Helper Functions

Working with SortedDataFrame structs

HighFrequencyCovariance.get_assetsFunction
get_assets(ts::SortedDataFrame, obs_to_include::Integer = 10)

This returns a vector of all of the assets in the SortedDataFrame with at least some number of observations (10 by default).

Inputs

  • ts - The tick data.
  • obs_to_include - An integer for the minimum number of ticks in ts needed for the function to include that asset.

Returns

  • A Vector{Symbol} with each asset.
source
HighFrequencyCovariance.ticks_per_assetFunction
ticks_per_asset(ts::SortedDataFrame, assets::Vector{Symbol} = get_assets(ts))

Count the number of observations for each asset.

Inputs

  • ts - The tick data
  • assets - A vector with asset Symbols.

Returns

  • A Dict with the number of observations for each input asset.
source
HighFrequencyCovariance.durationFunction
duration(ts::SortedDataFrame; in_dates_period::Bool = true)

The time elapsed between the first and the last tick in a SortedDataFrame.

Inputs

  • ts - Tick data.
  • in_dates_period - In Dates.Period format or just a number for the numeric difference between first and last tick.

Returns

  • A scalar representing this duration.
source
HighFrequencyCovariance.subset_to_tickFunction
subset_to_tick(ts::SortedDataFrame, n::Integer)

This subsets a SortedDataFrame to only the first n ticks.

Inputs

  • ts - Tick data.
  • n - How many ticks to subset to.

Returns

  • A (smaller) SortedDataFrame.
source
HighFrequencyCovariance.subset_to_timeFunction
subset_to_time(ts::SortedDataFrame, totime::Real)

This subsets a SortedDataFrame to only the first observations up until some time.

Inputs

  • ts - Tick data.
  • totime - Up to what time.

Returns

  • A (smaller) SortedDataFrame.
source

Working with CovarianceMatrix structs

HighFrequencyCovariance.covarianceFunction
covariance(
    cm::CovarianceMatrix,
    period::Dates.Period = cm.time_period_per_unit,
    assets::Vector{Symbol} = cm.labels,
)

This makes a Hermitian matrix for the covariance matrix over some duration.

Inputs

  • cm - A CovarianceMatrix struct.
  • period - A duration for which you want a covariance matrix. This should be in a Dates.Period.
  • assets - What assets in include in the covariance matrix.

Returns

  • A Hermitian. The labelling of assets for each row/column is as per the input assets vector.
source
HighFrequencyCovariance.get_correlationFunction
get_correlation(covar::CovarianceMatrix, asset1::Symbol, asset2::Symbol)

Extract the correlation between two assets stored in a CovarianceMatrix.

Inputs

  • covar - A CovarianceMatrix
  • asset1 - A Symbol representing an asset.
  • asset2 - A Symbol representing an asset.

Returns

  • A Scalar (the correlation coefficient).
source
get_correlation(covar::CovarianceModel, asset1::Symbol, asset2::Symbol)

Extract the correlation between two assets stored in a CovarianceModel.

Inputs

  • covar - A CovarianceModel
  • asset1 - A Symbol representing an asset.
  • asset2 - A Symbol representing an asset.

Returns

  • A Scalar (the correlation coefficient).
source
HighFrequencyCovariance.get_volatilityFunction
get_volatility(
    covar::CovarianceMatrix,
    asset1::Symbol,
    time_period_per_unit::Dates.Period = covar.time_period_per_unit,
)

Get the volatility for a stock from a CovarianceMatrix.

Inputs

  • covar - A CovarianceMatrix
  • asset1 - A Symbol representing an asset.
  • time_period_per_unit - The time interval the volatilities will be for.

Returns

  • A Scalar (the volatility).
source
get_volatility(
    covar::CovarianceModel,
    asset1::Symbol,
    time_period_per_unit::Dates.Period = covar.time_period_per_unit,
)

Get the volatility for a stock from a CovarianceModel.

Inputs

  • covar - A CovarianceModel
  • asset1 - A Symbol representing an asset.
  • time_period_per_unit - The time interval the volatilities will be for.

Returns

  • A Scalar (the volatility).
source
HighFrequencyCovariance.make_nan_covariance_matrixFunction
make_nan_covariance_matrix(
    labels::Vector{Symbol},
    time_period_per_unit::Dates.Period,
)

This makes an empty CovarianceMatrix struct with all volatilities and correlations being NaNs.

Inputs

  • labels - The names of the asset names for this (empty) CovarianceMatrix.
  • time_period_per_unit - The time interval the volatilities will be for.

Returns

  • An (empty) CovarianceMatrix
source
HighFrequencyCovariance.combine_covariance_matricesFunction
combine_covariance_matrices(
    vect::Vector{CovarianceMatrix{T}},
    cor_weights::Vector{<:Real} = repeat([1.0], length(vect)),
    vol_weights::Vector{<:Real} = cor_weights,
    time_period_per_unit::Union{Missing,Dates.Period} = vect[1].time_period_per_unit,
) where T<:Real

Combines a vector of CovarianceMatrix structs into one CovarianceMatrix struct.

Inputs

  • vect - A vector of CovarianceMatrix structs.
  • cor_weights - A vector for how much to weight the correlations from each covariance matrix (by default they will be equalweighted).
  • vol_weights - A vector for how much to weight the volatilities from each covariance matrix (by default they will be equalweighted).
  • time_period_per_unit - What time period should the volatilities be scaled to.

Returns

  • A matrix and a vector of labels for each row/column of the matrix.
source
HighFrequencyCovariance.rearrangeFunction
rearrange(
    cm::CovarianceMatrix,
    labels::Vector{Symbol},
    time_period_per_unit::Union{Missing,Dates.Period} = cm.time_period_per_unit,
)

Rearrange the order of labels in a CovarianceMatrix.

Takes

  • cm - A CovarianceMatrix.
  • labels - A Vector of labels.
  • time_period_per_unit - The time period you want for the resultant Covariance Matrix

Returns

  • A CovarianceMatrix.
source
rearrange(
    cm::CovarianceMatrix,
    labels::Vector{Symbol},
    time_period_per_unit::Union{Missing,Dates.Period} = cm.time_period_per_unit,
)

Rearrange the order of labels in a CovarianceMatrix.

Takes

  • cm - A CovarianceMatrix.
  • labels - A Vector of labels.
  • time_period_per_unit - The time period you want for the resultant Covariance Matrix

Returns

  • A CovarianceMatrix.
source
HighFrequencyCovariance.cov_to_corFunction
cov_to_cor(mat::AbstractMatrix)

Converts a matrix (representing a covariance matrix) into a Hermitian correlation matrix and a vector of standard deviations.

Inputs

  • cor - A matrix.

Returns

  • A Hermitian.
  • A Vector of standard deviations (not volatilities).
source
HighFrequencyCovariance.cor_to_covFunction
cor_to_cov(cor::AbstractMatrix,sdevs::Vector{<:Real})

Converts a correlation matrix and some standard deviations into a Hermitian covariance matrix.

Inputs

  • cor - A correlation matrix.
  • sdevs - A vector of standard deviations (not volatilities).

Returns

  • A Hermitian.
source
HighFrequencyCovariance.cov_to_cor_and_volFunction
cov_to_cor_and_vol(
    mat::AbstractMatrix,
    duration_of_covariance_matrix::Dates.Period,
    duration_for_desired_vols::Dates.Period,
)

cov_to_cor_and_vol(
    mat::AbstractMatrix,
    duration_of_covariance_matrix::Real,
    duration_for_desired_vols::Real,
)

Converts a matrix (representing a covariance matrix) into a Hermitian correlation matrix and a vector of volatilities.

Inputs

  • cor - A correlation matrix.
  • duration_of_covariance_matrix - The duration of the covariance matrix. If these are input as reals they must have the same units.
  • duration_for_desired_vols - The duration you want a volatility for. If these are input as reals they must have the same units.

Returns

  • A Hermitian.

  • A Vector of volatilities.

    covtocorandvol( mat::AbstractMatrix, durationofcovariancematrixinnaturalunits::Real, )

Inputs

  • cor - A correlation matrix.
  • duration_of_covariance_matrix_in_natural_units - The duration of the covariance matrix. It duration must be input in units that you know of (for instance the time_period_per_unit of a SortedDataFrame).

Returns

  • A Hermitian.
  • A Vector of volatilities.
source
HighFrequencyCovariance.construct_matrix_from_eigenFunction
construct_matrix_from_eigen(
    eigenvalues::Vector{<:Real},
    eigenvectors::Matrix{<:Real},
)

Constructs a matrix from its eigenvalue decomposition.

Inputs

  • eigenvalues - A vector of eigenvalues.
  • eigenvectors - A matrix of eigenvectors. The i'th column corresponds to the i'th eigenvalue.

Returns

  • A Matrix.
source
HighFrequencyCovariance.get_returnsFunction
get_returns(dd::DataFrame; rescale_for_duration::Bool = false)

Converts a long format DataFrame of prices into a DataFrame of returns.

Inputs

  • dd - A DataFrame with a column called :Time and all other columns being asset prices in each period.
  • rescale_for_duration - Should returns be rescaled to reflect a common time interval.

Returns

  • A DataFrame of returns.
source
HighFrequencyCovariance.valid_correlation_matrixFunction
valid_correlation_matrix(mat::Hermitian, min_eigen_threshold::Real = 0.0)

valid_correlation_matrix(covar::CovarianceMatrix, min_eigen_threshold::Real = 0.0)

Test if a Hermitian matrix is a valid correlation matrix. This is done by testing if it is psd, if it has a unit diagonal and if all other elements are less than one. If a Hermitian is input then it will be tested. If a CovarianceMatrix is input then its correlation matrix will be tested.

Inputs

  • mat - A Hermitian matrix or a CovarianceMatrix
  • min_eigen_threshold - How big does the smallest eigenvalue have to be.

Returns

  • A Bool that is true if mat is a valid correlation matrix and false if not.
source
valid_correlation_matrix(covar::CovarianceModel, min_eigen_threshold::Real = 0.0)
source
HighFrequencyCovariance.is_psd_matrixFunction
is_psd_matrix(mat::Hermitian, min_eigen_threshold::Real = 0.0)

is_psd_matrix(covar::CovarianceMatrix)

Test if a matrix is psd (Positive Semi-Definite). This is done by seeing if all eigenvalues are positive. If a Hermitian is input then it will be tested. If a CovarianceMatrix is input then its correlation matrix will be tested.

Inputs

  • mat - A Hermitian matrix or a CovarianceMatrix
  • min_eigen_threshold - How big does the smallest eigenvalue have to be.

Returns

  • A Bool that is true if mat is psd and false if not.
source
is_psd_matrix(covar::CovarianceModel)

Test if a matrix is psd (Positive Semi-Definite). This is done by seeing if all eigenvalues are positive. If a Hermitian is input then it will be tested. If a CovarianceModel is input then its correlation matrix will be tested.

Inputs

  • mat - A CovarianceModel
  • min_eigen_threshold - How big does the smallest eigenvalue have to be.

Returns

  • A Bool that is true if mat is psd and false if not.
source
HighFrequencyCovariance.relabelFunction
relabel(covar::CovarianceMatrix, mapping::Dict{Symbol,Symbol})

This relabels a CovarianceMatrix struct to give all the assets alternative names.

Inputs

  • covar - The CovarianceMatrix object you want to relabel.
  • mapping - A dict mapping from the names you have to the names you want.

Returns

  • A CovarianceMatrix the same as the one you input but with new labels.
source
relabel(covar::CovarianceModel, mapping::Dict{Symbol,Symbol})

This relabels a CovarianceModel struct to give all the assets alternative names.

Inputs

  • covar - The CovarianceModel object you want to relabel.
  • mapping - A dict mapping from the names you have to the names you want.

Returns

  • A CovarianceModel the same as the one you input but with new labels.
source

Blocking and Regularisation Functions

HighFrequencyCovariance.put_assets_into_blocks_by_trading_frequencyFunction
put_assets_into_blocks_by_trading_frequency(
   ts::SortedDataFrame,
   obs_multiple_for_new_block::Real,
   func::Symbol,
   optional_parameters::NamedTuple = NamedTuple(),
)

This makes a DataFrame that describes how to estimate the covariance matrix blockwise.

Inputs

  • ts - The tick data.
  • obs_multiple_for_new_block - The relative number of ticks needed before a new block is made. So if this is 1.2 that means a new group is made when one asset has 20% or more ticks than the slowest traded asset in the previous block.
  • func - A symbol representing the covariance estimation function to be used.
  • optional_parameters - Optional parameters to be used in the func function.

Returns

  • A DataFrame representing what estimations should be performed. The order of rows in the DataFrame shows the order of estimation.

References

Hautsch, N., Kyj, L.M. and Oomen, R.C.A. (2012), A blocking and regularization approach to high‐dimensional realized covariance estimation. J. Appl. Econ., 27: 625-645

source
HighFrequencyCovariance.blockwise_estimationFunction
blockwise_estimation(ts::SortedDataFrame, blocking_frame::DataFrame)

Run a series of covariance estimations and combine the results. Two things should be input, a SortedDataFrame with the price update data and a dataframe describing what estimations should be performed. This should be of the same form as is output by put_assets_into_blocks_by_trading_frequency (although the actual estimations can be customised to something different as to what that function outputs).

Inputs

  • ts - The tick data.
  • blocking_frame - A DataFrame representing what estimations to do and in what order. This is often be one generated by the put_assets_into_blocks_by_trading_frequency function (and potentially then modified).

Returns

  • A CovarianceMatrix.
source
HighFrequencyCovariance.put_assets_into_blocksFunction
put_assets_into_blocks(ts::SortedDataFrame, new_group_mult::Real)

This splits assets into seperate blocks depending on their number of ticks.

Inputs

  • ts - The tick data.
  • new_group_mult - The relative number of ticks needed before a new block is made. So if this is 1.2 that means a new group is made when one asset has 20% or more ticks than the slowest traded asset in the previous block.

Returns

  • A DataFrame.
source

Monte Carlo

HighFrequencyCovariance.generate_random_pathFunction
generate_random_path(
   dimensions::Integer,
   ticks::Integer;
   syncronous::Bool = false,
   rng::Union{MersenneTwister,StableRNG} = MersenneTwister(1),
   vol_dist::Distribution = Uniform(
       0.1 / sqrt(252 * 8 * 3600),
       0.5 / sqrt(252 * 8 * 3600),
   ),
   refresh_rate_dist::Distribution = Uniform(0.5, 5.0),
   time_period_per_unit::Dates.Period = Second(1),
   micro_noise_dist::Distribution = Uniform(
       vol_dist.a * sqrt(time_period_ratio(Minute(5), time_period_per_unit)),
       vol_dist.b * sqrt(time_period_ratio(Minute(5), time_period_per_unit)),
   ),
   assets::Union{Vector,Missing} = missing,
   brownian_corr_matrix::Union{Hermitian,Missing} = missing,
   vols::Union{Vector,Missing} = missing,
   rng_timing::Union{MersenneTwister,StableRNG} = MersenneTwister(1),
)

Generate a random path of price updates with a specified number of dimensions and ticks. There are options for whether the data is syncronous or asyncronous, the volatility of the price processes, the refresh rate on the (exponential) arrival times of price updates, the minimum and the maximum microstructure noises.

Note the defaults are chosen to reflect a highcap stock with annualised volatility between 10% and 50%. The standard deviation of microstructure noise is of the same order of magnitude as 5 minutes standard deviation of return. vol * sqrt(60*5) if vol is in seconds. Refreshed ticks every 0.5-5 seconds (in expectation).

Inputs

  • dimensions - The number of assets.
  • ticks - The number of ticks to produce.
  • syncronous - Should ticks be syncronous (for each asset) or asyncronous.
  • rng - The Random.MersenneTwister or StableRNGs.Stable used for RNG.
  • vol_dist - The distribution to draw volatilities from (only used if vols is missing).
  • refresh_rate_dist - The distribution to draw refresh rates (exponential distribution rates) from. Note if you want all intervals to be evenly spaced you can do something like DiscreteUniform(1,1).
  • time_period_per_unit - What time period should the time column correspond to.
  • micro_noise_dist - The distribution to draw assetwise microstructure noise standard deviations are drawn from.
  • assets - The names of the assets that you want to use. The length of this must be equal to the dimensions input.
  • brownian_corr_matrix - The correlation matrix to use. This is sampled from the Inverse Wishart distribution if none is input.
  • vols - The volatilities to use. These are sampled from the uniform distribution between min_noise_var and max_noise_var.

Returns

  • A SortedDataFrame of tick data.
  • A CovarianceMatrix representing the true data generation process used in making the tick data.
  • A Dict of microstructure noise variances for each asset.
  • A Dict of update rates for each asset.
source
StochasticIntegrals.ItoSetType
StochasticIntegrals.ItoSet(covariance_matrix::CovarianceMatrix{<:Real})

Convert a CovarianceMatrix into an ItoSet from the StochasticIntegrals package. This package can then be used to do things like generate draws from the Multivariate Gaussian corresponding to the covariance matrix and other things.

Inputs

  • covariance_matrix - The CovarianceMatrix that you want to convert into an StochasticIntegrals.ItoSet

Returns

  • A StochasticIntegrals.ItoSet struct.

Example

using Dates
covar = CovarianceMatrix(make_random_psd_matrix_from_wishart(5), rand(5), [:A,:B,:C,:D,:E], Dates.Hour(1))
iset = ItoSet(covar)
# To see how this is used for something useful you can look at the get_draws function.
source
StochasticIntegrals.get_drawsFunction
StochasticIntegrals.get_draws(
    covariance_matrix::CovarianceMatrix{<:Real},
    num::Integer;
    number_generator::NumberGenerator = Mersenne(
        MersenneTwister(1234),
        length(covariance_matrix.labels),
    ),
    antithetic_variates = false,
)

get pseudorandom draws from a CovarianceMatrix struct. This is basically a convenience wrapper over StochasticIntegrals.getdraws which does the necessary constructing of the structs of that package. If the `antitheticvariates` control is set to true then every second set of draws will be antithetic to the previous. If you want to do something like Sobol sampling you can change the number_generator. See StochasticIntegrals to see what is available (and feel free to make new ones and put in Pull Requests)

Inputs

  • covar - An CovarianceMatrix struct that you want to draw from.
  • num- The number of draws you want
  • number_generator - A NumberGenerator struct that can be queried for a series of unit interval vectors that are then transformed by the covariance matrix into draws.
  • antithetic_variates - A boolean indicating if antithetic variates should be used (every second draw is made from 1 - uniformdraw of previous)

Returns

  • A Vector of Dicts of draws. Note you can convert this to a dataframe or array with StochasticIntegrals.to_dataframe or StochasticIntegrals.to_array.
source
StochasticIntegrals.get_draws(
    covariance_model::CovarianceModel{<:Real},
    num::Integer;
    number_generator::NumberGenerator = Mersenne(
        MersenneTwister(1234),
        length(covariance_matrix.labels),
    ),
    antithetic_variates = false,
)

get pseudorandom draws from a CovarianceModel struct. This is basically a convenience wrapper over StochasticIntegrals.getdraws which does the necessary constructing of the structs of that package. If the `antitheticvariates` control is set to true then every second set of draws will be antithetic to the previous. If you want to do something like Sobol sampling you can change the number_generator. See StochasticIntegrals to see what is available (and feel free to make new ones and put in Pull Requests)

Inputs

  • covariance_model - An CovarianceModel struct that you want to draw from.
  • num- The number of draws you want
  • number_generator - A NumberGenerator struct that can be queried for a series of unit interval vectors that are then transformed by the covariance matrix into draws.
  • antithetic_variates - A boolean indicating if antithetic variates should be used (every second draw is made from 1 - uniformdraw of previous)

Returns

  • A Vector of Dicts of draws. Note you can convert this to a dataframe or array with StochasticIntegrals.to_dataframe or StochasticIntegrals.to_array.
source
HighFrequencyCovariance.convert_to_stochastic_integrals_typeFunction
convert_to_stochastic_integrals_type(x::MersenneTwister, num::Integer)

convert_to_stochastic_integrals_type(x::StableRNG, num::Integer)

This makes either a StochasticIntegrals.Mersenne or StochasticIntegrals.Stable_RNG type depending on what random number generator is input.

source

For getting a DataFrame version of a CovarianceMatrix and vice versa.

DataFrames.DataFrameType
DataFrames.DataFrame(
    covar::CovarianceMatrix,
    othercols::Dict = Dict{Symbol,Any}();
    delete_duplicate_correlations::Bool = true,
)

Convert a CovarianceMatrix to a DataFrame format.

Inputs

  • covar - The CovarianceMatrix
  • othercols - This adds columns to the DataFrame. For instance if it is Dict{Symbol,String}([:pc] .=> ["Fred's PC"]), then there will be a column indicating that this estimation was done on Fred's PC.
  • delete_duplicate_correlations - Should the unnecessary correlations be included (as correlation matrices are symmetric half the entries duplicate information).

Returns

  • A DataFrame.
source
DataFrames.DataFrame(
    covar::CovarianceModel,
    othercols::Dict = Dict{Symbol,Any}();
    delete_duplicate_correlations::Bool = true,
)

Convert a CovarianceModel to a DataFrame format.

Inputs

  • covar - The CovarianceModel
  • othercols - This adds columns to the DataFrame. For instance if it is Dict{Symbol,String}([:pc] .=> ["Fred's PC"]), then there will be a column indicating that this estimation was done on Fred's PC.
  • delete_duplicate_correlations - Should the unnecessary correlations be included (as correlation matrices are symmetric half the entries duplicate information).

Returns

  • A DataFrame.
source
HighFrequencyCovariance.CovarianceMatrixType
CovarianceMatrix(correlation::Hermitian{R},
    volatility::Vector{R},
    labels::Vector{Symbol}) where R<:Real

This Struct stores three elements. A Hermitian correlation matrix, a vector of volatilities and a vector of labels. The order of the labels matches the order of the assets in the volatility vector and correlation matrix. The default constructor is used.

Inputs

  • correlation - A Hermitian correlation matrix.
  • volatility - Volatilities for each asset.
  • labels - The labels for the correlation and volatility members. The n'th entry of the labels vector should contain the name of the asset that has its volatility in the n'th entry of the volatility member and its correlations in the n'th row/column of the correlation member.
  • time_period_per_unit - The period that one unit of volatility corresponds to.

Returns

  • A CovarianceMatrix.
source