Helper Functions
DataFrames.DataFrameHighFrequencyCovariance.CovarianceMatrixStochasticIntegrals.ItoSetHighFrequencyCovariance.blockwise_estimationHighFrequencyCovariance.combine_covariance_matricesHighFrequencyCovariance.construct_matrix_from_eigenHighFrequencyCovariance.convert_to_stochastic_integrals_typeHighFrequencyCovariance.cor_to_covHighFrequencyCovariance.cov_to_corHighFrequencyCovariance.cov_to_cor_and_volHighFrequencyCovariance.covarianceHighFrequencyCovariance.durationHighFrequencyCovariance.generate_random_pathHighFrequencyCovariance.get_assetsHighFrequencyCovariance.get_correlationHighFrequencyCovariance.get_returnsHighFrequencyCovariance.get_volatilityHighFrequencyCovariance.is_psd_matrixHighFrequencyCovariance.make_adjacent_block_sequenceHighFrequencyCovariance.make_nan_covariance_matrixHighFrequencyCovariance.make_sorted_adjacent_block_sequenceHighFrequencyCovariance.put_assets_into_blocksHighFrequencyCovariance.put_assets_into_blocks_by_trading_frequencyHighFrequencyCovariance.rearrangeHighFrequencyCovariance.relabelHighFrequencyCovariance.subset_to_tickHighFrequencyCovariance.subset_to_timeHighFrequencyCovariance.ticks_per_assetHighFrequencyCovariance.valid_correlation_matrixStochasticIntegrals.get_draws
Working with SortedDataFrame structs
HighFrequencyCovariance.get_assets — Functionget_assets(ts::SortedDataFrame, obs_to_include::Integer = 10)This returns a vector of all of the assets in the SortedDataFrame with at least some number of observations (10 by default).
Inputs
ts- The tick data.obs_to_include- An integer for the minimum number of ticks intsneeded for the function to include that asset.
Returns
- A
Vector{Symbol}with each asset.
HighFrequencyCovariance.ticks_per_asset — Functionticks_per_asset(ts::SortedDataFrame, assets::Vector{Symbol} = get_assets(ts))Count the number of observations for each asset.
Inputs
ts- The tick dataassets- A vector with assetSymbols.
Returns
- A
Dictwith the number of observations for each input asset.
HighFrequencyCovariance.duration — Functionduration(ts::SortedDataFrame; in_dates_period::Bool = true)The time elapsed between the first and the last tick in a SortedDataFrame.
Inputs
ts- Tick data.in_dates_period- In Dates.Period format or just a number for the numeric difference between first and last tick.
Returns
- A scalar representing this duration.
HighFrequencyCovariance.subset_to_tick — Functionsubset_to_tick(ts::SortedDataFrame, n::Integer)This subsets a SortedDataFrame to only the first n ticks.
Inputs
ts- Tick data.n- How many ticks to subset to.
Returns
- A (smaller)
SortedDataFrame.
HighFrequencyCovariance.subset_to_time — Functionsubset_to_time(ts::SortedDataFrame, totime::Real)This subsets a SortedDataFrame to only the first observations up until some time.
Inputs
ts- Tick data.totime- Up to what time.
Returns
- A (smaller)
SortedDataFrame.
Working with CovarianceMatrix structs
HighFrequencyCovariance.covariance — Functioncovariance(
cm::CovarianceMatrix,
period::Dates.Period = cm.time_period_per_unit,
assets::Vector{Symbol} = cm.labels,
)This makes a Hermitian matrix for the covariance matrix over some duration.
Inputs
cm- ACovarianceMatrixstruct.period- A duration for which you want a covariance matrix. This should be in a Dates.Period.assets- What assets in include in the covariance matrix.
Returns
- A
Hermitian. The labelling of assets for each row/column is as per the inputassetsvector.
HighFrequencyCovariance.get_correlation — Functionget_correlation(covar::CovarianceMatrix, asset1::Symbol, asset2::Symbol)Extract the correlation between two assets stored in a CovarianceMatrix.
Inputs
covar- ACovarianceMatrixasset1- ASymbolrepresenting an asset.asset2- ASymbolrepresenting an asset.
Returns
- A Scalar (the correlation coefficient).
get_correlation(covar::CovarianceModel, asset1::Symbol, asset2::Symbol)Extract the correlation between two assets stored in a CovarianceModel.
Inputs
covar- ACovarianceModelasset1- ASymbolrepresenting an asset.asset2- ASymbolrepresenting an asset.
Returns
- A Scalar (the correlation coefficient).
HighFrequencyCovariance.get_volatility — Functionget_volatility(
covar::CovarianceMatrix,
asset1::Symbol,
time_period_per_unit::Dates.Period = covar.time_period_per_unit,
)Get the volatility for a stock from a CovarianceMatrix.
Inputs
covar- ACovarianceMatrixasset1- ASymbolrepresenting an asset.time_period_per_unit- The time interval the volatilities will be for.
Returns
- A Scalar (the volatility).
get_volatility(
covar::CovarianceModel,
asset1::Symbol,
time_period_per_unit::Dates.Period = covar.time_period_per_unit,
)Get the volatility for a stock from a CovarianceModel.
Inputs
covar- ACovarianceModelasset1- ASymbolrepresenting an asset.time_period_per_unit- The time interval the volatilities will be for.
Returns
- A Scalar (the volatility).
HighFrequencyCovariance.make_nan_covariance_matrix — Functionmake_nan_covariance_matrix(
labels::Vector{Symbol},
time_period_per_unit::Dates.Period,
)This makes an empty CovarianceMatrix struct with all volatilities and correlations being NaNs.
Inputs
labels- The names of the asset names for this (empty)CovarianceMatrix.time_period_per_unit- The time interval the volatilities will be for.
Returns
- An (empty)
CovarianceMatrix
HighFrequencyCovariance.combine_covariance_matrices — Functioncombine_covariance_matrices(
vect::Vector{CovarianceMatrix{T}},
cor_weights::Vector{<:Real} = repeat([1.0], length(vect)),
vol_weights::Vector{<:Real} = cor_weights,
time_period_per_unit::Union{Missing,Dates.Period} = vect[1].time_period_per_unit,
) where T<:RealCombines a vector of CovarianceMatrix structs into one CovarianceMatrix struct.
Inputs
vect- A vector ofCovarianceMatrixstructs.cor_weights- A vector for how much to weight the correlations from each covariance matrix (by default they will be equalweighted).vol_weights- A vector for how much to weight the volatilities from each covariance matrix (by default they will be equalweighted).time_period_per_unit- What time period should the volatilities be scaled to.
Returns
- A matrix and a vector of labels for each row/column of the matrix.
HighFrequencyCovariance.rearrange — Functionrearrange(
cm::CovarianceMatrix,
labels::Vector{Symbol},
time_period_per_unit::Union{Missing,Dates.Period} = cm.time_period_per_unit,
)Rearrange the order of labels in a CovarianceMatrix.
Takes
cm- ACovarianceMatrix.labels- AVectorof labels.time_period_per_unit- The time period you want for the resultant Covariance Matrix
Returns
- A
CovarianceMatrix.
rearrange(
cm::CovarianceMatrix,
labels::Vector{Symbol},
time_period_per_unit::Union{Missing,Dates.Period} = cm.time_period_per_unit,
)Rearrange the order of labels in a CovarianceMatrix.
Takes
cm- ACovarianceMatrix.labels- AVectorof labels.time_period_per_unit- The time period you want for the resultant Covariance Matrix
Returns
- A
CovarianceMatrix.
HighFrequencyCovariance.cov_to_cor — Functioncov_to_cor(mat::AbstractMatrix)Converts a matrix (representing a covariance matrix) into a Hermitian correlation matrix and a vector of standard deviations.
Inputs
cor- A matrix.
Returns
- A
Hermitian. - A
Vectorof standard deviations (not volatilities).
HighFrequencyCovariance.cor_to_cov — Functioncor_to_cov(cor::AbstractMatrix,sdevs::Vector{<:Real})Converts a correlation matrix and some standard deviations into a Hermitian covariance matrix.
Inputs
cor- A correlation matrix.sdevs- A vector of standard deviations (not volatilities).
Returns
- A
Hermitian.
HighFrequencyCovariance.cov_to_cor_and_vol — Functioncov_to_cor_and_vol(
mat::AbstractMatrix,
duration_of_covariance_matrix::Dates.Period,
duration_for_desired_vols::Dates.Period,
)
cov_to_cor_and_vol(
mat::AbstractMatrix,
duration_of_covariance_matrix::Real,
duration_for_desired_vols::Real,
)Converts a matrix (representing a covariance matrix) into a Hermitian correlation matrix and a vector of volatilities.
Inputs
cor- A correlation matrix.duration_of_covariance_matrix- The duration of the covariance matrix. If these are input as reals they must have the same units.duration_for_desired_vols- The duration you want a volatility for. If these are input as reals they must have the same units.
Returns
A
Hermitian.A
Vectorof volatilities.covtocorandvol( mat::AbstractMatrix, durationofcovariancematrixinnaturalunits::Real, )
Inputs
cor- A correlation matrix.duration_of_covariance_matrix_in_natural_units- The duration of the covariance matrix. It duration must be input in units that you know of (for instance thetime_period_per_unitof aSortedDataFrame).
Returns
- A
Hermitian. - A
Vectorof volatilities.
HighFrequencyCovariance.construct_matrix_from_eigen — Functionconstruct_matrix_from_eigen(
eigenvalues::Vector{<:Real},
eigenvectors::Matrix{<:Real},
)Constructs a matrix from its eigenvalue decomposition.
Inputs
eigenvalues- A vector of eigenvalues.eigenvectors- A matrix of eigenvectors. The i'th column corresponds to the i'th eigenvalue.
Returns
- A
Matrix.
HighFrequencyCovariance.get_returns — Functionget_returns(dd::DataFrame; rescale_for_duration::Bool = false)Converts a long format DataFrame of prices into a DataFrame of returns.
Inputs
dd- ADataFramewith a column called :Time and all other columns being asset prices in each period.rescale_for_duration- Should returns be rescaled to reflect a common time interval.
Returns
- A
DataFrameof returns.
HighFrequencyCovariance.valid_correlation_matrix — Functionvalid_correlation_matrix(mat::Hermitian, min_eigen_threshold::Real = 0.0)
valid_correlation_matrix(covar::CovarianceMatrix, min_eigen_threshold::Real = 0.0)Test if a Hermitian matrix is a valid correlation matrix. This is done by testing if it is psd, if it has a unit diagonal and if all other elements are less than one. If a Hermitian is input then it will be tested. If a CovarianceMatrix is input then its correlation matrix will be tested.
Inputs
mat- AHermitianmatrix or aCovarianceMatrixmin_eigen_threshold- How big does the smallest eigenvalue have to be.
Returns
- A
Boolthat is true if mat is a valid correlation matrix and false if not.
valid_correlation_matrix(covar::CovarianceModel, min_eigen_threshold::Real = 0.0)HighFrequencyCovariance.is_psd_matrix — Functionis_psd_matrix(mat::Hermitian, min_eigen_threshold::Real = 0.0)
is_psd_matrix(covar::CovarianceMatrix)Test if a matrix is psd (Positive Semi-Definite). This is done by seeing if all eigenvalues are positive. If a Hermitian is input then it will be tested. If a CovarianceMatrix is input then its correlation matrix will be tested.
Inputs
mat- AHermitianmatrix or aCovarianceMatrixmin_eigen_threshold- How big does the smallest eigenvalue have to be.
Returns
- A
Boolthat is true if mat is psd and false if not.
is_psd_matrix(covar::CovarianceModel)Test if a matrix is psd (Positive Semi-Definite). This is done by seeing if all eigenvalues are positive. If a Hermitian is input then it will be tested. If a CovarianceModel is input then its correlation matrix will be tested.
Inputs
mat- ACovarianceModelmin_eigen_threshold- How big does the smallest eigenvalue have to be.
Returns
- A
Boolthat is true if mat is psd and false if not.
HighFrequencyCovariance.relabel — Functionrelabel(covar::CovarianceMatrix, mapping::Dict{Symbol,Symbol})This relabels a CovarianceMatrix struct to give all the assets alternative names.
Inputs
covar- TheCovarianceMatrixobject you want to relabel.mapping- A dict mapping from the names you have to the names you want.
Returns
- A
CovarianceMatrixthe same as the one you input but with new labels.
relabel(covar::CovarianceModel, mapping::Dict{Symbol,Symbol})This relabels a CovarianceModel struct to give all the assets alternative names.
Inputs
covar- TheCovarianceModelobject you want to relabel.mapping- A dict mapping from the names you have to the names you want.
Returns
- A
CovarianceModelthe same as the one you input but with new labels.
Blocking and Regularisation Functions
HighFrequencyCovariance.put_assets_into_blocks_by_trading_frequency — Functionput_assets_into_blocks_by_trading_frequency(
ts::SortedDataFrame,
obs_multiple_for_new_block::Real,
func::Symbol,
optional_parameters::NamedTuple = NamedTuple(),
)This makes a DataFrame that describes how to estimate the covariance matrix blockwise.
Inputs
ts- The tick data.obs_multiple_for_new_block- The relative number of ticks needed before a new block is made. So if this is 1.2 that means a new group is made when one asset has 20% or more ticks than the slowest traded asset in the previous block.func- A symbol representing the covariance estimation function to be used.optional_parameters- Optional parameters to be used in thefuncfunction.
Returns
- A
DataFramerepresenting what estimations should be performed. The order of rows in theDataFrameshows the order of estimation.
References
Hautsch, N., Kyj, L.M. and Oomen, R.C.A. (2012), A blocking and regularization approach to high‐dimensional realized covariance estimation. J. Appl. Econ., 27: 625-645
HighFrequencyCovariance.blockwise_estimation — Functionblockwise_estimation(ts::SortedDataFrame, blocking_frame::DataFrame)Run a series of covariance estimations and combine the results. Two things should be input, a SortedDataFrame with the price update data and a dataframe describing what estimations should be performed. This should be of the same form as is output by put_assets_into_blocks_by_trading_frequency (although the actual estimations can be customised to something different as to what that function outputs).
Inputs
ts- The tick data.blocking_frame- ADataFramerepresenting what estimations to do and in what order. This is often be one generated by theput_assets_into_blocks_by_trading_frequencyfunction (and potentially then modified).
Returns
- A
CovarianceMatrix.
HighFrequencyCovariance.make_adjacent_block_sequence — Functionmake_adjacent_block_sequence(blocks::Vector{Vector{Symbol}})This makes a sequence of adjacent blocks.
Inputs
blocks- The blocks for blockwise estimation.
Returns
- A
Vector.
HighFrequencyCovariance.make_sorted_adjacent_block_sequence — Functionmake_sorted_adjacent_block_sequence(blocks::Vector{Vector{Symbol}})This makes a sequence of adjacent blocks and then sort them by length.
Inputs
blocks- The blocks for blockwise estimation.
Returns
- A
Vector.
HighFrequencyCovariance.put_assets_into_blocks — Functionput_assets_into_blocks(ts::SortedDataFrame, new_group_mult::Real)This splits assets into seperate blocks depending on their number of ticks.
Inputs
ts- The tick data.new_group_mult- The relative number of ticks needed before a new block is made. So if this is 1.2 that means a new group is made when one asset has 20% or more ticks than the slowest traded asset in the previous block.
Returns
- A
DataFrame.
Monte Carlo
HighFrequencyCovariance.generate_random_path — Functiongenerate_random_path(
dimensions::Integer,
ticks::Integer;
syncronous::Bool = false,
rng::Union{MersenneTwister,StableRNG} = MersenneTwister(1),
vol_dist::Distribution = Uniform(
0.1 / sqrt(252 * 8 * 3600),
0.5 / sqrt(252 * 8 * 3600),
),
refresh_rate_dist::Distribution = Uniform(0.5, 5.0),
time_period_per_unit::Dates.Period = Second(1),
micro_noise_dist::Distribution = Uniform(
vol_dist.a * sqrt(time_period_ratio(Minute(5), time_period_per_unit)),
vol_dist.b * sqrt(time_period_ratio(Minute(5), time_period_per_unit)),
),
assets::Union{Vector,Missing} = missing,
brownian_corr_matrix::Union{Hermitian,Missing} = missing,
vols::Union{Vector,Missing} = missing,
rng_timing::Union{MersenneTwister,StableRNG} = MersenneTwister(1),
)Generate a random path of price updates with a specified number of dimensions and ticks. There are options for whether the data is syncronous or asyncronous, the volatility of the price processes, the refresh rate on the (exponential) arrival times of price updates, the minimum and the maximum microstructure noises.
Note the defaults are chosen to reflect a highcap stock with annualised volatility between 10% and 50%. The standard deviation of microstructure noise is of the same order of magnitude as 5 minutes standard deviation of return. vol * sqrt(60*5) if vol is in seconds. Refreshed ticks every 0.5-5 seconds (in expectation).
Inputs
dimensions- The number of assets.ticks- The number of ticks to produce.syncronous- Should ticks be syncronous (for each asset) or asyncronous.rng- The Random.MersenneTwister or StableRNGs.Stable used for RNG.vol_dist- The distribution to draw volatilities from (only used if vols is missing).refresh_rate_dist- The distribution to draw refresh rates (exponential distribution rates) from. Note if you want all intervals to be evenly spaced you can do something like DiscreteUniform(1,1).time_period_per_unit- What time period should the time column correspond to.micro_noise_dist- The distribution to draw assetwise microstructure noise standard deviations are drawn from.assets- The names of the assets that you want to use. The length of this must be equal to thedimensionsinput.brownian_corr_matrix- The correlation matrix to use. This is sampled from the Inverse Wishart distribution if none is input.vols- The volatilities to use. These are sampled from the uniform distribution betweenmin_noise_varandmax_noise_var.
Returns
- A
SortedDataFrameof tick data. - A
CovarianceMatrixrepresenting the true data generation process used in making the tick data. - A
Dictof microstructure noise variances for each asset. - A
Dictof update rates for each asset.
StochasticIntegrals.ItoSet — TypeStochasticIntegrals.ItoSet(covariance_matrix::CovarianceMatrix{<:Real})Convert a CovarianceMatrix into an ItoSet from the StochasticIntegrals package. This package can then be used to do things like generate draws from the Multivariate Gaussian corresponding to the covariance matrix and other things.
Inputs
covariance_matrix- TheCovarianceMatrixthat you want to convert into anStochasticIntegrals.ItoSet
Returns
- A
StochasticIntegrals.ItoSetstruct.
Example
using Dates
covar = CovarianceMatrix(make_random_psd_matrix_from_wishart(5), rand(5), [:A,:B,:C,:D,:E], Dates.Hour(1))
iset = ItoSet(covar)
# To see how this is used for something useful you can look at the get_draws function.StochasticIntegrals.get_draws — FunctionStochasticIntegrals.get_draws(
covariance_matrix::CovarianceMatrix{<:Real},
num::Integer;
number_generator::NumberGenerator = Mersenne(
MersenneTwister(1234),
length(covariance_matrix.labels),
),
antithetic_variates = false,
)get pseudorandom draws from a CovarianceMatrix struct. This is basically a convenience wrapper over StochasticIntegrals.getdraws which does the necessary constructing of the structs of that package. If the `antitheticvariates` control is set to true then every second set of draws will be antithetic to the previous. If you want to do something like Sobol sampling you can change the number_generator. See StochasticIntegrals to see what is available (and feel free to make new ones and put in Pull Requests)
Inputs
covar- AnCovarianceMatrixstruct that you want to draw from.num- The number of draws you wantnumber_generator- ANumberGeneratorstruct that can be queried for a series of unit interval vectors that are then transformed by the covariance matrix into draws.antithetic_variates- A boolean indicating if antithetic variates should be used (every second draw is made from 1 - uniformdraw of previous)
Returns
- A
VectorofDicts of draws. Note you can convert this to a dataframe or array withStochasticIntegrals.to_dataframeorStochasticIntegrals.to_array.
StochasticIntegrals.get_draws(
covariance_model::CovarianceModel{<:Real},
num::Integer;
number_generator::NumberGenerator = Mersenne(
MersenneTwister(1234),
length(covariance_matrix.labels),
),
antithetic_variates = false,
)get pseudorandom draws from a CovarianceModel struct. This is basically a convenience wrapper over StochasticIntegrals.getdraws which does the necessary constructing of the structs of that package. If the `antitheticvariates` control is set to true then every second set of draws will be antithetic to the previous. If you want to do something like Sobol sampling you can change the number_generator. See StochasticIntegrals to see what is available (and feel free to make new ones and put in Pull Requests)
Inputs
covariance_model- AnCovarianceModelstruct that you want to draw from.num- The number of draws you wantnumber_generator- ANumberGeneratorstruct that can be queried for a series of unit interval vectors that are then transformed by the covariance matrix into draws.antithetic_variates- A boolean indicating if antithetic variates should be used (every second draw is made from 1 - uniformdraw of previous)
Returns
- A
VectorofDicts of draws. Note you can convert this to a dataframe or array withStochasticIntegrals.to_dataframeorStochasticIntegrals.to_array.
HighFrequencyCovariance.convert_to_stochastic_integrals_type — Functionconvert_to_stochastic_integrals_type(x::MersenneTwister, num::Integer)
convert_to_stochastic_integrals_type(x::StableRNG, num::Integer)This makes either a StochasticIntegrals.Mersenne or StochasticIntegrals.Stable_RNG type depending on what random number generator is input.
For getting a DataFrame version of a CovarianceMatrix and vice versa.
DataFrames.DataFrame — TypeDataFrames.DataFrame(
covar::CovarianceMatrix,
othercols::Dict = Dict{Symbol,Any}();
delete_duplicate_correlations::Bool = true,
)Convert a CovarianceMatrix to a DataFrame format.
Inputs
covar- TheCovarianceMatrixothercols- This adds columns to theDataFrame. For instance if it isDict{Symbol,String}([:pc] .=> ["Fred's PC"]), then there will be a column indicating that this estimation was done on Fred's PC.delete_duplicate_correlations- Should the unnecessary correlations be included (as correlation matrices are symmetric half the entries duplicate information).
Returns
- A
DataFrame.
DataFrames.DataFrame(
covar::CovarianceModel,
othercols::Dict = Dict{Symbol,Any}();
delete_duplicate_correlations::Bool = true,
)Convert a CovarianceModel to a DataFrame format.
Inputs
covar- TheCovarianceModelothercols- This adds columns to theDataFrame. For instance if it isDict{Symbol,String}([:pc] .=> ["Fred's PC"]), then there will be a column indicating that this estimation was done on Fred's PC.delete_duplicate_correlations- Should the unnecessary correlations be included (as correlation matrices are symmetric half the entries duplicate information).
Returns
- A
DataFrame.
HighFrequencyCovariance.CovarianceMatrix — TypeCovarianceMatrix(correlation::Hermitian{R},
volatility::Vector{R},
labels::Vector{Symbol}) where R<:RealThis Struct stores three elements. A Hermitian correlation matrix, a vector of volatilities and a vector of labels. The order of the labels matches the order of the assets in the volatility vector and correlation matrix. The default constructor is used.
Inputs
correlation- AHermitiancorrelation matrix.volatility- Volatilities for each asset.labels- The labels for thecorrelationandvolatilitymembers. The n'th entry of thelabelsvector should contain the name of the asset that has its volatility in the n'th entry of thevolatilitymember and its correlations in the n'th row/column of thecorrelationmember.time_period_per_unit- The period that one unit of volatility corresponds to.
Returns
- A
CovarianceMatrix.