Helper Functions
DataFrames.DataFrame
HighFrequencyCovariance.CovarianceMatrix
StochasticIntegrals.ItoSet
HighFrequencyCovariance.blockwise_estimation
HighFrequencyCovariance.combine_covariance_matrices
HighFrequencyCovariance.construct_matrix_from_eigen
HighFrequencyCovariance.convert_to_stochastic_integrals_type
HighFrequencyCovariance.cor_to_cov
HighFrequencyCovariance.cov_to_cor
HighFrequencyCovariance.cov_to_cor_and_vol
HighFrequencyCovariance.covariance
HighFrequencyCovariance.duration
HighFrequencyCovariance.generate_random_path
HighFrequencyCovariance.get_assets
HighFrequencyCovariance.get_correlation
HighFrequencyCovariance.get_returns
HighFrequencyCovariance.get_volatility
HighFrequencyCovariance.is_psd_matrix
HighFrequencyCovariance.make_adjacent_block_sequence
HighFrequencyCovariance.make_nan_covariance_matrix
HighFrequencyCovariance.make_sorted_adjacent_block_sequence
HighFrequencyCovariance.put_assets_into_blocks
HighFrequencyCovariance.put_assets_into_blocks_by_trading_frequency
HighFrequencyCovariance.rearrange
HighFrequencyCovariance.relabel
HighFrequencyCovariance.subset_to_tick
HighFrequencyCovariance.subset_to_time
HighFrequencyCovariance.ticks_per_asset
HighFrequencyCovariance.valid_correlation_matrix
StochasticIntegrals.get_draws
Working with SortedDataFrame structs
HighFrequencyCovariance.get_assets
— Functionget_assets(ts::SortedDataFrame, obs_to_include::Integer = 10)
This returns a vector of all of the assets in the SortedDataFrame
with at least some number of observations (10 by default).
Inputs
ts
- The tick data.obs_to_include
- An integer for the minimum number of ticks ints
needed for the function to include that asset.
Returns
- A
Vector{Symbol}
with each asset.
HighFrequencyCovariance.ticks_per_asset
— Functionticks_per_asset(ts::SortedDataFrame, assets::Vector{Symbol} = get_assets(ts))
Count the number of observations for each asset.
Inputs
ts
- The tick dataassets
- A vector with assetSymbol
s.
Returns
- A
Dict
with the number of observations for each input asset.
HighFrequencyCovariance.duration
— Functionduration(ts::SortedDataFrame; in_dates_period::Bool = true)
The time elapsed between the first and the last tick in a SortedDataFrame
.
Inputs
ts
- Tick data.in_dates_period
- In Dates.Period format or just a number for the numeric difference between first and last tick.
Returns
- A scalar representing this duration.
HighFrequencyCovariance.subset_to_tick
— Functionsubset_to_tick(ts::SortedDataFrame, n::Integer)
This subsets a SortedDataFrame
to only the first n ticks.
Inputs
ts
- Tick data.n
- How many ticks to subset to.
Returns
- A (smaller)
SortedDataFrame
.
HighFrequencyCovariance.subset_to_time
— Functionsubset_to_time(ts::SortedDataFrame, totime::Real)
This subsets a SortedDataFrame
to only the first observations up until some time.
Inputs
ts
- Tick data.totime
- Up to what time.
Returns
- A (smaller)
SortedDataFrame
.
Working with CovarianceMatrix structs
HighFrequencyCovariance.covariance
— Functioncovariance(
cm::CovarianceMatrix,
period::Dates.Period = cm.time_period_per_unit,
assets::Vector{Symbol} = cm.labels,
)
This makes a Hermitian
matrix for the covariance matrix over some duration.
Inputs
cm
- ACovarianceMatrix
struct.period
- A duration for which you want a covariance matrix. This should be in a Dates.Period.assets
- What assets in include in the covariance matrix.
Returns
- A
Hermitian
. The labelling of assets for each row/column is as per the inputassets
vector.
HighFrequencyCovariance.get_correlation
— Functionget_correlation(covar::CovarianceMatrix, asset1::Symbol, asset2::Symbol)
Extract the correlation between two assets stored in a CovarianceMatrix.
Inputs
covar
- ACovarianceMatrix
asset1
- ASymbol
representing an asset.asset2
- ASymbol
representing an asset.
Returns
- A Scalar (the correlation coefficient).
get_correlation(covar::CovarianceModel, asset1::Symbol, asset2::Symbol)
Extract the correlation between two assets stored in a CovarianceModel.
Inputs
covar
- ACovarianceModel
asset1
- ASymbol
representing an asset.asset2
- ASymbol
representing an asset.
Returns
- A Scalar (the correlation coefficient).
HighFrequencyCovariance.get_volatility
— Functionget_volatility(
covar::CovarianceMatrix,
asset1::Symbol,
time_period_per_unit::Dates.Period = covar.time_period_per_unit,
)
Get the volatility for a stock from a CovarianceMatrix
.
Inputs
covar
- ACovarianceMatrix
asset1
- ASymbol
representing an asset.time_period_per_unit
- The time interval the volatilities will be for.
Returns
- A Scalar (the volatility).
get_volatility(
covar::CovarianceModel,
asset1::Symbol,
time_period_per_unit::Dates.Period = covar.time_period_per_unit,
)
Get the volatility for a stock from a CovarianceModel
.
Inputs
covar
- ACovarianceModel
asset1
- ASymbol
representing an asset.time_period_per_unit
- The time interval the volatilities will be for.
Returns
- A Scalar (the volatility).
HighFrequencyCovariance.make_nan_covariance_matrix
— Functionmake_nan_covariance_matrix(
labels::Vector{Symbol},
time_period_per_unit::Dates.Period,
)
This makes an empty CovarianceMatrix
struct with all volatilities and correlations being NaNs.
Inputs
labels
- The names of the asset names for this (empty)CovarianceMatrix
.time_period_per_unit
- The time interval the volatilities will be for.
Returns
- An (empty)
CovarianceMatrix
HighFrequencyCovariance.combine_covariance_matrices
— Functioncombine_covariance_matrices(
vect::Vector{CovarianceMatrix{T}},
cor_weights::Vector{<:Real} = repeat([1.0], length(vect)),
vol_weights::Vector{<:Real} = cor_weights,
time_period_per_unit::Union{Missing,Dates.Period} = vect[1].time_period_per_unit,
) where T<:Real
Combines a vector of CovarianceMatrix
structs into one CovarianceMatrix
struct.
Inputs
vect
- A vector ofCovarianceMatrix
structs.cor_weights
- A vector for how much to weight the correlations from each covariance matrix (by default they will be equalweighted).vol_weights
- A vector for how much to weight the volatilities from each covariance matrix (by default they will be equalweighted).time_period_per_unit
- What time period should the volatilities be scaled to.
Returns
- A matrix and a vector of labels for each row/column of the matrix.
HighFrequencyCovariance.rearrange
— Functionrearrange(
cm::CovarianceMatrix,
labels::Vector{Symbol},
time_period_per_unit::Union{Missing,Dates.Period} = cm.time_period_per_unit,
)
Rearrange the order of labels in a CovarianceMatrix
.
Takes
cm
- ACovarianceMatrix
.labels
- AVector
of labels.time_period_per_unit
- The time period you want for the resultant Covariance Matrix
Returns
- A
CovarianceMatrix
.
rearrange(
cm::CovarianceMatrix,
labels::Vector{Symbol},
time_period_per_unit::Union{Missing,Dates.Period} = cm.time_period_per_unit,
)
Rearrange the order of labels in a CovarianceMatrix
.
Takes
cm
- ACovarianceMatrix
.labels
- AVector
of labels.time_period_per_unit
- The time period you want for the resultant Covariance Matrix
Returns
- A
CovarianceMatrix
.
HighFrequencyCovariance.cov_to_cor
— Functioncov_to_cor(mat::AbstractMatrix)
Converts a matrix (representing a covariance matrix) into a Hermitian
correlation matrix and a vector of standard deviations.
Inputs
cor
- A matrix.
Returns
- A
Hermitian
. - A
Vector
of standard deviations (not volatilities).
HighFrequencyCovariance.cor_to_cov
— Functioncor_to_cov(cor::AbstractMatrix,sdevs::Vector{<:Real})
Converts a correlation matrix and some standard deviations into a Hermitian
covariance matrix.
Inputs
cor
- A correlation matrix.sdevs
- A vector of standard deviations (not volatilities).
Returns
- A
Hermitian
.
HighFrequencyCovariance.cov_to_cor_and_vol
— Functioncov_to_cor_and_vol(
mat::AbstractMatrix,
duration_of_covariance_matrix::Dates.Period,
duration_for_desired_vols::Dates.Period,
)
cov_to_cor_and_vol(
mat::AbstractMatrix,
duration_of_covariance_matrix::Real,
duration_for_desired_vols::Real,
)
Converts a matrix (representing a covariance matrix) into a Hermitian
correlation matrix and a vector of volatilities.
Inputs
cor
- A correlation matrix.duration_of_covariance_matrix
- The duration of the covariance matrix. If these are input as reals they must have the same units.duration_for_desired_vols
- The duration you want a volatility for. If these are input as reals they must have the same units.
Returns
A
Hermitian
.A
Vector
of volatilities.covtocorandvol( mat::AbstractMatrix, durationofcovariancematrixinnaturalunits::Real, )
Inputs
cor
- A correlation matrix.duration_of_covariance_matrix_in_natural_units
- The duration of the covariance matrix. It duration must be input in units that you know of (for instance thetime_period_per_unit
of aSortedDataFrame
).
Returns
- A
Hermitian
. - A
Vector
of volatilities.
HighFrequencyCovariance.construct_matrix_from_eigen
— Functionconstruct_matrix_from_eigen(
eigenvalues::Vector{<:Real},
eigenvectors::Matrix{<:Real},
)
Constructs a matrix from its eigenvalue decomposition.
Inputs
eigenvalues
- A vector of eigenvalues.eigenvectors
- A matrix of eigenvectors. The i'th column corresponds to the i'th eigenvalue.
Returns
- A
Matrix
.
HighFrequencyCovariance.get_returns
— Functionget_returns(dd::DataFrame; rescale_for_duration::Bool = false)
Converts a long format DataFrame
of prices into a DataFrame
of returns.
Inputs
dd
- ADataFrame
with a column called :Time and all other columns being asset prices in each period.rescale_for_duration
- Should returns be rescaled to reflect a common time interval.
Returns
- A
DataFrame
of returns.
HighFrequencyCovariance.valid_correlation_matrix
— Functionvalid_correlation_matrix(mat::Hermitian, min_eigen_threshold::Real = 0.0)
valid_correlation_matrix(covar::CovarianceMatrix, min_eigen_threshold::Real = 0.0)
Test if a Hermitian
matrix is a valid correlation matrix. This is done by testing if it is psd, if it has a unit diagonal and if all other elements are less than one. If a Hermitian
is input then it will be tested. If a CovarianceMatrix
is input then its correlation matrix will be tested.
Inputs
mat
- AHermitian
matrix or aCovarianceMatrix
min_eigen_threshold
- How big does the smallest eigenvalue have to be.
Returns
- A
Bool
that is true if mat is a valid correlation matrix and false if not.
valid_correlation_matrix(covar::CovarianceModel, min_eigen_threshold::Real = 0.0)
HighFrequencyCovariance.is_psd_matrix
— Functionis_psd_matrix(mat::Hermitian, min_eigen_threshold::Real = 0.0)
is_psd_matrix(covar::CovarianceMatrix)
Test if a matrix is psd (Positive Semi-Definite). This is done by seeing if all eigenvalues are positive. If a Hermitian
is input then it will be tested. If a CovarianceMatrix
is input then its correlation matrix will be tested.
Inputs
mat
- AHermitian
matrix or aCovarianceMatrix
min_eigen_threshold
- How big does the smallest eigenvalue have to be.
Returns
- A
Bool
that is true if mat is psd and false if not.
is_psd_matrix(covar::CovarianceModel)
Test if a matrix is psd (Positive Semi-Definite). This is done by seeing if all eigenvalues are positive. If a Hermitian
is input then it will be tested. If a CovarianceModel
is input then its correlation matrix will be tested.
Inputs
mat
- ACovarianceModel
min_eigen_threshold
- How big does the smallest eigenvalue have to be.
Returns
- A
Bool
that is true if mat is psd and false if not.
HighFrequencyCovariance.relabel
— Functionrelabel(covar::CovarianceMatrix, mapping::Dict{Symbol,Symbol})
This relabels a CovarianceMatrix struct to give all the assets alternative names.
Inputs
covar
- TheCovarianceMatrix
object you want to relabel.mapping
- A dict mapping from the names you have to the names you want.
Returns
- A
CovarianceMatrix
the same as the one you input but with new labels.
relabel(covar::CovarianceModel, mapping::Dict{Symbol,Symbol})
This relabels a CovarianceModel struct to give all the assets alternative names.
Inputs
covar
- TheCovarianceModel
object you want to relabel.mapping
- A dict mapping from the names you have to the names you want.
Returns
- A
CovarianceModel
the same as the one you input but with new labels.
Blocking and Regularisation Functions
HighFrequencyCovariance.put_assets_into_blocks_by_trading_frequency
— Functionput_assets_into_blocks_by_trading_frequency(
ts::SortedDataFrame,
obs_multiple_for_new_block::Real,
func::Symbol,
optional_parameters::NamedTuple = NamedTuple(),
)
This makes a DataFrame that describes how to estimate the covariance matrix blockwise.
Inputs
ts
- The tick data.obs_multiple_for_new_block
- The relative number of ticks needed before a new block is made. So if this is 1.2 that means a new group is made when one asset has 20% or more ticks than the slowest traded asset in the previous block.func
- A symbol representing the covariance estimation function to be used.optional_parameters
- Optional parameters to be used in thefunc
function.
Returns
- A
DataFrame
representing what estimations should be performed. The order of rows in theDataFrame
shows the order of estimation.
References
Hautsch, N., Kyj, L.M. and Oomen, R.C.A. (2012), A blocking and regularization approach to high‐dimensional realized covariance estimation. J. Appl. Econ., 27: 625-645
HighFrequencyCovariance.blockwise_estimation
— Functionblockwise_estimation(ts::SortedDataFrame, blocking_frame::DataFrame)
Run a series of covariance estimations and combine the results. Two things should be input, a SortedDataFrame with the price update data and a dataframe describing what estimations should be performed. This should be of the same form as is output by put_assets_into_blocks_by_trading_frequency
(although the actual estimations can be customised to something different as to what that function outputs).
Inputs
ts
- The tick data.blocking_frame
- ADataFrame
representing what estimations to do and in what order. This is often be one generated by theput_assets_into_blocks_by_trading_frequency
function (and potentially then modified).
Returns
- A
CovarianceMatrix
.
HighFrequencyCovariance.make_adjacent_block_sequence
— Functionmake_adjacent_block_sequence(blocks::Vector{Vector{Symbol}})
This makes a sequence of adjacent blocks.
Inputs
blocks
- The blocks for blockwise estimation.
Returns
- A
Vector
.
HighFrequencyCovariance.make_sorted_adjacent_block_sequence
— Functionmake_sorted_adjacent_block_sequence(blocks::Vector{Vector{Symbol}})
This makes a sequence of adjacent blocks and then sort them by length.
Inputs
blocks
- The blocks for blockwise estimation.
Returns
- A
Vector
.
HighFrequencyCovariance.put_assets_into_blocks
— Functionput_assets_into_blocks(ts::SortedDataFrame, new_group_mult::Real)
This splits assets into seperate blocks depending on their number of ticks.
Inputs
ts
- The tick data.new_group_mult
- The relative number of ticks needed before a new block is made. So if this is 1.2 that means a new group is made when one asset has 20% or more ticks than the slowest traded asset in the previous block.
Returns
- A
DataFrame
.
Monte Carlo
HighFrequencyCovariance.generate_random_path
— Functiongenerate_random_path(
dimensions::Integer,
ticks::Integer;
syncronous::Bool = false,
rng::Union{MersenneTwister,StableRNG} = MersenneTwister(1),
vol_dist::Distribution = Uniform(
0.1 / sqrt(252 * 8 * 3600),
0.5 / sqrt(252 * 8 * 3600),
),
refresh_rate_dist::Distribution = Uniform(0.5, 5.0),
time_period_per_unit::Dates.Period = Second(1),
micro_noise_dist::Distribution = Uniform(
vol_dist.a * sqrt(time_period_ratio(Minute(5), time_period_per_unit)),
vol_dist.b * sqrt(time_period_ratio(Minute(5), time_period_per_unit)),
),
assets::Union{Vector,Missing} = missing,
brownian_corr_matrix::Union{Hermitian,Missing} = missing,
vols::Union{Vector,Missing} = missing,
rng_timing::Union{MersenneTwister,StableRNG} = MersenneTwister(1),
)
Generate a random path of price updates with a specified number of dimensions and ticks. There are options for whether the data is syncronous or asyncronous, the volatility of the price processes, the refresh rate on the (exponential) arrival times of price updates, the minimum and the maximum microstructure noises.
Note the defaults are chosen to reflect a highcap stock with annualised volatility between 10% and 50%. The standard deviation of microstructure noise is of the same order of magnitude as 5 minutes standard deviation of return. vol * sqrt(60*5)
if vol is in seconds. Refreshed ticks every 0.5-5 seconds (in expectation).
Inputs
dimensions
- The number of assets.ticks
- The number of ticks to produce.syncronous
- Should ticks be syncronous (for each asset) or asyncronous.rng
- The Random.MersenneTwister or StableRNGs.Stable used for RNG.vol_dist
- The distribution to draw volatilities from (only used if vols is missing).refresh_rate_dist
- The distribution to draw refresh rates (exponential distribution rates) from. Note if you want all intervals to be evenly spaced you can do something like DiscreteUniform(1,1).time_period_per_unit
- What time period should the time column correspond to.micro_noise_dist
- The distribution to draw assetwise microstructure noise standard deviations are drawn from.assets
- The names of the assets that you want to use. The length of this must be equal to thedimensions
input.brownian_corr_matrix
- The correlation matrix to use. This is sampled from the Inverse Wishart distribution if none is input.vols
- The volatilities to use. These are sampled from the uniform distribution betweenmin_noise_var
andmax_noise_var
.
Returns
- A
SortedDataFrame
of tick data. - A
CovarianceMatrix
representing the true data generation process used in making the tick data. - A
Dict
of microstructure noise variances for each asset. - A
Dict
of update rates for each asset.
StochasticIntegrals.ItoSet
— TypeStochasticIntegrals.ItoSet(covariance_matrix::CovarianceMatrix{<:Real})
Convert a CovarianceMatrix
into an ItoSet
from the StochasticIntegrals package. This package can then be used to do things like generate draws from the Multivariate Gaussian corresponding to the covariance matrix and other things.
Inputs
covariance_matrix
- TheCovarianceMatrix
that you want to convert into anStochasticIntegrals.ItoSet
Returns
- A
StochasticIntegrals.ItoSet
struct.
Example
using Dates
covar = CovarianceMatrix(make_random_psd_matrix_from_wishart(5), rand(5), [:A,:B,:C,:D,:E], Dates.Hour(1))
iset = ItoSet(covar)
# To see how this is used for something useful you can look at the get_draws function.
StochasticIntegrals.get_draws
— FunctionStochasticIntegrals.get_draws(
covariance_matrix::CovarianceMatrix{<:Real},
num::Integer;
number_generator::NumberGenerator = Mersenne(
MersenneTwister(1234),
length(covariance_matrix.labels),
),
antithetic_variates = false,
)
get pseudorandom draws from a CovarianceMatrix
struct. This is basically a convenience wrapper over StochasticIntegrals.getdraws which does the necessary constructing of the structs of that package. If the `antitheticvariates` control is set to true then every second set of draws will be antithetic to the previous. If you want to do something like Sobol sampling you can change the number_generator. See StochasticIntegrals to see what is available (and feel free to make new ones and put in Pull Requests)
Inputs
covar
- AnCovarianceMatrix
struct that you want to draw from.num
- The number of draws you wantnumber_generator
- ANumberGenerator
struct that can be queried for a series of unit interval vectors that are then transformed by the covariance matrix into draws.antithetic_variates
- A boolean indicating if antithetic variates should be used (every second draw is made from 1 - uniformdraw of previous)
Returns
- A
Vector
ofDict
s of draws. Note you can convert this to a dataframe or array withStochasticIntegrals.to_dataframe
orStochasticIntegrals.to_array
.
StochasticIntegrals.get_draws(
covariance_model::CovarianceModel{<:Real},
num::Integer;
number_generator::NumberGenerator = Mersenne(
MersenneTwister(1234),
length(covariance_matrix.labels),
),
antithetic_variates = false,
)
get pseudorandom draws from a CovarianceModel
struct. This is basically a convenience wrapper over StochasticIntegrals.getdraws which does the necessary constructing of the structs of that package. If the `antitheticvariates` control is set to true then every second set of draws will be antithetic to the previous. If you want to do something like Sobol sampling you can change the number_generator. See StochasticIntegrals to see what is available (and feel free to make new ones and put in Pull Requests)
Inputs
covariance_model
- AnCovarianceModel
struct that you want to draw from.num
- The number of draws you wantnumber_generator
- ANumberGenerator
struct that can be queried for a series of unit interval vectors that are then transformed by the covariance matrix into draws.antithetic_variates
- A boolean indicating if antithetic variates should be used (every second draw is made from 1 - uniformdraw of previous)
Returns
- A
Vector
ofDict
s of draws. Note you can convert this to a dataframe or array withStochasticIntegrals.to_dataframe
orStochasticIntegrals.to_array
.
HighFrequencyCovariance.convert_to_stochastic_integrals_type
— Functionconvert_to_stochastic_integrals_type(x::MersenneTwister, num::Integer)
convert_to_stochastic_integrals_type(x::StableRNG, num::Integer)
This makes either a StochasticIntegrals.Mersenne or StochasticIntegrals.Stable_RNG type depending on what random number generator is input.
For getting a DataFrame version of a CovarianceMatrix and vice versa.
DataFrames.DataFrame
— TypeDataFrames.DataFrame(
covar::CovarianceMatrix,
othercols::Dict = Dict{Symbol,Any}();
delete_duplicate_correlations::Bool = true,
)
Convert a CovarianceMatrix to a DataFrame
format.
Inputs
covar
- TheCovarianceMatrix
othercols
- This adds columns to theDataFrame
. For instance if it isDict{Symbol,String}([:pc] .=> ["Fred's PC"])
, then there will be a column indicating that this estimation was done on Fred's PC.delete_duplicate_correlations
- Should the unnecessary correlations be included (as correlation matrices are symmetric half the entries duplicate information).
Returns
- A
DataFrame
.
DataFrames.DataFrame(
covar::CovarianceModel,
othercols::Dict = Dict{Symbol,Any}();
delete_duplicate_correlations::Bool = true,
)
Convert a CovarianceModel to a DataFrame
format.
Inputs
covar
- TheCovarianceModel
othercols
- This adds columns to theDataFrame
. For instance if it isDict{Symbol,String}([:pc] .=> ["Fred's PC"])
, then there will be a column indicating that this estimation was done on Fred's PC.delete_duplicate_correlations
- Should the unnecessary correlations be included (as correlation matrices are symmetric half the entries duplicate information).
Returns
- A
DataFrame
.
HighFrequencyCovariance.CovarianceMatrix
— TypeCovarianceMatrix(correlation::Hermitian{R},
volatility::Vector{R},
labels::Vector{Symbol}) where R<:Real
This Struct
stores three elements. A Hermitian
correlation matrix, a vector of volatilities and a vector of labels. The order of the labels matches the order of the assets in the volatility vector and correlation matrix. The default constructor is used.
Inputs
correlation
- AHermitian
correlation matrix.volatility
- Volatilities for each asset.labels
- The labels for thecorrelation
andvolatility
members. The n'th entry of thelabels
vector should contain the name of the asset that has its volatility in the n'th entry of thevolatility
member and its correlations in the n'th row/column of thecorrelation
member.time_period_per_unit
- The period that one unit of volatility corresponds to.
Returns
- A
CovarianceMatrix
.