Estimation Functions

Estimating Volatility

The estimate_volatility function is the main volatility estimation function. Either of the two estimation methods can be called by specifying :simple_volatility or :two_scales_volatility as the method argument in the estimate_volatility function. Alternatively the simple_volatility or two_scales_volatility functions can be called directly.

The simple_volatility returns a Dict with the estimated volatility for each asset. The two_scales_volatility function on the other hand returns a tuple with a Dict of estimated volatilities in the first position and a Dict of estimated microstructure noise variances in the second. For uniformity of output the estimate_volatility returns a Dict with the estimated volatility for each asset regardless of what method is chosen.

If a user wants to calculate both volatilities and microstructure noises then they are advised to prefer the two_scales_volatility function over doing both estimate_volatility (with the :two_scale_covariance method argument) and the estimate_microstructure_noise function. While the results are the same doing the two function option means everything is calculated twice.

HighFrequencyCovariance.estimate_volatilityFunction
estimate_volatility(
    ts::SortedDataFrame,
    assets::Vector{Symbol} = get_assets(ts),
    method::Symbol = :two_scales_volatility;
    time_grid::Union{Missing,Dict} = missing,
    fixed_spacing::Union{Missing,Dict,<:Real} = missing,
    use_all_obs::Bool = false,
    rough_guess_number_of_intervals::Integer = 5,
    num_grids::Real = default_num_grids(ts),
)

This is a convenience wrapper for the two volatility estimation techniques included in this package.

General Inputs

  • ts - The tick data.
  • assets - What assets from ts that you want to estimate the covariance for.
  • method - The method can be :simple_volatility (for the simple volatility method) or :two_scales_volatility (for the two scales volatility method)

Inputs only used in :simple_volatility method.

  • time_grid - The grid with which to calculate returns. If missing one is generated with a fixed spacing (if that is provided) or a default spacing.
  • fixed_spacing - A spacing used to calculate a time grid. Not used if a time_grid is input or if use_all_obs = true.
  • use_all_obs - Use all observations to estimate volatilities. Not used if a time_grid is provided.
  • rough_guess_number_of_intervals - A rough number of intervals to calculate a default spacing. Not used if a time_grid or fixed_spacing is provided or if use_all_obs = true.

Inputs only used in :two_scales_volatility method.

  • num_grids - Number of grids used in order in two scales estimation.

Returns

  • A Dict with estimated volatilities for each asset.
source
HighFrequencyCovariance.simple_volatilityFunction
simple_volatility(
   ts::SortedDataFrame,
   assets::Vector{Symbol} = get_assets(ts);
   time_grid::Union{Missing,Dict} = missing,
   fixed_spacing::Union{Missing,Dict,<:Real} = missing,
   use_all_obs::Bool = false,
   rough_guess_number_of_intervals::Integer = 5,
)

Calculates volatility with the simple method.

Inputs

  • ts - The tick data.
  • assets - The assets you want to estimate volatilities for.
  • time_grid - The grid with which to calculate returns. If missing one is generated with a fixed spacing (if that is provided) or a default spacing.
  • fixed_spacing - A spacing used to calculate a time grid. Not used if a time_grid is input or if use_all_obs = true.
  • use_all_obs - Use all observations to estimate volatilities. Not used if a time_grid is provided.
  • rough_guess_number_of_intervals - A rough number of intervals to calculate a default spacing. Not used if a time_grid or fixed_spacing is provided or if use_all_obs = true.

Returns

  • A Dict with an estimated volatility for each asset.
source
HighFrequencyCovariance.two_scales_volatilityFunction
two_scales_volatility(vals::Vector, times::Vector, num_grids::Real)

Calculates volatility with the two scales method of Zhang, Mykland, Ait-Sahalia 2005. The amount of time for the grid spacing is by default this is a tenth of the total duration by default. If this doesn't make sense for your use of it then choose a spacing at which you expect the effect of microstructure noise will be small.

Inputs

  • vals - The prices at each instant in time.
  • times - The times corresponding to each element in vals.
  • num_grids - Number of grids used in order in two scales estimation.

Returns

  • A scalar for the estimated volatility of the asset.

  • A scalar for the estimated microstructure noise variance.

    twoscalesvolatility( ts::SortedDataFrame, assets::Vector{Symbol} = getassets(ts); numgrids::Real = defaultnumgrids(ts), )

Calculates volatility with the two scales method of Zhang, Mykland, Ait-Sahalia 2005. The amount of time for the grid spacing is by default this is a tenth of the total duration by default. If this doesn't make sense for your use of it then choose a spacing at which you expect the effect of microstructure noise will be small.

Inputs

  • ts - The tick data.
  • assets - The assets you want to estimate volatilities for.
  • num_grids - Number of grids used in order in two scales estimation.

Returns

  • A Dict with estimated volatilities for each asset.
  • A Dict with estimated microstructure noise variances for each asset.

References

Zhang L, Mykland PA, Aït-Sahalia Y (2005). "A Tale of Two Time Scales: Determining Integrated Volatility with Noisy High-Frequency Data." Journal of the American Statistical Association, 100(472), 1394–1411. ISSN 01621459. doi:10.1198/016214505000000169.

source

Estimating Microstructure Noise

There is one function that returns a Dict of microstructure noise estimates for each asset. These estimates come from the two_scales_volatility method and are identical to what you get if you examine the second element of the tuple that that function outputs.

HighFrequencyCovariance.estimate_microstructure_noiseFunction
estimate_microstructure_noise(
    ts::SortedDataFrame,
    assets::Vector{Symbol} = get_assets(ts);
    num_grids::Real = default_num_grids(ts),
)

This estimates microstructure noise with the twoscalesvolatility method.

Inputs

  • ts - The tick data.
  • assets - What assets from ts that you want to estimate the covariance for.
  • num_grids - Number of grids used in order in two scales estimation.

Returns

  • A Dict with estimated microstructure noise variances for each asset.
source

Estimating Covariance Matrices

The estimate_covariance is the main method for estimating a CovarianceMatrix. Five possible methods can be input to this function (or the functions for each method can alternatively be called directly).

All covariance estimation functions take in a SortedDataFrame and (optionally) a vector of symbol names representing assets and (optionally) a specified regularisation method. If the vector of symbol names for assets in input then the CovarianceMatrix will only include those input assets and assets will be in the order specified in the vector.

If the regularisation method is specified then this will be used in regularising the resulting matrix. This can alternatively be missing in which case no regularisation will be done. By default the nearest_psd_matrix will be used for every method except the two_scales_covariance method and this regularisation is done on the estimated covariance matrix before its correlation matrix and volatilities are split up and placed in a CovarianceMatrix struct. For the two_scales_covariance method the correlation matrix is estimated directly and regularisation is applied to this correlation matrix. Hence the nearest_correlation_matrix is the default.

Note that some combinations of estimation technique and regularisation technique will not work. For instance nearest_correlation_matrix would not be good to apply in the case of the preaveraged_covariance method as it would attempt to make a covariance matrix into a correlation matrix with a unit diagonal. In addition if the estimated matrix is very non-psd then heavy regularisation might be required. This may have bad results. In these cases it may be useful to turn off regularisation in the estimation function and instead apply regularisation to the CovarianceMatrix struct.

HighFrequencyCovariance.estimate_covarianceFunction
estimate_covariance(
    ts::SortedDataFrame,
    assets::Vector{Symbol} = get_assets(ts),
    method::Symbol = :preaveraged_covariance;
    regularisation::Union{Missing,Symbol} = :default,
    regularisation_params::Dict = Dict(),
    only_regulise_if_not_PSD::Bool = false,
    time_grid::Union{Missing,Vector} = missing,
    fixed_spacing::Union{Missing,<:Real} = missing,
    refresh_times::Bool = false,
    rough_guess_number_of_intervals::Integer = 5, # General Inputs
    kernel::HFC_Kernel{<:Real} = parzen,
    H::Real = kernel.c_star * mean(a -> length(ts.groupingrows[a]), assets)^0.6,
    m::Integer = 2, # BNHLS parameters
    numJ::Integer = 100,
    num_blocks::Integer = 10,
    block_width::Real = (maximum(ts.df[:, ts.time]) - minimum(ts.df[:, ts.time])) /
                        num_blocks,
    microstructure_noise_var::Dict{Symbol,<:Real} = two_scales_volatility(ts, assets)[2], # Spectral Covariance parameters
    drop_assets_if_not_enough_data::Bool = false,
    theta::Real = 0.15,
    g::NamedTuple = g, # Preaveraging
    equalweight::Bool = false,
    num_grids::Real = default_num_grids(ts),
    min_obs_for_estimation::Integer = 10,
    if_dont_have_min_obs::Real = NaN,
)

This is a convenience wrapper for the regularisation techniques.

General Inputs

  • ts - The tick data.
  • assets - What assets from ts that you want to estimate the covariance for.
  • method - The method you want to use. This can be :simple_covariance, :bnhls_covariance, :spectral_covariance, :preaveraged_covariance or :two_scales_covariance.
  • regularisation - The regularisation method to use. This can be :identity_regularisation, :eigenvalue_clean, :nearest_correlation_matrix or :nearest_psd_matrix. You can also choose :covariance_default (which is :nearest_psd_matrix) or :correlation_default (which is :nearest_correlation_matrix). If missing then the default regularisation method for your chosen covariance estimation method will be used.
  • regularisation_params - Keyword arguments that will be used by your chosen regularisation method.
  • only_regulise_if_not_PSD - Should the resultant matrix only be regularised if it is not psd.

Inputs only used in :simple_covariance method.

  • time_grid - The grid with which to calculate returns (:simple_covariance method only).
  • fixed_spacing - A spacing used to calculate a time grid. Not used if refresh_times=true (:simple_covariance method only).
  • refresh_times - Should refresh times be used to estimate covariance (:simple_covariance method only).
  • rough_guess_number_of_intervals - A rough number of intervals to calculate a default spacing. Not used if a time_grid or fixed_spacing is provided or if refresh_times=true (:simple_covariance method only).

Inputs only used in :bnhls_covariance method.

  • kernel - The kernel used. See the bnhls paper for details. (:bnhls_covariance method only)
  • H - The number of lags/leads used in estimation. See the bnhls paper for details. (:bnhls_covariance method only)
  • m - The number of end returns to average. (:bnhls_covariance method only)

Inputs only used in :spectral_covariance method.

  • numJ - The number of J values. See the paper for details (:spectral_covariance method only).
  • num_blocks - The number of blocks to split the time frame into. See the preaveraging paper for details (:spectral_covariance method only).
  • block_width - The width of each block to split the time frame into (:spectral_covariance method only).
  • microstructure_noise_var - Estimates of microstructure noise variance for each asset (:spectral_covariance method only).

Inputs only used in :preaveraged_covariance method.

  • drop_assets_if_not_enough_data - If we do not have enough data to estimate for all the input assets should we just calculate the correlation/volatilities for those assets we do have?
  • theta - A theta value. See paper for details (:preaveraged_covariance method only).
  • g - A tuple containing a preaveraging method (with name "f") and a ψ value. See paper for details (:preaveraged_covariance method only).

Inputs only used in :two_scales_covariance method.

  • equalweight - Should we use equal weight for the two different linear combinations of assets. If false then an optimal weight is calculated (from volatilities) (:two_scales_covariance method only).
  • num_grids - Number of grids used in order in two scales estimation (:two_scales_covariance method only).
  • min_obs_for_estimation - How many observations do we need for estimation. If less than this we use below fallback.
  • if_dont_have_min_obs - If we do not have sufficient observations to estimate a correlation then what should be used?

Returns

  • A CovarianceMatrix
source
HighFrequencyCovariance.simple_covarianceFunction
simple_covariance(
    ts::SortedDataFrame,
    assets::Vector{Symbol} = get_assets(ts);
    regularisation::Union{Missing,Symbol} = :covariance_default,
    regularisation_params::Dict = Dict(),
    only_regulise_if_not_PSD::Bool = false,
    time_grid::Union{Missing,Vector} = missing,
    fixed_spacing::Union{Missing,<:Real} = missing,
    refresh_times::Bool = false,
    rough_guess_number_of_intervals::Integer = 5,
)

Estimation of the covariance matrix in the standard textbook way.

Inputs

  • ts - The tick data.
  • assets - The assets you want to estimate volatilities for.
  • regularisation - A symbol representing what regularisation technique should be used. If missing no regularisation is performed.
  • regularisation_params - keyword arguments to be consumed in the regularisation algorithm.
  • only_regulise_if_not_PSD - Should regularisation only be attempted if the matrix is not psd already.
  • time_grid - The grid with which to calculate returns.
  • fixed_spacing - A spacing used to calculate a time grid. Not used if refresh_times=true.
  • refresh_times - Should refresh times be used to estimate covariance.
  • rough_guess_number_of_intervals - A rough number of intervals to calculate a default spacing. Not used if a time_grid or fixed_spacing is provided or if refresh_times=true.

Returns

  • A CovarianceMatrix.
source
HighFrequencyCovariance.bnhls_covarianceFunction
bnhls_covariance(
    ts::SortedDataFrame,
    assets::Vector{Symbol} = get_assets(ts);
    regularisation::Union{Missing,Symbol} = :covariance_default,
    regularisation_params::Dict = Dict(),
    only_regulise_if_not_PSD::Bool = false,
    kernel::HFC_Kernel{<:Real} = parzen,
    H::Real = kernel.c_star * (mean(map(a -> length(ts.groupingrows[a]), assets)))^0.6,
    m::Integer = 2,
)

This calculates covariance with the Multivariate realised kernel oof BNHLS(2011).

Inputs

  • ts - The tick data.
  • assets - The assets you want to estimate volatilities for.
  • regularisation - A symbol representing what regularisation technique should be used. If missing no regularisation is performed.
  • regularisation_params - keyword arguments to be consumed in the regularisation algorithm.
  • only_regulise_if_not_PSD - Should regularisation only be attempted if the matrix is not psd already.
  • kernel - The kernel used. See the paper for details.
  • H - The number of lags/leads used in estimation. See the paper for details.
  • m - The number of end returns to average.

Returns

  • A CovarianceMatrix.

References

Barndorff-Nielsen, O., Hansen, P.R., Lunde, A., Shephard, N. 2011. - The whole paper but particularly 2.2, 2.3 here. Kernels are in table 1. choices of H are discussed in section 3.4 of the paper.

source
HighFrequencyCovariance.spectral_covarianceFunction
spectral_covariance(
    ts::SortedDataFrame,
    assets::Vector{Symbol} = get_assets(ts);
    regularisation::Union{Missing,Symbol} = :covariance_default,
    regularisation_params::Dict = Dict(),
    only_regulise_if_not_PSD::Bool = false,
    numJ::Integer = 100,
    num_blocks::Integer = 10,
    block_width::Real = (maximum(ts.df[:, ts.time]) - minimum(ts.df[:, ts.time])) /
                        num_blocks,
    microstructure_noise_var::Dict{Symbol,<:Real} = two_scales_volatility(ts, assets)[2],
)

Estimation of a CovarianceMatrix using the spectral covariance method.

Inputs

  • ts - The tick data.
  • assets - The assets you want to estimate volatilities for.
  • regularisation - A symbol representing what regularisation technique should be used. If missing no regularisation is performed.
  • regularisation_params - keyword arguments to be consumed in the regularisation algorithm.
  • only_regulise_if_not_PSD - Should regularisation only be attempted if the matrix is not psd already.
  • numJ - The number of J values. See the paper for details.
  • num_blocks - The number of blocks to split the time frame into. See the paper for details.
  • block_width - The width of each block to split the time frame into.
  • microstructure_noise_var - Estimates of microstructure noise variance for each asset.

Returns

  • A CovarianceMatrix.

References

Bibinger M, Hautsch N, Malec P, Reiss M (2014). “Estimating the quadratic covariation matrix from noisy observations: Local method of moments and efficiency.” The Annals of Statistics, 42(4), 1312–1346. doi:10.1214/14-AOS1224.

source
HighFrequencyCovariance.preaveraged_covarianceFunction
preaveraged_covariance(
    ts::SortedDataFrame,
    assets::Vector{Symbol} = get_assets(ts);
    regularisation::Union{Missing,Symbol} = :covariance_default,
    drop_assets_if_not_enough_data::Bool = false,
    regularisation_params::Dict = Dict(),
    only_regulise_if_not_PSD::Bool = false,
    theta::Real = 0.15,
    g::NamedTuple = g,
)

Estimation of the CovarianceMatrix using preaveraging method.

Inputs

  • ts - The tick data.
  • assets - The assets you want to estimate volatilities for.
  • regularisation - A symbol representing what regularisation technique should be used. If missing no regularisation is performed.
  • drop_assets_if_not_enough_data - If we do not have enough data to estimate for all the input assets should we just calculate the correlation/volatilities for those assets we do have?
  • regularisation_params - keyword arguments to be consumed in the regularisation algorithm.
  • only_regulise_if_not_PSD - Should regularisation only be attempted if the matrix is not psd already.
  • theta - A theta value. See paper for details.
  • g - A tuple containing a preaveraging method function (with name "f") and a ψ (with name "psi") value. psi here should be the integral of the function over the interval between zero and one.

Returns

  • A CovarianceMatrix.

References

Christensen K, Podolskij M, Vetter M (2013). “On covariation estimation for multivariate continuous Itô semimartingales with noise in non-synchronous observation schemes.” Journal of Multivariate Analysis, 120, 59–84. doi:10.1016/j.jmva.2013.05.002.

source
HighFrequencyCovariance.two_scales_covarianceFunction
two_scales_covariance(
    ts::SortedDataFrame,
    assets::Vector{Symbol} = get_assets(ts);
    regularisation::Union{Missing,Symbol} = :correlation_default,
    regularisation_params::Dict = Dict(),
    only_regulise_if_not_PSD::Bool = false,
    equalweight::Bool = false,
    num_grids::Real = default_num_grids(ts),
    min_obs_for_estimation::Integer = 10,
    if_dont_have_min_obs::Real = NaN,
)

Estimation of a CovarianceMatrix using the two scale covariance method.

Inputs

  • ts - The tick data.
  • assets - The assets you want to estimate volatilities for.
  • regularisation - A symbol representing what regularisation technique should be used. If missing no regularisation is performed.
  • regularisation_params - keyword arguments to be consumed in the regularisation algorithm.
  • only_regulise_if_not_PSD - Should regularisation only be attempted if the matrix is not psd already.
  • equalweight - Should we use equal weight for the two different linear combinations of assets. If false then an optimal weight is calculated (from volatilities).
  • num_grids - Number of grids used in order in two scales estimation.
  • min_obs_for_estimation - How many observations do we need for estimation. If less than this we use below fallback.
  • if_dont_have_min_obs - If we do not have sufficient observations to estimate a correlation then what should be used?

Returns

  • A CovarianceMatrix.
source

Regularisation of Covariance Matrices

The main function for regularisation is the regularise function. In addition four methods are implemented for regularising matrices can be used directly or through the regularise function. All of these functions can be applied to either a Hermitian matrix or to a CovarianceMatrix struct.

If these functions are applied to a Hermitian then regularisation is applied and a regularised Hermitian is returned.

If these functions are applied to a CovarianceMatrix struct.

HighFrequencyCovariance.regulariseFunction
regularise(
    mat::Hermitian,
    ts::SortedDataFrame,
    mat_labels::Vector,
    method::Symbol = :correlation_default;
    spacing::Union{Missing,<:Real} = missing,
    weighting_matrix = Diagonal(eltype(mat).(I(size(mat)[1]))),
    doDykstra = true,
    stop_at_first_correlation_matrix = true,
    max_iterates = 1000,
)

This is a convenience wrapper for the regularisation techniques.

General Inputs

  • mat - The matrix you want to regularise.
  • ts - The tick data.
  • mat_labels - The name of the assets for each row/column of the matrix.
  • method - The method you want to use. This can be :identity_regularisation, :eigenvalue_clean, :nearest_correlation_matrix or :nearest_psd_matrix. You can also choose :covariance_default (which is :nearest_psd_matrix) or :correlation_default (which is :nearest_correlation_matrix).

Inputs only used in :identity_regularisation method.

  • spacing - The interval spacing used in choosing an identity weight (identity_regularisation method only).

Inputs only used in :nearest_correlation_matrix method.

  • weighting_matrix - The weighting matrix used to calculate the nearest psd matrix (:nearest_correlation_matrix method only).
  • doDykstra - Should a Dykstra correction be applied (:nearest_correlation_matrix method only).
  • stop_at_first_correlation_matrix - Should we stop at first valid correlation matrix (:nearest_correlation_matrix method only).
  • max_iterates - Maximum number of iterates (:nearest_correlation_matrix method only).

Returns

  • A Hermitian

    regularise( covariancematrix::CovarianceMatrix, ts::SortedDataFrame, method::Symbol = :nearestcorrelationmatrix; applytocovariance::Bool = true, spacing::Union{Missing,<:Real} = missing, weightingmatrix = Diagonal(eltype(covariancematrix.correlation).(I(size(covariancematrix.correlation)[1]))), doDykstra = true, stopatfirstcorrelationmatrix = true, max_iterates = 1000, )

This is a convenience wrapper for the regularisation techniques.

General Inputs

  • covariance_matrix - The matrix you want to regularise.
  • ts - The tick data.
  • method - The method you want to use. This can be :identity_regularisation, :eigenvalue_clean, :nearest_correlation_matrix or :nearest_psd_matrix. You can also choose :covariance_default (which is :nearest_psd_matrix) or :correlation_default (which is :nearest_correlation_matrix).
  • apply_to_covariance - Should regularisation be applied to the covariance matrix. If false it is applied to the correlation matrix.

Inputs only used in :identity_regularisation method.

  • spacing - The interval spacing used in choosing an identity weight (identity_regularisation method only).

Inputs only used in :nearest_correlation_matrix method.

  • weighting_matrix - The weighting matrix used to calculate the nearest psd matrix (:nearest_correlation_matrix method only).
  • doDykstra - Should a Dykstra correction be applied (:nearest_correlation_matrix method only).
  • stop_at_first_correlation_matrix - Should we stop at first valid correlation matrix (:nearest_correlation_matrix method only).
  • max_iterates - Maximum number of iterates (:nearest_correlation_matrix method only).

Returns

  • A CovarianceMatrix
source
HighFrequencyCovariance.identity_regularisationFunction
identity_regularisation(mat::Hermitian, identity_weight::Real)

Regularisation of the correlation matrix by mixing with the identity matrix.

Inputs

  • mat - A matrix to be regularised.
  • identity_weight - How much weight to give to the identity matrix. Should be between 0 and 1.

Returns

  • A Hermitian.
identity_regularisation(mat::Hermitian, asset_returns::DataFrame)

Regularisation of the correlation matrix by mixing with the identity matrix as per Ledoit & Wolf 2003.

Inputs

  • mat - A matrix to be regularised.
  • ts - Tick data.

Returns

  • A Hermitian.

    identityregularisation( mat::Hermitian, ts::SortedDataFrame, matlabels::Vector; spacing::Union{Missing,<:Real} = missing, )

Regularisation of the correlation matrix by mixing with the identity matrix as per Ledoit & Wolf 2003.

Inputs

  • mat - A matrix to be regularised.
  • ts - Tick data.
  • mat_labels - The labels for each asset in the matrix.
  • spacing A spacing to use to estimate returns. This is used in determining the optimal weight to give to the identity matrix.

Returns

  • A Hermitian.

    identityregularisation( covariancematrix::CovarianceMatrix, ts::SortedDataFrame; spacing::Union{Missing,<:Real} = missing, applytocovariance::Bool = true, )

Regularisation of the correlation matrix by mixing with the identity matrix as per Ledoit & Wolf 2003.

Inputs

  • covariance_matrix - The CovarianceMatrix to be regularised.
  • ts - Tick data.
  • spacing A spacing to use to estimate returns. This is used in determining the optimal weight to give to the identity matrix.
  • apply_to_covariance Should regularisation be applied to the covariance matrix or the correlation matrix.

Returns

  • A CovarianceMatrix.

    identityregularisation( covariancematrix::CovarianceMatrix, identityweight::Real; applyto_covariance = false, )

Regularisation of the correlation matrix by mixing with the identity matrix.

Inputs

  • covariance_matrix - The CovarianceMatrix to be regularised.
  • identity_weight - How much weight to give to the identity matrix. Should be between 0 and 1.
  • apply_to_covariance Should regularisation be applied to the covariance matrix or the correlation matrix.

Returns

  • A CovarianceMatrix.

References

Ledoit, O. , Wolf, M. 2003. Improved Estimation of the Covariance Matrix of Stock Returns with an application to portfolio selection. Journal of empirical finance. 10. 603-621.

source
HighFrequencyCovariance.eigenvalue_cleanFunction
eigenvalue_clean(
    eigenvalues::Vector{<:Real},
    eigenvectors::Matrix{<:Real},
    eigenvalue_threshold::Real,
)

This takes the small eigenvalues (with values below eigenvaluethreshold). It sets them to the greater of their average or eigenvaluethreshold/(4*numberofsmall_eigens). Then the matrix is reconstructed and returned (as a Hermitian)

Inputs

  • eigenvalues - The eigenvalues of a matrix.
  • eigenvectors - The eigenvectors of a matrix.
  • eigenvalue_threshold - The threshold for a eigenvalue to be altered.

Returns

  • A Hermitian.

    eigenvalueclean(mat::Hermitian, eigenvaluethreshold::Real)

This splits a matrix into its eigenvalues and eigenvectors. Then takes the small eigenvalues (with values below eigenvalue_threshold). It sets them to the greater of their average or eigenvalue_threshold/(4*number_of_small_eigens). Then the matrix is reconstructed and returned (as a Hermitian)

Inputs

  • mat - A matrix that you want to regularise with eigenvalue regularisation.
  • eigenvalue_threshold - The threshold for a eigenvalue to be altered.

Returns

  • A Hermitian.

    eigenvalue_clean(mat::Hermitian, ts::SortedDataFrame)

Similarly to the above two methods these functions regularise a matrix by setting small eigenvalues to near zero. The method of Laloux, Cizeau, Bouchaud & Potters 2000 is used to choose a threshold.

Inputs

  • mat - A matrix that you want to regularise with eigenvalue regularisation.
  • ts - The tick data.

Returns

  • A Hermitian.

    eigenvalueclean( covariancematrix::CovarianceMatrix, ts::SortedDataFrame; applytocovariance::Bool = true, )

Inputs

  • mat - A matrix that you want to regularise with eigenvalue regularisation.
  • ts - The tick data.
  • apply_to_covariance Should regularisation be applied to the covariance matrix or the correlation matrix.

Returns

  • A CovarianceMatrix.

Note that if the input matrices include any NaN terms then regularisation is not possible. The matrix will be silently returned (as these NaNs will generally be from upstream problems so it is useful to return the matrix rather than throw at this point).As a result outputs should be checked.

References

Laloux, L., Cizeau, P., Bouchaud J. , Potters, M. 2000. "Random matrix theory and financial correlations" International Journal of Theoretical Applied FInance, 3, 391-397.

source
HighFrequencyCovariance.nearest_psd_matrixFunction
nearest_psd_matrix(mat::Hermitian)

This function maps a Hermitian matrix to the nearest psd matrix. This uses the project_to_S method in Higham (2001; Theorem 3.2). No special weighting is applied in this case. Advanced users can use the project_to_S directly if they want to use weights in order to decide what the closest pds matrix.

Inputs

  • mat - The matrix you want to map to a psd matrix

Results

  • A Hermitian

    nearestpsdmatrix( covariancematrix::CovarianceMatrix; applyto_covariance::Bool = true, )

This function maps a Hermitian matrix to the nearest psd matrix. This uses the project_to_S method in Higham (2001; Theorem 3.2). No special weighting is applied in this case. Advanced users can use the project_to_S directly if they want to use weights in order to decide what the closest pds matrix.

Inputs

  • covariance_matrix - The matrix you want to map to a psd matrix
  • apply_to_covariance - Should regularisation be applied to the correlation or covariance matrix.

Results

  • A CovarianceMatrix

    nearestpsdmatrix( covariancematrix::CovarianceMatrix, ts::SortedDataFrame; applyto_covariance::Bool = true, )

This function maps a Hermitian matrix to the nearest psd matrix. This uses the project_to_S method in Higham (2001; Theorem 3.2). No special weighting is applied in this case. Advanced users can use the project_to_S directly if they want to use weights in order to decide what the closest pds matrix.

Inputs

  • covariance_matrix - The matrix you want to map to a psd matrix
  • ts - The Tick data
  • apply_to_covariance - Should regularisation be applied to the correlation or covariance matrix.

Results

  • A CovarianceMatrix

References

Higham NJ (2002). "Computing the nearest correlation matrix - a problem from finance." IMA Journal of Numerical Analysis, 22, 329–343. doi:10.1002/nla.258.

source
HighFrequencyCovariance.nearest_correlation_matrixFunction
nearest_correlation_matrix(
    mat::AbstractMatrix,
    weighting_matrix::Union{Diagonal,Hermitian} = Diagonal(Float64.(I(size(mat)[1])));
    doDykstra::Bool = true,
    stop_at_first_correlation_matrix::Bool = true,
    max_iterates::Integer = 1000,
)

Maps a matrix to the nearest valid correlation matrix (pdf matrix with unit diagonal and all other entries below 1 in absolute value).

Inputs

  • mat - A matrix you want to regularise.
  • ts - The tick data.
  • weighting_matrix - The weighting matrix used to weight what the nearest valid correlation matrix is.
  • doDykstra - Should Dykstra correction be done.
  • stop_at_first_correlation_matrix - Should we keep iterating until we have done all iterates or stop at the first valid correlation matrix.
  • max_iterates - The maximum number of iterates to do towards a valid correlation matrix.

Returns

  • A Matrix

  • An integer saying how many iterates were done

  • A Symbol with a convergence message.

    nearestcorrelationmatrix( covariancematrix::CovarianceMatrix, ts::SortedDataFrame; weightingmatrix::Union{Diagonal,Hermitian} = Diagonal(eltype(covariancematrix.correlation).(I(size(covariancematrix.correlation)[1]))), doDykstra::Bool = true, stopatfirstcorrelationmatrix::Bool = true, max_iterates::Integer = 1000, )

Maps a matrix to the nearest valid correlation matrix (pdf matrix with unit diagonal and all other entries below 1 in absolute value).

Inputs

  • covariance_matrix - The matrix you want to regularise.
  • ts - The tick data.
  • weighting_matrix - The weighting matrix used to weight what the nearest valid correlation matrix is.
  • doDykstra - Should Dykstra correction be done.
  • stop_at_first_correlation_matrix - Should we keep iterating until we have done all iterates or stop at the first valid correlation matrix.
  • max_iterates - The maximum number of iterates to do towards a valid correlation matrix.

Returns

  • A CovarianceMatrix

    nearestcorrelationmatrix( covariancematrix::CovarianceMatrix; weightingmatrix::Union{Diagonal,Hermitian} = Diagonal(eltype(covariancematrix.correlation).(I(size(covariancematrix.correlation)[1]))), doDykstra::Bool = true, stopatfirstcorrelationmatrix::Bool = true, max_iterates::Integer = 1000, )

Maps a matrix to the nearest valid correlation matrix (pdf matrix with unit diagonal and all other entries below 1 in absolute value).

Inputs

  • covariance_matrix - The matrix you want to regularise.
  • weighting_matrix - The weighting matrix used to weight what the nearest valid correlation matrix is.
  • doDykstra - Should Dykstra correction be done.
  • stop_at_first_correlation_matrix - Should we keep iterating until we have done all iterates or stop at the first valid correlation matrix.
  • max_iterates - The maximum number of iterates to do towards a valid correlation matrix.

Returns

  • A CovarianceMatrix

    nearestcorrelationmatrix( mat::Hermitian, ts::SortedDataFrame; weightingmatrix::Union{Diagonal,Hermitian} = Diagonal(eltype(mat).(I(size(mat)[1]))), doDykstra::Bool = true, stopatfirstcorrelationmatrix::Bool = true, maxiterates::Integer = 1000, )

Maps a matrix to the nearest valid correlation matrix (pdf matrix with unit diagonal and all other entries below 1 in absolute value).

Inputs

  • mat - The matrix you want to regularise.
  • weighting_matrix - The weighting matrix used to weight what the nearest valid correlation matrix is.
  • doDykstra - Should Dykstra correction be done.
  • stop_at_first_correlation_matrix - Should we keep iterating until we have done all iterates or stop at the first valid correlation matrix.
  • max_iterates - The maximum number of iterates to do towards a valid correlation matrix.

Returns

  • A Hermitian

    nearestcorrelationmatrix( mat::Hermitian; weightingmatrix::Union{Diagonal,Hermitian} = Diagonal(eltype(mat).(I(size(mat)[1]))), doDykstra::Bool = true, stopatfirstcorrelationmatrix::Bool = true, maxiterates::Integer = 1000, )

Maps a matrix to the nearest valid correlation matrix (pdf matrix with unit diagonal and all other entries below 1 in absolute value).

Inputs

  • covariance_matrix - The matrix you want to regularise.
  • ts - The tick data.
  • weighting_matrix - The weighting matrix used to weight what the nearest valid correlation matrix is.
  • doDykstra - Should Dykstra correction be done.
  • stop_at_first_correlation_matrix - Should we keep iterating until we have done all iterates or stop at the first valid correlation matrix.
  • max_iterates - The maximum number of iterates to do towards a valid correlation matrix.

Returns

  • A Hermitian

References

Higham NJ (2002). "Computing the nearest correlation matrix - a problem from finance." IMA Journal of Numerical Analysis, 22, 329–343. doi:10.1002/nla.258.

source