Forecasting Volatility

Summary

https://chat.openai.com/c/18579575-e6d5-40ce-84f1-d00ec515321e

The discussion on forecasting volatility and estimating variance-covariance (VCV) matrices encompasses several key methods and concepts, each with its strengths, weaknesses, and appropriate applications. These can be categorized into the following key learning ideas:

1. Estimating Constant VCV Matrices with Sample Statistics

Method: Utilizes sample variance or covariance from historical return data.
Strengths: Unbiased and consistent, simple to implement.
Weaknesses: Limited by sample size, may not handle large numbers of assets well due to substantial sampling error and lack of cross-sectional consistency.
Application: Suitable for smaller asset numbers with ample historical data.

2. VCV Matrices from Multi-Factor Models

Method: Imposes structure on the VCV matrix using a small number of common factors to determine covariances and asset-specific variances.
Strengths: Handles large numbers of assets, reduces estimation items, and improves cross-sectional consistency.
Weaknesses: Potential for mis-specification leading to biased and inconsistent estimates.
Application: Effective for large asset pools, when precise factor sensitivities and factor VCV matrix elements can be identified.

3. Shrinkage Estimation of VCV Matrices

Method: Combines sample VCV matrix with a target VCV matrix (reflecting prior knowledge) to mitigate estimation error.
Strengths: Increases efficiency of estimates, capturing benefits of underlying methodologies while mitigating their limitations.
Weaknesses: Requires determination of appropriate weights and target matrix, which introduces bias.
Application: Useful when balancing between sample data reliability and theoretical model structures.

4. Estimating Volatility from Smoothed Returns

Method: Adjusts for smoothing in asset classes like real estate or private equity, where observed returns dampen volatility.
Strengths: Corrects for underestimation of risk and overestimation of diversification benefits.
Weaknesses: Dependent on finding reliable observable variables to represent unobservable true returns.
Application: Critical for asset classes with smoothed return data to ensure accurate risk assessment.

5. Time-Varying Volatility: ARCH Models

Method: Captures volatility clustering and time variation in asset returns through autoregressive conditional heteroskedasticity models.
Strengths: Explicitly incorporates time variation and volatility clustering, adaptable to daily volatility estimation.
Weaknesses: Complexity increases with the number of assets, may require simplifications or assumptions for practical use.
Application: Suitable for assets or portfolios exhibiting periods of high and low volatility, where capturing dynamic risk is critical.

These methodologies illustrate the complexity and diversity of approaches in forecasting volatility and estimating VCV matrices. Each method addresses different aspects of volatility estimation, from handling large asset numbers to adjusting for time-variant characteristics of financial markets. The choice among these methods depends on the specific context, including the number of assets, availability of historical data, and the nature of the assets being analyzed.

Learning Outcome

discuss methods of forecasting volatility

In some applications, the analyst is concerned with forecasting the variance for only a single asset. More often, however, the analyst needs to forecast the variance–covariance matrix for several, perhaps many, assets in order to analyze the risk of portfolios. Estimating a single variance that is believed to be constant is straightforward: The familiar sample variance is unbiased and its precision can be enhanced by using higher-frequency data. The analyst’s task becomes more complicated if the variance is not believed to be constant or the analyst needs to forecast a variance–covariance (VCV) matrix. These issues are addressed in this section. In addition, we elaborate on de-smoothing real estate and other returns.

Estimating a Constant VCV Matrix with Sample Statistics

The simplest and most heavily used method for estimating constant variances and covariances is to use the corresponding sample statistic—variance or covariance—computed from historical return data. These elements are then assembled into a VCV matrix. There are two main problems with this method, both related to sample size. First, given the short to intermediate sample periods typical in finance, the method cannot be used to estimate the VCV matrix for large numbers of assets. If the number of assets exceeds the number of historical observations, then some portfolios will erroneously appear to be riskless. Second, given typical sample sizes, this method is subject to substantial sampling error. A useful rule of thumb that addresses both of these issues is that the number of observations should be at least 10 times the number of assets in order for the sample VCV matrix to be deemed reliable. In addition, since each element is estimated without regard to any of the others, this method does not address the issue of imposing cross-sectional consistency.

VCV Matrices from Multi-Factor Models

Factor models have become the standard method of imposing structure on the VCV matrix of asset returns. From this perspective, their main advantage is that the number of assets can be very large relative to the number of observations. The key to making this work is that the covariances are fully determined by exposures to a small number of common factors whereas each variance includes an asset-specific component.

In a model with K common factors, the return on the ith asset is given by

where αi is a constant intercept, βik is the asset’s sensitivity to the kth factor, Fk is the kth common factor return, and εi is a stochastic term with a mean of zero that is unique to the ith asset. In general, the factors will be correlated. Given the model, the variance of the ith asset is

where ρmn is the covariance between the mth and nth factors and $ν2i$ is the variance of the unique component of the ith asset’s return. The covariance between the ith and jth assets is

As long as none of the factors are redundant and none of the asset returns are completely determined by the factors (so $ν2i≠0$ ��2≠0), there will not be any portfolios that erroneously appear to be riskless. That is, we will not encounter the first problem mentioned in Section 8, with respect to using sample statistics.

Imposing structure with a factor model makes the VCV matrix much simpler. With N assets, there are [N(N − 1)/2] distinct covariance elements in the VCV matrix. For example, if N =100, there are 4,950 distinct covariances to be estimated. The factor model reduces this problem to estimating [N × K] factor sensitivities plus [K(K + 1)/2] elements of the factor VCV matrix, Ω. With N = 100 and K = 5, this would mean “only” 500 sensitivities and 15 elements of the factor VCV matrix—almost a 90% reduction in items to estimate. (Of course, we also need to estimate the asset-specific variance terms, $ν2i$ ��2, in order to get the N variances, $σ2i$ ��2.) If the factors are chosen well, the factor-based VCV matrix will contain substantially less estimation error than the sample VCV matrix does.

A well-specified factor model can also improve cross-sectional consistency. To illustrate, suppose we somehow know that the true covariance of any asset i with any asset j is proportional to asset i’s covariance with any third asset, k, so

for any assets i, j, and k. We would want our estimates to come as close as possible to satisfying this relationship. Sample covariances computed from any given sample of returns will not, in general, do so. However, using Equation 12 with only one factor (i.e., K = 1) shows that the covariances from a single-factor model will satisfy

for all assets i, j, and k. Thus, in this simple example, a single-factor model imposes exactly the right cross-sectional structure.

The benefits obtained by imposing a factor structure—handling large numbers of assets, a reduced number of parameters to be estimated, imposition of cross-sectional structure, and a potentially substantial reduction of estimation error—come at a cost. In contrast to the simple example just discussed, in general, the factor model will almost certainly be mis-specified. The structure it imposes will not be exactly right. As a result, the factor-based VCV matrix is biased; that is, the expected value is not equal to the true (unobservable) VCV matrix of the returns. To put it differently, the matrix is not correct even “on average.” The matrix is also inconsistent; that is, it does not converge to the true matrix as the sample size gets arbitrarily large. In contrast, the sample VCV matrix is unbiased and consistent. Thus, when we use a factor-based matrix instead of the sample VCV matrix, we are choosing to estimate something that is “not quite right” with relative precision rather than the “right thing” with a lot of noise. The point is that although factor models are very useful, they are not a panacea.

Shrinkage Estimation of VCV Matrices

As with shrinkage estimation in general, the idea here is to combine the information in the sample data, the sample VCV matrix, with an alternative estimate, the target VCV matrix—which reflects assumed “prior” knowledge of the structure of the true VCV matrix—and thereby mitigate the impact of estimation error on the final matrix. Each element (variance or covariance) of the final shrinkage estimate of the VCV matrix is simply a weighted average of the corresponding elements of the sample VCV matrix and the target VCV matrix. The same weights are used for all elements of the matrix. The analyst must determine how much weight to put on the target matrix (the “prior” knowledge) and how much weight to put on the sample data (the sample VCV matrix).

Aside from a technical condition that rules out the appearance of riskless portfolios, virtually any choice of target VCV matrix will increase (or at least not decrease) the efficiency of the estimates versus the sample VCV matrix. “Efficiency” in this context means a smaller mean-squared error (MSE), which is equal to an estimator’s variance plus the square of its bias. Although the shrinkage estimator is biased, its MSE will in general be smaller than the MSE of the (unbiased) sample VCV matrix. The more plausible (and presumably less biased) the selected target matrix, the greater the improvement will be. A factor-model-based VCV matrix would be a reasonable candidate for the target.

EXAMPLE 9

Estimating the VCV Matrix

Isa Berkitz is an analyst at Barnsby & Culp (B&C), a recently formed multi-family office. Berkitz has been asked to propose the method for estimating the variance–covariance matrix to be used in B&C’s asset allocation process for all clients. After examining the existing client portfolios and talking with the clients and portfolio managers, Berkitz concludes that in order to support B&C’s strategic and tactical allocation needs, the VCV matrix will need to include 25 asset classes. For many of these classes, she will be able to obtain less than 10 years of monthly return data. Berkitz has decided to incorporate both the sample statistics and factor-model approaches using shrinkage estimation.
Explain the strengths and weaknesses of the two basic approaches and why Berkitz would choose to combine them using the shrinkage framework.

Solution:

The VCV matrix based on sample statistics is correct on average (it is unbiased) and convergences to the true VCV matrix as the sample size gets arbitrarily large (it is “consistent”). The sample VCV method cannot be used if the number of assets exceeds the number of observations, which is not an issue in this case. However, it is subject to large sampling errors unless the number of observations is large relative to the number of assets. A 10-to-1 rule of thumb would suggest that Berkitz needs more than 250 observations (20+ years of monthly data) in order for the sample VCV matrix to give her reliable estimates, but she has at most 120 observations. In addition, the sample VCV matrix does not impose any cross-sectional consistency on the estimates. A factor-model-based VCV matrix can be used even if the number of assets exceeds the number of observations. It can substantially reduce the number of unique parameters to be estimated, it imposes cross-sectional structure, and it can substantially reduce estimation errors. However, unless the structure imposed by the factor model is exactly correct, the VCV matrix will not be correct on average (it will be biased). Shrinkage estimation—a weighted average of the sample VCV and factor-based VCV matrices—will increase (or at least not decrease) the efficiency of the estimates. In effect, the shrinkage estimator captures the benefits of each underlying methodology and mitigates their respective limitations.

Estimating Volatility from Smoothed Returns

The available return data for such asset classes as private real estate, private equity, and hedge funds generally reflect smoothing of unobservable underlying “true” returns. The smoothing dampens the volatility of the observed data and distorts correlations with other assets. Thus, the raw data tend to understate the risk and overstate the diversification benefits of these asset classes. Failure to adjust for the impact of smoothing will almost certainly lead to distorted portfolio analysis and hence poor asset allocation decisions.

The basic idea is that the observed returns are a weighted average of current and past true, unobservable returns. One of the simplest and most widely used models implies that the current observed return, Rt, is a weighted average of the current true return, rt, and the previous observed return:

As an example, if λ = 0.8, then the true variance, var(r), of the asset is 9 times the variance of the observed data. Equivalently, the standard deviation is 3 times larger.

This model cannot be estimated directly because the true return, rt, is not observable. To get around this problem, the analyst assumes a relationship between the unobservable return and one or more observable variables. For private real estate, a natural choice might be a REIT index, whereas for private equity, an index of similar publicly traded equities could be used.

EXAMPLE 10

Estimating Volatility from Smoothed Data

While developing the VCV matrix for B&C, Isa Berkitz noted that the volatilities for several asset classes—notably, real estate and private equity categories—calculated directly from available return data appear to be very low. The data are from reputable sources, but Berkitz is skeptical because similar publicly traded classes—for example, REITs and small-cap equities—exhibit much higher volatilities. What is the likely cause of the issue?

Guideline answer:

The very low volatilities are very likely due to smoothing within the reported private asset returns. That is, the observed data reflect a weighted average of current and past true returns. For real estate, this smoothing arises primarily because the underlying property values used to calculate “current” returns are based primarily on backward-looking appraisals rather than concurrent transactions.

Time-Varying Volatility: ARCH Models

The discussion up to this point has focused on estimating variances and covariances under the assumption that their true values do not change over time. It is well known, however, that financial asset returns tend to exhibit volatility clustering, evidenced by periods of high and low volatility. A class of models known collectively as autoregressive conditional heteroskedasticity (ARCH) models has been developed to address these time-varying volatilities.31

One of the simplest and most heavily used forms of this broad class of models specifies that the variance in period t is given by

where α, β, and γ are non-negative parameters such that (α + β) < 1. The term ηt is the unexpected component of return in period t; that is, it is a random variable with a mean of zero conditional on information at time (t − 1). Rearranging the equation as in the second line shows that

can be interpreted as the “shock” to the variance in period t. Thus, the variance in period t depends on the variance in period (t − 1) plus a shock. The parameter β controls how much of the current “shock” feeds into the variance. In the extreme, if β = 0, then variance would be deterministic. The quantity (α + β) determines the extent to which the variance in future periods is influenced by the current level of volatility. The higher (α + β) is, the more the variance “remembers” what happened in the past and the more it “clusters” at high or low levels. The unconditional expected value of the variance is [γ/(1 − α − β)].

As an example, assume that γ = 0.000002, α = 0.9, and β = 0.08 and that we are estimating daily equity volatility. Given these parameters, the unconditional expected value of the variance is 0.0001, implying that the daily standard deviation is 1% (0.01). Suppose the estimated variance at time (t − 1) was 0.0004 (= 0.022) and the return in period t was 3% above expectations (ηt = 0.03). Then the variance in period t would be

The ARCH methodology can be extended to multiple assets—that is, to estimation of a VCV matrix. The most straightforward extensions tend to be limited to only a few assets since the number of parameters rises very rapidly. However, Engle (2002) developed a class of models with the potential to handle large matrices with relatively few parameters.

EXAMPLE 11

ARCH

Sam Akai has noticed that daily returns for a variety of asset classes tend to exhibit periods of high and low volatility but the volatility does seem to revert toward a fairly stable average level over time. Many market participants capture this tendency by estimating volatilities using a 60-day moving window. Akai notes that this method implicitly assumes volatility is constant within each 60-day window but somehow not constant from one day to the next. He has heard that ARCH models can explicitly incorporate time variation and capture the observed clustering pattern.

Explain the models to him.

Guideline answer:

The key idea is to model variance as a linear time-series process in which the current volatility depends on its own recent history or recent shocks. The shocks to volatility arise from unexpectedly large or small returns. In one of the simplest ARCH models, the current variance depends only on the variance in the previous period and the unexpected component of the current return (squared). Provided the coefficients are positive and not “too large,” the variance will exhibit the properties Akai has observed: periods of time at high/low levels relative to a well-defined average level.

PreviousForecasting Exchange Rates NextAdjusting Global Portfolio

Last updated 1 year ago