I was looking again at this question which basically haunts every quant I believe, and I was thinking about the effect of these gaps when computing volatility of the series.
Let's define the problem more specifically for clarity.
Say you're working on a simple mean variance optimization for asset allocation. You have different asset classes (Equities, Bonds, Hedge-Funds, ...) perhaps even separated by region (Equities "=" {"Equities EU","Equities US", ... ) and you usually do this optimization using indices prices (e.g. S&P500 for Equities US).
Assuming we use daily data for the optimization, as the different indices do not trade on the same dates, if you put the series altogether in a set you will have "holes" in some of them. As the series are still prices, you are looking for a policy to fill the gaps.
1) Removing data
You can decide to remove all the dates where one of the values is null. If you do so, when computing the returns, you will have "ignored" a point for series that actually had the data, and hence the return which will be computed will possibly be of a higher magnitude than it should have been (because the two prices are actually separated by more than a single day).
2) Filling with data
Alternatively, you can decide to fill the prices, for example with the LVCF tactic (last value carried forward), but this would result in 0 returns which are not actually part of the real data. (2.1)
You could also fill with some algorithm, but this will end up "flattening" the course of the returns (2.3)
How do you usually tackle this issue to have a volatility that is not altered by these workarounds? Removing? LVCF? Interpolation?
Do you think that maybe this effect is negligible?
Answer
The usual technique of computing the mean and standard deviation of returns happens to coincide with the maximum likelihood estimate when the data are regularly spaced. However, when the data are not regularly spaced, you can still do a maximum likelihood estimate. It's just more computationally intensive than before.
That is to say, assume you have observations of asset price $S_i$ at (possibly irregular) times $t_i$. Let these time differences be $\Delta t_i$. Then note that the transition probabilities are given by $$ p\left(S_{i+1}| S_i ; \alpha,\sigma \right) = \phi \left( \log(S_{i+1} / S_i ); (\alpha-\frac12 \sigma^2)\Delta t_i , \sigma \sqrt{\Delta t_i} \right) $$ where $\phi$ is the gaussian density function. This transition probability is from the Black-Scholes model, of course, which is the same one you are implicitly using when employing the standard volatility estimators.
Now you can run an optimization algorithm (such as BFGS) to maximize the overall path probability. That is, you create an objective function $$ F\left(\alpha,\sigma, \{S_i,t_i\} \right) = \sum_i \log( p\left(S_{i+1}| S_i; \alpha,\sigma \right) ) $$ to find its maximum over all possible values of $\alpha$ and $\sigma$. The corresponding value of $\sigma$ is your MLE estimate.
No comments:
Post a Comment