Assuming an asset price $S$ follows a geometric Brownian motion (GBM), the log returns $R$ are distributed as $$ R_i := \log\left(\frac{S_i}{S_{i-1}}\right) \sim \mathcal{N}\left(\left(\mu - \frac{\sigma^2}{2}\right) \Delta t, \sigma^2 \Delta t \right), \quad i=1,\ldots,N. $$
Let $m = \left(\mu - \frac{\sigma^2}{2}\right) \Delta t$ and $s^2 = \sigma^2 \Delta t$ and consider calibrating a GBM to some returns $R_i$. We'll use the maximum likelihood estimate for $m$, and for simplicity we assume $s$ is known (as would be the case if we were generating the data through a simulation ourselves), in which case $$ \hat{m} = \frac{1}{N} \sum_{i=1}^M R_i. $$ Then the sampling distribution for the sample mean is approximately $\hat{m} \sim \mathcal{N}\left(m, \frac{s^2}{N}\right)$, and an approximate $(1-\alpha)100\%$ confidence interval for the true mean $m$ is $$ [\hat{m} - z_{\alpha/2}\frac{s}{\sqrt{N}},\: \hat{m} + z_{\alpha/2}\frac{s}{\sqrt{N}}] \qquad (1). $$ In particular, increasing the number of observations $N$ results in a smaller confidence interval. This, of course, is a standard result from elementary statistics.
On the other hand, we really need an estimate for $\mu$ in practice, and from $(1)$ we can derive a confidence interval for $\hat{\mu} = \frac{\hat{m}}{\Delta t} + \frac{\sigma^2}{2}$: \begin{align*} & \hat{m} - z_{\alpha/2}\frac{s}{\sqrt{N}} < m < \hat{m} + z_{\alpha/2}\frac{s}{\sqrt{N}} \\ & \qquad \iff \hat{m} - z_{\alpha/2}\frac{s}{\sqrt{N}} < \left(\mu - \frac{\sigma^2}{2}\right) \Delta t < \hat{m} + z_{\alpha/2}\frac{s}{\sqrt{N}} \\ & \qquad \iff \frac{\hat{m}}{\Delta t} + \frac{\sigma^2}{2} - z_{\alpha/2}\frac{s}{\Delta t\sqrt{N}} < \mu < \frac{\hat{m}}{\Delta t} + \frac{\sigma^2}{2} + z_{\alpha/2}\frac{s}{\Delta t\sqrt{N}} \\ & \qquad \iff \hat{\mu} - z_{\alpha/2}\frac{\sigma}{\sqrt{N \Delta t}} < \mu < \hat{\mu} + z_{\alpha/2}\frac{\sigma}{\sqrt{N \Delta t}}. \end{align*} Then, since $\Delta t = \frac{T}{N}$ for some final observation time $T$, a $(1-\alpha)100\%$ confidence interval for the true drift $\mu$ is $$ [\hat{\mu} - z_{\alpha/2}\frac{\sigma}{\sqrt{T}},\: \hat{\mu} + z_{\alpha/2}\frac{\sigma}{\sqrt{T}}]. $$ In particular, increasing the number of observations $N$ has no effect on the confidence interval for the drift $\mu$. Instead, we only obtain a smaller confidence interval by increasing the final time, $T$.
Indeed, for fixed $T$ we may think of obtaining higher and higher frequency data so that $N$ becomes larger and larger. But then $\Delta t$ becomes smaller and smaller by definition, such that $dt = \frac{T}{N}$. This seems quite counter intuitive: for fixed $T$, no matter if I have 1,000 or $1e16$ observations, I get no closer to my true drift $\mu$. On the other hand, if I have only 10 observations over 100 years, I get a much better estimate of $\mu$.
Have I overlooked something? Perhaps this is a well-known problem with estimating the drift that I'm not aware of?
Answer
Yes, you are correct. Consider the following toy example:
1) Log prices follow: $dp_t=\mu dt+\sigma dW_t$
2) Then: $r_{t+h,h}=p_{r+h,h}-p_t ~ N(\mu h, \sigma^2 h)$ 3) standard ML estimators:
- $\hat{\mu}=\frac{1}{nh}\sum_{k=1} r_{kh,h}$
- $\hat{\sigma^2}=\frac{1}{nh}\sum_{k=1} (r_{kh,h}-\hat{\mu}h)^2$
Assymptotic distribution of estimators:
- $\sqrt T(\hat{\mu}-\mu) \rightarrow N(0,\sigma^2)$
- $\sqrt n (\hat{\sigma^2}-\sigma^2)\rightarrow N(0,\sigma^4)$
So when $n$ tends to infinity we get precise estimator of $\sigma^2$ , and when $T$ tends to infinity we get it for $\mu$.
This was first noted by Merton (1980).
No comments:
Post a Comment