r - Obtaining risk-neutral probability from option prices

Suppose I have the following data (for the current stock and option prices of the Bank of America)


   Strike Last        IV Probability
4       8 5.43 0.5813566   0.0000000
7      11 2.45 0.2868052   0.1571556
8      12 1.68 0.3611712   0.0000000
9      13 0.93 0.3149634   0.0000000

10     14 0.42 0.2906097   0.4216563
11     15 0.16 0.2827868   0.0000000
12     16 0.06 0.2894076   0.0000000
13     17 0.03 0.3147238  12.5000000
14     18 0.02 0.3498626   0.0000000
15     19 0.02 0.4019490   0.0000000
16     20 0.01 0.4093461 100.0000000
17     21 0.01 0.4513419   0.0000000
18     22 0.02 0.5374740   0.0000000
19     23 0.01 0.5280132         Inf

20     24 0.02 0.6147154   0.0000000
21     25 0.01 0.5967137   0.0000000

What I want to do is to obtain risk-neutral probability distribution of stock returns from it. I read this question. It is stated that for this purpose I need to take second derivative of option price such as $\frac{\partial^2 c}{\partial K^2}$ . Am I right that I cannot do that analytically from the option formula (I use BSM)? So what is practical solution for this (could you please explain in details and preferably with example using my data)?
In the Hull's book it is stated that one can use the following expression to evaluate probability density $g(K)$

$g(K) = e^{rT}\frac{c_1+c_3-2c_2}{\delta^2}$ where $K$ is a strike, $c_1$ , $c_2$ and $c_3$ are prices of European call options with maturity $T$ and strikes $K-\delta$ , $K$ and $K+\delta$ respectively. Is it the way I should use in practice to evaluate $g(K)$ ? I have tried it with my data but I get unrealistic results (e.g. negative probabilities).
Appreciate your help

Update: according to the @Quantuple answer I calculated probability, $Probability_i$ , that the stock price will lie between $Strike_{i-1}$ and $Strike_{i+1}$ (am I interpret it right?). The values that I got seem to be unrealistic (e.g. negative probability). In this case I tried to do the same with another data (Apple current stock and option prices) and I got the following



   Strike  Last        IV Probability
8    85.0 21.41 0.2814728  0.00000000
10   90.0 16.65 0.2712171  0.04287807
11   92.5 14.15 0.2350727  0.00000000
12   95.0 12.10 0.2530275  0.00000000
13   97.5 10.05 0.2506622  0.01535698
14  100.0  8.23 0.2525582  0.00000000
15  105.0  5.01 0.2436027  0.00000000
16  110.0  2.71 0.2368809  0.07017230
17  115.0  1.35 0.2363258  0.00000000

18  120.0  0.61 0.2362289  0.00000000
19  125.0  0.29 0.2435342  0.85066163
20  130.0  0.15 0.2548730  0.00000000
21  135.0  0.08 0.2660732  0.00000000
22  140.0  0.05 0.2814728  4.00000000
23  145.0  0.03 0.2935170  0.00000000

I use the following R code

chain <- getOptionChain("BAC", Exp = "2016-05-20")
chain <- chain$calls

chain <- chain[, 1:2]
chain$IV <- 0
time_remain <- as.numeric(as.Date("2016-05-20") - as.Date(Sys.time()))
time <- time_remain/360
rf <- 0.01
Spot <- getQuote("BAC")
Spot <- Spot$Last
chain <- as.data.frame(apply(chain, 2, as.numeric))
for (i in 1:nrow(chain)) {
  chain$IV[i] <- iv.opt(S = Spot, K = chain$Strike[i], T = time, riskfree = rf, price = chain$Last[i], type = "Call")

}
chain <- na.omit(chain)
ggplot(chain, aes(x = Strike, y = IV)) + geom_line(size = 1, color = "red") + ylab("Implied volatility") + theme(axis.text = element_text(size = 18), panel.border = element_rect(fill = NA, colour = "black", size = 2), axis.title = element_text(size = 20))
chain$Probability <- 0
for (i in seq(2, nrow(chain), 3)) {
  chain$Probability[i] <- (-2*chain$Last[i] + chain$Last[i-1] + chain$Last[i+1])/((chain$Last[i+1] - chain$Last[i-1])^2)
}
chain

For Apple everything looks ok besides that $P\left[S \in (135; 145)\right] = 4$ which is unreal (again). Is it real to get risk-neutral probabilities that I could interpret in a "common" way (e.g. they are not negative or their sum do not exceed $1$ )?

Answer

The risk-neutral probability density function $q(.)$ is indeed given by $q(S_T=s) = \frac{1}{P(0,T)} \frac{ \partial^2 C }{\partial K^2} (K=s,T)$ where $P(0,T)$ figures the relevant discount factor. This is known as the Breeden-Litzenberger identity.

Because you do not observe a continuum of call prices in practice, you can use a finite difference approximation to estimate the second derivative. In your case, because the strikes lie on a uniform grid (constant spacing, if not you should use a more general formula, see additional comments below), you can use

$\frac{ \partial^2 C }{\partial K^2} (K=s,T) \approx \frac{ C(K=s-\Delta K,T) - 2 C(K=s,T) + C(K=s+\Delta K,T)}{ (\Delta K)^2 }$

which is equivalent to forming a (normalised) butterfly around your target strike $K=s$ . Hence if you have no arbitrage in your prices then your pdf should be positive, as you rightfully mention.

The problem here, is that you might observe some 'fictive' arbitrage opportunities... because you are dealing with last prices.

For instance, with your data: $q(S_T=10) = \frac{C(K=9,T)-2C(K=10,T)+C(K=11,T)}{1^2} = (4.75 - 2(3.68) + 2.52) < 0$ which does not seem right, as you point out.

Actually, if you simply look at the call values for $K=\{21,22,23,24\}$ , you'll see that the prices are not even monotonically decreasing, hence a clear vertical spread arbitrage opportunity.

So yes, you do have the right formula, but you need to either

Clean your input prices. You can look at techniques such as arbitrage free smoothing, but this goes beyond the scope of your question.

Use a more reliable data source (e.g. when working with live prices such blatant arbitrage opportunities rarely persist due to the presence of arbitragers, plus there is usually a bid/ask spread).

Some clarifications required by the OP:

A last price, as the name indicates, figures the last price at which a certain security has traded before the market closed. Using this kind of prices can be dangerous. For instance, suppose we are a few seconds away from the market close and you observe a bunch of prices $C(K_1)$ ,..., $C(K_N)$ , that allow for no arbitrage opportunity. Now, someone wishes to trade $C(K_3)$ just before the market closes. Assume that, the price at which the trade is made would imply a butterfly arbitrage opportunity given the values of $C(K_1)$ and $C(K_2)$ ... well if the market closes just after his trade, no participant can actually do anything about the arbitrage opportunity that last participant created. In the last prices displayed by the exchange, the arb opportunity will be present, although no one can could benefit from it in practice. This is what I meant by 'fictive' arbitrage.

$q(S_T = K)$ does not represent the probability that $S_T$ will lie between strike $K-\Delta K$ and $K + \Delta K$ . Rather, it should be understood in the infenitesimal sense. More specifically: $q(S_T = K) = \lim_{\Delta K\rightarrow 0} Prob[ K - \Delta K \leq S_T \leq K + \Delta K ]$

In the first part of my answer, I wrote you could only use the formula I gave (which also happens to be the one you mention in your original post), if you had a constant spacing $\Delta K$ on your strikes grid. Here you don't have it. In that case you should rather use: $\frac{ \partial^2 C }{\partial K^2} (K=s,T) \approx \frac{2 }{\Delta K^-(\Delta K^- + \Delta K^+)} C(K=s-\Delta K^-,T) - \left( \frac{2}{\Delta K^-(\Delta K^- + \Delta K^+)} + \frac{2}{\Delta K^+(\Delta K^- + \Delta K^+)} \right) C(K=s,T) + \frac{2 }{\Delta K^+(\Delta K^- + \Delta K^+)} C(K=s+\Delta K^+,T)$ where $\Delta K^-$ is allowed to be different from $\Delta K^+$ (in case they are the same you fall back on the original formula for a uniform grid). Note that both formula are approximations, the smaller $\Delta K$ , the better they work.

Running this formula on your example should give (first column = strike price $K$ , second column = last price $C(K,T)$ , thrid column = $q(S_T=K)$ )


   85.0000   21.4100       NaN
   90.0000   16.6500   -0.0128
   92.5000   14.1500    0.0720
   95.0000   12.1000    0.0000
   97.5000   10.0500    0.0368
  100.0000    8.2300    0.0224
  105.0000    5.0100    0.0368
  110.0000    2.7100    0.0376

  115.0000    1.3500    0.0248
  120.0000    0.6100    0.0168
  125.0000    0.2900    0.0072
  130.0000    0.1500    0.0028
  135.0000    0.0800    0.0016
  140.0000    0.0500    0.0004
  145.0000    0.0300       NaN

which is still not totally satisfying (note that I took a unit discount factor, in the absence of relevant data).

I would recommend using OTMF options only for better results. That is, use call options for strikes $K \geq F(0,T)$ (where you see the pdf is relatively well behaved) and put options for strikes $K < F(0,T)$ .

Blog

Tuesday, February 6, 2018

r - Obtaining risk-neutral probability from option prices

No comments:

Post a Comment

technique - How credible is wikipedia?