Tuesday, February 6, 2018

r - Obtaining risk-neutral probability from option prices


Suppose I have the following data (for the current stock and option prices of the Bank of America)



Strike Last IV Probability
4 8 5.43 0.5813566 0.0000000
7 11 2.45 0.2868052 0.1571556
8 12 1.68 0.3611712 0.0000000
9 13 0.93 0.3149634 0.0000000

10 14 0.42 0.2906097 0.4216563
11 15 0.16 0.2827868 0.0000000
12 16 0.06 0.2894076 0.0000000
13 17 0.03 0.3147238 12.5000000
14 18 0.02 0.3498626 0.0000000
15 19 0.02 0.4019490 0.0000000
16 20 0.01 0.4093461 100.0000000
17 21 0.01 0.4513419 0.0000000
18 22 0.02 0.5374740 0.0000000
19 23 0.01 0.5280132 Inf

20 24 0.02 0.6147154 0.0000000
21 25 0.01 0.5967137 0.0000000

What I want to do is to obtain risk-neutral probability distribution of stock returns from it. I read this question. It is stated that for this purpose I need to take second derivative of option price such as $\frac{\partial^2 c}{\partial K^2}$. Am I right that I cannot do that analytically from the option formula (I use BSM)? So what is practical solution for this (could you please explain in details and preferably with example using my data)?
In the Hull's book it is stated that one can use the following expression to evaluate probability density $g(K)$


$$ g(K) = e^{rT}\frac{c_1+c_3-2c_2}{\delta^2} $$ where $K$ is a strike, $c_1$, $c_2$ and $c_3$ are prices of European call options with maturity $T$ and strikes $K-\delta$, $K$ and $K+\delta$ respectively. Is it the way I should use in practice to evaluate $g(K)$? I have tried it with my data but I get unrealistic results (e.g. negative probabilities).
Appreciate your help




Update: according to the @Quantuple answer I calculated probability, $Probability_i$, that the stock price will lie between $Strike_{i-1}$ and $Strike_{i+1}$ (am I interpret it right?). The values that I got seem to be unrealistic (e.g. negative probability). In this case I tried to do the same with another data (Apple current stock and option prices) and I got the following




Strike Last IV Probability
8 85.0 21.41 0.2814728 0.00000000
10 90.0 16.65 0.2712171 0.04287807
11 92.5 14.15 0.2350727 0.00000000
12 95.0 12.10 0.2530275 0.00000000
13 97.5 10.05 0.2506622 0.01535698
14 100.0 8.23 0.2525582 0.00000000
15 105.0 5.01 0.2436027 0.00000000
16 110.0 2.71 0.2368809 0.07017230
17 115.0 1.35 0.2363258 0.00000000

18 120.0 0.61 0.2362289 0.00000000
19 125.0 0.29 0.2435342 0.85066163
20 130.0 0.15 0.2548730 0.00000000
21 135.0 0.08 0.2660732 0.00000000
22 140.0 0.05 0.2814728 4.00000000
23 145.0 0.03 0.2935170 0.00000000

I use the following R code


chain <- getOptionChain("BAC", Exp = "2016-05-20")
chain <- chain$calls

chain <- chain[, 1:2]
chain$IV <- 0
time_remain <- as.numeric(as.Date("2016-05-20") - as.Date(Sys.time()))
time <- time_remain/360
rf <- 0.01
Spot <- getQuote("BAC")
Spot <- Spot$Last
chain <- as.data.frame(apply(chain, 2, as.numeric))
for (i in 1:nrow(chain)) {
chain$IV[i] <- iv.opt(S = Spot, K = chain$Strike[i], T = time, riskfree = rf, price = chain$Last[i], type = "Call")

}
chain <- na.omit(chain)
ggplot(chain, aes(x = Strike, y = IV)) + geom_line(size = 1, color = "red") + ylab("Implied volatility") + theme(axis.text = element_text(size = 18), panel.border = element_rect(fill = NA, colour = "black", size = 2), axis.title = element_text(size = 20))
chain$Probability <- 0
for (i in seq(2, nrow(chain), 3)) {
chain$Probability[i] <- (-2*chain$Last[i] + chain$Last[i-1] + chain$Last[i+1])/((chain$Last[i+1] - chain$Last[i-1])^2)
}
chain

For Apple everything looks ok besides that $P\left[S \in (135; 145)\right] = 4$ which is unreal (again). Is it real to get risk-neutral probabilities that I could interpret in a "common" way (e.g. they are not negative or their sum do not exceed $1$)?




Answer



The risk-neutral probability density function $q(.)$ is indeed given by $$ q(S_T=s) = \frac{1}{P(0,T)} \frac{ \partial^2 C }{\partial K^2} (K=s,T) $$ where $P(0,T)$ figures the relevant discount factor. This is known as the Breeden-Litzenberger identity.


Because you do not observe a continuum of call prices in practice, you can use a finite difference approximation to estimate the second derivative. In your case, because the strikes lie on a uniform grid (constant spacing, if not you should use a more general formula, see additional comments below), you can use


$$ \frac{ \partial^2 C }{\partial K^2} (K=s,T) \approx \frac{ C(K=s-\Delta K,T) - 2 C(K=s,T) + C(K=s+\Delta K,T)}{ (\Delta K)^2 } $$


which is equivalent to forming a (normalised) butterfly around your target strike $K=s$. Hence if you have no arbitrage in your prices then your pdf should be positive, as you rightfully mention.


The problem here, is that you might observe some 'fictive' arbitrage opportunities... because you are dealing with last prices.


For instance, with your data: $$ q(S_T=10) = \frac{C(K=9,T)-2C(K=10,T)+C(K=11,T)}{1^2} = (4.75 - 2(3.68) + 2.52) < 0$$ which does not seem right, as you point out.


Actually, if you simply look at the call values for $K=\{21,22,23,24\}$, you'll see that the prices are not even monotonically decreasing, hence a clear vertical spread arbitrage opportunity.


So yes, you do have the right formula, but you need to either





  • Clean your input prices. You can look at techniques such as arbitrage free smoothing, but this goes beyond the scope of your question.




  • Use a more reliable data source (e.g. when working with live prices such blatant arbitrage opportunities rarely persist due to the presence of arbitragers, plus there is usually a bid/ask spread).






Some clarifications required by the OP:





  1. A last price, as the name indicates, figures the last price at which a certain security has traded before the market closed. Using this kind of prices can be dangerous. For instance, suppose we are a few seconds away from the market close and you observe a bunch of prices $C(K_1)$,...,$C(K_N)$, that allow for no arbitrage opportunity. Now, someone wishes to trade $C(K_3)$ just before the market closes. Assume that, the price at which the trade is made would imply a butterfly arbitrage opportunity given the values of $C(K_1)$ and $C(K_2)$... well if the market closes just after his trade, no participant can actually do anything about the arbitrage opportunity that last participant created. In the last prices displayed by the exchange, the arb opportunity will be present, although no one can could benefit from it in practice. This is what I meant by 'fictive' arbitrage.




  2. $q(S_T = K)$ does not represent the probability that $S_T$ will lie between strike $K-\Delta K$ and $K + \Delta K$. Rather, it should be understood in the infenitesimal sense. More specifically: $$q(S_T = K) = \lim_{\Delta K\rightarrow 0} Prob[ K - \Delta K \leq S_T \leq K + \Delta K ]$$




  3. In the first part of my answer, I wrote you could only use the formula I gave (which also happens to be the one you mention in your original post), if you had a constant spacing $\Delta K$ on your strikes grid. Here you don't have it. In that case you should rather use: $$ \frac{ \partial^2 C }{\partial K^2} (K=s,T) \approx \frac{2 }{\Delta K^-(\Delta K^- + \Delta K^+)} C(K=s-\Delta K^-,T) - \left( \frac{2}{\Delta K^-(\Delta K^- + \Delta K^+)} + \frac{2}{\Delta K^+(\Delta K^- + \Delta K^+)} \right) C(K=s,T) + \frac{2 }{\Delta K^+(\Delta K^- + \Delta K^+)} C(K=s+\Delta K^+,T)$$ where $\Delta K^-$ is allowed to be different from $\Delta K^+$ (in case they are the same you fall back on the original formula for a uniform grid). Note that both formula are approximations, the smaller $\Delta K$, the better they work.





Running this formula on your example should give (first column = strike price $K$, second column = last price $C(K,T)$, thrid column = $q(S_T=K)$)



85.0000 21.4100 NaN
90.0000 16.6500 -0.0128
92.5000 14.1500 0.0720
95.0000 12.1000 0.0000
97.5000 10.0500 0.0368
100.0000 8.2300 0.0224
105.0000 5.0100 0.0368
110.0000 2.7100 0.0376

115.0000 1.3500 0.0248
120.0000 0.6100 0.0168
125.0000 0.2900 0.0072
130.0000 0.1500 0.0028
135.0000 0.0800 0.0016
140.0000 0.0500 0.0004
145.0000 0.0300 NaN

which is still not totally satisfying (note that I took a unit discount factor, in the absence of relevant data).


I would recommend using OTMF options only for better results. That is, use call options for strikes $K \geq F(0,T)$ (where you see the pdf is relatively well behaved) and put options for strikes $K < F(0,T)$.



No comments:

Post a Comment

technique - How credible is wikipedia?

I understand that this question relates more to wikipedia than it does writing but... If I was going to use wikipedia for a source for a res...