I'm trying to study the Merton Model for portfolio optimization and the document doesn't explain a quite important step : if V(t,x)=sup is the value function then, "under some regularity assumptions", it will satisfy the Hamilton-Jacobi-Bellman equation.
What are those regularity assumptions ? How can we prove them ?
Answer
This is an optimal control problem.
Consider a self-financing strategy \pi := (\pi_s)_{s\in[t,T]} over the horizon [t,T] consisting in, over each infinitesimal period of time [t,t+dt[, investing a fraction \pi_t of the current wealth in a risky asset S_t and placing the remaining part in the risk free asset B_t. Given the following dynamics dS_t = S_t(\mu_t dt + \sigma_t dW_t) dB_t = B_t(r_t dt) starting from an initial wealth x is, the wealth at time t of an investor following the strategy \pi will be X_t^{\pi,x} = \frac{ \pi_t X_t^{\pi,x} }{ S_t } S_t + \frac{ (1-\pi_t) X_t^{\pi,x} }{B_t} B_t and its evolution will be governed by the following SDE dX_t^{\pi,x} = X^{\pi,x} \left[ (r_t + \pi_t(\mu_t-r_t))dt + \pi_t \sigma_t dW_t \right]
Consider the value function V(t,x;(\pi_s)_{s\in[t,T]}) = \Bbb{E}_t \left[ U(X_T^{\pi,x}) \right]
The optimal control \pi_t^* is the stochastic process such that (\pi^*_s)_{s \in [t,T]} = \text{argsup}_{(\pi_s)_{s \in [t,T]}} V(t,x;(\pi_s)_{s \in [t,T]})
while the optimal cost is V(t,x) = \Bbb{E}_t \left[ U(X_T^{\pi^*,x}) \right]
The optimal cost function solves the Hamilton-Jacobi-Bellman equations.
The proof can be obtained by viewing the control problem as a Dynamic Programming Problem and relying on Bellman's principle of optimality (see (1) below).
As @noob2 mentions, at some point the Itô differential of the optimal cost V(t,x) appears. Therefore regularity conditions are the usual conditions for Itô integration, both for X_t and V(t,x).
Some intuition \begin{align} V(t,x) &= \Bbb{E}_t \left[ U(X_T^{\pi^*,x}) \right] \\ &= \Bbb{E}_t \left[ \Bbb{E}_{t+dt} \left[ U\left(X_T^{\pi^*,x+dX_t(\pi^*_t)}\right) \right] \right] \\ &= \Bbb{E}_t \left[ V(t+dt, x+dX_t(\pi^*_t)) \right] \tag{1}\\ &= \sup_{\pi_t} \Bbb{E}_t \left[ V(t+dt, x+dX_t(\pi_t)) \right]\\ &= \sup_{\pi_t} \Bbb{E}_t \left[ V(t,x) + \frac{dV(t,x)}{dt} dt + \frac{dV(t,x)}{dx}dX_t + \frac{d^2V(t,x)}{dx^2}d\langle X \rangle_t \right] \\ &= V(t,x) + \frac{dV(t,x)}{dt} dt + \sup_{\pi_t} \left( \frac{dV(t,x)}{dx} x (r_t + \pi_t(\mu_t-r_t)) + \frac{1}{2} \frac{d^2V(t,x)}{dx^2} x^2 \pi_t^2 \sigma_t^2 \right) dt \end{align} hence finally \frac{dV(t,x)}{dt} + \sup_{\pi_t} \left( \frac{dV(t,x)}{dx} x (r_t + \pi_t(\mu_t-r_t)) + \frac{1}{2} \frac{d^2V(t,x)}{dx^2} x^2 \pi_t^2 \sigma_t^2 \right) = 0
[edit]
The DPP point of view consists in viewing the optimal control (\pi^*_s)_{s \in [t,T]} as the "union" of what you choose to do over [t,t+dt[ and what you do over [t+dt,T[. Informally: (\pi^*_s)_{s \in [t,T]} = \pi^*_t \cup (\pi^*_s)_{s \in [t+dt,T]}
At this point, Bellman's optimality principle tells you that the restriction of the optimal control (\pi^*_s)_{s \in [t+dt,T]} is itself the optimal policy over the horizon [t+dt,T[. This is why in (1) you can write that \Bbb{E}_{t+dt} \left[ U\left(X_T^{\pi^*,x+dX_t(\pi^*_t)}\right) \right] = V(t+dt,x+dX_t(\pi^*_t)) with V the optimal cost (and not simply the value function).
No comments:
Post a Comment