Tuesday, April 11, 2017

portfolio optimization - Hamilton-Jacobi-Bellman equation in Merton Model


I'm trying to study the Merton Model for portfolio optimization and the document doesn't explain a quite important step : if V(t,x)=sup{E[U(XT(ϕ)) | Xt=x]  |  ϕ  an admissible trading strategy}

is the value function then, "under some regularity assumptions", it will satisfy the Hamilton-Jacobi-Bellman equation.


What are those regularity assumptions ? How can we prove them ?



Answer



This is an optimal control problem.


Consider a self-financing strategy π:=(πs)s[t,T] over the horizon [t,T] consisting in, over each infinitesimal period of time [t,t+dt[, investing a fraction πt of the current wealth in a risky asset St and placing the remaining part in the risk free asset Bt. Given the following dynamics dSt=St(μtdt+σtdWt)

dBt=Bt(rtdt)
starting from an initial wealth x is, the wealth at time t of an investor following the strategy π will be Xπ,xt=πtXπ,xtStSt+(1πt)Xπ,xtBtBt
and its evolution will be governed by the following SDE dXπ,xt=Xπ,x[(rt+πt(μtrt))dt+πtσtdWt]



Consider the value function V(t,x;(πs)s[t,T])=Et[U(Xπ,xT)]


The optimal control πt is the stochastic process such that (πs)s[t,T]=argsup(πs)s[t,T]V(t,x;(πs)s[t,T])


while the optimal cost is V(t,x)=Et[U(Xπ,xT)]


The optimal cost function solves the Hamilton-Jacobi-Bellman equations.


The proof can be obtained by viewing the control problem as a Dynamic Programming Problem and relying on Bellman's principle of optimality (see (1) below).


As @noob2 mentions, at some point the Itô differential of the optimal cost V(t,x) appears. Therefore regularity conditions are the usual conditions for Itô integration, both for Xt and V(t,x).


Some intuition V(t,x)=Et[U(Xπ,xT)]=Et[Et+dt[U(Xπ,x+dXt(πt)T)]]=Et[V(t+dt,x+dXt(πt))]=supπtEt[V(t+dt,x+dXt(πt))]=supπtEt[V(t,x)+dV(t,x)dtdt+dV(t,x)dxdXt+d2V(t,x)dx2dXt]=V(t,x)+dV(t,x)dtdt+supπt(dV(t,x)dxx(rt+πt(μtrt))+12d2V(t,x)dx2x2π2tσ2t)dt

hence finally dV(t,x)dt+supπt(dV(t,x)dxx(rt+πt(μtrt))+12d2V(t,x)dx2x2π2tσ2t)=0


[edit]


The DPP point of view consists in viewing the optimal control (πs)s[t,T] as the "union" of what you choose to do over [t,t+dt[ and what you do over [t+dt,T[. Informally: (πs)s[t,T]=πt(πs)s[t+dt,T]


At this point, Bellman's optimality principle tells you that the restriction of the optimal control (πs)s[t+dt,T] is itself the optimal policy over the horizon [t+dt,T[. This is why in (1) you can write that Et+dt[U(Xπ,x+dXt(πt)T)]=V(t+dt,x+dXt(πt))

with V the optimal cost (and not simply the value function).



No comments:

Post a Comment

technique - How credible is wikipedia?

I understand that this question relates more to wikipedia than it does writing but... If I was going to use wikipedia for a source for a res...