Wednesday, October 7, 2015

optimization - Min VaR and Min TE as second order cone program


The quadratic optimization (min variance) $$ w^{T} \Sigma w \rightarrow \text{min}, $$ where $w$ is the vector of portfolio weights and $\Sigma$ is the covariance matrix of asset returns, is a well studied problem. In practice we have to define certain constraints. These are easy linear, continuous ones (like $\sum_{i=1}^n w_i = 1$ or constraints on turn over) or difficult binary constraints (if an asset is bought than with at least $x\%$, a cardinality constraint, etc.).


Setting $\hat{w} = w_{Pf} - w_{BM}$ as the vector of active weights of the portfolio relative to a benchmark, the above formula describes minimizing the (squared) tracking error.



The difficult binary constraints lead to mixed integer programming (MIP). Commercial solvers solve this by methods such as branch and bound which are very time consuming for large problems (e.g. 1000+ assets).


I heard of an approach to formulate such problems as second order cone programs (SOCP) and that such problems are usually solved more efficiently.


I have plenty of experience with branche and bound and I would like to switch to SOCP. Do you know of any good refernce of SOCP and portfolio/TE optimization with hard real world constraints (especially with binary variables)? Do you have any experiences whether switching is worth the efforts?


PS: Let us assume that $\Sigma$ is well estimated and that variance makes sense. I know that this is debatable ...


EDIT: This paper Applications of second-order cone programming describes the formulation of a quadratically constrainted quadratic program as SOCP. Will also have a look here.


EDIT 2: To formulate every detail. I have the mixed-inter quadratic program (I formulate the TE case): $$ w^{T} \Sigma w + w^T c \rightarrow \text{min}, $$ with $0 \le w_i \le u$ for $i = 1,\ldots, N$ with $N \approx 1000$ and a constant vecor $c$. The constraints are of the form $$ w_i \ge b_i*l \text{ for } i = 1,\ldots,N \\ \sum b_i \le K \\ b_i \in \{0,1\} \text{ for } i = 1,\ldots,N $$ with a small real $l \in (0,1)$ and some large integer $K$. Moreover I have lots of continuous constraints $$ A w \le b. $$ If I am totally precise then there are additional continuous variables that appear in the constraints only (these model turn over).


If I follow the link above then this can be formulated as mixed SOCP with $$ t \rightarrow \text{min}. $$ and all the above constraints and one additional constraint: $$ \| \Sigma^{1/2} w + \Sigma^{-1/2}c \| \le t. $$


Is there a chance that the mixed-integer SOCP problem as formulated above can be solved more efficiently than the mixed-integer quadratic program (with linear and binary but no quadratic constraints)?



Answer



Without the discrete constraints, the minimum tracking error/variance problem is a quadratic program. If you constrain the tracking error, you have a convex quadratically-constrained problem which is solved as an SOCP by modern commercial solvers. SOCP does not address discrete constraints like cardinality of assets or minimum investment levels. SOCP deals with continuous convex constraints. Those constraints make the feasible region non-convex. As an example, a portfolio of 50% A and 50% B has just 2 assets, and a portfolio of 50% B and 50% C has two assets, but any convex combination of those two portfolios will have 3 assets.



Linear vs. SOCP and continuous vs mixed-integer are orthogonal concepts. Mixed-integer solvers usually require search trees. The linear vs. SOCP determines what problem is solved at each node in the search tree. With the commercial solvers cplex, and Gurobi you can have continuous linear, mixed-integer linear, quadratic/SOCP continuous and quadratic/SOCP mixed-integer. There's a video by Gurobi on SOCP that has an aside on the futility of trying to make names for all the combinations.


It is possible for a good commercial solver to reliably solve problems like you describe with 1000's of assets in a reasonable amount of time (seconds to a few minutes), which combine sophisticated heuristics and branch-and-cut algorithms. The actual performance you will observe is very dependent on how you formulate the constraints and the quality of the solver. There are a lot of ways to improve a formulation. For example, the cardinality constraints are usually modeled with indicator variables $$ y_i = \left\{\begin{array}{ll} 1 & \mbox{if $w_i > 0$} \\ 0 & \mbox{otherwise} \end{array}\right. $$


which is enforced with constraints \begin{eqnarray} w_i &\le y_i &\hspace{1in} \forall i \in \{1, \ldots, n\} \\ \sum_{i = 1}^n y_i &\le k & \end{eqnarray} where $k$ is the cardinality limit. The constraint $w_i \le y_i$ is very weak, however. If you know of an a-priori upper bound on the $w_i$, say $u_i$, the you can add the constraint $$w_i \le u_i y_i$$ to the model. If $u_i$ is much smaller than $1$, then the continuous relaxation at the nodes will be tighter and help prune the branching tree and guide the heuristics. You can get an upper bound on the $w_i$ either from an explicit constraint (hard limit on the weight of any single asset), or from deriving it from the problem.j


If you are actually trying to explicitly minimize $w^{T} \Sigma w$ for 1000's of assets, that means $\Sigma$ has over 1,000,000 entries and you have explicitly estimated the covariance of every pair of assets. This is both difficult or impossible to do reliably and inefficient from a computation standpoint. It would be better to make a factor model, where the returns on asset $i$ would be given by $r_i = \epsilon_i + \sum_j \beta_{ij} r_j$. Where $j$ indexes the factors, and $\sigma_i$ would be the variance of $\epsilon_i$. If $\Sigma^f$ was the variance/covariance of the factor returns, then you would have an objective $$ \mbox{minimize} f^T \Sigma^f f + \sum_i \sigma_i^2 w_i^2 $$ With additional constraints $$ \sum_i \beta_{ij} w_i = f_j \hspace{0.5in} \forall j $$ The continuous relaxation would be much smaller and so the node subproblems would solve much faster. The $\sigma_i$ could be used to compute upper bounds on $w_i$ as well, since it is the variance that can't be hedged by other assets.


No comments:

Post a Comment

technique - How credible is wikipedia?

I understand that this question relates more to wikipedia than it does writing but... If I was going to use wikipedia for a source for a res...