Friday, October 2, 2015

Equity Risk Model Using PCA


I'm trying to build a simple risk model for stocks using PCA. I've noticed that when my dimensions are larger than the number of observations (for example 1000 stocks but only 250 days of returns), then the resulting transformed return series (returns rotated by eigenvectors or factor returns) have non-zero correlation.


Intuitively, I can see why this might be, since in the pca process I am estimating a 1000x1000 covariance matrix from 250x1000 observations. So it is like an underdetermined system. But I'm not exactly sure what's going on. Can someone explain what is happening?


Also, for risk model purposes, is it better to assume a diagonal covariance matrix or use the sample covariance of the factors?



Here is some matlab code to demonstrate the problem:


% More observation than dimensions
Nstock = 10;
Nobs = 11;
obs = randn(Nobs, Nstock);
rot = princomp(obs);
rotobs = obs * rot;
corr(rotobs) % off diagonals are all zero

% More dimensions that observations

Nstock = 10;
Nobs = 9;
obs = randn(Nobs, Nstock);
rot = princomp(obs);
rotobs = obs * rot;
corr(rotobs) % some off diagonals are non-zero

Answer



Regarding the second part of your question - You are running into the classic N>T problem (N=# assets; T=# of observations). Therefore the number of parameters you must estimate grows geometrically with each N, but only arithmetically for each day of observation. Because you are estimating the diagonal portion of the covariance matrix you must estimate N*(N+1)/2 entries with only T observations.


A better approach would involve a shrinkage estimator where you assume constant correlation or constant covariance across securities. The out-of-sample performance of this approach is strong. Consider blending a covariance matrix between the diagonal and sample covariance matrix - See Ledoit and Wolf's paper: "Honey I shrunk the covariance matrix".


No comments:

Post a Comment

technique - How credible is wikipedia?

I understand that this question relates more to wikipedia than it does writing but... If I was going to use wikipedia for a source for a res...