Tuesday, November 28, 2017

Proof for non-positive semi-definite covariance matrix estimator


It is well known that the standard estimator of the covariance matrix can lose the property of being positive-semidefinite if the number of variables (e.g. number of stocks) exceeds the number of observations (e.g. trading days). I think the matrix can become singular. I have a clear idea why (inspired by the geometry of the problem) but does anybody have a short but rigorous proof for this fact?



Answer



The standard estimator of the covariance matrix is: $$\widehat{ \mathrm{cov}}(X) = \frac 1 {n-1} \sum_{i=1}^n (X_i-\bar X)(X_i-\bar X)^T,$$ where $X_i$ is the column vector containing the $i$th observation of all the observables. Each summand is an outer product of a vector with itself, i.e., a square matrix having rank at most one. Therefore $$\mathrm{rk\;}\widehat{\mathrm{cov}}(X) \le n$$ and the matrix not only can be but is always singular if $$n \lt \dim X,$$ i.e., if the number of observations is less than the number of variables.





Edit: regarding positive-semidefiniteness: $\widehat{ \mathrm{cov}}(X)$ is always positive-semidefinite because it is Gramian, even if its rank is not full. It loses the property of being positive-definite if and only if it is singular.


No comments:

Post a Comment

technique - How credible is wikipedia?

I understand that this question relates more to wikipedia than it does writing but... If I was going to use wikipedia for a source for a res...