Let's assume the usual state-space linear model without constant term for simplicity:
$y_{t}=\beta_{t} X_{t}+\epsilon_{t}$
If we apply Gaussian Kalman filter to estimate $\beta_{t}$ we get $P_{t}$, which is the covariance matrices of predicted states, and $v_{t}$, which is the prediction error.
The following simple R
code allows you to download pair of tickers (QQQ
and XLK
for instance) from Yahoo Finance and estimate $P_{t}$ and $v_{t}$ while plotting them:
# ======================================== #
# Kalman filter errors and states variance #
# ======================================== #
op <- par(no.readonly = TRUE)
Sys.setenv(TZ = 'UTC')
# Contents:
# 1. Installing packages
# 2. Loading packages
# 3. Downloading and plotting data
# 4. Kalman filtering of linear regression Beta
# *********************************
# 1. Installing packages
# *********************************
#install.packages('KFAS')
#install.packages('latticeExtra')
#install.packages('quantmod')
# *********************************
# 2. Loading packages
# *********************************
require(compiler)
require(latticeExtra)
require(KFAS)
require(quantmod)
# *********************************
# 3. Downloading and plotting data
# *********************************
Symbols <- c('QQQ', 'XLK')
getSymbols(Symbols, from = '1950-01-01')
data <- na.omit(merge(Cl(QQQ), Cl(XLK)))
colnames(data) <- Symbols
xyplot(data)
# *********************************
# 4. Kalman filtering of linear regression Beta
# *********************************
y <- na.omit(merge(ClCl(QQQ), ClCl(XLK)))[,1]
X <- na.omit(merge(ClCl(QQQ), ClCl(XLK)))[,2]
model <- regSSM(y = y, X = X, H = NA, Q = NA)
object <- fitSSM(inits = rep(0, 2), model = model)$model
KFAS <- KFS(object = object)
P <- xts(as.vector(KFAS$P)[-1], index(y))
v <- xts(t(KFAS$v), index(y))
Z <- cbind(P, v)
colnames(Z) <- c('Covariance of predicted state', 'Prediction error')
xyplot(tail(Z, 1000))
Now let's iterate this procedure over several pairs of securities to estimate their $\beta_{t}$, $P_{T}$ and $v_{t}$ and you want to sort these pairs by the stability and accuracy of $\beta_{t}$, that is, low variance and low prediction error.
I would like to know suitable criteria to make this ranking system having available $P_{t}$ and $v_{t}$, i.e. how to penalize a linear relationship because of too high variance and prediction errors?
For instance:
Replacing QQQ
and XLK
in my code with VXX
and TLT
, you will see greater $P_{t}$ and $v_{t}$, which are linearly related between VXX
and TLT
and are more volatile and have weaker predictive power than the one between QQQ
and XLK
.
This is similar to a ranking system and I would like to know how to produce some numeric criteria.
No comments:
Post a Comment