Tuesday, October 22, 2019

beta - Good criteria to sort state-space $beta_{t}$ according to Kalman filter output


Let's assume the usual state-space linear model without constant term for simplicity:


$y_{t}=\beta_{t} X_{t}+\epsilon_{t}$


If we apply Gaussian Kalman filter to estimate $\beta_{t}$ we get $P_{t}$, which is the covariance matrices of predicted states, and $v_{t}$, which is the prediction error.



The following simple R code allows you to download pair of tickers (QQQ and XLK for instance) from Yahoo Finance and estimate $P_{t}$ and $v_{t}$ while plotting them:


# ======================================== #
# Kalman filter errors and states variance #
# ======================================== #

op <- par(no.readonly = TRUE)
Sys.setenv(TZ = 'UTC')

# Contents:


# 1. Installing packages
# 2. Loading packages
# 3. Downloading and plotting data
# 4. Kalman filtering of linear regression Beta

# *********************************
# 1. Installing packages
# *********************************

#install.packages('KFAS')

#install.packages('latticeExtra')
#install.packages('quantmod')

# *********************************
# 2. Loading packages
# *********************************

require(compiler)
require(latticeExtra)
require(KFAS)

require(quantmod)

# *********************************
# 3. Downloading and plotting data
# *********************************

Symbols <- c('QQQ', 'XLK')
getSymbols(Symbols, from = '1950-01-01')
data <- na.omit(merge(Cl(QQQ), Cl(XLK)))
colnames(data) <- Symbols

xyplot(data)

# *********************************
# 4. Kalman filtering of linear regression Beta
# *********************************

y <- na.omit(merge(ClCl(QQQ), ClCl(XLK)))[,1]
X <- na.omit(merge(ClCl(QQQ), ClCl(XLK)))[,2]
model <- regSSM(y = y, X = X, H = NA, Q = NA)
object <- fitSSM(inits = rep(0, 2), model = model)$model

KFAS <- KFS(object = object)
P <- xts(as.vector(KFAS$
P)[-1], index(y))
v <- xts(t(KFAS$v), index(y))
Z <- cbind(P, v)
colnames(Z) <- c('Covariance of predicted state', 'Prediction error')
xyplot(tail(Z, 1000))

Now let's iterate this procedure over several pairs of securities to estimate their $\beta_{t}$, $P_{T}$ and $v_{t}$ and you want to sort these pairs by the stability and accuracy of $\beta_{t}$, that is, low variance and low prediction error.


I would like to know suitable criteria to make this ranking system having available $P_{t}$ and $v_{t}$, i.e. how to penalize a linear relationship because of too high variance and prediction errors?


For instance:



Replacing QQQ and XLK in my code with VXX and TLT, you will see greater $P_{t}$ and $v_{t}$, which are linearly related between VXX and TLT and are more volatile and have weaker predictive power than the one between QQQ and XLK.


This is similar to a ranking system and I would like to know how to produce some numeric criteria.




No comments:

Post a Comment

technique - How credible is wikipedia?

I understand that this question relates more to wikipedia than it does writing but... If I was going to use wikipedia for a source for a res...