How should we select efficiently orders parameters in time series modelling?

Tuesday, December 3, 2019

How should we select efficiently orders parameters in time series modelling?

A common way to select orders parameters (ex: to choose the number of AR terms to be included in the model ) in time series modelling is to rely on some Information Criteria (AIC, BIC, Hannan Quinn..) to measure the relative quality of the model : let’s call it Rule A.

Then in a second time, robustness tests are performed ( Ljung box test , Engle test ..).

However the methodology is not clear to me when I need to choose a model for a serie which has both autocorrelations in the mean and variance process :

I noticed that the model selected (by using rule A) is not always the same if :

I use a “two steps method ” : First , I select orders parameters of the mean process using rule A , secondly, keeping the parameters obtained in the first step, I use rule A again to select parameters in the variance process.

Example : I fit all ARMA(p,q) to the series with (p,q)=0:2 and select the most parsimonious one. Let’s say the best model is p=1 and q=2. Second step : if fit all ARMA(1,2)-GARCH(s,t) models to the serie with (s,t)=0:2 and I select the "best" s,t parameters using rule A again. If we let p:q to be in the range 0:4 and s,t in the range 0:2 they are $5^2 + 3^2$ models to be estimated .

OR a “direct way” modelling: I fit directly the full ARMA(p,q)-GARCH (s,t) to the time serie and select the best model (p,q,s,t) using rule A again. However in this case the number of combinations (number of models to be fitted) can be very high :if we let p:q to be in the range 0:4 and s,t in the range 0:2 they are $5^2 \times 3^2$ candidat models (it takes time and CPU..) .

Obviously the second method will evaluate the model selected by the two steps method and it may gives the strongest significant results. I said “may” because it is possible than the model selected by the direct method do not pass the misspecification part ..

My question is : How can I deal with this cost/efficiency problem ? How should I proceed ?

Answer

I will try to give a simple technique for Identifying $ARIMA(p,d,q)$ orders for a time series. It's an empirical technique, but the results are very closed to techniques based on $AIC$ or $BIC$ criterions.

-Indentifying the Integration order $d$ :

It's the first parameter to determine, indeed the ARMA models are based on the assumption that your time series $\{x_t\}$ is stationary. So, you should start by testing the stationarity of $x_t$ using Dickey–Fuller test for example (the is many other tests). If it's stationary then $d=0$, otherwise try a first integration $y_t = \Delta x_t = (x_{t+1}-x_t)$ and test for stationarity (generaly a first integration is sufficient), hence $d=1$, other whise try a first integration on $y_t$, thus $d=2$ and so on.

Let's assume $d=0$ (so that your x_t is already stationary)

-Indentifying the AutoRegressive (AR) order $p$ :

To determine this order, plot the Partial AutoCorrelation Function (PACF) of $x_t$, then $p$ will be the maximum lag at which the PACF is significant.

-Indentifying the Moving Average (MA) order $q$ :

Plot the AutoCorrelation Function (ACF) of $x_t$, and set $q$ to be the maximum lag at wich the ACF is significant.

Thus you get your empirical model $ARIMA(p,d,q)$.

If you are using R, you can to try to fit a model to your series by using auto.arima function from forecast package, and you will notice that the $AIC$ and $BIC$ criterions of this model are very closed the those of the automatic fitted model.

The techniques I explained above are inspired from the book Analysis of Financial Time Series ($3^{rd}$ Edition, by RUEY S. TSAY)

From my opinion, it's more interesting to do it this way, because you understand the relations between your parameters($\theta_i, \phi_j$ of the ARMA) and the ACF, PACF values, but also the economical justifications(how much laged days...).

Blog

Tuesday, December 3, 2019

How should we select efficiently orders parameters in time series modelling?

No comments:

Post a Comment

technique - How credible is wikipedia?