Friday, August 5, 2016

market microstructure - Analyzing tick data


What are some of the commonly used techniques to analyze tick data? I am looking at tick data to see how the quotes/ mid-price evolves due to certain events in the market. Since tick data is asynchronous one can't really apply traditional time series models to explain these price movements. Some people have proposed that I create price bars based on either clock-time or trade-time but I think that tends to miss out on information happening in between the bars.


Any suggestions on how I can approach this ?



Answer



Your question is very vague (e.g. what are you trying to measure, and what "tick data" do you have), but I'll give you some pointers:




  1. In general, when people consider how prices evolve, they will tend to think about things like volatility and correlation dynamics. So I would start by defining exactly what you want to measure. The irregularity of time series data is not a problem in itself, except in so far as you are making assumptions in your calculations about things like dispersion in time. The amount of variation over 1 millisecond will generally be different than over 1 second (and will also vary by asset), so you need to arrange your statistics to account for this.


    1.1. There is a vast literature on measuring volatility using high-frequency tick data. Search for papers on realized variance, volatility, and correlation from people like Neil Shepard (see his institute) or Tim Bollerslev. One feature of this literature is that it is actually optimal to not use tick-by-tick data because of what is known as microstructure noise (e.g. bid-ask bounce), and you're generally better making estimates off something like 5-minute data.



    1.2 There is also a literature on dealing with unevenly spaced data (see, for instance, papers by Muller and Zumbach). A recent paper on the subject is "Algorithms for Unevenly-Spaced Time Series: Moving Averages and Other Rolling Operators". There is a nice section in Eric Zivot's book on time series analysis that covers this (look for irregularly spaced high frequency data or inhomogeneous operators).




  2. Looking at statistics in clock time or trade time is an important distinction. For instance, the number of quotes or trades can vary dramatically across assets, with illiquid assets only trading a few times a day vs. liquid assets which trade many times each second. Using trade time to measure things like volatility can partly address this problem (as well as things like the significance of your estimate), although you will need to consider whether there are other clock time effects (such as open or close time seasonalities) even when you work in trade time.



  3. For tick data, are you working with level 1 (top of the book quotes and trades) or level 2 (full order book) data? If it's level 2, then you may not only want to consider changes through time, but also across the book.


No comments:

Post a Comment

technique - How credible is wikipedia?

I understand that this question relates more to wikipedia than it does writing but... If I was going to use wikipedia for a source for a res...