Sunday, July 19, 2015

equities - Scanning a stock database for errors/flaws


I'm currently working on some matlab code that is supposed to check a stock database for any errors (missing values, wrong values, etc.). The reason for this is that after reading this post I came to the conclusion that I'll probably have to write some data cleaning code in order to get accurate and reliable results when backtesting with this database.


The database has been downloaded from yahoo finance and contains the following columns for each stock: Date, Open, High, Low, Close, Volume, AdjClose.


So far the program scans for the following trivial errors:



  • Close > High

  • Close < Low

  • Open > High


  • Open < Low

  • High < Low


The program also checks if any of the data columns contains values less than zero or NaN.


What other errors/flaws could I look for in the database?




No comments:

Post a Comment

technique - How credible is wikipedia?

I understand that this question relates more to wikipedia than it does writing but... If I was going to use wikipedia for a source for a res...