I'm currently working on some matlab code that is supposed to check a stock database for any errors (missing values, wrong values, etc.). The reason for this is that after reading this post I came to the conclusion that I'll probably have to write some data cleaning code in order to get accurate and reliable results when backtesting with this database.
The database has been downloaded from yahoo finance and contains the following columns for each stock: Date, Open, High, Low, Close, Volume, AdjClose.
So far the program scans for the following trivial errors:
- Close > High
- Close < Low
- Open > High
- Open < Low
- High < Low
The program also checks if any of the data columns contains values less than zero or NaN.
What other errors/flaws could I look for in the database?
No comments:
Post a Comment