Even though we obtain most of our data either directly from the exchanges, or from a very robust archival process, this raw data is not research-ready. Errors can persist even in exchange-sourced data.
Therefore, we have introduced two filtering processes that add significant value to our equity, futures, and cash index trade data. Using a proprietary price filtering algorithm and codes included in the data by the exchanges, we remove or modify suspect ticks to improve the utility of the data without taking out the reality of volatile, free markets. Our process includes:
- Data Validation – We first programmatically look for any gaps where a trade has not occurred for a period of time unusual at that time of day for that specific symbol, or where the price difference between two consecutive trades is uncharacteristic for that particular symbol. We also ensure we have data for all available symbols in the dataset.
- Price Filtering – Our algorithmic filters flag trades that appear to be non-representative of market conditions (i.e. “bad ticks”) and suggest corrected values. We do not overwrite suspect ticks in the source files, but flag them for removal by TickWrite®. Our TickWrite data management software can use the unfiltered or filtered prices in its output. Having both prices in the data allows clients to review the results of our filtering and choose the time series they prefer.
- Condition Code Filtering – We also filter our tick-by-tick trade data for various condition codes that denote out-of-sequence trades, cancelled trades, block trades, off-exchange transactions, and other conditions that require data points to be removed prior to use by a quant. As with the price filter, the tick-by-tick trade source data files contain the ticks flagged for removal, and the user can decide whether or not TickWrite should include them in output time series files.