The markets have changed significantly since we began. New technologies and changes in market participant behavior have caused the number of trade and quote messages per second to explode. To reduce the risk of dropped packets and lost information, Tick Data has not collected data over a live feed since 2001. Our preferred method for obtaining data for new markets is to negotiate directly with exchanges for their official archive. However, some exchanges do not grant reasonable terms for redistribution, or they archive their data in a manner that results in incomplete or inaccurate information. In these cases, we partner with quality real-time data providers who have direct exchange connections, multiple, redundant ticker plants, and the ability to make the data available for download after markets close each day. Every provider we use meets our high standards for completeness and accuracy.
Despite the care we take in obtaining the finest raw tick-by-tick data, omissions and errors can persist, even in data sourced directly from exchanges. Therefore, all data passes through our validation process; a suite of extraction, filtering, verification, and reporting programs developed in-house for the sole purpose of producing clean, robust data: Condition Code Filtering
– We filter our tick-by-tick trade data for various condition codes that denote out of sequence trades, cancelled trades, and other conditions that require data points to be removed prior to use by a quant. While these prints still exist in the data, by default TickWrite removes them from output time series files. Clients can add these prints back into output files by simply checking a box. Price Filtering
– We also run our tick-by-tick trade data through a proprietary algorithmic filter that flags trades it considers to be non-representative of market conditions (i.e. “bad ticks”) and suggests corrected values. By default TickWrite replaces these “bad ticks” with the corrected prices, but it can extract both the filtered and unfiltered data. To learn more about our data filtering approach, please see our white paper on High-Frequency Data Filtering
. Data Validation
– Employing a number of internal processes, we first ensure that we have the data we expect to receive from our various sources and then programmatically identify gaps in trade data that we deem to be unusual for each individual symbol at that time of day.
When you order equity data from Tick Data, LLC, you receive as-traded
data, along with powerful tools that convert the as-traded
data into time series data. These as-traded files are delivered one file per company per day, each named by the ticker symbol under which it traded on that day. Our TickWrite® data management tools use these files as inputs for generating custom-formatted, research-ready mapped
output files. TickWrite® uses our proprietary Ticker Mapping® and corporate actions information to link together the data for all ticker symbols used by that corporate entity throughout its history, creating a single mapped
file for the range of data processed. Files can also be adjusted for stock splits/dividends and spinoffs if desired. Here is an example of the difference between as-traded and time series data using Citigroup (ticker: C) trading in the U.S.: The corporate entity known today as Citigroup
(NYSE ticker: C) has traded under these symbols throughout its history: C, CCI, TRV, and PA. If a client orders the complete history of the current symbol ‘C’ since 1993, we deliver this data in files by symbol, one file per symbol per day, that include C, CCI, TRV, and PA files. If a client wants to generate a single file of Citigroup data from 1993 to present, TickWrite® would combine the as-traded
files with these various symbols into a single time series file of historical Citigroup data. For more information on the available versions of TickWrite®, please see our Data Delivery
An often ignored, but significant, source of variation between theoretical and actual trading results, survivorship bias occurs when models are developed, tested, and optimized using only currently-traded ticker symbols. A benefit of ordering Equity or OPRA Options data products from Tick Data, LLC is the inclusion of symbols that are no longer traded due to mergers, acquisitions, delistings, etc. Clients who order all symbols for a market receive both active and inactive companies, which they can use to eliminate survivorship bias from their analyses. For more information, see our white paper: Survivorship Bias in the Development of Equity Trading Systems.
For all equity markets and U.S. options, we offer tick-by-tick data for both trades and quotes. This provides clients with the ability to analyze every single trade and/or every Level I bid and offer. However, for clients who measure frequency in minutes or hours rather than milliseconds or seconds, we also offer data in pre-built one-minute bars. Each interval of our pre-built one-minute trade data includes Date, Time, Open, High, Low, Close, and Volume. We deliver the data in zipped comma-delimited text files and offer three choices of data granularity:
1) Tick-by-tick Level I Quotes (bid/ask with size) and Trades (last price with volume)
2) Tick-by-tick Trades only
3) One-Minute bar data pre-built from the Trade data (OHLCV)
And, we have two additional intervals for U.S. equities:
4) Tick-by-tick NBBO Quotes and Trades
5) One-minute Quote Bars built from NBBO quote data In addition, clients can use TickWrite® to build tick- or time-based bars from trade data and time-based bars from quote data. Order data for all available symbols (both active & inactive symbols) by year or by market, or specify a custom symbol list and date range. Clients can also subscribe to our daily update service to keep data current.