Kaippallimalil J. Jacob
Morgan Stanley Dean Witter & Co., New York
(kjacob@acm.org)
Dennis Shasha
Prof, Computer Science Dept
Department of Computer Science
Courant Institute of Mathematical Sciences
New York University
Dennis Shasha's home page
(shasha@cs.nyu.edu)
How many times have you seen projects start using a relational engine only to find that they take more and more of their functionality out of the engine for performance purposes? The result is a nightmare of maintenance, but it does the job well.
One way we in the financial industry can do better is to demand that the database vendors give us products that give us relevant functionality. The Transaction Processing Council (www.tpc.org) benchmarks are helpful, but they essentially ignore time series and other forms of ordered data, a critical component of financial database systems.
The two of us, an academic interested in time series databases and a practitioner who has built many database systems for finance, have created a financial time series benchmark in order to challenge vendors to produce products we won't have to implement around
FinTime (http://cs.nyu.edu/cs/faculty/shasha/fintime.html) is a set of data and queries that reflects the needs of financial analysts who are studying patterns in stock market data, but it should apply to any time-dependent financial instruments.
Unlike some other benchmarks, this one makes no requirement that all queries be expressed in a given language (e.g. SQL 2000). If a vendor has a query language, that's good enough. It's up to the vendor's customers to decide on syntactic and semantic elegance.
FinTime has evolved from a tutorial on time series databases given by Shasha during VLDB 98 (see his web page) and reflects Jacob and Shasha's best understanding of typical data analysis queries issued by users. Since many vendors have products that handle ordered data, such a benchmark can help would-be customers to evaluate them.
The models suggested in FinTime reflect two frequently occurring cases in the financial industry, namely, a historical market data system (decision support) and real-time price tick database (on-line transaction processing). FinTime also suggests and defines metrics that capture three useful dimensions of any time-series system, namely, performance in a single-user mode, performance in a multi-user mode and the price to performance ratio.
Models for a Time-series Benchmark
Before deciding on a model, we have to examine the different parameters that determine a model for time-series system. The most important parameters that influence a time-series database system are:
Combinations of these factors will give rise to 64 possible models but for simplicity we can focus on the following commonly occurring cases in the financial industry
Model 1: Historical market Information
Attribute |
Specification |
Periodicity of Data |
Periodic |
Density of Data |
Dense |
Schedule of updates |
Periodic updates (e.g. at the end of a business day) |
Complexity of queries |
Complex (e.g. Decision support queries) |
Nature of query arrival |
Batch |
Concurrency of users |
Low (e.g. Few concurrent users) |
Model 2: Tick databases for financial instruments
Attribute |
Specification |
Periodicity of Data |
Non-periodic |
Density of Data |
Sparse to moderately dense |
Schedule of updates |
Continuous |
Complexity of queries |
Simple |
Nature of query arrival |
Ad hoc |
Concurrency of users |
High (e.g. Many concurrent users) |
Let us now discuss the characteristics of the two models in some detail (Datatypes used in the models are explained in greater details in the glossary)
Model 1: Historical Market Information
Historical Market data systems are closely related to decision support systems of the traditional relational database world. The various elements of this model are:
Following are the benchmark queries to be run against the data model defined above. Many of them include a notion of ``specified set of securities'' or ``specified period''. This notion is defined in the glossary with respect to a simple random model.
1 |
Get the closing price of a set of 10 stocks for a 10-year period and group into weekly, monthly and yearly aggregates. For each aggregate period determine the low, high and average closing price value. The output should be sorted by id and trade date. |
2 |
Adjust all prices and volumes (prices are multiplied by the split factor and volumes are divided by the split factor) for a set of 1000 stocks to reflect the split events during a specified 300 day period, assuming that events occur before the first trade of the split date. These are called split-adjusted prices and volumes. |
3 |
For each stock in a specified list of 1000 stocks, find the differences between the daily high and daily low on the day of each split event during a specified period. |
4 |
Calculate the value of the S&P500 and Russell 2000 index for a specified day using unadjusted prices and the index composition of the 2 indexes (see appendix for spec) on the specified day |
5 |
Find the 21-day and 5-day moving average price for a specified list of 1000 stocks during a 6-month period. (Use split adjusted prices) |
6 |
(Based on the previous query) Find the points (specific days) when the 5-month moving average intersects the 21-day moving average for these stocks. The output is to be sorted by id and date. |
7 |
Determine the value of $100,000 now if 1 year ago it was invested equally in 10 specified stocks (i.e. allocation for each stock is $10,000). The trading strategy is: When the 20-day moving average crosses over the 5-month moving average the complete allocation for that stock is invested and when the 20-day moving average crosses below the 5-month moving average the entire position is sold. The trades happen on the closing price of the trading day. |
8 |
Find the pair-wise coefficients of correlation in a set of 10 securities for a 2 year period. Sort the securities by the coefficient of correlation, indicating the pair of securities corresponding to that row. [Note: coefficient of correlation defined in appendix] |
9 |
Determine the yearly dividends and annual yield (dividends/average closing price) for the past 3 years for all the stocks in the Russell 2000 index that did not split during that period. Use unadjusted prices since there were no splits to adjust for. |
Since the benchmark model assumes that the level of concurrency is small, the number of users should be 5. The benchmark methodology should be that each of the five users executes all the queries listed above in a randomly generated permutation. The five users will be simultaneously active at any point in time. Where the output order is not specified, it can be displayed/stored in any order.
Model 2: Tick databases for financial instruments
This benchmark models a very different, but commonly occurring, scenario from the one above. In this case, the benchmark attempts to model a case where the database is expected to keep up with a very high rate of updates while responding to several users issuing fairly simple queries. This case is quite similar to OLTP systems in the relational world.
The case being modeled is the tick database for a financial system. Ticks are price quotation or trade (transaction) prices and associated attributes for individual securities that occur either on the floor of a stock exchange or in an electronic trading system, such as the NASDAQ market system. Ticks include 2 basic kinds of data (1) Trades are transactions between buyers and a sellers at a fixed price and quantity (2) Quotes are price quotations offered by buyers and sellers. Quotes can have the ask quote, the bid quote or both along with their associate attributes such as quantity offered.
A very important consideration in tick databases is the ability to quickly apply "cancel/correct". Occasionally an incorrect quote or trade record is published. The vendor will then send a correction record with the identifier of the security and its sequence number. The record will either be corrected according to the new published or simply deleted.
1 |
Get all ticks a specified set of 100 securities for a specified three hour time period on a specified trade date. |
2. |
Determine the volume weighted price of a security considering only the ticks in a specified three hour interval |
3. |
Determine the top 10 percentage losers for the specified date on the specified exchanges sorted by percentage loss. The loss is calculated as a percentage of the last trade price of the previous day. |
4. |
Determine the top 10 most active stocks for a specified date sorted by cumulative trade volume by considering all trades |
5. |
Find the most active stocks in the "COMPUTER" industry (use SIC code) |
6. |
Find the 10 stocks with the highest percentage spreads. Spread is the difference between the last ask-price and the last bid-price. Percentage spread is calculated as a percentage of the mid-point price (average of ask and bid price). |
Metrics
Here are metrics to allow customers to compare one set of results against another. Each metric should be stated with respect to a scale factor and results for Historical Market Information should be reported separately from results for Tick Databases. For Historical Market Information, the scale factor is the number of securities, assuming 4,000 days of history. For Tick Databases, the scale factor is also the number of securities, assuming 100 ticks per security per day and 90 days of history. Since the queries entail random selections from larger sets as explained in the glossary (e.g. ``specified set of 100 securities''), these tests should be run 10 times and vendors should report averages and standard deviations.
This can be defined as the geometric mean of the execution time of each of the queries executed one after the other in a single-user mode
The Throughput Metric is defined as the average time taken by a user to complete his workload in a multi-user system.
The cost metric is used to bring the capabilities of the hardware into the equation. It is based on the assumption that additional hardware capabilities would be reflected in higher hardware costs.
Constraints
Conclusions
Benchmarks help compare different products that solve the same problem. It gives the potential user of these systems a meaningful way of assessing the capabilities of differing products. Since it is quite difficult to capture all the information about a system in a few benchmark measures, it is important that the implementation of this benchmark record the exact conditions under which the tests were carried out and the results that were used to arrive at the metrics. Eventually, these conditions will fall under a certification procedure. In the meantime, ask for your vendor's FinTime numbers. You may be surprised at who is the best.
Bibliography
Glossary
Defined below are some concepts and measures commonly used in the financial industry and referred to in the benchmark specification