SpyTime --- a Performance Benchmark for Bitemporal Databases

SpyTime --- a Performance Benchmark for Bitemporal Databases

Dennis Shasha
Prof, Computer Science Dept, Department of Computer Science
Courant Institute of Mathematical Sciences
New York University
(shasha@cs.nyu.edu)

Yunyue Zhu
Department of Computer Science
Courant Institute of Mathematical Sciences
New York University
(yunyue@cs.nyu.edu)

Abstract

A bitemporal table records both the history of the facts and the history of changes to the records in the database. In addition to normal non-temporal queries, bitemporal databases permit queries over two orthogonal time dimensions: valid time and transaction time. Valid time is the time when a fact is effective in reality. Transaction time denotes the time when the record is effective in the database. The diversity of possible queries makes bitemporal tables versatile and useful in many applications. Numerous indexing methods for temporal (as well as bitemporal) databases have been proposed and a comparison [2] has been made among these access methods. It is our hope that the SpyTime benchmark will be used as a basis for comparing the performance of different bitemporal database management systems.

It should also be noted that our benchmark is not intended for the comparison of the descriptive and expressive power of query languages. One can turn to a semantic benchmark [3,6] for those comparisons. In fact, a system satisfies the requirements of SpyTime regardless of the query language it supports, as long as that query language is powerful enough to support the semantics of the queries listed here.

Models for a Bitemporal Benchmark

The scheme we use as our bitemporal benchmark is a database that records spy activities in different cities. The uncertainty surrounding spy activity is reflected in our bitemporal database. Each tuple in the database represents the assertion that a specific spy was reported to appear in a specific city during a specific (valid time) interval. The transaction time interval is the time period during which the database believed this fact to be true.

The following table describes the data in this benchmark.

Spy-City Information Table

Field Name	Type	Comments
Spy	Char(10)	The name of the spy.
City	Char(10)	The city where the spy was reported to appear in.
Vt_begin	Date	Begin of the Valid Time period. The spy was supposed to be in the specific city from this date.
Vt_end	Date	End of the Valid Time period. The spy was supposed to be in the specific city until this date.
Tt_begin	Date	Begin of the Transaction Time period. The database records the fact on this date.
Tt_end	Date	End of the Transaction Time period. The database finds out the data is no longer true on this date.
Reporter	Char(10)	The reporter who report the spy event. This can be other information that is relevant to the bitemporal data.

Query characteristics. There will be basically two types of queries.

Queries on one table. We will use the notation proposed by Tsotras et. al. [1] Each query has a type of Explicit Attribute//Valid Time/Transaction Time, and each entry can be one of the following. Note that none of these queries are now-queries.

V: a single attribute value, or a single time instant (slice).
R: a specific range of attribute values, or a continuous time interval.
S: a set of ranges of attribute values or a set of time intervals.
*: indicates the absence of a selection predicate on the attribute.

Queries on multiple tables involving temporal joins of various kinds (on valid time, transaction time, or both).

For clarity, we also give the queries in SQL3/Temporal [4].

1	(V///) The activities of a specified spy s recorded in the database. This is a query on non-temporal attribute. SELECT SPY,CITY FROM SPYTIME WHERE SYP=s
2	(*//V/V) Give the spy events on a specified day d1 as we knew on a specified day d2. This is a typical bitemporal query when the valid-time and transaction-time qualifiers are both single values. It is also known as a bitemporal time slice query. VALIDTIME AND TRANSACTIONTIME SELECT SPY,CITY FROM SPYTIME WHERE VALIDTIME(SPYTIME) OVERLAPS DATE d1 AND TRANSACTIONTIME(SPYTIME) OVERLAPS DATE d2
3	(//V/) When were the spy events that took place on a specified day d recorded? NONSEQUENCED VALIDTIME AND TRANSACTIONTIME SELECT SPY,CITY FROM SPYTIME WHERE VALIDTIME(SPYTIME) OVERLAPS DATE d
4	(////V) Give the history of the spy activities as known as of a specified day d. VALIDTIME AND NONSEQUENCED TRANSACTIONTIME SELECT SPY,CITY FROM SPYTIME WHERE TRANSACTIONTIME(SPYTIME) OVERLAPS DATE d
5	(//R/) When were the spy events that took place at some time during a specified time interval p recorded? NONSEQUENCED VALIDTIME AND TRANSACTIONTIME SELECT SPY,CITY FROM SPYTIME WHERE VALIDTIME(SPYTIME) OVERLAPS PERIOD p
6	Give the names of the spies that were in the same city where a specified spy s has been at any time (earlier or later or at the same time), as recorded on a specified day d. NONSEQUENCED VALIDTIME AND NONSEQUENCED TRANSACTIONTIME SELECT S2.SPY,CITY FROM SPYTIME AS S1,SPYTIME AS S2 WHERE TRANSACTIONTIME(S1) OVERLAPS DATE d AND TRANSACTIONTIME(S2) OVERLAPS DATE d AND S1.SPY=s AND S1.SPY<>S2.SPY AND S1.CITY=S2.CITY
7	For a specified spy s, for each city c visited by s, give the names of spies that were in c at the same time s was in c. This answer should be given based on the information known as of a specified day d. SEQUENCED VALIDTIME AND NONSEQUENCED TRANSACTIONTIME SELECT S2.SPY,CITY FROM SPYTIME AS S1,SPYTIME AS S2 WHERE TRANSACTIONTIME(S1) OVERLAPS DATE d AND TRANSACTIONTIME(S2) OVERLAPS DATE d AND S1.SPY=s AND S1.SPY<>S2.SPY AND S1.CITY=S2.CITY
8	When did we believe that some spies visited any city that a specified spy s visited? NONSEQUENCED VALIDTIME AND SEQUENCED TRANSACTIONTIME SELECT S2.SPY,CITY FROM SPYTIME AS S1,SPYTIME AS S2 WHERE S1.SPY=s AND S1.SPY<>S2.SPY AND S1.CITY=S2.CITY
9	Find all records that indicate inconsistency. Specifically, records that suggest that a specified spy s was in two different cities at the same time. This is a data integrity query. VALIDTIME AND TRANSACTIONTIME SELECT S1.SPY,S1.CITY,S2.CITY FROM SPYTIME AS S1,SPYTIME AS S2 WHERE S1.SPY=s AND S2.SPY=s AND S1.CITY<>S2.CITY
10	How long after the event did we find out that two specified spies s1 and s2 met in a specified city? This query concerns the relationship between the two time dimensions. VALIDTIME AND TRANSACTIONTIME SELECT BEGIN(TRANSACTIONTIME)-BEGIN(VALIDTIME) FROM SPYTIME AS S1,SPYTIME AS S2 WHERE S1.SPY=s1 AND S2.SPY=s2 AND S1.CITY=S2.CITY

Data Generation

The generation of data is simplified due to the nature of our scheme. We assume that a spy can be reported, maybe erroneously, anywhere and any time. We won’t check for data integrity in the original data. Checking for data integrity is one of our benchmark queries. Click here for software to generate the data.

Metric

Here is a metric to allow customers to compare one set of results against another. Each metric should be stated with respect to a scale factor, the number of records, assuming 10,000 records in 6 years. Since the queries entail random selections from larger sets, e.g. ''a specified day''), these tests should be run 10 times and vendors should report averages and standard deviations.

Response Time Metric:

This can be defined as the geometric mean of the execution time of each of the queries executed one after the other in a single-user mode

Where Qi is the Execution time for query i

One result of the performance of bitemporal database using this benchmark can be found here .

Bibliography

Vassilis J. Tsotras, Christian S. Jensen, Richard T. Snodgrass, "An Extensible Notation for Spatiotemporal Index Queries", SIGMOD Record 27(1): 47-53(1998)

B. Salzberg, V.J. Tsotras, "A Comparison of Access Methods for Temporal Data", to appear at ACM Computing Surveys.

Patrick P Kalua, Edward L. Robertson. "Benchmark Queries for Temporal Databases." Technical Report 379, Computer Science Department, Indiana University March 1993, 40 pages.

Snodgrass, R.T., M.H.Boehlen, C. S. Jensen and A. Steiner, "Adding Valid Time to SQL/Temporal.'' change proposal, ANSI X3H2-96-501r2, ISO/IECJTC1/SC21/ WG3 DBL MAD-146r2, November, 1996, 77 pages, http://www.cs.arizona.edu/people/rts/sql3.html

Kaippallimalil J. Jacob, Dennis Shasha, " FinTime --- a financial time series benchmark", http://www.cs.nyu.edu/cs/faculty/shasha/fintime.d/bench.html

Jensen, C. S. (editor), J. Clifford, S. K. Gadia, F. Grandi, P. P. Kalua, N. Kline, N. Lorentzos, Y. Mitsopoulos, A. Montanari, S. S. Nair, E. Peressi, B. Pernici, E. L. Robertson, J. F. Roddick, N. L. Sarda,M. R. Scalas, A. Segev, R. T. Snodgrass, A. Tansel, R. Tiberio, A. Tuzhilin, and G. T. J. Wuu, '' A Consensus Test Suite of Temporal Database Queries,'' Technical Report R 93-2034, Dept. of Mathematics and Computer Science, Aalborg University, Denmark, November, 1993, 45 pages. http://www.cs.auc.dk/~csj/Papers/testSuiteFTR1.ps.gz

Last updated May, 2001