SpyTime --- a Performance Benchmark for Bitemporal Databases
Dennis Shasha |
Yunyue Zhu |
Abstract
A bitemporal table records both the history of the facts and the history of changes to the records in the database. In addition to normal non-temporal queries, bitemporal databases permit queries over two orthogonal time dimensions: valid time and transaction time. Valid time is the time when a fact is effective in reality. Transaction time denotes the time when the record is effective in the database. The diversity of possible queries makes bitemporal tables versatile and useful in many applications. Numerous indexing methods for temporal (as well as bitemporal) databases have been proposed and a comparison [2] has been made among these access methods. It is our hope that the SpyTime benchmark will be used as a basis for comparing the performance of different bitemporal database management systems.
It should also be noted that our benchmark is not
intended for the comparison of the descriptive and expressive
power of query languages. One can turn to a semantic
benchmark [3,6] for those comparisons.
In fact, a system satisfies the requirements of SpyTime regardless
of the query language it supports, as long as that query language
is powerful enough to support the semantics of the queries listed here.
Models for a Bitemporal Benchmark
The scheme we use as our bitemporal benchmark is a database that records spy activities in different cities. The uncertainty surrounding spy activity is reflected in our bitemporal database. Each tuple in the database represents the assertion that a specific spy was reported to appear in a specific city during a specific (valid time) interval. The transaction time interval is the time period during which the database believed this fact to be true.
The following table describes the data in this benchmark.
Spy-City Information Table
Field Name |
Type |
Comments |
Spy |
Char(10) |
The name of the spy. |
City |
Char(10) |
The city where the spy was reported to appear in. |
Vt_begin |
Date |
Begin of the Valid Time period. The spy was supposed to be in the specific city from this date. |
Vt_end |
Date |
End of the Valid Time period. The spy was supposed to be in the specific city until this date. |
Tt_begin |
Date |
Begin of the Transaction Time period. The database records the fact on this date. |
Tt_end |
Date |
End of the Transaction Time period. The database finds out the data is no longer true on this date. |
Reporter |
Char(10) |
The reporter who report the spy event. This can be other information that is relevant to the bitemporal data. |
Query characteristics. There will be basically two types of queries.
Queries on one table. We will use the notation proposed by Tsotras et. al. [1] Each query has a type of Explicit Attribute//Valid Time/Transaction Time, and each entry can be one of the following. Note that none of these queries are now-queries.
For clarity, we also give the queries in SQL3/Temporal [4].
1 |
(V//*/*) The activities of a specified spy s recorded in the database. This is a query on non-temporal attribute. SELECT SPY,CITY FROM SPYTIME |
2 |
(*//V/V) Give the spy events on a specified day d1 as we knew on a specified day d2. This is a typical bitemporal query when the valid-time and transaction-time qualifiers are both single values. It is also known as a bitemporal time slice query. VALIDTIME AND TRANSACTIONTIME |
3 |
(*//V/*) When were the spy events that took place on a specified day d recorded? NONSEQUENCED VALIDTIME AND TRANSACTIONTIME |
4 |
(*//*//V) Give the history of the spy activities as known as of a specified day d. VALIDTIME AND NONSEQUENCED TRANSACTIONTIME |
5 |
(*//R/*) When were the spy events that took place at some time during a specified time interval p recorded? NONSEQUENCED VALIDTIME AND TRANSACTIONTIME |
6 |
Give the names of the spies that were in the same city where a specified spy s has been at any time (earlier or later or at the same time), as recorded on a specified day d. NONSEQUENCED VALIDTIME AND NONSEQUENCED TRANSACTIONTIME |
7 |
For a specified spy s, for each city c visited by s, give the names of spies that were in c at the same time s was in c. This answer should be given based on the information known as of a specified day d. SEQUENCED VALIDTIME AND NONSEQUENCED TRANSACTIONTIME |
8 |
When did we believe that some spies visited any city that a specified spy s visited? NONSEQUENCED VALIDTIME AND SEQUENCED TRANSACTIONTIME |
9 |
Find all records that indicate inconsistency. Specifically, records that suggest that a specified spy s was in two different cities at the same time. This is a data integrity query. VALIDTIME AND TRANSACTIONTIME |
10 |
How long after the event did we find out that two specified spies s1 and s2 met in a specified city? This query concerns the relationship between the two time dimensions. VALIDTIME AND TRANSACTIONTIME |
Data Generation
The generation of data is simplified due to the nature of our scheme. We assume that a spy can be reported, maybe erroneously, anywhere and any time. We won’t check for data integrity in the original data. Checking for data integrity is one of our benchmark queries. Click
here for software to generate the data.Metric
Here is a metric to allow customers to compare one set of results against another. Each metric should be stated with respect to a scale factor, the number of records, assuming 10,000 records in 6 years. Since the queries entail random selections from larger sets, e.g. ''a specified day''), these tests should be run 10 times and vendors should report averages and standard deviations.
Response Time Metric:
This can be defined as the geometric mean of the execution time of each of the queries executed one after the other in a single-user mode
Where Qi is the Execution time for query i
One result of the performance of bitemporal database using this benchmark can be found here .
Bibliography
Last updated May, 2001