DEPARTMENT OF COMPUTER SCIENCE
DOCTORAL DISSERTATION DEFENSE


Candidate: Elizabeth Shriver
Advisors: Alan Siegel and John Wilkes

Performance modeling for realistic storage devices

3:00 p.m., Thursday, January 30, 1997
12th floor conference room, 719 Broadway




Abstract

Managing large amounts of storage is difficult and becoming more so as both the complexity and number of storage devices are increasing. One approach to this problem is a self-managing storage system. Since a self-managing storage system is a real-time system, it requires a model that quickly approximates the behavior of the storage device in a workload-dependent fashion. We develop such a model.

Our approach to modeling devices is to model the individual components of the device, such as queues, caches, and disk mechanisms, and then compose the components. To determine the performance of a component, each component modifies the entering workload use patterns and determines the performance from the workload use patterns and the lower-level device behavior. For example, modifying the use patterns allows us to capture the altered spatial locality that occurs when queues reorder their requests.

Our model predicts the device behavior in terms of response time within a 8% relative error for an interesting subset of the domain of devices and workloads. To demonstrate this, the model has been validated with synthetic traces of parallel scientific file system applications and traces of transaction processing applications.

Our contributions to the area of performance modeling for storage devices include the following:

1.
Methods to approximate the positioning time for the disk head of a magnetic disk.
2.
Methods to approximate the queue delay for non-FCFS scheduling algorithms.
3.
Methods to approximate the cache-miss probabilities and the full and partial cache-hit probabilities in the data caches in the I/O path using measures of workload spatial locality.
4.
Methods to approximate the mean seek time and rotational latency of the disk mechanism using measures of workload spatial locality.
5.
An infrastructure for developing a composite model. The infrastructure supports the development of more complicated devices and workloads than we have validated.
Together, these mean that we have analytic methods to approximate the behavior of a set of realistic storage devices.