Abstract:
Current APIs for multiprocessor multi-disk file systems are not easy
to use in developing out-of-core algorithms that choreograph parallel
data accesses. Consequently, the efficiency of these algorithms is
hard to achieve in practice. We address this deficiency by specifying
an API that includes data-access primitives for data choreography.
With our API, the programmer can easily access specific blocks from
each disk in a single operation, thereby fully utilizing the
parallelism of the underlying storage system.
Our API supports the development of libraries of commonly-used
higher-level routines such as matrix-matrix addition, matrix-matrix
multiplication, and BMMC (bit-matrix-multiply/complement)
permutations. We illustrate our API in implementations of these three
high-level routines to demonstrate how easy it is to use.