Candidate: Arash Baratloo
Advisor: Zvi Kedem
Metacomputing on Commodity Computers
10 a.m., Tuesday, January 26, 1999
7th floor conference room, 715 Broadway
The advantages of using a set of networked commodity computers for parallel processing is well understood: such computers are cheap, widely available, and mostly underutilized. So why has the use of such environments for compute-intensive applications not proliferated? A major reason is that the inherent complexities of programming applications and coordinating their execution on networked computers outweighs the advantages.
In networked environments populated with multiuser commodity computers, both the computing speed and the number of available computers for executing parallel programs may change frequently and unpredictably. As a consequence, programs need to continuously adapt their execution to the changing environment. The execution of an application must therefore address such issues as dynamic changes in effective machine speeds, dynamic changes in the number of available machines, and sudden network and machine failures. It is not feasible for an application programmer to write programs that adapt to the behavior of a system whose critical aspects cannot be anticipated.
I will present a unified set of techniques to implement a virtual reliable parallel-processing platform on a set of unreliable computers with temporally varying execution speeds. These techniques are specifically designed for automatically adapting the execution of parallel programs to distributed environments. I will explain these techniques in the context of two software systems, Calypso and ResourceBroker, that have been built to validate them.
Calypso gives a programmer a simple tool to build and effectively
execute parallel programs on a set of commodity computers. The
notable properties of Calypso are: (1) a simple, intuitive programming
model based on a virtual machine interface; (2) separation of logical
and physical parallelism, allowing the source code to codify the
algorithm rather than the execution environment; and (3) a runtime
system that efficiently adapts the execution of the program to the
dynamic nature of the runtime environment. ResourceBroker is a
resource manager that demonstrates a novel technique to dynamically
manage the assignment of computers to parallel programs.
ResourceBroker can work with a variety of parallel systems, even
transparently managing those that are not aware of its existence, such
as PVM and MPI, and will distribute available resources fairly among
multiple computations. As a result, a mix of parallel programs,
written using diverse programming systems can effectively execute
concurrently on a set of computers.