Mapping algorithms on regular parallel architectures

Candidate: Lee,PeiZong

Abstract

It is significant that many of time-intensive scientific algorithms are formulated as nested loops, which are inherently regularly structured. In this dissertation the relations between the mathematical structure of nested loop algorithms and the architectural capabilities required for their parallel execution are studied. The architectural model considered in depth is that of an arbitrary dimensional systolic array. The mathematical structure of the algorithm is characterized by classifying its data-dependence vectors according to the new ZERO-ONE-INFINITE property introduced. Using this classification, the first complete set of necessary and sufficient conditions for correct transformation of a nested loop algorithm onto a given systolic array of an arbitrary dimension by means of linear mappings is derived. Practical methods to derive optimal or suboptimal systolic array implementations are also provided. The techniques developed are used constructively to develop families of implementations satisfying various optimization criteria and to design programmable arrays efficiently executing classes of algorithms. In addition, a Computer-Aided Design system running on SUN workstations has been implemented to help in the design. The methodology, which deals with general algorithms, is illustrated by synthesizing linear and planar systolic array algorithms for matrix multiplication, a reindexed Warshall-Floyd transitive closure algorithm, and the longest common subsequence algorithm.