Theses & Reports

Instructions for submitting a technical report or thesis.

You can find technical reports published prior to 1990 archived here.

Title

Authors

Year

TR1990-532 1990 On Triangulations of the 3-Ball and the Solid Torus Bohus, G.; Jockush, W.; Lee, C.; Prabhu, N. Abstract | PDF

Title: On Triangulations of the 3-Ball and the Solid Torus

Author(s): Bohus, G.; Jockush, W.; Lee, C.; Prabhu, N.

Abstract:

We show that neither the 3-ball nor the solid torus admits a triangulation in which (i) every vertex is on the boundary, and (ii) every tetrahedron has exactly one triangle on the boundary. (Such triangulations are relevant to an unresolved conjecture of Perles.) Our result settles a question posed at the DIMACS Workshop on Polytopes and Convex Sets.
TR1990-520 1990 Stable Perturbations of Nonsymmetric Matrices Burke, J. Abstract | PDF
TR1990-506 1990 Domain Decomposition Algorithms for Indefinite Elliptic Problems Cai, X.; Widlund, O. Abstract | PDF

Title: Domain Decomposition Algorithms for Indefinite Elliptic Problems

Author(s): Cai, X.; Widlund, O.

Abstract:

Iterative methods for the linear systems of algebraic equations arising from elliptic finite element problems are considered. Methods previously known to work well for positive definite, symmetric problems are extended to certain nonsymmetric problems, which also can have some eigenvalues in the left half plane.

We first consider an additive Schwarz method applied to linear, second order, symmetric or nonsymmetric, indefinite elliptic boundary value problems in two and three dimensions. An alternative linear system, which has the same solution as the original problem, is derived and this system is then solved by using GMRES, an iterative method of conjugate gradient type. In each iteration step, a coarse mesh finite element problem and a number of local problems are solved on small, overlapping subregions into which the original region is subdivided. We show that the rate of convergence is independent of the number of degrees of freedom and the number of local problems if the coarse mesh is fine enough. The performance of the method is illustrated by results of several numerical experiments.

We also consider two other iterative method for solving the same class of elliptic problems in two dimensions. Using an observation of Dryja and Widlund, we show that the rate of convergence of certain iterative substructuring methods deteriorates only quite slowly when the local problems increase in size. A similar result is established for Yserentant's hierarchical basis method.
TR1990-512 1990 Tight Bounds on the Complexity of the Boyer-Moore Pattern Matching Algorithm Cole, R. Abstract | PDF
TR1990-510 1990 On the Optimal Design of Columns Against Buckling Cox, S.; Overton, M. Abstract | PDF

Title: On the Optimal Design of Columns Against Buckling

Author(s): Cox, S.; Overton, M.

Abstract:

We establish existence, derive necessary conditions, and construct and test an algorithm for the maximization of a column's Euler buckling load under a variety of boundary conditions over a general class of admissible designs. We prove that symmetric clamped-clamped columns possess a positive first eigenfunction and introduce a symmetric rearrangement that does not decrease the column's buckling load. Our necessary conditions, expressed in the language of Clarke's generalized gradient, subsume those proposed by Olhoff and Rasmussen, Masur, and Seiranian. The work of Olhoff and Rasmussen, Masur, and Seiranian sought to correct the necessary conditions of Tadjbakhsh and Keller who had not foreseen the presence of a multiple least eigenvalue. This remedy has been hampered by Tadjbakhsh and Keller's miscalculation of the buckling loads of their clamped-clamped and clamped-hinged columns. We resolve this issue in the appendix.

In our numerical treatment of the associated finite dimensional optimization problem we build on the work of Overton in devising an efficient means of extracting an ascent direction from the column's least eigenvalue. Owing to its possible multiplicity this is indeed a nonsmooth problem and again the ideas of Clarke are exploited.
TR1990-534 1990 Physical Idealization as Plausible Inference Davis, E. Abstract | PDF
Title: Physical Idealization as Plausible Inference

Author(s): Davis, E.

Abstract:

The analysis of physical systems almost always relies on idealized models of the objects involved. Any idealization, however, will be incorrect or insufficiently accurate some of the time. It seems reasonable, therefore, to view a physical idealization as a defeasible inference which can be withdrawn in the presence of contrary evidence. This talk discusses the consequences of such a view.

We focus on examples where a system may or may not go into a state where idealizations are violated, such as dropping a ball near an open switch connected across a battery. We show that:
1. Non-monotonic logics will try to enforce the idealization by supposing that the ball will miss the switch. This anomaly does not seem to be solvable by the kinds of techniques that have been applied to the Yale Shooting Problem, which it superficially resembles. We show that this problem is analogous to anomalies in non-monotonic logic that are time-independent.
2. A probabilistic analysis is possible, but it relies on independence assumptions that are hard to justify in general. 3. For completely specified systems, the rule "If the idealization gives solvable equations, then assumes that it holds" is, in fact, a monotonic system of inferences. It should therefore be possible to characterize this in a purely deductive theory. We show that this is, indeed, possible for simple cases, but can get messy in complex systems.
3. For completely specified systems, the rule "If the idealization gives solvable equations, then assumes that it holds" is, infact, a monotonic system of inferences. It should therefore be possible to characterize this in a purely deductive theory. We show that this is, indeed, possible for simple cases, but can get messy in complex systems.
4. Programs that make physical predictions can avoid these problems by simply avoiding reasoning from the future to the past. Though most current programs observe this restriction, it seems likely that more powerful and general systems will have to violate it, and thus deal with this issue.
5. Finally, we look at dynamic systems where the idealization can be observed at any single instant, but it is inconsistent over extended intervals.
Ph.D. Thesis 1990 Detecting Nondeterminism in Shared Memory Parallel Programs Dinning, Anne Abstract | PDF

Title: Detecting Nondeterminism in Shared Memory Parallel Programs

Candidate: Dinning, Anne

Advisor(s): Mishra, Bud

Abstract:

This thesis addresses the problem of detecting of a specific type of nondeterminism in shared memory parallel programs known as access anomalies. An access anomaly occurs when an update to a shared variable X is concurrent with either a read of X or another update of X.

The first part of the work considers dynamic detection of access anomalies. We introduce a new technique called task recycling that detects access anomalies "on the fly" by monitoring the program execution. This technique is designed with two goals in mind. The first goal is minimal monitoring overhead. Costs are incurred only at thread create, terminate, and coordinate operations cind every time a monitored variable is accessed. Because variable accesses are generally the most frequent operation, the task recycling technique reduces the overhead per variable access to a small constant. The second goal is generality. The task recycling technique is appllicable to a wide variety of parallel constructs find all common synchronous and asynchronous coordination primitives. Combined with a protocol for specifying ordering constraints, the method of representing concurrency relationships in task recycling cam be extended to detect general race conditions in parallel programs.

The second pait of the thesis involves static detection of several types of nondeterminism that makes dynamic anomcily detection inefficient. In particulair, the notion of nondeterminism arising from critical section coordination is refined by distinguishing between three types of nondeterminism parallel, sequential, and reference nondeterminism. The presence of these types of nondeterminism in a program impacts access anomaly detection in two significant ways: (i) how critical section coordination is modeled during anomaly detection, and (ii) the confidence level and complexity of guaranteeing that a program has no access anomalies. In particular, it is shown that access anomalies can be detected efficiently only if a program is parallel, sequential and reference deterministic. Heuristics are presented that make access anomaly detection tractable in the presence of other nondeterminism through a better classification amd semantic understanding of a coordination protocol.
TR1990-498 1990 Performance of Shared Memory in a Parallel Computer Donovan, K. Abstract | PDF
TR1990-507 1990 Multilevel Additive Methods for Elliptic Finite Element Problems Dryja, M.; Widlund, O. Abstract | PDF

Title: Multilevel Additive Methods for Elliptic Finite Element Problems

Author(s): Dryja, M.; Widlund, O.

Abstract:

An additive variant of the Schwarz alternating method is discussed. A general framework is developed, which is useful in the design and analysis of a variety of domain decomposition methods as well as certain multigrid methods. Three methods of this kind are then considered and estimates of their rates of convergence given. One is a Schwarz-type method using several levels of overlapping subregions. The others use multilevel, multigrid-like decompositions of finite element spaces and have previously been considered by Yserentant and Bramble, Pasciak and Xu. Throughout, we work with finite element approximations of linear, self-adjoint, elliptic problems.
TR1990-529 1990 Substructuring Methods for Parabolic Problems Dryja, M. Abstract | PDF

Title: Substructuring Methods for Parabolic Problems

Author(s): Dryja, M.

Abstract:

Domain decomposition methods without overlapping for the approximation of parabolic problems are considered. Two kinds of methods are discussed. In the first method systems of algebraic equations resulting from the approximation on each time level are solved iteratively with a Neumann-Dirichlet preconditioner. The second method is direct and similar to certain iterative methods with a Neumann-Neumann preconditioner. An analysis of convergence of the methods is presented.
TR1990-531 1990 Cutting a Polytope Jockush, W.; Prabhu, N. Abstract | PDF

Title: Cutting a Polytope

Author(s): Jockush, W.; Prabhu, N.

Abstract:

We show that given two vertices of a polytope one cannot in general find a hyperplane containing the vertices, that has two or more facets of the polytope in one closed half-space. Our result refutes a long-standing conjecture.

We prove the result by constructing a 4-dimensional polytope that provides the counter-example. Also, we show that such a cutting hyperplane can be found for each pair of vertices, if the polytope is either simplicial or 3-dimensional.
TR1990-503 1990 Tree Locking on Changing Trees Lanin, V.; Shasha, D. Abstract | PDF

Title: Tree Locking on Changing Trees

Author(s): Lanin, V.; Shasha, D.

Abstract:

The tree locking protocol is a deadlock-free method of concurrency control defined and verified by Silberschatz and Kedem for data organized in a directed tree. Can the tree protocol work for applications that change the tree? We define a set of three operations capable of changing any tree to any other tree and show that the tree protocol continues to ensure serializability and deadlock-freedom in the presence of these operations.
TR1990-518 1990 Dag Representation and Optimization of Rewriting Li, K. Abstract | PDF

Title: Dag Representation and Optimization of Rewriting

Author(s): Li, K.

Abstract:

In all applications of term rewriting systems, computing the normal forms of terms is the fundamental step. In this paper, we propose a directed acyclic graph (dag) representation for term and term rewriting. The rewriting on dags is much more efficient than the rewriting on trees. We design several efficient strategies for rewriting on dags. By dag representation, we can even obtain efficient rewriting strategies for non-left-linear rewriting systems.
Ph.D. Thesis 1990 Program transformation for efficient derivation of multiple solutions in concurrent logic languages Markantonatos, Nikolaos Abstract | PDF

Title: Program transformation for efficient derivation of multiple solutions in concurrent logic languages

Candidate: Markantonatos, Nikolaos

Advisor(s): Harrison, Malcolm C.

Abstract:

Concurrent logic languages provide a flexible and powerful vehicle for expressing parallel programs using explicit processes. However, their drastic departure from conventional logic programming with respect to completeness renders them unsuitable for a variety of useful applications involving search. A multiple solution extension to concurrent logic languages appears to successfully obtain the effect of backtracking in a parallel environment, but has been impeded by inefficiency problems. Moreover, the multiple solution subset introduces a new language which is incoherent with the single solution base language. We propose a multiple solution subset definition that adheres to the base language both syntactically and semantically. Subsequently, we advocate a source-to-source transformational approach for the efficient implementation of the subset. Multiple solution programs are converted at compile-time into equivalent single solution programs that derive all possible solutions into a single list. Alternative solutions are obtained in an eager or lazy fashion as specified by the program. A number of multiple solution program classes that are transformable into efficient single solution programs are identified and the corresponding transformation procedures are presented and further illustrated using a variety of examples. The techniques employed for the various transformations include partial evaluation, abstract interpretation, continuation-based transformation, layered stream transformation and loop fusion. As a result of such a static transformational methodology, a broad range of multiple solution programs enjoy efficient execution. We believe that our approach forms a definite step towards an efficient multiple solution subset for concurrent logic languages.
Ph.D. Thesis 1990 Data structures and algorithms for hierarchical memory machines Mirza, Mirza G. R. Abstract | PDF

Title: Data structures and algorithms for hierarchical memory machines

Candidate: Mirza, Mirza G. R.

Advisor(s): Siegel, Alan

Abstract:

This thesis analyzes the influence of hierarchical memory in models of practical computation. While hierarchical memory is the standard in real computing systems, the most common models of computation, Random Access Memory Machines and Turing Machines, do not reflect this form of memory. Our main contributions are: (1) Models of computation that have memory hierarchy, and which provide a rich structure for the complexity analysis of real computational problems. (2) Optimal bounds for problems such as sorting, with respect to both space and time, for a variety of memory access costs. (3) Related bounds for other problems, including constrained multitape merging and the implementation of Priority Queues and B-Trees. (4) The introduction of multiprogramming and multiprocessing concepts for these models, and an analysis of their relative computational power.
TR1990-523 1990 Execution of Regular DO Loops on Asynchronous Multiprocessors Ouyang, P. Abstract | PDF

Title: Execution of Regular DO Loops on Asynchronous Multiprocessors

Author(s): Ouyang, P.

Abstract:

This paper studies issues concerning parallel execution of regular Fortran DO loops on an asynchronous shared-memory multiprocessor, where each iteration is the basic unit to be executed by a single processing element. An iteration is a dependent predecessor of another iteration if execution of the latter iteration has to wait until execution of the former iteration has completed. During the execution of a DO loop, an iteration will pass through four states, namely, idle, pending, ready, and finished states. An iteration is idle if none of its dependent predecessors have completed; an iteration is pending if some of its dependent predecessors have completed, but not all; an iteration is ready if all its dependent predecessors have completed, but itself has not; otherwise, an iteration is finished. In addition, an iteration without any dependent predecessors is called an initial iteration, which can only have ready and finished states. Via describing an execution scheme, this paper studies the characteristics of Fortran DO loops which are related to the efficiency of the execution. Specifically, this paper investigates (1) the number of initial iterations, (2) the maximum number of ready iterations at any instances during the execution, (3) the maximum number of pending iterations at any instances during the execution, (4) a hash function to disperse different pending iterations, and (5) the parallel execution time.
TR1990-505 1990 Large-Scale Optimization of Eigenvalues Overton, Michael L. Abstract | PDF

Title: Large-Scale Optimization of Eigenvalues

Author(s): Overton, Michael L.

Abstract:

Optimization problems involving eigenvalues arise in many applications. Let x be a vector of real parameters and let A(x) be a differentiable symmetric matrix function of x. We consider a particular problem which occurs frequently: the minimization of the maximum eigenvalue of A(x), subject to linear constraints and bounds on x. The eigenvalues of A(x) are not differentiable at points x where they coalesce, so the optimization problem is said to be nonsmooth. Furthermore, it is typically the case that the optimization objective tends to make eigenvalues coalesce at a solution point.

There are three main purposes of the paper. The first is to present a clear and self-contained derivation of the Clarke generalized gradient of the max eigenvalue function in terms of a "dual matrix". The second purpose is to describe a new algorithm, based on the ideas of a previous paper by the author (SIAM J. Matrix Anal. Appl. 9 (1988) 256-268), which is suitable for solving large-scale eigenvalue optimization problems. The algorithm uses a "successive partial linear programming" formulation which should be useful for other large-scale structured nonsmooth optimization problems as well as large-scale nonlinear programming with a relatively small number of nonlinear constraints. The third purpose is to report on our extensive numerical experience with the new algorithm, solving problems which arise in the following application areas: the optimal design of columns against buckling; the construction of optimal preconditioners for numerical linear equation solvers; the bounding of the Shannon capacity of a graph. We emphasize the role of the dual matrix, whose dimension is equal to the multiplicity of the minimal max eigenvalue. The dual matrix is computed by the optimization algorithm and used for verification of optimality and sensitivity analysis.
Ph.D. Thesis 1990 Design and implementation of HyTeK: A knowledge-based hypertext system Perez-Carballo, Jose F. Abstract | PDF

Title: Design and implementation of HyTeK: A knowledge-based hypertext system

Candidate: Perez-Carballo, Jose F.

Advisor(s): Strzalkowski, Tomek; Shasha, Dennis

Abstract:

A Hypertext system is a text data base where the units of information are interlinked using pointers that the user can follow. We call the pointers explicit links (as opposed to computed or virtual links.) HyTeK provides a set of tools designed to help the user explore the information contained in the system. The information contained in the system is represented using at least one of the three following methods: fragments of full text, explicit links between fragments and a collection of frame-like objects organized in a taxonomy. Explicit links are used to represent discourse relationships between fragments of text. The frame-like objects, called Topics, represent concepts in the domain of the text contained in the fragments. Topics are used to index the fragments for retrieval. The taxonomy of Topics represents some of the relationships between fragments that a traditional Hypertext System would represent using explicit links. HyTeK's query system uses the taxonomy of Topics in order to implement tools that allow the user to retrieve fragments selectively by their contents. A user queries the system by building a set of Topics in an interactive process of reformulation. Query reformulation is supported by a set of tools that allow the user to explore the space of Topics. The relationships between the Topics are used to define a similarity measure which is used to rank the target set of the query. This work describes an automatic indexing scheme, a query system and an extension of the Knowledge Representation (KR) system NIKL (KLONE) that was used in HyTeK to implement the taxonomy of Topics. A prototype of HyTeK was implemented in Common-Lisp in a Symbolics 3645 running Genera 7.2. The system has been extensively tested on several test collections of a total of 1000 fragments of text about AIDS treatments. The results indicate clear advantages over traditional Information Retrieval systems and suggest that the use of a KR system for the implementation of a query module for a Hypertext System is promising.
Ph.D. Thesis 1990 On a generalization of Herbrand's theorem Policriti, Alberto Abstract | PDF

Title: On a generalization of Herbrand's theorem

Candidate: Policriti, Alberto

Advisor(s): Davis, Martin D.

Abstract:

In this thesis we prove a generalized version of Herbrand's theorem. Our result guarantees the existence of a semi-decision procedure a la Herbrand for testing unsatisfiability with respect to a give theory T, in which the decision procedure used at the ground level depends upon T. This is opposed to the classical case in which procedure used at the ground level is simply a test for propositional satisfiability. The problem of finding suitable analogues for the general case of the exhaustive search procedures is also tackled, and one such generalization is proposed. The underlying motivation for this study was to find theoretical results that could provide the basis for a set-theoretic proof checker. Thus, the case of set theory is considered in more detail. In particular, decidability and undecidability results for classes of set-theoretic, purely universal formulae are proved.
TR1990-533 1990 On a Conjecture of Micha Perles Prabhu, N. Abstract | PDF

Title: On a Conjecture of Micha Perles

Author(s): Prabhu, N.

Abstract:

We prove a conjecture of Micha Perles concerning simple polytopes, for a subclass that properly contains the duals of stacked and crosspolytopes. As a consequence of a special property of this subclass it also follows that, the entire combinatorial structure of a polytope in the subclass can be recovered from its graph, by applying our results recursively.
Ph.D. Thesis 1990 Space-variant computer vision with a complex-logarithmic sensor geometry Rojer, Alan S. Abstract | PDF

Title: Space-variant computer vision with a complex-logarithmic sensor geometry

Candidate: Rojer, Alan S.

Advisor(s): Schwartz, Eric

Abstract:

The complex logarithm as a conformal mapping has drawn interest as a sensor architecture for computer vision due to its psuedo-invariance with respect to rotation and scaling, its high ratio of field width to resolution for a given number of pixels, and its utilization in biological vision as the topographic mapping from the retina to primary visual cortex. This thesis extends the computer vision applications of the complex-logarithmic geometry. Sensor design is based on the complex log mapping w = log (z + a), with real a $>$ 0, which smoothly removes the singularity in the log at the origin. Previous applications of the complex-logarithmic geometry to computer vision, graphics and sensory neuroscience are surveyed. A quantitative analysis of the space complexity of a complex-logarithmic sensor as a function of map geometry, field width and angular resolution is presented. The computer-graphic problems of warping uniform scenes according to the complex logarithm and inversion of log-mapping scenes to recover the original uniform scene are considered, as is the problem of blending the resulting inverse log maps to reconstruct the original (uniform) scene. A series of simple algorithms for segmentation of log scenes by contour completion and region filling are presented. A heuristic algorithm for figure/ground segmentation using the log geometry is also shown. The problem of fixation-point selection (visual attention) is considered. Random selection of fixation points, inhibition around previous fixations, spatial and temporal derivatives in the sensor periphery, and regions found by segmentation are all examined as heuristic attentional algorithms. For the special case where targets can be parametrically defined, a theory of model-based attention based on the Hough transform is introduced. A priori knowledge about the consistency between potential objects in the scene and measured features in the scene is used to select fixation points. The exponential storage requirements of the usual Hough transform are avoided.
Ph.D. Thesis 1990 SAGE: A real-time operating system for robotic supervisory control Salkind, Louis K. Abstract | PDF

Title: SAGE: A real-time operating system for robotic supervisory control

Candidate: Salkind, Louis K.

Advisor(s): Mishra, Bud

Abstract:

The next generation of robotic applications--computer integrated manufacturing, teleoperation, and mobile autonomous robots--will require far more computer systems support than currently available. In particular, real-time supervisory control systems will be needed to integrate an increasing number of sensors and actuators, as well as to communicate with other computers in a distributed environment. This thesis describes the design and implementation of SAGE, an operating system built specifically for real-time robotic supervisory control. The SAGE kernel runs on off-the-shelf Motorola 68020 processor boards, and features lightweight processes, virtual memory support, extensible low-overhead synchronization primitives, and real-time communications capabilities. Because SAGE is one of the first systems built for robotic supervisory control, the thesis focuses on the issues and design tradeoffs that arise in building a supervisory control operating system. The thesis also describes how SAGE was used to control a number of intelligent devices, including a Utah/MIT hand and a PUMA robot arm. The robotic experiments performed demonstrate that the operating system can be used in real-time supervisory control applications.
TR1990-514 1990 Beyond Fail-Stop: Wait-Free Serializability and Resiliency in the Presence of Slow-Down Failures Shasha, D.; Turek, J. Abstract | PDF

Title: Beyond Fail-Stop: Wait-Free Serializability and Resiliency in the Presence of Slow-Down Failures

Author(s): Shasha, D.; Turek, J.

Abstract:

Historically, database researchers have dealt with two kinds of process failures: fail-stop failures and malicious failures. Under the fail-stop assumption, processes fail by halting. Such failures are easily detectable. Under the malicious (or Byzantine) failure assumption, processes fail by behaving unpredictably, perhaps as adversaries. Such failures are not necessarily detectable. When system designers discuss fault tolerance, they typically restrict themselves to the problem of handling fail-stop failures only. This paper proposes an intermediate failure model and presents a practical algorithm for handling transactions under this model. The new failure model allows processes to fail by either slowing down or stopping; slow processes may later speed up, continue to proceed slowly, or, (eventually) stop. We call such failures slow-down failures. The model does not assume the ability to distinguish among these possibilities, say, by using a timeout mechanism, nor does it assume that it is possible to kill a slow process. Our algorithm, instead, allows for a new process to be dispatched to do the job that had been assigned to a slow process. The problem is that several processes may end up doing the same task and interfere with one another. Our algorithm controls such interference while guaranteeing both serializability and resiliency.
TR1990-519 1990 A Domain Decomposition Algorithm for Elliptic Problems in Three Dimensions Smith, B. Abstract | PDF

Title: A Domain Decomposition Algorithm for Elliptic Problems in Three Dimensions

Author(s): Smith, B.

Abstract:

Most domain decomposition algorithms have been developed for problems in two dimensions. One reason for this is the difficulty in devising a satisfactory, easy-to-implement, robust method of providing global communication of information for problems in three dimensions. Several methods that work well in two dimensions do not perform satisfactorily in three dimensions.

A new iterative substructuring algorithm for three dimensions is proposed. It is shown that the condition number of the resulting preconditioned problem is bounded independently of the number of subdomains and that the growth is quadratic in the logarithm of the number of degrees of freedom associated with a subdomain. The condition number is also bounded independently of the jumps in the coefficients of the differential equation between subdomains. The new algorithm also has more potential parallelism than the iterative substructuring methods previously proposed for problems in three dimensions.
TR1990-517 1990 Domain Decomposition Algorithms for the Partial Differential Equations of Linear Elasticity Smith, B. Abstract | PDF

Title: Domain Decomposition Algorithms for the Partial Differential Equations of Linear Elasticity

Author(s): Smith, B.

Abstract:

The use of the finite element method for elasticity problems results in extremely large, sparse linear systems. Historically these have been solved using direct solvers like Choleski's method. These linear systems are often ill-conditioned and hence require good preconditioners if they are to be solved iteratively. We propose and analyze three new, parallel iterative domain decomposition algorithms for the solution of these linear systems. The algorithms are also useful for other elliptic partial differential equations.

Domain decomposition algorithms are designed to take advantage of a new generation of parallel computers. The domain is decomposed into overlapping or non-overlapping subdomains. The discrete approximation to a partial differential equation is then obtained iteratively by solving problems associated with each subdomain. The algorithms are often accelerated using the conjugate gradient method.

The first new algorithm presented here borrows heavily from multi-level type algorithms. It involves a local change of basis on the interfaces between the substructures to accelerate the convergence. It works well only in two dimensions.

The second algorithm is optimal in that the condition number of the iteration operator is bounded independently of the number of subdomains and unknowns. It uses non-overlapping subdomains, but overlapping regions of the interfaces between subdomains. This is an additive Schwarz algorithm, which works equally well in two or three dimensions.

The third algorithm is designed for problems in three dimensions. It includes a coarse problem associated with the unknowns on the wirebaskets of the subdomains. The new method offers more potential parallelism than previous algorithms proposed for three dimensional problems since it allows for the simultaneous solution of the coarse problem and the local problems.
Ph.D. Thesis 1990 The APRAM: A model for asynchronous parallel computation Zajicek, Ofer Abstract | PDF

Title: The APRAM: A model for asynchronous parallel computation

Candidate: Zajicek, Ofer

Advisor(s): Cole, Richard

Abstract:

It is becoming increasingly clear that parallel computers will play a significant role in the area of computer science and its applications. In order to develop parallel machines and in order to be able to take advantage of them as they become available it is important to understand the issues underlying parallel computation. This thesis investigates one such issue, the synchronization costs of shared memory parallel computation. It defines the APRAM model, an asynchronous variation of the PRAM model, and analyzes a number of fundamental algorithms in this model; it uses three different complexity measures. The first part of the thesis defines the rounds complexity. It describes the complexity of an algorithm as a function of the slowest process. It is used to measure the explicit costs of synchronization: the cost of executing extra code in order to achieve synchronization. Three algorithms are analyzed under this complexity measure: a tree based summation algorithm; a list based recursive doubling algorithm; and an algorithm for computing the connected components of an undirected graph. In all three cases it is shown that global synchronization can be replaced by local synchronization thereby reducing the explicit costs of synchronization. The connectivity algorithm is significantly more substantial than the other two. We avoid the need to synchronize the processes, thereby obtaining an algorithm whose behavior appears somewhat chaotic. Due to its apparently chaotic nature and the unpredictability of the asynchronous environment, its analysis is quite challenging. In an asynchronous environment processes may proceed at different speeds. In the second part of the thesis we model the non-uniformity of the environment by defining the speeds of the processes to be random variables with a known probability distribution. We then quantify conditions under which asynchronous execution may have a significant advantage over a lock step execution, even if the explicit costs of a lock step execution are ignored. Both the summation algorithm and the recursive doubling algorithm are analyzed using two different probability distributions. In addition, we quantify conditions under which the list based recursive doubling algorithm is significantly faster than the tree based summation algorithm.