Instructions for submitting a technical report or thesis.
Title: A New Primal-Dual Interior-Point Method for Semidefinite Programming
Author(s): Alizadeh, F.; Haeberly, J. A.; Overton, M.
Abstract:
Semidefinite programming (SDP) is a convex optimization problem in the space of symmetric matrices. Primal-dual interiorpoint methods for SDP are discussed. These generate primal and dual matrices X and Z which commute only in the limit. A new method is proposed which iterates in the space of commuting matrices.
Title: Automatic Synthesis Algorithms for Supervisory Controllers (Preliminary Report)
Author(s): Antoniotti, M.; Mishra, B.
Abstract:
In this paper we describe our experience with a prototype system capable of synthesizing "Supervisor Controller Programs" based largely on the theory of discrete event systems (DES) first proposed by Ramadge and Wonham. We augment the theory by also allowing continuous time trajectories modeling transitions between events. We illustrate our approach by an example, - the discrete control of a walking machine - which poses some challenges on the applicability of the theory and finally, discuss some possible solutions.
Notes: Appeared in IEEE Proceedings of the Fourth International Conference on Computer Integrated Manufacturing and Automation Technology, Troy, NY, Oct. 1994
Title: Discrete Event Models + Temporal Logic = Supervisory Controller: Automatic Synthesis of Locomotion Controllers
Author(s): Antoniotti, M.; Mishra, B.
Abstract:
In this paper, we address the problem of the synthesis of controller programs for a variety of robotics and manufacturing tasks. The problem we choose for test and illustrative purposes is the standard ``Walking Machine Problem,'' a representative instance of a real "hybrid" problem with both logical/discrete and continuous properties and strong mutual influence without any reasonable separation. We aim to produce a ``compiler technology'' for this class of problems in a manner analogous to the development of the so-called ``Silicon Compilers'' for the VLSI technology. To cope with the difficulties inherent to the problem, we resort to a novel approach that combines many key ideas from a variety of disciplines: namely, ``Discrete Event Supervisory Systems'', Petri Nets approaches and ``Temporal Logic''.
Notes: Will appear in the 1995 IEEE International Conference on Robotics and Automation, Nagoya, Japan
Title: Multilevel Schwarz Methods with Partial Refinement
Author(s): Chen, H.
Abstract:
We consider multilevel additive Schwarz methods with partial refinement. These algorithms are generalizations of the multilevel additive Schwarz methods developed by Dryja and Widlund and many others. We will give two different proofs by using quasi-interpolants under two different assumptions on selected refinement subregions to show that this class of methods has an optimal condition number. The first proof is based purely on the localization property of quasi-interpolants. However, the second proof use some results on iterative refinement methods. As a by-product, the multiplicative versions which corresponds to the FAC algorithms with inexact solvers consisting of one Gauss-Seidel or damped Jacobi iteration have optimal rates of convergence. Finally, some numerical results are presented for these methods.
Title: Approximate Euclidean Shortest Path in 3-Space
Author(s): Choi, J.; Sellen, J.; Yap, C.K.
Abstract:
Papadimitriou's approximation approach to the Euclidean shortest path (ESP) in 3-space is revisited. As this problem is NPhard, his approach represents an important step towards practical algorithms. However, there are several gaps in the original description. Besides giving a complete treatment in the framework of bit complexity, we also improve on his subdivision method. Among the tools needed are root-separation bounds and non-trivial applications of Brent's complexity bounds on evaluation of elementary functions using floating point numbers.
Title: Branching Continuous Time and the Semantics of Continuous Action
Author(s): Davis, E.
Abstract:
It is often useful to model the behavior of an autonomous intelligent creature in terms of continuous control and choice. For example, a robot who moves through space can be idealized as able to execute any continuous motion, subject to constraints on velocity and acceleration; in such a model, the robot can "choose" at any instant to change his acceleration. We show how such models can be described using a continuous branching time structure. We discuss mathematical foundations of continuous branching structures, theories of continuous action in physical worlds, embedding of discrete theories of action in a continuous structure, and physical and epistemic feasibility of plans with continuous action.
Title: Adaptive Time-Frequency Approximations with Matching Pursuits
Author(s): Davis, G.; Mallat, S.; Zhang, Z.
Abstract:
Computing the optimal expansion of a signal in a redundant dictionary of waveforms is an NP-complete problem. We introduce a greedy algorithm called a matching pursuit which computes a sub-optimal expansion. The dictionary waveforms which best match a signal's structures are chosen iteratively. An orthogonalized version of the matching pursuit is also developed. Matching pursuits are general procedures for computing adaptive signal representations. With a dictionary of Gabor functions, a matching pursuit defines an adaptive time-frequency transform. We derive a signal energy distribution in the time-frequency plane which does not contain interference terms, unlike the Wigner and Cohen class distributions. A matching pursuit is a chaotic map whose asymptotic properties are studied. We describe an algorithm which isolates the coherent structures of a signal and show an application to pattern extraction from noisy signals.
Title: Systolic Combining Switch Designs
Candidate: Dickey, Susan
Advisor(s): Gottlieb, Allan
Abstract:
High-performance VLSI switches are needed in the interconnection network of massively parallel shared memory multiprocessors. The switch designs we consider alleviate the ``hot spot'' problem by adding extra logic to the switches to combine conventional loads and stores as well as fetch-and- $\phi$
operations destined for the same memory location. The performance of three buffered switch architectures was investigated through probabilistic analysis and simulation: Type A switches, with k queues, one at each output, each accepting k inputs per cycle; and two one-input queue designs, Type B switches, with $k^2$
output queues, and Type C switches, with k input queues. While the Type C switch is less expensive, Type A and B have considerably better performance. An efficient CMOS implementation for systolic queue designs was devised. A non-combining switch containing these systolic queues was fabricated through MOSIS in 3 micron CMOS and employed the NORA clocking methodology, using qualified clocks for distributing global control.
A combining switch was fabricated in 2 micron CMOS for use in the 16 by 16 processor/memory interconnection network of the NYU Ultracomputer prototype. Details are given about the internal logic of the two component types used in the network. A design usable in networks of size up to 256 * 256 has been prepared for fabrication by NCR at a smaller feature size in a higher pincount package. Differences in the logic partitioning of the two designs are described. We describe the performance of these designs for systems of up to 1024 PEs obtained through simulation. Our experience in implementing a combining switch indicates that the cost of hardware combining is much less than is widely believed. We compare the cost of a combining switch to that of a non-combining switch and discuss the scalability of the implemented design to large numbers of processors. Differences in the capabilities of combining switch architectures are studied. We describe the implementation of ``two-and-a-half-way'' combining, which promises to avoid network saturation in large networks at only slightly greater cost than two-way combining. We also discuss implementation alternatives and performance for a 4 by 4 combining switch.
Title: A Direct-Drive Hand: Design, Modeling and Control
Author(s): Ebner, M.; Wallace, R.
Abstract:
An artificial 15 degrees of mobility direct drive hand, slightly bigger than a human hand, is presented. The underlying technology are the miniature direct drive actuators recently developed. The motivation for our design and the construction plan for the hand is given. The dynamics of the hand are analyzed theoretically and a model for control of the hand is presented. Finally we describe our experiences made while experimenting with the hand. A direct drive hand graphics interface has been developed to simulate the dynamics of the hand and to test out control algorithms for the hand.
Title: Gedanken: A tool for Pondering the Tractability of Correct Program Technology
Candidate: Ericson, Lars
Advisor(s): Mishra, Bud
Abstract:
We examine the feasibility of the Correct Program Technology (CPT) approach to program verification using available technology, with pessimistic results.
We compare CPT with RAPTS and the Calculus of Constructions. We specify the Correct Programmer's Workbench (CPW), and review six programming environments as platforms. We define a Correct Program Editor and prototype it in Mathematica.
CPT applies decision procedures for specification sublanguages to make shorter proofs, hoping these shorter proofs will have faster verifications, but these sublanguages are NP-Complete or worse. We review some heuristics for improving their average case. CPT relies on a sublanguage of set theory, MLS. We prove that MLS is NP-average complete in the sense of the Levin-Gurevich theory of average case complexity. We conjecture that shorter proofs of random theorems cost more to verify.
EMLS is an elementary relational language (ERL). We define syntactic simplification rule sets (SSRs) for ERLs. The average case effect of an SSR is determined by the number of matches of the SSR with ERL sentences of n individual variables. EMLS sentences over n variables can be constructed from sentences in L _{ 4,2, n } and L _{ 2,3, n } , where L _{ k , m , n } is the language of k relations of m arguments over n variables. We recursively define a match-counting algorithm for L _{ k , m , n } SSRs and extend it to EMLS. If an SSR has p patterns in w pattern variables over n individual variables, match counting costs O ( p n ^{ w } 2 ^{ pn w - 1 } (2 + k n ^{ m } )). Match counting for L _{ k ,0,0 } is in #P, and we conjecture that it is # P -Complete. We conjecture generating functions do not yield a method of approximating the number of matches, and we conjecture that the problem of approximating matches is also # P -Complete. We count the matches for low n for some EMLS SSRs, with discouraging results, and note that the matches of an effective rule set must grow as the size of the language for n variables.
We conclude that the remaining hope for verification is to build a large library of specification language constructs which occur frequently and can be verified in polynomial time.
Title: Optimizing Eigenvalues of Symmetric Definite Pencils
Author(s): Haeberly, J. A.; Overton, M.
Abstract:
We consider the following quasiconvex optimization problem: minimize the largest eigenvalue of a symmetric definite matrix pencil depending on parameters. A new form of optimality conditions is given, emphasizing a complementarity condition on primal and dual matrices. Newton's method is then applied to these conditions to give a new quadratically convergent interior-point method which works well in practice. The algorithm is closely related to primal-dual interior-point methods for semidefinite programming.
Title: Designing Pattern Matching Algorithms by Exploiting Structural Pattern Properties
Candidate: Hariharan, Ramesh
Advisor(s): Cole, Richard
Abstract:
Exact Complexity of String Matching: We consider the question of how many character comparisons are needed to find all occurrences of a pattern string of length m in a text string of length n . We show an almost tight upper bound of the form n + O ( n / m ) character comparisons, following preprocessing. Specifically, we show an upper bound of n + 8/(3( m +1)) ( n - m ) character comparisons. The following lower bounds are also shown: for on-line algorithms, a bound of n + 9 / (4( m +1)) ( n - m ) character comparisons for m =35+36 k , for any integer k >= 1, and for general algorithms, a bound of n +2( n - m ) / ( m +3) character comparisons, for m =2 k +1, for any integer k >= 1.
Parallel Two-Dimensional Pattern Matching: We give the first time, space and work optimal common CRCW-PRAM algorithm for finding all occurrences of a two-dimensional pattern of size m _{ 1 } * m _{ 2 } in a two-dimensional text of size n _{ 1 } * n _{ 2 } . Our algorithm runs in O (1) time performing O ( n _{ 1 } * n _{ 2 } ) work, following preprocessing of the pattern. A major portion of the preprocessing step is the computation of witnesses for the pattern. We show how to compute witnesses for the pattern in O ( log log m _{ 2 } )time and O ( m _{ 1 } * m _{ 2 } ) work when m _{ 2 } >= m _{ 1 } . In the process of designing the above algorithm, we also obtain some new periodicity properties of two-dimensional patterns.
Parallel Suffix Tree Construction: We consider the problem of constructing the suffix tree of a given string s of length m in parallel. An O ( m )-work, O ( m )-space, O ( log ^{ 4 } m )-time CREW-PRAM algorithm for constructing the suffix tree of s is obtained when s is drawn from any fixed alphabet set. This is the first work and space optimal parallel algorithm known for this problem. It can be generalized to construct the suffix tree of a string s drawn from any general alphabet set to perform in O ( log ^{ 4 } m ) time, $O(m\log |\Sigma|)$
work, and $O(m\log |\Sigma|)$
space, after the characters in s have been sorted alphabetically; here $|\Sigma|$
is the number of distinct characters in s . In this case too, the algorithm is work optimal.
Title: Compilation of Array-Style Programs for Distributed Memory MIMD Machines: a Geometric Approach
Candidate: Katz, Alex
Advisor(s): Schonberg, Edmond
Abstract:
Distributed memory MIMD (Multiple Instruction Multiple Data) machines are emerging as a cost-effective means of speeding up numerically intensive programs. They scale more easily than other parallel machines. But writing explicitly parallel programs for these machines is both difficult and error prone. Compilers for languages like HPF make the task easier by generating the necessary inter-processor communication from the data distribution directives supplied by the programmer. This dissertation shows that for a large class of array-style programs automatic data distribution can produce a significant speedup on a distributed memory MIMD machine. Array-style programs use array primitives to manipulate entire arrays, rather than looping explicitly over the array elements. APL programs are typically array-style.
We show how to apply automated data distribution to APL programs, that treat arrays and operations on them as atomic. Automated data distribution determines the necessary inter-processor communication from the way APL primitives manipulate the entire arrays, rather than by complex algebraic analysis of the patterns of array subscripts, as would be done in more conventional compilers. A simple distribution and alignment scheme automatically distributes arrays across available processors. Arrays can be dynamic, with sizes varying during program execution. Data distribution is guided by array size estimates. Distribution trade-off analysis attempts to optimize the initial distribution by comparing the estimated communication and computation times, and replicating arrays whose partitioning results in excessive communication.
Building on the APL to C compiler developed by W.-M. Ching, we produce explicitly parallelized C, from APL source programs. We describe the parallel implementation of most of the APL primitives. The implementation of several APL primitives uses the monotonic data movement algorithm. The ideas developed are demonstrated with eight APL programs of varying complexity. We show the speedup and efficiency obtained when running these programs on 2 to 32 processors. The speedup achieved on 32 processors, ranging from 7 to 30, shows the technique to be applicable to a wide range of programs.
Title: An Optimal Preconditioner for a Class of Saddle Point Problems with a Penalty Term
Author(s): Klawonn, A.
Abstract:
Iterative methods are considered for a class of saddle point problems with a penalty term arising from finite element discretizations of certain elliptic problems. An optimal preconditioner which is independent of the and the penalty parameter is constructed. This approach is then used to design an iterative method with a convergence rate independent of the Lam\'{e} parameters occuring in the equations of linear elasticity.
Please see revised version tr683.
Title: New Estimates for Ritz Vectors
Author(s): Knyazev, A.
Abstract:
The followiing estimate for the Rayleigh--Ritz method is proved: $$ | \tilde \lambda - \lambda | |( \tilde u , u )| \le { \| A \tilde u - \tilde \lambda \tilde u \| } \sin \angle \{ u ; \tilde U \}, \ \| u \| =1. $$ Here $A$ is a bounded self-adjoint operator in a real Hilbert/euclidian space, $\{ \lambda, u \}$ one of its eigenpairs, $\tilde U$ a trial subspace for the Rayleigh--Ritz method, and $\{ \tilde \lambda, \tilde u \}$ a Ritz pair. %$\| u \| = \| \tilde u \| = 1.$ This inequality makes it possible to analyze the fine structure of the error of the Rayleigh--Ritz method, in particular, it shows that $ |( \tilde u , u )| \le C \epsilon^2, $ if an eigenvector $u$ is close to the trial subspace with accuracy $\epsilon$ and a Ritz vector $\tilde u$ is an $\epsilon$ approximation to another eigenvector, with a different eigenvalue. Generalizations of the estimate to the cases of eigenspaces and invariant subspaces are suggested, and estimates of approximation of eigenspaces and invariant subspaces are proved.
Title: Lazy SETL Debugging with Persistent Data Structures
Candidate: Liu, Zhiqing
Advisor(s): Schwartz, Jack
Abstract:
Debugging tools have been traditionally difficult to use, particularly in accumulating and exploring program runtime information. This dissertation addresses these issues by proposing a lazy debugging approach, which postpones investigation of debugging hypothesis until complete runtime history is available. This approach encourages a systematic way of debugging and supports many high-level debugging facilities. Recent advance in persistent data structures reduces the time and memory space overhead incurred in recording and storing execution events drastically, and also makes the overhead easily manageable.
To demonstrate this approach, a visual SETL debugger prototype has been designed and implemented based on D. Bacon's SETL translator. This debugger has a persistent runtime system designed using the persistent data structures of the node splitting type, developed by Driscoll, et al. It can efficiently record changes in program execution state under different recording granularities, along with supporting normal SETL executions. Users of this debugger are provided with a graphical interface, which supports many powerful tools, such as forward/backward control/data breakpoints, interactive variable printing, program animation, and re-execution from an recorded execution moment.
A strong set of conclusions can be drawn from an evaluation of the debugger's performance and usability issues, as well as the limitations and open questions of this debugging approach.
Title: Searching for Strings and Searching in Presence of Errors
Candidate: Muthukrishnan, S.
Advisor(s): Spencer, Joel
Abstract:
This dissertation deals with two classes of searching problems. The first class consists of pattern matching problems, and the second class comprises combinatorial searching problems in presence of errors in response to the queries. Our results are as follows.
Standard Stringology. Standard Stringology is the study of pattern matching problems in which a text location matches one in the pattern provided the associated symbols are identical. The basic problem here is the string matching problem of detecting all occurrences of a pattern string in a text string. This naturally generalizes to the dictionary matching problem of finding all occurrences of a set of patterns, rather than a single pattern, in a given text. Very fast optimal parallel algorithms exist for string matching in the PRAM model. These algorithms rely on structural properties of the strings. Unfortunately these structural properties are not useful for solving the dictionary matching problem. We have obtained the fastest and the most work-efficient algorithms known for this problem and a number of its variants by introducing and using a new technique called shrink-and-spawn .
Non-Standard Stringology. In problems from Non-Standard Stringology, an arbitrary many-to-many matching relation holds between the text and pattern locations. An example is string matching with ``don't cares'' where the position in the text that has a ``don't care'' symbol matches every pattern position. The inherent complexity and structure of such non-standard string matching problems is not well understood. Our main results are inherent complexity bounds for these problems, characterized in terms of algebraic convolutions. Traditionally structure in pattern matching has meant repetitions in patterns, but this work exposes a novel graph-theoretic structure in these problems.
Searching in presence of errors. Given a set of items containing one or more distinguished items, the generic combinatorial search problem is to determine the distinguished item(s) using detection tests on groups of items. Motivated by fault-tolerance issues, we consider the scenario when some tests get incorrect responses. We have developed a strategy to solve the generic problem above using at most one test more than that necessary, even under adversarial placement of incorrect responses to the tests.
Title: Visual Programming
Candidate: Nickerson, Jeffrey
Advisor(s): Schonberg, Edmond
Abstract:
While computer science textbooks and classroom lectures are filled with diagrams, and much of our design activity as programmers takes place on whiteboards, we write our pro- grams as text. Proponents of visual programming suggest that we should take advantage of graphic user interface technol- ogy and draw rather than write our programs. This disserta- tion examines the extent to which this is possible, address- ing the question of how graphic representation can best be used in the process of programming.
The use of diagrams in the field of computer science is thoroughly surveyed, and some underlying principles identi- fied. The visual conventions of Adjoinment, Linking, and Enclosure are defined and illustrated. Three languages are developed - a simple programming language that encompasses shell commands, a visual version of APL, and a visual front end for Mathematica. The visual version of APL is notable in that it presents both a program and instances of data under- going transformation as part of one unified diagram.
Building on the work of R. J. A. Buhr, new visual systems designing conventions are created to handle the intricacies of facilities in the Ada9X language. Asynchronous transfers of control, requeueing, and generic formal parameters are addressed. The asynchronous transfer of control convention is suitable for CASE representations of the language con- struct, and can be easily animated.
Some existing software metrics are modified for use in analyzing diagrams, and two new metrics are proposed: graphic token count and diagram class complexity. A graphic design measure, data density, is transformed into a computer science measure, token density. Using these metrics, graphic representations can be compared to each other and to textual representations. From this, a strong set of conclusions are drawn about the relative strengths of graphic and textual representation, as well as the limits and possibilities of graphic representation in programming.
Title: Planning Paths of Minimal Curvature
Author(s): Sellen, J.
Abstract:
We consider the problem of planning curvature constrained paths amidst polygonal obstacles, connecting given start and target configurations. Let the critical curvature Rc be the minimal curvature for which a constrained path exists. We describe an algorithm, which approximates the critical curvature and finds a corresponding path. Further, we give an efficient decision procedure to determine if there exists a path satisfying a given curvature constraint R, with running time polynomial in |R-Rc|/R.
Title: Simple Multi Function Vision System for 3D Data Acquisition
Author(s): Sokolov, S. M.; Max, D. P.; Wallace, R. S.
Abstract:
We have developed a simple multi function vision system for 3D data acquisition for a wide range of applications in robotics and automation. The system uses one CCD video camera and an active directed laser light source based on a direct drive spherical pointing motor (SPM). The anatomy of the system and algorithms used are described. System calibration methods and measurements of accuracy of the outputs are presented. A list of applications is shown.
Title: Scaling Direct Drive Robots
Author(s): Wallace, R.; Selig, J.
Abstract:
Recent experimental and analytical evidence indicates that direct drive robots become very practical and economical at miniature and microscopic scales, so it is interesting to understand quantitatively the properties of direct drive robots under scaling transformations. This leads to a study of how screws and their dual co-screws behave under the group of similarity transforms. This group is the group of isometries together with dilations. Several different representations are found on the space of screws and complementary representations are found on the dual space of co-screws. From the electromagnetic theory of the force and torque on a magnet in a magnetic field, we derive the scaling properties of the electromagnetic wrench. Hence, these results can be directly applied to the scaling of direct drive motors [1]. We conclude by proposing a scale-invariant measure for direct drive actuator performance.
Title: Representing Control in Parallel Applicative Programing
Author(s): Yao, C.
Abstract:
This research is an attempt to reason about the control of parallel computation in the world of applicative programming languages.
Applicative languages, in which computation is performed through function application and in which functions are treated as first-class objects, have the benefits of elegance, expressiveness and having clean semantics. Parallel computation and real-world concurrent activities are much harder to reason about than the sequential counterparts. Many parallel applicative languages have thus hidden most control details with their declarative programming styles, but they are not expressive enough to characterize many real world concurrent activities that can be easily explained with concepts such as message passing, pipelining and so on.
Ease of programming should not come at the expense of expressiveness. Therefore, we design a parallel applicative language Pscheme such that programmers can express explicitly the control of parallel computation while maintaining the clean semantics and the ease of programming of applicative languages. In Pscheme, we propose the concept of ports to model the general control in parallel computation. Through program examples, we show how Pscheme and ports support various parallel programming paradigms. We have also built libraries for higher level control facilities with ports so that programming in Pscheme becomes easier.
We provide an operational semantics for Pscheme, and develop a compiler and a run time system on NYU's Ultracomputer. Our experiments with parallel programs have shown satisfactory speedup. We claim that ports are the natural parallel extensions of continuations in sequential computation, and thus conclude that representing general control in parallel applicativeprogramming is feasible.
Title: Pscheme: Extending Continuations to Express Control and Synchronization in a Parallel LISP
Author(s): Yao, C.; Goldberg, B.
Abstract:
In this paper, we describe Pscheme, a parallel dialect of Scheme. The primary construct for specifying parallelism, synchronization, and communication is a natural extension of first-class continuations which we call a port. We describe the behavior of ports, along with the other parallel constructs of Pscheme. Because the user has precise control over the parallel computation, the Pscheme constructs can be used to build higher-level parallel programming abstractions, such as futures, semaphores, and Ada-style rendezvous. We provide the Pscheme code for these abstractions and discuss the current implementation of Pscheme on a shared-memory multiprocessor.
Title: Representing Control in Parallel Applicative Programming
Candidate: Yao, Chi
Advisor(s): Goldberg, Benjamin
Abstract:
This research is an attempt to reason about the control of parallel computation in the world of applicative programming languages.
Applicative languages, in which computation is performed through function application and in which functions are treated as first-class objects, have the benefits of elegance, expressiveness and having clean semantics. Parallel computation and real-world concurrent activities are much harder to reason about than the sequential counterparts. Many parallel applicative languages have thus hidden most control details with their declarative programming styles, but they are not expressive enough to characterize many real world concurrent activities that can be easily explained with concepts such as message passing, pipelining and so on. Ease of programming should not come at the expense of expressiveness. Therefore, we design a parallel applicative language Pscheme such that programmers can express explicitly the control of parallel computation while maintaining the clean semantics and the ease of programming of applicative languages. In Pscheme, we propose the concept of ports to model the general control in parallel computation. Through program examples, we show how Pscheme and ports support various parallel programming paradigms. We have also built libraries for higher level control facilities with ports so that programming in Pscheme becomes easier.
We provide an operational semantics for Pscheme, and develop a compiler and a run time system on NYU's Ultracomputer. Our experiments with parallel programs have shown satisfactory speedup. We claim that ports are the natural parallel extensions of continuations in sequential computation, and thus conclude that representing general control in parallel applicative programming is feasible.