Title: On the use of Global Optimization Algorithms for the Detection of Semantic Programming Errors (Setl, Data Flow, Type Finding)
Candidate: Freudenberger, Stefan M.
It has been pointed out repeatedly that it should be possible to adapt global program optimization algorithms for the purpose of detecting faults in programs. It has become clear that global program analysis can be beneficial in program development, debugging, verification, and documentation since it can provide information about all possible executions of the code at once. The techniques employed are not only capable of revealing errors, interfacing errors, and other shortcomings but of doing so in a way which helps to pinpoint the source of problems. In this dissertation we systematically examine the global optimization techniques available today to determine how these techniques can be used to aid the rapid, compile-time detection of program errors. The techniques considered include flow tracing, type finding, and value flow. The approach is to determine what facts about a program can be collected using the best available program analysis technique, and to use this information to mark suspicious program segments. The techniques proposed have been implemented in an extensive global bug finder, and examples of its use are included.
Title: Description of Operating Systems using Very-High-Level Diction (Programming Languages)
Candidate: Leshem, Gavriel
Operating systems are generally large and complicated, and therefore difficult to write, debug and maintain. This thesis approaches the problem of simplifying these complex descriptions by writing operating system prototypes using a very high-level programming language that significantly relieves the burden of low-level and machine dependent details. The language used includes special constructs designed to facilitate clear and concise description of the mechanisms involved in multiprocessing systems: (1) a coroutine mechanism to implement concurrent processes, (2) a interprocess communication mechanism, (3) a real I/O facility that provides access to I/O system services. Using these intermediate-level constructs simplifies the problem of describing the high-level structure of operating systems significantly. These constructs are written in a high-level programming language, using several simple low-level primitives. They can be modified easily and new operations can be added at will. The main purpose of our high-level approach is to provide a tool for describing and designing operating systems. The high-level description can be used as a blueprint for writing the real operating system in a suitable lower-level implementation language. The thesis also describes an implementation of the suggested language that can be used to test the high-level description of an operating system and possibly also to simulate the real system to predict its potential performance. We test our descriptive tools by giving extended descriptions of two well-known operating systems using the proposed high-level language. Several basic design issues concerning these operating systems are then examined and the operating systems are compared in a manner that emphasizes the design issues that emerge. Some modifications of these systems, inspired by the high-level representation, are suggested.
Title: Decidability and Proof Procedures for Set Theory with a Choice Operator
Candidate: Omodeo, Eugenio Giovanni
Advisor(s): Davis, Martin D.
Various decision algorithms are described and proved correct, each applying to a particular collection of unquantified set-theoretical formulas. Some of these algorithms are able to determine whether each given formula is satisfiable, some others can only establish whether it is satisfiable by means of an interpretation in which the values of the terms appearing in the formula are finite sets. In most cases, formulas are allowed to involve a choice operator which selects from every non-empty set s the minimum of s with respect to a well-ordering of the class of all sets. A semi-decision procedure is also described which applies to unquantified formulas in which all familiar set-theoretical operators are allowed to appear, with certain limitations imposed only on the occurrences of the unionset and choice operators. The execution of this procedure only terminates when the input formula is finitely satisfiable.
Title: A Self-Organizing Database System - a Different Approach to Query Optimization
Candidate: Piatetsky-Shapiro, Gregory Ilya
A Self-Organizing Database System (SODS) monitors queries asked, finds a good (or optimal) database structure for those queries, and suggests or does the reorganization. In this thesis we describe a prototype SODS for single-file relational queries and give an integrated analysis of its major design problems: (1) estimation of the number of records satisfying a condition (i.e., condition selectivity); (2) query optimization; (3) storing information about a set of queries; (4) optimal selection of secondary indices. We present new results for each of those problems. Some of this research was implemented in FASTSCAN, a commercial query system. We present a new method for accurate estimation of the number of records satisfying a condition field rel constant, where rel is one of =, , (LESSTHEQ), (GREATERTHEQ). We also examine estimates for more complicated conditions. We present elementary operations (such as UNION, INTERSECT) on pointer and record streams. We show how to use the query parse tree to construct a query evaluation method (EM) from those operations. Then we give an algorithm for selecting the optimal EM, based on converting the query to conjunctive normal form. We examine ways to compress information about a set of queries by combining information for similar queries. We derive a compression scheme which allows a correct and fast computation of the cost of the average query under any index set. We combine all previous results in analyzing the NP-hard problem of optimal index selection. We present two algorithms for it. The first one always finds the optimal answer and runs fast on real-size problems despite its exponential worst-case complexity. The second one (a Greedy method) runs much faster, yet finds the optimal answer very frequently. We analyze the Maximum Cover problem (also NP-hard), a simplification of the optimal index selection. We prove that the Greedy method is an epsilon-approximate algorithm: its answer is always > 63% of the optimal answer.
Title: Concurrency Control using Locks in Distributed Databases
Candidate: Wolfson, Ouri
Distributed Databases have drawn a great deal of research interest recently because of a combination of several related reasons. First is the tremendous expansion in the quantity of data that has to be processed in the modern world. Second is the growth in the number of interelated processing centers because microcomputers and communication technology enable greater dispersion of organizations. Third is the realization that complex problems to be addressed in this and next decade, such as different aspects of Artificial Intelligence, will require at least some parallel processing for adequate solution. In Distributed Databases the typical problems of Centralized Databases become more difficult. One of them is Concurrency Control. It can be summarized as follows. Users of the Database access it by executing transactions. Different transactions are executed concurrently therefore their actions interleave. Without proper control this interleaving may produce incorrect results, even if individual transactions are correct. The Concurrency Control process has to prevent these situations. There are several possible mechanisms for controlling concurrency, of which the most widely used is Locking. In this thesis we examine and analyze Locking as a Concurrency Control mechanism for Distributed Databases. We define Distributed Locking Policies (methods for locking entities in Distributed Databases) and show how existing Policies for a Centralized Database generalize to the Distributed case. We also define a new category of Distributed Locking Policies, D-policies, into which these generalizations fall. An algorithm which determines whether all transactions of a given D-policy are guaranteed to produce only correct interleavings (are safe) is presented. The algorithm is efficient, even though testing an arbitrary set of transactions for safety is coNP-complete. However, we prove that optimal locking of transactions to satisfy the conditions tested by the algorithm is NP-hard even for a Centralized Database.