Instructions for submitting a technical report or thesis.
Title: On the Solution of Elliptic Partial Differential Equations on Regions with Corners III: Curved Boundaries
Author(s): Serkh, Kirill
Abstract:
In this report we investigate the solution of boundary value problems for elliptic partial differential equations on domains with corners. Previously, we observed that when, in the case of polygonal domains, the boundary value problems are formulated as boundary integral equations of classical potential theory, the solutions are representable by series of certain elementary functions. Here, we extend this observation to the general case of regions with boundaries consisting of analytic curves meeting at corners. We show that the solutions near the corners have the same leading terms as in the polygonal case, plus a series of corrections involving products of the leading terms with integer powers and powers of logarithms. Furthermore, we show that if the curve in the vicinity of a corner approximates a polygon to order \(k\), then the correction added to the leading terms will vanish like \(O(t^k)\), where \(t\) is the distance from the corner.
Title: On the Design of Small Coarse Spaces for Domain Decomposition Algorithms
Author(s): Dohrmann, Clark; Widlund, Olof
Abstract:
Methods are presented for automatically constructing %low-dimensional coarse spaces of low dimension for domain decomposition algorithms. These constructions use equivalence classes of nodes on the interface between the subdomains into which the domain of a given elliptic problem has been subdivided, e.g., by a mesh partitioner such as METIS; these equivalence classes already play a central role in the design, analysis, and programming of many domain decomposition algorithms. The coarse space elements are well defined even for irregular subdomains, are continuous, and well suited for use in two-level or multi-level preconditioners such as overlapping Schwarz algorithms. An analysis for scalar elliptic and linear elasticity problems ms reveals that significant reductions in the coarse space dimension can be achieved while not sacrificing the favorable condition number estimates for larger coarse spaces previously developed. These estimates depend primarily on the Lipschitz parameters of the subdomains. Numerical examples for problems in three dimensions are presented to illustrate the methods and to confirm the analysis. In some of the experiments, the coefficients have large discontinuities across the interface between the subdomains, and in some, the subdomains are generated by mesh partitioners.
Title: Isogeometric BDDC Deluxe Preconditioners for Linear Elasticity
Author(s): Pavarino, Luca F.; Scacchi, Simone; Widlund, Olof B.; Zampini, Stefano
Abstract:
Balancing domain decomposition by constraints (BDDC) preconditioners have been shown to provide rapidly convergent preconditioned conjugate gradient methods for solving many of the very ill-conditioned systems of algebraic equations which often arise in finite element approximations of a large variety of problems in continuum mechanics. These algorithms have also been developed successfully for problems arising in isogeometric analysis. In particular, the BDDC deluxe version has proven very successful for problems approximated by non-uniform rational B-splines (NURBS), even those of high order and regularity. One main purpose of this paper is to extend the theory, previously fully developed only for scalar elliptic problems in the plane, to problems of linear elasticity in three dimensions. Numerical experiments supporting the theory, are also reported. Some of these experiments highlight the fact that the development of the theory can help to decrease substantially the dimension of the primal space of the BDDC algorithm, which provides the necessary global component of these preconditioners, while maintaining scalability and good convergence rates.
Title: Detecting Missing and Spurious Edges in Large, Dense Networks Using Parallel Computing
Author(s): Coolidge, Sam; Simon, Dan; Shasha, Dennis
Abstract:
Certain pairs of drugs can cause death from their interaction. Knowledge of such interactions is held in drug interaction networks. The problem is that such networks may miss interactions that should be present and may include interactions that should be absent. Clearly, such information is valuable. Drug interaction networks are not unique in this regard. The same holds for protein-protein interaction networks, ecological networks, and many others. Improving the quality of such networks often requires a ground truth analysis (e.g. more experiments) but Roger Guimerá, Marta Sales-Prado, and their colleagues have shown in several papers that a structural analysis of networks can lead to predictions of missing and spurious edges that can improve those networks. Our contribution in this paper and the accompanying software is to create a program implementing their algorithmic ideas that is parallelizable and easy to modify for researchers who wish to try out new ideas. Our software can be found at https://github.com/samcoolidge/network.
Title: Finding Prospects for Shopping Centers: a machine learning approach
Author(s): Kogan, Jonathan; Jain, Rishabh; Jean, Joe; Lowrance, Roy; Shasha, Dennis
Abstract:
We have developed an algorithm that predicts which store types are the best prospects to fill vacancies in shopping centers given the combinations of stores already there. The model is able to make predictions with accuracies up to 81.62% for the first prediction, 90.05% for the first two predictions, 93.34% for the first three predictions, 95.52% for the first four predictions, and 96.48% for the first five predictions. The p-values with respect to a naïve strategy of choosing the store types that are simply most frequent are all below 0.0001%. This paper explains how the system was built and some user tests, not all of which were positive. The system can be found at http://linserv2.cims.nyu.edu:54321. The code for the project can be found at https://github.com/jgk99/Store-Prospector.
Title: On the Solution of Elliptic Partial Differential Equations on Regions with Corners II: Detailed Analysis
Author(s): Serkh, Kirill
Abstract:
In this report we investigate the solution of boundary value problems on polygonal domains for elliptic partial differential equations.
Title: Alphacodes: Usable, Secure Transactions with Untrusted Providers using Human Computable Puzzles
Author(s): Sharma, Ashlesh; Chandrasekaran, Varun; Amjad, Fareeha; Shasha, Dennis; Subramanian, Lakshminarayanan
Abstract:
Many banking and commerce payment systems, especially in developing regions, continue to require users to share private or sensitive information in clear-text with untrusted providers exposing them to different forms of man-in-the-middle attacks. In this paper, we introduce Alphacodes, a new paradigm that provides a usable security solution that enables users to perform secure transactions with untrusted parties using the notion of visual puzzles. Alphacodes are designed as verification codes for short message transactions and provide easy authentication of critical portions of a transaction. We describe how Alphacodes can be applied in different use cases and also show two simple applications that we have built using the Alphacodes framework. We show security vulnerabilities in existing systems and show how our protocol overcomes them. We also demonstrate the ease of use of Alphacodes with minimal training using two simple mechanical turk studies. Using another simple real world user study involving 10 users who speak Kannada (local Indian language), we show that the Alphacodes concept can be easily extended to other languages beyond English.
Title: Scaling Multicore Databases via Constrained Parallel Execution
Author(s): Wang, Zhaoguo; Mu, Shuai; Cui, Yang; Yi, Han; Chen, Haibo; Li, Jinyang
Abstract:
Multicore in-memory databases often rely on traditional concurrency control schemes such as two-phase-locking (2PL) or optimistic concurrency control (OCC). Unfortunately, when the workload exhibits a non-trivial amount of contention, both 2PL and OCC sacrifice much parallel execution opportunity. In this paper, we describe a new concurrency control scheme, interleaving constrained concurrency control (IC3), which provides serializability while allowing for parallel execution of certain conflicting transactions. IC3 combines the static analysis of the transaction workload with runtime techniques that track and enforce dependencies among concurrent transactions. The use of static analysis simplifies IC3’s runtime design, allowing it to scale to many cores. Evaluations on a 64-core machine using the TPC-C benchmark show that IC3 outperforms traditional concurrency control schemes under contention. It achieves the throughput of 434K transactions/sec on the TPC-C benchmark configured with only one warehouse. It also scales better than several recent concurrent control schemes that also target contended workloads.
Title: Adaptive Selection of Primal Constraints for Isogeometric BDDC Deluxe Preconditioners
Author(s): Beirão da Veiga, L.; Pavarino, L. F.; Scacchi, S.; Widlund, O. B.; Zampini, S.
Abstract:
Isogeometric analysis has been introduced as an alternative to finite element methods in order to simplify the integration of CAD software and the discretization of variational problems of continuum mechanics. In contrast with the finite element case, the basis functions of isogeometric analysis are often not nodal. As a consequence, there are fat interfaces which can easily lead to an increase in the number of interface variables after a decomposition of the parameter space into subdomains. Building on earlier work on the deluxe version of the BDDC family of domain decomposition algorithms, several adaptive algorithms are here developed for scalar elliptic problems in an effort to decrease the dimension of the global, coarse component of these preconditioners. Numerical experiments provide evidence that this work can be successful, yielding scalable and quasi-optimal adaptive BDDC algorithms for isogeometric discretizations.
Title: An Adaptive Choice of Primal Constraints for BDDC Domain Decomposition Algorithms
Author(s): Calvo, Juan G.; Widlund, Olof B.
Abstract:
An adaptive choice for primal spaces, based on parallel sums, is developed for BDDC deluxe methods and elliptic problems in three dimensions. The primal space, which form the global, coarse part of the domain decomposition algorithm, and which is always required for any competitive algorithm, is defined in terms of generalized eigenvalue problems related to subdomain edges and faces; selected eigenvectors associated to the smallest eigenvalues are used to enhance the primal spaces. This selection can be made automatic by using tolerance parameters specified for the subdomain faces and edges. Numerical results verify the results and provide a comparison with primal spaces commonly used. They include results for cubic subdomains as well as subdomains obtained by a mesh partitioner. Different distributions for the coefficients are also considered, with constant coefficients, highly random values, and channel distributions.
Title: Domain Decomposition Methods for Problems in H(curl)
Author(s): Calvo, Juan Gabriel
Abstract:
Two domain decomposition methods for solving vector field problems posed in H(curl) and discretized with Nédélec finite elements are considered. These finite elements are conforming in H(curl).
A two-level overlapping Schwarz algorithm in two dimensions is analyzed, where the subdomains are only assumed to be uniform in the sense of Peter Jones. The coarse space is based on energy minimization and its dimension equals the number of interior subdomain edges. Local direct solvers are based on the overlapping subdomains. The bound for the condition number depends only on a few geometric parameters of the decomposition. This bound is independent of jumps in the coefficients across the interface between the subdomains for most of the different cases considered.
A bound is also obtained for the condition number of a balancing domain decomposition by constraints (BDDC) algorithm in two dimensions, with Jones subdomains. For the primal variable space, a continuity constraint for the tangential average over each interior subdomain edge is imposed. For the averaging operator, a new technique named deluxe scaling is used. The optimal bound is independent of jumps in the coefficients across the interface between the subdomains.
Furthermore, a new coarse function for problems in three dimensions is introduced, with only one degree of freedom per subdomain edge. In all the cases, it is established that the algorithms are scalable. Numerical results that verify the results are provided, including some with subdomains with fractal edges and others obtained by a mesh partitioner.
Title: Kmax: Analyzing the Linux Build System
Author(s): Gazzillo, Paul
Abstract:
Large-scale C software like Linux needs software engineering tools. But such codebases are software product families, with complex build systems that tailor the software with myriad features. This variability management is a challenge for tools, because they need awareness of variability to process all software product lines within the family. With over 14,000 features, processing all of Linux's product lines is infeasible by brute force, and current solutions employ incomplete heuristics. But having the complete set of compilation units with precise variability information is key to static tools such a bug-finders, which could miss critical bugs, and refactoring tools, since behavior-preservation requires a complete view of the software project. Kmax is a new tool for the Linux build system that extracts all compilation units with precise variability information. It processes build system files with a variability-aware make evaluator that stores variables in a conditional symbol table and hoists conditionals around complete statements, while tracking variability information as presence conditions. Kmax is evaluated empirically for correctness and completeness on the Linux kernel. Kmax is compared to previous work for correctness and running time, demonstrating that a complete solution's added complexity incurs only minor latency compared to the incomplete heuristic solutions.
Title: A Crop Recommendation Tool for Organic Farmers
Author(s): Hsu, Jasmine; Shasha, Dennis
Abstract:
We describe the data sources and machine learning algorithms that go into the current version of http://www.whatcanifarm.com , a website to help prospective organic farmers determine what to grow given the climate characterized by their zip code.
Title: BDDC Algorithm with Deluxe Scaling and Adaptive Selection of Primal Constraints for Raviart-Thomas Vector Fields
Author(s): Oh, Duk-Soon; Widlund, Olof B.; Zampini, Stefano; Dohrmann, Clark R.
Abstract:
A BDDC domain decomposition preconditioner is defined by a coarse component, expressed in terms of primal constraints, a weighted average across the interface between the subdomains, and local components given in terms of solvers of local subdomain problems. BDDC methods for vector field problems discretized with Raviart-Thomas finite elements are introduced. The methods are based on a new type of weighted average and an adaptive selection of primal constraints developed to deal with coefficients with high contrast even inside individual subdomains. For problems with very many subdomains, a third level of the preconditioner is introduced.
Under the assumption that the subdomains are all built from elements of a coarse triangulation of the given domain, and that the material parameters are constant in each subdomain, a bound is obtained for the condition number of the preconditioned linear system which is independent of the values and the jumps of the coefficients across the interface and has a polylogarithmic condition number bound in terms of the number of degrees of freedom of the individual subdomains. Numerical experiments, using the PETSc library, and a large parallel computer, for two and three dimensional problems are also presented which support the theory and show the effectiveness of the algorithms even for problems not covered by the theory. Included are also experiments with a variety of finite element approximations.
Title: Practical SMT-Based Type Error Localization
Author(s): Pavlinovic, Zvonimir; Wies, T
Abstract:
Compilers for statically typed functional programming languages are notorious for generating confusing type error messages. When the compiler detects a type error, it typically reports the program location where the type checking failed as the source of the error. Since other error sources are not even considered, the actual root cause is often missed. A more adequate approach is to consider all possible error sources and report the most useful one subject to some usefulness criterion. In our previous work, we showed that this approach can be formulated as an optimization problem related to satisfiability modulo theories (SMT). This formulation cleanly separates the heuristic nature of usefulness criteria from the underlying search problem. Unfortunately, algorithms that search for an optimal error source cannot directly use principal types which are crucial for dealing with the exponential-time complexity of the decision problem of polymorphic type checking. In this paper, we present a new algorithm that efficiently finds an optimal error source in a given ill-typed program. Our algorithm uses an improved SMT encoding to cope with the high complexity of polymorphic typing by iteratively expanding the typing constraints from which principal types are derived. The algorithm preserves the clean separation between the heuristics and the actual search. We have implemented our algorithm for OCaml. In our experimental evaluation, we found that the algorithm reduces the running times for optimal type error localization from minutes to seconds and scales better than previous localization algorithms.
Title: Acronym Disambiguation
Author(s): Turtel, Benjamin D.; Shasha, Dennis
Abstract:
Acronym disambiguation is the process of determining the correct expansion of an acronym in a given context. We describe a novel approach for expanding acronyms, by identifying acronym / expansion pairs in a large training corpus of text from Wikipedia and using these as a training dataset to expand acronyms based on word frequencies. On instances in which the correct acronym expansion has at least one instance in our training set (therefore making correct expansion possible), and in which the correct expansion is not the only expansion of an acronym seen in our training set (therefore making the expansion decision a non-trivial decision), we achieve an average accuracy of 88.6%. On a second set of experiments using user-submitted documents, we achieve an average accuracy of 81%.
Title: Overlapping Schwarz Algorithms for Almost Incompressible Linear Elasticity
Author(s): Cai, Mingchao; Pavarino, Luca F.; Widlund, Olof B.
Abstract:
Low order finite element discretizations of the linear elasticity system suffer increasingly from locking effects and ill-conditioning, when the material approaches the incompressible limit, if only the displacement variable are used. Mixed finite elements using both displacement and pressure variables provide a well-known remedy, but they yield larger and indefinite discrete systems for which the design of scalable and efficient iterative solvers is challenging. Two-level overlapping Schwarz preconditioner for the almost incompressible system of linear elasticity, discretized by mixed finite elements with discontinuous pressures, are constructed and analyzed. The preconditioned systems are accelerated either by a GMRES (generalized minimum residual) method applied to the resulting discrete saddle point problem or by a PCG (preconditioned conjugate gradient) method applied to a positive definite, although extremely ill-conditioned, reformulation of the problem obtained by eliminating all pressure variables on the element level. A novel theoretical analysis of the algorithm for the positive definite reformulation is given by extending some earlier results by Dohrmann and Widlund. The main result of the paper is a bound on the condition number of the algorithm which is cubic in the relative overlap and grows logarithmically with the number of elements across individual subdomains but is otherwise independent of the number of subdomains, their diameters and mesh sizes, and the incompressibility of the material and possible discontinuities of the material parameters across the subdomain interfaces. Numerical results in the plane confirm the theory and also indicate that an analogous result should hold for the saddle point formulation, as well as for spectral element discretizations.
Title: A two-level overlapping Schwarz method for H(curl) in two dimensions with irregular subdomains
Author(s): Calvo, Juan G.
Abstract:
A bound is obtained for the condition number of a two-level overlapping Schwarz algorithm for problems posed in H(curl) in two dimensions, where the subdomains are only assumed to be uniform in the sense of Peter Jones. The coarse space is based on energy minimization and its dimension equals the number of interior subdomain edges. Local direct solvers are used on the overlapping subdomains. Our bound depends only on a few geometric parameters of the decomposition. This bound is independent of jumps in the coefficients across the interface between the subdomains for most of the different cases considered. Numerical experiments that verify the result are shown, including some with subdomains with fractal edges and others obtained by a mesh partitioner.
Title: A BDDC algorithm with deluxe scaling for H(curl) in two dimensions with irregular subdomains
Author(s): Calvo, Juan G.
Abstract:
A bound is obtained for the condition number of a BDDC algorithm for problems posed in H(curl) in two dimensions, where the subdomains are only assumed to be uniform in the sense of Peter Jones. For the primal variable space, a continuity constraint for the tangential average over each interior subdomain edge is imposed.
For the averaging operator, a new technique named deluxe scaling is used. Our bound is independent of jumps in the coefficients across the interface between the subdomains and depends only on a few geometric parameters of the decomposition. Numerical results that verify the result are shown, including some with subdomains with fractal edges and others obtained by a mesh partitioner.
Title: A BDDC algorithm with deluxe scaling for three-dimensional H(curl) problems
Author(s): Dohrmann, Clark R.; Widlund, Olof B.
Abstract:
In this paper, we present and analyze a BDDC algorithm for a class of elliptic problems in the three-dimensional H(curl) space. Compared with existing results, our condition number estimate requires fewer assumptions and also involves two fewer powers of log(H/h), making it consistent with optimal estimates for other elliptic problems. Here, H/h is the maximum of H _{ i } /h _{ i } over all subdomains, where H _{ i } and h _{ i } are the diameter and the smallest element diameter for the subdomain Ω _{ i } .
The analysis makes use of two recent developments. The first is a new approach to averaging across the subdomain interfaces, while the second is a new technical tool which allows arguments involving trace classes to be avoided. Numerical examples are presented to confirm the theory and demonstrate the importance of the new averaging approach in certain cases.
Title: Local temporal reasoning
Author(s): Koskinen, Eric
Abstract:
We present the first method for reasoning about temporal logic properties of higher-order, infinite-data programs. By distinguishing between the finite traces and infinite traces in the specification, we obtain rules that permit us to reason about the temporal behavior of program parts via a type-and-effect system, which is then able to compose these facts together to prove the overall target property of the program. The type system alone is strong enough to derive many temporal safety properties using refinement types and temporal effects. We also show how existing techniques can be used as oracles to provide liveness information (e.g. termination) about program parts and that the type-and-effect system can combine this information with temporal safety information to derive nontrivial temporal properties. Our work has application toward verification of higher-order software, as well as modular strategies for procedural programs.
Title: The Push/Pull model of transactions
Author(s): Koskinen, Eric; Parkinson, Matthew
Abstract:
We present a general theory of serializability, unifying a wide range of transactional algorithms, including some that are yet to come. To this end, we provide a compact semantics in which concurrent transactions push their effects into the shared view (or unpush to recall effects) and pull the effects of potentially uncommitted concurrent transactions into their local view (or unpull to detangle). Each operation comes with simple side-conditions given in terms of commutativity (Lipton's left-movers and right-movers).
The benefit of this model is that most of the elaborate reasoning (coinduction, simulation, subtle invariants, etc.) necessary for proving the serializability of a transactional algorithm is already proved within the semantic model. Thus, proving serializability (or opacity) amounts simply to mapping the algorithm on to our rules, and showing that it satisfies the rules' side-conditions.
Title: VerifiableAuction: An Auction System for a Suspicious World
Author(s): Rosenberg, Michael; Shasha, Dennis
Abstract:
This paper presents a cryptosystem that will allow for fair first-price sealed-bid auctions among groups of individuals to be conducted over the internet without the need for a trusted third party. A client who maintains the secrecy of his or her private key will be able to keep his/her bid secret from the server and from all other clients until this client explicitly decides to reveal his/her bid, which will be after all clients publish their obfuscated bids. Each client will be able to verify that every other client's revealed bid corresponds to that client's obfuscated bid at the end of each auction. Each client is provided with a transcript of all auction proceedings so that they may be independently audited.
Title: On Automating Separation Logic with Trees and Data
Author(s): Wies, Thomas
Abstract:
Separation logic (SL) is a widely used formalism for verifying heap manipulating programs. Existing SL solvers focus on decidable fragments for list-like structures. More complex data structures such as trees are typically unsupported in implementations, or handled by incomplete heuristics.
While complete decision procedures for reasoning about trees have been proposed, these procedures suffer from high complexity, or make global assumptions about the heap that contradict the separation logic philosophy of local reasoning. In this paper, we present a fragment of classical first-order logic for local reasoning about tree-like data structures. The logic is decidable in NP and the decision procedure allows for combinations with other decidable first-order theories for reasoning about data. Such extensions are essential for proving functional correctness properties.
We have implemented our decision procedure and, building on earlier work on translating SL proof obligations into classical logic, integrated it into an SL-based verification tool. We successfully used the tool to verify functional correctness of tree-based data structure implementations.
Title: Isogeometric BDDC Preconditioners with Deluxe Scaling
Author(s): Beirao Da Veiga, Lourenco; Pavarino, Luca; Scacchi, Simone; Widlund, Olof; Zampni, Stefano
Abstract:
A BDDC (Balancing Domain Decomposition by Constraints) preconditioner with a novel scaling, introduced by Dohrmann for problems with more than one variable coeffcient and here denoted as deluxe scaling, is extended to Isogeometric Analysis of scalar elliptic problems. This new scaling turns out to be more powerful than the standard rho- and stiffness scalings considered in a previous isogeometric BDDC study. Our h-analysis shows that the condition number of the resulting deluxe BDDC preconditioner is scalable with a quasi-optimal polylogarithmic bound which is also independent of coeffcient discontinuities across subdomain interfaces. Extensive numerical experiments support the theory and show that the deluxe scaling yields a remarkable improvement over the older scalings, in particular, for large isogeometric polynomial degree and high regularity.
Title: Tight Lower Bound on the Probability of a Binomial Exceeding its Expectation
Author(s): Greenberg, Spencer; Mohri, Mehryar
Abstract:
We give the proof of a tight lower bound on the probability that a binomial random variable exceeds its expected value. The inequality plays an important role in a variety of contexts, including the analysis of relative deviation bounds in learning theory and generalization bounds for unbounded loss functions.
Title: Diet Planner: Finding a Nutritionally Sound Diet While Following (Most) of a Dieter’s Desires
Author(s): Jermsurawong, Mick Jermsak; Shasha, Dennis
Abstract:
We describe the design and implementation of a diet website currently housed at http://nutrientdata.herokuapp.com/. The site allows users or dieticians to enter nutritional constraints (e.g. at least this much calcium but not more than that amount of calcium) and objectives (e.g. minimize calories), a list of foods/brands the person likes. The site then determines, if possible, the quantities of at least some of those desired foods that would meet the nutritional constraints. If not possible, then the site guides the user in the choice of other foods that may meet the nutritional constraints. The net result is a tailored diet measured in servings.
Title: A General Method for Energy-Error Tradeoffs in Approximate Adders
Author(s): Kedem, Zvi; Muntimadugu, Kirthi Krishna
Abstract:
Approximate adders are adders with conventional architectures run in an overclocked mode. With this mode, erroneous sums may be produced at the savings of energy required to execute the computation. The results presented in this report lead to a procedure for allocating the available energy budgets among the adders modules so as to minimize the expected error. For simplicity, only the uniform distribution of the inputs is considered.
Title: Reversibility of Turing Machine Computations
Author(s): Kedem, Zvi M.
Abstract:
Since Bennett's 1973 seminal paper, there has been a growing interest in general-purpose, reversible computations and they have been studied using both mathematical and physical models. Following Bennett, given a terminating computation of a deterministic Turing Machine, one may be interested in constructing a new Turing Machine, whose computation consists of two stages. The first stage emulates the original Turing Machine computation on its working tape, while also producing the trace of the computation on a new history tape. The second stage reverses the first stage using the trace information. Ideally, one would want the second stage to traverse whole-machine states in the reverse order from that traversed in the first stage. But this is impossible other than for trivial computations. Bennett constructs the second phase by using additional controller states, beyond those used during the first stage. In this report, a construction of the new machine is presented in which the second stage uses the same and only those controller states that the first stage used and they are traversed in the reverse order. The sole element that is not fully reversed is the position of the head on the history tape, where it is out of phase by one square compared to the first stage.
Title: A Balancing Domain Decomposition By Constraints Deluxe Method For Numerically Thin Reissner-Mindlin Plates Approximated With Falk-tu Finite Elements
Author(s): Lee, Jong Ho
Abstract:
The Reissner-Mindlin plate models thin plates. The condition numbers of finite element approximations of these plate models increase very rapidly as the thickness of the plate goes to 0. A Balancing Domain Decomposition by Constraints (BDDC) De Luxe method is developed for these plate problems discretized by Falk-Tu finite elements. In this new algorithm, subdomain Schur complements restricted to individual edges are used to define the average operator for the BDDC De Luxe method. It is established that the condition number of this preconditioned iterative method is bounded by C(1 + log (H/h))^2 if t, the thickness of the plate, is on the order of the element size h or smaller; H is the maximum diameter of the subdomains. The constant C is independent of the thickness t as well as H and h. Numerical results, which verify the theory, and a comparison with a traditional BDDC method are also provided.
Title: Cryptographic Security of Macaroon Authorization Credentials
Author(s): Lopez-Alt, Adriana
Abstract:
Macaroons, recently introduced by Birgisson et al., are authorization credentials that provide support for controlled sharing in decentralized systems. Macaroons are similar to cookies in that they are bearer credentials, but unlike cookies, macaroons include caveats that attenuate and contextually confine when, where, by who, and for what purpose authorization should be granted.
In this work, we formally study the cryptographic security of macaroons. We define macaroon schemes, introduce corresponding security definitions and provide several constructions. In particular, the MAC-based and certificate-based constructions outlined by Birgisson et al., can be seen as instantiations of our definitions. We also present a new construction that is privately-verifiable (similar to the MAC-based construction) but where the verifying party does not learn the intermediate keys of the macaroon, a problem already observed by Birgisson et al.
We also formalize the notion of a protocol for "discharging" third-party caveats and present a security definition for such a protocol. The encryption-based protocol outlined by Birgisson et al. can be seen as an instantiation of our definition, and we also present a new signature-based construction.
Finally, we formally prove the security of all constructions in the given security models.
Title: Foundations of a Formal Theory of Time Travel
Author(s): Morgenstern, Leora
Abstract:
Although the phenomenon of time travel is common in popular culture, there has been little work in AI on developing a formal theory of time travel. This paper develops such a theory. The paper introduces a branching-time ontology that maintains the classical restriction of forward movement through a temporal tree structure, but permits the representation of paths in which one can perform inferences about time-travel scenarios. Central to the ontology is the notion of an agent embodiment whose beliefs are equivalent to those of an agent who has time-traveled from the future. We show how to formalize an example scenario and demonstrate what it means for such a scenario to be motivated with respect to an agent embodiment.
Title: A BDDC Algorithm for Raviart-Thomas Vector Fields
Author(s): Oh, Duk-Soon; Widlund, Olof B.; Dohrmann, Clark R.
Abstract:
A BDDC preconditioner is defined by a coarse component, expressed in terms of primal constraints and a weighted average across the interface between the subdomains, and local components given in terms of Schur complements of local subdomain problems. A BDDC method for vector field problems discretized with Raviart-Thomas finite elements is introduced. Our method is based on a new type of weighted average developed to deal with more than one variable coefficient. A bound on the condition number of the preconditioned linear system is also provided which is independent of the values and jumps of the coefficients across the interface and has a polylogarithmic condition number bound in terms of the number of degrees of freedom of the individual subdomains. Numerical experiments for two and three dimensional problems are also presented, which support the theory and show the effectiveness of our algorithm even for certain problems not covered by our theory.
Title: Automating Separation Logic Using SMT
Author(s): Piskac, Ruzica; Wies, Thomas; Zufferey, Damien
Abstract:
Separation logic (SL) has gained widespread popularity because of its ability to succinctly express complex invariants of a program's heap configurations. Several specialized provers have been developed for decidable SL fragments. However, these provers cannot be easily extended or combined with solvers for other theories that are important in program verification, e.g., linear arithmetic. In this paper, we present a reduction of decidable SL fragments to a decidable first-order theory that fits well into the satisﬁability modulo theories (SMT) framework. We show how to use this reduction to automate satisfiability, entailment, frame inference, and abduction problems for separafion logic using SMT solvers. Our approach provides a simple method of integrating separation logic into existing verification tools that provide SMT backends, and an elegant way of combining SL fragments with other decidable first-order theories. We implemented this approach in a verification tool and applied it to heap-manipulating programs whose verification involves reasoning in theory combinations.
Title: Online Machine Learning Algorithms For Currency Exchange Prediction
Author(s): Soulas, Eleftherios; Shasha, Dennis
Abstract:
Using Machine Learning Algorithms to analyze and predict security price patterns is an area of active interest. Most practical stock traders combine computational tools with their intuitions and knowledge to make decisions.
This document explains the algorithms and discusses various metrics of accuracy. It validates the models by applying the model to a real-life trading price stream. Though it is very hard to replace the expertise of an experienced trader, software like this may enhance the trader's performance.
Title: Transaction chains: achieving serializability with low latency in geo-distributed storage systems
Author(s): Zhang, Yang; Power, Russell; Zhou, Siyuan; Sovran, Yair; Aguilera, Marcos K.; Li, Jinyang
Abstract:
Currently, users of geo-distributed storage systems face a hard choice between having serializable transactions with high latency, or limited or no transactions with low latency. We show that it is possible to obtain both serializable transactions and low latency, under two conditions. First, transactions are known ahead of time, permitting an a priori static analysis of conflicts. Second, transactions are structured as transaction chains consisting of a sequence of hops, each hop modifying data at one server. To demonstrate this idea, we built Lynx, a geo-distributed storage system that offers transaction chains, secondary indexes, materialized join views, and geo-replication. Lynx uses static analysis to determine if each hop can execute separately while preserving serializability.if so, a client needs wait only for the first hop to complete, which occurs quickly. To evaluate Lynx, we built three applications: an auction service, a Twitter-like microblogging site and a social networking site. These applications successfully use chains to achieve low latency operation and good throughput.
Title: A Note on the Complexity of Model-Checking Bounded Multi-Pushdown Systems
Author(s): Bansal, Kshitij; Demri, Stephane
Abstract:
In this note, we provide complexity characterizations of model checking multi-pushdown systems. Multi-pushdown systems model recursive concurrent programs in which any sequential process has a finite control. We consider three standard notions for boundedness: context boundedness, phase boundedness and stack ordering. The logical formalism is a linear-time temporal logic extending well-known logic CaRet but dedicated to multi-pushdown systems in which abstract operators (related to calls and returns) such as those for next-time and until are parameterized by stacks. We show that the problem is EXPTIME-complete for context-bounded runs and unary encoding of the number of context switches; we also prove that the problem is 2EXPTIME-complete for phase-bounded runs and unary encoding of the number of phase switches. In both cases, the value k is given as an input (whence it is not a constant of the model-checking problem), which makes a substantial difference in the complexity. In certain cases, our results improve previous complexity results.
Title: Hitting the Sweet Spot for Streaming Languages: Dynamic Expressivity with Static Optimization
Author(s): Soulé, Robert; Gordon, Michael I.; Amarasinghe, Saman; Grimm, Robert; Hirzel, Martin
Abstract:
Developers increasingly use stream processing languages to write applications that process large volumes of data with high throughput. Unfortunately, when choosing which stream processing language to use, they face a difficult choice. On the one hand, dynamically scheduled languages allow developers to write a wider range of applications, but cannot take advantage of many crucial optimizations. On the other hand, statically scheduled languages are extremely performant, but cannot express many important streaming applications.
This paper presents the design of a hybrid scheduler for stream processing languages. The compiler partitions the streaming application into coarse-grained subgraphs separated by dynamic rate boundaries. It then applies static optimizations to those subgraphs. We have implemented this scheduler as an extension to the StreamIt compiler, and evaluated its performance against three scheduling techniques used by dynamic systems: OS thread, demand, and no-op. Our scheduler not only allows the previously static version of StreamIt to run dynamic rate applications, but it outperforms the three dynamic alternatives. This demonstrates that our scheduler strikes the right balance between expressivity and performance for stream processing languages.
Title: Two-Level Overlapping Schwarz Algorithms for a Staggered Discontinuous Galerkin Method
Author(s): Chung, Eric T.; Kim, Hyea Hyun; Widlund, Olof B.
Abstract:
Two overlapping Schwarz algorithms are developed for a discontinuous Galerkin (DG) finite element approximation of second order scalar elliptic problems in both two and three dimensions. The discontinuous Galerkin formulation is based on a staggered discretization introduced by Chung and Engquist for the acoustic wave equation. Two types of coarse problems are introduced for the two-level Schwarz algorithms. The first is built on a non-overlapping subdomain partition, which allows quite general subdomain partitions, and the second on introducing an additional coarse triangulation that can also be quite independent of the fine triangulation. Condition number ounds are established and numerical results are presented.
Title: An Alternative Coarse Space for Irregular Subdomains and an Overlapping Schwarz Algorithm
Author(s): Dohrmann, Clark R.; Widlund, Olof B.
Abstract:
In earlier work on domain decomposition methods for elliptic problems in the plane, an assumption that each subdomain is triangular, or a union of a few coarse triangles, has often been made. This is similar to what is required in geometric multigrid theory and is unrealistic if the subdomains are produced by a mesh partitioner. In an earlier paper, coauthored with Axel Klawonn, the authors introduced a coarse subspace for an overlapping Schwarz method with one degree of freedom for each subdomain vertex and one for each subdomain edge. A condition number bound proportional to \((1+\log(H/h))^2(1+H/\delta)\) was established assuming only that the subdomains are John domains; here \(H/\delta\) measures the relative overlap between neighboring subdomains and \(H/h\) the maximum number of elements across individual subdomains. We were also able to relate the rate of convergence to a parameter in an isoperimetric inequality for the subdomains into which the domain of the problem has been partitioned.
In this paper, the dimension of the coarse subspace is decreased by using only one degree of freedom for each subdomain vertex; if all subdomains have three edges, this leads to a reduction of the dimension of the coarse subspace by approximately a factor four. In addition, the condition number bound is shown to be proportional to \((1+\log(H/h))(1+H/\delta)\) under a quite mild assumption on the relative length of adjacent subdomain edges.
In this study, the subdomains are assumed to be uniform in the sense of Peter Jones. As in our earlier work, the results are insensitive to arbitrary large jumps in the coefficients of the elliptic problem across the interface between the subdomains.
Numerical results are presented which confirm the theory and demonstrate the usefulness of the algorithm for a variety of mesh decompositions and distributions of material properties. It is also shown that the new algorithm often converges faster than the older one in spite of the fact that the dimension of the coarse space has been decreased considerably.
Title: Parsing All of C by Taming the Preprocessor
Author(s): Gazzillo, Paul; Grimm, Robert
Abstract:
Given the continuing popularity of C for building large-scale programs, such as Linux, Apache, and Bind, it is critical to provide effective tool support, including, for example, code browsing, bug finding, and automated refactoring. Common to all such tools is a need to parse C. But C programs contain not only the C language proper but also preprocessor invocations for file inclusion (#include), conditional compilation (#if, #ifdef, and so on), and macro definition/expansion (#define). Worse, the preprocessor is a textual substitution system, which is oblivious to C constructs and operates on individual tokens. At the same time, the preprocessor is indispensable for improving C's expressivity, abstracting over software/hardware dependencies, and deriving variations from the same code base. The x86 version of the Linux kernel, for example, depends on about 7,600 header files for file inclusion, 7,000 configuration variables for conditional compilation, and 520,000 macros for code expansion.
In this paper, we present a new tool for parsing all of C, including arbitrary preprocessor use. Our tool, which is called SuperC, is based on a systematic analysis of all interactions between lexing, preprocessing, and parsing to ensure completeness. It first lexes and preprocesses source code while preserving conditionals. It then parses the result using a novel variant of LR parsing, which automatically forks parsers when encountering a conditional and merges them again when reaching the same input in the same state. The result is a well-formed AST, containing static choice nodes for conditionals. While the parsing algorithm and engine are new, neither grammar nor LR parser table generator need to change. We discuss the results of our problem analysis, the parsing algorithm itself, the pragmatics of building a real-world tool, and a demonstration on the x86 version of the Linux kernel.
Title: Sharing is Caring: Combination of Theories
Author(s): Jovanovic, Dejan; Barrett, Clark
Abstract:
One of the main shortcomings of the traditional methods for combining theories is the complexity of guessing the arrangement of the variables shared by the individual theories. This paper presents a reformulation of the Nelson-Oppen method that takes into account explicit equality propagation and can ignore pairs of shared variables that the theories do not care about. We show the correctness of the new approach and present care functions for the theory of uninterpreted functions and the theory of arrays. The effectiveness of the new method is illustrated by experimental results demonstrating a dramatic performance improvement on benchmarks combining arrays and bit-vectors.
Title: Formalization and Automated Verification of RESTful Behavior
Author(s): Klein, Uri; Namjoshi, Kedar S.
Abstract:
REST is a software architectural style used for the design of highly scalable web applications. Interest in REST has grown rapidly over the past decade, spurred by the growth of open web APIs. On the other hand, there is also considerable confusion surrounding REST: many examples of supposedly RESTful APIs violate key REST constraints. We show that the constraints of REST and of RESTful HTTP can be pre- cisely formulated within temporal logic. This leads to methods for model checking and run-time verfication of RESTful behavior. We formulate several relevant verification questions and analyze their complexity.
Title: Effective Synthesis of Asynchronous Systems from GR(1) Specifications
Author(s): Klein, Uri; Piterman, Nir; Pnueli, Amir
Abstract:
We consider automatic synthesis from linear temporal logic specifications for asynchronous systems. We aim the produced reactive systems to be used as software in a multi-threaded environment. We extend previous reduction of asynchronous synthesis to synchronous synthesis to the setting of multiple input and multiple output variables. Much like synthesis for synchronous designs, this solution is not practical as it requires determinization of automata on infinite words and solution of complicated games. We follow advances in synthesis of synchronous designs, which restrict the handled specifications but achieve scalability and efficiency. We propose a heuristic that, in some cases, maintains scalability for asynchronous synthesis. Our heuristic can prove that specifications are realizable and extract designs. This is done by a reduction to synchronous synthesis that is inspired by the theoretical reduction.
Title: Domain Decomposition Methods for Reissner-Mindlin Plates Discretized with the Falk-Tu Elements
Author(s): Lee, Jong Ho
Abstract:
The Reissner-Mindlin plate theory models a thin plate with thickness t. The condition number of finite element approximations of this model deteriorates badly as the thickness t of the plate converges to 0. In this thesis, we develop an overlapping domain decomposition method for the Reissner-Mindlin plate model discretized by Falk-Tu elements with a convergence rate which does not deteriorate when t converges to 0. We use modern overlapping methods which use the Schur complements to define coarse basis functions and show that the condition number of this overlapping method is bounded by C(1 + H/delta )^3*(1 + log(H/h))^2. Here H is the maximum diameter of the subdomains, delta the size of overlap between subdomains, and h the element size. Numerical examples are provided to confirm the theory. We also modify the overlapping method to develop a BDDC method for the Reissner-Mindlin model. We establish numerically an extension lemma to obtain a constant bound and an edge lemma to obtain a C(1 + log(H/h))^2 bound. Given such bounds, the condition number of this BDDC method is shown to be bounded by C(1 + log(H/h))^2.
Title: Domain Decomposition Methods for Raviart-Thomas Vector Fields
Author(s): Oh, Duk-Soon
Abstract:
Raviart-Thomas finite elements are very useful for problems posed in H(div) since they are H(div)-conforming. We introduce two domain decomposition methods for solving vector field problems posed in H(div) discretized by Raviart-Thomas finite elements.
A two-level overlapping Schwarz method is developed. The coarse part of the preconditioner is based on energy-minimizing extensions and the local parts consist of traditional solvers on overlapping subdomains. We prove that our method is scalable and that the condition number grows linearly with the logarithm of the number of degrees of freedom in the individual subdomains and linearly with the relative overlap between the overlapping subdomains. The condition number of the method is also independent of the values and jumps of the coefficients across the interface between subdomains. We provide numerical results to support our theory.
We also consider a balancing domain decomposition by constraints (BDDC) method. The BDDC preconditioner consists of a coarse part involving primal constraints across the interface between subdomains and local parts related to the Schur complements corresponding to the local subdomain problems. We provide bounds of the condition number of the preconditioned linear system and suggest that the condition number has a polylogarithmic bound in terms of the number of degrees of freedom in the individual subdomains from our numerical experiments for arbitrary jumps of the coefficients across the subdomain interfaces.
Title: From a Calculus to an Execution Environment for Stream Processing
Author(s): Soulé, Robert; Hirzel, Martin; Gedik, Bugra; Grimm, Robert
Abstract:
At one level, this paper is about River, a virtual execution environment for stream processing. Stream processing is a paradigm well-suited for many modern data processing systems that ingest high-volume data streams from the real world, such as audio/video streaming, high-frequency trading, and security monitoring. One attractive property of stream processing is that it lends itself to parallelization on multi-cores, and even to distribution on clusters when extreme scale is required. Stream processing has been coevolved by several communities, leading to diverse languages with similar core concepts. Providing a common execution environment reduces language development effort and increases portability. We designed River as a practical realization of Brooklet, a calculus for stream processing. So at another level, this paper is about a journey from theory (the calculus) to practice (the execution environment). The challenge is that, by definition, a calculus abstracts away all but the most central concepts. Hence, there are several research questions in concretizing the missing parts, not to mention a significant engineering effort in implementing them. But the effort is well worth it, because the benefit of using a calculus as a foundation is that it yields clear semantics and proven correctness results.
Title: Design and Results of the 4th Annual Satisfiability Modulo Theories Competition (SMT-COMP 2008)
Author(s): Barrett, Clark; Deters, Morgan; Oliveras, Albert; Stump, Aaron
Abstract:
The Satisfiability Modulo Theories Competition (SMT-COMP) is an annual competition aimed at stimulating the advance of the state-of-the-art techniques and tools developed by the Satisfiability Modulo Theories (SMT) community. As with the first three editions, SMT-COMP 2008 was held as a satellite event of CAV 2008, held July 7-14, 2008. This report gives an overview of the rules, competition format, benchmarks, participants and results of SMT-COMP 2008.
Title: Coordination Mechanisms for Weighted Sum of Completion Times
Author(s): Cole, Richard; Gkatzelis, Vasilis; Mirrokni, Vahab
Abstract:
We study policies aiming to minimize the weighted sum of completion times of jobs in the context of coordination mechanisms for selfish scheduling problems. Our goal is to design local policies that achieve a good price of anarchy in the resulting equilibria for unrelated machine scheduling. In short, we present the first constant-factor-approximate coordination mechanisms for this model.
First, we present a generalization of the ShortestFirst policy for weighted jobs, called SmithRule; we prove that it achieves an approximation ratio of 4 and we show that any set of non-preemptive ordering policies can result in equilibria with approximation ratio at least 3 even for unweighted jobs. Then, we present ProportionalSharing, a preemptive strongly local policy that beats this lower bound of 3; we show that this policy achieves an approximation ratio of 2.61 for the weighted sum of completion times and that the EqualSharing policy achieves an approximation ratio of 2.5 for the (unweighted) sum of completion times. Furthermore, we show that ProportionalSharing induces potential games (in which best-response dynamics converge to pure Nash equilibria).
All of our upper bounds are for the robust price of anarchy, defined by Roughgarden [36], so they naturally extend to mixed Nash equilibria, correlated equilibria, and regret minimization dynamics. Finally, we prove that our price of anarchy bound for ProportionalSharing can be used to design a new combinatorial constant-factor approximation algorithm minimizing weighted completion time for unrelated machine scheduling.
Title: An Iterative Substructuring Algorithm for Two-dimensional Problems in H(curl)
Author(s): Dohrmann, Clark R.; Widlund, Olof B.
Abstract:
A domain decomposition algorithm, similar to classical iterative substructuring algorithms, is presented for two-dimensional problems in the space H0(curl). It is defined in terms of a coarse space and local subspaces associated with individual edges of the subdomains into which the domain of the problem has been subdivided. The algorithm differs from others in three basic respects. First, it can be implemented in an algebraic manner that does not require access to individual subdomain matrices or a coarse discretization of the domain; this is in contrast to algorithms of the BDDC, FETIâDP, and classical twoâlevel overlapping Schwarz families. Second, favorable condition number bounds can be established over a broader range of subdomain material properties than in previous studies. Third, we are able to develop theory for quite irregular subdomains and bounds for the condition number of our preconditioned conjugate gradient algorithm, which depend only on a few geometric parameters.
The coarse space for the algorithm is based on simple energy minimization concepts, and its dimension equals the number of subdomain edges. Numerical results are presented which confirm the theory and demonstrate the usefulness of the algorithm for a variety of mesh decompositions and distributions of material properties.
Title: Information Extraction on High-School Level Chemistry Labs
Author(s): Galron, Daniel
Abstract:
In this report we present a feasibility study on automatically interpreting instructions found in a set of high school chemistry labs, and discuss the role of deep domain knowledge in the interpretation. We define the task of sentence-level interpretation as the extraction of symbolic representations of the sentence semantics. In the broader scope, the sentence-level semantics of a particular sentence will be resolved with semantics from other sentences in the lab along with domain knowledge to disambiguate and reason about a physical system. The task of general automatic sentence-level interpretation is a difficult one. The general problem is not very well defined in the natural language processing research community, and few researchers have studied the problem. The common practice is to decompose the problem into subtasks, such as resolving coreferences of noun phrases, labeling the semantic roles of arguments to predicates, and identifying word categories. We describe a pipeline combining the subtasks described, along with parsing, to create a system capable of extracting sentence-level semantics. All the systems used for the subtask are found off-the-shelf, and we should stress that such a system will be highly-error prone for reasons we discuss. Finally, we do a close study of the chemistry lab corpus, and analyze each instruction to determine the feasibility of its automatic interpretation and the role of deep domain knowledge in its disambiguation and understanding.
Title: Polite Theories Revisited
Author(s): Jovanovic, Dejan; Barrett, Clark
Abstract:
The classic method of Nelson and Oppen for combining decision procedures requires the theories to be stably-infnite. Unfortunately, some important theories do not fall into this category (e.g. the theory of bit-vectors). To remedy this problem, previous work introduced the notion of polite theories. Polite theories can be combined with any other theory using an extension of the Nelson-Oppen approach. In this paper we revisit the notion of polite theories, fxing a subtle flaw in the original definition. We give a new combination theorem which specifies the degree to which politeness is preserved when combining polite theories. We also give conditions under which politeness is preserved when instantiating theories by identifying two sorts. These results lead to a more general variant of the theorem for combining multiple polite theories.
Title: The Temporal Logic of Token Causes
Author(s): Kleinberg, Samantha; Mishra, Bud
Abstract:
While type causality helps understand general relationships such as the etiology of a disease (smoking causing lung cancer), token causality aims to explain causal connections in specific instantiated events, such as the diagnosis of a specific patient (Ravi's developing lung cancer after a 20-year smoking habit). Understanding why something happened, as in these examples, is central to reasoning in such diverse cases as the diagnosis of patients, understanding why the US financial market collapsed in 2007 and finding a causal explanation for Obama's victory over Clinton in the US primary. However, despite centuries of work in philosophy and decades of research in computer science, the problem of how to rigorously formalize token causality and how to automate such reasoning has remained unsolved. In this paper, we show how to use type-level causal relationships, represented as temporal logic formulas, together with philosophical principles, to reason about these token-level cases. Finally, we show how this method can correctly reason about examples that have traditionally proven difficult for both computational and philosophical theories to handle.
Title: An overlapping domain decomposition method for the Reissner-Mindlin Plate with the Falk-Tu Elements
Author(s): Lee, Jong Ho
Abstract:
The Reissner-Mindlin plate theory models a thin plate with thickness t. The condition numbers of finite element approximations of this model deteriorate badly as the thickness t of the plate converges to 0. In this paper, we develop an overlapping domain decomposition method for the Reissner-Mindlin plate model discretized by the Falk-Tu elements with the convergence rate which does not deteriorate when t converges to 0. It is shown that the condition number of this overlapping method is bounded by C(1+ H/delta)^3(1 +logH/h)^2. Here H is the maximum diameter of the subdomains, delta the size of overlap between subdomains, and h the element size. Numerical examples are provided to confirm the theory.
Title: An Overlapping Schwarz Algorithm for Raviart-Thomas Vector Fields with Discontinuous Coefficients
Author(s): Oh, Duk-Soon
Abstract:
Overlapping Schwarz methods form one of two major families of domain decomposition methods. We consider a two-level overlapping Schwarz method for Raviart-Thomas vector fields. The coarse part of the preconditioner is based on the energy-minimizing extensions and the local parts are based on traditional solvers on overlapping subdomains. We show that the condition number grows linearly with the logarithm of the number of degrees of freedom in the individual subdomains and linearly with the relative overlap between the overlapping subdomains. The condition number of the method is also independent of the values and jumps of the coefficients. Numerical results for 2D and 3D problems, which support the theory, are also presented.
Title: BDDC preconditioners for spectral element discretizations of almost incompressible elasticity in three dimensions
Author(s): Pavarino, Luca F.; Widlund, Olof B.; Zampini, Stefano
Abstract:
BDDC algorithms are constructed and analyzed for the system of almost incompressible elasticity discretized with Gauss-Lobatto-Legendre spectral elements in three dimensions. Initially mixed spectral elements are employed to discretize the almost incompressible elasticity system, but a positive definite reformulation is obtained by eliminating all pressure degrees of freedom interior to each subdomain into which the spectral elements have been grouped. Appropriate sets of primal constraints can be associated with the subdomain vertices, edges, and faces so that the resulting BDDC methods have a fast convergence rate independent of the almost incompressibility of the material. In particular, the condition number of the BDDC preconditioned operator is shown to depend only weakly on the polynomial degree \(n\), the ratio \(H/h\) of subdomain and element diameters, and the inverse of the inf-sup constants of the subdomains and the underlying mixed formulation, while being scalable, i.e., independent of the number of subdomains and robust, i.e., independent of the Poisson ratio and Young's modulus of the material considered. These results also apply to the related FETI-DP algorithms defined by the same set of primal constraints. Numerical experiments carried out on parallel computing systems confirm these results.
Title: An Empirical Bayesian Interpretation and Generalization of NL-means
Author(s): Raphan, Martin; Simoncelli, Eero P.
Abstract:
A number of recent algorithms in signal and image processing are based on the empirical distribution of localized patches. Here, we develop a nonparametric empirical Bayesian estimator for recovering an image corrupted by additive Gaussian noise, based on fitting the density over image patches with a local exponential model. The resulting solution is in the form of an adaptively weighted average of the observed patch with the mean of a set of similar patches, and thus both justifies and generalizes the recently proposed nonlocal-means (NL-means) method for image denoising. Unlike NL-means, our estimator includes a dependency on the size of the patch similarity neighborhood, and we show that this neighborhood size can be chosen in such a way that the estimator converges to the optimal Bayes least squares estimator as the amount of data grows. We demonstrate the increase in performance of our method compared to NL-means on a set of simulated examples.
Title: Henrique Andrade, Vibhore Kumar, and Kun-Lung Wu, A Universal Calculus for Stream Processing Languages
Author(s): Soulé, Robert; Hirzel, Martin; Grimm, Robert; Gedik, Buğra
Abstract:
Stream processing applications such as algorithmic trading, MPEG processing, and web content analysis are ubiquitous and essential to business and entertainment. Language designers have developed numerous domain-specific languages that are both tailored to the needs of their applications, and optimized for performance on their particular target platforms. Unfortunately, the goals of generality and performance are frequently at odds, and prior work on the formal semantics of stream processing languages does not capture the details necessary for reasoning about implementations. This paper presents Brooklet, a core calculus for stream processing that allows us to reason about how to map languages to platforms and how to optimize stream programs. We translate from three representative languages, CQL, StreamIt, and Sawzall, to Brooklet, and show that the translations are correct. We formalize three popular and vital optimizations, data-parallel computation, operator fusion, and operator re-ordering, and show under which conditions they are correct. Language designers can use Brooklet to specify exactly how new features or languages behave. Language implementors can use Brooklet to show exactly under which circumstances new optimizations are correct. In ongoing work, we are developing an intermediate language for streaming that is based on Brooklet. We are implementing our intermediate language on System S, IBM's high-performance streaming middleware.
Title: Learning Image Decompositions with Hierarchical Sparse Coding
Author(s): Zeiler, Matthew D.; Fergus, Rob
Abstract:
We present a hierarchical model that learns image decompositions via alternating layers of convolutional sparse coding and max pooling. When trained on natural images, the layers of our model capture image information in a variety of forms: low-level edges, mid-level edge junctions, high-level object parts and complete objects. To build our model we rely on a novel inference scheme that ensures each layer reconstructs the input, rather than just the output of the layer directly beneath, as is common with existing hierarchical approaches. This scheme makes it possible to robustly learn multiple layers of representation and we show a model with 4 layers, trained on images from the Caltech-101 dataset. We use our model to produce image decompositions that, when used as input to standard classification schemes, give a significant performance gain over low-level edge features and yield an overall performance competitive with leading approaches.
Title: Hybrid Domain Decomposition Algorithms for Compressible and Almost Incompressible Elasticity
Author(s): Dohrmann, Clark R.; Widlund, Olof B.
Abstract:
Overlapping Schwarz methods are considered for mixed finite element approximations of linear elasticity, with discontinuous pressure spaces, as well as for compressible elasticity approximated by standard conforming finite elements. The coarse components of the preconditioners are based on %spaces, with a fixed number of degrees of freedom per subdomain, spaces, with a number of degrees of freedom per subdomain which is uniformly bounded, and which are similar to those previously developed for scalar elliptic problems and domain decomposition methods of iterative substructuring type, i.e., methods based on non-overlapping decompositions of the domain. The local components of the new preconditioners are based on solvers on a set of overlapping subdomains.
Title: A numerical method for simulating the dynamics of 3D axisymmetric vesicles suspended in viscous flows
Author(s): K. Veerapaneni, Shravan; Gueyerer, Denis; Biros, George; Zorin, Denis
Abstract:
We extend "A boundary integral method for simulating the dynamics of inextensible vesicles suspended in a viscous fluid in 2D", Veerapaneni et al. Journal of Computational Physics, 228(7), 2009 to the case of three dimensional axisymmetric vesicles of spherical or toroidal topology immersed in viscous flows. Although the main components of the algorithm are similar in spirit to the 2D case.spectral approximation in space, semi-implicit time-stepping scheme.the main differences are that the bending and viscous force require new analysis, the linearization for the semi-implicit schemes must be rederived, a fully implicit scheme must be used for the toroidal topology to eliminate a CFL-type restriction, and a novel numerical scheme for the evaluation of the 3D Stokes single-layer potential on an axisymmetric surface is necessary to speed up the calculations. By introducing these novel components, we obtain a time-scheme that experimentally is unconditionally stable, has low cost per time step, and is third-order accurate in time. We present numerical results to analyze the cost and convergence rates of the scheme. To verify the solver, we compare it to a constrained variational approach to compute equilibrium shapes that does not involve interactions with a viscous fluid. To illustrate the applicability of method, we consider a few vesicle-flow interaction problems: the sedimentation of a vesicle, interactions of one and three vesicles with a background Poiseuille flow.
Title: A Hybrid Domain Decomposition Method and its Applications to Contact Problems
Author(s): Lee, Jungho
Abstract:
Our goal is to solve nonlinear contact problems. We consider bodies in contact with each other divided into subdomains, which in turn are unions of elements. The contact surface between the bodies is unknown a priori, and we have a nonpen-etration condition between the bodies, which is essentially an inequality constraint. We choose to use an active set method to solve such problems, which has both outer iterations in which the active set is updated, and inner iterations in which a (linear) minimization problem is solved on the current active face. In the first part of this dissertation, we review the basics of domain decomposition methods. In the second part, we consider how to solve the inner minimization problems. Using an approach based purely on FETI algorithms with only Lagrange multipliers as unknowns, as has been developed by the engineering community, does not lead to a scalable algorithm with respect to the number of subdomains in each body. We prove that such an algorithm has a condition number estimate which depends linearly on the number of subdomains across a body; numerical experiments suggest that this is the best possible bound. We also consider a new method based on the saddle point formulation of the FETI methods with both displacement vectors and Lagrange multipliers as unknowns. The resulting system is solved with a block-diagonal preconditioner which combines the one-level FETIand the BDDC methods. This approach allows the use of inexact solvers. We show that this new method is scalable with respect to the number of subdomains, and that its convergence rate depends only logarithmically on the number of degrees of freedom of the subdomains and bodies. In the last part of this dissertation, a model contact problem is solved by two approaches. The first one is a nonlinear algorithm which combines an active set method and the new method of Chapter 4. We also present a novel way of finding an initial active set. The second one uses the SMALBE algorithm, developed by Dostal et al. We show that the former approach has advantages over the latter.
Title: Learning least squares estimators without assumed priors or supervision
Author(s): Raphan, Martin; Simoncelli, Eero P.
Abstract:
The two standard methods of obtaining a least-squares optimal estimator are (1) Bayesian estimation, in which one assumes a prior distribution on the true values and combines this with a model of the measurement process to obtain an optimal estimator, and (2) supervised regression, in which one optimizes a parametric estimator over a training set containing pairs of corrupted measurements and their associated true values. But many real-world systems do not have access to either supervised training examples or a prior model. Here, we study the problem of obtaining an optimal estimator given a measurement process with known statistics, and a set of corrupted measurements of random values drawn from an unknown prior. We develop a general form of nonparametric empirical Bayesian estimator that is written as a direct function of the measurement density, with no explicit reference to the prior. We study the observation conditions under which such "prior-free" estimators may be obtained, and we derive specific forms for a variety of different corruption processes. Each of these prior-free estimators may also be used to express the mean squared estimation error as an expectation over the measurement density, thus generalizing Stein's unbiased risk estimator (SURE) which provides such an expression for the additive Gaussian noise case. Minimizing this expression over measurement samples provides an "unsupervised regression" method of learning an optimal estimator from noisy measurements in the absence of clean training data. We show that combining a prior-free estimator with its corresponding unsupervised regression form produces a generalization of the "score matching" procedure for parametric density estimation, and we develop an incremental form of learning for estimators that are written as a linear combination of nonlinear kernel functions. Finally, we show through numerical simulations that the convergence of these estimators can be comparable to their supervised or Bayesian counterparts.
Title: Body Signature Recognition
Author(s): Williams, George; Bregler, Christoph; Hackney, Peggy; Rosenthal, Sally; McDowall, Ian; Smolskiy, Kirill
Abstract:
This paper describes a new visual representation of motion that is used to learn and classify body language - what we call .body signatures. - of people while they are talking. We applied this technique to several hours of internet videos and television broadcasts that include US politicians and leaders from Germany, France, Iran, Russia, Pakistan, and India, and public figures such as the Pope, as well as numerous talk show hosts and comedians. Dependent on the complexity of the task, we show up to 80% recognition performance and clustering into broader body language categories.
Title: General Algorithms for Testing the Ambiguity of Finite Automata
Author(s): Allauzen, Cyril; Mohri, Mehryar; Rastogi, Ashish
Abstract:
This paper presents efficient algorithms for testing the finite, polynomial, and exponential ambiguity of finite automata with \(\epsilon\)-transitions. It gives an algorithm for testing the exponential ambiguity of an automaton \(A\) in time \(O(|A|_E2)\), and finite or polynomial ambiguity in time \(O(|A|_E3)\). These complexities significantly improve over the previous best complexities given for the same problem. Furthermore, the algorithms presented are simple and are based on a general algorithm for the composition or intersection of automata. We also give an algorithm to determine the degree of polynomial ambiguity of a finite automaton \(A\) that is polynomially ambiguous in time \(O(|A|_E3)\). Finally, we present an application of our algorithms to an approximate computation of the entropy of a probabilistic automaton.
Title: Competitive Hybridization Model
Author(s): Cherepinsky, Vera; Hashmi, Ghazala; Seul, Michael; Mishra, Bud
Abstract:
Microarray technology, in its simplest form, allows one to gather abundance data for target DNA molecules, associated with genomes or gene-expressions, and relies on hybridizing the target to many short probe oligonucleotides arrayed on a surface. While for such multiplexed reactions conditions are optimized to make the most of each individual probe-target interaction, subsequent analysis of these experiments is based on the implicit assumption that a given experiment gives the same result regardless of whether it was conducted in isolation or in parallel with many others. It has been discussed in the literature that this assumption is frequently false, and its validity depends on the types of probes and their interactions with each other. We present a detailed physical model of hybridization as a means of understanding probe interactions in a multiplexed reaction. The model is formulated as a system of ordinary di.erential equations (ODE.s) describing kinetic mass action and conservation-of-mass equations completing the system.
We examine pair-wise probe interactions in detail and present a model of .competition. between the probes for the target.especially, when target is in short supply. These e.ects are shown to be predictable from the a.nity constants for each of the four probe sequences involved, namely, the match and mismatch for both probes. These a.nity constants are calculated from the thermodynamic parameters such as the free energy of hybridization, which are in turn computed according to the nearest neighbor (NN) model for each probe and target sequence.
Simulations based on the competitive hybridization model explain the observed variability in the signal of a given probe when measured in parallel with di.erent groupings of other probes or individually. The results of the simulations are used for experiment design and pooling strategies, based on which probes have been shown to have a strong e.ect on each other.s signal in the in silico experiment. These results are aimed at better design of multiplexed reactions on arrays used in genotyping (e.g., HLA typing, SNP or CNV detection, etc.) and mutation analysis (e.g., cystic .brosis, cancer, autism, etc.).
Title: Factor Graphs for Relational Regression
Author(s): Chopra, Sumit; Thampy, Trivikaraman; Leahy, John; Caplin, Andrew; LeCun, Yann
Abstract:
Traditional methods for supervised learning involve treating the input data as a set of independent, identically distributed samples. However, in many situations, the samples are related in such a way that variables associated with one sample depend on other samples. We present a new form of relational graphical model that, in addition to capturing the dependence of the output on sample specific features, can also capture hidden relationships among samples through a non-parametric latent manifold. Learning in the proposed graphical model involves simultaneously learning the non-parametric latent manifold along with a non-relational parametric model. Efficient inference algorithms are introduced to accomplish this task. The method is applied to the prediction of house prices. A non-relational model predicts an ``intrinsic" price of the house which depends only on its individual characteristics, and a relational model estimates a hidden surface of ``desirability'' coefficients which links the price of a house to that of similar houses in the neighborhood.
Title: Pointer Analysis, Conditional Soundness, and Proving the Absence of Errors
Author(s): Conway, Christopher L.; Dams, Dennis; Namjoshi, Kedar S.; Barrett, Clark
Abstract:
It is well known that the use of points-to information can substantially improve the accuracyof a static program analysis. Commonly used algorithms for computing points-to information are known to be sound only for memory-safe programs. Thus, it appears problematic to utilize points-to information to verify the memory safety property without giving up soundness. We show that a sound combination is possible, even if the points-to information is computed separately and only conditionally sound. This result is based on a refined statement of the soundness conditions of points-to analyses and a general mechanism for composing conditionally sound analyses.
Title: An Overlapping Schwarz Algorithm for Almost Incompressible Elasticity
Author(s): Dohrmann, Clark R.; Widlund, Olof B.
Abstract:
Overlapping Schwarz methods are extended to mixed finite element approximations of linear elasticity which use discontinuous pressure spaces. The coarse component of the preconditioner is based on a low-dimensional space previously developed for scalar elliptic problems and a domain decomposition method of iterative substructuring type, i.e., a method based on non-overlapping decompositions of the domain, while the local components of the preconditioner are based on solvers on a set of overlapping subdomains.
A bound is established for the condition number of the algorithm which grows in proportion to the square of the logarithm of the number of degrees of freedom in individual subdomains and the third power of the relative overlap between the overlapping subdomains, and which is independent of the Poisson ratio as well as jumps in the Lam\'e parameters across the interface between the subdomains. A positive definite reformulation of the discrete problem makes the use of the standard preconditioned conjugate gradient method straightforward. Numerical results, which include a comparison with problems of compressible elasticity, illustrate the findings.
Title: Modal Logic, Temporal Models and Neural Circuits: What Connects Them
Author(s): Kleinberg, Samantha; Antoniotti, Marco; Ramakrishnan, Naren; Mishra, Bud
Abstract:
Traditional methods for supervised learning involve treating the input data as a set of independent, identically distributed samples. However, in many situations, the samples are related in such a way that variables associated with one sample depend on other samples. We present a new form of relational graphical model that, in addition to capturing the dependence of the output on sample specific features, can also capture hidden relationships among samples through a non-parametric latent manifold.
Learning in the proposed graphical model involves simultaneously learning the non-parametric latent manifold along with a non-relational parametric model. Efficient inference algorithms are introduced to accomplish this task. The method is applied to the prediction of house prices. A non-relational model predicts an ``intrinsic" price of the house which depends only on its individual characteristics, and a relational model estimates a hidden surface of ``desirability'' coefficients which links the price of a house to that of similar houses in the neighborhood.
Title: Extension of Two-level Schwarz Preconditioners to Symmetric Indefinite Problems
Author(s): Leong, Alan
Abstract:
Two-level overlapping Schwarz preconditioners are extended for use for a class of large, symmetric, indefinite systems of linear algebraic equations. The focus is on an enriched coarse space with additional basis functions built from free space solutions of the underlying partial differential equation. GMRES is used to accelerate the convergence of preconditioned systems. Both additive and hybrid Schwarz methods are considered and reports are given on extensive numerical experiments.
Title: Nonlinear extraction of 'Independent Components' of elliptically symmetric densities using radial Gaussianization
Author(s): Lyu, Siwei; Simoncelli, Eero P.
Abstract:
We consider the problem of efficiently encoding a signal by transforming it to a new representation whose components are statistically independent (also known as factorial). A widely studied family of solutions, generally known as independent components analysis (ICA), exists for the case when the signal is generated as a linear transformation of independent non-Gaussian sources. Here, we examine a complementary case, in which the signal density is non-Gaussian but elliptically symmetric. In this case, no linear transform suffices to properly decompose the signal into independent components, and thus, the ICA methodology fails. We show that a simple nonlinear transformation, which we call radial Gaussianization (RG), provides an exact solution for this case. We then examine this methodology in the context of natural image statistics, demonstrating that joint statistics of spatially proximal coefficients in a multi-scale image representation are better described as elliptical than factorial. We quantify this by showing that reduction in dependency achieved by RG is far greater than that achieved by ICA, for local spatial neighborhoods. We also show that the RG transformation may be closely approximated by divisive normalization transformations that have been used to model the nonlinear response properties of visual neurons, and that have been shown to reduce dependencies between multi-scale image coefficients.
Title: An Efficient Reduction of Ranking to Classification
Author(s): Ailon, Nir; Mohri, Mehryar
Abstract:
This paper describes an efficient reduction of the learning problem of ranking to binary classification. As with a recent result of Balcan et al. (2007), the reduction guarantees an average pairwise misranking regret of at most \(2r\) using a binary classifier with regret \(r\). However, our reduction applies to a broader class of ranking loss functions, admits a simpler proof, and the expected running time complexity of our algorithm in terms of number of calls to a classifier or preference function is improved from \(\Omega(n2)\) to \(O(n \log n)\). Furthermore, when the top \(k\) ranked elements only are required (\(k \ll n\)), as in many applications in information extraction or search engines, the time complexity of our algorithm can be further reduced to \(O(k \log k + n)\). Our reduction and algorithm are thus practical for realistic applications where the number of points to rank exceeds several thousands. Much of our results also extend beyond the bipartite case previously studied.
Title: N-Way Composition of Weighted Finite-State Transducers
Author(s): Allauzen, Cyril; Mohri, Mehryar
Abstract:
Composition of weighted transducers is a fundamental algorithm used in many applications, including for computing complex edit-distances between automata, or string kernels in machine learning, or to combine different components of a speech recognition, speech synthesis, or information extraction system. We present a generalization of the composition of weighted transducers, \emph{$n$-way composition}, which is dramatically faster in practice than the standard composition algorithm when combining more than two transducers. The expected worst-case complexity of our algorithm for composing three transducers $T_1$, $T_2$, and $T_3$\ignore{ depending on the strategy used, is $O(|T_1|_E|T_2|_Q|T_3|_E + |T|)$ or $(|T_1|_Q|T_2|_E|T_3|_Q + |T|)$, } is $O(\min(|T_1|_E|T_2|_Q|T_3|_E, |T_1|_Q|T_2|_E|T_3|_Q) + |T|)$, where $T$ is the result of that composition and $|T_i| = |T_i|_Q + |T_i|_E$ with $|T_i|_Q$ the number of states and $|T_i|_E$ the number of transitions of $T_i$, $i = 1, 2, 3$. In many cases, this significantly improves on the complexity of standard composition. Our algorithm also leads to a dramatically faster composition in practice. Furthermore, standard composition can be obtained as a special case of our algorithm. We report the results of several experiments demonstrating this improvement. These theoretical and empirical improvements significantly enhance performance in the applications already mentioned.
Title: On the Computation of the Relative Entropy of Probabilistic Automata
Author(s): Cortes, Corinna; Mohri, Mehryar; Rastogi, Ashish; Riley, Michael
Abstract:
We present an exhaustive analysis of the problem of computing the relative entropy of two probabilistic automata. We show that the problem of computing the relative entropy of unambiguous probabilistic automata can be formulated as a shortest-distance problem over an appropriate semiring, give efficient exact and approximate algorithms for its computation in that case, and report the results of experiments demonstrating the practicality of our algorithms for very large weighted automata. We also prove that the computation of the relative entropy of arbitrary probabilistic automata is PSPACE-complete.
The relative entropy is used in a variety of machine learning algorithms and applications to measure the discrepancy of two distributions. We examine the use of the symmetrized relative entropy in machine learning algorithms and show that, contrarily to what is suggested by a number of publications, the symmetrized relative entropy is neither positive definite symmetric nor negative definite symmetric, which limits its use and application in kernel methods. In particular, the convergence of training for learning algorithms is not guaranteed when the symmetrized relative entropy is used directly as a kernel, or as the operand of an exponential as in the case of Gaussian Kernels.
Finally, we show that our algorithm for the computation of the entropy of an unambiguous probabilistic automaton can be generalized to the computation of the norm of an unambiguous probabilistic automaton by using a monoid morphism. In particular, this yields efficient algorithms for the computation of the Lp -norm of a probabilistic automaton.
Title: Magnitude-Preserving Ranking Algorithms
Author(s): Cortes, Corinna; Mohri, Mehryar; Rastogi, Ashish
Abstract:
This paper studies the learning problem of ranking when one wishes not just to accurately predict pairwise ordering but also preserve the magnitude of the preferences or the difference between ratings, a problem motivated by its crucial importance in the design of search engines, movie recommendation, and other similar ranking systems. We describe and analyze several algorithms for this problem and give stability bounds for their generalization error, extending previously known stability results to non- bipartite ranking and magnitude of preference-preserving algorithms. We also report the results of experiments comparing these algorithms on several datasets and contrast these results with those obtained using an AUC-maximization algorithm.
Title: Domain Decomposition for Less Regular Subdomains: Overlapping Schwarz in Two Dimensions
Author(s): Dohrmann, Clark R.; Klawonn, Axel; Widlund, Olof B.
Abstract:
In the theory of domain decomposition methods, it is often assumed that each subdomain is the union of a small set of coarse triangles or tetrahedra. In this study, extensions to the existing theory which accommodates subdomains with much less regular shape are presented; the subdomains are only required to be John domains. Attention is focused on overlapping Schwarz preconditioners for problems in two dimensions with a coarse space component of the preconditioner which allows for good results even for coefficients which vary considerably. It is shown that the condition number of the domain decomposition method is bounded by C(1 + H/δ)(1 + log(H/h))
^{2}
, where the constant C independent of the number of subdomains and possible jumps in coefficients between subdomains. Numerical examples are provided which confirm the theory and demonstrate very good performance of the method for a variety of subregions including those obtained when a mesh partitioner is used for the domain decomposition.
Title: Typical: Taking the Tedium Out of Typing
Author(s): Grimm, Robert; Harris, Laune; Le, Anh
Abstract:
The implementation of real-world type checkers requires a non-trivial engineering effort. The resulting code easily comprises thousands of lines, which increases the probability of software defects in a component critical to compiler correctness. To make type checkers easier to implement and extend, this paper presents Typical, a domain-specific language and compiler that directly and concisely captures the structure of type systems. Our language builds on the functional core for ML to represent syntax trees and types as variants and to traverse them with pattern matches. It then adds declarative constructs for common type checker concerns, such as scoping rules, namespaces, and constraints on types. It also integrates error checking and reporting with other constructs to promote comprehensive error management. We have validated our system with two real-world type checkers written in Typical, one for Typical itself and the other for C.
Title: Declarative Syntax Tree Engineering* Or, One Grammar to Rule Them All
Author(s): Grimm, Robert
Abstract:
Grammars for many parser generators not only specify a language's syntax but also the corresponding syntax tree. Unfortunately, most parser generators pick a somewhat arbitrary combination of features from the design space for syntax trees and thus lock in specific trade-offs between expressivity, safety, and performance. This paper discusses the three major axes of the design space---specification within or outside a grammar, concrete or abstract syntax trees, and dynamically or statically typed trees---and their impact. It then presents algorithms for automatically realizing all major choices from the same, unmodified grammar with inline syntax tree declarations. In particular, this paper shows how to automatically (1) extract a separate syntax tree specification, (2) embed an abstract syntax tree within a concrete one, and (3) infer a strongly typed view on a dynamically typed tree. All techniques are implemented in the Rats! parser generator and have been applied to real-world C and Java grammars and their syntax trees.
Title: An analysis of a FETI--DP algorithm on irregular subdomains in the plane
Author(s): Klawonn, Axel; Rheinbach, Oliver; Widlund, Olof B.
Abstract:
In the theory for domain decomposition algorithms of the iterative substructuring family, each subdomain is typically assumed to be the union of a few coarse triangles or tetrahedra. This is an unrealistic assumption, in particular, if the subdomains result from the use of a mesh partitioner in which case they might not even have uniformly Lipschitz continuous boundaries.
The purpose of this study is to derive bounds for the condition number of these preconditioned conjugate gradient methods which depend only on a parameter in an isoperimetric inequality and two geometric parameters characterizing John and uniform domains. A related purpose is to explore to what extent well known technical tools previously developed for quite regular subdomains can be extended to much more irregular subdomains.
Some of these results are valid for any John domains, while an extension theorem, which is needed in this study, requires that the subdomains are uniform. The results, so far, are only complete for problems in two dimensions. Details are worked out for a FETI--DP algorithm and numerical results support the findings. Some of the numerical experiments illustrate that care must be taken when selecting the scaling of the preconditioners in the case of irregular subdomains.
Title: Empirical Bayes least squares estimation without an explicit prior
Author(s): Raphan, Martin; Simoncelli, Eero P.
Abstract:
Bayesian estimators are commonly constructed using an explicit prior model. In many applications, one does not have such a model, and it is difficult to learn since one does not have access to uncorrupted measurements of the variable being estimated. In many cases however, including the case of contamination with additive Gaussian noise, the Bayesian least squares estimator can be formulated directly in terms of the distribution of noisy measurements. We demonstrate the use of this formulation in removing noise from photographic images. We use a local approximation of the noisy measurement distribution by exponentials over adaptively chosen intervals, and derive an estimator from this approximate distribution. We demonstrate through simulations that this adaptive Bayesian estimator performs as well or better than previously published estimators based on simple prior models.
Title: DNA Hash Pooling and its Applications
Author(s): Shasha, Dennis; Amos, Martyn
Abstract:
In this paper we describe a new technique for the characterisation of populations of DNA strands. Such tools are vital to the study of ecological systems, at both the micro (e.g., individual humans) and macro (e.g., lakes) scales. Existing methods make extensive use of DNA sequencing and cloning, which can prove costly and time consuming. The overall objective is to address questions such as: (i) (Genome detection) Is a known genome sequence present at least in part in an environmental sample? (ii) (Sequence query) Is a specific fragment sequence present in a sample? (iii) (Similarity Discovery) How similar in terms of sequence content are two unsequenced samples?
We propose a method involving multiple filtering criteria that result in ``pools" of DNA of high or very high purity. Because our method is similar in spirit to hashing in computer science, we call the method {\it DNA hash pooling}. To illustrate this method, we describe examples using pairs of restriction enzymes. The {\it in silico} empirical results we present reflect a sensitivity to experimental error. The method requires minimal DNA sequencing and, when sequencing is required, little or no cloning.
Title: A Unified Construction of the Glushkov, Follow, and Antimirov Automata
Author(s): Allauzen, Cyril; Mohri, Mehryar
Abstract:
Many techniques have been introduced in the last few decades to create ε-free automata representing regular expressions: Glushkov automata, the so-called follow automata, and Antimirov automata. This paper presents a simple and unified view of all these ε-free automata both in the case of unweighted and weighted regular expressions.It describes simple and general algorithms with running time complexities at least as good as that of the best previously known techniques, and provides concise proofs.The construction methods are all based on two standard automata algorithms: epsilon-removal and minimization. This contrasts with the multitude of complicated and special-purpose techniques and proofs put forward by others to construct these automata. Our analysis provides a better understanding of ε-free automata representing regular expressions: they are all the results of the application of some combinations of epsilon-removal and minimization to the classical Thompson automata. This makes it straight forward to generalize these algorithms to the weighted case, which also results in much simpler algorithms than existing ones. For weighted regular expressions over a closed semiring, we extend the notion of follow automata to the weighted case. We also present the first algorithm to compute the Antimirov automata in the weighted case.
Title: Shape Analysis of Single-Parent Heaps
Author(s): Balaban, Ittai; Pnueli, Amir; Zuck, Lenore
Abstract:
We define the class of single-parent heap systems, which rely on a singly-linked heap in order to model destructive updates on tree structures. This encoding has the advantage of relying on a relatively simple theory of linked lists in order to support abstraction computation. To facilitate the application of this encoding, we provide a program transformation that, given a program operating on a multi-linked heap without sharing, transforms it into one over a single-parent heap. It is then possible to apply shape analysis by predicate and ranking abstraction as in [BPZ05]. The technique has been successfully applied on examples with trees of fixed arity (balancing of and insertion into a binary sort tree).
Title: Invisible Safety of Distributed Protocols
Author(s): Balaban, Ittai; Pnueli, Amir; Zuck, Lenore
Abstract:
The method of ``Invisible Invariants'' has been applied successfully to protocols that assume a ``symmetric'' underlying topology, be it cliques, stars, or rings. In this paper we show how the method can be applied to proving safety properties of distributed protocols running under arbitrary topologies. Many safety properties of such protocols have reachability predicates, which, on first glance, are beyond the scope of the Invisible Invariants method. To overcome this difficulty, we present a technique, called ``coloring,'' that allows, in many instances, to replace the second order reachability predicates by first order predicates, resulting in properties that are amenable to Invisible Invariants, where ``reachable'' is replaced by ``colored.'' We demonstrate our techniques on several distributed protocols, including a variant on Luby's Maximal Independent Set protocol, the Leader Election protocol used in the IEEE 1394 (Firewire) distributed bus protocol, and various distributed spanning tree algorithms. All examples have been tested using the symbolic model checker TLV.
Title: On Transductive Regression
Author(s): Cortes, Corinna; Mohri, Mehryar
Abstract:
In many modern large-scale learning applications, the amount of unlabeled data far exceeds that of labeled data. A common instance of this problem is the 'transductive' setting where the unlabeled test points are known to the learning algorithm. This paper presents a study of regression problems in that setting. It presents 'explicit' VC-dimension error bounds for transductive regression that hold for all bounded loss functions and coincide with the tight classification bounds of Vapnik when applied to classification. It also presents a new transductive regression algorithm inspired by our bound that admits a primal and kernelized closed-form solution and deals efficiently with large amounts of unlabeled data. The algorithm exploits the position of unlabeled points to locally estimate their labels and then uses a global optimization to ensure robust predictions. Our study also includes the results of experiments with several publicly available regression data sets with up to 20,000 unlabeled examples. The comparison with other transductive regression algorithms shows that it performs well and that it can scale to large data sets.
Title: A FETI-DP algorithm for elasticity problems with mortar discretization on geometrically non-conforming partitions
Author(s): Kim, Hyea Hyun
Abstract:
In this paper, a FETI-DP formulation for three dimensional elasticity on non-matching grids over geometrically non-conforming subdomain partitions is considered. To resolve the nonconformity of the finite elements, a mortar matching condition is imposed on the subdomain interfaces (faces). A FETI-DP algorithm is then built by enforcing the mortar matching condition in dual and primal ways. In order to make the FETI-DP algorithm scalable, a set of primal constraints, which include average and momentum constraints over interfaces, are selected from the mortar matching condition. A condition number bound, $C(1+\text{log}(H/h))2$, is then proved for the FETI-DP formulation for the elasticity problems with discontinuous material parameters. Only some faces need to be chosen as primal faces on which the average and momentum constraints are imposed.
Title: PSL Model Checking and Run-time Verification via Testers
Author(s): Pnueli, Amir; Zaks, Aleksandr
Abstract:
The paper introduces the construct of \emm{temporal testers} as a compositional basis for the construction of automata corresponding to temporal formulas in the PSL logic. Temporal testers can be viewed as (non-deterministic) transducers that, at any point, output a boolean value which is 1 iff the corresponding temporal formula holds starting at the current position.
The main advantage of testers, compared to acceptors (such as Buchi automata) is that they are compositional. Namely, a tester for a compound formula can be constructed out of the testers for its sub-formulas. In this paper, we extend the application of the testers method from LTL to the logic PSL.
Besides providing the construction of testers for PSL, we indicate how the symbolic representation of the testers can be directly utilized for efficient model checking and run-time monitoring.
Title: Infrastructure for Automatic Dynamic Deployment of J2EE Applications in Distributed Environments
Author(s): Akkerman, Anatoly; Totok, Alexander; Karamcheti, Vijay
Abstract:
Recent studies showed potential for using component frameworks for building flexible adaptible applications for deployment in distributed environments. However this approach is hindered by the complexity of deployment of component-based applications, which usually involves a great deal of configuration of both the application components and system services they depend on. In this paper we propose an infrastructure for automatic dynamic deployment of J2EE applications,that specifically addresses the problems of (1) inter-component connectivity specification and its effects on component configuration and deployment; and (2) application component dependencies on application server services, their configuration and deployment. The proposed infrastructure provides simple yet expressive abstractions for potential application adaptation through dynamic deployment and undeployment of components. We implement the infrastructure as a part of the JBoss J2EE application server and test it on several sample J2EE applications.
Title: Remembrance of Experiments Past: Analyzing Time Course Datasets to Discover Complex Temporal Invariants
Author(s): Antoniotti, Marco; Ramakrishnan, Naren; Kumar, Deept; Spivak, Marina; Mishra, Bud
Abstract:
Motivation: Current microarray data analysis techniques draw the biologist's attention to targeted sets of genes but do not otherwise present global and dynamic perspectives (e.g., invariants) inferred collectively over a dataset. Such perspectives are important in order to obtain a process-level understanding of the underlying cellular machinery, especially how cells react, respond, and recover from stresses.
Results: We present GOALIE, a novel computational approach and software system that uncovers formal temporal logic models of biological processes from time course microarray datasets. GOALIE `redescribes' data into the vocabulary of biological processes and then pieces together these redescriptions into a Kripke-structure model, where possible worlds encode transcriptional states and are connected to future possible worlds. This model then supports various query, inference, and comparative assessment tasks, besides providing descriptive process-level summaries. An application of GOALIE to characterizing the yeast (S. cerevisiae) cell cycle is described.
Availability: GOALIE runs on Windows XP platforms and is available on request from the authors.
Title: An Abstract Decision Procedure for Satisfiability in the Theory of Recursive Data Types
Author(s): Barrett, Clark; Shikanian, Igor; Tinelli, Cesare
Abstract:
The theory of recursive data types is a valuable modeling tool for software verification. In the past, decision procedures have been proposed for both the full theory and its universal fragment. However, previous work has been limited in various ways, including an inability to deal with multiple constructors, multi-sorted logic, and mutually recursive data types. More significantly, previous algorithms for the universal case have been based on inefficient nondeterministic guesses and have been described in fairly complex procedural terms.
We present an algorithm which addresses these issues for the universal theory. The algorithm is presented declaratively as a set of abstract rules which are terminating, sound, and complete. We also describe strategies for applying the rules and explain why our recommended strategy is more efficient than those used by previous algorithms. Finally, we discuss how the algorithm can be used within a broader framework of cooperating decision procedures.
Title: Squidball: An Experiment in Large-Scale Motion Capture and Game Design
Author(s): Bregler, Christoph; Castiglia, Clothilde; DeVincenzo, Jessica; DuBois, Roger Luke; Feeley, Kevin; Igoe, Tom; Meyer, Jonathan; Naimark, Michael; Postelnicu, Alexandru; Rabinovich, Michael; Rosenthal, Sally; Salen, Katie; Sudol, Jeremi; Wright, Bo
Abstract:
This paper describes a new large-scale motion capture based game that is called Squidball. It was tested on up to 4000 player audiences last summer at SIGGRAPH 2004. It required to build the world's largest motion capture space, the largest motion capture markers (balls), and many other challenges in technology, production, game play, and social studies. Our aim was to entertain the SIGGRAPH Electronic Theater audience with a cooperative and energetic game that is played by everybody together, in controlling real-time graphics and audio, while bouncing and batting multiple large helium filled balloons across the entire theater space. We detail in this paper all the lessons learned in producing such a system and game, and argue why we believe Squidball was a great success.
Title: A Domain Decomposition Discretization of Parabolic Problems
Author(s): Dryja, Maksymilian; Tu, Xuemin
Abstract:
In recent years, domain decomposition methods have attracted much attention due to their successful application to many elliptic and parabolic problems. Domain decomposition methods treat problems based on a domain substructuring, which is attractive for parallel computation, due to the independence among the subdomains. In principle, domain decomposition methods may be applied to the system resulting from a standard discretization of the parabolic problems or, directly, be carried out through a direct discretization of parabolic problems. In this paper, a direct domain decomposition method is introduced to discretize the parabolic problems. The stability and convergence of this algorithm are analyzed, and an $O(\tau+h)$ error bound is provided.
Title: Nonlinear Image Representation via Local Multiscale Orientation
Author(s): Hammond, David K.; Simoncelli, Eero P.
Abstract:
We present a nonlinear image representation based on multiscale local orientation measurements. Specifically, an image is first decomposed using a two-orientation steerable pyramid, a tight-frame representation in which the basis functions are directional derivatives of a radially symmetric blurring operator. The pair of subbands at each scale are thus gradients of progressively blurred copies of the original image. We then discard the magnitude information and retain only the orientation of each gradient vector. We develop a method for reconstructing the original image from this orientation information using an algorithm based on projection onto convex sets, and demonstrate its robustness to quantization.
Title: Oriented Overlays For Clustering Client Requests To Data-Centric Network Services
Author(s): He, Congchun; Karamcheti, Vijay
Abstract:
Many of the data-centric network services deployed today hold massive volumes of data at their origin websites, accessing the data to dynamically generate responses. Such dynamic responses are poorly supported by traditional caching infrastructures and result in poor performance and scalability for such services. One way of remedying this situation is to develop alternative caching infrastructures, which can dynamically detect the often large degree of service usage locality and leverage such information to on-demand replicate and redirect requests to service portions at appropriate network locations. Key to building such infrastructures is the ability to cluster and inspect client requests, at various points across a wide-area network.
This paper presents a zone-based scheme for constructing oriented overlays, which provide such an ability. Oriented overlays differ from previously proposed unstructured overlays in supporting network traffic flows from many sources towards one (or a small number) of destinations, and vice-versa. A good oriented overlay would offer sufficient clustering ability without adversely affecting path latencies. Our overlay construction scheme organizes participating nodes into different zones according to their latencies from the origin server(s), and has each node associate with one or more parents in another zone closer to the origin. Extensive experiments with a PlanetLab-based implementation of our scheme shows that it produces overlays that are (1) robust to network dynamics; (2) offer good clustering ability; and (3) minimally impact end-to-end network latencies seen by clients.
Title: An Analysis of Usage Locality for Data-Centric Web Services
Author(s): He, Congchun; Karamcheti, Vijay
Abstract:
The growing popularity of XML Web Services is resulting in a significant increase in the proportion of Internet traffic that involves requests to and responses from Web Services. Unfortunately, web service responses, because they are generated dynamically, are considered ``uncacheable" by traditional caching infrastructures. One way of remedying this situation is by developing alternative caching infrastructures, which improve performance using on-demand service replication, data offloading, and request redirection. These infrastructures benefit from two characteristics of web service traffic --- (1) the open nature of the underlying protocols, SOAP, WSDL, UDDI, which results in service requests and responses adhering to a well-formatted, widely known structure; and (2) the observation that for a large number of currently deployed data-centric services, requests can be interpreted as structured accesses against a physical or virtual database --- but require that there be sufficient locality in service usage to offset replication and redirection costs.
This paper investigates whether such locality does in fact exist in current web service workloads. We examine access logs from two large data-centric web service sites, SkyServer and TerraServer, to characterize workload locality across several dimensions: data space, network regions, and different time epochs. Our results show that both workloads exhibit a high degree of spatial and network locality: 10\% of the client IP addresses in the SkyServer trace contribute to about 99.95\% of the requests, and 99.94\% of the requests in the TerraServer trace are directed towards regions that represent less than 10\% of the overall data space accessible through the service. Our results point to the substantial opportunity for improving Web Services scalability by on-demand service replication.
Title: Two-Level Schwarz Algorithms, Using Overlapping Subregions, for Mortar Finite Element Methods
Author(s): Hyun Kim, Hyea; Widlund, Olof B.
Abstract:
Preconditioned conjugate gradient methods based on two-level overlapping Schwarz methods often perform quite well. Such a preconditioner combines a coarse space solver with local components which are defined in terms of subregions which form an overlapping covering of the region on which the elliptic problem is defined. Precise bounds on the rate of convergence of such iterative methods have previously been obtained in the case of conforming lower order and spectral finite elements as well as in a number of other cases. In this paper, this domain decomposition algorithm and analysis are extended to mortar finite elements. It is established that the condition number of the relevant iteration operator is independent of the number of subregions and varies with the relative overlap between neighboring subregions linearly as in the conforming cases previously considered.
Title: A FETI-DP formulation of three dimensional elasticity problems with mortar discretization
Author(s): Kim, Hyea Hyun
Abstract:
In this paper, a FETI-DP formulation for the three dimensional elasticity problem on non-matching grids over a geometrically conforming subdomain partition is considered. To resolve the nonconformity of the finite elements, a mortar matching condition on the subdomain interfaces (faces) is imposed. By introducing Lagrange multipliers for the mortar matching constraints, the resulting linear system becomes similar to that of a FETI-DP method. In order to make the FETI-DP method efficient for solving this linear system, a relatively large set of primal constraints, which include average and momentum constraints over interfaces (faces) as well as vertex constraints, is introduced. A condition number bound $C(1+\text{log}(H/h))2$ for the FETI-DP formulation with a Neumann-Dirichlet preconditioner is then proved for the elasticity problems with discontinuous material parameters when only some faces are chosen as primal faces on which the average and momentum constraints will be imposed. An algorithm which selects a quite small number of primal faces is also discussed.
Title: A BDDC algorithm for problems with mortar discretization
Author(s): Kim, Hyea Hyun; Dryja, Maksymilian; Widlund, Olof B.
Abstract:
A BDDC (balancing domain decomposition by constraints) algorithm is developed for elliptic problems with mortar discretizations for geometrically non-conforming partitions in both two and three spatial dimensions. The coarse component of the preconditioner is defined in terms of one mortar constraint for each edge/face which is an intersection of the boundaries of a pair of subdomains. A condition number bound of the form $C \max_i \left\{ (1+\text{log} (H_i/h_i) )3 \right\}$ is established. In geometrically conforming cases, the bound can be improved to $C \max_i \left\{ (1+\text{log} (H_i/h_i) )2 \right\}$. This estimate is also valid in the geometrically nonconforming case under an additional assumption on the ratio of mesh sizes and jumps of the coefficients. This BDDC preconditioner is also shown to be closely related to the Neumann-Dirichlet preconditioner for the FETI--DP algorithms of \cite{K-04-3d,KL-02} and it is shown that the eigenvalues of the BDDC and FETI--DP methods are the same except possibly for an eigenvalue equal to 1.
Title: BDDC Algorithms for Incompressible Stokes Equations
Author(s): Li, Jing; Widlund, Olof B.
Abstract:
The purpose of this paper is to extend the BDDC (balancing domain decomposition by constraints) algorithm to saddle-point problems that arise when mixed finite element methods are used to approximate the system of incompressible Stokes equations. The BDDC algorithms are iterative substructuring methods, which form a class of domain decomposition methods based on the decomposition of the domain of the differential equations into nonoverlapping subdomains. They are defined in terms of a set of primal continuity constraints, which are enforced across the interface between the subdomains and which provide a coarse space component of the preconditioner. Sets of such constraints are identified for which bounds on the rate of convergence can be established that are just as strong as previously known bounds for the elliptic case. In fact, the preconditioned operator is effectively positive definite, which makes the use of a conjugate gradient method possible. A close connection is also established between the BDDC and FETI-DP algorithms for the Stokes case.
Title: FETI--DP, BDDC, and Block Cholesky Methods
Author(s): Li, Jing; Widlund, Olof B.
Abstract:
Two popular non-overlapping domain decomposition methods, the FETI--DP and BDDC algorithms, are reformulated using Block Cholesky factorizations, an approach which can provide a useful framework for the design of domain decomposition algorithms for solving symmetric positive definite linear system of equations. Instead of introducing Lagrange multipliers to enforce the coarse level, primal continuity constraints in these algorithms, a change of variables is used such that each primal constraint corresponds to an explicit degree of freedom. With the new formulations of these algorithms, a simplified proof is provided that the spectra of a pair of FETI--DP and BDDC algorithms, with the same set of primal constraints, are the same. Results of numerical experiments also confirm this result.
Title: On the Use of Inexact Subdomain Solvers for BDDC Algorithms
Author(s): Li, Jing; Widlund, Olof B.
Abstract:
The standard BDDC (balancing domain decomposition by constraints) preconditioner is shown to be equivalent to a preconditioner built from a partially subassembled finite element model. This results in a system of linear algebraic equations which is much easier to solve in parallel than the fully assembled model; the cost is then often dominated by that of the problems on the subdomains. An important role is also played, both in theory and practice, by an average operator and in addition exact Dirichlet solvers are used on the subdomains in order to eliminate the residual in the interior of the subdomains. The use of inexact solvers for these problems and even the replacement of the Dirichlet solvers by a trivial extension are considered. It is established that one of the resulting algorithms has the same eigenvalues as the standard BDDC algorithm, and the connection of another with the FETI-DP algorithm with a lumped preconditioner is also considered. Multigrid methods are used in the experimental work and under certain assumptions, it can be established that the iteration count essentially remains the same as when exact solvers are used, while considerable gains in the speed of the algorithm can be realized since the cost of the exact solvers grows superlinearly with the size of the subdomain problems while the multigrid methods are linear.
Title: Real-time rendering of normal maps with discontinuities
Author(s): Parilov, Evgueni; Rosenberg, Ilya; Zorin, Denis
Abstract:
Title: Real-time rendering of normal maps with discontinuities (NYU-CS-TR872) Authors: Evgueni Parilov, Ilya Rosenberg and Denis Zorin Abstract:
Normal mapping uses normal perturbations stored in a texture to give objects a more geometrically complex appearance without increasing the number of geometric primitives. Standard bi- and trilinear interpolation of normal maps works well if the normal field is continuous, but may result in visible artifacts in the areas where the field is discontinuous, which is common for surfaces with creases and dents.
In this paper we describe a real-time rendering technique which preserves the discontinuity curves of the normal field at sub-pixel level and its GPU implementation. Our representation of the piecewise-continuous normal field is based on approximations of the distance function to the discontinuity set and its gradient. Using these approximations we can efficiently reconstruct discontinuities at arbitrary resolution and ensure that no normals are interpolated across the discontinuity. We also described a method for updating the normal field along the discontinuities in real-time based on blending the original field with the one calculated from a user-defined surface profile.
Title: Algorithmic Algebraic Model Checking I: The Case of Biochemical Systems and their Reachability Analysis
Author(s): Piazza, C.; Antoniotto, M.; Mysore, V.; Policriti, A.; Winkler, F.; Mishra, B.
Abstract:
Presently, there is no clear way to determine if the current body of biological facts is sufficient to explain phenomenology. Rigorous mathematical models with automated tools for reasoning, simulation, and computation can be of enormous help to uncover cognitive flaws, qualitative simplification or overly generalized assumptions. The approaches developed by control theorists analyzing stability of a system with feedback, physicists studying asymptotic properties of dynamical systems, computer scientists reasoning about discrete or hybrid (combining discrete events with continuous events) reactive systems---all have tried to address some aspects of the same problem in a very concrete manner. We explore here how biological processes could be studied in a similar manner, and how the appropriate tools for this purpose can be created.
In this paper, we suggest a possible confluence of the theory of hybrid automata and the techniques of algorithmic algebra to create a computational basis for systems biology. We start by discussing our basis for this choice -- semi-algebraic hybrid systems, as we also recognize its power and limitations. We explore solutions to the bounded-reachability problem through symbolic computation methods, applied to the descriptions of the traces of the hybrid automaton. Because the description of the automaton is through semi-algebraic sets, the evolution of the automaton can be described even in cases where system parameters and initial conditions are unspecified. Nonetheless, semialgebraic decision procedures provide a succinct description of algebraic constraints over the initial values and parameters for which proper behavior of the system can be expected. In addition, by keeping track of conservation principles in terms of constraint or invariant manifolds on which the system must evolve, we avoid many of the obvious pitfalls of numerical approaches.
Title: Ranking with a P-norm Push
Author(s): Rudin, Cynthia
Abstract:
We are interested in supervised ranking with the following twist: our goal is to design algorithms that perform especially well near the top of the ranked list, and are only required to perform sufficiently well on the rest of the list. Towards this goal, we provide a general form of convex objective that gives high-scoring examples more importance. This ``push'' near the top of the list can be chosen arbitrarily large or small. We choose $\ell_p$-norms to provide a specific type of push; as $p$ becomes large, the algorithm concentrates harder near the top of the list.
We derive a generalization bound based on the $p$-norm objective. We then derive a corresponding boosting-style algorithm, and illustrate the usefulness of the algorithm through experiments on UCI data.
Title: Better Burst Detection
Author(s): Shasha, Dennis; Zhang, Xin
Abstract:
A burst is a large number of events occurring within a certain time window. As an unusual activity, it's a noteworthy phenomenon in many natural and social processes. Many data stream applications require the detection of bursts across a variety of window sizes. For example, stock traders may be interested in bursts having to do with institutional purchases or sales that are spread out over minutes or hours. Detecting a burst over any of $k$ window sizes, a problem we call {\em elastic burst detection}, in a stream of length $N$ naively requires $O(kN)$ time. Previous work \cite{DiscoveryBook03} showed that a simple Shifted Binary Tree structure can reduce this time substantially (in very favorable cases near to $O(N)$) by filtering away obvious non-bursts. Unfortunately, for certain data distributions, the filter marks many windows of events as possible bursts, even though a detailed check shows them to be non-bursts.
In this paper, we present a new algorithmic framework for elastic burst detection: a family of data structures that generalizes the Shifted Binary Tree. We then present a heuristic search algorithm to find an efficient structure among the many offered by the framework, given the input. We study how different inputs affect the desired structures. Experiments on both synthetic and real world data show a factor of up to 35 times improvement compared with the Shifted Binary Tree over a wide variety of inputs, depending on the data distribution. We show an example application that identifies interesting correlations between bursts of activity in different stocks.
Title: Modeling Of Concurrent Web Sessions With Bounded Inconsistency In Shared Data
Author(s): Totok, Alexander; Karamcheti, Vijay
Abstract:
Client interactions with modern web-accessible network services are typically organized into sessions involving multiple requests that read and write shared application data. Therefore when executed concurrently, web sessions may invalidate each other's data. Depending on the nature of the business represented by the service, allowing the session with invalid data to progress might lead to financial penalties for the service provider, while blocking the session's progress and deferring its execution (e.g., by relaying its handling to the customer service) will most probably result in user dissatisfaction. A compromise would be to tolerate some bounded data inconsistency, which would allow most of the sessions to progress, while limiting the potential financial loss incurred by the service. In order to quantitatively reason about these tradeoffs, the service provider can benefit from models that predict metrics, such as the percentage of successfully completed sessions, for a certain degree of tolerable data inconsistency.
This paper develops such analytical models of concurrent web sessions with bounded inconsistency in shared data for three popular concurrency control algorithms. We illustrate our models using the sample buyer scenario from the TPC-W e-Commerce benchmark, and validate them by showing their close correspondence to measured results of concurrent session execution in both a simulated and a real web server environment. Our models take as input parameters of service usage, which can be obtained through profiling of incoming client requests. We augment our web application server environment with a profiling and automated decision making infrastructure which is shown to successfully choose, based on the specified performance metric, the best concurrency control algorithm in real time in response to changing service usage patterns.
Title: A BDDC Algorithm for Mixed Formulation of Flow in Porous Media
Author(s): Tu, Xuemin
Abstract:
The BDDC (balancing domain decomposition by constraints) algorithms are similar to the balancing Neumann-Neumann methods, with a small number of continuity constraints enforced across the interface throughout the iterations. These constraints form a coarse, global component of the preconditioner. The BDDC methods are powerful for solving large sparse linear algebraic systems arising from discretizations of elliptic boundary value problems. In this paper, the BDDC algorithm is extended to saddle point problems generated from the mixed finite element methods used to approximate the scalar elliptic problems for flow in porous media.
Edge/face average constraints are enforced and the same rate of convergence is obtained as for simple elliptic cases. The condition number bound is estimated and numerical experiments are discussed. In addition, a comparison of the BDDC method with an edge/face-based iterative substructuring method is provided.
Title: BDDC Domain Decomposition Algorithms: Methods with Three Levels and for Flow in Porous Media
Author(s): Tu, Xuemin
Abstract:
Two inexact coarse solvers for Balancing Domain Decomposition by Constraints (BDDC) algorithms are introduced and analyzed. These solvers help remove a bottleneck for the two-level BDDC algorithms related to the cost of the coarse problem when the number of subdomains is large. At the same time, a good convergence rate is maintained.
BDDC algorithms are also developed for the linear systems arising from flow in porous media discretized with mixed and hybrid finite elements. Our methods are proven to be scalable and the condition numbers of the operators with our BDDC preconditioners grow only polylogarithmically with the size of the subdomain problems.
Title: Three-Level BDDC in Three Dimensions
Author(s): Tu, Xuemin
Abstract:
BDDC methods are nonoverlapping iterative substructuring domain decomposition methods for the solution of large sparse linear algebraic systems arising from discretization of elliptic boundary value problems. Its coarse problem is given by a small number of continuity constraints which are enforced across the interface. The coarse problem matrix is generated and factored by direct solvers at the beginning of the computation and it can ultimately become a bottleneck, if the number of subdomains is very large.
In this paper, two three-level BDDC methods are introduced for solving the coarse problem approximately in three dimensions. This is an extension of previous work for the two dimensional case and since vertex constraints alone do not suffice to obtain polylogarithmic condition number bound, edge constraints are considered in this paper. Some new technical tools are then needed in the analysis and this makes the three dimensional case more complicated than the two dimensional case.
Estimates of the condition numbers are provided for two three-level BDDC methods and numerical experiments are also discussed.
Title: A BDDC algorithm for flow in porous media with a hybrid finite element discretization
Author(s): Tu, Xuemin
Abstract:
The BDDC (balancing domain decomposition by constraints) methods have been applied successfully to solve the large sparse linear algebraic systems arising from conforming finite element discretizations of elliptic boundary value problems. In this paper, the scalar elliptic problems for flow in porous media are discretized by a hybrid finite element method which is equivalent to a nonconforming finite element method. The BDDC algorithm is extended to these problems which originate as saddle point problems.
Edge/face average constraints are enforced across the interface and the same rate of convergence is obtained as in conforming cases. The condition number of the preconditioned system is estimated and numerical experiments are discussed.
Title: Fast and Cheap Genome wide Haplotype Construction via Optical Mapping
Author(s): Anantharaman, Thomas; Mysore, Venkatesh; Mishra, Bud
Abstract:
We describe an efficient algorithm to construct genome wide haplotype restriction maps of an individual by aligning single molecule DNA fragments collected with Optical Mapping technology. Using this algorithm and small amount of genomic material, we can construct the parental haplotypes for each diploid chromosome for any individual, one from the father and the other from the mother. Since such haplotype maps reveal the polymorphisms due to single nucleotide differences (SNPs) and small insertions and deletions (RFLPs), they are useful in association studies, studies involving genomic instabilities in cancer, and genetics. For instance, such haplotype restriction maps of individuals in a population can be used in association studies to locate genes responsible for genetics diseases with relatively low cost and high throughput. If the underlying problem is formulated as a combinatorial optimization problem, it can be shown to be NP-complete (a special case of K-population problem). But by effectively exploiting the structure of the underlying error processes and using a novel analog of the Baum-Welch algorithm for HMM models, we devise a probabilistic algorithm with a time complexity that is linear in the number of markers. The algorithms were tested by constructing the first genome wide haplotype restriction map of the microbe T. Pseudoana, as well as constructing a haplotype restriction map of a 120 Megabase region of Human chromosome 4. The frequency of false positives and false negatives was estimated using simulated data. The empirical results were found very promising.
Title: Naturally Speaking: A Systems Biology Tool with Natural Language Interfaces
Author(s): Antoniotti, Marco; Lau, Ian T.; Mishra, Bud
Abstract:
This short paper describes a systems biology software tool that can engage in a dialogue with a biologist by responding to questions posed to it in English (or another natural language) regarding the behavior of a complex biological system, and by suggesting a set of facts about the biological system based on a timetested generate and test approach. Thus, this bioinformatics system improves the quality of the interaction that a biologist can have with a system built on rigorous mathematical modeling, but without being aware of the underlying mathematically sophisticated concepts or notations. Given the nature of the mathematical semantics of our Simpathica/XSSYS tool, it was possible to construct a well-founded natural language interface on top of the computational kernel. We discuss our tool and illustrate its use with a few examples. The natural language subsystem is available as an integrated subsystem of the Simpathica/XSSYS tool and through a simple Web-based interface; we describe both systems in the paper. More details about the system can be found at: http://bioinformatics.nyu.edu, and its sub-pages.
Title: Practical Packrat Parsing
Author(s): Grimm, Robert
Abstract:
A considerable number of research projects are exploring how to extend object-oriented programming languages such as Java with, for example, support for generics, multiple dispatch, or pattern matching. To keep up with these changes, language implementors need appropriate tools. In this context, easily extensible parser generators are especially important because parsing program sources is a necessary first step for any language processor, be it a compiler, syntax-highlighting editor, or API documentation generator. Unfortunately, context-free grammars and the corresponding LR or LL parsers, while well understood and widely used, are also unnecessarily hard to extend. To address this lack of appropriate tools, we introduce Rats!, a parser generator for Java that supports easily modifiable grammars and avoids the complexities associated with altering LR or LL grammars. Our work builds on recent research on packrat parsers, which are recursive descent parsers that perform backtracking but also memoize all intermediate results (hence their name), thus ensuring linear-time performance. Our work makes this parsing technique, which has been developed in the context of functional programming languages, practical for object-oriented languages. Furthermore, our parser generator supports simpler grammar specifications and more convenient error reporting, while also producing better performing parsers through aggressive optimizations. In this paper, we motivate the need for more easily extensible parsers, describe our parser generator and its optimizations in detail, and present the results of our experimental evaluation.
Title: Sekitei: An AI planner for Constrained Component Deployment in Wide-Area Networks
Author(s): Kichkaylo, Tatiana; Ivan, Anca; Karamcheti, Vijay
Abstract:
Wide-area network applications are increasingly being built using component-based models, enabling integration of diverse functionality in modules distributed across the network. In such models, dynamic component selection and deployment enables an application to flexibly adapt to changing client and network characteristics, achieve loadbalancing, and satisfy QoS requirements. Unfortunately, the problem of finding a valid component deployment is hard because one needs to decide on the set of components while satisfying various constraints resulting from application semantic requirements, network resource limitations, and interactions between the two. In this paper, we describe a general model for the component placement problem and present an algorithm for it, which is based on AI planning algorithms. We validate the effectiveness of our algorithm by demonstrating its scalability with respect to network size and number of components in the context of deployments generated for two example applications a security-sensitive mail service, and a webcast service in a variety of network environments.
Title: Dual-Primal FETI Methods for Linear Elasticity
Author(s): Klawonn, Axel; Widlund, Olof B.
Abstract:
Dual-Primal FETI methods are nonoverlapping domain decomposition methods where some of the continuity constraints across subdomain boundaries are required to hold throughout the iterations, as in primal iterative substructuring methods, while most of the constraints are enforced by Lagrange multipliers, as in one-level FETI methods. The purpose of this article is to develop strategies for selecting these constraints, which are enforced throughout the iterations, such that good convergence bounds are obtained, which are independent of even large changes in the stiffnesses of the subdomains across the interface between them. A theoretical analysis is provided and condition number bounds are established which are uniform with respect to arbitrarily large jumps in the Young's modulus of the material and otherwise only depend polylogarithmically on the number of unknowns of a single subdomain.
Title: Optical flow estimation as distributed optimization problem - an aVLSI implementation
Author(s): Stocker, Alan
Abstract:
I present a new focal-plane analog VLSI sensor that estimates optical flow in two visual dimensions. The chip significantly improves previous approaches both with respect to the applied model of optical flow estimation as well as the actual hardware implementation. Its distributed computational architecture consists of an array of locally connected motion units that collectively solve for the unique optimal optical flow estimate. The novel gradient-based motion model assumes visual motion to be translational, smooth and biased. The model guarantees that the estimation problem is computationally well-posed regardless of the visual input. Model parameters can be globally adjusted, leading to a rich output behavior. Varying the smoothness strength, for example, can provide a continuous spectrum of motion estimates, ranging from normal to global optical flow. Unlike approaches that rely on the explicit matching of brightness edges in space or time, the applied gradient-based model assures spatiotemporal continuity on visual information. The non-linear coupling of the individual motion units improves the resulting optical flow estimate because it reduces spatial smoothing across large velocity differences. Extended measures of a 30x30 array prototype sensor under real-world conditions demonstrate the validity of the model and the robustness and functionality of the implementation.
Title: Three-level BDDC in Two Dimensions
Author(s): Tu, Xuemin
Abstract:
BDDC methods are nonoverlapping iterative substructuring domain decomposition methods for the solutions of large sparse linear algebraic systems arising from discretization of elliptic boundary value problems. They are similar to the balancing Neumann-Neumann algorithm. However, in BDDC methods, a small number of continuity constraints are enforced across the interface, and these constraints form a new coarse, global component. An important advantage of using such constraints is that the Schur complements that arise in the computation willa ll be strictly positive definite. The coarse problem is generated and factored by a direct solver at the beginning of the computation. However, this problem can ultimately become a bottleneck, if the number of subdomains is very large. In this paper, two three-level BDDC methods are introduced for solving the coarse problem approximately in two dimensional space, while still maintaining a good convergence rate. Estimates of the condition numbers are provided for the two three-level BDDC methods and numerical experiments are also discussed.
Title: A kernel-independent fast multipole algorithm
Author(s): Biros, George; Ying, Lexing; Zorin, Denis
Abstract:
Title: A kernel-independent fast multipole algorithm (NYU-CS-TR839) Authors: George Biros, Lexing Ying, and Denis Zorin Abstract: We present a new fast multipole method for particle simulations. The main feature of our algorithm is that is kernel independent, in the sense that no analytic expansions are used to represent the far field. Instead we use equivalent densities, which we compute by solving small Dirichlet-type boundary value problems. The translations from the sources to the induced potentials are accelerated by singular value decomposition in 2D and fast Fourier transforms in 3D. We have tested the new method on the single and double layer operators for the Laplacian, the modified Laplacian, the Stokes, the modified Stokes, the Navier, and the modified Navier operators in two and three dimensions. Our numerical results indicate that our method compares very well with the best known implementations of the analytic FMM method for both the Laplacian and modified Laplacian kernels. Its advantage is the (relative) simplicity of the implementation and its immediate extension to more general kernels.
Title: An Embedded Boundary Integral Solver for the Stokes Equations
Author(s): Biros, George; Ying, Lexing; Zorin, Denis
Abstract:
We present a new method for the solution of the Stokes equations. Our goal is to develop a robust and scalable methodology for two and three dimensional, moving-boundary, flow simulations. Our method is based on Anita Mayo's method for the Poisson's equation: "The Fast Solution of Poisson's and the Biharmonic Equations on Irregular Regions", SIAM J. Num. Anal., 21 (1984), pp. 285--299. We embed the domain in a rectangular domain, for which fast solvers are available, and we impose the boundary conditions as interface (jump) conditions on the velocities and tractions. We use an indirect boundary integral formulation for the homogeneous Stokes equations to compute the jumps. The resulting integral equations are discretized by Nystrom's method. The rectangular domain problem is discretized by finite elements for a velocity-pressure formulation with equal order interpolation bilinear elements (Q1-Q1). Stabilization is used to circumvent the inf-sup condition for the pressure space. For the integral equations, fast matrix vector multiplications are achieved via a N log N algorithm based on a block representation of the discrete integral operator, combined with (kernel independent) singular value decomposition to sparsify low-rank blocks. Our code is built on top of PETSc, an MPI based parallel linear algebra library. The regular grid solver is a Krylov method (Conjugate Residuals) combined with an optimal two-level Schwartz-preconditioner. For the integral equation we use GMRES. We have tested our algorithm on several numerical examples and we have observed optimal convergence rates.
Title: Survey: Eigenvector Analysis in Webpage Rankings
Author(s): Chang, Hung-Hsien
Abstract:
Two major techniques have been proposed for using the structure of links in the World Wide Web to determine the relative significance of Web Pages. The PageRank algorithm \cite{BP98}, which is a critical part of the Google search engine, gives a single measure of importance of each page in the Web. The HITS algorithm \cite{K98} applies to a set of pages believed relevant to a given query, and assigns two values to each page: the degree to which the page is a hub and the degree to which it is an authority. Both algorithms have a natural interpretation in terms of a random walk over the set of pages involved, and in both cases the computation involved amounts to computing an eigenvector over the transition matrix for this random walk.
This paper surveys the literature discussing these two techniques and their variants, and their connection to random walks and eigenvector computation. It also discusses the stability of these techniques under small changes in the Web link structure.
Title: Shrinkage-Based Similarity Metric for Cluster Analysis of Microarray Data
Author(s): Cherepinsky, Vera; Feng, Jiawu; Rejali, Marc; Mishra, Bud
Abstract:
The current standard correlation coefficient used in the analysis of microarray data, including gene expression arrays, was introduced in [1]. Its formulation is rather arbitrary. We give a mathematically rigorous derivation of the correlation coefficient of two gene expression vectors based on James-Stein Shrinkage estimators. We use the background assumptions described in [1], also taking into account the fact that the data can be treated as transformed into normal distributions. While [1] uses zero as an estimator for the expression vector mean μ, we start with the assumption that for each gene, μ is itself a zero-mean normal random variable (with a priori distribution N(0,τ ^{2})), and use Bayesian analysis to update that belief, to obtain a posteriori distribution of μ in terms of the data. The estimator for μ, obtained after shrinkage towards zero, differs from the mean of the data vectors and ultimately leads to a statistically robust estimator for correlation coefficients.
To evaluate the effectiveness of shrinkage, we conducted in silico experiments and also compared similarity metrics on a biological example using the data set from [1]. For the latter, we classified genes involved in the regulation of yeast cell-cycle functions by computing clusters based on various definitions of correlation coefficients, including the one using shrinkage, and contrasting them against clusters based on the activators known in the literature. In addition, we conducted an extensive computational analysis of the data from [1], empirically testing the performance of different values of the shrinkage factor γ and comparing them to the values of γ corresponding to the three metrics adressed here, namely, γ=0 for the Eisen metric, γ=1 for the Pearson correlation coefficient, and γ computed from the data for the Shrinkage metric.
The estimated "false-positives" and "false-negatives" from this study indicate the relative merits of clustering algorithms based on different statistical correlation coefficients as well as the sensitivity of the clustering algorithm to small perturbations in the correlation coefficients. These results indicate that using the shrinkage metric improves the accuracy of the analysis.
All derivation steps are described in detail; all mathematical assertions used in the derivation are proven in the appendix.
[1] Eisen, M.B., Spellman, P.T., Brown, P.O., and Botstein, D. (1998), PNAS USA 95, 14863-14868.
Title: Comparing the Performance of Centralized and Distributed Coordination on Systems with Improved Combining Switches
Author(s): Freudenthal, Eric; Gottlieb, Allan
Abstract:
Memory system congestion due to serialization of hot spot accesses can adversely affect the performance of interprocess coordination algorithms. Hardware and software techniques have been proposed to reduce this congestion and thereby provide superior system performance. The combining networks of Gottlieb et al. automatically parallelize concurrent hot spot memory accesses, improving the performance of algorithms that poll a small number of shared variables. The widely used MCS distributed-spin algorithms take a software approach: they reduce hot spot congestion by polling only variables stored locally. Our investigations detected performance problems in existing designs for combining networks and we propose mechanisms that alleviate them. Simulation studies described herein indicate that a centralized readers writers algorithms executed on the improved combining networks have performance at least competitive to the MCS algorithms.
Title: QTM: Trust Management with Quantified Stochastic Attributes
Author(s): Freudenthal, Eric; Karamcheti, Vijay
Abstract:
Trust management systems enable the construction of access-control infrastructures suitable for protecting sensitive resources from access by unauthorized agents. The state of the art in such systems (i) provide fail-safe in that access will be denied when authorizing credentials are revoked, (ii) can mitigate the risk of insider attacks using mechanisms for threshold authorization in which several independent partially trusted agents are required to co-sponsor sensitive activities, and (iii) are capable of enforcing intra- and inter- organizational access control policies.
Despite these advantages, trust management systems are limited in their ability to express partial trust. Additionally, they are cumbersome to administer when there are a large number of related access rights with differing trust (and thereby access) levels due to the need for explicit enumeration of the exponential number of agent combinations. More importantly, these systems have no provision for fault tolerance in cases where a primary authorization is lost (perhaps due to revocation), but others are available. Such situations may result in a cascading loss of access and possible interruption of service.
In this paper, we propose extending traditional trust management systems through a framework of reliability and confidence metrics. This framework naturally captures partial trust relationships, thereby reducing administrative complexity of access control systems with multiple related trust levels and increasing system availability in the presence of authorization faults while still maintaining equivalent safety properties.
Title: Why Path-Based Adaptation? Performance Implications of Different Adaptation Mechanisms for Network Content Delivery
Author(s): Fu, Xiaodong; Karamcheti, Vijay
Abstract:
Because of heterogeneous and dynamic changing network environments, content delivery across the network requires system support for coping with different network conditions in order to provide satisfactory user experiences. Despite the existence of many adaptation frameworks, the question that which adaptation approach performs the best under what network configurations still remains unanswered. The performance implication of different adaptation approaches (end-point, proxy-based and path-based approaches) has not been studied yet. This paper aims to address this shortcoming by conducting a series simulation-based experiments to compare performance among these adaptation approaches under different network configurations. In order to make a fair comparison, in this paper approach-neutral strategies are proposed for constructing communication paths and managing network resources. The experiment results show that there are well-defined network environments under which each of these approaches delivers its best performance; and among them, the path-based approach, which uses the whole communication path to do adaptation, provides the best and the most robust performance under different network configurations, and for different types of servers and clients.
Title: Balancing Neumann-Neumann Preconditioners for the Mixed Formulation of Almost-Incompressible Linear Elasticity
Author(s): Goldfeld, Paulo
Abstract:
Balancing Neumann-Neumann methods are extended to the equations arising from the mixed formulation of almost-incompressible linear elasticity problems discretized with discontinuous-pressure finite elements. This family of domain decomposition algorithms has previously been shown to be effective for large finite element approximations of positive definite elliptic problems. Our methods are proved to be scalable and to depend weakly on the size of the local problems. Our work is an extension of previous work by Pavarino and Widlund on BNN methods for Stokes equation.
Our iterative substructuring methods are based on the partition of the unknowns into interior ones - including interior displacements and pressures with zero average on every subdomain - and interface ones - displacements on the geometric interface and constant-by-subdomain pressures. The restriction of the problem to the interior degrees of freedom is then a collection of decoupled local problems that are well-posed even in the incompressible limit. The interior variables are eliminated and a hybrid preconditioner of BNN type is designed for the Schur complement problem. The iterates are restricted to a benign subspace, on which the preconditioned operator is positive definite, allowing for the use of conjugate gradient methods.
A complete convergence analysis of the method is presented for the constant coefficient case. The algorithm is extended to handle discontinuous coefficients, but a full analysis is not provided. Extensions of the algorithm and of the analysis are also presented for problems combining pure-displacement and mixed finite elements in different subregions. An algorithm is also proposed for problems with continuous discrete pressure spaces.
All the algorithms discussed have been implemented in parallel codes that have been successfully tested on large sample problems on large parallel computers; results are presented and discussed. Implementations issues are also discussed, including a version of our main algorithm that does not require the solution of any auxiliary saddle-point problem since all subproblems of the preconditioner can be reduced to solving symmetric positive definite linear systems.
Title: AQuery: Query Language for Ordered Data, Optimization Techniques, and Experiments
Author(s): Lerner, Alberto; Shasha, Dennis
Abstract:
An order-dependent query is one whose result (interpreted as a multi-set) changes if the order of the input records is changed. In a stock-quotes database, for instance, retrieving all quotes concerning a given stock for a given day does not depend on order, because the collection of quotes does not depend on order. By contrast, finding the five price moving average in a trade table gives a result that depends on the order of the table. Query languages based on the relational data model can handle order-dependent queries only through add-ons. SQL:1999, for example, permits the use of a data ordering mechanism called a "window" in limited parts of a query. As a result, order-dependent queries become difficult to write in those languages and optimization techniques for these features, applied as pre- or post-enumerating phases, are generally crude. The goal of this paper is to show that when order is a property of the underlying data model and algebra, writing order-dependent queries in a language can be as natural as is their optimization. We introduce AQuery, an SQL-like query language and algebra that has from-the-ground-up support for order. We also present a framework for optimization of the order-dependent queries catagories it expresses. The framework is able to take advantage of the large body of query transformations on relational systems while incorporating new ones described here. We show by experiment that the resulting system is orders of magnitude faster than current SQL:1999 systems on many natural order-dependent queries.
Title: Secure Untrusted Data Repository (SUNDR)
Author(s): Li, Jinyuan; Krohn, Maxwell; Mazieres, David; Shasha, Dennis
Abstract:
We have implemented a secure network file system called SUNDR that guarantees the integrity of data even when malicious parties control the server. SUNDR splits storage functionality between two untrusted components, a block store and a consistency server. The block store holds all file data and most metadata. Without interpreting metadata, it presents a simple interface for clients to store variable-sized data blocks and later retrieve them by cryptographic hash.
The consistency server implements a novel protocol that guarantees close-to-open consistency whenever users see each other's updates. The protocol roughly consists of users exchanging version-stamped digital signatures of block server metadata, though a number of subtleties arise in efficiently supporting concurrent clients and group-writable files. We have proven the protocol's security under basic cryptographic assumptions. Without somehow producing signed messages valid under a user's (or the superuser's) public key, an attacker cannot tamper with a user's files---even given control of the servers and network. Despite this guarantee, SUNDR performs within a reasonable factor of existing insecure network file systems.
Title: Robust Model-Free Tracking of Non-Rigid Shape
Author(s): Torresani, Lorenzo; Hertzmann, Aaron; Bregler, Christoph
Abstract:
We present a robust algorithm for estimating non-rigid motion in video sequences. We build on recent methods for tracking video by enforcing global structure (such as rank constraints) on the tracking. These methods assume color constancy in the neighborhood of each tracked feature, an assumption that is violated by occlusions, deformations, lighting changes, and other effects. Our method identifies outliers while solving for flow. This allows us to obtain high-quality tracking from difficult sequences, even when there is no single "reference frame" in which all tracks are visible.
Title: Improved Link-Based Algorithms for Ranking Web Pages
Author(s): Wang, Ziyang
Abstract:
Several link-based algorithms, such as PageRank [19], HITS [15] and SALSA [16], have been developed to evaluate the popularity of web pages. These algorithms can be interpreted as computing the steady-state distribution of various Markov processes over web pages. The PageRank and HITS algorithms tend to over-rank tightly interlinked collections of pages, such as well-organized message boards. We show that this effect can be alleviated using a number of modications to the underlying Markov process. Specically, rather than weight all outlinks from a given page equally, greater weight is given to links between pages that are, in other respects, further off in the web, and less weight is given to links between pages that are nearby. We have experimented with a number of variants of this idea, using a number of different measures of ``distance'' in the Web, and a number different weighting schemes. We show that these revised algorithms often do avoid the over-ranking problem and give better overall rankings.
Title: A Distributed Adaptive Cache Update Algorithm for Dynamic Source Routing
Author(s): Yu, Xin; Kedem, Zvi M.
Abstract:
On-demand routing protocols use route caches to make routing decisions. Due to frequent topology changes, cached routes easily become stale. To address the cache staleness issue in DSR (the Dynamic Source Routing protocol), prior work mainly used heuristics with ad hoc parameters to predict the lifetime of a link or a route. However, heuristics cannot accurately predict timeouts because topology changes are unpredictable. In this paper, we present a novel distributed cache update algorithm to make route caches adapt quickly to topology changes without using ad hoc parameters. We define a new cache structure called a cache table to maintain the information necessary for cache updates. When a node detects a link failure, our algorithm proactively notifies all reachable nodes that have cached the broken link in a distributed manner. We compare our algorithm with DSR with path caches and with Link-MaxLife through detailed simulations. We show that our algorithm significantly outperforms DSR with path caches and with Link-MaxLife.
Title: An Embedded Boundary Integral Solver for the Unsteady Incompressible Navier-Stokes Equations
Author(s): Biros, George; Ying, Lexing; Zorin, Denis
Abstract:
We present a new method for the solution of the unsteady incompressible Navier-Stokes equations. Our goal is to achieve a robust and scalable methodology for two and three dimensional incompressible laminar flows. The Navier-Stokes operator discretization is done using boundary integrals and structured-grid finite elements. We use a two-step second-order accurate scheme to advance the equations in time. The convective term is discretized by an explicit, but unconditionally stable, semi-Lagrangian formulation; at each time step we inverta spatial constant-coefficient (modified) Stokes operator. The Dirichlet problem for the modified Stokes operator is formulated as a double-layer boundary integral equation. Domain integrals are computed via finite elements with appropriate forcing singularities to account for the irregular geometry. We use a velocity-pressure formulation which we discretize with bilinear elements (Q1-Q1), which give equal order interpolation for the velocities and pressures. Stabilization is used to circumvent the div-stability condition for the pressure space. The integral equations are discretized by Nystrom's method. For the specific approximation choices the method is second order accurate. We will present numerical results and discuss the performance and scalability of the method in two dimensions.
Title: Adaptive Service Access in Shared Wireless Environments
Author(s): Fu, Xiaodong; Karamcheti, Vijay
Abstract:
Adaptation to network changes is important to provide applications with seamless service access in a shared wireless environment. Path-based mechanisms, which augment data paths with application-specific ``bridging'' components guided by minimal application input, are promising approaches for providing such support. Although shown to be successful in static network situations, their utility under dynamically changing network conditions has not been well-studied.
In this paper, we answer this question by investigating the performance of a path-based approach, CANS (Composable Adaptive Network Services) in a dynamic environment. We find that the suitability of CANS-like approaches is hampered by inaccurate component models and expensive planning and reconfiguration. We address these problems by extending CANS to support (1) generalized path creation strategies to match different application performance preferences; (2) refined component models that enable adjustment at a finer granularity and more accurately represent behavior of component compositions; and (3) local planning and reconfiguration mechanisms that improve responsiveness. We present the problems and evaluate our solutions using an image streaming application. The experiment results show that our solutions are effective.
Title: Balancing Neumann-Neumann Preconditioners for Mixed Approximations of Heterogeneous Problems in Linear Elasticity
Author(s): Goldfeld, Paulo; Pavarino, Luca F.; Widlund, Olof B.
Abstract:
Balancing Neumann-Neumann methods are extented to mixed formulations of the linear elasticity system with discontinuous coefficients, discretized with mixed finite or spectral elements with discontinuous pressures.
These domain decomposition methods implicitly eliminate the degrees of freedom associated with the interior of each subdomain and solve iteratively the resulting saddle point Schur complement using a hybrid preconditioner based on a coarse mixed elasticity problem and local mixed elasticity problems with natural and essential boundary conditions. A polylogarithmic bound in the local number of degrees of freedom is proven for the condition number of the preconditioned operator in the constant coefficient case.
Parallel and serial numerical experiments confirm the theoretical results, indicate that they still hold for systems with discontinuous coefficients, and show that our algorithm is scalable, parallel, and robust with respect to material heterogeneities. The results on heterogeneous general problems are also supported in part by our theory.
Title: Overlapping Schwarz Preconditioners for Spectral Nedelec Elements for a Model Problem in H(curl)
Author(s): Hientzsch, Bernhard
Abstract:
A two-level overlapping domain decomposition method is analyzed for a Nedelec spectral element approximation of a model problem appearing in the solution of Maxwell's equations. The overlap between subdomains can consist of entire spectral elements or rectangular subsets of spectral elements. For fixed relative overlap and overlap made from entire elements, the condition number of the method is bounded, independently of the mesh size, the number of subregions, the coefficients and the degree of the spectral elements. In the case of overlap including just parts of spectral elements, a bound linear in the degree of the elements is proven. It is assumed that the coarse and fine mesh are quasi-uniform and shape-regular and that the domain is convex. Arguments that would not require quasi-uniformity of the coarse mesh and convexity of the domain are mentioned. Our work generalizes results obtained for lower-order Nedelec elements in Toselli [Numerische Mathematik (2000) 86:733-752]. Numerical results for the two-level algorithm in two dimensions are also presented, supporting our analysis.
Title: Dual-Primal FETI Methods for Stationary Stokes and Navier-Stokes Equations
Author(s): Li, Jing
Abstract:
Finite element tearing and interconnecting (FETI) type domain decomposition methods are first extended to solving incompressible Stokes equations. One-level, two-level, and dual-primal FETI algorithms are proposed. Numerical experiments show that these FETI type algorithms are scalable, i.e., the number of iterations is independent of the number of subregions into which the given domain is subdivided. A convergence analysis is then given for dual-primal FETI algorithms both in two and three dimensions.
Extension to solving linearized nonsymmetric stationary Navier-Stokes equations is also discussed. The resulting linear system is no longer symmetric and a GMRES method is used to solve the preconditioned linear system. Eigenvalue estimates show that, for small Reynolds number, the nonsymmetric preconditioned linear system is a small perturbation of that in the symmetric case. Numerical experiments also show that, for small Reynolds number, the convergence of GMRES method is similar to the convergence of solving symmetric Stokes equations with the conjugate gradient method. The convergence of GMRES method depends on the Reynolds number; the larger the Reynolds number, the slower the convergence.
Dual-primal FETI algorithms are further extended to nonlinear stationary Navier-Stokes equations, which are solved by using a Picard iteration. In each iteration step, a linearized Navier-Stokes equation is solved by using a dual-primal FETI algorithm. Numerical experiments indicate that convergence of the Picard iteration depends on the Reynolds number, but is independent of both the number of subdomains and the subdomain problem size.
Title: Dual-Primal FETI Methods for Incompressible Stokes and Linearized Navier-Stokes Equations
Author(s): Li, Jing
Abstract:
In this paper, a dual-primal FETI method is developed for solving incompressible Stokes equations approximated by mixed finite elements with discontinuous pressures in three dimensions. The domain of the problem is decomposed into non-overlapping subdomains, and the continuity of the velocity across the subdomain interface is enforced by introducing Lagrange multipliers. By a Schur complement procedure, the indefinite Stokes problem is reduced to a symmetric positive definite problem for the dual variables, i.e., the Lagrange multipliers. This dual problem is solved by a Krylov space method with a Dirichlet preconditioner. At each step of the iteration, both subdomain problems and a coarse problem on a coarse subdomain mesh are solved by a direct method. It is proved that the condition number of this preconditioned dual problem is independent of the number of subdomains and bounded from above by the product of the inverse of the inf-sup constant of the discrete problem and the square of the logarithm of the number of unknowns in the individual subdomain problems. Illustrative numerical results are presented by solving lid driven cavity problems. This algorithm is also extended to solving linearized non-symmetric Navier-Stokes equation.
Title: Efficiently Distributing Component-based Applications Across Wide-Area Environments
Author(s): Llambiri, Deni; Totok, Alexander; Karamcheti, Vijay
Abstract:
Distribution and replication of network-accessible applications has been shown to be an effective approach for delivering improved Quality of Service (QoS) to end users. An orthogonal trend seen in current-day network services is the use of component-based frameworks. Even though such component-based applications are natural candidates for distributed deployment, it is unclear if the design patterns underlying component frameworks also enable efficient service distribution in wide-area environments. In this paper, we investigate application design rules and their accompanying system-level support essential to a beneficial and efficient service distribution process. Our study targets the widely used Java 2 Enterprise Edition (J2EE) component platform and two sample component-based applications: Java Pet Store and RUBiS. Our results present strong experimental evidence that component-based applications can be efficiently distributed in wide-area environments, significantly improving QoS delivered to end users as compared to a centralized solution. Although current design patterns underlying component frameworks are not always suitable, we identify a small set of design rules for orchestrating interactions and managing component state that together enable efficient distribution. Futhermore, we show how enforcement of the identified design rules and automation of pattern implementation can be supported by container frameworks.
Title: Online Codes
Author(s): Maymounkov, Petar
Abstract:
We introduce online codes - a class of near-optimal codes for a very general loss channel which we call the free channel. Online codes are linear encoding / decoding time codes, based on sparse bipartite graphs, similar to Tornado codes, with a couple of novel properties: local encodability and rateless-ness. Local encodability is the property that each block of the encoding of a message can be computed independently from the others in constant time. This also implies that each encoding block is only dependent on a constant-sized part of the message and a few preprocessed bits. Rateless-ness is the property that each message has an encoding of practically infinite size.
We argue that rateless codes are more appropriate than fixed-rate codes for most situations where erasure codes were considered a solution. Furthermore, rateless codes meet new areas of application, where they are not replaceable by fixed-rate codes. One such area is information dispersal over peer-to-peer networks.
Title: Building Secure File Systems Out of Byzantine Storage
Author(s): Mazieres, David; Shasha, Dennis
Abstract:
This paper shows how to implement a trusted network file system on an untrusted server. While cryptographic storage techniques exist that allow users to keep data secret from untrusted servers, this work concentrates on the detection of tampering attacks and stale data. Ideally, users of an untrusted storage server would immediately and unconditionally notice any misbehavior on the part of the server. This ideal is unfortunately not achievable. However, we define a notion of data integrity called fork consistency in which, if the server delays just one user from seeing even a single change by another, the two users will never again see one another's changes - a failure easily detectable with on-line communication. We give a practical protocol for a multi-user network file system called SUNDR, and provfe that SUNDR offers fork consistency whether or not the server obeys the protocol.
Title: Image Denoising using a Gaussian Scale Mixture in the Wavelet Domain
Author(s): Portilla, Javier; Strela, Vasily; Wainwright, Martin J.; Simoncelli, Eero P.
Abstract:
We describe a method for removing noise from digital images, based on a statistical model of the coefficients of an overcomplete multi-scale oriented basis. Neighborhoods of coefficients at adjacent positions and scales are modeled as the product of two independent random variables: a Gaussian vector and a hidden positive scalar multiplier. The latter modulates the local variance of the coefficients in the neighborhood, and is thus able to account for the empirically observed correlation between the amplitudes of pyramid coefficients. Under this model, the Bayesian least squares estimate of each coefficient reduces to a weighted average of the local linear (Wiener) estimate over all possible values of the hidden multiplier variable. We demonstrate through simulations with images contaminated by additive Gaussian noise of known covariance that the performance of this method substantially surpasses that of previously published methods, both visually and in terms of mean squared error. In addition, we demonstrate the performance of the algorithm in removing sensor noise from high-ISO digital camera images.
Title: Workload Characterization of a Personalized Web Site - And Its Implications for Dynamic Content Caching
Author(s): Shi, Weisong; Wright, Randy; Collins, Eli; Karamcheti, Vijay
Abstract:
Requests for dynamic and personalized content increasingly dominate current-day Internet traffic; however, traditional caching architectures are not well-suited to cache such content. Several recently proposed techniques, which exploit reuse at the sub-document level, promise to add this shortcoming, but require a better understanding of the workloads seen on web sites that serve such content. In this paper, we study the characteristicsof a medium-sized personalized web site, NYUHOME, which is a customizable portal used by approximately 44,000 users from the New York University community. Our study leverages detailed server-side overheads, and the client-perceived request latencies. We then use these statistics to derive general implications for efficient caching and edge generation of dynamic content in the context of our ongoing CONCA project. Our study verifies both the need for and likely benefit from caching content at sub-document granularity, and points to additional opportunities for reducing client-perceived latency using prefetching, access prediction, and content transcoding.
Title: StatStream: Statistical Monitoring of Thousands of Data Streams in Real Time
Author(s): Zhu, Yunyue; Shasha, Dennis
Abstract:
Consider the problem of monitoring tens of thousands of time series data streams in an online fashion and making decisions based on them. In addition to single stream statistics such as average and standard deviation, we also want to find high correlations among all pairs of streams. A stock market trader might use such a tool to spot arbitrage opportunities. This paper proposes efficient methods for solving this problem based on Discrete Fourier Transforms and a three level time interval hierarchy. Extensive experiments on synthetic data and real world financial trading data show that our algorithm beats the direct computation approach by several orders of magnitude. It also improves on previous Fourier Transform approaches by allowing the efficient computation of time-delayed correlation over any size sliding window and any time delay. Correlation also lends itself to an efficient grid-based data structure.The result is the first algorithm that we know of to compute correlations over thousands of data streams in real time. The algorithm is incremental,has fixed response time, and can monitor the pairwise correlations of 10,000 streams on a single PC. The algorithm is embarrassingly parallelizable.
Title: Genomics via Optical Mapping IV: Sequence Validation via Optical Map Matching
Author(s): Antoniotti, Marco; Anantharaman, Thomas; Paxia, Salvatore; Mishra, Bud
Abstract:
This paper describes the unerlying mathematical model and the dynamic programming algorithm technique for the valicdation of a (DNA) sequence against a (DNA) map. The sequence can be obtained from a variety of sources (r,g, GenBAnk, Sanger's Lab, or Celera P.E.) and it is assumed to be written out as a string of nucleotides. The map is an ordered restriction map obtained through an optical mapping process and is augmented with statistical information which will ne used to place (or not) the sequence in the genome.
Our approach has many other applications beyond validation: e.g. map-based sequence assembly, phasing sequence contigs, detecting and closing gaps and annotation of partially sequenced genomes to find open reading frames, genes and synteny groups.
We tested our system by checking various maps against publicly available sequence data for Plasmodium falciparum.
Title: dRBAC: Distributed Role-based Access Control for Dynamic Environments
Author(s): Freudenthal, Eric; Pesin, Tracy; Port, Lawrence; Keenan, Edward; Karamcheti, Vijay
Abstract:
Distributed Role-Based Access Control (dRBAC) is a scalable, decentralized trust-management and access-control mechanism for systems that span multiple administrative domains. dRBAC represents controlled actions in terms of roles , which are defined within the trust domain of one entity and can be transitively delegated to other roles within a different trust domain. dRBAC utilizes PKI to identify all entities engaged in trust-sensitive operations and to validate delegation certificates. The mapping of roles to authorized name spaces obviates the need to identify additional policy roots. dRBAC distinguishes itself from previous trust management and role-based access control approaches in its support for three features: (1) third-party delegations , which improve expressiveness by allowing an entity to delegate roles outside its namespace when authorized by an explicit delegation of assignment ; (2) valued attributes , which modulate transferred access rights via mechanisms that assign and manipulate numerical values associated with roles; and (3) credential subscriptions , which enable continuous monitoring of established trust relationships using a pub/sub infrastructure to track the status of revocable credentials. This paper describes the dRBAC model, its scalable implementation using a graph-based model of credential discovery and validation, and its application in a larger security context.
Title: Credentialed Secure Communication "Switchboards"
Author(s): Freudenthal, Eric; Port, Lawrence; Keenan, Edward; Pesin, Tracy; Karamcheti, Vijay
Abstract:
Software development in distributed computation is complicated by the extra overhead of communication between connected, dispersed hosts in dynamically changing, multiple administrative domains. Many disparate technologies exist for trust management, authentication, secure communication channels, and service discovery, but composing all of these elements into a single system can outweigh principal development efforts. The NYU Disco Switchboard consolidates these connectivity issues into a single convenient, extensible architecture, providing an abstraction for managing secure, host-pair communication with connection monitoring facilities. Switchboard extends the secure authenticated communication channel abstraction provided by standard interfaces such as SSL/TLS with mechanisms to support trust management, key sharing, service discovery, and connection liveness and monitoring. We present an extensible architecture which is particularly useful in dynamically changing, distributed coalition environments. Applications that utilize Switchboard benefit from the availability of authentication, trust management, cryptography, and discovery, while retaining the simplicity of a common interface.
Title: DisCo: A Distribution Infrastructure for Securely Deploying Decomposable Services in Partly Trusted Environments
Author(s): Freudenthal, Eric; Keenan, Edward; Pesin, Tracy; Port, Lawrence; Karamcheti, Vijay
Abstract:
The growing popularity of network-based services and peer-to-peer networks has resulted in situations where components of a distributed application often need to execute in environments that are only partly trusted by the application's owner. Such deployment into partial or unstable trust environments exacerbates the classical problems of distributing decomposable services: authentication and access control, trust management, secure communication, code distribution and installation, and process rights management. Unfortunately, the application developer's burden of coping with these latter issues often dominates the benefits of service distribution. The DisCo infrastructure is specifically targeted to the development of systems and services deployed into coalition environments: networks of users and hosts administered by multiple authorities with changing trust relationships. The DisCo infrastructure provides application-neutral support for the classical problems of distributed services, thereby relieving the developer of the burden of independently managing these features. DisCo also includes support for continuously monitoring established connections, enabling corrective action from an application to cope with changing trust relationships. Our experience with building a secure video distribution service using the DisCo toolkit indicates that the latter permits distributed secure deployment into a partly trusted environment with minimal application developer effort, affording the advantages of natural expression and convenient deployment without compromising on efficiency.
Title: Title: Automatic Deployment of Transcoding Components for Ubiquitous, Network-Aware Access to Internet Services
Author(s): Fu, Xiadong; Shi, Weisong; Karamacheti, Vijay
Abstract:
Advances in wireless communication together with the growing number of mobile end devices hold the potential of ubiquitous access to sophisticated internet services; however, such access must cope with an inherent mismatch between the low-bandwidth, limited-resource characteristics of mobile devices and the high-bandwidth expectations of many content-rich services. One promising way of bridging this gap is by deploying application-specific components on the path between the device and service, which perform operations such as protocol conversion and content transcoding. Although several researchers have proposed infrastructures allowing such deployment, most rely on static, hand-tuned deployment strategies restricting their applicability in dynamic situations.
In this paper, we present an automatic approach for the dynamic deployment of such transcoding components, which can additionally be dynamically reconfigured as required. Our approach relies on three components: (a) a high-level integrated type-based specification of components and network resources, essential for "late binding" components to paths; (b) an automatic path creation strategy that selects and maps components so as to optimize a global metric; and (c) system support for low-overhead path reconfiguration, consisting of both restrictions on component interfaces and protocols satisfying application semantic continuity requirements. We comprehensively evaluate the effectiveness of our approach over a range of network and end-device characteristics using both a web-access scenario where client performance is for reduced access time, and a streaming scenario where client preference is for increased throughput. Our results verify that (1) automatic path creation and reconfiguration is achievable and does in fact yield substantial performance benefits; and (2) that despite their flexibility, both path creation and reconfiguration can be supported with low run-time overhead.
Title: Fast Solvers and Domain Decomposition Preconditioners for Spectral Element Discretizations of Problems in H(curl)
Author(s): Hientzsch, Bernhard
Abstract:
For problems with piecewise smooth solutions, spectral element methods hold great promise. They combine the exponential convergence of spectral methods with the geometric flexibility of finite elements. Spectral elements are well-established for scalar elliptic problems and problems of fluid dynamics, and recently the first methods for problems in H(curl) and H(div) were proposed. In this dissertation we study spectral element methods for a model problem. We first consider Maxwell's equation and derive the model problem in H(curl). Then we introduce anisotropic spectral Nédélec element discretizations with variable numerical integration for the model problem. We discuss their structure, and their convergence and approximation properties. We also obtain results on the norm of the Nédélec interpolants between Nédélec and Raviart-Thomas spaces of different degree, needed for the computation of the splitting constant for the domain decomposition preconditioner and the numerical analysis of nonlinear equations. We also prove a Friedrichs-like inequality for the model problem for the spectral case.
We present fast direct solvers for the model problem on separable domains, taking advantage of the tensor product discretization and fast diagonalization methods. We use those fast solvers as local solvers in domain decomposition methods for problems that are too large to be solved directly, or posed on non-separable domains, and use them to compute and subassemble the Schur complement system corresponding to the interface. We also apply them in the direct solution of the Schur complement system for general domains.
As an example for the domain decomposition methods that can be implemented with these tools, we introduce overlapping Schwarz methods, both one-level and two-level versions.
We extend the theory for overlapping Schwarz methods to the spectral Nédélec element case. We reduce the proof of the condition number estimate to three basic estimates, and present theoretical and numerical results on those estimates. The technique of the proof works in both the two-dimensional and three-dimensional case.
We also present numerical results for one-level and two-level methods in two dimensions.
Title: Overlapping Schwarz Algorithms using Discontinuous Iterates for Poisson's Equation
Author(s): Kimn, Jung-Han
Abstract:
A new type of overlapping Schwarz methods, the overlapping Schwarz algorithms using discontinuous iterates is constructed from the classical overlapping Schwarz algorithm. It allows for discontinuities at each artificial interface. The new algorithm, for Poisson's equation, can be considered as an overlapping version of Lions' Robin iteration method for which little is known concerning the convergence. Since overlap improves the performance of the classical algorithms considerably, the existence of a uniform convergence factor is the fundamental question for our new algorithm.
The first part of this thesis concerns the formulation of the new algorithm. A variational formulation of the new algorithm is derived from the classical algorithms. The discontinuity of the iterates of the new algorithm is the fundamental distinction from the classical algorithms. To analyze this important property, we use a saddle-point approach. We show that the new algorithm can be interpreted as a block Gauss-Seidel method with dual and primal variables.
The second part of the thesis deals with algebraic properties of the new algorithm. We prove that the fractional steps of the new algorithm are nonsymmetric. The algebraic systems of the primal variables can be reduced to those of the dual variables. We analyze the structure of the dual formulation algebraically and analyze its numerical behavior.
The remaining part of the thesis concerns convergence theory and numerical results for the new algorithm. We first extend the classical convergence theory, without using Lagrange multipliers, in some limited cases. A new theory using Lagrange multiplier is then introduced and we find conditions for the existence of uniform convergence factors of the dual variables, which implies convergence of the primal variables, in the two overlapping subdomain case with any Robin boundary condition. Our condition shows a relation between the given conditions and the artificial interface condition. The numerical results for the general case with cross points are also presented. They indicate possible extensions of our results to this more general case.
Title: Dual-Primal FETI Methods for Three-dimensional Elliptic Problems with Heterogeneous Coefficients
Author(s): Klawonn, Axel; Widlund, Olof; Dryja, Maksymilian
Abstract:
In this paper, certain iterative substructuring methods with Lagrange multipliers are considered for elliptic problems in three dimensions. The algorithms belong to the family of dual--primal FETI methods which have recently been introduced and analyzed successfully for elliptic problems in the plane. The family of algorithms for three dimensions is extended and a full analysis is provided for the new algorithms. Particular attention is paid to finding algorithms with a small primal subspace since that subspace represents the only global part of the dual--primal preconditioner. It is shown that the condition numbers of several of the dual--primal FETI methods can be bounded polylogarithmically as a function of the dimension of the individual subregion problems and that the bounds are otherwise independent of the number of subdomains, the mesh size, and jumps in the coefficients. These results closely parallel those for other successful iterative substructuring methods of primal as well as dual type.
Title: A Dual-Primal FETI Method for Incompressible Stokes Equations
Author(s): Li, Jing
Abstract:
In this paper, a dual-primal FETI method is developed for incompressible Stokes equation approximated by mixed finite elements with discontinuous pressures. The domain of the problem is decomposed into nonoverlapping subdomains, and the continuity of the velocity across the subdomain interface is enforced by introducing Lagrange multipliers. By a Schur complement procedure, solving the indefinite Stokes problem is reduced to solving a symmetric positive definite problem for the dual variables, i.e., the Lagrange multipliers. This dual problem is solved by a Krylov space method with a Dirichlet preconditioner. At each step of the iteration, both subdomain problems and a coarse problem on the course subdomain mesh are solved by a direct method. It is proved that the condition number of this preconditioned problem is independent of the number of subdomains and bounded from above by the product of the inverse of the inf-sup constant of the discrete problem and the square of the logarithm of the number of unknowns in the individual subdomain problems. Illustrative results are presented by solving a lid driven cavity problem.
Title: Title: Balancing Neumann-Neumann Methods for Incompressible Stokes Equations
Author(s): Pavarino, Luca; Widlund, Olof
Abstract:
This paper describes the unerlying mathematical model and the Balancing Neumann-Neumann methods are introduced and studied for incompressible Stokes equations discretized with mixed finite or spectral elements with discontinuous pressures. After decomposing the original domain of the problem into nonoverlapping subdomains, the interior unknowns, which are the interior velocity component and all except the constant pressure component, of each subdomain problem are implicitly eliminated. The resulting saddle point Schur complement is solved with a Krylov space method with a balancing Neumann-Neumann preconditioner based on the solution of a coarse Stokes problem with a few degrees of freedom per subdomain and on the solution of local Stokes problems with natural %Neumann velocity and essential boundary conditions on the subdomains. This preconditioner is of hybrid form in which the coarse problem is treated multiplicatively while the local problems are treated additively. The condition number of the preconditioned operator is independent of the number of subdomains and is bounded from above by the product of the square of the logarithm of the local number of unknowns in each subdomain and the inverse of the inf-sup constants of the discrete problem and of the coarse subproblem. Numerical results show that the method is quite fast; they are also fully consistent with the theory.
Title: Modeling Object Characteristics of Dynamic Web Content
Author(s): Shi, Weisong; Collins, Eli; Karamcheti, Vijay
Abstract:
Requests for dynamic and personalized content increasingly dominate current-day Internet traffic, driven both by a growth in dynamic web services and a ``trickle-down'' effect stemming from the effectiveness of caches and content-distribution networks at serving static content. To efficiently serve this trend, several server-side and cache-side techniques have recently been proposed. Although such techniques, which exploit different forms of reuse at the sub-document level, appear promising, a significant impediment to their widespread deployment is (1) the absence of good models describing characteristics of dynamic web content, and (2) the lack of effective synthetic content generators, which reduce the effort involved in verifying the effectiveness of a proposed solution.
This paper addresses both of these shortcomings. Its primary contribution is a set of models that capture the characteristics of dynamic content both in terms of independent parameters such as the distributions of object sizes and their freshness times, as well as derived parameters such as content reusability across time and linked documents. These models are derived from an analysis of the content from six representative news and e-commerce sites, using both size-based and level-based splitting techniques to infer document objects. A secondary contribution is a Tomcat-based dynamic content emulator, which uses these models to generate ESI-based dynamic content and serve requests for whole document and separate objects. To validate both the models and the design of the content emulator, we compare the bandwidth requirements seen by an idealized cache simulator that is driven by both the real trace and emulated content. Our simulation results verify that the output of the content emulator effectively and efficiently models real content.
Title: Enforcing Resource Sharing Agreements among Distributed Server Cluster
Author(s): Zhao, Tao; Karamcheti, Vijay
Abstract:
Future scalable, high throughput, and high performance applications are likely to execute on platforms constructed by clustering multiple autonomous distributed servers, with resource access governed by agreements between the owners and users of these servers. As an example, application service providers (ASPs) can pool their resources together according to pre-specified sharing agreements to provide better services to their customers. Such systems raise several new resource management challenges, chief amongst which is the enforcement of agreements to ensure that, despite the distributed nature of both requests and resources, user requests only receive a predetermined share of the aggregate resource and that the resources of a participant are not misused. Current solutions only enforce such agreements at a coarse granularity and in a centralized fashion, limiting their applicability for general workloads.
This paper presents an architecture for the distributed enforcement of resource sharing agreements. Our approach exploits a uniform application-independent representation of agreements, and combines it with efficient time-window based coordinated queuing algorithms running on multiple nodes. We have successfully implemented this general strategy in two different network layers: a layer-7 HTTP redirector and a layer-4 packet redirector, which redirect connection requests from distributed clients to a cluster of distributed servers. Our measurements of both implementations verify that our approach is general and effective: different client groups receive service commensurate with their agreements.
Title: Describing Spatial Transitions Using Mereotopological Relations Over Histories
Author(s): Davis, Ernest
Abstract:
Muller (1998) develops a language of motion and shape change in terms of topological relations and temporal order relations between regions of space-time (histories). He uses this language to state and prove the transition rules developed in (Randell, Cui, and Cohn, 1992) that constrain the changes in spatial relations possible for objects whose shape changes continuously. Unfortunately, Muller's statement of the transition rules is inadequate. This paper presents an alternative statement of these transition rules.
Title: Continuous Shape Transformation and Metrics on Shapes
Author(s): Davis, Ernest
Abstract:
A natural approach to defining continuous change of shape is in terms of a metric that measures the difference between two regions. We consider four such metrics over regions: the Hausdorff distance, the dual-Hausdorff distance, the area of the symmetric difference, and the optimal-homeomorphism metric. Each of these gives a different criterion for continuous change. We establish qualitative properties of all of these; in particular, the continuity of basic functions such as union, intersection, set difference, area, distance, and the boundary function; the transition graph between RCC relations (Randell, Cui, and Cohn, 1992). We discuss the physical significance of these different criteria.
We also show that the history-based definition of continuity proposed by Muller (1998) is equivalent to continuity with respect to the Hausdorff distance. An examination of the difference between the transition rules that we have found for the Hausdorff distance and the transition theorems that Muller derives leads to the conclusion that Muller's analysis of state transitions is not adequate. We propose an alternative characterization of transitions in Muller's first-order language over histories.
Title: CANS: Composable, Adaptive Network Services Infrastructure
Author(s): Fu, Xiaodong; Shi, Weisong; Akkerman, Anatoly; Karamcheti, Vijay
Abstract:
The growth of the internet has been fueled by an increasing number of sophisticated network-accessible services. Unfortunately, the high bandwidth and processing requirements of such services is at odds with current trends towards increased variation in network characteristics and a large diversity in end devices. Ubiquitous access to sucr services requires the injection of additional functionality into the network to handle protocol conversion, data transcoding, and in general bridge disparate portions of the physical network. Several researchers have proposed infrastructures for injecting such functionality; however, many challenges remain before these infrastructures can be widely deployed.
CANS is an application-level infrastructure for injecting application-specific components into the network that focuses on three such challenges: (a) efficient and dynamic composition of individual components; (b) dynamic and distributed adaptation of injected components in response to system conditions; and (c) support for legacy applications and services. The network view supported by CANS consists of applications, stateful services, and data paths between them built up from mobile soft-state objects called drivers. Both services and data paths can be dynamically created and reconfigured: a planning and event propagation model assists in distributed adaptation, and a run-time type-based composition model dictates how new services and drivers are integrated with existing components. An interception layer that virtualizes network bindings permits legacy applications to plug into the CANS infrastructure, and a delegation model does the same for legacy services.
This paper describes the CANS architecture and implementation, and a case study involving a shrink-wrapped client application in a dynamically changing network environment where CANS was used to improve overall user experience.
Title: Paint By Relaxation
Author(s): Hertzmann, Aaron
Abstract:
We use relaxation to produce painted imagery from images and video. An energy function is first specified; a painting is then generated by performing a search for a painting with minimal energy. The appeal of this strategy is that, ideally, we need only specify what we want, not how to directly compute it. Because the energy function is very difficult to optimize, we use a relaxation algorithm combined with various search heuristics.
This formulation allows us to specify painting style by varying the relative weights of energy terms. The basic energy function yields an economical painting that effectively conveys an image with few strokes. This economical style produces moderate temporal coherence when processing video, without losing the essential 2D quality of the painting. The system allows as fine user control as desired: the user may interac-tively change the painting style, specify variations of style over an image, and/or add specific strokes to the painting. Procedural stroke textures may be used to enhance visual appeal.
Title: Verifying a Design Pattern for the Fault-Tolerant Execution of Parallel Programs
Author(s): Kindler, Ekkart; Shasha, Dennis
Abstract:
We present a protocol for the fault-tolerant execution of parallel programs. The protocol leaves the implementation free to make choices concerning efficiency tradeoffs. Thus, we are proposing a design pattern rather than a fully specified algorithm. The protocol is modeled with the help of Petri nets.
Based on the Petri net model, we formally prove the correctness of the design pattern. This verification serves two goals: first, it guarantees the correctness of the design pattern; second, it serves as a test case for the underlying verification technique.
Title: An Overlapping Domain Decomposition Preconditioner for a Class of Discontinuous Galerkin Approximations of Advection-Diffusion Problems
Author(s): Lasser, Caroline; Toselli, Andrea
Abstract:
We consider a scalar advection-diffusion problem and a recently proposed discontinuous Galerkin approximation, which employs discontinuous finite element spaces and suitable bilinear forms containing interface terms that ensure consistency. For the corresponding sparse, non-symmetric linear system, we propose and study an additive, two--level overlapping Schwarz preconditioner, consisting of a coarse problem on a coarse triangulation and local solvers associated to suitable problems defined on a family of subdomains.
This is a generalization of the corresponding overlapping method for approximations on continuous finite element spaces. Related to the lack of continuity of our approximation spaces, some interesting new features arise in our generalization, which have no analog in the conforming case.
We prove an upper bound for the number of iterations obtained by using this preconditioner with GMRES, which is independent of the number of degrees of freedom of the original problem and the number of subdomains. The performance of the method is illustrated by several numerical experiments for different test problems, using linear finite elements in two dimensions.
Title: A Numerical Study of FETI Algorithms for Mortar Finite Element Methods
Author(s): Stefanica, Dan
Abstract:
The Finite Element Tearing and Interconnecting (FETI) method is an iterative substructuring method using Lagrange multipliers to enforce the continuity of the finite element solution across the subdomain interface. Mortar finite elements are nonconforming finite elements that allow for a geometrically nonconforming decomposition of the computational domain into subregions and, at the same time, for the optimal coupling of different variational approximations in different subregions. We present a numerical study of FETI algorithms for elliptic self-adjoint equations discretized by mortar finite elements. Several preconditioners which have been successful for the case of conforming finite elements are considered. We compare the performance of our algorithms when applied to classical mortar elements and to a new family of biorthogonal mortar elements and discuss the differences between enforcing mortar conditions instead of continuity conditions for the case of matching nodes across the interface. Our experiments are carried out for both two and three dimensional problems, and include a study of the relative costs of applying different preconditioners for mortar elements.
Title: Domain Decomposition Methods for Mortar Finite Elements
Author(s): Stefanica, Dan
Abstract:
Domain decomposition methods are powerful iterative methods for solving systems of algebraic equations arising from the discretization of partial differential equations by, e.g., finite elements. The computational domain is decomposed into overlapping or nonoverlapping subdomains. The problem is divided into, or assembled from, smaller subproblems corresponding to these subdomains. In this dissertation, we focus on domain decomposition methods for mortar finite elements, which are nonconforming finite element methods that allow for a geometrically nonconforming decomposition of the computational domain into subregions and for the optimal coupling of different variational approximations in different subregions.
We introduce a FETI method for mortar finite elements, and provide numer- ical comparisons of FETI algorithms for mortar finite elements when different preconditioners, given in the FETI literature, are considered. We also analyze the complexity of the preconditioners for the three dimensional versions of the algorithms.
We formulate a variant of the balancing method for mortar finite elements, which uses extended local regions to account for the nonmortar sides of the subre- gions. We prove a polylogarithmic condition number estimate for our algorithm in the geometrically nonconforming case. Our estimate is similar to those for other Neumann{Neumann and substructuring methods for mortar finite elements.
In addition, we establish several fundamental properties of mortar finite elements: the existence of the nonmortar partition of any interface, the L^2 stability of the mortar projection for arbitrary meshes on the nonmortar side, and prove Friedrichs and Poincare inequalities for geometrically nonconforming mortar elements.
Title: FETI domain decomposition methods for scalar advection-diffusion problems
Author(s): Toselli, A.
Abstract:
In this paper, we show that iterative substructuring methods of Finite Element Tearing and Interconnecting type can be successfully employed for the solution of linear systems arising from the finite element approximation of scalar advection-diffusion problems. Using similar ideas as those of a recently developed Neumann-Neumann method, we propose a one-level algorithm and a class of two-level algorithms, obtained by suitably modifying the local problems on the subdomains. We present some numerical results for some significant test cases. Our methods appear to be optimal for flows without closed streamlines and possibly very small values of the viscosity. They also show very good performances for rotating flows and moderate Reynolds numbers. Therefore, the algorithms proposed appear to be well-suited for many convection-dominated problems of practical interest.
Title: hp-finite element approximations on non-matching grids for partial differential equations with non-negative characteristic form
Author(s): Toselli, Andrea
Abstract:
We propose and analyze a domain decomposition method on non-matching grids for partial differential equations with non-negative characteristic form. No weak or strong continuity of the finite element functions, their normal derivatives, or linear combinations of the two is imposed across the boundaries of the subdomains. Instead, we employ suitable bilinear forms defined on the common interfaces, typical of discontinuous Galerkin approximations. We prove an error bound which is optimal with respect to the mesh-size and suboptimal with respect to the polynomial degree. Our analysis is valid for arbitrary shape-regular meshes and arbitrary partitions into subdomains. Our method can be applied to advective, diffusive, and mixed-type equations, as well, and is well-suited for problems coupling hyperbolic and elliptic equations.
Title: Expressing and Enforcing Distributed Resource Sharing Agreements
Author(s): Zhao, Tao; Karamcheti, Vijay
Abstract:
Advances in computing and networking technology, and an explosion in information sources has resulted in a growing number of distributed systems getting constructed out of resources contributed by multiple sources. Use of such resources is typically governed by sharing agreements between owning principals, which limit both who can access a resource and in what quantity. Despite their increasing importance, existing resource management infrastructures offer only limited support for the expression and enforcement of sharing agreements, typically restricting themselves to identifying compatible resources. In this paper, we present a novel approach building on the concepts of tickets and currencies to express resource sharing agreements in an abstract, dynamic, and uniform fashion. We also formu-late the allocation problem of enforcing these agreements as a linear-programming model, automatically factoring the transitive availability of resources via chained agreements. A case study modeling resource sharing among ISP-level web proxies shows the benefits of enforcing transitive agreements: worst-case waiting times of clients accessing these proxies improves by up to two orders of magnitude.
Title: Comic Strips for Algorithm Visualization
Author(s): Biermann, H.; Cole, R.
Abstract:
This paper presents visualizations of binary search trees and splay trees. The visualizations comprise sequences of figures or frames, called comic strips. Consecutive frames are viewed two at a time to facilitate user (viewer) understanding of the algorithm steps. The visualizations are implemented in Java to facilitate their wide use. This paper explores several other considerations in the design of instructional visualizations.
Title: Stateless Remote Environment Navigation with View Compression
Author(s): Biermann, H.; Hertzmann, A.; Meyer, J.; Perlin, K.
Abstract:
We present a set of very low bandwidth techniques for navigating remote environments. In a typical setup using our system, a virtual environment resides on a server machine, and one or more users explore the environment from client machines. Each client uses previous views of the environment to predict the next view, using the known camera motion and image-based rendering techniques. The server performs the same prediction, and sends only the difference between the predicted and actual view. Compressed difference images require significantly less bandwidth than the compressed images of each frame, and thus can yield much higher frame rates. To request a view, the client simply sends the coordinates of the desired view and of the previous view to the server. This avoids the overhead of maintaining connections between the server and each client.
No restrictions are placed on the scene or the camera motions; the view compression technique may be used with arbitrarily complex 3D scenes or dynamically changing views from a web camera or a digital television broadcast. A lossy compression scheme is presented in which the client estimates the cumulative error in each frame, and requests a comprete refresh before errors become noticable.
This work is applicable to remote exploration of virtual worlds such as on head-mounted displays, Digital Television, or over the Internet.
Title: Piecewise Smooth Subdivision Surfaces with Normal Control
Author(s): Biermann, H.; Levin, A.; Zorin, D.
Abstract:
In this paper we introduce improved rules for Catmull-Clark and Loop subdivision that overcome several problems with the original schemes (lack of smoothness at extraordinary boundary vertices, folds near concave corners). In addition, our approach to rule modification allows generation of surfaces with prescribed normals, both on the boundary and in the interior, which considerably improves control of the shape of surfaces.
Title: Recovering Non-Rigid 3D Shape from Image Streams
Author(s): Bregler, C.; Hertzmann, A.; Biermann, H.
Abstract:
This paper addresses the problem of recovering 3D non-rigid shape models from image sequences. For example, given a video recording of a talking person, we would like to estimate a 3D model of the lips and the full head and its internal modes of variation. Many solutions that recover 3D shape from 2D image sequences have been proposed; these so-called structure-from-motion techniques usually assume that the 3D object is rigid. For example, Tomasi and Kanade's factorization technique is based on a rigid shape matrix, which produces a tracking matrix of rank 3 under orthographic projection. We propose a novel technique based on a non-rigid model, where the 3D shape in each frame is a linear combination of a set of basis shapes. Under this model, the tracking matrix is of higher rank, and can be factored in a three step process to yield to pose, configuration and shape. We demonstrate this simple but effective algorithm on video sequences of speaking people. We were able to recover 3D non-rigid facial models with high accuracy.
Title: Variational Analysis of Non-Lipschitz Spectral Functions
Author(s): Burke, J. V.; Overton, M. L.
Abstract:
We consider spectral functions f o lambda, where f is any permutation-invariant mapping from C^n to R, and lambda is the eigenvalue map from C^{n X n} to C^n, ordering the eigenvalues lexicographically. For example, if f is the function "maximum real part", then f o lambda is the spectral abscissa, while if f is "maximum modulus", then f o lambda is the spectral radius. Both these spectral functions are continuous, but they are neither convex nor Lipschitz. For our analysis, we use the notion of subgradient extensively analyzed in Variational Analysis, R.T. Rockafellar and R. J.-B. Wets (Springer, 1998), which is particularly well suited to the variational analysis of non-Lipschitz spectral functions. We derive a number of necessary conditions for subgradients of spectral functions. For the spectral abscissa, we give both necessary and sufficient conditions for subgradients, and precisely identify the case where subdifferential regularity holds. We conclude by introducing the notion of semistable programming: minimizing a linear function of a matrix subject to linear constraints, together with the constraint that the eigenvalues of the matrix all lie in the right half-plane or on the imaginary axis. This is a generalization of semidefinite programming for non-Hermitian matrices. Using our analysis, we derive a necessary condition for a local minimizer of a semistable program, and give a generalization of the complementarity condition familiar from semidefinite programming.
Title: Optimizing Matrix Stability
Author(s): Burke, J. V.; Lewis, A. S.; Overton, M. L.
Abstract:
Given an affine subspace of square matrices, we consider the problem of minimizing the spectral abscissa (the largest real part of an eigenvalue). We give an example whose optimal solution has Jordan form consisting of a single Jordan block, and we show, using non-lipschitz variational analysis, that this behaviour persists under arbitrary small perturbations to the example. Thus although matrices with nontrivial Jordan structure are rare in the space of all matrices, they appear naturally in spectral abscissa minimization.
Title: Automatic Configuration and Run-time Adaptation of Distributed Applications
Author(s): Chang, F.; Karamcheti, V.
Abstract:
Current technology trends point towards both an increased heterogeneity in hardware platforms and an increase in the mechanisms available to applications for controlling how these platforms are utilized. These trends motivate the design of resource-aware distributed applications, which proactively monitor and control utilization of the underlying platform, ensuring a desired performance level by adapting their behavior to changing resource characteristics.
This paper describes a general framework for enabling application adaptation on distributed platforms. The framework combines programmer specification of alternate execution behaviors (configurations) with automatic support for deciding when and how to adapt, relying extensively on two components: (1) profile-based modeling of application behavior, automatically generated by measuring application performance in a virtual execution environment with controllable resource consumption, and (2)application-specific continuous monitoring of current resource characteristics. The latter detects when application configurations need to change while the former guides the selection of a new configuration.
We evaluate these framework components using an interactive image visualization application. Our results demonstrate that starting from a natural specification of alternate application behaviors and an automatically generated performance database, our framework permits the application to both configure itself in diverse distributed environments and adapt itself to run-time changes in resource characteristics so as to satisfy user preferences of output quality.
Title: Secure, User-level Resource-constrained Sandboxing
Author(s): Chang, F.; Itzkovitz, A.; Karamcheti, V.
Abstract:
The popularity of mobile and networked applications has resulted in an increasing demand for execution ``sandboxes''---environments that impose irrevocable qualitative and quantitative restrictions on resource usage. Existing approaches either verify application compliance to restrictions at start time (e.g., using certified code or language-based protection) or enforce it at run time (e.g., using kernel support, binary modification, or active interception of the application's interactions with the operating system). However, their general applicability is constrained by the fact that they are either too heavyweight and inflexible, or are limited in the kinds of sandboxing restrictions and applications they can handle.
This paper presents a secure user-level sandboxing approach for enforcing both qualitative and quantitative restrictions on resource usage of applications in distributed systems. Our approach actively monitors an application's interactions with the underlying system, proactively controlling it as desired to enforce the desired behavior. Our approach leverages a core set of user-level mechanisms that are available in most modern operating systems: fine-grained timers, monitoring infrastructure (e.g., the /proc filesystem), debugger processes, priority-based scheduling, and page-based memory protection. We describe implementations of a sandbox that imposes quantitative restrictions on CPU, memory, and network usage on two commodity operating systems: Windows NT and Linux. Our results show that application usage of resources can be restricted to within 3% of desired limits with minimal run-time overhead.
Title: Edge-Coloring Bipartite Multigraphs in $0(E\log D)$ Time
Author(s): Cole, R.; Ost, K.; Schirra, S.
Abstract:
Let $V$, $E$, and $D$ denote the cardinality of the vertex set, the cardinality of the edge set, and the maximum degree of a bipartite multigraph $G$. We show that a minimal edge-coloring of $G$ can be computed in $O(E\log D)$ time.
Title: Randomized Swap Matching in $O(m \log m \log |\Sigma| )$ time
Author(s): Cole, R.; Hariharan, R.
Abstract:
We give a randomized algorithm for the {\em Pattern Matching with Swaps} problem which runs in $O(m \log m \log |\Sigma| )$ time on a text of length $2m-1$ and a pattern of length $m$ drawn from an alphabet set of size $|\Sigma|$. This algorithm gives the correct answer with probability at least $1-\frac{1}{m}$ and does not miss a match. The best deterministic algorithm known for this problem takes $O(m^{4/3} \mbox{polylog}(m))$ time.
Title: An Improved Intra-procedural May-alias Analysis Algorithm
Author(s): Goyal, D.
Abstract:
Hind et al.~\cite({Hind99}) use a standard data flow framework \cite{Rosen79, Tarjan81} to formulate an intra-procedural may-alias computation. The intra-procedural aliasing information is computed by applying well-known iterative techniques to the Sparse Evaluation Graph (SEG) (\cite{Choi91}). The computation requires a transfer function for each node that causes a potential pointer assignment (relating the data flow information flowing into and out of the node), and a set of aliases holding at the entry node of the SEG. The intra-procedural analysis assumes that precomputed information in the form of summary functions is available for all function-call sites in the procedure being analyzed. The time complexity of the intra-procedural may-alias computation for the algorithm presented by Hind et al.~(\cite{Hind99}) is $O(N^6)$ in the worst case (where $N$ is the size of the SEG). In this paper we present a worst case $O(N^3)$ time algorithm to compute the same may-alias information.
Title: Interactive 3D Scene Reconstruction from Images
Author(s): Hertzmann, A.
Abstract:
We propose an interactive framework for reconstructing an arbitrary 3D scene consistent with a set of images, for use in example-based image synthesis. Previous research has used human input to specify feature matches, which are then processed off-line; however, it is very difficult to correctly match images without feedback. The central idea of this paper is to perform and display 3D reconstruction during user modification. By allowing the user to interactively manipulate the image correspondence and the resulting 3D reconstruction, we can exploit both the user's intuitive image understanding and the computer's processing power.
Title: A Domain Decomposition Method with Lagrange Multipliers for Linear Elasticity
Author(s): Klawonn, A.; Widlund, O. B.
Abstract:
A new domain decomposition method with Lagrange multipliers for elliptic problems is introduced. It is based on a reformulation of the well--known FETI method as a saddle point problem with both primal and dual variables as unknowns. The resulting linear system is solved with block--structured preconditioners combined with a suitable Krylov subspace method. This approach allows the use of inexact subdomain solvers for the positive definite subproblems. It is shown that the condition number of the preconditioned saddle point problem is bounded independently of the number of subregions and depends only polylogarithmically on the number of degrees of freedom of individual local subproblems. Numerical results are presented for a plane stress cantilever membrane problem.
Title: Parallel Programming for Everyone
Author(s): Schwartz, N.
Abstract:
This article proposes a novel architectural model which augments the latest developments in automatic program parallelization and distributed systems to achieve a level of practicality as yet unknown to either field. Today's premier automatic parallelization model is well suited to implementation on a network of commodity workstations (NOW) using only a very thin layer of software support. We describe a parallelizing compiler framework which greatly simplifies the parallelization of even highly complex sequential applications while producing extremely effective parallelizations for the NOW. We further show how our model greatly enhances programmer productivity through the use of minimally invasive C++ transformation techniques, aiding both debugging and portability.
Title: Memory Classification Analysis for Recursive C Structures
Author(s): Schwartz, N.
Abstract:
The long-time quest of the parallelizing compiler community for effective aggregate summarization techniques has led to increasingly sophisticated array section representations. In this paper, we show how the latest of these can be used for nested C structure summarization. We then show how this summarization notation can be used to make Shape Analysis precise on arbitrarily low-level code. Combining these techniques, we show that an appropriate generalization of Memory Classification Analysis, originally presented for Fortran programs, provides a flow dependence summarization technique for C code as well, while avoiding code normalization compared with previous techniques. In so doing, we break down perhaps the final conceptual barriers in the construction of practical programmer-friendly C parallelizing compilers.
Title: Sparse Constant Propagation via Memory Classification Analysis
Author(s): Schwartz, N.
Abstract:
This article presents a novel Sparse Constant Propagation technique which provides a heretofore unknown level of practicality. Unlike other techniques which are based on data flow, it is based on the execution-order summarization sweep employed in Memory Classification Analysis (MCA), a technique originally developed for array dependence analysis. This methodology achieves a precise description of memory reference activity within a summary representation that grows only linearly with program size. Because of this, the collected sparse constant information need not be artificially limited to satisfy classical data flow lattice requirements, which constrain other algorithms to discard information in the interests of efficient termination. Sparse Constant Propagation is not only more effective within the MCA framework, but it in fact generalizes the framework. Original MCA provids the means to break only simple induction and reduction types of flow-dependences. The integrated framework provides the means to also break flow-dependences for which array values can be propagated.
Title: Neumann-Neumann Methods for Vector Field Problems
Author(s): Toselli, A.
Abstract:
In this paper, we study some Schwarz methods of Neumann-Neumann type for some vector field problems, discretized with the lowest order Raviart-Thomas and Nedelec finite elements. We consider a hybrid Schwarz peconditioner consisting of a coarse component, which involves the solution of the original problem on a coarse mesh, and local ones, which involve the solution of Neumann problems on the elements of the coarse triangulation, also called substructures. We show that the condition number of the corresponding method is independent of the number of substructures and grows logarithmically with the number of unknowns associated with an individual substructure. It is also independent of the jumps of both the coefficients of the original problem. The numerical results presented validate our theoretical bound.
Title: A FETI Domain Decomposition Method for Maxwell's Equations with Discontinuous Coefficients in Two Dimensions
Author(s): Toselli, A.; Klawonn, A.
Abstract:
A class of FETI methods for the edge element approximation of vector field problems in two dimensions is introduced and analyzed. First, an abstract framework is presented for the analysis of a class of FETI methods where a natural coarse problem, associated with the substructures, is lacking. Then, a family of FETI methods for edge element approximations is proposed. It is shown that the condition number of the corresponding method is independent of the number of substructures and grows only polylogarithmically with the number of unknowns associated with individual substructures. The estimate is also independent of the jumps of both of the coefficients of the original problem. Numerical results validating the theoretical bounds are given. The method and its analysis can be easily generalized to Raviart-Thomas element approximations in two and three dimensions.
Title: Domain Decomposition Methods for Vector Field Problems
Author(s): Toselli, A.
Abstract:
Finite element approximation of vector equations gives rise to very large, sparse linear systems. In this dissertation, we study some domain decomposition methods for finite element approximations of vector--valued problems, involving the curl and the divergence operators. Edge and Raviart--Thomas finite element are employed. Problems involving the curl operator arise, for instance, when approximating Maxwell's equations and the stream function--vorticity formulation of Stokes' problem, while mixed approximations of second order elliptic equations and stabilized mixed formulations of the Stoke' problem give rise to problems involving the divergence operator.
We first consider Maxwell's equations in three dimensional conductive media using implicit time--stepping. We prove that the condition number of a two-level overlapping algorithm is bounded independently of the number of unknowns, the number of subregions, and the time step.
For the same equation in two dimensions, we consider two new iterative substructuring methods. The first one is based on individual edges, while the second one is a Neumann-Neumann method. We show that the condition numbers of the corresponding methods increase slowly with the number of unknowns in each substructure, but are independent of the time step and even large jumps of the coefficients. We also analyze similar preconditioners for a three--dimensional vector problem involving the divergence operator, and prove that the preconditioners are quasi--optimal and scalable in this case as well.
For each method, we provide a series of numerical experiments, that confirm our theoretical analysis.
This work generalizes well--known results for scalar second order elliptic equations and has required the development of several new technical tools.
Title: Transparent Network Connectivity in Dynamic Cluster Environments
Author(s): Xiadong, F.; Wang, H.; Karamcheti, V.
Abstract:
Improvements in microprocessor and networking performance have made networks of workstations a very attractive platform for high-end parallel and distributed computing. However, the effective deployment of such environments requires addressing two problems not associated with dedicated parallel machines: heterogeneous resource capabilities and dynamic availability. Achieving good performance requires that application components be able to migrate between cluster resources and efficiently adapt to the underlying resource capabilities. An important component of the required support is maintaining network connectivity, which directly impacts on the transparency of migration to the application and its performance after migration. Unfortunately, existing approaches rely on either extensive operating system modifications or new APIs to maintain network connectivity, both of which limits their wider applicability.
This paper presents the design, implementation, and performance of a transparent network connectivity layer for dynamic cluster environments. Our design uses the techniques of API interception and virtualization to construct a transparent layer in user space; use of the layer requires no modification either to the application or the underlying operating system and messaging layers. Our layer enables the migration of application components without breaking network connections, and additionally permits adaptation to the characteristics of the underlying networking substrate. Experiments with supporting a persistent socket interface in two environments---an Ethernet LAN on top of TCP/IP, and a Myrinet LAN on top of Fast Messages---show that our approach incurs minimal overheads and can effectively select the best substrate for implementing application communication requirements.
Title: Genomics via Optical Mapping (I): Probabilistic Analysis of Optical Mapping Models
Author(s): Anantharaman, T.; Mishra, B.
Abstract:
We study several simple models for optical mapping and explore their power and limitations when applied to the construction of maps of clones (e.g., lambdas, cosmids, BACs and YACs). We provide precise lower and upper bounds on the number of clone molecules needed to create the correct map of the clone. Our probabilistic analysis shows that as the number of clone molecules is increased in the optical mapping data, the probability of successful computation of the map jumps from 0 to 1 for fairly small number of molecules (for typical values of the parameterS, the transition point is around 70 molecules). These observations have been independently verified with extensive tests, with both in vitro and in silico data.
In addition, we compare our results with those derived by Karp and Shamir in a recent paper. We hope that this paper clarifies certain misconceptions and explains why the model proposed in Anantharaman et al. (1997) has proven so powerful.
Title: Genomics via Optical Mapping II(A): Restriction Maps from Partial Molecules and Variations
Author(s): Anantharaman, T.; Mishra, B.
Abstract:
In this paper, we extend an algorithmic approach to constructing ordered restriction maps from images of a population of individual DNA molecules (clones) digested by restriction enzymes. The original algorithm was capable of producing high-resolution, high-accuracy maps rapidly and in a scalable manner given a certain class of data errors, including contamination, sizing errors, false and missing restriction sites and unknown orientation. Here we extend this set of errors to include possibly broken molecules where the amount of breakage is not known beforehand, which is necessary for handling larger clones. In an earlier paper~\cite{optmapII}, we had shown that the problem of making maps from molecules with end fragments missing as the only source of error is NP-complete. We also show how to handle multiple reliability levels in the input data when calling restriction sites, where the actual reliability levels are not known and must be infered from the data.
Title: Genomics via Optical Mapping III: Contiging Genomic DNA and Variations
Author(s): Anantharaman, T.; Mishra, B.; Schwartz, D.
Abstract:
In this paper, we describe our algorithmic approach to constructing an alignment (Contig) of a set of optical maps created from the images of individual genomic DNA molecules digested by restriction enzymes. Generally, these DNA segments are sized in the range of 1--4 Mb. The problem of assembling clone contig maps is a simpler special case of this contig problem and is handled by our algorithms. The goal is to devise contiging algorithms capable of producing high-quality composite maps rapidly and in a scalable manner. The resulting software is a key component of our physical mapping automation tools and has been used routinely to create composite maps of various microorganisms (E.coli, P.falciparum and D.radioduran). The experimental results appear highly promising.
Title: An Efficient Primal-Dual Interior-Point Method for Minimizing a Sum of Euclidean Norms
Author(s): Anderson, K. D.; Christiansen, E.; Conn, A. R.; Overton, M. L.
Abstract:
The problem of minimizing a sum of Euclidean norms dates from the 17th century and may be the earliest example of duality in the mathematical programming literature. This nonsmooth optimization problem arises in many different kinds of modern scientific applications. We derive a primal-dual interior-point algorithm for the problem, by applying Newton's method directly to a system of nonlinear equations characterizing primal and dual feasibility and a perturbed complementarity condition. The main work at each step consists of solving a system of linear equations (the Schur complement equations). This Schur complement matrix is not symmetric, unlike in linear programming. We incorporate a Mehrotra-type predictor-corrector scheme and present some experimental results comparing several variations of the algorithm, including, as one option, explicit symmetrization of the Schur complement with a skew corrector term. We also present results obtained from a code implemented to solve large sparse problems, using a symmetrized Schur complement. This has been applied to problems arising in plastic collapse analysis, with hundreds of thousands of variables and millions of nonzeros in the constraint matrix. The algorithm typically finds accurate solutions in less than 50 iterations and determines physically meaningful solutions previously unobtainable.
Title: Just-in-Time Transparent Resource Management
Author(s): Baratloo, A.
Abstract:
This paper presents the design and the implementation of a resource management system for monitoring computing resources on a network and for dynamically allocating them to concurrently executing jobs. In particular, it is designed to support adaptive parallel computations---computations that benefit from addition of new machines, and can tolerate removal of machines while executing. The challenge for such a resource manager is to communicate the availability of resources to running programs even when the programs were not developed to work with external resource managers. Our main contribution is a novel mechanism addressing this issue, built on low-level features common to popular parallel programming systems.
Existing resource management systems for adaptive computations either require tight integration with the operating system (DRMS), or require an integration with a programming system that is aware of external resource managers (e.g. Condor/CARMI, MPVM, Piranha). Thus in each case, their support is limited to a single type of programming system. In contrast, our resource management system is unique in supporting several unmodified parallel programming systems. Furthermore, the system runs with user-level privilege, and thus can not compromise the security of the network.
The underlying mechanism and the overall system have been validated on a dynamically changing mix of jobs, some sequential, some PVM, some MPI, and some Calypso computations. We demonstrate the feasibility and the usefulness of our approach, thus showing how to construct a middleware resource management system to enhance the utilizations of distributed systems.
Title: Exploiting Application Tunability for Efficient, Predictable, Parallel Resource Management
Author(s): Chang, F.; Karamcheti, V.; Kedem, Z.
Abstract:
Parallel computing is becoming increasing central and mainstream, driven both by the widespread availability of commodity SMP and high-performance cluster platforms, as well as the growing use of parallelism in general-purpose applications such as image recognition, virtual reality, and media processing. In addition to performance requirements, the latter computations impose soft real-time constraints, necessitating em efficient, predictable parallel resource management. Unfortunately, traditional resource management approaches in both parallel and real-time systems are inadequate for meeting this objective; the parallel approaches focus primarily on improving application performance and/or system utilization at the cost of arbitrarily delaying a given application, while the real-time approaches are overly conservative sacrificing system utilization in order to meet application deadlines. In this paper, we propose a novel approach for increasing parallel system utilization while meeting application soft real-time deadlines. Our approach exploits the application tunability found in several general-purpose computations. Tunability refers to an application's ability to trade off resource requirements over time, while maintaining a desired level of output quality. In other words, a large allocation of resources in one stage of the computation's lifetime may compensate, in a parameterizable manner, for a smaller allocation in another stage. We first describe language extensions to support tunability in the Calypso programming system, a component of the MILAN metacomputing project, and evaluate their expressiveness using an image processing application. We then characterize the performance benefits of tunability, using a synthetic task system to systematically identify its benefits and shortcomings. Our results are very encouraging: application tunability is convenient to express, and can significantly improve parallel system utilization for computations with predictability requirements.
Title: A New Solution to the Hidden Copy Problem
Author(s): Goyal, D.; Paige, R.
Abstract:
We consider the well-known problem of avoiding unnecessary costly copying that arises in languages with copy/value semantics and large aggregate structures such as arrays, sets, or files. The origins of many recent studies focusing on avoiding copies of flat arrays in functional languages may be traced back to SETL copy optimization [Schwartz 75]. The problem is hard, and progress is slow, but a successful solution is crucial to achieving a pointer-free style of programming envisioned by [Hoare 75].
We give a new solution to copy optimization that uses dynamic reference counts and lazy copying to implement updates efficiently in an imperative language with arbitrarily nested finite sets and maps (which can easily model arrays, records and other aggregate datatypes). Big step operational semantics and abstract interpretations are used to prove the soundness of the analysis and the correctness of the transformation. An efficient algorithm to implement the analysis is presented. The approach is supported by realistic empirical evidence.
Our solution anticipates the introduction of arbitrarily nested polymorphic sets and maps into JAVA. It may also provide a new efficient strategy for implementing object cloning in Java and object assigment in C++. We illustrate how our methods might improve the recent approach of [Wand and Clinger 98] to avoid copies of flat arrays in a language of first-order recursion equations.
Title: Competitive Equilibrium
Author(s): Greenwald, A.
Abstract:
This report includes a modern account of welfare economics and competitive equilibrium theory. In particular, competitive, or Walrasian, equilibrium is defined. Moreover, existence, optimality, and uniqueness are demonstrated. However, no reliable mechanism for computing equilibrium prices is suggested. At this stage, the problem shifts from the realm of economics to an algorithmic problem in computer science.
Title: Learning to Play Network Games
Author(s): Greenwald, A.
Abstract:
The idea of learning to play equilibrium strategies in repeated games is an active area of research in the game-theoretic community. Game theorists are primarily concerned with the equilibrium outcomes of learning algorithms in the limit: i.e., over an infinite amount of time. One of the goals of this research is to apply computer science ideology to learning theory. In particular, this thesis will consider imposing restrictions on traditional game-theoretic learning algorithms such that players learn to play approximations to equilibrium strategies in bounded amounts of time. The idea of such bounded learning algorithms is to quickly learn to exploit the obvious, while ignoring any subtleties.
The idea of bounded learning is applicable to network games, in which players learn to utilize networks during times of minimal congestion. These games are atypical as compared with traditional games described in the game-theoretic literature, since their underlying structure is not commonly understood by the players, and moreover, common knowledge of rationality is not a valid assumption. As such, this class of repeated games does not naturally lend itself to belief-based learning algorithms. Rather, this thesis will investigate learning algorithms for network games that are analyzed on the basis of performance, without requiring that players maintain prior beliefs about expected network congestion. In sum, the initial focus of this thesis is to explore an application of computer science ideology to learning algorithms in game theory; secondly, bounded game-theoretic learning will be applied to routing and congestion problems in network environments.
Title: Steering Clear of Triples: Deriving the Control Flow Graph Directly from the Abstract Syntax Tree in C Programs
Author(s): Schwartz, N.
Abstract:
This article explores the extension of Morgenthaler's Virtual Control Flow technique, which derives control flow semantics directly from the Abstract Syntax Tree, from the relatively coarse granularity of syntactic C expressions to the finer granularity of basic block expressions, that is, expressions without embedded control flow. We explain why this is a better level of abstraction for program analysis, and discuss the elements of an efficient and elegant solution, motivating the presentation by appealing to a more explicit intermediate form. We present our algorithm, and conclude with remarks about the suitability of Morgenthaler's version of Virtual Control Flow for customary exhaustive data-flow analysis.
Title: Poincare and Friedrichs Inequalities For Mortar Finite Element Methods
Author(s): Stefanica, D.
Abstract:
Mortar finite elements are nonconforming finite elements that allow for a geometrically nonconforming decomposition of the computational domain and, at the same time, for the optimal coupling of different variational approximations in different subregions. Poincare and Friedrichs inequalities for mortar finite elements are derived. Using these inequalities, it is shown that the condition number for self-adjoint elliptic problems discretized using mortars is comparable to that of the conforming finite element case. Geometrically non-conforming mortars of the second generation are considered, i.e. no continuity conditions are imposed at the vertices of the subregions.
Title: On the L(2) Stability of the 1-D Mortar Projection
Author(s): Stefanica, D.
Abstract:
It is previously known that the one dimensional mortar finite element projection is stable in the $L^2$ norm, provided that the ratio of any two neighboring mesh intervals is uniformly bounded, but with the constant in the bound depending on the maximum value of that ratio. In this paper, we show that this projection is stable in the $L^2$ norm, independently of the properties of the nonmortar mesh. The 1D trace of the mortar space considered here is a piecewise polynomial space of arbitrary degree; therefore, our result can be used for both the $h$ and the $hp$ version of the mortar finite element.
Title: A Numerical Study of a Class of FETI Preconditioners for Mortar Finite Elements in Two Dimensions
Author(s): Stefanica, D.; Klawonn, A.
Abstract:
The FETI method is an iterative substructuring method using Lagrange multipliers. It is actively used in industrial--size parallel codes for solving difficult computational mechanics problems, for example the system ANSYS. Mortar finite elements are nonconforming finite elements that also allow for a geometrically nonconforming decomposition of the computational domain and for the optimal coupling of different variational approximations in different subdomains. We present a numerical study of three different FETI preconditioners for two dimensional, self-adjoint, elliptic equations discretized by mortar finite elements.
Title: Some Results on Overlapping Schwarz Methods for the Helmholtz Equation Employing Perfectly Matched Layers
Author(s): Toselli, A.
Abstract:
In this paper, we build a class of overlapping Schwarz preconditioners for a finite element approximation of the Helmholtz equation in two dimensions. Perfectly Matched Layers are employed to build the local problems and two kinds of boundary conditions are employed to match the local solutions. Numerical results are presented to compare the different preconditioners.
Title: An Iterative Substructuring Method for Maxwell's Equations in Two Dimensions
Author(s): Toselli, A.; Widlund, O. B.; Wohlmuth, B. I.
Abstract:
Iterative substructuring methods, also known as Schur complement methods, form an important family of domain decomposition algorithms. They are preconditioned conjugate gradient methods where solvers on local subregions and a solver on a coarse mesh are used to construct the preconditioner. For conforming finite element approximations of $H^1$, it is known that the number of conjugate gradient steps required to reduce the residual norm by a fixed factor is independent of the number of substructures and that it grows only as the logarithm of the dimension of the local problem associated with an individual substructure. In this paper, the same result is established for similar iterative methods for low--order N{\'e}d{\'e}lec finite elements, which approximate $\Hcurl$ in two dimensions. Results of numerical experiments are also provided.
Title: An Iterative Substructuring Method for Raviart-Thomas Vector Fields in Three Dimensions
Author(s): Wohlmuth, B. I.; Toselli, A.; Widlund, O. B.
Abstract:
The iterative substructuring methods, also known as Schur complement methods, form one of two important families of domain decomposition algorithms. They are based on a partitioning of a given region, on which the partial differential equation is defined, into non-overlapping substructures. The preconditioners of these conjugate gradient methods are then defined in terms of local problems defined on individual substructures and pairs of substructures, and, in addition, a global problem of low dimension. An iterative method of this kind is introduced for the lowest order Raviart-Thomas finite elements in three dimensions and it is shown that the condition number of the relevant operator is independent of the number of substructures and grows only as the square of the logarithm of the number of unknowns associated with an individual substructure. The theoretical bounds are confirmed by a series of numerical experiments.
Title: Finding Idle Work Periods on Networks of Workstations
Author(s): Wyckoff, P.; Jeong, K.; Johnson, T.
Abstract:
We present a simple technique for predicting the probability that an idle workstation will continue to be idle for $i$ minutes, given that it has been idle for $x$ minutes (i.e., find the {\em remaining idle period probability} $P(i;x)$). By idle we mean that the workstation owner is not interactively using the workstation or executing other tasks on it. The results are particularly applicable to the scheduling of tasks in systems that harvest cycles from idle-only workstations. Our Remaining Idle Period Probability Predictor (RIPPP) uses the distribution of the lengths of idle periods on the managed workstations. Collecting, storing, and processing these distributions (in the form of histograms) is a small overhead on modern workstations (a few kilobytes of storage per workstation).
We investigated the behavior of our RIPPP with usage traces of 31 workstations collected over a five month period, and discovered the following six results. (1) The distribution of one month of idle periods predicts the remaining idle period probability in the next month for most workstations. (2) Different workstations tend to have significantly different idle period length distributions. (3) The average length of an idle period does not necessarily correlate well with the probability of being able to find long idle periods, contrary to intuition and previous scheduling heuristics. (4) A workstation that has been idle a long time does not necessarily have a high probability of remaining idle for a long time. (5) Using the time of day can improve predictions. (6) The length of the previous and the current idle periods are positively correlated, but the length of the previous idle period is not strongly correlated with finding long remaining idle periods.
Based on these studies, we conclude that an effective way to find idle workstations is to collect their idle period length distribution and use it to compute $P(i;x)$. We believe our analysis will be applicable to predicting the length of busy periods, which is useful for deciding whether to migrate or suspend tasks when a workstation becomes busy (the owner reclaims it).
From our results, we have developed a remaining idle period probability toolkit which includes a statistics collector and a prediction library in C. This will be available from our project homepage.
Title: Iterative Substructuring Preconditioners for Mortar Element Methods in Two Dimensions
Author(s): Achdou, Y.; Maday, Y.; Widlund, O. B.
Abstract:
The mortar methods are based on domain decomposition and they allow for the coupling of different variational approximations in different subdomains. The resulting methods are nonconforming but still yield optimal approximations. In this paper, we will discuss iterative substructuring algorithms for the algebraic systems arising from the discretization of symmetric, second order, elliptic equations in two dimensions. Both spectral and finite element methods, for geometrically conforming as well as nonconforming domain decompositions, are studied. In each case, we obtain a polylogarithmic bound on the condition number of the preconditioned matrix.
Title: SDPPACK User's Guide -- Version 0.9 Beta for Matlab 5.0
Author(s): Alizadeh, F.; Haeberly, J. A.; Nayakkankuppa, M. V.; Overton, M.L.; Schmieta, S.
Abstract:
This report describes SDPpack Version 0.9 Beta for Matlab 5.0. This version extends the previous release for semidefinite programming (SDP) to mixed semidefinite--quadratic--linear programs (SQLP), i.e.\ linear optimization problems over a product of semidefinite cones, quadratic cones and the nonnegative orthant. Together, these cones make up all possible homogeneous self-dual cones over the reals.
The main routine implements a primal--dual Mehrotra predictor--corrector scheme based on the XZ+ZX search direction for SDP. More specialized routines are also available, one to solve SDP's with diagonal constraints only, and one to compute the Lov\'asz $\theta$ function of a graph, both using the XZ search direction. Routines are also provided to determine whether an SQLP is primal or dual degenerate at its solution and whether strict complementarity holds there. Primal nondegeneracy is associated with dual uniqueness and dual nondegeneracy with primal uniqueness, though these conditions are not equivalent if strict complementarity fails to hold.
A routine is also provided to compute the condition number of an SQLP. The Matlab code calls mex files for improved performance; binaries are available for several platforms. Benchmarks show that the codes provide highly accurate solutions to a wide variety of problems.
Title: SDPPACK User's Guide -- Version 0.8 Beta
Author(s): Alizadeh, F.; Haeberly, J.; Nayakkankuppam, M.V.; Overton, M.L.
Abstract:
Abstract: This report describes SDPpack, a package of Matlab files designed to solve semidefinite programs (SDP). SDP is a generalization of linear programming to the space of block diagonal, symmetric, positive semidefinite matrices. The main routine implements a primal--dual Mehrotra predictor--corrector scheme based on the XZ+ZX search direction. We also provide certain specialized routines, one to solve SDP's with only diagonal constraints, and one to compute the Lov\'asz $\theta$ function of a graph, using the XZ search direction. Routines are also provided to determine whether an SDP is primal or dual degenerate, and to compute the condition number of an SDP. The code optionally uses MEX files for improved performance; binaries are available for several platforms. Benchmarks show that the codes provide highly accurate solutions to a wide variety of problems.
Title: The coupling of mixed and conforming finite element discretizations
Author(s): Baratloo, A.; Karaul, M.; Karl, H.; Kedem, Z. M.
Abstract:
While Java and applets have created a new perspective for Web applications, some problems are still unsolved. Among these are the question of how Java applets can find other members of the collaboration session, how to deal with the restrictions imposed by the Java security model, and how to overcome the inability of applets to communicate directly, even if they belong to the same distributed application. KnittingFactory addresses the problem of finding other members of a collaboration session by providing a distributed registry system where the search is performed within a Web browser without violating its security model; the problem of arbitrary placement of applications by providing the core functionality for downloading applets from an arbitrary node; and finally the problem of direct applet-applet communication by using the Java Remote Method Invocation mechanisms to give applets information on how their fellow applets can be reached. Two example applications validate this concept and demonstrate the ease of use of KnittingFactory.
Title: Iterative Substructuring Algorithms for the P-Version Finite Element Method for Elliptic Problems
Author(s): Bica, I.
Abstract:
In this thesis, we study iterative substructuring methods for linear elliptic problems approximated by the $p$-version finite element method. They form a class of nonoverlapping domain decomposition methods, for which the information exchange between neighboring subdomains is limited to the variables directly associated with the interface, i.e. those common to more than one subregion. Our objective is to design algorithms in $3D$ for which we can find an upper bound for the {\it condition number} $\kappa$ of the preconditioned linear system, which is independent of the number of subdomains and grows slowly with $p$.
Iterative substructuring methods for the $h-$version finite element, and spectral elements have previously been developed and analysed by several authors. However, some very real difficulties remained when the extension of these methods and their analysis to the $p-$version finite element method were attempted, such as a lack extension theorems for polynomials. The corresponding results are well known for Sobolev spaces, but their extension to finite element spaces is quite intricate. In our technical work, we use and further develop extension theorems for polynomials in order to prove bounds on the condition numbers of several algorithms.
We have also made many numerical tests. We can use our programs for several purposes. Not only can we compute the condition numbers and study the rate of convergence for a variety of the algorithms that we have developed, but we can also compute the bounds on these condition numbers, as given by the theory. This is useful because the theory predicts the order of magnitude actual condition numbers.
Title: On the Singular Limit of the Quantum-Classical Molecular Dynamics Model
Author(s): Bornemann, F. A.; Schuette, C.
Abstract:
In molecular dynamics applications there is a growing interest in so-called mixed quantum-classical models. These models describe most atoms of the molecular system by the means of classical mechanics but an important, small portion of the system by the means of quantum mechanics. A particularly extensively used model, the QCMD model, consists of a singularly perturbed Schrodinger equation nonlinearly coupled to a classical Newtonian equation of motion.
This paper studies the singular limit of the QCMD model for finite dimensional Hilbert spaces. The main result states that this limit is given by the time-dependent Born-Oppenheimer model of quantum theory ---provided the Hamiltonian under consideration has a smooth spectral decomposition. This result is strongly related to the quantum adiabatic theorem. The proof uses the method of weak convergence by directly discussing the density matrix instead of the wave functions. This technique avoids the discussion of highly oscillatory phases.
On the other hand, the limit of the QCMD model is of a different nature if the spectral decomposition of the Hamiltonian happens not to be smooth. We will present a generic example for which the limit set is not a unique trajectory of a limit dynamical system but rather a funnel consisting of infinitely many trajectories.
Title: Overlapping Schwarz Algorithms for Solving Helmholtz's Equation
Author(s): Cai, X.; Casarin, M. A., Jr.; Elliot, F. W., Jr.; Widlund, O. B.
Abstract:
In this paper, prepared for the proceedings of the international conference on domain decomposition held in Boulder, CO in August 1997, we give a progress report on the development of a new family of domain decomposition methods for the solution of Helmholtz's equation.
We present three algorithms based on overlapping Schwarz methods; in our favorite method we proceed to the continuous finite element approximation of the Helmholtz's equation through a sequence of discontinuous iterates. While this is, quite possibly, a new type of overlapping Schwarz methods, we have been inspired to develop this idea by the thesis of Bruno Despr\'{e}s.
Title: Smile consistency - A Memory Consistency Model with User Definable High Level Synchronization Primitives
Author(s): Chu, C.; Piatko, P.
Abstract:
We propose a new natural memory consistency model, Smile consistency. Not only does Smile provide an intuitive memory consistency model but also a paradigm in which users can define their own synchronization primitives, called synchronization classes. Programmers can use the synchronization class to ease the programming work related to basic synchronization operations. Therefore, in addition to shared memory, threads can also communicate with each other via synchronization objects, instances of synchronization classes. Programs with high-level synchronization objects may also outperform those with only basic synchronization primitives.
Title: Order of Magnitude Comparisons of Distance
Author(s): Davis, E.
Abstract:
Order of magnitude reasoning --- reasoning by rough comparisons of the sizes of quantities --- is often called "back of the envelope calculation", with the implication that the calculations are quick though approximate. This paper exhibits an interesting class of constraint sets in which order of magnitude reasoning is demonstrably much faster than ordinary quantitative reasoning. Specifically, we present a polynomial-time algorithm that can solve a set of constraints of the form ``Points a and b are much closer together than points c and d.'' We prove that this algorithm can be applied if ``much closer together'' is interpreted either as referring to an infinite difference in scale or as referring to a finite difference in scale, as long as the difference in scale is greater than the number of variables in the constraint set. We also prove the first-order theory over such constraints is decidable.
Title: The Naive Physics Perplex
Author(s): Davis, E.
Abstract:
The ``Naive Physics Manifesto'' of Pat Hayes [1978] proposes a large-scale project of developing a formal theory encompassing the entire knowledge of physics of naive reasoners, expressed in a declarative symbolic form. The theory is organized in clusters of closely interconnected concepts and axioms. More recent work in the representation of commonsense physical knowledge has followed a somewhat different methodology. The goal has been to develop a competence theory powerful enough to justify commonsense physical inferences, and the research is organized in microworlds, each microworld covering a small range of physical phenomena. In this paper we compare the advantages and disadvantages of the two approaches. We also discuss some difficult key issues in automating commonsense physical reasoning.
Title: The On-Line K-Server Problem
Author(s): Floratos, A.
Abstract:
We survey the research performed during the last few years on the on-line $k$-server problem over metric spaces. A variety of algorithms are presented \mbox{--- both} deterministic and \mbox{randomized ---} and their performance is studied in the framework of competitive analysis. Restrictions of the problem to special cases of metric spaces are also considered.
Title: Overlapping Schwarz Methods for Vector Valued Elliptic Problems in Three Dimensions
Author(s): Hiptmair, R.; Toselli, A.
Abstract:
This paper is intended as a survey of current results on algorithmic and theoretical aspects of overlapping Schwarz methods for discrete $\Hcurl$ and $\Hdiv$--elliptic problems set in suitable finite element spaces. The emphasis is on a unified framework for the motivation and theoretical study of the various approaches developed in recent years.
Generalized Helmholtz decompositions -- orthogonal decompositions into the null space of the relevant differential operator and its complement -- are crucial in our considerations. It turns out that the decompositions the Schwarz methods are based upon have to be designed separately for both components. In the case of the null space, the construction has to rely on liftings into spaces of discrete potentials.
Taking the cue from well-known Schwarz schemes for second order elliptic problems, we devise uniformly stable splittings of both parts of the Helmholtz decomposition. They immediately give rise to powerful preconditioners and iterative solvers.
Title: Adaptive Mixed Hybrid and Macro-Hybrid Finite Element Methods
Author(s): Hoppe, R. H. W.; Wohlmuth, B.
Abstract:
In this paper, we consider efficient multilevel based iterative solvers and efficient and reliable a posteriori error estimators for mixed hybrid and macro-hybrid finite element discretizations of elliptic boundary value problems. We give an overview concerning the state-of-the-art techniques for these nonconforming approaches and illustrate the performance of the adaptivity concepts realized by some selected numerical examples.
Title: WebSeal: Web Server Allocation
Author(s): Karaul, M. H.; Korilis, Y. A.; Orda, A.
Abstract:
With the rapid growth of the World Wide Web, clients attempting to access some popular web sites are experiencing slow response times due to server load and network congestion. Replacing the single server machine with a set of replicated servers is a cost-effective solution to partition server load which also allows incremental scalability and fault transparency. Distributing these replicated servers geographically can reduce network congestion and increase availability. However, distributed web sites are faced with the issue of allocating servers: how do clients find out about the replicas and how do they decide which one to contact? Popular web sites have well publicized server names and require a transparent mapping of the public server name to replicated servers.
Unlike most traditional approaches, we propose a technique which pushes the server allocation functionality onto the client. We argue that this approach scales well and results in increased performance in many cases. Building on theoretical work based on game theory, we show that the usage of individual replicas can be effectively controlled with cost functions even when the clients are noncooperative. We present the design and implementation of WebSeAl, our prototype system realizing these techniques. WebSeAl does not require any changes to existing client and server code, conforms to all standards, and does not generate any control messages. Preliminary experiments utilizing servers on six continents and in controlled settings indicate that WebSeal improves performance significantly while imposing little overhead.
Title: Pincer-Search: A New Algorithm for Discovering the Maximum Frequent Set
Author(s): Lin, D-I.; Kedem, Z.
Abstract:
Discovering frequent itemsets is a key problem in important data mining applications, such as the discovery of association rules, strong rules, episodes, and minimal keys. Typical algorithms for solving this problem operate in a bottom-up breadth-first search direction. The computation starts from frequent 1-itemsets (minimal length frequent itemsets) and continues until all maximal (length) frequent itemsets are found. During the execution, every frequent itemset is explicitly considered. Such algorithms perform reasonably well when all maximal frequent itemsets are short. However, performance drastically decreases when some of the maximal frequent itemsets are relatively long. We present a new algorithm which combines both the bottom-up and top-down directions. The main search direction is still bottom-up but a restricted search is conducted in the top-down direction. This search is used only for maintaining and updating a new data structure we designed, the maximum frequent candidate set. It is used to prune candidates in the bottom-up search. As a very important characteristic of the algorithm, it is not necessary to explicitly examine every frequent itemset. Therefore it performs well even when some maximal frequent itemsets are long. As its output, the algorithm produces the maximum frequent set, i.e., the set containing all maximal frequent itemsets, which therefore specifies immediately all frequent itemsets. We evaluate the performance of the algorithm using a well-known benchmark database. The improvements can be up to several orders of magnitude, compared to the best current algorithms.
Title: Inapproximability of Flip-Cut, Shift-Cut and Other problems from Optical Mapping
Author(s): Parida, L.
Abstract:
Optical Mapping is an emerging technology for constructing ordered restriction maps of DNA molecules. The study of the complexity of the problems arising in Optical Mapping has generated considerable interest amongst computer science researchers. In this paper we examine the complexity of these problems.
Optical Mapping leads to various computational problems such as the Binary Flip Cut (BFC) problem, the Weighted Flip Cut (WFC) problem the Exclusive Binary Flip Cut (EBFC) problem \cite{parida1, parida2}, the Binary Shift Cut (BSC) problem, the Binary Partition Cut (BPC) problem and others. The complexity and the hardness of the BFC problem, the WFC problem were not known. Using the technique of {\em gap-preserving} reduction of the max-cut problem, we show that BFC and WFC problems are MAX SNP-hard and achieving an approximation ratio $1-\Upsilon/7$ for these problems is NP-hard, where $\Upsilon$ denotes the upper bound on the polynomial time approximation factor of the well-known max cut problem. A slight variation of BFC, BFC$_{\max K}$, had been shown to be NP-hard; we improve the result to show that BFC$_{\max K}$ is MAX SNP-hard and achieving an approximation ratio $(1-\Upsilon/7)\frac{p_{max}}{p_{min}}$ for BFC$_{\max K}$ is NP-hard, where $p_{\min}$ and $p_{\max}$ are the minimum and maximum of the digestion rates in the given problem. The EBFC problem was shown to be NP-Complete; improve this result to show that EBFC is MAX SNP-hard and achieving an approximation ratio $1-\Upsilon/7$ for EBFC is NP-hard. However, a dense instance of the EBFC problem does have a PTAS.
The Binary Partition Cut (modeling spurious molecules) problem has been shown to be NP-Complete: we show, in this paper, that a (reasonable) unrestrained version of it has an efficient polynomial time algorithm. A variation of the Binary Shift Cut (modeling missing fragments) BSC$_{\max K}$, had been shown to be NP-hard \cite{Tom}; we show both the versions of this problem to be MAX SNP-hard and achieving an approximation ratio $1-\Upsilon/6$ for BSC and a ratio $(1-\Upsilon/6)\frac{p_{max}}{p_{min}}$ for BSC$_{\max K}$ is NP-hard. In addition, we show that $d$-wise Match ($d$M) problem is MAX SNP-hard and achieving an approximation ratio $1-\Upsilon$ is NP-hard.
Title: Junctions: Detection, Classification and Reconstruction
Author(s): Parida, L.; Geiger, D.; Hummel, R.
Abstract:
Junctions are important features for image analysis and form a critical aspect of image understanding tasks such as object recognition. We present a unified approach to detecting (location of the center of the junction), classifying (by the number of wedges -- lines, corners, $3$-junctions such as $T$ or $Y$ junctions, or $4$-junctions such as $X$-junctions) and reconstructing junctions (in terms of radius size, the angles of each wedge and the intensity in each of the wedges) in images. Our main contribution is a modeling of the junction which is complex enough to handle all these issues and yet simple enough to admit an effective dynamic programming solution. Broadly, we use a template deformation framework along with a gradient criterium to detect radial partitions of the template. We use the Minimum Description Length (MDL) principle to obtain the optimal number of partitions that best describes the junction.
Kona is an implementation of this model. We (quantitatively) demonstrate the stability and robustness of the detector by analyzing its behavior in the presence of noise, using synthetic/controlled apparatus. We also present a qualitative study of its behavior on real images.
Title: A Uniform Framework for Ordered Restriction Map Problems
Author(s): Parida, L.
Abstract:
Optical Mapping is an emerging technology for constructing ordered restriction maps of DNA molecules. The underlying computational problems for this technology have been studied and several cost functions have been proposed in recent literature. Most of these propose combinatorial models; one of them also presents a probabilistic approach. However, it is not {\em a priori} clear as to how these cost functions relate to one another and to the underlying problem. We present a uniform framework for the restriction map problems where each of these various models is a specific instance of the basic framework. We achieve this by identifying the following approaches to the ordered restriction map problem: (1) using data consensus or agreement, and, (2) optimizing a characteristic function of the data. Our framework also opens up the possibility of exploring other cost functions. An additional feature is that we not only integrate the combinatorial models but also analyze the probabilistic model within the same framework. %Finally, for completeness, we include i brief survey of %the best known complexity results for these problems. Finally, we indicate the open problems by including a survey of the best known complexity results for these problems.
Title: Iterative Substructuring Methods for Spectral Element Discretizations of Elliptic Systems in Three Dimensions
Author(s): Pavarino, L. F.; Widlund, O. B.
Abstract:
Spectral element methods are considered for symmetric elliptic systems of second-order partial differential equations, such as the linear elasticity and the Stokes systems in three dimensions. The resulting discrete problems can be positive definite, as in the case of compressible elasticity in pure displacement form, or saddle point problems, as in the case of almost incompressible elasticity in mixed form and Stokes equations. Iterative substructuring algorithms are developed for both cases. They are domain decomposition preconditioners constructed from local solvers for the interior of each element and for each face of the elements and a coarse, global solver related to the wire basket of the elements. In the positive definite case, the condition number of the resulting preconditioned operator is independent of the number of spectral elements and grows at most in proportion to the square of the logarithm of the spectral degree. For saddle point problems, there is an additional factor in the estimate of the condition number, namely, the inverse of the discrete inf-sup constant of the problem.
Title: Iterative Substructuring Methods for Spectral Element Discretizations of Elliptic Systems. II: Mixed Methods for Linear Elasticity and Stokes Flow
Author(s): Pavarino, L. F.; Widlund, O. B.
Abstract:
Iterative substructuring methods are introduced and analyzed for saddle point problems with a penalty term. Two examples of saddle point problems are considered: the mixed formulation of the linear elasticity system and the generalized Stokes system in three dimensions. These problems are discretized with mixed spectral element methods. The resulting stiffness matrices are symmetric and indefinite. The unknowns interior to each element are first implicitly eliminated by using exact local solvers. The resulting saddle point Schur complement is solved with a Krylov space method with block preconditioners. The velocity block can be approximated by a domain decomposition method, e.g., of wire basket type, which is constructed from local solvers for each face of the elements, and a coarse solver related to the wire basket of the elements. The condition number of the preconditioned operator is independent of the number of spectral elements and is bounded from above by the product of the square of the logarithm of the spectral degree and the inverse of the discrete inf-sup constant of the problem.
Title: Iterative Substructuring Methods for Spectral Element Discretizations of Elliptic Systems. I: Compressible Linear Elasticity
Author(s): Pavarino, L. F.; Widlund, O. B.
Abstract:
An iterative substructuring method for the system of linear elasticity in three dimensions is introduced and analyzed. The pure displacement formulation for compressible materials is discretized with the spectral element method. The resulting stiffness matrix is symmetric and positive definite.
The method proposed provides a domain decomposition preconditioner constructed from local solvers for the interior of each element, and for each face of the elements and a coarse, global solver related to the wire basket of the elements. As in the scalar case, the condition number of the preconditioned operator is independent of the number of spectral elements and grows as the square of the logarithm of the spectral degree.
Title: Overlapping Schwarz Methods for Maxwell's Equations in Three Dimensions
Author(s): Toselli, A.
Abstract:
Two-level overlapping Schwarz methods are considered for finite element problems of 3D Maxwell's equations. Nedelec elements built on tetrahedra and hexahedra are considered. Once the relative overlap is fixed, the condition number of the additive Schwarz method is bounded, independently of the diameter of the triangulation and the number of subregions. A similar result is obtained for a multiplicative method. These bounds are obtained for quasi-uniform triangulations. In addition, for the Dirichlet problem, the convexity of the domain has to be assumed. Our work generalizes well-known results for conforming finite elements for second order elliptic scalar equations.
Title: The Coupling of Mixed and Conforming Finite Element Discretizations
Author(s): Wieners, C.; Wohlmuth, B.
Abstract:
In this paper, we introduce and analyze a special mortar finite element method. We restrict ourselves to the case of two disjoint subdomains, and use Raviart-Thomas finite elements in one subdomain and conforming finite elements in the other. In particular, this might be interesting for the coupling of different models and materials. Because of the different role of Dirichlet and Neumann boundary conditions a variational formulation without a Lagrange multiplier can be presented. It can be shown that no matching conditions for the discrete finite element spaces are necessary at the interface. Using static condensation, a coupling of conforming finite elements and enriched nonconforming Crouzeix-Raviart elements satisfying Dirichlet boundary conditions at the interface is obtained. The Dirichlet problem is then extended to a variational problem on the whole nonconforming ansatz space. It can be shown that this is equivalent to a standard mortar coupling between conforming and Crouzeix-Raviart finite elements where the Lagrange multiplier lives on the side of the Crouzeix-Raviart elements. We note that the Lagrange multiplier represents an approximation of the Neumann boundary condition at the interface. Finally, we present some numerical results and sketch the ideas of the algorithm. The arising saddle point problems is be solved by multigrid techniques with transforming smoothers.
Title: Hierarchical A Posteriori Error Estimators for Mortar Finite Element Methods with Lagrange Multipliers
Author(s): Wohlmuth, B.
Abstract:
Hierarchical a posteriori error estimators are introduced and analyzed for mortar finite element methods. A weak continuity condition at the interfaces is enforced by means of Lagrange multipliers. The two proposed error estimators are based on a defect correction in higher order finite element spaces and an adequate hierarchical two-level splitting. The first provides upper and lower bounds for the discrete energy norm of the mortar finite element solution whereas the second also estimates the error for the Lagrange multiplier. It is shown that an appropriate measure for the nonconformity of the mortar finite element solution is the weighted $L^2$-norm of the jumps across the interfaces.
Title: Primal-Dual Interior-Point Methods for Semidefinite Programming: Convergence Rates, Stability and Numerical Results
Author(s): Alizadeh, F.; Haeberly, J.A.; Overton, M.L.
Abstract:
Primal-dual interior-point path-following methods for semidefinite programming (SDP) are considered. Several variants are discussed, based on Newton's method applied to three equations: primal feasibility, dual feasibility, and some form of centering condition.
The focus is on three such algorithms, called respectively the XZ, XZ+ZX and Q methods. For the XZ+ZX and Q algorithms, the Newton system is well-defined and its Jabobian is nonsingular at the solution, under nondegeneracy assumptions. The associated Schur complement matrix has an unbounded condition number on the central path, under the nondegeneracy assumptions and an additional rank assumption.
Practical aspects are discussed, including Mehrotra predictor-corrector variants and issues of numerical stability. Compared to the other methods considered, the XZ+ZX method is more robust with respect to its ability to step close to the boundary, converges more rapidly, and achieves higher accuracy.
Title: PLinda User Manual
Author(s): Brown, T.; Jeong, K.; Li, B.; Talla, S.; Wyckoff, P.; Shasha, D.
Abstract:
Persistent Linda (PLinda) is a programming environment for writing fault-tolerant distributed/parallel programs that may be run on networks of workstations. PLinda is a set of extensions to the Linda parallel programming model and PLinda/C++ (and Fortran77 respectively) is an implementation combined with the sequential language C++.
The PLinda User Manual introduces the PLinda model, mechanics of the PLinda operations, and programming in PLinda/C++ and PLinda/Fortran77.
Title: Schwarz Preconditioners for Spectral and Mortar Finite Element Methods with Applications to Incompressible Fluids
Author(s): Casarin, M.A., Jr.
Abstract:
The spectral element method has been used extensively for the simulation of fluid flows. The resulting linear systems are often not amenable to direct methods of solution, and are especially ill-conditioned. Domain decomposition preconditioners, well adapted to the solution on parallel computers, are proposed and analyzed; both two and three space dimensions are considered.
Second-order elliptic equations are considered first, and the now well-developed theory of domain decomposition methods for finite elements is fully extended to spectral elements. This includes an analysis of exotic coarse spaces, which have proven necessary for the efficient solution of elliptic problems with large discontinuities in the coefficients, as well as a study of overlapping methods. Estimates of the condition numbers of the Schur complement restricted to an edge (in two dimensions) or to a face (in three dimensions) are also given; in particular, a fast method is designed and studied in full detail for problems with many subregions.
The Stokes problem, when restricted to the space of discrete divergence free velocities, is symmetric positive definite. A number of preconditioners are proposed, which are based on previous results for the scalar elliptic case, and new global models. The construction of a basis for the constrained velocity space is not required,and the resulting condition numbers grow only weakly with the degree $N$ and are independent of the number of subdomains.
We also consider the stationary Navier-Stokes equations, solved with Newton's method. In each iteration, a non-symmetric indefinite problem is solved using a Schwarz preconditioner. A new coarse space is proposed which satisfies the usual properties required by the elliptic theory, and also a specific $H^1$-approximation property. The rate of convergence of the algorithm grows only weakly with $N$, and does not depend on the number of subdomains, or the Newton step.
Finally, a hierarchical basis preconditioner for the mortar finite element method in two dimensions is proposed and analyzed. It is also further shown that the analysis of the symmetric positive definite preconditioner can also be applied to construct preconditioners for symmetric indefinite problems arising from second-order elliptic equations. Numerical results are presented for the Helmholtz equation.
Title: Building a Fast Double-Dummy Bridge Solver
Author(s): Chang, M-S.
Abstract:
Compared to other games, particularly chess, the research in computer bridge is immature, and the best bridge-playing programs are mediocre. In this paper we address the problem of designing a fast double-dummy bridge game (i.e., a simplified bridge game with perfect information) solver. Although th size of the game tree we generated for searching the best line of play is huge (about on the order of $13! \cdot 2^{39} \approx 10^{21}$, even if we assume the average branching factor for players to follow suit is just 2), we show that, through varieties of searching techniques and some efficient moves ordering and pruning heuristics, most double-dummy bridge hands can be solved within a reasonable amount of time. In this paper we first give a brief introduction to computer bridge and previous work on the card-playing phase of bridge. Next, we describe the top-level architecture of our double-dummy solver (dds), followed by a number of implementation techniques we employed in our dds. Finally we present experimental results, draw our conclusion and describe some future work toward automating card-playing in real bridge.
Title: An O(n log n) Algorithm for the Maximum Agreement Subtree Problem for Binary Trees
Author(s): Cole, R.; Farach, M.; Hariharan, R.; Przytycka, T.; Thorup, M.
Abstract:
The Maximum Agreement Subtree problem is the following:
Given two trees whose leaves are drawn from the same set of items (e.g., species), find the largest subset of these items so that the portions of the two trees restricted to these items are isomorphic. We consider the case which occurs frequently in practice, i.e., the case when the trees are binary, and give an O(n log n) time algorithm for this problem.
Title: Tree Pattern Matching and Subset Matching in Randomized O(n log^3 m) Time
Author(s): Cole, R.; Hariharan, R.
Abstract:
The main goal of this paper is to give an efficient algorithm for the Tree Pattern Matching problem. We also introduce and give an efficient algorithm for the Subset Matching problem.
The Subset Matching problem is to find all occurrences of a pattern string p of length m in a text string t of length n, where each pattern and text location is a set of characters drawn from some alphabet. The pattern is said to occur at text position i if the set p[j] is a subset of the set t[i+j-1], for all j, 1 <= j <= m. We give an O((s+n)\log^3 m) randomized algorithm for this problem, where s denotes the sum of the sizes of all the sets.
Then we reduce the Tree Pattern Matching problem to a number of instances of the Subset Matching problem. This reduction takes linear time and the sum of the sizes of the Subset Matching problems obtained is also linear. Coupled with our first result, this implies an O(nlog^3 m) time randomized of metric spaces are also considered.
Title: Two Heuristics for the Steiner Tree Problem
Author(s): Dreyer, D. R.; Overton, M.L.
Abstract:
The Steiner tree problem is to find the tree with minimal Euclidean length spanning a set of fixed points in the plane, given the ability to add points (Steiner points). The problem is NP-hard, so polynomial-time heuristics are desired. We present two such heuristics, both of which utilize an efficient method for optimizing a tree with a given topology. The first systematically inserts Steiner points between edges of the minimal spanning tree meeting at angles less than 120 degrees, performing a local optimization at the end. The second begins by finding the Steiner tree for three of the fixed points. Then, at each iteration, it introduces a new fixed point to the tree, connecting it to each possible edge by inserting a Steiner point, and minimizes over all connections, performing a local optimizations for each. We present a variety of test cases that demonstrate the strengths and weaknesses of both algorithms.
Title: A Model and Solution to the DNA Flipping String Problem
Author(s): Geiger, D.; Parida, L.
Abstract:
We consider the case where a pool of DNA molecules clones both, flipped and not-flipped, have been cut by restriction enzymes. Ideally, each clone is cut in the same positions, although in practice due to errors, this does not always happen. The computational problem is to determine where the cuts have occurred.
This is a key problem in determining the structure of the original DNA molecule.
A single molecule is represented by a string of 1's and 0's, with cuts represented by $1's$. A set of molecules clones (with errors) is observed, but the orientation/parity of each molecule is unknown. Clear is that the location of the observed cuts of one molecule are dependent on the parity: flipping the molecule would result in the cuts location, as observed, being ``flipped'' .
We propose a Bayesian approach to generate a posterior distribution on the cuts and parity, given the data. We first present an approximate algorithm where we attempt to divide the problem into subproblems, but it is not guaranteed to solve the problem. Then, we propose another approximate method based on a statistical framework and a mean field annealing algorithm. It computes the maximum posterior marginal (MPM estimator) and maximum aposteriori estimate (MAP estimator).
We also provide evidence that the exact solution of the problem is intractable.
Title: Hierarchically Split Cube Forests for Decision Support: description and tuned design
Author(s): Johnson, T.; Shasha, D.
Abstract:
The paradigmatic view of data in decision support consists of a set of dimensions (e.g., location, product, time period, ...), each encoding a hierarchy (e.g., location has hemisphere, country, state/province, ..., block). Typical queries consist of aggregates over a quantifiable attribute (e.g., sales) as a function of at most one attribute in each dimension of this ``data cube.'' For example, find the sum of all sales of blue polo shirts in Palm Beach during the last quarter. In this paper, we introduce an index structure for storing and indexing aggregates, called ``cube forests,'' to support such cube queries efficiently --- one index search is usually enough.
In their most general form, cube forests require a lot of space. So, we present an optimized structure, called ``hierarchically split cube forests'' that exploit the hierarchical nature of the data to save space. We then present a model and algorithms to arrive at designs that further reduce update time, but suffer an increase in query time. Our experiments bear out the model and show that the structure has promise for decision support applications in read-intensive environments.
Title: Preconditioners for Indefinite Problems
Author(s): Klawonn, A.
Abstract:
Two different preconditioners for symmetric saddle point problems with a penalty term are analyzed. The saddle point problems are discretized by mixed finite elements. The preconditioners are applied in combination with Krylov space methods. It is shown that both methods yield convergence rates that are independent from both, the discretization and the penalty parameters. The first method is based on a symmetric positive definite block-diagonal preconditioner and the second one uses a non-symmetric and indefinite block-triangular preconditioner. Numerical results are presented for a problem of linear elasticity. The preconditioners in our experiments are based on domain decomposition and multilevel techniques. It is further shown that the analysis of the symmetric positive definite preconditioner can also be applied to construct preconditioners for symmetric indefinite problems arising from second-order elliptic equations. Numerical results are presented for the Helmholtz equation.
Title: Skip-Over: Algorithms and Complexity for Overloaded Systems that Allow Skips
Author(s): Koren, G.; Shasha, D.
Abstract:
In applications ranging from video reception to telecommunications and packet communication to aircraft control, tasks enter periodically and have fixed response time constraints, but missing a deadline is acceptable, provided most deadlines are met. We call such tasks ``occasionally skippable''. We look at the problem of uniprocessor scheduling of occasionally skippable periodic tasks in an environment having periodic tasks. We show that making optimal use of skips is NP-hard. We then look at two algorithms called Skip-Over Algorithms (one a variant of earliest deadline first and one of rate monotonic scheduling) that exploit skips. We give schedulability bounds for both.
Title: Highly Efficient Instruction Scheduling of Realtime Programs on RISC Processors
Author(s): Leung, A.; Palem, K.V.; Pnueli, A.
Abstract:
Enabled by RISC technologies, low-cost commodity microprocessors are performing at ever increasing levels, significantly via instruction level parallelism (ILP). This in turn increases the opportunities for their use in a variety of day-to-day applications ranging from the simple control of appliances such as microwave ovens, to sophisticated systems for cabin control in modern aircraft. Indeed, ``embedded'' applications such as these represent segments in the computer industry with great potential for growth. However, this growth is currently impeded by the lack of robust optimizing compiler technologies that support the assured, rapid and inexpensive prototyping of real-time software in the context of microprocessors with ILP. In this paper, we will present fast (polynomial-time) algorithms for compile-time instruction scheduling, of programs instrumented with timing-constraints, on processors with ILP. Our algorithms can be distinguished from earlier work in that they are guaranteed to find feasible schedules --- those satisfying the timing-constraints --- whenever such schedules exist, in cases of practical interest. Consequently, they can serve as powerful engines and can simultaneously support the ``analysis'' of the program prior to compilation, as well as during compilation once a feasible schedule is identified via analysis. We will also describe a novel notation, Time_tract, for specifying timing-constraints in programs, independent of the base language being used to develop the embedded application; Time_tract specifications are language independent and can be instrumented into imperative and object-oriented languages non-intrusively. As we will show, the instruction scheduling questions that arise out of Time_tract specifications are always ``tractable''. In contrast, a range of specification mechanisms proposed earlier yield substantially intractable instruction scheduling questions, thereby limiting their potential utility. We will sketch a formal and precise comparison of the tractability and related expressive power issues between Time_tract and some of the extant mechanisms for specifying properties of timed programs; this will be done using the canonical framework of timed-automata.
Title: CoRRet: A CONSTRAINT Based Environment for Rapid Prototyping Real Time Programs
Author(s): Palem, K.
Abstract:
The information revolution that we are in the midst of has led to the use of computers controlling applications ranging from automobiles and games, to video-pumps in the information highway. These applications are distinguished by the fact that they use programs with special timing relationships between their constituent elements. For example, a program running in the microprocessor controlling an ABS system in a modern automobile must sense and react to the friction coefficient between the brake pads and the wheel at well-defined intervals of time; failure to do so will result in a systemic failure of the brakes. Referred to typically as embedded systems, these applications constitute a significant portion of the potential growth in the computer industry. However, this growth opportunity is being hampered by a lack of adequate support via software development tools, to aid the easy, rapid and correct prototyping of embedded applications.
In this report, we outline CoRReT, a COnstraint based environment for the Rapid prototyping of REal Time programs. The report outlines the overall system architecture as well as the key modules in this environment that are being currently developed. CoRReT is a scheduling centric system in that a suite of algorithms for instruction scheduling programs instrumented with real-time constraints, are at its core. These algorithms are an integral part of an (optimizing) compiler which will compile these programs automaticallywhile attempting to ensure that the timing constraints are met; when the constraints are met, the resulting schedule for the instructions is referred to be feasible. If a feasible schedule is found, it will be fed automatically into a code-generator in the back-end of the compiler. Our envisioned scheduler can --- in addition to traditional control- and data-dependence constraints in the source program --- also cope with a variety of timing constraints specified by the programmer.
Our focus is on computational platforms that embody parallelism at two levels of granularity. At the highest level, we envision a tightly-coupled parallel machine offering large-scale parallelism. In this setting, a single embedded application can be distributed across the individual processors of the cluster. Furthermore, each processor in this parallel machine can embody Instruction Level Parallelism (ILP) at a fine-grained level.
Unfortunately, due to a lack of automatic tools and technology that can provide compilation support for real-time constraints ubiquitous to embedded applications, parallel computing platforms have not proliferated in this setting. Considering the fine-grained case first, RISC processors with ILP have not yet found a niche in this domain; currently, developers of embedded systems are reluctant to embrace ILP technologies due to the onerous task of ensuring timing relationships in the program by hand --- a difficulty compounded by parallelism (at a fine-grained level) in the processor. Clearly, providing support through automation that frees the programmer of these difficulties, is a means of overcoming this challenge.
Our response to this challenge via CoRReT is to develop scheduling methodologies and tools for automatically harnessing very high performance from these platforms, in the context of embedded systems. In the absence of time-constraints, major progress has been achieved in this direction at the coarse-grained level. The situation is even better at the fine-grained level where scheduling technology is being used routinely in product-quality compilers for RISC processors.
The methodology on which CoRReT is based is independent of any particular target processor, and is applicable to third and fourth generation languages. Furthermore, we propose to use the same scheduling engines during the static analysis of the program as well as during compilation. We anticipate this ``confluence'' in the scheduling algorithms to aid in shorter prototyping cycles, since identical schedules will be used by the analysis tools and the back-end of the compiler to generate code. We envision that the algorithms and tools that go into CoRReT will naturally form an integral part of a full-fledged programming environment for prototyping real-time programs on parallel platforms.
Title: NYU Reactive Gripper: An Implementation
Author(s): Teichmann, M.; Mishra, B.
Abstract:
We consider the problem of grasping an unknown polygonal flat object using a parallel jaw gripper. Our design equips a standard gripper with several light-beam sensors (close to each jaw) and employs a control scheme based on a reactive grasping algorithm. This is done by probing the object to locate a good grasp position, and then grasping, without moving the object significantly. The goal is to do as little motion as possible to find a grasp. In this paper, we discuss an implementation of this device using NYU's MOSAIC robot, following a quick overview of the underlying reactivity principle.
Title: Some Numerical Results Using An Additive Schwarz Method For Maxwell's Equations
Author(s): Toselli, A.
Abstract:
We present some numerical results for a two-level additive overlapping Schwarz method applied to the 2-D Maxwell's equations. Nedelec finite elements defined on rectangles are employed. Numerical examples show the convergence properties of the method, when varying the mesh size of the coarse and fine problems, the overlap and the time step of the implicit finite difference scheme employed.
Title: Formal Models of Distributed Memory Management
Author(s): Ungureanu, C.; Goldberg, B.
Abstract:
We develop an abstract model of memory management in distributed systems. The model is low-level enough so we can express communication, allocation and garbage collection, but otherwise hide many of the lower-level details of an actual implementation.
Recently, such formal models have been developed for memory management in a functional, sequential setting by Morrisett, Felleisen, and Harper. The models are rewriting systems whose terms are programs. Programs have both the "code" (control string) and the "store" syntactically apparent. Evaluation is expressed as conditional rewriting and includes store operations. Garbage collection becomes a rewriting relation that removes part of the store without affecting the behavior of the program.
By using techniques developed for communicating and concurrent systems such as Milner's CCS, we extend the model for a distributed environment. Sending and receiving messages is also made apparent at the syntactic level. A very general garbage collection rule based on reachability is introduced and proved correct. Now, proving correct a specific collection strategy is reduced to showing that the relation between programs defined by the strategy is a subrelation of the general relation. Any actual implementation which is capable of providing the transitions (including their atomicity constraints) specified by the strategy is therefore correct.
Title: A Note on Scheduling Algorithms for Processors with Lookahead
Author(s): Ungureanu, C.
Abstract:
Many superscalar processors designed today are able to dynamically schedule instructions. Dynamic scheduling means that a processor is able to analyze a portion of the instruction stream ``on the fly'', and has the capability of issuing an instruction other than the next one available in the input, in order to avoid stalling. Such an instruction is said to be executed out of order.
Scheduling algorithms for machines with in-order execution are used in most compilers today. However, schedules which are optimal for machines with in-order execution may be sub-optimal for a machine with out-of-order execution. Here, we are presenting an algorithm which produces a local schedule for a trace of basic blocks, such that the completion time is minimized for a processor with a depth of the pipeline k=2 and dynamic scheduling ability with scope size s=2. The algorithm runs in polynomial time. A generalization of the algorithm to work for machines with larger scopes is straightforward.
Title: Complementarity and Nondegeneracy in Semidefinite Programming
Author(s): Alizadeh, F.; Haeberly, J.; Overton, M.
Abstract:
Primal and dual nondegeneracy conditions are defined for semidefinite programming. Given the existence of primal and dual solutions, it is shown that primal nondegeneracy implies a unique dual solution and that dual nondegeneracy implies a unique primal solution. The converses hold if strict complementarity is assumed. Primal and dual nondegeneracy assumptions do not imply strict complementarity, as they do in LP. The primal and dual nondegeneracy assumptions imply a range of possible ranks for primal and dual solutions $X$ and $Z$. This is in contrast with LP where nondegeneracy assumptions exactly determine the number of variables which are zero. It is shown that primal and dual nondegeneracy and strict complementarity all hold generically. Numerical experiments suggest probability distributions for the ranks of $X$ and $Z$ which are consistent with the nondegeneracy conditions.
Title: Computing Limit Loads by Minimizing a Sum of Norms
Author(s): Andersen, K.; Christiansen, E.; Overton, M.
Abstract:
We consider the problem of computing the collapse state in limit analysis for a solid with a quadratic yield condition, such as, for example, the Mises condition. After discretization with the finite element method, using divergence-free elements for the plastic flow, the kinematic formulation turns into the problem of minimizing a sum of Euclidean vector norms, subject to a single linear constraint. This is a nonsmooth minimization problem, since many of the norms in the sum may vanish at the optimal point. However, efficient solution algorithms for this particular convex optimization problem have recently been developed.
The method is applied to test problems in limit analysis in two different plane models: plane strain and plates. In the first case more than 80 percent of the terms in the sum are zero in the optimal solution, causing severe ill-conditioning. In the last case all terms are nonzero. In both cases the algorithm works very well, and we solve problems which are larger by at least an order of magnitude than previously reported. The relative accuracy for the discrete problems, measured by duality gap and feasibility, is typically of the order 1.0E-8. The discretization error, due to the finite grid, depends on the nature of the solution. In the applications reported here it ranges from 1.0E-5 to 1.0E-2.
Keywords: Limit analysis, plasticity, finite element method, nonsmooth optimization.
Title: The Supervisor Synthesis Problem for Unrestricted CTL is NP-complete
Author(s): Antoniotti, M.; Mishra, B.
Abstract:
The problem of restricting a finite state model (a Kripke structure) in order to satisfy a set of unrestricted CTL formulae is named the ``Unrestricted CTL Supervisor Synthesis Problem''. The finite state model has the characteristics described in \cite{ramadge-wonham87}, that is, its transitions are partitioned between "controllable" and "uncontrollable" ones. The set of CTL formulae represents a specification of the "desired behavior" of the system, which may be achieved through a "control action". This note shows the problem to be NP-complete.
Title: A Hierarchical Preconditioner for the Mortar Finite Element Method
Author(s): Casarin, M. A.; Widlund, O. B.
Abstract:
Mortar elements form a family of nonconforming finite element methods that are more flexible than conforming finite elements and are known to be as accurate as their conforming counterparts. A fast iterative method is developed for linear, second order elliptic equations in the plane. Our algorithm is modeled on a hierarchical basis preconditioner previously analyzed and tested, for conforming case, by Barry Smith and the second author. A complete analysis and results of numerical experiments are given for lower order mortar elements and geometrically conforming decompositions of the region into subregions.
Title: Diagonal Edge Preconditioners in p-Version and Spectral Element Methods
Author(s): Casarin, M. A.
Abstract:
Abstract: Domain decomposition preconditioners for high-order Galerkin methods in two dimensions are often built from modules associated with the decomposition of the discrete space into subspaces of functions related to the interior of elements, individual edges, and vertices. The restriction of the original bilinear form to a particular subspace gives rise to a diagonal block of the preconditioner, and the action of its inverse on a vector has to be evaluated in each iteration. Each block can be replaced by a preconditioner in order to decrease the cost. Knowledge of the quality of this local preconditioner can be used directly in a study of the convergence rate of the overall iterative process.
The Schur complement of an edge with respect to the variables interior to two adjacent elements is considered. The assembly and factorization of this block matrix are potentially expensive, especially for polynomials of high degree. It is demonstrated that the diagonal preconditioner of one such block has a condition number that increases approximately linearly with the degree of the polynomials. Numerical results demonstrate that the actual condition numbers are relatively small.
Title: Quasi-Optimal Schwarz Methods for the Conforming Spectral Element Discretization
Author(s): Casarin, M. A.
Abstract:
The spectral element method is used to discretize self-adjoint elliptic equations in three dimensional domains. The domain is decomposed into hexahedral elements, and in each of the elements the discretization space is the set of polynomials of degree $N$ in each variable. A conforming Galerkin formulation is used, the corresponding integrals are computed approximately with Gauss-Lobatto-Legendre (GLL) quadrature rules of order $N$, and a Lagrange interpolation basis associated with the GLL nodes is used. Fast methods are developed for solving the resulting linear system by the preconditioned conjugate gradient method. The conforming {\it finite element} space on the GLL mesh, consisting of piecewise $Q_{1}$ or $P_1$ functions, produces a stiffness matrix $K_h$ that is known to be spectrally equivalent to the spectral element stiffness matrix $K_N$. $K_h$ is replaced by a preconditioner $\tilde{K}_h$ which is well adapted to parallel computer architectures. The preconditioned operator is then $\tilde{K}_h^{-1} K_N$.
Our techniques for non-regular meshes make it possible to estimate the condition number of $\tilde{K}_h^{-1} K_N$, where $\tilde{K}_h$ is a standard finite element preconditioner of $K_h$, based on the GLL mesh. The analysis of two finite element based preconditioners: the wirebasket method of Smith, and the overlapping Schwarz algorithm for the spectral element method, are given as examples of the use of these tools. Numerical experiments performed by Pahl are briefly discussed to illustrate the efficiency of these methods in two dimensions.
Title: On the Dynamic Finger Conjecture for Splay Trees Part I: Splay Sorting log n-Block Sequences
Author(s): Cole, R.; Mishra, B.; Schmidt, J.; Siegel, A.
Abstract:
A special case of the Dynamic Finger Conjecture is proved; this special case introduces a number of useful techniques.
Title: On the Dynamic Finger Conjecture for Splay Trees Part II: The Proof
Author(s): Cole, R.
Abstract:
The following result is shown: On an n-node splay tree, the amortized cost of an access at distance d from the preceding access is O(log (d+1)). In addition, there is an O(n) initialization cost. The accesses include searches, insertions and deletions.
Title: The Average Case Complexity of Multilevel Syllogistic
Author(s): Cox, J.; Ericson, L.; Mishra, B.
Abstract:
An approach to the problem of developing provably correct programs has been to enrich a theorem prover for Hoare logic with decision procedures for a number of decidable sublanguages of set theory (EMLS, MLS, and extensions) and arithmetic (FPILP) (See [Schwartz, 1977]). Citing results of Goldberg (refer to [Goldberg, 79]) on average case behavior of algorithms for SAT, it was hoped that these decision procedures would perform well on average.
So far, it has been fairly difficult to prove average case NP-hardness under the various definitions (see [Levin, 86], [Ben-David et al, 89], [Blass & Gurevich, 91], [Gurevich, 91], [Venkatesan & Rajagopalan, 92], [Schuler & Yamakami, 92] and [Reischuk & Schindelhauer, 93]). We should note that the definitions in the literature haven't yet been standardized. We survey some of the results of the average case analysis of NP-complete problems, and compare the results of Goldberg with more pessimistic results. We prove that FPILP, EMLS and related fragments of set theory are NP-average complete, and show that there are simple distributions that will frustrate any algorithm for these decision problems.
Title: Approximation and Abstraction in Solid Object Kinematics
Author(s): Davis, E.
Abstract:
Physical reasoning often involves approximating or abstracting the situation or the theory at hand. This paper studies the nature of approximation and abstraction as applied to the kinematic theory of rigid solid objects.
Five categories of approximation are considered: 1. Geometric approximation. 2. Abstraction of a complex kinematic structure by a simpler kinematic structure. For example, the abstraction of a collection of tightly-linked objects as a single object. 3. Abstraction of a kinematic structure by a simpler theory. For example, the abstraction by a connectivity graph in configuration space. 4. Approximation of a complex kinematic structure by a simpler structure in a more complex theory. For example, the approximation of a chain by a string. 5. Approximation of a more complex theory by a kinematic theory. For example, the approximation of solid object dynamics by kinematics.
We discuss how some of these types of approximation can be implemented and combined. We conclude that abstraction and approximation are open-ended styles of reasoning, rather than neatly categorizable meta-relationships.
Title: A Highly Expressive Language of Spatial Constraints
Author(s): Davis, E.
Abstract:
AI applications require the representation and manipulation of partial spatial knowledge of many different kinds. This paper argues that a representation rich in primitives but fairly restricted in logical form will suffice for many of these purposes. We present and discuss one such representation language. We demonstrate that the language is expressive enough to capture exactly or closely approximate many of the representations that have been used in the AI literature. It also contains some original constructs for dealing with collections of regions of unknown cardinality.
Title: Approximations of Shape and Configuration Space
Author(s): Davis, E.
Abstract:
We consider the issue of shape approximation in kinematic mechanical systems; that is, systems of rigid solid objects whose behavior can be characterized entirely in terms of the constraints that each object moves rigidly and that no two objects overlap, without considering masses or forces. The general question we address is the following: Suppose we have calculated the behavior of some kinematic system using ideal descriptions of the shapes of the objects involved. Does it then follow that a real mechanism, in which the shape of the objects approximates this ideal will have a similar behavior? In addressing this question, we present various possible definitions of what it means (a) for one shape to approximate another and (b) for the behavior of one mechanism to be similarto the behavior of another. We characterize the behavioral properties of a kinematic system in terms of its configuration space; that is, the set of physically feasible positions and orientations of the objects. We prove several existential theorems that guarantee that a sufficiently precise approximation of shape preserves significant properties of configuration space. In particular, we show that It is often possible to guarantee that the configuration space of system A is close to that of system B in terms of metric criteria by requiring that the shapes of A closely approximate those of B in terms of the dual-Hausdorff distance. It is often possible to guarantee further that the configuration space of A is topologically similar to that of B by requiring that the surface normals are close at corresponding boundary points of A and B.
Title: A Knowledge Representation Based on the Belnap's Four-Valued Logic
Author(s): Kaluzhny, Y.; Muravitsky, A.
Abstract:
We treat knowledge from a computer-oriented point of view, considering it as a kind of data type. An axiomatic approach to the notion of data type undertaken by Dana Scott in [D.S.Scott, Outline of a Mathematical Theory of Computation, in: Proceedings of Princeton Conference on Information Science, 1971, pp. 169--176], is explored to find entities suitable for representation techniques. At the same time, we stay in Belnap's paradigm of the toleration of inconsistency. We propose a representation of knowledge (possible with contradictions) in simple propositional language, and we show how such knowledge can be maintained and how it should be transformed on receipt of new information. In this transformation, the key role is played by Scott's continuity rather than consistency.
Title: Dirichlet Problem for the Schrodinger Operator in a Half-space with Boundary Data of Arbitrary Growth at Infinity
Author(s): Kheyfits, A.
Abstract:
We consider the Dirichlet problem for the Schrodinger operator in a half-space with boundary data having an arbitrary growth at infinity. A solution is constructed as the generalized Poisson integral. Uniqueness of the solution is investigated too.
Title: An Optimal Preconditioner for a Class of Saddle Point Problems with a Penalty Term, Part II: General Theory
Author(s): Klawonn, A.
Abstract:
Iterative methods are considered for saddle point problems with penalty term. A positive definite preconditioner is constructed and it is proved that the condition number of the preconditioned system can be made independent of the discretization and the penalty parameters. Examples include the pure displacement problem in linear elasticity, the Timoshenko beam, and the Mindlin-Reissner plate.
Key words: Saddle point problems, penalty term, nearly incompressible materials, Timoshenko, Mindlin-Reissner, preconditioned conjugate residual method, multilevel, domain decomposition.
Please note: This report is a revised version of tr676.
Title: Run-time versus Compile-time Instruction Scheduling in Superscalar (RISC) Processors: Performance and Tradeoffs
Author(s): Leung, A.; Palem, K.; Ungureanu, C.
Abstract:
The RISC revolution has spurred the development of processors with increasing levels of instruction level parallelism (ILP). In order to realize the full potential of these processors, multiple instructions must be issued and executed in a single cycle. Consequently, instruction scheduling plays a crucial role as an optimization in this context. While early attempts at instruction scheduling were limited to compile-time approaches, the recent trend is to provide dynamic support in hardware. In this paper, we present the results of a detailed comparative study of the performance advantages to be derived by the spectrum of instruction scheduling approaches: from limited basic-block schedulers in the compiler, to novel and aggressive run-time schedulers in hardware. A significant portion of our experimental study via simulations, is devoted to understanding the performance advantages of run-time scheduling. Our results indicate it to be effective in extracting the ILP inherent to the program trace being scheduled, over a wide range of machine and program parameters. Furthermore, we also show that this effectiveness can be further enhanced by a simple basic-block scheduler in the compiler, which optimizes for the presence of the run-time scheduler in the target; current basic-block schedulers are not designed to take advantage of this feature. We demonstrate this fact by presenting a novel enhanced basic-block scheduler in this paper. Finally, we outline a simple analytical characterization of the performance advantage, that run-time schedulers have to offer.
Key words: Compile-time Optimizations, Dynamic Schedulers, Instruction Scheduling, Program Traces, Scope, Superscalar Processors
Title: Computational Real Algebraic Geometry
Author(s): Mishra, B.
Abstract:
Computational real algebraic geometry studies various algorithmic questions dealing with the real solutions of a system of equalities, inequalities, and inequations of polynomials over the real numbers. This emerging field is largely motivated by the power and elegance with which it solves a broad and general class of problems arising in robotics, vision, computer aided design, geometric theorem proving, etc.
The following survey paper discusses the underlying concepts, algorithms and a series of representative applications. This paper will appear as a chapter in the "Handbook of Discrete and Computational Geometry" (Edited by J.E. Goodman and J. O'Rourke), CRC Series in Discrete and Combinatorial Mathematics.
Title: Grasp Metrics: Optimality and Complexity
Author(s): Mishra, B.
Abstract:
In this paper, we discuss and compare various metrics for goodness of a grasp. We study the relations and trade-offs among the goodness of a grasp, geometry of the grasped object, number of fingers and the computational complexity of the grasp-synthesis algorithms. The results here employ the techniques from convexity theory first introduced by the author and his colleagues.
Title: Three Finger Optimal Planar Grasp
Author(s): Mishra, B.; Teichman, M.
Abstract:
In this paper, we study various algorithmic questions regarding the computation of an optimal three finger planar grasp. We present a novel O(n^2 log n)-time algorithm to compute such an optimal grasp for an arbitrary simple n-gon. This algorithm can be used for finding ``good'' immobilizing sets. We also discuss several variations on the problem and many intriguing open questions in the area that remain unsolved.
Title: On the Lidskii-Vishik-Lyusternik Perturbation Theory for Eigenvalues of Matrices with Arbitrary Jordan Structure
Author(s): Moro, J.; Burke, J.V.; Overton, M.L.
Abstract:
Let $A$ be a complex matrix with arbitrary Jordan structure, and $\lambda$ an eigenvalue of $A$ whose largest Jordan block has size $n$. We review previous results due to Lidskii, showing that the splitting of $\lambda$ under a small perturbation of $A$ of order $\epsilon$, is, generically, of order $\epsilon^{1/n}$. Explicit formulas for the leading coefficients are obtained, involving the perturbation matrix and the eigenvectors of $A$. We also present an alternative proof of Lidskii's main theorem, based on the use of the Newton diagram. This approach clarifies certain difficulties which arise in the nongeneric case, and leads, in some situations, to the extension of Lidskii's results. These results suggest a new notion of Holder condition number for multiple eigenvalues, depending only on the conditioning of the associated eigenvectors, not the conditioning of the Jordan vectors.
Title: Combined Instruction Scheduling and Register Allocation
Author(s): Motwani, R.; Palem, K.; Sarkar, V.; Reyen, S.
Abstract:
In this paper, we describe a novel framework for expressing an optimization problem that simultaneously addresses instruction scheduling and register allocation, referred to as CRISP. By modeling spill-costs directly in the scheduling problem, CRISP permits the design of algorithms whose objective is to exploit the available instruction level parallelism --- the traditional goal of instruction scheduling --- while lowering the cost of register spilling at the same time. Currently, these optimizations are performed in separate phases and interact in ways that are not characterized very well, leading to phase-ordering problems. We also develop a fast heuristic in this paper for solving this combined optimization in the context of basic-blocks; our algorithm runs in time O ( E N) where the basic block of N instructions has E edges; this time includes all preprocessing costs. In comparison to conventional phase-ordered approaches, our combined heuristic performed well in experimental evaluations on basic-blocks with sizes in the range 5 to 100. We also present rigorous characterizations of the inherent difficulty of solving optimization problems in our CRISP framework, as well as in classical frameworks. A surprising outcome of this work is that problems expressed in CRISP are provably easier to solve, than say graph coloring --- graph coloring is the classical approach to expressing just one of the two phases of interest, namely register allocation. By eliminating the phase-ordering problem, CRISP lowers the overall complexity of the software engineering effort involved. This is because optimizers designed based on our unified approach will be relatively ``lightweight,'' when compared to those that have to cope with phase-ordering. This has positive implications both for the duration of the design cycles, as well as the concomitant costs of designing low-level optimizations in modern compilers.
Title: Knowledge Representation as Domains
Author(s): Muravitsky, A.
Abstract:
This is a continuing attempt in a series of papers [ knowledge.ps, inform.ps, frame.ps ] to show how computer-represented knowledge can be arranged as elements of an effectively represented semantic domain in the sense of [C.A.Gunter and D.S.Scott, Semantic Domains, in: J. van Leeuwen (ed.), Handbook of Theoretical Computer Science, Vol. B, pp. 635--674]. We present a direct deductive description of the domain, which was defined semantically in [ knowledge.ps ], via the Scott's notion of information system. Also, the internal structure of the continuous ampliative operations coordinated with the domain's effective basis is established. Though we always remain in the paradigm of the toleration of contradictory information described in [N.D.Belnap, A Useful Four-Valued Logic: How a Computer Should Think, in: A.R.Anderson, N.D.Belnap, and J.M.Dunn, Entailment: the Logic of Relevance and Necessity, Vol. 2, Princeton Univ. Press, 1992], the approach in question could be extended to include domains for consistency knowledge bases.
Title: A Framework for Knowledge-Based Systems
Author(s): Muravitsky, A.
Abstract:
The paper continues the theme of [ knowledge.ps ]. We differentiate between our approach to knowledge representation and that of others by expressing the following Working Hypothesis: Knowledge is a data type, and knowledge revision is accomplished by continuous operations on it, which are coordinated with its effective basis. Staying in the limits of Belnap's paradigm of the admittance of contradictory information into the computer's memory, our purpose in this paper is to reduce as much as possible all the computable processes needed for modifing the current state of the computer's knowledge and describe conditions for possible maneuvering. In particular, we solve some problems of decidability concerning operations on the minimal states, which are regarded as natural knowledge transformers. We show, also, how to express those operations in lattice theory terms that leads to the simplification of their computation on the lattice of minimal states. The problem of backtracking in the presented context is considered as well.
Title: Some Knowledge Transformers: Infons and Constraints
Author(s): Muravitsky, A.
Abstract:
The goal of this paper is twofold. First, it is to present a general scheme within which information is supposed to turn into the computer-represented knowledge and, second, to define two natural kinds of transfomers of this knowledge which this scheme thrusts us into considering.
Title: Logic of Information Knowledge
Author(s): Muravitsky, A.
Abstract:
We share with some philosophers the view that a state of knowledge, being a part of the real world, can bring contradiction into it. Such an ontological reading of knowledge is very important when one deals with information knowledge, which arises as the content of the computer's memory when the computer is placed into changeable information environment ("information flow"), and "must" be able to tolerate any (not excluding contradictions) from the computer's users. Continuing research begun in [KM 93], we consider in length one kind of Scott-continuous operations introduced there. Each such operation [A->B](x), where A and B are formulas in a propositional language, called a rule, moves the computer to a "minimal" state of knowledge, in which B is true, if in a current state A is true. Note that the notion of rule is used here in an information-transforming sense, rather than in the ordinary truth-sound sense. We distinguish between global and local rules and show that these notions are decidable. Also, we define a modal epistemic logic as a tool for the prediction of possible evolution of the system's knowledge and establish decidability of this logic.
Title: A Perspective of New Foundations for Knowledge Maintenance Systems: Research Program
Author(s): Muravitsky, A.
Abstract:
We propose to provide new mathematical foundations for the design of knowledge-based systems. The underlying idea is that the knowledge which the computer (``artificial agent'') operates with is considered as a kind of abstract data type. In this context, a relation of approximation arises in a natural way when one imagines the computer as operating in a changing information environment (``information flow''). This notion of pproximation can be studied using the techniques that have been developed for domain theory in the context of denotational semantics of programming languages.
Title: On the First Degree Entailment of Two 3-Valued Logics
Author(s): Muravitsky, A.
Abstract:
We note first that the first degree entailment of {\L}ukasiewicz's 3-valued logic and a 3-valued logic that is extracted of Belnap's 4-valued logic is the same. Then, we give an axiomatization of that entailment as the calculus E_{fde} + A &-A- > B\/-B, where E_{fde} is the first degree entailment of Anderson-Belnap's logic E of relevance and necessity.
Title: Some Knowledge Transformers: Infons and Constraints
Author(s): Muravitsky, Alexei Yu.
Abstract:
The goal of this paper is twofold. First, it is to present a general scheme within which information is supposed to turn into the computer-represented knowledge and, second, to define two natural kinds of transfomers of this knowledge which this scheme thrusts us into considering.
Title: New Mathematical Foundations for Knowledge Maintenance Systems: Research Program
Author(s): Muravitsky, Alexei Yu.
Abstract:
We propose to provide new mathematical foundations for the design of knowledge-based systems. The underlying idea is that the knowledge which the computer ("artificial agent") operates with is considered as a kind of abstract data type. In this context, a relation of approximation arises in a natural way when one imagines the computer as operating in a changing information environment ("information flow"). This notion of approximation can be studied using the techniques that have been developed for domain theory in the context of denotational semantics of programming languages.
Title: Double Hashing is Computable and Randomizable with Universal Hash Functions
Author(s): Schmidt, J.; Siegel, A.
Abstract:
Universal hash functions that exhibit (c log n)-wise independence are shown to give a performance in double hashing and virtually any reasonable generalization of double hashing that has an expected probe count of 1/(1-alpha) + epsilon for the insertion of the (alpha n)-th item into a table of size n, for any fixed alpha < 1 and epsilon > 0. This performance is within epsilon of optimal. These results are derived from a novel formulation that overestimates the expected probe count by underestimating the presence of partial items already inserted into the hash table, and from a sharp analysis of the underlying stochastic structures formed by colliding items.
Title: An API for Choreographing Data Accesses
Author(s): Shriver, E.A.M.; Wisniewski, L.F.
Abstract:
Current APIs for multiprocessor multi-disk file systems are not easy to use in developing out-of-core algorithms that choreograph parallel data accesses. Consequently, the efficiency of these algorithms is hard to achieve in practice. We address this deficiency by specifying an API that includes data-access primitives for data choreography. With our API, the programmer can easily access specific blocks from each disk in a single operation, thereby fully utilizing the parallelism of the underlying storage system.
Our API supports the development of libraries of commonly-used higher-level routines such as matrix-matrix addition, matrix-matrix multiplication, and BMMC (bit-matrix-multiply/complement) permutations. We illustrate our API in implementations of these three high-level routines to demonstrate how easy it is to use.
Title: Toward a Usable Theory of Chernoff Bounds for Heterogeneous and Partially Dependent Random Variables
Author(s): Siegel, A.
Abstract:
Let X be a sum of real valued random variables and have a bounded mean E[X]. The generic Chernoff-Hoeffding estimate for large deviations of X is: P{X-E[X]>=a}<=min_{y>=0}exp(-y(a+E[X]))E[exp(y X)], which applies with a>=0 to random variables with very small tails. At issue is how to use this method to attain sharp and useful estimates. We present a number of Chernoff-Hoeffding bounds for sums of random variables that may have a variety of dependent relationships and that may be heterogeneously distributed.
Title: Closed Hashing is Computable and Optimally Randomizable with Universal Hash Functions
Author(s): Siegel, A.; Schmidt, J.
Abstract:
Abstract: Universal hash functions that exhibit (c log n)-wise independence are shown to give a performance in double hashing, uniform hashing and virtually anyreasonable generalization of double hashing that has an expected probe count of 1/(1-alpha)+O(1/n) for the insertion of the (alpha n)-th item into a table of size n, for any fixed alpha < 1. This performance is optimal. These results are derived from a novel formulation that overestimates the expected probe count by underestimating the presence of local items already inserted into the hash table, and from a very sharp analysis of the underlying stochasticstructures formed by colliding items.
Analogous bounds are attained for the expected r-th moment of the probe count, or any fixed r, and linear probing is also shown to achieve a performance with universal hash functions that is equivalent to the fully random case.
Title: On Universal Classes of Extremely Random Constant Time Hash Functions and their Time-space Tradeoff
Author(s): Siegel, A.
Abstract:
A family of functions F that map [0,n]->[0,n], is said to be h-wise independent if any h points in [0,n] have an image, for randomly selected f in F, that is uniformly distributed. This paper gives both probabilistic and explicit randomized constructions of (n**epsilon)-wise independent functions, for epsilon < 1, that can be evaluated in constant time for the standard random access model of computation. Simple extensions give comparable behavior for larger domains. As a consequence, many probabilistic algorithms can for the first time be shown to achieve their expected asymptotic performance for a feasible model of computation.
This paper also establishes a tight tradeoff in the number of random seeds that must be precomputed for a random function that runs in time T and is h-wise independent.
Title: Report on NSF Workshop on Manufacturing and Computational Geometry
Author(s): Yap, C.
Abstract:
This is a summary of the NSF Workshop on Manufacturing and Computational Geometry, held at the Courant Institute of Mathematical Sciences, New York University, on April 1-2, 1994. The meeting brought together about 30 participants from both the manufacturing and the computational geometry communities for the purposes of discussing current trends in the two communities, identifying areas of mutual interest, and proposing future joint activities.
Title: A New Primal-Dual Interior-Point Method for Semidefinite Programming
Author(s): Alizadeh, F.; Haeberly, J. A.; Overton, M.
Abstract:
Semidefinite programming (SDP) is a convex optimization problem in the space of symmetric matrices. Primal-dual interiorpoint methods for SDP are discussed. These generate primal and dual matrices X and Z which commute only in the limit. A new method is proposed which iterates in the space of commuting matrices.
Title: Automatic Synthesis Algorithms for Supervisory Controllers (Preliminary Report)
Author(s): Antoniotti, M.; Mishra, B.
Abstract:
In this paper we describe our experience with a prototype system capable of synthesizing "Supervisor Controller Programs" based largely on the theory of discrete event systems (DES) first proposed by Ramadge and Wonham. We augment the theory by also allowing continuous time trajectories modeling transitions between events. We illustrate our approach by an example, - the discrete control of a walking machine - which poses some challenges on the applicability of the theory and finally, discuss some possible solutions.
Notes: Appeared in IEEE Proceedings of the Fourth International Conference on Computer Integrated Manufacturing and Automation Technology, Troy, NY, Oct. 1994
Title: Discrete Event Models + Temporal Logic = Supervisory Controller: Automatic Synthesis of Locomotion Controllers
Author(s): Antoniotti, M.; Mishra, B.
Abstract:
In this paper, we address the problem of the synthesis of controller programs for a variety of robotics and manufacturing tasks. The problem we choose for test and illustrative purposes is the standard ``Walking Machine Problem,'' a representative instance of a real "hybrid" problem with both logical/discrete and continuous properties and strong mutual influence without any reasonable separation. We aim to produce a ``compiler technology'' for this class of problems in a manner analogous to the development of the so-called ``Silicon Compilers'' for the VLSI technology. To cope with the difficulties inherent to the problem, we resort to a novel approach that combines many key ideas from a variety of disciplines: namely, ``Discrete Event Supervisory Systems'', Petri Nets approaches and ``Temporal Logic''.
Notes: Will appear in the 1995 IEEE International Conference on Robotics and Automation, Nagoya, Japan
Title: Multilevel Schwarz Methods with Partial Refinement
Author(s): Chen, H.
Abstract:
We consider multilevel additive Schwarz methods with partial refinement. These algorithms are generalizations of the multilevel additive Schwarz methods developed by Dryja and Widlund and many others. We will give two different proofs by using quasi-interpolants under two different assumptions on selected refinement subregions to show that this class of methods has an optimal condition number. The first proof is based purely on the localization property of quasi-interpolants. However, the second proof use some results on iterative refinement methods. As a by-product, the multiplicative versions which corresponds to the FAC algorithms with inexact solvers consisting of one Gauss-Seidel or damped Jacobi iteration have optimal rates of convergence. Finally, some numerical results are presented for these methods.
Title: Approximate Euclidean Shortest Path in 3-Space
Author(s): Choi, J.; Sellen, J.; Yap, C.K.
Abstract:
Papadimitriou's approximation approach to the Euclidean shortest path (ESP) in 3-space is revisited. As this problem is NPhard, his approach represents an important step towards practical algorithms. However, there are several gaps in the original description. Besides giving a complete treatment in the framework of bit complexity, we also improve on his subdivision method. Among the tools needed are root-separation bounds and non-trivial applications of Brent's complexity bounds on evaluation of elementary functions using floating point numbers.
Title: Branching Continuous Time and the Semantics of Continuous Action
Author(s): Davis, E.
Abstract:
It is often useful to model the behavior of an autonomous intelligent creature in terms of continuous control and choice. For example, a robot who moves through space can be idealized as able to execute any continuous motion, subject to constraints on velocity and acceleration; in such a model, the robot can "choose" at any instant to change his acceleration. We show how such models can be described using a continuous branching time structure. We discuss mathematical foundations of continuous branching structures, theories of continuous action in physical worlds, embedding of discrete theories of action in a continuous structure, and physical and epistemic feasibility of plans with continuous action.
Title: Adaptive Time-Frequency Approximations with Matching Pursuits
Author(s): Davis, G.; Mallat, S.; Zhang, Z.
Abstract:
Computing the optimal expansion of a signal in a redundant dictionary of waveforms is an NP-complete problem. We introduce a greedy algorithm called a matching pursuit which computes a sub-optimal expansion. The dictionary waveforms which best match a signal's structures are chosen iteratively. An orthogonalized version of the matching pursuit is also developed. Matching pursuits are general procedures for computing adaptive signal representations. With a dictionary of Gabor functions, a matching pursuit defines an adaptive time-frequency transform. We derive a signal energy distribution in the time-frequency plane which does not contain interference terms, unlike the Wigner and Cohen class distributions. A matching pursuit is a chaotic map whose asymptotic properties are studied. We describe an algorithm which isolates the coherent structures of a signal and show an application to pattern extraction from noisy signals.
Title: A Direct-Drive Hand: Design, Modeling and Control
Author(s): Ebner, M.; Wallace, R.
Abstract:
An artificial 15 degrees of mobility direct drive hand, slightly bigger than a human hand, is presented. The underlying technology are the miniature direct drive actuators recently developed. The motivation for our design and the construction plan for the hand is given. The dynamics of the hand are analyzed theoretically and a model for control of the hand is presented. Finally we describe our experiences made while experimenting with the hand. A direct drive hand graphics interface has been developed to simulate the dynamics of the hand and to test out control algorithms for the hand.
Title: Optimizing Eigenvalues of Symmetric Definite Pencils
Author(s): Haeberly, J. A.; Overton, M.
Abstract:
We consider the following quasiconvex optimization problem: minimize the largest eigenvalue of a symmetric definite matrix pencil depending on parameters. A new form of optimality conditions is given, emphasizing a complementarity condition on primal and dual matrices. Newton's method is then applied to these conditions to give a new quadratically convergent interior-point method which works well in practice. The algorithm is closely related to primal-dual interior-point methods for semidefinite programming.
Title: An Optimal Preconditioner for a Class of Saddle Point Problems with a Penalty Term
Author(s): Klawonn, A.
Abstract:
Iterative methods are considered for a class of saddle point problems with a penalty term arising from finite element discretizations of certain elliptic problems. An optimal preconditioner which is independent of the and the penalty parameter is constructed. This approach is then used to design an iterative method with a convergence rate independent of the Lam\'{e} parameters occuring in the equations of linear elasticity.
Please see revised version tr683.
Title: New Estimates for Ritz Vectors
Author(s): Knyazev, A.
Abstract:
The followiing estimate for the Rayleigh--Ritz method is proved: $$ | \tilde \lambda - \lambda | |( \tilde u , u )| \le { \| A \tilde u - \tilde \lambda \tilde u \| } \sin \angle \{ u ; \tilde U \}, \ \| u \| =1. $$ Here $A$ is a bounded self-adjoint operator in a real Hilbert/euclidian space, $\{ \lambda, u \}$ one of its eigenpairs, $\tilde U$ a trial subspace for the Rayleigh--Ritz method, and $\{ \tilde \lambda, \tilde u \}$ a Ritz pair. %$\| u \| = \| \tilde u \| = 1.$ This inequality makes it possible to analyze the fine structure of the error of the Rayleigh--Ritz method, in particular, it shows that $ |( \tilde u , u )| \le C \epsilon^2, $ if an eigenvector $u$ is close to the trial subspace with accuracy $\epsilon$ and a Ritz vector $\tilde u$ is an $\epsilon$ approximation to another eigenvector, with a different eigenvalue. Generalizations of the estimate to the cases of eigenspaces and invariant subspaces are suggested, and estimates of approximation of eigenspaces and invariant subspaces are proved.
Title: Planning Paths of Minimal Curvature
Author(s): Sellen, J.
Abstract:
We consider the problem of planning curvature constrained paths amidst polygonal obstacles, connecting given start and target configurations. Let the critical curvature Rc be the minimal curvature for which a constrained path exists. We describe an algorithm, which approximates the critical curvature and finds a corresponding path. Further, we give an efficient decision procedure to determine if there exists a path satisfying a given curvature constraint R, with running time polynomial in |R-Rc|/R.
Title: Simple Multi Function Vision System for 3D Data Acquisition
Author(s): Sokolov, S. M.; Max, D. P.; Wallace, R. S.
Abstract:
We have developed a simple multi function vision system for 3D data acquisition for a wide range of applications in robotics and automation. The system uses one CCD video camera and an active directed laser light source based on a direct drive spherical pointing motor (SPM). The anatomy of the system and algorithms used are described. System calibration methods and measurements of accuracy of the outputs are presented. A list of applications is shown.
Title: Scaling Direct Drive Robots
Author(s): Wallace, R.; Selig, J.
Abstract:
Recent experimental and analytical evidence indicates that direct drive robots become very practical and economical at miniature and microscopic scales, so it is interesting to understand quantitatively the properties of direct drive robots under scaling transformations. This leads to a study of how screws and their dual co-screws behave under the group of similarity transforms. This group is the group of isometries together with dilations. Several different representations are found on the space of screws and complementary representations are found on the dual space of co-screws. From the electromagnetic theory of the force and torque on a magnet in a magnetic field, we derive the scaling properties of the electromagnetic wrench. Hence, these results can be directly applied to the scaling of direct drive motors [1]. We conclude by proposing a scale-invariant measure for direct drive actuator performance.
Title: Pscheme: Extending Continuations to Express Control and Synchronization in a Parallel LISP
Author(s): Yao, C.; Goldberg, B.
Abstract:
In this paper, we describe Pscheme, a parallel dialect of Scheme. The primary construct for specifying parallelism, synchronization, and communication is a natural extension of first-class continuations which we call a port. We describe the behavior of ports, along with the other parallel constructs of Pscheme. Because the user has precise control over the parallel computation, the Pscheme constructs can be used to build higher-level parallel programming abstractions, such as futures, semaphores, and Ada-style rendezvous. We provide the Pscheme code for these abstractions and discuss the current implementation of Pscheme on a shared-memory multiprocessor.
Title: Representing Control in Parallel Applicative Programing
Author(s): Yao, C.
Abstract:
This research is an attempt to reason about the control of parallel computation in the world of applicative programming languages.
Applicative languages, in which computation is performed through function application and in which functions are treated as first-class objects, have the benefits of elegance, expressiveness and having clean semantics. Parallel computation and real-world concurrent activities are much harder to reason about than the sequential counterparts. Many parallel applicative languages have thus hidden most control details with their declarative programming styles, but they are not expressive enough to characterize many real world concurrent activities that can be easily explained with concepts such as message passing, pipelining and so on.
Ease of programming should not come at the expense of expressiveness. Therefore, we design a parallel applicative language Pscheme such that programmers can express explicitly the control of parallel computation while maintaining the clean semantics and the ease of programming of applicative languages. In Pscheme, we propose the concept of ports to model the general control in parallel computation. Through program examples, we show how Pscheme and ports support various parallel programming paradigms. We have also built libraries for higher level control facilities with ports so that programming in Pscheme becomes easier.
We provide an operational semantics for Pscheme, and develop a compiler and a run time system on NYU's Ultracomputer. Our experiments with parallel programs have shown satisfactory speedup. We claim that ports are the natural parallel extensions of continuations in sequential computation, and thus conclude that representing general control in parallel applicativeprogramming is feasible.
Title: The Cell Programming Language
Author(s): Agarwal, P.
Abstract:
We describe the Cell Programming Language (CPL), which we have designed to write programs that mimic the life of a biological cell. The aim is to study the complex interactions between cells which lead to diverse shapes of cellular aggregates.
Each cell is treated as a two-dimensional homogeneous polygon with a specific area. A cell goes through a series of states in its lifetime. In each state, one can specify the cell's growth rate; information about cell division and cell differentiation; the chemical constituents of the cell and their interactions; and cell motion. This behavior may be conditional based on the cell's own status (chemical concentrations and size) or on its neighborhood (the type of cells surrounding it, the contact lengths with each of them, their areas, the directions to these cells, and their chemical concentrations). The language is explored by modeling cellular sorting in vitro, and aggregation in the Dictyostelium discoidea (cellular slime mold).
Title: A Language for Semantic Analysis
Author(s): Cai, J.
Abstract:
Semantic analysis is important for compilers. In the APTS program transformation system, semantics is specified by rules in the language RSL. The semantic rules are interpreted by APTS to generate the semantic information of the program, which is then used by the rewriting engine for program translation. This approach has proved to be convenient and powerful in our construction of a SETL-to-C compiler. In this paper, we discuss the features, applications, implementation strategy, and performance of RSL.
Title: Knowledge Preconditions for Plans
Author(s): Davis, Ernest
Abstract:
For an agent to be able to rely on a plan, he must know both that he is physically capable of carrying out the physical actions involved, and that he knows enough to carry out the plan. In this talk, we advance and discuss new definitions of "knowing enough to carry out a plan", for the case of a single agent carrying out a sequence of primitive actions one at a time. We consider both determinate and indeterminate plans.
We show how these definition can be expressed in a formal logic, using a situation calculus model of time and a possible worlds model of knowledge. The definitions strictly subsume previous theories for the single-agent case without concurrent actions.
We illustrate the power of the definition by showing that it supports results of the following kinds:
Title: Schwarz Analysis of Iterative Substructuring Algorithms for Elliptic Problems in Three Dimensions
Author(s): Dryja, M.; Smith, B.; Widlund, O.
Abstract:
638 SCHWARZ ANALYSIS OF ITERATIVE SUBSTRUCTURING ALGORITHMS FOR ELLIPTIC PROBLEMS IN THREE DIMENSIONS M. Dryja, B. Smith, O. Widlund, May 1993 Domain decomposition methods provide powerful preconditioners for the iterative solution of the large systems of algebraic equations that arise in finite element or finite difference approximations of partial differential equations. The preconditioners are constructed from exact or approximate solvers for the same partial differential equation restricted to a set of subregions into which the given region has been divided. In addition, the preconditioner is often augmented by a coarse, second level approximation, that provides additional, global exchange of information, and which can enhance the rate of convergence considerably. The iterative substructuring methods, based on decompositions of the region into non-overlapping subregions, form one of the main families of such Many domain decomposition algorithms can conveniently be described and analyzed as Schwarz methods. These algorithms are fully defined in terms of a set of subspaces and auxiliary bilinear forms. A general theoretical framework has previously been developed and, in this paper, these techniques are used in an analysis of iterative substructuring methods for elliptic problems in three dimensions. A special emphasis is placed on the difficult problem of designing good coarse models and obtaining robust methods for which the rate of convergence is insensitive to large variations in the coefficients of the differential equation. Domain decomposition algorithms can conveniently be built from modules, which represent local and global components of the preconditioner. In this paper, a number of such possibilities is explored and it is demonstrated how a great variety of fast algorithms can be designed and analyzed.
Title: Schwarz Methods of Neumann-Neumann Type for Three-Dimensional Elliptic Finite Element Problems
Author(s): Dryja, M.; Widlund, O.
Abstract:
Several domain decomposition methods of Neumann-Neumann type are considered for solving the large linear systems of algebraic equations that arise from discretizations of elliptic problems by finite elements. We will only consider problems in three dimensions. Several new variants of the basic algorithm are introduced in a Schwarz method framework that provides tools which have already proven very useful in the design and analysis of other domain decomposition and multi-level methods.
The Neumann-Neumann algorithms have several advantages over other domain decomposition methods. The subregions, which define the subproblems, only share the boundary degrees of freedom with their neighbors. The subregions can also be of quite arbitrary shape and many of the major components of the preconditioner can be constructed from subprograms available in standard finite element program libraries. However, in its original form, the algorithm lacks a mechanism for global transportation of information and its performance therefore suffers when the number of subregions increases. In the new variants of the algorithms, considered in this paper, the preconditioners include global components, of low rank, to overcome this difficulty. Bounds are established for the condition number of the iteration operator, which are independent of the number of subregions, and depend only polylogarithmically on the number of degrees of freedom of individual local subproblems. Results are also given for problems with arbitrarily large jumps in the coefficients across the interfaces separating the subregions.
Title: The Complexity of Resolvent Resolved
Author(s): Gallo, G.; Mishra, B.
Abstract:
The concept of a resolvent of a prime ideal was originally introduced by J.F. Ritt along with the notion of a characteristic set. The motivation for studying resolvents comes from its connections with the birational isomorphisms that describe structures of irreducible algebraic varieties by means of an equivalent hypersurface and a one-to-one rational map. As a result, these ideas have a wide range of applications in such areas as solid modeling, computer aided design and manufacturing. An algorithm to compute the resolvent by means of characteristic sets was first proposed by Ritt. This and some related algorithms have resurfaced as interest in resolvent structures have grown, spurred by its applicability.
Unfortunately, the algebraic complexity of the resolvent and the computational complexity of the associated algorithms have never been satisfactorily explored. In this paper, we give single exponential upper and lower bounds for the degrees of the resolvent and its associated parametrizing polynomials. We also show that the resolvent can be computed deterministically in single exponential sequential and polynomial parallel time complexity. All previous algorithms for resolvent had relied on a random choice of certain extraneous parameters.
Title: A Hybrid Algorithm for Optimizing Eigenvalues of Symmetric Definite Pencils
Author(s): Haeberly, J.; Overton, M.
Abstract:
We present an algorithm for the optimization of the maximum eigenvalue of a symmetric definite pencil depending affinely on a vector of parameters. The algorithm uses a hybrid approach, combining a scheme based on the method of centers, developed by Boyd and El Ghaoui, with a new quadratically convergent local scheme. A convenient expression for the generalized gradient of the maximum eigenvalue of the pencil is also given, expressed in terms of a dual matrix. The algorithm computes the dual matrix which establishes the optimality of the computed solution.
Title: Chacterization of Self-Similar with Wavelet Maxima
Author(s): Hwang, W.; Mallat, S.
Abstract:
Self-similar multifractals have a wavelet transform whose maxima define self-similar curves in the scale-space plane. We introduce an algorithm that recovers the affine self-similarity parameters through a voting procedure in the corresponding parameter space. The voting approach is robust with respect to renormalization noises and can recover the value of parameters having random fluctuations. We describe numerical applications to Cantor measures, dyadique multifractals and to the study of Diffusion Limited Aggregates.
Title: Competitive Algorithms and Lower Bounds for On-Line Scheduling of Multiprocessor Real-Time Systems
Author(s): Koren, G.; Shasha, D.
Abstract:
We study competitive on-line scheduling in multi-processor real-time environments. In our model, every task has a deadline and a value that it obtains only if it completes by its deadline. A task can be assigned to any processor, all of which are equally powerful. The problem is to design an on-line scheduling algorithm (i.e., the scheduler has no knowledge of a task until it is released) with worst case guarantees as to the total value obtained by the system.
We study systems with two or more processors and with uniform or non-uniform value density. We present an inherent limit on the best competitive guarantee that any on-line parallel real-time scheduler can give. Then we present a competitive algorithm that achieves a worst case guarantee which is only within a factor of 2 from the best possible guarantee in many cases. These are the most general results yet known for parallel overloaded real-time scheduling.
Title: Matching Pursuits with Time-Frequency Dictionaries , Rev
Author(s): Mallat, S.; Zhang, Z.
Abstract:
We introduce an algorithm, called matching pursuit, that decomposes any signal into a linear expansion of waveforms that are selected from a redundant dictionary of functions. These waveforms are chosen in order to best match the signal structures. Matching pursuits are general procedures to compute adaptive signal representations. With a dictionary of Gabor functions, a matching pursuit defines an adaptive time-frequency transform. We derive a signal energy distribution in the time-frequency plane, which does not include interference terms, unlike Wigner and Cohen class distributions [?]. A matching pursuit isolates the signal structures that are coherent with respect to a given dictionary. An application to pattern extraction from noisy signals is described. We compare a matching pursuit decomposition with a signal expansion over an optimized wavepacket orthonormal basis, selected with the algorithm of Coifman and Wickerhauser.
Title: Feedback Control of Miniature Direct Drive Devices
Author(s): Max, D.; Wallace, R.
Abstract:
We discuss dynamics and control of miniature direct drive actuators, specifically for the two axis spherical pointing motor and for the one axis direct drive finger joint. These actuators can move both accurately and at high speed, however the capabilities of these devices cannot be fully exploited using open loop control techniques. We derive an ideal PID feedback control scheme. Our initial experiments indicate that PID feedback control for the SPM and finger joint is highly feasible and a significant improvement over open loop methods.
Title: NYU Educational Robotics Project: A Pedagogic Overview
Author(s): Mishra, B.; Antoniotti, M.; Hansen, F.; Wallace, R.
Abstract:
The primary goal of the NYU educational robotics project (NYU-ED) is to create a disseminable, multi-functional and inexpensive laboratory course sequence, aimed at improving the practical skills of undergraduate students specializing in robotics, vision, AI and manufacturing disciplines.
The earlier approaches to robotics laboratory education have been to use either industrial robot arms or commercially available low-power arms. In each case, there have been considerable problems in the lack of an ideal interface, in not providing enough flexibility to add other devices, or in the absence of adequate safety. Also, the underlying dynamical model is usually so complicated that performing any control experiment has been beyond the scope of all but advanced graduate students.
In this report, we describe our approach to deal with these problems in constructing a modern robotics laboratory. The main work-horse of the NYU educational project was chosen to be a multi-functional ED I robot system, consisting of a 4 DOF DD arm and several auxiliary devices. The system was designed to be simple, inexpensive, flexible and safe.
We also describe our experience with some advanced laboratory experiments using miniature direct drive robots that have proven to be ideal for teaching as they are reconfigurable, safe and easy to program.
Title: ED I: NYU Educational Robot Design and Evaluation
Author(s): Mishra, B.; Antoniotti, M.
Abstract:
The primary goal of the NYU educational robot project is to create a disseminable, multi-functional and inexpensive laboratory course sequence, aimed at improving the practical skills of undergraduate students specializing in robotics, vision, AI and manufacturing disciplines.
The main work-horse of the NYU educational project was chosen to be a multi-functional ED I robot system, consisting of a 4 DOF DD arm and several auxiliary devices. The system was designed to be simple, inexpensive, flexible and safe.
In this report, we describe the history, design, structure and evaluation of this robot system. We also describe several robotics and related course sequence that can use the ED I system effectively. We also provide some example experiments that have been run on ED I successfully.
This report has benefited from the labor, contribution, discussions, advice and criticisms of several people on the ED I project team and the credit for the final product goes to the entire team.
Title: A Survey of Computational Differential Algebra
Author(s): Mishra, B.
Abstract:
In this note, we explore the computational aspects of several problems in differential algebra with concrete applications in dynamics and motion-planning problems in robotics, automatic synthesis of control schemes for nonlinear systems and simulation of physical systems with fixed degrees of freedom.
Our primary aim is to study, compute and structurally describe the solution of a system of differential equations with coefficients in a field (say, the field of complex numbers, ). There seem to have been various approaches in this direction: e.g. ideal theoretic approach of Ritt, Galois theoretic approach of Kolchin and Singer and group theoretic technique of Lie. It is interesting to study their interrelationship and effectivity of various problems they suggest.
In general, these problems are known to be uncomputable; thus, we need to understand under what situations these problems become feasible. As related computer science questions, we also need to study the complexity of these problems, underlying data-structures, effects of the representation (e.g. sparsity).
Of related interest are some closely-related problems such as symbolic integration problem, solving difference equations, integro-differential equations and differential equations with algebraic constraints.
Title: Towards Second-Order Methods for Structured Nonsmooth Optimization
Author(s): Overton, M.; Ye, X.
Abstract:
Structured nonsmooth optimization objectives often arise in a composite form f = h ffi a, where h is convex (but not necessarily polyhedral) and a is smooth. We consider the case where the structure of the nonsmooth convex function h is known. Specifically, we assume that, for any given point in the domain of h, a parameterization of a manifold \Omega , on which h reduces locally to a smooth function, is given. We discuss two affine spaces: the tangent space to the manifold \Omega at a point, and the affine hull of the subdifferential of h at the same point, and explain that these are typically orthogonal complements. We indicate how the construction of locally second-order methods is possible, even when h is nonpolyhedral, provided the appropriate Lagrangian, modeling the structure, is used. We illustrate our ideas with two important convex functions: the ordinary max function, and the max eigenvalue function for symmetric matrices, and we solicit other interesting examples with genuinely different structure from the community.
Title: Two-Level Schwarz Methods for Nonconforming Finite Elements and Discontinuous Coefficients
Author(s): Sarkis, M.
Abstract:
A two-level domain decomposition methods are developed for a simple nonconforming approximation of second order elliptic problems. A bound is established for the condition number of these iterative methods, which grows only logarithmically with the number of degrees of freedom in each subregion. This bound holds for two and three dimensions and is independent of jumps in the value of the coefficients.
Title: Using Line Invariants for Object Recognition by Geometric Hashing
Author(s): Tsai, F.
Abstract:
Geometric Hashing is a model-based object recognition technique for detecting objects which can be partially overlapping or partly occluded. It precompiles, from local geometric features, redundant transformation-invariant information of the models in a hash table and uses the invariants computed from a scene for fast indexing into the hash table to hypothesize possible matches between object instances and object models during recognition.
In its simplest form, the geometric hashing method assumes relatively noise-free data and is applied to objects with points as local features. However, extracting of the locations of point features is inherently error-prone and the analysis of geometric hashing on point sets shows considerable noise sensitivity. Line features can generally be extracted with greater accuracy.
We investigate the use of line features for geometric hashing applied to 2-D (or flat 3-D) object recognition and derive, from a combination of line features, invariants for lines under various geometric transformations.
Title: A Probabilistic Approach to Geometric Hashing Using Line Features
Author(s): Tsai, F.
Abstract:
Most current object recognition algorithms assume reliable image segmentation, which in practice is often not available. We examine the combination of the Hough Transform with a variation of Geometric Hashing as a technique for model-based object recognition in seriously degraded single intensity images. Prior work on the performance analysis of geometric hashing has focused on point features and shown its noise sensitivity. This paper uses line features to compute recognition invariants in a potentially more robust way. We investigate the statistical behavior of these line features analytically. Various viewing transformations, which 2-D (or flat 3-D) objects undergo during image formation, are considered. For the case of affine transformations, which are often suitable substitutes for more general perspective transformations, we show experimentally that the technique is noise resistant and can be used in highly occluded environments.
Title: Miniature Direct-Drive Rotary Actuators
Author(s): Wallace, R.
Abstract:
This paper reports the development of direct drive DC motor actuators for miniature robots. The motors are based on Nd-Fe-B rare earth permanent magnets and controlled by low cost microcontrollers. The motors have low friction, small size, high speed, low construction cost, no gear backlash, operate safely without limit switches, have limited self-braking, and generate moderate torque. Significantly, one motor can generate enough torque to lift a second motor of about the same size against the force of gravity, at a distance approximately equal to the size of the motor, without resorting to the use of a counterweight. We demonstrated the feasibility of using these actuators to make a two-link robot finger or leg.
Title: Miniature Direct Drive Rotary Actuators II: Eye, Finger, and Leg
Author(s): Wallace, R.
Abstract:
We have developed miniature direct drive DC motor actuators for robotics. These actuators have low friction, small size, high speed, low construction cost, no gear backlash, operate safely without the use of limit switches and generate moderate torque at a high torque to weight ratio. Our initial experiments indicated the feasibility of constructing a variety of new high speed low cost actuators, for applications in camera pointing, robot hands, and robot legs. In this work we study some prototype devices in each of these categories.
Title: Voice-Bandwidth Visual Communication Through Logmaps: The Telecortex
Author(s): Wallace, R.; Bederson, B.; Schwartz, E.
Abstract:
We present a robotic video telephone application of the Cortex-1 miniaturized space-variant active vision system. The embedded processor architecture of Cortex-1 enables it to implement a variety of functions not found in conventional video telephones, for example the camera tracks moving users with its pantilt mechanism. We also report an analog channel coding scheme to transmit logmap video images through band-limited analog channels such as the public switched telephone network (PSTN). The transmitter divides the voice frequency band into 768 channels, and modulates two values in quadrature on each channel. Some channels are reserved for special calibration signals enabling the receiver to recover both the phase and magnitude of the transmitted signal. The remaining channels carry pixel intensities. We synthesize the signal in the frequency domain and run the FFT algorithm to implement a fast conversion to a real, time-domain signal. A phase-lock loop keeps the receiver frame-synchronized with the transmitter. We constructed an experimental video telephone that sends 1376 pixel logmap images at 3.9 frames per second through the PSTN. Using the analog channel coding scheme, we achieve an effective data transfer rate in excess of 40000 bits per second.
Title: Space Variant Image Processing
Author(s): Wallace, R.; Ong, P.; Bederson, B.; Schwartz, E.
Abstract:
This paper describes a graph-based approach to image processing, intended for use with images obtained from sensors having space variant sampling grids. The connectivity graph (CG) is presented as a fundamental framework for posing image operations in any kind of space variant sensor. Partially motivated by the observation that human vision is strongly space variant, a number of research groups have been experimenting with space variant sensors. Such systems cover wide solid angles yet maintain high acuity in their central regions. Implementation of space variant systems pose at least two outstanding problems. First, such a system must be active, in order to utilize its high acuity region; second, there are significant image processing problems introduced by the non-uniform pixel size, shape and connectivity. Familiar image processing operations such as connected components, convolution, template matching, and even image translation, take on new and different forms when defined on space variant images. The present paper provides a general method for space variant image processing, based on a connectivity graph which represents the neighbor-relations in an arbitrarily structured sensor. We illustrate this approach with the following applications: (1) Connected components is reduced to its graph theoretic counterpart. We illustrate this on a logmap sensor, which possesses a difficult topology due to the branch cut associated with the complex logarithm function. (2) We show how to write local image operators in the connectivity graph that are independent of the sensor geometry. (3) We relate the connectivity graph to pyramids over irregular tessalations, and implement a local binarization operator in a 2-level pyramid. (4) Finally, we expand the connectivity graph into a structure we call a transformation graph, which represents the effects of geometric transformations in space variant image sensors. Using the transformation graph, we define an efficient algorithm for matching in the logmap images and solve the template matching problem for space variant images. Because of the very small number of pixels typical of logarithmic structured space variant arrays, the connectivity graph approach to image processing is suitable for real-time implementation, and provides a generic solution to a wide range of image processing applications with space variant sensors.
Title: HESFCN - A Fortran package of Hessian Subroutines for Testing Nonlinear Optimization Software
Author(s): Averbukh, V.; Figueroa, S.; Schlick, T.
Abstract:
We report the development of Hessian FORTRAN routines for testing unconstrained nonlinear optimization. An existing package, "Algorithm 566" (J. Mor'e, B. G. Garbow, and K. Hillstrom, ACM Trans. Math. Softw. 7, 14-41 and 136-140, 1981) provides function and gradient subroutines of 18 test functions for multivariate minimization. Our supplementary Hessian segments will enable users to test optimization software that requires second derivative information. Eigenvalue analysis throughout the minimization will also be possible in the goal of better understanding minimization progress by different algorithms and the relation of progress to eigenvalue distribution and condition number.
Title: More Efficient Bottom-Up Mult-Pattern Matching in Trees
Author(s): Cai, J.; Paige, R.; Tarjan, R.
Abstract:
Pattern matching in trees is fundamental to a variety of programming language systems. However, progress has been slow in satisfying a pressing need for general purpose pattern matching algorithms that are efficient in both time and space. We offer asymptotic improvements in both time and space to Chase's bottom-up algorithm for pattern preprocessing. A preliminary implementation of our algorithm runs ten times faster than Chase's implementation on the hardest problem instances. Our preprocessing algorithm has the advantage of being on-line with respect to pattern additions and deletions. It also adapts to favorable input instances, and on Hoffmann and O'Donnell's class of Simple Patterns, it performs better than their special purpose algorithm tailored to this class. We show how to modify our algorithm using a new decomposition method to obtain a space/time tradeoff. Finally, we trade a log factor in time for a linear space bottom-up pattern matching algorithm that handles a wide subclass of Hoffmann and O'Donnell's Simple Patterns.
Title: An $O(m$log$n)$-Time Algorithm for the Maximal Planar Subgraph Problem
Author(s): Cai, J.; Han, X.; Tarjan, R.
Abstract:
Based on a new version of Hopcroft and Tarjan's planarity testing algorithm, we develop an O(mlogn)-time algorithm to find a maximal planar subgraph.
Title: Counting Embeddings of Planar Graphs Using DFS Trees
Author(s): Cai, Jiazhen
Abstract:
Previously counting embeddings of planar graphs used P-Q trees and was restricted to biconnected graphs. Although the P-Q tree approach is conceptually simple, its implementation is complicated. In this paper we solve this problem using DFS trees, which are easy to implement. We also give formulas that count the number of embeddings of general planar graphs (not necessarily connected or biconnected) in O (n) arithmetic steps, where n is the number of vertices of the input graph. Finally, our algorithm can be extended to generate all embeddings of a planar graph in linear time with respect to the output.
Title: Multiplicative Schwarz Algorithms for Some Nonsymmetric and Indefinite Problems
Author(s): Cai, X.-C.; Widlund, O.
Abstract:
The classical Schwarz alternating method has recently been generalized in several directions. This effort has resulted in a number of new powerful domain decomposition methods for elliptic problems, in new insight into multigrid methods and in the development of a very useful framework for the analysis of a variety of iterative methods. Most of this work has focused on positive definite, symmetric problems. In this paper a general framework is developed for multiplicative Schwarz algorithms for nonsymmetric and indefinite problems. Several applications are then discussed including two- and multi-level Schwarz methods and iterative substructuring algorithms. Some new results on additive Schwarz methods are also presented.
Title: Backward Analysis for Higher-Order Functions Using Inverse Images
Author(s): Chuang, T.-R.; Goldberg, B.
Abstract:
We propose a method for performing backward analysis on higher-order functional programming languages based on computing inverse images of functions over abstract domains. This method can be viewed as abstract interpretation done backward. Given an abstract semantics which supports forward analysis, we can transform it into an abstract semantics which performs backward analysis. We show that if the original abstract semantics is correct and computable, then the transformed version of the abstract semantics is also correct and computable.
More specifically, given a forward abstract semantics of a higher-order functional language which is expressed in terms of Scott-closed powerdomains, we derive an backward abstraction semantics which is expressed in terms of Scott-open powerdomains. The derivation is shown to be correct and the relationships between forward analysis and backward analysis is established. We apply this method to the classic strictness analysis in functional languages and obtain promising results. We show that the time complexity of inverse image based backward analysis is not much worse than the forward analysis from which it is derived. We then compare this work with previous works of Wadler and Hughes (1987), Hughes (1988), and Burn (1990), showing that some special restrictions and constructions in previous works have natural interpretation in the Scott-closed/Scott-open powerdomain framework. A brief outline of applying the inverse image method to other backward semantics analysis is also given.
Title: Some Recent Results on Schwarz Type Domain Decomposition Algorithms
Author(s): Dryja, M.; Widlund, O.
Abstract:
Numerical experiments have shown that two-level Schwarz methods, for the solution of discrete elliptic problems, often perform very well even if the overlap between neighboring subregions is quite small. This is true to an even greater extent for a related algorithm, due to Barry Smith, where a Schwarz algorithm is applied to the reduced linear system of equations that remains after that the variables interior to the subregions have been eliminated. A supporting theory is outlined.
Title: Domain Decomposition Algorithms with Small Overlap
Author(s): Dryja, M.; Widlund, O.
Abstract:
Numerical experiments have shown that two-level Schwarz methods often perform very well even if the overlap between neighboring subregions is quite small. This is true to an even greater extent for a related algorithm, due to Barry Smith, where a Schwarz algorithm is applied to the reduced linear system of equations that remains after that the variables interior to the subregions have been eliminated. In this paper, a supporting theory is developed.
Title: GMRES/CR and Arnoldi/Lanczos as Matrix Approximation Problems
Author(s): Greenbaum, A.; Trefethen, L.
Abstract:
The GMRES and Arnoldi algorithms, which reduce to the CR and Lanczos algorithms in the symmetric case, both minimize kp(A)bk over polynomials p of degree n. The difference is that p is normalized at z = 0 for GMRES and at z = inf for Arnoldi. Analogous "ideal GMRES" and "ideal Arnoldi" problems are obtained if one removes b from the discussion and minimizes kp(A)k instead. Investigation of these true and ideal approximation problems gives insight into how fast GMRES converges and how the Arnoldi iteration locates eigenvalues.
Title: Semantic Analyses for Storage Management Optimizations in Functional Language Implementations
Author(s): Park, G.
Abstract:
One of the major overheads in implementing functional languages is the storage management overhead due to dynamic allocation and automatic reclamation of indefinite-extent storage. This dissertation investigates the problems of statically inferring lifetime information about dynamically-allocated objects in higher-order polymorphic functional languages, both strict and non-strict, and of applying that information to reduce the storage management overhead.
We have developed a set of compile-time semantic analyses for a higher-order, monomorphic, strict functional language based on denotational semantics and abstract interpretation. They are 1) escape analysis, which provides information about the relative lifetimes of objects such as arguments and local objects defined within a function with respect to an activation of the function call, 2) refined escape analysis which, as a refinement of escape analysis, provides information about the lifetimes of components of aggregate structures, and 3) reference escape analysis which provides information about the relative lifetimes of references created within a function with respect to an activation of the function.
We also have developed a compile-time semantic analysis called order-of-demand analysis for higher-order, monomorphic, non-strict functional languages, which provides information about the order in which the values of bound variables are demanded and thus allows one to compute a range of information including strictness, evaluation-order, and evaluationstatus information.
Using the notion of polymorphic invariance, we describe a method for analyzing a polymorphic language by using the analyses for a monomorphic language. We then extend those analyses for a strict language to a non-strict language using non-strict program transformation and evaluation-status information.
Based on statically inferred escape information, we propose a combination of storage management optimization techniques including stack allocation, explicit reclamation, in-place reuse, reference counting elimination, block allocation/reclamation, and improving generational garbage collection.
Title: Domain Decomposition Algorithms for the P-Version Finite Element Method for Elliptic Problems
Author(s): Pavarino, L.
Abstract:
Domain decomposition algorithms based on the Schwarz framework were originally proposed for the h-version finite element method for elliptic problems. In this thesis, we study some Schwarz algorithms for the p-version finite element method, in which increased accuracy is achieved by increasing the degree p of the elements while the mesh is fixed. These iterative algorithms, often of conjugate gradient type, are both parallel and scalable, and therefore very well suited for massively parallel computing.
We consider linear, scalar, self adjoint, second order elliptic problems and quadrilateral elements in the finite element discretization. For a class of overlapping methods, we prove a constant bound, independent of the degree p, the mesh size H and the number of elements N , for the condition number of the iteration operator. This optimal result holds in two and three dimensions for additive and multiplicative schemes, as well as variants on the interface.
We consider then local refinement for the same class of overlapping methods in two dimensions. Optimal bounds are obtained under certain hypothesis on the choice of refinement points, while in general almost optimal bounds with logarithmic growth in p are obtained. In the analysis of these local refinement methods, we prove some results of independent interest, such as a polynomial discrete Sobolev inequality and a bounded decomposition of discrete harmonic polynomials.
Iterative substructuring methods in two dimensions are also considered. We use the additive Schwarz framework to prove almost optimal bounds as in the h-version finite element method.
Results of numerical experiments, confirming the theoretical results, are conducted in two dimensions for model problems.
Title: Some Schwarz Algorithms for the P-Version Finite Element Method
Author(s): Pavarino, L.
Abstract:
Domain decomposition methods based on the Schwarz framework were originally proposed for the h-version finite element method for elliptic problems. In this paper, we consider instead the p-version, in which increased accuracy is achieved by increasing the degree of the elements while the mesh is fixed. We consider linear, scalar, self adjoint, second order elliptic problems and quadrilateral elements in the finite element discretization. For a class of overlapping additive Schwarz methods, we prove a constant bound, independent of the degree p and the number of elements N , for the condition number of the iteration operator. This optimal result holds in two and three dimensions for additive and multiplicative schemes, as well as variants on the interface. We then study local refinement for the same class of overlapping methods in two dimensions. A constant bound holds under certain hypotheses on the refinement region, while in general an almost optimal bound with logarithmic growth in p is obtained.
Title: Statistical Approach to Affine Invariant Matching with Line Features
Author(s): Tsai, F.
Abstract:
One of the most important goals of computer vision research is object recognition. Currently, most object recognition algorithms assume reliable quality of image segmentation, which in practice is often not the case. This report examines the combination of the Hough Transform with a variation of Geometric Hashing as a technique for model-based object recognition in seriously degraded single intensity images.
There is recently much focus on the performance analysis of geometric hashing. However, to our knowledge, all of them are focusing on applying the paradigm to point features and show that the technique is sensitive to noise. There is as yet no exploration of line features. In this report, we use lines as the primitive features to compute the geometric invariants for fast indexing into the geometric hash table containing the pre-processed model information. In addition, we analytically determine the effect of perturbations of line parameters on the computed invariant for the case where models are allowed to undergo affine transformation.
We have implemented the system with a series of experiments on polygonal objects, which are modeled by lines. It shows that the technique is noise resistant and suitable in an environment containing many occlusions.
Title: Differential Properties of Eigenvalues
Author(s): Burke, J.; Overton, M.
Abstract:
We define and study a directional derivative for two functions of the spectrum of an analytic matrix valued function. These are the maximum real part and the maximum modulus of the spectrum. Results are first obtained for the roots of polynomials with analytic coefficients by way of Puiseux-Newton series. In this regard, the primary analytic tool is the so called Puiseux-Newton diagram. These results are then translated into the context of matrices. Precise results are obtained when the eigenvalues that achieve the maximum value for the function under consideration are all either nondefective or nonderogatory. In the defective derogatory cases a general lower bound for the directional derivative is given which, in particular, describes those directions in which the directional derivative attains an infinite value.
Title: Randomized Parallel Algorithms for Trapezoidal Diagrams
Author(s): Clarkson, K. L.; Cole, R.; Tarjan, R. E.
Abstract:
We describe randomized parallel CREW PRAM algorithms for building trapezoidal diagrams of line segments in the plane. For general segments, we give an algorithm requiring optimal O(A + nlogn) expected work and optimal O(logn) time, where A is the number of intersecting pairs of segments. If the segments for a simple chain, we give an algorithm requiring optimal O(n) expected work and O(lognloglognlog*n) expected time, and a simpler algorithm requiring O(nlog*n) expected work. The serial algorithm corresponding to the latter is the simplest known algorithm requiring O(nlog*n) expected operations. For a set of segments forming K chains, we give an algorithm requiring O(A + nlog*n + Klogn) expected work and O(lognloglognlog*n) expected time. The parallel time bounds require the assumption that enough processors are available, with processor allocations every logn steps.
Title: On the Detection of Robust Curves
Author(s): Cole, R.; Vishkin, U.
Abstract:
Given m points in the plane and a threshold t, a curve is defined to be robust if at least t points lie on it. Efficient algorithms for detecting robust curves are given; the key contribution is to use randomized sampling. In addition, an approximate version of the problem is introduced. A geometric solution to this problem is given; it too can be enhanced by randomization.
These algorithms are readily generalized to solve the problem of robust curve detection in a scene of curve fragments: given a set of curve segments, a curve oe is defined to be robust if curve segments of total length at least l lie on oe. Again, both an exact and an approximate version of the problem are considered.
The problems and solutions are closely related to the well-investigated Hough Transform technique.
Title: The Expected Advantage of Asynchrony
Author(s): Cole, R.; Zajicek, O.
Abstract:
This paper studies the implicit costs of synchronization and the advantage that may be gained by avoiding synchronization in asynchronous environments. An asynchronous generalization of the PRAM model called the APRAM model is used and appropriate complexity measures are defined. The advantage asynchrony provides is illustrated by analyzing two algorithms: a parallel summation algorithm which proceeds along an implicit complete binary tree and a recursive doubling algorithm which proceeds along a linked list.
Title: An Asynchronous Parallel Algorithm for Undirected Graph Connectivity
Author(s): Cole, R.; Zajicek, O.
Abstract:
An algorithm for computing the components of an undirected graph in the (asynchronous) APRAM model is given; the algorithm uses O(n + e) processes and O(log n) rounds.
Title: The APRAM - The Rounds Complexity Measure and the Explicit Costs of Synchronization
Author(s): Cole, R.; Zajicek, O.
Abstract:
This paper studies the explicit costs of synchronization by examining an asynchronous generalization of the PRAM model called the APRAM model. The APRAM model and its associated complexity measure, the rounds complexity, are defined and then illustrated by designing and analyzing two algorithms: a parallel summation algorithm which proceeds along an implicit complete binary tree and a recursive doubling algorithm which proceeds along a linked list. In both cases replacing global synchronization with local synchronization yields algorithms with reduced complexity.
Title: Lucid Representations
Author(s): Davis, E.
Abstract:
This paper criticizes the widespread idea that knowledge bases in AI systems should be complete and that representations should be "model-like." The arguments in favor of such representations are less cogent and more ambiguous than they appear at first. Levesque's suggestion that representations should be "vivid" is extremely restrictive, particularly in its uniform imposition of a closed-world assumption. Spatial representations that are adequate for reasoning about a wide range of physical phenomena must ultimately either use complex symbolic reasoning or deal with partial and imperfect approximations. Requiring that temporal representations be fully detailed simulations will often be extremely inefficient. Finally, a flexible intelligent system must necessarily deal with partial information of all kinds, and the techniques for carrying out reasoning about partial information using complete representations are very limited in their application.
Title: The Kinematics of Cutting Solid Objects
Author(s): Davis, E.
Abstract:
This paper studies how the cutting of one solid object by another can be described in a formal theory. We present two alternative first-order representations for this domain. The first views an object as gradually changing its shape until it is split, at which time the original object ceases to exist and two (or more) new objects come into existence. The second focusses instead on chunks of material which are part of the overall object. A chunk persists with constant shape until some piece of it is cut away, when the chunk ceases to exist. We prove that the two theories are equivalent under ordinary circumstances, and we show that they are sufficient to support some simple commonsense inferences and algorithms.
Title: Additive Schwarz Methods for Elliptic Finite Element Problems in Three Dimensions
Author(s): Dryja, M.; Widlund, O.
Abstract:
Many domain decomposition algorithms and certain multigrid methods can be described and analyzed as additive Schwarz methods. When designing and analyzing domain decomposition methods, we encounter special difficulties in the case of three dimensions and if the coefficients are discontinuous and vary over a large range. In this paper, we first introduce a general framework for Schwarz methods. Three classes of applications are then considered: certain wire basket based iterative substructuring methods, Neumann-Neumann algorithms with low dimensional, global subspaces and a modified form of a multilevel algorithm introduced by Bramble, Pasciak and Xu.
Title: Efficient Algorithms for Cyclic Scheduling
Author(s): Gasperoni, F.; Schwiegelshohn, U.
Abstract:
This work addresses the problem of non-preemptively scheduling a cyclic set of interdependent operations, representing for instance a program loop, when p identical processors are available. For p = 1 we give a simple, efficient, polynomial time algorithm producing optimum results. When p ! 1 the problem becomes NP-hard and a slight modification of our algorithm generates provably close to optimum results.
We consider real-time systems in which the value of a task is proportional to its computation time. The system obtains the value of a given task if the task completes by its deadline. Otherwise, the system obtains no value for the task.
When such a system is underloaded (i.e. there exists a schedule for which all tasks meet their deadlines), Dertouzos showed that the earliest deadline first algorithm will achieve 100% of the possible value. We consider the case of a possibly overloaded system and present an algorithm which: 1. behaves like the earliest deadline first algorithm when the system is underloaded. 2. obtains at least 1/4 of the maximum value that an optimal clairvoyant algorithm could obtain even when the system is overloaded.
We implement our algorithm with an amortized cost of O(log n) time per task, where n bounds the number of tasks in the system at any instant.
Title: On Shape Optimizing the Ratio of the First Two Eigenvalues of the Laplacian
Author(s): Haeberly, J.
Abstract:
We investigate numerically a 1956 conjecture of Payne, Polya, and Weinberger. The conjecture asserts that the ratio of the first two eigenvalues of the Laplacian on a bounded domain \Omega of the plane with Dirichlet boundary conditions reaches its minimum value precisely when \Omega is a disk. A crucial feature of this problem is the loss of smoothness of the objective function at the solution. The following results form the core of our numerical treatment. First, we construct finite dimensional families of deformations of a disk equipped with a uniform triangulation. This permits the formulation of a discrete model of the problem via finite element techniques. Second, we build on the work of M. Overton to derive optimality conditions in terms of Clarke's generalized gradients for nonsmooth functions. These ideas are then combined into an algorithm and implemented in Fortran.
Title: Programming with Structures, Functions, and Objects
Author(s): Henglein, F.; Laufer, K.
Abstract:
We describe program structuring mechanisms for integrating algebraic, functional and object-oriented programming in a single framework. Our language is a statically typed higher-order language with specifications, structures, types, and values, and with universal and existential abstraction over structures, types, and values.
We show that existential types over structures generalize both the necessarily homogeneous type classes of Haskell and the necessarily heterogeneous object classes of object-oriented programming languages such as C++ or Eiffel. Following recent work on ML, we provide separate linguistic mechanisms for reusing specifications and structures. Subtyping is provided in the form of explicit type conversions.
The language mechanisms are introduced by examples to emphasize their pragmatic aspects. We compare them with the mechanisms of XML+, Haskell and Eiffel and give a type-theoretic perspective. These mechanisms have been developed within a larger, ongoing prototyping language design project.
Title: Efficient Loop-Level Parallelishm in ADA
Author(s): Hind, M.
Abstract:
Parallelism in scientific applications can most often be found at the loop level. Although Ada supports parallelism via the task construct, its coarseness renders it unsuitable for this light-weight parallelism. In this work, we propose Ada constructs to achieve efficient loop-level parallelism in ANSI-Ada. This is accomplished in two steps. First, we present an idiom that allows the specification of light-weight tasks. Second, we give an efficient implementation of this idiom that is considerably more efficient than a standard Ada task.
In addition, we present an idiom that makes the fetch and add synchronization primitive available at the Ada level. Our implementation of this idiom is more efficient in both time and space than previous results. In addition to providing universal synchronization, using fetch and add simplifies program analysis (e.g. proving the absence of race conditions in the implementation of a parallel algorithm). Since all these idioms are written in standard Ada, they maintain the portability that is central to the mandated uses of the language.
Title: Decomposition and Fictitious Domains Methods for Elliptic Boundary Value Problems
Author(s): Nepomnyaschikh, S.
Abstract:
Boundary value problems for elliptic second order equations in three-dimensional domains with piecewise smooth boundaries are considered. Discretization of the problem is