Theses & Reports
Instructions for submitting a technical report or thesis.
You can find technical reports published prior to 1990 archived here.
-
TR2005-867
2005
Infrastructure for Automatic Dynamic Deployment of J2EE Applications in Distributed Environments
Akkerman, Anatoly;
Totok, Alexander; Karamcheti, Vijay
Abstract
|
PDF
Title: Infrastructure for Automatic Dynamic Deployment of J2EE Applications in Distributed Environments
Author(s): Akkerman, Anatoly; Totok, Alexander; Karamcheti, Vijay
Abstract:
Recent studies showed potential for using component frameworks for building flexible adaptible applications for deployment in distributed environments. However this approach is hindered by the complexity of deployment of component-based applications, which usually involves a great deal of configuration of both the application components and system services they depend on. In this paper we propose an infrastructure for automatic dynamic deployment of J2EE applications,that specifically addresses the problems of (1) inter-component connectivity specification and its effects on component configuration and deployment; and (2) application component dependencies on application server services, their configuration and deployment. The proposed infrastructure provides simple yet expressive abstractions for potential application adaptation through dynamic deployment and undeployment of components. We implement the infrastructure as a part of the JBoss J2EE application server and test it on several sample J2EE applications.
-
TR2005-858
2005
Remembrance of Experiments Past: Analyzing Time Course Datasets to Discover Complex Temporal Invariants
Antoniotti, Marco;
Ramakrishnan, Naren; Kumar, Deept; Spivak, Marina; Mishra, Bud
Abstract
|
PDF
Title: Remembrance of Experiments Past: Analyzing Time Course Datasets to Discover Complex Temporal Invariants
Author(s): Antoniotti, Marco; Ramakrishnan, Naren; Kumar, Deept; Spivak, Marina; Mishra, Bud
Abstract:
Motivation: Current microarray data analysis techniques draw the biologist's attention to targeted sets of genes but do not otherwise present global and dynamic perspectives (e.g., invariants) inferred collectively over a dataset. Such perspectives are important in order to obtain a process-level understanding of the underlying cellular machinery, especially how cells react, respond, and recover from stresses.
Results: We present GOALIE, a novel computational approach and software system that uncovers formal temporal logic models of biological processes from time course microarray datasets. GOALIE `redescribes' data into the vocabulary of biological processes and then pieces together these redescriptions into a Kripke-structure model, where possible worlds encode transcriptional states and are connected to future possible worlds. This model then supports various query, inference, and comparative assessment tasks, besides providing descriptive process-level summaries. An application of GOALIE to characterizing the yeast (S. cerevisiae) cell cycle is described.
Availability: GOALIE runs on Windows XP platforms and is available on request from the authors.
-
TR2005-878
2005
An Abstract Decision Procedure for Satisfiability in the Theory of Recursive Data Types
Barrett, Clark;
Shikanian, Igor; Tinelli, Cesare
Abstract
|
PDF
Title: An Abstract Decision Procedure for Satisfiability in the Theory of Recursive Data Types
Author(s): Barrett, Clark; Shikanian, Igor; Tinelli, Cesare
Abstract:
The theory of recursive data types is a valuable modeling tool for software verification. In the past, decision procedures have been proposed for both the full theory and its universal fragment. However, previous work has been limited in various ways, including an inability to deal with multiple constructors, multi-sorted logic, and mutually recursive data types. More significantly, previous algorithms for the universal case have been based on inefficient nondeterministic guesses and have been described in fairly complex procedural terms.
We present an algorithm which addresses these issues for the universal theory. The algorithm is presented declaratively as a set of abstract rules which are terminating, sound, and complete. We also describe strategies for applying the rules and explain why our recommended strategy is more efficient than those used by previous algorithms. Finally, we discuss how the algorithm can be used within a broader framework of cooperating decision procedures.
-
TR2005-869
2005
Squidball: An Experiment in Large-Scale Motion Capture and Game Design
Bregler, Christoph;
Castiglia, Clothilde; DeVincenzo, Jessica; DuBois, Roger Luke; Feeley, Kevin; Igoe, Tom; Meyer, Jonathan; Naimark, Michael; Postelnicu, Alexandru; Rabinovich, Michael; Rosenthal, Sally; Salen, Katie; Sudol, Jeremi; Wright, Bo
Abstract
|
PDF
Title: Squidball: An Experiment in Large-Scale Motion Capture and Game Design
Author(s): Bregler, Christoph; Castiglia, Clothilde; DeVincenzo, Jessica; DuBois, Roger Luke; Feeley, Kevin; Igoe, Tom; Meyer, Jonathan; Naimark, Michael; Postelnicu, Alexandru; Rabinovich, Michael; Rosenthal, Sally; Salen, Katie; Sudol, Jeremi; Wright, Bo
Abstract:
This paper describes a new large-scale motion capture based game that is called Squidball. It was tested on up to 4000 player audiences last summer at SIGGRAPH 2004. It required to build the world's largest motion capture space, the largest motion capture markers (balls), and many other challenges in technology, production, game play, and social studies. Our aim was to entertain the SIGGRAPH Electronic Theater audience with a cooperative and energetic game that is played by everybody together, in controlling real-time graphics and audio, while bouncing and batting multiple large helium filled balloons across the entire theater space. We detail in this paper all the lessons learned in producing such a system and game, and argue why we believe Squidball was a great success.
-
TR2005-860
2005
A Domain Decomposition Discretization of Parabolic Problems
Dryja, Maksymilian;
Tu, Xuemin
Abstract
|
PDF
Title: A Domain Decomposition Discretization of Parabolic Problems
Author(s): Dryja, Maksymilian; Tu, Xuemin
Abstract:
In recent years, domain decomposition methods have attracted much attention due to their successful application to many elliptic and parabolic problems. Domain decomposition methods treat problems based on a domain substructuring, which is attractive for parallel computation, due to the independence among the subdomains. In principle, domain decomposition methods may be applied to the system resulting from a standard discretization of the parabolic problems or, directly, be carried out through a direct discretization of parabolic problems. In this paper, a direct domain decomposition method is introduced to discretize the parabolic problems. The stability and convergence of this algorithm are analyzed, and an $O(\tau+h)$ error bound is provided.
-
Ph.D. Thesis
2005
Translation Validation of Optimizing Compilers
Fang, Yi
Abstract
|
PDF
Title: Translation Validation of Optimizing Compilers
Candidate: Fang, Yi
Advisor(s): Pnueli, Amir; Zuck, Lenore
Abstract:
There is a growing awareness, both in industry and academia, of the crucial role of formally verifying the translation from high-level source-code into low-level object code that is typically performed by an optimizing comiler. Formally verifying an optimizing compiler, as one woule verify any other large program, is not feasible due to its size, ongoing evolution and modification, and possibly, proprietary considerations. Translation validation is a novel approach that offers an alternative to the verification of translator in general and compilers in particular: Rather than verifying the compiler itself, one constructs a validation tool which, after every run of the compiler, formally confirms that the target code produced in the run is a correct translation of the source program. This thesis work takes an important step towards ensuring an extremely high level of confidence in compilers targeted at EPIC architectures.
In this thesis, we focus on the translation validation of structure preserving optimizations, i.e. transformations that do not modify programs' structure in a major way. This category of optimizations covers most of the global optimizations performed by compilers. This thesis has two main parts. One develops a proof rule that formally establishes the correctness of structure preserving transformation based on computational induction. The other part is the development of a tool that applies the proof rule to the automatic validation of global optimizaitons performed by Intel's ORC compiler for IA-64 architecture. With minimal instrumentation from the compiler, the tool constructs ''verification conditions'' -- formal theorems that, if valid, establish the correctness of a translation. The verificaiton conditions are then transferred to an automatic theorem prover that checks their validity. Together, the tool offers a fully automatic method to formally establish the correctness of each translation.
-
TR2005-875
2005
Nonlinear Image Representation via Local Multiscale Orientation
Hammond, David K.;
Simoncelli, Eero P.
Abstract
|
PDF
Title: Nonlinear Image Representation via Local Multiscale Orientation
Author(s): Hammond, David K.; Simoncelli, Eero P.
Abstract:
We present a nonlinear image representation based on multiscale local orientation measurements. Specifically, an image is first decomposed using a two-orientation steerable pyramid, a tight-frame representation in which the basis functions are directional derivatives of a radially symmetric blurring operator. The pair of subbands at each scale are thus gradients of progressively blurred copies of the original image. We then discard the magnitude information and retain only the orientation of each gradient vector. We develop a method for reconstructing the original image from this orientation information using an algorithm based on projection onto convex sets, and demonstrate its robustness to quantization.
-
TR2005-866
2005
An Analysis of Usage Locality for Data-Centric Web Services
He, Congchun;
Karamcheti, Vijay
Abstract
|
PDF
Title: An Analysis of Usage Locality for Data-Centric Web Services
Author(s): He, Congchun; Karamcheti, Vijay
Abstract:
The growing popularity of XML Web Services is resulting in a significant increase in the proportion of Internet traffic that involves requests to and responses from Web Services. Unfortunately, web service responses, because they are generated dynamically, are considered ``uncacheable" by traditional caching infrastructures. One way of remedying this situation is by developing alternative caching infrastructures, which improve performance using on-demand service replication, data offloading, and request redirection. These infrastructures benefit from two characteristics of web service traffic --- (1) the open nature of the underlying protocols, SOAP, WSDL, UDDI, which results in service requests and responses adhering to a well-formatted, widely known structure; and (2) the observation that for a large number of currently deployed data-centric services, requests can be interpreted as structured accesses against a physical or virtual database --- but require that there be sufficient locality in service usage to offset replication and redirection costs.
This paper investigates whether such locality does in fact exist in current web service workloads. We examine access logs from two large data-centric web service sites, SkyServer and TerraServer, to characterize workload locality across several dimensions: data space, network regions, and different time epochs. Our results show that both workloads exhibit a high degree of spatial and network locality: 10\% of the client IP addresses in the SkyServer trace contribute to about 99.95\% of the requests, and 99.94\% of the requests in the TerraServer trace are directed towards regions that represent less than 10\% of the overall data space accessible through the service. Our results point to the substantial opportunity for improving Web Services scalability by on-demand service replication.
-
TR2005-877
2005
Oriented Overlays For Clustering Client Requests To Data-Centric Network Services
He, Congchun;
Karamcheti, Vijay
Abstract
|
PDF
Title: Oriented Overlays For Clustering Client Requests To Data-Centric Network Services
Author(s): He, Congchun; Karamcheti, Vijay
Abstract:
Many of the data-centric network services deployed today hold massive volumes of data at their origin websites, accessing the data to dynamically generate responses. Such dynamic responses are poorly supported by traditional caching infrastructures and result in poor performance and scalability for such services. One way of remedying this situation is to develop alternative caching infrastructures, which can dynamically detect the often large degree of service usage locality and leverage such information to on-demand replicate and redirect requests to service portions at appropriate network locations. Key to building such infrastructures is the ability to cluster and inspect client requests, at various points across a wide-area network.
This paper presents a zone-based scheme for constructing oriented overlays, which provide such an ability. Oriented overlays differ from previously proposed unstructured overlays in supporting network traffic flows from many sources towards one (or a small number) of destinations, and vice-versa. A good oriented overlay would offer sufficient clustering ability without adversely affecting path latencies. Our overlay construction scheme organizes participating nodes into different zones according to their latencies from the origin server(s), and has each node associate with one or more parents in another zone closer to the origin. Extensive experiments with a PlanetLab-based implementation of our scheme shows that it produces overlays that are (1) robust to network dynamics; (2) offer good clustering ability; and (3) minimally impact end-to-end network latencies seen by clients.
-
Ph.D. Thesis
2005
Translation Validation of Loop Optimizations
Hu, Ying
Abstract
|
PDF
Title: Translation Validation of Loop Optimizations
Candidate: Hu, Ying
Advisor(s): Goldberg, Benjamin; Barrett, Clark
Abstract:
Formal verification is important in designing reliable computer systems. For a critical software system, it is not enough to have a proof of correctness for the source code, there must also be an assurance that the compiler produces a correct translation of the source code into the target machine code. Verifying the correctness of modern optimizing compilers is a challenging task because of their size, their complexity, and their evolution over time.
In this thesis, we focus on the Translation Validation of loop optimizations. In order to validate the optimizations performed by the compiler, we try to prove the equivalence of the intermediate codes before and after the optimizations. There were previously a set of proof rules for building the equivalence relation between two programs. However, they cannot validate some cases with legal loop optimizations. We propose new proof rules to consider the conditions of loops and possible elimination of some loops, so that those cases can also be handled. According to these new proof rules, algorithms are designed to apply them to an automatic validation process.
Based on the above proof rules, we implement an automatic validation tool for loop optimizations which analyzes the loops, guesses what kinds of loop optimizations occur, proves the validity of a combination of loop optimizations, and synthesizes a series of intermediate codes. We integrate this new loop tool into our translation validation tool TVOC, so that TVOC handles not only optimizations which do not significantly change the structure of the code, but also loop optimizations which do change the structure greatly. With this new part, TVOC has succeeded in validating many examples with loop optimizations.
Speculative optimizations are the aggressive optimizations that are only correct under certain conditions that cannot be known at compile time. In this thesis, we present the theory and algorithms for validating speculative optimizations and generating the runtime tests necessary for speculative optimizations. We also provide several examples and the results of the algorithms for speculative optimizations.
-
TR2005-870
2005
Two-Level Schwarz Algorithms, Using Overlapping Subregions, for Mortar Finite Element Methods
Hyun Kim, Hyea;
Widlund, Olof B.
Abstract
|
PDF
Title: Two-Level Schwarz Algorithms, Using Overlapping Subregions, for Mortar Finite Element Methods
Author(s): Hyun Kim, Hyea; Widlund, Olof B.
Abstract:
Preconditioned conjugate gradient methods based on two-level overlapping Schwarz methods often perform quite well. Such a preconditioner combines a coarse space solver with local components which are defined in terms of subregions which form an overlapping covering of the region on which the elliptic problem is defined. Precise bounds on the rate of convergence of such iterative methods have previously been obtained in the case of conforming lower order and spectral finite elements as well as in a number of other cases. In this paper, this domain decomposition algorithm and analysis are extended to mortar finite elements. It is established that the condition number of the relevant iteration operator is independent of the number of subregions and varies with the relative overlap between neighboring subregions linearly as in the conforming cases previously considered.
-
Ph.D. Thesis
2005
Construction of Component-Based Applications by Planning
Kichkaylo, Tatiana
Abstract
|
PDF
Title: Construction of Component-Based Applications by Planning
Candidate: Kichkaylo, Tatiana
Advisor(s): Karamcheti, Vijay; Ernest Davis
Abstract:
Many modern wide-area distributed systems are component-based. This approach provides great flexibility in adapting applications to the changing state of the environment and user requirements, but increases the complexity of configuring the applications. Because of the scale and heterogeneity of modern wide-area environments, manual configuration is hard, inefficient, suboptimal, and error-prone. Automated application configuration is desired.
Constructing distributed applications requires choosing a set of components that will constitute the application instance and assigning network resources to component executions and data transfers. Stated this way, the application configuration problem (ACP) is similar to the planning (action selection) and scheduling (resource allocation) problems studied by the Artificial Intelligence (AI) community.
This thesis investigates the problem of solving the ACP using AI planning techniques. However, the ACP poses several challenges not usually encountered and addressed by the traditional AI solutions. The problem specification for the ACP can be much larger than the solution, with the relevant portions only identified during the search. Additionally, the interactions between planning operators are numeric rather than logical. Finally, it is desirable to be able to trade off quality of the solution versus search time.
We show that the ACP is undecidable in general. Therefore, instead of a single algorithm, we propose a set of techniques that can be used to compose an algorithm for a particular variety of the ACP that can exploit natural restrictions exhibited by that variety. These techniques address the challenges above by dynamically obtaining portions of the problem specification as necessary during the search, using envelope hierarchies based on numeric information for pruning and search guidance, and discretizing continuous variables to approximate numeric parameters without restricting the form of supported numeric functions.
We illustrate these techniques by describing their use in algorithms tailored for two specific varieties of the ACP --- snapshot configurations for dynamic component-based frameworks, and scheduling of grid workflows with replica selection and explicit resource reservations. Experimental evaluation of the performance of these two algorithms shows that the techniques successfully achieve their goals, with acceptable run-time overhead.
-
TR2005-873
2005
A BDDC algorithm for problems with mortar discretization
Kim, Hyea Hyun;
Dryja, Maksymilian; Widlund, Olof B.
Abstract
|
PDF
Title: A BDDC algorithm for problems with mortar discretization
Author(s): Kim, Hyea Hyun; Dryja, Maksymilian; Widlund, Olof B.
Abstract:
A BDDC (balancing domain decomposition by constraints) algorithm is developed for elliptic problems with mortar discretizations for geometrically non-conforming partitions in both two and three spatial dimensions. The coarse component of the preconditioner is defined in terms of one mortar constraint for each edge/face which is an intersection of the boundaries of a pair of subdomains. A condition number bound of the form $C \max_i \left\{ (1+\text{log} (H_i/h_i) )3 \right\}$ is established. In geometrically conforming cases, the bound can be improved to $C \max_i \left\{ (1+\text{log} (H_i/h_i) )2 \right\}$. This estimate is also valid in the geometrically nonconforming case under an additional assumption on the ratio of mesh sizes and jumps of the coefficients. This BDDC preconditioner is also shown to be closely related to the Neumann-Dirichlet preconditioner for the FETI--DP algorithms of \cite{K-04-3d,KL-02} and it is shown that the eigenvalues of the BDDC and FETI--DP methods are the same except possibly for an eigenvalue equal to 1.
-
TR2005-863
2005
A FETI-DP formulation of three dimensional elasticity problems with mortar discretization
Kim, Hyea Hyun
Abstract
|
PDF
Title: A FETI-DP formulation of three dimensional elasticity problems with mortar discretization
Author(s): Kim, Hyea Hyun
Abstract:
In this paper, a FETI-DP formulation for the three dimensional elasticity problem on non-matching grids over a geometrically conforming subdomain partition is considered. To resolve the nonconformity of the finite elements, a mortar matching condition on the subdomain interfaces (faces) is imposed. By introducing Lagrange multipliers for the mortar matching constraints, the resulting linear system becomes similar to that of a FETI-DP method. In order to make the FETI-DP method efficient for solving this linear system, a relatively large set of primal constraints, which include average and momentum constraints over interfaces (faces) as well as vertex constraints, is introduced. A condition number bound $C(1+\text{log}(H/h))2$ for the FETI-DP formulation with a Neumann-Dirichlet preconditioner is then proved for the elasticity problems with discontinuous material parameters when only some faces are chosen as primal faces on which the average and momentum constraints will be imposed. An algorithm which selects a quite small number of primal faces is also discussed.
-
TR2005-861
2005
BDDC Algorithms for Incompressible Stokes Equations
Li, Jing;
Widlund, Olof B.
Abstract
|
PDF
Title: BDDC Algorithms for Incompressible Stokes Equations
Author(s): Li, Jing; Widlund, Olof B.
Abstract:
The purpose of this paper is to extend the BDDC (balancing domain decomposition by constraints) algorithm to saddle-point problems that arise when mixed finite element methods are used to approximate the system of incompressible Stokes equations. The BDDC algorithms are iterative substructuring methods, which form a class of domain decomposition methods based on the decomposition of the domain of the differential equations into nonoverlapping subdomains. They are defined in terms of a set of primal continuity constraints, which are enforced across the interface between the subdomains and which provide a coarse space component of the preconditioner. Sets of such constraints are identified for which bounds on the rate of convergence can be established that are just as strong as previously known bounds for the elliptic case. In fact, the preconditioned operator is effectively positive definite, which makes the use of a conjugate gradient method possible. A close connection is also established between the BDDC and FETI-DP algorithms for the Stokes case.
-
TR2004-857
2005
FETI--DP, BDDC, and Block Cholesky Methods
Li, Jing;
Widlund, Olof B.
Abstract
|
PDF
Title: FETI--DP, BDDC, and Block Cholesky Methods
Author(s): Li, Jing; Widlund, Olof B.
Abstract:
Two popular non-overlapping domain decomposition methods, the FETI--DP and BDDC algorithms, are reformulated using Block Cholesky factorizations, an approach which can provide a useful framework for the design of domain decomposition algorithms for solving symmetric positive definite linear system of equations. Instead of introducing Lagrange multipliers to enforce the coarse level, primal continuity constraints in these algorithms, a change of variables is used such that each primal constraint corresponds to an explicit degree of freedom. With the new formulations of these algorithms, a simplified proof is provided that the spectra of a pair of FETI--DP and BDDC algorithms, with the same set of primal constraints, are the same. Results of numerical experiments also confirm this result.
-
TR2005-871
2005
On the Use of Inexact Subdomain Solvers for BDDC Algorithms
Li, Jing;
Widlund, Olof B.
Abstract
|
PDF
Title: On the Use of Inexact Subdomain Solvers for BDDC Algorithms
Author(s): Li, Jing; Widlund, Olof B.
Abstract:
The standard BDDC (balancing domain decomposition by constraints) preconditioner is shown to be equivalent to a preconditioner built from a partially subassembled finite element model. This results in a system of linear algebraic equations which is much easier to solve in parallel than the fully assembled model; the cost is then often dominated by that of the problems on the subdomains. An important role is also played, both in theory and practice, by an average operator and in addition exact Dirichlet solvers are used on the subdomains in order to eliminate the residual in the interior of the subdomains. The use of inexact solvers for these problems and even the replacement of the Dirichlet solvers by a trivial extension are considered. It is established that one of the resulting algorithms has the same eigenvalues as the standard BDDC algorithm, and the connection of another with the FETI-DP algorithm with a lumped preconditioner is also considered. Multigrid methods are used in the experimental work and under certain assumptions, it can be established that the iteration count essentially remains the same as when exact solvers are used, while considerable gains in the speed of the algorithm can be realized since the cost of the exact solvers grows superlinearly with the size of the subdomain problems while the multigrid methods are linear.
-
TR2005-872
2005
Real-time rendering of normal maps with discontinuities
Parilov, Evgueni;
Rosenberg, Ilya; Zorin, Denis
Abstract
|
PDF
Title: Real-time rendering of normal maps with discontinuities
Author(s): Parilov, Evgueni; Rosenberg, Ilya; Zorin, Denis
Abstract:
Title: Real-time rendering of normal maps with discontinuities (NYU-CS-TR872) Authors: Evgueni Parilov, Ilya Rosenberg and Denis Zorin Abstract:
Normal mapping uses normal perturbations stored in a texture to give objects a more geometrically complex appearance without increasing the number of geometric primitives. Standard bi- and trilinear interpolation of normal maps works well if the normal field is continuous, but may result in visible artifacts in the areas where the field is discontinuous, which is common for surfaces with creases and dents.
In this paper we describe a real-time rendering technique which preserves the discontinuity curves of the normal field at sub-pixel level and its GPU implementation. Our representation of the piecewise-continuous normal field is based on approximations of the distance function to the discontinuity set and its gradient. Using these approximations we can efficiently reconstruct discontinuities at arbitrary resolution and ensure that no normals are interpolated across the discontinuity. We also described a method for updating the normal field along the discontinuities in real-time based on blending the original field with the one calculated from a user-defined surface profile. -
TR2005-859
2005
Algorithmic Algebraic Model Checking I: The Case of Biochemical Systems and their Reachability Analysis
Piazza, C.;
Antoniotto, M.; Mysore, V.; Policriti, A.; Winkler, F.; Mishra, B.
Abstract
|
PDF
Title: Algorithmic Algebraic Model Checking I: The Case of Biochemical Systems and their Reachability Analysis
Author(s): Piazza, C.; Antoniotto, M.; Mysore, V.; Policriti, A.; Winkler, F.; Mishra, B.
Abstract:
Presently, there is no clear way to determine if the current body of biological facts is sufficient to explain phenomenology. Rigorous mathematical models with automated tools for reasoning, simulation, and computation can be of enormous help to uncover cognitive flaws, qualitative simplification or overly generalized assumptions. The approaches developed by control theorists analyzing stability of a system with feedback, physicists studying asymptotic properties of dynamical systems, computer scientists reasoning about discrete or hybrid (combining discrete events with continuous events) reactive systems---all have tried to address some aspects of the same problem in a very concrete manner. We explore here how biological processes could be studied in a similar manner, and how the appropriate tools for this purpose can be created.
In this paper, we suggest a possible confluence of the theory of hybrid automata and the techniques of algorithmic algebra to create a computational basis for systems biology. We start by discussing our basis for this choice -- semi-algebraic hybrid systems, as we also recognize its power and limitations. We explore solutions to the bounded-reachability problem through symbolic computation methods, applied to the descriptions of the traces of the hybrid automaton. Because the description of the automaton is through semi-algebraic sets, the evolution of the automaton can be described even in cases where system parameters and initial conditions are unspecified. Nonetheless, semialgebraic decision procedures provide a succinct description of algebraic constraints over the initial values and parameters for which proper behavior of the system can be expected. In addition, by keeping track of conservation principles in terms of constraint or invariant manifolds on which the system must evolve, we avoid many of the obvious pitfalls of numerical approaches.
-
Ph.D. Thesis
2005
Extensible MultiModal Environment Toolkit (EMMET): A Toolkit for Prototyping and Remotely Testing Speech and Gesture Based Multimodal Interfaces
Robbins, Christopher A.
Abstract
|
PDF
Title: Extensible MultiModal Environment Toolkit (EMMET): A Toolkit for Prototyping and Remotely Testing Speech and Gesture Based Multimodal Interfaces
Candidate: Robbins, Christopher A.
Advisor(s): Perlin, Ken
Abstract:
Ongoing improvements to the performance and accessibility of less conventional input modalities such as speech and gesture recognition now provide new dimensions for interface designers to explore. Yet there is a scarcity of commercial applications which utilize these modalities either independently or multimodally. This scarcity partially results from a lack of development tools and design guidelines to facilitate the use of speech and gesture.
An integral aspect of the user interface design process is the ability to easily evaluate various design solutions through an iterative process of prototyping and testing. Through this process guidelines emerge that aid in the design of future interfaces. Today there is no shortage of tools supporting the development of conventional interfaces. However there do not exist resources allowing interface designers to easily prototype and quickly test, via remote distribution, interface designs utilizing speech and gesture.
The thesis work for this dissertation explores the development of an Extensible MultiModal Environment Toolkit (EMMET) for prototyping and remotely testing speech and gesture based multimodal interfaces to three-dimensional environments. The overarching goals for this toolkit are to allow its users to: explore speech and gesture based interface design without requiring an understanding of the details involved in the low-level implementation of speech or gesture recognition, quickly distribute their multimodal interface prototypes via the Web, and receive multimodal usage statistics collected remotely after each use of their application.
EMMET ultimately contributes to the field of multimodal user interface design by providing an environment to existing user interface developers in which speech and gesture recognition have been seamlessly integrated into their palette of user input options. Such seamless integration serves to increase the utilization within applications of speech and gesture modalities by removing any actual or perceived deterrents to the use of these modalities versus the use of conventional modalities. EMMET additionally strives to improve the quality of speech and gesture based interfaces by supporting the prototype-and-test development cycle through its Web distribution and usage statistics collection capabilities. These capabilities also allow developers to realize new design guidelines specific to the use of speech and gesture.
-
TR2005-874
2005
Ranking with a P-norm Push
Rudin, Cynthia
Abstract
|
PDF
Title: Ranking with a P-norm Push
Author(s): Rudin, Cynthia
Abstract:
We are interested in supervised ranking with the following twist: our goal is to design algorithms that perform especially well near the top of the ranked list, and are only required to perform sufficiently well on the rest of the list. Towards this goal, we provide a general form of convex objective that gives high-scoring examples more importance. This ``push'' near the top of the list can be chosen arbitrarily large or small. We choose $\ell_p$-norms to provide a specific type of push; as $p$ becomes large, the algorithm concentrates harder near the top of the list.
We derive a generalization bound based on the $p$-norm objective. We then derive a corresponding boosting-style algorithm, and illustrate the usefulness of the algorithm through experiments on UCI data.
-
TR2005-876
2005
Better Burst Detection
Shasha, Dennis;
Zhang, Xin
Abstract
|
PDF
Title: Better Burst Detection
Author(s): Shasha, Dennis; Zhang, Xin
Abstract:
A burst is a large number of events occurring within a certain time window. As an unusual activity, it's a noteworthy phenomenon in many natural and social processes. Many data stream applications require the detection of bursts across a variety of window sizes. For example, stock traders may be interested in bursts having to do with institutional purchases or sales that are spread out over minutes or hours. Detecting a burst over any of $k$ window sizes, a problem we call {\em elastic burst detection}, in a stream of length $N$ naively requires $O(kN)$ time. Previous work \cite{DiscoveryBook03} showed that a simple Shifted Binary Tree structure can reduce this time substantially (in very favorable cases near to $O(N)$) by filtering away obvious non-bursts. Unfortunately, for certain data distributions, the filter marks many windows of events as possible bursts, even though a detailed check shows them to be non-bursts.
In this paper, we present a new algorithmic framework for elastic burst detection: a family of data structures that generalizes the Shifted Binary Tree. We then present a heuristic search algorithm to find an efficient structure among the many offered by the framework, given the input. We study how different inputs affect the desired structures. Experiments on both synthetic and real world data show a factor of up to 35 times improvement compared with the Shifted Binary Tree over a wide variety of inputs, depending on the data distribution. We show an example application that identifies interesting correlations between bursts of activity in different stocks.
-
TR2005-868
2005
Modeling Of Concurrent Web Sessions With Bounded Inconsistency In Shared Data
Totok, Alexander;
Karamcheti, Vijay
Abstract
|
PDF
Title: Modeling Of Concurrent Web Sessions With Bounded Inconsistency In Shared Data
Author(s): Totok, Alexander; Karamcheti, Vijay
Abstract:
Client interactions with modern web-accessible network services are typically organized into sessions involving multiple requests that read and write shared application data. Therefore when executed concurrently, web sessions may invalidate each other's data. Depending on the nature of the business represented by the service, allowing the session with invalid data to progress might lead to financial penalties for the service provider, while blocking the session's progress and deferring its execution (e.g., by relaying its handling to the customer service) will most probably result in user dissatisfaction. A compromise would be to tolerate some bounded data inconsistency, which would allow most of the sessions to progress, while limiting the potential financial loss incurred by the service. In order to quantitatively reason about these tradeoffs, the service provider can benefit from models that predict metrics, such as the percentage of successfully completed sessions, for a certain degree of tolerable data inconsistency.
This paper develops such analytical models of concurrent web sessions with bounded inconsistency in shared data for three popular concurrency control algorithms. We illustrate our models using the sample buyer scenario from the TPC-W e-Commerce benchmark, and validate them by showing their close correspondence to measured results of concurrent session execution in both a simulated and a real web server environment. Our models take as input parameters of service usage, which can be obtained through profiling of incoming client requests. We augment our web application server environment with a profiling and automated decision making infrastructure which is shown to successfully choose, based on the specified performance metric, the best concurrency control algorithm in real time in response to changing service usage patterns.
-
Ph.D. Thesis
2005
Pattern Discovery for Hypotheses Generation in Biology
Tsirigos, Aristotelis
Abstract
|
PDF
Title: Pattern Discovery for Hypotheses Generation in Biology
Candidate: Tsirigos, Aristotelis
Advisor(s): Shasha, Dennis
Abstract:
In recent years, the increase in the amounts of available genomic as well as gene expression data has provided researchers with the necessary information to train and test various models of gene origin, evolution, function and regulation. In this thesis, we present novel solutions to key problems in computational biology that deal with nucleotide sequences (horizontal gene transfer detection), amino-acid sequences (protein sub-cellular localization prediction), and gene expression data (transcription factor - binding site pair discovery). Different pattern discovery techniques are utilized, such as maximal sequence motif discovery and maximal itemset discovery, and combined with support vector machines in order to achieve significant improvements against previously proposed methods.
-
TR2005-865
2005
A BDDC algorithm for flow in porous media with a hybrid finite element discretization
Tu, Xuemin
Abstract
|
PDF
Title: A BDDC algorithm for flow in porous media with a hybrid finite element discretization
Author(s): Tu, Xuemin
Abstract:
The BDDC (balancing domain decomposition by constraints) methods have been applied successfully to solve the large sparse linear algebraic systems arising from conforming finite element discretizations of elliptic boundary value problems. In this paper, the scalar elliptic problems for flow in porous media are discretized by a hybrid finite element method which is equivalent to a nonconforming finite element method. The BDDC algorithm is extended to these problems which originate as saddle point problems.
Edge/face average constraints are enforced across the interface and the same rate of convergence is obtained as in conforming cases. The condition number of the preconditioned system is estimated and numerical experiments are discussed.
-
TR2005-864
2005
A BDDC Algorithm for Mixed Formulation of Flow in Porous Media
Tu, Xuemin
Abstract
|
PDF
Title: A BDDC Algorithm for Mixed Formulation of Flow in Porous Media
Author(s): Tu, Xuemin
Abstract:
The BDDC (balancing domain decomposition by constraints) algorithms are similar to the balancing Neumann-Neumann methods, with a small number of continuity constraints enforced across the interface throughout the iterations. These constraints form a coarse, global component of the preconditioner. The BDDC methods are powerful for solving large sparse linear algebraic systems arising from discretizations of elliptic boundary value problems. In this paper, the BDDC algorithm is extended to saddle point problems generated from the mixed finite element methods used to approximate the scalar elliptic problems for flow in porous media.
Edge/face average constraints are enforced and the same rate of convergence is obtained as for simple elliptic cases. The condition number bound is estimated and numerical experiments are discussed. In addition, a comparison of the BDDC method with an edge/face-based iterative substructuring method is provided.
-
TR2005-879
2005
BDDC Domain Decomposition Algorithms: Methods with Three Levels and for Flow in Porous Media
Tu, Xuemin
Abstract
|
PDF
Title: BDDC Domain Decomposition Algorithms: Methods with Three Levels and for Flow in Porous Media
Author(s): Tu, Xuemin
Abstract:
Two inexact coarse solvers for Balancing Domain Decomposition by Constraints (BDDC) algorithms are introduced and analyzed. These solvers help remove a bottleneck for the two-level BDDC algorithms related to the cost of the coarse problem when the number of subdomains is large. At the same time, a good convergence rate is maintained.
BDDC algorithms are also developed for the linear systems arising from flow in porous media discretized with mixed and hybrid finite elements. Our methods are proven to be scalable and the condition numbers of the operators with our BDDC preconditioners grow only polylogarithmically with the size of the subdomain problems.
-
TR2005-862
2005
Three-Level BDDC in Three Dimensions
Tu, Xuemin
Abstract
|
PDF
Title: Three-Level BDDC in Three Dimensions
Author(s): Tu, Xuemin
Abstract:
BDDC methods are nonoverlapping iterative substructuring domain decomposition methods for the solution of large sparse linear algebraic systems arising from discretization of elliptic boundary value problems. Its coarse problem is given by a small number of continuity constraints which are enforced across the interface. The coarse problem matrix is generated and factored by direct solvers at the beginning of the computation and it can ultimately become a bottleneck, if the number of subdomains is very large.
In this paper, two three-level BDDC methods are introduced for solving the coarse problem approximately in three dimensions. This is an extension of previous work for the two dimensional case and since vertex constraints alone do not suffice to obtain polylogarithmic condition number bound, edge constraints are considered in this paper. Some new technical tools are then needed in the analysis and this makes the three dimensional case more complicated than the two dimensional case.
Estimates of the condition numbers are provided for two three-level BDDC methods and numerical experiments are also discussed.
-
Ph.D. Thesis
2005
Automatic Verification of Parameterized Systems
Xu, Jiazhao
Abstract
|
PDF
Title: Automatic Verification of Parameterized Systems
Candidate: Xu, Jiazhao
Advisor(s): Pnueli, Amir
Abstract:
Verification plays an indispensable role in designing reliable computer hardware and software systems. With the fast growth in design complexity and the quick turnaround in design time, formal verification has become an increasingly important technology for establishing correctness as well as for finding difficult bugs. Since there is no ``silver-bullet'' to solve all verification problems, a spectrum of powerful techniques in formal verification have been developed to tackle different verification problems and complexity issues. Depending on the nature of the problem whose most salient components are the system implementation and the property specification, a proper methodology or a combination of different techniques is applied to solve the problem.
In this thesis, we focus on the research and development of formal methods to uniformly verify parameterized systems. A parameterized system is a class of systems obtained by instantiating the system parameters. Parameterized verification seeks a single correctness proof of a property for the entire class. Although the general parameterized verification problem is undecidable [AK86], it is possible to solve special classes by applying a repertoire of techniques and heuristics. Many methods in parameterized verification require a great deal of human interaction. This makes the application of these methods to real world problems infeasible. Thus, the main focus of this research is to develop techniques that can be automated to deliver proofs of safety and liveness properties.
Our research combines various formal techniques such as deductive methods, abstraction and model checking. One main result in this thesis is an automatic deductive method for parameterized verification. We apply small model properties of Bounded Data Systems (a special type of parameterized system) to help prove deductive inference rules for the safety properties of BDS systems. Another methodology we developed enables us to prove liveness properties of parameterized systems via an automatic abstraction method called counter abstraction . There are several useful by-products from our research: A set of heuristics is established for the automatic generation of program invariants which can benefit deductive verification in general; also we proposed methodologies for the automatic abstraction of fairness conditions that are crucial for proving liveness properties.
-
Ph.D. Thesis
2005
Mobility, Route Caching, and TCP Performance in Mobile Ad Hoc Networks
Yu, Xin
Abstract
|
PDF
Title: Mobility, Route Caching, and TCP Performance in Mobile Ad Hoc Networks
Candidate: Yu, Xin
Advisor(s): Johnson, David B.
Abstract:
In a mobile ad hoc network, mobile nodes communicate with each other through wireless links. Mobility causes frequent topology changes. This thesis addresses the fundamental challenges mobility presents to on-demand routing protocols and to TCP.
On-demand routing protocols use route caches to make routing decisions. Due to mobility, cached routes easily become stale. To address the cache staleness issue, prior work used adaptive timeout mechanisms. However, heuristics cannot accurately estimate timeouts because topology changes are unpredictable. I propose to proactively disseminate the broken link information to the nodes that have cached the link. I define a new cache structure called a cache table to maintain the information necessary for cache updates, and design a distributed cache update algorithm. This algorithm is the first work that proactively updates route caches in an adaptive manner. Simulation results show that proactive cache updating is more efficient than adaptive timeout mechanisms. I conclude that proactive cache updating is key to the adaptation of on-demand routing protocols to mobility.
TCP does not perform well in mobile ad hoc networks. Prior work provided link failure feedback to TCP so that it can avoid invoking congestion control mechanisms for packet losses caused by route failures. Simulation results show that my cache update algorithm significantly improves TCP throughput since it reduces the effect of mobility on TCP. TCP still suffers from frequent data and ACK losses. I propose to make routing protocols aware of lost TCP packets and help reduce TCP timeouts. I design two mechanisms that exploit cross-layer information awareness: early packet loss notification (EPLN) and best-effort ACK delivery (BEAD). EPLN notifies TCP senders about lost data. BEAD retransmits ACKs at intermediate nodes or at TCP receivers. Simulation results show that the two mechanisms significantly improve TCP throughput. I conclude that cross-layer information awareness is key to making TCP efficient in the presence of mobility.
I also study the impact of route caching strategies on the scalability of on-demand routing protocols with mobility. I show that making route caches adapt quickly and efficiently to topology changes is key to the scalability of on-demand routing protocols with mobility.
-
Ph.D. Thesis
2005
Information Extraction from Multiple Syntactic Sources
Zhao, Shubin
Abstract
|
PDF
Title: Information Extraction from Multiple Syntactic Sources
Candidate: Zhao, Shubin
Advisor(s): Grishman, Ralph
Abstract:
Information Extraction is the automatic extraction of facts from text, which includes detection of named entities, entity relations and events. Conventional approaches to Information Extraction try to find syntactic patterns based on deep processing of text, such as partial or full parsing. The problem these solutions have to face is that as deeper analysis is used, the accuracy of the result decreases, and one cannot recover from the induced errors. On the other hand, lower level processing is more accurate and it can also provide useful information. However, within the framework of conventional approaches, this kind of information can not be efficiently incorporated.
This thesis describes a novel supervised approach based on kernel methods to address these issues. In this approach customized kernels are used to match syntactic structures produced from different preprocessing phases. Using properties of a kernel, individual kernels are combined into composite kernels to integrate and extend all the information. The composite kernels can be used with various classifiers, such as Nearest Neighbor or Support Vector Machines (SVM). The main classifier we propose to use is SVM due to its ability to generalize in large dimensional feature spaces. We will show that each level of syntactic information can contribute to IE tasks, and low level information can help to recover from errors in deep processing.
The new approach has demonstrated state-of-the-art performance on two benchmark tasks. The first task is detecting slot fillers for management succession events (MUC-6). For this task two types of kernels were designed, a surface kernel based on word n-grams and a kernel built on sentence dependency trees; the second task is the ACE RDR evaluation, which is to recognize relations between entities in text from newswire and broadcast news transcript. For this task, five kernels were built to represent information from sentence tokenization, syntactic parsing and dependency parsing. Experimental results for the two tasks will be shown and discussed.