Theses & Reports

Instructions for submitting a technical report or thesis.

You can find technical reports published prior to 1990 archived here.

Title

Authors

Year

Ph.D. Thesis 2026 Surface Parametrization and Remeshing with Guarantees Zhu, Leyi Abstract | PDF
Ph.D. Thesis 2025 Designing Efficient and Equitable Networked Systems for Mobile Users in Emerging Regions Asim, Rohail Abstract | PDF

Title: Designing Efficient and Equitable Networked Systems for Mobile Users in Emerging Regions

Candidate: Asim, Rohail

Advisor(s): Yasir Zaki

Abstract:

Global improvements in network infrastructure have enabled the development of exciting applications across a broad spectrum. These developments range from research powering lightweight educational, informative, and community-building services in rural and developing regions with poor internet accessibility and hardware to Collaborative Extended Reality applications that push the limits of the state-of-the-art network infrastructure available today with the aim of realizing ideas that, prior to recent advancements, were only available in fiction. Across the spectrum, significant challenges restrict the development and deployment of exciting new applications due to poor connectivity, limited access to high-performance devices, and unaffordable service costs in emerging regions and next-generation networked applications such as immersive reality and large-scale AI systems introducing unprecedented demands on bandwidth, latency, and sustainability in regions with state-of-the-art network infrastructure. This thesis addresses these twin challenges of digital inequality and network inefficiency by developing new systems and methodologies that operate across both the application and transport layers of the Internet stack.

In the first part of this dissertation, we present a series of lightweight web access systems designed for low-end phones, offline environments, and bandwidth-constrained regions. Through a global measurement study of 56 cities, we quantify disparities in page load times, web complexity, and mobile affordability. We then introduce Lite-Web, a browser-level rewriting system that accelerates existing websites on low-end devices, and MAML, a markup abstraction for building simplified and visually consistent pages. These systems are deployed in GAIUS, a hyperlocal, offline-first web ecosystem adopted in communities across Kenya, Bangladesh, and India. These web simplification efforts enabled internet accessibility in regions with poor internet accessibility and hardware constraints. However, many emerging regions suffer due to a lack of network infrastructure that creates a barrier between lightweight simplified webpages and people living in these regions. To address this, we also design Sonic, a novel hybrid system that leverages radio infrastructure to broadcast pre-rendered web content over FM radio and enable interaction through SMS, enabling access in disconnected regions such as rural Cameroon.

In the second part of the dissertation, we turn our focus to the transport layer, where emerging applications face severe limitations from current congestion control protocols. Using a new benchmarking framework, we evaluate the performance of state-of-the-art CCAs across synthetic and real 5G networks. Our analysis reveals significant mismatches between protocol behavior and the requirements of next-generation collaborative and immersive applications. To address this, we design Hera, a QoE-aware modular framework for next-generation immersive applications. By bridging the gap between application-level responsiveness and network-level adaptability, Hera lays the foundation for more scalable, robust, and high-fidelity multi-user immersive experiences.

Together, these contributions demonstrate how cross-layer design, from simplified content to smarter transport, can dramatically improve web accessibility, application quality of experience (QoE), and sustainability in both high-demand and underserved settings. The work advances a broader vision for an inclusive and efficient Internet: one that adapts to user constraints, application demands, and the infrastructural realities of the global majority.
Ph.D. Thesis 2025 On The Applications of Coarse Network Geometry to Personalized Immuno-Oncology Bannon, James Abstract | PDF

Title: On The Applications of Coarse Network Geometry to Personalized Immuno-Oncology

Candidate: Bannon, James

Advisor(s): Bud Mishra

Abstract:

Immune checkpoint inhibitors (ICIs), also called immune checkpoint blockers, are a promising category of targeted therapy for solid tumors. Predicting which patients will respond to ICI therapy remains an open problem under active investigation. This thesis aims to improve the precision with which immune checkpoint inhibitors are prescribed. By focusing on one type of biological measurement --- whole-tumor shotgun RNA sequencing data, which we call \textit{bulk RNA-seq} --- we are able to deeply explore the potential and limits of predictors built from this kind of measurement. Two of the algorithms presented here are based on a notion of graph curvature which we believe has extensive promise in bioinformatic inquiry.

The first part of this thesis performs a rigorous permutation testing evaluation of machine learning models for the task of predicting therapy response which we cast as a binary classification problem. We show that bulk RNA-seq data contains predictive signal but that there is an upper limit to ML model efficacy that can potentially be remedied by the curation of larger data sets or augmenting RNA-seq data with other biological measurements.

The next part presents a modular pipeline for the discovery of biomarkers from bulk RNA-seq data. We contextualize gene expression measurements using a protein-protein interaction (PPI) network and then use a notion of graph curvature to find (pairs of) genes in the PPI network that could serve as potential biomarkers. Our candidate biomarkers are evaluated using an extensive literature search and transfer learning experiments. We also provide a harmonized collection of drug-specific candidate markers found through rank aggregation that we believe merit further study.

Lastly, we cluster patients in an unsupervised manner using discrete Ollivier-Ricci Flow (ORF). Our method surfaces populations with distinct survival curves which in turn allows us to find many potential biomarkers, including gene expression modules. We believe the algorithm may be of independent interest for clustering other datasets in a diverse set of research areas.

As a result of the work here we have provided novel algorithmic techniques for analyzing (biological) data and advanced the state of the art in finding biomarkers for ICI therapy.
Ph.D. Thesis 2025 Modern machine learning methods for protein design Berenberg, Daniel Abstract | PDF

Title: Modern machine learning methods for protein design

Candidate: Berenberg, Daniel

Advisor(s): Richard Bonneau, Kyunghyun Cho

Abstract:

Designing biosynthetic molecules such as proteins is critical for applications in therapeutics and agriculture, yet the vast sequence space and complex functional landscape pose significant challenges.

Previous design workflows rely on clustering, mechanistic modeling, or directed evolution and are often constrained by hand-crafted heuristics and domain-specific biases. Advances in deep generative modeling and protein databases of unprecedented size present an opportunity to apply modern machine learning techniques.
In this work, we develop methods to generate and score protein sequences. We propose several steering and guidance techniques that balance data-driven exploration with expert-guided refinement.

Leveraging established classifications of antibodies, we enable targeted redesign of designated regions for applications such as affinity maturation and framework optimization. Expanding the scope to general sequence design, we show effective classifier-guided generation of protein sequences using a novel sequence denoising autoencoder. Finally, we investigate the utility of natural language text embeddings in classifier-free generation and show the capabilities of text conditioned models on downstream generative modeling tasks.

Our work provides a spectrum of methods that transition from bespoke, domain-specific approaches toward a generalized, human-centric framework for modern protein engineering and molecular programming.
Ph.D. Thesis 2025 Fair and Explainable Machine Learning: Estimating Bias, Detecting Disparities, and Designing for Algorithmic Recourse Boxer, Kate Abstract | PDF

Title: Fair and Explainable Machine Learning: Estimating Bias, Detecting Disparities, and Designing for Algorithmic Recourse

Candidate: Boxer, Kate

Advisor(s): Daniel Neill

Abstract:

This dissertation investigates algorithmic bias and explainability from the perspective of an individual's interactions with computational models that have an impact on their circumstances, including those influencing their environmental conditions and those used during institutional decision-making. Accordingly, this dissertation focuses on three subtopics within this broad field: estimating data bias in datasets that inform policy decisions, auditing for predictive bias, and multi-objective formulations for systems that provide algorithmic recourse.

In relation to estimating data bias in datasets utilized to inform governmental resource allocation, we introduce two methods—a novel grouping algorithm for statistical significance testing and a custom latent variable model—to detect under-reporting in citizen-generated data. This introduces a domain-specific framework that is instrumental for practitioners interested in making data-informed policy decisions using self-reported data collected from populations located in urban settings. To audit for predictive bias, we introduce a domain- and model-agnostic framework for detecting statistically significant predictive biases in model outputs affecting both marginal and intersectional subpopulations of a target population through novel pattern detection methods for subgroup scanning, where predictive biases take the form of group-fairness violations.

Lastly, we propose a set of principles aimed at ensuring that systems that provide algorithmic recourse materially increase individual agency. Based on these principles, we endorse specific design choices to ensure the reliability of recommendations, develop burden-based measurements to assess the accessibility and fairness of these systems, and train algorithmic decision-makers that uphold these principles when used in systems that provide algorithmic recourse.

Collectively, these works represent key methodologies to detect data bias and predictive bias, spanning both context-specific and domain-agnostic settings, and also contribute to an effort to fundamentally shift institutional decision-making to ensure that algorithmic decision-makers are designed in such a way that individuals have means to achieve favorable outcomes.
Ph.D. Thesis 2025 Simple Structures in Neural Networks: On Expressiveness, Optimization and Data Distribution Chen, Lei Abstract | PDF

Title: Simple Structures in Neural Networks: On Expressiveness, Optimization and Data Distribution

Candidate: Chen, Lei

Advisor(s): Prof. Joan Bruna

Abstract:

In this era of Large Language Models (LLMs) and other giant neural networks, we aim to analyze simplified settings from scratch, as foundational steps towards understanding the functionality of the giant models. We present our understanding from three aspects. On expressive power, we investigate the function class of simplified graph networks, i.e., Graph-Augmented Multi-layer Perceptrons (GA-MLPs), against the classic Graph Neural Networks (GNNs) using measurements of graph isomorphism testing and counting attributed walks. On optimization, we theoretically study instabilities from large learning rates in training neural networks, i.e., Edge of Stability. We investigate the conditions of how the loss landscape contains such unstable training trajectories, especially oscillating in a low-dimensional subspace. Then we leverage such property in simple, yet representative, learning problems in a teacher-student style. On data distribution of reasoning tasks, we propose a decomposition of next-token prediction into two parts: in-context reasoning and distributional association. We study this decomposition empirically and theoretically in a controlled synthetic setting, and find that feed-forward layers tend to learn simple distributional associations such as bigrams, while attention layers focus on in-context reasoning. Finally, we discuss how such an understanding of next-token predictions and feed-forward layers could be applied to some recent developments of LLMs.
Ph.D. Thesis 2025 Distributed Randomness in Adversarial Settings Choi, Kevin Abstract | PDF

Title: Distributed Randomness in Adversarial Settings

Candidate: Choi, Kevin

Advisor(s): Joseph Bonneau

Abstract:

Distributed randomness in adversarial settings concerns the problem of jointly computing a random output in a network of mutually untrusting participants such that the output is not predictable or biasable by any participant or any coalition of participants. A distributed randomness beacon (DRB) is a service that periodically emits random outputs through such distributed randomness protocols and has found applications in cryptographically verifiable lotteries and gaming as well as leader election in distributed systems and consensus algorithms. In the past decade, the landscape of DRBs has evolved, with many DRB protocols relying on ad hoc heuristics rather than structured design principles. While this bottom-up approach has led to interesting integrations of cryptographic techniques, establishing a unifying framework of DRBs has remained open prior to this work. Similarly, the consideration of security properties of DRBs, such as unbiasability and unpredictability, has typically been restricted to specific settings.

This dissertation seeks to address these gaps by adopting a top-down approach to realizing a distributed randomness beacon. We conceptualize the broader design space of DRBs, introduce comprehensive security definitions applicable to all DRBs, and consider a variety of practical deployment scenarios. Simultaneously, we compare protocols based on their communication and computational efficiency and also highlight the functionality of various cryptographic building blocks in light of DRBs rather than solely focusing on their technical details.

Furthermore, we shed light on the security gap that exists between theoretical models and real-world scenarios, where most theoretical DRBs rely on the honest majority assumption (network assumption that more than half of the nodes are honest) which has shown to break down in practice (e.g. the \$625 million Axie Infinity's Ronin hack in 2022). Recognizing this issue, we propose two new optimized DRB protocols---Bicorn and Cornucopia---that offer robustness even in the presence of a dishonest majority.
Ph.D. Thesis 2025 Computational Design through Differentiable Elastodynamic Simulation, Parametrized by Geometric Techniques, for Applications in Soft Robotics Gjoka, Arvi Abstract | PDF

Title: Computational Design through Differentiable Elastodynamic Simulation, Parametrized by Geometric Techniques, for Applications in Soft Robotics

Candidate: Gjoka, Arvi

Advisor(s): Daniele Panozzo, Denis Zorin

Abstract:

Traditionally, design of physical objects is a tedious task which involves many time-consuming cycles of design and experimentation, often done by area experts which specialize in one or the other. This is especially true with soft objects that exhibit large displacement that is central to their form and function, and it is hard to predict how the object will behave during the design phase. In this talk, we will explore viewing computational design through the lens of differentiable simulation. First, we will outline a framework for differentiable FEM simulation with robust contact handling, which allows us to extract gradients with respect to shape, material parameters, boundary conditions, etc. Next, we discuss how this framework allows us to explore computational design of highly deformable objects, starting with pneumatic soft robots and then looking at the modeling and design of deformable capacitive thin-film sensors that can be draped over objects (such as soft robots or human body parts). For each, we demonstrate the validity of our simulation and optimization results by validating on fabricated objects.
Ph.D. Thesis 2025 Noise and Games in Distribution Estimation: From Survival Analysis to Generative Models Goldstein, Mark Abstract | PDF

Title: Noise and Games in Distribution Estimation: From Survival Analysis to Generative Models

Candidate: Goldstein, Mark

Advisor(s): Rajesh Ranganath

Abstract:

This thesis presents new machine learning methodologies for generative modeling and survival analysis. We introduce novel approaches for specifying, training, and sampling from diffusion-based generative models, including auxiliary-variable and nonlinear noising processes, data-dependent base distributions, and hybrid strategies that combine deterministic and stochastic sampling. We validate these methods across applications such as images, videos, partial differential equations, and active matter systems, showing that simple choices at training and inference time can significantly impact both efficiency and performance.

Shifting focus to a different kind of distribution estimation problem, survival analysis (time-to-event modeling), we propose Inverse-Weighted Survival Games, an optimization framework that handles censoring (i.e., missing data) through the simultaneous estimation of failure and censoring distributions. This approach improves both discriminative performance and calibration on real-world medical datasets.

Returning to diffusion models with new optimization tools, we introduce GameFlow, a novel training method for consistency models (i.e., flow maps that directly and quickly solve the diffusion sampling process). GameFlow uses a game-like formulation to efficiently train consistency models from scratch via Jacobian-vector products, avoiding the need for adversarial objectives, model inverses, or pre-trained models.

Collectively, these works leverage latent variables, differential equations, estimating equations, Monte Carlo gradient methods, and stop-gradient games—highlighting how a shared set of computational tools can be broadly useful for tackling distribution estimation across diverse domains.
Ph.D. Thesis 2025 Understanding Inductive Bias in the Era of Large-Scale Pretraining with Scientific Data Gruver, Nathaniel Abstract | PDF

Title: Understanding Inductive Bias in the Era of Large-Scale Pretraining with Scientific Data

Candidate: Gruver, Nathaniel

Advisor(s): Andrew Wilson

Abstract:

Inductive biases are crucial for machine learning in data-scarce settings, but their optimal role in data-rich regimes remains poorly understood. This thesis challenges the conventional wisdom that strict architectural constraints are necessary for modeling numerical data, particularly in physics and chemistry. Through systematic empirical studies, I demonstrate that data-driven approaches can effectively learn both physical symmetries and broader numerical patterns without explicit architectural constraints. First, I show that transformer models trained with data augmentation can acquire stronger equivariance properties than convolutional neural networks, despite lacking built-in symmetry constraints. Building on this insight, I investigate whether pretrained language models can learn generalizable numerical capabilities from text alone. By studying the behavior of language models in many settings, I demonstrate that text pretraining induces a preference for simple functions that serves as a powerful inductive bias across numerical domains. This emergent bias enables large language models to outperform specialized architectures on benchmark tasks in time series forecasting and 3D structure prediction, achieving state-of-the-art results with minimal task-specific adaptation. However, these benefits do not extend universally - I identify molecular property prediction as a key limitation and trace this failure to fundamental constraints in discrete token representations. This work provides a comprehensive framework for understanding when learned biases can replace architectural constraints in numerical domains, with important implications for model design in scientific machine learning.
Ph.D. Thesis 2025 Computational Shape Design through Robust Physics Simulations Huang, Zizhou Abstract | PDF

Title: Computational Shape Design through Robust Physics Simulations

Candidate: Huang, Zizhou

Advisor(s): Denis Zorin, Daniele Panozzo

Abstract:

Additive manufacturing enables the fabrication of complex geometric structures tailored to specific material properties, with diverse applications ranging from lightweight yet strong aerospace components to customized shoe soles, prosthetic devices, and flexible robotic parts. However, due to the complexity of geometry, novel techniques for engineering analysis and optimization are needed. Our research seeks to address problems by developing robust and accurate physics simulation methods that can enhance the design process of complex structures.

This thesis introduces a physics-based simulation method for elastodynamics, incorporating collisions and friction, that resolves the artifacts in the state-of-the-art method and provides better robustness and efficiency. Further, the simulator is extended to support differentiability with respect to input physics parameters, enabling gradient-based inverse optimization applications such as optimal shape design and material inference. Specifically, we investigate the desired force response of shock-absorbing materials and leverage our differentiable simulator for shape optimization to achieve the desired behavior. The resulting microstructures are fabricated and validated through real-world experiments, demonstrating the accuracy and practical applicability of the proposed simulation framework.
Ph.D. Thesis 2025 Understanding and Mitigating Goal Misgeneralization in Language Models Joshi, Nitish Abstract | PDF

Title: Understanding and Mitigating Goal Misgeneralization in Language Models

Candidate: Joshi, Nitish

Abstract:

As Large Language Models (LLMs) are being widely used in various applications, it is critical that they are robust and generalize well. One of the reasons why LLMs might perform poorly after deployment is due to goal misgeneralization. Goal misgeneralization refers to the issue where an LLM performs well on the training distribution (e.g., high accuracy or reward), but performs poorly on the test distribution due to misgeneralization. Specifically, misgeneralization implies that the model has a systematic failure on the test distribution due to learning unintended functions, as opposed to performing randomly or lacking capability to do well on the test distribution. This encapsulates various problems that the machine learning community has worked on, including spurious correlations, underspecification, and reward hacking.

This dissertation focuses on goal misgeneralization in language models and consists of the following components. (1) For finetuning language models, if explicit knowledge of the spurious correlation which the model relies on is available, mitigating it is not too hard. We propose a new method to mitigate spurious correlations when such knowledge is not available---our method relies on complementary knowledge based on semantic corruptions. We empirically demonstrate the effectiveness of our method outperforming standard training methods. (2) For such methods which do rely on the knowledge of semantics to mitigate spurious correlations, scalably discovering robust semantic features can be done through crowdsourcing, such as in counterfactual data augmentation. We critically analyze the discrepancy between theory and practice for this training method, where in practice it seems to give marginal to no benefits. We show that this occurs due to the difficulty in obtaining diversity in counterfactuals, and this lack of diversity could even exacerbate spurious correlations. (3) We take a step back and ask: Can we use a mitigation method for any spurious correlation encountered in language data? We argue that there are two main sources of spurious correlations in language data, and methods to mitigate and evaluate spurious correlations might not work well for both. One is when the feature is irrelevant to the label (e.g. extra spaces), and the other is when the feature's effect on the label depends on the context (e.g. negation). We formalize this distinction using causal models and demonstrate why the distinction is necessary empirically. (4) We discuss other goal misgeneralization issues beyond spurious correlations in finetuning. First, we demonstrate how goal misgeneralization could occur during pretraining. Specifically, focusing on causal reasoning we show that language models have learned unintended position bias and post hoc fallacy from the pretraining data. We also show that only scaling language models does not address this misgeneralization. Next, we show that underspecification in in-context learning is also an instance of goal misgeneralization, and understand feature preferences of language models in the setting.

Finally, we discuss future directions focusing on other goal misgeneralization issues in language models. We briefly mention goal misgeneralization in the context of safety for LLM-agents, and reward hacking during reinforcement learning in language models.
Ph.D. Thesis 2025 Decision Problems for Global Protocol Specifications Li, Elaine Abstract | PDF

Title: Decision Problems for Global Protocol Specifications

Candidate: Li, Elaine

Advisor(s): Thomas Wies

Abstract:

Concurrency is ubiquitous in modern computing, message passing is a major concurrency paradigm, and communication protocols are therefore a key target for formal verification. Writing implementations for each protocol participant individually, such that their composition is free from communication errors and deadlocks, is challenging and error-prone. In response, various verification methodologies center on the construct of a global protocol. Global protocol specifications synchronously describe the message-passing behaviors of all protocol participants from
a bird’s-eye view, and thus rule out large classes of communication errors by construction. Global protocols are adopted in industry by the ITU standard and UML, and are widely studied in academia in the form of high-level message sequence charts, session types and choreographic programs. Application domains for this top-down verification methodology include cryptographic security, cyber-physical systems, and web services.

This thesis contributes decision procedures for three problems central to global protocol verification: implementability, synthesis, and subtyping. Implementability asks whether a protocol admits a distributed implementation, synthesis in turn computes one, and subtyping asks whether an admissible implementation can be substituted in whole or part to yield fewer behaviors. This thesis additionally contributes a Rocq mechanization of a precise implementability characterization for infinite-state protocols, and the SPROUT tool for automatically verifying such protocols.
Ph.D. Thesis 2025 Governing the Scientific Journals: What Big Data and Computational Modeling Tell Us about the Policies That Shape Editorial Boards Liu, Fengyuan "Michael" Abstract | PDF

Title: Governing the Scientific Journals: What Big Data and Computational Modeling Tell Us about the Policies That Shape Editorial Boards

Candidate: Liu, Fengyuan "Michael"

Advisor(s): Talal Rahwan

Abstract:

Academic journal editors are the gatekeepers of science, collectively shaping the content of scientific publications and setting standards for their fields of research. Yet, most editors take on this role as a form of community service while maintaining their primary careers as research-active scientists. This dual role raises two key questions at the heart of this thesis: (1) To what extent are editors representative of scientists at large in terms of their demographic composition? (2) How prevalent are conflicts of interest among academic editors? To address these questions, I construct two large, novel longitudinal datasets of academic editors and provide quantitative evidence on both fronts. Furthermore, these datasets enable me to evaluate the impact of policy interventions designed to (1) increase editorial board diversity and (2) mitigate conflicts of interest. By leveraging natural experiments identified in historical archives of journal policy documents, I analyze cases where such policies have been implemented and evaluate their effectiveness. Finally, I discuss the broader implications of big data and computational modeling for quantitative policy research.
Ph.D. Thesis 2025 Machine Learning for Simulations Otness, Karl Abstract | PDF

Title: Machine Learning for Simulations

Candidate: Otness, Karl

Advisor(s): Joan Bruna, Benjamin Peherstorfer

Abstract:

Computational modeling of physical systems is a core task of scientific computing. Machine learning methods can extend traditional approaches to modeling partial differential equations and hold the potential to simplify the modeling process and improve simulation accuracy and performance. In this thesis we explore the use of neural networks to learn the behavior of systems from data. We evaluate the performance-accuracy tradeoffs involved in their use as emulators, and use insights gained here to explore a specific application to learning subgrid parameterizations for climate models in particular. For this task we propose two novel techniques to improve the accuracy and stability of the learned parameterizations by tailoring architectures to incorporate favorable inductive biases, and by augmenting training data to encourage stability.
Ph.D. Thesis 2025 Language Models at the Scale of Evolution Rives, Alexander Abstract | PDF

Title: Language Models at the Scale of Evolution

Candidate: Rives, Alexander

Advisor(s): Rob Fergus, Yann LeCun

Abstract:

I will describe the development of the evolutionary scale modeling (ESM) program, which proposes to solve an inverse problem across evolution to learn the biology of proteins from their sequences at the scale of life. Beginning from the idea that the sequences of proteins contain an image of biology in their patterns, this thesis shows that language models trained on protein sequences spanning the natural diversity of the Earth, by learning to predict which amino acids evolution chooses, develop feature spaces that reflect the immense scope and complexity of protein biology containing known and unknown biology. Biological structure and function emerge in the representations of the models. This emergence is shown to occur in a direct linkage with improvements in the language modeling of sequences. The representation space has an ordered structure in which proteins are organized according to their underlying biology, and directions correspond to meaningful biological variations. Attention patterns materialize in the neural network that correspond to the folded three-dimensional structure of proteins. The probabilities assigned to amino acids within a given sequence context, reflect protein function and predict the effects of mutations. The representations learned by protein language models constitute a general and transferable feature space which supports the discovery and generation of new biology. This has enabled an effort to reveal the structures of hundreds of millions of metagenomic proteins for the first time. The thesis concludes with experimental characterizations of proteins created by language models, which demonstrate that the feature space learned from natural proteins supports generating proteins beyond those in nature.
Ph.D. Thesis 2025 Static Analysis Tools For Network-Device Stacks Ruffy, Fabian Abstract | PDF

Title: Static Analysis Tools For Network-Device Stacks

Candidate: Ruffy, Fabian

Advisor(s): Anirudh Sivaraman

Abstract:

Networking devices are becoming more programmable. With this trend, network-device software---dedicated to forwarding packets and interpreting instructions from the network control plane---now covers more functionality and also increases in complexity. Faults in network-device software can have an outsized impact on a network. Hence, network operators and device manufacturers are reaching for static analysis to ensure that this code is both functionally correct and well-optimized. Network-device software is extensive and often written in general-purpose languages such as Python or C++. These languages contain loops, aliasing, or indirection, which can make developing effective static analysis techniques challenging.

In this dissertation, we explore an opportunity to build better static analysis tools for network-device software. We use P4, a domain-specific language for network programming, as our foundation. We develop an execution model for P4 which describes the behavior of a network device, and we reify this execution model using satisfiability modulo theories (SMT), expressed in quantifier-free bit vectors. We refine this execution model through three distinct projects and show its utility by adopting techniques from software engineering research that are theoretically powerful but were considered practically limited for general-purpose languages. Applying our specialized techniques, we were able to find over 50 bugs in network-device software which cause incorrect packet-processing. Furthermore, we reuse our model to optimize network programs based on their control-plane configuration, which can reduce resource usage and increase packet-processing performance.

Our SMT-based execution model for packet processing is protocol-independent, device-agnostic, and precise enough for bug-finding and program optimization. We attribute these successes to tailoring our model to a DSL specialized in packet processing while also appropriately exploiting the restrictions of this DSL.
Ph.D. Thesis 2025 Towards Generally Intelligent Robots that Simply Work Everywhere Shafiullah, Nur Muhammad "Mahi" Abstract | PDF

Title: Towards Generally Intelligent Robots that Simply Work Everywhere

Candidate: Shafiullah, Nur Muhammad "Mahi"

Advisor(s): Lerrel Pinto

Abstract:

Applications of machine learning have touched the lives of common people in innumerable novel ways. Robotics today seems poised to make such an impact, too. Yet the current state-of-the-art in robotics, whether it’s a parkouring humanoid from Boston Dynamics or a T-shirt-folding robot from Google Deepmind, are specialists of their own environments – either by instrumenting and extensively modeling the scene, or by collecting weeks or months of data on the exact same setup.

In this thesis, we focus on building generally intelligent robots that simply work everywhere by studying the interplay of representation, data, and memory in robotics. To create robots that can address the broad and diverse challenges of operating in messy and unstructured environments everywhere, this thesis investigates three fundamental directions. We first look into algorithms that optimize the use of data in robot learning since data, as fuel, plays a critical role in creating broadly capable ML systems. We not only create efficient, self-supervised representations of the robots' perception, but also develop action representations that enable scaling to large, uncurated demonstration datasets. Then, we take a deep dive on creating systems – bridging algorithms and hardware – that can create and learn from robot data in the wild. Such systems enable few-shot and then zero-shot behavior generalization in novel homes in New York City and beyond. Finally, to enable generally intelligent robot behavior that extends over time and space, we construct neural data structures called spatio-semantic memory for robots. These memory modules enable scaling in-the-wild autonomous robot behavior from seconds to hours, and beyond.
Ph.D. Thesis 2025 Shape Design, Repair and Optimization Wang, Siqi Abstract | PDF

Title: Shape Design, Repair and Optimization

Candidate: Wang, Siqi

Advisor(s): Denis Zorin, Daniele Panozzo

Abstract:

Digital geometric models are fundamental to modern engineering, media, and manufacturing. However, models created by artists in-the-wild often contain ambiguities that precludes their use in simulation and manufacturing, while complex designs may need to be simplified for efficiency or functionally optimized to meet competing aesthetic and performance goals. This necessity for robust, useful, and high-performing geometry creates a critical need for advanced computational techniques that can automatically repair, simplify, and optimize digital shapes. Our research addresses these challenges by developing a suite of shape processing and optimization methods designed to enhance the quality and functionality of geometric models for a range of applications.

This thesis delivers solutions across three key areas. First, we present a Bézier curve simplification framework that simplifies complex vector graphics while preserving visual fidelity by defining a curve-to-curve distance metric and repeatedly conducting local segment removal operations. Second, we propose a solid or shell labeling technique for artist-created surface meshes that lack a well-defined interior, guided by a sparse set of user inputs. These labels reduce ambiguity and enable the construction of valid volumetric meshes for downstream applications. Finally, we introduce two powerful shape optimization frameworks: one that leverages neural network-based models to independently control the tactile properties and visual appearance of a texture, and another that optimizes the geometry and position of radiofrequency (RF) receive coil arrays to increase signal-to-noise (SNR) ratio in magnetic resonance imaging (MRI).
Ph.D. Thesis 2025 Mechanisms to Advance the Adoption of Programmable High-speed Packet-Processing Pipelines Wang, Tao Abstract | PDF

Title: Mechanisms to Advance the Adoption of Programmable High-speed Packet-Processing Pipelines

Candidate: Wang, Tao

Advisor(s): Anirudh Sivaraman, Aurojit Panda

Abstract:

Today's programmable high-speed packet-processing pipelines have enabled a wide range of network offloads, e.g., in-network telemetry, parameter aggregation in machine learning, etc. However, it is not ready yet to allow a larger number of people and applications to benefit from those programmable pipelines.

This dissertation looks into this problem from two specific aspects, i.e., multitenancy and general L7 processing, and argues that new hardware primitives together with software toolchains are necessary to make the high-speed packet-processing pipelines a wider adoption for the application developers. Specifically, in this dissertation, we propose two systems: (1) Menshen designs isolation mechanisms to support multiple programs running atop a single pipeline without interfering with each other; (2) QingNiao targets L7 dispatch—a type of L7 process that is pervasive in the networking infrastructure layer—and presents a holistic solution based on the new hardware primitives and a programming model to support running such L7 processing on the programmable pipelines.
Ph.D. Thesis 2025 Enhancing Computational Music Intelligence via Concept Alignment Wang, Ziyu Abstract | PDF

Title: Enhancing Computational Music Intelligence via Concept Alignment

Candidate: Wang, Ziyu

Advisor(s): Gus Xia

Abstract:

Recent advances in generative AI have led to impressive achievements in music generation. Yet, a fundamental challenge remains: how can these black-box models move beyond imitating music data to truly understand human creative intent and collaborate meaningfully with humans? We argue that the missing piece is a deeper alignment between humans and AI. This thesis introduces concept alignment as a framework to bridge the human creative process and machine behavior through various ways of concept manipulation. I explore this through three core directions: (1) concept representation, using disentangled latent codes to control musical attributes like pitch contour and texture; (2) concept organization, designing hierarchical models that structure musical ideas and abstractions; and (3) concept emergence, guiding models to discover symbolic representations directly from raw data in an unsupervised way. These contributions demonstrate how models can learn, organize, and reveal human-like concepts, opening a path toward more interpretable, controllable, and collaborative music AI.
Ph.D. Thesis 2025 Better Incentives: Performant and Private Machine Learning Xu, Mimee Abstract | PDF
Title: Better Incentives: Performant and Private Machine Learning

Candidate: Xu, Mimee

Advisor(s): Leon Bottou

Abstract:

Machine learning algorithms benefit from large and diverse datasets. However, business needs and research workflows are potentially at odds with the ownership of private data. Without sharing private data in their respective contexts, current privacy-enhancing solutions tend to, instead, compromise on performance or privacy.

This thesis addresses gaps between machine learning and data ownership, through modeling a system of three parties: model owners, data owners, and overseers. Incentive issues between the parties are addressed with secure and confidential computation, consisting of Secure-Multiparty Computation (S-MPC) and Homomorphic Encryption (FHE). Though lesser-known to machine learning, these techniques can help support data rights.
- First, as data used for training tends to be owned by disparate parties, the first sub-problem pertains to whether unshared training data's utility can be evaluated without sharing it. We implemented influenced-based appraisal functions that are compatible with efficient S-MPC computation, achieving 92.3\% correlation with plain-text ground truth ranking for 100 datasets under induced class imbalance, and 96.0\% under label-flipping, without the usability challenge of sensitive hyperparameters of training a joint model under S-MPC.
- Second, seeing the trend of deploying proprietary ML models where the input and output to those models are hidden, can the public audit privately-held data, especially in domains where encryption is the default? Using FHE for auditing triaging fairness in hospitals' emergency department, as an example, my prior work provided a qualitative description of the setup that can be applied to ease the tension between regulators and private data parties, without the need to decrypt private data.
- Finally, is it necessary to trade off data utility and privacy in low data domains? Our practical framework, Secure-KL (SKL), incurs no privacy leakage while enabling robust evaluation of additional data to combine with. Without making assumptions about the final downstream model, our dataset-divergence approximation, in secure computation, is consistent with plaintext divergence values by over 90%. We show it successfully identifies beneficial data partnerships for intensive care unit (ICU) mortality prediction, hereby improving downstream classifier performance for the source hospital. We also show that secure methods are more robust and reliable than alternatives of sharing a subset of data (medium leakage), using demographic information (low leakage), or selecting blind (high variance). With zero leakage, SKL allows all parties' data to remain private while entire datasets are utilized, eliminating a key roadblock towards orchestrating broader collaborations in healthcare with limited resources.
Ph.D. Thesis 2025 Diagnosing AI Misbehavior: Why Do Models Fail? Zhang, Anqi Abstract | PDF

Title: Diagnosing AI Misbehavior: Why Do Models Fail?

Candidate: Zhang, Anqi

Advisor(s): Jinyang Li, Aurojit Panda

Abstract:

As AI models become increasingly pervasive across critical domains, understanding and diagnosing their failures has become paramount for ensuring safety, reliability, and trust. This dissertation addresses the importance of diagnosing AI misbehavior across shifting deep learning paradigms — from classifiers to Graph Neural Networks (GNNs) to Large Reasoning Models (LRMs) — each exhibiting distinct failure modes that demand specialized diagnostic approaches.

In this thesis, we focus on different models to explore and diagnose model misbehaviors: (a) for classifiers, we introduce the Average Marginal Effect (AME), a scalable data attribution method that traces prediction errors back to problematic training data, achieving efficient attribution under the sparsity assumption; (b) for GNNs, we develop a novel long-distance targeted poisoning attack that reveals critical blind spots in GNN explanation tools, and adapt our AME method to locate poisoned subgraphs; (c) for large reasoning models, we design self-verification probes, which reveal that intermediate answer correctness signals are encoded in a reasoning model’s hidden states, and enable confidence-based early-exit strategies that reduce inference tokens without compromising accuracy. Our work advances both the understanding of AI misbehavior and the development of practical tools for building more trustworthy, efficient, and interpretable AI systems.
Ph.D. Thesis 2025 An Explicit Certified Method for Path Planning Problem of an SE(3) Robot Zhang, Zhaoqi Abstract | PDF

Title: An Explicit Certified Method for Path Planning Problem of an SE(3) Robot

Candidate: Zhang, Zhaoqi

Advisor(s): Chee Yap

Abstract:

The design and implementation of theoretically-sound robot motion planning algorithms is challenging, especially for robots with high degrees of freedom (DOF). This thesis presents an explicit, practical and certified path planner for a rigid spatial robot with 6 DOFs. The robot is a spatial triangle moving amidst polyhedral obstacles. Correct, complete and practical path planners for such a robot has never been achieved. It is widely recognized as a key challenge in robotics. We design such a planner by using the Soft Subdivision Search (SSS) framework, based on the twin foundations of ε-exactness and soft predicates. This SSS planner is a theoretical alternative to the standard exact algorithms, and provides much stronger guarantees than probabilistic or sampling algorithms.

In this thesis, we address technical challenges for the SE(3) robot. First, we establish the foundational theory of SSS framework by proving a general form of the Fundamental Theorem of SSS. Second, we introduce a topologically correct data structure for non-Euclidean path planning in the SE(3) space. Third, we analyze the distortion bound of the SE(3) representation. Fourth, we design an approximate footprint and combine it with the highly efficient feature set technique which leads to its soft predicate. Finally, we explicitly design the geometric primitives to avoid using a general solver of a polynomial system. This allows a direct implementation. These contributions represent a robust, practical, and adaptable solution to robot motion planning.
Ph.D. Thesis 2025 On the Diversity and Stability of Internal Representations in Deep Neural Networks Zhu, Jiachen Abstract | PDF

Title: On the Diversity and Stability of Internal Representations in Deep Neural Networks

Candidate: Zhu, Jiachen

Advisor(s): Yann LeCun

Abstract:

The quality of internal representations is fundamental to the performance and generalization capabilities of deep neural networks. However, standard training paradigms often produce representations that are suboptimal; they can suffer from feature redundancy and dimensional collapse, which harms transferability, and they rely on complex normalization layers to ensure stable training dynamics. This thesis addresses these critical challenges through a comprehensive investigation into methods that directly shape and control the properties of learned representations. First, we tackle the problem of feature diversity by introducing Variance-Covariance Regularization (VCReg), an explicit regularization method that encourages the network to learn high-variance and low-covariance features. By applying this principle to intermediate representations, we show that VCReg effectively mitigates neural collapse and gradient starvation. This leads to significant improvements in transfer learning performance across a wide range of tasks and modalities, including image classification, video action recognition, and long-tail learning scenarios.

Second, we study the challenge of training stability. Motivated by an empirical analysis of how normalization layers shape activation distributions, we introduce Dynamic Tanh (DyT), a simple, element-wise function designed to replace normalization layers entirely. We demonstrate that Transformers equipped with DyT can be trained stably without any normalization, matching or exceeding the performance of their conventional counterparts on benchmarks spanning computer vision, language modeling, and generative modeling. Taken together, the contributions in this thesis demonstrate that by controlling the statistical properties of internal representations—through both explicit regularization and principled architectural design—we can build deep learning models that are more robust, generalizable, and efficient.
Ph.D. Thesis 2024 On Efficient Instantiations of Secure Multi-Party Computation in Practice Bienstock, Alexander Abstract | PDF

Title: On Efficient Instantiations of Secure Multi-Party Computation in Practice

Candidate: Bienstock, Alexander

Advisor(s): Yevgeniy Dodis/Marshall Ball

Abstract:

Secure Multi-Party Computation (MPC) is an area of cryptography that has been studied extensively since the 1980s. In full generality, MPC allows a set of mutually distrusting parties to privately compute a function of their inputs. That is, the parties interact in some protocol, and at the end obtain the output of the function, and nothing else. In the decades since the inception of MPC, great strides have been made towards making it more efficient. However, despite this progress, the use of MPC in practice still faces some shortcomings.

In this thesis, we take steps to mitigate two such shortcomings. The first deficiency we study is related to the communication networks in which such MPC protocols operate. MPC protocols are usually designed assuming that all parties have pairwise secure communication channels which are stable; i.e., nodes never crash, messages always arrive on time, etc. However, in the real-world, this is rarely the case—it is hard to sustain a stable connection between parties over long periods of time. One such model that has been introduced to address this deficiency is called Fluid MPC (Choudhuri et al., CRYPTO 2021). In this model, parties are not mandated to stay online for long periods of time. Instead, parties come online for short periods of time and work together in committees to compute some function. The benefit is that individual committees are much more likely to be able to sustain stable connections for these shorter interactions. However, existing protocols in this model do not match the level of efficiency that is obtained by traditional MPC protocols. In the first part of this thesis, we study Fluid MPC, and in particular, introduce Fluid MPC protocols with efficiency that matches those of traditional MPC.

The second deficiency of MPC which we study in this thesis is that general-purpose protocols often are still not efficient enough to be used in practice. One way to resolve this is by using protocols that are tailor-made for specific applications. One such application that has gained recent attention is called Private Join and Compute (PJC). In this application, two parties come together with input sets and associated values for each item in their sets. The goal is to privately compute a function over the associated values of the intersection of the two sets. In practice, the size of the intersection is quite small, and therefore the private computation of the intersection is actually much more expensive than whatever computation that needs to be done over it. In the second part of this thesis, we improve the efficiency of tailor-made state-of-the-art protocols that are used to privately compute the intersection, thus improving the efficiency of prior PJC protocols.
Ph.D. Thesis 2024 Generative modeling and Stochastic Control as Dynamics on Probability Distributions Domingo i Enrich, Carles Abstract | PDF

Title: Generative modeling and Stochastic Control as Dynamics on Probability Distributions

Candidate: Domingo i Enrich, Carles

Advisor(s): Joan Bruna

Abstract:

Several modern machine learning algorithms can be studied from the perspective of evolution dynamics on the space of probability measures. Gradient descent-ascent algorithms that are used to solve minimax problems such as the ones arising in generative adversarial networks (GANs) can be interpreted as a joint evolution of two measures: one over the space of parameters of the generator, and one over the space of parameters of the discriminator.

In the first chapter of the thesis, I study systems of this form, and provide convergence guarantees when possible. Diffusion models, which are another generative modeling technique, are also based on dynamics on probability measures, in this case over the space of samples. The dynamics are simulated at inference time; the starting distribution is a Gaussian, and the final distribution is meant to be the target data distribution. Diffusion models were generalized by the Flow Matching framework, which allows to construct different paths between the Gaussian noise distribution and the data distribution.

In the second part, I introduce Multisample Flow Matching, which is a generalization of Flow Matching with intimate connections to optimal transport. Stochastic optimal control is a third problem where dynamics on
measures play a critical role. The goal is to learn a vector field (the control) in order to drive the behavior of the solutions of a stochastic differential equation.

In the third chapter, I present Stochastic Optimal Control Matching, a least-squares loss that is based on the same principles that are used to formulate diffusion model losses, and which achieves errors that are an order of magnitude lower than for existing methods.

The talk will cover the second and third chapters.
Ph.D. Thesis 2024 Solver-Aided Compiler Design for Programmable Network Devices Gao, Xiangyu Abstract | PDF

Title: Solver-Aided Compiler Design for Programmable Network Devices

Candidate: Gao, Xiangyu

Advisor(s): Anirudh Sivaraman, Srinivas Narayana

Abstract:

Historically, network devices were mostly fixed-function ones. They could run at a line rate of one network packet per nanosecond, but it was impossible to support newly developed network algorithms without upgrading the device. The emergence of programmable network devices remedies this drawback. These devices use reconfigurable match table model to enable programmability and provide more flexibility for developers to continue updating and adding new algorithms to the device. People developed several programming languages to write programs for these devices. Even though it is not hard to get started with writing packet-processing code, writing programs that can fit within the target devices’ various resource constraints is not an easy job. The root cause for that is the lack of optimizing compilers in this domain. Hence, this thesis focuses on optimizing compiler design for domain-specific network accelerators using solver-aided techniques that can generate better compilation results compared with state-of-the-art compilers.

First, we build the Chipmunk compiler that does code generation for stateful transactions into programmable switches using program synthesis. We frame the compilation problem as a solution-searching problem and use a program synthesis engine, SKETCH, to find a semantically equivalent compilation outcome. Additionally, we also develop a series of algorithms to speed up the compilation process. We find that the Chipmunk compiler can generate better compilation results in terms of hardware resource usage within a reasonable time period.

Second, we build the CaT compiler that does both code generation and resource allocation into the packet-processing pipeline using solver-aided technologies. We decompose the compilation problem for such pipelines into three phases—making extensive use of solver engines to simplify the development of these phases. We also incorporate some heuristics for further resource usage optimization. We observe that the CaT compiler can generate near-optimal compilation results at a much faster speed than Chipmunk.

Third, we build the Polyglotter compiler that outputs programs for target hardware devices from input programs written for source hardware devices in the parser portion. This compiler unifies features across different programming languages and reduces the efforts required to write algorithms across platforms. We discover that the Polyglotter compiler can generate correct transpilation results with better hardware resource usage.

To our best knowledge, we for the first time propose to incorporate solver-aided techniques into compiler design for programmable network devices. In the domain of programmable network devices, these compilers can outperform traditional compilers that rely on program rewrite rules. Our contributions are beyond just building solver-aided compilers and include domain specific algorithms to speed up the whole compilation process. Based on these developed compilers, we explore several useful aspects of solver-aided techniques and hope to extend them into more applications in future works.
Ph.D. Thesis 2024 Predictive and Generative Models of Protein Sequence and Structure Lin, Zeming Abstract | PDF

Title: Predictive and Generative Models of Protein Sequence and Structure

Candidate: Lin, Zeming

Advisor(s): Yann LeCun

Abstract:

Historically, protein engineering has predominantly involved a bottom-up strategy, utilizing naturally occurring components as the building blocks. However, the problem of designing arbitrary protein sequences and structures for specific problems present significant challenges due to the complexity of biological systems. In this work, we tackle the problem of developing models of protein sequences and structures for prediction and generation. We show that neural networks can learn the patterns inherent to these systems and provide results for modeling protein through predicting protein structures from a given sequence and vice versa. Generative models can also model the unconditional distributions of protein sequence and
structure.

To model protein structures, we present an autoencoder architecture that can produce a wide array of protein backbones to model protein structures. These structures exhibit both local and global coherence in terms of secondary and tertiary structures. Using classical techniques to design sequences that fold to generated backbones, we show that the model can generate novel sequences which are validated in-silico. To generate better sequences for these backbones, we then present ESM-IF1, a model for fixed backbone protein design. We designed a large-scale system to predict millions of structures using AlphaFold. By training on the synthetic data, we were able to obtain state of the art results and obtain over 50% sequence recovery.

We then scale large protein language models to 15 billion parameters (ESM-2) as an unconditional model of protein sequences. ESM-2 is capable of replacing multiple sequence alignment (MSA) features to obtain nearly state-of-the-art structure prediction results from a single sequence Removing MSA features gives a 60x speed up, allowing us to catalog the largest database of predicted protein structures. We open-sourced the ESM Metagenomic Atlas, a database of over 225 million high-confidence predicted structures, giving us an unprecedented view into the vast breadth and diversity of natural proteins. Finally, the speed and single sequence nature of our model allows us to directly optimize the protein sequence with respect to the protein structure. We show that black box optimization techniques can enable the design of proteins with structural constraints as symmetry, scaffolding, and binding. In sum, we present a series of models that are able to model the conditional and unconditional distributions of protein sequence and structure.
Ph.D. Thesis 2024 Towards Responsible AI: Safeguarding Privacy, Integrity, and Fairness Mirza, Muhammad Shujaat Abstract | PDF

Title: Towards Responsible AI: Safeguarding Privacy, Integrity, and Fairness

Candidate: Mirza, Muhammad Shujaat

Advisor(s): Prof. Christina Pöpper

Abstract:

The widespread adoption of Artificial Intelligence (AI) into digital platforms, spanning general-purpose applications such as chatbots, professional tools like code generation, and high-risk domains like healthcare, has profoundly transformed user experiences. However, this rapid integration has also brought to the forefront critical concerns surrounding privacy, integrity, and fairness. This thesis systematically investigates these three interconnected challenges through comprehensive investigations revealing vulnerabilities and proposes approaches to address them, contributing to the responsible development of AI technologies.

In addressing privacy concerns, we focus on managing personal information exposure in an era where digital data persists indefinitely. We begin with a global longitudinal analysis of privacy narratives to contextualize the evolving landscape of privacy concerns. Next, we systematically develop a semi-automated pipeline to assess the risks of training data extraction from large language models (LLMs), particularly those used for code generation such as Github Copilot. We demonstrate the feasibility of leaking various types of sensitive personal information, including email addresses, medical records, and passwords. Finally, we undertake a comprehensive systematization of privacy-enhancing technologies for exposure management, bridging gaps between technical solutions and user needs. We identify key discrepancies and propose actionable strategies for aligning technical solutions with user expectations. These findings lay the groundwork for user-centric privacy solutions that effectively address data persistence challenges.

To tackle threats to information integrity, we focus on the potential misuse of generative AI tools and coordinated disinformation campaigns. We conduct a detailed evaluation of factual accuracy of frontier LLMs, such as the GPT series, in the zero-shot classification setting. By comparing different model versions we uncover inconsistencies in performance improvements, with GPT-4's March release outperforming its June counterpart. Next, we develop a novel cybersecurity-inspired framework for characterizing disinformation threats, profiling threat actors, attack patterns, targets, and channels. We validate our framework's effectiveness through case studies of real-world disinformation campaigns, highlighting its potential to strengthen the integrity of online information ecosystems and laying the groundwork for potential automated threat-scoring systems.

Lastly, we address fairness in machine learning systems by identifying biases that reinforce inequalities. We introduce Global-Liar, a novel dataset uniquely balanced in terms of geographic representation, facilitating a more nuanced factuality evaluation of LLM biases across different regions. Using this dataset, we conduct a rigorous evaluation of general-purpose LLMs, revealing significant disadvantages faced by the Global South. Next, we conduct thorough investigation into fairness in high-risk computer vision models used for medical diagnosis in healthcare. Our assessment reveals significant racial and sex biases in kidney and tumor segmentation tasks. We investigate a range of bias mitigation approaches, from pre-processing techniques, like stratified batch sampling, to algorithmic interventions, like fair meta-learning. Notably, our findings suggest that architectural choices play a significant role in bias reduction, emphasizing the necessity of careful design and thorough evaluation of model architectures.

In summary, our findings and proposed solutions in privacy, integrity, and fairness contribute to responsible AI development, aiming to democratize its benefits across all constituencies.
Ph.D. Thesis 2024 Learning from Rewards in Text Generation Pang, Richard Yuanzhe Abstract | PDF

Title: Learning from Rewards in Text Generation

Candidate: Pang, Richard Yuanzhe

Advisor(s): He He, Kyunghyun Cho

Abstract:

The progress in text generation comes from every stage in the pipeline: problem definition, data curation, learning, decoding, and evaluation. This dissertation focuses on learning. There is a mismatch between traditional training objectives and evaluation objectives: regular maximum likelihood estimation tries to minimize the cross-entropy loss with respect to each sample in the dataset, but the downstream evaluation is often based on a reward that scores the compatibility of the input-output pair (e.g., human judgments of the output). I aim to bridge this gap by optimizing for the reward of the generated text directly.

The talk is composed of the following components. (1) Rewards could be expensive to obtain. To tackle this challenge in the social dialogue setting, we extract implicit signals from deployment data without extra human annotations. (2) The model could make slow or no progress in learning, and one idea is to obtain denser and high-quality rewards. In neural machine translation, we define a reward inspired by noisy channel decoding which has a long history, and we are able to increase decoding speed significantly while ensuring similar translation quality. (3) Another way to make progress in learning is to innovate on training algorithms instead. We set the rewards to be based on the simple exact match of generations and references, but algorithm-wise we explore the extreme case where we do not deviate too far from references by framing text generation as an offline reinforcement learning (RL) problem. We propose generation by off-policy learning from demonstrations (GOLD) using importance weighting. Our generations outperform those trained by MLE and policy gradient on a range of tasks. (4) We show that we do not need to rely on RL using a few reasoning tasks (e.g., math, science, commonsense) as the testbed. We develop an approach called iterative reasoning preference optimization (IRPO) that optimizes for winning vs. losing reasoning chain-of-thoughts, using modified direct preference optimization as the criteria. IRPO results in markedly increased accuracies compared to a range of baselines.

To conclude the talk, I will discuss the future directions of using large language models as rewards. I will briefly mention the initial promise given by our work on self-rewarding language models using LLM-based rewards with a learning algorithm connected to that in IRPO; the discussion is then followed by the corresponding challenges and next steps. I will also touch on human–AI collaboration – an additional way to improve LLM evaluation capabilities.
Ph.D. Thesis 2024 Verification of Concurrent Search Structures Patel, Nisarg Abstract | PDF

Title: Verification of Concurrent Search Structures

Candidate: Patel, Nisarg

Advisor(s): Prof. Thomas Wies

Abstract:

Concurrent search structures are a class of concurrent data structures that implement a key-value store. Concurrent search structures are integral components of modern software systems, yet they are notoriously difficult to design and implement. In the context of concurrency, linearizability is the accepted notion of correctness of a data structure. Verifying linearizability of concurrent search structures remains a formidable challenge due to the inherent complexity of the underlying algorithms. So far, verification of these data structures has often led to large, intricate proofs that are hard to comprehend and reuse.

The concrete contribution of the thesis is developing and verifying new template algorithms that cover several variants of lock-free skiplists and lock-based log-structured merge (LSM) trees. The template algorithms capture concurrency mechanism, but abstract away node-level details and the maintenance operations.

The generalizable contribution of the thesis is the advancement in the verification technology required to prove the new template algorithms. There are two key contributions here, first relating to hindsight reasoning and second to keyset reasoning. Hindsight reasoning has been shown to be useful for proving linearizability, but it has not been explored in the context of a foundational program logic. The thesis addresses the challenge by embedding the technique of hindsight reasoning in the concurrent separation logic Iris via prophecy variables. Keyset reasoning is useful for lifting assertions on a node's contents to the global contents held by the structure. The thesis develops a keyset resource algebra, an Iris resource algebra to enable keyset reasoning in Iris.

All of the techniques and proofs are mechanized in Iris/Coq. Verified search structures include in particular the Michael set, the Harris list, the Herlihy-Shavit skiplist and an LSM-tree implementation based on LevelDB. The verification effort represents a significant contribution as it is the first mechanized proof of linearizability for concurrent skiplists and LSM-trees.
Ph.D. Thesis 2024 Neural Language Representations and Scaling Semi-Supervised Learning for Speech Recognition Peyser, Cal Abstract | PDF

Title: Neural Language Representations and Scaling Semi-Supervised Learning for Speech Recognition

Candidate: Peyser, Cal

Advisor(s): Prof. Kyunghyun Cho, Prof. Michael Picheny

Abstract:

Speech recognition research has been focused for several years on the incorporation of unpaired speech and text data alongside conventional supervised datasets. Dominant methods have emphasized auxiliary tasks for refining speech and/or text representations during model training. These methods have generally performed strongly when paired with very small supervised datasets, but do not yield the same improvements against strong, supervised baselines. We argue in this thesis that the path to scaling these methods lies in the speech and text representations themselves. We investigate statistical properties of these representations, and show that downstream ASR performance corresponds to a model's ability to jointly represent speech and text. We analyze existing methods for semisupervised ASR, and develop an algorithm to improve them at scale by aligning speech and text in representation space.
Ph.D. Thesis 2024 Unlocking AI outside the training distribution: Generalization, Causality, and Coronary Risk Modeling Puli, Aahlad Manas Abstract | PDF

Title: Unlocking AI outside the training distribution: Generalization, Causality, and Coronary Risk Modeling

Candidate: Puli, Aahlad Manas

Advisor(s): Prof. Rajesh Ranganath

Abstract:

Modern AI models make it easy to exploit the correlations in a dataset to predict a target of interest from a given set of inputs. However, the primary use of these models often lies outside the training data. For example, while one can train a Transformer to correlate a patient's medical history to their chances of developing coronary heart disease (CHD), the goal would be to estimate risks on populations elsewhere or in the future. Challenges arise if the model relies on correlations that shift between training and test times or capture non-causal relationships. Predictions based on unstable relationships can degrade outside the training distribution, and basing treatment decisions on non-causal relationships can result in harm. This thesis first develops a methodology for generalizing out-of-distribution (OOD) and estimating causal effects. It closes with an empirical study of building and transporting CHD risk models at two large hospital systems.

The first part begins by defining a class of distribution shifts where standard training or balancing the data yield models can perform worse than random guessing. We characterize representations that generalize across such shifts and derive an algorithm to build models with such representations. Next, we develop an approach to encode knowledge of features used by humans into building robust models. The last work in this part identifies biases implicit in the standard way of training, gradient-based optimization of cross-entropy, that force models to depend more on unstable features than on the more informative stable ones. We develop a class of loss functions to encourage dependence on the more informative features.

The second part of this thesis studies cases where common assumptions that enable causal estimation are violated. We provide an algorithm to estimate causal effects with deep models from confounded data where instrumental variables are available. This algorithm generalizes the control function method and works without the separability assumptions required by popular algorithms like the two-stage least-squares and generalized method of moments. Then, we consider tasks where the confounders are known to equal a function of the variables whose effects we want to estimate; this setup violates an assumption known as overlap or positivity, commonly made to uniquely determine (identify) causal effects from non-randomized data. In this setting, we derive nonparametric conditions for identifiability and derive an estimator that solves a gradient flow equation to answer general causal queries from the data without overlap.

The last part of this thesis performs an empirical study of building and transporting CHD risk models between two large hospitals. Departing from the standard approach of constructing risk scores from carefully chosen features, we use broad feature sets available in the electronic health records (EHRs). We train AI models to predict time-to-CHD from minimally curated EHR data that outperform existing risk scores at the institution where they were trained and when transported externally.
Ph.D. Thesis 2024 DrawTalking: Building Interactive Worlds by Sketching and Speaking Rosenberg, Karl Toby Abstract | PDF

Title: DrawTalking: Building Interactive Worlds by Sketching and Speaking

Candidate: Rosenberg, Karl Toby

Advisor(s): Ken Perlin

Abstract:

This thesis introduces the design and implementation of an interaction concept called DrawTalking. Through simple combinations of sketching and speaking, the user can improvisationally build an interactive world of graphics, animations, diagrams, and dynamic mechanisms with behavior and rules, as if by narrating a story or explaining a concept to an audience. The interface demonstrates a possible step towards designing future interfaces more closely in-tune with how we naturally communicate and think.

For context, sketching while speaking has played a major part in innovation across disciplines. The combination of visuals and spoken language enables us to make-believe: think about, describe, communicate, and interact with anything that we can think of, including things that do not or cannot exist in the real world. Evolving technology creates opportunities to move beyond sketching and speech alone. Human-computer interactions of the future, drawing inspiration from our process of make-believe, can add interactive computation to the combination of sketching and speech, allowing us to work with explorable worlds, simulations, and mechanics. By enabling such interactions, we might think, learn, design, play, and tell stories in increasingly expressive ways.

Towards this idea, what makes for a good interface for computation-mediated sketching and speaking? This touches upon several fundamental questions in interaction design, human-AI interaction, and human-centered interfaces, chiefly among them, how to balance human control and machine automation?

Inspired by real-world speaking and sketching interactions, and seminal works in dynamic sketching, interactive visual programming, and language interfaces, we designed interaction techniques that draw on the way people describe objects and phenomena when telling stories and explaining processes at a whiteboard.

How does it work? the user speaks to label hand-drawn sketches with names and properties, and to define rules for how their world should behave. This communicates semantic intent to the computer, while giving the user the flexibility to choose how to represent and change their drawings. Now the user can interact with a simulated world simply by narrating stories or describing mechanics, which dynamically creates running interactive programs from built-in primitives and user-customized rules.

To gauge understanding of the mechanics of DrawTalking and to derive use cases, we invited participants to an open-ended one-on-one user-study session with the researcher to discover and explore the features in DrawTalking. Each user improvised and prototyped interactive sketch-based animations and gameplay scenarios by collaborating with the researcher. The resulting artifacts and discussion were oriented around each participant's specific experiences and background.

Feedback suggests that our approach is promising and intuitive: it prioritizes user control; it is flexible and supports improvisation; the workflow is fluid; the features are extensible and adaptable to other application domains and contexts beyond sketching; the design demonstrates how multiple applications can use similar language-based interaction techniques and behaviors predictably alongside other language-based technologies; it enables programming-like capability without code.

Through the research and design process of DrawTalking, we learned that it could represent an approach to designing complex interoperating systems for human-AI collaboration. We hope it can serve as a useful example for research and design of future machine-mediated interfaces, interactions, and computer systems.
Ph.D. Thesis 2024 Algorithmic enhancements to causal inference problems Shen, Bingran Abstract | PDF

Title: Algorithmic enhancements to causal inference problems

Candidate: Shen, Bingran

Advisor(s): Prof. Dennis Shasha

Abstract:

This thesis explores novel approaches to inferring and representing causal relationships in biological networks. We introduce EnsInfer, an ensemble method that combines state-of-the-art inference algorithms using a Naive Bayes classifier, outperforming individual methods and providing a flexible framework for integrating diverse data types. Our research then challenges the conventional representation of gene regulatory networks (GRNs) by demonstrating that nonlinear machine learning models achieve better predictive performance than models based solely on "gold standard" regulatory edges. To address this limitation, we propose a bipartite network representation that better captures the synergistic regulatory effects of multiple transcription factors on target genes. This framework focuses on four key goals: predictive accuracy, parsimonious enumeration of predictive regulatory genes, identification of disjoint sets of predictive regulatory genes, and construction of a bipartite network representation of causality. Our work provides an actionable and interpretable paradigm for investigating causal gene regulation, with potential applications across diverse domains of causality research.
Ph.D. Thesis 2024 Olympiad-level Geometry Theorem Proving without Human Demonstrations Trinh, Trieu Abstract | PDF

Title: Olympiad-level Geometry Theorem Proving without Human Demonstrations

Candidate: Trinh, Trieu

Advisor(s): He He

Abstract:

Proving mathematical theorems at Olympiad level represents a significant milestone in human-level automated reasoning, owing to their reputed difficulty among the world’s best talents in pre-university mathematics. Current machine learning approaches, however, are not applicable to most mathematical domains due to the high cost of translating human proofs into machine-verifiable format. The problem is even worse for geometry due to its unique translation challenges, resulting in severe scarcity of training data. We propose G0, a theorem prover for Euclidean plane geometry that sidesteps the need for human demonstrations by synthesizing millions of theorems and proofs across different levels of complexity. G0 is a neuro-symbolic system that uses a neural language model, trained from scratch on our large-scale synthetic data, to guide a symbolic deduction engine through infinite branching points in challenging problems. On a test set of 30 latest Olympiad problems, G0 solves 25, outperforming the previous best method that only solves 10 problems and approaching the performance of an average International Mathematical Olympiad (IMO) gold medalist. Notably, G0 produces human-readable proofs, solves all geometry problems in the IMO 2000 and 2015 under human expert evaluation, and discovers a generalized version of a translated IMO theorem in 2004.
Ph.D. Thesis 2024 Improve Language Model Serving Efficiency with Fine-grained and Stateful Scheduling Yu, Lingfan Abstract | PDF

Title: Improve Language Model Serving Efficiency with Fine-grained and Stateful Scheduling

Candidate: Yu, Lingfan

Advisor(s): Jinyang Li

Abstract:

The world has witnessed the remarkable success of large language models (LLMs), led by the fast-growing popularity of ChatGPT. However, it is challenging to serve these language models and deliver both high throughput and low latency due to the iterative nature of language models. This thesis identifies two key issues impacting the performance of existing systems: (1) Coarse-grained batching at the request level results in wasteful computation for requests with variable input and output lengths; (2) The lack of stateful context management results in duplicate computation for applications that engage in multi-turn interactions with the LLM model. Two systems, BatchMaker and Pensieve, are then presented to address these issues.

BatchMaker proposes a technique called cellular batching to improve the latency and throughput of language model inference. Existing systems use batch execution of the dataflow graphs of a fixed set of requests. By contrast, BatchMaker makes finer-grained batching decisions at each token processing step, and dynamically assembles a batch for execution as requests join and leave the system.

Pensieve is a system optimized for multi-turn conversation LLM serving. It maintains the conversation state across requests from the same conversation by caching previously processed history to avoid duplicate processing. Pensieve's multi-tier caching strategy utilizes both GPU and CPU memory to store and retrieve cached data efficiently. Pensieve also generalizes the recent PagedAttention kernel to support attention between multiple input tokens whose KV cache is spread over non-contiguous GPU memory.

Experiments on various workloads show that BatchMaker improves throughput by 25-80% while reducing latency by 18-90% latency, and Pensieve improves throughput by 33-100% and reduces latency by 40-77%.
Ph.D. Thesis 2024 Theory of Symmetric Neural Networks Zweig, Aaron Abstract | PDF

Title: Theory of Symmetric Neural Networks

Candidate: Zweig, Aaron

Advisor(s): Joan Bruna

Abstract:

Symmetric functions, which take as input an unordered, fixed-size set, find practical application in myriad physical settings based on indistinguishable points or particles, and are also used as intermediate building blocks to construct networks with other invariances. Symmetric functions are known to be universally representable by neural networks that enforce permutation invariance. However the theoretical tools that characterize the approximation, optimization and generalization of typical networks fail to adequately characterize architectures that enforce invariance.

This thesis explores when these tools can be adapted to symmetric architectures, and when the invariance properties lead to new theoretical findings altogether. We study and prove approximation limitations on the extension of symmetric neural networks to infinite-sized inputs, the approximation capabilities of symmetric and antisymmetric networks relative to the interaction between set elements, and the learnability of simple symmetric functions with gradient methods
Ph.D. Thesis 2023 On Matching Problems in Large Settings Agarwal, Ishan Abstract | PDF

Title: On Matching Problems in Large Settings

Candidate: Agarwal, Ishan

Advisor(s): Richard Cole

Abstract:

Matching problems arise in several settings in practice and have been a longstanding subject of theoretical analysis. Typically, the settings of interest involve a large number of agents. We further the study of matching problems in two settings: the stable matching setting, which has been studied since the seminal work of Gale and Shapley, and a setting where agents' values to prospective partners degrade over time, leading them to have to balance the trade-off between searching for a better partner versus deciding to match.

In the stable matching setting, we extend a line of research that seeks to explain the dichotomy between the fact that Gale and Shapley's Deferred Acceptance algorithm seems to work well in practice, even when agents only submit a short list of prospective partners to the centralized matching algorithm, and the fact that if the agents' preferences are allowed to be arbitrary, complete lists of all agents' preferences are needed in order to guarantee a stable matching. To this end, we consider probabilistically generated preference lists and we show that under fairly general assumptions and in a variety of models, with high probability, short lists of prospective partners, namely length $\Theta (\log n)$ instead of $n$, suffice for most of the agents. We prove our bounds are tight up to constant factors. Furthermore, we construct a simple set of $\Theta (\log n)$ possible matches per agent for almost all agents and demonstrate (in the form of an approximate equilibrium result) that they can afford to restrict their proposals to this set, while incurring only a small loss in utility.

In the time discounted utilities setting, we consider a dynamic matching market, and study how agents should balance accepting a proposed match with the cost of continuing their search. Our model has two new features: finite agent lifetimes with linear loss in utility over time, and a discrete population model, aspects which are underexplored in the literature. We quantify how well the agents can do by providing upper and lower bounds on the collective losses of the agents, with a polynomially small failure probability, where the notion of loss is with respect to a plausible baseline we define. These bounds are also tight up to constant factors.

In both settings, we complement our theoretical results with numerical simulations.
Ph.D. Thesis 2023 Function Space Reasoning in Gaussian Processes and Neural Networks Benton, Gregory Abstract | PDF

Title: Function Space Reasoning in Gaussian Processes and Neural Networks

Candidate: Benton, Gregory

Advisor(s): Andrew Gordon Wilson

Abstract:

In a typical modeling setting we have prior notions of what types of functions we want to learn. For example, in regression we may want to learn a smooth function or a periodic function and in image classification we may want to learn a function that is invariant to rotations. While function space provides us the benefit of being able to reason about traits like invariance or smoothness, it is often difficult to directly quantify the functional properties of models, in particular for large parametric models like neural networks.

In this thesis we leverage our ability to reason about function space to build more powerful models in both Gaussian processes (GPs) and neural networks. By generating GP kernels as functions themselves of latent processes, we introduce methods for providing uncertainty over what types of functions we produce, not just over the functions themselves in GP models. We also introduce methods for learning levels of invariance and equivariance in neural networks, enabling us to imbue the functions our models produce with soft or limited equivariance constraints. Finally, we show how we can leverage our understanding of parameter space in neural networks to efficiently ensemble diverse collections of functions to improve the accuracy and robustness of our models. Through the introduction of these methods we show that by carefully considering the types of functions we are producing we can describe models with a range of desirable properties. These properties include more flexible models, models that better align with domain knowledge, and models that are both accurate and robust. We demonstrate these results on a broad range of problems, including time series forecasting, image classification, and reinforcement learning.
Ph.D. Thesis 2023 Bridging the Gap from Supervised Learning to Control Brandfonbrener, David Abstract | PDF

Title: Bridging the Gap from Supervised Learning to Control

Candidate: Brandfonbrener, David

Advisor(s): Joan Bruna

Abstract:

he combination of deep learning and internet-scale data with supervised
learning has led to impressive progress in recent years. However, the
potential of this progress has yet to be realized in the context of
control problems beyond games that are easy to simulate. This thesis
attempts to bridge this gap so as to leverage tools from supervised
learning to solve control problems. To do this, we focus on the offline
reinforcement learning setting which attempts to learn a control policy
from a fixed dataset rather than requiring the policy to learn and
collect data at the same time. This removes issues of non-stationary
training data and exploration from the control problem, which allows the
more straightforward application of tools from supervised learning.

We study this intersection between supervised learning and control from
several angles. In the first part of the thesis, we present work on
policy learning, focusing on simplified algorithms that look more like
standard supervised algorithms. In the second part, we move one step
earlier in the pipeline and consider how to best collect datasets for
offline reinforcement learning. And in the last part, we consider how to
design pretraining objectives to learn representations for downstream
offline policy learning. Taken together, these contributions present a
view of the promise and challenges that face the application of machine
learning to control problems.
Ph.D. Thesis 2023 Provably Robust and Accurate Methods for Rigid and Deformable Simulation with Contact Ferguson, Zachary Abstract | PDF

Title: Provably Robust and Accurate Methods for Rigid and Deformable Simulation with Contact

Candidate: Ferguson, Zachary

Advisor(s): Daniele Panozzo

Abstract:

Contacts are essential to virtually every aspect of life and play a vital role in many physical phenomena. Because of this, the study of contact mechanics has a deep wealth of knowledge. Surprisingly, however, simulating contact is a challenge with many parameters to carefully adjust. Incorrect parameters can result in numerical explosions, intersections, and other failures. Our research seeks to address these problems by developing robust methods that can handle arbitrary scenarios with guaranteed success.

In this thesis, we introduce the Incremental Potential Contact (IPC) method. IPC is the first simulation algorithm for deformable and rigid bodies that is unconditionally robust, requires minimal parameter tuning, and provides a direct way of controlling the trade-off between running time and accuracy. We further back up these claims by providing a large-scale benchmark of continuous collision detection (CCD) algorithms (a core component of the IPC method) based on their efficiency and correctness. As part of this study, we introduce the first efficient CCD algorithm that is provably conservative. For extended accuracy and efficiency, we show how nonlinear geometry and function spaces can be used within the IPC framework. Finally, we introduce the first physically-based adaptive meshing strategy which produces more accurate discretizations depending on elastic, contact, and frictional forces.

This work and our open-source implementations have quickly garnered attention from the computer graphics, mechanical engineering, and biomechanical engineering communities for their robustness and ability to seamlessly handle scenarios that have long been a challenge. This marks a large step towards democratizing simulation tools for design, robotics, biomechanical, and visual effects applications, among others.
Ph.D. Thesis 2023 Understanding and Incorporating Mathematical Inductive Biases in Neural Networks Finzi, Marc Abstract | PDF

Title: Understanding and Incorporating Mathematical Inductive Biases in Neural Networks

Candidate: Finzi, Marc

Advisor(s): Andrew Gordon Wilson

Abstract:

To overcome the enormous sample complexity of deep learning models, we can leverage basic elements of human and scientific knowledge and imbue these elements into our models. By doing so, we can short-circuit the thousands of years of evolutionary development that has enabled such rapid learning in humans, and the development of science which provides a framework to fit new knowledge into. In this work I develop new methods for incorporating mathematical inductive biases into our models, biasing them towards solutions that reflect our priors and our knowledge. This work helps to broaden the scope and automation of equivariant model construction across diverse domains, uncover the role of inductive biases in learning and generalization, and developing new machine learning models for scientific applications, capturing relevant scientific knowledge.
Ph.D. Thesis 2023 Deconstructing Models and Methods in Deep Learning Izmailov, Pavel Abstract | PDF

Title: Deconstructing Models and Methods in Deep Learning

Candidate: Izmailov, Pavel

Advisor(s): Andrew Gordon Wilson

Abstract:

Machine learning models are ultimately used to make
decisions in the real world, where mistakes can be incredibly costly.
We still understand surprisingly little about neural networks and the
procedures that we use to train them, and, as a result, our models are
brittle, often rely on spurious features, and generalize poorly under
minor distribution shifts. Moreover, these models are often unable to
faithfully represent uncertainty in their predictions, further
limiting their applicability. In this dissertation, I present results
on neural network loss surfaces, probabilistic deep learning,
uncertainty estimation and robustness to distribution shifts. In each
of these works, we aim to build foundational understanding of models,
training procedures, and their limitations, and then use this
understanding to develop practically impactful, interpretable, robust
and broadly applicable methods and models.
Ph.D. Thesis 2023 Learning structured and stable reduced models from data with operator inference Sawant, Nihar Abstract | PDF

Title: Learning structured and stable reduced models from data with operator inference

Candidate: Sawant, Nihar

Advisor(s): Benjamin Peherstorfer

Abstract:

Operator inference learns low-dimensional dynamical-system models with polynomial nonlinear terms from trajectories of high-dimensional physical systems (non-intrusive model
reduction). This work focuses on the large class of physical systems that can be well described by models with quadratic and cubic nonlinear terms and proposes a regularizer for
operator inference that induces a stability bias onto learned models. The proposed regularizer is physics informed in the sense that it penalizes higher-order terms with large norms and
so explicitly leverages the polynomial model form that is given by the underlying physics.
This means that the proposed approach judiciously learns from data and physical insights
combined, rather than from either data or physics alone. A formulation of operator inference
is proposed that enforces model constraints for preserving structure such as symmetry and
definiteness in linear terms. Additionally, for a system of nonlinear conservation laws, we
enforce model constraints that preserve the entropy stability of the dynamical system. Numerical results demonstrate that models learned with operator inference and the proposed
regularizer and structure preservation are accurate and stable even in cases where using no
regularization and Tikhonov regularization leads to models that are unstable.
Ph.D. Thesis 2023 Continuous LWE and its Applications Song, Min Jae Abstract | PDF

Title: Continuous LWE and its Applications

Candidate: Song, Min Jae

Advisor(s): Oded Regev/Joan Bruna

Abstract:

Efficiently extracting useful information from high-dimensional data is a major challenge in machine learning (ML). Oftentimes, the challenge comes not from a lack of data, but from its high dimensionality and computational constraints. For instance, when data exhibits a low-dimensional structure, one could in principle exhaustively search over all candidate structures, and obtain estimators with strong statistical guarantees. Of course, such brute-force approach is prohibitively expensive in high dimensions, necessitating the need for computationally efficient alternatives. When our problem, however, *persistently* eludes efficient algorithms, we may find ourselves asking the following perplexing question: is the failure due to our lack of algorithmic ingenuity or is the problem just too hard? Is there a *gap* between what we can achieve statistically and what we can achieve computationally?

This thesis is one attempt at answering such questions on the computational complexity of statistical inference. We provide results of both positive and negative nature on the complexity of canonical learning problems by establishing connections between ML and lattice-based cryptography. The Continuous Learning with Errors (CLWE) problem, which can be seen as a continuous variant of the well-known Learning with Errors (LWE) problem from lattice-based cryptography, lies at the center of this fruitful connection.

In the first part of this thesis, we show that CLWE enjoys essentially the same average-case hardness guarantees as LWE. This result has several important applications. For example, it shows that estimating the density of high-dimensional Gaussian mixtures is computationally hard, and gives rise to "backdoored" Gaussian distributions that can be used to plant undetectable backdoors in ML models and construct novel public-key encryption schemes.

Next, we focus on the "backdoored" Gaussian distributions, which we refer to as Gaussian Pancakes, and the problem of distinguishing these distributions from the standard Gaussian. We provide several evidence for the hardness of this distinguishing problem based on a reduction from CLWE and lower bounds against restricted classes of algorithms, such as algorithms that compute low-degree polynomials of the observations.

Finally, we end on a positive note by showing that the Lenstra-Lenstra-Lovasz (LLL) algorithm, commonly used in computational number theory and lattice-based cryptography, has surprising implications for noiseless inference. In particular, we show that LLL solves both CLWE and Gaussian Pancakes in the noiseless setting, in spite of the low-degree lower bound for Gaussian Pancakes. Furthermore, we show that LLL surpasses Sum-of-Squares and Approximate Message Passing algorithms, two methods often conjectured to be optimal among polynomial-time algorithms, on other noiseless problems such as Gaussian Clustering and Gaussian Phase Retrieval. These results highlight the crucial but subtle role of noise and hidden algebraic structure in the onset of statistical-to-computational gaps.
Ph.D. Thesis 2023 Expanding Structural Design through Shape Optimization and Microstructures Tozoni, Davi Colli Abstract | PDF

Title: Expanding Structural Design through Shape Optimization and Microstructures

Candidate: Tozoni, Davi Colli

Advisor(s): Denis Zorin

Abstract:

3D printing and other modern manufacturing tools allow users to design and produce customized objects for their needs at a considerably low cost. However, designing structures that are able to perform well is not an easy task and doing it manually can be a very slow and tedious process. In this context, structural optimization techniques can be very useful and help automating the design and analysis process.

This thesis describes techniques that can expand the usage of structural optimization for digital fabrication by formulating optimization to be used with simulation models that are closer to reality, through the addition of contact and friction. Moreover, we show a fast method to compute gradients from differentiable simulations, which can be used to optimize shape, material and physical properties of our domain. In addition, we provide ways of expanding the use of two-scale topology optimization by presenting microstructures that have a smooth map from material to geometry and which can be used on curved shapes defined by irregular lattices with close to rhombic cells. Finally, we introduce two low-parametric microstructures that together are able to cover almost the whole possible range of elastic properties for isotropic metamaterials.

Our results in simulation and physical experiments, both for static and time-dependent scenarios, show the advantages of our techniques and how they can be used in practice.
Ph.D. Thesis 2022 Enhancing Robustness through Domain Faithful Deep Learning Systems Balashankar, Ananth Abstract | PDF

Title: Enhancing Robustness through Domain Faithful Deep Learning Systems

Candidate: Balashankar, Ananth

Advisor(s): Lakshminarayanan Subramanian

Abstract:

In high-stakes domains like health, socio-economic inference, and content moderation, a fundamental roadblock for relying on deep learning systems is that models' predictions diverge from established domain knowledge when deployed in the real world and fail to faithfully incorporate domain-specific structure. In this talk, I will focus on the design of Domain Faithful Deep Learning Systems, that translate expert-understandable domain knowledge and constraints to be faithfully incorporated into learning robust deep learning models. Through methodological contributions in causal-aware ML model design, constrained optimization, counterfactual data augmentation, and feature selection, I have addressed core research questions of “What data distributions do domain practitioners care about?'', “How to faithfully convert domain knowledge into model constraints for better generalization?'' and finally ``How to evaluate whether the ML models we learn are grounded in the domain knowledge and in what ways do they deviate?''. I will demonstrate how, through these new approaches to incorporating domain knowledge, I have been able to meaningfully improve performance in four real-world applications of news-based famine forecasting, medication recommendations, causal question answering, and toxicity detection in online social media. These causal-aware and robust prediction models I have developed in collaboration with the World Bank and Google have shown that incorporating domain-specific structure is essential for building robust predictive models.
Ph.D. Thesis 2022 Unstructured Mesh Generation and Repairing in the Wild Hu, Yixin Abstract | PDF

Title: Unstructured Mesh Generation and Repairing in the Wild

Candidate: Hu, Yixin

Advisor(s): Daniele Panozzo

Abstract:

A mesh is a representation used to digitally represent the boundary or volume of an object for manipulation and analysis. Meshes can be used in many fields, including physical simulation in manufacturing, architecture design, medical scan analysis. In this thesis, we propose a series of meshing algorithms, named WildMeshing, that tackles one of the long-standing, yet fundamental, problems in geometry modeling: robustly and automatically generating high-quality triangle and tetrahedral meshes and repairing imperfect geometries in the wild. Different from existing methods that have assumptions about the input and thus often fail on real-world input geometries, WildMeshing provides strict guarantees of termination and is a black box that can be easily integrated into any geometry processing pipelines in research or industry.

This thesis first investigates the problem of tetrahedralizing 3D geometries represented by piecewise linear surfaces. We propose an algorithm, TetWild, that is unconditionally robust, requires no user interaction, and can directly convert a triangle soup into an analysis-ready volumetric tetrahedral mesh. It relies on three core principles: hybrid geometric kernel, tolerance of the mesh relative to the surface input, and iterative mesh optimization with guarantees on the output validity. We then consider improving the algorithm efficiency for tetrahedralizing large-scale geometries. We design a new algorithm, fTetWild, that is based on the principles of TetWild but replaces the hybrid kernel with a floating-point kernel, which largely reduces runtime while keeping the same robustness. Next, this thesis explores meshing curved geometries. We start from the problem of triangulating 2D planar shapes whose boundaries are represented by curves. We introduce TriWild, an algorithm to robustly generate curved triangle meshes reproducing smooth feature curves, which leads to coarse meshes designed to match the simulation requirements necessary by applications and avoids the geometrical errors introduced by linear meshes.

We test our algorithms on over ten thousand real-world input geometries and they achieve 100% success rate. Our methods generate meshes without any assumptions about the input while repairing the imperfect geometries, opening the door to automatic, large-scale processing of real-world geometric data.
Ph.D. Thesis 2022 Data-driven Solutions for Addressing Two Pressing Urban Sustainability Challenges: Air Pollution Reduction and Traffic Management Iyer, Shiva Abstract | PDF

Title: Data-driven Solutions for Addressing Two Pressing Urban Sustainability Challenges: Air Pollution Reduction and Traffic Management

Candidate: Iyer, Shiva

Advisor(s): Lakshmi Subramanian

Abstract:

Data Science and AI-driven solutions are abundant today for a large variety of practical applications. With a continuing focus on urban development and sustainability, in this thesis, I present our attempts in addressing two prominent urban challenges – urban air pollution control and road traffic congestion management. For both these applications, we have developed novel methods, such as the message-passing recurrent neural network, for predictive analytics and inference. The city of Delhi has 32 air quality monitors over an area of about 900 sq km, but we do not have information on fine-grained variations in air quality in the city in order to reason about citizen exposure and identify hotspots. We have installed 28 low-cost sensors, many of them concentrated in the south Delhi region. We have developed a generic definition of "hotspots" in terms of spatio-temporal variations, using which we validate some known hotspots and discover new ones. We have also designed a novel model combining geostatistics and deep learning that is able to make spatio-temporal pollution predictions by the hour with an MAPE of about 10% across all locations.

In the context of urban traffic management, we first show that road networks can experience traffic jams over prolonged periods such as several hours due to sudden traffic bursts over short time scales. We illustrate this using real data from two different cities – New York and Nairobi. We provide a formalism for understanding the phenomena of traffic collapse and sudden jams. In the second work, we devise a novel model called the message-passing neural network for modeling the propagation of congestion within a road network and forecasting congestion. The MPRNN achieves the lowest mean error of < 0.3 mph when predicting ahead in 10 minute intervals, for up to 3 road segments ahead (message passing across 3 hops). Finally, in the third work, we describe an algorithm for signal control in free-flow road networks, inspired from congestion control in computer networks. Our proposed method significantly enhances the operational capacity of free-flow road networks in the real world by several orders of magnitude (between 3× and 5×) and prevents congestion collapse.
Ph.D. Thesis 2022 Synergistic Geometry Processing: from Robust Geometric Modeling to Scalable Physical Simulation Jiang, Zhongshi Abstract | PDF

Title: Synergistic Geometry Processing: from Robust Geometric Modeling to Scalable Physical Simulation

Candidate: Jiang, Zhongshi

Advisor(s): Daniele Panozzo

Abstract:

Various applications, from artistic creation, to scientific computing,
require the processing and reasoning of 3D digital objects.
The computational modeling of 3D geometric shapes, materials, and
textures, as well as the simulation of their deformation and
interactions, is essential to bring the algorithmic power of computing
to real-life manufacture, architecture, and medical device design.
Depending on the specific numerical properties, better algorithm
designs might prefer 3D data with different representations, for
example, in planes, surfaces, or inside volumes.

This thesis investigates the problem related to the representations of
data on 3D shapes and across different domains,
so computations for different stages within a pipeline, may come
together synergistically without manual tuning that disrupts an
automated data flow.

I propose novel geometrical principles in various geometric modeling
and processing stages. I also showcase various geometric computing
applications that easily integrate such principles to guarantee the
geometry validity and algorithm effectiveness of surface
parameterization, rendering, deformation/animation, and mechanical
simulation.
In addition, we can finally explore creative solutions that reliably
coarsen the surface. Such simplification accelerates everyday
geometric modeling operations; the contribution also includes a
scalable method to construct coarse and curved meshes for fast
animation and scientific computing.

Furthermore, the thesis provides a declarative way to formulate mesh
processing and adaptation algorithms to facilitate the practical
development of robust and reliable mesh processing software.
Finally, the thesis includes extensive numerical validations involving
tens of thousands of complex geometry shapes. And to maintain
replicability and foster further research in this direction, I also
released the implementation and generated data to be open source and
accessible.

Finally, the thesis includes extensive numerical validations involving
tens of thousands of complex geometry shapes. %And to maintain
replicability and foster further research in this direction, I also
released the implementation and generated data to be open source and
accessible.
Ph.D. Thesis 2022 Cryptography: From Practice to Theory Karthikeyan, Harish Abstract | PDF

Title: Cryptography: From Practice to Theory

Candidate: Karthikeyan, Harish

Advisor(s): Yevgeniy Dodis

Abstract:

This work is yet another attempt to turn an age-old adage on its head by deriving inspiration for theoretical research from problems that are germane to practitioners and real-world deployment. This could be viewed as a departure from the practice of creating real-world solutions that trace their origin to a theoretical research, or alternatively ex post facto theoretical analyses of practically deployed solutions that can be rather ad-hoc. Specifically, we look at four different problems that are relevant for practical deployment - random number generation, provably secure block ciphers, searching over encrypted data, and forward-secure group messaging.
Ph.D. Thesis 2022 Scalable Distributed Payment Systems with Minimal Trust Assumptions Kattis, Assimakis Abstract | PDF

Title: Scalable Distributed Payment Systems with Minimal Trust Assumptions

Candidate: Kattis, Assimakis

Advisor(s): Prof. Joseph Bonneau

Abstract:

Over the last decade, the security and resilience of Bitcoin
as a stable payment network has motivated substantial study of the
viability of distributed payment protocols, with many works focusing on their suitability as alternatives to centralized payment processing. We investigate the design of scalable distributed payment systems in the permissionless setting, where no actors in the protocol can be trusted or identified with out-of-band information. Scalability is identified with two desirable properties: high transaction processing rate (or throughput) and low confirmation latency (or settlement times). We analyze the trade-offs inherent to distributed protocols that prevent
naive optimization of the above parameters and study techniques from verifiable computation as potential tools for overcoming these
bottlenecks.

One technique to increase throughput in distributed payment systems
involves the use of Succinct Non-interactive ARguments of Knowledge
(SNARKs, or SNARK proofs) to verify the integrity of transactions.
Transaction rollups are one such solution, using SNARK computations to achieve scalability. Many instantiations of rollups leveraging SNARKs show encouraging evidence that this technique could achieve commercial- capacity throughput rates if implemented on top of current distributed payment systems, even in the smart-contract setting. Although promising, all rollup approaches require the resolution of an additional yet crucial question. For protocols operating in the permissionless setting, we need to ensure that a system relying on proof generation to scale also incentivizes actors to compute proofs cheaply and quickly. This is a governance problem, as the protocol needs to decide on how participants will be chosen to perform these (expensive) computations. We pose the question of who will compute the proofs, identify it as a consensus problem and provide a technical proposal towards its resolution.

Our main contributions are twofold: in Part I, we design a
permissionless consensus protocol that solves the problem of state
verification for resource-limited clients in an incentive-compatible way. We show formal proofs of security and achieve minimal resource requirements for full ledger verification. This protocol showcases our key contribution: the design of a proof-of-work (PoW) process that computes SNARK proofs as valid outputs. Suitably choosing the statement whose proof is generated through PoW provides an incentive-compatible way to enforce the computation required by proof-based scaling techniques. In Part II, we look at one of the key components of SNARK- based throughput optimization: the non-interactive proof itself. We design a novel proof system which provides security guarantees in the trustless setting, while still being small and efficiently computable.
This proof system (a transparent SNARK, or STARK) can be used directly for scaling throughput in distributed payments through transaction rollups. In conjunction with an incentivized PoW process, it also demonstrates a way for participants in consensus to quickly generate the rollup proofs in a permissionless way.
Ph.D. Thesis 2022 Characterizing and Resolving Degeneracies in Neural Autoregressive Text Generation Kulikov, Ilia Abstract | PDF

Title: Characterizing and Resolving Degeneracies in Neural Autoregressive Text Generation

Candidate: Kulikov, Ilia

Advisor(s): Kyunghyun Cho, Jason Weston

Abstract:

Autoregressive neural networks have shown great success as part of the sequence to sequence framework solving a diverse set of sequence generation tasks. These tasks include machine translation, dialogue modeling, question answering, text summarization, and sequence completion. In spite of the visible success, many challenges remain to be solved and are reported across these tasks. These challenges are usually discussed as visible deviations in the predicted sequence compared to the given reference. It is, however, not always possible to do the comparison, because interactive tasks, such as dialogue modeling, do not come together with reference sequences in the middle of the conversation at the test time. We refer to such deviations as \textit{degeneracies} which result in degenerate sequences. In this thesis, we work on reducing widely reported degeneracies within specific tasks or in text generation in general. To do so, we often first need to formulate the degeneracy in a measurable way and hypothesize what is the major cause behind it.
We investigate the issue of oversmoothing, where the model assigns high probability to overly short sequences. We address this degeneracy from the learning aspect by proposing a novel regularization which minimizes the newly proposed oversmoothing rate directly. We show the effectiveness of the proposed method in the context of neural machine translation. Still concentrating on the learning aspect, we next address the problem of repetition in the context of sequence completion, where the generated sequences have unreasonably many repetitive substrings compared to the ones we see in the data. We propose a novel unlikelihood training procedure which allows to penalize undesired continuations, such as repetitive substrings. Unlikelihood training significantly reduces the number of repetitions and improves the naturalness of the generated continuations. One issue with the repetition degeneracy is that it can also lead to non-termination. We study if the original model is able to terminate the repetitive loop itself even if we do not enforce the maximum generated length during decoding. We connect this problem of non-termination with the consistency of the distribution induced by the chosen decoding algorithm. After proving that an incomplete decoding algorithm, such as beam search, may induce the inconsistent distribution when paired with a consistent model, we propose an alternative parametrization which guarantees the decoding-induced distribution to be consistent. After that, we switch to a more complicated scenario of conversation modeling, where the model has to generate a response in a multi-turn setting. We investigate the issue of unengaging or dull responses by highlighting the importance of the decoding algorithm. We observe a low diversity of beam search candidates compared to iterative beam search which explores a wider search subspace via efficient pruning. We find that the selection criterion is as important as the decoding strategy. Along the way, we stress the importance of careful human evaluation in the presence of annotator bias and calibrate the observed scores using Bayesian inference. While we address different kinds of degeneracy, the list we tackle is not exhaustive. For instance, neural machine translation is known to produce hallucinated translations or copy large parts of the input sentence. Furthermore, degeneracies exist past autoregressive modeling in both non-autoregressive and semi-autoregressive settings. We believe our contributions will be helpful for future research solving new problems.
Ph.D. Thesis 2022 Finding and Fixing Undesirable Behaviors in Pretrained Language Models Perez, Ethan Abstract | PDF

Title: Finding and Fixing Undesirable Behaviors in Pretrained Language Models

Candidate: Perez, Ethan

Advisor(s): Kyunghyun Cho

Abstract:

Natural Language Processing (NLP) promises to deliver tools for a variety of impactful applications, ranging from automatic summarization to question-answering systems and conversational assistants. Recently, NLP has been revolutionized by the advent of Pretrained Language Models (PLMs). We train PLMs using "self-supervised" learning objectives -- prediction tasks that operate on unlabeled text alone, such as next word prediction or missing word prediction. As a result, PLMs are able to learn from large quantities of internet text, to obtain strong performance on many NLP tasks.

Despite the success of self-supervised objectives, they face a fundamental limitation: they train PLMs to behave in ways that are misaligned with human preferences. PLMs learn to repeat internet misinformation, offensive jokes, and personal contact information, and it is hard to control or guide the text that PLMs generate. Next, we show that PLM-based classifiers are effective at predicting which texts humans prefer. As a result, it is possible to use such classifiers as a learning signal to automatically correct the PLM. We showcase this approach to train a high-quality retrieval system, obtaining strong performance across a variety of tasks using Retrieval-Augmented Generation (RAG). Even after such training schemes, some undesirable behaviors may remain undetected during training. We thus go a step further and generate inputs that elicit undesirable behaviors from the PLM using other PLMs, to preemptively find and fix such behaviors. Overall, we find that some of the most powerful tools for aligning PLMs with human preferences are PLMs themselves.
Ph.D. Thesis 2022 Identifying, Addressing, and Understanding Challenging Cases in Machine Learning Resnick, Cinjon Abstract | PDF

Title: Identifying, Addressing, and Understanding Challenging Cases in Machine Learning

Candidate: Resnick, Cinjon

Advisor(s): Kyunghyun Cho/Joan Bruna

Abstract:

Machine learning has advanced tremendously this past decade. Object
detection systems routinely perform beyond human-level accuracy with
no loss in speed, game-playing agents play at superhuman level in real
time, and generative models write language useful enough for
downstream products. And yet, autonomous vehicles (AV) crash due to
surprising mistakes, the best gaming agents lose to simple strategies,
and our language models produce nonsensical utterances at a
surprisingly high rate. I could have chosen examples from any field
because these failures are not endemic to just vision, games, or
language. There are always challenging cases remaining after training
our system, and these cases are where the systems fail.

This thesis focuses on the challenging cases in a machine learning
system in order to improve its overall capabilities. In the first
part, we study methods for identifying the challenging cases, an
important precursor for improving the system. In the second part, we
then study methods for addressing the challenging cases, arguably the
most important part of this thesis for real-world applicability. And
in the third part, we study methods for understanding the root cause
of challenging cases, an important step in attaining guarantees about
our system's capabilities. As machine learning is practiced in many
different settings, our study does too. We explore these questions in
the context of computer vision, language learning, and task learning.
The connecting thread among them is the drive towards creating a
communicative and visually aware robot that can capably complete
household tasks. In that context, we present in parallel the Machine
Learning Application Framework that highlights where our contributions
improve downstream applications.

All together, this work studies how to identify, address, and
understand the most challenging cases over a diverse array of machine
learning systems. This research is imperative towards deploying many
systems that we care about, including most autonomous vehicles and
health assistants. Consequently, it represents an important step
towards society's technological goals.
Ph.D. Thesis 2022 Constrained Surface Parameterization Methods with Guarantees Shen, Hanxiao Abstract | PDF

Title: Constrained Surface Parameterization Methods with Guarantees

Candidate: Shen, Hanxiao

Advisor(s): Denis Zorin/Daniele Panozzo

Abstract:

Surface parameterization for piecewise-linear surfaces is a
fundamental problem in computer graphics and geometry processing. The
generation of surface parameterization is a key step in numerous
applications like texture mapping, remeshing, quadrangulation,
inter-surface mapping, and shape-analysis.

Due to its popularity, the robustness of mapping
generation methods plays a major role in its applicability. In
addition, depending on the specific requirements of the application at
hand, various formulations of constraints are used to control or guide
the parameterization.
Typical examples of the constraints are point constraints, curvature
constraints, and topological constraints. In many practical cases, to
ensure that the input assumptions of downstream algorithms are
satisfied, the constraints, such as need to be imposed exactly (as
opposed, e.g., to approximation via penalties).

In this work, we investigate different
constraint formulations suitable for various applications and present
algorithms with guarantees to generate parameterization fully
satisfying these constraints. In the first part of this thesis, we
develop an algorithm that solves the
classical problem of mapping a disk domain with boundary constraints;
in the special case of domains with convex boundary, it improves, in
terms of robustness, on the classical Tutte's algorithm. Utilizing it
as a building block, we design a parameterization method that supports
arbitrary positional constraints. In the second part, building on
recent developments in the theory of discrete uniformization, we
develop a highly robust algorithm for discrete conformal maps that
satisfy prescribed curvature constraints. In the third part, we
provide a constructive proof for the existence of globally seamless
parameterization that matches admissible user-prescribed cone position
and curvature constraints. Lastly, we generalize this to constraints
on holonomy angles on a homology basis of loops, which fully capture
the topology of seamless parameterizations. This method yields
parameterizations that are very close to field-aligned
parametrizations obtained using commonly used methods but, in contrast
to these methods, guarantees the existence of solution satisfying all
constraints.
Ph.D. Thesis 2022 On deep learning tools for scientific discovery in healthcare Sudarshan, Mukund Abstract | PDF

Title: On deep learning tools for scientific discovery in healthcare

Candidate: Sudarshan, Mukund

Advisor(s): Rajesh Ranganath/Oded Regev

Abstract:

Scientists validate hypotheses by building mathematical models of the
real world. They make inferences by checking if their models are
supported by data. Often, the models are hand-crafted and do not
accurately reflect real processes. This often leads to low power to
make scientific discoveries or even false discoveries.

Machine learning can solve these issues in several ways. By allowing
data to inform the construction of models, scientists can use machine
learning to create more powerful statistical hypothesis testing
procedures, or build more realistic models of underlying processes.

This thesis details techniques to address both of these approaches.
First we address the creation of machine learning-based statistical
discovery procedures for scientific discovery. Specifically, we
discuss how machine learning can be used to construct conditional
independence tests, which are used to identify causal links in data.

We detail how such methods can be used to control the false discovery
rate when testing multiple hypotheses. We then apply these techniques
to two important domains. We solve a timely problem in medical
informatics: identifying a small set of variables that are highly
informative of whether an ICU patient with Covid will experience an
adverse event. At the height of Covid in 2020, NYU doctors used a
deployed version of this tool to quickly identify patients to
discharge and free up beds in the ICU. We also apply our methods to a
problem in cancer genomics, where the goal is to identify a set of
gene mutations that are most predictive of tumor metastasis. In the
near future, we expect tools like ours to lead to targeted gene
therapies that tailor treatments to the mutations present in an
individual's tumor.

Next we detail the construction of an interpretable machine learning
model that helps understand an important step in the creation of
proteins. Specifically, we build a model to understand RNA splicing,
which involves removing non-coding regions from precursor messenger
mRNA (pre-mRNA) and joining coding regions together. Our model
accurately models splicing outcomes across a large dataset of
sequences, but more importantly leads to several biologically
validated insights. We use the interpretable nature of our model to
infer that most splicing decisions are a function of a small set of
short sequence features. We also learn that certain pre-mRNA secondary
structures strongly inhibit the inclusion of a coding region in the
final mRNA transcript. Finally, we validate these model-driven
findings by carefully designing experiments for the wet lab.
Ph.D. Thesis 2022 Efficient Verification of Untrusted Services Tzialla, Ioanna Abstract | PDF

Title: Efficient Verification of Untrusted Services

Candidate: Tzialla, Ioanna

Advisor(s): Michael Walfish

Abstract:

Using a third-party service today requires trusting that it is executing as promised. Meanwhile, the correct execution of services is regularly impeded by failures, bugs, misconfigurations, operational mistakes, and insider attacks. Is it possible to verify, instead of trust, that a third-party service executes correctly?

We study this question for two services that execute on remote servers: transparency dictionaries, a foundational infrastructure for end-to-end encryption and other applications, and event-driven web applications. For each of these two services, we leverage their workloads to introduce a practical system that allows a verifier to get a strong security guarantee that the service executes correctly.

In the case of a transparency dictionary, this guarantee is in the form of a cryptographic proof provided by the service. Producing cryptographic proofs typically requires high resource costs. We show that tailoring the cryptographic tools used by the transparency dictionary for its use case mitigates these costs and results in a system, Verdict, that scales to dictionaries with millions of entries while imposing modest overheads on the service and its clients.

In the case of outsourced event-driven web applications, the verifier gets the required guarantee by replaying the requests on a trusted machine using Karousos, a novel record-replay system in which the service has the role of the untrusted recorder. Karousos takes advantage of the particular characteristics of event-driven web applications to enable the replayer (the verifier) to use less computational resources than the recorder (the service), while imposing tolerable overheads on the
recorder and keeping communication small.
Ph.D. Thesis 2022 NLP Evaluation in the Time of Large Language Models Wang, Alex Abstract | PDF

Title: NLP Evaluation in the Time of Large Language Models

Candidate: Wang, Alex

Advisor(s): Kyunghyun Cho

Abstract:

The field of natural language processing (NLP) has been
>dramatically impacted by the creation and proliferation of large
language models that are pretrained on Internet-scale text data. These
models have led to significant improvements on a myriad of NLP tasks.
However, as the capabilities of these models drive up performance on
existing task benchmarks, there is a critical need for evaluation
metrics that are up-to-date with current models. In this dissertation,
we develop NLP evaluation methodologies that benchmark and leverage
pretrained language models. We first present two multi-task benchmarks
for evaluating the generalization ability of NLP models and discuss
the role of these benchmarks in the development of large language
models. Next, we demonstrate that we can leverage the capabilities of
pretrained language models to develop new automatic evaluation metrics
that better measure the semantics of model-generated text.
Specifically, we make use of the question answering abilities of
pretrained models to evaluate the faithfulness of automatically
generated summaries. Finally, we explore methods for crowdsourcing
high-quality and challenging text generation data to address issues of
data quality that have been surfaced by the ability of language models
to replicate noise in benchmark datasets. Overall, we show that the
rise of pretrained language models presents both challenges and
opportunities in how we evaluate NLP systems, and that incorporating
these very models into our evaluation methodologies offers a promising
path forward.
Ph.D. Thesis 2022 Improving Sample Efficiency in Off-policy and Offline Deep Reinforcement Learning Wu, Yanqiu (Autumn) Abstract | PDF

Title: Improving Sample Efficiency in Off-policy and Offline Deep Reinforcement Learning

Candidate: Wu, Yanqiu (Autumn)

Advisor(s): Keith Ross

Abstract:

Reinforcement Learning (RL) is an area of Machine Learning, where agents are trained through trial and error to make a sequence of decisions in some given environment to achieve a goal. Traditional reinforcement learning methodology suffers from the curse of dimensionality. Fortunately, with the help of deep learning, Deep Reinforcement Learning (DRL) can overcome the issue and can often find high performing policies for applications with large state and action spaces. Over the past few years, DRL has achieved major breakthroughs in complex tasks, such as outperforming human players in video games [Mnih et al. 2013; Vinyals et al. 2019], defeating the human world champion in Go [Silver et al. 2016, 2018] and autonomous robotics control [Lillicrap et al. 2019; Haarnoja et al. 2018a].

Despite the recent breakthroughs, sample efficiency remains an important issue in deep reinforcement learning. In some complex tasks, where data collection is very expensive and agents require relatively few interactions with the environment for training, sample efficiency is of central concern for making DRL practical for applications. This thesis addresses the sample efficiency problem in the context of off-policy and offline Deep Reinforcement Learning. We develop training algorithms which not only lead to high asymptotic performing policies, but are also highly sample efficient in both on-line and offline settings. We demonstrate the performance of our methods in simulated robotic locomotion environments.

In the first part of this thesis, we develop a streamlined off-policy algorithm that utilizes an output normalization scheme and non-uniform sampling. We identify the squashing exploration problem and show how maximum entropy DRL [Haarnoja et al. 2018a,b] helps to resolve it. Based on our observation, we develop an alternative output normalization scheme to maximum entropy algorithms. We show that this normalization scheme can then be combined with non-uniform sampling, resulting in high performing policies. Next, we develop a simple off-policy algorithm that takes advantage of a high update-to-data (UTD) ratio and Q-ensembles which demonstrates superior sample efficiency in early-stage training and also achieve high asymptotic performance in late-stage training. We employ Q-ensembles and keep several lowest values for updating to address the overestimation bias. Finally, we consider offline deep reinforcement learning. We introduce the novel notion of “upper envelope of the data” and then develop an Imitation-Learning based algorithm based on the notion. Our algorithm is computationally much faster and achieves state-of-the art performance.
Ph.D. Thesis 2022 On-Policy Deep Reinforcement Learning — The Discounted and Average Reward Criteria Zhang, Yiming Abstract | PDF

Title: On-Policy Deep Reinforcement Learning — The Discounted and Average Reward Criteria

Candidate: Zhang, Yiming

Advisor(s): Keith Ross

Abstract:

Reinforcement Learning (RL) is the study of sequential decision making where an agent attempts to maximize its overall cumulative reward in some given environment. Combined with deep learning, reinforcement learning has made remarkable strides in the past decade in complex tasks such as playing video games (Mnih et al. 2013, Vinyals et al. 2019), playing Go (Silver et al. 2016, 2018), robotics (Lillicrap et al. 2016, Haarnoja et al. 2018), and chip design (Mirhoseini et al. 2021). However despite these successes, modern RL algorithms often suffer from poor sample efficiency and lack of safety guarantees. In this thesis we tackle these issues in the context of on-policy Deep Reinforcement Learning (DRL), both theoretically and algorithmically. This work addresses both the discounted and average reward criteria. In the first part of this thesis, we develop theory for average reward on-policy reinforcement learning by extending recent results for local policy search. We show that previous work based on the discounted return (Schulman et al. 2015, Achiam et al. 2017) results in a non-meaningful bound in the average-reward setting. By addressing the average-reward criterion directly, we derive a novel bound which depends on the average divergence between the two policies and Kemeny's constant. Based on this bound, we develop an iterative procedure which produces a sequence of monotonically improved policies for the average reward criterion. We show that this iterative procedure can then be combined with classic deep reinforcement learning methods, resulting in practical DRL algorithms that target the long-run average reward criterion. Next, we develop a unifying framework for the on-policy sample efficiency problem. This methodology uses a two-step approach which first learns an optimal policy in the non-parameterized policy space before projecting said policy back into the parameter space. Our approach is general in that it applies to both discrete and continuous action spaces, and can handle a wide variety of proximity constraints. Finally we address the problem of reinforcement learning with safety constraints. We provide theoretical support that trust region-based methods can be applied to problems with both discounted and non-discounted cost constraints. We then propose a novel first-order algorithm for policy optimization for maximizing an agent's cumulative reward while at the same time satisfying a set of cost constraints. Our algorithm is extremely simple to implement and has an approximate upper bound for worst-case constraint violation throughout training.
Ph.D. Thesis 2021 Advances in computer bridge: techniques for a partial-information, communication-based game. Bethe, Paul Abstract | PDF

Title: Advances in computer bridge: techniques for a partial-information, communication-based game.

Candidate: Bethe, Paul

Advisor(s): Ernest Davis

Abstract:

Bridge is an imperfect information game with elements of competition
against opponents as well as cooperation with a partner. Despite the
application of many tenets of artificial intelligence, humans have yet
to be consistently bested by the computer. This thesis explores AI
shortcomings in both the play and bidding phases of the game. In the
play, we explore weaknesses in the cutting edge Monte Carlo techniques
and explore both inference and learning based solutions. In the bidding,
we go beyond existing rule based systems and investigate deep
reinforcement learning as a method to learn how to bid.
Ph.D. Thesis 2021 Learning Causality in Molecular Biology Cirrone, Jacopo Abstract | PDF

Title: Learning Causality in Molecular Biology

Candidate: Cirrone, Jacopo

Advisor(s): Dennis Shasha

Abstract:

The Systems Biology community has invested a great deal of effort in
modeling gene regulatory networks that should be able to (i) accurately
predict future states and (ii) identify regulatory hubs that can be
manipulated to achieve desired phenotypes. Most computational tools for
the problem embody linear models (e.g. 5 * TF1 + 2*TF2 - 0.4*TF3....).
However, it is well known that biological interactions are highly
synergistic and non-linear. Further, those tools mostly try to directly
predict networks even when the discovered edges (which usually come from
some assay such as Chip-seq) may have little physiological significance
(e.g., may not influence gene expression).

This thesis considers an alternative approach to inferring gene
causality. Specifically, we consider the problem of predicting the
expression of genes at a future time point in a genomic time series. In
this, we follow the philosophy that accurate prediction often
corresponds to a good understanding of causality.
The prediction may rest on several sources of data: the time point
immediately preceding t, the entire target time series preceding t, and
ancillary data. In biology, for example, the ancillary data may consist
of a network based on binding data, data from different time series,
steady state data, a community-blessed gold standard network, or some
combination of those. We introduce OutPredict, which is a machine
learning method for time series that incorporates ancillary steady state
and network data to achieve a low error in gene expression prediction.
We show that OutPredict outperforms several of the best state-of-the-art
methods for prediction. The predictive models OutPredict in turn
generate a causal network.

Thus, this thesis presents an approach to the inference of causality
based on predictions of out-of-sample time-points based on both steady
state and time series data. Because the model for each gene identifies
those transcription factors that have the most importance in prediction,
those important transcription factors are the most likely causal
elements for that gene. We validate those predictions for a set of
well-documented transcription factors in Arabidopsis.
Because our methods apply to any situation in which there is time series
data, ancillary data, and the need for non-linear causal models, we
believe that this work will have a broad appeal to the scientific
community, specifically those studying causality networks in any
biological system.
Ph.D. Thesis 2021 Responsibility Analysis by Abstract Interpretation Deng, Chaoqiang Abstract | PDF

Title: Responsibility Analysis by Abstract Interpretation

Candidate: Deng, Chaoqiang

Advisor(s): Patrick Cousot

Abstract:

Given a behavior of interest, automatically determining the corresponding responsible entity (or say, the root cause) is a task of critical importance in various scientific fields, especially in the program static analysis. Classical static analysis techniques (e.g. dependency analysis, taint analysis, slicing, etc.) assist programmers in narrowing down the scope of responsibility, but none of them can explicitly identify the responsible entity. Meanwhile, the causality analysis is generally not pertinent for analyzing programs, and the structural equations model (SEM) of actual causality misses some information inherent in programs (e.g. temporal information, and whether an entity is free to make choices or not), making the corresponding program analysis imprecise.

In this dissertation, inspired by a classic forest fire example used in defining causality, a novel definition of responsibility based on the abstraction of trace semantics is proposed, which is expressive and generic to cope with both program analyses and tasks in other scientific fields. Briefly speaking, an action aR is responsible for behavior B in a certain trace, if and only if aR is free to make choices, and such a choice is the first one that ensures the occurrence of B in that trace. Such a definition makes use of the information regarding the temporal ordering of actions, as well as whether an action has free choices or not. In addition, our definition of responsibility takes into account the cognizance of observer, which, to the best of our knowledge, is a new innovative idea in program analysis. Compared to current dependency and causality analysis methods, the responsibility analysis is demonstrated to be more precise in many examples.

Furthermore, this dissertation proposes a sound framework of abstract responsibility analysis, which allows a balance between cost and precision to solve the undecidable problem of responsibility. Essentially, the abstract analysis builds a trace partitioning automaton by an iteration of over-approximating forward reachability analysis with trace partitioning and under-approximating/over-approximating backward impossible failure accessibility analysis, and determines the bounds of potentially responsible entities along paths in the automaton. Unlike the concrete responsibility analysis identifies exactly a single action as the responsible entity along every concrete trace, the abstract analysis may lose some precision and find multiple actions potentially responsible along each automaton path. However, the soundness is preserved, and every responsible entity in the concrete is guaranteed to be also found responsible in the abstract.
Ph.D. Thesis 2021 Enhancing Collaboration and Productivity for Virtual and Augmented Reality He, Zhenyi Abstract | PDF

Title: Enhancing Collaboration and Productivity for Virtual and Augmented Reality

Candidate: He, Zhenyi

Advisor(s): Ken Perlin

Abstract:

Immersive environments such as Virtual Reality (VR) and Augmented Reality (AR) are now receiving more and more attention. Although VR and AR have largely been used for individual entertainment experiences, they also possess huge potential as a platform for the support of collaboration and productivity. My thesis work is concerned with enabling VR/AR to be flexibly adapted for collaborative and productive uses. I approach this scope from several facets: a new haptic user interface based on actuated robots to bridge virtual and physical world, a reconfigurable framework for both co-located and geographically dispersed multi-user communication, and a text entry system in which users type by tapping their fingers, without needing to look at their hands or be aware of their hand positions. Further, I extend these ideas to a daily video conferencing experience that requires minimal hardware.
Ph.D. Thesis 2021 Larger-Context Neural Machine Translation Jean, Sébastien Abstract | PDF

Title: Larger-Context Neural Machine Translation

Candidate: Jean, Sébastien

Advisor(s): Kyunghyun Cho

Abstract:

Translation helps connect people by bridging language barriers. It can make travel more enjoyable, allow our minds to explore imaginary worlds, let us talk to others, and so on. Given the need for translation, but the limited availability of human translators, machine translation has flourished. Most systems translate sentences one by one, ignoring its context, which isn't always sufficient as the missing information can lead to incorrect or inconsistent translations. We believe that neural machine translation (NMT) is particularly well-suited to incorporate the surrounding context. Indeed, NMT systems can attend to arbitrarily distant words, while the use of continuous representations improves generalization on unseen examples.

As such, in this thesis, we extend neural machine translation to leverage information from the surrounding context. To do so, we first highlight the potential of the then-nascent NMT paradigm. We subsequently introduce architectural changes to integrate information from the surrounding document, initially starting from the preceding sentence. We further encourage models to use context from either a learning or data augmentation perspective. We also consider the efficient use of document-level neural language models for this task. While some challenges still remain, our work has helped establish larger-context translation on a solid footing, and we are optimistic about future progress.
Ph.D. Thesis 2021 Improving Sample Efficiency of Imitation and Reinforcement Learning Kostrikov, Ilya Abstract | PDF

Title: Improving Sample Efficiency of Imitation and Reinforcement Learning

Candidate: Kostrikov, Ilya

Advisor(s): Rob Fergus

Abstract:

Reinforcement Learning (RL) is an area of machine learning focused on learning to make a sequence of actions in an environment that maximizes cumulative rewards. Combined with Deep Learning, Reinforcement Learning has made significant progress over the last decade across various domains. Notable successes include achieving superhuman performance on Atari games, Go, StarCraft II, Dota 2, and various continuous control tasks.

However, RL's success stories are often limited to games and simulations where it is possible to generate a large amount of training data. This thesis describes several methods focused on improving sample efficiency to enable a wider variety of RL applications. For the first half of the thesis, we focus on Imitation Learning, where ground truth rewards are usually unknown, and expert demonstrations define optimality. First, we introduce a method for robust and sample efficient imitation learning. We adapt an imitation learning approach where an agent tries to mimic a domain expert using a GAN-like framework called GAIL. We identify two primary sources of sample inefficiency associated with this approach: on-policy RL and GAN discriminator training. We show that sample inefficiency can be mitigated by performing off-policy RL training combined with off-policy training of the discriminator. We also identify and resolve some task-specific biases associated with the family of adversarial imitation learning algorithms based on GAIL. Then, we derive a principled off-policy formulation of robust imitation learning that is entirely offline and allows us to learn a policy that imitates the expert relying only on the previously collected data. This work concludes the part of the thesis focused on imitation learning, and for the rest of the thesis, we focus on online and offline RL where we have access to environment rewards. We observe that off-policy RL from pixels suffers from overfitting and propose a simple solution inspired by image augmentation techniques from Computer Vision. Finally, we introduce a method for offline RL that utilizes a pre-trained behavioral policy to improve the robustness of behavior regularization widely used in the context of offline RL. In contrast to prior work on Offline RL, this method utilizes the behavior policy to regularize the critic instead of constraining the training policy.
Ph.D. Thesis 2021 Latent Variable Models and Iterative Refinement for Non-Autoregressive Neural Machine Translation Lee, Jason Abstract | PDF

Title: Latent Variable Models and Iterative Refinement for Non-Autoregressive Neural Machine Translation

Candidate: Lee, Jason

Advisor(s): Kyunghyun Cho

Abstract:

Deep neural networks have fundamentally transformed the field of machine translation, and replaced statistical phrase-based approaches to serve translations to millions of users in production systems every day. Despite impressive progress in translation accuracy, improving decoding speed remains a key challenge as most systems are \emph{autoregressive} and generate a sentence word-by-word. As neural machine translation (NMT) models are becoming increasingly deep and complex, there is a growing need for more efficient translation systems with sub-linear or constant inference latency, with respect to the sentence length. The main challenge in non-autoregressive machine translation is capturing the dependencies between tokens in a target sentence without autogression. Motivated by a rich history of probabilistic graphical models in sequence generation, this thesis proposes to use latent variables to model intra-sentence dependencies, such that the output distribution can be factorized given the latent variables. We also present several inference algorithms for non-autoregressive machine translation based on iterative refinement, which revises a sentence over multiple iterations. Our non-autoregressive models based on latent variables and iterative refinement can deliver significant decoding speedup with comparable translation accuracy relative to a strong autoregressive baseline. Finally, we investigate the correlation between training (log-likelihood) and test objective (BLEU) of several model families. We observe the two metrics are not correlated when comparing models from different families (e.g. between autoregressive and latent variable models).
Ph.D. Thesis 2021 Neural Structured Prediction using Iterative Refinement with Applications to Text and Molecule Generation Mansimov, Elman Abstract | PDF

Title: Neural Structured Prediction using Iterative Refinement with Applications to Text and Molecule Generation

Candidate: Mansimov, Elman

Advisor(s): Kyunghyun Cho

Abstract:

Humans excel at generating structured data in the form of images, text, speech, molecules, computer code, and others. Researchers have spent several decades proposing various solutions for the effective generation of these structured objects in a data-driven way, known as structured prediction. With the revival of deep neural networks, autoregressive models that process structured objects in fixed left-to-right monotonic ordering became a de-facto solution for this problem. Notable successes of autoregressive models include neural machine translation [Sutskever et al., 2014, Bahdanau et al., 2014, Vaswani et al., 2017], open-ended text generation [Radford et al., 2019, Brown et al., 2020], text-to-speech synthesis [van den Oord et al., 2016], among many.

Despite the considerable success of autoregressive models on many applications, a natural question arises whether alternative approaches are possible for structured prediction. This thesis describes a novel method for structured prediction based on the principle of iterative refinement with a particular focus on applications to text and molecule generation. We first introduce the iterative refinement framework for text generation. Starting from the blank sentence, the iterative refinement approach gradually refines text over multiple steps. Using this approach, we show that we can flexibly generate the text in various ways, such as generate all or some words in parallel and generate text according to the ordering learned from the data. We show that iterative refinement achieves competitive performance compared to autoregressive models while delivering a speedup in decoding. We conclude this thesis by showing how we can adapt the iterative refinement framework originally introduced for text generation for molecule generation. In particular, we demonstrate two iterative refinement approaches for molecular graph generation and molecular geometry prediction. We anticipate that models based on the iterative refinement will be broadly applicable to other domains of interest.
Ph.D. Thesis 2021 Scalable Particulate Flow Simulations with Boundary Integral Equations Morse, Matthew Abstract | PDF

Title: Scalable Particulate Flow Simulations with Boundary Integral Equations

Candidate: Morse, Matthew

Advisor(s): Denis Zorin

Abstract:

Numerical simulation of complex particulate flows, and of red blood cell flows through capillaries in particular, is an important investigational tool in the biological sciences. The ability to rapidly evaluate the impact of vessel and cell geometries, plasma viscosity, and particulate densities on macroscopic physiology is crucial to pursuing further biological understanding. Experimental techniques are costly and time-consuming, while analytical approaches are often of limited practical use in realistic scenarios, ultimately underscoring the importance of a comptuational approach.

In this work, we construct such a simulation, capable of simulating microliters of blood flowing through realistic vasculature, along with more general particulate suspensions. Due to the micrometer length scales of typical capillaries, we can model the blood plasma as a Stokesian fluid and red blood cells as inextensible, deformable membranes. By reformulating the viscous flow as a set of boundary integral equations, we are able to produce a method that has optimal complexity with high-order accuracy that is capable of handling dense particulate suspensions in complex geometries.

This approach relies on a novel, robust solver for elliptic partial differential equations, applied to Stokes flow. A core component of the solver is a novel fast algorithm to compute the value of the solution near and on the domain boundary, which we have named \qbkix. We provide a set of algorithms to guarantee the accuracy of \qbkix on piecewise smooth surfaces, discuss the error behavior and complexity of \qbkix, and evaluate its performance.

Leveraging this solver in a confined blood flow simulation involves advecting deformable particulates along the flow trajectory. Large timesteps are required for an efficient simulation, but can cause collisions among cells and with the vessel wall if performed naively. We present collision detection and resolution algorithms for the red blood cells and the blood vessel. We parallelize \qbkix and the collision algorithms and scale the final simulation to nearly 35,000 cores.
Ph.D. Thesis 2021 Towards More General and Adaptive Deep Reinforcement Learning Agents Raileanu, Roberta Abstract | PDF

Title: Towards More General and Adaptive Deep Reinforcement Learning Agents

Candidate: Raileanu, Roberta

Advisor(s): Rob Fergus

Abstract:

Building agents with general skills that can be applied in a wide
range of settings has been a long-standing problem in machine
learning. The most popular framework for training agents to make
sequential decisions in order to maximize reward in a given
environment is Reinforcement Learning (RL). Over the last decade, deep
reinforcement learning, where RL agents are parameterized by neural
networks, has achieved impressive results on a number of tasks, from
games such as Atari, Go, StarCraft, or Dota, to continuous control
tasks with applications in robotics._

However, current RL agents are prone to overfitting and struggle to
generalize when even minor perturbations are applied to the training
environment. This hinders progress on real-world applications such as
autonomous vehicles or home robots, where agents need to deal with a
large variety of scenarios. In this thesis, we introduce several
methods for improving the versatility of deep reinforcement learning
agents. We start by studying the problem of zero-shot generalization
to new instances of a task after training on a limited number of
environments. We first propose an approach for regularizing the policy
and value function of a RL agent and automatically finding an
effective type of data augmentation for a given task. We also identify
that there is an asymmetry between the information needed to represent
the optimal policy and the true value function, which leads to
overfitting when using standard deep RL algorithms. As a step towards
solving this problem, we propose a method which decouples the
optimization of the policy and value, and constrains the
representation to be invariant to the task instance. Next, we focus on
the problem of learning general exploration strategies for
procedurally generated environments with sparse rewards. We formulate
a new type of intrinsic reward which encourages agents to impact their
environments and show that it outperforms other popular exploration
methods. Then, we discuss a novel approach for fast adaptation to new
dynamics. We show that our method, which leverages self-supervised
techniques to learn policy and environment embeddings, enables
adaptation within a single episode on a number of continuous control
tasks. Finally, we investigate how agents can learn more flexible
strategies for interacting with different opponents and collaborators.
Ph.D. Thesis 2021 Theory and Algorithms for Several Central Problems in Large-Scale Machine Learning Storcheus, Dmitry Abstract | PDF

Title: Theory and Algorithms for Several Central Problems in Large-Scale Machine Learning

Candidate: Storcheus, Dmitry

Advisor(s): Mehryar Mohri

Abstract:

This Ph.D. dissertation presents fundamental analaysis of several central problems in large-scale machine learning. We derive novel, scalable algorithms supported by strong theoretical guarantees for the most practically important large-scale learning scenarios. These scenarios include extentions of the standard supervised learning to multiple base hypotheses spaces, multiple objective functions, multiple distributions, multiple classes and high-dimensional feature spaces.

A standard supervised learning scenario consists of fitting a predictor from a fixed hypotheses space that minimizes certain empirical loss on a sample drawn i.i.d. from a particular distribution. The richness of modern machine learning applications requires the learning scenario to be large-scale by having the ability to learn from many training examples. While scalability in terms of many examples is widely studied, the current state of research in the field overlooks other scenarios and directions for scalability that may be even more important that many training examples. For instance, by allowing the learner to select predictors from multiple hypotheses spaces of varying complexity, or fit to multiple objective functions.

While the problems mentioned above may seem to relate to separate aspects of large-scale learning, this thesis provides a unified theoretical analysis framework that brings these central problems together. This framework is based on the Rademacher complexity analysis as well as on the Empirical and Structural Risk Minimization principles.
Ph.D. Thesis 2021 The Evolutionary Maps of Data Tamaskar, Abhinav Abstract | PDF

Title: The Evolutionary Maps of Data

Candidate: Tamaskar, Abhinav

Advisor(s): Bud Mishra

Abstract:

We present a geometric view of analyzing temporal causal models from the perspective of topology and limit graphs. We will briefly cover an intuitive overview of the topological techniques used and the theory of limit graphs. We will then briefly describe the Suppes Bayes causal networks which are used as the temporal causal models. We briefly describe evolutionary models used in scientific literature and show an efficient model for performing simulations on generalized large scale evolutionary networks. We then present the techniques for analyzing large scale evolutionary populations, and showcase their generality through two real world examples, (1) with the linguistic data from Reddit over the course of 5 years and while showing the existence of echo chambers and giving a metric to analyze similarities of populations over time, and (2) through the TCGA and COSMIC dataset for cancer mutation of over 11,000 genes and by using an approximation metric on the space of causal models to find similar cancer types, to perform transfer learning to boost survival forecasting through blackbox learning models.
Ph.D. Thesis 2021 Order and Learning in Sequential Neural Structured Prediction Welleck, Sean Abstract | PDF

Title: Order and Learning in Sequential Neural Structured Prediction

Candidate: Welleck, Sean

Advisor(s): Kyunghyun Cho

Abstract:

Structured objects such as sets, trees, and sequences appear in a variety of scientific and industrial domains. Developing machine learning methods that generate these objects is of interest for both scientific understanding and practical applications. One approach, sequential neural structured prediction, decomposes generation into a sequence of predictions, with each prediction made by a deep neural network. Choosing an appropriate sequential representation of each structured object and selecting an effective learning objective are key to adopting this approach. The standard method for learning specifies a canonical ordering of elements in the sequential representation and maximizes the likelihood of the resulting sequences. We develop two streams of research that explore alternatives to this fixed-order, maximum likelihood approach for sequentially generating sets, trees, and sequences, with a focus on natural language processing applications.

First, we focus on text generation and study degenerate properties of fixed-order maximum-likelihood learning, motivating new learning methods. We characterize the degeneracy using three properties observed in generated text: non-termination, logical incoherence, and repetition. To study non-termination, we develop theory that allows us to prove that conventional text generation methods can generate infinite-length sequences with high probability. To study logical incoherence, we create a dataset for investigating the degree to which a model logically contradicts its preceding statements. For reducing degeneration, we develop unlikelihood training, a learning method which penalizes task-specific textual properties. In the second part of the thesis, we remove the requirement of a fixed generation order with a learning framework called non-monotonic generation, which yields models that select input-dependent generation orders. We use non-monotonic generation to generate multisets, parse trees, and text. The investigations and techniques presented in this thesis lead to promising directions for future work.
Ph.D. Thesis 2021 Techniques for Sample-Efficient Reinforcement Learning Whitney, William Abstract | PDF

Title: Techniques for Sample-Efficient Reinforcement Learning

Candidate: Whitney, William

Advisor(s): Kyunghyun Cho

Abstract:

By leveraging advances in deep learning, reinforcement learning (RL) has recently made such advances that for any task which has a simulator, and thus enables the collection of nearly unlimited data, it might now be expected to yield superhuman performance. However, many practically relevant tasks take place in the physical world. Constructing physical simulators of sufficient fidelity and correspondence to transfer is a non-trivial challenge, so for the majority of physical tasks at least some amount of training on real data is required. Collecting data in the real world is sufficiently expensive that it makes up much of the cost of training a reinforcement learning agent.

This thesis focuses on improving the sample efficiency of reinforcement learning in order to make them more practical to use on physical systems. It includes three approaches to this goal. The first part studies the data collection process, and in particular the opportunity for exploration to improve the sample efficiency of RL. The second part considers the use of representation learning to improve generalization, and thus sample efficiency, in reinforcement learning. The third part examines the offline RL setting, which consists of pure policy optimization using a fixed dataset and therefore does not require additional data collection.

Taken together, this work studies techniques for improving the sample efficiency of reinforcement learning by collecting data which is more useful and diverse, then learning more from every sample.
It represents an early step on the path to RL as an everyday tool for control of physical systems.
Ph.D. Thesis 2021 Methods to Improve Knowledge Transfer Efficiency for Data-limited Problems in Genomics Yi, Ren Abstract | PDF

Title: Methods to Improve Knowledge Transfer Efficiency for Data-limited Problems in Genomics

Candidate: Yi, Ren

Advisor(s): Richard Bonneau

Abstract:

The recent advancement in computational genomics has greatly benefited from the explosion of high-throughput genomic data and similar growth in biological databases. However, as more sequencing technologies become available and large genomic consortiums start to crowdsource data from larger cohorts of research groups, data heterogeneity has become an increasingly prominent issue. Data integration across multiple data sources and data modalities becomes particularly important for a greater number of biological systems. High-throughput omics data are typically highly skewed towards a small number of model organisms, factors, and conditions with which wet-lab experiments have higher success rates. It further introduces technical challenges when building machine learning models for problems with limited data. This thesis describes methods that improve knowledge transfer efficiency for learning data-limited problems through effective task-specific feature representation in the multitask learning setting. We demonstrate the performance of our methods in two genomic problems -- genetic variant calling and cell type-specific transcription factor binding predictions.
Ph.D. Thesis 2020 Out of Distribution Generalization in Machine Learning Arjovsky, Martin Abstract | PDF

Title: Out of Distribution Generalization in Machine Learning

Candidate: Arjovsky, Martin

Advisor(s): Leon Bottou

Abstract:

Machine learning has achieved tremendous success in a variety of
domains in recent years. However, a lot of these success stories have
been in places where the training and the testing distributions are
extremely similar to each other. In everyday situations when models
are tested in slightly different data than it was trained on, ML
algorithms can fail spectacularly. This research attempts to formally
define this problem, what sets of assumptions are reasonable to make
in our data and what kind of guarantees we hope to obtain from them.
Then, we focus on a certain class of out of distribution problems,
their assumptions, and introduce simple algorithms that follow from
these assumptions, and that are able to provide more reliable
generalization. A central topic in the thesis is the strong link
between discovering the causal structure of the data, finding features
that are reliable (when using them to predict) regardless of their
context, and out of distribution generalization.
Ph.D. Thesis 2020 Behavior of the Limited-Memory BFGS Method on Nonsmooth Optimization Problems Asl, Azam Abstract | PDF

Title: Behavior of the Limited-Memory BFGS Method on Nonsmooth Optimization Problems

Candidate: Asl, Azam

Advisor(s): Michael Overton

Abstract:

The limited memory BFGS (Broyden-Fletcher-Goldfarb-Shanno) method,
abbreviated L-BFGS, is widely used for large-scale unconstrained optimization, but its behavior on nonsmooth problems has received little attention. In this thesis we give the first convergence analysis of the L-BFGS method applied to nonsmooth functions. We focus on the simplest version of the method, sometimes known as memoryless BFGS, which uses just one update. L-BFGS can be used with or without “scaling”; the use of scaling is normally recommended. We consider a simple class of convex piecewise linear nonsmooth functions f that are unbounded below. On this class of problems, we show that memoryless BFGS with scaling, using any ArmijoWolfe line search and initialized at any point where f is differentiable, generates iterates that converge to a non-optimal point, if a certain condition
relating the Lipschitz constant of f to the line search Armijo parameter holds. We also present an analysis of the ordinary gradient method with the same line search applied to the same class of functions, giving conditions under which it also fails. However, scaled memoryless BFGS fails under a weaker condition relating the Lipschitz constant of the function to the line search Armijo parameter than that implying failure of the gradient method. Furthermore, in sharp contrast to the gradient method, if a specific standard Armijo-Wolfe bracketing line search is used, scaled memoryless BFGS fails if the Lipschitz constant is sufficiently large regardless of the Armijo
parameter. Our experimental results demonstrate that our analysis is tight on this class of functions, and that similar results likely hold for L-BFGS with any fixed number of updates. In contrast, the “full” BFGS method is remarkably effective for minimizing nonsmooth functions, but it is not a practical approach when the number of variables is large.

We also conduct extensive experiments applying L-BFGS, both unscaled, with various choices for the number of updates, on many other classes of convex nonsmooth functions, ranging from artificially devised, highly ill-conditioned nonsmooth problems to eigenvalue optimization problems that are equivalent to semidefinite programming problems arising from applications. We also apply L-BFGS to smoothed versions of these problems. We find that although L-BFGS is usually a reliable method for minimizing ill-conditioned smooth problems, when the condition number is so large that the function is effectively nonsmooth, L-BFGS consistently fails. This behavior is in sharp contrast to the behavior of full BFGS, which is consistently reliable for nonsmooth optimization problems. We arrive at the conclusion that, for large-scale nonsmooth optimization problems for which BFGS and other methods are not practical, it is far preferable to apply L-BFGS to a smoothed variant of a nonsmooth problem than to apply it directly to the nonsmooth problem.
Ph.D. Thesis 2020 Enhanced Representations for Relations by Multi-task Learning Fu, Lisheng Abstract | PDF

Title: Enhanced Representations for Relations by Multi-task Learning

Candidate: Fu, Lisheng

Advisor(s): Grishman, Ralph

Abstract:

A relation describes the relationship between a pair of entities. Relation Extraction is the process of extracting relations from free text and converting them to structured machine-readable knowledge. This process can facilitate building and extending knowledge bases, and therefore can benefit a variety of natural language processing applications such as Question Answering and Summarization.

Typical relation extraction projects start by defining a relation schema: a set of mutually-exclusive relation types. Based on these definitions, all instances of these relations in a text corpus are labeled by hand, producing a dataset which can be used to train a statistical model. Labeling relations in text is difficult and time-consuming. There only exist limited relation datasets developed in this way. New applications will give rise to new schemas, so the lack of high-quality labeled data is almost inevitable for Relation Extraction.

Despite limited labeled samples in relation datasets, neural net models have been shown to be more effective than traditional methods in learning feature representations with pre-trained word embeddings. In the context of representation learning, this thesis presents multi-task learning frameworks to learn enhanced representations for relations. It shows how to learn better feature representations in both unsupervised and supervised ways. First, the dissertation shows how to learn domain invariant representations using unlabeled entity pairs. Then it shows how to learn a unified encoder by combining multiple annotated datasets. Finally, it shows how to learn the relatedness between relation types across different relation schemas. These techniques improve the relation models without requiring more annotation from the target dataset. The multi-task learning frameworks could be an efficient toolkit for relation extraction in general.
Ph.D. Thesis 2020 Scaling Multi-user Virtual and Augmented Reality Herscher, Sebastian Abstract | PDF

Title: Scaling Multi-user Virtual and Augmented Reality

Candidate: Herscher, Sebastian

Advisor(s): Perlin, Ken

Abstract:

The Virtual and Augmented Reality (XR) ecosystems have been gaining substantial momentum and traction within the gaming, entertainment, enterprise, and training markets in the past half-decade, but have been hampered by limitations in concurrent user count, throughput, and accessibility to mass audiences. Although a litany of XR devices have been made available for public purchase, most XR experiences have been developed for either a single user or a small set of users at a time. Few systems or experiments in co-located XR environments have expanded past a small set of users, leaving the paradigm of being part of a larger virtual audience relatively untested. This thesis presents a set of components, systems, and experiments that assist in the creation, deployment, and scaling of multi-user virtual and augmented reality experiences, and outlines the strengths of techniques found in traditional co-located media for the design space of scaled co-located XR.
Ph.D. Thesis 2020 Auditing Outsourced Services Tan, Cheng Abstract | PDF

Title: Auditing Outsourced Services

Candidate: Tan, Cheng

Advisor(s): Michael Walfish

Abstract:

Outsourcing to the cloud is based on assuming that remote servers behave as expected, even under failures, bugs, misconfigurations, operational mistakes, insider threats, and external attacks. Can we instead verify their behavior? There have been various attempts at such verification, but these attempts have had to choose: comprehensive guarantees or good performance? This dissertation studies how to get both.

This dissertation focuses on two essential services: outsourced computation and outsourced databases. Verifying them correspondingly introduces two new abstract problems. We call the first problem the Efficient Server Audit Problem, which examines how to efficiently verify a concurrent and untrusted server. The second problem is verifying a core correctness contract of black-box databases while scaling to real-world online workloads.

To address the two problems, this dissertation respectively introduces two systems: orochi and cobra. Both systems tolerate arbitrary failures in the service provider, and have good performance: in our experiments, orochi’s verifier achieves 5.6–10.9x speedup versus simply re-executing inputs, with less than 10% CPU overhead on the server side; cobra improves over baselines by 10x in verification cost, with modest overhead on clients (less than 5% throughput degradation and about 7% 90-percentile latency increases).
Ph.D. Thesis 2020 Market Efficiency and Dynamics Tao, Yixin Abstract | PDF
Title: Market Efficiency and Dynamics

Candidate: Tao, Yixin

Advisor(s): Richard Cole

Abstract:

General equilibrium theory, initiated by Walras over a century ago, explains the interaction between supply and demand in an economy. In this dissertation, we look at Fisher Markets, which are a particular case of the general equilibrium theory. We consider two issues in Fisher Markets: strategic behavior and dynamics.

Strategic behavior is usually considered in a game, such as auction, in which case, participants in the game may choose not to report their real preferences in order to improve their payoff. In general equilibrium theory, buyers are usually considered to be non-strategic: given the prices, buyers will maximize their true utilities by properly distributing their money on different goods. In this case, the Market equilibrium should be efficient. However, the prices in the market equilibrium are influenced by the demands of the buyers. In principle, buyers can affect prices by changing their demands, which may improve buyers' final utilities. This may result in inefficient outcomes. In this thesis, we investigate this possibility in large Fisher markets. We show that the market will approach full efficiency as the market becomes larger and larger. We also show a similar result for the Walrasian mechanism in large settings.

We also study two dynamics in Fisher Markets in this dissertation:
1. Proportional response is a buyer-oriented dynamics. Each round, buyers update their spending in proportion to the utilities they received in the last round, where prices are the sum of the spendings. This dissertation establishes new convergence results for two generalizations of proportional response in Fisher markets with buyers having CES utility functions. The starting points are respectively a new convex and a new convex-concave formulation of such markets. The two generalizations of proportional response correspond to suitable mirror descent algorithms applied to these formulations. Among other results, we analyze a damped generalized proportional response and show a linear rate of convergence in a Fisher market with buyers whose utility functions cover the full spectrum of CES utilities aside the extremes of linear and Leontief utilities; when these utilities are included, we obtain an empirical O(1 / T) rate of convergence.
2. Tatonnement is considered the most natural dynamics in Fisher Markets: the price of a good is raised if the demand exceeds the supply of the good, and decreased if it is too small. Implicitly, buyers' demands are assumed to be a best-response to the current prices. This dissertation addresses a lack of robustness in existing convergence results for discrete forms of tatonnement, including the fact that it need not converge when buyers have linear utility functions. This dissertation shows that for Fisher markets with buyers having CES utility functions, including linear utility functions, tatonnement will converge quickly to an approximate equilibrium (i.e., at a linear rate), modulo a suitable large market assumption. The quality of the approximation is a function of the parameters of the large market assumption.
Ph.D. Thesis 2020 Flexible and Efficient Systems for Training Emerging Deep Neural Networks Wang, Minjie Abstract | PDF

Title: Flexible and Efficient Systems for Training Emerging Deep Neural Networks

Candidate: Wang, Minjie

Advisor(s): Li, Jinyang

Abstract:

The success of deep neural networks (DNNs) is due to its strong capability to learn from data. To leverage more data requires larger models that may exceed the capacity of a single computing device. To leverage graph structured data demands models of sparse computation pattern. Unfortunately, current deep learning systems limit the exploration of such models, causing disturbing user experience. This thesis proposes a system design to guide the development of new deep learning systems. The goal of this design is to enable efficient training of these emerging DNNs with little user effort.

We then realize the design in two systems, Tofu and DGL. Tofu partitions very large DNNs across multiple GPUs to reduce per-GPU memory footprint. To automatically partition each operator, we propose a description language for annotating the semantics of an operator. To optimally partition the whole training, Tofu proposes an algorithm that minimizes the total communication cost. We evaluate and assess the capability of Tofu to train very models demonstrating the substantial gains by applying the design. We then implement DGL, a new framework for training DNNs for graph structured data. DGL provides an intuitive and expressive interface that can cover a wide range of graph DNN models. We introduce batching and kernel fusion techniques that enable training GNNs on large graphs and achieve significant improvements in performance relative to existing systems.
Ph.D. Thesis 2019 From 2.5G To 5G: Enhancing Access And Performance For Mobile Users Ahmad, Talal Abstract | PDF

Title: From 2.5G To 5G: Enhancing Access And Performance For Mobile Users

Candidate: Ahmad, Talal

Advisor(s): Subramanian, Lakshminarayanan

Abstract:

This dissertation has two overarching themes: i) enhancing connectivity access for mobile users in rural contexts and ii) enhancing transport layer performance for mobile users.

More than half of the world’s population faces barriers in accessing the Internet. A recent ITU study estimates that 2.6 billion people cannot afford connectivity and that 3.8 billion do not have access to it. To enhance access I have worked on two projects: Wi-Fly and GreenApps. Wi-Fly is a new connectivity paradigm designed for regions without Internet coverage that enables communication between a lightweight Wi-Fi device on commercial planes and ground stations. Through empirical experiments with test flights and simulation, we show that Wi-Fly and its extensions have the potential to provide connectivity in the most remote regions of the world. In GreenApps, we look at how localized cellular applications can be built for rural communities on top of software-defined cellular base stations. We deployed the GreenApps platform on rural base stations for communities in Ghana and Nicaragua and supported multiple localized applications for rural communities.

Enhancing transport layer performance over cellular networks is critical to improve end-to-end application performance for mobile users. Cellular networks have unique challenges that make conventional transport protocols not suitable for these environments. In the past few years, several new delay-based congestion-control algorithms have been developed with complex nonlinear control loops for cellular contexts. While these protocols have shown promise, it has been extremely challenging to analyze and interpret the behavior of these algorithms especially under highly variable network conditions (e.g., cellular links). In the Model-Driven Interpretable (MDI) congestion control work, we provide a model-driven framework to reason about the behavior of such congestion control algorithms. Our modeling approach simplifies a congestion control algorithm’s behavior into a guided random walk over a two-dimensional Markov model. We show that the model of a congestion-control algorithm can give key insights into its convergence and performance. More recently, we also looked at how to learn early signals of congestion in highly varying 5G channels. In particular we worked with Wi-Gig traces collected at 60 GHz and showed that it is possible to learn highly accurate early congestion signals using delay features observed at end-hosts.
Ph.D. Thesis 2019 Co-Located Augmented and Virtual Reality Systems DeFanti, Connor Abstract | PDF

Title: Co-Located Augmented and Virtual Reality Systems

Candidate: DeFanti, Connor

Advisor(s): Perlin, Ken

Abstract:

Augmented and Virtual Reality (AVR) systems have become increasingly popular in the worlds of entertainment and industry. However, many current systems are limited in scope to experiences that isolate a single user within a given physical space. While many such experiences allow for interactions between remotely located users, very few experiences allow for multiple users to coexist in the same physical space while interacting with a consistent world-view of shared virtual objects. Our research has found that by enabling this co-located paradigm, users are able to have rich interactions that are otherwise impossible. This thesis presents a series of experiments that demonstrate the importance of the social aspects of co-located AVR, a set of solutions that overcome the difficulties often encountered in such experiences, and directions for future scalability using forthcoming hardware and technologies.
Ph.D. Thesis 2019 Design for Customized Manufacturing Gil-Ureta, Francisca T. Abstract | PDF

Title: Design for Customized Manufacturing

Candidate: Gil-Ureta, Francisca T.

Advisor(s): Denis Zorin

Abstract:

Over the past few years, 3D printing technology has captivated business and consumers alike with its promise of affordable custom manufacturing. The expectation is, in the future, people will be able to easily customize and manufacture objects to fit individual needs. To make this a reality, we need new methods that support the creative process of makers, from conception to fabrication.

In this thesis, I present three projects where we reexamine the tools and workflows used for customized design. The core idea behind these projects is that, compared with traditional methods, we design for an unknown or changeable manufacturing process, which affects the life-cycles of design. Our goal is to create tools that simplify the modification, optimization, and evaluation of designs such that they can be easily altered to fit manufacturing and personal constraints.

Although fabrication constraints are unlimited, we can study specific domains to learn the most common ones. In the first project, we present an interactive modeling tool for designing mechanical objects, which are determined mostly by kinematic constraints. In the second project, we study the structural efficiency of shells and introduce an efficient method for designing shell reinforcements of minimal weight. Finally, in the third project, we develop a robust collision resolution algorithm, crucial for the design and optimization of
models subject to dynamic impulses.
Ph.D. Thesis 2019 Scalable Machine Learning using Dataflow Graph Analysis Huang, Chien-Chin Abstract | PDF

Title: Scalable Machine Learning using Dataflow Graph Analysis

Candidate: Huang, Chien-Chin

Advisor(s): Li, Jinyang

Abstract:

In the past decade, the abundance of computing resources and the growth of data boost the development of machine learning applications. Many computation frameworks, e.g., Hadoop, Spark, TensorFlow, and PyTorch, have been proposed and become widely used in the industry. However, programming large-scale machine learning applications is still challenging and requires the manual efforts of developers to achieve good performance.

For example, when parallelizing arrays to hundreds of CPU machines, it is critical to choose a good partition strategy to co-locate the computation arrays to reduce network communication. Unfortunately, existing distributed array frameworks usually use a default partition scheme and requires manually partitioning if another parallel strategy is used, making it less easy to develop a distributed array program. Another example is running deep learning applications with GPU. Modern GPU can be orders of magnitude faster than CPU and becomes an attractive computation resource. Unfortunately, the limited memory size of GPU restricts the scale of the DNN models can be run. It is desired to have a computation framework to allow users to explore deeper and wider DNN models.

Modern distributed frameworks generally adopt a dataflow-style programming paradigm. The dataflow graph of an application exposes valuable information to optimize the application. In this thesis, we present two techniques to address the above issues via dataflow graph analysis.

We first design Spartan to help users parallelize distributed arrays on a CPU cluster. Spartan is a distributed array framework, built on top of a set of higher-order dataflow operators. Based on the operators, Spartan provides a collection of Numpy-like array APIs. Developers can choose the built-in array APIs or directly use the operators to construct machine learning applications. To achieve good performance for the distributed application, Spartan analyzes the communication pattern of the dataflow graph captured through the operators and applies a greedy strategy to find a good partition scheme to minimize the communication cost.

To support memory-intensive deep learning applications on a single GPU, we develop SwapAdvisor, a swapping system that automatically swaps temporarily unused tensors from GPU memory to CPU memory. To minimize the communication overhead, SwapAdvisor analyzes the dataflow graph of the given DNN model and uses a custom-designed genetic algorithm to optimize the operator scheduling and memory allocation. Based on the optimized operator schedule and memory allocation, SwapAdvisor can determine what and when to swap to achieve a good performance.
Ph.D. Thesis 2019 Compositional Abstractions for Verifying Concurrent Data Structures Krishna, Siddharth Abstract | PDF

Title: Compositional Abstractions for Verifying Concurrent Data Structures

Candidate: Krishna, Siddharth

Advisor(s): Thomas Wies

Abstract:

Formal verification has had great success in improving the reliability of real-world software, with projects such as ASTREE, CompCert, and Infer showing that rigorous mathematical analysis can handle the scale of today's cyber-infrastructure. However, despite these successes, many core software components are yet to be verified formally. Concurrent data structures are a class of algorithms that are becoming ubiquitous, as software systems seek to make use of the increasingly parallel design of computers and servers. These data structures use sophisticated algorithms to perform fine-grained synchronization between threads, making them notoriously difficult to design correctly, with bugs being found both in actual implementations and in the designs proposed by experts in peer-reviewed publications. The rapid development and deployment of these concurrent algorithms has resulted in a rift between the algorithms that can be verified by the state-of-the-art techniques and those being developed and used today. The goal of this dissertation is to bridge this gap and bring the certified safety of formal verification to the concurrent data structures used in practice.

Permission-based program logics such as separation logic have been established as the standard technique for verifying programs that manipulate complex heap-based data structures. These logics build on so-called separation algebras, which allow expressing properties of heap regions such that modifications to a region do not invalidate properties stated about the remainder of the heap. This concept is key to enabling modular reasoning and also extends to concurrency. However, certain data structure idioms prevalent in real-world programs, especially concurrent programs, are notoriously difficult to reason about, even in these advanced logics (e.g., random access into inductively defined structures, data structure overlays). The underlying issue is that while heaps are naturally related to mathematical graphs, many ubiquitous graph properties are non-local in character. Examples of such properties include reachability between nodes, path lengths, acyclicity and other structural invariants, as well as data invariants which combine with these notions. Reasoning modularly about such global graph properties remains a hard problem, since a local modification can have side-effects on a global property that cannot be easily confined to a small region.

This dissertation addresses the question: What separation algebra can be used to prove that a program maintains a global graph property by reasoning only about the local region modified by the program? We propose a general class of global graph properties, that we call flows, that can be expressed as fixpoints of algebraic equations over graphs. Flows can encode structural properties of the heap (e.g. the reachable nodes from the root form a tree), data invariants (e.g. sortedness), as well as combinations of both shape and data constraints of overlaid structures in a uniform manner. We then introduce the notion of a flow interface, an abstraction of a region in the heap, which expresses the constraints and guarantees between the region and its context with respect to the flow. Under a suitable notion of composition that preserves the flow values, we show that flow interfaces form the desired separation algebra.

Building on our theory of flows, we develop the flow framework, a general proof technique for modular reasoning about global graph properties over program heaps that can be integrated with existing separation logics. We further devise a strategy for automating this technique using SMT-based verification tools. We have implemented this strategy on top of the verification tool Viper and applied it successfully to a variety of challenging benchmarks including 1) algorithms involving general graphs such as Dijkstra's algorithm and a priority inheritance protocol, 2) inductive data structures such as linked lists and B trees, 3) overlaid data structures such as the Harris list and threaded trees, and 4) OO design patterns such as Composite and Subject/Observer. We are not aware of any single other approach that can handle these examples with the same degree of simplicity or automation.

While the flow framework is applicable to any data structure, its features give rise to a new form of modular reasoning for certain concurrent data structures. Concurrent separation logics already apply modularity on multiple levels to simplify correctness proofs, decomposing them according to program structure, program state, and individual threads. Despite these advances, it remains difficult to achieve proof reuse across different data structure implementations. For the large class of concurrent search structures, we demonstrate how one can achieve further proof modularity by decoupling the proof of thread safety from the proof of structural integrity. We base our work on the template algorithms of Shasha and Goodman that dictate how threads interact but abstract from the concrete layout of nodes in memory. By using the flow framework of compositional abstractions in the separation logic Iris, we show how to prove correctness of template algorithms, and how to instantiate them to obtain multiple verified implementations. We demonstrate our approach by formalizing three concurrent search structure templates, based on link, give-up, and lock-coupling synchronization, and deriving implementations based on B-trees, hash tables, and linked lists. These case studies represent algorithms used in real-world file systems and databases, which have so far been beyond the capability of automated or mechanized state-of-the-art verification techniques. Our verification is split between the Coq proof assistant and the deductive verification tool GRASShopper in order to demonstrate that our proof technique and framework can be applied both in fully mechanized proof assistants as well as automated program verifiers. In addition, our approach reduces proof complexity and is able to achieve significant proof reuse.
Ph.D. Thesis 2019 Parallel Contact-Aware Algorithms for Large-Scale Direct Blood Flow Simulations Lu, Libin Abstract | PDF

Title: Parallel Contact-Aware Algorithms for Large-Scale Direct Blood Flow Simulations

Candidate: Lu, Libin

Advisor(s): Zorin, Denis

Abstract:

Experimental and theoretical evidence suggests that blood flow can be well approximated by a mixture model of a Newtonian fluid and deformable particles representing the red blood cells. We use a well-established boundary integral formulation for the problem as the foundation of our approach. This type of formulations, with a high-order spatial discretization and an implicit and adaptive time discretization, have been shown to be able to handle complex interactions between particles with high accuracy. Yet, for dense suspensions, very small time-steps or expensive implicit solves as well as a large number of discretization points are required to avoid non-physical contact and intersections between particles, lead- ing to infinite forces and numerical instability. Given the importance of vesicle flows, in this thesis we focus in efficient numerical methods for such problems: we present computationally parallel-scalable algorithms for the simulation of dense deformable vesicles in two and three dimensions both in unbounded and bounded domain.

Our method maintains the accuracy of previous methods at a significantly lower cost for dense suspensions and the time step size is independent from the volume fraction. The key idea is to ensure interference-free configuration by introducing explicit contact constraints into the system. While such constraints are unnecessary in the formulation, in the discrete form of the problem, they make it possible to eliminate catastrophic loss of accuracy by preventing contact explicitly. Experimental and theoretical evidence suggests that blood flow can be well approximated by a mixture model of a Newtonian fluid and deformable particles representing the red blood cells. We use a well-established boundary integral formulation for the problem as the foundation of our approach. This type of formulations, with a high-order spatial discretization and an implicit and adaptive time discretization, have been shown to be able to handle complex interactions between particles with high accuracy. Yet, for dense suspensions, very small time-steps or expensive implicit solves as well as a large number of discretization points are required to avoid non-physical contact and intersections between particles, lead- ing to infinite forces and numerical instability. Given the importance of vesicle flows, in this thesis we focus in efficient numerical methods for such problems: we present computationally parallel-scalable algorithms for the simulation of dense deformable vesicles in two and three dimensions both in unbounded and bounded domain.

Our method maintains the accuracy of previous methods at a significantly lower cost for dense suspensions and the time step size is independent from the volume fraction. The key idea is to ensure interference-free configuration by introducing explicit contact constraints into the system. While such constraints are unnecessary in the formulation, in the discrete form of the problem, they make it possible to eliminate catastrophic loss of accuracy by preventing contact explicitly.

Introducing contact constraints results in a significant increase in stable time- step size for locally-implicit time-stepping, and a reduction in the number of points adequate for stability. Our method permits simulations with high volume fractions; we report results with up to 60% volume fraction. We demonstrated the parallel v scaling of the algorithms on up to 35K CPU cores.
Ph.D. Thesis 2019 Leveraging Program Analysis for Type Inference Pavlinovic, Zvonimir Abstract | PDF

Title: Leveraging Program Analysis for Type Inference

Candidate: Pavlinovic, Zvonimir

Advisor(s): Wies, Thomas

Abstract:

Type inference is a popular feature of programming languages used to automatically guarantee the absence of certain execution errors in programs at compile time. The convenience of type inference, unfortunately, comes with a cost. Developing type inference algorithms is a challenging task that currently lacks a systematic approach. Moreover, programmers often have problems interpreting error reports produced by type inference. The overarching goal of this thesis is to provide a mathematically rigorous framework for the systematic development of sophisticated type inference algorithms that are convenient to use by the programmers. To this end, we focus on two specific problems in this thesis: (1) how to constructively design type inference algorithms that improve over the state-of-the-art and (2) how to automatically debug type errors that arise during inference. We base our approach on the observation that, similar to type inference, program analysis algorithms automatically discover various program properties that can be used to show program correctness. Type inference and program analysis techniques, although similar, have traditionally been developed independently of each other. In contrast, this thesis further explores the recent path of leveraging program analysis for type inference.

As our first contribution, we use abstract interpretation to constructively design type inference algorithms. We specifically focus on Liquid types, an advanced family of algorithms that combine classical typing disciplines and known static analyses to prove various safety properties of functional programs. By using abstract interpretation, we make the design space of Liquid type inference explicit. We also unveil the general type inference framework underlying Liquid types. By properly instantiating this general framework, one obtains novel type inference algorithms that are sound by construction.

Our second contribution is a framework for automatically debugging type errors for languages that deploy type inference in the style of Hindley-Milner, such as OCaml and Haskell. Such languages are notorious for producing cryptic type error reports that are often not helpful in fixing the actual bug. We formulate the problem of finding the root cause of type errors as an optimization problem expressed in a formal logic. We then show how to solve this problem using automated theorem provers. We experimentally illustrate how our framework can efficiently produce type error reports that outperform the state-of-the-art solutions in identifying the true cause of type errors.

In summary, this thesis introduces a mathematical framework for the systematic design of sophisticated type inference algorithms that are sound by construction. Our results further enable automatic generation of more meaningful type error diagnostics, ultimately making type inference more usable by the programmers.
Ph.D. Thesis 2019 Concentration and Anti-concentration for Markov Chains Rao, Shravas Abstract | PDF

Title: Concentration and Anti-concentration for Markov Chains

Candidate: Rao, Shravas

Advisor(s): Regev, Oded

Abstract:

We study tail bounds and small ball probabilities for sums of random variables obtained from a Markov chain. In particular, we consider the following sum $S_n = f_1(Y_1)+\cdots+f_n(Y_n)$ where $\{Y_i\}_{i=1}^{\infty}$ is a Markov chain with state space $[N]$, transition matrix $A$, and stationary distribution $\mu$ such that $Y_1$ is distributed as $\mu$, and $f_i: [N] \rightarrow \mathbb{R}$. We also consider settings in which $f_i(Y_i)$ is vector-valued.

In all results, the bounds are in terms of the spectral gap of the Markov chain. In almost all of the results in this thesis, when the transitions are independent and the spectral gap is $1$, the bounds match the corresponding bounds for independent random variables up to constant factors.

We first obtain tail bounds in the case that only the $p$th moment of the random variable $f_i(Y_i)$ is bounded. This is a Markov chain version of a corollary of the Marcinkiewicz–Zygmund inequality. Using this, we also obtain tail bounds for $S_n$ when the $f_i(Y_i)$ are elements of an $\ell_q$ space.

Next, we obtain sharp tail bounds when the random variables $f_i(Y_i)$ are bounded and the expected value of $S_n$ is small. This is a Markov chain version of a Poisson approximation to sums of independent random variables. As an application, we explain how such tail bounds can be used to construct simple and explicit resilient functions that match the non-constructive functions shown to exist due to the work of Ajtai and Linial.

Next, we obtain tail bounds in the case that the $f_i(Y_i)$ are bounded in the range $[-a_i, a_i]$ for each $i$. This is a Markov chain version of the Hoeffding inequality. This improves upon previously known bounds in that the dependence is on $\sqrt{a_1^2+\cdots+a_n^2}$ rather than $\max_{i}\{a_i\}\sqrt{n}.$ Using this, we obtain tail bounds for certain types of random variables in which the $f_i(Y_i)$ are elements of any Banach space.

Finally, we show that if the $f_i(Y_i)$ take on values $\{-a_i, a_i\}$ with equal probability and the $a_i$ are Euclidean vectors with norm at least $1$, the probability that $S_n$ lies in a ball of volume $1$ is small. This is a Markov chain version of the Littlewood-Offord inequality. We also construct a new pseudorandom generator for the Littlewood-Offord problem.
Ph.D. Thesis 2019 Approximation algorithms, Hardness, and PCPs Thiruvenkatachari, Devanathan Abstract | PDF

Title: Approximation algorithms, Hardness, and PCPs

Candidate: Thiruvenkatachari, Devanathan

Advisor(s): Khot, Subhash

Abstract:

This thesis is a collection of theoretical results on the topic of approximation algorithms and hardness of approximation. The results presented here use a combination of classical and modern techniques to achieve better approximation algorithms and hardness results for some pivotal NP-hard problems and their variants. We study CSPs from a multi-objective point of view, with the goal of simultaneous optimization of multiple instances over the same set of variables, with MAX-CUT as the central focus. We provide an approximation algorithm that is near optimal assuming the unique games conjecture. We also study PCPs and their role in hardness of approximation, and present a hardness result for 3-LIN in the sub-constant soundness regime. Lastly, dictatorship testing is a property testing problem with significant applications in proving hardness results, and we present an improvement on the soundness of the k-bit dictatorship test with perfect completeness.
Ph.D. Thesis 2019 Tactile Perception Design for Fabrication Tymms, Chelsea Abstract | PDF

Title: Tactile Perception Design for Fabrication

Candidate: Tymms, Chelsea

Advisor(s): Zorin, Denis

Abstract:

High-resolution 3D printing technology provides the ability to manufacture shapes with precise geometry. Controlling this fine-scale geometry to confer haptic qualities is a growing area of research in fabrication. In this thesis, I will present three projects addressing the question of how to fabricate surface textures with controlled tactile properties and exploring how tactile textures can be used in custom manufacturing and to expand the understanding of the human sense of touch.

Surface roughness is one of the most significant qualities in haptic perception, essential to material identification, comfort, and usability. Past perceptual studies on roughness have typically used stimuli that are existing materials or in a narrow range of custom-made materials. In the first project presented in this thesis, we explore the use of 3D printing to manufacture stimuli. We used modeling and 3D printing to manufacture a set of fine parametric bump textures, and we used these texture stimuli in a psychophysical study of human roughness perception. We investigated the contribution of the texton spacing, size, and arrangement to the texture's perceived tactile roughness.

In the second project, we quantitatively address the problem of mapping arbitrary texture geometry to tactile roughness. Drawing from insights in past neurophysiology research, we developed a model that simulates human touch to predict a texture's tactile roughness from its surface geometry. We fabricated a set of 46 parametric and real-life textures, and we used psychophysical experiments with human subjects to place them in the perceptual space for tactile roughness using non-metric multidimensional scaling. We closely match this space with our quantitative model, obtained from strain fields derived from the elasticity simulations of the human skin contacting texture geometry. We demonstrate how this model can be applied to predict and alter surface roughness, and we show several applications in the context of fabrication.

The third project extends these ideas by developing a method to control a texture's haptic qualities and visual appearance at the same time. The tactile feeling and visual appearance of objects often interact in unpredictable ways, and both serve important purposes for identification and usability. In this project, we develop an optimization method to maintain a texture's visual appearance while altering its perceived tactile roughness or tactile temperature. Our optimization method, which is enabled by neural network-based models, allows us to change a texture to a different desired tactile feeling while preserving the visual appearance, at a relatively low computational cost.
Ph.D. Thesis 2019 End-to-End Learning for Autonomous Driving Zhang, Jiakai Abstract | PDF

Title: End-to-End Learning for Autonomous Driving

Candidate: Zhang, Jiakai

Advisor(s): Cho, Kyunghyun

Abstract:

The end-to-end learning approach for autonomous driving has sparked great interest in both academic and industry in recent years. The approach can be defined as learning a model that maps from sensory input, such as image frames from a camera, to driving actions for controlling the autonomous vehicle such as steering. Compared to the traditional autonomous driving system, which often includes perception, localization, mapping, and path planning, the end-to-end learning approach offers a more efficient method of utilizing large amounts of expert driver demonstrations to achieve fully autonomous driving without acquiring expensive labeled data such as bounding box for objects.

The end-to-end learning for autonomous driving can be done by supervised learning, where a model is tuned to minimize the difference between predicted actions and ground-truth actions. The ground truth of driving actions is usually obtained from driver demonstrations. A model trained in this way, however, suffers from unexpected behaviors due to the mismatch between the samples visited by a learned model and the samples collected by an expert driver. To address this issue, we first introduce an end-to-end supervised learning approach with data augmentation to train a model to keep a vehicle driving at the center of a lane. The data augmentation is done by synthetically generating new samples through rotating and translating input images captured from a front-facing camera and calculating compensatory steering. We show that using such automatically-augmented data, a trained model can drive a car to follow a lane in various conditions on highways and local and residential roads.

Instead of generating augmented data, we can also collect new samples when trying out the learned model. Aiming to reduce the number of times querying an expert for labeling, we propose SafeDAgger algorithm, which is a query-efficient imitation learning approach. We show that our method significantly reduces the number of querying times and trains a driving model more efficiently. A model trained by our proposed SafeDAgger algorithm can successfully drive a racing car in a simulator to do lane following and overtaking.

The expert demonstrations provided by humans and used for training models often show significant variability due to latent factors. Given such expert demonstrations, a model trained by minimizing the difference between the expert driving actions and predicted driving actions can output dangerous driving actions that may cause serious accidents. We address this issue by introducing a variational mixture density network to model the variability using a discrete latent variable. The experimental results in a racing car simulator show that the model trained using our proposed method can learn the variability of driving signals from expert demonstrations and successfully distinguish certain driving behaviors such as changing lanes and following lanes.

We introduce a simulator to support the development, training, and evaluation of autonomous driving systems using the end-to-end learning approaches. Leveraging this simulator, we demonstrate how to train and evaluate models to drive a truck that follows a navigation map in a video game.

In summary, this thesis introduces the end-to-end learning approaches for autonomous driving to address the data mismatch issue and learn the variability of expert driving actions. Our results show that the trained model can drive the vehicle to follow a lane, change lanes and make turns in simulated driving environments.
Ph.D. Thesis 2019 Text Representation using Convolutional Networks Zhang, Xiang Abstract | PDF

Title: Text Representation using Convolutional Networks

Candidate: Zhang, Xiang

Advisor(s): LeCun, Yann

Abstract:

This dissertation applies convolutional networks for learning representations of text, and it consists of several parts. The first part offers an empirical exploration on the use of character-level convolutional networks (ConvNets) for text classification. We constructed several large-scale datasets to show that character-level convolutional networks could achieve state-of-the-art or competitive results. Comparisons are offered against traditional models such as bag of words, n-grams and their TFIDF variants, and deep learning models such as word-based ConvNets and recurrent neural networks. These results indicate that using low-level inputs – in this case characters – for convolutional networks could be feasible for text representation learning.

The second part concerns which text encoding method might work for convolutional networks. We include a comprehensive comparison of different encoding methods for the task of text classification using 14 large-scale datasets in 4 languages including Chinese, English, Japanese and Korean. Different encoding levels are studied, including UTF-8 bytes, characters, words, romanized characters and romanized words. For all encoding levels, whenever applicable, we provide comparisons with linear models, fastText and convolutional networks. For convolutional networks, we compare between encoding mechanisms using character glyph images, one-hot (or one-of-n) encoding, and embedding. From these 473 models, one of the conclusions is that byte-level one-hot encoding works consistently best for convolutional networks.

Based on this, in the third part of the dissertation we develop a convolutional network at the level of bytes for learning representations through the task of auto-encoding. The proposed model is a multi-stage deep convolutional encoder-decoder framework using residual connections, containing up to 160 parameterized layers. Each encoder or decoder contains a shared group of modules that consists of either pooling or up-sampling layers, making the network recursive in terms of abstraction levels in representation. The decoding process is non-sequential. Results for 6 large-scale paragraph datasets are reported, in 3 languages including Arabic, Chinese and English. Analyses are conducted to study several properties of the proposed model. Experiments are presented to verify that the auto-encoder can learn useful representations.

In the fourth part of the dissertation, we use the improved design from the previous auto-encoding model to text classification, adding comparisons between residual and dense connections. This further validates the choice of the architecture we made for the auto-encoding model, and the effectiveness of the recursive architecture with residual or dense connections.
Ph.D. Thesis 2019 Unsupervised Learning with Regularized Autoencoders Zhao, Junbo Abstract | PDF

Title: Unsupervised Learning with Regularized Autoencoders

Candidate: Zhao, Junbo

Advisor(s): LeCun, Yann

Abstract:

Deep learning has enjoyed remarkable successes in a variety of domains. These successes often emerge at the cost of large annotated datasets and training computationally heavy neural network models. The learning paradigm for this is called supervised learning. However, to reduce the sample complexity while improving the universality of the trained models is a crucial next step that may to artificial intelligence. Unsupervised Learning, in contrast to supervised learning, aims to build neural network models with more generic loss objectives requiring little or no labelling effort, and therefore it does not reside on any specific domain-task. In spite of the brevity of its goal, unsupervised learning is a broad topic that relates or includes several sub-fields, such as density estimation, generative modeling, world model and etc. In this thesis, we primarily adopt an energy-based view unifying these different fields. A desired energy function reflects the data manifold by differentiating the energy assigned to the points on the data manifold against points off the manifold. With this foundation, we first cast the popular autoencoder and adversarial learning framework into an energy-based perspective, and then propose several technique or architectures with a motivation to learn better-shaped energy function. We also show that the proposed techniques in this thesis cover a wide spectrum of applications including image/text generative modeling, text summarization, style-transfer without aligned data, transfer/semi-supervised learning on both computer vision and natural language processing. The thesis is organized as follows. First, we assess the validity and the main challenges of energy-based learning. We then introduce two frameworks focusing on strengthening autoencoders by building unit connection hierarchies via either hard-coded pooling or self-learned graphs. Finally, we propose several systematic regularization techniques, based on adversarial training and vector discretization.
Ph.D. Thesis 2019 Unsupervised Learning with Regularized Autoencoders Zhao, Junbo Abstract | PDF

Title: Unsupervised Learning with Regularized Autoencoders

Candidate: Zhao, Junbo

Advisor(s): Yann LeCun

Abstract:

Deep learning has enjoyed remarkable successes in a variety of domains.These successes often emerge at the cost of large annotated datasets and training computationally heavy neural network models.The learning paradigm for this is called \emph{supervised learning}. However, to reduce the sample complexity while improving the universality of the trained models is a crucial next step that may to artificial intelligence. \emph{Unsupervised Learning}, in contrast to supervised learning, aims to build neural network models with more generic loss objectives requiring little or no labelling effort, and therefore it does not reside on any specific domain-task. In spite of the brevity of its goal, unsupervised learning is a broad topic that relates or includes several sub-fields, such as density estimation, generative modeling, world model and etc. In this thesis, we primarily adopt an energy-based view unifying these different fields~\citep{lecun2006tutorial}. A desired energy function reflects the data manifold by differentiating the energy assigned to the points on the data manifold against points off the manifold. With this foundation, we first cast the popular autoencoder and adversarial learning framework into an energy-based perspective, and then propose several technique or architectures with a motivation to learn better-shaped energy function. We also show that the proposed techniques in this thesis cover a wide spectrum of applications including image/text generative modeling, text summarization, style-transfer without aligned data, transfer/semi-supervised learning on both computer vision and natural language processing. The thesis is organized as follows. First, we assess the validity and the main challenges of energy-based learning. We then introduce two frameworks focusing on strengthening autoencoders by building unit connection hierarchies via either hard-coded pooling or self-learned graphs. Finally, we propose several systematic regularization techniques, based on adversarial training and vector discretization.
Ph.D. Thesis 2018 Deep Generative Models of Images and Video Denton, Emily Lynn Abstract | PDF

Title: Deep Generative Models of Images and Video

Candidate: Denton, Emily Lynn

Advisor(s): Fergus, Rob

Abstract:

Deep neural networks have seen wide success in the supervised setting in recent years. Many of these successes rely heavily on large training sets of manually annotated data. Given the difficulty of obtaining enough labeled data to scale many deep learning approaches, it is increasingly important to look for better methods of utilizing large amounts of unlabeled data. Building generative models of images and video is a fundamental paradigm of learning from unlabeled data. Unsupervised criterion based on generating or reconstructing images drive many representation learning frameworks. Video is a particularly appealing domain for unsupervised learning due to the inherent temporal structure of the data. This structure lends itself to representation learning approaches based on extracting invariances and predicting future frames, given the past.

Additionally, building accurate models of the world that facilitate future prediction can be useful for model based reinforcement learning, planning, and more generally, endowing an agent with the capacity to reason about its environment. Incorporating predictive models can potentially help alleviate the sample inefficiency of many reinforcement learning systems.

In this thesis, we review the challenges associated with generating images and videos. We then introduce a multi-scale image generation framework that demonstrates impressive performance on real world image datasets. This method was the first to demonstrate empirically the potential of generative adversarial networks. We also address two challenging aspects of video generation:learning a latent space that affords easier prediction and modeling the uncertainty in video sequences.
Ph.D. Thesis 2018 Deep Networks for Forward Prediction and Planning Henaff, Mikael Bruce Abstract | PDF

Title: Deep Networks for Forward Prediction and Planning

Candidate: Henaff, Mikael Bruce

Advisor(s): LeCun, Yann

Abstract:

Learning to predict how an environment will evolve and the consequences of one’s actions is an important ability for autonomous agents, and can enable planning with relatively few interactions with the environment which may be slow or costly. However, learning an accurate forward model is often difficult in practice due to several features often present in complex environments. First, many environments exhibit long-term dependencies which require the system to learn to record and maintain relevant information in its memory over long timescales. Second, the environment may only be partially observed, and the aspects of the environment which are observed may depend on parts of the environment which are hidden. Third, many observed processes contain some form of apparent or inherent stochasticity, which makes the task of predicting future states ill-defined. In this thesis, we propose approaches to tackle and better understand these different challenges associated with learning predictive models of the environment and using them for planning. We first provide an analysis of recurrent neural network (RNN) memory, which sheds light on the mechanisms by which RNNs are able to store different types of information in their memory over long timescales through the analysis of two synthetic benchmark tasks. We then introduce a new neural network architecture which keeps an estimate of the state of the environment in its memory, and can deal with partial observability by reasoning based on what is observed. We next present a new method for performing planning using a learned model of the environment with both discrete and continuous actions. Finally, we propose an approach for model-based planning in the presence of both environment uncertainty and model uncertainty, and evaluate it on a new real-world dataset and environment with applications to autonomous driving.
Ph.D. Thesis 2018 Learning Representations of Text through Language and Discourse Modeling: From Characters to Sentences Jernite, Yacine Abstract | PDF

Title: Learning Representations of Text through Language and Discourse Modeling: From Characters to Sentences

Candidate: Jernite, Yacine

Advisor(s): Sontag, David

Abstract:

In this thesis, we consider the problem of obtaining a representation of the meaning expressed in a text. How to do so correctly remains a largely open problem, combining a number of inter-related questions (e.g. what is the role of context in interpreting text? how should language understanding models handle compositionality? etc...) In this work, after reflecting on some of these questions and describing the most common sequence modeling paradigms in use in recent work, we focus on two specifically: what level of granularity text should be read at, and what training objectives can lead models to learn useful representations of a text’s meaning.

In a first part, we argue for the use of sub-word information for that purpose, and present new neural network architectures which can either process words in a way that takes advantage of morphological information, or do away with word separations altogether while still being able to identify relevant units of meaning.

The second part starts by arguing for the use of language modeling as a learning objective, and provides algorithms which can help with its scalability issues and propose a globally rather than locally normalized probability distribution. It then explores the question of what makes a good language learning objective, and introduces discriminative objectives inspired by the notion of discourse coherence which help learn a representation of the meaning of sentences.
Ph.D. Thesis 2018 Deep Learning for Information Extraction Nguyen, Thien Huu Abstract | PDF

Title: Deep Learning for Information Extraction

Candidate: Nguyen, Thien Huu

Advisor(s): Grishman, Ralph

Abstract:

The explosion of data has made it crucial to analyze the data and distill important information effectively and efficiently. A significant part of such data is presented in unstructured and free-text documents. This has prompted the development of the techniques for information extraction that allow computers to automatically extract structured information from the natural free-text data. Information extraction is a branch of natural language processing in artificial intelligence that has a wide range of applications, including question answering, knowledge base population, information retrieval etc. The traditional approach for information extraction has mainly involved hand-designing large feature sets (feature engineering) for different information extraction problems, i.e, entity mention detection, relation extraction, coreference resolution, event extraction, and entity linking. This approach is limited by the laborious and expensive effort required for feature engineering for different domains, and suffers from the unseen word/feature problem of natural languages.

This dissertation explores a different approach for information extraction that uses deep learning to automate the representation learning process and generate more effective features. Deep learning is a subfield of machine learning that uses multiple layers of connections to reveal the underlying representations of data. I develop the fundamental deep learning models for information extraction problems and demonstrate their benefits through systematic experiments.

First, I examine word embeddings, a general word representation that is produced by training a deep learning model on a large unlabelled dataset. I introduce methods to use word embeddings to obtain new features that generalize well across domains for relation extraction. This is done for both the feature-based method and the kernel-based method of relation extraction.

Second, I investigate deep learning models for different problems, including entity mention detection, relation extraction and event detection. I develop new mechanisms and network architectures that allow deep learning to model the structures of information extraction problems more effectively. Some extensive experiments are conducted on the domain adaptation and transfer learning settings to highlight the generalization advantage of the deep learning models for information extraction.

Finally, I investigate the joint frameworks to simultaneously solve several information extraction problems and benefit from the inter-dependencies among these problems. I design a novel memory augmented network for deep learning to properly exploit such inter-dependencies. I demonstrate the effectiveness of this network on two important problems of information extraction, i.e, event extraction and entity linking.
Ph.D. Thesis 2018 Accelerating Approximate Simulation with Deep Learning Schlachter, Kristofer Abstract | PDF

Title: Accelerating Approximate Simulation with Deep Learning

Candidate: Schlachter, Kristofer

Advisor(s): Perlin, Ken

Abstract:

Once a simulation resorts to an approximate numerical solution one is faced with various tradeoffs in accuracy versus computation time. We propose that another approximate solution can be learned for two chosen simulations, which in our case, are just as useful but can be made faster to compute. The two problems addressed in this thesis are fluid simulation and the simulation of diffuse inter-reflection in computer graphics.

Real-time simulation of fluid and smoke is a long standing problem in computer graphics, where state-of-the-art approaches require large compute resources, making real-time applications often impractical. In this work, we propose a data-driven approach that leverages the approximation power of deep-learning methods with the precision of standard fluid solvers to obtain both fast and highly realistic simulations. The proposed method solves the incompressible Euler equations following the standard operator splitting method in which a large, often ill-condition linear system must be solved. We propose replacing this system by learning a Convolutional Network (ConvNet) from a training set of simulations using a semi-supervised learning method to minimize long-term velocity divergence.

ConvNets are amenable to efficient GPU implementations and, unlike exact iterative solvers, have fixed computational complexity and latency. The proposed hybrid approach restricts the learning task to a linear projection without modeling the well understood advection and body forces. We present real-time 2D and 3D simulations of fluids and smoke; the obtained results are realistic and show good generalization properties to unseen geometry.

The next simulation that we address is the synthesis of images for training convnets. A challenge with training deep learning models is that they commonly require a large corpus of training data and retrieving sufficient real world data may be unachievable. A solution to this problem can be found in the use of synthetic or simulated training data. However, for simulated photographs or renderings, there hasn't been a systematic approach to comparing the relative benefits of different techniques in image synthesis.

We compare multiple synthesis techniques to one another as well as the real data that they seek to replicate. We also introduce learned synthesis techniques that either train models better than the most realistic graphical methods used by standard rendering packages or else approach their fidelity using far less computation. We accomplish this by learning shading of geometry as well as denoising the results of low sample Monte Carlo image synthesis. Our major contributions are (i) a dataset that allows comparison of real and synthetic versions of the same scene, (ii) an augmented data representation that boosts the stability of learning, and (iii) three different partially differentiable rendering techniques where lighting, denoising and shading are learned. Finally we are able to generate datasets that can outperform full global illumination rendering and approach the performance of training on real data.
Ph.D. Thesis 2018 Elements of Intelligence: Memory, Communication and Intrinsic Motivation Sukhbaatar, Sainbayar Abstract | PDF

Title: Elements of Intelligence: Memory, Communication and Intrinsic Motivation

Candidate: Sukhbaatar, Sainbayar

Advisor(s): Fergus, Rob

Abstract:

Building an intelligent agent that can learn and adapt to its environment has always been a challenging task. This is because intelligence consists of many different elements such as recognition, memory, and planning. In recent years, deep learning has shown impressive results in recognition tasks. The aim of this thesis is to advance the deep learning techniques to other elements of intelligence.

We start our investigation with memory, an integral part of intelligence that bridges past experience with current decision making. In particular, we focus on the episodic memory, which is responsible for storing our past experiences and recalling them. An agent without such memory will struggle at many tasks such as having a coherent conversation. We show that a neural network with an external memory is better at such tasks, outperforming traditional recurrent networks with an internal memory.

Another crucial ingredient of intelligence is the capability to communicate with others. In particular, communication is essential for cooperative tasks, enabling agents to better collaborate and improve their division of labor. We investigate whether agents can learn to communicate from scratch without any external supervision. Our finding is that communication through a continuous vector facilitates faster learning by allowing gradients to flow between agents.

Lastly, an intelligent agent must have an intrinsic motivation to learn about its environment on its own without any external supervision or rewards. Our investigation led to one such learning strategy where an agent plays a two-role game with itself. The first role proposes a task, and the second role tries to execute it. Since their goal is to make the other fail, their adversarial interplay pushes them to explore increasingly complex tasks, which results in a better understanding of the environment.
Ph.D. Thesis 2018 Rethinking Customer Segmentation and Demand Learning in the Presence of Sparse, Diverse, and Large-scale Data Venkataraman, Ashwin Abstract | PDF

Title: Rethinking Customer Segmentation and Demand Learning in the Presence of Sparse, Diverse, and Large-scale Data

Candidate: Venkataraman, Ashwin

Advisor(s): Jagabathula, Srikanth; Subramanian, Lakshminarayanan

Abstract:

Firms are now able to collect unprecedented amounts of data. This wealth of data provides new opportunities and capabilities for the firm to better solve classical problems within operational and marketing contexts, such as customer segmentation and demand learning. At the same time, the data imposes new challenges. In addition to its large-scale nature which creates computational issues, the data comes from a diversity of sources, varying in their respective measurement scales (e.g., clicks, ratings, purchase signals, etc.), and is typically sparse, containing a large fraction of missing observations. The diversity in the data makes it hard to directly compare different observations (clicks vs purchases, for instance) and the severe sparsity precludes any meaningful imputations of unobserved entries. The data also comes from unreliable sources, which introduce both unintentional and deliberate errors. The identities of such sources is very often unknown, which makes it difficult to determine which sources to trust.

These data challenges require a rethink of traditional techniques for customer segmentation and demand learning. Given their importance and widespread use, this dissertation revisits the classical problems of customer segmentation and demand learning but in the presence of sparse, diverse, and large-scale data. The key contribution of the dissertation is a suite of novel methodologies to deal with the challenges described above.

Part I of the dissertation focuses on the problem of customer segmentation. In Chapter 1, we consider the problem of segmenting (or clustering) a large population of customers based on their preferences, when the preference signals (e.g., clicks, ratings, etc.) come from a multitude of diverse data sources and each customer provides only a few observations. These data characteristics preclude the applicability of traditional marketing techniques as well as standard clustering approaches in machine learning. We propose a model-based embedding technique which takes the customer observations and a probabilistic model class generating the observations as inputs, and outputs an embedding—a low-dimensional vector representation in Euclidean space—for each customer. We then cluster the embeddings to obtain the segments. We show that our segmentation technique can be used to generate highly accurate personalized recommendations in two real-world case studies, including up to 8% improvement over the existing approach on an eBay dataset consisting of millions of customers and items. In addition, it outperforms (both in speed and accuracy) standard techniques in marketing and machine learning.

In Chapter 2, we turn our attention to the domain of crowdsourced labeling, which provides a low-cost, easy and scalable way to collect labels from the crowd—composed of "workers"—which are then aggregated and used as inputs for training machine learning applications. The main challenge is that workers are often unreliable, and therefore can introduce unintentional or even intentional errors into the labels. The reliabilities of the workers are a priori unknown, so correctly aggregating the labels becomes difficult. We propose algorithms to separate the worker population into two segments, what we call "honest" and "adversarial" workers. Honest workers can provide incorrect labels, but their errors are probabilistic and therefore, can be corrected. Adversarial workers, on the other hand, adopt arbitrary labeling strategies (whether deterministic or probabilistic) and therefore, their labels cannot be trusted. We demonstrate that discarding the labels provided by even a few adversarial workers can significantly improve the accuracy of several existing approaches for aggregating the labels in real-world crowdsourcing datasets.

Part II is devoted to demand learning. In Chapter 3, we consider the problem of learning customer demand for a set of substitutable products. Within operations, the customer demand is typically modeled using a mixture of logit models, which can capture heterogeneity as well as rich substitution patterns in customer preferences. The mixture model is fit to historical sales transactions and inventory data, and the fitted model is used to inform pricing and assortment decisions. We propose a novel nonparametric estimator for the mixture of logit models, providing the ability to make effective use of the large amounts of transaction data that firms have access to. By contrast, most existing techniques impose parametric assumptions—usually driven by tractability considerations—on the mixing distribution, and consequently can suffer from model misspecification issues. We show that our estimator is able to recover good approximations of different ground-truth mixing distributions—despite having no knowledge of their underlying structure—and outperforms the standard expectation-maximization (EM) benchmark in predictive and decision accuracies, while being an order of magnitude faster.
Ph.D. Thesis 2017 On Quadtrees, Voronoi Diagrams, and Lattices: Results in Geometric Algorithms Bennett, Huxley Abstract | PDF

Title: On Quadtrees, Voronoi Diagrams, and Lattices: Results in Geometric Algorithms

Candidate: Bennett, Huxley

Advisor(s): Yap, Chee

Abstract:

We present several results on geometric algorithms, and somewhat more specifically on algorithmic aspects of geometric structures including quadtrees, Voronoi diagrams, and lattices. Our work contains two parts, the first of which is on subdivision algorithms, and the second of which is on lattice algorithms.

Subdivision algorithms amount to recursively splitting an ambient space into smaller pieces until certain conditions hold. Often the underlying space is a square in the plane (or a box in higher dimensions), whose subdivision is represented by a quadtree (or its higher-dimensional analogs). A quadtree is smooth if any two adjacent leaf boxes differ by at most one in depth. We first study the cost of the smooth split operation in quadtrees, showing that it has constant amortized cost in quadtrees of any fixed dimension.

We then present a subdivision-based algorithm for computing isotopic epsilon-approximations of planar minimization diagrams. Given a family of continuous functions, its minimization diagram partitions the plane into regions on which each function is minimal. Minimization diagrams generalize many natural Voronoi diagrams, and we show how to use our framework to compute an anisotropic Voronoi diagram on polygonal sites. We have implemented a prototype of our algorithm for anisotropic Voronoi diagrams, and we provide experimental results.

We then turn to studying lattice algorithms. A lattice is a regular ordering of points in Euclidean space, which is represented as the set of all integer combinations of some linearly independent vectors (which we call a basis of the lattice). In our first work on lattices, we introduce and study the Lattice Distortion Problem (LDP). LDP asks how "similar" two lattices are, i.e., what the minimum distortion of a linear bijection between two lattices is. We show how to compute low-distortion mappings with a tradeoff between approximation quality and running time based on a notion of basis reduction introduced by Seysen (Combinatorica 1993). We also show that LDP is NP-hard to approximate to within any constant factor (under randomized reductions).

Finally, we study the problem of finding lattice bases which are optimal with respect to two basis quality measures. Namely, we study the problem of finding bases with minimal orthogonality defect, and with nearly minimal Seysen condition number. We give algorithms which solve both problems while running in time depending only on the rank of the lattice times a polynomial in the input length.
Ph.D. Thesis 2017 Improving Event Extraction: Casting a Wider Net Cao, Kai Abstract | PDF

Title: Improving Event Extraction: Casting a Wider Net

Candidate: Cao, Kai

Advisor(s): Grishman, Ralph

Abstract:

Information extraction is the task of automatically extracting structured information from unstructured and/or semi-structured machine-readable documents. One facet of information extraction is event extraction (EE): identifying instances of selected types of events appearing in natural language text. For each instance, EE should identify the type of the event, the event trigger (the word or phrase which evokes the event), the participants in the event, and (where possible) the time and place of the event.

One EE task was defined and intensively studied as part of the ACE (Automatic Content Extraction) research program. The 2005 ACE EE task involved 8 types and 33 subtypes of events. For instance, given the sentence "She was killed by an automobile yesterday.", an EE system should be able to recognize the word "killed" as a trigger for an event of subtype DIE, and discover "an automobile" and "yesterday" as the Agent and Time arguments. This task is quite challenging, as the same event might appear in the form of various trigger expressions and an expression might represent different types of events in different contexts.

To support the development and evaluation of ACE EE systems, the Linguistic Data Consortium annotated a text corpus (consisting primarily of news articles) with information on the events mentioned. This corpus was widely used to train ACE EE systems. However, the event instances in the ACE corpus are not evenly distributed, and so some frequent expressions involving ACE events do not appear in the training data, adversely affecting performance.

The thesis presents several strategies for improving the performance of EE. We first demonstrate the effectiveness of two types of linguistic analysis -- dependency regularization and Abstract Meaning Representation -- in boosting EE performance. Next we show the benefit of an active learning strategy in which a person is asked to judge a limited number of phrases which may be event triggers. Finally we report the impact of combining our baseline system with event patterns from a system developed for a different EE task (the TABARI program). This step contains expert-level patterns generated by other research groups. Because the information received is complicated and quite different from the original corpus (ACE), the integration of this information requires more complex processing.
Ph.D. Thesis 2017 Random Growth Models Florescu, Laura Abstract | PDF

Title: Random Growth Models

Candidate: Florescu, Laura

Advisor(s): Spencer, Joel

Abstract:

This work explores variations of randomness in networks, and more specifically, how drastically the dynamics and structure of a network change when a little bit of information is added to "chaos". On one hand, I investigate how much determinism in diffusions de-randomizes the process, and on the other hand, I look at how superposing "planted" information on a random network changes its structure in such a way that the "planted" structure can be recovered.

The first part of the dissertation is concerned with rotor-router walks, a deterministic counterpart to random walk, which is the mathematical model of a path consisting of a succession of random steps. I study and show results on the volume (``the range") of the territory explored by the random rotor-router model, confirming an old prediction of physicists.

The second major part in the dissertation consists of two constrained diffusion problems. The questions in this model are to understand the long-term behavior of the models, as well as how the boundary of the processes evolves in time.

The third part is detecting communities in, or more generally, clustering networks. This is a fundamental problem in mathematics, machine learning, biology and economics, both for its theoretical foundations as well as for its practical implications. This problem can be viewed as "planting" some structure in a random network; for example, in cryptography, a code can be viewed as hiding some integers in a random sequence. For such a model with two communities, I show both information theoretic thresholds when it is impossible to recover the communities based on the density of the edges "planted" between the communities, as well as thresholds for when it is computationally possible to recover the communities.
Ph.D. Thesis 2017 Zero-knowledge Proofs: Efficient Techniques for Combination Statements and their Applications Ganesh, Chaya Abstract | PDF

Title: Zero-knowledge Proofs: Efficient Techniques for Combination Statements and their Applications

Candidate: Ganesh, Chaya

Advisor(s): Dodis, Yevgeniy

Abstract:

Zero-knowledge proofs provide a powerful tool, which allows a prover to convince a verifier that a statement is true without revealing any further information. It is known that every language in NP has a zero knowledge proof system, thus opening up several cryptographic applications. While true in theory, designing proof systems that are efficient to be used in practice remains challenging. The most common and most efficient systems implemented are approaches based on sigma protocols, and approaches based on SNARKs (Succinct Non-interactive ARguments of Knowledge). Each approach has its own advantages and shortcomings, and are suited for certain statements.

While sigma protocols are efficient for algebraic statements, they are expensive for non-algebraic statements. SNARKs, on the other hand, result in short proofs and efficient verification, and are better suited for proving statements about hash functions. But proving an algebraic statement, for instance, knowledge of discrete logarithm, is expensive as the prover needs to perform public-key operations proportional to the size of the circuit.

Recent work achieve zero-knowledge proofs that are efficient for statements phrased as Boolean circuits based on Garbled circuits (GC). This, again, is expensive for large circuits, in addition to being inherently interactive. Thus, SNARKs and GC-based approaches are better suited for non-algebraic statements, and sigma protocols are efficient for algebraic statements.

But in some applications, one is interested in proving combination statements, that is, statements that have both algebraic and non-algebraic components. The state of the art fails to take advantage of the best of all worlds and has to forgo the efficiency of one approach to obtain the other's. In this work, we ask how to efficiently prove a statement that is a combination of algebraic and non-algebraic statements.

We first show how to combine the GC-based approach with sigma protocols. Then, we study how to combine sigma protocol proofs with SNARKs to obtain non-interactive arguments for combination statements. We show applications of our techniques to anonymous credentials, and privacy-preserving protocols on the blockchain. Finally, we study garbled circuits as a primitive and present an efficient way of hashing garbled circuits. We show applications of our hashing technique, including application to GC-based zero-knowledge.
Ph.D. Thesis 2017 Circuit Complexity: New Techniques and Their Limitations Golovnev, Aleksandr Abstract | PDF

Title: Circuit Complexity: New Techniques and Their Limitations

Candidate: Golovnev, Aleksandr

Advisor(s): Dodis, Yevgeniy; Regev, Oded

Abstract:

We study the problem of proving circuit lower bounds. The strongest known lower bound of 3n-o(n) for an explicit function was proven by Blum in 1984. We prove a lower bound of (3+1/86)n-o(n) for affine dispersers for sublinear dimensions.

We introduce the weighted gate elimination method to give an elementary proof of a 3.11n lower bound for quadratic dispersers. (Although currently there are no explicit constructions of such functions.) Also, we develop a general framework which allows us to turn lower bounds proofs into upper bounds for Circuit SAT algorithms.

Finally, we prove strong limitations of the developed techniques.
Ph.D. Thesis 2017 Unsupervised Learning Under Uncertainty Mathieu, Michael Abstract | PDF

Title: Unsupervised Learning Under Uncertainty

Candidate: Mathieu, Michael

Advisor(s): LeCun, Yann

Abstract:

Deep learning, in particular neural networks, achieved remarkable success in the recent years. However, most of it is based on supervised learning, and relies on ever larger datasets, and immense computing power. One step towards general artificial intelligence is to build a model of the world, with enough knowledge to acquire a kind of “common sense”. Representations learned by such a model could be reused in a number of other tasks. It would reduce the requirement for labeled samples and possibly acquire a deeper understanding of the problem. The vast quantities of knowledge required to build common sense preclude the use of supervised learning, and suggest to rely on unsupervised learning instead.

The concept of uncertainty is central to unsupervised learning. The task is usually to learn a complex, multimodal distribution. Density estimation and generative models aim at representing the whole distribution of the data, while predictive learning consists of predicting the state of the world given the context and, more often than not, the prediction is not unique. That may be because the model lacks the capacity or the computing power to make a certain prediction, or because the future depends on parameters that are not part of the observation. Finally, the world can be chaotic of truly stochastic. Representing complex, multimodal continuous distributions with deep neural networks is still an open problem.

In this thesis, we first assess the difficulties of representing probabilities in high dimensional spaces, and review the related work in this domain. We then introduce two methods to address the problem of video prediction, first using a novel form of linearizing auto-encoder and latent variables, and secondly using Generative Adversarial Networks (GANs). We show how GANs can be seen as trainable loss functions to represent uncertainty, then how they can be used to disentangle factors of variation. Finally, we explore a new non-probabilistic framework for GANs.
Ph.D. Thesis 2017 Fine-scale Structure Design for 3D Printing Panetta, Francis Julian Abstract | PDF

Title: Fine-scale Structure Design for 3D Printing

Candidate: Panetta, Francis Julian

Advisor(s): Zorin, Denis

Abstract:

Modern additive fabrication technologies can manufacture shapes whose geometric complexities far exceed what existing computational design tools can analyze or optimize. At the same time, falling costs have placed these fabrication technologies within the average consumer's reach. Especially for inexpert designers, new software tools are needed to take full advantage of 3D printing technology.

My thesis develops such tools and demonstrates the exciting possibilities enabled by fine-tuning objects at the small scales achievable by 3D printing. The thesis applies two high-level ideas to invent these tools: two-scale design and worst-case analysis.

The two-scale design approach addresses the problem that accurately simulating---let alone optimizing---geometry at the full resolution one can print requires orders of magnitude more computational power than currently available. However, we can use periodic homogenization to decompose the design problem into a small-scale problem (designing tileable structures achieving a particular deformation behavior) and a macro-scale problem (deciding where to place these structures in the larger object). We can then design structures for every possible deformation behavior and store them in a database, so that they can be re-used for many different macro-scale design problems.

Worst-case analysis refers to determining how likely an object is to fracture by studying the worst possible scenario: the forces most efficiently breaking it. This analysis is needed when the designer has insufficient knowledge or experience to predict what forces an object will undergo, or when the design is intended for use in many different scenarios unknown a priori.
Ph.D. Thesis 2017 On the Gaussian Measure Over Lattices Stephens-Davidowitz, Noah Abstract | PDF

Title: On the Gaussian Measure Over Lattices

Candidate: Stephens-Davidowitz, Noah

Advisor(s): Dodis, Yevgeniy; Regev, Oded

Abstract:

We study the Gaussian mass of a lattice coset \[ \rho_s(\mathcal{L} - \vec{t}) := \sum_{\vec{y} \in \mathcal{L}} \exp(-\pi \|\vec{y} - \vec{t}\|^2/s^2) \; , \] where $\mathcal{L} \subset \mathbb{R}^n$ is a lattice and $\vec{t} \in \mathbb{R}^n$ is a vector describing a shift of the lattice. In particular, we use bounds on this Gaussian mass to obtain a partial converse to Minkowski's celebrated theorem bounding the number of lattice points in a ball.

We also consider the discrete Gaussian distribution $D_{\mathcal{L} - \vec{t}, s}$ induced by the Gaussian measure over $\mathcal{L} - \vec{t}$, and we use procedures for sampling from this distribution to construct the current fastest known algorithms for the two most important computation problems over lattices, the Shortest Vector Problem (SVP) and the Closest Vector Problem (CVP).

Finally, we study $\rho_s(\mathcal{L} - \vec{t})$ and $D_{\mathcal{L} - \vec{t}, s}$ as interesting computational and mathematical objects in their own right. In particular, we show that the computational problem of sampling from $D_{\mathcal{L} - \vec{t}, s}$ is equivalent to CVP in a very strong sense (and that sampling from $D_{\mathcal{L}, s}$ is no harder than SVP). We also prove a number of bounds on the moments of $D_{\mathcal{L} - \vec{t}, s}$ and various monotonicity properties of $\rho_s(\mathcal{L} - \vec{t})$.
Ph.D. Thesis 2016 Decision Procedures for Finite Sets with Cardinality, and Local Theories Extensions Bansal, Kshitij Abstract | PDF

Title: Decision Procedures for Finite Sets with Cardinality, and Local Theories Extensions

Candidate: Bansal, Kshitij

Advisor(s): Barrett, Clark; Wies, Thomas

Abstract:

Many tasks in design, verification, and testing of hardware and computer systems can be reduced to checking satisfiability of logical formulas. Certain fragments of first-order logic that model the semantics of prevalent data types, and hardware and software constructs, such as integers, bit-vectors, and arrays are thus of most interest. The appeal of satisfiability modulo theories (SMT) solvers is that they implement decision procedures for efficiently reasoning about formulas in these fragments. Thus, they can often be used off-the-shelf as automated back-end solvers in verification tools. In this thesis, we expand the scope of SMT solvers by developing decision procedures for new theories of interest in reasoning about hardware and software.

First, we consider the theory of finite sets with cardinality. Sets are a common high-level data structure used in programming; thus, such a theory is useful for modeling program constructs directly. More importantly, sets are a basic construct of mathematics and thus natural to use when mathematically defining the properties of a computer system. We extend a calculus for finite sets to reason about cardinality constraints. The reasoning for cardinality involves tracking how different sets overlap. For an efficient procedure in an SMT solver, we'd like to avoid considering Venn regions directly, which has been the approach in earlier work. We develop a novel technique wherein potentially overlapping regions are considered incrementally. We use a graph to track the interaction of the different regions. Additionally, our technique leverages the procedure for reasoning about the other set operations (besides cardinality) in a modular fashion.

Second, a limitation frequently encountered is that verification problems are often not fully expressible in the theories supported natively by the solvers. Many solvers allow the specification of application-specific theories as quantified axioms, but their handling is incomplete outside of narrow special cases. We show how SMT solvers can be used to obtain complete decision procedures for local theory extensions, an important class of theories that are decidable using finite instantiation of axioms. We present an algorithm that uses E-matching to generate instances incrementally during the search, significantly reducing the number of generated instances compared to eager instantiation strategies.
Ph.D. Thesis 2016 Analyzing Source Code Across Static Conditionals Gazzillo, Paul Abstract | PDF

Title: Analyzing Source Code Across Static Conditionals

Candidate: Gazzillo, Paul

Advisor(s): Wies, Thomas

Abstract:

We need better tools for C, such as source browsers, bug finders, and automated refactorings. The problem is that large C systems such as Linux are software product lines, containing thousands of configuration variables controlling every aspect of the software from architecture features to file systems and drivers. The challenge of such configurability is how do software tools accurately analyze all configurations of the source without the exponential explosion of trying them all separately. To this end, we focus on two key subproblems, parsing and the build system. The contributions of this thesis are the following: (1) a configuration-preserving preprocessor and parser called SuperC that preserves configurations in its output syntax tree; (2) a configuration-preserving Makefile evaluator called Kmax that collections Linux's compilation units and their configurations; and (3) a framework for configuration-aware analyses of source code using these tools.

C tools need to process two languages: C itself and the preprocessor. The latter improves expressivity through file includes, macros, and static conditionals. But it operates only on tokens, making it hard to even parse both languages. SuperC is a complete, performant solution to parsing all of C. First, a configuration-preserving preprocessor resolves includes and macros yet leaves static conditionals intact, thus preserving a program's variability. To ensure completeness, we analyze all interactions between preprocessor features and identify techniques for correctly handling them. Second, a configuration-preserving parser generates a well-formed AST with static choice nodes for conditionals. It forks new subparsers when encountering static conditionals and merges them again after the conditionals. To ensure performance, we present a simple algorithm for table-driven Fork-Merge LR parsing and four novel optimizations. We demonstrate SuperC's effectiveness on the x86 Linux kernel.

Large-scale C codebases like Linux are software product families, with complex build systems that tailor the software with myriad features. Such variability management is a challenge for tools, because they need awareness of variability to process all software product lines within the family. With over 14,000 features, processing all of Linux's product lines is infeasible by brute force, and current solutions employ incomplete heuristics. But having the complete set of compilation units with precise variability information is key to static tools such a bug-finders, which could miss critical bugs, and refactoring tools, since behavior-preservation requires a complete view of the software project. Kmax is a new tool for the Linux build system that extracts all compilation units with precise variability information. It processes build system files with a variability-aware \texttt{make} evaluator that stores variables in a conditional symbol table and hoists conditionals around complete statements, while tracking variability information as presence conditions. Kmax is evaluated empirically for correctness and completeness on the Linux kernel. Kmax is compared to previous work for correctness and running time, demonstrating that a complete solution's added complexity incurs only minor latency compared to the incomplete heuristic solutions.

SuperC's configuration-preserving parsing of compilation units and Kmax's project-wide capabilities are in a unique position to process source code across all configurations. Bug-finding is one area where such capability is useful. Bugs may appear in untested combinations of configurations and testing each configuration one-at-a-time is infeasible. For example, one compilation units that defines a global function called by other compilation units may not be linked into the final program due to configuration variable selection. Such a bug can be found with Kmax and SuperC's cross-configuration capability. Cilantro is a framework for creating variability-aware bug-checkers. Kmax is used to determine the complete set of compilation units and the combinations of features that activate them, while SuperC's parsing framework is extended with semantic actions in order implement the checkers. A checker for linker errors across all compilation units in the Linux kernel demonstrates each part of the Cilantro framework and is evaluated on the Linux source code.
Ph.D. Thesis 2016 Semi-Supervised Learning for Electronic Phenotyping in Support of Precision Medicine Halpern, Yonatan Abstract | PDF

Title: Semi-Supervised Learning for Electronic Phenotyping in Support of Precision Medicine

Candidate: Halpern, Yonatan

Advisor(s): Sontag, David

Abstract:

Medical informatics plays an important role in precision medicine, delivering the right information to the right person, at the right time. With the introduction and widespread adoption of electronic medical records, in the United States and world-wide, there is now a tremendous amount of health data available for analysis. Electronic record phenotyping refers to the task of determining, from an electronic medical record entry, a concise descriptor of the patient, comprising of their medical history, current problems, presentation, etc. In inferring such a phenotype descriptor from the record, a computer, in a sense, "understands", the relevant parts of the record. These phenotypes can then be used in downstream applications such as cohort selection for retrospective studies, real-time clinical decision support, contextual displays, intelligent search, and precise alerting mechanisms.

To handle the incomplete data present in medical records, we use the formal framework of probabilistic graphical models with latent or unobserved variables. The first part of the thesis presents two different structural conditions under which learning with latent variables is computationally tractable. The first is the "anchored" condition, where every latent variable has at least one child that is not shared by any other parent. The second is the "singly-coupled" condition, where every latent variable is connected to at least three children that satisfy conditional independence (possibly after a transformation of the data). Variables that satisfy these conditions can be specified by an expert without requiring that the entire structure or its parameters be specified, allowing for effective use of human expertise and making room for statistical learning to do some of the heavy lifting in model learning. For both the anchored and singly-coupled conditions, practical algorithms are presented.

The second part of the thesis describes real-life applications using the anchored condition for electronic phenotyping. A human-in-the-loop learning system and a functioning emergency informatics system for real-time extraction of important clinical variables are described and evaluated.

The algorithms and discussion presented here were developed for the purpose of improving healthcare, but are much more widely applicable, dealing with the very basic questions of identifiability and learning models with latent variables - a problem that lies at the very heart of the natural and social sciences.
Ph.D. Thesis 2016 Improving Knowledge Base Population with Information Extraction Li, Xiang Abstract | PDF

Title: Improving Knowledge Base Population with Information Extraction

Candidate: Li, Xiang

Advisor(s): Grishman, Ralph

Abstract:

Knowledge Bases (KBs) are data resources that encode world knowledge in machine-readable formats. Knowledge Base Population (KBP) aims at understanding this knowledge and extending KBs with more semantic information, which is a fundamental problem in Artificial Intelligence. It can benefit a wide range of tasks, such as semantic search and question answering. Information Extraction (IE), the task of discovering important types of facts (entities, relations and events) in unstructured text, is necessary and crucial for successfully populating knowledge bases. This dissertation focuses on four essential aspects of knowledge base population by leveraging IE techniques: extracting facts from unstructured data, validating the extracted information, accelerating and enhancing systems with less annotation effort, and utilizing knowledge bases to improve real-world applications.

First, we investigate the Slot Filling task, which is a key component for knowledge base population. Slot filling aims to collect information from a large collection of news, web, or other sources of documents to determine a set of predefined attributes ("slots") for given person and organization entities. We introduce a statistical language understanding approach to automatically construct personal (user-centric) knowledge bases from conversational dialogs.

Second, we consider how to probabilistically estimate the correctness of the extracted slot values. Despite the significant progress of KBP research and systems in recent years, slot filling approaches are still far from completely reliable. Using the NIST KBP Slot Filling task as a case study, we propose a confidence estimation model based on the Maximum Entropy framework, and demonstrate the effectiveness of this model in both precision and the capability to improve the slot filling aggregation through a weighted voting strategy.

Third, we study rich annotation guided learning to fill the gap between an expert annotator and a feature engineer. We develop an algorithm to enrich features with the guidance of all levels of rich annotations from human annotators. We also evaluate the comparative efficacy, generality and scalability of this framework by conducting case studies on three distinct applications in various domains, including facilitating KBP slot filling systems. Empirical studies demonstrate that with little additional annotation time, we can significantly improve the performance for all tasks.

Finally, we explore utilizing knowledge bases in a real-world application - personalized content recommendation. Traditional systems infer user interests from surface-level features derived from online activity logs and user demographic profiles, rather than deeply understanding the context semantics. We conduct a systematic study to show the effectiveness of incorporating deep semantic knowledge encoded in the entities on modeling user interests, by utilizing the abundance of entity information from knowledge bases.
Ph.D. Thesis 2016 Improving SAT Solvers by Exploiting Empirical Characteristics of CDCL Oh, Chanseok Abstract | PDF

Title: Improving SAT Solvers by Exploiting Empirical Characteristics of CDCL

Candidate: Oh, Chanseok

Advisor(s): Wies, Thomas

Abstract:

The Boolean Satisfiability Problem (SAT) is a canonical decision problem originally shown to be NP-complete in Cook’s seminal work on the theory of computational complexity. The SAT problem is one of several computational tasks identified by researchers as core problems in computer science. The existence of an efficient decision procedure for SAT would imply P = NP. However, numerous algorithms and techniques for solving the SAT problem have been proposed in various forms in practical settings. Highly efficient solvers are now actively being used, either directly or as a core engine of a larger system, to solve real-world problems that arise from many application domains. These state-of-the-art solvers use the Davis-Putnam-Logemann-Loveland (DPLL) algorithm extended with ConflictDriven Clause Learning (CDCL). Due to the practical importance of SAT, building a fast SAT solver can have a huge impact on current and prospective applications. The ultimate contribution of this thesis is improving the state of the art of CDCL by understanding and exploiting the empirical characteristics of how CDCL works on real-world problems. The first part of the thesis shows empirically that most of the unsatisfiable real-world problems solvable by CDCL have a refutation proof with near-constant width for the great portion of the proof. Based on this observation, the thesis provides an unconventional perspective that CDCL solvers can solve real-world problems very efficiently and often more efficiently just by maintaining a small set of certain classes of learned clauses. The next part of the thesis focuses on understanding the inherently different natures of satisfiable and unsatisfiable problems and their implications on the empirical workings of CDCL. We examine the varying degree of roles and effects of crucial elements of CDCL based on the satisfiability status of a problem. Ultimately, we propose effective techniques to exploit the new insights about the different natures of proving satisfi- ability and unsatisfiability to improve the state of the art of CDCL. In the last part of the thesis, we present a reference solver that incorporates all the techniques described in the thesis. The design of the presented solver emphasizes minimality in implementation while guaranteeing state-of-the-art performance. Several versions of the reference solver have demonstrated top-notch performance, earning several medals in the annual SAT competitive events. The minimal spirit of the reference solver shows that a simple CDCL framework alone can still be made competitive with state-of-the-art solvers that implement sophisticated techniques outside the CDCL framework.
Ph.D. Thesis 2016 Graph-based Approaches to Resolve Entity Ambiguity Pershina, Maria Abstract | PDF

Title: Graph-based Approaches to Resolve Entity Ambiguity

Candidate: Pershina, Maria

Advisor(s): Grishman, Ralph

Abstract:

Information Extraction is the task of automatically extracting structured information from unstructured or semi-structured machine-readable documents. One of the challenges of Information Extraction is to resolve ambiguity between entities either in a knowledge base or in text documents. There are many variations of this problem and it is known under different names, such as coreference resolution, entity disambiguation, entity linking, entity matching, etc. For example, the task of coreference resolution decides whether two expressions refer to the same entity; entity disambiguation determines how to map an entity mention to an appropriate entity in a knowledge base (KB); the main focus of entity linking is to infer that two entity mentions in a document(s) refer to the same real world entity even if they do not appear in a KB; entity matching (also record deduplication, entity resolution, reference reconciliation) is to merge records from databases if they refer to the same object.

Resolving ambiguity and finding proper matches between entities is an important step for many downstream applications, such as data integration, question answering, relation extraction, etc. The Internet has enabled the creation of a growing number of large-scale knowledge bases in a variety of domains, posing a scalability challenge for Information Extraction systems. Tools for automatically aligning these knowledge bases would make it possible to unify many sources of structured knowledge and to answer complex queries. However the efficient alignment of large-scale knowledge bases still poses a considerable challenge.

Various aspects and different settings to resolve ambiguity between entities are studied in this dissertation. A new scalable domain-independent graph-based approach utilizing Personalized Page Rank is developed for entity matching across large-scale knowledge bases and evaluated on datasets of 110 million and 203 million entities. A new model for entity disambiguation between a document and a knowledge base utilizing a document graph and effectively filtering out noise is proposed. A new technique based on a paraphrase detection model is proposed to recognize name variations for an entity in a document. A new approach integrating a graph-based entity disambiguation model and this technique is presented for an entity linking task and is evaluated on a dataset for Â the Text Analysis Conference Entity Discovery and Linking 2014 task.
Ph.D. Thesis 2016 Partition Memory Models for Program Analysis Wang, Wei Abstract | PDF

Title: Partition Memory Models for Program Analysis

Candidate: Wang, Wei

Advisor(s): Barrett, Clark; Wies, Thomas

Abstract:

Scalability is a key challenge in static program analyses based on solvers for Satisfiability Modulo Theories (SMT). For imperative languages like C, the approach taken for modeling memory can play a significant role in scalability. The main theme of this thesis is using partitioned memory models to divide up memory based on the alias information derived from a points-to analysis.

First, a general analysis framework based on memory partitioning is presented. It incorporates a points-to analysis as a preprocessing step to determine a conservative approximation of which areas of memory may alias or overlap and splits the memory into distinct arrays for each of these areas.

Then we propose a new cell-based field-sensitive points-to analysis, which is an extension of Steensgaard’s unification-based algorithms. A cell is a unit of access with scalar or record type. Arrays and dynamically memory allocations are viewed as a collection of cells. We show how our points-to analysis yields more precise alias information for programs with complex heap data structures.

Our work is implemented in Cascade, a static analysis framework for C programs. It replaces the former flat memory model that models the memory as a single array of bytes. We show that the partitioned memory models achieve better scalability within Cascade, and the cell-based memory model, in particular, improves the performance significantly, making Cascade a state-of-the-art C analyzer.
Ph.D. Thesis 2016 Learning Algorithms from Data Zaremba, Wojciech Abstract | PDF

Title: Learning Algorithms from Data

Candidate: Zaremba, Wojciech

Advisor(s): Fergus, Rob; LeCun, Yann

Abstract:

Statistical machine learning is concerned with learning models that describe observations. We train our models from data on tasks like machine translation or object recognition because we cannot explicitly write down programs to solve such problems. A statistical model is only useful when it generalizes to unseen data. Solomonoff has proved that one should choose the model that agrees with the observed data, while preferring the model that can be compressed the most, because such a choice guarantees the best possible generalization. The size of the best possible compression of the model is called the Kolmogorov complexity of the model. We define an algorithm as a function with small Kolmogorov complexity.

This Ph.D. thesis outlines the problem of learning algorithms from data and shows several partial solutions to it. Our data model is mainly neural networks as they have proven to be successful in various domains like object recognition, language modeling, speech recognition and others. First, we examine empirical trainability limits for classical neural networks. Then, we extend them by providing interfaces, which provide a way to read memory, access the input, and postpone predictions. The model learns how to use them with reinforcement learning techniques like Reinforce and Q-learning. Next, we examine whether contemporary algorithms such as convolution layer can be automatically rediscovered. We show that it is possible indeed to learn convolution as a special case in a broader range of models. Finally, we investigate whether it is directly possible to enumerate short programs and find a solution to a given problem. This follows the original line of thought behind the Solomonoff induction. Our approach is to learn a prior over programs such that we can explore them efficiently.
Ph.D. Thesis 2016 Distributed Stochastic Optimization for Deep Learning Zhang, Sixin Abstract | PDF

Title: Distributed Stochastic Optimization for Deep Learning

Candidate: Zhang, Sixin

Advisor(s): LeCun, Yann

Abstract:

We study the problem of how to distribute the training of large-scale deep learning models in the parallel computing environment. We propose a new distributed stochastic optimization method called Elastic Averaging SGD (EASGD). We analyze the convergence rate of the EASGD method in the synchronous scenario and compare its stability condition with the existing ADMM method in the round-robin scheme. An asynchronous and momentum variant of the EASGD method is applied to train deep convolutional neural networks for image classification on the CIFAR and ImageNet datasets. Our approach accelerates the training and furthermore achieves better test accuracy. It also requires a much smaller amount of communication than other common baseline approaches such as the DOWNPOUR method.

We then investigate the limit in speedup of the initial and the asymptotic phase of the mini-batch SGD, the momentum SGD, and the EASGD methods. We find that the spread of the input data distribution has a big impact on their initial convergence rate and stability region. We also find a surprising connection between the momentum SGD and the EASGD method with a negative moving average rate. A non-convex case is also studied to understand when EASGD can get trapped by a saddle point.

Finally, we scale up the EASGD method by using a tree structured network topology. We show empirically its advantage and challenge. We also establish a connection between the EASGD and the DOWNPOUR method with the classical Jacobi and the Gauss-Seidel method, thus unifying a class of distributed stochastic optimization methods.
Ph.D. Thesis 2016 Pushing the Limits of Additive Fabrication Technologies Zhou, Qingnan (James) Abstract | PDF

Title: Pushing the Limits of Additive Fabrication Technologies

Candidate: Zhou, Qingnan (James)

Advisor(s): Zorin, Denis

Abstract:

A rough symmetry can be observed in the stock price of 3D Systems (NYSE:DDD), the leading and largest 3D printer manufacturer, from its IPO on June 3, 2011 to the beginning of 2016. The price sky rocketed nearly 600% from 2011 to the end of 2013, and took a free fall back to its original value by 2016. Coincidentally, it is also the period during which I got my hands dirty and investigated some of the toughest challenges as well as exciting new possibilities associated with different types of 3D printing technologies. In this thesis, I documented my attempts from 3 different angles to push the limits of 3D printing: printability, microstructure design and robust geometry processing with mesh arrangements.

Printability check has long been the bottleneck that prevents 3D printing from scaling up. Oftentimes, designers of 3D models lack the expertise or tools to ensure 3D printability. 3D printing service providers typically rely human inspections to filter out unprintable designs. This process is manual and error-prone. As designs become ever more complex, manual printability check becomes increasingly difficult. To tackle this problem, my colleagues and I proposed an algorithm to automatically determine structurally weak regions and the worst-case usage scenario to break a given model. We validate the algorithm by physically break a number of real 3D printed designs.

A key distinctive feature of 3D printing technologies is that the cost and time of fabrication is uncorrelated with geometric complexity. This opens up many exciting new possibilities. In particular, by pushing geometric complexity to the extreme, 3D printing has the potential of fabricating soft, deformable shapes with microscopic structures using a single raw material. In our recent SIGGRAPH publication, my colleagues and I have not only demonstrated fabricating microscopic frame structures is possible but also proposed an entire pipeline for designing spatially varying microstructures to satisfy target material properties or deformation goals.

With the boost of 3D printing technologies, 3D models have become more abundant and easily accessible than ever before. These models are sometimes known as "wild" models because they differ significantly in complexity and quality from traditional models in graphics researches. This poses a serious challenge in robustly analyzing 3D designs. Many state-of-the-art geometry processing algorithms/libraries are ill-prepared for dealing with "wild" models that are non-manifold, self-intersecting, locally degenerate and/or containing multiple and possibly nested components. In our most recent SIGGRAPH submission, we proposed a systematic recipe based on mesh arrangements for conducting a family of exact constructive solid geometry operations. We exhaustively tested our algorithm on 10,000 "wild" models crawled from Thingiverse, a popular online shape repository. Both the code and the dataset are freely available to the public.
Ph.D. Thesis 2015 Big Data Analytics for Development: Events, Knowledge Graphs and Predictive Models Chakraborty, Sunandan Abstract | PDF

Title: Big Data Analytics for Development: Events, Knowledge Graphs and Predictive Models

Candidate: Chakraborty, Sunandan

Advisor(s): Subramanian, Lakshminarayanan; Nyarko, Yaw

Abstract:

Volatility in critical socio-economic indices can have a significant negative impact on global development. This thesis presents a suite of novel big data analytics algorithms that operate on unstructured Web data streams to automatically infer events, knowledge graphs and predictive models to understand, characterize and predict the volatility of socioeconomic indices.

This thesis makes four important research contributions. First, given a large volume of diverse unstructured news streams, we present new models for capturing events and learning spatio-temporal characteristics of events from news streams. We specifically explore two types of event models in this thesis: one centered around the concept of event triggers and a probabilistic meta-event model that explicitly delineates named entities from text streams to learn a generic class of meta-events. The second contribution focuses on learning several different types of knowledge graphs from news streams and events: a) Spatio-temporal article graphs capture intrinsic relationships between different news articles; b) Event graphs characterize relationships between events and given a news query, provide a succinct summary of a timeline of events relating to a query; c) Event-phenomenon graphs that provide a condensed representation of classes of events that relate to a given phenomena at a given location and time; d) Causality testing on word-word graphs which can capture strong spatio-temporal relationships between word occurrences in news streams; e) Concept graphs that capture relationships between different word concepts that occur in a given text stream.

The third contribution focuses on connecting the different knowledge graph representations and structured time series data corresponding to a socio-economic index to automatically learn event-driven predictive models for the given socio-economic index to predict future volatility. We propose several types of predictive models centered around our two event models: event triggers and probabilistic meta-events. The final contribution focuses on a broad spectrum of inference case studies for different types of socio-economic indices including food prices, stock prices, disease outbreaks and interest rates. Across all these indices, we show that event-driven predictive models provide significant improvements in prediction accuracy over state-of-the-art techniques.
Ph.D. Thesis 2015 SMT-Based and Disjunctive Relational Abstract Domains for StaticAnalysis Chen, Junjie Abstract | PDF

Title: SMT-Based and Disjunctive Relational Abstract Domains for StaticAnalysis

Candidate: Chen, Junjie

Advisor(s): Patrick Cousot

Abstract:

Abstract Interpretation is a theory of sound approximation of program semantics. In recent decades, it has been widely and successfully applied to the static analysis of computer programs. In this thesis, we will work on abstract domains, one of the key concepts in abstract interpretation, which aim at automatically collecting information about the set of all possible values of the program variables. We will focus, in particularly, on two aspects: the combination with theorem provers and the refinement of existing abstract domains.

Satisfiability modulo theories (SMT) solvers are popular theorem provers, which proved to be very powerful tools for checking the satisfiability of first-order logical formulas with respect to some background theories. In the first part of this thesis, we introduce two abstract domains whose elements are logical formulas involving finite conjunctions of affine equalities and finite conjunctions of linear inequalities. These two abstract domains rely on SMT solvers for the computation of transformations and other logical operations.

In the second part of this thesis, we present an abstract domain functor whose elements are binary decision trees. It is parameterized by decision nodes which are a set of boolean tests appearing in the programs and by a numerical or symbolic abstract domain whose elements are the leaves. This new binary decision tree abstract domain functor provides a flexible way of adjusting the cost/precision ratio in path-dependent static analysis.
Ph.D. Thesis 2015 Iris: Mitigating Phase Noise in Millimeter Wave OFDM Systems Dhananjay, Aditya Abstract | PDF

Title: Iris: Mitigating Phase Noise in Millimeter Wave OFDM Systems

Candidate: Dhananjay, Aditya

Advisor(s): Li, Jinyang

Abstract:

Next-generation wireless networks are widely expected to operate over millimeter-wave (mmW) frequencies of over 28GHz. These bands mitigate the acute spectrum shortage in the conventional microwave bands of less than 6GHz. The shorter wavelengths in these bands also allow for building dense antenna arrays on a single chip, thereby enabling various MIMO configurations and highly directional links that can increase the spatial reuse of spectrum.

While attempting to build a practical over-the-air (OTA) link over mmW, we realized that the traditional baseband processing techniques used in the microwave bands simply could not cope with the exacerbated frequency offsets (or phase noise) observed in the RF oscillators at these bands. While the frequency offsets are large, the real difficulty arose from the fact that they varied significantly over very short time-scales.Traditional feedback loop techniques still left significant residual offsets, which in turn led to inter-carrier-interference (ICI). The result was very high symbol error rates (SER).

This thesis presents Iris, a baseband processing block that enables clean mmW links, even in the presence of previously fatal amounts of phase noise. Over real mmW hardware, Iris reduces the SER by one to two orders of magnitude, as compared to competing techniques.
Ph.D. Thesis 2015 Predicting Images using Convolutional Networks: Visual Scene Understanding with Pixel Maps Eigen, David Abstract | PDF

Title: Predicting Images using Convolutional Networks: Visual Scene Understanding with Pixel Maps

Candidate: Eigen, David

Advisor(s): Fergus, Rob

Abstract:

In the greater part of this thesis, we develop a set of convolutional networks that infer predictions at each pixel of an input image. This is a common problem that arises in many computer vision applications: For example, predicting a semantic label at each pixel describes not only the image content, but also fine-grained locations and segmenta- tions; at the same time, finding depth or surface normals provide 3D geometric relations between points. The second part of this thesis investigates convolutional models also in the contexts of classification and unsupervised learning.

To address our main objective, we develop a versatile Multi-Scale Convolutional Network that can be applied to diverse vision problems using simple adaptations, and apply it to predict depth at each pixel, surface normals and semantic labels. Our model uses a series of convolutional network stacks applied at progressively finer scales. The first uses the entire image field of view to predict a spatially coarse set of feature maps based on global relations; subsequent scales correct and refine the output, yielding a high resolution prediction. We look exclusively at depth prediction first, then generalize our method to multiple tasks. Our system achieves state-of-the-art results on all tasks we investigate, and can match many image details without the need for superpixelation.

Leading to our multi-scale network, we also design a purely local convolutional network to remove dirt and raindrops present on a window surface, which learns to identify and inpaint compact corruptions. We also we investigate a weighted nearest-neighbors labeling system applied to superpixels, in which we learn weights for each example, and use local context to find rare class instances.

In addition, we investigate the relative importance of sizing parameters using a recursive convolutional network, finding that network depth is most critical. We also develop a Convolutional LISTA Autoencoder, which learns features similar to stacked sparse coding at a fraction of the cost, combine it with a local entropy objective, and describe a convolutional adaptation of ZCA whitening.
Ph.D. Thesis 2015 Unsupervised Feature Learning in Computer Vision Goroshin, Ross Abstract | PDF

Title: Unsupervised Feature Learning in Computer Vision

Candidate: Goroshin, Ross

Advisor(s): LeCun, Yann

Abstract:

Much of computer vision has been devoted to the question of representation through feature extraction. Ideal features transform raw pixel intensity values to a representation in which common problems such as object identification, tracking, and segmentation are easier to solve. Recently, deep feature hierarchies have proven to be immensely successful at solving many problems in computer vision. In the supervised setting, these hierarchies are trained to solve specific problems by minimizing an objective function of the data and problem specific label information. Recent findings suggest that despite being trained on a specific task, the learned features can be transferred across multiple visual tasks. These findings suggests that there exists a generically useful feature representation for natural visual data.

This work aims to uncover the principles that lead to these generic feature representations in the unsupervised setting, which does not require problem specific label information. We begin by reviewing relevant prior work, particularly the literature on autoencoder networks and energy based learning. We introduce a new regularizer for autoencoders that plays an analogous role to the partition function in probabilistic graphical models. Next we explore the role of specialized encoder architectures for sparse inference. The remainder of the thesis explores visual feature learning from video. We establish a connection between slow-feature learning and metric learning, and experimentally demonstrate that semantically coherent metrics can be learned from natural videos. Finally, we posit that useful features linearize natural image transformations in video. To this end, we introduce a new architecture and loss for training deep feature hierarchies that linearize the transformations observed in unlabeled natural video sequences by learning to predict future frames in the presence of uncertainty.
Ph.D. Thesis 2015 Efficient and Trustworthy Theory Solver for Bit-vectors in SatisfiabilityModulo Theories Hadarean, Liana Abstract | PDF

Title: Efficient and Trustworthy Theory Solver for Bit-vectors in SatisfiabilityModulo Theories

Candidate: Hadarean, Liana

Advisor(s): Barrett, Clark

Abstract:

As software and hardware systems grow in complexity, automated techniques for ensuring their correctness are becoming increasingly important. Many modern formal verification tools rely on back-end satisfiability modulo theories (SMT) solvers to discharge complex verification goals. These goals are usually formalized in one or more fixed first-order logic theories, such as the theory of fixed-width bit-vectors. The theory of bit-vectors offers a natural way of encoding the precise semantics of typical machine operations on binary data. The predominant approach to deciding the bit-vector theory is via eager reduction to propositional logic. While this often works well in practice, it does not scale well as the bit-width and number of operations increase. The first part of this thesis seeks to fill this gap, by exploring efficient techniques of solving bit-vector constraints that leverage the word-level structure. We propose two complementary approaches: an eager approach that takes full advantage of the solving power of off the shelf propositional logic solvers, and a lazy approach that combines on-the-fly algebraic reasoning with efficient propositional logic solvers. In the second part of the thesis, we propose a proof system for encoding automatically checkable refutation proofs in the theory of bit-vectors. These proofs can be automatically generated by the SMT solver, and act as a certificate for the correctness of the result.
Ph.D. Thesis 2015 Predicting the Market Value of Single-Family Residences Lowrance, Roy Abstract | PDF

Title: Predicting the Market Value of Single-Family Residences

Candidate: Lowrance, Roy

Advisor(s): LeCun, Yann; Shasha, Dennis

Abstract:

This work develops the best linear model of residential real estate prices for 2003 through 2009 in Los Angeles County. It differs from other studies comparing models for predicting house prices by covering a larger geographic area than most, more houses than most, a longer time period than most, and the time period both before and after the real estate price boom in the United States.

In addition, it open sources all of the software. We test designs for linear models to determine the best form for the model as well as the training period, features, and regularizer that produce the lowest errors. We compare the best of our linear models to random forests and point to directions for further research.
Ph.D. Thesis 2015 Building Fast, CPU-Efficient Distributed Systems on Ultra-Low Latency, RDMA-Capable Networks Mitchell, Christopher Abstract | PDF

Title: Building Fast, CPU-Efficient Distributed Systems on Ultra-Low Latency, RDMA-Capable Networks

Candidate: Mitchell, Christopher

Advisor(s): Li, Jinyang

Abstract:

Modern datacenters utilize traditional Ethernet interconnects to connect hundreds or thousands of machines. Although inexpensive and ubiquitous, Ethernet imposes design constraints on datacenter-scale distributed storage systems that use traditional client-server architectures. Recent technological trends indicate that future datacenters will embrace interconnects with ultra-low latency, high bandwidth, and the ability to offload work from servers to clients. Future datacenter-scale distributed storage systems will need to be designed specifically to exploit these features. This thesis explores what these features mean for large-scale in-memory storage systems, and derives two key insights for building RDMA-aware distributed systems.

First, relaxing locality between data and computation is now practical: data can be copied from servers to clients for computation. Second, selectively relaxing data-computation locality makes it possible to optimally balance load between server and client CPUs to maintain low application latency. This thesis presents two in-memory distributed storage systems built around these two insights, Pilaf and Cell, that demonstrate effective use of ultra-low-latency, RDMA-capable interconnects. Through Pilaf and Cell, this thesis demonstrates that by combining RDMA and message passing to selectively relax locality, systems can achieve ultra-low latency and optimal load balancing with modest CPU resources.
Ph.D. Thesis 2015 Instance Segmentation of RGBD Scenes Silberman, Nathan Abstract | PDF

Title: Instance Segmentation of RGBD Scenes

Candidate: Silberman, Nathan

Advisor(s): Fergus, Rob

Abstract:

The vast majority of literature in scene parsing can be described as semantic pixel labeling or semantic segmentation: predicting the semantic class of the object represented by each pixel in the scene. Our familiar perception of the world, however, provides a far richer representation. Firstly, rather than just being able to predict the semantic class of a location in a scene, humans are able to reason about object instances. Discriminating between a region that might represent a single object versus ten objects is a crucial and basic faculty. Secondly, rather than reasoning about objects as merely occupying the space visible from a single vantage point, we are able to quickly and easily reason about an object's true extent in 3D. Thirdly, rather than viewing a scene as a collection of objects independently existing in space, humans exhibit a representation of scenes that is highly grounded through a intuitive model of physics. Such models allow us to reason about how objects relate physically: via physical support relationships.

Instance segmentation is the task of segmenting a scene into regions which correspond to individual object instances. We argue that this task is not only closer to our own perception of the world than semantic segmentation, but also directly allows for subsequent reasoning about a scenes constituent elements. We explore various strategies for instance segmentation in indoor RGBD scenes.

Firstly, we explore tree-based instance segmentation algorithms. The utility of trees for semantic segmentation has been thoroughly demonstrated and we adapt them to instance segmentation and analyze both greedy and global approaches to inference.

Next, we investigate exemplar-based instance segmentation algorithms, in which a set of representative exemplars are chosen from a large pool of regions and pixels are assigned to exemplars. Inference can either be performed in two stages, exemplar selection followed by pixel-to-exemplar assignment, or in a single joint reasoning stage. We consider the advantages and disadvantages of each approach.

We introduce the task of support-relation prediction in which we predict which objects are physically supporting other objects. We propose an algorithm and a new set of features for performing discriminative support prediction, we demonstrate the effectiveness of our method and compare training mechanisms.

Finally, we introduce an algorithm for inferring scene and object extent. We demonstrate how reasoning about 3D extent can be done by extending known 2D methods and highlight the strengths and limitations of this approach.
Ph.D. Thesis 2015 Localization of Humans in Images Using Convolutional Networks Tompson, Jonathan Abstract | PDF

Title: Localization of Humans in Images Using Convolutional Networks

Candidate: Tompson, Jonathan

Advisor(s): Bregler, Christopher

Abstract:

Tracking of humans in images is a long standing problem in computer vision research for which, despite significant research effort, an adequate solution has not yet emerged. This is largely due to the fact that human body localization is complicated and difficult; potential solutions must find the location of body joints in images with invariance to shape, lighting and texture variation and it must do so in the presence of occlusion and incomplete data. However, despite these significant challenges, this work will present a framework for human body pose localization that not only offers a significant improvement over existing traditional architectures, but has sufficient localization performance and computational efficiency for use in real-world applications.

At it's core, this framework makes use of Convolutional Networks to infer the location of body joints efficiently and accurately. We describe solutions to two applications 1) hand-tracking from a depth image source and 2) human body-tracking from and RGB image source. For both these applications we show that Convolutional Networks are able to significantly out-perform existing state-of-the-art.

We propose a new hybrid architecture that consists of a deep Convolutional Network and a Probabilistic Graphical Model which can exploit structural domain constraints such as geometric relationships between body joint locations to improve tracking performance. We then explore the use of both color and motion features to improve tracking performance. Finally we introduce a novel architecture which includes an efficient ‘position refinement’ model that is trained to estimate the joint offset location within a small region of the image. This refinement model allows our network to improve spatial localization accuracy even with large amounts of spatial pooling.
Ph.D. Thesis 2015 Joint Training of a Neural Network and a Structured Model for Computer Vision Wan, Li Abstract | PDF

Title: Joint Training of a Neural Network and a Structured Model for Computer Vision

Candidate: Wan, Li

Advisor(s): Fergus, Rob

Abstract:

Identifying objects and telling where they are in real world images is one of the most important problems in Artificial Intelligence. The problem is challenging due to: occluded objects, varying object viewpoints and object deformations. This makes the vision problem extremely difficult and cannot be efficiently solved without learning.

This thesis explores hybrid systems that combine a neural network as a trainable feature extractor and structured models that capture high level information such as object parts. The resulting models combine the strengths of the two approaches: a deep neural network which provides a powerful non-linear feature transformation and a high level structured model which integrates domain-specific knowledge. We develop discriminative training algorithms to jointly optimize these entire models end-to-end.

First, we proposed a unified model which combines a deep neural network with a latent topic model for image classification. The hybrid model is shown to outperform models based solely on neural networks or topic model alone. Next, we investigate techniques for training a neural network system, introducing an effective way of regularizing the network called DropConnect. DropConnect allows us to train large models while avoiding over-fitting. This yields state-of-the-art results on a variety of standard benchmarks for image classification. Third, we worked on object detection for PASCAL challenge. We improved the deformable parts model and proposed a new non-maximal suppression algorithm. This system was the joint winner of the 2011 challenge. Finally, we develop a new hybrid model which integrates a deep network, deformable parts model and non-maximal suppression. Joint training of our hybrid model shows clear advantage over train each component individually, and achieving competitive result on standard benchmarks.
Ph.D. Thesis 2015 Partition Memory Models in Program Analysis Wang, Wei Abstract | PDF

Title: Partition Memory Models in Program Analysis

Candidate: Wang, Wei

Advisor(s): Barrett, Clark

Abstract:

Scalability is a key challenge in static program analyses based on solvers for Satisfiability Modulo Theories (SMT). For imperative languages like C, the approach taken for modeling memory can play a significant role in scalability. The main theme of this thesis is using partitioned memory models to divide up memory based on the alias information derived from a points-to analysis.

First, a general analysis framework based on memory partitioning is presented. It incorporates a points-to analysis as a preprocessing step to determine a conservative approximation of which areas of memory may alias or overlap and splits the memory into distinct arrays for each of these areas.

Then we propose a new cell-based field-sensitive points-to analysis, which is an extension of Steensgaard's unification-based algorithms. A cell is a unit of access with scalar or record type. Arrays and dynamically memory allocations are viewed as a collection of cells. We show how our points-to analysis yields more precise alias information for programs with complex heap data structures.

Our work is implemented in Cascade, a static analysis framework for C programs. It replaces the former at memory model that models the memory as a single array of bytes. We show that the partitioned memory models achieve better scalability within Cascade, and the cell-based memory model, in particular, improves the performance significantly, making Cascade a state-of-the-art C analyzer.
Ph.D. Thesis 2014 On the Human Form: Efficient acquisition, modeling and manipulation of thehuman body Braga, Otavio Abstract | PDF

Title: On the Human Form: Efficient acquisition, modeling and manipulation of thehuman body

Candidate: Braga, Otavio

Advisor(s): Geiger, Davi

Abstract:

This thesis concerns the acquisition, modeling and manipulation of the human form.

First, we acquire body models. We introduce an efficient bootstraped algorithm that we employed to register over 2,000 high resolution body scans of male and female adult subjects. Our algorithm outputs not only the traditional vertex correspondences, but also directly produces a high quality model which can be immediately deformed. We then employ the result to fit noisy depth maps coming from now commercially available 3D sensors such as Microsoft's Kinect and PrimeSense's Carmine.

We conclude by describing a new real-time system for image-based body manipulation called BodyJam, that lets you change your outfit with a finger snap. BodyJam is inspired by a technique invented by the surrealists a century ago: "Exquisite corpse", a method by which a collection of images (of body parts) is collectively assembled. BodyJam does it on a video display that mirrors the pose in real-time of a real-person standing in front of the camera/display mirror, and allows the user to change clothes and other appearance attributes. Using Microsoft's Kinect, poses are matched to a video database of different torsos and legs, and "pages" showing different clothes are turned by handwitch focus to the topic of body manipulation. We first revisit the more traditional way of specifying bodies from a set of measurements, such as coming from clothing sizing charts, showing how the statistics of the population learned during the registration can aid us in accurately defining the body shape. We then introduce a new manipulation metaphor, where we navigate through the space of body shapes and poses by directly dragging the body mesh surface.

We conclude by describing a new real-time system for image-based body manipulation called BodyJam, that lets you change your outfit with a finger snap. BodyJam is inspired by a technique invented by the surrealists a century ago: "Exquisite Corpse", a method by which a collection of images (of body parts) is collectively assembled. BodyJam does it on a video display that mirrors the pose in real-time of a real-person standing in front of the camera/display mirror, and allows the user to change clothes and other appearance attributes. Using Microsoft's Kinect, poses are matched to a video database of different torsos and legs, and "pages" showing different clothes are turned by hand gestures.
Ph.D. Thesis 2014 Analyzing Tatonnement Dynamics in Economic Markets Cheung, Yun Kuen Abstract | PDF

Title: Analyzing Tatonnement Dynamics in Economic Markets

Candidate: Cheung, Yun Kuen

Advisor(s): Cole, Richard

Abstract:

The impetus for this dissertation is to explain why well-functioning markets might be able to stay at or near a market equilibrium. We argue that tatonnement, a natural, simple and distributed price update dynamic in economic markets, is a plausible candidate to explain how markets might reach their equilibria.

Tatonnement is broadly defined as follows: if the demand for a good is more than the supply, increase the price of the good, and conversely, decrease the price when the demand is less than the supply. Prior works show that tatonnement converges to market equilibrium in some markets while it fails to converge in other markets. Our goal is to extend the classes of markets in which tatonnement is shown to converge. The prior positive results largely concerned markets with substitute goods. We seek market constraints which enable tatonnement to converge in markets with complementary goods, or with a mixture of substitutes and complementary goods. We also show fast convergence rates for some of these markets.

We introduce an amortized analysis technique to handle asynchronous events - in our case asynchronous price updates. On the other hand, for some markets we show that tatonnement is equivalent to generalized gradient descent (GGD). The amortized analysis and our analysis on GGD may be of independent interests.
Ph.D. Thesis 2014 Low-latency Image Recognition withGPU-accelerated Convolutional Networksfor Web-based Services Huang, Fu Jie Abstract | PDF

Title: Low-latency Image Recognition withGPU-accelerated Convolutional Networksfor Web-based Services

Candidate: Huang, Fu Jie

Advisor(s): LeCun, Yann

Abstract:

In this work, we describe an application of convolutional networks to object classification and detection in images. The task of image based object recognition is surveyed in the first chapter. Its application in internet advertisement is one of the main motivations of this work.

The architecture of the convolutional networks is described in details in the following chapter. Stochastic gradient descent is used to train the networks.

We then describe the data collection and labelling process. The set of training data labelled basically decides what kind of recognizer is being built. Four binary classifers are trained for the object types of sailboat, car, motorbike, and dog.

GPU based massive parallel implementation of the convolutional networks is built. This enables us to run the convolution operations at close to 40 times faster than running on a traditional CPU. Details about how to implement the convolutional operation on NVIDIA GPUs using CUDA is disscused.

In order to apply the object recognizer in a production environment where millions of images are processed daily, we have built a platform with cloud computing. We describe how large scale and low latency image processing can be achieved with such a system.
Ph.D. Thesis 2014 Effective Algorithms for the Satisfiability of Quantifier-Free Formulas Over Linear Real and Integer Arithmetic King, Tim Abstract | PDF

Title: Effective Algorithms for the Satisfiability of Quantifier-Free Formulas Over Linear Real and Integer Arithmetic

Candidate: King, Tim

Advisor(s): Barrett, Clark

Abstract:

A core technique of modern tools for formally reasoning about computing systems is generating and dispatching queries to automated theorem provers, including Satisfiability Modulo Theories (SMT) provers. SMT provers aim at the tight integration of decision procedures for propositional satisfiability and decision procedures for fixed first-order theories ‒ known as theory solvers. This thesis presents several advancements in the design and implementation of theory solvers for quantifier-free linear real, integer, and mixed integer and real arithmetic. These are implemented within the SMT system CVC4. We begin by formally describing the Satisfiability Modulo Theories problem and the role of theory solvers within CVC4. We discuss known techniques for building solvers for quantifier-free linear real, integer, and mixed integer and real arithmetic around the Simplex for DPLL(T) algorithm. We give several small improvements to theory solvers using this algorithm and describe the implementation and theory of this algorithm in detail. To extend the class of problems that the theory solver can robustly support, we borrow and adapt several techniques from linear programming (LP) and mixed integer programming (MIP) solvers which come from the tradition of optimization. We propose a new decicion procedure for quantifier-free linear real arithmetic that replaces the Simplex for DPLL(T) algorithm with a variant of the Simplex algorithm that performs a form of optimization ‒ minimizing the sum of infeasibilties. In this thesis, we additionally describe techniques for leveraging LP and MIP solvers to improve the performance of SMT solvers without compromising correctness. Previous efforts to leverage such solvers in the context of SMT have concluded that in addition to being potentially unsound, such solvers are too heavyweight to compete in the context of SMT. We present an empirical comparison against other state-of-the-art SMT tools to demonstrate the effectiveness of the proposed solutions.
Ph.D. Thesis 2014 Cryptographic Algorithms for the SecureDelegation of Multiparty Computation Lopez-Alt, Adriana Abstract | PDF
Title: Cryptographic Algorithms for the SecureDelegation of Multiparty Computation

Candidate: Lopez-Alt, Adriana

Advisor(s): Dodis, Yevgeniy

Abstract:

In today’s world, we store our data and perform expensive computations remotely on powerful servers (a.k.a. “the cloud”) rather than on our local devices. In this dissertation we study the question of achieving cryptographic security in the setting where multiple (mutually distrusting) clients wish to delegate the computation of a joint function on their inputs to an untrusted cloud, while keeping these inputs private. We introduce two frameworks for modeling such protocols.
1. The first, called cloud-assisted multiparty computation (cloud-assisted MPC), builds on the standard notion of MPC to incorporate the concept of delegation. In particular, since the cloud is expected to perform the computation of the function, our definition requires the communication complexity of the protocol, as well as the computation time of all clients to be (essentially) independent of the complexity of the function.
2. The second, called on-the-fly MPC, builds on the notion of cloud-assisted MPC and further requires that the clients be involved only when initially uploading their input to the cloud, and in a final phase when outputs are revealed. In particular, this allows the server to dynamically choose functions (and subsets of data on which to evaluate these functions) “on- the-fly”, and evaluate them without requiring any interaction with the clients. The only interaction required takes place in the final phase after the computation has been completed, when the clients must retroactively approve both the chosen functions, and the subsets of data upon which these functions were evaluated.
We construct cloud-assisted and on-the-fly MPC protocols using fully homomorphic encryption (FHE). However, FHE requires inputs to be encrypted under the same key; we extend it to the multiparty setting in two ways:
1. We introduce the notion of threshold FHE : fully homomorphic encryption that allows the clients to jointly generate a common public key (whose corresponding secret key is shared among them), as well as decrypt a ciphertext under this public key without learning any- thing but the plaintext. Using threshold FHE, we show how to construct an efficient cloud- assisted MPC protocol. We construct threshold FHE using (a modification of) the Brakerski- Vaikuntanathan (ring-based) FHE scheme; however our ideas extend to many other lattice- based FHE schemes in the literature.
2. We introduce the notion of multikey FHE : fully homomorphic encryption that allows the cloud to perform homomorphic evaluation on ciphertexts encrypted under different and independent keys. We show a construction of on-the-fly MPC using multikey FHE, and construct a multikey FHE scheme based on NTRU encryption. We highlight that it was previously not known how to make NTRU fully homomorphic, even for a single key. Therefore, we view the construction of (multikey) FHE from NTRU encryption as a main contribution of independent interest.
Ph.D. Thesis 2014 Robust and Efficient Methods for Approximation and Optimization of Stability Measures Mitchell, Tim Abstract | PDF

Title: Robust and Efficient Methods for Approximation and Optimization of Stability Measures

Candidate: Mitchell, Tim

Advisor(s): Overton, Michael

Abstract:

We consider two new algorithms with practical application to the problem of designing controllers for linear dynamical systems with input and output: a new spectral value set based algorithm called hybrid expansion-contraction intended for approximating the H-infinity norm, or equivalently, the complex stability radius, of large-scale systems, and a new BFGS SQP based optimization method for nonsmooth, nonconvex constrained optimization motivated by multi-objective controller design. In comprehensive numerical experiments, we show that both algorithms in their respect domains are significantly faster and more robust compared to other available alternatives. Moreover, we present convergence guarantees for hybrid expansion-contraction, proving that it converges at least superlinearly, and observe that it converges quadratically in practice, and typically to good approximations to the H-infinity norm, for problems which we can verify this. We also extend the hybrid expansion-contraction algorithm to the real stability radius, a measure which is known to be more difficult to compute than the complex stability radius. Finally, for the purposes of comparing multiple optimization methods, we present a new visualization tool called relative minimization profiles that allow for simultaneously assessing the relative performance of algorithms with respect to three important performance characteristics, highlighting how these measures interrelate to one another and compare to the other competing algorithms on heterogenous test sets. We employ relative minimization profiles to empirically validate our proposed BFGS SQP method in terms of quality of minimization, attaining feasibility, and speed of progress compared to other available methods on challenging test sets comprised of nonsmooth, nonconvex constrained optimization problems arising in controller design.
Ph.D. Thesis 2014 Building Efficient Distributed In-memory Systems Power, Russell Abstract | PDF

Title: Building Efficient Distributed In-memory Systems

Candidate: Power, Russell

Advisor(s): Li, Jinyang

Abstract:

The recent cloud computing revolution has changed the distributed computing landscape, making the resources of entire datacenters available to ordinary users. This process has been greatly aided by dataflow style frameworks such as MapReduce which expose simple model for programs, allowing for efficient, fault-tolerant execution across many machines. While the MapReduce model has proved to be effective for many applications, there are a wide class of applications which are difficult to write or inefficient in such a model. This includes many familiar and important applications such as PageRank, matrix factorization and a number of machine learning algorithms. In lieu of a good framework for building these applications, users resort to writing applications using MPI or RPC, a difficult and error-prone construction.

This thesis presents 2 complementary frameworks, Piccolo and Spartan, which help programmers to write in-memory distributed applications not served well by existing approaches.

Piccolo presents a new data-centric programming model for in-memory applications. Unlike data-flow models, Piccolo allows programs running on different machines to share distributed, mutable state via a key-value table interface. This design allows for both high-performance and additional flexibility. Piccolo makes novel use of commutative updates to efficiently resolve write-write conflicts. We find Piccolo provides an efficient backend for a wide-range of applications: from PageRank and matrix multiplication to web-crawling.

While Piccolo provides an efficient backend for distributed computation, it can still be some- what cumbersome to write programs using it directly. To address this, we created Spartan. Spartan implements a distributed implementation of the NumPy array language, and fully sup- ports important array language features such as spatial indexing (slicing), fancy indexing and broadcasting. A key feature of Spartan is its use of a small number of simple, powerful high-level operators to provide most functionality. Not only do these operators dramatically simplify the design and implementation of Spartan, they also allow users to implement new functionality with ease.

We evaluate Piccolo and Spartan on a wide range of applications and find that they both perform significantly better than existing approaches.
Ph.D. Thesis 2014 Runtime Compilation of Array-Oriented Python Programs Rubinsteyn, Alex Abstract | PDF

Title: Runtime Compilation of Array-Oriented Python Programs

Candidate: Rubinsteyn, Alex

Advisor(s): Shasha, Dennis

Abstract:

The Python programming language has become a popular platform for data analysis and scientific computing. To mitigate the poor performance of Python's standard interpreter, numerically intensive computations are typically offloaded to library functions written in languages such as Fortran or C. If, however, some algorithm does not have an existing low-level implementation, then the scientific programmer must either accept sub-standard performance (sometimes orders of magnitude slower than native code) or themselves implement the desired functionality in a less productive but more efficient language.

To alleviate this problem, this thesis present Parakeet, a runtime compiler for an array-oriented subset of Python. Parakeet does not replace the Python interpreter, but rather selectively augments it by compiling and executing functions explicitly marked by the programmer. Parakeet uses runtime type specialization to eliminate the performance-defeating dynamicism of untyped Python code. Parakeet's pervasive use of data parallel operators as a means for implementing array operations enables high-level restructuring optimization and compilation to parallel hardware such as multi-core CPUs and graphics processors. We evaluate Parakeet on a collection of numerical benchmarks and demonstrate its dramatic capacity for accelerating array-oriented Python programs.
Ph.D. Thesis 2014 A Deep Learning Pipeline for Image Understanding and Acoustic Modeling Sermanet, Pierre Abstract | PDF

Title: A Deep Learning Pipeline for Image Understanding and Acoustic Modeling

Candidate: Sermanet, Pierre

Advisor(s): LeCun, Yann

Abstract:

One of the biggest challenges artificial intelligence faces is making sense of the real world through sensory signals such as audio or video. Noisy inputs, varying object viewpoints, deformations and lighting conditions turn it into a high-dimensional problem which cannot be efficiently solved without learning from data.

This thesis explores a general way of learning from high dimensional data (video, images, audio, text, financial data, etc.) called deep learning. It strives on the increasingly large amounts of data available to learn robust and invariant internal features in a hierarchical manner directly from the raw signals.

We propose an unified pipeline for feature learning, recognition, localization and detection using Convolutional Networks (ConvNets) that can obtain state-of-the-art accuracy on a number of pattern recognition tasks, including acoustic modeling for speech recognition and object recognition in computer vision. ConvNets are particularly well suited for learning from continuous signals in terms of both accuracy and efficiency.

Additionally, a novel and general deep learning approach to detection is proposed and successfully demonstrated on the most challenging vision datasets. We then generalize it to other modalities such as speech data. This approach allows accurate localization and detection objects in images or phones in voice signals by learning to predict boundaries from internal representations. We extend the reach of deep learning from classification to detection tasks in an integrated fashion by learning multiple tasks using a single deep model. This work is among the first to outperform human vision and establishes a new state of the art on some computer vision and speech recognition benchmarks.
Ph.D. Thesis 2014 Towards New Interfaces For Pedagogy Stein, Murphy Abstract | PDF

Title: Towards New Interfaces For Pedagogy

Candidate: Stein, Murphy

Advisor(s): Perlin, Ken

Abstract:

Developing technology to help people teach and learn is an important topic in Human Computer Interaction (HCI).

In this thesis we present three studies on this topic. In the first study, we demonstrate new games for learning mathematics and discuss the evidence for key design decisions from user studies. In the second study, we develop a real-time video compositing system for distance education and share evidence for its potential value compared to standard techniques from two user studies. In the third study, we demonstrate our markerless hand tracking interface for real-time 3D manipulation and explain its advantages compared to other state-of-the-art methods.

A data-driven methodology is applied intensively throughout the course of this study. Several paraphrase corpora are constructed using automatic techniques, experts and crowdsourcing platforms. Paraphrase systems are trained and evaluated by using these data as a cornerstone. We show that even with a very noisy or a relatively small amount of parallel training data, it is possible to learn paraphrase models which capture linguistic phenomena. This work expands the scope of paraphrase studies to targeting different language variations, and more potential applications, such as text normalization and domain adaptation.
Ph.D. Thesis 2014 Computational Complexity Implicationsof Secure Coin-Flipping Tentes, Aristeidis Abstract | PDF

Title: Computational Complexity Implicationsof Secure Coin-Flipping

Candidate: Tentes, Aristeidis

Advisor(s): Dodis, Yevgeniy

Abstract:

Modern Cryptography is based on computational intractability assumptions, e.g., Factoring, Discrete Logarithm, Diffie-Helman etc. However, since an assumption might be proven incorrect, there has been a lot of focus in order to construct cryptographic primitives based on the possibly most minimal assumption. The most popular minimal assumption, which is implied by the existence of almost all cryptographic primitives, is the existence of One Way Functions. Coin-Flipping protocols are known to be implied by One-Way Functions, however, a complete characterization of the inverse direction is not known. There was even speculation that weak notions of Coin Flipping Protocols might be strictly weaker than One Way Functions. In this thesis we show that even very weak notions of Coin Flipping protocols do imply One Way Functions. In particular we show that the existence of a coin-flipping protocol safe against any non-trivial constant bias (e.g 0.499) implies the existence of One Way Functions. This improves upon a recent result of Haitner and Omri [FOCS '11], who proved this implication for protocols with bias 0.207. Unlike the former result, our result also holds for weak coin-flipping protocols.
Ph.D. Thesis 2014 Data-driven Approaches for Paraphrasing across Language Variations Xu, Wei Abstract | PDF

Title: Data-driven Approaches for Paraphrasing across Language Variations

Candidate: Xu, Wei

Advisor(s): Grishman, Ralph

Abstract:

Our language changes very rapidly, accompanying political, social and cultural trends, as well as the evolution of science and technology. The Internet, especially the social media, has accelerated this process of change. This poses a severe challenge for both human beings and natural language processing (NLP) systems, which usually only model a snapshot of language presented in the form of text corpora within a certain domain and time frame.

While much previous effort has investigated monolingual paraphrase and bilingual translation, we focus on modeling meaning-preserving transformations between variants of a single language. We use Shakespearean and Internet language as examples to investigate various aspects of this new paraphrase problem, including acquisition, generation, detection and evaluation.

A data-driven methodology is applied intensively throughout the course of this study. Several paraphrase corpora are constructed using automatic techniques, experts and crowdsourcing platforms. Paraphrase systems are trained and evaluated by using these data as a cornerstone. We show that even with a very noisy or a relatively small amount of parallel training data, it is possible to learn paraphrase models which capture linguistic phenomena. This work expands the scope of paraphrase studies to targeting different language variations, and more potential applications, such as text normalization and domain adaptation.
Ph.D. Thesis 2014 Positive-Unlabeled Learning in the Context of Protein Function Prediction Youngs, Noah Abstract | PDF

Title: Positive-Unlabeled Learning in the Context of Protein Function Prediction

Candidate: Youngs, Noah

Advisor(s): Shasha, Dennis

Abstract:

With the recent proliferation of large, unlabeled data sets, a particular subclass of semisupervised learning problems has become more prevalent. Known as positiveunlabeled learning (PU learning), this scenario provides only positive labeled examples, usually just a small fraction of the entire dataset, with the remaining examples unknown and thus potentially belonging to either the positive or negative class. Since the vast majority of traditional machine learning classifiers require both positive and negative examples in the training set, a new class of algorithms has been developed to deal with PU learning problems.

A canonical example of this scenario is topic labeling of a large corpus of documents. Once the size of a corpus reaches into the thousands, it becomes largely infeasible to have a curator read even a sizable fraction of the documents, and annotate them with topics. In addition, the entire set of topics may not be known, or may change over time, making it impossible for a curator to annotate which documents are NOT about certain topics. Thus a machine learning algorithm needs to be able to learn from a small set of positive examples, without knowledge of the negative class, and knowing that the unlabeled training examples may contain an arbitrary number of additional but as yet unknown positive examples. Another example of a PU learning scenario recently garnering attention is the protein function prediction problem (PFP problem).

While the number of organisms with fully sequenced genomes continues to grow, the progress of annotating those sequences with the biological functions that they perform lags far behind. Machine learning methods have already been successfully applied to this problem, but with many organisms having a small number of positive annotated training examples, and the lack of availability of almost any labeled negative examples, PU learning algorithms can make large gains in predictive performance.

The first part of this dissertation motivates the protein function prediction problem, explores previous work, and introduces novel methods that improve upon previously reported benchmarks for a particular type of learning algorithm, known as Gaussian Random Field Label Propagation (GRFLP). In addition, we present improvements to the computational efficiency of the GRFLP algorithm, and a modification to the traditional structure of the PFP learning problem that allows for simultaneous prediction across multiple species.

The second part of the dissertation focuses specifically on the positive-unlabeled aspects of the PFP problem. Two novel algorithms are presented, and rigorously compared to existing PU learning techniques in the context of protein function prediction. Additionally, we take a step back and examine some of the theoretical considerations of the PU scenario in general, and provide an additional novel algorithm applicable in any PU context. This algorithm is tailored for situations in which the labeled positive examples are a small fraction of the set of true positive examples, and where the labeling process may be subject to some type of bias rather than being a random selection of true positives (arguably some of the most difficult PU learning scenarios).

The third and fourth sections return to the PFP problem, examining the power of tertiary structure as a predictor of protein function, as well as presenting two case studies of function prediction performance on novel benchmarks. Lastly, we conclude with several promising avenues of future research into both PU learning in general, and the protein function prediction problem specifically.
Ph.D. Thesis 2014 Hierarchical Convolutional Deep Learning in Computer Vision Zeiler, Matthew Abstract | PDF

Title: Hierarchical Convolutional Deep Learning in Computer Vision

Candidate: Zeiler, Matthew

Advisor(s): Fergus, Rob

Abstract:

It has long been the goal in computer vision to learn a hierarchy of features useful for object recognition. Spanning the two traditional paradigms of machine learning, unsupervised and supervised learning, we investigate the application of deep learning methods to tackle this challenging task and to learn robust representations of images.

We begin our investigation with the introduction of a novel unsupervised learning technique called deconvolutional networks. Based on convolutional sparse coding, we show this model learns interesting decompositions of images into parts without object label information. This method, which easily scales to large images, becomes increasingly invariant by learning multiple layers of feature extraction coupled with pooling layers. We introduce a novel pooling method called Gaussian pooling to enable these layers to store continuous location information while being differentiable, creating a unified objective function to optimize.

In the supervised learning domain, a well-established model for recognition of objects is the convolutional network. We introduce a new regularization method for convolutional networks called stochastic pooling which relies on sampling noise to prevent these powerful models from overfitting. Additionally, we show novel visualizations of these complex models to better understand what they learn and to provide insight on how to develop state-of-the-art architectures for large-scale classification of 1,000 different object categories.

We also investigate some other related problems in deep learning. First, we introduce a model for the task of mapping one high dimensional time series sequence onto another. Second, we address the choice of nonlinearity in neural networks, showing evidence that rectified linear units outperform others types in automatic speech recognition. Finally, we introduce a novel optimization method called ADADELTA which shows promising convergence speeds in practice while being robust to hyper-parameter selection.
Ph.D. Thesis 2013 Incentive-Centered Design of Money-Free Mechanisms Gkatzelis, Vasilis Abstract | PDF

Title: Incentive-Centered Design of Money-Free Mechanisms

Candidate: Gkatzelis, Vasilis

Advisor(s): Cole, Richard

Abstract:

This thesis serves as a step toward a better understanding of how to design fair and efficient multiagent resource allocation systems by bringing the incentives of the participating agents to the center of the design process. As the quality of these systems critically depends on the ways in which the participants interact with each other and with the system, an ill-designed set of incentives can lead to severe inefficiencies. The special focus of this work is on the problems that arise when the use of monetary exchanges between the system and the participants is prohibited. This is a common restriction that substantially complicates the designer's task; we nevertheless provide a sequence of positive results in the form of mechanisms that maximize efficiency or fairness despite the possibly self-interested behavior of the participating agents.

The first part of this work is a contribution to the literature on approximate mechanism design without money. Given a set of divisible resources, our goal is to design a mechanism that allocates them among the agents. The main complication here is due to the fact that the agents' preferences over different allocations of these resources may not be known to the system. Therefore, the mechanism needs to be designed in such a way that it is in the best interest of every agent to report the truth about her preferences; since monetary rewards and penalties cannot be used in order to elicit the truth, a much more delicate regulation of the resource allocation is necessary. Our contribution mostly revolves around a new truthful mechanism that we propose, which we call the /Partial Allocation/ mechanism. We first show how to use the two-agent version of this mechanism to create a system with the best currently known worst-case efficiency guarantees for problem instances involving two agents. We then consider fairness measures and prove that the general version of this elegant mechanism yields surprisingly good approximation guarantees for the classic problem of fair division. More specifically, we use the well established solution of /Proportional Fairness/ as a benchmark and we show that for an arbitrary number of agents and resources, and for a very large class of agent preferences, our mechanism provides /every agent/ with a value close to her proportionally fair value. We complement these results by also studying the limits of truthful money-free mechanisms, and by providing other mechanisms for special classes of problem instances. Finally, we uncover interesting connections between our mechanism and the Vickrey-Clarke-Groves mechanism from the literature on mechanism design with money.

The second part of this work concerns the design of money-free resource allocation mechanisms for /decentralized/ multiagent systems. As the world has become increasingly interconnected, such systems are using more and more resources that are geographically dispersed; in order to provide scalability in these systems, the mechanisms need to be decentralized. That is, the allocation decisions for any given resource should not assume global information regarding the system's resources or participants. We approach this restriction by using /coordination mechanisms/: a collection of simple resource allocation policies, each of which controls only one of the resources and uses only local information regarding the state of the system. The system's participants, facing these policies, have the option of choosing which resources they will access. We study a variety of coordination mechanisms and we prove that the social welfare of any equilibrium of the games that these mechanisms induce is a good approximation of the optimal welfare. Once again, we complement our positive results by studying the limits of coordination mechanisms. We also provide a detailed explanation of the seemingly counter-intuitive incentives that some of these mechanisms yield. Finally, we use this understanding in order to design a combinatorial constant-factor approximation algorithm for maximizing the social welfare, thus providing evidence that a game-theoretic mindset can lead to novel optimization algorithms.
Ph.D. Thesis 2013 Locality Optimization for Data Parallel Programs Hielscher, Eric Abstract | PDF

Title: Locality Optimization for Data Parallel Programs

Candidate: Hielscher, Eric

Advisor(s): Shasha, Dennis

Abstract:

Productivity languages such as NumPy and Matlab make it much easier to implement data-intensive numerical algorithms than it is to implement them in efficiency languages such as C++. This is important as many programmers (1) aren't expert programmers; or (2) don't have time to tune their software for performance, as their main job focus is not programming per se. The tradeoff is typically one of execution time versus programming time, as unless there are specialized library functions or precompiled primitives for your particular task a productivity language is likely to be orders of magnitude slower than an efficiency language.

In this thesis, we present Parakeet, an array-oriented language embedded within Python, a widely-used productivity language. The Parakeet just-in-time compiler dynamically translates whole user functions to high performance multi-threaded native code. This thesis focuses in particular on our use of data parallel operators as a basis for locality enhancing program optimizations. e transform Parakeet programs written with the classic data parallel operators (Map, Reduce, and Scan; in Parakeet these are called adverbs) to process small local pieces (called tiles) of data at a time. To express this locality we introduce three new adverbs: TiledMap, TiledReduce, and TiledScan. These tiled adverbs are not exposed to the programmer but rather are automatically generated by a tiling transformation.

We use this tiling algorithm to bring two classic locality optimizations to a data parallel setting: cache tiling, and register tiling. We set register tile sizes statically at compile time, but use an online autotuning search to find good cache tile sizes at runtime. We evaluate Parakeet and these optimizations on various benchmark programs, and exhibit excellent performance even compared to typical C implementations.
Ph.D. Thesis 2013 Piecewise Smooth Surfaces with Features Kovacs, Denis Abstract | PDF

Title: Piecewise Smooth Surfaces with Features

Candidate: Kovacs, Denis

Advisor(s): Zorin, Denis

Abstract:

The creation, manipulation and display of piecewise smooth surfaces has been a fundamental topic in computer graphics since its inception. The applications range from highest-quality surfaces for manufacturing in CAD, to believable animations of virtual creatures in Special Effects, to virtual worlds rendered in real-time in computer games.

Our focus is on improving the a) mathematical representation and b) automatic construction of such surfaces from finely sampled meshes in the presence of features. Features can be areas of higher geometric detail in an otherwise smooth area of the mesh, or sharp creases that contrast the overall smooth appearance of an object.
In the first part, we build on techniques that define piecewise smooth surfaces, to improve their quality in the presence of features. We present a crease technique suitable for real-time applications that helps increases the perceived visual detail of objects that are required to be very compactly represented and efficiently evaluated.
We then introduce a new subdivision scheme that allows the use of T-junctions for better local refinement. It thus reduces the need for extraordinary vertices, which can cause surface artifacts especially on animated objects.
In the second part, we consider the problem of how to build the control meshes of piecewise smooth surfaces, in a way that the resulting surface closely approximates an existing data set (such as a 3D range scan), particularly in the presence of features. To this end, we introduce a simple modification that can be applied to a wide range of parameterization techniques to obtain an anisotropic parameterization. We show that a resulting quadrangulation can indeed better approximate the original surface. Finally, we present a quadrangulation scheme that turns a data set into a quad mesh with T-junctions, which we then use as a T-Spline control mesh to obtain a smooth surface.
Ph.D. Thesis 2013 Low-level Image Priors and Laplacian Preconditioners for Applications in Computer Graphics and Computational Photography Krishnan, Dilip Abstract | PDF

Title: Low-level Image Priors and Laplacian Preconditioners for Applications in Computer Graphics and Computational Photography

Candidate: Krishnan, Dilip

Advisor(s): Fergus, Rob

Abstract:

In the first part of this thesis, we develop novel image priors and efficient algorithms for image denoising and deconvolution applications. Our priors and algorithms enable fast, high-quality restoration of images corrupted by noise or blur. In the second part, we develop effective preconditioners for Laplacian matrices. Such matrices arise in a number of computer graphics and computational photography problems such as image colorization, tone mapping and geodesic distance computation on 3D meshes.

The first prior we develop is a spectral prior which models correlations between different spectral bands. We introduce a prototype camera and flash system, used in conjunction with the spectral prior, to enable taking photographs at very low light levels. Our second prior is a sparsity-based measure for blind image deconvolution. This prior gives lower costs to sharp images than blurred ones, enabling the use simple and efficient Maximum a-Posteriori algorithms.

We develop a new algorithm for the non-blind deconvolution problem. This enables extremely fast deconvolution of images blurred by a known blur kernel. Our algorithm uses Fast Fourier Transforms and Lookup Tables to achieve real-time deconvolution performance with non convex gradient-based priors. Finally, for certain image restoration problems with no clear formation model, we demonstrate how learning a direct mapping between original/corrupted patch pairs enables effective restoration.

We develop multi-level preconditioners to solve discrete Poisson equations. Existing multilevel preconditioners have two major drawbacks: excessive bandwidth growth at coarse levels; and the inability to adapt to problems with highly varying coefficients. Our approach tackles both these problems by introducing sparsification and compensation steps at each level. We interleave the selection of fine and coarse-level variables with the removal of weak connections between potential fine-level variables (sparsification) and compensate for these changes by strengthening nearby connections. By applying these operations before each elimination step and repeating the procedure recursively on the resulting smaller systems, we obtain highly efficient schemes. The construction is linear in time and memory. Numerical experiments demonstrate that our new schemes outperform state of the art methods, both in terms of operation count and wall-clock time, over a range of 2D and 3D problems.
Ph.D. Thesis 2013 Relation Extraction with Weak Supervision and Distributional Semantics Min, Bonan Abstract | PDF

Title: Relation Extraction with Weak Supervision and Distributional Semantics

Candidate: Min, Bonan

Advisor(s): Grishman, Ralph

Abstract:

Relation Extraction aims at detecting and categorizing semantic relations between pairs of entities in unstructured text. It benefits an enormous number of applications such as Web search and Question Answering. Traditional approaches for relation extraction either rely on learning from a large number of accurate human-labeled examples or pattern matching with hand-crafted rules. These resources are very laborious to obtain and can only be applied to a narrow set of target types of interest.

This talk focuses on learning relations with little or no human supervision. First, we examine the approach that treats relation extraction as a supervised learning problem. We develop an algorithm that is able to train a model with approximately 1/3 of the human-annotation cost and that matches the performance of models trained with high-quality annotation. Second, we investigate distant supervision, a weakly supervised algorithm that automatically generates its own labeled training data. We develop a latent Bayesian framework for this purpose. By using a model which provides a better approximation of the weak source of supervision, it outperforms the state-of-the-art methods. Finally, we investigate the possibility of building all relational tables beforehand with an unsupervised relation extraction algorithm. We develop an effective yet efficient algorithm that combines the power of various semantic resources that are automatically mined from a corpus based on distributional semantics. The algorithm is able to extract a very large set of relations from the web at high precision.
Ph.D. Thesis 2013 Usable Security Mechanisms in the Developing World Paik, Michael Abstract | PDF

Title: Usable Security Mechanisms in the Developing World

Candidate: Paik, Michael

Advisor(s): Subramanian; Lakshminarayanan

Abstract:

Security and privacy are increasingly important in our interconnected world. Cybercrimes, including identity theft, phishing, and other attacks, are on the rise, and computer-assisted crimes such as theft and stalking are becoming commonplace.

Contemporary with this trend is the uptake of technology in the developing world, proceeding at a pace often outstripping that of the developed world. Penetration of mobile phones and services such as healthcare delivery, mobile money, and social networking is higher than that of even amenities like electricity. Connectivity is empowering disenfranchised people, providing information and services to the heretofore disconnected poor.

There are efforts to use technology to enhance physical security and well-being in the developing world, including citizen journalism, education, improving drug security, attendance tracking, etc.

However, there are significant challenges to security both in the digital and the physical domains that are particular to these contexts. Infrastructure is constrained, literacy, numeracy, and familiarity with basic technologies cannot be assumed, and environments are harsh on hardware. These circumstances often prevent security best practices from being transplanted directly to these regions â in many ways, the adoption of technology has overtaken the users ability to use it safely, and their trust in it is oftentimes reater than it should be.

This dissertation describes several systems and methodologies designed to operate in the developing world, using technologies and metaphors that are familiar to users and that are robust against the operating environments.

It begins with an overview of the state of affairs, and several threat models. It continues with a description of Signet, a method to use SIM cards as trusted computing hardware to provide secure signed receipts. Next, Epothecary describes a low-infrastructure system for tracking pharmaceuticals that also significantly and asymmetrically increases costs for counterfeiters. The balance consists of a description of a low-cost Biometric Terminal currently in use by NGOs in India performing DOTS-based tuberculosis treatment, Blacknoise, an investigation into the use of low-cost cameraphones with noisy imaging sensors for image-based steganography, and finally Innoculous, a low-cost, crowdsourcing system for combating the spread of computer viruses, particularly among non-networked computers, while also collecting valuable "epidemiological" data.
Ph.D. Thesis 2013 Inapproximability Reductions and Integrality Gaps Popat, Preyas Abstract | PDF

Title: Inapproximability Reductions and Integrality Gaps

Candidate: Popat, Preyas

Advisor(s): Khot, Subhash

Abstract:

In this thesis we prove intractability results for several well studied problems in combinatorial optimization.

Closest Vector Problem with Preprocessing (CVPP): We show that the preprocessing version of the well known Closest Vector Problem is hard to approximate to an almost polynomial factor unless NP is in quasi polynomial time. The approximability of CVPP is closely related to the security of lattice based cryptosystems.

Pricing Loss Leaders: We show hardness of approximation results for the problem of maximizing profit from buyers with single minded valuations where each buyer is interested in bundles of at most k items, and the items are allowed to have negative prices ("Loss Leaders"). For k = 2, we show that assuming the Unique Games Conjecture, it is hard to approximate the profit to any constant factor. For k > 2, we show the same result assuming P != N P.

Integrality gaps: We show SemiDefinite Programming (SDP) integrality gaps for Unique Games and 2 to 1 Games. Inapproximability results for these problems imply inapproximability results for many fundamental optimization problems. For the first problem, we show "approximate" integrality gaps for super constant rounds of the powerful Lasserre hierarchy. For the second problem we show integrality gaps for the basic SDP relaxation with perfect completeness.
Ph.D. Thesis 2013 Natural Interaction with a Virtual World Rosenberg, Ilya Abstract | PDF

Title: Natural Interaction with a Virtual World

Candidate: Rosenberg, Ilya

Advisor(s): Perlin, Ken

Abstract:

A large portion of computer graphics and human/computer interaction is concerned with the creation, manipulation and use of two and three dimensional objects existing in a virtual world. By creating more natural physical interfaces and virtual worlds which behave in physically plausible ways, it is possible to empower nonexpert users to create, work and play in virtual environments. This thesis is concerned with the design, creation, and optimization of user-input devices which break down the barriers between the real and the virtual as well as the development of software algorithms which allow for the creation of physically realistic virtual worlds.
Ph.D. Thesis 2013 Security Mechanisms for Physical Authentication Sharma, Ashlesh Abstract | PDF

Title: Security Mechanisms for Physical Authentication

Candidate: Sharma, Ashlesh

Advisor(s): Subramanian; Lakshminarayanan

Abstract:

Counterfeiting of goods is a worldwide problem where the losses are in billions of dollars. It is estimated that 10% of all the world trade is counterfeit. To alleviate counterfeiting, a number of techniques are used from barcodes to holograms. But these technologies are easily reproducible and hence they are ineffective against counterfeiters.

In this thesis, we introduce PaperSpeckle, a novel way to fingerprint any piece of paper based on its unique microscopic properties. Next, we extend and generalize this work to introduce TextureSpeckle, a novel way to fingerprint and characterize the uniqueness of the surface of a material based on the interaction of light with the natural randomness present in the rough structure at the microscopic level of the surface. We show the existence and uniqueness of these fingerprints by analyzing a large number of surfaces (over 20,000 microscopic surfaces and 200 million pairwise comparisons) of different materials. We also define the entropy of the fingerprints and show how each surface can be uniquely identified in a robust manner even in case of damage.

From a theoretical perspective, we consider a discrete approximation model from light scattering theory which allows us to compute the speckle pattern for a given surface. Under this computational model, we show that given a speckle pattern, it is computationally hard to reconstruct the physical surface characteristics by simulating the multiple scattering of light. Using TextureSpeckle as a security primitive, we design secure protocols to enable a variety of scenarios such as: i) supply chain security, where applications range from drug tracking to inventory management, ii) mobile based secure transfer of money (mobile money), where any paper can be changed to an on-demand currency, and iii) fingerprint ecosystem, a cloud based system, where any physical object can be identified and authenticated on-demand.

We discuss the construction of the prototype device ranging from optical lens design to usability aspects and show how our technique can be applied in the real world to alleviate counterfeiting and forgery. In addition, we introduce Pattern Matching Puzzles (PMPs), a usable security mechanism that provides a 'human computable' one-time-MAC (message authentication code) for every transaction,making each transaction information-theoretically secure against various adversarial attacks. The puzzles are easy tosolve even for semi-literate users with simple pattern recognition skills.
Ph.D. Thesis 2013 Augmenting Information Flow for Visual Privacy Spiro, Ian Abstract | PDF

Title: Augmenting Information Flow for Visual Privacy

Candidate: Spiro, Ian

Advisor(s): Bregler, Christopher

Abstract:

In the Information Age, visual media take on powerful new forms. Photographs once printed on paper and stored in physical albums now exist as digital files. With the rise of social media, photo data has moved to the cloud for rapid dissemination. The upside can be measured in terms of increased efficiency, greater reach, or reduced printing costs. But there is a downside that is harder to quantify: the risk of private photos or videos leaking inappropriately. Human imagery is potentially sensitive, revealing private details of a persons body, lifestyle, activities, and more. Images create visceral responses and have the potential to permanently damage a persons reputation.

We employed the theory of contextual integrity to explore privacy aspects of transmitting the human form. In response to privacy threats from new sociotechnical systems, we developed practical solutions that have the potential to restore balance. The main work is a set of client-side, technical interventions that can be used to alter information flows and provide features to support visual privacy. In the first approach, we use crowdsourcing to extract specific, useful human signal from video to decouple it from bundled identity information. The second approach is an attempt to achieve similar ends with pure software. Instead of using information workers, we developed a series of filters that alter video to hide identity information while still revealing motion signal. The final approach is an attempt to control the recipients of photos by encoding them in the visual channel. The software completely protects data from third-parties who lack proper credentials and maintains data integrity by exploiting the visual coherence of uploaded images, even in the face of JPEG compression. The software offers end-to-end encryption that is compatible with existing social media applications.
Ph.D. Thesis 2013 Toward a computational solution to the inverse problem of how hypoxia arises in metabolically heterogeneous cancer cell populations Sundstrom, Andrew Abstract | PDF

Title: Toward a computational solution to the inverse problem of how hypoxia arises in metabolically heterogeneous cancer cell populations

Candidate: Sundstrom, Andrew

Advisor(s): Mishra, Bud; Bar-Sagi, Dafna

Abstract:

As a tumor grows, it rapidly outstrips its blood supply, leaving portions of tumor that undergo hypoxia. Hypoxia is strongly correlated with poor prognosis as it renders tumors less responsive to chemotherapy and radiotherapy. During hypoxia, HIFs upregulate production of glycolysis enzymes and VEGF, thereby promoting metabolic heterogeneity and angiogenesis, and proving to be directly instrumental in tumor progression. Prolonged hypoxia leads to necrosis, which in turn activates inflammatory responses that produce cytokines that stimulate tumor growth. Hypoxic tumor cells interact with macrophages and fibroblasts, both involved with inflammatory processes tied to tumor progression. So it is of clinical and theoretical significance to understand: Under what conditions does hypoxia arise in a heterogeneous cell population? Our aim is to transform this biological origins problem into a computational inverse problem, and then attack it using approaches from computer science. First, we develop a minimal, stochastic, spatiotemporal simulation of large heterogeneous cell populations interacting in three dimensions. The simulation can manifest stable localized regions of hypoxia. Second, we employ and develop a variety of algorithms to analyze histological images of hypoxia in xenographed colorectal tumors, and extract features to construct a spatiotemporal logical characterization of hypoxia. We also consider characterizing hypoxia by a linear regression functional learning mechanism that yields a similarity score. Third, we employ a Bayesian statistical model checking algorithm that can determine, over some bounded number of simulation executions, whether hypoxia is likely to emerge under some fixed set of simulation parameters, and some fixed logical or functional description of hypoxia. Driving the model checking process is one of three adaptive Monte Carlo sampling algorithms we developed to explore the high dimensional space of simulation initial conditions and operational parameters. Taken together, these three system components formulate a novel approach to the inverse problem above, and constitute a design for a tool that can be placed into the hands of experimentalists, for testing hypotheses based upon known parameter values or ones the tool might discover. In principle, this design can be generalized to other biological phenomena involving large heterogeneous populations of interacting cells.
Ph.D. Thesis 2013 Rethinking Information Privacy for the Web Tierney, Matthew Abstract | PDF

Title: Rethinking Information Privacy for the Web

Candidate: Tierney, Matthew

Advisor(s): Subramanian; Lakshminarayanan

Abstract:

In response to Supreme Court Justice Samuel Alitoâs opinion that society should accept a decline in personal privacy with modern technology, Hanni M. Fakhoury, staff attorney with the Electronic Frontier Foundation, argued âTechnology doesnât involve an âinevitableâ tradeoff [of increased convenience] with privacy. The only inevitability must be the demand that privacy be a value built into our technologyâ [42]. Our position resonates with Mr. Fakhouryâs. In this thesis, we present three artifacts that address the balance between usability, efficiency, and privacy as we rethink information privacy for the web.

In the first part of this thesis, we present the design, implementation and evaluation of Cryptagram, a system designed to enhance online photo privacy. Cryptagram enables users to convert photos into encrypted images, which the users upload to Online Social Networks (OSNs). Users directly manage access control to those photos via shared keys that are independent of OSNs or other third parties. OSNs apply standard image transformations (JPEG compression) to all uploaded images so Cryptagram provides image encoding and encryption protocols that are tolerant to these transformations. Cryptagram guarantees that the recipient with the right credentials can completely retrieve the original image from the transformed version of the uploaded encrypted image while the OSN cannot infer the original image. Cryptagramâs browser extension integrates seamlessly with preexisting OSNs, including Facebook and Google+, and currently has over 400 active users.

In the second part of this thesis, we present the design and implementation of Lockbox, a system designed to provide end-to-end private file-sharing with the convenience of Google Drive or Dropbox. Lockbox uniquely combines two important design points: (1) a federated system for detecting and recovering from server equivocation and (2) a hybrid cryptosystem over delta encoded data to balance storage and bandwidth costs with efficiency for syncing end-user data. To facilitate appropriate use of public keys in the hybrid cryptosystem, we integrate a service that we call KeyNet, which is a web service designed to leverage existing authentication media (e.g., OAuth, verified email addresses) to improve the usability of public key cryptography.

In the third part of this thesis, we present the design of Compass, which realizes the philosophical privacy framework of contextual integrity (CI) as a full OSN design. CI), which we believe better captures users privacy expectations in OSNs. In Compass, three properties hold: (a) users are associated with roles in specific contexts; (b) every piece of information posted by a user is associated with a specific context; (c) norms defined on roles and attributes of posts in a context govern how information is shared across users within that context. Given the definition of a context and its corresponding norm set, we describe the design of a compiler that converts the human-readable norm definitions to generate appropriate information flow verification logic including: (a) a compact binary decision diagram for the norm set; and (b) access control code that evaluates how a new post to a context will flow. We have implemented a prototype that shows how the philosophical framework of contextual integrity can be realized in practice to achieve strong privacy guarantees with limited additional verification overhead.
Ph.D. Thesis 2012 Learning Hierarchical Feature Extractors For ImageRecognition Boureau, Y-Lan Abstract | PDF

Title: Learning Hierarchical Feature Extractors For ImageRecognition

Candidate: Boureau, Y-Lan

Advisor(s): LeCun, Yann

Abstract:

Telling cow from sheep is effortless for most animals, but requires much engineering for computers. In this thesis, we seek to tease out basic principles that underlie many recent advances in image recognition. First, we recast many methods into a common unsupervised feature extraction framework based on an alternation of coding steps, which encode the input by comparing it with a collection of reference patterns, and pooling steps, which compute an aggregation statistic summarizing the codes within some region of interest of the image.

Within that framework, we conduct extensive comparative evaluations of many coding or pooling operators proposed in the literature. Our results demonstrate a robust superiority of sparse coding (which decomposes an input as a linear combination of a few visual words) and max pooling (which summarizes a set of inputs by their maximum value). We also propose macrofeatures, which import into the popular spatial pyramid framework the joint encoding of nearby features commonly practiced in neural networks, and obtain significantly improved image recognition performance. Next, we analyze the statistical properties of max pooling that underlie its better performance, through a simple theoretical model of feature activation. We then present results of experiments that confirm many predictions of the model. Beyond the pooling operator itself, an important parameter is the set of pools over which the summary statistic is computed. We propose locality in feature configuration space as a natural criterion for devising better pools. Finally, we propose ways to make coding faster and more powerful through fast convolutional feedforward architectures, and examine how to incorporate supervision into feature extraction schemes. Overall, our experiments offer insights into what makes current systems work so well, and state-of-the-art results on several image recognition benchmarks.
Ph.D. Thesis 2012 On populations, haplotypes and genome sequencing Franquin, Pierre Abstract | PDF

Title: On populations, haplotypes and genome sequencing

Candidate: Franquin, Pierre

Advisor(s): Mishra, Bud

Abstract:

Population genetics has seen a renewed interest since the completion of the human genome project. With the availability of rapidly growing volumes of genomic data, the scientific and medical communities have been optimistic that better understanding of human diseases as well as their treatment were imminent. Many population genomic models and association studies have been designed (or redesigned) to address these problems. For instance, the genome-wide association studies (GWAS) had raised hopes for finding disease markers, personalized medicine and rational drug design. Yet, as of today, they have not yielded results that live up to their promise and have only led to a frustrating disappointment.

Intrigued, but not deterred by these challenges, this dissertation visits the different aspects of these problems. In the first part, we will review the different models and theories of population genetics that are now challenged. We will propose our own implementation of a model to test different hypotheses. This effort will hopefully help us in understanding whether our expectations were unreasonably too high or if we had ignored a crucial piece of information. When discussing association studies, we must not forget that we rely on data that are produced by sequencing technologies, so far available. We have to ensure that the quality of this data is reasonably good for GWAS. Unfortunately, as we will see in the second part, despite the existence of a diverse set of sequencing technologies, none of them can produce haplotypes with phasing, which appears to be the most important type of sequence data needed for association studies. To address this challenge, we propose a novel approach for a sequencing technology, called SMASH that allows us to create the quality and type of haplotypic genome sequences necessary for efficient population genetics.
Ph.D. Thesis 2012 Optimizing Machine Translation by Learning to Search Galron, Daniel Abstract | PDF

Title: Optimizing Machine Translation by Learning to Search

Candidate: Galron, Daniel

Advisor(s): Melamed, Dan

Abstract:

We present a novel approach to training discriminative tree-structured machine translation systems by learning to search. We describe three primary innovations in this work: a new parsing coordinator architecture and algorithms to synthesize the required training examples for the learning algorithm; a new semiring that provides an unbiased way to compare translations; and a new training objective that measures whether a translation inference improves the quality of a translation. We also apply the reinforcement learning concept of exploration to SMT. Finally, we empirically evaluate the effects of our innovations on the quality of translations output by our system.
Ph.D. Thesis 2012 Flexible-Cost SLAM Grimes, Matthew Abstract | PDF

Title: Flexible-Cost SLAM

Candidate: Grimes, Matthew

Advisor(s): LeCun, Yann

Abstract:

The ability of a robot to track its position and its surroundings is critical in mobile robotics applications, such as autonomous transport, farming, search-and-rescue, and planetary exploration.

As a foundational building block to such tasks, localization must remain reliable and unobtrusive. For example, it must not provide an unneeded level of precision, when the cost of doing so displaces higher-level tasks from a busy CPU. Nor should it produce noisy estimates on the cheap, when there are CPU cycles to spare.

This thesis explores localization solutions that provide exactly the amount of accuracy needed to a given task. We begin with a real-world system used in the DARPA Learning Applied to Ground Robotics (LAGR) competition. Using a novel hybrid of wheel and visual odometry, we cut the cost of visual odometry from 100% of a CPU to 5%, clearing room for other critical visual processes, such as long-range terrain classification. We present our hybrid odometer in chapter 2.

Next, we describe a novel SLAM algorithm that provides a means to choose the desired balance between cost and accuracy. At its fastest setting, our algorithm converges faster than previous stochastic SLAM solvers, while maintaining significantly better accuracy. At its most accurate, it provides the same solution as exact SLAM solvers. Its main feature, however, is the ability to flexibly choose any point between these two extremes of speed and precision, as circumstances demand. As a result, we are able to guarantee real-time performance at each timestep on city-scale maps with large loops. We present this solver in chapter 3, along with results from both commonly available datasets and Google Street View data.

Taken as a whole, this thesis recognizes that precision and efficiency can be competing values, whose proper balance depends on the application and its fluctuating circumstances. It demonstrates how a localizer can and should fit its cost to the task at hand, rather than the other way around. In enabling this flexibility, we demonstrate a new direction for SLAM research, as well as provide a new convenience for end-users, who may wish to map the world without stopping it.
Ph.D. Thesis 2012 SMT Beyond DPLL(T): A New Approach to Theory Solvers and Theory Combination Jovanovic, Dejan Abstract | PDF

Title: SMT Beyond DPLL(T): A New Approach to Theory Solvers and Theory Combination

Candidate: Jovanovic, Dejan

Advisor(s): Barrett, Clark

Abstract:

Satisifiability modulo theories (SMT) is the problem of deciding whether a given logical formula can be satisifed with respect to a combination of background theories. The past few decades have seen many significant developments in the field, including fast Boolean satisfiability solvers (SAT), efficient decision procedures for a growing number of expressive theories, and frameworks for modular combination of decision procedures. All these improvements, with addition of robust SMT solver implementations, culminated with the acceptance of SMT as a standard tool in the fields of automated reasoning and computer aided verification. In this thesis we develop new decision procedures for the theory of linear integer arithmetic and the theory of non-linear real arithmetic, and develop a new general framework fro combination of decision procedures. The new decision procedures integrate theory specific reasoning and the Boolean search to provide more powerful and efficient procedures, and allow a more expressive language for explaining problematic states. The new framework for combination of decision procedures overcomes the complexity limitations and restrictions on the theories imposed by the standard Nelson-Oppen approach.
Ph.D. Thesis 2012 An Adaptive Fast Multipole Method-Based PDE Solver in Three Dimensions Langston, Matthew Harper Abstract | PDF

Title: An Adaptive Fast Multipole Method-Based PDE Solver in Three Dimensions

Candidate: Langston, Matthew Harper

Advisor(s): Zorin, Denis

Abstract:

Many problems in scientific computing require the accurate and fast solution to a variety of elliptic PDEs. These problems become increasingly dif.cult in three dimensions when forces become non-homogeneously distributed and geometries are complex.

We present an adaptive fast volume solver using a new version of the fast multipole method, incorporated with a pre-existing boundary integral formulation for the development of an adaptive embedded boundary solver.

For the fast volume solver portion of the algorithm, we present a kernel-independent, adaptive fast multipole method of arbitrary order accuracy for solving elliptic PDEs in three dimensions with radiation boundary conditions. The algorithm requires only a Greenâs function evaluation routine for the governing equation and a representation of the source distribution (the right-hand side) that can be evaluated at arbiÂtrary points.

The performance of the method is accelerated in two ways. First, we construct a piecewise polynomial approximation of the right-hand side and compute far-.eld expansions in the FMM from the coef.cients of this approximation. Second, we precompute tables of quadratures to handle the near-.eld interactions on adaptive octree data structures, keeping the total storage requirements in check through the exploitation of symmetries. We additionally show how we extend the free-space volume solver to solvers with periodic and well as Dirichlet boundary conditions.

For incorporation with the boundary integral solver, we develop interpolation methods to maintain the accuracy of the volume solver. These methods use the existing FMM-based octree structure to locate apÂpropriate interpolation points, building polynomial approximations to this larger set of forces and evaluating these polynomials to the locally under-re.ned grid in the area of interest.

We present numerical examples for the Laplace, modi.ed Helmholtz and Stokes equations for a variety of boundary conditions and geometries as well as studies of the interpolation procedures and stability of far-.eld and polynomial constructions.
Ph.D. Thesis 2012 Acquiring information from wider scope to improve event extraction Liao, Shasha Abstract | PDF

Title: Acquiring information from wider scope to improve event extraction

Candidate: Liao, Shasha

Advisor(s): Grishman, Ralph

Abstract:

Event extraction is a particularly challenging type of information extraction (IE). Most current event extraction systems rely on local information at the phrase or sentence level. However, this local context may be insufficient to resolve ambiguities in identifying particular types of events; information from a wider scope can serve to resolve some of these ambiguities.

In this thesis, we first investigate how to extract supervised and unsupervised features to improve a supervised baseline system. Then, we present two additional tasks to show the benefit of wider scope features in semi-supervised learning (self-training) and active learning (co-testing). Experiments show that using features from wider scope can not only aid a supervised local event extraction baseline system, but also help the semi-supervised or active learning approach.
Ph.D. Thesis 2012 Mobile Accessibility Tools for the Visually Impaired Paisios, Nektarios Abstract | PDF

Title: Mobile Accessibility Tools for the Visually Impaired

Candidate: Paisios, Nektarios

Advisor(s): Subramanian; Lakshminarayanan

Abstract:

Visually impaired users are in dire need of better accessibility tools. The past few years have witnessed an exponential growth in the computing capabilities and onboard sensing capabilities of mobile phones making them an ideal candidate for building next-generation applications. We believe that the mobile device can play a significant role in the future for aiding visually impaired users in day-to-day activities with simple and usable mobile accessibility tools. This thesis describes the design, implementation, evaluation and user-study based analysis of four different mobile accessibility applications.

Our first system is the design of a highly accurate and usable mobile navigational guide that uses Wi-Fi and accelerometer sensors to navigate unfamiliar environments. A visually impaired user can use the system to construct a virtual topological map across points of interest within a building based on correlating the user' walking patterns (with turn signals) with the Wi-Fi and accelerometer readings. The user can subsequently use the map to navigate previously traveled routes. Our second system, Mobile Brailler, presents several prototype methods of text entry on a modern touch screen mobile phone that are based on the Braille alphabet and thus are convenient for visually impaired users. Our third system enables visually impaired users to leverage the camera of a mobile device to accurately recognize currency bills even if the images are partially or highly distorted. The final system enables visually impaired users to determine whether a pair of clothes, in this case of a tie and a shirt, can be worn together or not, based on the current social norms of color-matching.

We believe that these applications together, provide a suite of important mobile accessibility tools to enhance four critical aspects of a day-to-day routine of a visually impaired user: to navigate easily, to type easily, to recognize currency bills (for payments) and to identify matching clothes.
Ph.D. Thesis 2012 Reusable Software Infrastructure for Stream Processing Soule, Robert Abstract | PDF

Title: Reusable Software Infrastructure for Stream Processing

Candidate: Soule, Robert

Advisor(s): Grimm, Robert

Abstract:

Developers increasingly use streaming languages to write their data processing applications. While a variety of streaming languages exist, each targeting a particular application domain, they are all similar in that they represent a program as a graph of streams (i.e. sequences of data items) and operators (i.e. data transformers). They are also similar in that they must process large volumes of data with high throughput. To meet this requirement, compilers of streaming languages must provide a variety of streaming-specific optimizations, including automatic parallelization. Traditionally, when many languages share a set of optimizations, language implementors translate the source languages into a common representation called an intermediate language (IL). Because optimizations can modify the IL directly, they can be re-used by all of the source languages, reducing the overall engineering effort. However, traditional ILs and their associated optimizations target single-machine, single-process programs. In contrast, the kinds of optimizations that compilers must perform in the streaming domain are quite different, and often involve reasoning across multiple machines. Consequently, existing ILs are not suited to streaming languages.

This thesis addresses the problem of how to provide a reusable infrastructure for stream processing languages. Central to the approach is the design of an intermediate language specifically for streaming languages and optimizations. The hypothesis is that an intermediate language designed to meet the requirements of stream processing can assure implementation correctness; reduce overall implementation effort; and serve as a common substrate for critical optimizations. In evidence, this thesis provides the following contributions: (1) a catalog of common streaming optimizations that helps define the requirements of a streaming IL; (2) a calculus that enables reasoning about the correctness of source language translation and streaming optimizations; and (3) an intermediate language that preserves the semantics of the calculus, while addressing the implementation issues omitted from the calculus This work significantly reduces the effort it takes to develop stream processing languages, and jump-starts innovation in language and optimization design.
Ph.D. Thesis 2012 Building scalable geo-replicated storage backends for web applications Sovran, Yair Abstract | PDF

Title: Building scalable geo-replicated storage backends for web applications

Candidate: Sovran, Yair

Advisor(s): Li, Jinyang

Abstract:

Web applications increasingly require a storage system that is both scalable and can replicate data across many distant data centers or sites. Most existing storage solutions fall into one of two categories: Traditional databases offer strict consistency guarantees and programming ease, but are difficult to scale in a geo-replicated setting. NoSQL stores are scalable and efficient, but have weak consistency guarantees, placing the burden of ensuring consistency on programmers. In this dissertation, we describe two systems that help bridge the two extremes, providing scalable, geo-replicated storage for web applications, while also easy to program for. Walter is a key-value store that supports transactions and replicating data across distant sites. A key feature underlying Walter is a new isolation property: Parallel Snapshot Isolation (PSI). PSI allows Walter to replicate data asynchronously, while providing strong guarantees within each site. PSI does not allow write-write conflicts, alleviating the burden of writing conflict resolution logic. To prevent write-write conflicts and implement PSI, Walter uses two new and simple techniques: preferred sites and counting sets. Lynx is a distributed database backend for scaling latency-sensitive web applications. Lynx supports optimizing queries via data denormalization, distributed secondary indexes, and materialized join views. To preserve data constraints across denormalized tables and secondary indexes, Lynx relies on the a novel primitive: Distributed Transaction Chain (DTC). A DTC groups a sequence of transactions to be executed on different nodes while providing two guarantees. First, all transactions in a DTC execute exactly once despite failures. Second, transactions from concurrent DTCs are interleaved consistently on common nodes. We built several web applications on top of Walter and Lynx: an auction service, a microblogging service, and a social networking website. We have found that building web applications using Walter and Lynx is quick and easy. Our experiments show that the resulting applications are capable of providing scalable, low latency operation across multiple geo-replicated sites.
Ph.D. Thesis 2012 Rapid Training of Information Extraction with Local and Global Data Views Sun, Ang Abstract | PDF

Title: Rapid Training of Information Extraction with Local and Global Data Views

Candidate: Sun, Ang

Advisor(s): Grishman, Ralph

Abstract:

This dissertation focuses on fast system development for Information Extraction (IE). State-of-the-art systems heavily rely on extensively annotated corpora, which are slow to build for a new domain or task. Moreover, previous systems are mostly built with local evidence such as words in a short context window or features that are extracted at the sentence level. They usually generalize poorly on new domains.

This dissertation presents novel approaches for rapidly training an IE system for a new domain or task based on both local and global evidence. Specifically, we present three systems: a relation type extension system based on active learning, a relation type extension system based on semi-supervised learning, and a cross-domain bootstrapping system for domain adaptive named entity extraction.

The active learning procedure adopts features extracted at the sentence level as the local view and distributional similarities between relational phrases as the global view. It builds two classifiers based on these two views to find the most informative contention data points to request human labels so as to reduce annotation cost.

The semi-supervised system aims to learn a large set of accurate patterns for extracting relations between names from only a few seed patterns. It estimates the confidence of a name pair both locally and globally: locally by looking at the patterns that connect the pair in isolation; globally by incorporating the evidence from the clusters of patterns that connect the pair. The use of pattern clusters can prevent semantic drift and contribute to a natural stopping criterion for semi-supervised relation pattern discovery.

For adapting a named entity recognition system to a new domain, we propose a cross-domain bootstrapping algorithm, which iteratively learns a model for the new domain with labeled data from the original domain and unlabeled data from the new domain. We first use word clusters as global evidence to generalize features that are extracted from a local context window. We then select self-learned instances as additional training examples using multiple criteria, including some based on global evidence.
Ph.D. Thesis 2012 Combating Sybil attacks in cooperative systems Tran, Nguyen Abstract | PDF

Title: Combating Sybil attacks in cooperative systems

Candidate: Tran, Nguyen

Advisor(s): Li, Jinyang

Abstract:

Cooperative systems are ubiquitous nowadays. In a cooperative system, end users contribute resource to run the service instead of only receiving the service passively from the system. For example, users upload and comment pictures and videos on Flicker and YouTube, users submit and vote on news articles on Digg. As another example, users in BitTorrent contribute bandwidth and storage to help each other download content. As long as users behave as expected, these systems benefit immensely from user contribution. In fact, five out of ten most popular websites are operating in this cooperative fashion (Facebook, YouTube, Blogger, Twitter, Wikipedia). BitTorrent is dominating the global Internet traffic.

A robust cooperative system cannot blindly trust that its users will truthfully participate in the system. Malicious users seek to exploit the systems for profit. Selfish users consume but avoid to contribute resource. For example, adversaries have manipulated the voting system of Digg to promote their articles of dubious quality. Selfish users in public BitTorrent communities leave the system to avoid uploading files to others, resulting in drastic performance degradation for these content distribution systems. The ultimate way to disrupt security and incentive mechanisms of cooperative systems is using Sybil attacks, in which the adversary creates many Sybil identities (fake identities) and use them to disrupt the systems' normal operation. No security and incentive mechanism works correctly if the systems do not have a robust identity management that can defend against Sybil attacks.

This thesis provides robust identity management schemes which are resilient to the Sybil attack, and use them to secure and incentivize user contribution in several example cooperative systems. The main theme of this work is to leverage the social network among users in designing secure and incentive-compatible cooperative systems. First, we develop a distributed admission control protocol, called Gatekeeper, that leverages social network to admit most honest user identities and only few Sybil identities into the systems. Gatekeeper can be used as a robust identity management for both centralized and decentralized cooperative systems. Second, we provide a vote aggregation system for content voting systems, called SumUp, that can prevent an adversary from casting many bogus votes for a piece of content using the Sybil attack. SumUp leverages unique properties of content voting systems to provide significantly better Sybil defense compared with applying a general admission control protocol such as \gatekeeper. Finally, we provide a robust reputation system, called Credo, that can be used to incentivize bandwidth contribution in peer-to-peer content distribution networks. Credo reputation can capture user contribution, and is resilient to both Sybil and collusion attacks.
Ph.D. Thesis 2012 Multi-species biclustering: An integrative method to identify functional gene conservation between multiple species Waltman, Peter Abstract | PDF

Title: Multi-species biclustering: An integrative method to identify functional gene conservation between multiple species

Candidate: Waltman, Peter

Advisor(s): Bonneau, Richard

Abstract:

Background : Several recent comparative functional genomics projects have indicated that the co-regulation of many genes is conserved across species, at least in part. This suggests that comparative analysis of functional genomics data-sets could prove powerful in identifying co-regulated groups that are conserved across multiple species.

Results : We present recent work to extend our cMonkey algorithm to simultaneously bicluster heterogeneous data from multiple species to identify conserved modules of orthologous genes, which can yield evolutionary insights into the formation of regulatory modules. We also present results from the multi-species analysis to two triplets of bacteria. The first of these is a triplet of Gram-positive bacteria consisting of Bacillus subtilis, Bacillus anthracis, and Listeria monocytogenes, while the second is a triplet of Gram-negative bacteria that includes Escherichia coli, Salmonella typhimurium and Vibrio cholerae. Finally, we will present initial results from the multi-species biclustering analysis of human and mouse hematopoietic differentiation data.

Conclusion : Analysis of biclusters obtained revealed a surprising number of gene groups with conserved modularity and high biological significance as judged by several measures of cluster quality. We also highlight cases of interest from the Gram-positive triplet, including one that suggests a temporal difference in the expression of genes governing sporulation in the two Bacillus species. While analysis of the mouse and human hematopoietic differentiation is preliminary, it indicates the applicability of this analysis to eukaryotic systems, including comparison of cancer model systems. Finally, we suggest ways in which this analysis could be extended to identify divergent modules that may exist between normal and disease tissue.
Ph.D. Thesis 2011 Collusion Preserving Computation Alwen, Joel Abstract | PDF

Title: Collusion Preserving Computation

Candidate: Alwen, Joel

Advisor(s): Dodis, Yevgeniy

Abstract:

In collusion-free protocols, subliminal communication is impossible and parties are thus unable to communicate any information beyond what the protocol allows". Collusion-free protocols are interesting for several reasons, but have specifically attracted attention because they can be used to reduce trust in game-theoretic mechanisms. Collusion-free protocols are impossible to achieve (in general) when all parties are connected by point-to-point channels, but exist under certain physical assumptions (Lepinksi et al., STOC 2005) or in specific network topologies (Alwen et al., Crypto 2008).

In addition to proposing the definition, we explore necessary properties of the underlying communication resource. Next we provide a general feasibility result for collusion-preserving computation of arbitrary functionalities. We show that the resulting protocols enjoy an elegant (and surprisingly strong) fallback security even in the case when the underlying communication resource acts in a Byzantine manner. Finally, we investigate the implications of these results in the context of mechanism design.
Ph.D. Thesis 2011 Re-architecting Web and Mobile Information Access for Emerging Regions Chen, Jay Abstract | PDF

Title: Re-architecting Web and Mobile Information Access for Emerging Regions

Candidate: Chen, Jay

Advisor(s): Subramanian; Lakshminarayanan

Abstract:

Providing access to information for people in emerging regions is an important problem. Over the past decade there have been many proposed and increasingly numerous deployed systems to enable information access, but successes are few and modest at best. Internet in emerging regions is still generally unusable or intolerably slow. Mobile phone applications are either not designed for the phones that poor people own, otherwise, the applications lack functionality, are difficult to use, or expensive to operate. In this work we focus on enabling digital information access for people in emerging regions.

To advance the state of the art, we contribute numerous observations about how people access information in emerging regions, why the current models for web access and SMS platforms are broken, and techniques to enable applications over constrained Internet or SMS. The mechanisms presented here were designed after extensive field work in several different regions including rural, peri-urban, and urban areas in India, Kenya, Ghana, and Mexico. Multiple user studies were conducted throughout the course of system design and prototyping. We present a novel set of context appropriate platforms and tools, some spanning several layers of the networking stack. Five complete systems were implemented and deployed in the field. First, Event Logger for Firefox (ELF) is an easily deployable Firefox extension which functions as both a web browsing analysis tool and an in-browser web optimization platform. Second, RuralCafe provides a platform for web search and browsing over extremely slow or intermittent networks. Third, Contextual Information Portals (CIP) provide cached repositories of web pages tailored to the particular context in which it is to be used. Fourth, UjU is a mobile application platform that simplies the design of new SMS-based mobile applications. Finally, SMSFind is a SMS-based search service that runs on mobile phones without setup or subscription to a data plan.

Taken as a whole, the systems here are a comprehensive solution for addressing the problem of enabling digital information access in emerging regions.
Ph.D. Thesis 2011 Automatic Deduction for Theories of Algebraic Data Types Chikanian, Igor Abstract | PDF

Title: Automatic Deduction for Theories of Algebraic Data Types

Candidate: Chikanian, Igor

Advisor(s): Barrett, Clark

Abstract:

In this thesis we present formal logical systems, concerned with reasoning about algebraic data types.

The first formal system is based on the quantifier-free calculus (outermost universally quantified). This calculus is comprised of state change rules, and computations are performed by successive applications of these rules. Thereby, our calculus gives rise to an abstract decision procedure. This decision procedure determines if a given formula involving algebraic type members is valid. It is shown that this calculus is sound and complete. We also examine how this system performs practically and give experimental results. Our main contribution, as compared to previous work on this subject,is a new and more efficient decision procedure for checking satisfiability of the universal fragment within the theory of algebraic data types.

The second formal system, called Term Builder, is the deductive system based on higher order type theory, which subsumes second order and higher order logics. The main purpose of this calculus is to formulate and prove theorems about algebraic or other arbitrary user-defined types.Term Builder supports proof objects and is both, an interactive theorem prover, and verifier. We describe the built-in deductive capabilities of Term Builder and show its consistency. The logic represented by our prover is intuitionistic. Naturally, it is also incomplete and undecidable, but its expressive power is much higher than that of the first formal system.

Among our achievements in building this theorem prover is an elegant and intuitive GUI for building proofs. Also, a new feature from the foundational viewpoint is that, in contrast with other approaches, we have uniqueness-of-types property, which is not modulo beta-conversion.
Ph.D. Thesis 2011 Efficient Cryptographic Primitives for Non-Interactive Zero-Knowledge Proofs and Applications Haralambiev, Kristiyan Abstract | PDF

Title: Efficient Cryptographic Primitives for Non-Interactive Zero-Knowledge Proofs and Applications

Candidate: Haralambiev, Kristiyan

Advisor(s): Shoup, Victor

Abstract:

Non-interactive zero-knowledge (NIZK) proofs have enjoyed much interest in cryptography since they were introduced more than twenty years ago by Blum et al. [BFM88]. While quite useful when designing modular cryptographic schemes, until recently NIZK could be realized efficiently only using certain heuristics. However, such heuristic schemes have been widely criticized. In this work we focus on designing schemes which avoid them. In [GS08], Groth and Sahai presented the first efficient (and currently the only) NIZK proof system in the standard model. The construction is based on bilinear maps and is limited to languages of certain satisfiable system of equations. Given this expressibility limitation of the system of equations, we are interested in cryptographic primitives that are "compatible" with it. Equipped with such primitives and Groth-Sahai proof system, we show how to construct cryptographic schemes efficiently in a modular fashion.

In this work, we describe properties required by any cryptographic scheme to mesh well with Groth-Sahai proofs. Towards this, we introduce the notion of "structure-preserving" cryptographic scheme. We present the first constant-size structure-preserving signature scheme for messages consisting of general bilinear group elements. This allows us (for the first time) to instantiate efficiently a modular construction of round-optimal blind signature based on the framework of Fischlin [Fis06].

Our structure-preserving homomorphic trapdoor commitment schemes yield efficient leakage-resilient signatures (in the bounded leakage model) which satisfy the standard security requirements and additionally tolerates any amount of leakage; all previous works satisfied at most two of those three properties.

Lastly, we build a structure-preserving encryption scheme which satisfies the standard CCA security requirements. While somewhat similar to the notion of verifiable encryption, it provides better properties and yields the first efficient two-party protocol for joint ciphertext computation. Note that the efficient realization of such a protocol was not previously possible even using the heuristics mentioned above.

Lastly, in this line of work, we revisit the notion of simulation extractability and define "true-simulation extractable" NIZK proofs. Although quite similar to the notion of simulation-sound extractable NIZK proofs, there is a subtle but rather important difference which makes it weaker and easier to instantiate efficiently. As it turns out, in many scenarios, this new notion is sufficient, and using it, we can construct efficient leakage resilient signatures and CCA encryption scheme.
Ph.D. Thesis 2011 Learning Feature Hierarchies for Object Recognition Kavukcuoglu, Koray Abstract | PDF

Title: Learning Feature Hierarchies for Object Recognition

Candidate: Kavukcuoglu, Koray

Advisor(s): LeCun, Yann

Abstract:

In this thesis we study unsupervised learning algorithms for training feature extractors and building deep learning models. We propose sparse-modeling algo- rithms as the foundation for unsupervised feature extraction systems. To reduce the cost of the inference process required to obtain the optimal sparse code, we model a feed-forward function that is trained to predict this optimal sparse code. Using an efficient predictor function enables the use of sparse coding in hierarchical models for object recognition. We demonstrate the performance of the developed system on several recognition tasks, including object recognition, handwritten digit classification and pedestrian detection. Robustness to noise or small variations in the input is a very desirable property for a feature extraction algorithm. In order to train locally-invariant feature extractors in an unsupervised manner, we use group sparsity criteria that promote similarity between the dictionary elements within a group. This model produces locally-invariant representations under small pertur- bations of the input, thus improving the robustness of the features. Many sparse modeling algorithms are trained on small image patches that are the same size as the dictionary elements. This forces the system to learn multiple shifted versions of each dictionary element. However, when used convolutionally over large im- ages to extract features, these models produce very redundant representations. To avoid this problem, we propose convolutional sparse coding algorithms that yield a richer set of dictionary elements, reduce the redundancy of the representation and improve recognition performance.
Ph.D. Thesis 2011 Topics in Formal Synthesis and Modeling Klein, Uri Abstract | PDF

Title: Topics in Formal Synthesis and Modeling

Candidate: Klein, Uri

Advisor(s): Pnueli, Amir; Zuck, Lenore

Abstract:

The work presented focuses on two problems, that of synthesizing systems from formal specifications, and that of formalizing REST -- a popular web applications' development pattern.

For the synthesis problem, we distinguish between the synchronous and the asynchronous case. For the former, we solve a problem concerning a fundamental flaw in specification construction in previous work. We continue with exploring effective synthesis of asynchronous systems (programs on multi-threaded systems). Two alternative models of asynchrony are presented, and shown to be equally expressive for the purpose of synthesis.

REST is a software architectural style used for the design of highly scalable web applications. Interest in REST has grown rapidly over the past decade. However, there is also considerable confusion surrounding REST: many examples of supposedly RESTful APIs violate key REST constraints. We show that the constraints of REST and of RESTful HTTP can be precisely formulated within temporal logic. This leads to methods for model checking and run-time verification of RESTful behavior. We formulate several relevant verification questions and analyze their complexity.
Ph.D. Thesis 2011 Adaptive Isotopic Approximation of Nonsingular Curves and Surfaces Lin, Long Abstract | PDF

Title: Adaptive Isotopic Approximation of Nonsingular Curves and Surfaces

Candidate: Lin, Long

Advisor(s): Yap, Chee

Abstract:

Consider the problem of computing isotopic approximations of nonsingular curves and surfaces that are implicitly represented by equations of the form f (X, Y )=0 and f (X,Y, Z)=0. Thisfundamentalproblem has seen much progress along several fronts, but we will focus on domain subdivision algorithms. Two algorithms in this area are from Snyder(1992) and Plantinga and Vegter(2004). We introduce a family of new algorithms that combines the advantages of these two algorithms: like Snyder, we use the parameterizability criterion for subdivision, and like Plantinga and Vegter, we exploit nonlocal isotopy.

We first apply our approach to curves, resulting in a more efficient algorithm. We then extend our approach to surfaces. The extension is by no means routine, as the correctness arguments and case analysis are more subtle. Also, a new phenomenon arises in which local rules for constructing surfaces are no longer sufficient.

We further extend our algorithms in two important and practical directions: first, we allow subdivision cells to be non squares or non cubes, with arbitrary but bounded aspect ratios: in 2D, we allow boxes to be split into 2 or 4 children; and in 3D, we allow boxes to be split into 2, 4 or 8 children. Second, we allow the inputregion-of-interest(ROI) to have arbitrary geometry represented by anquadtreeoroctree,aslongas the curves or surfaces has no singularities in the ROI and intersects the boundary of ROI transversally.

Our algorithm is numerical because our primitives are based on interval arithmetic and exact BigFloat numbers. It is practical, easy to implement exactly (compared to algebraic approaches) and does not suffer from implementation gaps (compared to geometric approaches). We report some very encouraging experimental results,showing that our algorithms can be much more efficient than the algorithms of Plantinga and Vegter(2D and 3D)and Snyder(2D only).
Ph.D. Thesis 2011 Real-Space Localization Methods for Minimizing the Kohn-Sham Energy Millstone, Marc Abstract | PDF

Title: Real-Space Localization Methods for Minimizing the Kohn-Sham Energy

Candidate: Millstone, Marc

Advisor(s): Overton, Michael

Abstract:

The combination of ever increasing computational power and new mathematical models has fundamentally changed the field of computational chemistry. One example of this is the use of new algorithms for computing the charge density of a molecular system from which one can predict many physical properties of the system.

This thesis presents two new algorithms for minimizing the Kohn-Sham energy, which is used to describe a system of non-interacting electrons through a set of single-particle wavefunctions. By exploiting a known localization region of the wavefunctions, each algorithm evaluates the Kohn-Sham energy function and gradient at a set of iterates that have a special sparsity structure. We have chosen to represent the problem in real-space using finite-differences, allowing us to efficiently evaluate the energy function and gradient using sparse linear algebra. Detailed numerical experiments are provided on a set of representative molecules demonstrating the performance and robustness of these methods.
Ph.D. Thesis 2011 Scoring-and-Unfolding Trimmed Tree Assembler: Algorithms for Assembling Genome Sequences Accurately and Efficiently Narzisi, Giuseppe Abstract | PDF

Title: Scoring-and-Unfolding Trimmed Tree Assembler: Algorithms for Assembling Genome Sequences Accurately and Efficiently

Candidate: Narzisi, Giuseppe

Advisor(s): Mishra, Bud

Abstract:

The recent advances in DNA sequencing technology and their many potential applications to Biology and Medicine have rekindled enormous interest in several classical algorithmic problems at the core of Genomics and Computational Biology: primarily, the whole-genome sequence assembly problem (WGSA). Two decades back, in the context of the Human Genome Project, the problem had received unprecedented scientific prominence: its computational complexity and intractability were thought to have been well understood; various competitive heuristics, thoroughly explored and the necessary software, properly implemented and validated. However, several recent studies, focusing on the experimental validation of de novo assemblies, have highlighted several limitations of the current assemblers.

Intrigued by these negative results, this dissertation reinvestigates the algorithmic techniques required to correctly and efficiently assemble genomes. Mired by its connection to a well-known NP-complete combinatorial optimization problem, historically, WGSA has been assumed to be amenable only to greedy and heuristic methods. By placing efficiency as their priority, these methods opted to rely on local searches, and are thus inherently approximate, ambiguous or error-prone. This dissertation presents a novel sequence assembler, SUTTA, that dispenses with the idea of limiting the solutions to just the approximated ones, and instead favors an approach that could potentially lead to an exhaustive (exponential-time) search of all possible layouts but tames the complexity through constrained search (Branch-and-Bound) and quick identification and pruning of implausible solutions.

Complementary to this problem is the task of validating the generated assemblies. Unfortunately, no commonly accepted method exists yet and widely used metrics to compare the assembled sequences emphasize only size, poorly capturing quality and accuracy. This dissertation also addresses these concerns by developing a more comprehensive metric, the Feature-Response Curve, that, using ideas from classical ROC (receiver-operating characteristic) curve, more faithfully captures the trade-off between contiguity and quality.

Finally, this dissertation demonstrates the advantages of a complete pipeline integrating base-calling (TotalReCaller) with assembly (SUTTA) in a Bayesian manner.
Ph.D. Thesis 2011 Cryptographic Resilience to Continual Information Leakage Wichs, Daniel Abstract | PDF

Title: Cryptographic Resilience to Continual Information Leakage

Candidate: Wichs, Daniel

Advisor(s): Dodis, Yevgeniy

Abstract:

We study the question of achieving cryptographic security on devices that leak information about their internal secret state to an external attacker.This study is motivated by the prevalence of side-channel attacks, where the physical characteristics of a computation (e.g. timing, power-consumption, temperature, radiation, acoustics, etc.) can be measured, and may reveal useful information about the internal state of a device. Since some such leakage is inevitably present in almost any physical implementation, we believe that this problem cannot just be addressed by physical countermeasures alone. Instead, it should already be taken into account when designing the mathematical specification of cryptographic primitives and included in the formal study of their security.

In this thesis, we propose a new formal framework for modeling the leakage available to an attacker. This framework, called the continual leakage model, assumes that an attacker can continually learn arbitrary information about the internal secret state of a cryptographic scheme at any point in time, subject only to the constraint that the rate of leakage is bounded. More precisely, our model assumes some abstract notion of time periods. In each such period, the attacker can choose to learn arbitrary functions of the current secret state of the scheme, as long as the number of output bits leaked is not too large. In our solutions, cryptographic schemes will continually update their internal secret state at the end of each time period. This will ensure that leakage observed in different time periods cannot be meaningfully combined to break the security of the cryptosystem. Although these updates modify the secret state of the cryptosystem, the desired functionality of the scheme is preserved, and the users can remain oblivious to these updates. We construct signatures, encryption, and secret sharing/storage schemes in this model.
Ph.D. Thesis 2011 Surface Representation of Particle Based Fluids Yu, Jihun Abstract | PDF

Title: Surface Representation of Particle Based Fluids

Candidate: Yu, Jihun

Advisor(s): Yap, Chee

Abstract:

In this thesis, we focus on surface representation for particle-based fluid simulators such as Smoothed Particle Hydrodynamics (SPH). We first present a new surface reconstruction algorithm which formulates the implicit function as a sum of anisotropic smoothing kernels. The direction of anisotropy at a particle is determined by performing Weighted Principal Component Analysis (WPCA) over the neighboring particles. In addition, we perform a smoothing step that re-positions the centers of these smoothing kernels. Since these anisotropic moothing kernels capture the local particle distributions more accurately, our method has advantages over existing methods in representing smooth surfaces, thin streams and sharp features of fluids. This method is fast, easy to implement, and the results demonstrate a significant improvement in the quality of reconstructed surfaces as compared to existing methods. Next,we introduce the idea of using an explicit triangle mesh to track the air/liquid interface in a SPH simulator.

Once an initial surface mesh is created, this mesh is carried forward in time using nearby particle velocities to advect the mesh vertices. The mesh connectivity remains mostly unchanged across time-steps; it is only modified locally for topology change events or for the improvement of triangle quality. In order to ensure that the surface mesh does not diverge from the underlying particle simulation, we periodically project the mesh surface onto an implicit surface defined by the physics simulation. The mesh surface presents several advantages over previous SPH surface tracking techniques: A new method for surface tension calculations clearly outperforms the state of the art in SPH surface tension for computer graphics. A new method for tracking detailed surface information (like colors) is less susceptible to numerical diffusion than competing techniques. Finally, a temporally-coherent surface mesh allows us to simulate high-resolution surface wave dynamics without being limited by the particle resolution of the SPH simulation.
Ph.D. Thesis 2010 On the Randomness Requirements for Privacy Bosley, Carleton Abstract | PDF

Title: On the Randomness Requirements for Privacy

Candidate: Bosley, Carleton

Advisor(s): Dodis, Yevgeniy

Abstract:

Most cryptographic primitives require randomness (for example, to generate secret keys). Usually, one assumes that perfect randomness is available, but, conceivably, such primitives might be built under weaker, more realistic assumptions. This is known to be achievable for many authentication applications, when entropy alone is typically sufficient. In contrast, all known techniques for achieving privacy seem to fundamentally require (nearly) perfect randomness. We ask the question whether this is just a coincidence, or, perhaps, privacy inherently requires true randomness?

We completely resolve this question for information-theoretic private-key encryption, where parties wish to encrypt a b-bit value using a shared secret key sampled from some imperfect source of randomness S. Our technique also extends to related primitives which are sufficiently binding and hiding, including computationally secure commitments and public-key encryption.

Our main result shows that if such n-bit source S allows for a secure encryption of b bits, where b > log n, then one can deterministically extract nearly b almost perfect random bits from S . Further, the restriction that b > log n is nearly tight: there exist sources S allowing one to perfectly encrypt (log n - log log n) bits, but not to deterministically extract even a single slightly unbiased bit.

Hence, to a large extent, true randomness is inherent for encryption: either the key length must be exponential in the message length b, or one can deterministically extract nearly b almost unbiased random bits from the key. In particular, the one-time pad scheme is essentially "universal".
Ph.D. Thesis 2010 Machine Learning Approaches to Gene Duplication and Transcription Regulation Chen, Huang-Wen Abstract | PDF

Title: Machine Learning Approaches to Gene Duplication and Transcription Regulation

Candidate: Chen, Huang-Wen

Advisor(s): Shasha, Dennis

Abstract:

Gene duplication can lead to genetic redundancy or functional divergence, when duplicated genes evolve independently or partition the original function. In this dissertation, we employed machine learning approaches to study two different views of this problem: 1) Redundome, which explored the redundancy of gene pairs in the genome of Arabidopsis thaliana, and 2) ContactBind, which focused on functional divergence of transcription factors by mutating contact residues to change binding affinity.

In the Redundome project, we used machine learning techniques to classify gene family members into redundant and non-redundant gene pairs in Arabidopsis thaliana, where sufficient genetic and genomic data is available. We showed that Support Vector Machines were two-fold more precise than single attribute classifiers, and performed among the best within other machine learning algorithms. Machine learning methods predict that about half of all genes in Arabidopsis showed the signature of predicted redundancy with at least one but typically less than three other family members. Interestingly, a large proportion of predicted redundant gene pairs were relatively old duplications (e.g., Ks>1), suggesting that redundancy is stable over long evolutionary periods. The genome-wide predictions were plot with similarity trees based on ClustalW alignment scores, and can be accessed at http://redundome.bio.nyu.edu .

In the ContactBind project, we use Bayesian networks to model dependences between contact residues in transcription factors and binding site sequences. Based on the models learned from various binding experiments, we predicted binding motifs and their locations on promoters for three families of transcription factors in three species. The predictions are publicly available at http://contactbind.bio.nyu.edu . The website also provides tools to predict binding motifs and their locations for novel protein sequences of transcription factors. Users can construct their Bayesian networks for new families once such a familial binding data is available.
Ph.D. Thesis 2010 New Privacy-Preserving Architectures for Identity-/Attribute-based Encryption Chow, Sze Ming Abstract | PDF

Title: New Privacy-Preserving Architectures for Identity-/Attribute-based Encryption

Candidate: Chow, Sze Ming

Advisor(s): Dodis, Yevgeniy; Shoup, Victor

Abstract:

The notion of identity-based encryption (IBE) was proposed as an economical alternative to public-key infrastructures. IBE is also a useful building block in various cryptographic primitives such as searchable encryption. A generalization of IBE is attribute-based encryption (ABE). A major application of ABE is fine-grained cryptographic access control of data. Research on these topics is still actively continuing.

However, security and privacy of IBE and ABE are hinged on the assumption that the authority which setups the system is honest. Our study aims to reduce this trust assumption.

The inherent key escrow of IBE has sparkled numerous debates in the cryptography/security community. A curious key generation center (KGC) can simply generate the user's private key to decrypt a ciphertext. However, can a KGC still decrypt if it does not know the intended recipient of the ciphertext? This question is answered by formalizing KGC anonymous ciphertext indistinguishability (ACI-KGC). All existing practical pairing-based IBE schemes without random oracles do not achieve this notion. In this thesis, we propose an IBE scheme with ACI-KGC, and a new system architecture with an anonymous secret key generation protocol such that the KGC can issue keys to authenticated users without knowing the list of users' identities. This also matches the practice that authentication should be done with the local registration authorities. Our proposal can be viewed as mitigating the key escrow problem in a new dimension.

For ABE, it is not realistic to trust a single authority to monitor all attributes and hence distributing control over many attribute-authorities is desirable. A multi-authority ABE scheme can be realized with a trusted central authority (CA) which issues part of the decryption key according to a user's global identifier (GID). However, this CA may have the power to decrypt every ciphertext, and the use of a consistent GID allowed the attribute-authorities to collectively build a full profile with all of a user's attributes. This thesis proposes a solution without the trusted CA and without compromising users' privacy, thus making ABE more usable in practice.

Underlying both contributions are our new privacy-preserving architectures enabled by borrowing techniques from anonymous credential.
Ph.D. Thesis 2010 Tools and Techniques for the Sound Verification of Low Level Code Conway, Christopher L. Abstract | PDF

Title: Tools and Techniques for the Sound Verification of Low Level Code

Candidate: Conway, Christopher L.

Advisor(s): Barrett, Clark

Abstract:

Software plays an increasingly crucial role in nearly every facet of modern life, from communications infrastructure to the control systems in automobiles, airplanes, and power plants. To achieve the highest degree of reliability for the most critical pieces of software, it is necessary to move beyond ad hoc testing and review processes towards verification---to prove using formal methods that a piece of code exhibits exactly those behaviors allowed by its specification and no others.

A significant portion of the existing software infrastructure is written in low-level languages like C and C++. Features of these language present significant verification challenges. For example, unrestricted pointer manipulation means that we cannot prove even the simplest properties of programs without first collecting precise information about potential aliasing relationships between variables.

In this thesis, I present several contributions. The first is a general framework for combining program analyses that are only conditionally sound. Using this framework, I show it is possible to design a sound verification tool that relies on a separate, previously-computed pointer analysis.

The second contribution of this thesis is Cascade, a multi-platform, multi-paradigm framework for verification. Cascade includes a support for precise analysis of low-level C code, as well as for higher-level languages such as SPL.

Finally, I describe a novel technique for the verification of datatype invariants in low-level systems code. The programmer provides a high-level specification for a low-level implementation in the form of inductive datatype declarations and code assertions. The connection between the high-level semantics and the implementation code is then checked using bit-precise reasoning. An implementation of this datatype verification technique is available as a Cascade module.
Ph.D. Thesis 2010 Probabilistic and Topological methods in Computational Geometry Dhandapani, Raghavan Abstract | PDF

Title: Probabilistic and Topological methods in Computational Geometry

Candidate: Dhandapani, Raghavan

Advisor(s): Pach, Janos

Abstract:

We consider four problems connected by the common thread of geometry. The first three involve problems and algorithms that arise in applications that apriori do not involve geometry but this turns out to be the right language for visualizing and analyzing them. In the fourth, we generalize some well known results in geometry to the topological plane. The techniques we use come from probability and topology.

First, we consider two algorithms that work well in practice but the theoretical mechanism behind whose success is not very well understood.

Greedy routing is a routing mechanism that is commonly used in wireless sensor networks. While routing on the Internet uses standard established protocols, routing in ad-hoc networks with little structure (like sensor networks) is more difficult. Practitioners have devised algorithms that work well in practice, however they were no known theoretical guarantees. We provide the first such result in this area by showing that greedy routing can be made to work on Planar triangulations.

Linear Programming is a technique for optimizing a linear function subject to linear constraints. Simplex Algorithms are a family of algorithms that have proven quite successful in solving Linear Programs in practice. However, examples of Linear Programs on which these algorithms are very inefficient have been obtained by researchers. In order to explain this discrepancy between theory and practice, many authors have shown that Simplex Algorithms are efficient in expectation on randomized Linear Programs. We strengthen these results by proving a partial concentration bound for the Shadow Vertex Simplex Algorithm.

Next, we point out a limitation in an algorithm that is used commonly by practitioners and suggest a way of overcoming this.

Recommendation Systems are algorithms that are used to recommend goods (books, movies etc.) to users based on the similarities between their past preferences and those of other users. Low Rank Approximation is a common method used for this. We point out a common limitation of this method in the presence of ill-conditioning: the presence of multiple local minima. We also suggest a simple averaging based technique to overcome this limitation.

Finally, we consider some basic results in convexity like Radon's, Helly's and Caratheodory's theorems and generalize them to the topological plane, i.e., a plane which has the concept of a linear path which is analogous to a straight line but no notion of metric or distances.
Ph.D. Thesis 2010 Semi-Supervised Learning via Generalized Maximum Entropy Erkan, Ayse Naz Abstract | PDF

Title: Semi-Supervised Learning via Generalized Maximum Entropy

Candidate: Erkan, Ayse Naz

Advisor(s): LeCun, Yann

Abstract:

Maximum entropy (MaxEnt) framework has been studied extensively in the supervised setting. Here, the goal is to find a distribution p, that maximizes an entropy function while enforcing data constraints so that the expected values of some (pre-defined) features with respect to p, match their empirical counterparts approximately. Using different entropy measures, different model spaces for p and different approximation criteria for the data constraints yields a family of discriminative supervised learning methods (e.g., logistic regression, conditional random fields, least squares and boosting). This framework is known as the generalized maximum entropy framework.

Semi-supervised learning (SSL) has emerged in the last decade as a promising field that enables utilizing unlabeled data along with labeled data so as to increase the accuracy and robustness of inference algorithms. However, most SSL algorithms to date have had trade-offs, for instance in terms of scalability or applicability to multi-categorical data.

In this thesis, we extend the generalized MaxEnt framework to develop a family of novel SSL algorithms using two different approaches: i. Introducing Similarity Constraints We incorporate unlabeled data via modifications to the primal MaxEnt objective in terms of additional potential functions. A potential function stands for a closed proper convex function that can take the form of a constraint and/or a penalty representing our structural assumptions on the data geometry. Specifically, we impose similarity constraints as additional penalties based on the semi-supervised smoothness assumption; i.e., we restrict the generalized MaxEnt problem such that similar samples have similar model outputs. ii. Augmenting Constraints on Model Features We incorporate unlabeled data to enhance the estimates on the model and empirical expectations based on our assumptions on the data geometry.

In particular, we derive the semi-supervised formulations for three specific instances of the generalized MaxEnt on conditional distributions, namely logistic regression and kernel logistic regression for multi-class problems, and conditional random fields for structured output prediction problems. A thorough empirical evaluation on standard data sets that are widely used in the literature demonstrates the validity and competitiveness of the proposed algorithms. In addition to these benchmark data sets, we apply our approach to two real-life problems: i. vision based robot grasping, and ii. remote sensing image classification, where the scarcity of the labeled training samples is the main bottleneck in the learning process. For the particular case of grasp learning, we propose a combination of semi-supervised learning and active learning, another sub-field of machine learning that is focused on the scarcity of labeled samples, when the problem setup is suitable for incremental labeling.

The novel SSL algorithms proposed in this thesis have numerous advantages over the existing semi-supervised algorithms as they yield convex, scalable, inherently multi-class loss functions that can be kernelized naturally.
Ph.D. Thesis 2010 Solving Quantified First Order Formulas in Satisfiability Modulo Theories Ge, Yeting Abstract | PDF

Title: Solving Quantified First Order Formulas in Satisfiability Modulo Theories

Candidate: Ge, Yeting

Advisor(s): Barrett, Clark

Abstract:

Design errors in computer systems, i.e. bugs, can cause inconvenience, loss of data and time, and in some cases catastrophic damages. One approach for improving design correctness is formal methods: techniques aiming at mathematically establishing that a piece of hardware or software satisfies certain properties. For some industrial cases in which formal methods are utilized, quantified first order formulas in satisfiability modulo theories (SMT) are useful. This dissertation presents several novel techniques for solving quantified formulas in SMT.

In general, deciding a quantified formula in SMT is undecidable. The practical approach for general quantifier reasoning in SMT is heuristics-based instantiation. This dissertation proposes a number of new heuristics that solves several challenges. Experimental results show that with the new heuristics a significant number of more benchmarks can be solved than before.

When only consider formulas within certain fragments of first order logic, it is possible to have complete algorithms based on instantiation. We propose several new fragments, and we prove that formulas in these fragments can be solved by a complete algorithm based on instantiation. For satisfiable quantified formulas in these fragments, we show how to construct the models.

As SMT solvers grow in complexity, the correctness of SMT solvers become questionable. A practical method to improve the correctness is to check the proofs from SMT solvers. We propose a proof translator that translates proofs from SMT solver CVC3 into a trusted solver HOL Light that actually checks the proofs. Experiments with the proof translator discover a faulty proof rule in CVC3 and two MIT-labeled quantified benchmarks in the SMT benchmark library SMT-LIB.
Ph.D. Thesis 2010 An Algorithmic Enquiry Concerning Causality Kleinberg, Samantha Abstract | PDF

Title: An Algorithmic Enquiry Concerning Causality

Candidate: Kleinberg, Samantha

Advisor(s): Mishra, Bhubaneswar

Abstract:

In many domains we face the problem of determining the underlying causal structure from time-course observations of a system. Whether we have neural spike trains in neuroscience, gene expression levels in systems biology, or stock price movements in finance, we want to determine why these systems behave the way they do. For this purpose we must assess which of the myriad possible causes are significant while aiming to do so with a feasible computational complexity. At the same time, there has been much work in philosophy on what it means for something to be a cause, but comparatively little attention has been paid to how we can identify these causes. Algorithmic approaches from computer science have provided the first steps in this direction, but fail to capture the complex, probabilistic and temporal nature of the relationships we seek.

This dissertation presents a novel approach to the inference of general (type-level) and singular (token-level) causes. The approach combines philosophical notions of causality with algorithmic approaches built on model checking and statistical techniques for false discovery rate control. By using a probabilistic computation tree logic to describe both cause and effect, we allow for complex relationships and explicit description of the time between cause and effect as well as the probability of this relationship being observed (e.g. "a and b until c, causing d in 10-20 time units"). Using these causal formulas and their associated probabilities, we develop a novel measure for the significance of a cause for its effect, thus allowing discovery of those that are statistically interesting, determined using the concepts of multiple hypothesis testing and false discovery control. We develop algorithms for testing these properties in time-series observations and for relating the inferred general relationships to token-level events (described as sequences of observations). Finally, we illustrate these ideas with example data from both neuroscience and finance, comparing the results to those found with other inference methods. The results demonstrate that our approach achieves superior control of false discovery rates, due to its ability to appropriately represent and infer temporal information.
Ph.D. Thesis 2010 Time Series Modeling with Hidden Variables and Gradient-Based Algorithms Mirowski, Piotr Abstract | PDF

Title: Time Series Modeling with Hidden Variables and Gradient-Based Algorithms

Candidate: Mirowski, Piotr

Advisor(s): LeCun, Yann

Abstract:

We collect time series from real-world phenomena, such as gene interactions in biology or word frequencies in consecutive news articles. However, these data present us with an incomplete picture, as they result from complex dynamical processes involving unobserved state variables. Research on state-space models is motivated by simultaneously trying to infer hidden state variables from observations, as well as learning the associated dynamic and generative models.

I have developed a tractable, gradient-based method for training Dynamic Factor Graphs (DFG) with continuous latent variables. A DFG consists of (potentially nonlinear) factors modeling joint probabilities between hidden and observed variables. The DFG assigns a scalar energy to each configuration of variables, and a gradient-based inference procedure finds the minimum-energy state sequence for a given observation sequence. We approximate maximum likelihood learning by minimizing the expected energy over training sequences with respect to the factors' parameters. These alternated inference and parameter updates constitute a deterministic EM-like procedure.

Using nonlinear factors such as deep, convolutional networks, DFGs were shown to reconstruct chaotic attractors, to outperform a time series prediction benchmark, and to successfully impute motion capture data where a large number of markers were missing. In a joint work with the NYU Plant Systems Biology Lab, DFGs have been subsequently employed to the discovery of gene regulation networks by learning the dynamics of mRNA expression levels.

DFGs have also been extended into a deep auto-encoder architecture, and used on time-stamped text documents, with word frequencies as inputs. We focused on collections of documents that exhibit a structure over time. Working as dynamic topic models, DFGs could extract a latent trajectory from consecutive political speeches; applied to news articles, they achieved state-of-the-art text categorization and retrieval performance.

Finally, I used an embodiment of DFGs to evaluate the likelihood of discrete sequences of words in text corpora, relying on dynamics on word embeddings. Collaborating with AT&T; Labs Research on a project in speech recognition, we have improved on existing continuous statistical language models by enriching them with word features and long-range topic dependencies.
Ph.D. Thesis 2010 Structure Prediction and Visualization in Molecular Biology Poultney, Christopher Abstract | PDF

Title: Structure Prediction and Visualization in Molecular Biology

Candidate: Poultney, Christopher

Advisor(s): Shasha, Dennis

Abstract:

The tools of computer science can be a tremendous help to the working biologist. Two broad areas where this is particularly true are visualization and prediction. In visualization, the size of the data involved often makes meaningful exploration of the data and discovery of salient features difficult and time-consuming. Similarly, intelligent prediction algorithms can greatly reduce the lab time required to achieve significant results, or can reduce an intractable space of potential experiments to a tractable size.

Whereas the thesis discusses both a visualization technique and a machine learning problem, the thesis presentation will focus exclusively on the machine learning problem: prediction of temperature-sensitive mutations from protein structure. Temperature-sensitive mutations are a tremendously valuable research tool particularly for studying genes such as yeast essentially genes. To date, most methods for generating temperature-sensitive mutations involve large-scale random mutations followed by an intensive screening and characterization process. While there have been successful efforts to improve this process by rational design of temperature-sensitive proteins, surprisingly little work has been done in the area of predicting those mutations that will exhibit a temperature-sensitive phenotype. We describe a system that, given the structure of a protein of interest, uses a combination of protein structure prediction and machine learning to provide a ranked "top 5" list of likely candidates for temperature-sensitive mutations.
Ph.D. Thesis 2010 Theoretical Foundations and Algorithms for Learning with Multiple Kernels Rostamizadeh, Afshin Abstract | PDF

Title: Theoretical Foundations and Algorithms for Learning with Multiple Kernels

Candidate: Rostamizadeh, Afshin

Advisor(s): Mohri, Mehryar

Abstract:

Kernel-based algorithms have been used with great success in a variety of machine learning applications. These include algorithms such as support vector machines for classification, kernel ridge regression, ranking algorithms, clustering algorithms, and virtually all popular dimensionality reduction algorithms, since they are special instances of kernel principal component analysis.

But, the choice of the kernel, which is crucial to the success of these algorithms, has been traditionally left entirely to the user. Rather than requesting the user to commit to a specific kernel, multiple kernel algorithms require the user only to specify a family of kernels. This family of kernels can be used by a learning algorithm to form a combined kernel and derive an accurate predictor. This is a problem that has attracted a lot of attention recently, both from the theoretical point of view and from the algorithmic, optimization, and application point of view.

This thesis presents a number of novel theoretical and algorithmic results for learning with multiple kernels.

It gives the first tight margin-based generalization bounds for learning kernels with Lp regularization. In particular, our margin bounds for L1 regularization are shown to have only a logarithmic dependency on the number of kernels, which is a significant improvement over all previous analyses. Our results also include stability-based guarantees for a class of regression algorithms. In all cases, these guarantees indicate the benefits of learning with a large number of kernels.

We also present a family of new two-stage algorithms for learning kernels based on a notion of alignment and give an extensive analysis of the properties of these algorithms. We show the existence of good predictors for the notion of alignment we define and give efficient algorithms for learning a maximum alignment kernel by showing that the problem can be reduced to a simple QP.

Finally, we also report the results of extensive experiments with our two-stage algorithms in classification and regression tasks, which show an improvement both over the uniform combination of kernels and over other state-of-the-art learning kernel methods for L1 and L2 regularization. These might constitute the first series of results for learning with multiple kernels that demonstrate a consistent improvement over a uniform combination of kernels.
Ph.D. Thesis 2010 Creating collections and evaluating viewpoints: Selection techniques for interface design Secord, Adrian Abstract | PDF

Title: Creating collections and evaluating viewpoints: Selection techniques for interface design

Candidate: Secord, Adrian

Advisor(s): Zorin, Denis

Abstract:

In computer graphics and user interface design, selection problems are those that require the user to select a collection consisting of a small number of items from a much larger library. This dissertation explores selection problems in two diverse domains: large personal multimedia collections, containing items such as personal photographs or songs, and camera positions for 3D objects, where each item is a different viewpoint observing an object. Multimedia collections have by discrete items with strong associated metadata, while camera positions form a continuous space but are weak in metadata. In either domain, the items to be selected have rich interconnections and dependencies, making it difficult to successfully apply simple techniques (such as ranking) to aid the user. Accordingly, we develop separate approaches for the two domains.

For personal multimedia collections, we leverage the semantic metadata associated with each item (such as song title, artist name, etc.) and provide the user with a simple query language to describe their desired collection. Our system automatically suggests a collection of items that conform to the userâs query. Since any query language has limited expressive power, and since users often create collections via exploration, we provide various refinement techniques that allow the user to expand, refine and explore their collection directly through examples.

For camera positioning, we do not have the advantage of having semantic metadata for each item, unlike in media collections. We instead create a proxy viewpoint goodness function which can be used to guide the solution of various selection problems involving camera viewpoints. This function is constructed from several different attributes of the viewpoint, such as how much surface area is visible, or how "curvy" the silhouette is. Since there are many possible viewpoint goodness functions, we conducted a large user study of viewpoint preference and use the results to evaluate thousands of different functions and find the best ones. While we suggest several goodness functions to the practitioner, our user study data and methodology can be used to evaluate any proposed goodness function; we hope it will be a useful tool for other researchers.
Ph.D. Thesis 2010 Analysis of Mass Spectrometry Data for Protein Identification In Complex Biological Mixtures Spivak, Marina Abstract | PDF

Title: Analysis of Mass Spectrometry Data for Protein Identification In Complex Biological Mixtures

Candidate: Spivak, Marina

Advisor(s): Greengard, Leslie

Abstract:

Mass spectrometry is a powerful technique in analytical chemistry that was originally designed to determine the composition of small molecules in terms of their constituent elements. In the last several decades, it has begun to be used for much more complex tasks, including the detailed analysis of the amino acid sequence that makes up an unknown protein and even the identification of multiple proteins present in a complex mixture. The latter problem is largely unsolved and the principal subject of this dissertation.

The fundamental difficulty in the analysis of mass spectrometry data is that of ill-posedness. There are multiple solutions consistent with the experimental data and the data is subject to significant amounts of noise. In this work, we have developed application-specific machine learning algorithms that (partially) overcome this ill-posedness. We make use of labeled examples of a single class of peptide fragments and of the unlabeled fragments detected by the instrument. This places the approach within the broader framework of semi-supervised learning.

Recently, there has been considerable interest in classification problems of this type, where the learning algorithm only has access to labeled examples of a single class and unlabeled data. The motivation for such problems is that in many applications, examples of one of the two classes are easy and inexpensive to obtain, whereas the acquisition of examples of a second class is difficult and labor-intensive. For example, in document classification, positive examples are documents that address specific subject, while unlabeled documents are abundant. In movie rating, the positive data are the movies chosen by clients, while the unlabeled data are all remaining movies in a collection. In medical imaging, positive (labeled) data correspond to images of tissue affected by a disease, while the remaining available images of the same tissue comprise the unlabeled data. Protein identification using mass spectrometry is another variant of such a general problem.

In this work, we propose application-specific machine learning algorithms to address this problem. The reliable identification of proteins from mixtures using mass spectrometry would provide an important tool in both biomedical research and clinical practice.
Ph.D. Thesis 2010 Matrix Approximation for Large-scale Learning Talwalkar, Ameet Abstract | PDF

Title: Matrix Approximation for Large-scale Learning

Candidate: Talwalkar, Ameet

Advisor(s): Mohri, Mehryar

Abstract:

Modern learning problems in computer vision, natural language processing, computational biology, and other areas are often based on large data sets of thousands to millions of training instances. However, several standard learning algorithms, such as kernel-based algorithms, e.g., Support Vector Machines, Kernel Ridge Regression, Kernel PCA, do not easily scale to such orders of magnitude. This thesis focuses on sampling-based matrix approximation techniques that help scale kernel-based algorithms to large-scale datasets. We address several fundamental theoretical and empirical questions including:

What approximation should be used? We discuss two common sampling-based methods, providing novel theoretical insights regarding their suitability for various applications and experimental results motivated by this theory. Our results show that one of these methods, the Nystrom method, is superior in the context of large-scale learning.

Do these approximations work in practice? We show the effectiveness of approximation techniques on a variety of problems. In the largest study to-date for manifold learning, we use the Nystrom method to extract low-dimensional structure from high-dimensional data to effectively cluster face images. We also report good empirical results for kernel ridge regression and kernel logistic regression.

How should we sample columns? A key aspect of sampling-based algorithms is the distribution according to which columns are sampled. We study both fixed and adaptive sampling schemes as well as a promising ensemble technique that can be easily parallelized and generates superior approximations, both in theory and in practice.

How well do these approximations work in theory? We provide theoretical analyses of the Nystrom method to understand when this technique should be used. We present guarantees on approximation accuracy based on various matrix properties and analyze the effect of matrix approximation on actual kernel-based algorithms.

This work has important consequences for the machine learning community since it extends to large-scale applications the benefits of kernel-based algorithms. The crucial aspect of this research, involving low-rank matrix approximation, is of independent interest within the field of numerical linear algebra.
Ph.D. Thesis 2009 Factor Graphs for Relational Regression Chopra, Sumit Abstract | PDF

Title: Factor Graphs for Relational Regression

Candidate: Chopra, Sumit

Advisor(s): LeCun, Yann

Abstract:

Inherent in many interesting regression problems is a rich underlying inter-sample "Relational Structure". In these problems, the samples may be related to each other in ways such that the unknown variables associated with any sample not only depends on its individual attributes, but also depends on the variables associated with related samples. One such problem, whose importance is further emphasized by the present economic crises, is understanding real estate prices. The price of a house clearly depends on its individual attributes, such as, the number of bedrooms. However, the price also depends on the neighborhood in which the house lies and on the time period in which it was sold. This effect of neighborhood and time on the price is not directly measurable. It is merely reflected in the prices of other houses in the vicinity that were sold around the same time period. Uncovering these spatio-temporal dependencies can certainly help better understand house prices, while at the same time improving prediction accuracy.

Problems of this nature fall in the domain of "Statistical Relational Learning". However the drawback of most models proposed so far is that they cater only to classification problems. To this end, we propose "relational factor graph" models for doing regression in relational data. A single factor graph is used to capture, one, dependencies among individual variables of sample, and two, dependencies among variables associated with multiple samples. The proposed models are capable of capturing hidden inter-sample dependencies via latent variables, and also permits non-linear log-likelihood functions in parameter space, thereby allowing considerably more complex architectures. Efficient inference and learning algorithms for relational factor graphs are proposed. The models are applied to predicting the prices of real estate properties and for constructing house price indices. The relational aspect of the model accounts for the hidden spatio-temporal influences on the price of every house. Experiments show that one can achieve considerably superior performance by identifying and using the underlying spatio-temporal structure associated with the problem. To the best of our knowledge this is the first work in the direction of relational regression and is also the first work in constructing house price indices by simultaneously accounting for the spatio-temporal effects on house prices using large-scale industry standard data set.
Ph.D. Thesis 2009 Numerical Estimation of the Second Largest Eigenvalue of a Reversible Markov Transition Matrix Gade, Kranthi Abstract | PDF

Title: Numerical Estimation of the Second Largest Eigenvalue of a Reversible Markov Transition Matrix

Candidate: Gade, Kranthi

Advisor(s): Goodman, Jonathan

Abstract:

We discuss the problem of finding the second largest eigenvalue of an operator that defines a reversible Markov chain. The second largest eigenvalue governs the rate at which the statistics of the Markov chain converge to equilibrium. Scientific applications include understanding the very slow dynamics of some models of dynamic glass. Applications in computing include estimating the rate of convergence of Markov chain Monte Carlo algorithms.

Most practical Markov chains have state spaces so large that direct or even iterative methods from linear algebra are inapplicable. The size of the state space, which is the dimension of the eigenvalue problem, grows exponentially with the system size. This makes it impossible to store a vector (for sparse methods), let alone a matrix (for dense methods). Instead, we seek a method that uses only time correlation from samples produced from the Markov chain itself.

In this thesis, we propose a novel Krylov subspace type method to estimate the second eigenvalue from the simulation data of the Markov chain using test functions which are known to have good overlap with the slowest mode. This method starts with the naive Rayleigh quotient estimate of the test function and refines it to obtain an improved estimate of the second eigenvalue. We apply the method to a few model problems and the estimate compares very favorably with the known answer. We also apply the estimator to some Markov chains occuring in practice, most notably in the study of glasses. We show experimentally that our estimator is more accurate and stable for these problems compared to the existing methods.
Ph.D. Thesis 2009 2D-Centric Interfaces and Algorithms for 3D Modeling Gingold, Yotam Abstract | PDF

Title: 2D-Centric Interfaces and Algorithms for 3D Modeling

Candidate: Gingold, Yotam

Advisor(s): Zorin, Denis

Abstract:

The creation of 3D models is a fundamental task in computer graphics. The task is required by professional artists working on movies, television, and games, and desired by casual users who wish to make their own models for use in virtual worlds or as a hobby.

In this thesis, we consider approaches to creating and editing 3D models that minimize the user's thinking in 3D. In particular, our approaches do not require the user to manipulate 3D positions in space or mentally invert complex 3D-to-2D mappings. We present interfaces and algorithms for the creation of 3D surfaces, for texturing, and for adding small-to-medium scale geometric detail.

First, we present a novel approach for texture placement and editing based on direct manipulation of textures on the surface. Compared to conventional tools for surface texturing, our system combines UV-coordinate specification and texture editing into one seamless process, reducing the need for careful initial design of parameterization and providing a natural interface for working with textures directly on 3D surfaces.

Second, we present a system for free-form surface modeling that allows a user to modify a shape by changing its rendered, shaded image using stroke-based drawing tools. A new shape, whose rendered image closely approximates user input, is c omputed using an efficient and stable surface optimization procedure. We demonstrate how several types of free-form surface edits which may be difficult to cast in terms of standard deformation approaches can be easily performed using our system.

Third, we present a single-view 2D interface for 3D modeling based on the idea of placing 2D primitives and annotations on an existing, pre-made sketch or image. Our interface frees users to create 2D sketches from arbitrary angles using their preferred tool---including pencil and paper---which they then "describe" using our tool to create a 3D model. Our primitives are manipulated with persistent, dynamic handles, and our annotations take the form of markings commonly used in geometry textbooks.
Ph.D. Thesis 2009 Proximity problems for point sets residing in spaces with low doubling dimension Gottlieb, Lee-Ad Abstract | PDF

Title: Proximity problems for point sets residing in spaces with low doubling dimension

Candidate: Gottlieb, Lee-Ad

Advisor(s): Cole, Richard

Abstract:

In this thesis we consider proximity problems on point sets. Proximity problems arise in all fields of computer science, with broad application to computation geometry, machine learning, computational biology, data mining and the like. In particular, we will consider the problems of approximate nearest neighbor search, and dynamic maintenance of a spanner for a point set.

It has been conjectured that all algorithms for these two problems suffer from the "curse of dimensionality," which means that their run time grow exponentially with the dimension of the point set. To avoid this undesirable growth, we consider point sets that occupy a doubling dimension lambda. We first present a dynamic data structure that uses linear space and supports a (1+e)-approximate nearest neighbor search of the point set. We then extend this algorithm to allow the dynamic maintenance of a low degree (1+e)-spanner for the point set. The query and update time of these structures are exponential in lambda (as opposed to exponential in the dimension); when lambda is small, this provides a significant spead-up over known algorithms, and when lambda is constant then these run times are optimal up to a constant. Even when no assumptions are made on lambda, the query and update times of the neighest neighbor search structure match the best known run times for approximate nearest neighbor search (up to a constant multiple in lambda). Further, the stretch of the spanner is optimal, and its update times exceed all previously known algorithms.
Ph.D. Thesis 2009 Creativity Support for Computational Literature Howe, Daniel Abstract | PDF

Title: Creativity Support for Computational Literature

Candidate: Howe, Daniel

Advisor(s): Perlin, Ken

Abstract:

The creativity support community has a long history of providing valuable tools to artists and designers. Similarly, creative digital media practice has proven a valuable pedagogical strategy for teaching core computational ideas. Neither strain of research has focused on the domain of literary art however, instead targeting visual, and aural media almost exclusively.

To address this situation, this thesis presents a software toolkit created specifically to support creativity in computational literature. Two primary hypotheses direct the bulk of the research presented: first, that it is possible to implement effective creativity support tools for literary art given current resource constraints; and second, that such tools, in addition to facilitating new forms of literary creativity, will provide unique opportunities for computer science education.

Designed both for practicing artists and for pedagogy, the research presented directly addresses impediments to participation in the field for a diverse range of users and provides an end-to-end solution for courses attempting to engage the creative faculties of computer science students, and to introduce a wider demographic--from writers, to digital artists, to media and literary theorists --to procedural literacy and computational thinking.

The tools and strategies presented have been implemented, deployed, and iteratively refined in a real-world contexts over the past three years. In addition to their use in large-scale projects by contemporary artists, they have provided effective support for multiple iterations of 'Programming for Digital Art & Literature', a successful inter-disciplinary computer science course taught by the author.

Taken together, this thesis provides a novel set of tools for a new domain, and demonstrates their real-world efficacy in providing both creativity and pedagogical support for a diverse and emerging population of users.
Ph.D. Thesis 2009 Efficient Systems Biology Algorithms for Biological Networks over Multiple Time-Scales: From Evolutionary to Regulatory Time Mitrofanova, Antonina Abstract | PDF

Title: Efficient Systems Biology Algorithms for Biological Networks over Multiple Time-Scales: From Evolutionary to Regulatory Time

Candidate: Mitrofanova, Antonina

Advisor(s): Mishra, Bud

Abstract:

Recently, Computational Biology has emerged as one of the most exciting areas of computer science research, not only because of its immediate impact on many biomedical applications, (e.g., personalized medicine, drug and vaccine discovery, tools for diagnostics and therapeutic interventions, etc.), but also because it raises many new and interesting combinatorial and algorithmic questions, in the process. In this thesis, we focus on robust and efficient algorithms to analyze biological networks, primarily targeting protein networks, possibly the most fascinating networks in computational biology in terms of their structure, evolution and complexity, as well as because of their role in various genetic and metabolic diseases.

Classically, protein networks have been studied statically, i.e., without taking into account time-dependent metamorphic changes in network topology and functionality. In this work, we introduce new analysis techniques that view protein networks as being dynamic in nature, evolving over time, and diverse in regulatory patterns at various stages of the system development. Our analysis is capable of dealing with multiple time-scales: ranging from the slowest time-scale corresponding to evolutionary time between species, speeding up to inter-species pathway evolution time, and finally, moving to the other extreme at the cellular developmental time-scale.

We also provide a new method to overcome limitations imposed by corrupting effects of experimental noise (e.g., high false positive and false negative rates) in Yeast Two-Hybrid (Y2H) networks, which often provide primary data for protein complexes. Our new combinatorial algorithm measures connectivity between proteins in Y2H network not by edges but by edge-disjoint paths, which reflects pathway evolution better within single specie network. This algorithm has been shown to be robust against increasing false positives and false negatives, as estimated using variation of information and separation measures.

In addition, we have devised a new way to incorporate evolutionary information in order to significantly improve classification of proteins, especially those isolated in their own networks or surrounded by poorly characterized neighbors. In our method, the networks of two (or more) species are joined by edges of high sequence similarity so that protein-homologs of different species can exchange information and acquire new and improved functional associations.

Finally, we have integrated many of these techniques into one tool to create a novel analysis of malaria parasite P. falciparum's life-cycle at the scale of reaction-time, single cell level, and encompassing its entire inter-erythrocytic developmental cycle (IDC). Our approach allows connecting time-course gene expression profiles of consecutive IDC stages in order to assign functions to un-annotated Malaria proteins and predict potential targets for vaccine and drug development.
Ph.D. Thesis 2009 Detecting, modeling and rendering complex configurations of curvilinear features Parilov, Evgueni Abstract | PDF

Title: Detecting, modeling and rendering complex configurations of curvilinear features

Candidate: Parilov, Evgueni

Advisor(s): Zorin, Denis

Abstract:

Curvilinear features allow one to represent a variety of real world regular patterns like honeycomb tiling as well as very complicated random patterns like networks of furrows on the surface of the human skin. We have developed a set of methods and new data representations for solving key problems related to curvilinear features, which include robust detection of intricate networks of curvilinear features from digital images, GPU-based sharp rendering of fields with curvilinear features, and a parametric synthesis approach to generate systems of curvilinear features with desirable local configurations and global control.

The existing edge-detection techniques may underperform in the presence of noise, usually do not link the detected edge points into chains, often fail on complex structures, heavily depend on initial guess, and assume significant manual phase. We have developed a technique based on active contours, or snakes, which avoids manual initial positioning of the snakes and can detect large networks of curves with complex junctions without user guidance.

The standard bilinear interpolation of piecewise continuous fields results in unwanted smoothing along the curvilinear discontinuities. Spatially varying features can be best represented as a function of the distance to the discontinuity curves and its gradient. We have developed a real-time, GPU-based method for unsigned distance function field and its gradient field interpolation which preserves discontinuity feature curves, represented by quadratic Bezier curves, with minimal restriction on their topology.

Detail features are very important visual clues which make computer-generated imagery look less artificial. Instead of using sample-based synthesis technique which lacks user control on features usually producing gaps in features or breaking feature coherency, we have explored an alternative approach of generating features using random fibre processes. We have developed a Gibbs-type random process of linear fibres based on local fibre interactions. It allows generating non-stationary curvilinear networks with some degree of regularity, and provides an intuitive set of parameters which directly defines fibre local configurations and global pattern of fibres.

For random systems of linear fibres which approximately form two orthogonal dominant orientation fields, we have adapted a streamline placement algorithm which converts such systems into overlapping random sets of coherent smooth curves.
Ph.D. Thesis 2009 Unsupervised Learning of Feature Hierarchies Ranzato, Marc'Aurelio Abstract | PDF

Title: Unsupervised Learning of Feature Hierarchies

Candidate: Ranzato, Marc'Aurelio

Advisor(s): LeCun, Yann

Abstract:

The applicability of machine learning methods is often limited by the amount of available labeled data, and by the ability (or inability) of the designer to produce good internal representations and good similarity measures for the input data vectors.

The aim of this thesis is to alleviate these two limitations by proposing algorithms to learn good internal representations, and invariant feature hierarchies from unlabeled data. These methods go beyond traditional supervised learning algorithms, and rely on unsupervised, and semi-supervised learning.

In particular, this work focuses on ''deep learning'' methods, a set of techniques and principles to train hierarchical models. Hierarchical models produce feature hierarchies that can capture complex non-linear dependencies among the observed data variables in a concise and efficient manner. After training, these models can be employed in real-time systems because they compute the representation by a very fast forward propagation of the input through a sequence of non-linear transformations.

When the paucity of labeled data does not allow the use of traditional supervised algorithms, each layer of the hierarchy can be trained in sequence starting at the bottom by using unsupervised or semi-supervised algorithms. Once each layer has been trained, the whole system can be fine-tuned in an end-to-end fashion. We propose several unsupervised algorithms that can be used as building block to train such feature hierarchies. We investigate algorithms that produce sparse overcomplete representations and features that are invariant to known and learned transformations. These algorithms are designed using the Energy-Based Model framework and gradient-based optimization techniques that scale well on large datasets. The principle underlying these algorithms is to learn representations that are at the same time sparse, able to reconstruct the observation, and directly predictable by some learned mapping that can be used for fast inference in test time.

With the general principles at the foundation of these algorithms, we validate these models on a variety of tasks, from visual object recognition to text document classification and retrieval.
Ph.D. Thesis 2009 Search Problems for Speech and Audio Sequences Weinstein, Eugene Abstract | PDF

Title: Search Problems for Speech and Audio Sequences

Candidate: Weinstein, Eugene

Advisor(s): Mohri, Mehryar

Abstract:

The modern proliferation of very large audio, video, and biological databases has created a need for the design of effective methods for indexing and searching highly variable or uncertain data. Classical search and indexing algorithms deal with clean or perfect input sequences. However, an index created from speech transcriptions is marked with errors and uncertainties stemming from the use of imperfect statistical models in the speech recognition process. Similarly, automatic transcription of music, such as assigning a sequence of notes to represent a stream of music audio, is prone to errors. How can we generalize search and indexing algorithms to deal with such uncertain inputs?

This thesis presents several novel algorithms, analyses, and general techniques and tools for effective indexing and search that not only tolerate but actually exploit this uncertainty. In particular, it develops an algorithmic foundation for music identification, or content-based music search; presents novel automata-theoretic results applicable generally to a variety of search and indexing tasks; and describes new algorithms for topic segmentation, or automatic splitting of speech streams into topic-coherent segments.

We devise a new technique for music identification in which each song is represented by a distinct sequence of music sounds, called "music phonemes." In our approach, we learn the set of music phonemes, as well as a unique sequence of music phonemes characterizing each song, from training data using an unsupervised algorithm. We also propose a novel application of factor automata to create a compact mapping of music phoneme sequences to songs. Using these techniques, we construct an efficient and robust music identification system for a large database of songs.

We further design new algorithms for compact indexing of uncertain inputs based on suffix and factor automata and give novel theoretical guarantees for their space requirements. Suffix automata and factor automata represent the set of all suffixes or substrings of a set of strings, and are used in numerous indexing and search tasks, including the music identification system just mentioned. We show that the suffix automaton or factor automaton of a set of strings U has at most 2Q-2 states, where Q is the number of nodes of a prefix-tree representing the strings in U, a significant improvement over previous work. We also describe a matching new linear-time algorithm for constructing the suffix automaton S or factor automaton F of U in time O(|S|).

We also define a new quality measure for topic segmentation systems and design a discriminative topic segmentation algorithm for speech inputs, thus facilitating effective indexation of spoken audio collections. The new quality measure improves on previously used criteria and is correlated with human judgment of topic-coherence. Our segmentation algorithm uses a novel general topical similarity score based on word co-occurrence statistics. This new algorithm outperforms previous methods in experiments over speech and text streams. We further demonstrate that the performance of segmentation algorithms can be improved by using a lattice of competing hypotheses over the speech stream rather than just the one-best hypothesis as input.
Ph.D. Thesis 2009 Using Application-Domain Knowledge in the Runtime Support of Multi-Experiment Computational Studies Yau, Siu-Man Abstract | PDF

Title: Using Application-Domain Knowledge in the Runtime Support of Multi-Experiment Computational Studies

Candidate: Yau, Siu-Man

Advisor(s): Karamcheti, Vijay; Zorin, Denis

Abstract:

Multi-Experiment Studies (MESs) is a type of computational study in which the same simulation software is executed multiple times, and the result of all executions need to be aggregated to obtain useful insight. As computational simulation experiments become increasingly accepted as part of the scientific process, the use of MESs is becoming more wide-spread among scientists and engineers.

MESs present several challenging requirements on the computing system. First, many MESs need constant user monitoring and feedback, requiring simultaneous steering of multiple executions of the simulation code. Second, MESs can comprise of many executions of long-running simulations; the sheer volume of computation can make them prohibitively long to run.

Parallel architecture offer an attractive computing platform for MESs. Low-cost, small-scale desktops employing multi-core chips allow wide-spread dedicated local access to parallel computation power, offering more research groups an opportunity to achieve interactive MESs. Massively-parallel, high-performance computing clusters can afford a level of parallelism never seen before, and present an opportunity to address the problem of computationally intensive MESs.

However, in order to fully leverage the benefits of parallel architectures, the traditional parallel systems' view has to be augmented. Existing parallel computing systems often treat each execution of the software as a black box, and are prevented from viewing an entire computational study as a single entity that must be optimized for.

This dissertation investigates how a parallel system can view MESs as an end-to-end system and leverage the application-specific properties of MESs to address its requirements. In particular, the system can 1) adapt its scheduling decisions to the overall goal of an MES to reduce the needed computation, 2) simultaneously aggregate results from, and disseminate user actions to, multiple executions of the software to enable simultaneous steering, 3) store reusable information across executions of the simulation software to reduce individual run-time, and 4) adapt its resource allocation policies to the MES's properties to improve resource utilization.

Using a test bed system called SimX and four example MESs across different disciplines, this dissertation shows that the application-aware MES-level approach can achieve multi-fold to multiple orders-of-magnitude improvements over the traditional simulation-level approach.
Ph.D. Thesis 2009 Ensuring Correctness of Compiled Code Zaks, Ganna Abstract | PDF

Title: Ensuring Correctness of Compiled Code

Candidate: Zaks, Ganna

Advisor(s): Pnueli, Amir

Abstract:

Traditionally, the verification effort is applied to the abstract algorithmic descriptions of the underlining software. However, even well understood protocols such as Peterson's protocol for mutual exclusion, whose algorithmic description takes only half a page, have published implementations that are erroneous. Furthermore, the semantics of the implementations can be altered by optimizing compilers, which are very large applications and, consequently, are bound to have bugs. Thus, it is highly desirable to ensure the correctness of the compiled code especially in safety critical and high-assurance software. This dissertation describes two alternative approaches that bring us closer to solving the problem.

First, we present CoVaC - a deductive framework for proving program equivalence and its application to automatic verification of transformations performed by optimizing compilers. To leverage the existing program analysis techniques, we reduce the equivalence checking problem to analysis of one system - a cross-product of the two input programs. We show how the approach can be effectively used for checking equivalence of single-threaded programs that are structurally similar. Unlike the existing frameworks, our approach accommodates absence of compiler annotations and handles most of the classical intraprocedural optimizations such as constant folding, reassociation, common subexpression elimination, code motion, dead code elimination, branch optimizations, and others. In addition, we have developed rules for translation validation of interprocedural optimizations, which can be applied when compiler annotations are available.

The second contribution is the pancam framework for verifying multi-threaded C programs. Pancam first compiles a multithreaded C program into optimized bytecode format. The framework relies on Spin, an existing explicit state model checker, to orchestrate the program's state space search. However, the program transitions and states are computed by the pancam bytecode interpreter. A feature of our approach is that not only pancam checks the actual implementation, but it can also check the code after compiler optimizations. Pancam addresses the state space explosion problem by allowing users to define data abstraction functions and to constrain the number of allowed context switches. We also describe a partial order reduction method that reduces context switches using dynamic knowledge computed on-the-fly, while being sound for both safety and liveness properties.
Ph.D. Thesis 2008 Verification of Transactional Memories and Recursive Programs Cohen, Ariel Abstract | PDF

Title: Verification of Transactional Memories and Recursive Programs

Candidate: Cohen, Ariel

Advisor(s): Pnueli, Amir

Abstract:

Transactional memory is a programming abstraction intended to simplify the synchronization of conflicting concurrent memory accesses without the difficulties associated with locks. In the first part of this thesis we provide a framework and tools that allow to formally verify that a transactional memory implementation satisfies its specification. First we show how to specify transactional memory in terms of admissible interchanges of transaction operations, and give proof rules for showing that an implementation satisfies its specification. We illustrate how to verify correctness, first using a model checker for bounded instantiations, and subsequently by using a theorem prover, thus eliminating all bounds. We provide a mechanical proof of the soundness of the verification method, as well as mechanical proofs for several implementations from the literature, including one that supports non-transactional memory accesses.

Procedural programs with unbounded recursion present a challenge to symbolic model-checkers since they ostensibly require the checker to model an unbounded call stack. In the second part of this thesis we present a method for model-checking safety and liveness properties over procedural programs. Our method performs by first augmenting a concrete procedural program with a well founded ranking function, and then abstracting the Procedural programs with unbounded recursion present a challenge to symbolic model-checkers since they ostensibly require the checker to model an unbounded call stack. In the second part of this thesis we present a method for model-checking safety and liveness properties over procedural programs. Our method performs by first augmenting a concrete procedural program with a well founded ranking function, and then abstracting the augmented program by a finitary state abstraction. Using procedure summarization the procedural abstract program is then reduced to a finite-state system, which is model checked for the property.
Ph.D. Thesis 2008 Learning Long-Range Vision for an Offroad Robot Hadsell, Raia Abstract | PDF

Title: Learning Long-Range Vision for an Offroad Robot

Candidate: Hadsell, Raia

Advisor(s): LeCun, Yann

Abstract:

Teaching a robot to perceive and navigate in an unstructured natural world is a difficult task. Without learning, navigation systems are short-range and extremely limited. With learning, the robot can be taught to classify terrain at longer distances, but these classifiers can be fragile as well, leading to extremely conservative planning. A robust, high-level learning-based perception system for a mobile robot needs to continually learn and adapt as it explores new environments. To do this, a strong feature representation is necessary that can encode meaningful, discriminative patterns as well as invariance to irrelevant transformations. A simple realtime classifier can then be trained on those features to predict the traversability of the current terrain.

One such method for learning a feature representation is discussed in detail in this work. Dimensionality reduction by learning an invariant mapping (DrLIM) is a weakly supervised method for learning a similarity measure over a domain. Given a set of training samples and their pairwise relationships, which can be arbitrarily defined, DrLIM can be used to learn a function that is invariant to complex transformations of the inputs such as shape distortion and rotation.

The main contribution of this work is a self-supervised learning process for long-range vision that is able to accurately classify complex terrain, permitting improved strategic planning. As a mobile robot moves through offroad environments, it learns traversability from a stereo obstacle detector. The learning architecture is composed of a static feature extractor, trained offline for a general yet discriminative feature representation, and an adaptive online classifier. This architecture reduces the effect of concept drift by allowing the online classifier to quickly adapt to very few training samples without overtraining. After experiments with several different learned feature extractors, we conclude that unsupervised or weakly supervised learning methods are necessary for training general feature representations for natural scenes.

The process was developed and tested on the LAGR mobile robot as part of a fully autonomous vision-based navigation system.
Ph.D. Thesis 2008 Synthesizing Executable Programs from Requirements Plock, Cory Abstract | PDF

Title: Synthesizing Executable Programs from Requirements

Candidate: Plock, Cory

Advisor(s): Goldberg, Benjamin

Abstract:

Automatic generation of correct software from requirements has long been a ``holy grail'' for system and software development. According to this vision, instead of implementing a system and then working hard to apply testing and verification methods to prove system correctness, a system is rather built correctly by construction. This problem, referred to as synthesis, is undecidable in the general case. However, by restricting the domain to decidable subsets, it is possible to bring this vision one step closer to reality.

The focus of our study is reactive systems, or non-terminating programs that continuously receive input from an external environment and produce output responses. Reactive systems are often safety critical and include applications such as anti-lock braking systems, auto-pilots, and pacemakers. One of the challenges of reactive system design is ensuring that the software meets the requirements under the assumption of unpredictable environment input. The behavior of many of these systems can be expressed as regular languages over infinite strings, a domain in which synthesis has yielded successful results.

We present a method for synthesizing executable reactive systems from formal requirements. The object-oriented requirements language of Live Sequence Charts (LSCs) is considered. We begin by establishing a mapping between various subsets of the language and finite-state formal models. We also consider LSCs which can express time constraints over a dense-time domain. From one of these models, we show how to formulate a winning strategy that is guaranteed to satisfy the requirements, provided one exists. The strategy is realized in the form of a controller which guides the system in choosing only non-violating behaviors. We describe an implementation of this work as an extension of an existing tool called the Play-Engine.
Ph.D. Thesis 2008 Theory and Algorithms for Modern Machine Learning Problems and an Analysis of Markets Rastogi, Ashish Abstract | PDF

Title: Theory and Algorithms for Modern Machine Learning Problems and an Analysis of Markets

Candidate: Rastogi, Ashish

Advisor(s): Cole, Richard; Mohri, Mehryar

Abstract:

The unprecedented growth of the Internet over the past decade and of data collection, more generally, has given rise to vast quantities of digital information, ranging from web documents and images, genomic databases to a vast array of business customer information. Consequently, it is of growing importance to develop tools and models that enable us to better understand this data and to design data-driven algorithms that leverage this information. This thesis provides several fundamental theoretical and algorithmic results for tackling such problems with applications to speech recognition, image processing, natural language processing, computational biology and web-based algorithms.

Probabilistic automata provide an efficient and compact way to model sequence- oriented data such as speech or web documents. Measuring the similarity of such automata provides a way of comparing the objects they model, and is an essential first step in organizing this type of data. We present algorithmic and hardness results for computing various discrepancies (or dissimilarities) between probabilistic automata, including the relative entropy and the Lp distance; we also give an efficient algorithm to determine if two probabilistic automata are equivalent. In addition, we study the complexity of computing the norms of probabilistic automata.

Organizing and querying large amounts of digitized data such as images and videos is a challenging task because little or no label information is available. This motivates transduction, a setting in which the learning algorithm can leverage unlabeled data during training to improve performance. We present novel error bounds for a family of transductive regression algorithms and validate their usefulness through experiments.

Widespread success of search engines and information retrieval systems has led to large scale collection of rating information which is being used to provide personalized rankings. We examine an alternate formulation of the ranking problem for search engines motivated bythe requirement that in addition to accurately predicting pairwise ordering, ranking systems must also preserve the magnitude of the preferences or the difference between ratings. We present algorithms with sound theoretical properties, and verify their efficacy through experiments.

Finally, price discovery in a market setting can be viewed as an (ongoing) learning problem. Specifically, the problem is to find and maintain a set of prices that balance supply and demand, a core topic in economics. This appears to involve complex implicit and possibly large-scale information transfers. We show that finding equilibrium prices, even approximately, in discrete markets is NP-hard and complement the hardness result with a matching polynomial time approximation algorithm.We also give a new way of measuring the quality of an approximation to equilibrium prices that is based on a natural aggregation of the dissatisfaction of individual market participants.
Ph.D. Thesis 2008 Geometric Modeling with High Order Derivatives Tosun, Elif Abstract | PDF

Title: Geometric Modeling with High Order Derivatives

Candidate: Tosun, Elif

Advisor(s): Zorin, Denis

Abstract:

Modeling of high quality surfaces is the core of geometric modeling. Such models are used in many computer-aided design and computer graphics applications. Irregular behavior of higher-order differential parameters of the surface (e.g. curvature variation) may lead to aesthetic or physical imperfections. In this work, we consider approaches to constructing surfaces with high degree of smoothness.

One direction is based on a manifold-based surface definition which ensures well-defined high-order derivatives that can be explicitly computed at any point. We extend previously proposed manifold-based construction to surfaces with piecewise-smooth boundary and explore trade-offs in some elements of the construction. We show that growth of derivative magnitudes with order is a general property of constructions with locally supported basis functions and derive a lower bound for derivative growth and numerically study flexibility of resulting surfaces at arbitrary points.

An alternative direction to using high-order surfaces is to define an approximation to high-order quantities for meshes, with high-order surface implicit. These approximations do not necessarily converge point-wise, but can nevertheless be successfully used to solve surface optimization problems. Even though fourth-order problems are commonly solved to obtain high quality surfaces, in many cases, these formulations may lead to reflection-line and curvature discontinuities. We consider two approaches to further increasing control over surface properties.

The first approach is to consider data-dependent functionals leading to fourth-order problems but with explicit control over desired surface properties. Our fourth-order functionals are based on reflection line behavior. Reflection lines are commonly used for surface interrogation and high-quality reflection line patterns are well-correlated with high-quality surface appearance. We demonstrate how these can be discretized and optimized accurately and efficiently on general meshes.

A more direct approach is to consider a poly-harmonic function on a mesh, such as the fourth-order biharmonic or the sixth-order triharmonic. The biharmonic and the triharmonic equations can be thought of as a linearization of curvature and curvature variation Euler-Lagrange equations respectively. We present a novel discretization for both problems based on the mixed finite element framework and a regularization technique for solving the resulting, highly ill-conditioned systems of equations. We show that this method, compared to more ad-hoc discretizations, has higher degree of mesh independence and yields surfaces of better quality.
Ph.D. Thesis 2007 Scaling Data Servers via Cooperative Caching Annapureddy, Siddhartha Abstract | PDF

Title: Scaling Data Servers via Cooperative Caching

Candidate: Annapureddy, Siddhartha

Advisor(s): Mazieres, David

Abstract:

In this thesis, we present design techniques -- and systems that illustrate and validate these techniques -- for building data-intensive applications over the Internet. We enable the use of a traditional bandwidth-limited server in these applications. A large number of cooperating users contribute resources such as disk space and network bandwidth, and form the backbone of such applications. The applications we consider fall in one of two categories. The first type provide user-perceived utility in proportion to the data download rates of the participants; bulk data distribution systems is a typical example. The second type are usable only when the participants have data download rates above a certain threshold; video streaming is a prime example.

We built Shark, a distributed file system, to address the first type of applications. It is designed for large-scale, wide-area deployment, while also providing a drop-in replacement for local-area file systems. Shark introduces a novel locality-aware cooperative-caching mechanism, in which clients exploit each other's file caches to reduce load on an origin file server. Shark also enables sharing of data even when it originates from different servers. In addition, Shark clients are mutually distrustful in order to operate in the wide-area. Performance results show that Shark greatly reduces server load and reduces client-perceived latency for read-heavy workloads both in the wide and local areas.

We built RedCarpet, a near-Video-on-Demand (nVoD) system, to address the second type of applications. nVoD allows a user to watch a video starting at any point after waiting for a small setup time. RedCarpet uses a mesh-based peero-peer (P2P) system to provide the nVoD service. In this context, we study the problem of scheduling the dissemination of chunks that constitute a video. We show that providing nVoD is feasible with a combination of techniques that include network coding, avoiding resource starvation for different chunks, and overay topology management algorithms. Our evaluation, using a simulator as well as a prototype, shows that systems that do not optimize in all these dimensions could deliver significantly worse nVoD performance.
Ph.D. Thesis 2007 Shape Analysis by Abstraction, Augmentation, and Transformation Balaban, Ittai Abstract | PDF

Title: Shape Analysis by Abstraction, Augmentation, and Transformation

Candidate: Balaban, Ittai

Advisor(s): Pnueli, Amir; Zuck, Lenore

Abstract:

The goal of shape analysis is to analyze properties of programs that perform destructive updates of linked structures (heaps). This thesis presents an approach for shape analysis based on program augmentation (instrumentation), predicate abstraction, and model checking, that allows for verification of safety and liveness properties (which, for sequential programs, usually corresponds to program invariance and termination).

One of the difficulties in abstracting heap-manipulating programs is devising a decision procedure for a sufficiently expressive logic of graph properties. Since graph reachability (expressible by transitive closure) is not a first order property, the challenge is in showing that a decision procedure exists for a rich enough subset of first order logic with transitive closure.

Predicate abstraction is in general too weak to verify liveness properties. Thus an additional issue dealt with is how to perform abstraction while retaining enough information. The method presented here is domain-neutral, and applies to concurrent programs as well as sequential ones.
Ph.D. Thesis 2007 Democratizing Content Distribution Freedman, Michael Abstract | PDF

Title: Democratizing Content Distribution

Candidate: Freedman, Michael

Advisor(s): Mazieres, David

Abstract:

In order to reach their large audiences, today's Internet publishers primarily use content distribution networks (CDNs) to deliver content. Yet the architectures of the prevalent commercial systems are tightly bound to centralized control, static deployments, and trusted infrastructure, inherently limiting their scope and scale to ensure cost recovery.

To move beyond such shortcomings, this thesis contributes a number of techniques that realize cooperative content distribution. By federating large numbers of unreliable or untrusted hosts, we can satisfy the demand for content by leveraging all available resources. We propose novel algorithms and architectures for three central mechanisms of CDNs: content discovery (where are nearby copies of the client's desired resource?), server selection (which node should a client use?), and secure content transmission (how should a client download content efficiently and securely from its multiple potential sources?).

These mechanisms have been implemented, deployed, and tested in production systems that have provided open content distribution services for more than three years. Every day, these systems answer tens of millions of client requests, serving terabytes of data to more than a million people.

This thesis presents five systems related to content distribution. First, Coral provides a distributed key-value index that enables content lookups to occur efficiently and returns references to nearby cached objects whenever possible, while still preventing any load imbalances from forming. Second, CoralCDN demonstrates how to construct a self-organizing CDN for web content out of unreliable nodes, providing robust behavior in the face of failures. Third, OASIS provides a general-purpose, flexible anycast infrastructure, with which clients can locate nearby or unloaded instances of participating distributed systems. Fourth, as a more clean-slate design that can leverage untrusted participants, Shark offers a distributed file system that supports secure block-based file discovery and distribution. Finally, our authentication code protocol enables the integrity verification of large files on-the-fly when using erasure codes for efficient data dissemination.

Taken together, this thesis provides a novel set of tools for building highly-scalable, efficient, and secure content distribution systems. By enabling the automated replication of data based on its popularity, we can make desired content available and accessible to everybody. And in effect, democratize content distribution.
Ph.D. Thesis 2007 Joint Inference for Information Extraction and Translation Ji, Heng Abstract | PDF

Title: Joint Inference for Information Extraction and Translation

Candidate: Ji, Heng

Advisor(s): Grishman, Ralph

Abstract:

The traditional natural language processing pipeline incorporates multiple stages of linguistic analysis. Although errors are typically compounded through the pipeline, it is possible to reduce the errors in one stage by harnessing the results of the other stages.

This thesis presents a new framework based on component interactions to approach this goal. The new framework applies all stages in a suitable order, with each stage generating multiple hypotheses and propagating them through the whole pipeline. Then the feedback from subsequent stages is used to enhance the target stage by re-ranking these hypotheses, and then produce the best analysis.

The effectiveness of this framework has been demonstrated by substantially improving the performance of Chinese and English entity extraction and Chinese-to-English entity translation. The inference knowledge includes mono-lingual interactions among information extraction stages such as name tagging, coreference resolution, relation extraction and event extraction, as well as cross-lingual interaction between information extraction and machine translation.

Such symbiosis of analysis components allows us to incorporate information from a much wider context, spanning the entire document and even going across documents, and utilize deeper semantic analysis; it will therefore be essential for the creation of a high- performance NLP pipeline.
Ph.D. Thesis 2007 Authentication Mechanisms for Open Distributed Systems Nicolosi, Antonio Abstract | PDF

Title: Authentication Mechanisms for Open Distributed Systems

Candidate: Nicolosi, Antonio

Advisor(s): Mazieres, David; Shoup, Victor

Abstract:

While authentication within organizations is a well-understood problem, traditional solutions are often inadequate at the scale of the Internet, where the lack of a central authority, the open nature of the systems, and issues such as privacy and anonymity create new challenges. For example, users typically establish dozens of web accounts with independently administered services under a single password, which increases the likelihood of exposure of their credentials; users wish to receive email from anyone who is not a spammer, but the openness of the email infrastructure makes it hard to authenticate legitimate senders; users may have a rightful expectation of privacy when viewing widely-accessed protected resources such as premium website content, yet they are commonly required to present identifying login credentials, which permits tracking of their access patterns.

This dissertation describes enhanced authentication mechanisms to tackle the challenges of each of the above settings. Specifically, the dissertation develops: 1) a remote authentication architecture that lets users recover easily in case of password compromise; 2) a social network-based email system in which users can authenticate themselves as trusted senders without disclosing all their social contacts; and 3) a group access-control scheme where requests can be monitored while affording a degree of anonymity to the group member performing the request.

The proposed constructions combine system designs and novel cryptographic techniques to address their respective security and privacy requirements both effectively and efficiently.
Ph.D. Thesis 2007 New Design Criteria for Hash Functions and Block Ciphers Puniya, Prashant Abstract | PDF
Title: New Design Criteria for Hash Functions and Block Ciphers

Candidate: Puniya, Prashant

Advisor(s): Dodis, Yevgeniy

Abstract:

Cryptographic primitives, such as hash functions and block ciphers, are integral components in several practical cryptographic schemes. In order to prove security of these schemes, a variety of security assumptions are made on the underlying hash function or block cipher, such as collision-resistance, pseudorandomness etc. In fact, such assumptions are often made without much regard for the actual constructions of these primitives. In this thesis, we address this problem and suggest new, and possibly better, design criteria for hash functions and block ciphers.

We start by analyzing the design criteria underlying hash functions. The usual design principle here involves a two-step procedure: First, come up with a heuristically-designed and ``hopefully strong'' fixed-length input construction (i.e. the compression function), then use a standard domain extension technique, usually the cascade construction, to get a construction that works for variable-length inputs. We investigate this design principle from two perspectives:
- To instantiate the Random Oracle. We suggest modifications to existing constructions that make the resulting construction secure as a random oracle, with appropriate assumptions on the underlying compression function.
- In general, we look for ``black-box'' fixes to existing hash functions to get secure constructions for each of the common security notions required of hash functions. We also give suggestions for appropriate modes for using existing hash functions along these lines.
We next move on to discuss the Feistel network, which is used in the design of several popular block ciphers such as DES, Triple-DES, Blowfish etc. Currently, the celebrated result of Luby-Rackoff (and further extensions) is regarded as the theoretical basis for using this construction in block cipher design, where it was shown that a four-round Feistel network is a (strong) pseudorandom permutation (PRP) if the round functions are independent pseudorandom functions (PRFs). We study the Feistel network from two different perspectives:
- Is there a weaker security notion for round functions, than pseudorandomness, that suffices to prove security of the Feistel network?
- Can the Feistel network satisfy a much stronger security notion, i.e. security as an ideal cipher, under appropriate assumptions on the round functions?
We give a positive answer to the first question and a partial positive answer to the second question. In the process, we undertake a combinatorial study of the Feistel network, that might be useful in other scenarios as well. We provide several practical applications of our results for the Feistel network.
Ph.D. Thesis 2007 Being Lazy and Preemptive at Learning toward Information Extraction Shinyama, Yusuke Abstract | PDF

Title: Being Lazy and Preemptive at Learning toward Information Extraction

Candidate: Shinyama, Yusuke

Advisor(s): Sekine, Satoshi

Abstract:

This thesis proposes a novel approach for exploring Information Extraction scenarios. Information Extraction, or IE, is a task aiming at finding events and relations in natural language texts that meet a user's demand. However, it is often difficult to formulate, or even define such events that satisfy both a user's need and technical feasibility. Furthermore, most existing IE systems need to be tuned for a new scenario with proper training data in advance. So a system designer usually needs to understand what a user wants to know in order to maximize the system performance, while the user has to understand how the system will perform in order to maximize his/her satisfaction.

In this thesis, we focus on maximizing the variety of scenarios that the system can handle instead of trying to improve the accuracy of a particular scenario. In traditional IE systems, a relation is defined a priori by a user and is identified by a set of patterns that are manually crafted or acquired in advance. We propose a technique called Unrestricted Relation Discovery, which defers determining what is a relation and what is not until the very end of the processing so that a relation can be defined a posteriori. This laziness gives huge flexibility to the types of relations the system can handle. Furthermore, we use the notion of recurrent relations to measure how useful each relation is. This way, we can discover new IE scenarios without fully specifying definitions or patterns, which leads to Preemptive Information Extraction, where a system can provide a user a portfolio of extractable relations and let the user choose them.

We used one year news articles obtained from the Web as a development set. We discovered dozens of scenarios that are similar to the existing scenarios tried by many IE systems, as well as new scenarios that are relatively novel. We have evaluated the existing scenarios with Automatic Content Extraction (ACE) event corpus and obtained reasonable performance. We believe this system will shed new light on IE research by giving various experimental IE scenarios.
Ph.D. Thesis 2007 Constituent Parsing by Classification Turian, Joseph Abstract | PDF

Title: Constituent Parsing by Classification

Candidate: Turian, Joseph

Advisor(s): Melamed, I. Dan

Abstract:

We present an approach to constituent parsing, which is driven by classifiers induced to minimize a single regularized objective. It is the first discriminatively-trained constituent parser to surpass the Collins (2003) parser without using a generative model. Our primary contribution is simplifying the human effort required for feature engineering. Our model can incorporate arbitrary features of the input and parse state. Feature selection and feature construction occur automatically, as part of learning. We define a set of fine-grained atomic features, and let the learner induce informative compound features. Our learning approach includes several novel approximations and optimizations which improve the efficiency of discriminative training. We introduce greedy completion, a new agenda-driven search strategy designed to find low-cost solutions given a limit on search effort. The inference evaluation function was learned accurately enough to guide the deterministic parsers to the optimal parse reasonably quickly without pruning, and thus without search errors. Experiments demonstrate the flexibility of our approach, which has also been applied to machine translation (Wellington et. al, AMTA 2006; Turian et al., NIPS 2007).
Ph.D. Thesis 2007 Enhanced Security Models for Network Protocols Walfish, Shabsi Abstract | PDF

Title: Enhanced Security Models for Network Protocols

Candidate: Walfish, Shabsi

Advisor(s): Dodis, Yevgeniy

Abstract:

Modeling security for protocols running in the complex network environment of the Internet can be a daunting task. Ideally, a security model for the Internet should provide the following guarantee: a protocol that "securely" implements a particular task specification will retain all the same security properties as the specification itself, even when an arbitrary set of protocols runs concurrently on the same network. This guarantee must hold even when other protocols are maliciously designed to interact badly with the analyzed protocol, and even when the analyzed protocol is composed with other protocols. The popular Universal Composability (UC) security framework aims to provide this guarantee.

Unfortunately, such strong security guarantees come with a price: they are impossible to achieve without the use of some trusted setup. Typically, this trusted setup is global in nature, and takes the form of a Public Key Infrastructure (PKI) and/or a Common Reference String (CRS). However, the current approach to modeling security in the presence of such setups falls short of providing expected security guarantees. A quintessential example of this phenomenon is the deniability concern: there exist natural protocols that meet the strongest known security notions (including UC) while failing to provide the same deniability guarantees that their task specifications imply they should provide.

We introduce the Generalized Universal Composability (GUC) framework to extend the UC security notion and enable the re-establishment of its original intuitive security guarantees even for protocols that use global trusted setups. In particular, GUC enables us to guarantee that secure protocols will provide the same level of deniability as the task specification they implement. To demonstrate the usefulness of the GUC framework, we first apply it to the analysis and construction of deniable authentication protocols. Building upon such deniable authentication protocols, we then prove a general feasibility result showing how to construct protocols satisfying our security notion for a large class of two-party and multi-party tasks (assuming the availability of some reasonable trusted setup). Finally, we highlight the practical applicability of GUC by constructing efficient protocols that securely instantiate two common cryptographic tasks: commitments and zero-knowledge proofs.
Ph.D. Thesis 2007 Tree-Structured Models of Multitext: Theory, Design and Experiments Wellington, Benjamin Abstract | PDF

Title: Tree-Structured Models of Multitext: Theory, Design and Experiments

Candidate: Wellington, Benjamin

Advisor(s): Melamed, I. Dan

Abstract:

Statistical machine translation (SMT) systems use empirical models to simulate the act of human translation between language pairs. This dissertation surveys the ability of currently popular syntax-aware SMT systems to model real-world multitext, and shows different types of linguistic phenomena occurring in natural language translation that these popular systems cannot capture. It then proposes a new grammar formalism, Generalized Multitext Grammar (GMTG), and a generalization of Chomsky Normal Form, that allows us to build an efficient SMT system using previously developed parsing techniques. The dissertation addresses many software engineering issues that arise when doing syntax-based SMT using large corpora and lays out a object-oriented design for a translation toolkit. Using the toolkit, we show that a tree-transduction based SMT system, which uses modern machine learning algorithms, outperforms a generative baseline.
Ph.D. Thesis 2007 Formal Verification Using Static and Dynamic Analyses Zaks, Aleksandr Abstract | PDF

Title: Formal Verification Using Static and Dynamic Analyses

Candidate: Zaks, Aleksandr

Advisor(s): Pnueli, Amir

Abstract:

One of the main challenges of formal verification is the ability to handle systems of realistic size, which is especially exacerbated in the context of software verification. In this dissertation, we suggest two related approaches that, while both rely on formal method techniques, they can still be applied to larger practical systems. The scalability is mainly achieved by restricting the types of properties we are considering and guarantees that are given.

Our first approach is a novel run-time monitoring framework. Unlike previous work on this topic, we expect the properties to be specified using Property Specification Language (PSL). PSL is a newly adopted IEEE P1850 standard and is an extension of Linear Temporal Logic (LTL). The new features include regular expressions and finite trace semantics, which make the new logic very attractive for run-time monitoring of both software and hardware designs. To facilitate the new logic we have extended the existing algorithm for LTL tester construction to cover the PSL specific operators. Another novelty of our approach is the ability to use partial information about the program that is being monitored while the existing tools only use the information about the observed trace and the property under consideration. This allows going beyond the focus of traditional run-time monitoring tools -- error detection in the execution trace, towards the focus of static analysis -- bug detection in programs.

In our second approach, we employ static analysis to compute SAT-based function summaries to detect invalid pointer accesses. To compute function summaries, we propose new techniques for improving the precision and performance in order to reduce the false error rates. In particular, we use BDDs to represent a symbolic simulation of functions, where BDDs allow an efficient representation of path-sensitive information and high level simplification. In addition, we use light-weight range analysis technique for determining lower and upper bounds for program variables, which can further offload the work form the SAT solver. Note that while in our current implementation the analysis happens at compile time, we can also use the function summaries as a basis for run-time monitoring.
Ph.D. Thesis 2006 Guaranteed Precision for Transcendental and Algebraic Computation Made Easy Du, Zilin Abstract | PDF

Title: Guaranteed Precision for Transcendental and Algebraic Computation Made Easy

Candidate: Du, Zilin

Advisor(s): Yap, Chee

Abstract:

Numerical non-robustness is a well-known phenomenon when implementing geometric algorithms. A general approach to achieve geometric robustness is Exact Geometric Computation (EGC). This dissertation explores the redesign and extension of Core Library, a C++ library which embraces the EGC approach. The contributions of this thesis are organized into three parts.

In the first part, we discuss the redesign of Core Library, especially the expression "Expr" and bigfloat "BigFloat" classes. Our new design emphasizes extensibility in a clean and modular way. The three facilities in "Expr", filter, root bound and bigfloat, are separated into independent modules. This allows new filters, root bounds and some bigfloat substitute to be plugged in. The key approximate evaluation and precision propagation algorithms have been greatly improved. A new bigfloat system based on MPFR and interval arithmetic has been incorporated. Our benchmark shows that the redesigned Core Library typically has 5-10 times speedup. We also provide tools to facilitate extensions of "Expr" to incorporate new type of nodes, especially transcendental nodes.

Although the Core Library was originally designed for algebraic applications, transcendental functions are needed in many applications. In the second part, we present a complete algorithm for absolute approximation of the general hypergeometric functions. It's complexity is also given. The extension of this algorithm to ``blackbox number'' is provided. A general hypergeometric function package based on our algorithm is implemented and integrated into the Core Library based on our new design.

Brent has shown that many elementary functions, such as $\exp, \log, \sin$, etc., can be efficiently computed using the Arithmetic-Geometric Mean (AGM) based algorithm. However, he only gave an asymptotic error analysis. The constants in the Big $O(\cdot)$ notation required for implementation are unknown. We provide a non-asymptotic error analysis of the AGM algorithm and the related algorithms for logarithm and exponential functions. These algorithms have been implemented and incorporated into the Core Library.
Ph.D. Thesis 2006 On Cryptographic Techniques for Digital Rights Management Fazio, Nelly Abstract | PDF
Title: On Cryptographic Techniques for Digital Rights Management

Candidate: Fazio, Nelly

Advisor(s): Dodis, Yevgeniy

Abstract:

With more and more content being produced, distributed, and ultimately rendered and consumed in digital form, devising effective Content Protection mechanisms and building satisfactory Digital Rights Management (DRM) systems have become top priorities for the Publishing and Entertaining Industries.

To help tackle this challenge, several cryptographic primitives and constructions have been proposed, including mechanisms to securely distribute data over a unidirectional insecure channel (Broadcast Encryption), schemes in which leakage of cryptographic keys can be traced back to the leaker (Traitor Tracing), and techniques to combine revocation and tracing capabilities (Trace-and-Revoke schemes).

In this thesis, we present several original constructions of the above primitives, which improve upon existing DRM-enabling cryptographic primitives along the following two directions:
1. Widening their scope of applicability e.g., by considering models taking into accounts usability issues typical of the DRM setting; and
2. Strengthening their security guarantees to higher levels that are standards, for example, in the case of stand-alone encryption.
Our results along the first line of work include the following:
- An efficient public-key broadcast encryption scheme, which allows mutually mistrusting content providers to leverage a common delivery infrastructure, and can cope with low-end, stateless receivers;
- A traitor tracing scheme with optimal transmission rate, in which encryption does not cause a blow-up in the size of the content, thus allowing for optimal utilization of the broadcast channel;
- A public-key tracing and revoking scheme that can deal with both server-side and client-side scalability issues, while preserving traceability.
As for the second direction, our contribution can be divided as follows:
- A forward-secure public-key broadcast encryption scheme, in which the unauthorized access resulting from cracking a user-key is constrained to a minimal time frame which is delimited, in the future, by the revocation mechanism, and in the past, by forward secrecy;
- A precise formalization of the notion of adaptive chosen-ciphertext security for public-key broadcast encryption schemes, along with a modular and efficient construction.
Overall, the cryptographic tools developed in this thesis provide more flexibility and more security than existing solutions, and thus offer a better match for the challenges of the DRM setting.
Ph.D. Thesis 2006 Finding Your Match: Techniques for Improving Sequence Alignment in DNA and RNA Gill, Ofer Hirsch Abstract | PDF

Title: Finding Your Match: Techniques for Improving Sequence Alignment in DNA and RNA

Candidate: Gill, Ofer Hirsch

Advisor(s): Mishra, Bud

Abstract:

In Bioinformatics, finding correlations between species allows us the better understand the important biological functions of those species and trace its evolution. This thesis considers sequence alignment, a method for obtaining these correlations. We improve upon sequence alignment tools designed for DNA with Plains, an algorithm than uses piecewise-linear gap functions and parameter-optimization to obtain correlations in remotely-related species pairs such as human and fugu using reasonable amounts of memory and space on an ordinary computer. We then discuss Planar, which is similar to Plains, but is designed for aligning RNA, and accounts for secondary structure. We also explore SEPA, a tool that uses p-value estimation based on exhaustive empirical data to better emphasize key results from an alignment with a measure of reliability. Using SEPA to measure the quality of an alignment, we proceed to compare Plains and Planar against similar alignment tools, emphaisizing the interesting correlations caught in the process.
Ph.D. Thesis 2006 DataSlicer: A Hosting Platform for Data-Centric Network Services He, Congchun Abstract | PDF

Title: DataSlicer: A Hosting Platform for Data-Centric Network Services

Candidate: He, Congchun

Advisor(s): Karamcheti, Vijay

Abstract:

As the Web evolves, the number of network services deployed on the Internet has been growing at a dramatic pace. Such services usually involve a massive volume of data stored in physical or virtual back-end databases, and access the data to dynamically generate responses for client requests. These characteristics restrict use of traditional mechanisms for improving service performance and scalability: large volumes prevent replication of the service data at multiple sites required by content distribution schemes, while dynamic responses do not support the reuse required by web caching schemes.

However, many deployed data-centric network services share other properties that can help alleviate this situation: (1) service usage patterns exhibit locality of various forms, and (2) services are accessed using standard protocols and publicly known message structures. When properly exploited, these characteristics enable the design of alternative caching infrastructures, which leverage distributed network intermediaries to inspect traffic flowing between clients and services, infer locality information dynamically, and potentially improve service performance by taking actions such as partial service replication, request redirection, or admission control.

This dissertation investigates the nature of locality in service usage patterns for two well-known web services, and reports on the design, implementation, and evaluation of such a network intermediary architecture, named DataSlicer. DataSlicer incorporates four main techniques: (1) Service-neutral request inspection and locality detection on distributed network intermediaries; (2) Construction of oriented overlays for clustering client requests; (3)Integrated load-balancing and service replication mechanisms that improve service performance and scalability by either redistributing the underlying traffic in the network or creating partial service replicas on demand at appropriate network locations; and (4) Robustness mechanisms to maintain system stability in a wide-area network environment.

DataSlicer has been successfully deployed on the PlanetLab network. Extensive experiments using synthetic workloads show that our approach can: (1) create appropriate oriented overlays to cluster client requests according to multiple application metrics; (2) detect locality information across multiple dimensions and granularity levels; (3) leverage the detected locality information to perform appropriate load-balancing and service replication actions with minimal cost; and (4) ensure robust behavior in the face of dynamically changing network conditions.
Ph.D. Thesis 2006 Multimarker Genetic Analysis Methods for High Throughput Array Data Ionita, Iuliana Abstract | PDF

Title: Multimarker Genetic Analysis Methods for High Throughput Array Data

Candidate: Ionita, Iuliana

Advisor(s): Mishra, Bud

Abstract:

In this thesis, we focus on multi-marker/-locus statistical methods for analyzing high-throughput array data used for the detection of genes implicated in complex disorders. There are two main parts: the first part concerns the localization of cancer genes from copy number variation data, with an application to lung cancer; the second part concerns the localization of disease genes using an affected-sib-pair design, with an application to inflammatory bowel disease. A third part addresses an important issue involved in the design of these disease-gene-detection studies. More details follow:

1. Detection of Oncogenes and Tumor Suppressor Genes using Multipoint Statistics from Copy Number Variation Data

ArrayCGH is a microarray-based comparative genomic hybridization technique that has been used to compare a tumor genome against a normal genome, thus providing rapid genomic assays of tumor genomes in terms of copy number variations of those chromosomal segments, which have been gained or lost. When properly interpreted, these assays are likely to shed important light on genes and mechanisms involved in initiation and progression of cancer. Specifically, chromosomal segments, amplified or deleted in a group of cancer patients, point to locations of cancer genes. We describe a statistical method to estimate the location of such genes by analyzing segmental amplifications and deletions in the genomes from cancer patients and the spatial relation of these segments to any specific genomic interval. The algorithm assigns to a genomic segment a score that parsimoniously captures the underlying biology. It computes a p-value for every putative disease gene by using results from the theory of scan statistics. We have validated our method using simulated datasets, as well as a real dataset on lung cancer.

2. Multi-locus Linkage Analysis of Affected-Sib-Pairs

A The affected-sib-pair (ASP) design is a simple and popular design in the linkage analysis of complex traits. The traditional ASP methods evaluate the linkage information at a locus by considering only the marginal linkage information present at that locus. However complex traits are influenced by multiple genes that together interact to increase the risk to disease. We describe a multi-locus linkage method that uses both the marginal information and information derived from the possible interactions among several disease loci, thereby increasing the significance of loci with modest marginal effects. Our method is based on a statistic that quantifies the linkage information contained in a set of markers. By a marker selection-reduction process, we screen a set of polymorphisms and select a few that seem linked to disease. We test our approach on simulated data and a genome-scan data for inflammatory bowel disease. We show that our method is expected to be more powerful than single-locus methods in detecting disease loci responsible for complex traits.

3. A Practical Haplotype Inference Algorithm

We consider the problem of efficient inference algorithms to determine the haplotypes and their distribution from a dataset of unrelated genotypes.

With the currently available catalogue of single-nucleotide polymorphisms (SNPs) and given their abundance throughout the genome (one in about $500$ bps) and low mutation rates, scientists hope to significantly improve their ability to discover genetic variants associated with a particular complex trait. We present a solution to a key intermediate step by devising a practical algorithm that has the ability to infer the haplotype variants for a particular individual from its own genotype SNP data in relation to population data. The algorithm we present is simple to describe and implement; it makes no assumption such as perfect phylogeny or the availability of parental genomes (as in trio-studies); it exploits locality in linkages and low diversity in haplotype blocks to achieve a linear time complexity in the number of markers; it combines many of the advantageous properties and concepts of other existing statistical algorithms for this problem; and finally, it outperforms competing algorithms in computational complexity and accuracy, as demonstrated by the studies performed on real data and synthetic data.
Ph.D. Thesis 2006 Expressive Motion Lees, Alyssa Abstract | PDF

Title: Expressive Motion

Candidate: Lees, Alyssa

Advisor(s): Bregler, Christopher; Geiger, Davi

Abstract:

Since the advent of motion capture animation, attempts have been made to extract the seemingly nebulously defined attributes of 'content' and 'style' from the motion data. Enabling quick access to highly precise data, the benefits of motion capture for animation purposes are abundant. Yet manipulating the expressive attributes of the motion data in a comprehensive manner has proved elusive. This dissertation poses practical solutions that are based on insights from the dance community and learning attributes from the motion data itself. The culminating project is a system which learns the deformations of the human body and reapplies them in exaggerated form for enhanced expressivity.

While simultaneously developing efficient and usable tools for animators, the result is a three pronged technique to enhance the expressive qualities of motion capture animation. The key aspect is the creation of a deformable skeleton representation of the human body using a unique machine learning approach. The deformable skeleton is modeled by replicating the actual movements of the human spine. The second step relies on exploiting the subtle aspects of motion, such as hand movement to create an emotional effect visually. Both of these approaches involve exaggerating the movements in the same vein as traditional 2-D animation technique of 'squash and stretch'. Finally, a novel technique for the application of style on a baseline motion capture sequence is developed.

All of these approaches are rooted in machine learning techniques. Linear discriminate analysis was initially applied to a single phrase of motion demonstrating various style characteristics in LABAN notation. A variety of methods including nonlinear PCA, and LLE were used to learn the underlying manifold of spine movements. Nonlinear dynamic models were learned in attempts to describe motion segments versus single phrases. In addition, the dissertation focuses on the variety of obstacles in learning with motion data. This includes the correct parameterization of angles, applying statistical analysis to quaternions, and appropriate distance measures between postures.
Ph.D. Thesis 2006 Building Trustworthy Storage Services out of Untrusted Infrastructure Li, Jinyuan Abstract | PDF

Title: Building Trustworthy Storage Services out of Untrusted Infrastructure

Candidate: Li, Jinyuan

Advisor(s): Mazieres, David

Abstract:

As the Internet has become increasingly ubiquitous, it has seen tremendous growth in the popularity of online services. These services range from online CVS repositories like sourceforge , shopping sites, to online financial and administrative systems, etc. It is critical for these services to provide correct and reliable execution for clients. However, given their attractiveness as targets and ubiquitous accessibility, online servers also have a significant chance of being compromised, leading to Byzantine failures.

Designing and implementing a service to run on a machine that may be compromised is not an easy task, since infrastructure under malicious control may behave arbitrarily. Even worse, as any monitoring facility may also be subverted at the same time, there is no easy way for system behavior to be audited, or for malicious attacks to be detected.

We propose our solution to the problem by reducing the trust needed on the server side in the first place. In the other words, our system is designed specifically for running on untrusted hosts. In this thesis, we realize this principle by two different approaches. First, we design and implement a new network file system -- SUNDR. In SUNDR, malicious servers cannot forge users' operations or tamper with their data without being detected. In the worst case, attackers can only conceal users' operations from each other. Still, SUNDR is able to detect this misbehavior whenever users communicate with each other directly.

The limitation of the approach above lies in that the system cannot guarantee ideal consistency with even one single failure. In the second approach, we use replicated state machines to tolerate some fraction of malicious server failures, which is termed Byzantine Fault Tolerance (BFT) in the literature. Classical BFT systems assume less than 1/3 of the replicas are malicious, to provide ideal consistency. In this thesis, we push the boundary from 1/3 to 2/3. With fewer than 1/3 of replicas faulty, we provide the same guarantees as classical BFT systems. Additionally, we guarantee weaker consistency, instead of arbitrary behavior, when between 1/3 and 1/3 of replicas fail.
Ph.D. Thesis 2006 Measures for Robust Stability and Controllability Mengi, Emre Abstract | PDF

Title: Measures for Robust Stability and Controllability

Candidate: Mengi, Emre

Advisor(s): Overton, Michael

Abstract:

A linear time-invariant dynamical system is robustly stable if the system as well as all of its nearby systems in a neighborhood of interest are stable. An important property of robustly stable systems is they decay asymptotically without exhibiting significant transient behavior. The first part of this thesis work focuses on measures revealing the degree of robust stability of a dynamical system. We put special emphasis on pseudospectral measures, those based on the eigenvalues of nearby matrices for a first-order system or matrix polynomials for a higher-order system. We present algorithms for the computation of pseudospectral measures for continuous and discrete systems with quadratic rate of convergence and analyze their accuracy in the presence of rounding errors. We also provide an efficient algorithm for the numerical radius of a matrix, the modulus of the outermost point in the field of values (the set of Rayleigh quotients) of the matrix. These algorithms are inspired by algorithms of Byers, Boyd-Balakrishnan and Burke-Lewis-Overton.

The second part is devoted to indicators of robust controllability. We call a system robustly controllable if it is controllable and remains controllable under perturbations of interest. We describe efficient methods for the computation of the distance to the closest uncontrollable system. Our first algorithm for the first-order distance to uncontrollability depends on a grid and is well-suited for low precision approximation. We then discuss algorithms for high precision approximation of the first-order distance to uncontrollability. These are based on the bisection method of Gu and the trisection variant of Burke-Lewis-Overton.

These algorithms require the extraction of the real eigenvalues of matrices of size $O(n2)$ typically at a cost of $O(n6)$, where $n$ is the dimension of the state space. We propose a new divide-and-conquer algorithm that reduces the cost to $O(n4)$ on average in both theory and practice and $O(n5)$ in the worst case. The new iterative approach to the extraction of real eigenvalues may also be useful in other contexts. For higher-order systems we derive a singular value characterization and exploit this characterization for the computation of the higher-order distance to uncontrollability to low precision. The algorithms in this thesis assume arbitrary complex perturbations are applicable to the input system and usually require the extraction of the imaginary eigenvalues of Hamiltonian matrices (or even matrix polynomials) or the unit eigenvalues of symplectic pencils (or palindromic matrix polynomials).
Ph.D. Thesis 2006 Algorithmic Algebraic Model Checking: Hybrid Automata & Systems Biology Mysore, Venkatesh Pranesh Abstract | PDF

Title: Algorithmic Algebraic Model Checking: Hybrid Automata & Systems Biology

Candidate: Mysore, Venkatesh Pranesh

Advisor(s): Mishra, Bud

Abstract:

Systems Biology strives to hasten our understanding of the fundamental principles of life by adopting a systems-level approach for the analysis of cellular function and behavior. One popular framework for capturing the chemical kinetics of interacting biochemicals is Hybrid Automata. Our goal in this thesis is to aid Systems Biology research by improving the current understanding of hybrid automata, by developing techniques for symbolic rather than numerical analysis of the dynamics of biochemical networks modeled as hybrid automata, and by honing the theory to two classes of problems: kinetic mass action based simulation in genetic regulatory & signal transduction pathways, and pseudo-equilibrium simulation in metabolic networks.

We first provide new constructions that prove that the "open" Hierarchical Piecewise Constant Derivative (HPCD) subclass is closer to the decidability and undecidability frontiers than was previously understood. After concluding that the HPCD-like classes are unsuitable for modeling chemical reactions, our quest for semi-decidable subclasses leads us to define the "semi-algebraic" subclass. This is the most expressive hybrid automaton subclass amenable to rigorous symbolic temporal reasoning. We begin with the bounded reachability problem, and then show how the dense-time temporal logic Timed Computation Tree Logic (TCTL) can be model-checked by exploiting techniques from real algebraic geometry, primarily real quantifier elimination. We also prove the undecidability of reachability in the Blum-Shub-Smale Turing Machine formalism. We then develop efficient approximation strategies by extending bisimulation partitioning, rectangular grid-based approximation, polytopal approximation and time discretization. We then develop a uniform algebraic framework for modeling biochemical and metabolic networks, also extending flux balance analysis. We present some preliminary results using a prototypical tool Tolque. It is a symbolic algebraic dense time model-checker for semi-algebraic hybrid automata, which uses Qepcad for quantifier elimination.

The "Algorithmic Algebraic Model Checking" techniques developed in this thesis present a theoretically-grounded mathematically-sound platform for powerful symbolic temporal reasoning over biochemical networks and other semi-algebraic hybrid automata. It is our hope that by building upon this thesis, along with the development of computationally efficient parallelizable quantifier elimination algorithms and the integration of different computer algebra tools, scientific software systems will emerge that fundamentally transform the way biochemical networks (and other hybrid automata) are analyzed.
Ph.D. Thesis 2006 Building an Automatic Phenotyping System of Developing Embryos Ning, Feng Abstract | PDF

Title: Building an Automatic Phenotyping System of Developing Embryos

Candidate: Ning, Feng

Advisor(s): LeCun, Yann

Abstract:

This dissertation presents a learning-based system for the detection, identification, localization, and measurement of various sub-cellular structures in microscopic images of developing embryos. The system analyzes sequences of images obtained through DIC microscopy and detects cell nuclei, cytoplasm, and cell walls automatically. The system described in this dissertation is the key initial component of a fully automated phenotype analysis system.

Our study primarily concerns the early stages of development of C. Elegans nematode embryos, from fertilization to the four-cell stage. The method proposed in this dissertation consists in learning the entire processing chain {\em from end to end}, from raw pixels to ultimate object categories.

The system contains three modules: (1) a convolutional network trained to classify each pixel into five categories: cell wall, cytoplasm, nuclear membrane, nucleus, outside medium; (2) an Energy-Based Model which cleans up the output of the convolutional network by learning local consistency constraints that must be satisfied by label images; (3) A set of elastic models of the embryo at various stages of development that are matched to the label images.

When observing normal (wild type) embryos it is possible to visualize important cellular functions such as nuclear movements and fusions, cytokinesis and the setting up of crucial cell-cell contacts. These events are highly reproducible from embryo to embryo. The events will deviate from normal behaviors when the function of a specific gene is perturbed, therefore allowing the detection of correlations between genes activities and specific early embryonic events. One important goal of the system is to automatically detect whether the development is normal (and therefore, not particularly interesting), or abnormal and worth investigating. Another important goal is to automatically extract quantitative measurements such as the migration speed of the nuclei and the precise time of cell divisions.
Ph.D. Thesis 2006 A Polymorphic Type System and Compilation Scheme for Record Concatenation Osinski, Edward Abstract | PDF

Title: A Polymorphic Type System and Compilation Scheme for Record Concatenation

Candidate: Osinski, Edward

Advisor(s): Goldberg, Benjamin

Abstract:

The notion of records, which are used to organize closely related groups of data so the group can be treated as a unit, and also provide access to the data within by name, is almost universally supported in programming languages. However, in virtually all cases, the operations permitted on records in statically typed languages are extremely limited. Providing greater flexibility in dealing with records, while simultaneously retaining the benefits of static type checking is a desirable goal.

This problem has generated considerable interest, and a number of type systems dealing with records have appeared in the literature. In this work, we present the first polymorphic type system that is expressive enough to type a number of complex operations on records, including three forms of concatenation and natural join. In addition, the precise types of the records involved are inferred, to eliminate the burden of explicit type declarations. Another aspect of this problem is an efficient implementation of records and their associated operations. We also present a compilation method which accomplishes this goal.
Ph.D. Thesis 2006 A Probabilistic Learning Approach to Attribute Value Inconsistency Resolution Pevzner, Ilya Abstract | PDF

Title: A Probabilistic Learning Approach to Attribute Value Inconsistency Resolution

Candidate: Pevzner, Ilya

Advisor(s): Goldberg, Arthur

Abstract:

Resolving inconsistencies in data is a problem of critical practical importance. Inconsistent data arises whenever an attribute takes on multiple, inconsistent, values. This may occur when a particular entity is stored multiple times in one database, or in multiple databases that are combined.

We investigate Attribute Value Inconsistency Resolution (AVIR), the problem of semi-automatically resolving data inconsistencies among multiple database records that describe the same person or thing.

Our survey of the area shows that existing solutions are either limited in scope or impose a significant burden on their users. Either they do not cover all types of inconsistencies and attributes, or they require users to write or choose attribute resolution functions for each potentially conflicting attribute.

Our ML based approach applies to all types of inconsistencies and attributes, and automatically selects appropriate resolution functions based on the conflicting data. We have invented and developed a system, that uses a set of binary features that detect data properties and relationships and resolution functions that merge data. Many such features and resolution functions have been written. The system uses supervised learning with maximum likelihood estimation to determine which function(s) to apply, based on which feature(s) fire.

We have validated our system by comparing its error rate, decision rate and decision accuracy on a test data set to baseline values determined by a clairvoyant application of a standard approach where each potentially conflicting attribute is resolved by the best resolution function for the attribute.
Ph.D. Thesis 2006 Animating Autonomous Pedestrians Shao, Wei Abstract | PDF

Title: Animating Autonomous Pedestrians

Candidate: Shao, Wei

Advisor(s): Terzopoulos, Demetri

Abstract:

This thesis addresses the difficult open problem in computer graphics of autonomous human modeling and animation, specifically of emulating the rich complexity of real pedestrians in urban environments.

We pursue an artificial life approach that integrates motor, perceptual, behavioral, and cognitive components within a model of pedestrians as highly capable individuals. Our comprehensive model features innovations in these components, as well as in their combination, yielding results of unprecedented fidelity and complexity for fully autonomous multi-human simulation in large urban environments. Our pedestrian model is entirely autonomous and requires no centralized, global control whatsoever.

To animate a variety of natural interactions between numerous pedestrians and their environment, we represent the environment using hierarchical data structures, which efficiently support the perceptual queries of the autonomous pedestrians that drive their behavioral responses and sustain their ability to plan their actions on local and global scales.

The animation system that we implement using the above models enables us to run long-term simulations of pedestrians in large urban environments without manual intervention. Real-time simulation can be achieved for well over a thousand autonomous pedestrians. With each pedestrian under his/her own autonomous control, the self-animated characters imbue the virtual world with liveliness, social (dis)order, and a realistically complex dynamic.

We demonstrate the automated animation of human activity in a virtual train station, and we employ our pedestrian simulator in the context of virtual archaeology for visualizing urban social life in reconstructed archaeological sites. Our pedestrian simulator is also serving as the basis of a testbed for designing and experimenting with visual sensor networks in the field of computer vision.
Ph.D. Thesis 2006 Complexity Analysis of Algorithms in Algebraic Computation Sharma, Vikram Abstract | PDF

Title: Complexity Analysis of Algorithms in Algebraic Computation

Candidate: Sharma, Vikram

Advisor(s): Yap, Chee

Abstract:

Numerical computations with real algebraic numbers require algorithms for approximating and isolating real roots of polynomials. A classical choice for root approximation is Newton's method. For an analytic function on a Banach space, Smale introduced the concept of approximate zeros, i.e., points from which Newton's method for the function converges quadratically. To identify these approximate zeros he gave computationally verifiable convergence criteria called point estimates. However, in developing these results Smale assumed that Newton's method is computed exactly. For a system of $n$ homogeneous polynomials in $n+1$ variables, Malajovich developed point estimates for a different definition of approximate zero, assuming that all operations in Newton's method are computed with fixed precision. In the first half of this dissertation, we develop point estimates for these two different definitions of approximate zeros of an analytic function on a Banach space, but assume the strong bigfloat computational model of Brent, i.e., where all operations involve bigfloats with varying precision. In this model, we derive a uniform complexity bound for approximating a root of a zero-dimensional system of $n$ integer polynomials in $n$ variables. We also derive a non-asymptotic bound, in terms of the condition number of the system, on the precision required to implement the robust Newton method.

The second part of the dissertation analyses the worst-case complexity of two algorithms for isolating real roots of a square-free polynomial with real coefficients: The Descartes method and Akritas' continued fractions algorithm. The analysis of both algorithms is based upon amortization bounds such as the Davenport-Mahler bound. For the Descartes method, we give a unified framework that encompasses both the power basis and the Bernstein basis variant of the method; we derive an $O(n(L+\log n))$ bound on the size of the recursion tree obtained by applying the method to a square-free polynomial of degree n with integer coefficients of bit-length $L$, the bound is tight for $L=\Omega(\log n)$; based upon this result we readily obtain the best known bit-complexity bound of $\wt{O}(n^4L2) $ for the Descartes method, where $\wt{O}$ means we ignore logarithmic factors. Similar worst case bounds on the bit-complexity of Akritas' algorithm were not known in the literature. We provide the first such bound, $\wt{O}(n^{12}L3)$, for a square-free integer polynomial of degree $n$ and coefficients of bit-length $L$.
Ph.D. Thesis 2006 Pairwise Comparison between Genomic Sequences and Optical-Maps Sun, Bing Abstract | PDF

Title: Pairwise Comparison between Genomic Sequences and Optical-Maps

Candidate: Sun, Bing

Advisor(s): Mishra, Bud

Abstract:

With the development and improvement of high throughput experimental technologies, massive amount of biological data including genomic sequences and optical-maps have been collected for various species. Comparative techniques play a central role in investigating the adaptive significance of organismal traits and revealing evolutionary relations among organisms by comparing these biological data. This dissertation presents two efficient comparative analysis tools used in comparative genomics and comparative optical-map study, respectively.

A complete genome sequence of an organism can be viewed as its ultimate genetic map, in the sense that the heritable information are encoded within the DNA and the order of nucleotides along chromosomes is known. Comparative genomics can be applied to find functional sites by comparing genetic maps. Comparing vertebrate genomes requires efficient cross-species sequence alignment programs. The first tool introduced in this thesis is COMBAT (Clean Ordered Mer-Based Alignment Tool), a new mer-based method which can search rapidly for highly similar translated genomic sequences using the stable-marriage algorithm (SM) as an alignment filter. In experiments COMBAT is applied to comparative analysis between yeast genomes, and between the human genome and the recently published bovine genome. The homologous blocks identified by COMBAT are comparable with the alignments produced by BLASTP and BLASTZ.

When genetic maps are not available, other genomic maps, including optical-maps, can be constructed. An optical map is an ordered enumeration of the restriction sites along with the estimated lengths of the restriction fragments between consecutive restriction sites. CAPO (Comparative Analysis and Phylogeny with Optical-Maps), introduced as a second technique in this thesis, is a tool for inferring phylogeny based on pairwise optical map comparison and bipartite graph matching. CAPO combines the stable matching algorithm with either the Unweighted Pair Group Method with Arithmetic Averaging (UPGMA) or the Neighbor-Joining (NJ) method for constructing phylogenetic trees. This new algorithm is capable of constructing phylogenetic trees in logarithmic steps and performs well in practice. Using optical maps constructed in silico and in vivo, our work shows that both UPGMA-flavored trees and the NJ-flavored trees produced by CAPO share substantial overlapping tree topology and are biologically meaningful.
Ph.D. Thesis 2006 Exploiting Service Usage Information for Optimizing Server Resource Management Totok, Alexander Abstract | PDF

Title: Exploiting Service Usage Information for Optimizing Server Resource Management

Candidate: Totok, Alexander

Advisor(s): Karamcheti, Vijay

Abstract:

It is difficult to provision and manage modern component-based Internet services so that they provide stable quality-of-service (QoS) guarantees to their clients, because: (1) component middleware are complex software systems that expose several independently tuned configurable application runtime policies and server resource management mechanisms; (2) session-oriented client behavior with complex data access patterns makes it hard to predict what impact tuning these policies and mechanisms has on application behavior; (3) component-based Internet services exhibit complex structural organization with requests of different types accessing different components and data sources, which could be distributed and/or replicated for failover, performance, or business purposes.

This dissertation attempts to alleviate this situation by targeting three interconnected goals: (1) providing improved QoS guarantees to the service clients, (2) optimizing server resource utilization, and (3) providing application developers with guidelines for natural application structuring, which enable efficient use of the proposed mechanisms for improving service performance. Specifically, we explore the thesis that exposing and using detailed information about how clients use component-based Internet services enables mechanisms that achieve the range of goals listed above. To validate this thesis we show its applicability to the following four problems: (1) maximizing reward brought by Internet services, (2) optimizing utilization of server resource pools, (3) providing session data integrity guarantees, and (4) enabling service distribution in wide-area environments.

The techniques that we propose for the identified problems are applicable at both the application structuring stage and the application operation stage, and range from automatic (i.e., performed by middleware in real time) to manual (i.e., involve the programmer, or the service provider). These techniques take into account service usage information exposed at different levels, ranging from high-level structure of user sessions to low level information about data access patterns and resource utilization by requests of different types. To show the benefits of the proposed techniques, we implement various middleware mechanisms in the JBoss application server, which utilizes the J2EE component model, and comprehensively evaluate them on several publicly-available sample J2EE applications - Java Pet Store, RUBiS, and our own implementation of the TPC-W web transactional benchmark. Our experimental results show that the proposed techniques achieve optimal utilization of server resources and improve application performance by up to two times for centralized Internet services and by up to 6 times for distributed ones.
Ph.D. Thesis 2006 Time Series Matching: A Multi-Filter Approach Wang, Zhihua Abstract | PDF

Title: Time Series Matching: A Multi-Filter Approach

Candidate: Wang, Zhihua

Advisor(s): Shasha, Dennis

Abstract:

Data arriving in time order (time series) arises in disciplines ranging from music to meteorology to finance to motion capture data, to name a few. In many cases, a natural way to query the data is what we call time series matching - a user enters a time series by hand, keyboard or voice and the system finds "similar" time series.

Existing time series similarity measures, such as DTW (Dynamic Time Warping), can accommodate certain timing errors in the query and perform with high accuracy on small databases. However, they all have high computational complexity and the accuracy dramatically drops when the data set grows. More importantly, there are types of errors that cannot be captured by a single similarity measure.

Here we present a general time series matching framework. This framework can easily optimize, combine and test different features to execute a fast similarity search based on the application's requirement. Basically we use a multi-filter chain and boosting algorithms to compose a ranking algorithm. Each filter is a classifier which removes bad candidates by comparing certain features of the time series data. Some filters use a boosting algorithm to combine a few different weak classifiers into a strong classifier. The final filter will give a ranked list of candidates in the reference data which matches the query data.

The framework is applied to build query algorithms for a Query-by-Humming system. Experiments show that the algorithm has a more accurate similarity measure and its response time increases much slower than the pure DTW algorithm when the number of songs in the database increases from 60 to 1400.
Ph.D. Thesis 2006 Incremental Web Search: Tracking Changes in the Web Wang, Ziyang Abstract | PDF

Title: Incremental Web Search: Tracking Changes in the Web

Candidate: Wang, Ziyang

Advisor(s): Davis, Ernest

Abstract:

A large amount of new information is posted on the Web every day. Large-scale web search engines often update their index slowly and are unable to present such information in a timely manner. Here we present our solutions of searching new information from the web by tracking the changes of web documents.

First, we present the algorithms and techniques useful for solving the following problems: detecting web pages that have changed, extracting changes from different versions of a web page, and evaluating the significance of web changes. We propose a two-level change detector: MetaDetector and ContentDetector. The combined detector successfully reduces network traffic by about 67%. Our algorithm for extracting web changes consists of three steps: document tree construction, document tree encoding and tree matching. It has linear time complexity and extracts effectively the changed content from different versions of a web page. In order to evaluate web changes, we propose a unified ranking framework combining three metrics: popularity ranking, content-based ranking and evolution ranking. Our methods can identify and deliver important new information in a timely manner.

Second, we present an application using the techniques and algorithms we developed, named "Web Daily News Assistant (WebDNA): finding what's new on Your Web". It is a search tool that helps community users search new information on their community web. Currently WebDNA is deployed on the New York University web site.

Third, we model the changes of web documents using survival analysis. Modeling web changes is useful for web crawler scheduling and web caching. Currently people model changes to web pages as a Poisson Process, and use a necessarily incomplete detection history to estimate the true frequencies of changes. However, other features that can be used to predict change frequency have not previously been studied. Our analysis shows that PageRank value is a good predictor. Statistically, the change frequency is a function proportional to $\exp[0.36\cdot (\ln(PageRank)+C)]$. We further study the problem of combining the predictor and change history into a unified framework. An improved estimator of change frequency is presented, which successfully reduces the error by 27.3% when the change history is short.
Ph.D. Thesis 2006 Fast Algorithms for Burst Detection Zhang, Xin Abstract | PDF

Title: Fast Algorithms for Burst Detection

Candidate: Zhang, Xin

Advisor(s): Shasha, Dennis

Abstract:

Events occur in every aspect of our lives.

An unexpectedly large number of events occurring within some certain measurement (e.g. within some time duration or a spatial region) is called a {\em burst}, suggesting unusual behaviors or activities. Bursts come up in many natural and social processes. It is a challenging task to monitor the occurrence of bursts whose lasting duration is unknown in a fast data stream environment.

This work describes efficient data structures and algorithms for high performance burst detection under different settings. Our view is that bursts, as an unusual phenomenon, constitute a useful preliminary primitive in a knowledge discovery hierarchy. Our intent is to build a high performance primitive detection algorithm to support high-level data mining tasks.

The work starts with an algorithmic framework including a family of data structures and a heuristic optimization algorithm to choose an efficient data structure given the inputs. The advantage of this framework is that it's adaptive to different inputs. Experiments on both synthetic data and real world data show the new framework significantly outperforms existing techniques over a variety of inputs.

Furthermore, we present a greedy dynamic detection algorithm which handles the changing data. It evolves the structure to adapt to the incoming data. It achieves better performance in both synthetic and real data streams than a static algorithm in most cases.

We have applied this framework to different real world applications in physics, stock trading and website traffic monitoring. All the case studies show our framework has superb performance.

We extend this framework to multi-dimensional data and use it in an epidemiology simulation to detect infectious disease outbreak and spread.
Ph.D. Thesis 2006 High Performance Algorithms for Multiple Streaming Time Series Zhao, Xiaojian Abstract | PDF

Title: High Performance Algorithms for Multiple Streaming Time Series

Candidate: Zhao, Xiaojian

Advisor(s): Shasha, Dennis

Abstract:

Data arriving in time order (a data stream) arises in fields ranging from physics to finance to medicine to music, to name a few. Often the data comes from sensors (in physics and medicine for example) whose data rates continue to improve dramatically as sensor technology improves. Furthermore, the number of sensors is increasing, so analyzing data between sensors becomes ever more critical in order to distill knowledge from the data. Fast response is desirable in many applications (e.g. to aim a telescope at an activity of interest or to perform a stock trade). In applications such as finance, recent information, e.g. correlation, is of far more interest than older information, so analysis over sliding windows is a desired operation.

These three factors -- huge data size, fast response, and windowed computation -- motivated this work. Our intent is to build a foundational library of primitives to perform online or near online statistical analysis, e.g. windowed correlation, incremental matching pursuit, burst detection, on thousands or even millions of time series. Beside the algorithms, we also propose the concept of ``uncooperative'' time series, whose power spectra are spread over all frequencies with any regularity.

Previous work showed how to do windowed correlation with Fast Fourier Transforms and Wavelet Transforms, but such techniques don't work for uncooperative time series. This thesis will show how to use sketches (random projections) in a way that combines several simple techniques -- sketches, convolution, structured random vectors, grid structures, combinatorial design, and bootstrapping -- to achieve high performance, windowed correlation over a variety of data sets. Experiments confirm the asymptotic analysis.

To conduct matching pursuit (MP) over time series windows, an incremental scheme is designed to reduce the computational effort. Our empirical study demonstrates a substantial improvement in speed.

In previous work, Zhu and Shasha introduced an efficient algorithm to monitor bursts within windows of multiple sizes. We implemented it in a physical system by overcoming several practical challenges. Experimental results support the authors' linear running time analysis.
Ph.D. Thesis 2006 Distribution of Route-Impacting Control Information in a Publish/Subscribe System with Delivery Guarantees Zhao, Yuanyuan Abstract | PDF

Title: Distribution of Route-Impacting Control Information in a Publish/Subscribe System with Delivery Guarantees

Candidate: Zhao, Yuanyuan

Advisor(s): Kedem, Zvi

Abstract:

Event-driven middleware is a popular infrastructure for building large-scale asynchronous distributed systems. Content-based publish/subscribe systems are a type of event-driven middleware that provides service flexibility and specification expressiveness, creating opportunities for improving reliability and efficiency of the system.

The use of route-impacting control information, such as subscription filters and access control rules, has the potential to enable efficient routing for applications that require selective and regional distribution of events. Such applications range from financial information systems to sensor networks to service-oriented architectures. However, it has been a great challenge to design correct and efficient protocols for distributing control information and exploiting it to achieve efficient and highly available message routing.

In this dissertation, we study the problem of distributing and utilizing route-impacting control information. We present an abstract model of content-based routing and reliable delivery in redundant broker networks. Based on this model, we design a generic algorithm that propagates control information and performs content-based routing and delivers events reliably. The algorithm is efficient and light-weight in that it does not require heavy-weight consensus protocols between redundant brokers. We extend this generic algorithm to support consolidation and merging of control information. Existing protocols can be viewed as particular encodings and optimizations of the generic algorithm. We show an encoding using virtual time vectors that supports reliable delivery and deterministic dynamic access control in redundant broker networks. In our system, the semantics of reliable delivery is clearly defined even if subscription information and access control policy can dynamically change. That is, one or more subscribers of same principal will receive exactly the same sequence of messages (modulo subscription filter differences) regardless of where they are connected and the network latency and failure conditions in their parts of the network.

We have implemented these protocols in a fully-functioning content-based publish/subscribe system - Gryphon. We evaluate its efficiency, scalability and high availability.
Ph.D. Thesis 2005 Translation Validation of Optimizing Compilers Fang, Yi Abstract | PDF

Title: Translation Validation of Optimizing Compilers

Candidate: Fang, Yi

Advisor(s): Pnueli, Amir; Zuck, Lenore

Abstract:

There is a growing awareness, both in industry and academia, of the crucial role of formally verifying the translation from high-level source-code into low-level object code that is typically performed by an optimizing comiler. Formally verifying an optimizing compiler, as one woule verify any other large program, is not feasible due to its size, ongoing evolution and modification, and possibly, proprietary considerations. Translation validation is a novel approach that offers an alternative to the verification of translator in general and compilers in particular: Rather than verifying the compiler itself, one constructs a validation tool which, after every run of the compiler, formally confirms that the target code produced in the run is a correct translation of the source program. This thesis work takes an important step towards ensuring an extremely high level of confidence in compilers targeted at EPIC architectures.

In this thesis, we focus on the translation validation of structure preserving optimizations, i.e. transformations that do not modify programs' structure in a major way. This category of optimizations covers most of the global optimizations performed by compilers. This thesis has two main parts. One develops a proof rule that formally establishes the correctness of structure preserving transformation based on computational induction. The other part is the development of a tool that applies the proof rule to the automatic validation of global optimizaitons performed by Intel's ORC compiler for IA-64 architecture. With minimal instrumentation from the compiler, the tool constructs ''verification conditions'' -- formal theorems that, if valid, establish the correctness of a translation. The verificaiton conditions are then transferred to an automatic theorem prover that checks their validity. Together, the tool offers a fully automatic method to formally establish the correctness of each translation.
Ph.D. Thesis 2005 Translation Validation of Loop Optimizations Hu, Ying Abstract | PDF

Title: Translation Validation of Loop Optimizations

Candidate: Hu, Ying

Advisor(s): Goldberg, Benjamin; Barrett, Clark

Abstract:

Formal verification is important in designing reliable computer systems. For a critical software system, it is not enough to have a proof of correctness for the source code, there must also be an assurance that the compiler produces a correct translation of the source code into the target machine code. Verifying the correctness of modern optimizing compilers is a challenging task because of their size, their complexity, and their evolution over time.

In this thesis, we focus on the Translation Validation of loop optimizations. In order to validate the optimizations performed by the compiler, we try to prove the equivalence of the intermediate codes before and after the optimizations. There were previously a set of proof rules for building the equivalence relation between two programs. However, they cannot validate some cases with legal loop optimizations. We propose new proof rules to consider the conditions of loops and possible elimination of some loops, so that those cases can also be handled. According to these new proof rules, algorithms are designed to apply them to an automatic validation process.

Based on the above proof rules, we implement an automatic validation tool for loop optimizations which analyzes the loops, guesses what kinds of loop optimizations occur, proves the validity of a combination of loop optimizations, and synthesizes a series of intermediate codes. We integrate this new loop tool into our translation validation tool TVOC, so that TVOC handles not only optimizations which do not significantly change the structure of the code, but also loop optimizations which do change the structure greatly. With this new part, TVOC has succeeded in validating many examples with loop optimizations.

Speculative optimizations are the aggressive optimizations that are only correct under certain conditions that cannot be known at compile time. In this thesis, we present the theory and algorithms for validating speculative optimizations and generating the runtime tests necessary for speculative optimizations. We also provide several examples and the results of the algorithms for speculative optimizations.
Ph.D. Thesis 2005 Construction of Component-Based Applications by Planning Kichkaylo, Tatiana Abstract | PDF

Title: Construction of Component-Based Applications by Planning

Candidate: Kichkaylo, Tatiana

Advisor(s): Karamcheti, Vijay; Ernest Davis

Abstract:

Many modern wide-area distributed systems are component-based. This approach provides great flexibility in adapting applications to the changing state of the environment and user requirements, but increases the complexity of configuring the applications. Because of the scale and heterogeneity of modern wide-area environments, manual configuration is hard, inefficient, suboptimal, and error-prone. Automated application configuration is desired.

Constructing distributed applications requires choosing a set of components that will constitute the application instance and assigning network resources to component executions and data transfers. Stated this way, the application configuration problem (ACP) is similar to the planning (action selection) and scheduling (resource allocation) problems studied by the Artificial Intelligence (AI) community.

This thesis investigates the problem of solving the ACP using AI planning techniques. However, the ACP poses several challenges not usually encountered and addressed by the traditional AI solutions. The problem specification for the ACP can be much larger than the solution, with the relevant portions only identified during the search. Additionally, the interactions between planning operators are numeric rather than logical. Finally, it is desirable to be able to trade off quality of the solution versus search time.

We show that the ACP is undecidable in general. Therefore, instead of a single algorithm, we propose a set of techniques that can be used to compose an algorithm for a particular variety of the ACP that can exploit natural restrictions exhibited by that variety. These techniques address the challenges above by dynamically obtaining portions of the problem specification as necessary during the search, using envelope hierarchies based on numeric information for pruning and search guidance, and discretizing continuous variables to approximate numeric parameters without restricting the form of supported numeric functions.

We illustrate these techniques by describing their use in algorithms tailored for two specific varieties of the ACP --- snapshot configurations for dynamic component-based frameworks, and scheduling of grid workflows with replica selection and explicit resource reservations. Experimental evaluation of the performance of these two algorithms shows that the techniques successfully achieve their goals, with acceptable run-time overhead.
Ph.D. Thesis 2005 Extensible MultiModal Environment Toolkit (EMMET): A Toolkit for Prototyping and Remotely Testing Speech and Gesture Based Multimodal Interfaces Robbins, Christopher A. Abstract | PDF

Title: Extensible MultiModal Environment Toolkit (EMMET): A Toolkit for Prototyping and Remotely Testing Speech and Gesture Based Multimodal Interfaces

Candidate: Robbins, Christopher A.

Advisor(s): Perlin, Ken

Abstract:

Ongoing improvements to the performance and accessibility of less conventional input modalities such as speech and gesture recognition now provide new dimensions for interface designers to explore. Yet there is a scarcity of commercial applications which utilize these modalities either independently or multimodally. This scarcity partially results from a lack of development tools and design guidelines to facilitate the use of speech and gesture.

An integral aspect of the user interface design process is the ability to easily evaluate various design solutions through an iterative process of prototyping and testing. Through this process guidelines emerge that aid in the design of future interfaces. Today there is no shortage of tools supporting the development of conventional interfaces. However there do not exist resources allowing interface designers to easily prototype and quickly test, via remote distribution, interface designs utilizing speech and gesture.

The thesis work for this dissertation explores the development of an Extensible MultiModal Environment Toolkit (EMMET) for prototyping and remotely testing speech and gesture based multimodal interfaces to three-dimensional environments. The overarching goals for this toolkit are to allow its users to: explore speech and gesture based interface design without requiring an understanding of the details involved in the low-level implementation of speech or gesture recognition, quickly distribute their multimodal interface prototypes via the Web, and receive multimodal usage statistics collected remotely after each use of their application.

EMMET ultimately contributes to the field of multimodal user interface design by providing an environment to existing user interface developers in which speech and gesture recognition have been seamlessly integrated into their palette of user input options. Such seamless integration serves to increase the utilization within applications of speech and gesture modalities by removing any actual or perceived deterrents to the use of these modalities versus the use of conventional modalities. EMMET additionally strives to improve the quality of speech and gesture based interfaces by supporting the prototype-and-test development cycle through its Web distribution and usage statistics collection capabilities. These capabilities also allow developers to realize new design guidelines specific to the use of speech and gesture.
Ph.D. Thesis 2005 Pattern Discovery for Hypotheses Generation in Biology Tsirigos, Aristotelis Abstract | PDF

Title: Pattern Discovery for Hypotheses Generation in Biology

Candidate: Tsirigos, Aristotelis

Advisor(s): Shasha, Dennis

Abstract:

In recent years, the increase in the amounts of available genomic as well as gene expression data has provided researchers with the necessary information to train and test various models of gene origin, evolution, function and regulation. In this thesis, we present novel solutions to key problems in computational biology that deal with nucleotide sequences (horizontal gene transfer detection), amino-acid sequences (protein sub-cellular localization prediction), and gene expression data (transcription factor - binding site pair discovery). Different pattern discovery techniques are utilized, such as maximal sequence motif discovery and maximal itemset discovery, and combined with support vector machines in order to achieve significant improvements against previously proposed methods.
Ph.D. Thesis 2005 Automatic Verification of Parameterized Systems Xu, Jiazhao Abstract | PDF

Title: Automatic Verification of Parameterized Systems

Candidate: Xu, Jiazhao

Advisor(s): Pnueli, Amir

Abstract:

Verification plays an indispensable role in designing reliable computer hardware and software systems. With the fast growth in design complexity and the quick turnaround in design time, formal verification has become an increasingly important technology for establishing correctness as well as for finding difficult bugs. Since there is no ``silver-bullet'' to solve all verification problems, a spectrum of powerful techniques in formal verification have been developed to tackle different verification problems and complexity issues. Depending on the nature of the problem whose most salient components are the system implementation and the property specification, a proper methodology or a combination of different techniques is applied to solve the problem.

In this thesis, we focus on the research and development of formal methods to uniformly verify parameterized systems. A parameterized system is a class of systems obtained by instantiating the system parameters. Parameterized verification seeks a single correctness proof of a property for the entire class. Although the general parameterized verification problem is undecidable [AK86], it is possible to solve special classes by applying a repertoire of techniques and heuristics. Many methods in parameterized verification require a great deal of human interaction. This makes the application of these methods to real world problems infeasible. Thus, the main focus of this research is to develop techniques that can be automated to deliver proofs of safety and liveness properties.

Our research combines various formal techniques such as deductive methods, abstraction and model checking. One main result in this thesis is an automatic deductive method for parameterized verification. We apply small model properties of Bounded Data Systems (a special type of parameterized system) to help prove deductive inference rules for the safety properties of BDS systems. Another methodology we developed enables us to prove liveness properties of parameterized systems via an automatic abstraction method called counter abstraction . There are several useful by-products from our research: A set of heuristics is established for the automatic generation of program invariants which can benefit deductive verification in general; also we proposed methodologies for the automatic abstraction of fairness conditions that are crucial for proving liveness properties.
Ph.D. Thesis 2005 Mobility, Route Caching, and TCP Performance in Mobile Ad Hoc Networks Yu, Xin Abstract | PDF

Title: Mobility, Route Caching, and TCP Performance in Mobile Ad Hoc Networks

Candidate: Yu, Xin

Advisor(s): Johnson, David B.

Abstract:

In a mobile ad hoc network, mobile nodes communicate with each other through wireless links. Mobility causes frequent topology changes. This thesis addresses the fundamental challenges mobility presents to on-demand routing protocols and to TCP.

On-demand routing protocols use route caches to make routing decisions. Due to mobility, cached routes easily become stale. To address the cache staleness issue, prior work used adaptive timeout mechanisms. However, heuristics cannot accurately estimate timeouts because topology changes are unpredictable. I propose to proactively disseminate the broken link information to the nodes that have cached the link. I define a new cache structure called a cache table to maintain the information necessary for cache updates, and design a distributed cache update algorithm. This algorithm is the first work that proactively updates route caches in an adaptive manner. Simulation results show that proactive cache updating is more efficient than adaptive timeout mechanisms. I conclude that proactive cache updating is key to the adaptation of on-demand routing protocols to mobility.

TCP does not perform well in mobile ad hoc networks. Prior work provided link failure feedback to TCP so that it can avoid invoking congestion control mechanisms for packet losses caused by route failures. Simulation results show that my cache update algorithm significantly improves TCP throughput since it reduces the effect of mobility on TCP. TCP still suffers from frequent data and ACK losses. I propose to make routing protocols aware of lost TCP packets and help reduce TCP timeouts. I design two mechanisms that exploit cross-layer information awareness: early packet loss notification (EPLN) and best-effort ACK delivery (BEAD). EPLN notifies TCP senders about lost data. BEAD retransmits ACKs at intermediate nodes or at TCP receivers. Simulation results show that the two mechanisms significantly improve TCP throughput. I conclude that cross-layer information awareness is key to making TCP efficient in the presence of mobility.

I also study the impact of route caching strategies on the scalability of on-demand routing protocols with mobility. I show that making route caches adapt quickly and efficiently to topology changes is key to the scalability of on-demand routing protocols with mobility.
Ph.D. Thesis 2005 Information Extraction from Multiple Syntactic Sources Zhao, Shubin Abstract | PDF

Title: Information Extraction from Multiple Syntactic Sources

Candidate: Zhao, Shubin

Advisor(s): Grishman, Ralph

Abstract:

Information Extraction is the automatic extraction of facts from text, which includes detection of named entities, entity relations and events. Conventional approaches to Information Extraction try to find syntactic patterns based on deep processing of text, such as partial or full parsing. The problem these solutions have to face is that as deeper analysis is used, the accuracy of the result decreases, and one cannot recover from the induced errors. On the other hand, lower level processing is more accurate and it can also provide useful information. However, within the framework of conventional approaches, this kind of information can not be efficiently incorporated.

This thesis describes a novel supervised approach based on kernel methods to address these issues. In this approach customized kernels are used to match syntactic structures produced from different preprocessing phases. Using properties of a kernel, individual kernels are combined into composite kernels to integrate and extend all the information. The composite kernels can be used with various classifiers, such as Nearest Neighbor or Support Vector Machines (SVM). The main classifier we propose to use is SVM due to its ability to generalize in large dimensional feature spaces. We will show that each level of syntactic information can contribute to IE tasks, and low level information can help to recover from errors in deep processing.

The new approach has demonstrated state-of-the-art performance on two benchmark tasks. The first task is detecting slot fillers for management succession events (MUC-6). For this task two types of kernels were designed, a surface kernel based on word n-grams and a kernel built on sentence dependency trees; the second task is the ACE RDR evaluation, which is to recognize relations between entities in text from newswire and broadcast news transcript. For this task, five kernels were built to represent information from sentence tokenization, syntactic parsing and dependency parsing. Experimental results for the two tasks will be shown and discussed.
Ph.D. Thesis 2004 Partitionable Services Framework: Seamless Access to Distributed Applications Ivan, Anca Abstract | PDF

Title: Partitionable Services Framework: Seamless Access to Distributed Applications

Candidate: Ivan, Anca

Advisor(s): Karamcheti, Vijay

Abstract:

A key problem in contemporary distributed systems is how to satisfy user quality of service (QoS) requirements for distributed applications deployed in heterogeneous, dynamically changing environments spanning multiple administrative domains.

An attractive solution is to create an infrastructure which satisfies user QoS requirements by automatically and transparently adapting distributed applications to any environment changes with minimum user input. However, successful use of this approach requires overcoming three challenges: (1) Capturing the application behavior and its relationship with the environment as a set of compact local specifications, using both general, quantitative (e.g., CPU usage) and qualitative (e.g., security) properties. Such information should be sufficient to reason about the global behavior of the application deployment. (2) Finding the ``best'' application deployment that satisfies both application and user requirements, and the various domain policies. The search algorithm should be complete, efficient, scalable with regard to application and network sizes, and guarantee optimality (e.g., resources consumed by applications). (3) Ensuring that the found deployments are practical and efficient, i.e., that the efficiency of automatic deployments is comparable with the efficiency of hand-tuned solutions.

This dissertation describes three techniques that address these challenges in the context of component-based applications. The modularity and reusability of the latter enable automatic deployments while supporting reasoning about the global connectivity based on the local information exposed by each component. The first technique extends the basic component-based application model with information about conditions and effects of component deployments and linkages, together with interactions between components and the network. The second technique uses AI planning to build an efficient and scalable algorithm which exploits the expressivity of the application model to find an application deployment that satisfies user QoS and application requirements. The last technique ensures that application deployments are both practical and efficient, by leveraging language and run-time system support to automatically customize components, as appropriate for the desired security and data consistency guarantees. These techniques are implemented as integral parts of the Partitionable Services Framework (PSF), a Java-based framework which flexibly assembles component-based applications to suit the properties of their environment. PSF facilitates on-demand, transparent migration and replication of application components at locations closer to clients, while retaining the illusion of a monolithic application.

The benefits of PSF are evaluated by deploying representative component-based applications in an environment simulating fast and secure domains connected by slow and insecure links. Analysis of the programming and the deployment processes shows that: (1) the code modifications required by PSF are minimal,(2) PSF appropriately adapts the deployments based on the network state and user QoS requirements, (3) the run-time deployment overheads incurred by PSF are negligible compared to the application lifetime, and (4) the efficiency of PSF-deployed applications matches that of hand-crafted solutions.
Ph.D. Thesis 2004 VALIS: A Multi-language System for Rapid Prototyping in Computational Biology Paxia, Salvatore Abstract | PDF

Title: VALIS: A Multi-language System for Rapid Prototyping in Computational Biology

Candidate: Paxia, Salvatore

Advisor(s): Mishra, Bud

Abstract:

Bioinformatics is a challenging area for computer science, since the underlying computational formalisms span database systems, numerical methods, geometric modeling and visualization, imaging and image analysis, combinatorial algorithms, data analysis and mining, statistical approaches, and reasoning under uncertainty.

This thesis describes the Valis environment for rapid application prototyping in bioinformatics. The core components of the Valis system are the underlying database structure and the algorithmic development platform.

This thesis presents a novel set of data structures that has marked advantages when dealing with unstructured and unbounded data that are common in scientific fields and bioinformatics.

Bioinformatics problems rarely have a one-language, one-platform solution. The Valis environment allows seamless integration between scripts written in different programming languages and includes tools to rapidly prototype graphical user interfaces.

To date the speed of computation of most whole genome analysis tools have stood in the way of developing fast interactive programs that may be used as exploratory tools. This thesis presents the basic algorithms and widgets that permit rapid prototyping of whole genomic scale real-time applications within Valis.
Ph.D. Thesis 2004 Thick Surfaces: Interactive Modeling of Topologically Complex Geometric Details Peng, Jianbo Abstract | PDF

Title: Thick Surfaces: Interactive Modeling of Topologically Complex Geometric Details

Candidate: Peng, Jianbo

Advisor(s): Zorin, Denis

Abstract:

Lots of objects in computer graphics applications are represented by surfaces. It works very well for objects of simple topology, but can get prohibitively expensive for objects with complex small-scale geometrical details.

Volumetric textures aligned with a surface can be used to add topologically complex geometric details to an object, while retaining an underlying simple surface structure. The simple surface structure provides great controllability on the overall shape of the model, and volumetric textures handle geometric details and topological changes efficiently.

Adding a volumetric texture to a surface requires more than a conventional twodimensional parameterization: a part of the space surrounding the surface has to be parameterized. Another problem with using volumetric textures for adding geometric detail is the difficulty of the rendering of implicitly represented surfaces, especially when they are changed interactively.

We introduce thick surfaces to represent objects with topologically complex geometric details. A thick surface consists of three components. First, a base mesh of simple structure is used to approximate the overall shape of the object. Second, a layer of space along the base mesh is parameterized. We define the layer of space as a shell, which covers the geometric details of the object. Third, volumetric textures of geometric details are mapped into the shell. The object is represented as the implicit surface encoded by the volumetric textures. Places without volumetric textures are filled with patches of the base mesh.

We present algorithms for constructing a shell around a surface and rendering a volumetric-textured surface. Mipmap technique for volumetric textures is explored as well. The gradient field of a generalized distance function is used to construct a non-self-intersecting shell, which has other properties desirable for volumetric texture mapping. The rendering algorithm is designed and implemented on NVIDIA GeForceFX video chips. Finally we demonstrate a number of interactive operations that these algorithms enable.
Ph.D. Thesis 2004 TM-LPSAT: Encoding Temporal Metric Planning in Continuous Time Shin, Ji-Ae Abstract | PDF

Title: TM-LPSAT: Encoding Temporal Metric Planning in Continuous Time

Candidate: Shin, Ji-Ae

Advisor(s): Davis, Ernest

Abstract:

In any domain with change, the dimension of time is inherently involved. Whether the domain should be modeled in discrete time or continuous time depends on aspects of the domain to be modeled. Many complex real-world domains involve continuous time, resources, metric quantities and concurrent actions. Planning in such domains must necessarily go beyond simple discrete models of time and change.

In this thesis, we show how the SAT-based planning framework can be extended to generate plans of concurrent asynchronous actions that may depend on or make change piecewise linear metric constraints in continuous time.

In the SAT-based planning framework, a planning problem is formulated as a satisfiability problem of a set of propositional constraints (axioms) such that any model of the axioms corresponds to a valid plan. There are two parameters to a SAT-based planning system: an encoding scheme for representing plans of bounded length and a propositional SAT solver to search for a model. The LPSAT architecture is composed of a SAT solver integrated with a linear arithmetic constraint solver in order to deal with metric aspects of domains.

We present encoding schemes for temporal models of continuous time defined in PDDL+: ( i ) Durative actions with discrete and/or continuous changes; (ii) Real-time temporal model with exogenous events and autonomous processes capturing continuous changes. The encoding represents, in a CNF formula over arithmetic constraints and propositional fluents, time-stamped parallel plans possibly with concurrent continuous and/or discrete changes. In addition, we present encoding schemes for multi-capacity resources, partitioned interval resources, and metric quantities which are represented as intervals. An interval type can be used as a parameter to action as well as a fluent type.

Based on the LPSAT engine, the TM-LPSAT temporal metric planner has been implemented: Given a PDDL+ representation of a planning problem, the compiler of TM-LPSAT translates it in a CNF formula, which is fed into the LPSAT engine to find a solution corresponding to a plan for the planning problem. We also have experimented on our temporal metric encodings with other decision procedure, MathSAT, which deals with propositional combinations of linear constraints and Boolean variables. The results show that in terms of searching time the SAT-based approach to temporal metric planning can be comparable to other planning approaches and there is plenty of room to push further the limits of the SAT-based approach.
Ph.D. Thesis 2004 Unsupervised Discovery of Extraction Patterns for InformationExtraction Sudo, Kiyoshi Abstract | PDF

Title: Unsupervised Discovery of Extraction Patterns for InformationExtraction

Candidate: Sudo, Kiyoshi

Advisor(s): Grishman, Ralph; Sekine, Satoshi

Abstract:

The task of Information Extraction (IE) is to find specific types of information in natural language text. In particular, *event extraction* identifies instances of a particular type of event or fact (a particular "scenario"), including the entities involved, and fills a database which has been pre-defined for the scenario. As the number of documents available on-line has multiplied, entity extraction has grown in importance for various applications, including tracking terrorist activities from newswire sources and building a database of job postings from the Web, to name a few.

Linguistic contexts, such as predicate-argument relationships, have been widely used as *extraction patterns* to identify the items to be extracted from the text. The cost of creating extraction patterns for each scenario has been a bottleneck limiting the portability of information extraction systems to different scenarios, although there has been some research on semi-supervised pattern discovery procedures to reduce this cost. The challenge is to develop a fully automatic method for identifying extraction patterns for a scenario specified by the user.

This dissertation presents a novel approach for the unsupervised discovery of extraction patterns for event extraction from raw text. First, we present a framework that allows the user to have a self-customizing information extraction system for his/her query: the Query-Driven Information Extraction (QDIE) framework. The input to the QDIE framework is the user's query: either a set of keywords or a narrative description of the event extraction task.

Second, we assess the improvement in extraction pattern models. By considering the shortcomings of the prior work based on predicate-argument models and their extensions, we propose a novel extraction pattern model that is based on arbitrary subtrees of dependency trees.

Third, we address the issue of portability across languages. As a case study of the QDIE framework, we implemented a pre-CODIE system, a Cross-Lingual On-Demand Information Extraction system requiring minimal human intervention, which incorporates the QDIE framework as a component for pattern discovery. In addition, we assess the role of machine translation in cross-lingual information extraction by comparing translation-based implementations.
Ph.D. Thesis 2004 An Efficient and High-Order Accurate Boundary Integral Solver for the Stokes Equations in Three Dimensional Complex Geometries Ying, Lexing Abstract | PDF

Title: An Efficient and High-Order Accurate Boundary Integral Solver for the Stokes Equations in Three Dimensional Complex Geometries

Candidate: Ying, Lexing

Advisor(s): Zorin, Denis

Abstract:

This dissertation presents an efficient and high-order boundary integral solver for the Stokes equations in complex 3D geometries. The targeted applications of this solver are the flow problems in domains involving moving boundaries. In such problems, traditional finite element methods involving 3D unstructured mesh generation expe- rience difficulties. Our solver uses the indirect boundary integral formulation and discretizes the equation using the Nyström method.

Although our solver is designed for the Stokes equations, we show that it can be generalized to other constant coefficient elliptic partial differential equations (PDEs) with non-oscillatory kernels.

First, we present a new geometric representation of the domain boundary. This scheme takes quadrilateral control meshes with arbitrary geometry and topology as input, and produces smooth surfaces approximating the control meshes. Our surfaces are parameterized over several overlapping charts through explicit nonsingular C ^∞ parameterizations, depend linearly on the control points, have fixed-size local support for basis functions, and have good visual quality.

Second, we describe a kernel independent fast multipole method (FMM) and its parallel implementation. The main feature of our algorithm is that it is based only on kernel evaluation and does not require the multipole expansions of the underlying kernel. We have tested our method on kernels from a wide range of elliptic PDEs. Our numerical results indicate that our method is efficient and accurate. Other ad- vantages include the simplicity of the implementation and its immediate extension to other elliptic PDE kernels. We also present an MPI based parallel implementation which scales well up to thousands of processors.

Third, we present a framework to evaluate the singular integrals in our solver. A singular integral is decomposed into a smooth far field part and a local part that contains the singularity. The smooth part of the integral is integrated using the trape- zoidal rule over overlapping charts, and the singular part is integrated in the polar coordinates which removes or decreases the order of singularity. We also describe a novel algorithm to integrate the nearly singular integrals coming from the evaluation at points close to the boundary.
Ph.D. Thesis 2004 High Performance Data Mining in Time Series: Techniques and Case Studies Zhu, Yunyue Abstract | PDF

Title: High Performance Data Mining in Time Series: Techniques and Case Studies

Candidate: Zhu, Yunyue

Advisor(s): Shasha, Dennis

Abstract:

Note: A significantly improved and expanded description of this material is available in the book High Performance Discovery in Time Series Springer Verlag 2004 ISBN 0-387-00857-8.

As extremely large time series data sets grow more prevalent in a wide variety of settings, we face the significant challenge of developing efficient analysis methods. This dissertation addresses the problem in designing fast, scalable algorithms for the analysis of time series.

The first part of this dissertation describes the framework for high performance time series data mining based on important primitives. Data reduction trasform such as the Discrete Fourier Transform, the Discrete Wavelet Transform, Singular Value Decomposition and Random Projection, can reduce the size of the data without substantial loss of information, therefore provides a synopsis of the data. Indexing methods organize data so that the time series data can be retrieved efficiently. Transformation on time series, such as shifting, scaling, time shifting, time scaling and dynamic time warping, facilitates the discovery of flexible patterns from time series.

The second part of this dissertation integrates the above primitives into useful applications ranging from music to physics to finance to medicine.

StatStream
StatStream is a system based on fast algorithms for finding the most highly correlated pairs of time series from among thousands of time series streams and doing so in a moving window fashion. It can be used to find correlations in time series in finance and in scientific applications.

HumFinder
Most people hum rather poorly. Nevertheless, somehow people have some idea what we are humming when we hum. The goal of the query by humming program, HumFinder, is to make a computer do what a person can do. Using pitch translation, time dilation, and dynamic time warping, one can match an inaccurate hum to a melody remarkably accurately.

OmniBurst
Burst detection is the activity of finding abnormal aggregates in data streams. Our software, OmniBurst, can detect bursts of varying durations. Our example applications are monitoring gamma rays and stock market price volatility. The software makes use of a shifted wavelet structure to create a linear time filter that can guarantee that no bursts will be missed at the same time that it guarantees (under a reasonable statistical model) that the filter eliminates nearly all false positives.
Ph.D. Thesis 2003 Comparing and Improving Centralized and Distributed Techniques for Coordinating Massively Parallel Shared-Memory Systems Freudenthal, Eric Abstract | PDF

Title: Comparing and Improving Centralized and Distributed Techniques for Coordinating Massively Parallel Shared-Memory Systems

Candidate: Freudenthal, Eric

Advisor(s): Gottlieb, Allan

Abstract:

Two complementary approaches have been proposed to achieve high performance inter-process coordination on highly parallel shared-memory systems. Gottlieb et. al. introduced the technique of combining concurrent memory references, thereby reducing hot spot contention and enabling the bottleneck-free execution of algorithms referencing a small number of shared variables. Mellor- Crummey and Scott introduced an alternative distributed local-spin technique that minimizes hot spot contention by not polling hotspot variables and exploiting the availability of processor-local shared memory. My principal contributions are a comparison of these two approaches, and significant improvements to the former.

The NYU Ultra3 prototype is the only system built that implements memory reference combining. My research utilizes micro-benchmark simulation studies of massively parallel Ultra3 systems executing coordination algorithms. This investigation detects problems in the Ultra3 design that result in higher-than-expected memory latency for reference patterns typical of busy-wait polling. This causes centralized coordination algorithms to perform poorly. Several architectural enhancements are described that significantly reduce the latency of these access patterns, thereby improving the performance of the centralized algorithms.

I investigate existing centralized algorithms for readers-writers and barrier coordination, all of which require fetch-and-add, and discovered variants that require fewer memory accesses (and hence have shorter latency). In addition,my evaluation includes novel algorithms that require only a restricted form of fetch-and-add.

Coordination latency of these algorithms executed on the enhanced combining architecture is compared to the latency of the distributed local-spin alternatives. These comparisons indicate that the distributed local-spin dissemination barrier, which generates no hot spot tra c, has latency slightly inferior to the best centralized algorithms investigated. However, for the less structured readers-writers problem, the centralized algorithms significantly outperform the distributed local-spin algorithm.
Ph.D. Thesis 2003 Infrastructure Support for Accessing Network Services in Dynamic Network Environments Fu, Xiaodong Abstract | PDF

Title: Infrastructure Support for Accessing Network Services in Dynamic Network Environments

Candidate: Fu, Xiaodong

Advisor(s): Karamcheti, Vijay

Abstract:

Despite increases in network bandwidth, accessing network services across a wide area network still remains a challenging task. The difficulty mainly comes from the heterogeneous and constantly changing network environment, which usually causes undesirable user experience for network-oblivious applications.

A promising approach to address this is to provide network awareness in communication paths. While several such path-based infrastructures have been proposed, the network awareness provided by them is rather limited. Many challenging problems remain, in particular: (1) how to automatically create effective network paths whose performance is optimized for encountered network conditions, (2) how to dynamically reconfigure such paths when network conditions change, and (3) how to manage and distribute network resources among different paths and between different network regions. Furthermore, there is poor understanding of the benefits of using the path-based approach over other alternatives.

This dissertation describes solutions for these problems, built into a programmable network infrastructure called Composable Adaptive Network Services (CANS). The CANS infrastructure provides applications with network-aware communication paths that are automatically created and dynamically modified. CANS highlights four key mechanisms: (1) a high-level integrated type-based specification of components and network resources; (2) automatic path creation strategies; (3) system support for low-overhead path reconfiguration; and (4) distributed strategies for managing and allocating network resources.

We evaluate these mechanisms using experiments with typical applications running in the CANS infrastructure, and extensive simulation on a large scale network topology to compare with other alternatives. Experimental results validate the effectiveness of our approach, verifying that (1) the path-based approach provides the best and the most robust performance under a wide range of network configurations as compared to end-point or proxy-based alternatives; (2) automatic generation of network-aware paths is feasible and provides considerable performance advantages, requiring only minimal input from applications; (3) path reconfiguration strategies ensure continuous adaptation and provide desirable adaptation behaviors by using automatically generated paths; (4) both run-time overhead and reconfiguration time of CANS paths are negligible for most applications; (5) the resource management and allocation strategies allow effective setting up shared resource pools in the network and sharing resources among paths.
Ph.D. Thesis 2003 Enriched Content: Concept, Architecture, Implementation, and Applications Hung-Hsien, Chang Abstract | PDF

Title: Enriched Content: Concept, Architecture, Implementation, and Applications

Candidate: Hung-Hsien, Chang

Advisor(s): Perlin, Ken

Abstract:

Since the debut of the World Wide Web, Web users have been facing the following problems:

[Extended Semantics]
When we read or study a digital document that we wish to explore further, typically, we interrupt our work to start a search. It costs time.

[Reverse Hyperlink]
When we visit a web page, we might be curious about what other hyperlinks point to the visited page. These links would most likely be of related interest. Can we get ``real time'' information about what other pages are pointing to this page?

[Version Control]
Many of us have been frustrated and even annoyed when the hyperlink that we follow gives us a ``404 not found'' or the retrieved webpage content is entirely different from the one we have bookmarked. Could we also have access to the past versions even if the hyperlink has been removed or the content has been changed?

[Composition Assistant]
Writing is not an easy task. We labor to structure a body of text, sort out ideas, find materials, and digest information. We would like an automated service that can associate the content we have produced with other contexts(on the Web) and bring these web contexts to us for reference.

In this thesis, we provide a unified framework and architecture, named enriched content, to resolve the above problems. We apply the architecture and show how the enriched content can be used in each application. We demonstrate that this method can be a new way of writing add-on functions for various document applications without having to write individual plug-in for each application or re-write each application. We also briefly discuss possible future development.
Ph.D. Thesis 2003 A framework for optimistic program optimization Pechtchanski, Igor Abstract | PDF

Title: A framework for optimistic program optimization

Candidate: Pechtchanski, Igor

Advisor(s): Goldberg, Benjamin

Abstract:

The problem of program optimization is a non-trivial one. Compilers do a fair job, but can't always deliver the best performance. The expressibility of general-purpose languages is limited, not allowing programmers to describe expected run time behavior, for example, and some programs are thus more amenable to optimization than others, depending on what the compiler expects to see. We present a generic framework that allows addressing this problem in two ways: through specifying verifiable source annotations to guide compiler analyses, and through optimistically using some assumptions and analysis results for the subset of the program seen so far. Two novel applications are presented, one for each of the above approaches: a dynamic optimistic interprocedural type analysis algorithm, and a mechanism for specifying immutability assertions. Both applications result in measurable speedups, demonstrating the feasibility of each approach.
Ph.D. Thesis 2003 Secure and Robust Censorship-Resistant Publishing Systems Waldman, Marc Abstract | PDF

Title: Secure and Robust Censorship-Resistant Publishing Systems

Candidate: Waldman, Marc

Advisor(s): Mazieres, David

Abstract:

In many cases, censoring documents on the Internet is a fairly simple task. Almost any published document can be traced back to a specific host, and from there to an individual responsible for the material. Someone wishing to censor this material can use the courts, threats, or other means of intimidation to compel the relevant parties to delete the material or remove the host from the network. Even if these methods prove unsuccessful, various denial of service attacks can be launched against a host to make the document difficult or impossible to retrieve. Unless a host's operator has a strong interest in preserving a particular document, removing it is often the easiest course of action.

A censorship-resistant publishing system allows an individual to publish a document in such a way that it is difficult, if not impossible, for an adversary to completely remove, or convincingly alter, a published document. One useful technique for ensuring document availability is to replicate the document widely on servers located throughout the world. However, replication alone does not block censorship. Replicas need to be protected from accidental or malicious corruption. In addition, a censorship-resistant publishing system needs to address a number of other important issues, including protecting the publisher's identity while simultaneously preventing storage flooding attacks by anonymous users.

This dissertation presents the design and implementation of two very different censorship-resistant publishing systems. The first system, Publius, is a web based system that allows an individual to publish, update, delete and retrieve documents in a secure manner. Publius's main contributions include an automatic tamper checking mechanism, a method for updating or deleting anonymously published content and methods for publishing anonymously hyperlinked content. The second system, Tangler, is a peer-to-peer based system whose contributions include a unique publication mechanism and a dynamic self-policing network. The benefits of this new publication mechanism include the automatic replication of previously published content and an incentive to audit the reliability with which servers store content published by other people. In part through these incentives, the self-policing network identifies and ejects servers that exhibit faulty behavior.
Ph.D. Thesis 2003 A Qualitative Profile-based Approach to Edge Detection Yen, Ting-jen Abstract | PDF

Title: A Qualitative Profile-based Approach to Edge Detection

Candidate: Yen, Ting-jen

Advisor(s): Yap, Chee

Abstract:

Edge detection is a fundamental problem of computer vision and has been widely investigated. We propose a new framework for edge detection based on edge profiles.

Our model, based on one-dimensional qualitative edge profile fitting and edge consistency, will produce one continuous edge from an initial seed point. A "profile" is defined as a finite cross-section of a two-dimensional image along a line segment. "Edge consistency" means that all the profiles on the same edge should be consistent.

Appropriate evaluation functions are needed for different types of edge profiles, such as step edges, ramp edges, etc. An evaluation function must meet the requirement that it will produce local minima at the positions where edges of a given type occurs in the profile. Instead of subjective thresholding, image noise is measured statistically and used as a systematic way of filtering false edges. We describe our method as "qualitative edge profile fitting" because it is not based on arbitrary thresolding. Once an edge point is localized, it can be extended into an edge by matching compatible profiles. Two profiles are considered compatible as long as their average di erence is within the noise measurement. Another feature of our approach is its subpixel accuracy. The utilization of profiles and noise-induced threshold selection make tasks such as joining broken edges more objective.

We develop the necessary algorithms and implement them. Different evaluation functions are constructed for different edge models and experimented on different one-dimensional profiles. The edge detector, using these evaluation functions, is then examined using different images and under different noise conditions.
Ph.D. Thesis 2002 Expert-Driven Validation of Set-Based Data Mining Results Adomavicius, Gediminas Abstract | PDF

Title: Expert-Driven Validation of Set-Based Data Mining Results

Candidate: Adomavicius, Gediminas

Advisor(s): Tuzhilin, Alexander; Davis, Ernest

Abstract:

This dissertation addresses the problem of dealing with large numbers of set-based patterns, such as association rules and itemsets, discovered by data mining algorithms. Since many discovered patterns may be spurious, irrelevant, or trivial, one of the main problems is how to validate them, e.g., how to separate the ``good'' rules from the ``bad.'' Many researchers have advocated the explicit involvement of a human expert in the validation process. However, scalability becomes an issue when large numbers of patterns are discovered, since the expert cannot perform the validation on a pattern-by-pattern basis in a reasonable period of time. To address this problem, this dissertation describes a new expert-driven approach to set-based pattern validation.

The proposed validation approach is based on validation sequences, i.e., we rely on the expert's ability to iteratively apply various validation operators that can validate multiple patterns at a time, thus making the expert-based validation feasible. We identified the class of scalable set predicates called cardinality predicates and demonstrated how these predicates can be effectively used in the validation process, i.e., as a basis for validation operators. We examined various properties of cardinality predicates, including their expressiveness. We also have developed and implemented the set validation language (SVL) that can be used for manual specification of cardinality predicates by a domain expert. In addition, we have proposed and developed a scalable algorithm for set and rule grouping that can be used to generate cardinality predicates automatically.

The dissertation also explores various theoretical properties of sequences of validation operators and facilitates a better understanding of the validation process. We have also addressed the problem of finding optimal validation sequences and have shown that certain formulations of this problem are NP-complete. In addition, we provided some heuristics for addressing this problem.

Finally, we have tested our rule validation approach on several real-life applications, including personalization and bioinformatics applications.
Ph.D. Thesis 2002 Responsive Thinwire Visualization of Large Geographic Datasets Been, Kenneth Abstract | PDF

Title: Responsive Thinwire Visualization of Large Geographic Datasets

Candidate: Been, Kenneth

Advisor(s): Yap, Chee

Abstract:

This thesis describes a web-based, responsive, zooming and panning visual- ization system for a full-featured geographic description of the United States. Current web-based map servers provide, from a visualization standpoint, little more than one static image per page, with hyperlinks for navigation; continuous zooming and panning requires locally stored data. Our primary contribution is a multi-threaded, scalable and responsive client-server architecture that responds to user requests as naturally and quickly as possible, regardless of network band- width reliability. This architecture can be generalized for use in other applica- tions, including non-geographic ones. To this we add a scalable and exible user interface for navigation of multi-scale geographic data, with intuitive zooming and panning, pop-up feature labels, and a user controlled tree-hierarchy of windows. We build software tools and algorithms for translating the U.S. Census Bureau's TIGER data into a format designed for speedy database retrieval and network delivery, and for generalizing the data into multiple levels of detail. Because of anomalies in the TIGER data, this processing requires some human intervention.
Ph.D. Thesis 2002 Representing and Modifying Complex Surfaces Biermann, Henning Abstract | PDF

Title: Representing and Modifying Complex Surfaces

Candidate: Biermann, Henning

Advisor(s): Zorin, Denis

Abstract:

The increasing demand for highly detailed geometric models poses new and important problems in computer graphics and geometric modeling. Applications for complex models range from geometric design and scientific simulations to feature movies and video games.

We focus on the fundamental problem of creating and manipulating complex surface models. We address the problem by designing an efficient and general surface representation, and develop algorithms for efficient modification of surfaces represented in this form. Our surface representation extends existing subdivision-based representations with explicit representation of sharp features and boundaries, which is crucial in many computer-aided design applications.

We consider two types of surface modifications: boolean operations on solids bounded by surfaces, and surface pasting. Our technique rapidly and robustly computes an approximate result rather than aiming for the precise solution. At the same time, our approach allows one to trade speed for accuracy, and, in most cases, compute the result with any desired accuracy. The second type of editing operations we consider address the problem of transferring geometric features between different objects. Our technique makes it easy to combine geometric data from various sources (e.g. 3D scanning, CAGD model) into a single model.
Ph.D. Thesis 2002 On computing the Pareto-optimal solution set in a large scale dynamic network Daruwala, Raoul-Sam Abstract | PDF

Title: On computing the Pareto-optimal solution set in a large scale dynamic network

Candidate: Daruwala, Raoul-Sam

Advisor(s): Mishra, Bud

Abstract:

Let G=(V,E) be a graph with time-dependent edges where the cost of a path p through the graph is determined by a vector functions F(p)=[f_1(p),f_2(p), \dots, f_n(p)], where f_1,f_2,...,f_n are independent objective functions. Where n>1 there is no clear idea of what a ``best'' solution is, instead we turn to the idea of Pareto-optimality to define the efficiency of a path. Given the set of paths P through the network, a path p' is Pareto-optimal if for every p in P for all the objective functions (f_i(p) >= f_i(p')).

The problem of planning itineraries on a transportation system involves computing the set of optimal paths through a time-dependent network where the cost of a path is determined by more than one, possibly non-linear and non-additive, cost function. This thesis introduces an algorithmic toolkit for finding the set of Pareto-optimal paths in time-dependent networks in the presence of multiple objective functions.

Multi-criteria path optimization problems are known to be NP-Hard, however, by exploiting geometric and periodic properties of the dynamic graphs that model transit networks we show that it is possible to compute the Pareto-optimal solutions sets rapidly without using heuristics. We show that we can solve the itinerary problem in the presence of response time constraints for a large scale graph.
Ph.D. Thesis 2002 Informative Features in Vision and Learning Rudra, Archisman Abstract | PDF

Title: Informative Features in Vision and Learning

Candidate: Rudra, Archisman

Advisor(s): Geiger, Davi

Abstract:

We explore the role of features in solving problems in computer vision and learning. Features captures important domain-dependent knowledge and are fundamental in simplifying problems. Our goal is to consider the universal features of the problem concerned, and not just particular algorithms used in its solution. Such an approach reveals only the fundamental difficulties of any problem. For most problems we will face a host of other specialized concerns. Therefore, we consider simplified problems which captures the essence of our approach.

This thesis consists of two parts. First, we explore means of discovering features. We come up with an information theoretic criterion to identify features which has deep connections to statistical estimation theory. We consider features to be ``nice'' representations of objects. We find that, ideally, a feature space representation of on image is the most concise representation of an image which captures all available information in it. In practice, however, we are satisfied with an approximation to it. Therefore, we explore a few such approximations and explain their connection to the information-theoretic approach. We look at the algorithms which implement these approximation and look at their generalizations in the related field of stereo vision.

Using features, whether they come from some feature-discovery algorithm or are hand crafted, is usually an ad hoc process which depends on the actual problem, and the exact representation of features. This diversity mostly arises from the multitude of ways features capture information. In the second part of this thesis, we come up with an architecture which lets us use features in a very flexible way, in the context of content-addressable memories. We apply this approach to two radically different domains, face images and English words. We also look at human performance in reconstructing words from fragments, which give us some information about the memory subsystem in human beings.
Ph.D. Thesis 2001 Knowledge Discovery in Databases for Intrusion Detection, Disease Classification and Beyond Berger, Gideon Abstract | PDF

Title: Knowledge Discovery in Databases for Intrusion Detection, Disease Classification and Beyond

Candidate: Berger, Gideon

Advisor(s): Mishra, Bud

Abstract:

As the number of networked computers grows and the amount of sensitive information available on them grows as well there is an increasing need to ensure the security of these systems. The security of computer networks is not a new issue. We have dealt with the need for security for a long time with such measures as passwords and encryption. These will always provide an important initial line of defense. However, given a clever and malicious individual these defenses can often be circumvented. Intrusion detection is therefore needed as another way to protect computer systems. This thesis describes a novel three stage algorithm for building classification models in the presence of non-stationary, temporal, high dimensional data, in general, and for detecting network intrusion detections, in particular. Given a set of training data records the algorithm begins by identifying "interesting'' temporal patterns in this data using a modal logic. This approach is distinguished from other work in this area where frequent patterns are identified. We show that when frequency is replaced by our measure of "interestingness'' the problem of finding temporal patterns is NP-complete. We then offer an efficient heuristic approach that has proven effective in experiments. Having identified interesting patterns, these patterns then become the predictor variables in the construction of a Multivariate Adaptive Regression Splines (MARS) model. This approach will be justified, after surveying other methods for solving the classification problem, by its ability to capture complex nonlinear relationships between the predictor and response variables which is comparable to other heuristic approaches such as neural networks and classification trees, while offering improved computational properties such as rapid convergence and interpret-ability. After considering a variety of approaches to the problems of over-fitting which is inherent when modeling high dimensional data and non-stationarity, we describe our approach to addressing these issues through the use of truncated Stein shrinkage. This approach is motivated by showing the inadmissibility of the maximum likelihood estimator (MLE) in the high dimensional (dimension >= 3) data. We then discuss the application of our approach as participants in the 1999 DARPA Intrusion Detection Evaluation where we were able to exhibit the benefits of our approach. Finally, we suggest another area of research where we believe that our work would meet with similar success, namely, the area of disease classification.
Ph.D. Thesis 2001 Algorithms for Rendering in Artistic Styles Hertzmann, Aaron Abstract | PDF

Title: Algorithms for Rendering in Artistic Styles

Candidate: Hertzmann, Aaron

Abstract:

We describe new algorithms and tools for generating paintings, illustrations, and animation on a computer. These algorithms are designed to produce visually appealing and expressive images that look hand-painted or hand-drawn. In many contexts, painting and illustration have many advantages over photorealistic computer graphics, in aspects such as aesthetics, expression, and computational requirements. We explore three general strategies for non-photorealistic rendering:

First, we describe explicit procedures for placing brush strokes. We begin with a painterly image processing algorithm inspired by painting with real physical media. This method produces images with a much greater subjective impression of looking hand-made than do earlier methods. By adjusting algorithm parameters, a variety of styles can be generated, such as styles inspired by the Impressionists and the Expressionists. This method is then extended to processing video, as demonstrated by painterly animations and an interactive installation. We then present a new style of line art illustration for smooth 3D surfaces. This style is designed to clearly convey surface shape, even for surfaces without predefined material properties or hatching directions.

Next, we describe a new relaxation-based algorithm, in which we search for the painting that minimizes some energy function. In contrast to the first approach, we ideally only need to specify what we want, not how to directly compute it. The system allows as fine user control as desired: the user may interactively change the painting style, specify variations of style over an image, and/or add specific strokes to the painting.

Finally, we describe a new framework for processing images by example, called ``image analogies.'' Given an example of a painting or drawing (e.g. scanned from a hand-painted source), we can process new images with some approximation to the style of the painting. In contrast to the first two approaches, this allows us to design styles without requiring an explicit technical definition of the style. The image analogies framework supports many other novel image processing operations.
Ph.D. Thesis 2001 Region-based Register Allocation for EPIC Architectures Kim, Hansoo Abstract | PDF

Title: Region-based Register Allocation for EPIC Architectures

Candidate: Kim, Hansoo

Advisor(s): Palem, Krishna

Abstract:

Instruction-level parallelism(ILP) is a family of processor and compiler design techniques that speed up execution by allowing individual machine operations. Explicitly Parallel Instruction computing (EPIC) processors evolved in an attempt to achieve high levels of ILP without the hardware complexity. In EPIC processors most of the functions to extract ILP are performed by the compiler. To take advantage higher level of ILP of these architectures, the ILP compiler must use aggressive ILP technique. This opportunity for improved performance comes at the price of increased compilation time.

As the size of the compilation unit is limited, the compilation time can be reduced. But the limited scope of compilation may restrict the scope of optimization. As a result, the compiler may generate less efficient quality of code. Ideally, we want to get smaller compilation time and the same or better execution time as that obtained using the global approach.

In this thesis, we address the problem of the compilation time and execution performance trade-off in region-based compilation within the context of the key optimization of register allocation . We demonstrate that schemes designed for region-based allocation perform as well as or even better than schemes designed for global based allocation while having smaller compilation time. To achieve this goal, we propose several innovative techniques which form the core of this thesis.

We show considerable compilation time savings with comparable execution time performance by synthesizing our techniques in a region-based register allocation. We also explore the relation between the performance of the register allocation and the region size and quantify it. Our research shows selecting the right size of region has the important impact to the performance of register allocation. We proposed the concept of restructuring the regions based on register pressure and discussed how we can estimate the register pressure in order to improve compilation time while maintaining the execution time.
Ph.D. Thesis 2001 Adversarial Reasoning: A Logical Approach for Computer Go Klinger, Tamir Abstract | PDF

Title: Adversarial Reasoning: A Logical Approach for Computer Go

Candidate: Klinger, Tamir

Advisor(s): Davis, Ernest

Abstract:

Go is a board game with simple rules but complex strategy requiring ability in almost all aspects of human reasoning. A good Go player must be able to hypothesize moves and analyze their consequences; to judge which areas are relevant to the analysis at hand; to learn from successes and failures; to generalize that knowledge to other ``similar'' situations; and to make inferences from knowledge about a position.

Unlike computer chess, which has seen steady progress since Shannon's [23] and Turing's [24] original papers on the subject, progress on computer Go remains in relative infancy. In computer chess, minimax search with [IMAGE ] - [IMAGE ] pruning based on a simple evaluation function can beat a beginner handily. No such simple evaluation function is known for Go. To accurately evaluate a Go position requires knowledge of the life and death status of the points on the board. Since the player with the most live points at the end of the game wins, a small mistake in this analysis can be disastrous.

In this dissertation we describe the design, performance, and underlying logic of a knowledgebased program that solves life and death problems in the game of Go. Our algorithm applies life and death theory coupled with knowledge about which moves are reasonable for each relevant goal in that theory to restrict the search space to a tractable size. Our results show that simple depth-first search armed with a goal theory and heuristic move knowledge yields very positive results on standard life and death test problems - even without sophisticated move ordering heuristics.

In addition to a description of the program and its internals we present a modal logic useful for describing strategic theories in games and use it to give a life and death theory and to formally state the rules of Go. We also give an axiomatization for this logic using the modal [IMAGE ] calculus [15] and prove some basic theorems of the system.
Ph.D. Thesis 2001 Machine Level Optimizations for High Level Languages Leung, Allen Abstract | PDF

Title: Machine Level Optimizations for High Level Languages

Candidate: Leung, Allen

Advisor(s): Palem, Krishna

Abstract:

Two machine instruction level compiler optimization problems are considered in this work.

The first problem is time-constrained instruction scheduling, i.e., finding optimal schedules for machine code in the presence of time constraints such as release-times and deadlines. These types of time constraints appear naturally in embedded applications, and also as a side effect of many other compiler optimization problems. While the general problem is NP-hard, we have developed a new algorithm which can optimally handle many P-time solvable sub-instances. In fact, we show that almost all previous algorithms in this related area can be seen as an instance of the priority computation scheme that we have developed. Our work extends and unifies many algorithmic results in classical deterministic scheduling theory related to release-times, deadlines and pipeline latencies.

The second problem that we investigate in this work is scalar optimizations in machine code. We present a new framework that utilizes static single assignment form (SSA) at the level of individual machine instructions. Complementing the framework, we have also developed new SSA construction algorithms which are faster than previous algorithms, and are very simple to implement.
Ph.D. Thesis 2001 Exact Geometric Computation: Theory and Applications Li, Chen Abstract | PDF

Title: Exact Geometric Computation: Theory and Applications

Candidate: Li, Chen

Abstract:
Exact Geometric Computation: Theory and Applications Abstract This dissertation explores the theory and applications of Exact Geometric Computation (EGC), a general approach to robust geometric computing. The contributions of this thesis are organized into three parts. A fundamental task in EGC is to support exact comparison of algebraic expressions. This leads to the problem of constructive root bounds for algebraic expressions. Such root bounds determine the worst-case complexity of exact comparisons. In the first part, we present a new constructive root bound which, compared to previous bounds, can give dramatically better performance in many common computations involving divisions and radical roots. We also improve the well-known degree-measure bound by exploiting the sharing of common sub-expressions. In the second part, we discuss the design and implementation of the Core Library, a C++ library which embraces the EGC approach to robust numerical and geometric computation. Our design emphasizes ease of use and facilitates the rapid development of robust geometric applications. It allows non-specialist programmers to add robustness into new or existing applications with little extra effort. A number of efficiency and implementation issues are investigated. Although focused on geometric computation, the EGC techniques and software we developed can be applied to other areas where it is critical to guarantee numerical precision. In the third part, we introduce a new randomized test for the vanishing of multivariate radical expressions. With this test, we develop a probabilistic approach to proving elementary geometry theorems about ruler-and-compass constructions. A probabilistic theorem prover based on this approach has been implemented using the Core Library. We present some empirical data.
Ph.D. Thesis 2001 An On-Line Handwriting Recognizer with Fisher Matching, Hypotheses Propagation Network and Context Constraint Models Oh, Jong Abstract | PDF

Title: An On-Line Handwriting Recognizer with Fisher Matching, Hypotheses Propagation Network and Context Constraint Models

Candidate: Oh, Jong

Advisor(s): Geiger, Davi

Abstract:

We have developed an on-line handwriting recognition system. Our approach integrates local bottom-up constructs with a global top-down measure into a modular recognition engine. The bottom-up process uses local point features for hypothesizing character segmentations and the top-down part performs shape matching for evaluating the segmentations. The shape comparison, called Fisher segmental matching, is based on Fisher's linear discriminant analysis. The component character recognizer of the system uses two kinds of Fisher matching based on different representations and combines the information to form the multiple experts paradigm.

Along with an efficient ligature modeling, the segmentations and their character recognition scores are integrated into a recognition engine termed Hypotheses Propagation Network (HPN), which runs a variant of topological sort algorithm of graph search. The HPN improves on the conventional Hidden Markov Model and the Viterbi search by using the more robust mean-based scores for word level hypotheses and keeping multiple predecessors during the search.

We have also studied and implemented a geometric context modeling termed Visual Bigram Modeling that improves the accuracy of the system's performance by taking the geometric constraints into account, in which the component characters in a word can be formed in relation with the neighboring characters. The result is a shape-oriented system, robust with respect to local and temporal features, modular in construction and has a rich range of opportunities for further extensions.
Ph.D. Thesis 2001 Continuous Model for Salient Shape Selection and Representation Pao, Hsing-Kuo (Kenneth) Abstract | PDF

Title: Continuous Model for Salient Shape Selection and Representation

Candidate: Pao, Hsing-Kuo (Kenneth)

Advisor(s): Geiger, Davi

Abstract:

We propose a new framework for shape representation and salient shape selection. The framework is considered as a low- to middle-level vision process. The framework can be applied to various topics, including figure/ground separation, searching of the shape axis, junction detection and illusory figure finding. The model construction is inspired by the Gestalt studies. They suggest that proximity, convexity, similarity, good continuation, closure, symmetry, etc, are useful for figure/ground separation and visual organization construction. First, we quantify those attributes for (completed or partial) shapes by our distributed systems. The shape will be evaluated and represented by those results. In particular, the shape convexity, rather than other shape attributes like the symmetry axis or size which were well-studied before, will be emphasized in our discussion. Our problem is proposed in a continuous manner. For the shape convexity, unlike the conventional mathematical definition, we are aimed at deriving a definition to describe a shape ``more convex'' or ``less convex'' than the other. To search the shape axis, more than a binary information telling a point on or off any axis, a continuous information will be obtained. We distinguish axes with ``stronger'' or ``weaker'' declarations. An Easy and natural scheme of pruning can be applied by such representation. For the junction detection, we do not assume any artificial threshold. Instead, the transition from low-curvature to high-curvature curves or curves with discontinuities will be shown by our representation. The model is based on a variational approach, provided by the minimization of the data fitting error as well as the neighborhood discrepancy. Two models will be proposed, the decay diffusion process and the orientation diffusion process.
Ph.D. Thesis 2001 Language Support for Program Generation Reasoning, Implementation, and Applications Yang, Zhe Abstract | PDF
Title: Language Support for Program Generation Reasoning, Implementation, and Applications

Candidate: Yang, Zhe

Advisor(s): Danvy, Olivier; Goldberg, Benjamin

Abstract:

This dissertation develops programming languages and associated techniques for sound and efficient implementations of algorithms for program generation.

First, we develop a framework for practical two-level languages. In this framework, we demonstrate that two-level languages are not only a good tool for describing program-generation algorithms, but a good tool for reasoning about them and implementing them as well. We pinpoint several general properties of two-level languages that capture common proof obligations of program-generation algorithms:
- To prove that the generated program behaves as desired, we use an erasure property to reduce the two-level proof obligation to a simpler one-level proof obligation.
- To prove that the generated program satisfies certain syntactic constraints, we use a type-preservation property for a refined type system that enforces these constraints.
In addition, to justify concrete implementations, we use a native embedding of a two-level language into a one-level language.

We present two-level languages with these properties both for a call-by-name object language and for a call-by-value object language with computational effects, and demonstrate them through two classes of non-trivial applications: one-pass transformations into continuation-passing style and type-directed partial evaluation for call-by-name and for call-by-value.

Next, to facilitate implementations, we develop several general approaches to programming with type-indexed families of values within the popular Hindley-Milner type system. Type-indexed families provide a form of type dependency, which is employed by many algorithms that generate typed programs, but is absent from mainstream languages. Our approaches are based on type encodings, so that they are type safe. We demonstrate and compare them through a host of examples, including type-directed partial evaluation and printf-style formatting.

Finally, upon the two-level framework and type-encoding techniques, we recast a joint work with Bernd Grobauer, where we formally derived a suitable self application for type-directed partial evaluation, and achieved automatic compiler generation.
Ph.D. Thesis 2000 SETL for Internet Data Processing Bacon, David Abstract | PDF

Title: SETL for Internet Data Processing

Candidate: Bacon, David

Advisor(s): Schwartz, Jack

Abstract:

Although networks and coordinated processes figure prominently in the kinds of data manipulation found in everything from scientific modeling to large-scale data mining, programmers charged with setting up the requisite software systems frequently find themselves hampered by the inadequacy of available languages. The ``real'' languages such as C++ and Java tend to be low-level, requiring the specification of a great deal of often repetitive detail, whereas the higher-level ``scripting'' languages tend to lack the kinds of structuring facilities that lend themselves to the reliable construction of even modestly large systems.

The high-level language SETL meets both of these needs. Originally conceived as a language which aimed to bring programming a little closer to the idealized world of mathematics, making it extremely useful in the human-to-human communication of algorithms, SETL has proven itself over the years to be an excellent language for software prototyping, primarily because its conciseness and immediacy lend it well to rapid experimentation. These characteristics, together with its general freedom from machine-oriented restrictions, its value semantics, its comprehension-style constructors for aggregates, its skill with strings, and especially its syntactic support for mappings, also make it well suited to high-level data processing.

In order to play the role of a full-fledged modern data processing language, however, SETL had to acquire the ability to manipulate processes and communicate with them easily, and furthermore to be able to work with networks, particularly the client-server model that rules the Internet. Accordingly, I have integrated a full set of process and network management features into SETL. In my dissertation, I show how the liberal use of fullweight processes, with the high, protective walls that surround them, sustains a modular design approach which in turn provides a strong defense against the main hazards of distributed computing, namely race conditions and deadlock, while preserving the luxury and convenience of programming in a truly high-level language. To this end, I have evolved protocols and design patterns for developing multiplexing servers and clients in SETL, and in my talk, will present examples of fairly complex systems where hierarchies of processes communicate over the network. Such systems tend to be notorious for their unreliability, but in these instances, robustness seems to follow naturally from the readability of simple programs written in an ancient and friendly language.
Ph.D. Thesis 2000 A Rigorous Framework for Fully Supporting the IEEE Standard for Floating-Point Arithmetic in High-Level Programming Languages Figueroa, Sam Abstract | PDF

Title: A Rigorous Framework for Fully Supporting the IEEE Standard for Floating-Point Arithmetic in High-Level Programming Languages

Candidate: Figueroa, Sam

Advisor(s): Dewar, Robert

Abstract:

Processors conforming to the IEEE Standard for Floating-Point Arithmetic have been commonplace for some years, and now several programming languages seem to support or conform to this standard, from hereon referred to as ``the IEEE Standard.'' For example, The Java Language Specification by Gosling, Joy, and Steele, which defines the Java language, frequently mentions the IEEE Standard. Indeed, Java, as do other languages, supports some of the features of the IEEE Standard, including a couple floating-point data formats, and even requires (in section 4.2.4 ``Floating-Point Operations'' of the aforementioned book) that ``operators on floating-point numbers behave exactly as specified by IEEE 754.''

Arguing that the support current languages offer is not enough, this thesis establishes clear criteria for what it means to fully support the IEEE Standard in a programming language. Each aspect of the IEEE Standard is examined in detail from the point of view of how various arithmetic engines implement that aspect of the IEEE Standard, how different languages (and implementations thereof) support it, and what the range of options are in supporting that aspect. Practical recommendations are then offered (particularly, but not exclusively, for Ada and Java), taking, for example, programmer convenience and impact on performance into consideration. A detailed model specification following these recommendations is provided for the Ada language.

In addition, a variety of issues related to the floating-point aspects of programming languages are discussed, so as to serve as a more complete guide to language designers. One such issue is floating-point expression evaluation schemes, and, more specifically, whether bit-for-bit identical results are actually achievable on a variety of platforms that conform to the IEEE Standard, as the Java language promises. Closely tied to this issue is that of double rounding, which occurs when a (possibly intermediate) result is rounded more than once before subsequent use or before being delivered to its final destination. So this thesis discusses when double rounding makes a difference, how it can be avoided, and what the performance impact is in avoiding it.
Ph.D. Thesis 2000 A Language-Theoretic Approach to Algorithms Goyal, Deepak Abstract | PDF

Title: A Language-Theoretic Approach to Algorithms

Candidate: Goyal, Deepak

Advisor(s): Paige, Bob

Abstract:

An effective algorithm design language should be 1) wide-spectrum in nature, i.e. capable of expressing both abstract specifications and low-level implementations, and 2) "computationally transparent", i.e. facilitate accurate estimation of time and space requirements. The conflict between these requirements is exemplified by SETL which is wide-spectrum, but lacks computational transparency because of its reliance on hash-based data structures. The first part of this thesis develops an effective algorithm design language, and the second demonstrates its usefulness for algorithm explanation and discovery.

In the first part three successively more abstract set-theoretic languages are developed and shown to be computationally transparent. These languages can collectively express both abstract specifications and low-level implementations. We formally define a data structure selection method for these languages using a novel type system. Computational transparency is obtained for the lowest-level language through the type system, and for the higher-level languages by transformation into the next lower level. We show the effectiveness of this method by using it to improve a difficult database query optimization algorithm from expected to worst-case linear time. In addition, a simpler explanation and a shorter proof of correctness are obtained.

In the second part we show how our data structure selection method can be made an effective third component of a transformational program design methodology whose first two components are finite differencing and dominated convergence. Finite differencing replaces costly repeated computations by cheaper incremental counterparts, and dominated convergence provides a generalized iteration scheme for computing fixed-points. This methodology has led us to a simpler explanation of a complex linear-time model-checking algorithm for the alternation-free modal mu-calculus, and to the discovery of an O ( N ³ ) time algorithm for computing intra-procedural may-alias information that improves over an existing O ( N ⁵ ) time algorithm.
Ph.D. Thesis 2000 Supporting a Flexible Parallel Programming Model on a Network of Non-Dedicated Workstations Huang, Shih-Chen Abstract | PDF

Title: Supporting a Flexible Parallel Programming Model on a Network of Non-Dedicated Workstations

Candidate: Huang, Shih-Chen

Advisor(s): Kedem, Zvi

Abstract:

A network of non-dedicated workstations can provide computational resources at minimal or no additional cost. If harnessed properly, the combined computational power of these otherwise ``wasted'' resources can outperform even mainframe computers. Performing demanding computations on a network of non-dedicated workstations efficiently has previously been studied, but inadequate handling of the unpredictable behavior of the environment and possible failures resulted in limited success only.

This dissertation presents a shared memory software system for executing programs with nested parallelism and synchronization on a network of non-dedicated workstations. The programming model exhibits a very convenient and natural programming style and is especially suitable for computations whose complexity and parallelism emerges only during their execution, such as in divide and conquer problems. To both support and take advantage of the flexibility inherent in the programming model, an architecture that distributes both the shared memory management and the computation is developed. This architecture removes bottlenecks inherent in centralization, thus enhancing scalability and dependability. By adapting available resource dynamically and coping with unpredictable machine slowdowns and failures, the system also supports dynamic load balancing, and fault tolerance--both transparently to the programmer.
Ph.D. Thesis 2000 Global Optimization Using Embedded Graphs Ishikawa, Hiroshi Abstract | PDF

Title: Global Optimization Using Embedded Graphs

Candidate: Ishikawa, Hiroshi

Advisor(s): Geiger, Davi

Abstract:

One of the challenges of computer vision is that the information we seek to extract from images is not even defined for most images. Because of this, we cannot hope to find a simple process that produces the information directly from a given image. Instead, we need a search, or an optimization, in the space of parameters that we are trying to estimate.

In this thesis, I introduce two new optimization methods that use graph algorithms. They are characterized by their ability to find a global optimum efficiently. Each method defines a graph that can be seen as embedded in a Euclidean space. Graph- theoretic entities such as cuts and cycles represent geometric objects that embody the information we seek.

The first method finds a hypersurface in a Euclidean space that minimizes a certain kind of energy functional. The hypersurface is approximated by a cut of an embedded graph so that the total cost of the cut corresponds to the energy. A globally optimal solution is found by using a minimum cut algorithm. In particular, it can globally solve first order Markov Random Field problems in more generality than was previously possible. I prove that the convexity of the smoothing function in the energy is essential for the applicability of the method and provide an exact criterion in terms of the MRF energy.

The second method proposed here efficiently finds an optimal cycle in a Euclidean space. It uses a minimum ratio cycle algorithm to find a cycle with minimum energy in an embedded graph. In the case of two dimensions, the energy can depend not only on the cycle itself but also on the region defined by the cycle. Because of this, the method unifies the two competing views of boundary and region segmentation.

I demonstrate the utility of the methods in applications, with the results of experiments in the areas of binocular stereo, image restoration, and image segmentation. The image segmentation, or contour extraction, experiments are carried out in various situations using different types of information, for example motion, stereo, and intensity.
Ph.D. Thesis 2000 On the Use of Functionals on Boundaries in Hierarchical Models of Object Recognition Jermyn, Ian Abstract | PDF

Title: On the Use of Functionals on Boundaries in Hierarchical Models of Object Recognition

Candidate: Jermyn, Ian

Advisor(s): Geiger, Davi

Abstract:

Object recognition is a central problem in computer vision. Typically it is assumed to follow a sequential model in which successively more specific hypotheses are generated about the image. This is a rather simplistic model, allowing as it does no margin for error at any point. We follow a more general approach in which the various representations involved are allowed to influence one another from the outset. As a guide and ultimate goal, we study the problem of finding the region occupied by human beings in images, and the separation of the region into arms, legs and head. We approach the problem as that of defining a functional on the space of boundaries in images whose minimum specifies the region occupied by the human figure. Previous work that uses such functionals suffers from a number of difficulties. These include an uncontrollable dependence on scale, an inability to find the global minimum for boundaries in polynomial time, and the inability to include region as well as boundary information. We present a new form of functional on boundaries in a manifold that solves these problems, and is also the unique form of functional in a specific class that possesses a non-trivial, efficiently computable global minimum. We describe applications of the model to single images and to the extraction of boundaries from stereo pairs and motion sequences. In addition, the functionals used in previous work could not include information about the shape of the region sought. We develop a model for the part structures of boundaries that extends previous work to the case of real images, thus including shape information in the functional framework. We show that such part structures are hyperpaths in a hypergraph. An `optimal hyperpath' algorithm is developed that globally minimizes the functional under some conditions. We show how to use exemplars of a shape to construct a functional that includes specific information about the topology of the part structure sought. An algorithm is developed that globally minimizes such functionals in the case of a fixed boundary. The behaviour of the functional mimics an aspect of human shape comparison.
Ph.D. Thesis 2000 Delegation Logic: A Logic-based Approach to Distrbuted Authorization Li, Ninghui Abstract | PDF

Title: Delegation Logic: A Logic-based Approach to Distrbuted Authorization

Candidate: Li, Ninghui

Advisor(s): Feigenbaum, Joan; Siegel, Alan

Abstract:

We address the problem of authorization in large-scale, open, distributed systems. Authorization decisions are needed in electronic commerce, mobile-code execution, remote resource sharing, content advising, privacy protection, etc. We adopt the trustmanagement approach, in which “authorization” is viewed as a “proof-of-compliance” problem: Does a set of credentials prove that a request complies with a policy? We develop a logic-based language Delegation Logic (DL) to represent policies, credentials, and requests in distributed authorization. Delegation Logic extends logic programming (LP) languages with expressive delegation constructs that feature delegation depth and a wide variety of complex principals (including, but not limited to, k-out-of-n thresholds). D1LP, the monotonic version of DL, extends the LP language Datalog with delegation constructs. D2LP, the nonmonotonic version of DL, also features classical negation, negation-as-failure, and prioritized conflict handling. Our approach to defining and implementing DL is based on tractably compiling DL programs into ordinary logic programs (OLP’s). This compilation approach enables DL to be implemented modularly on top of existing technologies for OLP, e.g., Prolog. As a trust-management language, Delegation Logic provides a concept of proof-ofcompliance that is founded on well-understood principles of logic programming and knowledge representation. DL also provides a logical framework for studying delegation, negation of authority, conflicts between authorities, and their interplay.
Ph.D. Thesis 2000 Queryable Expert Systems Tanzer, David Abstract | PDF

Title: Queryable Expert Systems

Candidate: Tanzer, David

Abstract:
No Title

DEPARTMENT OF COMPUTER SCIENCE
DOCTORAL DISSERTATION DEFENSE

Candidate: David Tanzer
Advisor: Dennis Shasha

Queryable Expert Systems

10:00 a.m., Tuesday, October 17, 2000
12th floor conference room, 719 Broadway

Abstract

Interactive rule-based expert systems, which work by ``interviewing'' their users, have found applications in fields ranging from aerospace to help desks. Although they have been shown to be useful, people find them difficult to query in flexible ways. This limits the reusability of the knowledge they contain. Databases and noninteractive rule systems such as logic programs, on the other hand, are queryable but they do not offer an interview capability. This thesis is the first investigation that we know of into query-processing for interactive expert systems.
In our query paradigm, the user describes a hypothetical condition and then the system reports which of its conclusions are reachable, and which are inevitable, under that condition. For instance, if the input value for bloodSugar exceeds 100 units, is the conclusion diabetes then inevitable? Reachability problems have been studied in other settings, e.g., the halting problem, but not for interactive expert systems.
We first give a theoretical framework for query-processing that covers a wide class of interactive expert systems. Then we present a query algorithm for a specific language of expert systems. This language is a restriction of production systems to an acyclic form that generalizes decision trees and classical spreadsheets. The algorithm effects a reduction from the reachability and inevitability queries into datalog rules with constraints. When preconditions are conjunctive, the data complexity is tractable. Next, we optimize for queries to production systems that contain regions which are decision trees. When general-purpose datalog methods are applied to the rules that result from our queries, the number of constraints that must be solved is O ( n ² ), where n is the size of the trees. We lower the complexity to O ( n ). Finally, we have built a query tool for a useful subset of the acyclic production systems. To our knowledge, these are the first interactive expert systems that can be queried about the reachability and inevitability of their conclusions.
Ph.D. Thesis 2000 Scenario Customization for Information Extraction Yangarber, Roman Abstract | PDF

Title: Scenario Customization for Information Extraction

Candidate: Yangarber, Roman

Advisor(s): Grishman, Ralph

Abstract:

Information Extraction (IE) is an emerging NLP technology, whose function is to process unstructured, natural language text, to locate specific pieces of information, or facts , in the text, and to use these facts to fill a database. IE systems today are commonly based on pattern matching. The core IE engine uses a cascade of sets of patterns of increasing linguistic complexity. Each pattern consists of a regular expression and an associated mapping from syntactic to logical form. The pattern sets are customized for each new topic , as defined by the set of facts to be extracted.

Construction of a pattern base for a new topic is recognized as a time-consuming and expensive process--a principal roadblock to wider use of IE technology in the large. An effective pattern base must be precise and must have wide coverage. This thesis addresses the portability problem in two stages.

First, we introduce a set of tools for building patterns manually from examples . To adapt the IE system to a new subject domain quickly, the user chooses a set of example sentences from a training text, and specifies how each example maps to the extracted event--its logical form. The system then applies meta-rules to transform the example automatically into a general set of patterns. This effectively shifts the portability bottleneck from building patterns to finding good examples.

Second, we propose a novel methodology for discovering good examples automatically from a large un-annotated corpus of text. The system is initially seeded with a small set of relevant patterns provided by the user. An unsupervised learning procedure then identifies new patterns and classes of related terms on successive iterations. We present experimental results, which confirm that the discovered patterns exhibit high quality, as measured in terms of precision and recall.
Ph.D. Thesis 1999 Higher-Order Conditional Synchronization Afshartous, Niki Abstract | PDF

Title: Higher-Order Conditional Synchronization

Candidate: Afshartous, Niki

Advisor(s): Goldberg, Benjamin

Abstract:

Conditional synchronization - a mechanism that conditionally blocks a thread based on the value of a boolean expression currently exists in several programming languages. We propose promoting conditional synchronization to first-class status allowing the synchronization object representing a suspended conditional synchronization to be passed as a value.

To demonstrate our idea we extend Concurrent ML and present several examples illustrating the expressiveness of first-class conditional synchronization (FCS). FCS has broadcast semantics making it appropriate for applications such as barriers and discrete-event simulation. The semantics also guarantee that no transient store configurations are missed. The end result facilitates abstraction and adds flexibility in writing concurrent programs. To minimize re-evaluation of synchronization conditions we propose a static analysis and translation that identifies expressions for the run-time system that could affect the value of a synchronization condition. The static analysis (which is based on an effect type system) therefore precludes excessive run-time system polling of synchronization conditions.
Ph.D. Thesis 1999 Metacomputing on on Commodity Computers Baratloo, Arash Abstract | PDF

Title: Metacomputing on on Commodity Computers

Candidate: Baratloo, Arash

Advisor(s): Kedem, Zvi

Abstract:

The advantages of using a set of networked commodity computers for parallel processing is well understood: such computers are cheap, widely available, and mostly underutilized. So why has the use of such environments for compute-intensive applications not proliferated? A major reason is that the inherent complexities of programming applications and coordinating their execution on networked computers outweighs the advantages.

In networked environments populated with multiuser commodity computers, both the computing speed and the number of available computers for executing parallel programs may change frequently and unpredictably. As a consequence, programs need to continuously adapt their execution to the changing environment. The execution of an application must therefore address such issues as dynamic changes in effective machine speeds, dynamic changes in the number of available machines, and sudden network and machine failures. It is not feasible for an application programmer to write programs that adapt to the behavior of a system whose critical aspects cannot be anticipated.

I will present a unified set of techniques to implement a virtual reliable parallel-processing platform on a set of unreliable computers with temporally varying execution speeds. These techniques are specifically designed for automatically adapting the execution of parallel programs to distributed environments. I will explain these techniques in the context of two software systems, Calypso and ResourceBroker, that have been built to validate them.

Calypso gives a programmer a simple tool to build and effectively execute parallel programs on a set of commodity computers. The notable properties of Calypso are: (1) a simple, intuitive programming model based on a virtual machine interface; (2) separation of logical and physical parallelism, allowing the source code to codify the algorithm rather than the execution environment; and (3) a runtime system that efficiently adapts the execution of the program to the dynamic nature of the runtime environment. ResourceBroker is a resource manager that demonstrates a novel technique to dynamically manage the assignment of computers to parallel programs. ResourceBroker can work with a variety of parallel systems, even transparently managing those that are not aware of its existence, such as PVM and MPI, and will distribute available resources fairly among multiple computations. As a result, a mix of parallel programs, written using diverse programming systems can effectively execute concurrently on a set of computers.
Ph.D. Thesis 1999 A Maximum Entropy Approach to Named Entity Recognition Borthwick, Andrew Abstract | PDF

Title: A Maximum Entropy Approach to Named Entity Recognition

Candidate: Borthwick, Andrew

Advisor(s): Grishman, Ralph

Abstract:

This thesis describes a novel statistical named-entity (i.e. ``proper name'') recognition system known as ``MENE'' (Maximum Entropy Named Entity). Named entity (N.E.) recognition is a form of information extraction in which we seek to classify every word in a document as being a person-name, organization, location, date, time, monetary value, percentage, or ``none of the above''. The task has particular significance for Internet search engines, machine translation, the automatic indexing of documents, and as a foundation for work on more complex information extraction tasks.

Two of the most significant problems facing the constructor of a named entity system are the questions of portability and system performance. A practical N.E. system will need to be ported frequently to new bodies of text and even to new languages. The challenge is to build a system which can be ported with minimal expense (in particular minimal programming by a computational linguist) while maintaining a high degree of accuracy in the new domains or languages.

MENE attempts to address these issues through the use of maximum entropy probabilistic modeling. It utilizes a very flexible object-based architecture which allows it to make use of a broad range of knowledge sources in making its tagging decisions. In the DARPA-sponsored MUC-7 named entity evaluation, the system displayed an accuracy rate which was well-above the median, demonstrating that it can achieve the performance goal. In addition, we demonstrate that the system can be used as a post-processing tool to enhance the output of a hand-coded named entity recognizer through experiments in which MENE improved on the performance of N.E. systems from three different sites. Furthermore, when all three external recognizers are combined under MENE, we are able to achieve very strong results which, in some cases, appear to be competitive with human performance.

Finally, we demonstrate the trans-lingual portability of the system. We ported the system to two Japanese-language named entity tasks, one of which involved a new named entity category, ``artifact''. Our results on these tasks were competitive with the best systems built by native Japanese speakers despite the fact that the author speaks no Japanese.
Ph.D. Thesis 1999 Algorithms for Nonlinear Models in Computational Finance and their Object-oriented Implementation Buff, Robert Abstract | PDF

Title: Algorithms for Nonlinear Models in Computational Finance and their Object-oriented Implementation

Candidate: Buff, Robert

Advisor(s): Avellaneda, Marco

Abstract:

Individual components of financial option portfolios cannot be evaluated independently under nonlinear models in mathematical finance. This entails increased algorithmic complexity if the options under consideration are path-dependent. We describe algorithms that price portfolios of vanilla, barrier and American options under worst-case assumptions in an uncertain volatility setting. We present a generalized approach to worst-case volatility scenarios in which only the duration, but not the starting dates of periods of high volatility risk are known. Our implementation follows object-oriented principles and is modular and extensible. Combinatorial and numerical algorithms are separate and orthogonal to each other. We make our tools available to a wide audience by using standard Internet technologies.
Ph.D. Thesis 1999 Prototyping a Prototyping Language Chen, Hseu-Ming Abstract | PDF

Title: Prototyping a Prototyping Language

Candidate: Chen, Hseu-Ming

Advisor(s): Harrison, Malcolm C.

Abstract:

The development of a prototyping language should follow the usual software-engineering methodology: starting with an evolvable, easily modifiable, working prototype of the proposed language. Rather than committing to the development of a mammoth compiler at the outset, we can design a translator from the prototyping language to another high-level language as a viable alternative. From a software-engineering point of view, the advantages of the translator approach are its shorter development cycle and lessened maintenance burden.

In prototyping language design, there are often innovative cutting-edge features which may not be well-understood. It is inevitable that numerous experimentations and revisions will be made to the current design, and hence supporting evolvability and modifiability is critical in the translator design.

In this dissertation we present an action-semantics-based framework for high-level source-to-source language translation. Action semantics is a form of denotational semantics that is based on abstract semantic algebra rather than Scott domain and lambda-notation. More specifically, this model not only provides a formal semantics definition for the source language and sets guidelines for implementations as well as migration, but also facilitates mathematical reasoning and a correctness proof of the entire translation process. The translation is geared primarily towards readability, maintainability, and type-preserving target programs, only secondarily towards reasonable efficiency.

We have acquired a collection of techniques for the translation of certain non-trivial high-level features of prototyping languages and declarative languages into efficient procedural constructs in imperative languages like Ada95, while using the abstraction mechanism of the target languages to maximize the readability of the target programs. In particular, we translate Griffin existential types into Ada95 using its object-oriented features, based on coercion calculus. This translation is actually more general, in that one can add existential types to a language (with modicum of extra syntax) supporting object-oriented paradigm without augmenting its type system, through intra-language transformation. We also present a type-preserving translation of closures which allows us to drop the whole-program-transformation requirement.
Ph.D. Thesis 1999 Distributed intelligence with bounded rationality: Applications to economies and networks Even, Ron Abstract | PDF

Title: Distributed intelligence with bounded rationality: Applications to economies and networks

Candidate: Even, Ron

Advisor(s): Mishra, Bud

Abstract:

This dissertation examines bounded rationality as a tool in distributed systems of intelligent agents. We have implemented, in Java, a simulator for complex adaptive systems called CAF??. We use our framework to simulate a simple network and compare the effectiveness of bounded rationality at routing and admission control to that of a more traditional, source based, greedy routing approach. We find that the boundedly rational approach is particularly effective when user behavior is synchronized, such as occurs during breaking news releases on the World Wide Web, for example. We develop the key structures of our framework by first examining, through simulation, the behavior of boundedly rational speculators in a simple economy. We find them to be instrumental in bringing the economy quickly to price equilibrium as well as in maintaining the equilibrium in the face of changing conditions. We draw several interesting conclusions as to the key similarities between economy and computational systems and also, the situations where they differ drastically.
Ph.D. Thesis 1999 Pattern Discovery in Biology: Theory and Applications Floratos, Aristidis Abstract | PDF

Title: Pattern Discovery in Biology: Theory and Applications

Candidate: Floratos, Aristidis

Advisor(s): Boppana, Ravi; Rigoutsos, Isidore

Abstract:

Molecular Biology studies the composition and interactions of life's agents, namely the various molecules (e.g. DNA, proteins, lipids) sustaining the living process. Traditionally, this study has been performed in wet labs using mostly physicochemical techniques. Such techniques, although precise and detailed, are often cumbersome and time consuming. On top of that, recent advances in sequencing technology have allowed the rapid accumulation of DNA and protein data. As a result a gap has been created (and is constantly being expanded): on the one side there is a rapidly growing collection of data containing all the information upon which life is built; and on the other side we are currently unable to keep up with the study of this data, impaired by the limits of existing analysis tools. It is obvious that alternative analysis techniques are badly needed. In this work we examine how computational methods can help in drilling the information contained in collections of biological data. In particular, we investigate how sequence similarity among various macromolecules (e.g. proteins) can be exploited towards the extraction of biologically useful information.
Ph.D. Thesis 1999 Matching Algorithms and Feature Match Quality Measures for Model-Based Object Recognition with Applications toAutomatic Target Recognition Garcia-Keller, Martin Abstract | PDF

Title: Matching Algorithms and Feature Match Quality Measures for Model-Based Object Recognition with Applications toAutomatic Target Recognition

Candidate: Garcia-Keller, Martin

Advisor(s): Hummel, Robert

Abstract:

In the fields of computational vision and image understanding, the object recognition problem can often be formulated as a problem of matching a collection of model features to features extracted from an observed scene. This dissertation is concerned with the use of feature-based match similarity measures and feature match algorithms in object detection and classification in the context of image understanding from complex signature data. Our applications are in the domains of target vehicle recognition from radar imagery, and binocular stereopsis.

In what follows, we will consider “image understanding” to encompass the set of activities necessary to identify objects in visual imagery and to establish meaningful three-dimensional relationships between the objects themselves, or between the object and the viewer. The main goal in image understanding then involves the transformation of images to symbolic representation, effectively providing a high-level description of an image in terms of objects, object attributes, and relationships between known objects. As 2 such, image understanding subsumes the capabilities traditionally associated with image processing, object recognition and artificial vision [Crevier and Lepage 1997].

In human and/or biological vision systems, the task of object recognition is a natural and spontaneous one. Humans can recognize immediately and without effort a huge variety of objects from diverse perceptual cues and multiple sensorial inputs. The operations involved are complex and inconspicuous psychophysical and biological processes, including the use of properties such as shape, color, texture, pattern, motion, context, as well as considerations based on contextual information, prior knowledge, expectations, functionality hypothesis, and temporal continuity. These operations and their relation to machine object recognition and artificial vision are discussed in detail elsewhere [Marr 1982], [Biederman 1985], but they are not our concern in this thesis.

In this research, we consider only the simpler problem of model-based vision, where the objects to be recognized come from a library of three-dimensional models known in advance, and the problem is constrained using context and domain-specific knowledge.

The relevance of this work resides in its potential to support state-of-the-art developments in both civilian and military applications including knowledge-based image analysis, sensors exploitation, intelligence gathering, evolving databases, 3 interactive environments, etc. A large number of applications are reviewed below in section 1.4. Experimental results are presented in Chapters 5, 6, and
Ph.D. Thesis 1999 Learning to Play Network Games Greenwald, Amy Abstract | PDF

Title: Learning to Play Network Games

Candidate: Greenwald, Amy

Advisor(s): Mishra, Bud

Abstract:

This talk concerns the strategic behavior of automated agents in the framework of network game theory, with particular focus on the collective behavior that arises via learning. In particular, ideas are conveyed on both the theory and simulation of learning in network games, in terms of two sample applications. The first application is network control, presented via an abstraction known as the Santa Fe bar problem, for which it is proven that rational learning does *not* converge to Nash equilibrium, the classic game-theoretic solution concept. On the other hand, it is observed via simulations, that low-rationality learning, where agents trade-off between exploration and exploitation, typically converges to mixed strategy Nash equilibria in this game. The second application is the economics of shopbots - agents that automatically search the Internet for price and product information - in which learning yields behaviors ranging from price wars to tacit collusion, with sophisticated low-rationality learning algorithms converging to Nash equilibria. This work forms part of a larger research program that advocates learning and game theory as a framework in which to model the interactions of computational agents in network domains.
Ph.D. Thesis 1999 Experiments in refining graphical interface widgets Hecker, Yaron Chanoch Abstract | PDF

Title: Experiments in refining graphical interface widgets

Candidate: Hecker, Yaron Chanoch

Abstract:

This thesis investigates GUIs and their shortcomings. We demonstrate that there is room for refinement of existing graphical user interfaces, including those interfaces with which we are most familiar. A foundation for our designs is first established. It consists of known human capabilities, especially concerning hand-eye coordination, short term and long term memory, and visual perception. Accumulated experience in static and animated visual design provides additional guides for our work. On the basis of this foundation we analyze existing widgets. A series of new widgets are then proposed to address observed deficiencies in existing designs for scrolling, multiple copy and paste in text environments, text insertion and selection, and window management. Lessons learned from analyzing our new designs and observations of existing widgets are generalized into principles of widget design.
Ph.D. Thesis 1999 Automated Software Deployment Jai, Benchiao Abstract | PDF

Title: Automated Software Deployment

Candidate: Jai, Benchiao

Advisor(s): Siegel, Alan

Abstract:

The work users do with an application can be divided into actual work accomplished using the application and overhead performed in order to use the application. The latter can be further partitioned based on the time at which the work is performed: before (application location and delivery), during (installation) and after (upgrade) the installation of the application. This category can be characterized as the software deployment overhead. This thesis presents a component architecture RADIUS (Rapid Application location, Delivery, Installation and Upgrade System) in which applications can be built with no software deployment overhead to the users. An application is deployed automatically by simply giving the user a document produced by the application. Furthermore, the facilities in RADIUS make the applications self-upgrading. In the end, the users perform no deployment overhead work at all.

The conventional way of using an application is to install the application first, then start using documents of the application. The object-oriented programming (OOP) paradigm suggests that this order should be reversed: the data should lead to the code. However, almost all software fails to meet this model of design at the persistence level. While modern software often use OOP at the program level, the underlying operating systems do not support OOP at the document/file level. OOP languages use pointers to methods to indicate what operations can be performed on the objects. We extend the idea to include "pointers to applications". Each document has an attached application pointer, which is read by RADIUS when the document is opened. This application pointer is then used to locate and deliver the application module necessary for the document.

RADIUS is designed to be compatible with existing technologies and requires no extensions to either programming languages or operating systems. It is orthogonal to programming tools, is language-independent and compatible among operating systems, and consequently does not impose limitations on which environments the developers can use. We illustrate the implementations for the two most popular platforms today - C++ on Windows, and Java. RADIUS is also orthogonal to other component systems such as CORBA or COM and is easy to integrate with them.
Ph.D. Thesis 1999 Toward Stronger User Authentication Monrose, Newman Fabian Abstract | PDF

Title: Toward Stronger User Authentication

Candidate: Monrose, Newman Fabian

Advisor(s): Kedem, Zvi

Abstract:

Password-based authentication is the dominant mechanism for verifying the identity of computer users, even though it is well known that people frequently choose passwords that are vulnerable to dictionary attacks. This talk addresses the issue of improving the security of password-based authentication, and presents authentication techniques that are more secure than traditional approaches against both on-line and off-line attacks.

We present a technique for strengthening the security of a textual password by augmenting it with biometric information such as the duration and latency of keystrokes during entry of the password. Thereby, both the password and the user's typing pattern are used to corroborate the user's identity. The technique presented adapts to gradual changes in a user's typing pattern while maintaining the same strengthened password across authenticated sessions. Moreover, our technique does not reveal which of a user's keystroke features are used to generate the corresponding strengthened password. This knowledge is hidden even from an attacker who captures all the system information used by the authentication server, and we show that our technique increases significantly the amount of work such an attacker must perform.

Additionally, we present an alternative technique for user authentication that exploits features of graphical input devices. We propose and evaluate ``graphical passwords'', which serve the same purpose as textual passwords, but consist of handwritten drawings, possibly in addition to text. Graphical passwords derive their strength from the fact that graphical input devices allow one to decouple the positions of inputs from the temporal order in which these inputs occur. We use this independence to build new password-based authentication schemes that are convincingly stronger than conventional methods.
Ph.D. Thesis 1999 Optimization Over Symmetric Cones Nayakkankuppam, Madhu Abstract | PDF

Title: Optimization Over Symmetric Cones

Candidate: Nayakkankuppam, Madhu

Advisor(s): Overton, Michael

Abstract:

We consider the problem of optimizing a linear function over the intersection of an affine space and a special class of closed, convex cones, namely the symmetric cones over the reals. This problem subsumes linear programming, convex quadratically constrained quadratic programming, and semidefinite programming as special cases. First, we derive some perturbation results for this problem class. Then, we discuss two solution methods: an interior-point method capable of delivering highly accurate solutions to problems of modest size, and a first order bundle method which provides solutions of low accuracy, but can handle much larger problems. Finally, we describe an application of semidefinite programming in electronic structure calculations, and give some numerical results on sample problems.
Ph.D. Thesis 1999 Efficient Computational Model for Energy Propagation in Geoemtrically Represented Large Envirnoments Rajkumar, Ajay Abstract | PDF

Title: Efficient Computational Model for Energy Propagation in Geoemtrically Represented Large Envirnoments

Candidate: Rajkumar, Ajay

Advisor(s): Perlin, Ken

Abstract:

Current radio propagation algorithms are very narrowly focused to specific types of input models and do not scale well to an increase in the number of receiver locations or the number of polygons in an input model. In this dissertation, we look at the problem of efficiently computing energy propagation at radio frequencies in a range of geometrically defined environments from a given transmitter location and for various transmitter and receiver characteristics. To achieve this goal, we propose a unified approach to radio propagation for different types of input models and their combinations as well, by representing the geometry as a binary space partitioning tree and broadcasting energy from the source. The approach is both scalable to large input models as well as dynamically adapts to its scale without incurring unreasonable computational cost. The proposed approach is equally effective for acoustic modeling as well.

We present a new adaptive ray-beam tracing algorithm which initially tessellates the surface of a transmitter into four-sided polygons. Each polygon is cast as a beam which avoids arbitrarily large gaps or overlaps between adjacent beams. For fast intersection computation each beam carries information of its medial ray as well. As the computation proceeds a ray-beam is adaptively subdivided depending on various parameters. The proposed algorithm has sublinear time complexity in terms of the number of receiver locations.

Modeling diffraction off an edge of a wedge is important to compute radio signal that reaches the shadow region of the wedge. Storing these edges explicitly in a data structure can be very expensive for large input models and especially for terrain-based models that have significant elevation variations. We present a new runtime edge-detection algorithm instead of storing the edges statically and its adaptation to binary space partitioning tree represented environments.

We have developed a propagation prediction system called Propagate using these algorithms with good statistical correlation between predicted and measured results for a number of different input models. The proposed algorithms have been used to model several other important computations related to a cellular network of transmitters such as signal strength and path loss, delay spread, angular spread, carrier-to-interference ratio, and modeling of different antenna diversity schemes.
Ph.D. Thesis 1999 Automatic Parallelization: An Incremental, Optimistic, Practical Approach Schwartz, Naftali Abstract | PDF

Title: Automatic Parallelization: An Incremental, Optimistic, Practical Approach

Candidate: Schwartz, Naftali

Advisor(s): Kedem, Zvi

Abstract:

The historic focus of Automatic Parallelization efforts has been limited in two ways. First, parallelization has generally been attempted only on codes which can be proven to be parallelizeable. Unfortunately, the requisite dependence analysis is undecidable, and today's applications demonstrate that this restriction is more than theoretical. Second, parallel program generation has generally been geared to custom multiprocessing hardware. Although a network of commodity workstations (NOW) could theoretically be harnessed to serve as a multiprocessing platform, the NOW has characteristics which are at odds with effective utilization.

This thesis shows that by restricting our attention to the important domain of ``embarrassingly parallel'' applications, leveraging existing scalable and efficient network services, and carefully orchestrating a synergy between compile-time transformations and a small runtime system, we can achieve a parallelization that not only works in the face of inconclusive program analysis, but is indeed efficient for the NOW. We optimistically parallelize loops whose memory access behavior is unknown, relying on the runtime system to provide efficient detection and recovery in the case of an overly optimistic transformation. Unlike previous work in speculative parallelization, we provide a methodology which is not tied to the Fortran language, making it feasible as a generally useful approach. Our runtime system implements Two-Phase Idempotent Eager Scheduling (TIES) for efficient network execution, providing an Automatic Parallelization platform with performance scalability for the NOW.

Our transformation divides the original program into a server and zero or more clients. The server program is a specialization of the original application with each parallel loop replaced with a scheduling call to the client which comprises the body of that parallel loop. The scheduler remotely executes the appropriate instances of this client on available machines.

We describe the transformation and runtime system in detail, and report on the automatic transformation achieved by our implementation prototype in two case studies. In each of these cases, we were able to automatically locate the important coarse-grained loops, construct a shared-memory layout, and generate appropriate server and client code. Furthermore, we show that our generated parallel programs achieve near-linear speedups for sufficiently large problem sizes.
Ph.D. Thesis 1999 Destructive Effect Analysis And Finite Differencing For Strict Functional Languages Yung, Chung Abstract | PDF

Title: Destructive Effect Analysis And Finite Differencing For Strict Functional Languages

Candidate: Yung, Chung

Advisor(s): Goldberg, Benjamin

Abstract:

Destructive update optimization is critical for writing scientific codes in functional languages. Pure functional languages do not allow mutations, destructive updates, or selective updates so that the straightforward implementations of functional languages induces large amounts of copying to preserve program semantics. The unnecessary copying of data can increase both the execution time and the memory requirements of an application. Destructive update optimization makes an essential improvement to the implementation of functional programs with compound data structures, such as arrays, sets, and aggregates. Moreover, for many of the compiler optimization techniques that depend on the side-effects, destructive update analysis provide the input for applying such optimization techniques. Among other compiler optimization techniques, finite differencing captures common yet distinctive program constructions of costly repeated calculations and transforms them into more efficient incremental program constructions.

In this dissertation, we develop a new approach to destructive update analysis, called destructive effect analysis . We present the semantic model and the abstract interpretation of destructive effect analysis. We designed EAS , an experimental applicative language with set expressions. The implementation of the destructive effect analysis is integrated with the optimization phase of our experimental compiler of EAS. We apply finite differencing to optimize pure functional programs, and we show the performance improvement that results from applying the finite differencing optimization together with the destructive update optimization.
Ph.D. Thesis 1998 Foveation Techniques and Scheduling Issues in Thinwire Visualization Chang, Ee-Chien Abstract | PDF

Title: Foveation Techniques and Scheduling Issues in Thinwire Visualization

Candidate: Chang, Ee-Chien

Advisor(s): Yap, Chee

Abstract:

We are interested in the visualization of large images across a network. Upon request, the server sends an image across the network to the client, who in turn, presents this image to the viewer. A key observation is that, at any moment, the viewer is mainly interested in a region around his gaze point in the image. To exploit this, we let the viewer interactively indicates this point and the selected region will have higher priority in the transmission process. As a result, the displayed image is a ``space-variant'' image. A fundamental difference between this scheme and the usual progressive transmission scheme is that we place more emphasis on the visualization process. This shift in emphasis opens up new perspectives on the problem. In this thesis, we focus on this difference.

In chapter two, we formalize the operation of ``foveating an image'', study how to distribute the resolution over an image, and how to progressively refine such a space-variant image. Motivated by properties of human vision, we propose two methods for the construction of space-variant images. In chapter three, we formulate and study an abstract on-line scheduling problem which is motivated by interactions between the client and the server. In the fourth and last chapter, we describe details and issues in an implementation.
Ph.D. Thesis 1998 Techniques to Improve the Performance of Software-based Distributed Shared Memory Systems Chu, Churngwei Abstract | PDF

Title: Techniques to Improve the Performance of Software-based Distributed Shared Memory Systems

Candidate: Chu, Churngwei

Advisor(s): Kedem, Zvi

Abstract:

Software distributed shared memory systems are able to provide programmers with the illusion of global shared memory on networked workstations without special hardware support. This thesis identifies two problems in contemporary software distributed shared memory systems: (1) poor application programming interfaces for programmers who need to solve complicated synchronization problems and (2) inefficiencies in traditional multiple writer protocols. We propose a solution to both of these problems. One is the introduction of user-definable high level synchronization primitives to provide a better application programming interface. The other is the single-owner protocol to provide efficiency. In order to accommodate user-definable high level synchronization primitives, a variant of release consistency is also proposed.

User-definable high level synchronization primitives provide a paradigm for users to define their own synchronization primitives instead of relying on traditional low level synchronization primitives, such as barriers and locks. The single-owner protocol reduces the number of messages from O ( n ² ) messages (the number of messages needed in the multiple-owner protocol) to Theta(n) messages when there are first n writers writing to a page and then n readers reading the page. Unlike some multiple-owner protocols, in the single-owner protocol garbage collection is performed asynchronously, and the size of a message for doing memory update is smaller in most cases.

We also evaluate the tradeoffs between the single-owner protocol and multiple-owner protocols. We have found that in most cases the single-owner protocol uses fewer messages than multiple-owner protocols, but there are some computations which may perform better with some multiple-owner protocols. In order to combine the advantages of both protocols, we propose a hybrid owner protocol which can be used to increase the efficiency in an adaptive way, with some pages managed by the single-owner protocol and some by a multiple-owner protocol.

Finally, five applications are evaluated using the single-owner protocol and a particular multiple-owner protocol called the lazy invalidate protocol. The performance of these two protocols is compared. We also demonstrate the use of user-definable high level synchronization primitives on one of the applications, and compare its performance against the same application constructed using only low-level synchronization primitives.
Ph.D. Thesis 1998 Deformable Object Tabula Rasa: A Zoomable User Interface System Fox, David Abstract | PDF
Title: Deformable Object Tabula Rasa: A Zoomable User Interface System

Candidate: Fox, David

Advisor(s): Perlin, Ken

Abstract:

This dissertation develops the concept of a zoomable user interface and identifies the design elements which are important to its viability as a successor to the desktop style of interface. The implementation of an example system named Tabula Rasa is described, along with the design and implementation of some sample applications for Tabula Rasa. We show how programming techniques such as delegation and multi-methods can be used to solve certain problems that arise in the implementation of Tabula Rasa, and in the implementation of Tabula Rasa applications.

Over the past thirty years the desktop or WIMP (Windows, Icons, Menus, Pointer) user interface has made the computer into a tool that allows non-specialists to get a variety of tasks done. In recent years, however, the applications available under this interface have become larger and more unwieldy, taking into themselves more and more marginally related functionality. Any inter-operability between applications must be explicitly designed in.

The Zoomable User Interface (ZUI) is a relatively new metaphor designed as a successor to the desktop interface. It is inspired by the Pad system, which is based on a zoomable surface of unlimited resolution. Just as the desktop interface has a set of essential elements, a ZUI has a set of elements each of which is vital to the whole. These include
1. a zoomable imaging model,
2. a persistent virtual geography for data objects,
3. semantic zooming to optimize the utility of screen space,
4. work-through interfaces for application objects,
5. a constraint system for ensuring the consistency of the interface elements.
These basic elements combine to produce an environment that takes advantage of the user's spatial memory to create a more expansive and dynamic working environment, as well as encouraging finer grained applications that automatically inter-operate with various types of data objects and applications.
Ph.D. Thesis 1998 Metacomputing and Resource Allocation on the World Wide Web Karaul, Mehmet Abstract | PDF

Title: Metacomputing and Resource Allocation on the World Wide Web

Candidate: Karaul, Mehmet

Advisor(s): Kedem, Zvi

Abstract:

The World Wide Web is a challenging environment for distributed computing due to its sheer size and the heterogeneity and unreliability of machines and networks. Therefore, scalability, load balancing, and fault masking play an important role for Web-based systems. In this dissertation, I present novel mechanisms for resource allocation and parallel computing on the Web addressing these issues.

Large Web sites rely on a set of geographically dispersed replicated servers among which client requests should be appropriately allocated. I present a scalable decentralized design, which pushes the allocation functionality onto the clients. At its core lies a pricing strategy that provides incentives to clients to control the dispatching of requests while still allowing clients to take advantage of geographic proximity. An adaptive algorithm updates prices to deal with dynamic changes. A prototype system based on this architecture has been implemented and its functionality validated through a series of experiments.

Parallel computing on local area networks is based on a variety of mechanisms targeting the properties of this environment. However, these mechanisms do not effectively extend to wide area networks due to issues such as heterogeneity, security, and administrative boundaries. I present a prototype system which allows application programmers to write parallel programs in Java and allows Java-capable browsers to execute parallel tasks. It comprises a virtual machine model which isolates the program from the execution environment, and a runtime system realizing this machine on the Web. Load balancing and fault masking are transparently provided by the runtime system.
Ph.D. Thesis 1998 Free Parallel Data Mining Li, Bin Abstract | PDF

Title: Free Parallel Data Mining

Candidate: Li, Bin

Advisor(s): Shasha, Dennis

Abstract:

Data mining is the emerging field of applying statistical and artificial intelligence techniques to the problem of finding novel, useful, and non-trivial patterns from large databases. This thesis presents a framework for easily and efficiently parallelizing data mining algorithms. We propose an acyclic directed graph structure, exploration dag ( E-dag ), to characterize the computation model of data mining algorithms in classification rule mining, association rule mining, and combinatorial pattern discovery. An E-dag can be constructively formed in parallel from specifications of a data mining problem, then a parallel E-dag traversal is performed on the fly to efficiently solve the problem. The effectiveness of the E-dag framework is demonstrated in biological pattern discovery applications.

We also explore data parallelism in data mining applications. The cross-validation and the windowing techniques used in classification tree algorithms facilitate easy development of efficient data partitioning programs. In this spirit, we present a new classification tree algorithm called NyuMiner that guarantees that every split in a classification tree is optimal with respect to any given impurity function and any given maximum number of branches allowed in a split. NyuMiner can be easily parallelized using the data partitioning technique.

This thesis also presents a software architecture for running parallel data mining programs on networks of workstations (NOW) in a fault-tolerant manner. The software architecture is based on Persistent Linda (PLinda), a robust distributed parallel computing system which automatically utilize idle cycles. Templates are provided for application programmers to develop parallel data mining programs in PLinda. Parallelization frameworks and the software architecture form a synergy that makes free efficient data mining realistic.
Ph.D. Thesis 1998 Fast Algorithms for Discovering the Maximum Frequent Set Lin, Dao-I Abstract | PDF

Title: Fast Algorithms for Discovering the Maximum Frequent Set

Candidate: Lin, Dao-I

Advisor(s): Kedem, Zvi

Abstract:

Discovering frequent itemsets is a key problem in important data mining applications, such as the discovery of association rules, strong rules, episodes, and minimal keys. Typical algorithms for solving this problem operate in a bottom-up breadth-first search direction. The computation starts from frequent 1-itemsets (minimal length frequent itemsets) and continues until all maximal (length) frequent itemsets are found. During the execution, every frequent itemset is explicitly considered. Such algorithms perform reasonably well when all maximal frequent itemsets are short. However, performance drastically decreases when some of the maximal frequent itemsets are relatively long. We present a new algorithm which combines both the bottom-up and the top-down searches. The primary search direction is still bottom-up, but a restricted search is also conducted in the top-down direction. This search is used only for maintaining and updating a new data structure we designed, the maximum frequent candidate set. It is used to prune candidates in the bottom-up search. A very important characteristic of the algorithm is that it does not require explicite examination of every frequent itemset. Therefore the algorithm performs well even when some maximal frequent itemsets are long. As its output, the algorithm produces the maximum frequent set, i.e., the set containing all maximal frequent itemsets, thus specifying immediately all frequent itemsets. We evaluate the performance of the algorithm using well-known synthetatic benchmark databases and real-life census and stock market databases. The improvement in performance can be up to several orders of magnitude, compared to the best current algorithms.
Ph.D. Thesis 1998 Algorithmic Techniques in Computational Genomics Parida, Laxmi Abstract | PDF

Title: Algorithmic Techniques in Computational Genomics

Candidate: Parida, Laxmi

Advisor(s): Mishra, Bud

Abstract:

This thesis explores the application of algorithmic techniques in understanding and solving computational problems arising in Genomics (called Computational Genomics ). In the first part of the thesis we focus on the problem of reconstructing physical maps from data, related to "reading" the genome of an organism, and in the second part we focus on problems related to "interpreting" (in a very limited sense) the genome. The main contributions of the thesis are understanding the computational complexity of, and designing algorithms for some key problems in both these domains.

The primary goal of the Human Genome Project is to determine the entire three billion base pair sequence of the human genome and locate roughly 100,000 genes on the DNA. Recently, a set of single molecule methods (such as optical mapping) have been developed that allow one to create physical maps (a set of landmarks on the DNA whose locations are well defined), but can only do so by combining a population of data in the presence of errors from various sources. In the first part of the thesis, we focus on the problem of computing physical maps from data that arise in single molecule methods. We describe two combinatorial models of the problem termed Exclusive Binary Flip Cut (EBFC) and Weighted Consistency Graph (WCG) problems. We show that both the problems are MAX SNP hard and give bounds on the approximation factors achievable. We give polynomial time 0.878-approximation algorithm for the EBFC problem and 0.817-approximation algorithm for the WCG problem, using the maxcut approximation algorithm due to Goemans and Williamson. We also give a low polynomial time practical algorithm that works well on simulated and real data. Naksha is an implementation of this algorithm and a demonstration is available at http://www.cs.nyu.edu/parida/naksha.html . We also have similar results on complexity for generalizations of the problem which model various other sources of errors. We have generalized our complexity and algorithmic results to the case where there is more than one population in the data (which we call the K -populations problem). In the second part of the thesis, we focus on "interpreting" the genome. We consider the problem of discovering patterns (or motifs) in strings on a finite alphabet: we show that by appropriately defining irredundant motifs, the number of irredundant motifs is only quadratic in the input size. We use these irredundant motifs in designing algorithms to align multiple genome or protein sequences. Alignment of sequences aids in comparing similarities, in structure and function of the proteins.
Ph.D. Thesis 1998 Thinksheet: a Tool for Information Navigation Piatko, Peter Abstract | PDF

Title: Thinksheet: a Tool for Information Navigation

Candidate: Piatko, Peter

Advisor(s): Shasha, Dennis

Abstract:

Imagine that you are a ``knowledge worker'' in the coming millenium. You must synthesize information and make decisions such as ``Which benefits plan to use?'' ``What do the regulations say about this course of action?'' ``How does my job fit into the corporate business plan?'' or even ``How does this program work?'' If the dream of digital libraries is to bring you all material relevant to your task, you may find yourself drowning before long. Reading is harder than talking to people who know the relevant documents and can tell you what you're interested in. That is what many current knowledge workers do, giving rise to professions such as insurance consultant, lawyer, benefits specialist, and so on.

Imagine by contrast that the documents you retrieve could be tailored precisely to your needs. That is, imagine that the document might ask you questions and produce a document filtered and organized according to those you have answered.

We have been developing software that allows writers to tailor documents to the specific needs of large groups of readers. Thinksheet combines the technologies of expert systems, spreadsheets, and database query processing to provide tailoring capabilities for complex documents. The authoring model is only slighly more complex than a spreadsheet.

This thesis discusses the conceptual model and the implementation of Thinksheet, and applications for complex documents and metadata.
Ph.D. Thesis 1998 Corpus-based Parsing and Sublanguage Studies Sekine, Satoshi Abstract | PDF

Title: Corpus-based Parsing and Sublanguage Studies

Candidate: Sekine, Satoshi

Advisor(s): Grishman, Ralph

Abstract:

There are two main topics in this thesis, a corpus-based parser and a study of sublanguage.

A novel approach to corpus-based parsing is proposed. In this framework, a probabilistic grammar is constructed whose rules are partial trees from a syntactically-bracketed corpus. The distinctive feature is that the partial trees are multi-layered. In other words, only a small number of non-terminals are used to cut the initial trees; other grammatical nodes are embedded into the partial trees, and hence into the grammar rules. Good parsing performance was obtained, even with small training corpora. Several techniques were developed to improve the parser's accuracy, including in particular two methods for incorporating lexical information. One method uses probabilities of binary lexical dependencies; the other directly lexicalizes the grammar rules. Because the grammar rules are long, the number of rules is huge - more than thirty thousand from a corpus of one million words. A parsing algorithm which can efficiently handle such a large grammar is described. A Japanese parser based on the same idea was also developed.

Corpus-based sublanguage studies were conducted to relate the notion of sublanguage to lexical and syntactic properties of a text. A statistical method based on word frequencies was developed to define sublanguages within a collection of documents; this method was evaluated by identifying the sublanguage of new documents. Relative frequencies of different syntactic structures were used to assess the domain dependency of syntactic structure in a multi-domain corpus. Cross-entropy measurements showed a clear distinction between fiction and non-fiction domains. Experiments were then performed in which grammars trained on individual domains, or sets of domains, were used to parse texts in the same or other domains. The results correlate with the measurements of syntactic variation across domains; in particular, the best performance is achieved using grammars trained on the same or similar domains.

The parsing and sublanguage techniques were applied to speech recognition. Sublanguage techniques were able to increase recognition accuracy, and some promising cases were found where the parser was able to correct recognition errors.
Ph.D. Thesis 1998 Abstract Models of Distributed Memory Management Ungureanu, Cristian Abstract | PDF

Title: Abstract Models of Distributed Memory Management

Candidate: Ungureanu, Cristian

Advisor(s): Goldberg, Benjamin

Abstract:

In this dissertation, we are presenting a model suitable for reasoning about memory management in concurrent and distributed systems. The model provides a suitable level of abstraction: it is low-level enough so that we can express communication, allocation and garbage collection, but otherwise hides many of the lower-level details of an actual implementation. Using it, we can give compact, and provably correct, characterizations of garbage collection algorithms in distributed systems.

The models are rewriting systems whose terms are programs in which the ``code'' and the ``store'' are syntactically apparent. Evaluation is expressed as conditional rewriting and includes store and communication operations. Using techniques developed for communicating and concurrent systems we give a semantics suitable for proving equivalence of such programs. Garbage collection becomes a rewriting relation that removes part of the store without affecting the behavior of the program.

We introduce and prove correct a very general garbage collection rule based on reachability; any actual implementation which is capable of providing the transitions (including their atomicity constraints) specified by the strategy is therefore correct. We give examples of such specific implementations, and show how their correctness follows from the correctness of the general relation.
Ph.D. Thesis 1998 Fault-tolerant parallel computing on networks of non-dedicated workstations Wyckoff, Peter Abstract | PDF

Title: Fault-tolerant parallel computing on networks of non-dedicated workstations

Candidate: Wyckoff, Peter

Abstract:

This thesis addresses fault tolerance issues in parallel computing on loosely-coupled networks of non-dedicated, heterogeneous workstations. The efficiency of fault tolerance mechanisms is dictated by network and failure characteristics. Traditional approaches to fault tolerance are efficient when network and failure characteristics are identical across workstations, such as in a local area network of homogeneous workstations; however, a loosely coupled network of non-dedicated workstations has non-uniform network and failure characteristics. This thesis presents the design and implementation of a flexible fault tolerance runtime system that allows each process in a parallel application to use one of three rollback recovery mechanisms. Rollback recovery is achieved using a lightweight form of transaction, which performance results show incurs almost no overhead. The system is built on top of the Linda coordination language and runs on Alpha, Linux, Solaris and SGI workstations and Java-enabled browsers. For barrier synchronous parallel applications, a new equi-distant checkpointing interval selection method, the expected maximum heuristic, is presented. The method is applicable to any rollback recovery system in which processes recover from failure independently and communicate through a reliable third party. Simulation results show that the expected maximum heuristic has near optimal performance under a variety of different failure rates and barrier lengths.
Ph.D. Thesis 1997 Multiscale Snakes: Resolution-Appropriate Shape Descriptions Baldwin, Bernard Abstract | PDF

Title: Multiscale Snakes: Resolution-Appropriate Shape Descriptions

Candidate: Baldwin, Bernard

Advisor(s): Geiger, Davi

Abstract:

We present a new type of "snake" in which the dimensionality of the shapes is scaled appropriately for the resolution of the images in which the shapes are embedded. We define shapes as an ordered list of control points and compute the principal components of the shapes in a prior training set. Our energy function is based upon the Mahalanobis distance of a given shape from the mean shape and on the Mahalanobis distance of the image attributes from image attribute values extracted from the training set. We show that the derivative of this energy function with respect to the modal weights is reduced as the image resolution is reduced, and that the derivative of the energy scales with the variance associated with each mode. We exploit this property to determine the subset of the modes which are relevant at a particular level of image resolution, thereby reducing the dimensionality of the shapes. We implement a coarse-to-fine search procedure in the image and shape domains simultaneously, and demonstrate this procedure on the identification of anatomic structures in Computed Tomography images and on the identification of military vehicles in range images.
Ph.D. Thesis 1997 Deformable Object Recognition with Articulations and Occlusions Liu, Tyng-Luh Abstract | PDF

Title: Deformable Object Recognition with Articulations and Occlusions

Candidate: Liu, Tyng-Luh

Advisor(s): Geiger, Davi

Abstract:

The subject of this thesis is deformable object recognition. We concentrate on issues of articulations and of occlusions.

In order to find a target object (undergoing articulations) in an image we use the following procedures: (i) extracting key features in an image, (ii) detecting key points in the model, (iii) efficiently searching through possible image segmentations and (iv) comparing and grouping shapes. Together, they reconstruct the target object in the image. A Bayesian rational is presented to justify this strategy.

Our main focuses in this thesis are on (iii) and (iv). More precisely, we are interested in shape representation, shape similarity and combining shape similarity with image segmentation.

We consider two possible shape representations for an object. The first is given by its shape contour (SC), or silhouette, and the other is described by the structure of symmetry axis (SA), or skeleton, which has a unique free tree structure. For shape similarity, we review a string matching method based on the SC representation and then, we develop a tree matching scheme using the SA-tree representation. The advantage of this approach is that it becomes extremely simple to account for articulations and occlusions. As a novelty, the SA is obtained via a shape comparison between an SC and its mirror version. Finally we study how to integrate the shape module, for both shape representations (SC and SA), with an active contour tracker to yield an image segmentation.

Our efforts through all these issues have been to provide methods that are guaranteed to find optimal solutions.

We also address the topic of occluded object recognition but from a different viewpoint. Our method is to treat it as a function approximation problem with an over-complete basis (a library of image templates), but also accounts for occlusions, where the basis superposition principle is no longer valid. Since the basis is over-complete, there are infinitely many ways to decompose the image. We are motivated to select a sparse/compact representation of the image and to account for occlusions and noise.
Ph.D. Thesis 1997 Partial evaluation of concurrent programs Marinescu, Mihnea Abstract | PDF

Title: Partial evaluation of concurrent programs

Candidate: Marinescu, Mihnea

Advisor(s): Goldberg, Benjamin

Abstract:

The goal of this dissertation is to develop partial evaluation (program specialization) techniques specific to concurrent programs .

The language chosen for this investigation is a very simple CSP-like language. A standard binding-time analysis for imperative languages is conservatively extended in order to deal with the basic concurrent constructs: synchronous communication and nondeterministic choice. Based on the resulting binding-time annotations, a specialization transformation is formally defined using a labeled transition system with actions. The correctness of the partial evaluation is stated and a proof is included. This result is closely related to (strong) bisimulation , the equivalence relation on transition systems. We name the two directions of the bisimulation equivalence soundness and completeness respectively.

In order to maintain a clear presentation, this simple specialization algorithm addresses only the data transfer component of the communication; a post-specialization analysis for the detection and removal of redundant synchronizations (i.e. synchronizations whose removal does not increase the nondeterminism of a program) is presented separately. This redundant-synchronization analysis is based on the characterization of dependencies in a CSP-like language.

Several pragmatic issues such as improving the binding-time analysis, controlling loop unrolling and the consequences of lifting nondeterminism from run-time to specialization-time are discussed. Two additional binding-time analyses are presented. We call one of them speculative because the specialization transformation based on it is sound but not complete. We call the other one extended because it includes an on-line redundant-synchronization analysis.

The relationship between partial evaluation and different types of fairness is also studied. In order to deal with a wide range of fair run-time systems, ranging from strong to weak, and from process-fair to channel-fair and communication-fair, we use a general operational framework for specifying fairness properties as systematic means of reducing nondeterminism. We then prove the correctness (as bisimulation equivalence) or just the soundness of specialization transformations under various binding-time analyses.

Throughout the dissertation, the power of the newly developed techniques is shown in several examples.
Ph.D. Thesis 1997 Pricing and Hedging Volatility Risk in Interest-Rate Derivatives Porras, Juan Abstract | PDF

Title: Pricing and Hedging Volatility Risk in Interest-Rate Derivatives

Candidate: Porras, Juan

Advisor(s): Avellaneda, Marco

Abstract:

This work addresses the problem of pricing interest-rate derivative securities and the use of quoted prices of traded instruments to calibrate the corresponding interest-rate dynamics. To this end, an arbitrage-free model of interest rate evolution is adopted, for which the local drift will depend on the history of volatility, thus leading to path-dependent pricing. This model is based on the Heath-Jarrow-Morton formulation but, in addition, presupposes that the volatility process is not defined a-priori . This leads to a path-dependent model that can be formulated in a Markovian framework by considering additional state-variables and hence increasing the dimensionality of the computation. Instead of solving the resulting 3-dimensional partial differential equation, an alternative approach, based on conditional expectations of the history of volatility, is taken. This pricing method is applied to a non-linear (adverse volatility) setting, and used as the core of a non-parametric model calibration technique. The algorithm, by performing an optimization over volatility surfaces, finds a volatility surface that matches the market prices of a given set of securities. This method also finds a hedge for volatility risk, using derivative securities as hedging instruments. In particular, we present results obtained for the problem of hedging American swaptions (options on interest-rate swaps) using European swaptions.

The conditional expectation approach is explored further, and found to be of interest in its own right for the pricing of several kinds of path-dependent instruments, providing an alternative to increasing state-space dimension in order to satisfy a Markov property. In particular, we show how this method speeds up the computation of prices for some types of exotic options, while being general enough to apply to both linear and non-linear pricing of portfolios.
Ph.D. Thesis 1997 Performance Modeling for Realistic Storage Devices Shriver, Elizabeth Abstract | PDF

Title: Performance Modeling for Realistic Storage Devices

Candidate: Shriver, Elizabeth

Advisor(s): Siegel, Alan; Wilkes, John

Abstract:

Managing large amounts of storage is difficult and becoming more so as both the complexity and number of storage devices are increasing. One approach to this problem is a self-managing storage system . Since a self-managing storage system is a real-time system, it requires a model that quickly approximates the behavior of the storage device in a workload-dependent fashion. We develop such a model.

Our approach to modeling devices is to model the individual components of the device, such as queues, caches, and disk mechanisms, and then compose the components. To determine the performance of a component, each component modifies the entering workload use patterns and determines the performance from the workload use patterns and the lower-level device behavior. For example, modifying the use patterns allows us to capture the altered spatial locality that occurs when queues reorder their requests.

Our model predicts the device behavior in terms of response time within a 8% relative error for an interesting subset of the domain of devices and workloads. To demonstrate this, the model has been validated with synthetic traces of parallel scientific file system applications and traces of transaction processing applications.

Our contributions to the area of performance modeling for storage devices include the following:

1.
Methods to approximate the positioning time for the disk head of a magnetic disk.
2.
Methods to approximate the queue delay for non-FCFS scheduling algorithms.
3.
Methods to approximate the cache-miss probabilities and the full and partial cache-hit probabilities in the data caches in the I/O path using measures of workload spatial locality.
4.
Methods to approximate the mean seek time and rotational latency of the disk mechanism using measures of workload spatial locality.
5.
An infrastructure for developing a composite model. The infrastructure supports the development of more complicated devices and workloads than we have validated.

Together, these mean that we have analytic methods to approximate the behavior of a set of realistic storage devices.
Ph.D. Thesis 1996 Algorithms in Semi-Algabraic Geometry Basu, Saugata Abstract | PDF

Title: Algorithms in Semi-Algabraic Geometry

Candidate: Basu, Saugata

Advisor(s): Pollack, Richard

Abstract:

In this thesis we present new algorithms to solve several very general problems of semi-algebraic geometry. Our algorithms are currently the best algorithms for solving these problems. In addition, we have proved new bounds on the topological complexity of real semi-algebraic sets, in terms of the parameters of the polynomial system defining them, which improve some old and widely used results in this field.

The first part of the thesis deals mainly with the decision problem for the first order theory of real closed fields, and the more general problem of quantifier elimination. We give algorithms which improve the complexity of of all the previously known algorithms for these problems. Moreover, our techniques allow us to prove some purely mathematical theorems on the number of connected components and on the existence of small rational points in a given semi-algebraic set.

The second part of this work deals with connectivity questions of semi-algebraic sets. We develop new techniques in order to give an algorithm for computing roadmaps of semi-algebraic sets which improves on the complexity of the previous algorithms for this problem.

The third part of this work deals with bounding the topological complexity of semi-algebraic sets in terms of the number and the degrees of the polynomials describing it. We extend and improve a classical and widely used result of Oleinik and Petrovsky(1949), Thom (1965) and Milnor(1964), bounding the sum of the Betti numbers of semi-algebraic sets. Using the ideas behind this result, we give the first singly exponential algorithm for computing the Euler characteristic of an arbitrary semi-algebraic set.

One common thread that links these results is that our bounds are separated into a combinatorial part (the part depending on the number of polynomials) and an algebraic part (the part depending on the degrees of the polynomials). The combinatorial part of the complexity of our algorithms is frequently tight and this marks the improvement of many of our results. This is most striking when one considers that in many applications, for instance in computational geometry, it is the number of polynomials which is the most important parameter (the degrees and the number of variables are usually small). Another important and new feature of some of our results is that when the given semi-algebraic set is contained in a lower dimensional variety, the combinatorial part of the complexity depends on the dimension of this variety rather than on the dimension of the ambient space. This is useful when one considers semi-algebraic sets which have low real dimension embedded in a higher dimensional space.
Ph.D. Thesis 1996 Statistical Source Channel Models for Natural Language Understanding Epstein, Mark Abstract | PDF

Title: Statistical Source Channel Models for Natural Language Understanding

Candidate: Epstein, Mark

Advisor(s): Grishman, Ralph

Abstract:

The problem of Natural Language Understanding (NLU) has intrigued researchers since the 1960's. Most researchers working in computational linguistics focus on linguistic solutions to their problems. They develop grammars and parsers to process the input natural language into a meaning representation . In this thesis, a new approach is utilized. Borrowing from the field of communication theory , an information theoretic approach to natural language understanding is applied. This is based on the source-channel model of communication.

The source-channel model of NLU assumes that the user has a meaning in the domain of the application that he wishes to convey. This meaning is sent through a noisy channel . The observer receives the English sentence as output from the noisy channel. The observer then submits the English sentence to a decoder , which determines the meaning most likely to have generated the English. The decoder uses mathematical models of the channel and the meanings to process the English sentence. Thus, the following problems must be addressed in a source-channel model for NLU:

1.
A mathematical model of the noisy-channel must be developed.
2.
The parameters of the model must be set, either manually or by an automatic training procedure.
3.
A decoder must be built to search through the meaning space for the most likely meaning to have generated the observed English.

This dissertation focuses on the first two of these problems. Several mathematical models of the noisy channel are developed. They are trained from a corpus of context independent sentence pairs consisting of both English and the corresponding meaning. The parameters of the models are trained to maximize the likelihood of the model's prediction of the observed training data using Dempster and Laird's Expectation-Maximization algorithm . Results are presented for the Air Travel Information Service (ATIS) domain.
Ph.D. Thesis 1996 Solving the Navier-Stokes Equations on a Distributed Parallel Computer Sabbagh, Hadil Abstract | PDF

Title: Solving the Navier-Stokes Equations on a Distributed Parallel Computer

Candidate: Sabbagh, Hadil

Advisor(s): Peskin, Charles S.

Abstract:

Speed and space are two major issues in computational fluid dynamics today. Scalable parallel or distributed computers offer the promise of faster time to solve problems through parallelism and solutions to larger problems by adding more parallel processors with their own private memories. These systems use message passing to share data between processors. Parallel programs are difficult to write, especially for message passing systems, and there are few well-studied test cases.

In this dissertation, we solve the incompressible Navier-Stokes equations on a periodic cubic domain (3-torus). The numerical method is a finite difference method that consists of two parts: upwind differencing applied to the non-linear terms and solution of the Stokes equations. The latter are solved implicitly using a three-dimensional FFT. For the parallel implementation, the domain is divided into equally sized non-periodic cubic subdomains. Each subdomain is assigned to a processor; the processors form a process grid which is periodic. The parallel upwind differencing is preceded by an exchange of face data. The discrete Fourier transform in the Stokes solver is computed by applying one-dimensional FFTs sequentially in the three coordinate directions. In each coordinate direction, data must be exchanged only among those processors that lie on the same line of the process grid.

The parallel algorithm was implemented twice: once using PVM and once using MPI. Although both implementations are described in the thesis, the performance of only the MPI version is discussed.

The Navier-Stokes solver is tested on the IBM SP-2. Three constant problem size and three scalability experiments are used to analyze the performance of the solver. The fluid solver achieves a speedup of 48.8 when solving a 240 * 240 * 240 problem on 216 processors. Furthermore, there is evidence of scalability.
Ph.D. Thesis 1995 Synthesis and Verification of Controllers for Robotics and Manufacturing Devices with Temporal Logic and the "Control-D" System Antoniotti, Marco Abstract | PDF

Title: Synthesis and Verification of Controllers for Robotics and Manufacturing Devices with Temporal Logic and the "Control-D" System

Candidate: Antoniotti, Marco

Advisor(s): Mishra, Bud

Abstract:

This dissertation studies the semi-automated synthesis and verification of control systems for robotics and manufacturing devices using formal methods in a discrete framework, and bears some resemblance to the theory of controlled discrete event systems (CDES) of Ramadge and Wonham. The discrete controller components of a walking machine and of a manufacturing line in the Combat Ration Advanced Manufacturing Technology Demonstration (CRAMTD) of Rutgers University are constructed automatically using the algorithms developed here.

The goal of this research has been to facilitate the integration of CDES theory with the the specification and verification formalisms for finite state systems. Many of our techniques rely on the application of some flavor of temporal logic . In particular, the model-checking techniques of Clarke and Emerson, for branching time temporal logic, proved to be valuable in the implementation of a controller synthesis tool for CDES, called Control-D . The main synthesis algorithm used by the Control-D tool compares favorably with the Ramadge-Wonham algorithm in time and space complexity, while achieving improved expressiveness in its underlying specification language.
Ph.D. Thesis 1995 Planning in an Imperfect World Using Previous Experiences Chiu, Jen-Lung Abstract | PDF

Title: Planning in an Imperfect World Using Previous Experiences

Candidate: Chiu, Jen-Lung

Advisor(s): Davis, Ernest

Abstract:

This thesis studies the problem of planning and problem solving in an unpredictable environment by adapting previous experiences. We construct a single agent planning system CADDY and operate it in a simple golf world testbed. The study of CADDY combines the studies of probabilistic, spatial, and temporal reasoning, adapting and reusing plans, and the tradeoff between gains and costs based on various considerations.

The CADDY planning system operates in an uncertain and unpredictable environment. Despite limited perception, incomplete knowledge, and imperfect motion control, CADDY achieves its goal efficiently by finding a plan that is already known to work well in a similar situation and applying repair heuristics to improve it. The capability of adapting experiences makes CADDY a planning system with learning capability.

In this thesis, we discuss the structure of the CADDY planning system and the results of experimental tests of CADDY when we applied to a simulated golf world. We compare CADDY with several other research projects on probabilistic planners and planners which utilizes experiences. We also discuss how CADDY can be characterized in terms of theoretical work on plan feasibility. Finally, we point out possible directions of system extension and generalizations of the idea learned from CADDY to other problem domains. Currently CADDY is not directly applied to real-world problems, but it shows an interesting and promising direction of study. By combining the techniques of probabilistic reasoning, planning, and learning, the performance of planning on real-world domains can be improved dramatically.
Ph.D. Thesis 1995 Geodesic Problems in High Dimensions Choi, Joonsoo Abstract | PDF

Title: Geodesic Problems in High Dimensions

Candidate: Choi, Joonsoo

Advisor(s): Yap, Chee

Abstract:

The geometric shortest path (geodesic) problem can be formulated as follows: given a collection of obstacles in $\R^d$ , and source and target points $s, t \in \R^d$ , find a shortest obstacle-avoiding path between s and t . This thesis studies the Euclidean geodesic problem in $\R^3$ with polyhedral obstacles and the rectilinear geodesic problem in $\R^d$ with pairwise-disjoint, axes-parallel boxes.

Computing Euclidean geodesics in $\R^3$ with polyhedral obstacles is known to be NP -hard. In contrast, Papadimitriou gave a polynomial-time approximation algorithm for this problem. Unfortunately his complexity analysis involves an unusual mixture of both the algebraic computing model and the bit computing model. In the first part of the thesis, we present a true bit complexity analysis: there is an approximation algorithm that computes a geodesic with relative error $\epsilon > 0$ in $ O((n^3M\log{M} + (nM)^2) \cdot \mu(W)) $ time, where $M=O(nL/\epsilon)$ , $W = O(\log(n/\epsilon)+L)$, and $\mu(W)$ is the time complexity of multiplying two W -bit integers. Our algorithm is a variant of Papadimitriou's algorithm.

The second part of the thesis addresses the rectilinear geodesic problem in $\R^3$ with a set of pairwise-disjoint, axes-parallel boxes. A monotonicity property of rectilinear geodesics is shown: every obstacle-avoiding geodesic between two points is monotone along at least one of coordinate directions. Using the monotonicity property of geodesics, an algorithm computing a geodesic from a query point to a fixed point is presented. The preprocessing time of the algorithm is $O(n^2 \log n)$ and each query takes $O(\log n +k)$ time, where k is the number of edges in a geodesic.

The last part of the thesis generalizes the above monotonicity property to every dimensions: given a set of pairwise-disjoint, axes-parallel boxes in $\R^d$ , every obstacle-avoiding geodesic between two points is monotone along at least one of coordinate directions.
Ph.D. Thesis 1995 Practical Structures for Parallel Operating Systems Edler, Jan Abstract | PDF

Title: Practical Structures for Parallel Operating Systems

Candidate: Edler, Jan

Advisor(s): Gottlieb, Allan

Abstract:

Large shared memory MIMD computers, with hundreds or thousands of processors, pose special problems for general purpose operating system design. In particular:

1.
Serial bottlenecks that are insignificant for smaller machines can seriously limit scalability.
2.
The needs of parallel programming environments vary greatly, requiring a flexible model for runtime support.
3.
Frequent synchronization within parallel applications can lead to high overhead and bad scheduling decisions.

Because of these difficulties, the algorithms, data structures, and abstractions of conventional operating systems are not well suited to highly parallel machines.

We describe the Symunix operating system for the NYU Ultracomputer, a machine with hardware support for Fetch&Phi operations and combining of memory references. Avoidance of serial bottlenecks, through careful design of interfaces and use of highly parallel algorithms and data structures, is the primary goal of the system. Support for flexible parallel programming models relies on user-mode code to implement common abstractions such as threads and shared address spaces. Close interaction between the kernel and user-mode runtime layer reduces the cost of synchronization and avoids many scheduling anomalies.

Symunix first became operational in 1985 and has been extensively used for research and instruction on two generations of Ultracomputer prototypes at NYU. We present data gathered from actual multiprocessor execution to support our claim of practicality for future large systems.
Ph.D. Thesis 1995 Dreme: for Life in the Net Fuchs, Matthew Abstract | PDF

Title: Dreme: for Life in the Net

Candidate: Fuchs, Matthew

Advisor(s): Perlin, Ken

Abstract:

This dissertation makes four contributions towards supporting distributed, multi-user applications over open networks.

Dreme, a distributed dialect of the Scheme language in which all first-class language objects are mobile in the network. In particular, various distributed topologies, such as client/server and peer-to-peer, can be created by migrating closures with overlapping scopes around the network, correct inter-process communication being assured by Scheme's lexical scoping rules and network wide addressing. Threads of control are passed around through first-class distributed continuations.

A User Interface toolkit for coordinating events in multi-threaded, multi-user applications by organizing continuation callbacks into nested lexical scopes. Each event has certain attributes, such as synchronous/asynchronous. Certain events create new scopes with new events. Continuation callbacks allow both synchronous events which return values to their callers, and asynchronous ones. Application needn't be spread throughout the application, as with applications using an event-loop.

A distributed garbage collection algorithm that collects all cycles on an open network. The basic algorithm depends on maintaining the inverse reference graph (IRG) among network nodes (i.e., if a->b is in the regular graph, b->a is in the IRG). A single IRG traversal from any object determines the status of each object touched. Communication is decentralized (any object can choose to determine its status), garbage is touched O(1) times (in the absence of failures), it is fault-tolerant, and can handle malicious or faulty neighbors. Each operation uses messages linear in the size of the IRG. Overlapping operations perform like parallel quick sort.

An approach to using the Standard Generalized Markup Language (SGML) over the network to support distributed GUIs, intelligent clients, and mobile agents. SGML is a meta-grammar for creating domain specific document markup languages to which a variety of semantics (display, reading/writing databases, etc.) can be applied. The document, its grammar, and some semantics, are retrieved over the network. Applications normally create interfaces directly out of graphic objects to communicate with the user. However, if the interface has some semantics (and is parsable), a computational agent can interpret the interface and talk directly to the application on behalf of the human.
Ph.D. Thesis 1995 Fault-tolerant Parallel Processing Combining Linda, Checkpointing, and Transactions Jeong, Karpjoo Abstract | PDF

Title: Fault-tolerant Parallel Processing Combining Linda, Checkpointing, and Transactions

Candidate: Jeong, Karpjoo

Advisor(s): Shasha, Dennis

Abstract:

With the advent of high performance workstations and fast LANs, networks of workstations have recently emerged as a promising computing platform for long-running coarse grain parallel applications. Their advantages are wide availability and cost-effectiveness, as compared to massively parallel computers. Long-running computation in the workstation environment, however, requires both fault tolerance and the effective utilization of idle workstations.

In this dissertation, we present a variant of Linda, called Persistent Linda (PLinda), that treats these two issues uniformly: specifically, PLinda treats non-idleness as failure.

PLinda provides a combination of checkpointing and transaction support on both data and program state (an encoding of continuations). The traditional transaction model is simplified and then extended to support robust parallel computation. Treatable failures include processor and main memory hard and slowdown failures, and network omission and corruption failures.

The programmer can customize fault tolerance when constructing an application, trading failure-free performance against recovery time. When creating a PLinda program, the programmer can decide on the frequency of transactions and the encoding of continuations to be saved upon transaction commit. At runtime, the programmer can decide to suppress certain continuations for better failure-free performance.

PLinda has been applied to corporate bond index statistics computation and biological pattern recognition.
Ph.D. Thesis 1995 A Model-Based 3-D Object Recognition System Using Geometric Hashing with Attributed Features Liu, Jyhjong Abstract | PDF

Title: A Model-Based 3-D Object Recognition System Using Geometric Hashing with Attributed Features

Candidate: Liu, Jyhjong

Advisor(s): Hummel, Robert

Abstract:

We build an object recognition system that is able to recognize 3-D objects such as vehicles embedded in highly complicated backgrounds. We use the geometric hashing method, augmenting the approach through the use of attributed features , k -d trees for access to features, and the use of bounds in order to limit the search.

We make use of expressive features to improve the performance of a geometric hashing object recognition system. Various kinds of attributed features, such as the midpoint of a line segment with its orientation, the endpoints of a line segment with its orientation, and the center and the circle features are extracted and used in our system.

The number of features as well as the type of features in each model can vary. We make use of weighted voting, which has a Bayesian interpretation. The distribution of the invariants for various features as well as the bounds of the weighted voting formula are analyzed. In order to improve the performance of the system, we use a k-d tree to search entries in high-dimensional hash tables. The method is generalized in order to treat variables taking on values from a non-interval domain, such as data measuring angles. To make use of available computer resources, we distribute the computation, assigning evidence accumulation for a single hypothesis to one processor in a multiple processor and multiple workstation environment. The implementation reduces the communication overhead to minimum. The system is implemented using the Khoros software development system.

The results of target recognition are reported in numerous experiments. The experiments show that the use of more expressive features improves the performance of the recognition system.
Ph.D. Thesis 1995 Grasping and Fixturing: a Geometric Study and an Implementation Teichmann, Marek Abstract | PDF
Title: Grasping and Fixturing: a Geometric Study and an Implementation

Candidate: Teichmann, Marek

Advisor(s): Mishra, Bud

Abstract:

The problem of immobilizing an object by placing ``fingers'' (or points) on its boundary occurs in the field of dexterous manipulation, manufacturing and geometry. In this dissertation, we consider the purely static problems of good grasp and fixture set synthesis, and explore their connection to problems in computational and combinatorial geometry. Two efficient randomized approximation algorithms are proposed for finding the smallest cover for a given convex set and for finding the largest magnitude by which a convex set can be scaled and still be covered by a cover of a given size. They generalize an algorithm by Clarkson. The cover points are selected from a set of n points. The following bounds are valid for both types of problems. For the former, c is the size of the optimal cover, and for the latter, c is the desired cover size. In both cases, a cover of size $4 cd \lg c$ is returned.

The running time depends on the set to be covered. Covering an n -vertex polytope in $R^d$ takes $O(c^2 n \log n \log c)$ expected time, and covering a ball takes
```
$O(nc^{1+\delta}+c^{\lfloor{d/2}\rfloor+1}\log n\log^{\lfloor{d/2}\rfloor} c)$
```
expected time. These algorithms have applications to finding a good grasp or fixture set. An $O(n^2 \log n)$ algorithm for finding optimal 3 finger grasps for n sided polygons is also given.

We also introduce a new grasp efficiency measure based on a certain class of ellipsoids, invariant under rigid motions of the object coordinate system. To our knowledge, this is the first measure having this property. We also introduce a new reactive grasping paradigm which does not require a priori knowledge of the object. This paradigm leads to several reactive algorithms for finding a grasp for parallel jaw grippers and three finger robot hands equipped with simple sensors. We show their correctness and discuss our implementation of one such algorithm: a parallel jaw gripper with light-beam sensors which we have built. A short video demonstration will also be shown.
Ph.D. Thesis 1994 Systolic Combining Switch Designs Dickey, Susan Abstract | PDF

Title: Systolic Combining Switch Designs

Candidate: Dickey, Susan

Advisor(s): Gottlieb, Allan

Abstract:

High-performance VLSI switches are needed in the interconnection network of massively parallel shared memory multiprocessors. The switch designs we consider alleviate the ``hot spot'' problem by adding extra logic to the switches to combine conventional loads and stores as well as fetch-and- $\phi$ operations destined for the same memory location. The performance of three buffered switch architectures was investigated through probabilistic analysis and simulation: Type A switches, with k queues, one at each output, each accepting k inputs per cycle; and two one-input queue designs, Type B switches, with $k^2$ output queues, and Type C switches, with k input queues. While the Type C switch is less expensive, Type A and B have considerably better performance. An efficient CMOS implementation for systolic queue designs was devised. A non-combining switch containing these systolic queues was fabricated through MOSIS in 3 micron CMOS and employed the NORA clocking methodology, using qualified clocks for distributing global control.

A combining switch was fabricated in 2 micron CMOS for use in the 16 by 16 processor/memory interconnection network of the NYU Ultracomputer prototype. Details are given about the internal logic of the two component types used in the network. A design usable in networks of size up to 256 * 256 has been prepared for fabrication by NCR at a smaller feature size in a higher pincount package. Differences in the logic partitioning of the two designs are described. We describe the performance of these designs for systems of up to 1024 PEs obtained through simulation. Our experience in implementing a combining switch indicates that the cost of hardware combining is much less than is widely believed. We compare the cost of a combining switch to that of a non-combining switch and discuss the scalability of the implemented design to large numbers of processors. Differences in the capabilities of combining switch architectures are studied. We describe the implementation of ``two-and-a-half-way'' combining, which promises to avoid network saturation in large networks at only slightly greater cost than two-way combining. We also discuss implementation alternatives and performance for a 4 by 4 combining switch.
Ph.D. Thesis 1994 Gedanken: A tool for Pondering the Tractability of Correct Program Technology Ericson, Lars Abstract | PDF

Title: Gedanken: A tool for Pondering the Tractability of Correct Program Technology

Candidate: Ericson, Lars

Advisor(s): Mishra, Bud

Abstract:

We examine the feasibility of the Correct Program Technology (CPT) approach to program verification using available technology, with pessimistic results.

We compare CPT with RAPTS and the Calculus of Constructions. We specify the Correct Programmer's Workbench (CPW), and review six programming environments as platforms. We define a Correct Program Editor and prototype it in Mathematica.

CPT applies decision procedures for specification sublanguages to make shorter proofs, hoping these shorter proofs will have faster verifications, but these sublanguages are NP-Complete or worse. We review some heuristics for improving their average case. CPT relies on a sublanguage of set theory, MLS. We prove that MLS is NP-average complete in the sense of the Levin-Gurevich theory of average case complexity. We conjecture that shorter proofs of random theorems cost more to verify.

EMLS is an elementary relational language (ERL). We define syntactic simplification rule sets (SSRs) for ERLs. The average case effect of an SSR is determined by the number of matches of the SSR with ERL sentences of n individual variables. EMLS sentences over n variables can be constructed from sentences in L _{4,2, n} and L _{2,3, n} , where L _{k , m , n} is the language of k relations of m arguments over n variables. We recursively define a match-counting algorithm for L _{k , m , n} SSRs and extend it to EMLS. If an SSR has p patterns in w pattern variables over n individual variables, match counting costs O ( p n ^w 2 ^{pn ^w - 1} (2 + k n ^m )). Match counting for L _{k ,0,0} is in #P, and we conjecture that it is # P -Complete. We conjecture generating functions do not yield a method of approximating the number of matches, and we conjecture that the problem of approximating matches is also # P -Complete. We count the matches for low n for some EMLS SSRs, with discouraging results, and note that the matches of an effective rule set must grow as the size of the language for n variables.

We conclude that the remaining hope for verification is to build a large library of specification language constructs which occur frequently and can be verified in polynomial time.
Ph.D. Thesis 1994 Designing Pattern Matching Algorithms by Exploiting Structural Pattern Properties Hariharan, Ramesh Abstract | PDF

Title: Designing Pattern Matching Algorithms by Exploiting Structural Pattern Properties

Candidate: Hariharan, Ramesh

Advisor(s): Cole, Richard

Abstract:

Exact Complexity of String Matching: We consider the question of how many character comparisons are needed to find all occurrences of a pattern string of length m in a text string of length n . We show an almost tight upper bound of the form n + O ( n / m ) character comparisons, following preprocessing. Specifically, we show an upper bound of n + 8/(3( m +1)) ( n - m ) character comparisons. The following lower bounds are also shown: for on-line algorithms, a bound of n + 9 / (4( m +1)) ( n - m ) character comparisons for m =35+36 k , for any integer k >= 1, and for general algorithms, a bound of n +2( n - m ) / ( m +3) character comparisons, for m =2 k +1, for any integer k >= 1.

Parallel Two-Dimensional Pattern Matching: We give the first time, space and work optimal common CRCW-PRAM algorithm for finding all occurrences of a two-dimensional pattern of size m ₁ * m ₂ in a two-dimensional text of size n ₁ * n ₂ . Our algorithm runs in O (1) time performing O ( n ₁ * n ₂ ) work, following preprocessing of the pattern. A major portion of the preprocessing step is the computation of witnesses for the pattern. We show how to compute witnesses for the pattern in O ( log log m ₂ )time and O ( m ₁ * m ₂ ) work when m ₂ >= m ₁ . In the process of designing the above algorithm, we also obtain some new periodicity properties of two-dimensional patterns.

Parallel Suffix Tree Construction: We consider the problem of constructing the suffix tree of a given string s of length m in parallel. An O ( m )-work, O ( m )-space, O ( log ⁴ m )-time CREW-PRAM algorithm for constructing the suffix tree of s is obtained when s is drawn from any fixed alphabet set. This is the first work and space optimal parallel algorithm known for this problem. It can be generalized to construct the suffix tree of a string s drawn from any general alphabet set to perform in O ( log ⁴ m ) time, $O(m\log |\Sigma|)$ work, and $O(m\log |\Sigma|)$ space, after the characters in s have been sorted alphabetically; here $|\Sigma|$ is the number of distinct characters in s . In this case too, the algorithm is work optimal.
Ph.D. Thesis 1994 Compilation of Array-Style Programs for Distributed Memory MIMD Machines: a Geometric Approach Katz, Alex Abstract | PDF

Title: Compilation of Array-Style Programs for Distributed Memory MIMD Machines: a Geometric Approach

Candidate: Katz, Alex

Advisor(s): Schonberg, Edmond

Abstract:

Distributed memory MIMD (Multiple Instruction Multiple Data) machines are emerging as a cost-effective means of speeding up numerically intensive programs. They scale more easily than other parallel machines. But writing explicitly parallel programs for these machines is both difficult and error prone. Compilers for languages like HPF make the task easier by generating the necessary inter-processor communication from the data distribution directives supplied by the programmer. This dissertation shows that for a large class of array-style programs automatic data distribution can produce a significant speedup on a distributed memory MIMD machine. Array-style programs use array primitives to manipulate entire arrays, rather than looping explicitly over the array elements. APL programs are typically array-style.

We show how to apply automated data distribution to APL programs, that treat arrays and operations on them as atomic. Automated data distribution determines the necessary inter-processor communication from the way APL primitives manipulate the entire arrays, rather than by complex algebraic analysis of the patterns of array subscripts, as would be done in more conventional compilers. A simple distribution and alignment scheme automatically distributes arrays across available processors. Arrays can be dynamic, with sizes varying during program execution. Data distribution is guided by array size estimates. Distribution trade-off analysis attempts to optimize the initial distribution by comparing the estimated communication and computation times, and replicating arrays whose partitioning results in excessive communication.

Building on the APL to C compiler developed by W.-M. Ching, we produce explicitly parallelized C, from APL source programs. We describe the parallel implementation of most of the APL primitives. The implementation of several APL primitives uses the monotonic data movement algorithm. The ideas developed are demonstrated with eight APL programs of varying complexity. We show the speedup and efficiency obtained when running these programs on 2 to 32 processors. The speedup achieved on 32 processors, ranging from 7 to 30, shows the technique to be applicable to a wide range of programs.
Ph.D. Thesis 1994 Lazy SETL Debugging with Persistent Data Structures Liu, Zhiqing Abstract | PDF

Title: Lazy SETL Debugging with Persistent Data Structures

Candidate: Liu, Zhiqing

Advisor(s): Schwartz, Jack

Abstract:

Debugging tools have been traditionally difficult to use, particularly in accumulating and exploring program runtime information. This dissertation addresses these issues by proposing a lazy debugging approach, which postpones investigation of debugging hypothesis until complete runtime history is available. This approach encourages a systematic way of debugging and supports many high-level debugging facilities. Recent advance in persistent data structures reduces the time and memory space overhead incurred in recording and storing execution events drastically, and also makes the overhead easily manageable.

To demonstrate this approach, a visual SETL debugger prototype has been designed and implemented based on D. Bacon's SETL translator. This debugger has a persistent runtime system designed using the persistent data structures of the node splitting type, developed by Driscoll, et al. It can efficiently record changes in program execution state under different recording granularities, along with supporting normal SETL executions. Users of this debugger are provided with a graphical interface, which supports many powerful tools, such as forward/backward control/data breakpoints, interactive variable printing, program animation, and re-execution from an recorded execution moment.

A strong set of conclusions can be drawn from an evaluation of the debugger's performance and usability issues, as well as the limitations and open questions of this debugging approach.
Ph.D. Thesis 1994 Searching for Strings and Searching in Presence of Errors Muthukrishnan, S. Abstract | PDF

Title: Searching for Strings and Searching in Presence of Errors

Candidate: Muthukrishnan, S.

Advisor(s): Spencer, Joel

Abstract:

This dissertation deals with two classes of searching problems. The first class consists of pattern matching problems, and the second class comprises combinatorial searching problems in presence of errors in response to the queries. Our results are as follows.

Standard Stringology. Standard Stringology is the study of pattern matching problems in which a text location matches one in the pattern provided the associated symbols are identical. The basic problem here is the string matching problem of detecting all occurrences of a pattern string in a text string. This naturally generalizes to the dictionary matching problem of finding all occurrences of a set of patterns, rather than a single pattern, in a given text. Very fast optimal parallel algorithms exist for string matching in the PRAM model. These algorithms rely on structural properties of the strings. Unfortunately these structural properties are not useful for solving the dictionary matching problem. We have obtained the fastest and the most work-efficient algorithms known for this problem and a number of its variants by introducing and using a new technique called shrink-and-spawn .

Non-Standard Stringology. In problems from Non-Standard Stringology, an arbitrary many-to-many matching relation holds between the text and pattern locations. An example is string matching with ``don't cares'' where the position in the text that has a ``don't care'' symbol matches every pattern position. The inherent complexity and structure of such non-standard string matching problems is not well understood. Our main results are inherent complexity bounds for these problems, characterized in terms of algebraic convolutions. Traditionally structure in pattern matching has meant repetitions in patterns, but this work exposes a novel graph-theoretic structure in these problems.

Searching in presence of errors. Given a set of items containing one or more distinguished items, the generic combinatorial search problem is to determine the distinguished item(s) using detection tests on groups of items. Motivated by fault-tolerance issues, we consider the scenario when some tests get incorrect responses. We have developed a strategy to solve the generic problem above using at most one test more than that necessary, even under adversarial placement of incorrect responses to the tests.
Ph.D. Thesis 1994 Visual Programming Nickerson, Jeffrey Abstract | PDF

Title: Visual Programming

Candidate: Nickerson, Jeffrey

Advisor(s): Schonberg, Edmond

Abstract:

While computer science textbooks and classroom lectures are filled with diagrams, and much of our design activity as programmers takes place on whiteboards, we write our pro- grams as text. Proponents of visual programming suggest that we should take advantage of graphic user interface technol- ogy and draw rather than write our programs. This disserta- tion examines the extent to which this is possible, address- ing the question of how graphic representation can best be used in the process of programming.

The use of diagrams in the field of computer science is thoroughly surveyed, and some underlying principles identi- fied. The visual conventions of Adjoinment, Linking, and Enclosure are defined and illustrated. Three languages are developed - a simple programming language that encompasses shell commands, a visual version of APL, and a visual front end for Mathematica. The visual version of APL is notable in that it presents both a program and instances of data under- going transformation as part of one unified diagram.

Building on the work of R. J. A. Buhr, new visual systems designing conventions are created to handle the intricacies of facilities in the Ada9X language. Asynchronous transfers of control, requeueing, and generic formal parameters are addressed. The asynchronous transfer of control convention is suitable for CASE representations of the language con- struct, and can be easily animated.

Some existing software metrics are modified for use in analyzing diagrams, and two new metrics are proposed: graphic token count and diagram class complexity. A graphic design measure, data density, is transformed into a computer science measure, token density. Using these metrics, graphic representations can be compared to each other and to textual representations. From this, a strong set of conclusions are drawn about the relative strengths of graphic and textual representation, as well as the limits and possibilities of graphic representation in programming.
Ph.D. Thesis 1994 Representing Control in Parallel Applicative Programming Yao, Chi Abstract | PDF

Title: Representing Control in Parallel Applicative Programming

Candidate: Yao, Chi

Advisor(s): Goldberg, Benjamin

Abstract:

This research is an attempt to reason about the control of parallel computation in the world of applicative programming languages.

Applicative languages, in which computation is performed through function application and in which functions are treated as first-class objects, have the benefits of elegance, expressiveness and having clean semantics. Parallel computation and real-world concurrent activities are much harder to reason about than the sequential counterparts. Many parallel applicative languages have thus hidden most control details with their declarative programming styles, but they are not expressive enough to characterize many real world concurrent activities that can be easily explained with concepts such as message passing, pipelining and so on. Ease of programming should not come at the expense of expressiveness. Therefore, we design a parallel applicative language Pscheme such that programmers can express explicitly the control of parallel computation while maintaining the clean semantics and the ease of programming of applicative languages. In Pscheme, we propose the concept of ports to model the general control in parallel computation. Through program examples, we show how Pscheme and ports support various parallel programming paradigms. We have also built libraries for higher level control facilities with ports so that programming in Pscheme becomes easier.

We provide an operational semantics for Pscheme, and develop a compiler and a run time system on NYU's Ultracomputer. Our experiments with parallel programs have shown satisfactory speedup. We claim that ports are the natural parallel extensions of continuations in sequential computation, and thus conclude that representing general control in parallel applicative programming is feasible.
Ph.D. Thesis 1993 Cell-based Computer Models in Developmental Biology Agarwal, Pankaj Abstract | PDF
Title: Cell-based Computer Models in Developmental Biology

Candidate: Agarwal, Pankaj

Advisor(s): Schwartz, Jacob T.

Abstract:

In developmental biology, modeling and simulation play an important role in understanding cellular behavior. We suggest a simple language, the Cell Programming Language (CPL), to write computer programs to describe this behavior. Using these programs, it is possible to simulate and visualize cell behavior.

A genome is the program for the development of an organism. The genome, in conjunction with the environment, determines the behavior of each cell of the organism. The program for each cell (written in CPL) plays the role of its genome. The program for an individual cell consists of a set of states. In each state, rules are specified which determine the cell properties (i.e. shape, motility, concentrations of various molecular species, etc.). Different states of the same cell signify different phases in the cell's life. Each cell has a tissue type associated with it. Cells of the same tissue type execute the same CPL program.

We use the discrete time simulation model. At every time step, each cell executes all the instructions in its present state sequentially. All cells are assumed to be executing in parallel, with synchronization performed after every time step.

The cells are two-dimensional. Each cell has a physical location comprising a collection of discrete connected points. This physical presence imparts to the cells the attributes of area, perimeter, and neighbors (other cells). The neighbor attribute forms the basis for all intercellular communication.

The language contains features for specifying:
- the location, area, and shape of the cells;
- the concentrations of various chemicals in each cell, the equations of their catalysis, and diffusion;
- the direction and speed of cell motion;
- the rates of cell growth and division;
- cell differentiation: the evolution of cell behavior during its lifetime.
We have employed CPL to model the following: aggregation in cellular slime mold in response to a chemotactic agent; the formation of skeletal elements in the vertebrate limb; and cellular segregation due to differential adhesion.
Ph.D. Thesis 1993 Applications of Convexity in Computational Geometry Capoyleas, Vasilis Abstract | PDF

Title: Applications of Convexity in Computational Geometry

Candidate: Capoyleas, Vasilis

Advisor(s): Pach, Janos; Pollack, Richard

Abstract:

We present seven results in computational geometry. The concept of convexity plays a vital role in all seven of the results; either as a tool in the proof method or as a means of giving a formal definition.

The topics considered are:

Weak $\epsilon$ -nets: We provide strong upper bounds for the size of the smallest weak [IMAGE ] -net of a set of points, in two basic cases.

Geometric Clusterings: We provide the first polynomial algorithm to find an optimal clustering of a set of points in the plane. The optimality criteria are based on the diameter and radius of the clusters.

The Hadwiger-Kneser-Thue Poulsen conjecture: This famous 40 year old conjecture states that the area of the union of a set of disks is diminished if the disks are pushed together. We provide two partial results to this conjecture.

Grasping: We consider grasping of polygonal objects by a pair of parallel jaws. We define a model and prove that a fairly large class of polygons can be grasped under this model.

Graph drawing and crossing numbers: We consider the problem of estimating the maximum number of edges for graphs that satisfy some sort of a relaxed planarity condition. We provide exact bounds for an important special case.
Ph.D. Thesis 1993 New Techniques for the Analysis and Implementation of Functional Programs Chuang, Tyng-Ruey Abstract | PDF

Title: New Techniques for the Analysis and Implementation of Functional Programs

Candidate: Chuang, Tyng-Ruey

Advisor(s): Goldberg, Benjamin

Abstract:

Functional programming languages provide programmers with clean semantics and side-effect free computation, which make easier the tasks of designing programs and reasoning about them. Efficient implementations of purely functional programs, however, can pose certain challenges. Our purpose in this dissertation is to develop new techniques for the efficient analysis and implementation of functional programs.

Our first goal is to investigate a syntactic approach, contrary to the usual semantic approaches, of finding the least fixed points of higher-order functions over finite domains. The second objective is to develop implementation techniques for aggregate data structures for functional programs such that accesses to aggregates are both efficient and side-effect free.

Finding the least fixed point of a monotonic function over a finite domain is an essential task when analyzing a functional program in the framework of abstract interpretation. Previous methods for least fixed point finding have primarily used semantic approaches, which often traverse large portions of the semantic domain and may be very inefficient even for simple programs. We propose a syntactic method based on an augmented simply typed lambda calculus. It is shown that, for finite domains, the syntactic method is both sound and complete with respect to the semantics. Moreover, we demonstrate that the proposed syntactic method can be quite effective in cases where the usual semantic method is very inefficient.

Efficient implementations of aggregate data structures for functional programs has been an active research topic. The problem arises because once an aggregate is updated, both the old version and newly updated copy must be preserved to maintain the side-effect free semantics of functional languages. We modify the shallow binding scheme of Baker to implement functional arrays for efficient incremental updates and voluminous reads. The scheme, however, uses side-effects and cannot be implemented in purely functional languages themselves. We then investigate the possibility of implementing efficient aggregates without using side-effects, and show that real-time deques can be implemented in a purely functional way. We describe several interesting applications of this technique.
Ph.D. Thesis 1993 Nonholonomic Motion Planning : Algorithms and Software Fernandes, Christopher Abstract | PDF

Title: Nonholonomic Motion Planning : Algorithms and Software

Candidate: Fernandes, Christopher

Advisor(s): Mishra, Bud

Abstract:

Robot motion planning with nonholonomic constraints has recently engaged the attention of roboticists, as its application in dexterous manipulation, mobile robots and space robotics has begun to be understood. Such constraints arise from two different sources - Rolling Constraints and Non-Integrable Conservation Laws. For instance, the kinematics of dexterous manipulation using hard fingers making contact on a hard object requires nonholonomic motion planning (NMP) in order to satisfy rolling constraint. On the other hand, the control of attitude of space platform-based manipulators using only the internal motion of their manipulator joints requires NMP, as a result of the law of conservation of angular momentum.

Recently some algorithms and their implementation in software have been created in order to understand, simulate and control nonholonomic systems. Currently most of the algorithms have been demonstrated in somewhat specialized applications. There is a great need for software that enables the researcher to quickly test algorithms on these simple systems and then experiment with potential generalizations.

In this thesis, we describe a software system that we have developed at NYU and the underlying principles and algorithms (the ``Basis algorithm''). The system runs on SGI Iris, is written in C with auxiliary tools from Unix, Mathematica, DASSL etc. We shall also describe how we have designed controllers for such example nonholonomic systems as unicyle, space station and space platform-mounted robot manipulator.

It is hoped that this thesis will be useful for the control engineers engaged in designing non-linear control systems, for roboticists studying dexterous manipulations, motion planning and space robotics and finally, for software engineers interested in building tools and applications for robotics.
Ph.D. Thesis 1993 Dynamic Impact Analysis: Analyzing Error Propagation in Program Executions Goradia, Tarak Abstract | PDF

Title: Dynamic Impact Analysis: Analyzing Error Propagation in Program Executions

Candidate: Goradia, Tarak

Advisor(s): Weyuker, Elaine

Abstract:

Test adequacy criteria serve as rules to determine whether a test set adequately tests a program. The effectiveness of a test adequacy criterion is determined by its ability to detect faults. For a test case to detect a specific fault, it should execute the fault, cause the fault to generate an erroneous state and propagate the error to the output. Analysis of previously proposed code-based testing strategies suggests that satisfying the error propagation condition is both important and expensive. The technique of dynamic impact analysis is proposed for analyzing a program execution and estimating the error propagation behavior of various potential sources of errors in the execution. Impact graphs are introduced to provide an infrastructure supporting the analysis. A program impact graph modifies the notion of a program dependence graph proposed in the literature in order to capture some of the subtle impact relationships that exist in a program. An execution impact graphs represents the dynamic impact relationships that are demonstrated during a program execution. The notion of impact strength is defined as a quantitative measure of the error sensitivity of an impact. A cost-effective algorithm for analyzing impact relationships in an execution and computing the impact strengths is presented. A research prototype implemented to demonstrate the feasibility of dynamic impact analysis is briefly described. The time complexity of dynamic impact analysis is shown to be linear with respect to the original execution time, and experimental measurements indicate that the constant of proportionality is a small number. The experiments undertaken to validate the computation of impact strengths are presented. An experience study relating impact strengths to error propagation in faulty programs is also presented. The empirical results provide evidence indicating a strong positive correlation between impact strength and error propagation. The results also emphasize the need for better heuristics to improve the accuracy of the error propagation estimates. Potential applications of dynamic impact analysis to mutation testing, syntactic coverage-based testing and dynamic program slicing are discussed.
Ph.D. Thesis 1993 Singularity Detection, Noise Reduction and Multifractal Fractal Characterization Hwang, Wen-Liang Abstract | PDF

Title: Singularity Detection, Noise Reduction and Multifractal Fractal Characterization

Candidate: Hwang, Wen-Liang

Advisor(s): Mallat, Stephane

Abstract:

Most of a signal information is often carried by singularities. We study the characterization of the singularities with the wavelet transform and its modulus maxima. We introduce numerical algorithm to detect and characterize pointwise singularities from the behavior of the wavelet transform maxima across scales. As an application, we develop a denoising algorithm which discriminates the signal information from noise through an analysis of local singularities. In one dimension, we recover a piecewise smooth signal, where the sharp transitions are preserved. In two dimensions, the wavelet maxima algorithm detects and characterizes the edges. The geometrical properties of edges are used to discriminate the noise from the image information and the denoising algorithm restores sharp images even at low SNR.

Multifractals are singular signals having some self-similarity properties. We develop a robust algorithm to extract the fractal parameter of fractional Brownian motion embedded in white noise. Fractal parameters are estimated from the evolution of the variance of the wavelet coefficients across scales with a modified penalty method. Self-similar multifractals have a wavelet transform whose maxima define self-similar curves in the scale-space plane. We introduce an algorithm to recover the affine self-similar parameters with a voting procedure. This voting strategy is robust with respect to renormalization noise. We describe the numerical applications to Cantor measures, dyadique multifractals and to the study of diffusion limited aggregates.
Ph.D. Thesis 1993 Competitive On-line Scheduling for Overloaded Real-Time Systems Koren, Gilad Abstract | PDF

Title: Competitive On-line Scheduling for Overloaded Real-Time Systems

Candidate: Koren, Gilad

Advisor(s): Shasha, Dennis; Mishra, Bud

Abstract:

We study competitive on-line scheduling in uniprocessor and multiprocessor real-time environments. In our model, tasks are sporadic and preemptable. Every task has a deadline and a value that the system obtains only if the task completes its execution by its deadline. The aim of a scheduler is to maximize the total value obtained from all the tasks that complete before their deadline.

An on-line scheduler has no knowledge of a task until it is released. The problem is to design an on-line scheduler with worst case guarantees even in the presence of overloaded periods. The guarantee is given in terms of a positive competitive factor. We say that an on-line algorithm has a competitive factor of r , 0 < r <= 1, when under all possible circumstances (i.e, task sets) the scheduler will get at least r times the best possible value. The best value is the value obtained by a clairvoyant algorithm. In contrast to an on-line scheduler, the clairvoyant algorithm knows the entire task set a priori at time zero.

When a uniprocessor system is underloaded there exist several optimal on-line algorithms that will schedule all tasks to completion (e.g., the Earliest Deadline First algorithm). However, under overload, these algorithms perform poorly. Heuristics have been proposed to deal with overloaded situations but these give no worst case guarantees.

We present an optimal on-line scheduling algorithm for uniprocessor overloaded systems called D-over. D-over is optimal in the sense that it has the best competitive factor possible. Moreover, while the system is underloaded, D-over will obtain 100% of the possible value.

In the multiprocessor case, we study systems with two or more processors. We present an inherent limit (lower bound) on the best competitive guarantee that any on-line parallel real-time scheduler can give. Then we present a competitive algorithm that achieves a worst case guarantee which is within a small factor from the best possible guarantee in many cases.

These are the most general results yet known for competitive scheduling of multiprocessor real-time systems.
Ph.D. Thesis 1993 Probabilistic Methods in Computer Science and Combinatorics Narayanan, Babu Abstract | PDF

Title: Probabilistic Methods in Computer Science and Combinatorics

Candidate: Narayanan, Babu

Advisor(s): Boppana, Ravi

Abstract:

Over the last few years, the Probabilistic method has become an important tool in Computer Science and Combinatorics. This thesis deals with three applications of the Probabilistic method.

The first problem concerns a model of imperfect randomness: the slightly-random source of Santha and Vazirani. In a slightly-random source with bias $\epsilon$ , the conditional probability that the next bit output is 1, given complete knowledge of the previous bits output, is between $1/2 - \epsilon$ and $1/2 +\epsilon$ . We show that, for every fixed $\epsilon < 1/2 $ , and for most sets, the probability of hitting that set using a slightly-random source is bounded away from 0.

The second problem arises in parallel and distributed computing. A set of n processors is trying to collectively flip a coin, using a protocol that tolerates a large number of faulty processors. We demonstrate the existence of perfect-information protocols that are immune to sets of $\epsilon n$ faulty processors, for every fixed $\epsilon< 1/2$ .

Finally, we consider a problem in Ramsey theory. Let an adversary color the edges of the Binomial random graph with r colors, the edge probability being $ c / (\sqrt n)$ , where c is a large enough constant. We show that, almost surely, a constant fraction of the triangles in the graph will be monochromatic.
Ph.D. Thesis 1993 Singularity Detection, Dataflow Analysis of Logic Programs Using Typed Domains Papadopoulos, Georgios Abstract | PDF

Title: Singularity Detection, Dataflow Analysis of Logic Programs Using Typed Domains

Candidate: Papadopoulos, Georgios

Advisor(s): Harrison, Malcolm C.

Abstract:

Dataflow analysis for logic programming languages has been collecting information about properties of whole terms. As a result pessimistic assumptions had to be made about the substerms of a term, missing information that could be used for better compiler optimization and partial evaluation.

We use type information to divide each term into sets of subterms (t-terms) and then collect dataflow information for these sets. We identify and solve several problems as we develop a new sharing analysis for logic programs using our t-terms in a denotational abstract interpretation framework.
Ph.D. Thesis 1993 Statistical Recognition of Textured Patterns From Local Spectral Decomposition Perry, Adi Abstract | PDF

Title: Statistical Recognition of Textured Patterns From Local Spectral Decomposition

Candidate: Perry, Adi

Advisor(s): Lowe, David

Abstract:

Unsupervised segmentation of an image into homogeneously textured regions and the recognition of known texture patterns have been important tasks in computer vision. This thesis presents a new set of algorithms and describes an implemented system which performs these tasks. Initial features are computed from a local multi-channel spectral decomposition of the image that is implemented with Gabor filters. Textures are not assumed to have a band limited frequency spectrum and there is no supposition regarding the image contents: it may contain some unknown texture patterns or regions with no textures at all. Stability of features is enhanced by employing a method for smoothing reliable measurements. Both recognition and segmentation procedures use robust statistical algorithms and are performed locally for small image patches. In particular, statistical classification with principal components is used for recognition. Further accuracy is achieved by employing spatial consistency constraints. When a slanted texture is projected on the image plane, the patterns undergo systematic changes in the density, area, and directionality of the texture elements. Recognition is made invariant to such transformations by representing texture classes with multiple descriptors. These descriptors are computed from carefully selected 3-D views of the patterns. Simulated projection of textures from arbitrary viewpoints are obtained by using a new texture mapping algorithm. The segmentation algorithm overcomes the non-stationarity of the features by employing a new, robust similarity measure. The performance of these methods is demonstrated by applying them to real images.
Ph.D. Thesis 1993 Automating Physical Database Design: An Extensible Approach Rozen, Steven Abstract | PDF

Title: Automating Physical Database Design: An Extensible Approach

Candidate: Rozen, Steven

Advisor(s): Shasha, Dennis

Abstract:

In a high-level query language such as SQL, queries yield the same result no matter how the logical schema is physically implemented. Nevertheless, a query's cost can vary by orders of magnitude among different physical implementations of the same logical schema, even with the most modern query optimizers. Therefore, designing a low-cost physical implementation is an important pragmatic problem-one that requires a sophisticated understanding of physical design options and query strategies, and that involves estimating query costs, a tedious and error-prone process when done manually.

We have devised a simple framework for automating physical design in relational or post-relational DBMSs and in database programming languages. Within this framework, design options are uniformly represented as ``features'', and designs are represented by ``conflict''-free sets of features. (Mutually exclusive features conflict. An example would be two primary indexes on the same table.) The uniform representation of design options as features accommodates a greater variety of design options than previous approaches; adding a new design option (e.g. a new index type) merely entails characterizing it as a feature with appropriate parameters. We propose an approximation algorithm, based on this framework, that finds low-cost physical designs. In an initial phase, the algorithm examines the logical schema, data statistics, and queries, and generates ``useful features''-features that might reduce query costs. In a subsequent phase, the algorithm uses the DBMS's cost estimates to find ``best features''-features that belong to the lowest-cost designs for each individual query. Finally, the algorithm searches among conflict-free subsets of the best features of all the queries to find organizations with low global cost estimates. We have implemented a prototype physical design assistant for the INGRES relational DBMS, and we evaluate its designs for several benchmarks, including ASSSAP. Our experiments with the prototype show that it can produce good designs, and that the critical factor limiting their quality is the accuracy of query cost estimates. The prototype implementation isolates dependencies on INGRES, permitting our framework to produce design assistants for a wide range of relational, nested-relational, and object-oriented DBMSs.
Ph.D. Thesis 1993 A Probabilistic Approach to Geometric Hashing using Line Features Tsai, Frank Abstract | PDF

Title: A Probabilistic Approach to Geometric Hashing using Line Features

Candidate: Tsai, Frank

Advisor(s): Schwartz, Jacob T.

Abstract:

One of the most important goals of computer vision research is object recognition. Most current object recognition algorithms assume reliable image segmentation, which in practice is often not available. This research exploits the combination of the Hough method with the geometric hashing technique for model-based object recognition in seriously degraded intensity images.

We describe the analysis, design and implementation of a recognition system that can recognize, in a seriously degraded intensity image, multiple objects modeled by a collection of lines.

We first examine the factors affecting line extraction by the Hough transform and proposed various techniques to cope with them. Line features are then used as primitive features from which we compute the geometric invariants used by the geometric hashing technique. Various geometric transformations, including rigid, similarity, affine and projective transformations, are examined. We then derive the ``spread'' of computed invariant over the hash space caused by ``perturbation'' of the lines giving rise to this invariant. This is the first of its kind for noise analysis on line features for geometric hashing. The result of the noise analysis is then used in a weighted voting scheme for the geometric hashing technique. We have implemented the system described and carried out a series of experiments on polygonal objects modeled by lines, assuming affine approximations to perspective viewing transformations. Our experimental results show that the technique described is noise resistant and suitable in an environment containing many occlusions.
Ph.D. Thesis 1992 A Miniature Space-Variant Active Vision System: Cortex-I Bederson, Benjamin Abstract | PDF

Title: A Miniature Space-Variant Active Vision System: Cortex-I

Candidate: Bederson, Benjamin

Advisor(s): Schwartz, Eric

Abstract:

We have developed a prototype miniaturized active vision system whose sensor architecture is based on a logarithmically structured space-variant pixel geometry. A space-variant image's resolution changes across the image. Typically, the central part of the image has a very high resolution, and the resolution falls off gradually in the periphery. Our system integrates a miniature CCD-based camera, pan-tilt actuator, controller, general purpose processors and display. Due to the ability of space-variant sensors to cover large work-spaces, yet provide high acuity with an extremely small number of pixels, space-variant active vision system architectures provide the potential for radical reductions in system size and cost. We have realized this by creating an entire system that takes up less than a third of a cubic foot. In this thesis, we describe a prototype space-variant active vision system (Cortex-I) which performs such tasks as tracking moving objects and license plate reading, and functions as a video telephone.

We report on the design and construction of the camera (which is 8x8x8mm), its readout, and a fast mapping algorithm to convert the uniform image to a space-variant image. We introduce a new miniature pan-tilt actuator, the Spherical Pointing Motor (SPM), which is 4x5x6cm. The basic idea behind the SPM is to orient a permanent magnet to the magnetic field induced by three orthogonal coils by applying the appropriate ratio of currents to the coils. Finally, we present results of integrating the system with several applications. Potential application domains for systems of this type include vision systems for mobile robots and robot manipulators, traffic monitoring systems, security and surveillance, telerobotics, and consumer video communications.

The long-range goal of this project is to demonstrate that major new applications of robotics will become feasible when small low-cost machine vision systems can be mass-produced. This notion of `commodity robotics' is expected to parallel the impact of the personal computer, in the sense of opening up new application niches for what has until now been expensive and therefore limited technology.
Ph.D. Thesis 1992 Regular Expressions to DFA's using Compressed NFA's Chang, Chia-Hsiang Abstract | PDF

Title: Regular Expressions to DFA's using Compressed NFA's

Candidate: Chang, Chia-Hsiang

Advisor(s): Paige, Robert

Abstract:

We show how to turn a regular expression R of length r into an O ( s ) space representation of McNaughton and Yamada's NFA, where s is the number of occurrences of alphabet symbols in R , and s +1is the number of NFA states. The standard adjacency list representation of McNaughton and Yamada's NFA takes up 1 + 2 s + s ² space in the worst case. The adjacency list representation of the NFA produced by Thompson takes up between 2 r and 6 r space, where r can be arbitrarily larger than s . Given any subset V of states in McNaughton and Yamada's NFA, our representation can be used to compute the set U of states one transition away from the states in V in optimal time O (| V | + | U |). McNaughton and Yamada's NFA requires $\Theta(|V| \times |U|)$ time in the worst case. Using Thompson's NFA, the equivalent calculation requires $\Theta(r)$ time in the worst case.

An implementation of our NFA representation confirms that it takes up an order of magnitude less space than McNaughton and Yamada's machine. An implementation to produce a DFA from our NFA representation by subset construction shows linear and quadratic speedups over subset construction starting from both Thompson's and McNaughton and Yamada's NFA's.

It also shows that the DFA produced from our NFA is as much as one order of magnitude smaller than DFA's constructed from the two other NFA's.

An UNIX egrep compatible software called cgrep based on our NFA representation is implemented. A benchmark shows that cgrep is dramatically faster than both UNIX egrep and GNU e?grep.

Throughout this thesis the importance of syntax is stressed in the design of our algorithms. In particular, we exploit a method of program improvement in which costly repeated calculations can be avoided by establishing and maintaining program invariants. This method of symbolic finite differencing has been used previously by Douglas Smith to derive efficient functional programs.
Ph.D. Thesis 1992 Complexity Issues in Computational Algebra Gallo, Giovanni Abstract | PDF

Title: Complexity Issues in Computational Algebra

Candidate: Gallo, Giovanni

Advisor(s): Mishra, Bud

Abstract:

The ideal membership problem for the ring of multivariate polynomials is a central problem in Computational Algebra. Relatively tight computational complexity bounds for this problem are known in the case of polynomials with coefficients in a field. After reviewing these results we give an algorithm, together with an upper bound on its complexity, for the solution of the membership problem in the case of polynomials with integer coefficients. This result is obtained adapting Buchberger's algorithm to the integer case. As an application, we also provide a more general upper bound on the length of strictly ascending chain of ideals in the ring $Z[x_1,\ldots,x_n]$ .

Many applications of Computational Algebra, however, do not require the complete solution of the membership problems and alternative approaches are available. In this thesis we survey the method of the characteristic sets, originally introduced by Ritt in the forties and now widely applied, particularly in Elementary Geometry Theorem Proving. We present optimal algorithms for computing a characteristic set with simple-exponential sequential and polynomial parallel time complexities.

We finally present an attempt to generalize some of the constructive methods of Commutative Algebra to the Algebra of differential polynomials. Rings of differential polynomials do not resemble their purely algebraic counterparts: we prove that there exist non-recursive differential ideals in $Z\{x\}$ and hence that, in general, no effective method can be devised to solve the membership problem in this case. However special classes of ideals can be algorithmically approached and toward this goal, we generalize the concept of H-basis, first introduced by Macaulay for algebraic ideals, to differential rings.
Ph.D. Thesis 1992 Typing Higher-Order Functions with Dynamic Dispatching Hsieh, Chih-Hung Abstract | PDF

Title: Typing Higher-Order Functions with Dynamic Dispatching

Candidate: Hsieh, Chih-Hung

Advisor(s): Harrison, Malcolm C.

Abstract:

We design new type expressions and algorithms to classify and check object types in higher-order programming. Our computation model is imperative and strongly typed. It has dynamic-dispatched functions, higher-order bounded polymorphic functions, record and function subtyping, parameterized types, both named and structural types, free-union types, existential union types, poly-typed variables, poly-typed expressions, and heterogeneous collections.

A prototype of a mini-language with the above features is implemented in Prolog with a type checking system. A small but powerful set of typing structures and operations is identified. The type checking rules are formally defined. A new technique is developed for translating recursive type relations into cyclic AND/OR graphs. Efficient algorithms are designed for resolving generalized AND/OR graphs with constraints on valid cycles.

Using elegant syntax the new type system describes more general and precise type relations than any other type systems we have known. The new technique for translating type relations into AND/OR graphs provides a new direction for implementing a higher-order polymorphic type system, which is not available in unification-based type systems. The AND/OR graph models are general enough to represent recursive relations, and their applications are not solely limited to type-checking. Our AND/OR graph resolution algorithms find the optimal solutions. They are theoretically proved efficient and are shown practical in our implementation.
Ph.D. Thesis 1992 Computer Simulation of Cortical Polymaps Landau, Pierre Abstract | PDF

Title: Computer Simulation of Cortical Polymaps

Candidate: Landau, Pierre

Advisor(s): Schwartz, Eric

Abstract:

Neo-cortical sensory areas of the vertebrate brain are organized in terms of topographic maps of peripheral sense-organs. Cortical topography has been generally modeled in terms of a continuous map of a peripheral sensory surface onto a cortical surface. However, the details of cortical architecture do not conform to this concept. Most, if not all, cortical areas consist of an interlaced structure containing multiple topographic maps of distinct classes of neural input. The term ``polymap'' is used to refer to a cortical area which consists of more than one system, interlaced in a globally topographic, but locally columnar fashion. The best known example of a cortical polymap is provided by the ocular dominance column system in layer IV of primate striate cortex, but the puff/extra-puff and orientation systems of surrounding layers also illustrate this concept, as do the thick-thin-interstripe columns of V-2, and the direction columns of MT. Since polymap architecture seems to be a common architectural pattern in the neo-cortex, this work addresses the computational modeling of polymap systems, with the expectation that such modeling will lead to a better understanding of the underlying biology. An algorithm is presented, based on the computational geometry constructs of Generalized Voronoi Polygon and Medial Axis, which provides a general method for simulating polymap systems. It also adds a powerful technique to the repertoire of Digital Image Warping. The algorithm is illustrated using the ocular dominance column and orientation column systems of V-1. In addition, a mechanism is proposed and demonstrated to account for the spatial registration of the ocular dominance and orientation column systems. Computer simulations of the activity evoked by binocular stimuli, as they would appear at the level of layers III and IV in V-1, are shown, and compared to results from recent experiments. Methods of generalizing these techniques to other common polymap cortical areas are outlined.
Ph.D. Thesis 1992 Polymorphic Type Inference and Abstract Data Types Laufer, Konstantin Abstract | PDF
Title: Polymorphic Type Inference and Abstract Data Types

Candidate: Laufer, Konstantin

Advisor(s): Goldberg, Benjamin; Odersky, Martin (Yale)

Abstract:

Many statically-typed programming languages provide an abstract data type construct, such as the package in Ada, the cluster in CLU, and the module in Modula2. However, in most of these languages, instances of abstract data types are not first-class values. Thus they cannot be assigned to a variable, passed as a function parameter or returned as a function result.

The higher-order functional language ML has a strong and static type system with parametric polymorphism. In addition, ML provides type reconstruction and consequently does not require type declarations for identifiers. Although the ML module system supports abstract data types, their instances cannot be used as first-class values for type-theoretic reasons.

In this dissertation, we describe a family of extensions of ML. While retaining ML's static type discipline, type reconstruction, and most of its syntax, we add significant expressive power to the language by incorporating first-class abstract types as an extension of ML's free algebraic datatypes. In particular, we are now able to express
- multiple implementations of a given abstract type,
- heterogeneous aggregates of different implementations of the same abstract type, and
- dynamic dispatching of operations with respect to the implementation type.
Following Mitchell and Plotkin, we formalize abstract types in terms of existentially quantified types. We prove that our type system is semantically sound with respect to a standard denotational semantics.

We then present an extension of Haskell, a non-strict functional language that uses type classes to capture systematic overloading. This language results from incorporating existentially quantified types into Haskell and gives us first-class abstract types with type classes as their interfaces. We can now express heterogeneous structures over type classes. The language is statically typed and offers comparable flexibility to object-oriented languages. Its semantics is defined through a type-preserving translation to a modified version of our ML extension.

We have implemented a prototype of an interpreter for our language, including the type reconstruction algorithm, in Standard ML.
Ph.D. Thesis 1992 A sublanguage based medical language processing system for German Oliver, Neil Abstract | PDF

Title: A sublanguage based medical language processing system for German

Candidate: Oliver, Neil

Advisor(s): Sager, Naomi

Abstract:

The major accomplishments reported in this thesis are:

- The development of a computer grammar for a nontrivial sublanguage of German. This grammar, using the LSP (Linguistic String Processor) grammar formalism, solves a number of parsing problems arising in free word order languages such as German.

- The development of an LSP-based information formatting system that obtains semantic representations of texts in a medical sublanguage of German.

- The confirmation of the sublanguage hypothesis (explained below).

In LSP grammar theory, sentences in a language are derived from a collection of basic sentence types. The basic sentence types are described in terms of the major syntactic classes (e.g., noun, verb, adjective) of the language. Sentences are derived from these basic sentences by the insertion of optional structures called adjuncts, by conjoining, and by substituting words in the major classes. Insertion, conjoining, and substitution are constrained by co-occurrence restrictions between elements in the derived syntactic structures. The restrictions subcategorize the major word classes into subclasses that may co-occur in sentences according to the co-occurrence restrictions.

The sublanguage hypothesis elaborates LSP grammar theory in the following way. In a particular domain of discourse, the subcategorization of the major word classes reflects the underlying semantics of the domain. The basic sentence types of the language, represented by sublanguage subclasses instead of major word classes, can function as data structures (called information formats ) representing the information of the domain.

The LSP Medical Language Processor (LSP/MLP) is an information retrieval/information extraction system based on sublanguage and information formatting. It processes sentences in the English sublanguage of clinical reporting into information formats, which are in turn are converted into database update records for a relational database. The information formats are derived from sublanguage co-occurrence information obtained from a corpus of discharge summaries.

The German information formatting system implemented in this work processes German Arztbriefe (doctor letters) of cancer surgery patients into information formats. It confirms the sublanguage hypothesis because it re-uses the sublanguage information (co-occurrence information and formats) of the English LSP/MLP system in an equivalent sublanguage, showing that the sublanguage information reflects the semantics of the domain.
Ph.D. Thesis 1992 Image Processing, Pattern Recognition and Attentional Algorithms in a Space-Variant Active Vision System Ong, Ping-Wen Abstract | PDF

Title: Image Processing, Pattern Recognition and Attentional Algorithms in a Space-Variant Active Vision System

Candidate: Ong, Ping-Wen

Advisor(s): Schwartz, Eric

Abstract:

A space-variant sensor motivated by human vision system has highest resolution at the center with rapidly decreasing resolution toward the peripheral area. It has the advantages of a wide visual field and, at the same time, high central resolution. The dramatic reduction of pixel number in this kind of sensor makes it possible to build a real-time vision system using only moderate computational resources. On the other hand, the space-variant image has different layout compared to a raster image. The neighbor relationships change from pixel to pixel. We need to device special method to solve this neighborhood problem.

We use a connectivity graph to represent neighbor relations between pixels in a space-variant image. We can use it to define operators for edge detection, smoothing, etc. We use a two-level pyramid based on the connectivity graph to perform local thresholding for segmentation. The translation, rotation and scaling graph are three extensions of the connectivity graph which are used to translate, rotate and scale space-variant images. We can use these graphs to perform scale and rotation independent template matching.

We successfully apply several feature designs for OCR in the space-variant domain. They include: Characteristic-Loci, Partition, Heat-Signature, and Projection. All of them are translation and scale invariant. We also have two rotation invariant methods based on Partion and Projection methods.

Since space-variant sensor has higher resolution at the center, the recognition result is more reliable if we point the sensor close to the candidate object. Therefore, if we want to recognize any single character, the center of this character is the best place for pointing the sensor. But for recognizing adjacent characters, except for well separated ones, we need to point the sensor to the place where we can separate these characters.

Based on this reliability analysis, we devised four attentional rules and an algorithm for moving sensors to recognize character strings in static natural scenes.

Finally, we describe the algorithms for reading characters from the license plate on a moving vehicle. It includes stages for traffic zone finding, moving car finding, license plate finding, license plate tracking, and character reading.
Ph.D. Thesis 1992 On Compiling Regular Loops for Efficient Parallel Execution Ouyang, Pei Abstract | PDF

Title: On Compiling Regular Loops for Efficient Parallel Execution

Candidate: Ouyang, Pei

Advisor(s): Kedem, Zvi; Palem, Krishna

Abstract:

In this thesis, we study the problem of mapping regular loops onto multiprocessors. We develop mapping schemes that yield very efficient executions of regular loops on shared and distributed memory architectures. We also develop novel analysis techniques, using which we argue about the efficiency of these resulting executions. The quality of the execution of these regular loops in the distributed memory setting, relies heavily on implementing cyclic shifts efficiently. Effectively, cyclic shifts are used to communicate results between individual processors, to which different interdependent iterations are assigned. Therefore, in order to achieve efficient executions of regular loops on distributed memory architectures, we also develop and analyze algorithms for solving the cyclic shift problem. In order to analyze the executions of regular loops that result from any specific mapping, we need to characterize the important parameters that determine its efficiency. We formally characterize a basic set of such parameters. These parameters facilitate the analysis of the memory and the processor requirements of a given execution, as well as its running time . Using these parameters, we analyze a greedy execution scheme, in the shared memory model. For example, we can determine the limit on the number of processors beyond which no speedup can be attained by the greedy method, for regular loops. The greedy scheme is of interest because it exploits the maximal possible parallelism in a natural way.

We then address the mapping scheme of regular loops onto distributed memory machines. Unfortunately, we show that the problem of finding an optimal mapping is computationally intractable in this case. In order to provide schemes that can be actually applied to regular loops at compile-time, we relax the requirement that the resulting executions be optimum. Instead, we design a heuristic mapping algorithm and validate it through experiments. This heuristic mapping scheme relies heavily on the use of efficient algorithms for realizing cyclic shifts. Therefore, we also study the problem of realizing cyclic shifts on hypercube architectures.
Ph.D. Thesis 1992 Japanese/English Machine Translation Using Sublanguage Patterns and Reversible Grammars Peng, Ping Abstract | PDF

Title: Japanese/English Machine Translation Using Sublanguage Patterns and Reversible Grammars

Candidate: Peng, Ping

Abstract:

For this thesis, a Japanese/English machine translation system with reversible components has been designed and implemented in PROLOG. Sublanguage co-occurrence patterns have been used to address the problems of lexical and structural selection in the transfer between the internal representations of a pair of natural languages. The system has been tested translating Japanese into English in the domain of programming language manuals. The evaluation of the test outputs provides some assessment of the utility of the sublanguage approach as a method for the development and refinement of a machine translation system. The thesis also explores the roles that a reversible grammar would play in sharing linguistic knowledge between parsing and generation.

The system has been developed with the goal of using sublanguage word co-occurrence patterns to simplify the description of syntactic/semantic knowledge needed in both the transfer rules and the analysis of the source language. In particular, sublanguage co-occurrence patterns are introduced to provide semantic constraints and ellipsis recovery in parsing Japanese.

This thesis introduces a right-to-left parsing scheme for Japanese. The idea for the right-to-left parsing algorithm evolved from the desire to produce partial syntactic analyses of Japanese in a more deterministic manner than was achieved by conventional left-to-right parsing schemes. The algorithm makes efficient use of sublanguage co-occurrence patterns as semantic knowledge to help disambiguate Japanese parses. The enforcement of syntactic and semantic constraints is tightly interwoven during the course of parsing. The performance in parsing Japanese has thereby been significantly enhanced.

A procedure has been implemented for translating a Definite Clause Grammar dually into a PROLOG parser and PROLOG generator, so that one grammar can be used for parsing and generation. In current natural language processing systems, separate grammars are used for parsing and generation. However, there has long been an interest in designing a single grammar for both parsing and synthesis for reasons of efficiency and integrity, as well as linguistic elegance and perspicuity. As part of the current implementation, a strategy has been developed for creating efficient grammars for both parsing and generation using a goal reordering technique within the logic programming framework.
Ph.D. Thesis 1992 The Analysis and Generation of Tests for Programming Language Translators Rennels, Deborah Abstract | PDF

Title: The Analysis and Generation of Tests for Programming Language Translators

Candidate: Rennels, Deborah

Advisor(s): Schonberg, Edmond

Abstract:

This thesis addresses the automation of two aspects of compiler validation testing: semantic analysis of existing test programs, and construction of new test programs. Semantic analysis is required during test modification and maintenance, and also when evaluating the language coverage attained by the test suite. In the current state of practice, both the semantic analysis and the construction of new test programs are extremely labor-intensive tasks; both, however, are amenable to automation. We describe two systems; one, which we have implemented, involves test case analysis and feature identification. The other is a proposed system for automatic generation of tests from test specifications. We tested our methods on the largest and most comprehensive compiler validation project to date- the Ada Compiler Validation Capability (ACVC), a large collection of Ada test programs used to verify that compilers conform to the Ada language standard.

We first describe the Ada Features Identification System (AFIS), a system which automates test program analysis. AFIS provides three different methods for identifying Ada language features in test programs, ranging from elementary syntactic items to complex context-sensitive combinations of semantic features. Semantic feature copmbinations are specified by writing program templates in a pattern language which is an extension of Ada, and pattern-matching these templates against test programs.

In the second part of this thesis we define a language to facilitate the specification of Ada compiler test objectives, and the design of a system that uses these specifications to automatically generate valid Ada test programs. The language allows a test developer to write a specification that embodies the testing goal of a given objective, without including all type and expression information required in a complete test program. These details are supplied automatically by the generator system. We show, by numerous examples taken from the Ada Implementors Guide (the design document for the Ada validation suite), how Ada test objectives can be specified in this language. The focus of our examples is constraint violation checking, which is an important component of Ada's strong typing system, and also a basic organizing principle of the ACVC tests.
Ph.D. Thesis 1992 Massively Parallel Bayesian Object Recognition Rigoutsos, Isidore Abstract | PDF

Title: Massively Parallel Bayesian Object Recognition

Candidate: Rigoutsos, Isidore

Advisor(s): Hummel, Robert

Abstract:

The problem of model-based object recognition is a fundamental one in the field of computer vision, and represents a promising direction for practical applications. In this talk we will describe the design, analysis, implementation and testing of a model-based object recognition system.

In the first part of the talk, we will discuss two parallel algorithms for performing geometric hashing. The first algorithm regards geometric hashing as a connectionist algorithm with information flowing via patterns of communication, and is designed for an SIMD hypercube-based machine. The second algorithm is more general, and treats the parallel architecture as a source of ``intelligent memory;'' the algorithm achieves parallelism through broadcast facilities from the parallel machine's host. A number of enhancements to the geometric hashing method, such as hash table equalization, the use of hash table symmetries, and hash table foldings will also be presented. These enhancements were developed specifically for the parallel algorithms, and lead to substantial performance improvements.

In the second part of the talk, we will examine the performance of geometric hashing methods in the presence of noise. The quantization of the invariants can result in a non-graceful degradation of the performance. We will present precise formulas as well as first-order approximations describing the dependency of the computed invariants on Gaussian positional error, for the similarity and affine transformation cases. Knowledge of this dependency allows the incorporation of an error model in the geometric hashing framework and subsequently leads to improved performance. A counter-intuitive result regarding the solutions of certain linear systems will also be derived as a corollary of this analysis.

In the final part of the talk, we will present an interpretation of geometric hashing that allows the algorithm to be viewed as a Bayesian approach to model-based object recognition. This interpretation, which is a new form of Bayesian-based model matching, leads to well-justified formulas, and gives a precise weighted-voting method for the evidence-gathering phase of geometric hashing. These formulas replace traditional heuristically-derived methods for performing weighted voting, and also provide a precise method for evaluating uncertainty.

A prototype object recognition system using these ideas has been implemented on a CM-2 Connection Machine. The system is scalable and can recognize aircraft and automobile models subjected to 2D rotation, translation, and scale changes in real-world digital imagery. This is the first system of its kind that is scalable, uses large databases, can handle noisy input data, works rapidly on an existing parallel architecture, and exhibits excellent performance with real-world, natural scenes.
Ph.D. Thesis 1992 Control of a Dexterous Robot Hand: Theory, Implementation, and Experiments Silver, Naomi Abstract | PDF

Title: Control of a Dexterous Robot Hand: Theory, Implementation, and Experiments

Candidate: Silver, Naomi

Advisor(s): Mishra, Bud

Abstract:

Advanced robotic systems, such as multi-fingered hands are becoming more complex, and, as yet, many of the basic questions involved remain unanswered. What control law should we use? What constitutes a good control law? How should we describe motions? What constitutes a broad, yet efficient description of motions for a grasped object?

In addition to the complexity, robotic systems frequently undergo upgrades, and it is therefore necessary to design the system in an unorganized manner. This includes such things as hierarchical software, system independent descriptions of motion, system independent control laws.

In this thesis, we address these issues for a less complex system. We have focused our attention on describing motions for objects being grasped by a multi-fingered hand. We present a formulation for motion primitives, which allow object manipulation and require limited parameter specification. We have attempted to find a control law which will perform well under adverse conditions. We have built the system and tested it on the NYU Four Finger Manipulator which is a two dimensional hand. Even for this simplified problem, there remains a large degree of complexity and there are as yet no definitive solutions to these problems.
Ph.D. Thesis 1992 Executable Operational Semantics of Programming Languages Siritzky, Brian Abstract | PDF

Title: Executable Operational Semantics of Programming Languages

Candidate: Siritzky, Brian

Advisor(s): Dewar, Robert

Abstract:

Since the inception of computer languages there have been attempts to define programming languages formally. Several markedly different methodologies have been proposed to solve this problem. This thesis argues for Executable Operational Semantics (EOS) as a methodology for formal definition which has many fundamental advantages. The EOS methodology has, however, been broadly criticized. We show that the major objections against EOS are unfounded, and that executability is suitable and useful for many applications of formal definitions. The primary criticisms of EOS definitions- that they are hardware and architecture specific, that they are unable to describe concurrency and non-determinism, and that they overspecify implementation details-are countered by the demonstration that Ada/Ed, an executable definition of Ada developed at New York University, can avoid or overcome each problem. A description of the implementation of Ada's real arithmetic and representation specifications reveals hardware and architectural independence in an executable definition. Ada/Ed's model of Ada tasking demonstrates that concurrency can be defined within an executable framework, and we argue that an executable definition can describe all non-deterministic aspects of Ada. The problems of overspecificity can be alleviated by the appropriate choice of metalanguage and software techniques, and by suitable parameterization of the formal definition. Finally, we describe some general advantages of executable definitions. We present worked examples of questions to a formal definition of Ada, and establish that requiring a definition to be executable enhances rather than degrades its usability and credibility.
Ph.D. Thesis 1992 Non-Correcting Error Recovery For LR Parsers Snyder, Kirk Abstract | PDF

Title: Non-Correcting Error Recovery For LR Parsers

Candidate: Snyder, Kirk

Advisor(s): Schwartz, Jacob T.

Abstract:

In recent years much effort has been devoted to the automatic generation of parsers, with considerable success. The error-handling mechanisms of these parsers are still not completely satisfactory, however. Currently available techniques are either too slow for practical production compilers, or they leave open the possibility of many spurious diagnostic messages. This thesis presents a parsing technique designed to minimize the frequency of spurious or misleading diagnostic messages emitted by the compiler, without the efficiency cost of similarly robust parsers. The technique parses program text following a syntax error as a `suffix' in the programming language, reporting errors in invalid suffixes. The system achieves its high efficiency by accepting a superset of the suffixes of the language being parsed, but a sufficiently small superset that very few errors are undetected. The technique described has been fully implemented, and a number of experiments on typical syntax errors in various programming languages are presented. We describe our parsing system in detail and assess its strengths and weaknesses relative to other parsing systems.
Ph.D. Thesis 1992 Global Methods for Image Motion Analysis Sundareswaran, V. Abstract | PDF

Title: Global Methods for Image Motion Analysis

Candidate: Sundareswaran, V.

Advisor(s): Hummel, Robert

Abstract:

Processing motion information is an important problem in building automated vision systems. A moving sensor can obtain knowledge about the environmental layout, its own motion, and motion of objects in the scene by processing the temporal information in the imagery. We provide algorithms that can determine self-motion (or egomotion) by observing a sequence of images produced by a moving sensor in a rigid, stationary environment. The algorithms make use of optical flow information extracted from the sequence, and unlike most alternative methods, are global and robust to inaccuracies in the flow data.

Two algorithms are presented. Both algorithms assume that the first stage of visual motion analysis, the computation of an image vector flow field that describes the instantaneous motion of individual points, has been solved.

The first algorithm, the flow circulation algorithm, determines the rotational parameters using the curl of the flow field, which under many conditions is approximately a linear function. The coefficients of the linear function, which may be determined by simple regression, are the desired rotational parameters. Circulation values, defined to be contour integrals of the vector field on the image plane, may be used in place of curl values, resulting in robustness. The second algorithm determines the translational parameters of the motion. The inner product of the image vector flow field and a certain circular vector field gives rise to a scalar function that is of a particular quadratic polynomial form when the center of the circlular field is chosen appropriately. This correct choice of the center is related to the translational parameters and can be found by projecting the inner product function onto suitable subspaces determined by the quadratic polynomial form. Three different methods, of increasing complexity and accuracy, are developed. A fourth, fast but approximate method is also presented.

The algorithms are described, analyzed and experimental results are shown. The thesis contains mathematical observations that provide insight into the problem of motion analysis, and experimental observations that demonstrate the applicability of the algorithms.
Ph.D. Thesis 1991 Persistent LINDA: Design and implementation of a system to add transactions to LINDA Anderson, Brian Abstract | PDF

Title: Persistent LINDA: Design and implementation of a system to add transactions to LINDA

Candidate: Anderson, Brian

Advisor(s): Shasha, Dennis

Abstract:

Persistent Linda (PLinda hereafter) is based on the shared tuple space model of Linda. PLinda extends the model to facilitate the manipulation of sets and to implement transactional persistence. Its operations are upward compatible with Linda's. We have chosen Linda as a basis for the following reasons:

1) A shared memory model is the language of much parallel algorithms work, so implementing a parallel algorithm is easiest in that model. At the same time, a persistent data store is most useful as a shared resource.

2) In a distributed system, the cost of sending a message is often dominated by the cost of setting up the message. By encapsulating accesses into large semantic units (i.e. Linda tuples) as opposed to machine-dependent units such as words, Linda reduces the number of data transfers, thereby reducing set-up overhead. Shared data stores are also best accessed in large chunks for the same reason.

3) Associative retrieval of tuples is a convenient abstraction to the parallel programmer and is a good target of optimization for the database implementer.
Ph.D. Thesis 1991 A Theory of Natural Learning Botta, Alexander Abstract | PDF

Title: A Theory of Natural Learning

Candidate: Botta, Alexander

Advisor(s): Davis, Ernest

Abstract:

Unsupervised learning is based on capturing regularities in data. We formalize the vague notion of regularity, using the concept of algorithmic information (Solomonoff,Chaitin, Koppel). We present a theory on how regularities are induced and accumulated. A generative model captures a regularity if it achieves compression. A basic regularity is a building block for hierarchical structures. We prove that a basic regularity may be identified as a local maximum in compressibility. Stepwise induction is a polynomial-time approach to structures whose basic components have bound complexity.

Agents exploring a universe engage in active learning. The regularities of their sensory-motor streams are similar to Piaget's schemes and constituents of an induced ontology. We illustrate these ideas on three microworlds. First are Moore automata. State representations are constructed incrementally from results of tests when in that state and from outputs percieved on the way to that state. The second world contains loosely coupled geometric objects. They are basic regularities identifiable by stepwise induction. In the third world the agent has an elaborate eye and can move objects on a tiled surface. Statistical correlations between sets of stimuli are induced, then models are constructed to generate instances of new correlations from already known ones.

Algorithmic information theory allows a unified perspective on many areas of learning research. We define analysis as the separation of novelty in data from the already known. We present explanation based generalization as a well formalized instance of analysis, and constructive induction as an ill defined instance. We show EBG to specialize a theory through positive examples, and we prove it a language independent method, valid beyond the predicate calculus representations.
Ph.D. Thesis 1991 A Practical Method for Constructing Efficient LALR(k) Parsers with Automatic Error Recovery Charles, Phillipe Abstract | PDF

Title: A Practical Method for Constructing Efficient LALR(k) Parsers with Automatic Error Recovery

Candidate: Charles, Phillipe

Advisor(s): Schonberg, Edmond

Abstract:

LR parsing is used for a wide range of applications, including compiler construction, automatic code generation, language-specific editors and natural language processing. Currently, however, solutions have not been developed for practical multiple-lookahead parsing, fully-automatic error recovery, and space and time-efficient LR parsing across this wide-range of applications.

We present a practical framework for LR(k) parsing, for k > 1. We give an efficient algorithm that incrementally constructs an LALR(k) parser with varying- length lookahead strings, and whose symbols are consulted during parsing only when necessary.

Currently, effective LR error recovery systems require some user intervention. We describe an effective and fully automated syntactic error recovery method for LR(k) parsers. Finally, we present a generally effective method for compressing LR(k) parsing tables.

We have incorporated these innovations into a parser generator system that automatically constructs a production-quality parser with built-in error diagnostics and recovery. We will show examples of its performance on several programming languages.
Ph.D. Thesis 1991 Statistical Techniques for Parsing Messages Chitrao, Mahesh Abstract | PDF

Title: Statistical Techniques for Parsing Messages

Candidate: Chitrao, Mahesh

Advisor(s): Grishman, Ralph

Abstract:

Message processing is the extraction of information about key events described in brief narratives concerning a narrow domain. This is a suitable task for natural language understanding, since the amount of world knowledge required is limited. However, the messages are often ill-formed and therefore require the grammar which parses them to be quite forgiving. This often results in a proliferation of parses. This problem is compounded by one's inability to construct a complete domain model which would resolve all the semantic ambiguity. Thus, selection of the correct parse becomes an important goal for such systems.

Structural preference is a technique which helps disambiguation by assigning a higher preference to certain syntactic structures. The idea of statistical parsing evolved from the desire of being able to prefer certain structures over others on the basis of empirical observations, rather than ad-hoc judgement. In the framework of statistical parsing, every production of the grammar is assigned a priority, which is computed from a statistical analysis of a corpus.

There are two distinct methodologies that can be used for assigning these priorities. In Supervised Training , only the correct parses are used for training the grammar. On the other hand, Unsupervised Training uses parses independent of their semantic validity. After assigning the priorities, the parser searches for parses in a best-first order as dictated by these priorities.

When this scheme was incorporated into the PROTEUS message understanding system while processing OPREP (U.S. Navy Operational) messages, a two-fold advantage was observed. Firstly, the speed of the parsing increased, because rare productions tended not to get used at all. Secondly, since the parses were generated in the best-first order, the parses generated earlier on tended to be more likely and semantically more acceptable.

The performance of the modified parsing algorithm was evaluated with and without several refinements such as the use of context sensitive statistics and the use of heuristic penalties. The relative performances of the grammars trained by Supervised Training and Unsupervised Training were also compared.
Ph.D. Thesis 1991 On the satisfiability problem for unquantified classes of formulae involving set-theoretical and topological constructs Cutello, Vincenzo Abstract | PDF

Title: On the satisfiability problem for unquantified classes of formulae involving set-theoretical and topological constructs

Candidate: Cutello, Vincenzo

Advisor(s): Schwartz, Jacob T.

Abstract:

In this thesis we prove the solvability of the satisfiability problem for various classes of unquantified set-theoretical formulae. In particular, we will provide satisfiability tests that given a formula as input produce a model for it, if any exists. We will also show how the decidability of certain fragments of set theory can be used to prove the solvability of the satisfiability problem for some unquantified languages involving topological notions. In particular, a list of topological statements whose validity can be checked by our algorithms is given. The underlying motivation for this study is to enrich the class of theoretical results that can be used for a set-theoretic proof verifier; we also provide lower bounds for what is undecidable in set theory and topology.
Ph.D. Thesis 1991 Scheduling for Horizontal Systems: The VLIW Paradigm in Persepctive Gasperoni, Franco Abstract | PDF

Title: Scheduling for Horizontal Systems: The VLIW Paradigm in Persepctive

Candidate: Gasperoni, Franco

Advisor(s): Schonberg, Edmond

Abstract:

This work focuses on automatic extraction of operation level parallelism from programs originally intended to be sequential. Optimality issues in the framework of very long instruction word architectures and compilers (VLIW) are investigated. Possible advantages of an idealized dynamic scheduler over a purely static one are also explored. More specifically the model and the results of scheduling theory are extended to account for cyclicity and branching capabilities present in sequential programs. The existence of inherent bottlenecks in the VLIW paradigm is substantiated and the advantage of dynamic over static scheduling is demonstrated for certain type of loops. A novel technique for efficient parallelization of straight line loops is presented. A simple scheduling heuristic for arbitrary programs is proven to perform between a constant and a logarithmic factor from appropriately defined optimality criteria. Finally it is proven the existence of loops containing branches for which no parallel program can achieve time optimal performance on VLIWs with unlimited resources. The overall aim of the thesis is to identify the family of sequential programs for which the VLIW model of parallel computation is viable.
Ph.D. Thesis 1991 Efficienty Loop-Level Parallelism in ADA Hind, Michael Abstract | PDF

Title: Efficienty Loop-Level Parallelism in ADA

Candidate: Hind, Michael

Advisor(s): Schonberg, Edmond

Abstract:

Parallelism in scientific applications can most often be found at the loop level. Although Ada supports parallelism via the task construct, its coarseness renders it unsuitable for this light-weight parallelism. In this work, we propose Ada constructs to achieve efficient loop-level parallelism in ANSI-Ada. This is accomplished in two steps. First, we present an idiom that allows the specification of light-weight tasks. Second, we give an efficient implementation of this idiom (for a variety of shared memory machines) that is considerably more efficient than a standard Ada task.

In addition, we present an idiom that makes the fetch_and_add synchronization primitive available at the Ada level. Our implementation of this idiom is more efficient in both time and space than previous results. In addition to providing universal synchronization, using fetch_and_add simplifies program analysis (e.g. proving the absence of race conditions in the implementation of a parallel algorithm). Since all these idioms are written in standard Ada, they maintain the portability that is central to the mandated uses of the language.
Ph.D. Thesis 1991 Segmentation and Surface-Based Modeling Objects in Three-Dimensional Biomedical Images Kalvin, Alan Abstract | PDF

Title: Segmentation and Surface-Based Modeling Objects in Three-Dimensional Biomedical Images

Candidate: Kalvin, Alan

Advisor(s): Hummel, Robert

Abstract:

The rapid development of technologies for imaging the human body has led to a growing interest in the extraction and analysis of objects in 3D biomedical images for applications in fields such as clinical medicine, biomedical research, and physical anthropology.

This dissertation examines the problem of creating surface-based geometric models of biomedical objects that are suitable for analysis through visualization, mensuration, and manipulation. This is a two-stage problem. First the objects are identified by segmenting the 3D image into regions of interest, and then surface-based models of the objects are created.

We discuss the issues of segmentation and surface construction and introduce the following new methods for solving these problems.

First, we present the MLO algorithm, a general-purpose, domain-independent segmentation algorithm that has been applied successfully to identify skulls in CT images, the ventricle walls of the heart in MR images, brain ventricles in CT images, and carotid arteries in MR angiography images. It uses an iterative, cooperative procedure to segment an image by optimizing a cost function. To achieve a fast segmentation, a coarse-to-fine strategy is employed, using a multiresolution pyramid.

The GRG algorithm is a model-driven, special-purpose algorithm for identifying thin bone in CT head images. The algorithm, developed specifically for craniofacial surgical planning, uses anatomical knowledge in the segmentation process, and can handle the abnormal anatomy of craniofacial patients. It successfully finds most of the thin bone that can not be found using previous methods.

ALLIGATOR is a surface construction algorithm that creates models using the ``winged-edge'' data structure of Baumgart, enabling efficient access to the topological and geometric information of the surfaces, and permitting efficient, topologically consistent modifications to the representations.

Unlike previous surface construction algorithms, ALLIGATOR is suitable not just for visualizing biomedical objects, but for measuring and manipulating them as well. Another important feature of ALLIGATOR is that it uses an adaptive face-merging process to create surface models that are significantly more concise, in terms of vertices, edges, and faces, than the models produced by other surface construction algorithms.
Ph.D. Thesis 1991 The Development of Parallel Image Algorithms by Prototyping Kelly, Robert Abstract | PDF

Title: The Development of Parallel Image Algorithms by Prototyping

Candidate: Kelly, Robert

Advisor(s): Hummel, Robert

Abstract:

We examine the process of parallel algorithm development for a class of image synthesis and image processing problems. Algorithms are developed for a class of parallel machines characterized by shared memory multiprocessors, such as is exhibited by the Ultracomputer model. The new algorithms are asynchronous in nature, and many employ the ``pool of tasks'' paradigm. These algorithms are prototyped using the sequential specification language SETL that has been adapted to function as a parallel specification tool. The issue of refinement of the high-level specification is illustrated with a number of examples of machine-specific implementations.

Parallel algorithms are proposed for the connected components problem, for hidden surface removal in surface rendering, and parallel algorithms for ray tracing are discussed. Within the investigation of connected components algorithms, new algorithms are suggested for four classes of approaches to the problem: (1) Adjacency matrix methods, (2) pointer graph methods based on the vertex collapse algorithm of Hirschberg, (3) pointer graph methods based on the Shiloach/Vishkin connected components algorithm, and (4) image scan algorithms, based on the sequential raster scan ``blob coloring'' algorithm. For the third area, the Shiloach/Vishkin-type connected components algorithm, we show how a stronger model of computation (one that permits constant-time concurrent additive-writes) allows the elimination of one of the steps of the algorithm. Although this modification does not improve the asymptotic time complexity of the algorithm, the MIMD version of the Shiloach/Vishkin algorithm is then considerably simplified, and contains fewer synchronization points, and has improved expected execution time performance.

All algorithms are given in the parallel-adapted SETL language. The final versions of all proposed parallel connected components algorithms are further refined into EPEX/Fortran, suitable for execution on an RP3 simulator system. Empirical results are obtained for various algorithms, by use of instrumenting either the SETL version or the EPEX/Fortran version, thereby providing estimates of expected performance times by means of examining average lengths of queues of tasks. In particular, queue activity patterns are examined for executions of the parallel adjacency matrix connected components algorithm, and the MIMD version of the Shiloach/Vishkin connected components algorithm. For the latter, run-time performance estimates are made demonstrating the utility of the modifications made to the MIMD version of the algorithm. For the image scan algorithms, estimates are obtained comparing the size of subimages that are assigned to processors against the sizes of the reduced graph connected components problem that result, based on runs of the EPEX/Fortran version. Finally, the shared-memory access patterns of the parallel ray-tracing algorithm are examined, suggesting that the algorithm is viable in terms of memory contention rates.
Ph.D. Thesis 1991 Semantically Based Concurrent Data Structure Algorithms Lanin, Vladimir Abstract | PDF

Title: Semantically Based Concurrent Data Structure Algorithms

Candidate: Lanin, Vladimir

Advisor(s): Shasha, Dennis

Abstract:

A computational environment is called concurrent when it allows several threads of sequential control, or processes, to overlap in time and to communicate with each other. Such an environment is called synchronous when the length of time it takes any process to execute any sequence of steps can be determined in advance. When such a calculation is impossible (at least to the precision required), the environment is called asynchronous.

Algorithms designed to work in the asynchronous concurrent environment have appeared in the literature for such data structures as B-trees, hash tables, and queues. The most common standard of correctness for a concurrent algorithm is serializability, which requires that the effects of a concurrent computation be equivalent to some serial composition of the same actions. However, several notions of ``equivalence'' exist, depending on whether they take into account the semantics of the data structure,or only the syntax of the computation.

We examine the drawbacks and advantages of several correctness standards, and identify a particular standard to be of general utility. Furthermore, we formalize the notion of decisive operations, and show how it can be applied to greatly simplify semantic serializability proofs.

We apply the concepts of syntactic and semantic serializability to the development of several novel algorithms, including an extension of the tree protocol to changing trees, a highly concurrent B-tree algorithm, and a wait-free set manipulation algorithm. Useful techniques appearing in the design are identified, and the correctness proofs serve as examples of the techniques previously described.
Ph.D. Thesis 1991 On the Optimization of Term Rewriting Li, Ke Abstract | PDF

Title: On the Optimization of Term Rewriting

Candidate: Li, Ke

Advisor(s): Kedem, Zvi

Abstract:

Term rewriting systems (TRSs) are widely applied in automated theorem proving, equational languages, logic programming, specification of software and hardware, and other symbolic computations. An important computation procedure in applications of TRSs is to reduce a term to its normal form. The research of this thesis examines the complexity of normal form computation and explores efficient rewriting strategies for it.

A rewriting strategy is said optimal if for any term, it always generates the shortest derivation when computing the term's normal form. First, we prove that a universal optimal rewriting strategy for any canonical TRS does not exist unless NP = P . We prove the same result for AC TRS in which some functions are associative and commutative.

To find efficient rewriting strategies, we divide TRSs into three categories: variable-more , variable-equal , and variable-fewer , and propose optimal rewriting strategies for the first two categories and approximate strategies for the last.

We have done experiments on RRL (Rewrite Rule Laboratory) -- an automated theorem prover with term rewriting as the basic inference rule. The experiment output confirm our theoretical results. Based on our theory and experiment results, we improve RRL by implementing new programs that automatically choose an efficient rewriting strategy for an given term rewriting system.
Ph.D. Thesis 1991 The Design and Implementation of ALLOY, a Higher Level Parallel Programming Language Mitsolides, Thanasis Abstract | PDF

Title: The Design and Implementation of ALLOY, a Higher Level Parallel Programming Language

Candidate: Mitsolides, Thanasis

Advisor(s): Harrison, Malcolm C.

Abstract:

The goal of this thesis is to show that it is possible to define a parallel higher level programming language for programming in the large which will be able to easily express both complicated parallel problems and traditional serial ones. Such a language would provide many good features of serial and parallel programming languages and be appropriate for programming massively parallel computing systems. To demonstrate this a simple language, called ALLOY, was designed. The main features of this language, could be incorporated into other languages.

ALLOY, directly supports functional, object oriented and logic programming styles in a unified and controlled framework. Evaluating modes support serial or parallel execution, eager or lazy evaluation, non-determinism or multiple solutions. These modes can be combined freely. ALLOY is simple, utilizing only 29 primitives, half of which are for object oriented programming.

The power of ALLOY is demonstrated through the use of a wide variety of examples. Some of the examples are: a) partition sort and FP library demonstrating clarity, efficiency, and simple parallelism, b) prime numbers and buffering demonstrating the ability to select between eager and lazy evaluation, c) systolic sort and merge sort demonstrating dynamic networks of communicating processes, d) N-queens and list permutations demonstrating serial and parallel searching. A library is given for programming in logic programming styles. Finally a number of parallel objects demonstrate ALLOY's ability to exploit massively parallel architectures effectively.

An interpreter of ALLOY together with a number of utilities and a programming environment has been written in Common Lisp. The system is available for anonymous ftp. It is shown that ALLOY can have reasonably efficient implementation on shared memory multiprocessor (MIMD) systems supporting highly parallel operations, on distributed architectures, and possibly on Data Flow architectures as well.
Ph.D. Thesis 1991 Semantic program analyses for storage management optimizations in functional language implementations Park, Young G. Abstract | PDF
Title: Semantic program analyses for storage management optimizations in functional language implementations

Candidate: Park, Young G.

Advisor(s): Goldberg, Benjamin

Abstract:

One of the major overheads in implementing functional languages in both uniprocessor and multiprocessor environments is the storage management overhead due to dynamic allocation and automatic reclamation of indefinite- extent storage. We investigate compiler optimization to reduce such overhead by statically inferring the lifetime information about dynamically-allocated objects.

We have developed a set of compile-time semantic analyses for a higher-order monomorphic strict functional language based on denotational semantics and abstract interpretation:
- Escape Analysis: provides information about the relative lifetimes of objects with respect to the activation of the function call.
- Refined Escape Analysis: provides, as a refinement of escape analysis, information about the lifetimes of components of aggregate structures.
- Reference Escape Analysis: provides information about the relative lifetimes of references created within a function with respect to the activation of the function call.
- Order-of-Demand Analysis: provides information about the order in which the values of bound variables are demanded, and thus allows to compute a range of information including strictness, evaluation-order and evaluation-status information.
Those analyses are extended to both polymorphic and non-strict (either normal- order evaluation or lazy evaluation) languages.

Using statically inferred escape information, we have proposed a variety of storage management optimization techniques including stack allocation, explicit reclamation, in-place reuse, reference counting elimination, block allocation/reclamation, and improving generational garbage collection.
Ph.D. Thesis 1991 Counting Real Zeros Pedersen, Paul Abstract | PDF

Title: Counting Real Zeros

Candidate: Pedersen, Paul

Advisor(s): Mishra, Bud

Abstract:

This thesis presents an n -dimensional generalization of Hermite's theorem for counting real roots of a polynomial using quadratic forms. We solve the problem of counting the number of real solutions of a system of polynomial equations within an algebraic polyhedron in n -dimensional space, where the polynomials are taken to have rational coefficients.

Our algorithm is purely symbolic, which means that it may be used to implement infinite-precision algorithms for arithmetic in the real-algebraic subset of the real numbers. We present algorithms for doing this as an application of the general theory.

Our algorithms are based on resultant theory, both because this theory provides insights into the algorithms, and because it makes possible a comparatively clear complexity analysis which shows the algorithms to be worst-case optimal, i.e., singly exponential in the degree of the polynomials.
Ph.D. Thesis 1991 Combinatorial and algorithmic analysis of stabbing and visibility problems in three-dimensional space Pellegrini, Marco Abstract | PDF

Title: Combinatorial and algorithmic analysis of stabbing and visibility problems in three-dimensional space

Candidate: Pellegrini, Marco

Advisor(s): Pollack, Richard

Abstract:

Given a set $T$ of triangles in 3-space, with $\vert T\vert$ = $n$, let ${\cal S}$($T$) be the set of all lines stabbing the set $T$. The combinatorial descriptive complexity of ${\cal S}$($T$) is denoted by #${\cal S}$($T$). The following questions about ${\cal S}$($T$) are considered in this thesis: (a) answer the query given a line $l$, is $l$ $\in$ ${\cal S}$($T$)? (query problem). (b) decide whether ${\cal S}$($T$) $\ne$ 0 (existence problem). (c) Give upper and lower bounds on #${\cal S}$($T$). The following results are shown in this thesis: (1) There is an $\Omega (n\sb3$) lower bound for #${\cal S}$($T$). Also ${\cal S}$($T$) may have $\Omega (n\sb2$) connected components. (2) There is an $O (n\sp{3+\epsilon}$) upper bound on #${\cal S}$($T$). Within the same time bound it is possible to solve the existence problem. (3) The existence problem for triangles on a set of planes with $g$ different plane inclinations can be solved in $O(g\sp2 n\sp2 {\rm log}\ n$) time. (4) The query problem is solvable in $O(n\sp{2+\epsilon}$) preprocessing and storage and logarithmic $O({\rm log}\ n$) query time. (5) The results (1), (2), (3) and (4) extend, with the same asymptotic bounds, to sets of convex polyhedra with total complexity $n$. Given a set $T$ of n disjoint triangles, the ray shooting problem for $T$ is the following: preprocess $T$ so to be able to answer queries of the form Given a ray $\rho$, does $\rho$ hit any triangle in $T$?. The following results are shown in this thesis: (1) Using $O(n\sp{3+\epsilon}$) randomized preprocessing time and storage we can solve ray-shooting queries in $O (\sqrt{n}{\rm log}\sp2 n$) worst case query time. (2) If we are given $m$ $>$ $n\sp{7/5}$ rays and $n$ disjoint triangles, we can answer all the ray shooting queries in $O (m\sp{5/6-\delta} n\sp{5/6+5\delta}$log $n$ + $m$ log $n$ + $n$ log $m$) randomized expected time and $O (m+n$) space, for every $\delta$ $>$ 0. The multiplicative constants depend on $\delta$. (3) Given $m$ rays and $n$ axis-oriented boxes we can answer ray shooting queries in randomized expected time $O(m\sp{3/4-\delta}n\sp{3/4+3\delta}\log\sp3n+m$ log$\sp3n+n$ log $m$) and $O(m+n$) space, for 1/28 $<$ $\delta$ $<$ 1/9. The multiplicative constants depend on $\delta$.
Ph.D. Thesis 1991 Properties of Convex Polytopes Prabhu, N. Abstract | PDF

Title: Properties of Convex Polytopes

Candidate: Prabhu, N.

Advisor(s): Pollack, Richard

Abstract:

The thesis presents some results about the boundary complexes of convex polytopes.

1.
The intersection of affine subspaces with the boundary complexes of convex polytopes: We show that the lower bound on the dimension of a subspace that intersects the relative interiors of all j -faces of a d -polytope is 2( d - j ). We also show that every d -simplex attains the above lower bound; hence the bound is tight. Further, using neighborly polytopes, we construct polytopes with arbitrarily large number of vertices which attain the above lower bound.
2.
Hamiltonian simple polytopes: Given integers n and d , n > d , does there exist a simple d -polytope with n vertices? We show that for all $n > c d\sqrt{d}$ ( c a constant) one can construct a simple d -polytope with n vertices. In fact for all $n>c d \sqrt{d},$ we construct a Hamiltonian simple d -polytope with n vertices. The Hamiltonicity of the constructed polytopes improves a result of Victor Klee.
3.
Construction of a 4-dimensional polytope to show that in general one cannot find a hyperplane in R ^d that contains a given pair of vertices of a d -polytope and has two or more facets of the polytope in one of the closed halfspaces.
4.
A generalization of Balinski's Theorem: Balinski showed that the graph of every d -polytope is d -connected, i.e., removing any d -1 vertices does not disconnect the remaining subgraph. However, removing all the vertices of a j -face ( j < d ) leaves the remaining subgraph ( d - j -1)-connected and this bound is tight for j < d -1.
5.
A conjecture of Micha Perles: Perles conjectured that every induced, connected, ( d -1)-regular subgraph of the graph of a simple d -polytope determines a facet of the polytope. Generalizing Perles' conjecture to triangulated spheres, leads to a question about the existence of a certain triangulation of the 3-ball and the solid torus. We show that neither the 3-ball nor the solid torus admits the required triangulation. Further we prove Perles' conjecture for some subclasses of simple polytopes and prove a few reduction theorems.
Ph.D. Thesis 1991 Amortized Complexity of Data Structures Sundar, Rajamani Abstract | PDF
Title: Amortized Complexity of Data Structures

Candidate: Sundar, Rajamani

Advisor(s): Boppana, Ravi

Abstract:

This thesis investigates the amortized complexity of some fundamental data structure problems and introduces interesting ideas for proving lower bounds on amortized complexity and for performing amortized analysis. The problems are as follows:
- Dictionary Problem: A dictionary is a dynamic set that supports searches of elements and changes under insertions and deletions of elements. It is open whether there exists a dictionary data structure that takes constant amortized time per operation and uses space polynomial in the dictionary size. We prove that dictionary operations require log-logarithmic amortized time under a multilevel hashing model that is based on Yao's cell probe model.
- Splay Algorithm's Analysis: Splay is a simple, efficient algorithm for searching binary search trees, devised by Sleator and Tarjan, that uses rotations to reorganize the tree. Tarjan conjectured that Splay takes linear time to process deque operation sequences on a binary tree and proved a special case of this conjecture called the Scanning Theorem: We prove tight bounds on the maximum numbers of various types of right rotations in a sequence of right rotations performed on a binary tree. One of the lower bounds refutes a conjecture of Sleator. We apply the upper bounds to obtain a nearly linear upper bound for Tarjan's conjecture. We give two new proofs of the Scanning Theorem, one of which is a potential-based proof that solves a problem of Tarjan.
- Set Equality Problem: The task of maintaining a dynamic collection of sets under various operations arises in many applications. We devise a fast data structure for maintaining sets under equality-tests and under creations of new sets through insertions and deletions of elements. Equality-tests require constant time and set-creations require logarithmic amortized time. This improves previous solutions.
Ph.D. Thesis 1991 Performance Evaluation of Solutions to the TLB Consistency Problem Teller, Patricia Abstract | PDF

Title: Performance Evaluation of Solutions to the TLB Consistency Problem

Candidate: Teller, Patricia

Advisor(s): Gottlieb, Allan

Abstract:

To implement virtual memory efficiently, virtual-to-physical address translation information is stored in page tables and cached in translation-lookaside buffers (TLBs). In multiprocessors with multiple TLBs, page-table modifications can result in outdated TLB entries, the use of which can cause erroneous memory accesses.

We propose three new solutions to this TLB consistency problem, which unlike existing solutions for highly-parallel shared-memory multiprocessors do not require interprocessor synchronization and communication, and neither interrupt processor execution nor introduce unnecessary serialization.

The cost of each of our solutions is embodied in the cost of TLB reloads, which load translation information for referenced pages into TLBs. Two assume TLBs at processors and one assumes TLBs at memory. We study their performance in scalable multiprocessor architectures via a trace-driven simulation system capable of simulating a range of systems using just one address trace.

Our results show that system performance improves if TLBs are located at memory, rather than processors, provided that memory is organized as multiple paging arenas, where the mapping of pages to arenas is fixed.

A class of parallel workloads can produce a number of TLB reloads, R, that grows linearly with N. A set of our simulations for processor-based TLBs validate this model.

A processor-based TLB reload costs O(log N) because of network transit. Thus, management of processor-based TLBs, be it consistency ensuring or not, has an overhead that grows as N log N.

The cost of a memory-based TLB reload within a paging arena can be made smaller than that of a processor-based TLB, since additional network transits are not required.

Simulation result show that when there is only one paging arena, memory-based TLBs exhibit generally larger miss rates than processor-based TLBs, and the related overhead is generally larger. When there are two paging arenas, memory-based TLBs produce smaller miss rates than processor-based TLBs of equal size, and the related overhead is generally smaller. To maintain low overhead for large machines, it is likely that the number of paging arenas must grow as O(N).
Ph.D. Thesis 1991 Applications and Analysis of Probabilistic Techniques Tetali, Prasad Abstract | PDF

Title: Applications and Analysis of Probabilistic Techniques

Candidate: Tetali, Prasad

Advisor(s): Spencer, Joel

Abstract:

The thesis illustrates the strength of randomness by applying some recent probabilistic techniques to solve problems in number theory, graph theory and computer science.

The first part of the thesis is concerned with random construction of integer sequences with certain additive properties. A set of natural numbers is called an asymptotic basis of order k , if every number (sufficiently large) can be expressed as a sum of k distinct numbers from the set. We prove that for every fixed k , there exists an asymptotic basis of order k such that the number of representations of n is $\Theta (\log n)$ . The case k =2 was proved in 1956 by Paul Erdos.

The second part deals with analysis of random walks on graphs. Random walks on graphs have been known to have interesting analogies in electrical networks. A precise characterization of effective resistance in electrical networks is provided in this thesis in terms of random walks on the underlying graphs. The interpretation of effective resistance yields interesting new results and new proofs for some known results. The main result here is an exact formula for the hitting time between two vertices in terms of the effective resistances in the network, settling an open question. This is much in the spirit of the commute time result by Ashok Chandra et al.
Ph.D. Thesis 1991 Resilient Computations in the Presence of Slow-Downs Turek, John Abstract | PDF

Title: Resilient Computations in the Presence of Slow-Downs

Candidate: Turek, John

Advisor(s): Shasha, Dennis; Cole, Richard

Abstract:

With the advent of low cost work stations, distributed systems are becoming increasingly attractive. However, as the number of components in the system increases so does the probability of some component failing. When system designers discuss fault-tolerance, they typically restrict themselves to the problem of handling fail-stop failures. This work proposes an enhanced failure model that allows processes to fail by either slowing down or stopping; slow processes may later speed up, continue to proceed slowly, or, eventually, stop. We call such failures slow-downs. The model does not assume the ability to distinguish among these possibilities, say, by using a timeout mechanism, nor does it assume that it is possible to kill a slow process.

This thesis presents several results in this context. We discuss how to execute transactions under the slow-down model when the correctness criteria is serializability. We then discuss how to transform a class of lock-based concurrent data structures into nonblocking data structures. Both results are developed in the context of a shared memory machine having an atomic compare&swap.

We conclude this thesis by giving algorithms that can be used to emulate a reliable shared memory with compare&swap on a message passing system prone to slow-downs.
Ph.D. Thesis 1991 Query Optimization in Database and Information Retrieval Systems Wang, Tsong-Li Abstract | PDF

Title: Query Optimization in Database and Information Retrieval Systems

Candidate: Wang, Tsong-Li

Advisor(s): Shasha, Dennis

Abstract:

Recently, several prototype and commercial systems based on a loosely-coupled shared-nothing architecture have been proposed and built for database applications. To achieve speed-ups proportional to the number of processors for operations such as selections and joins, such systems often distribute data across storage units using a hashing function. In the first part of this thesis, we investigate ways of minimizing response time for various multi-join queries in such systems. We develop a dynamic programming algorithm for queries whose closures are chains. We next prove the NP-completeness for more general queries and propose four heuristics for them. We then evaluate experimentally the relative performance of these heuristics and their performance relative to optimums. The empirical results show that a hybrid heuristic combining our chain algorithm with a heuristic related to Kruskal's spanning tree algorithm performs well.

In the second part of the thesis, we present a scheme to answer best-match queries from a file containing a collection of objects. A best-match query is to find the objects in the file which are closest (according to some (dis)similarity measure) to a given target.

Previous work suggested that one can reduce the computational effort required to achieve the desired results using the triangle inequality when starting with a data structure for the file which reflects some precomputed intrafile distances. We generalize the technique to allow the optimum use of any given set of precomputed intrafile distances. We then extend our scheme to a class of queries for retrieving similar or dissimilar objects that commonly arise in vision and molecular biology. Artificial data and actual protein sequences are used to illustrate the effectiveness of our scheme for different queries, and to compare its performance with previous algorithms.

Finally, we implement our techniques into a tree information system that enables users to retrieve and extract information from trees based on approximate comparison. We expect this system to have applications in pattern recognition, biology, linguistics, and programming languages. The system is implemented in C and X-windows, and is fully operational on SUN workstations.
Ph.D. Thesis 1991 Toward a Fully Integrated VLSI CAD System: from Custom to Fully Automatic You, Yongtao Abstract | PDF

Title: Toward a Fully Integrated VLSI CAD System: from Custom to Fully Automatic

Candidate: You, Yongtao

Advisor(s): Siegel, Alan

Abstract:

This thesis describes an integrated CAD environment, which is intented to support almost all phases of the VLSI circuit design cycle, from high-level circuit description down to mask specification. Several VLSI CAD tools have been integrated together under the environment, including a multi-level simulator Msim, a hardware description language CHDL, some automatic placement tools, a schematic layout editor, and the UC Berkeley-developed geometry layout editor Magic.

The multi-level simulator Msim supports top-down design by allowing circuits whose components are described at different levels to be simulated together. The levels of circuit description currently supported include a hardware description language CHDL, which is a variant of the C programming language for circuit behavior descriptions, a schematic layout representation, and the Magic layout from which masks for wafer fabrication can be generated.

The schematic layout editor allows designers to specify interconnections among circuit components in a very efficient manner. It supports both behavioral descriptions and high level geometric layout of a circuit. Designers can have a graphical view of their design, and specify, within this graphical organization, the behavioral description of components at different levels of abstraction. These schematic layouts with different levels of representation can be simulated using the multi-level simulator Msim.

The automatic placement tool presently performs bottom-up iterative improvement, with simulated annealing as its assistant when needed. An interactive graphics interface is provided which allows human intervention on intermediate as well as final layouts.

In addition, the linear (true) charge-sharing modeling problem with indeterminate transistor switches is shown to be NP-Complete, which explains why it is integrated exclusively within the lattice model for our switch-level simulation.
Ph.D. Thesis 1991 Edge representation from wavelet transform maxima Zhong, Sifen Abstract | PDF

Title: Edge representation from wavelet transform maxima

Candidate: Zhong, Sifen

Advisor(s): Mallat, Stephane

Abstract:

The multiscale edges of a signal are the sharp variation points measured at different scales. This thesis studies a model of multiscale edge representation based on the local maxima wavelet transform. The wavelet transform is a mathematical formulation of a multiscale decomposition. It decomposes a signal into multiple components indexed by a scale parameter. A particular class of wavelets are used such that each of these components is the first derivative of a smooth version of the signal, with the scale parameter indicating the degree of smoothing. The local maxima of this wavelet transform is therefore a multiscale edge representation. This thesis shows that the local maxima not only identify the edges but also characterize the edges. An algorithm to reconstruct a signal from its local maximum representation is developed. The experimental results show that the algorithm reconstructs the original signal, and this reconstruction is stable. This implies that the local maximum representation is a reorganization of the signal information. Therefore, various pattern analysis algorithms can be developed uniquely based on the properties of edges. Image processing can also be done through the multiscale edge representation. An application to image coding is described.
Ph.D. Thesis 1990 Detecting Nondeterminism in Shared Memory Parallel Programs Dinning, Anne Abstract | PDF

Title: Detecting Nondeterminism in Shared Memory Parallel Programs

Candidate: Dinning, Anne

Advisor(s): Mishra, Bud

Abstract:

This thesis addresses the problem of detecting of a specific type of nondeterminism in shared memory parallel programs known as access anomalies. An access anomaly occurs when an update to a shared variable X is concurrent with either a read of X or another update of X.

The first part of the work considers dynamic detection of access anomalies. We introduce a new technique called task recycling that detects access anomalies "on the fly" by monitoring the program execution. This technique is designed with two goals in mind. The first goal is minimal monitoring overhead. Costs are incurred only at thread create, terminate, and coordinate operations cind every time a monitored variable is accessed. Because variable accesses are generally the most frequent operation, the task recycling technique reduces the overhead per variable access to a small constant. The second goal is generality. The task recycling technique is appllicable to a wide variety of parallel constructs find all common synchronous and asynchronous coordination primitives. Combined with a protocol for specifying ordering constraints, the method of representing concurrency relationships in task recycling cam be extended to detect general race conditions in parallel programs.

The second pait of the thesis involves static detection of several types of nondeterminism that makes dynamic anomcily detection inefficient. In particulair, the notion of nondeterminism arising from critical section coordination is refined by distinguishing between three types of nondeterminism parallel, sequential, and reference nondeterminism. The presence of these types of nondeterminism in a program impacts access anomaly detection in two significant ways: (i) how critical section coordination is modeled during anomaly detection, and (ii) the confidence level and complexity of guaranteeing that a program has no access anomalies. In particular, it is shown that access anomalies can be detected efficiently only if a program is parallel, sequential and reference deterministic. Heuristics are presented that make access anomaly detection tractable in the presence of other nondeterminism through a better classification amd semantic understanding of a coordination protocol.
Ph.D. Thesis 1990 Program transformation for efficient derivation of multiple solutions in concurrent logic languages Markantonatos, Nikolaos Abstract | PDF

Title: Program transformation for efficient derivation of multiple solutions in concurrent logic languages

Candidate: Markantonatos, Nikolaos

Advisor(s): Harrison, Malcolm C.

Abstract:

Concurrent logic languages provide a flexible and powerful vehicle for expressing parallel programs using explicit processes. However, their drastic departure from conventional logic programming with respect to completeness renders them unsuitable for a variety of useful applications involving search. A multiple solution extension to concurrent logic languages appears to successfully obtain the effect of backtracking in a parallel environment, but has been impeded by inefficiency problems. Moreover, the multiple solution subset introduces a new language which is incoherent with the single solution base language. We propose a multiple solution subset definition that adheres to the base language both syntactically and semantically. Subsequently, we advocate a source-to-source transformational approach for the efficient implementation of the subset. Multiple solution programs are converted at compile-time into equivalent single solution programs that derive all possible solutions into a single list. Alternative solutions are obtained in an eager or lazy fashion as specified by the program. A number of multiple solution program classes that are transformable into efficient single solution programs are identified and the corresponding transformation procedures are presented and further illustrated using a variety of examples. The techniques employed for the various transformations include partial evaluation, abstract interpretation, continuation-based transformation, layered stream transformation and loop fusion. As a result of such a static transformational methodology, a broad range of multiple solution programs enjoy efficient execution. We believe that our approach forms a definite step towards an efficient multiple solution subset for concurrent logic languages.
Ph.D. Thesis 1990 Data structures and algorithms for hierarchical memory machines Mirza, Mirza G. R. Abstract | PDF

Title: Data structures and algorithms for hierarchical memory machines

Candidate: Mirza, Mirza G. R.

Advisor(s): Siegel, Alan

Abstract:

This thesis analyzes the influence of hierarchical memory in models of practical computation. While hierarchical memory is the standard in real computing systems, the most common models of computation, Random Access Memory Machines and Turing Machines, do not reflect this form of memory. Our main contributions are: (1) Models of computation that have memory hierarchy, and which provide a rich structure for the complexity analysis of real computational problems. (2) Optimal bounds for problems such as sorting, with respect to both space and time, for a variety of memory access costs. (3) Related bounds for other problems, including constrained multitape merging and the implementation of Priority Queues and B-Trees. (4) The introduction of multiprogramming and multiprocessing concepts for these models, and an analysis of their relative computational power.
Ph.D. Thesis 1990 Design and implementation of HyTeK: A knowledge-based hypertext system Perez-Carballo, Jose F. Abstract | PDF

Title: Design and implementation of HyTeK: A knowledge-based hypertext system

Candidate: Perez-Carballo, Jose F.

Advisor(s): Strzalkowski, Tomek; Shasha, Dennis

Abstract:

A Hypertext system is a text data base where the units of information are interlinked using pointers that the user can follow. We call the pointers explicit links (as opposed to computed or virtual links.) HyTeK provides a set of tools designed to help the user explore the information contained in the system. The information contained in the system is represented using at least one of the three following methods: fragments of full text, explicit links between fragments and a collection of frame-like objects organized in a taxonomy. Explicit links are used to represent discourse relationships between fragments of text. The frame-like objects, called Topics, represent concepts in the domain of the text contained in the fragments. Topics are used to index the fragments for retrieval. The taxonomy of Topics represents some of the relationships between fragments that a traditional Hypertext System would represent using explicit links. HyTeK's query system uses the taxonomy of Topics in order to implement tools that allow the user to retrieve fragments selectively by their contents. A user queries the system by building a set of Topics in an interactive process of reformulation. Query reformulation is supported by a set of tools that allow the user to explore the space of Topics. The relationships between the Topics are used to define a similarity measure which is used to rank the target set of the query. This work describes an automatic indexing scheme, a query system and an extension of the Knowledge Representation (KR) system NIKL (KLONE) that was used in HyTeK to implement the taxonomy of Topics. A prototype of HyTeK was implemented in Common-Lisp in a Symbolics 3645 running Genera 7.2. The system has been extensively tested on several test collections of a total of 1000 fragments of text about AIDS treatments. The results indicate clear advantages over traditional Information Retrieval systems and suggest that the use of a KR system for the implementation of a query module for a Hypertext System is promising.
Ph.D. Thesis 1990 On a generalization of Herbrand's theorem Policriti, Alberto Abstract | PDF

Title: On a generalization of Herbrand's theorem

Candidate: Policriti, Alberto

Advisor(s): Davis, Martin D.

Abstract:

In this thesis we prove a generalized version of Herbrand's theorem. Our result guarantees the existence of a semi-decision procedure a la Herbrand for testing unsatisfiability with respect to a give theory T, in which the decision procedure used at the ground level depends upon T. This is opposed to the classical case in which procedure used at the ground level is simply a test for propositional satisfiability. The problem of finding suitable analogues for the general case of the exhaustive search procedures is also tackled, and one such generalization is proposed. The underlying motivation for this study was to find theoretical results that could provide the basis for a set-theoretic proof checker. Thus, the case of set theory is considered in more detail. In particular, decidability and undecidability results for classes of set-theoretic, purely universal formulae are proved.
Ph.D. Thesis 1990 Space-variant computer vision with a complex-logarithmic sensor geometry Rojer, Alan S. Abstract | PDF

Title: Space-variant computer vision with a complex-logarithmic sensor geometry

Candidate: Rojer, Alan S.

Advisor(s): Schwartz, Eric

Abstract:

The complex logarithm as a conformal mapping has drawn interest as a sensor architecture for computer vision due to its psuedo-invariance with respect to rotation and scaling, its high ratio of field width to resolution for a given number of pixels, and its utilization in biological vision as the topographic mapping from the retina to primary visual cortex. This thesis extends the computer vision applications of the complex-logarithmic geometry. Sensor design is based on the complex log mapping w = log (z + a), with real a $>$ 0, which smoothly removes the singularity in the log at the origin. Previous applications of the complex-logarithmic geometry to computer vision, graphics and sensory neuroscience are surveyed. A quantitative analysis of the space complexity of a complex-logarithmic sensor as a function of map geometry, field width and angular resolution is presented. The computer-graphic problems of warping uniform scenes according to the complex logarithm and inversion of log-mapping scenes to recover the original uniform scene are considered, as is the problem of blending the resulting inverse log maps to reconstruct the original (uniform) scene. A series of simple algorithms for segmentation of log scenes by contour completion and region filling are presented. A heuristic algorithm for figure/ground segmentation using the log geometry is also shown. The problem of fixation-point selection (visual attention) is considered. Random selection of fixation points, inhibition around previous fixations, spatial and temporal derivatives in the sensor periphery, and regions found by segmentation are all examined as heuristic attentional algorithms. For the special case where targets can be parametrically defined, a theory of model-based attention based on the Hough transform is introduced. A priori knowledge about the consistency between potential objects in the scene and measured features in the scene is used to select fixation points. The exponential storage requirements of the usual Hough transform are avoided.
Ph.D. Thesis 1990 SAGE: A real-time operating system for robotic supervisory control Salkind, Louis K. Abstract | PDF

Title: SAGE: A real-time operating system for robotic supervisory control

Candidate: Salkind, Louis K.

Advisor(s): Mishra, Bud

Abstract:

The next generation of robotic applications--computer integrated manufacturing, teleoperation, and mobile autonomous robots--will require far more computer systems support than currently available. In particular, real-time supervisory control systems will be needed to integrate an increasing number of sensors and actuators, as well as to communicate with other computers in a distributed environment. This thesis describes the design and implementation of SAGE, an operating system built specifically for real-time robotic supervisory control. The SAGE kernel runs on off-the-shelf Motorola 68020 processor boards, and features lightweight processes, virtual memory support, extensible low-overhead synchronization primitives, and real-time communications capabilities. Because SAGE is one of the first systems built for robotic supervisory control, the thesis focuses on the issues and design tradeoffs that arise in building a supervisory control operating system. The thesis also describes how SAGE was used to control a number of intelligent devices, including a Utah/MIT hand and a PUMA robot arm. The robotic experiments performed demonstrate that the operating system can be used in real-time supervisory control applications.
Ph.D. Thesis 1990 The APRAM: A model for asynchronous parallel computation Zajicek, Ofer Abstract | PDF

Title: The APRAM: A model for asynchronous parallel computation

Candidate: Zajicek, Ofer

Advisor(s): Cole, Richard

Abstract:

It is becoming increasingly clear that parallel computers will play a significant role in the area of computer science and its applications. In order to develop parallel machines and in order to be able to take advantage of them as they become available it is important to understand the issues underlying parallel computation. This thesis investigates one such issue, the synchronization costs of shared memory parallel computation. It defines the APRAM model, an asynchronous variation of the PRAM model, and analyzes a number of fundamental algorithms in this model; it uses three different complexity measures. The first part of the thesis defines the rounds complexity. It describes the complexity of an algorithm as a function of the slowest process. It is used to measure the explicit costs of synchronization: the cost of executing extra code in order to achieve synchronization. Three algorithms are analyzed under this complexity measure: a tree based summation algorithm; a list based recursive doubling algorithm; and an algorithm for computing the connected components of an undirected graph. In all three cases it is shown that global synchronization can be replaced by local synchronization thereby reducing the explicit costs of synchronization. The connectivity algorithm is significantly more substantial than the other two. We avoid the need to synchronize the processes, thereby obtaining an algorithm whose behavior appears somewhat chaotic. Due to its apparently chaotic nature and the unpredictability of the asynchronous environment, its analysis is quite challenging. In an asynchronous environment processes may proceed at different speeds. In the second part of the thesis we model the non-uniformity of the environment by defining the speeds of the processes to be random variables with a known probability distribution. We then quantify conditions under which asynchronous execution may have a significant advantage over a lock step execution, even if the explicit costs of a lock step execution are ignored. Both the summation algorithm and the recursive doubling algorithm are analyzed using two different probability distributions. In addition, we quantify conditions under which the list based recursive doubling algorithm is significantly faster than the tree based summation algorithm.
Ph.D. Thesis 1989 Combinatorial and algorithmic analysis of space decomposition problems Aronov, Boris Abstract | PDF

Title: Combinatorial and algorithmic analysis of space decomposition problems

Candidate: Aronov, Boris

Advisor(s): Sharir Micha

Abstract:

The first part of the thesis studies geodesic Voronoi diagrams. The closest-site (respectively, furthest-site) Voronoi diagram of a finite set of sites in Euclidean space is a classical geometric structure, which partitions the space into a set of Voronoi cells, each associated with a site, so that any point in the cell of site s is closer to s (resp. further from s) than to any other site. The structure of such diagrams for point sites in the plane has been completely characterized and well-known efficient algorithms exist for computing them. Voronoi diagrams have been generalized by replacing the Euclidean distance by a more general metric and/or relaxing the assumption that sites be single points. We consider the closest- and the furthest-site Voronoi diagrams for a set of k point sites in a simple n-gon, defined by the internal geodesic distance inside the polygon. We demonstrate that the planar map defined by either diagram is comprised of O(n + k) features of bounded complexity each and describe nearly optimal algorithms for constructing the two Voronoi diagrams. Namely, the closest-site geodesic Voronoi diagram can be computed in time O((n + k)log(n + k)log n), while O((n + k)log(n + k)) time is sufficient for the furthest-site diagram. The second part of the thesis analyzes the structure of an arrangement of flat triangles in 3-space. The combined combinatorial complexity of all non-convex cells (i.e., non-convex components of the complement of the union of the triangles), maximized over all arrangements of n triangles is shown to be roughly O($n\sp{7\over 3}$), improving the best previously known upper bound of O($n\sp{3-{1\over 49}}$) for a smaller quantity--the maximum combinatorial complexity of a single cell. Our result has applications to algorithmic motion planning, stemming from the well-known technique that transforms a polyhedral body translating in a polyhedral environment into a collection of convex polygonal plates in three-dimensional space; the set of placements of the body reachable from a starting configuration along a collision-free path corresponds to a cell in the arrangement of these plates. Thus analyzing the maximum combinatorial complexity of a single cell and obtaining a comparably efficient algorithm for its calculation constitutes a satisfactory solution to the translational motion planning just mentioned. To this end, we also consider the problem of computing a single cell or a subset of cells in a three-dimensional arrangement of triangles, providing a nearly worst-case optimal randomized algorithm for solving the former problem and a less efficient procedure for the latter. In addition, we examine a few special classes of arrangements for which better estimates on the maximum single-cell complexity can be deduced and where computing a cell or any collection of cells appears easier.
Ph.D. Thesis 1989 Data communication in robot control systems Clark, Dayton R., Jr. Abstract | PDF

Title: Data communication in robot control systems

Candidate: Clark, Dayton R., Jr.

Advisor(s): Mishra, Bud

Abstract:

Robots and robot controllers are becoming more sophisticated. Consequently, the demands on the controller's operating system are increasing. The lower levels of robot control systems (indeed, most real-time control systems) are characterized by servo loops. This thesis examines servo loops and how they affect data communications within robot control systems. In the two systems described in this thesis the special characteristics of servo loops are exploited to enhance the data communications. H scIC is an operating system for hierarchies of servo loops. It uses rate monotonic scheduling for the periodic servo loop processes. H sc IC events (or processes) which are used to implement servo loops are not allowed to block. They will only surrender the processor upon completion or when preempted by a higher priority process. A non-blocking communication structure, Periodic Data Buffers (PDB's) was developed for inter-process communication. H scIC has been implemented and is used successfully in a controller for the Utah/MIT hand. G scANGLIA is a proposed real-time communication network. It is intended to allow the processors in a robot controller to be distributed within the robot. Thus the processors can be close to the sensors and actuators they control. Much of the traffic on such a network would be periodic. G scANGLIA uses a central controller which allocates access to the network. For the periodic traffic a fixed schedule, produced off-line, is used. For the aperiodic traffic round-robin polling is used. Unlike most protocols, messages do not contain the address of the destination node. Instead, the messages are labeled with the name of its contents. Each node examines each message and decides whether or not it is interested in the message. A special communication controller in each node (the Communication Memory Management Unit) examines and selects the messages. The result of this protocol is a network-wide common memory. In this thesis, the G scANGLIA protocol is described in detail and some preliminary analysis of its effectiveness in some real robot systems is given.
Ph.D. Thesis 1989 On-line motion planning Cox, James L. Abstract | PDF

Title: On-line motion planning

Candidate: Cox, James L.

Advisor(s): Yap, Chee

Abstract:

In this thesis we investigate the area of online or exploratory motion planning. In this thesis we develop algorithms for planning the motion of a planar rod or ladder and a three link planar arm moving amidst an environment containing obstacles bounded by simple, closed polygons. The exact shape, number and location of the obstacles is assumed unknown to the planning algorithm, which can only obtain information about the obstacles by detecting points of contact with the obstacles. The ability to detect contact with obstacles is formalized by move primitives that we call guarded moves. We call ours the online motion planning problem as opposed to the usual offline version. This is a significant departure form the usual setting for motion planning problems, in which the algorithm is given an explicit description of the scene as part of its input. What we demonstrate is that the retraction method can be applied, although new issues arise that have no counterparts in the usual setting. For the rod we are able to obtain an algorithm with path complexity ($O(m) = O(n\sp2)$ guarded moves, where $n$ is the number of obstacle walls, and $m$ the number of pairs of obstacle walls and corners of distance less than or equal to the length of the ladder) that matches the known lower bound (Ork85). This lower bound holds for both the online and offline (where the environment is explicitly given) versions of the problem. The computational complexity of the algorithm $O(m$ log $n)$ matches the best known algorithm (SfS) for the offline version. For the arm we are able to obtain an algorithm with path complexity that is $O(m) = O(n\sp3)$ where $n$ is the number of obstacle walls and $m$ is the number of pairs of obstacle features that the linkage can simultaneously contact. The computational complexity is $O(n\sp3$log $n$). Also our constraint based approach can be extended to obtain algorithms for $k > 3$ link arms that are polynomial for each $k$. That is, if $k$ is fixed the complexity is proportional to $n\sp{k}$.
Ph.D. Thesis 1989 Quantitative analysis of problems in computer algebra: Grobner bases and the Nullstellensatz Dube, Thomas William Abstract | PDF

Title: Quantitative analysis of problems in computer algebra: Grobner bases and the Nullstellensatz

Candidate: Dube, Thomas William

Advisor(s): Yap, Chee

Abstract:

This thesis presents new quantitative results concerning multi-variate polynomial ideals. Since these ideals are the basic objects of (computational) algebraic geometry, these results have important ramifications in algebraic algorithms, particularly in the solving of simultaneous equations. Furthermore, all the new theorems are proven using only constructive techniques and basic algebra. In many cases, the proofs provide algorithms for constructing the objects which the theorems describe. Among the results assembled here, three are of particular importance. The first shows that every ideal and residue class ring can be decomposed into simple pieces called cones. Next, the cone decomposition is used to produce a new upper bound on the degree of polynomials which appear in a reduced Grobner basis. Finally, a new tight upper bound for the exponent in Hilbert's Nullstellensatz is demonstrated.
Ph.D. Thesis 1989 SMARTS--Shared-memory Multiprocessor Ada Run Time Supervisor Flynn-Hummel, Susan Frances Abstract | PDF

Title: SMARTS--Shared-memory Multiprocessor Ada Run Time Supervisor

Candidate: Flynn-Hummel, Susan Frances

Advisor(s): Schonberg, Edmond

Abstract:

The programming language Ada is primarily intended for the construction of large scale and real time systems. Although the tasking model of Ada was aimed mainly at embedded systems, its rich set of synchronization operators together with its support for programming in the large, make Ada increasingly attractive for writing inherently parallel, computationally intensive, numeric and symbolic applications. Highly parallel shared-memory MIMD machines such as the NYU Ultracomputer have traditionally been regarded as suitable for large-scale scientific code, and not for more symbolic or heterogeneous concurrent applications such as are found in Artificial Intelligence or real-time programming. However, these applications would benefit greatly from (and even require) the computational power provided by highly parallel machines. It is therefore desirable to develop Ada implementations for highly parallel machines. The concern has been that the cost of managing large numbers of Ada tasks will negate the speedup obtained from their parallel execution. Indeed, a run-time supervisor for Ada must contend with many potentially expensive serialization points, that is to say, constructs that may take time proportional to the number of tasks involved. In this thesis we show that a run-time supervisor for an implementation of Ada on highly parallel machines can be written which is free of costly serialization points. The run-time supervisor SMARTS (Shared-memory Multiprocessor Ada Run Time Supervisor) depends on the hardware synchronization primitive $fetch\&\Phi$, and supports the tasking features of Ada in a highly parallel manner. We further reduce the overhead of Ada tasking, by means of micro-tasking, i.e. the explicit scheduling of a family of Ada tasks on a specified number of processors. Thus, Ada tasks are implemented as light weight processes managed by SMARTS, rather than full blown operating systems processes. Finally, SMARTS implements Ada shared variables efficiently by means of relay sets. Relay sets not only provide a means for identifying and resolving references to shared variables, but also facilitate the implementation of the Ada rendezvous mechanism as a remote procedure call.
Ph.D. Thesis 1989 A computational treatment of the comparative Friedman, Carol Abstract | PDF

Title: A computational treatment of the comparative

Candidate: Friedman, Carol

Advisor(s): Grishman, Ralph

Abstract:

This thesis develops a computational treatment of the comparative in English that is general, efficient, and relatively easy to implement, while not unduly complicating the natural language processing system. Implementation was accomplished using the Proteus Question Answering System, which translates natural language questions into database queries. The comparative is a particularly difficult language structure to process, and presently only a few natural language systems handle it in limited ways. However, the comparative is an essential component of language that frequently occurs in discourse. The comparative is difficult to process because it corresponds to an amazingly diverse range of syntactic forms such as coordinate and subordinate conjunctions and relative clauses which are also very complex and often contain missing elements. Semantically, the comparative is cross-categorical: adjectives, quantifiers, and adverbs can have the comparative feature. The semantics of the comparative has to be consistent with that of different linguistic categories while retaining its own unique characteristics. The computational approach of this thesis is based on a language model which contains functionally independent syntactic, semantic, and pragmatic components. Although the comparative relates to all the components, the syntactic component is the one that is mainly affected. The syntactic stage of processing analyzes and regularizes the comparative structures. The analysis process utilizes existing mechanisms that handle structures similar to the comparative. The regularization process transforms all the different comparative structures into one standard form consisting of a comparative operator and two complete clauses. This process consists of two phases: the first uses a compositional approach based on Montague-style translation rules. The subsequent phase uses specialized procedures to complete the regularization process by expanding the comparative, filling in missing elements, and providing the appropriate quantified terms associated with the comparated elements. After the comparative is regularized, the remaining stages of processing are hardly affected. Each clause of the comparative is processed using the same procedures as usual, and only minor modifications are required specifically for the comparative.
Ph.D. Thesis 1989 Verification of three-dimensional model parameters from two-dimensional image data Goldberg, Robert Raphael Abstract | PDF

Title: Verification of three-dimensional model parameters from two-dimensional image data

Candidate: Goldberg, Robert Raphael

Advisor(s): Lowe, David

Abstract:

A unified approach is presented for instantiating model and camera parameters in the verification process of visual recognition. Recognition implies the generation of a hypothesis, a map between projected model data and image data. An important part of the problem remaining is the instantiation of model and camera parameters to verify the hypothesis. We present this camera pose determination as a non-linear least squares problem, with functions minimizing distance between the projected model and image data. This approach treats both camera and model parameters as the same, simplifying the camera/sensor calibration problem. Coordinate trees with null components, an original data structure, models the objects in the image. This allows the calculation of analytical partial derivatives (with respect to the parameters of model and camera). We discuss objective model functions that best suit general applications. The incorporation of various numeric techniques is analyzed, with tables displaying convergence results for various models and parameters. Good convergence results are obtained and this method can be integrated into general vision applications. No depth information is required, and the algorithms also hold in noisy images, adding much robustness to our techniques. A natural extension of these techniques is to instantiate the parameters of generally constrained models.
Ph.D. Thesis 1989 Topics in algebraic computing: Subresultants, GCD, factoring and primary ideal decomposition Ho, Chung-Jen Abstract | PDF

Title: Topics in algebraic computing: Subresultants, GCD, factoring and primary ideal decomposition

Candidate: Ho, Chung-Jen

Advisor(s): Yap, Chee

Abstract:

Our goal is to present an algorithm for computing a primary decomposition of a zero-dimensional ideal. We compute the decomposition of the radical ideal of the zero-dimensional ideal and lift it to a primary decomposition. The algorithm for decomposing radicals simply uses Kronecker's method of elimination and GCD and factoring algorithms. Kronecker's method of elimination and GCD computations are related to resultant systems and subresultants. Thus, we first investigate the theory of subresultants. We expound the theory of subresultants along the lines suggested by Loos. However, there were some major oversights in Loos's proof of the Subresultant Theorem. We point out where exactly Loos's proof fails and give a correct version of proofs. Then, we define the Sylvester matrix of many polynomials and explore the properties of the Sylvester matrix. By these properties, we derive fast parallel algorithms for computing the GCD of many polynomials. Our algorithms have better processor bound than Von zur Gathen's algorithm. Moreover, one of the algorithms uses no divisions. The factoring algorithm deals with factoring polynomials over multiple algebraic extensions of rational number field. We present an algorithm to find an integer $D$ such that the defect of an integral basis for a multiple extension of Q divides $D$. Though there is a naive algorithm to find a $D$ by translating a multiple extension to a simple extension, our algorithm has much better time and space bound than the naive algorithm. With this result, we can directly factor polynomials without translating a multiple extension to a simple extension. Finally, we improve Kronecker's method of elimination; and then, by applying the GCD and factoring algorithms on the resultant systems generated by Kronecker's method of elimination, we obtain a tree representation of all the associated prime ideals belonging to the zero-dimensional ideal.
Ph.D. Thesis 1989 Object recognition by geometric hashing Lamdan, Yehezkel Abstract | PDF

Title: Object recognition by geometric hashing

Candidate: Lamdan, Yehezkel

Advisor(s): Schwartz, Jacob T.; Wolfson, Haim J.

Abstract:

This thesis proposes a general and efficient model-based object recognition scheme. The scheme addresses the problem of identifying instances of model objects in single images. The model objects are two or three dimensional, and their instances in the scene might be overlapping and partially occluded by other unknown objects. The camera viewpoint is unknown and assumed to be arbitrary. The images can be two dimensional intensity images or three dimensional range images. The scheme deals uniformly with all feasible imaging transformations, from the simplest case of pure translation to the most complex case of the perspective transformation. The proposed method is based on geometric hashing. It hypothesizes model to scene transformations based on corresponding model and scene feature subsets. These subsets have the minimal cardinality, which still allow to recover the imaging transformation for a given transformation type. In order to prune the search space of all model and scene feature subset pairs, a hashing scheme is used. It is based on geometrical relations among the object features, which are invariant under the given transformation type. The recognition algorithm has two major steps. First, a hash-table, encoding the geometrical invariants of the model features, is prepared. This stage is independent of the scenes to be later processed, and can be executed off-line. In the second stage, an efficient matching algorithm is performed, which utilizes the previously prepared hash-table. The efficacy of the recognition is achieved by considering only those model and scene subsets, which are 'similar' under the given transformation type. The algorithm was tested in 'real-life' situations for the important cases of recognizing flat and solid objects in the 3D world, using the weak perspective approximation to the perspective transformation.
Ph.D. Thesis 1989 Mapping algorithms on regular parallel architectures Lee, PeiZong Abstract | PDF

Title: Mapping algorithms on regular parallel architectures

Candidate: Lee, PeiZong

Advisor(s): Kedem, Zvi

Abstract:

It is significant that many of time-intensive scientific algorithms are formulated as nested loops, which are inherently regularly structured. In this dissertation the relations between the mathematical structure of nested loop algorithms and the architectural capabilities required for their parallel execution are studied. The architectural model considered in depth is that of an arbitrary dimensional systolic array. The mathematical structure of the algorithm is characterized by classifying its data-dependence vectors according to the new ZERO-ONE-INFINITE property introduced. Using this classification, the first complete set of necessary and sufficient conditions for correct transformation of a nested loop algorithm onto a given systolic array of an arbitrary dimension by means of linear mappings is derived. Practical methods to derive optimal or suboptimal systolic array implementations are also provided. The techniques developed are used constructively to develop families of implementations satisfying various optimization criteria and to design programmable arrays efficiently executing classes of algorithms. In addition, a Computer-Aided Design system running on SUN workstations has been implemented to help in the design. The methodology, which deals with general algorithms, is illustrated by synthesizing linear and planar systolic array algorithms for matrix multiplication, a reindexed Warshall-Floyd transitive closure algorithm, and the longest common subsequence algorithm.
Ph.D. Thesis 1989 Transformations for backtracking SETL programs Nathan, Albert Abstract | PDF

Title: Transformations for backtracking SETL programs

Candidate: Nathan, Albert

Advisor(s): Dewar, Robert

Abstract:

We study program transformations for a class of combinatorial search problems whose solutions are usually found by backtrack searching. High-level algorithms for such problems can be elegantly specified using SETL's backtracking primitives ok and fail, for which we give a more formal and precise semantic definition than the one which currently exists. Then we explore two types of transformations applicable to such specifications. First, we derive Finite Differencing transformations which reduce the amount of computation performed at each node of the search tree. Though the formal derivation of these transformations is somewhat lengthy, the net results are simple and easily understood. In the process of deriving the transformations, we also expose some difficulties encountered when applying Finite Differencing methods to programs which use ok/fail. Second, we propose two general transformations which reduce the size of the search tree generated by pruning subtrees which are guaranteed to fail. The first one is based on the idea of using knowledge accumulated during the search to guide the search, while the second one prunes subtrees which contain no paths of sufficient length needed to extend the current partial solution to a complete solution. For each filter, we describe its enabling conditions, give a high-level specification, and then formally derive an efficient implementation using Finite Differencing. Finally, we suggest suitable representations, based on SETL's Data Representation Sublanguage, for implementing the data structures used in our transformations. We demonstrate the effectiveness of all these transformations by programming some familiar backtrack-search problems and comparing the running times and number of nodes generated in the transformed versions against those of the original specification. We also show some papers from the literature in which some suggestion of these transformations does appear, but in which (in contrast to this work) no formal demonstration of their correctness or applicability to other problem domains is given.
Ph.D. Thesis 1989 Optimization and garbage collection in Ada programs on shared memory computers Operowsky, Howard Lawrence Abstract | PDF

Title: Optimization and garbage collection in Ada programs on shared memory computers

Candidate: Operowsky, Howard Lawrence

Advisor(s): Schonberg, Edmond

Abstract:

Compiler development for Ada is still in its infancy. Despite its goal of supporting embedded systems in an efficient manner, Ada programs still tend to be large and slow. In this thesis, we investigate three issues related to the efficient implementation of Ada programs: run-time representation of types and objects, reduction of run-time constraint checking, and parallel garbage collection on a shared memory multiprocessor. We present a collection of type templates for scalar and composite types which are storage-efficient and allow for efficient object code to be produced by the code generator. We present an algorithm for constructing these templates at run-time when constraint information is unavailable at compile-time. We show that a global optimizer is not required to reduce the overhead of constraint checking in Ada programs. We present a series of data-flow equations for available expressions and use them as the basis for a simple algorithm to eliminate redundant constraint checks. The algorithm is syntax-directed and is executed in a single pass over the source program's abstract syntax tree. No control flow analysis is required. Our algorithm also includes constant propagation using an extended framework and induction variable analysis. Because the algorithm operates on the abstract syntax tree, induction variable analysis is simplified. Although programs with goto statements are not considered, the exit statement is handled fully. We also examine the effects of shared variables and exception handling. No commercial compiler for Ada currently performs garbage collection. We examine the difficulties in garbage collection presented by Ada and present practical algorithms for Ada on shared memory multiprocessors. We extend Kung and Song's on-the-fly garbage collection algorithm to support multiple tasks on the NYU Ultracomputer/IBM RP3 computers. We prove that no additional synchronization is required because of Ada's rules on the use of shared variables.
Ph.D. Thesis 1989 Using relational discrete event systems and models for prediction of future behavior of databases Tuzhilin, Alexander Sergei Abstract | PDF

Title: Using relational discrete event systems and models for prediction of future behavior of databases

Candidate: Tuzhilin, Alexander Sergei

Advisor(s): Kedem, Zvi

Abstract:

The following prediction problem is studied in this dissertation: given a specification of the future behavior of a system and the current state of the system described with a relational database, predict what will happen to the system in the future. The behavior is defined in terms of Relational Discrete Event Systems (RDESes) and Models (RDEMs). An RDES is a set of possible non-deterministic trajectories of future states of a system. An RDEM is a finite formal description of a generally infinite RDES set. Various production system RDEMs and a recurrence equation RDEM are defined and formally compared in terms of expressive power in this dissertation. It is shown that one of the production system RDEMs is better than other considered RDEMs not only in terms of expressive power but in other respects as well. Also, the suitability of various control strategies to restrict non-determinism and improve system's performance is considered. In order to obtain predictions about possible future states of a database, Predictive Query Language (PQL) is defined with the syntax based on a predicate temporal logic and the semantics on RDEM models. It is shown how PQL is related to relational queries for Datalog and its extensions. Finally, the prototype of the Cassandra system is described. Cassandra supports PQL with the semantics based on a production system RDEM. An example of a small Flexible Manufacturing System is used throughout the dissertation to illustrate various points about the described methods.
Ph.D. Thesis 1989 Fuzzy disk modeling and rendering of textured complex three-dimensional surfaces of real objects Yang, Xue Dong Abstract | PDF

Title: Fuzzy disk modeling and rendering of textured complex three-dimensional surfaces of real objects

Candidate: Yang, Xue Dong

Advisor(s): Perlin, Ken; Schwartz, Jacob T.

Abstract:

The three-dimensional geometric modeling in computer graphics is concerned with the representation, specification, and manipulation of free-form curves, surfaces, and volumes. This research explores a model for constructing representations of complex three-dimensional surfaces of real-world objects, such as sculptures in a museum, from sample points acquired with a special 3-D camera, and for synthesizing computer-generated pictures from this model. The difficulty of this problem comes from the complexity of the surface characteristics of such objects, which involve complicated irregular shapes and rich textures. This thesis presents a new three-dimensional surface model - three-dimensional fuzzy disk model, for computer graphics display. This model allows any curved surface to be approximated by a number of overlapping disks. A new blending method has been developed to generate smoothly curved surfaces from the overlapping disks. The shape of a blending surface can be controlled by varying some geometric parameters. This three-dimensional fuzzy disk representation is organized into a multi-resolution structure which allows adaptive refinement of surfaces details and supports coarse-to-fine display process. A scan-line rendering algorithm has been developed to synthesize images from the new model. We also present a simpler, less accurate, but more efficient approximation to the original model. In addition, we present a fast shadow penumbra approximation algorithm capable of generating soft shadows.
Ph.D. Thesis 1989 The editing distance between trees: Algorithms and applications Zhang, KaiZhong Abstract | PDF

Title: The editing distance between trees: Algorithms and applications

Candidate: Zhang, KaiZhong

Advisor(s): Shasha, Dennis

Abstract:

Trees are a ubiquitous building block in computer science and related fields. Examples are grammar parses, image descriptions, secondary structures of RNA molecules, and many other phenomena. Comparing trees is therefore useful to compare scenes, parses, and so on. This thesis presents algorithms for tree comparison and applications of those algorithms. We consider the distance between two labeled trees to be the weighted number of editing operations (insert, delete, and modify) to transform one tree to another. We show that for unordered trees this is a NP-Complete problem. For ordered trees we present a simple fast dynamic programming algorithm that is significantly better than the best previous published algorithms. We then show that our method provides a general technique for solving other related tree problems (e.g. approximate tree matching). We also present efficient parallel algorithms on the assumption that the costs be unit. One of our applications is to compare secondary structures of RNA molecules. We describe another application to vision that uses tree comparisons to compare shapes. We have also implemented some of the algorithms in the form of a tree comparison toolkit. The preliminary version of the toolkit has been used at the U.S. National Cancer Institute for the comparison of RNA secondary structures.
Ph.D. Thesis 1988 Parallel algorithms for band SPD systems of linear equations Bar-On, Ilan Abstract | PDF

Title: Parallel algorithms for band SPD systems of linear equations

Candidate: Bar-On, Ilan

Advisor(s): Widlund, Olof

Abstract:

In this thesis we consider parallel algorithms for solving band symmetric positive definite systems of linear equations where the number of equations is much larger than the band width. Such systems arise in many practical applications for the dynamic analysis of structures such as the design of dams, bridges, ships, supersonic jets etc. Sequential methods for solving these systems require intolerable turnaround times and hence the importance of fast parallel algorithms for solving them. Our main contribution in this thesis is the presentation of a new practical parallel algorithm. Our algorithm runs in O(m $\*$ log n) time using nm/log n processors where n is the number of equations and m the band width. Hence, the algorithm is efficient. For tridiagonal systems the algorithm runs in O(log n) time using n/log n processors. We also develop a theoretical faster algorithm that runs in O(log m log n) time using nm$\sp2$/(log m log n) processors. This algorithm is efficient and runs as fast as the best currently known theoretical method. In chapter one we introduce the basic principles of parallel computations. In chapter two we review the basic algebraic and numerical properties of matrix computations. Here, we present a new parallel efficient algorithm for adding n k-bits integers in O(log n + log k) time based on the Fibbonachi sequence. In chapter three we consider parallel methods for solving band triangular systems which arise from the L-U decomposition of A. We conclude that this method is not as efficient for parallel computers as for sequential ones. In chapter four, we give a new efficient parallel algorithm for inverting a s.p.d. matrix in O(log$\sp2$n) time. We then present our new parallel algorithm for solving band s.p.d. systems, analyse its complexity, and show its improvement over the odd-even reduction algorithm. We conclude by pointing to yet unresolved problems in this field.
Ph.D. Thesis 1988 ZLISP--a portable parallel LISP environment Dimitrovsky, Isaac Aaron Abstract | PDF

Title: ZLISP--a portable parallel LISP environment

Candidate: Dimitrovsky, Isaac Aaron

Advisor(s): Harrison, Malcolm C.

Abstract:

This thesis concerns ZLISP, a portable parallel LISP environment for shared memory MIMD supercomputers. ZLISP was created as a vehicle for experimenting with parallel symbolic computing on a variety of supercomputer designs. It is a small but reasonably powerful subset of COMMON LISP that includes arrays, strings, structures, most of COMMON LISP's control flow functions, and a native code compiler, among other features. A low-level, flexible set of parallel primitives is provided that can support a wide spectrum of parallel programming styles. ZLISP currently runs on the NYU Ultracomputer prototype. A version that simulates parallelism runs on VAX and SUN minicomputers. I begin this thesis by discussing ZLISP's design and implementation. I attempt to justify the more difficult design decisions made during the development of ZLISP. I also give some details on how the more unusual parts of ZLISP are implemented. The full ZLISP reference manual is included as an appendix. I then turn to some parallel algorithms of independent interest that were discovered during the development of ZLISP. Many of these algorithms use the faa (fetch-and-add) operation, a versatile low-level synchronization primitive that has been promoted by the NYU Ultracomputer group and incorporated in several other supercomputer designs. I first describe some of the parallel algorithms used to implement ZLISP. These include an algorithm for parallel garbage collection and an algorithm for efficiently using hash tables in a parallel garbage collected environment. Finally, I cover some parallel algorithms provided for use by ZLISP programmers. I define the group lock, a new synchronization primitive useful in writing asynchronous parallel algorithms, and give some examples of its use in such applications as parallel stacks, heaps, and databases. I also present an assortment of space efficient parallel data structures such as queues, multiqueues, and stacks.
Ph.D. Thesis 1988 Reasoning about shape and kinematic function in mechanical devices Joskowicz, Leo Abstract | PDF

Title: Reasoning about shape and kinematic function in mechanical devices

Candidate: Joskowicz, Leo

Advisor(s): Davis, Ernest

Abstract:

This thesis presents a general framework for reasoning about the relationship between the shape of a solid object and its kinematic function in a mechanical device. Such a framework is essential for numerous reasoning tasks concerning mechanical devices such as analysis, prediction of behavior, and design. We propose to use an intermediate representation that relates the geometry of objects to their kinematic function in a mechanism; this representation stems from the notion of configuration spaces, originally introduced for motion planning. We show that configuration spaces are an appropriate symbolic representation for reasoning about the kinematics mechanical devices because the regions of the mechanism's configuration space can be interpreted as representing all the qualitatively different possible motions its objects. Our theory supports both qualitative and causal reasoning. To describe kinematic behavior functionally, we begin by developing two functional languages: possible motions descriptions and causal descriptions. We then present a two-step analysis procedure that starts by deducing the behavior of all kinematic pairs and then composes these behaviors to obtain the overall behavior of the mechanism. For a subclass of mechanisms (fixed axes mechanisms), we show that a simplified version of the composition operation can be used to obtain the overall behavior, and we outline a constraint propagation, label inferencing algorithm to produce a region diagram. This diagram constitutes a total qualitative envisionment of the mechanism's reachable behaviors. Given a sequence of input motions and a region diagram, we indicate how to predict the behavior of the mechanism. In the second part of this thesis, we address the problem of designing the shape of physical objects defined by a set of functional requirements. In particular, we show how to design kinematic pairs from a description of their desired behavior. We provide a general heuristic algorithm for innovative shape design, and present a number of efficient algorithms for special design cases. We also show how to design kinematic pairs when a qualitative or incomplete description of the desired behavior is provided.
Ph.D. Thesis 1988 Use of three-dimensional curves in computer vision Kishon, Eyal Abstract | PDF

Title: Use of three-dimensional curves in computer vision

Candidate: Kishon, Eyal

Advisor(s): Schwartz, Jacob T.

Abstract:

The objective of this work is to study the use of 3-D curves in model based object recognition. We approach the two main problems of object recognition, i.e., model formation and matching in a unified way. We propose a framework in which 3-D curves will be used both to represent objects in a database of models, and then present algorithms that use these curves to perform efficient matching between an observed object and a previously prepared database of object models. The motivation for this work comes from the fact that 3-D curves can describe in a natural way the objects from which they were extracted. Moreover, the use of these curves in the matching process has proved to be highly accurate while at the same time very efficient. In this work we present algorithms to extract 3-D curves from a pair of range and intensity images, and then algorithms that classify and separate between the different types of curves. We will also present two efficient algorithms for matching 3-D curves.
Ph.D. Thesis 1988 Simulation-based understanding of texts about equipment Ksiezyk, Tomasz Bartlomiej Abstract | PDF

Title: Simulation-based understanding of texts about equipment

Candidate: Ksiezyk, Tomasz Bartlomiej

Advisor(s): Grishman, Ralph

Abstract:

This thesis presents a natural language understanding system, operating in the domain of equipment consisting of mechanical, hydraulic, and electrical elements. The task of the system is to analyze reports regarding the failure, diagnosis and repair of equipment. We argue that a general knowledge of equipment is not sufficient for a full understanding of such reports. As an alternative, we propose a system which relies on a detailed simulation model to support language understanding. We describe the structure of the model and emphasize features specifically required for language understanding. We show how this model can be used in analyzing and determining the referents for complex noun phrases describing equipment parts. We outline the data structures used for concepts which are mentioned in the text but which have no permanent representation in the model, and explain how they are created during the text analysis. Similarly, we discuss the data structures for representing the facts conveyed by the text, and provide algorithms for translating text expressing facts into their representations. We point out the importance of identifying the implicit temporal and causal relations in the text and show how the simulation capabilities of the model support this task. We present a dynamic graphical interface which gives the user insight into the way the input has been understood by the system. Finally, we indicate how our system may be extended to facilitate dynamic (i.e. during the analysis of text) extensions to its data base, and to assist the user in entering new equipment models. Most aspects of the discussed system were implemented on a Symbolics Lisp machine.
Ph.D. Thesis 1988 Extensions to SETL to support problem specification and transformation of imperative programs Lewis, Henry Merriman Abstract | PDF

Title: Extensions to SETL to support problem specification and transformation of imperative programs

Candidate: Lewis, Henry Merriman

Advisor(s): Dewar, Robert

Abstract:

Programming by transformation is a reliable and efficient way to develop algorithms. An ideal methodology begins with high-level specifications of the problem to be solved. Such dictions are by nature concise, easy to understand, and easy to verify. They are free from the details that determine the method by which the solution is found, yet promote transformations leading to derivation of solutions. The user of the transformation system applies refinements and modifications that transform the problem specifications into algorithm specifications, and so is able to derive programs that solve the original problem. We propose extensions to the set-theoretic programming language SETL to support problem specifications. The resulting language realizes the ideals of problem specifications. The resulting language realizes the ideals of problem specification, and further supports direct execution of the highest-level specifications as a search over a solution space. Its dictions are imperative at all levels of derivation, so as to provide consistency of style among all versions, from problems to programs. We show how dictions of the form find variables $\vert$ conditions serve to specify problems, and how transformation of the conditions promotes derivation of algorithms. We propose dictions that allow concise specification of problems that require minimization of a function, and a variant that allows specification of problems that are inherently non-rigorous, or whose solutions admit approximation or tolerance. We suggest transformations of expressions that lead to algorithms employing formal differentiation of expressions or dynamic programming. Through examples we show that the method of transformational programming constitutes a tool for the specification, derivation, and discovery of algorithms.
Ph.D. Thesis 1988 Foundations of a logic of knowledge, action, and communication Morgenstern, Leora Abstract | PDF

Title: Foundations of a logic of knowledge, action, and communication

Candidate: Morgenstern, Leora

Advisor(s): Davis, Ernest

Abstract:

Most Artificial Intelligence planners work on the assumption that they have complete knowledge of their problem domain and situation, so that planning an action consists of searching for an action sequence that achieves some desired goal. In actual planning situations, agents rarely know enough to map out a detailed plan of action when they start out. Instead, they initially draw up a sketchy plan and fill in details as they proceed. This thesis presents a formalism that is expressive enough to describe this flexible planning process. We address ourselves to two central issues: (1) How can an agent determine that he knows enough to do an action? (Knowledge Preconditions Problem) (2) If the agent does not know enough, how can he plan to get the action done? (Ignorant Agent Problem) We demonstrate that modal logic is too weak to serve as the basis for such a theory, and choose instead to work within a first order logic augmented with quotation. We then discuss the Knower Paradoxes that arise from such syntactic treatments of knowledge, and propose a solution to these paradoxes based on Kripke's solution to the Liar Paradox. Next, we present a theory of action and planning that is powerful enough to describe partial plans and joint-effort plans. We then explain what knowledge an agent must have in order to successfully perform an action and how an ignorant agent can construct and execute complex plans in order to overcome his ignorance. A central observation underlying our solution to the Ignorant Agent Problem is that ignorant agents tend to use communicative acts, such as asking for information, and delegating, to plan around their ignorance. During the final part of this thesis, we therefore develop a theory of communication as an integrated part of our theory of action and planning. We show that this theory of communication is more expressive than standard Austinian-type speech act theories. The thesis includes comparisons of our theory with other syntactic and modal theories such as Konolige's and Moore's. We demonstrate that our theory is powerful enough to solve classes of problems that these theories cannot handle.
Ph.D. Thesis 1988 Taliere: An interactive system for data structuring SETL programs Straub, Robert Michael Abstract | PDF

Title: Taliere: An interactive system for data structuring SETL programs

Candidate: Straub, Robert Michael

Advisor(s): Schonberg, Edmond

Abstract:

This thesis describes a system designed to aid SETL programmers in the selection of data structures for the representation of program variables. The system uses information from the SETL optimizer, and provided interactively by the programmer, to select from the set and map representations which are available to implement SETL objects. We begin by describing previous work on data structure selection for very high level languages, including the data structure selection performed by the SETL optimizer. We then present a general description of a system for data structure selection for SETL programs. We describe techniques used to obtain useful information from a source program. This includes obtaining symbolic estimates of the execution frequencies of individual program operation, and estimates of the sizes of program objects. The data structures considered by the system are then described. We present a detailed description of the data structure selection algorithm, along with optimizations and heuristics used to improve the execution efficiency of the data structuring system. We conclude with examples comparing choices made by the system with choices made by a competent programmer and speculate on the eventual success of semi-automatic structuring systems.
Ph.D. Thesis 1988 Operating system data structures for shared memory MIMD machines with fetch-and-add Wilson, James M. Abstract | PDF

Title: Operating system data structures for shared memory MIMD machines with fetch-and-add

Candidate: Wilson, James M.

Advisor(s): Gottlieb, Allan

Abstract:

Ideally, procedures and data structures on a shared-memory MIMD machine should be serialization-free and concurrently accessible to avoid (potential) performance-limiting bottlenecks. The fetch-and-add coordination primitive, in conjunction with combining interconnection networks, has been proposed as a means for achieving this goal. The first is essentially an indivisible add-to-memory and the second combines simultaneous requests to the same memory location. In this thesis we address serialization-free memory and process management for a shared-memory MIMD machine with fetch-and-add and a combining network. To meet this goal we adopt a self-service paradigm for the operating system that permits each processing element (PE) to service its own requests (thereby avoiding central server bottlenecks). The success of this approach depends upon the use of concurrently accessible data structures to hold data shared among the PEs. We begin by reviewing existing fetch-and-add based queue and multiqueue (a compressed queue) implementations that support concurrent queue insertion and deletion. We then extend these implementations to include a new operations (e.g., the removal of an interior queue item) and new data structure representations (e.g., linked lists). Parallel memory allocation algorithms, many based on the modified queue and multiqueue data structures, are then given. These algorithms include parallel analogs to a number of existing serial algorithms such as Knuth's boundary tag method and the binary buddy system. Next, we define a set of primitives that permit various task activities, such as creation and scheduling, to be done in parallel. Task-switching readers/writers and event primitives are given as well. In the readers/writers implementations, reader activity is fully parallel in the absence of writers. An important feature of both the readers/writers and event implementations is that tasks waiting for a resource can be resumed in parallel by multiple PEs. We then demonstrate how high-level parallel programming constructs (e.g., parallel loops) may be implemented via the task primitives and the queue and multiqueue data structures. Finally, we prove that one of the readers/writers implementations satisfies certain correctness criteria including freedom from deadlock and the mutual exclusion of readers and writers.
Ph.D. Thesis 1987 A Decision Procedure for a Class of Unquantified Formulae of Set Theory Involving the Powerset and Singleton Operators Cantone, Domenico A. Abstract | PDF

Title: A Decision Procedure for a Class of Unquantified Formulae of Set Theory Involving the Powerset and Singleton Operators

Candidate: Cantone, Domenico A.

Advisor(s): Schwartz, Jacob T.

Abstract:

The class of unquantified formulae of set theory involving Boolean opeators, the powerset and the singleton operators, and the equality and membership predicates is shown to have a solvable satisfiability problem. It is also shown that whenever a formula (phi) in the above class is satisfiable there exists a hereditarily finite model of (phi), where rank is bounded by a doubly exponential expression in the number of variables occurring in (phi).
Ph.D. Thesis 1987 Tape Reversal and Parallel Time Chen, Jianer Abstract | PDF

Title: Tape Reversal and Parallel Time

Candidate: Chen, Jianer

Advisor(s): Yap, Chee; Gross, Jonathan

Abstract:

Recent research has shown an intimate relationship between reversal complexity on multitape Turing machines and parallel computation time. In this dissertation, we systematically study the structural properties of these two important complexity measures and the relationship between them. We develop some basic techniques necessary for establishing analogues of well-known theorems on space and time complexity. We give a linear simulation of deterministic space by deterministic reversal on multitape Turing machines and the first known tape reduction theorem for reversal complexity. As applications of the tape reduction theorem, we prove a hierarchy theorem and show the existence of complete languages for reversal complexity. The relationship between reversal and tape is also discussed. We show that with respect to reversal complexity there is an intrinsic difference between 1-tape and 2-tape Turing machines. More precisely, we show that in deterministic case, 2-tape Turing machines can simulate k-tape Turing machines with only a polynomial (quadratic) increase of reversals while 1-tape Turing machines do not have such a property if $P \not= PSPACE;$ in nondeterministic case, reversal complexity is too powerful to be a complexity measure on 2-tape Turing machines but on 1-tape Turing machines it is a reasonable complexity measure which is linearly related to the space complexity. For parallel computation, we introduce the concepts of deterministic, nondeterministic and oracle circuits in a very natural way. Based on our model of oracle circuits, we build up a log-depth hierarchy in parallel computation, and show that our hierarchy corresponds exactly to the well-known NC hierarchy. From this point of view, some structural properties of the NC hierarchy are discussed. Log-depth many-one reducibility and log-depth Turing reducibility are discussed. Several new complete languages for the class of deterministic log-space languages are presented. Finally, we give the detail proofs of the polynomial relationship between reversal complexity on multitape Turing machines and parallel time complexity on uniform circuits. (Some of these proofs have been outlined by Pippenger.)
Ph.D. Thesis 1987 The use of Data Flow Information for the Selection and Evaluation of Software Test Data Frankl, Phyllis G. Abstract | PDF

Title: The use of Data Flow Information for the Selection and Evaluation of Software Test Data

Candidate: Frankl, Phyllis G.

Advisor(s): Weyuker, Elaine

Abstract:

Two families of software test data adequacy criteria, each based on data flow analysis, are defined for programs written in Pascal. Their formal properties are investigated and interactive software testing tools based on them are described. The first of these families, the data flow testing criteria, was previously defined for programs written in a simple language. We extend the definitions to apply to programs written in Pascal. The data flow testing criteria are based purely on the syntax of the program being tested. They require that the test data execute certain paths from program points at which variables are defined to program points at which those definitions are used. We describe the design and implementation of a software testing tool, ASSET, based on the data flow testing criteria. A serious weakness of the data flow testing criteria is that for some programs there exists no set of test data which is adequate for testing the program according to these criteria. This problem arises due to unexecutable paths in the program. The second family of criteria, the feasible data flow testing criteria, circumvent this problem by eliminating from consideration those definition-use associations which can never be exercised. We show that certain formal properties of the feasible data flow testing criteria differ significantly from those of the data flow testing criteria. Since it is undecidable whether a given set of test data satisfies a given feasible data flow testing criterion, feasible data flow testing cannot be fully automated. However, it can be partially automated. We describe a heuristic method, the path expression method, which attempts to determine whether a given definition-use association can be exercised. The path expression method is based on a combination of data flow analysis and symbolic evaluation. We introduce a new symbolic evaluation technique which is more general, but essentially no more expensive, than symbolic execution. The path expression method, along with ASSET, constitute a tool which partially automates feasible data flow testing.
Ph.D. Thesis 1987 Control and Task Planning for a Four Finger Dextrous Manipulator Hor, Maw-Kae Abstract | PDF

Title: Control and Task Planning for a Four Finger Dextrous Manipulator

Candidate: Hor, Maw-Kae

Abstract:

Various attempts have been made to build a dextrous hand and to study the control and planning issues involved in dextrous manipulation. However, in many practical situations, the following problems make the real time control and planning of dextrous manipulation very difficult: (1) the discrepancy between the model and reality (for example, imprecise knowledge of inertia, friction, and the geometric dimensions), (2) the inadequacy of the control theory used in controlling a highly non-linear manipulator, (3) the numerous computations required in the dynamic and kinematic calculations, and (4) the lack of abstract level manipulation primitives. This thesis investigates several issues in relation to dextrous manipulation and control. We designed and built a planar manipulator, the Four Finger Manipulator, for studying of the dextrous manipulation. We also developed a prototype software structure for multi-finger manipulators. Models for quasi-static control and real time calculation are presented which make the real time control possible. Heuristics are described for: (a) choosing the finger gripping forces of a force controlled adaptive frictional grasp, (b) estimating the trajectory in compliant motions, and (c) coordinating finger groups to perform tasks that require multiple finger groups. A set of manipulation primitives and algorithms have been developed on the Four Finger Manipulator. Successful performance is demonstrated for various tasks.
Ph.D. Thesis 1987 An Analyzer for the Information Content of Sentences (Semantics) Johnson, Stephen Bennett Abstract | PDF

Title: An Analyzer for the Information Content of Sentences (Semantics)

Candidate: Johnson, Stephen Bennett

Advisor(s): Sager, Naomi

Abstract:

An algorithm is presented which produces a representation of the information content of sentences as a tree of operator words predicating on argument words. The Sentence Analyzer employs a new type of formal grammar which describes the surface syntax of sentences, grammatical constraints, and the operator-argument relations underlying the surface forms. The algorithm works left to right, first obtaining the operator-argument representations of words from a lexicon, and then applying grammar rules to construct operator-argument subtrees over longer and longer segments of the sentence. All alternate analyses are developed simultaneously. The grammar rules are based on the detailed mathematical grammar of Zellig Harris, termed here Composition-Reduction Grammar, in which sentences are generated by a process of operator words entering on argument words. As words enter, this tree structure is linearized. Various reductions may apply to words which are redundant in the operator-argument structure, producing variations such as morphological changes, and the dropping of words from the sentence. Reduction yields sentences with a more compact form, the form we see, while preserving the objective information content. The fundamental unit of the formal grammar developed here is the descriptor, a tuple of six attributes, which represents an operator-argument word class. A descriptor is similar to traditional word classes like nouns and verbs, but can carry information specific to an individual word to form an entry in the lexicon. More importantly, descriptors can replace the use of symbols for phrases in traditional grammar. This is because a descriptor can stand for the entire word sequence spanned by the operator-argument subtree of which it is the root. This feature enables the grammar rules to be specified as a relation between two descriptors whose subtrees span adjacent word sequences. The two words related by a rule either have a simple operator-argument relation, or a more complex operator-argument relation made compact by reduction. The result is a formal grammar in which all relations are between words, with sufficient power for the Sentence Analyzer to perform a direct analysis of sentences into their informational relations, without recourse to intricate transformational procedures.
Ph.D. Thesis 1987 Description of Shape using Orientation and Propagation Flow Menczel, Yaron Abstract | PDF

Title: Description of Shape using Orientation and Propagation Flow

Candidate: Menczel, Yaron

Abstract:

A new theory for the partition of an image into its syntactical primitives is introduced. The method uses edge segments and their orientation to mark an image with useful syntactical information. The marking is done by defining a flow initiating from the boundary and propagating inward into the shape. Three algorithms are introduced. The first sends flow waves in a direction perpendicular to the edges into the object. The second algorithm is an iterative version of the first algorithm, with the addition that an edge detector is constantly applied on the growing object. The third labels the edges with their orientation and then iteratively applies a majority vote selection to spread the orientation with unlabeled pixels inactive in the voting process. The propagation is moderated by a number of heuristics that ensure local and global support within the flow. The flow carries orientation data and spreads the information to all interior pixels. A connected component algorithm based on orientation is then used to construct segments of uniform orientation. These segments constitute the basis of a structural description. The new approach is compared to other methods of segmentation and representation of shapes. These other methods are not always capable of explaining human perception of shapes in a uniform and unique way. Methods that are designed to deal with simple perceptual domains are not capable of dealing with occlusion, texture, touching bodies, and subjective contours. In contrast, this new proposal is shown to work with simple figures as well as more real world complex images. Several examples are given to show the usefulness of the approach. In particular, we give an implementation of a system that performs automatic character recognition based on this method.
Ph.D. Thesis 1987 Generic: a Programming Language for Vlsi Layout and Layout Manipulation Solworth, Jon A. Abstract | PDF

Title: Generic: a Programming Language for Vlsi Layout and Layout Manipulation

Candidate: Solworth, Jon A.

Abstract:

We describe a programming language, GENERIC (GENERation of Integrated Circuits) for producing high-quality, general-purpose layout of custom integrated circuits. Unlike other VLSI programming languages, in GENERIC, existing layouts can be manipulated by the VLSI operators to produce new layouts. The design of a layout in GENERIC starts with a circuit description which contains the active components and electrical nets. The circuit description (sometimes called an abstract layout) is then transformed into a realizable layout by the application of VLSI operators. These operators are both design-rule safe and wire connectivity maintaining. Built-in operations include relative placement, primitive compaction, and orientation. A novel mechanism called planes is described, which for the first time enables non-design rule violating topological manipulations. GENERIC forms the kernal of a VLSI design system. We also describe the cell library, Flexcell which contains parameterized and modifiable cells. Cells in the Flexcell library are created using cell generators, but unlike traditional cell generators, the layout generated need not exhibit a high degree of regularity. For each cell, a number of templates are provided, which encode known good layout schemes. Cells created with a template can then be modified using utilities written in GENERIC. Hence, Flexcell provides highly optimized cells which can be reused in many different environments.
Ph.D. Thesis 1987 A Theory of Concurrent Programs and Test Data Adequacy Weiss, Stewart Neil Abstract | PDF

Title: A Theory of Concurrent Programs and Test Data Adequacy

Candidate: Weiss, Stewart Neil

Abstract:

We establish a general framework for the investigation of concurrent program-based adequacy criteria and we extend notions of program-based test data adequacy to the domain of concurrent programs. This work is consistent with the testing theory proposed by Gourlay and the axiomatization of test data adequacy proposed by Weyuker. Our method is to define a representation of concurrent programs which is particularly suited to the study of the problems of concurrent program testing, and which serves as a model for an extension of a theory of testing to such programs. Our framework also provides the basis for a practical testing tool for concurrent programs. We prove theoretical results concerning various properties of our representation of concurrent programs, among which are notions of completeness, consistency, and computability. We propose approximate solutions to some of the undecidable problems which we encounter. We demonstrate that our theory of concurrent program testing may be used to assess the complexity and reliability of various adequacy criteria for testing concurrent programs. We use our model to investigate and compare concurrent program based adequacy criteria derived from a subclass of structural coverage criteria including a large family of data flow criteria. Finally, we propose practical methods of using our framework as an aid to concurrent program testing.
Ph.D. Thesis 1986 Three-Dimensional Data Acquisition by Means of the Intensity Ratio Depth Sensor (Vision, Robotics) Carrihill, Brian Lee Abstract | PDF

Title: Three-Dimensional Data Acquisition by Means of the Intensity Ratio Depth Sensor (Vision, Robotics)

Candidate: Carrihill, Brian Lee

Abstract:

The thesis discusses the acquisition of three-dimensional information by means of the Intensity Ratio Depth Sensor. The Intensity Ratio Depth Sensor uses a structured-light triangulation approach for the measurement of depth from a camera unit to object surfaces in a scene. The device may be viewed as a modification of the plane-of-light scheme in which multiple illumination planes are encoded by intensity ratio values obtained from two or three intensity images. The modification avoids the need to scan the plane of light which, together with the small amount of processing required for the depth calculation, offers a distinct speed advantage over existing schemes. The system design and calibration issues, necessary in obtaining a working Intensity Ratio Depth Sensor, are analyzed. The depth equation (for the transformation of intensity ratio values into depth values) together with four experimental methods for its calculation are presented. The results of the four sensor implementations are given for test scenes. Potential scene dependent and scene independent error sources are discussed. In particular, mutual illumination (illumination resulting from reflections between surfaces elements) is an important scene dependent error source. An analysis of mutual illumination based on a radiative energy transfer formulation is presented. The result of the analysis is an iterative mutual illumination removal algorithm which is applied to test scenes. Two empirical methods for mutual illumination removal are also derived and demonstrated. Preliminary processing of the three-dimensional data produced by the sensor, exploiting constraints imposed by the device, is examined. The processing yields first and second derivative surface parameters for points in the scene.
Ph.D. Thesis 1986 Polygon Optimization Problems (Computational Geometry, Algorithm) Chang, Jyun-Sheng Abstract | PDF

Title: Polygon Optimization Problems (Computational Geometry, Algorithm)

Candidate: Chang, Jyun-Sheng

Advisor(s): Yap, Chee

Abstract:

The thesis examines polygon optimization problems arising from the stockcutting problem. Two types of problems are considered: the inclusion problems and the enclosure problems. The inclusion (enclosure) problems ask for a maximum polygonal subset (minimum polygonal superset) of a given polygon, satisfying certain conditions. Both the area and perimeter metrics on the polygons can be used as the measure of optimality. Various geometric properties and algorithms for these problems are shown. The main results are: (1) An O(n('7)) time (O(n('6)) time) algorithm for finding a maximum area (perimeter) convex subset. (Only exponential time algorithms existed previously for the problem.) (2) An O(n('2) log n log k) time algorithm for finding a minimum area enclosing convex k-gon. (3) An O(n('2)) time algorithm for finding a minimum perimeter enclosing triangle. (4) An O(nk('4)) time algorithm for finding a minimum enclosing k-gon with a fixed shape.
Ph.D. Thesis 1986 Machine Code Optimization Goss, Clinton Francis Abstract | PDF

Title: Machine Code Optimization

Candidate: Goss, Clinton Francis

Abstract:

This dissertation explores classes of compiler optimization techniques which are applicable late in the compilation process, after all executable code for a program has been linked. We concentrate on techniques which, for various reasons, cannot be applied earlier in the compilation process. We begin by demonstrating the need for optimizations at this level in the UNIX('(REGTM)) programming environment. We then describe a Machine Code Optimizer which improves code in executable task files in that environment. The specific details of certain algorithms are then described: code elimination to remove unreachable code, code distribution to re-order sections of code, operand reduction which converts operands to use more advantageous addressing modes available on the target architecture, and macro compression which collapses common sequences of instructions. We show that the problem of finding optimal solutions for code distribution is NP-Complete and discuss heuristics for practical solutions. We then describe the implementation of a Machine Code Optimizer containing the code elimination, code distribution, and operand reduction algorithms. This optimizer operates in a production environment and incorporates a machine independent architecture representation which allows it to be ported across a large class of machines. We demonstrate the portability of the Machine Code Optimizer to the Motorola MC68000('(REGTM)) and the Digital VAX-11('(REGTM)) instruction sets. Finally, metrics on the improvements obtained across architectures and the optimization techniques are provided along with proposed lines of further research. The methods demonstrate that substantial reductions in code space and more modest improvements in execution speed can be obtained using these techniques.
Ph.D. Thesis 1986 Sequential Quadratic Programming Methods Based on Approximating a Projected Hessian Matrix (Updating Method, Quasi-Newton, Nonlinear Constraints) Gurwitz, Chaya Bleich Abstract | PDF

Title: Sequential Quadratic Programming Methods Based on Approximating a Projected Hessian Matrix (Updating Method, Quasi-Newton, Nonlinear Constraints)

Candidate: Gurwitz, Chaya Bleich

Advisor(s): Overton, Michael

Abstract:

We consider the nonlinear programming problem, namely minimizing a nonlinear function subject to a set of nonlinear equality and inequality constaints. Sequential quadratic programming (SQP) methods are particularly effective for solving problems of this nature. It is assumed that first derivatives of the objective and constraint functions are available, but that second derivatives may be too expensive to compute. Instead, the methods typically update a suitable matrix which approximates second derivative information at each iteration. We are interested in developing SQP methods which maintain an approximation to second derivative information projected onto the tangent space of the constraints. The main motivation for our work is that only the projected matrix enters into the optimality conditions for the nonlinear problem. Updating projected second derivative information reduces the dimension of the matrix to be recurred; we avoid the necessity of introducing an augmenting term which can lead to ill-conditioned matrices; and we are able to make use of standard quasi-Newton updates which maintain hereditary positive definiteness. We discuss four possible formulations of the quadratic programming subproblem and present numerical results which indicate that our methods may be useful in practice.
Ph.D. Thesis 1986 Analysis of Cache Memories in Highly Parallel Systems Mcauliffe, Kevin Patrick Abstract | PDF

Title: Analysis of Cache Memories in Highly Parallel Systems

Candidate: Mcauliffe, Kevin Patrick

Advisor(s): Gottlieb, Allan

Abstract:

Though advances in VLSI technology will soon make it practical to construct parallel processors consisting of thousands of processing elements (PEs) sharing a central memory, the performance of these parallel processors is limited by the high memory access time due to interconnect network latency. This thesis is a study of how the performance of a parallel processor is affected by associating a cache memory with each PE of the system. Cache parameters and policies are varied and the performance of the resulting cache configurations are compared. The cache coherence problem is discussed and a solution that is compatible with the philosophy of parallel systems is adopted. Performance is analyzed by analytic and simulation models. Due to time and space limitations the simulation modeling is done in a hierarchical fashion: a primary level simulates a single cache and a secondary level simulates a parallel machine. The simulators can run in a trace-driven and self-driven mode. The trace data used to drive the simulators was collected by tracing the reference patterns of actual parallel programs. An approximate analytic model is developed that predicts the queue waiting times of various components of a parallel system, enabling the comparison of a water range of cache parameters than is possible with the simulators.
Ph.D. Thesis 1986 Synthesizing Realistic Textures by the Composition of Perceptually Motivated Functions (Graphics) Perlin, Kenneth H. Abstract | PDF

Title: Synthesizing Realistic Textures by the Composition of Perceptually Motivated Functions (Graphics)

Candidate: Perlin, Kenneth H.

Advisor(s): Lowe, David

Abstract:

This research demonstrates a uniform functional composition framework for modeling and synthesizing complex textures. The appearance of a wide range of natural phenomena can be expressed and efficiently synthesized in this framework. Animation of texture is readily incorporated. Emphasis will be on explaining the properties leading to generality, expressivity, and efficiency. A system is described in which an image is approximated by a finite collection of samples, representing neighborhoods in the image. The user designs visual simulations of surface textures by constructing an algorithm that is to be independently computed at each image sample. Primitive functions are provided that allow control within the texture algorithm of visually important texture properties, such as frequency and first order spatial statistics. The user proceeds by building from these functions. Feedback is provided by images indicating the state of any computed quantity over all samples. The system includes primitive functions allowing the manipulation of such visually discriminable qualities as brightness, contrast, coherent discontinuities, orientation, and features possessing restricted ranges of frequency. These are used to build up composite functions allowing the manipulation of more sophisticated visual qualities. The system is applied to build the appearance of many textures such as water, star fields, flame, smoke, marble, clouds, stucco, rock, smoke, and soap films. Major results are twofold. First, it will be shown that a wide range of naturalistic visual textures can be constructed with this approach. Second, a number of particular functions will be demonstrated that encode the common visual elements of disparate visual textures.
Ph.D. Thesis 1986 Persistent Data Structures Sarnak, Neil Ivor Abstract | PDF

Title: Persistent Data Structures

Candidate: Sarnak, Neil Ivor

Advisor(s): Tarjan, Robert

Abstract:

This dissertation introduces the concept of persistence in data structures. Classical algorithms operate on data structures is such a manner that modifications to the structure do not preserve its state as it appeared before the modification. A persistent data structure is one in which multiple versions of the structure as it varies through time are maintained. Data structures that do not maintain the history of states of the structure are called ephemeral. A differentiation between two types of persistence, partial persistence and full persistence, is made. A partially persistent data structure allows the modification only of the most recent version of the structure. This makes partial persistence useful in cases where the history of update operations is required for query purposes but no changes of prior versions are desired. Under certain constraints, any ephemeral data structure may be made persistent without a major blow-up of the space and time complexity measures. Full persistence allows modification of any version of the data structure. This dissertation presents algorithms that support persistent search trees, with applications in computational geometry. In particular, the planar point location problem will be solved using persistent binary search trees with an O(log n) query time and O(n) space. Persistent lists are described, with applications in applicative programming languages. In particular, persistent deques are presented that have constant space overhead per deque operation, while still maintaining O(1) update times. Persistent finger search trees are also presented, with applications in text editing. Persistent finger search trees are implemented with an O(log d) space overhead per update, and an O(log d) time bound, where d is the distance between the finger and the affected position. A general result is shown that allows making arbitrary ephemeral data structures partially persistent with an O(1) space overhead per update operation.
Ph.D. Thesis 1986 The Semantics of Shared Variables in Parallel Programming Languages Shulman, Norman Victor Abstract | PDF

Title: The Semantics of Shared Variables in Parallel Programming Languages

Candidate: Shulman, Norman Victor

Abstract:

Chapter 1 surveys the status of shared variables in parallel programming languages, as well as pointing out the problems inherent in the use of shared variables and the importance of a semantic definition. Our approach to the semantics of shared variables is set forth, and used to highlight the deficiencies of shared variables in Ada. Chapter 2 presents a clear simple informal semantic model of shared variables based on the concepts of atomicity, uniqueness and independence. The model captures the relationships between these concepts so that it can be used to resolve questions regarding packing, mutual exclusion, and local copies of shared variables. Chapter 3 discusses the deficiencies of shared variables in Ada. An informal semantic model of shared variables in Ada is presented in terms of the concepts of atomicity, uniqueness and independence. This informal semantic model serves as the basis for proposing changes to the section of the Ada Reference Manual dealing with shared variables for incorporation in a future revision. Chapter 4 shows how the Ada definition can be modified so that execution of programs such as the on-the-fly garbage collector and the Laplace's equation solver mentioned in Chapter 1 will no longer be qualified as erroneous. New restrictions can be imposed to ensure the independence of operations on shared variables. The informal semantic model also serves as the basis for extending the applicability of the axiomatic techniques of Owicki to a wider class of programs subject to certain optimizations of time and space. Chapter 5 shows that it is possible to relax the restrictions on expressions, and to formulate conditions under which it is safe to keep local copies of shared variables and to pack shared structured objects, while preserving the assignment axiom.
Ph.D. Thesis 1986 Recursive Data Types in Setl: Automatic Determination, Data Language Description, and Efficient Implementation (Compilers) Weiss, Gerald Abstract | PDF

Title: Recursive Data Types in Setl: Automatic Determination, Data Language Description, and Efficient Implementation (Compilers)

Candidate: Weiss, Gerald

Abstract:

Very high level languages are often weakly typed in the sense that different occurrences of a name can be associated with distinct types. The types of many entities are nevertheless determinable from the structure of the program, allowing translators for these languages often to incorporate some sort of typefinding algorithm. Due to problems of algorithmic termination, however, these algorithms have been unable to type structures of a recursive nature such as trees. In this thesis we present a method which detects and uncovers the structure of recursive objects, and discuss possible applications of the method to optimization of code. We examine the run-time type model of SETL and the corresponding data representation sublanguage (DRSL), and present a general critique of the design as well as implementation of the current data representation sublanguage. The objects expressible by the latter are shown to be proper subsets of the universe of types assumable by SETL entities at run-time; we present suggestions for extending the representation sublanguage to allow for complete type specification.
Ph.D. Thesis 1985 Extraction and Generalization of Expert Advice (Learning, Representation, Induction) Benjamin, David Paul Abstract | PDF

Title: Extraction and Generalization of Expert Advice (Learning, Representation, Induction)

Candidate: Benjamin, David Paul

Abstract:

This work describes a method for representing knowledge in production systems which makes use of the conflict set. This permits a rich description of task situations, and allows the use of control productions to effect conflict resolution. A set of extensions to the OPS5 production system is described which facilitates the implementation of this approach within OPS5. This extended system is then used to implement a multi-level, goal-directed production system for the construction of expert systems, CAMERA, in which control information is automatically built from the actions of an expert trainer. This control information consists of sequencing and goal information which is interactively extracted from the trainer by CAMERA, and generalized by DISC, which models generalization as the process of finding 'discriminating' features, which are those features of a situation that cause a particular method to be chosen, and then constructing a description of those features. When solving a task, CAMERA examines only the discriminating features specified in the generalized control rules. Thus, instead of matching all the productions against the working memory, CAMERA considers only the relevant rules. Experiments with the system are described.
Ph.D. Thesis 1984 On the use of Global Optimization Algorithms for the Detection of Semantic Programming Errors (Setl, Data Flow, Type Finding) Freudenberger, Stefan M. Abstract | PDF

Title: On the use of Global Optimization Algorithms for the Detection of Semantic Programming Errors (Setl, Data Flow, Type Finding)

Candidate: Freudenberger, Stefan M.

Abstract:

It has been pointed out repeatedly that it should be possible to adapt global program optimization algorithms for the purpose of detecting faults in programs. It has become clear that global program analysis can be beneficial in program development, debugging, verification, and documentation since it can provide information about all possible executions of the code at once. The techniques employed are not only capable of revealing errors, interfacing errors, and other shortcomings but of doing so in a way which helps to pinpoint the source of problems. In this dissertation we systematically examine the global optimization techniques available today to determine how these techniques can be used to aid the rapid, compile-time detection of program errors. The techniques considered include flow tracing, type finding, and value flow. The approach is to determine what facts about a program can be collected using the best available program analysis technique, and to use this information to mark suspicious program segments. The techniques proposed have been implemented in an extensive global bug finder, and examples of its use are included.
Ph.D. Thesis 1984 Description of Operating Systems using Very-High-Level Diction (Programming Languages) Leshem, Gavriel Abstract | PDF

Title: Description of Operating Systems using Very-High-Level Diction (Programming Languages)

Candidate: Leshem, Gavriel

Abstract:

Operating systems are generally large and complicated, and therefore difficult to write, debug and maintain. This thesis approaches the problem of simplifying these complex descriptions by writing operating system prototypes using a very high-level programming language that significantly relieves the burden of low-level and machine dependent details. The language used includes special constructs designed to facilitate clear and concise description of the mechanisms involved in multiprocessing systems: (1) a coroutine mechanism to implement concurrent processes, (2) a interprocess communication mechanism, (3) a real I/O facility that provides access to I/O system services. Using these intermediate-level constructs simplifies the problem of describing the high-level structure of operating systems significantly. These constructs are written in a high-level programming language, using several simple low-level primitives. They can be modified easily and new operations can be added at will. The main purpose of our high-level approach is to provide a tool for describing and designing operating systems. The high-level description can be used as a blueprint for writing the real operating system in a suitable lower-level implementation language. The thesis also describes an implementation of the suggested language that can be used to test the high-level description of an operating system and possibly also to simulate the real system to predict its potential performance. We test our descriptive tools by giving extended descriptions of two well-known operating systems using the proposed high-level language. Several basic design issues concerning these operating systems are then examined and the operating systems are compared in a manner that emphasizes the design issues that emerge. Some modifications of these systems, inspired by the high-level representation, are suggested.
Ph.D. Thesis 1984 Decidability and Proof Procedures for Set Theory with a Choice Operator Omodeo, Eugenio Giovanni Abstract | PDF

Title: Decidability and Proof Procedures for Set Theory with a Choice Operator

Candidate: Omodeo, Eugenio Giovanni

Advisor(s): Davis, Martin D.

Abstract:

Various decision algorithms are described and proved correct, each applying to a particular collection of unquantified set-theoretical formulas. Some of these algorithms are able to determine whether each given formula is satisfiable, some others can only establish whether it is satisfiable by means of an interpretation in which the values of the terms appearing in the formula are finite sets. In most cases, formulas are allowed to involve a choice operator which selects from every non-empty set s the minimum of s with respect to a well-ordering of the class of all sets. A semi-decision procedure is also described which applies to unquantified formulas in which all familiar set-theoretical operators are allowed to appear, with certain limitations imposed only on the occurrences of the unionset and choice operators. The execution of this procedure only terminates when the input formula is finitely satisfiable.
Ph.D. Thesis 1984 A Self-Organizing Database System - a Different Approach to Query Optimization Piatetsky-Shapiro, Gregory Ilya Abstract | PDF

Title: A Self-Organizing Database System - a Different Approach to Query Optimization

Candidate: Piatetsky-Shapiro, Gregory Ilya

Abstract:

A Self-Organizing Database System (SODS) monitors queries asked, finds a good (or optimal) database structure for those queries, and suggests or does the reorganization. In this thesis we describe a prototype SODS for single-file relational queries and give an integrated analysis of its major design problems: (1) estimation of the number of records satisfying a condition (i.e., condition selectivity); (2) query optimization; (3) storing information about a set of queries; (4) optimal selection of secondary indices. We present new results for each of those problems. Some of this research was implemented in FASTSCAN, a commercial query system. We present a new method for accurate estimation of the number of records satisfying a condition field rel constant, where rel is one of =, , (LESSTHEQ), (GREATERTHEQ). We also examine estimates for more complicated conditions. We present elementary operations (such as UNION, INTERSECT) on pointer and record streams. We show how to use the query parse tree to construct a query evaluation method (EM) from those operations. Then we give an algorithm for selecting the optimal EM, based on converting the query to conjunctive normal form. We examine ways to compress information about a set of queries by combining information for similar queries. We derive a compression scheme which allows a correct and fast computation of the cost of the average query under any index set. We combine all previous results in analyzing the NP-hard problem of optimal index selection. We present two algorithms for it. The first one always finds the optimal answer and runs fast on real-size problems despite its exponential worst-case complexity. The second one (a Greedy method) runs much faster, yet finds the optimal answer very frequently. We analyze the Maximum Cover problem (also NP-hard), a simplification of the optimal index selection. We prove that the Greedy method is an epsilon-approximate algorithm: its answer is always > 63% of the optimal answer.
Ph.D. Thesis 1984 Concurrency Control using Locks in Distributed Databases Wolfson, Ouri Abstract | PDF

Title: Concurrency Control using Locks in Distributed Databases

Candidate: Wolfson, Ouri

Abstract:

Distributed Databases have drawn a great deal of research interest recently because of a combination of several related reasons. First is the tremendous expansion in the quantity of data that has to be processed in the modern world. Second is the growth in the number of interelated processing centers because microcomputers and communication technology enable greater dispersion of organizations. Third is the realization that complex problems to be addressed in this and next decade, such as different aspects of Artificial Intelligence, will require at least some parallel processing for adequate solution. In Distributed Databases the typical problems of Centralized Databases become more difficult. One of them is Concurrency Control. It can be summarized as follows. Users of the Database access it by executing transactions. Different transactions are executed concurrently therefore their actions interleave. Without proper control this interleaving may produce incorrect results, even if individual transactions are correct. The Concurrency Control process has to prevent these situations. There are several possible mechanisms for controlling concurrency, of which the most widely used is Locking. In this thesis we examine and analyze Locking as a Concurrency Control mechanism for Distributed Databases. We define Distributed Locking Policies (methods for locking entities in Distributed Databases) and show how existing Policies for a Centralized Database generalize to the Distributed case. We also define a new category of Distributed Locking Policies, D-policies, into which these generalizations fall. An algorithm which determines whether all transactions of a given D-policy are guaranteed to produce only correct interleavings (are safe) is presented. The algorithm is efficient, even though testing an arbitrary set of transactions for safety is coNP-complete. However, we prove that optimal locking of transactions to satisfy the conditions tested by the algorithm is NP-hard even for a Centralized Database.
Ph.D. Thesis 1983 A Practical Method for Lr and Ll Syntactic Error Diagnosis and Recovery Burke, Michael George Abstract | PDF

Title: A Practical Method for Lr and Ll Syntactic Error Diagnosis and Recovery

Candidate: Burke, Michael George

Abstract:

A powerful, practical, and language-independent method for diagnosing and recovering from syntactic errors within the LR and LL parsing frameworks is described. The method proceeds in three phases. The simple recovery phase attempts a single token modification of the source text, scope recovery attempts a multiple token insertion to close one or more open scopes, and secondary recovery involves a multiple deletion of tokens surrounding the error point. When the token at which the error is detected is not the token that is in error, points on the parse stack must be considered if the error is to be corrected. Condensation that has occurred on the parse stack, however, is sometimes harmful in this context. Also, in some of the parsing frameworks under consideration, unwanted condensation may occur even if the error is detected at the point at which it occurs. This problem motivates the existence of four versions of the method involving tradeoffs between the quality of error recovery and efficiency with respect to space and time. Techniques are described that make the method efficient in practice. Other implementation issues, such as language specific tuning and the issuing of diagnostic messages, are discussed. Empirical results are presented that demonstrate that the versions of the method offer choices ranging from very high quality recovery with reasonable efficiency to high quality recovery with excellent efficiency.
Ph.D. Thesis 1983 Resolution by Unification and Equality Digricoli, Vincent Joseph Abstract | PDF

Title: Resolution by Unification and Equality

Candidate: Digricoli, Vincent Joseph

Abstract:

In resolution by unification and equality, we recast the theory of binary resolution on the basis of the properties of the equality relationship as stated by the equality axioms. In standard binary resolution as introduced by J. A. Robinson in 1965, complete and strict unification is the sole basis for resolving complementary literals leading to exceptionally long proofs for even simple theorems involving equality. In RUE resolution implicit use of the equality axioms is made through their incorporation into two rules of inference which are sound and complete to prove E-unsatisfiability. Proofs by RUE resolution are significantly shorter and more transparent than standard refutations with the equality axioms. These qualities permit more effective application of heuristics to guide the search for refutations. We here present the complete theory of RUE resolution, with proofs of lemmas and theorems in support of the theory. We define RUE hyperresolution as a restriction strategy and develop a heuristic theory to order the search for refutations. We have implemented an RUE theorem prover and performed experiments in the fields of Boolean algebra, Ring theory and Group theory. We present a careful comparison with the work of McCharen, Overbeek and Wos, whose theorem prover using unification resolution with the equality axioms and paramodulation represents one of the most successful uses of unification resolution. The comparison of results presents major evidence that RUE resolution is a significant advance over unification resolution.
Ph.D. Thesis 1983 Measuring Setl Performance Shields, Lynwood David Abstract | PDF

Title: Measuring Setl Performance

Candidate: Shields, Lynwood David

Advisor(s): Schwartz, Jacob T.

Abstract:

Current computer technology is being driven by the hardware advances that have provided a constant and dramatic decrease in the cost of elementary hardware operations. This has made more feasible the use of high-level languages that permit program development without the constant attention to detail needed to achieve efficient execution that characterizes low-level languages; indeed, such languages can be realized by a combination of microcode and special-purpose VLSI chips. However, effective use of this technology requires an understanding of the underlying performance issues. We have analyzed the problem of measuring performance of high-level languages by studying in detail one such language, SETL, and have developed a set of measurement tools addressed both to the user and the implementor. Our thesis is that such measurement efforts must aim to provide measurement tools that can be integrated into the system, but only after their efficacy has been demonstrated by their use on real programs. This work has resulted in prototype versions of four program profilers, each providing a specific view of SETL performance; we discuss their use in analyzing, and then improving, the performance of actual SETL programs. We also discuss the implementation of the hard code system that provides an essential starting point for evaluating the effectiveness of the representation sublanguage provided by SETL. Finally, we indicate some ways in which SETL performance can be improved.
Ph.D. Thesis 1983 Undecidable Complexity Statements in a Hierarchy of Extensions of Primitive Recursive Arithmetic Sigal, Ron Mark Abstract | PDF

Title: Undecidable Complexity Statements in a Hierarchy of Extensions of Primitive Recursive Arithmetic

Candidate: Sigal, Ron Mark

Advisor(s): Weyuker, Elaine; Davis, Martin D.

Abstract:

For each transfinite ordinal (alpha) (LESSTHEQ) (epsilon)(,0), we fix a unique well-ordering of the natural numbers which we call its canonical well-ordering. Let S((alpha)) be Primitive Recursive Arithmetic plus function definition by transfinite recursion on the canonical well-ordering of order type (alpha). For a hierarchy of theories S((alpha)), where (omega)('(omega)('(omega))) (LESSTHEQ) (alpha) < (epsilon)(,0), we define functions (phi)(,(alpha)) such that statements asserting extremely loose upper bounds on the computational complexity of (phi)(,(alpha)) are independent of S((alpha)). We quantify the gap between actual and provable complexity bounds in terms of the Lob-Wainer hierarchy of rapidly growing functions. A statement asserting a primitive recursive upper bound for the complexity of (phi)(,(alpha)) can be proven in a theory slightly higher in the hierarchy than S((alpha)).
Ph.D. Thesis 1983 Formal Languages with Oracles Weixelbaum, Elia S. Abstract | PDF

Title: Formal Languages with Oracles

Candidate: Weixelbaum, Elia S.

Abstract:

A relativization of formal language theory is studied in this dissertation. Specifically, we examine possible relativizations of the four language classes of the Chomsky hierarchy. Definitions are given for oracle finite automata, oracle pushdown automata, oracle linear bounded automata, and oracle Turing machines. The relativized regular languages are characterized via results derived from AFL (abstract families of languages) theory. We then use this characterization to help us derive a relativization of the Chomsky-Schutzenberger theorem for relativized context free languages. We examine relativized recursively enumerable (r.e.) languages by studying oracle Turing machines and also by suggesting a definition for an oracle phrase structure grammar. We demonstrate two different types of equivalences between these two models. The context sensitive languages are relativized in the same manner as are the r.e. languages, although there are difficulties in proving the respective results for the context sensitive case. Several unresolved questions remain in this case.
Ph.D. Thesis 1982 Decision Algorithms for a Class of Set-Theoretic Formulae Involving One Occurrence of the Union-Set Operator Breban, Michael Abstract | PDF

Title: Decision Algorithms for a Class of Set-Theoretic Formulae Involving One Occurrence of the Union-Set Operator

Candidate: Breban, Michael

Advisor(s): Schwartz, Jacob T.

Abstract:

We consider the first order language allowing the operators = (equality), (epsilon) (membership), (UNION) (binary union), (INTERSECT) (binary intersection), (FDIAG) (set difference), { } (singleton former) and one occurrence of Un (unary union). We show that unquantified formulae of this language are decidable. As a preparatory result we show that unquantified formulae of the above mentioned language not involving the singleton former are decidable.
Ph.D. Thesis 1982 The Role of the High Level Specification in Programming by Transformation: Specification and Transformation by Parts Merritt, Susan Mary Abstract | PDF

Title: The Role of the High Level Specification in Programming by Transformation: Specification and Transformation by Parts

Candidate: Merritt, Susan Mary

Abstract:

Specification by parts is a technique for constructing a very high level specification of a problem. The specification is then the target of transformation by parts, a global transformation strategy, which yields a family of high level algorithms which are correct and which solve the problem. The specifications are easy to construct, to understand and to modify. The key to the specification by parts technique is the use of weak parts. Output conditions are factored into conjunctions of weaker conditions, called weak parts, each of which is easier to satisfy than the original condition. In the transformation by parts, an initial guess is made for the output object. The guess satisfies some subset of the weak parts; the conditions in this subset are called the invariant conditions. A general iterative structure is built, which incrementally changes the initial guess, keeping the invariant conditions true, and converging to the remaining conditions. The methodology demonstrates the relationship between invariance and convergence in algorithm construction. In particular it demonstrates that algorithms for the same problem are often the result of different choices of invariant and convergent conditions. The methods are illustrated in three case studies and in three supplementary examples (which are smaller in scope than the case studies), all of which are fundamental computer science problems. These applications demonstrate the flexibility and ease with which the high level specifications can be constructed and transformed. They also demonstrate the potential which this methodology offers for the discovery of new algorithms, the illustration of connections among known algorithms, and the possible semi-automation or automation of algorithm construction.
Ph.D. Thesis 1982 Software Structures for Ultraparallel Computing Rudolph, Lawrence S. Abstract | PDF

Title: Software Structures for Ultraparallel Computing

Candidate: Rudolph, Lawrence S.

Advisor(s): Gottlieb, Allan; Schwartz, Jacob T.

Abstract:

In this thesis we implement several basic parallel processing primitives by using a replace-add operation, which can supersede the standard test and set, and which appears to be a universal primitive for efficiently coordinating large numbers of independently acting sequential processors. The replace-add is essentially an indivisible add-to-memory operation although concurrent replace-adds can all be processed in the same one cycle. In particular, we use the replace-add to develop routines for concurrent access to a queue and show how they can be used to devise many highly parallel algorithms as well as a distributed, concurrent task scheduler. The paracomputer forms our underlying theoretical model of parallel computation although we also consider a realistic architecture approximating this model. We justify our use of the replace-add operation by presenting a hardware implementation that permits multiple replace-adds to be processed nearly as efficiently as loads and stores. Moreover, the crucial special case of concurrent replace-adds updating the same variable is handled particularly well: If every PE simultaneously addresses a replace-add at the same variable, all these requests are satisfied in the time required to process just one request.
Ph.D. Thesis 1981 Stochastic Solutions to the Schroedinger Equation for Fermions Arnow, David Moss Abstract | PDF

Title: Stochastic Solutions to the Schroedinger Equation for Fermions

Candidate: Arnow, David Moss

Abstract:

An exact stochastic method has been developed for generating the antisymmetric eigensolution of lowest index and its associated eigenvalue for the Schrodinger wave equation in 3N dimensions. The method is called the Green's function Monte Carlo method for fermions (FGFMC) because it is based on a Monte Carlo solution to the integral form of the Schrodinger equation (using Green's function) and because it is the fermion class of particles in physics which require antisymmetric solutions. The solution consists of two sets of 3N-dimensional points, {R(,j)('+)} and {R(,j)('-)}, distributed by density functions (psi)('+) and (psi)('-), whose difference, (psi)('+)-(psi)('-), is proportional to the eigensolution, (psi)(,F). These sets may be used to estimate integrals of the form (DIAGRAM, TABLE OR GRAPHIC OMITTED...PLEASE SEE DAI) where R = (x(,1),...,x(,3N)) and where f(R) and g(R) are antisymmetric functions. By setting g(R) to (psi)(,T)(R) and f(R) to H(psi)(,T)(R), where (psi)(,T) is an antisymmetric trial wave function satisfying the boundary conditions, E(,F) is obtained. The method is exact because the only sources of error are variance and bias, both of which can be estimated and reduced, either by employing larger sample sizes, or by reconstructing the sampling procedure in ways that make greater use of our understanding of the problem (importance sampling). There are no physical or mathematical approximations other than the statistical one. The crux of the method is a sampling procedure which constructs the two sets of points in linear time (as a function of accuracy). Earlier methods were exponential in cost. The FGFMC method is successfully applied to a one dimensional problem and a nine dimensional problem, the results of which are presented here. These results demonstrate that this method can be successfully applied to small physical problems on medium-scale computing machines. The key to this success was the transformation of the problem from exponential to linear cost as a function of accuracy. The strong dependence on dimensionality, however, currently results in an exponential cost as a function of problem size, and this, until overcome, imposes a servere barrier to calculations on large systems.
Ph.D. Thesis 1981 Synchronization Efficiency Borg, Anita Abstract | PDF

Title: Synchronization Efficiency

Candidate: Borg, Anita

Abstract:

A generally applicable methodology for the analysis of synchronization efficiency is introduced. It is based upon the assumption that synchronization is required because of the need to control the use of resources by concurrent processes. Two aspects of synchronization efficiency are identified: Time efficiency, and accuracy efficiency. Time efficiency provides a measure of the use of resources during synchronization. Accuracy efficiency specifies how well a solution to a synchronization problem supports the rules of a problem. The methodology involves the simulation of solutions to synchronization problems as greater and greater implementation detail is specified. The assumptions made concerning the execution times of operations, especially synchronization operations, is seen to be crucial to the correct analysis of synchronization efficiency. It is argued that the only reasonable assumption for the execution times of synchronization operations, when their implementation is left unspecified, is that they execute instantaneously. However, it is also shown that this assumption must be used with care in order to avoid erroneous conclusions. The methodology is applied to PV, Monitor, and ADA solutions to the mutual exclusion, reader-writer, and consumer-producer problems. The PV solutions were usually the most efficient, while the ADA solutions were found to be the least efficient. It is also shown that no single characteristic of a solution determines its efficiency. However, the primary characteristics affecting efficiency are shown to be: (1) The execution time required for synchronization. (2) The rules for execution of the synchronizating computations. (3) The amount of competition among processes. (4) The amount and cost of process switching required during synchronization. It is the interaction of these factors which determine synchronization efficiency.
Ph.D. Thesis 1981 Circle Graphs Buckingham, Mark Alan Abstract | PDF

Title: Circle Graphs

Candidate: Buckingham, Mark Alan

Advisor(s): Golumbic, Martin

Abstract:

From a circle with chords we may derive a graph whose nodes correspond to chords and whose edges correspond to intersecting chords. Such a graph is called a circle graph. After numbering the endpoints of the chords such that two endpoints are numbered the same iff the endpoints belong to the same chord, we form a circle graph sequence by reading off these numbers going around the outside of the circle. Circle graph sequences are often used to prove properties of circle graphs. In this dissertation we discuss many mathematical and algorithmic aspects of circle graphs. The number of different circle with chords representations that yield a chordless path is given. The property that a circle with chords is connected (that is, its derived circle graph is connected) and the property that a circle with chords has two separated chords (that is, two chords that cannot both be intersected by a third chord without the third intersecting a fourth chord) are described in terms of circle graph sequences. They are found to be dual to one another. An incomplete forbidden subgraph characterization of circle graphs is also presented. An important result of this dissertation is that the Berge Strong Perfect Graph Conjecture is shown to hold for the class of circle graphs. Many properties of p-critical graphs and partitionable graphs are given, most with simplified proofs. Some new results are presented and a new, very simple proof of the Berge Conjecture for K(,1,3)-free graphs is put forward. Very efficient algorithms for finding maximum (weighted) cliques and maximum (weighted) stable sets of the derived circle graph of a circle graph sequence are given. We find an O(e*log(,2)(omega)) algorithm for the unweighted clique problem, an O((delta)e) algorithm for the weighted clique problem and an O(c) algorithm for the weighted stable set problem; where e is the number of edges in the graph, the maximum clique size, (delta) the maximum degree and c the number of occurrences of an interval being completely contained in another interval in the circle graph sequence. Some open problems for further research are listed.
Ph.D. Thesis 1981 Decision Procedures for some Classes of Unquantified Set Theoretic Formulae Ferro, Alfredo Abstract | PDF

Title: Decision Procedures for some Classes of Unquantified Set Theoretic Formulae

Candidate: Ferro, Alfredo

Advisor(s): Schwartz, Jacob T.; Mammana, Carmelo

Abstract:

We consider the first order language consisting of = (equality), (ELEM) (membership), (UNION) (binary union), (INTERSECT) (binary intersection), (FDIAG) (set difference), and pow (powerset former). We show that the class of all universal sentences of this language is decidable, provided that we impose the strong restriction that at most two terms appear as arguments of the powerset former. As a preliminary result we show that the class of all universal sentences in the above language extended by allowing infinitely many constants: one for each hereditarily finite set, is decidable provided that we allow only a single occurrence of the powerset former.
Ph.D. Thesis 1981 A Transformational Framework for Automatic Derived Data Control and its Applications in an Entity-Relationship Data Model Koenig, Shaye Abstract | PDF

Title: A Transformational Framework for Automatic Derived Data Control and its Applications in an Entity-Relationship Data Model

Candidate: Koenig, Shaye

Abstract:

This thesis investigates the specification, implementation and application of derived data in the context of MADAM, an entity-relationship oriented, map-based data model/programming language for database conceptual schema representation and processing. The data representation and manipulation facilities of MADAM, described in chapter 2; represent a synthesis of ideas from the areas of very high level languages, in particular SETL, and the binary association and entity-relationship approaches to data modeling. Derived data refers to data that appears to exist in its declared form, but is actually derived from related data in the database. Previous approaches to the materialization of derived data have been based on a global recalculation strategy in which derived data is recomputed whenever it is referenced. In this thesis we present an alternative approach in which derived data is explicitly stored and incrementally maintained. In chapter 3, we describe the definition of derived data in MADAM; discuss its importance as a means of fostering logical data independence, providing access control mechanisms, and supporting semantic relativism; and present a unified framework for the automatic maintenance of derived data. This framework is based on the transformational techniques of finite differencing in which repeated costly computations are replaced by more efficient incremental counterparts. In addition to the importance of our incremental maintenance approach for supporting alternative views of the same data, additional applications of our incremental maintenance approach to the implementation of summary data, integrity control, and triggers are discussed in chapter 4.
Ph.D. Thesis 1981 Upper and Lower Bounds on the Performance of Parallel Algorithms Kruskal, Clyde Philip Abstract | PDF

Title: Upper and Lower Bounds on the Performance of Parallel Algorithms

Candidate: Kruskal, Clyde Philip

Advisor(s): Schwartz, Jacob T.

Abstract:

With the advent of VLSI, new opportunities in computer architecture are emerging. Parallel processors composed of many thousands of PEs will soon be practical. In this thesis, we derive both upper and lower bounds for parallel algorithms. Our analyses emphasize two specific models of parallel computation--the ultracomputer and the paracomputer--but the general ideas and many of the results are much more widely applicable. We present general lower bounds for solving a wide class of problems on direct connection machines, and a sharper lower bound for effecting permutations. This latter bound shows that the permutation problem is not completely parallelizable on any direct connection machine that is not almost completely connected. In addition, using a very general model of parallel computation, we study the worst case time complexity of searching in parallel. We then present a large collection of basic algorithms for both the ultracomputer and the paracomputer. Since the performances of many of these algorithms achieve the lower bounds mentioned above, both models are extremely effective parallel computer systems. Finally, a systematic method for generalizing any dependent-size algorithm to an independent-size one is given.
Ph.D. Thesis 1980 The Transformational Approach to the Development and Verification of Programs in a very High Level Language Deak, Edith Gail Abstract | PDF

Title: The Transformational Approach to the Development and Verification of Programs in a very High Level Language

Candidate: Deak, Edith Gail

Abstract:

In informal exposition, the correctness of a complex algorithm is often demonstrated by deriving it through successive refinement steps from a high level specification, and supplying proofs of the underlying principles used in the process. However, most existing mechanical program verifiers ignore this standard expository practice, and are generally designed to verify programs written in a low level form. While logically simple algorithms can be handled adequately in this manner, attempting to verify more complex algorithms at a low level requires treatment of implementation details which obscure the main arguments of the verification. This thesis describes a systematic technique for proving algorithms correct using a transformational approach, and presents a detailed transformation/verification scenario of the proof of a variety of complex combinatorial algorithms. The algorithms treated here are considerably more involved than those verified by other methods. The programming language used is a variant of SETL, adapted for program verification, which provides a medium for high level specification. A program P is annotated with logical formulae of set theory, which are called assumptions and assertions. P is said to by partially correct if every computation which satisfies all assumptions also satisfies all assertions. In order to prove the correctness of P, which initially contains only assumptions, we apply proof rules which are used both to transform the program into logical formulae called verification conditions and then to prove these verification conditions. The transformation rules are unique in that they enable the combination of correct program fragments. We are able to reuse general code fragments in a variety of contexts without reproof and to derive several different low level algorithms from a single high level algorithm. The transformations often require proof of enabling conditions. In such cases, when a transformation is performed, the enabling condition is introduced into the program text as an assumption which must be verified in turn using the proof mechanism described above.
Ph.D. Thesis 1980 An Implementation for Gyve: a Language for Concurrent Processing Meyer, Jeanine Marietta Abstract | PDF

Title: An Implementation for Gyve: a Language for Concurrent Processing

Candidate: Meyer, Jeanine Marietta

Abstract:

This thesis presents a design for implementing a programming language, called GYVE, for specifying groups of concurrent processes such as operating systems. GYVE was designed by Philip Shaw and is described in his dissertation (New York University, 1978). Important features of GYVE include compile time protection checking, explicit scheduling of processes and a dynamic destroy function. The present work contains a detailed review of most of the constructs of GYVE and discussion of how various features could be modified so as to ease the implementation and/or increase performance in certain situations. One such feature concerns accessing of shared objects. This thesis specifies the syntactic and semantic phases of a GYVE compiler and the runtime structures and procedures required for execution of output from the compiler. Included with the specification is a reconcilation of the definition of GYVE implicit in the implementation with the formal definition of Shaw. Shaw gives his formal definition of the compilation process in the form of a two-level grammar. This is compared with the BNF-based syntactic and semantic phases of the implementation. Shaw's runtime system is specified through procedures written in GYVE. The specification code of the implementation is in a low level form of SETL in which we refer to various system tables of fixed sizes, machines with finite storage, semaphores and a simple timer mechanism. An analysis is given of the use of semaphores as required by the existence of the destroy function and the desire to prevent deadlock.
Ph.D. Thesis 1980 Optimization of Inductive Assertions Warren, Jr., Henry Stanley Abstract | PDF

Title: Optimization of Inductive Assertions

Candidate: Warren, Jr., Henry Stanley

Abstract:

Inductive assertions are assertions placed in the loops of a program, primarily for the purpose of aiding a mechanical correctness prover to prove that the program is correct. Here we assume that the assertions in a program are executed along with the program. That is, the predicate expression of each assertion is evaluated when encountered during program execution, to verify that its value is true. Inductive assertions are particularly expensive in terms of execution time. This is not only because they are in loops, but also because they are frequently themselves loops (quantified expressions). Thus executing them can slow a program's execution by a factor that can be indefinitely large. For example, executing them can change an O(n('2)) process to an O(n('3)) process. This thesis investigates the possibility of optimizing such quantified inductive assertions by substantially reducing the range of quantification. It is shown that many inductive assertions encountered in practice fall into a simple pattern in which the quantifier may, essentially, be removed. This restores the execution time of the program to the same order of magnitude that it would have been if the inductive assertions were not executed. Emphasis is placed on methods that are no more costly in compiler size and execution time than conventional global optimization techniques.
Ph.D. Thesis 1979 On Quadtrees, Voronoi Diagrams, and Lattices: Results in Geometric Algorithms Bennett, Huxley Abstract | PDF

Title: On Quadtrees, Voronoi Diagrams, and Lattices: Results in Geometric Algorithms

Candidate: Bennett, Huxley

Advisor(s): Chee Yap

Abstract:

We present several results on geometric algorithms, and somewhat more specifically on algorithmic aspects of geometric structures including quadtrees, Voronoi diagrams, and lattices. Our work contains two parts, the first of which is on subdivision algorithms, and the second of which is on lattice algorithms.

Subdivision algorithms amount to recursively splitting an ambient space into smaller pieces until certain conditions hold. Often the underlying space is a square in the plane (or a box in higher dimensions), whose subdivision is represented by a quadtree (or its higher-dimensional analogs). A quadtree is smooth if any two adjacent leaf boxes differ by at most one in depth. We first study the cost of the smooth split operation in quadtrees, showing that it has constant amortized cost in quadtrees of any fixed dimension.

We then present a subdivision-based algorithm for computing isotopic epsilon-approximations of planar minimization diagrams. Given a family of continuous functions, its minimization diagram partitions the plane into regions on which each function is minimal. Minimization diagrams generalize many natural Voronoi diagrams, and we show how to use our framework to compute an anisotropic Voronoi diagram on polygonal sites. We have implemented a prototype of our algorithm for anisotropic Voronoi diagrams, and we provide experimental results.

We then turn to studying lattice algorithms. A lattice is a regular ordering of points in Euclidean space, which is represented as the set of all integer combinations of some linearly independent vectors (which we call a basis of the lattice). In our first work on lattices, we introduce and study the Lattice Distortion Problem (LDP). LDP asks how "similar" two lattices are, i.e., what the minimum distortion of a linear bijection between two lattices is. We show how to compute low-distortion mappings with a tradeoff between approximation quality and running time based on a notion of basis reduction introduced by Seysen (Combinatorica 1993). We also show that LDP is NP-hard to approximate to within any constant factor (under randomized reductions).

Finally, we study the problem of finding lattice bases which are optimal with respect to two basis quality measures. Namely, we study the problem of finding bases with minimal orthogonality defect, and with nearly minimal Seysen condition number. We give algorithms which solve both problems while running in time depending only on the rank of the lattice times a polynomial in the input length.
Ph.D. Thesis 1979 Automatic Storage Optimization Fabri, Janet Abstract | PDF
Ph.D. Thesis 1979 The Optimization of Horizontal Microcode within and Beyond Basic Blocks: an Application of Processor Scheduling with Resources Fisher, Joseph Allen Abstract | PDF
Ph.D. Thesis 1979 On the Complexity of the Satisfiability Problem Goldberg, Allen T. Abstract | PDF
Ph.D. Thesis 1979 Computing Chromatic Polynomials for Special Families of Graphs Loerinc, Beatrice Margaret Abstract | PDF
Ph.D. Thesis 1979 Expression Continuity and the Formal Differentiation of Algorithms Paige, Robert Allan Abstract | PDF
Ph.D. Thesis 1979 Comparison of Direct Code Generation and Intermediate Language Generationfor Bootstrapping the Machine-Independent Compiler, Little Schneck, Paul Bennett Abstract | PDF
Ph.D. Thesis 1979 Groups with Solvable Word Problems Semeniuk, Christine Abstract | PDF
Ph.D. Thesis 1979 Automatic Discovery of Heuristics for Nondeterministic Programs from Sample Execution Traces Stolfo, Salvatore Joseph Abstract | PDF
Ph.D. Thesis 1978 Decision Regions for Multi-Stage Allocation Problems Coppage, Samuel Francis, Jr. Abstract | PDF
Ph.D. Thesis 1978 Configurable Software for Satellite Graphics Hartzman, Peter David Abstract | PDF
Ph.D. Thesis 1978 Automatic Data Structure Choice in Setl Liu, Ssu-Cheng Abstract | PDF
Ph.D. Thesis 1978 Gyve, a Programming Language for Protection and Control in a Concurrent Processing Environment Shaw, Philip Sidell Abstract | PDF
Ph.D. Thesis 1977 Computer Reconstruction of Bodies Bounded by Quadric Surfaces from a Set of Imperfect Projections Shapira, Ruth Abstract | PDF

Title: Computer Reconstruction of Bodies Bounded by Quadric Surfaces from a Set of Imperfect Projections

Candidate: Shapira, Ruth

Abstract:

This thesis describes a computer program for constructing a description of solid bodies from a set of n pictures of the bodies. The bodies are assumed to be bounded by faces which are quadric or planar, and they are restricted to have all their vertices formed by exactly three faces. The pictures are taken from different vantage points, with the restriction that a slight shift in vantage point will not alter the topology of the picture. It is assumed that the program receives outline information from a preprocessor which has extracted this information from the pictures. The outline information (set of line structures) may be imperfect in that some junctions may be erroneously reported and some lines may be missing. However, all lines due to shadows are assumed to have been eliminated by the preprocessor. The thesis includes a technique for establishing the validity of the junctions presented by the preprocessor as well as for matching corresponding features in the line structures derived from the different pictures. New grammar rules for line-drawing projections of curved and planar solid bodies are developed. These are useful in parsing the line drawings. They have also led to the definition of a new family of impossible objects. The program works simultaneously with all the available line structures. The parsing of every line structure is supported dynamically by the results gotten thus far from the parsing of the other line structures. Through the parsing of the line structures, the use of picture comparison and the application of the grammar rules, many of the preprocessor errors are detected and partly corrected. The program also can provide feedback to the preprocessor in the form of suggestions as to where to look again for lines in the pictures. The program utilizes the extracted line structures corresponding to the different bodies in all the pictures to determine the set of faces (insofar as possible) for every body. Every face is defined by an ordered set of n-tuples. The n-tuples are the matched lines and junctions in the n different pictures. The three-dimensional coordinates of the vertices and the equations of the faces can then be determined from these n-tuples. The program was written in PL/I and has been tested on several scenes.
Ph.D. Thesis 1976 On Algorithms for Minimizing the Number of Multiplications in Matrix Products Laderman, Julian David Abstract | PDF
Ph.D. Thesis 1976 A Comprehensive Survey of Parsing Algorithms for Programming Languages Owens, Philip Jonathan Abstract | PDF
Ph.D. Thesis 1976 Programming of Mechanism Motions Spegel, Marjan Abstract | PDF
Ph.D. Thesis 1976 Inferential Learning through Counterexample Construction Sperling, Michael Zelig Abstract | PDF
Ph.D. Thesis 1975 Operating System Specification using very High Level Dictions Markstein, Peter Willy Abstract | PDF
Ph.D. Thesis 1975 Visual Information Processing of Isolated Character Inputs Stryker, Charles William Abstract | PDF
Ph.D. Thesis 1975 An Investigation into a Probability Model for Correct Target Letter Detection Teichman, Sheldon M. Abstract | PDF
Ph.D. Thesis 1975 A Computer Based Approach to some Geometric Aspects of Character Recognition Wilamowsky, Yonah Abstract | PDF
Ph.D. Thesis 1974 Investigations in the Theory of Descriptive Complexity Gewirtz, William Lawrence Abstract | PDF
Ph.D. Thesis 1974 A Metalanguage for Expressing Grammatical Restrictions in Nodal Spans Parsing of Natural-Language Hobbs, Jerry Robert Abstract | PDF
Ph.D. Thesis 1974 Computer Edge Extraction from Photographs of Curved Objects Ramer, Eugen Urs Abstract | PDF
Ph.D. Thesis 1974 Optimum Correction of Pincushion Distortion Takeuchi, Seiichi Abstract | PDF
Ph.D. Thesis 1974 Type Determination for very High Level Languages Tenenbaum, Aaron Melvin Abstract | PDF
Ph.D. Thesis 1973 Recursive Compiler-Optimization for Nonserial Program Graphs Agresti, William Wolfgang Abstract | PDF
Ph.D. Thesis 1973 Studies in Pattern Recognition of Line-Size, Line-Orientation and their Interaction Friedmann, Jehosua Abstract | PDF
Ph.D. Thesis 1973 Computer Recognition of Handprinted Two-Dimensional Mathematics Grossman, Fred Abstract | PDF
Ph.D. Thesis 1973 Sub-Elementary Classes of Functions and Relations Harrow, Keith Abstract | PDF
Ph.D. Thesis 1973 A Study in Programming Techniques Maly, Kurt Abstract | PDF
Ph.D. Thesis 1973 A Comparison of some Deadlock Models Waxman, Jerry Milton Abstract | PDF
Ph.D. Thesis 1972 An Experimental Comparison of the Efficiency of Parsing Techniques Knobe, Bruce Stuart Abstract | PDF
Ph.D. Thesis 1972 Digital Computer Transformations for Irregular Line-Drawings Reggiori, Giovanni B. Abstract | PDF
Ph.D. Thesis 1971 A Network Queueing Model of a Multiprogrammed Time-Shared Computer System Brown, Theodore David Abstract | PDF
Ph.D. Thesis 1971 Parallel Programming: Operational Model and Detection of Parallelism Firestone, Roger Morris Abstract | PDF
Ph.D. Thesis 1971 Global Flow Analysis and Register Allocation for Simple Code Structures Kennedy, Kenneth Wade, Jr. Abstract | PDF
Ph.D. Thesis 1971 Reconstruction of Polyhedra from Sets of their Perspective Projections Rabinowitz, Andrew David Abstract | PDF
Ph.D. Thesis 1971 A Trainable Syntactic Model for Syntax Specification and Recognition of Handdrawn Two-Dimensional Patterns Sharma, Onkar P. Abstract | PDF
Ph.D. Thesis 1971 A Systematic Method for the Creation of Data Structures in Computer Graphics Applications Williams, Robin Abstract | PDF
Ph.D. Thesis 1971 A Computer Procedure for Generating Visible-Line Drawings of Solids Bounded by Quadric Surfaces Woon, Peter Yi-do Abstract | PDF
Ph.D. Thesis 1970 The Optimum Two-Dimensional Allocation of Irregular, Multiply-Connected Shapes with Linear, Logical and Geometric Constraints Adamowicz, Michael Abstract | PDF