Instructions for submitting a technical report or thesis.
Title: On Quadtrees, Voronoi Diagrams, and Lattices: Results in Geometric Algorithms
Candidate: Bennett, Huxley
Advisor(s): Yap, Chee
Abstract:
We present several results on geometric algorithms, and somewhat more specifically on algorithmic aspects of geometric structures including quadtrees, Voronoi diagrams, and lattices. Our work contains two parts, the first of which is on subdivision algorithms, and the second of which is on lattice algorithms.
Subdivision algorithms amount to recursively splitting an ambient space into smaller pieces until certain conditions hold. Often the underlying space is a square in the plane (or a box in higher dimensions), whose subdivision is represented by a quadtree (or its higher-dimensional analogs). A quadtree is smooth if any two adjacent leaf boxes differ by at most one in depth. We first study the cost of the smooth split operation in quadtrees, showing that it has constant amortized cost in quadtrees of any fixed dimension.
We then present a subdivision-based algorithm for computing isotopic epsilon-approximations of planar minimization diagrams. Given a family of continuous functions, its minimization diagram partitions the plane into regions on which each function is minimal. Minimization diagrams generalize many natural Voronoi diagrams, and we show how to use our framework to compute an anisotropic Voronoi diagram on polygonal sites. We have implemented a prototype of our algorithm for anisotropic Voronoi diagrams, and we provide experimental results.
We then turn to studying lattice algorithms. A lattice is a regular ordering of points in Euclidean space, which is represented as the set of all integer combinations of some linearly independent vectors (which we call a basis of the lattice). In our first work on lattices, we introduce and study the Lattice Distortion Problem (LDP). LDP asks how "similar" two lattices are, i.e., what the minimum distortion of a linear bijection between two lattices is. We show how to compute low-distortion mappings with a tradeoff between approximation quality and running time based on a notion of basis reduction introduced by Seysen (Combinatorica 1993). We also show that LDP is NP-hard to approximate to within any constant factor (under randomized reductions).
Finally, we study the problem of finding lattice bases which are optimal with respect to two basis quality measures. Namely, we study the problem of finding bases with minimal orthogonality defect, and with nearly minimal Seysen condition number. We give algorithms which solve both problems while running in time depending only on the rank of the lattice times a polynomial in the input length.
Title: Improving Event Extraction: Casting a Wider Net
Candidate: Cao, Kai
Advisor(s): Grishman, Ralph
Abstract:
Information extraction is the task of automatically extracting structured information from unstructured and/or semi-structured machine-readable documents. One facet of information extraction is event extraction (EE): identifying instances of selected types of events appearing in natural language text. For each instance, EE should identify the type of the event, the event trigger (the word or phrase which evokes the event), the participants in the event, and (where possible) the time and place of the event.
One EE task was defined and intensively studied as part of the ACE (Automatic Content Extraction) research program. The 2005 ACE EE task involved 8 types and 33 subtypes of events. For instance, given the sentence "She was killed by an automobile yesterday.", an EE system should be able to recognize the word "killed" as a trigger for an event of subtype DIE, and discover "an automobile" and "yesterday" as the Agent and Time arguments. This task is quite challenging, as the same event might appear in the form of various trigger expressions and an expression might represent different types of events in different contexts.
To support the development and evaluation of ACE EE systems, the Linguistic Data Consortium annotated a text corpus (consisting primarily of news articles) with information on the events mentioned. This corpus was widely used to train ACE EE systems. However, the event instances in the ACE corpus are not evenly distributed, and so some frequent expressions involving ACE events do not appear in the training data, adversely affecting performance.
The thesis presents several strategies for improving the performance of EE. We first demonstrate the effectiveness of two types of linguistic analysis -- dependency regularization and Abstract Meaning Representation -- in boosting EE performance. Next we show the benefit of an active learning strategy in which a person is asked to judge a limited number of phrases which may be event triggers. Finally we report the impact of combining our baseline system with event patterns from a system developed for a different EE task (the TABARI program). This step contains expert-level patterns generated by other research groups. Because the information received is complicated and quite different from the original corpus (ACE), the integration of this information requires more complex processing.
Title: On the Design of Small Coarse Spaces for Domain Decomposition Algorithms
Author(s): Dohrmann, Clark; Widlund, Olof
Abstract:
Methods are presented for automatically constructing %low-dimensional coarse spaces of low dimension for domain decomposition algorithms. These constructions use equivalence classes of nodes on the interface between the subdomains into which the domain of a given elliptic problem has been subdivided, e.g., by a mesh partitioner such as METIS; these equivalence classes already play a central role in the design, analysis, and programming of many domain decomposition algorithms. The coarse space elements are well defined even for irregular subdomains, are continuous, and well suited for use in two-level or multi-level preconditioners such as overlapping Schwarz algorithms. An analysis for scalar elliptic and linear elasticity problems ms reveals that significant reductions in the coarse space dimension can be achieved while not sacrificing the favorable condition number estimates for larger coarse spaces previously developed. These estimates depend primarily on the Lipschitz parameters of the subdomains. Numerical examples for problems in three dimensions are presented to illustrate the methods and to confirm the analysis. In some of the experiments, the coefficients have large discontinuities across the interface between the subdomains, and in some, the subdomains are generated by mesh partitioners.
Title: Random Growth Models
Candidate: Florescu, Laura
Advisor(s): Spencer, Joel
Abstract:
This work explores variations of randomness in networks, and more specifically, how drastically the dynamics and structure of a network change when a little bit of information is added to "chaos". On one hand, I investigate how much determinism in diffusions de-randomizes the process, and on the other hand, I look at how superposing "planted" information on a random network changes its structure in such a way that the "planted" structure can be recovered.
The first part of the dissertation is concerned with rotor-router walks, a deterministic counterpart to random walk, which is the mathematical model of a path consisting of a succession of random steps. I study and show results on the volume (``the range") of the territory explored by the random rotor-router model, confirming an old prediction of physicists.
The second major part in the dissertation consists of two constrained diffusion problems. The questions in this model are to understand the long-term behavior of the models, as well as how the boundary of the processes evolves in time.
The third part is detecting communities in, or more generally, clustering networks. This is a fundamental problem in mathematics, machine learning, biology and economics, both for its theoretical foundations as well as for its practical implications. This problem can be viewed as "planting" some structure in a random network; for example, in cryptography, a code can be viewed as hiding some integers in a random sequence. For such a model with two communities, I show both information theoretic thresholds when it is impossible to recover the communities based on the density of the edges "planted" between the communities, as well as thresholds for when it is computationally possible to recover the communities.
Title: Zero-knowledge Proofs: Efficient Techniques for Combination Statements and their Applications
Candidate: Ganesh, Chaya
Advisor(s): Dodis, Yevgeniy
Abstract:
Zero-knowledge proofs provide a powerful tool, which allows a prover to convince a verifier that a statement is true without revealing any further information. It is known that every language in NP has a zero knowledge proof system, thus opening up several cryptographic applications. While true in theory, designing proof systems that are efficient to be used in practice remains challenging. The most common and most efficient systems implemented are approaches based on sigma protocols, and approaches based on SNARKs (Succinct Non-interactive ARguments of Knowledge). Each approach has its own advantages and shortcomings, and are suited for certain statements.
While sigma protocols are efficient for algebraic statements, they are expensive for non-algebraic statements. SNARKs, on the other hand, result in short proofs and efficient verification, and are better suited for proving statements about hash functions. But proving an algebraic statement, for instance, knowledge of discrete logarithm, is expensive as the prover needs to perform public-key operations proportional to the size of the circuit.
Recent work achieve zero-knowledge proofs that are efficient for statements phrased as Boolean circuits based on Garbled circuits (GC). This, again, is expensive for large circuits, in addition to being inherently interactive. Thus, SNARKs and GC-based approaches are better suited for non-algebraic statements, and sigma protocols are efficient for algebraic statements.
But in some applications, one is interested in proving combination statements, that is, statements that have both algebraic and non-algebraic components. The state of the art fails to take advantage of the best of all worlds and has to forgo the efficiency of one approach to obtain the other's. In this work, we ask how to efficiently prove a statement that is a combination of algebraic and non-algebraic statements.
We first show how to combine the GC-based approach with sigma protocols. Then, we study how to combine sigma protocol proofs with SNARKs to obtain non-interactive arguments for combination statements. We show applications of our techniques to anonymous credentials, and privacy-preserving protocols on the blockchain. Finally, we study garbled circuits as a primitive and present an efficient way of hashing garbled circuits. We show applications of our hashing technique, including application to GC-based zero-knowledge.
Title: Circuit Complexity: New Techniques and Their Limitations
Candidate: Golovnev, Aleksandr
Advisor(s): Dodis, Yevgeniy; Regev, Oded
Abstract:
We study the problem of proving circuit lower bounds. The strongest known lower bound of 3n-o(n) for an explicit function was proven by Blum in 1984. We prove a lower bound of (3+1/86)n-o(n) for affine dispersers for sublinear dimensions.
We introduce the weighted gate elimination method to give an elementary proof of a 3.11n lower bound for quadratic dispersers. (Although currently there are no explicit constructions of such functions.) Also, we develop a general framework which allows us to turn lower bounds proofs into upper bounds for Circuit SAT algorithms.
Finally, we prove strong limitations of the developed techniques.
Title: Unsupervised Learning Under Uncertainty
Candidate: Mathieu, Michael
Advisor(s): LeCun, Yann
Abstract:
Deep learning, in particular neural networks, achieved remarkable success in the recent years. However, most of it is based on supervised learning, and relies on ever larger datasets, and immense computing power. One step towards general artificial intelligence is to build a model of the world, with enough knowledge to acquire a kind of “common sense”. Representations learned by such a model could be reused in a number of other tasks. It would reduce the requirement for labeled samples and possibly acquire a deeper understanding of the problem. The vast quantities of knowledge required to build common sense preclude the use of supervised learning, and suggest to rely on unsupervised learning instead.
The concept of uncertainty is central to unsupervised learning. The task is usually to learn a complex, multimodal distribution. Density estimation and generative models aim at representing the whole distribution of the data, while predictive learning consists of predicting the state of the world given the context and, more often than not, the prediction is not unique. That may be because the model lacks the capacity or the computing power to make a certain prediction, or because the future depends on parameters that are not part of the observation. Finally, the world can be chaotic of truly stochastic. Representing complex, multimodal continuous distributions with deep neural networks is still an open problem.
In this thesis, we first assess the difficulties of representing probabilities in high dimensional spaces, and review the related work in this domain. We then introduce two methods to address the problem of video prediction, first using a novel form of linearizing auto-encoder and latent variables, and secondly using Generative Adversarial Networks (GANs). We show how GANs can be seen as trainable loss functions to represent uncertainty, then how they can be used to disentangle factors of variation. Finally, we explore a new non-probabilistic framework for GANs.
Title: Deep Learning for Information Extraction
Candidate: Nguyen, Thien Huu
Advisor(s): Grishman, Ralph
Abstract:
The explosion of data has made it crucial to analyze the data and distill important information effectively and efficiently. A significant part of such data is presented in unstructured and free-text documents. This has prompted the development of the techniques for information extraction that allow computers to automatically extract structured information from the natural free-text data. Information extraction is a branch of natural language processing in artificial intelligence that has a wide range of applications, including question answering, knowledge base population, information retrieval etc. The traditional approach for information extraction has mainly involved hand-designing large feature sets (feature engineering) for different information extraction problems, i.e, entity mention detection, relation extraction, coreference resolution, event extraction, and entity linking. This approach is limited by the laborious and expensive effort required for feature engineering for different domains, and suffers from the unseen word/feature problem of natural languages.
This dissertation explores a different approach for information extraction that uses deep learning to automate the representation learning process and generate more effective features. Deep learning is a subfield of machine learning that uses multiple layers of connections to reveal the underlying representations of data. I develop the fundamental deep learning models for information extraction problems and demonstrate their benefits through systematic experiments.
First, I examine word embeddings, a general word representation that is produced by training a deep learning model on a large unlabelled dataset. I introduce methods to use word embeddings to obtain new features that generalize well across domains for relation extraction. This is done for both the feature-based method and the kernel-based method of relation extraction.
Second, I investigate deep learning models for different problems, including entity mention detection, relation extraction and event detection. I develop new mechanisms and network architectures that allow deep learning to model the structures of information extraction problems more effectively. Some extensive experiments are conducted on the domain adaptation and transfer learning settings to highlight the generalization advantage of the deep learning models for information extraction.
Finally, I investigate the joint frameworks to simultaneously solve several information extraction problems and benefit from the inter-dependencies among these problems. I design a novel memory augmented network for deep learning to properly exploit such inter-dependencies. I demonstrate the effectiveness of this network on two important problems of information extraction, i.e, event extraction and entity linking.
Title: Atypical: A type system for live performances
Candidate: Nunes, Gabriel Barbosa
Advisor(s): Panozzo, Daniele; Perlin, Ken
Abstract:
Chalktalk is a visual language based around real-time interaction with virtual objects in a blackboard-style environment. Its aim is to be a presentation and communication tool, using animation and interactivity to allow easy illustration of complex topics or ideas. Among many of the capabilities of these virtual objects is the ability to send data from one object to another via a visual linking system. In this paper, we describe a way of making the link system more robust by adding type information to these links, and compare and contrast the requirements of a presentation-oriented visual language with a more traditional programming language.
Title: Fine-scale Structure Design for 3D Printing
Candidate: Panetta, Francis Julian
Advisor(s): Zorin, Denis
Abstract:
Modern additive fabrication technologies can manufacture shapes whose geometric complexities far exceed what existing computational design tools can analyze or optimize. At the same time, falling costs have placed these fabrication technologies within the average consumer's reach. Especially for inexpert designers, new software tools are needed to take full advantage of 3D printing technology.
My thesis develops such tools and demonstrates the exciting possibilities enabled by fine-tuning objects at the small scales achievable by 3D printing. The thesis applies two high-level ideas to invent these tools: two-scale design and worst-case analysis.
The two-scale design approach addresses the problem that accurately simulating---let alone optimizing---geometry at the full resolution one can print requires orders of magnitude more computational power than currently available. However, we can use periodic homogenization to decompose the design problem into a small-scale problem (designing tileable structures achieving a particular deformation behavior) and a macro-scale problem (deciding where to place these structures in the larger object). We can then design structures for every possible deformation behavior and store them in a database, so that they can be re-used for many different macro-scale design problems.
Worst-case analysis refers to determining how likely an object is to fracture by studying the worst possible scenario: the forces most efficiently breaking it. This analysis is needed when the designer has insufficient knowledge or experience to predict what forces an object will undergo, or when the design is intended for use in many different scenarios unknown a priori.
Title: Isogeometric BDDC Deluxe Preconditioners for Linear Elasticity
Author(s): Pavarino, Luca F.; Scacchi, Simone; Widlund, Olof B.; Zampini, Stefano
Abstract:
Balancing domain decomposition by constraints (BDDC) preconditioners have been shown to provide rapidly convergent preconditioned conjugate gradient methods for solving many of the very ill-conditioned systems of algebraic equations which often arise in finite element approximations of a large variety of problems in continuum mechanics. These algorithms have also been developed successfully for problems arising in isogeometric analysis. In particular, the BDDC deluxe version has proven very successful for problems approximated by non-uniform rational B-splines (NURBS), even those of high order and regularity. One main purpose of this paper is to extend the theory, previously fully developed only for scalar elliptic problems in the plane, to problems of linear elasticity in three dimensions. Numerical experiments supporting the theory, are also reported. Some of these experiments highlight the fact that the development of the theory can help to decrease substantially the dimension of the primal space of the BDDC algorithm, which provides the necessary global component of these preconditioners, while maintaining scalability and good convergence rates.
Title: Accelerating Approximate Simulation with Deep Learning
Candidate: Schlachter, Kristofer
Advisor(s): Perlin, Ken
Abstract:
Once a simulation resorts to an approximate numerical solution one is faced with various tradeoffs in accuracy versus computation time. We propose that another approximate solution can be learned for two chosen simulations, which in our case, are just as useful but can be made faster to compute. The two problems addressed in this thesis are fluid simulation and the simulation of diffuse inter-reflection in computer graphics.
Real-time simulation of fluid and smoke is a long standing problem in computer graphics, where state-of-the-art approaches require large compute resources, making real-time applications often impractical. In this work, we propose a data-driven approach that leverages the approximation power of deep-learning methods with the precision of standard fluid solvers to obtain both fast and highly realistic simulations. The proposed method solves the incompressible Euler equations following the standard operator splitting method in which a large, often ill-condition linear system must be solved. We propose replacing this system by learning a Convolutional Network (ConvNet) from a training set of simulations using a semi-supervised learning method to minimize long-term velocity divergence.
ConvNets are amenable to efficient GPU implementations and, unlike exact iterative solvers, have fixed computational complexity and latency. The proposed hybrid approach restricts the learning task to a linear projection without modeling the well understood advection and body forces. We present real-time 2D and 3D simulations of fluids and smoke; the obtained results are realistic and show good generalization properties to unseen geometry.
The next simulation that we address is the synthesis of images for training convnets. A challenge with training deep learning models is that they commonly require a large corpus of training data and retrieving sufficient real world data may be unachievable. A solution to this problem can be found in the use of synthetic or simulated training data. However, for simulated photographs or renderings, there hasn't been a systematic approach to comparing the relative benefits of different techniques in image synthesis.
We compare multiple synthesis techniques to one another as well as the real data that they seek to replicate. We also introduce learned synthesis techniques that either train models better than the most realistic graphical methods used by standard rendering packages or else approach their fidelity using far less computation. We accomplish this by learning shading of geometry as well as denoising the results of low sample Monte Carlo image synthesis. Our major contributions are (i) a dataset that allows comparison of real and synthetic versions of the same scene, (ii) an augmented data representation that boosts the stability of learning, and (iii) three different partially differentiable rendering techniques where lighting, denoising and shading are learned. Finally we are able to generate datasets that can outperform full global illumination rendering and approach the performance of training on real data.
Title: On the Gaussian Measure Over Lattices
Candidate: Stephens-Davidowitz, Noah
Advisor(s): Dodis, Yevgeniy; Regev, Oded
Abstract:
We study the Gaussian mass of a lattice coset \[ \rho_s(\mathcal{L} - \vec{t}) := \sum_{\vec{y} \in \mathcal{L}} \exp(-\pi \|\vec{y} - \vec{t}\|^2/s^2) \; , \] where \(\mathcal{L} \subset \mathbb{R}^n\) is a lattice and \(\vec{t} \in \mathbb{R}^n\) is a vector describing a shift of the lattice. In particular, we use bounds on this Gaussian mass to obtain a partial converse to Minkowski's celebrated theorem bounding the number of lattice points in a ball.
We also consider the discrete Gaussian distribution \(D_{\mathcal{L} - \vec{t}, s}\) induced by the Gaussian measure over \(\mathcal{L} - \vec{t}\), and we use procedures for sampling from this distribution to construct the current fastest known algorithms for the two most important computation problems over lattices, the Shortest Vector Problem (SVP) and the Closest Vector Problem (CVP).
Finally, we study \(\rho_s(\mathcal{L} - \vec{t})\) and \(D_{\mathcal{L} - \vec{t}, s}\) as interesting computational and mathematical objects in their own right. In particular, we show that the computational problem of sampling from \(D_{\mathcal{L} - \vec{t}, s}\) is equivalent to CVP in a very strong sense (and that sampling from \(D_{\mathcal{L}, s}\) is no harder than SVP). We also prove a number of bounds on the moments of \(D_{\mathcal{L} - \vec{t}, s}\) and various monotonicity properties of \(\rho_s(\mathcal{L} - \vec{t})\).
Title: Inducing Cooperation Through Virtual Reality
Candidate: Zhang, Daniel W.
Advisor(s): Perlin, Ken
Abstract:
There has been a recent resurgence in Virtual Reality (VR) as a new medium for entertainment and communication. With these potentially exciting developments, I decided to create an experiment to test how people can potentially be influenced by virtual reality interfaces. I had hoped that I could induce people to cooperate better in a virtual reality version of a task compared to an un-augmented version of the task in the regular world. After conducting 16 separate trials, with half in VR and the other half in the regular world, there is no conclusive evidence to completely confirm or deny this hypothesis. I have found evidence to suggest that there can be such an influence, as there were more successes in the VR trials than the regular trials, but they can potentially be explained away by the sample size and the attitudes of participants before starting the experiment. This data suggests that further research in this field can lead to interesting discoveries regarding human behavior in virtual reality environments, and that the Holojam framework invented by the Future Reality Lab at New York University can be very helpful in designing experiments for this research.