Theses & Reports
Advances in computer bridge: techniques for a partial-information, communication-based game.
Title: Advances in computer bridge: techniques for a partial-information, communication-based game.
Candidate: Bethe, Paul
Advisor(s): Ernest Davis
Bridge is an imperfect information game with elements of competition
against opponents as well as cooperation with a partner. Despite the
application of many tenets of artificial intelligence, humans have yet
to be consistently bested by the computer. This thesis explores AI
shortcomings in both the play and bidding phases of the game. In the
play, we explore weaknesses in the cutting edge Monte Carlo techniques
and explore both inference and learning based solutions. In the bidding,
we go beyond existing rule based systems and investigate deep
reinforcement learning as a method to learn how to bid.
Learning Causality in Molecular Biology
Title: Learning Causality in Molecular Biology
Candidate: Cirrone, Jacopo
Advisor(s): Dennis Shasha
The Systems Biology community has invested a great deal of effort in
modeling gene regulatory networks that should be able to (i) accurately
predict future states and (ii) identify regulatory hubs that can be
manipulated to achieve desired phenotypes. Most computational tools for
the problem embody linear models (e.g. 5 * TF1 + 2*TF2 - 0.4*TF3....).
However, it is well known that biological interactions are highly
synergistic and non-linear. Further, those tools mostly try to directly
predict networks even when the discovered edges (which usually come from
some assay such as Chip-seq) may have little physiological significance
(e.g., may not influence gene expression).
This thesis considers an alternative approach to inferring gene
causality. Specifically, we consider the problem of predicting the
expression of genes at a future time point in a genomic time series. In
this, we follow the philosophy that accurate prediction often
corresponds to a good understanding of causality.
The prediction may rest on several sources of data: the time point
immediately preceding t, the entire target time series preceding t, and
ancillary data. In biology, for example, the ancillary data may consist
of a network based on binding data, data from different time series,
steady state data, a community-blessed gold standard network, or some
combination of those. We introduce OutPredict, which is a machine
learning method for time series that incorporates ancillary steady state
and network data to achieve a low error in gene expression prediction.
We show that OutPredict outperforms several of the best state-of-the-art
methods for prediction. The predictive models OutPredict in turn
generate a causal network.
Thus, this thesis presents an approach to the inference of causality
based on predictions of out-of-sample time-points based on both steady
state and time series data. Because the model for each gene identifies
those transcription factors that have the most importance in prediction,
those important transcription factors are the most likely causal
elements for that gene. We validate those predictions for a set of
well-documented transcription factors in Arabidopsis.
Because our methods apply to any situation in which there is time series
data, ancillary data, and the need for non-linear causal models, we
believe that this work will have a broad appeal to the scientific
community, specifically those studying causality networks in any
Responsibility Analysis by Abstract Interpretation
Title: Responsibility Analysis by Abstract Interpretation
Candidate: Deng, Chaoqiang
Advisor(s): Patrick Cousot
Given a behavior of interest, automatically determining the corresponding responsible entity (or say, the root cause) is a task of critical importance in various scientific fields, especially in the program static analysis. Classical static analysis techniques (e.g. dependency analysis, taint analysis, slicing, etc.) assist programmers in narrowing down the scope of responsibility, but none of them can explicitly identify the responsible entity. Meanwhile, the causality analysis is generally not pertinent for analyzing programs, and the structural equations model (SEM) of actual causality misses some information inherent in programs (e.g. temporal information, and whether an entity is free to make choices or not), making the corresponding program analysis imprecise.
In this dissertation, inspired by a classic forest fire example used in defining causality, a novel definition of responsibility based on the abstraction of trace semantics is proposed, which is expressive and generic to cope with both program analyses and tasks in other scientific fields. Briefly speaking, an action aR is responsible for behavior B in a certain trace, if and only if aR is free to make choices, and such a choice is the first one that ensures the occurrence of B in that trace. Such a definition makes use of the information regarding the temporal ordering of actions, as well as whether an action has free choices or not. In addition, our definition of responsibility takes into account the cognizance of observer, which, to the best of our knowledge, is a new innovative idea in program analysis. Compared to current dependency and causality analysis methods, the responsibility analysis is demonstrated to be more precise in many examples.
Furthermore, this dissertation proposes a sound framework of abstract responsibility analysis, which allows a balance between cost and precision to solve the undecidable problem of responsibility. Essentially, the abstract analysis builds a trace partitioning automaton by an iteration of over-approximating forward reachability analysis with trace partitioning and under-approximating/over-approximating backward impossible failure accessibility analysis, and determines the bounds of potentially responsible entities along paths in the automaton. Unlike the concrete responsibility analysis identifies exactly a single action as the responsible entity along every concrete trace, the abstract analysis may lose some precision and find multiple actions potentially responsible along each automaton path. However, the soundness is preserved, and every responsible entity in the concrete is guaranteed to be also found responsible in the abstract.
Enhancing Collaboration and Productivity for Virtual and Augmented Reality
Title: Enhancing Collaboration and Productivity for Virtual and Augmented Reality
Candidate: He, Zhenyi
Advisor(s): Ken Perlin
Immersive environments such as Virtual Reality (VR) and Augmented Reality (AR) are now receiving more and more attention. Although VR and AR have largely been used for individual entertainment experiences, they also possess huge potential as a platform for the support of collaboration and productivity. My thesis work is concerned with enabling VR/AR to be flexibly adapted for collaborative and productive uses. I approach this scope from several facets: a new haptic user interface based on actuated robots to bridge virtual and physical world, a reconfigurable framework for both co-located and geographically dispersed multi-user communication, and a text entry system in which users type by tapping their fingers, without needing to look at their hands or be aware of their hand positions. Further, I extend these ideas to a daily video conferencing experience that requires minimal hardware.
Neural Structured Prediction using Iterative Refinement with Applications to Text and Molecule Generation
Title: Neural Structured Prediction using Iterative Refinement with Applications to Text and Molecule Generation
Candidate: Mansimov, Elman
Advisor(s): Kyunghyun Cho
Humans excel at generating structured data in the form of images, text, speech, molecules, computer code, and others. Researchers have spent several decades proposing various solutions for the effective generation of these structured objects in a data-driven way, known as structured prediction. With the revival of deep neural networks, autoregressive models that process structured objects in fixed left-to-right monotonic ordering became a de-facto solution for this problem. Notable successes of autoregressive models include neural machine translation [Sutskever et al., 2014, Bahdanau et al., 2014, Vaswani et al., 2017], open-ended text generation [Radford et al., 2019, Brown et al., 2020], text-to-speech synthesis [van den Oord et al., 2016], among many.
Despite the considerable success of autoregressive models on many applications, a natural question arises whether alternative approaches are possible for structured prediction. This thesis describes a novel method for structured prediction based on the principle of iterative refinement with a particular focus on applications to text and molecule generation. We first introduce the iterative refinement framework for text generation. Starting from the blank sentence, the iterative refinement approach gradually refines text over multiple steps. Using this approach, we show that we can flexibly generate the text in various ways, such as generate all or some words in parallel and generate text according to the ordering learned from the data. We show that iterative refinement achieves competitive performance compared to autoregressive models while delivering a speedup in decoding. We conclude this thesis by showing how we can adapt the iterative refinement framework originally introduced for text generation for molecule generation. In particular, we demonstrate two iterative refinement approaches for molecular graph generation and molecular geometry prediction. We anticipate that models based on the iterative refinement will be broadly applicable to other domains of interest.
Order and Learning in Sequential Neural Structured Prediction
Title: Order and Learning in Sequential Neural Structured Prediction
Candidate: Welleck, Sean
Advisor(s): Kyunghyun Cho
Structured objects such as sets, trees, and sequences appear in a variety of scientific and industrial domains. Developing machine learning methods that generate these objects is of interest for both scientific understanding and practical applications. One approach, sequential neural structured prediction, decomposes generation into a sequence of predictions, with each prediction made by a deep neural network. Choosing an appropriate sequential representation of each structured object and selecting an effective learning objective are key to adopting this approach. The standard method for learning specifies a canonical ordering of elements in the sequential representation and maximizes the likelihood of the resulting sequences. We develop two streams of research that explore alternatives to this fixed-order, maximum likelihood approach for sequentially generating sets, trees, and sequences, with a focus on natural language processing applications.
First, we focus on text generation and study degenerate properties of fixed-order maximum-likelihood learning, motivating new learning methods. We characterize the degeneracy using three properties observed in generated text: non-termination, logical incoherence, and repetition. To study non-termination, we develop theory that allows us to prove that conventional text generation methods can generate infinite-length sequences with high probability. To study logical incoherence, we create a dataset for investigating the degree to which a model logically contradicts its preceding statements. For reducing degeneration, we develop unlikelihood training, a learning method which penalizes task-specific textual properties. In the second part of the thesis, we remove the requirement of a fixed generation order with a learning framework called non-monotonic generation, which yields models that select input-dependent generation orders. We use non-monotonic generation to generate multisets, parse trees, and text. The investigations and techniques presented in this thesis lead to promising directions for future work.