The Generalizability and Interpretability of Neural Language Models
Speaker: Urvashi Khandelwal, Stanford University
Date: March 10, 2021, 11 a.m.
Host: Kyunghyun Cho
Neural language models (LMs) have become the workhorse of most natural language processing tasks and systems today. Yet, they are not perfect, and the two most important challenges in improving them further are (1) their lack of interpretability, and (2) their inability to generalize consistently in a range of settings. In this talk, I first describe my work on analyzing these black-box LMs to understand how their predictions change in response to strategic changes in inputs. This makes model predictions more transparent by highlighting the features of the input that the model relies on. Then, I describe my work on “Generalization through Memorization” -- exploiting the notion of similarity between examples by using data saved in an external memory and retrieving nearest neighbors from it. This approach improves existing LM and machine translation models in a range of settings, including both in- and out-of-domain generalization, without any added training costs. Beyond improving generalization, memorization also makes model predictions more interpretable.
Urvashi Khandelwal is a PhD student in Computer Science at Stanford University, in the Stanford Natural Language Processing (NLP) Group, where she is advised by Professor Dan Jurafsky. She works at the intersection of NLP and machine learning, and is interested in building interpretable systems that can generalize in and adapt to a range of settings. Recently, her work was recognized by a Microsoft Research Dissertation Grant.