Prof. Ralph Grishman
60 Fifth Avenue [Forbes Building], Room 300
Fall office hours: Mondays, 2:30-3:30 PM
(generally also available Mondays 11:00-1:00 and Thursdays 11:00-12:00;
please send an email in advance if possible)
An introductory class in natural language processing.
Good programming skills.
Knowledge of Java.
We assume that you are familiar with the main language analysis components of NLP.
This course is intended to give you hands-on experience in assembling these
components into an integrated NLP system. The system we will build will be
for knowledge-base construction: converting an (unstructured) text corpus
into a (structured) data base.
We will begin by reading a few recent papers on each component in order to understand
the design alternatives. Each student will select one component to implement,
delivering a minimal functional version after 3 weeks and a higher-performing one
after 6 weeks. These will be initially connected in a pipeline, most likely with
modest performance. The minimal version needs to be in Java; later versions
may use deep learning languages such as Theano or TensorFlow.
We will then consider how to improve perormance through domain modeling and joint
inference, in particular using Markov Logic Networks. In the final weeks each
student will conduct an experiment to improve some aspect of KBC performance
through domain modeling, joint inference, or deep learning.
deliver baseline component [brief class presetation] (due: class 4)
enhance your component [brief class presetation] (due: class 7)
Markov Logic Network assignment (due: class 9)
conduct experiment involving joint inference, deep learning, or domain
modeling (due: class 13; class presentation week 14)
(2) class presentations of papers
brief paper summaries
Because much of the work will be presented in class, assignment deadlines
will be strictly enforced. There is no final exam.
There is no text for the course; all readings will be made available on-line.
As a general reference for most of the topics covered in the course we
recommend Jurafsky and Martin's Speech and Language Processing.