Language variation in Natural Language Processing
Speaker: Houda Bouamor, Carnegie Mellon University in Qatar
Location: 60 Fifth Avenue C15
Date: June 20, 2017, 11:30 a.m.
Host: Oded Regev
Language variation, or the fact that information can be conveyed in a great variety of ways by means of linguistic expressions, is one of the most challenging and certainly fascinating features of language for NLP, with wide applications in language analysis and generation. This analysis is an extremely complex inferential process, which, like recognizing a face or walking, is effortless to humans. My research approach pairs the development of linguistic tools and resources with their use in applications. Language variation could be considered in two scenarios: monolingual and multilingual. In this talk, I will discuss four NLP tasks in which dealing with language richness and variation is crucial: Paraphrasing, Machine Translation, Machine Translation Evaluation for morphologically rich languages and Dialectal Arabic processing. I will present the several approaches I designed to fulfill these tasks.
Dr. Houda Bouamor is a Visiting Assistant Professor of Computer Science at Carnegie Mellon University in Qatar. She has more than two years of experience as a full-time instructor with specialty in teaching large-enrollment introductory courses to under-prepared students, as well as higher level electives. She is a a graduate of Paris-Sud University, France, where she worked on paraphrasing for English and French. She obtained an MSc in Computer Science at the Paris-Est Marne-La-Vallee University. She has worked on different projects tackling various NLP issues and involving data collection and annotation such as statistical machine translation, paraphrase acquisition and validation, automatic spelling correction, machine translation evaluation. Dr. Bouamor has been and is currently leading a day-to-day management of several previous and ongoing projects funded by QNRF.
Refreshments will be offered starting 15 minutes prior to the scheduled start of the talk.