Towards high-quality machine translation for complex language pairs
Speaker: Arianna Bisazza, University of Amsterdam
Location: Warren Weaver Hall 102
Date: April 21, 2015, 11:30 a.m.
Host: Davi Geiger
In today’s digital societies, human language remains the medium of excellence for sharing information. As more and more data is produced in languages other than English, the need for fast and accurate machine translation keeps growing at a rapid pace. In this talk I will show that state-of-the-art statistical machine translation (SMT) is close to reaching this objective, but only for a few language pairs with little structural (e.g. grammatical) differences, such as French-English. To achieve high SMT quality in many more language pairs, it is time to revise the simplifying assumptions taken so far, with special regard to word reordering and morphology. I will give an overview of my work on these two aspects: namely, (i) how can we capture long-range reordering phenomena during translation without sacrificing efficiency? and (ii) how can we ensure the generation of fluent and understandable translations in languages with complex word formation mechanisms? To conclude, I will discuss future research directions to improve translation from English into a diverse range of target languages.
Arianna Bisazza is a post-doc researcher in the Information and Language Processing Systems group of the University of Amsterdam. Her work focuses on the statistical modeling of natural languages, with the prime goal of improving the quality of machine translation and speech recognition of challenging languages. Before joining UvA she obtained her PhD from the University of Trento, Italy, in 2013. She has spent research periods at Microsoft Research and Dublin City University, and participated in the development of one of the most widely used open-source machine translation platforms.
In-person attendance only available to those with active NYU ID cards. All individuals must show the Daily Screener green pass in order to gain entry to the building.