Eugene Weinstein's Home Page
I am a software engineer at Google's
Android speech group. I spend most of my time building acoustic models
for the 40+ languages and dialects supported by Google Voice Search and
Voice Input, and on various related research and infrastructure
projects. I have been at Google since finishing my Ph.D. in
the Computer Science Department of
the Courant Institute
at NYU, working
with Prof. Mehryar Mohri. Prior
to starting at NYU, I was at MIT for
nine years, first as an undergraduate, then as a master's student, and
then as a research staff member at
the Computer Science and Artificial
Intelligence Laboratory (CSAIL).
Contact Info
Email: | eugenew at-sign cs dot nyu dot edu
|
Snail-Mail: |
Eugene Weinstein
Google, Inc.
76 Ninth Ave.
New York, NY 10011
|
Current and Past Research Interests
Research and Teaching Activities
- I am teaching CSCI-GA.3033-015 Speech Recognition in Fall 2013.
- I have defended my dissertation! [PDF] [Slides PDF]
- I gave a talk at Columbia's Electrical Engineering Signal and Information Processing Series on Friday, April 3rd, 2009.
- I was the TA for Data Structures, Fall 2008: recitation programs, also linked from class page (click on "Notes and Code")
- I was the TA for Data Structures, Spring 2008: recitation programs, class page.
- I gave a talk at the NYU Music Department's Music Technology Program on April 4th, 2008.
- I was the TA for Intro to Computer Science, Fall 2007: link1, link2.
- I ran the weekly Machine Learning Seminar at NYU.
- I gave a talk at MIT on Oct 19th, 2007: [Slides PDF].
- Read about a project that I'm working on in the Google Research Blog.
- I passed my depth qualifying exams. Here are my slides: syllabus and research.
Resume (2009)
[PDF]
Past Projects (at MIT)
Publications
Not up-to-date: please
see this
page for recent publications.
- M. Mohri, P. Moreno, and E. Weinstein. Discriminative topic
segmentation of text and speech. To appear at the International
Conference on Artificial Intelligence and Statistics (AISTATS), Sardinia, Italy,
2010. [PDF]
- M. Mohri, P. Moreno, and E. Weinstein. A new quality measure for topic segmentation of text and speech. Conference of the International Speech Communication Association (Interspeech), Brighton, UK, 2009. [PDF] [Poster PDF]
- M. Mohri, P. Moreno, and E. Weinstein. Efficient and robust music identification with weighted finite-state transducers. IEEE Transactions on Audio, Speech, and Language Processing 18(1), 2010. [PDF]
- M. Mohri, P. Moreno, and E. Weinstein. General Suffix Automaton Construction Algorithm and Space Bounds. Theoretical Computer Science 410(37), 2009. [PDF]
- M. Mohri, P. Moreno, and E. Weinstein. Robust music identification, detection, and analysis. International Conference on Music Information Retrieval (ISMIR), September 2007, Vienna, Austria. [PDF] [Poster PDF].
- M. Mohri, P. Moreno, and E. Weinstein, Factor Automata of Automata
and Applications. International Conference on Implementation and Application of Automata (CIAA), July 2007, Prague,
Czech Republic. [PDF] [Slides PDF]
- E. Weinstein, K. Steele, A. Agarwal, and J. Glass, LOUD: A
1020-Node Microphone Array and Acoustic Beamformer. International Congress on Sound and Vibration (ICSV), July 2007, Cairns,
Australia. [PDF] [Slides PDF]
- E. Weinstein and P. Moreno. Music Identification with Weighted
Finite-State Transducers. International Conference on Acoustics,
Speech, and Signal Processing (ICASSP) 2007, April 2007, Honolulu,
Hawaii [PDF].
- T. J. Hazen, E. Weinstein, B. Heisele, A. Park, and J. Ming.
Multi-Modal Face and Speaker Identication for Mobile Devices. In R. I. Hammoud, B. Abidi, and M. Abidi, eds., Face Biometrics for Personal Identification: Multi-Sensory Multi-Modal Systems, Springer-Verlag, Heidelberg, Germany, April 2007. [PDF]
- E. Weinstein, K. Steele, A. Agarwal, and J. Glass.
LOUD: A 1020-Node Modular Microphone Array and Beamformer for
Intelligent Computing Spaces. MIT/LCS Technical Memo MIT-LCS-TM-642,
April, 2004 [PDF].
- J. Glass, E. Weinstein, S. Cyphers, J. Polifroni, G. Chung, and
M. Nakano. A Framework for Developing Conversational User Interfaces.
In Proceedings, Computer-Aided Design of User Interfaces (CADUI) 2004,
January 2004, pp. 347-358, Funchal, Isle of Madeira, Portugal. [PDF] [PS] [PPT slides]
- T. Hazen, E. Weinstein, R. Kabir, A. Park, and B. Heisele.
Multi-Modal Face and Speaker Identification on a Handheld Device. In
Proceedings, Workshop on Multimodal User Authentication (MMUA) 2003,
pp. 113-120, December 2003, Santa Barbara, CA, USA. [PDF] [PS]
- T. Hazen, E. Weinstein, and A. Park. Towards Robust Person
Recognition On Handheld Devices Using Face and Speaker Identification
Technologies. In Proceedings, International Conference on Multimodal
Interfaces (ICMI) 2003, November 2003, Vancouver, BC, Canada. [PDF]
- E. Weinstein, P. Ho, B. Heisele, T. Poggio, K. Steele, and A. Agarwal.
Handheld Face Identification Technology in a Pervasive Computing
Environment. in Short Paper Proceedings, International Conference on
Pervasive Computing (Pervasive) 2002, pp. 48-54, August 2002,
Zurich, Switzerland. [PS] [PDF] [CiteSeer]
- J. Glass and E. Weinstein. SPEECHBUILDER: Facilitating Spoken
Dialogue Systems Development. In Proceedings, European Conference on
Speech Communication and Technology (Eurospeech) 2001, pp. 1335-1338, September 2001, Aalborg,
Denmark. [PDF]
[CiteSeer]
- E. Weinstein. SpeechBuilder: Facilitating Spoken Dialogue System
Development. M.Eng. thesis, MIT Department of Electrical Engineering and
Computer Science, May 2001. [Gzipped
PS] [PDF]
[CiteSeer]
Presentations
- Summary of Support Vector
Method for Novelty Detection (Scholkopf et al., NIPS 2000)[PDF]
- Introduction to Topic Models [PDF]
- Summary of Discriminative Log-Linear Grammars with Latent Variables, Petrov and Klein (NIPS 2007) [PDF]
- Summary of Search-based
Structured Prediction, by Hal Daume, John Langford, and
Daniel Marcu (submitted to Machine Learning) [PDF]
- Summary of "Discriminative Training of Decoding Graphs for Large
Vocabulary Continuous Speech Recognition", by Kuo, Kingsbury, and Zweig, ICASSP, 2007. [PDF]
- Summary of "Discriminative Log-Linear Grammars with Latent
Variables", by Slav Petrov and Dan Klein, NIPS, 2007. [PDF]
- Summary of "Search-Based Structured Prediction", by Hal Daume et al, Submitted to Machine Learning, 2007. [PDF]
- Summary of "Support Vector Machine Learning for Interdependent and Structured Output Spaces", by Ioannis Tsochantaridis et al, ICML, 2004.
[PDF]
- Summary of "Large Margin Training of Continuous-density Hidden
Markov Models," Fei Sha and Lawrence Saul, NIPS 2006
[PDF]
- Expectation-Maximization algorithm and applications [PDF]
- Survey of Query-by-Humming (QBH) [PDF]
- Summary of "Computer Vision for Music Identification," Y. Ke, D. Hoiem, R. Sukthankar, Computer Vision and Pattern Recognition (CVPR), 2005
[PDF]
- Summary of "Robust Real-Time Face Detection," P. Viola and M. Jones,
International Journal of Computer Vision 57(2), 2004
[PDF]
- Summary of "The Population Frequencies of Species and the Estimation of
Population Parameters," I. J. Good, Biometrika, Vol. 40, No. 3/4. (Dec., 1953), pp. 237-264.
[PDF]
Links