Thesis Defense
Sam Roweis
Articulatory Speech Processing
January 13, 1999
California Institute of Technology
Here is a PDF version of the
slides
.
Click here to start
Index of Slides
Introduction and Motivation
Articulatory Speech Processing
Speech processing problem
Speech recognition
Speech synthesis
Speaker identification
Current approach: feature extraction
Current approach: templates
An engineering objection
New idea: use a model
Linguistics & psychophysics
Is articulatory variability easier to model?
Approach: analysis of real articulatory data
Approach: analysis of real data
X-ray microbeam database
Placement of tracking beads
Example data & simple statistics
Palate estimation
Error level estimation
Graphical interpretation
What are typical mouth shapes?
Typical shapes of the mouth
Most basic manifold model: PCA
Mixtures of linear manifolds
A pseudo-mechanical model
Data de-noising
Filling in known movements
Filling in missing movements
Forward model: from mouth shape to sound
From movements to acoustics
Spectrogram generation
Spectrogram Reconstruction I
Spectrogram Reconstruction II
Resynthesis from spectrograms
Cartoon Joke
Signal energy estimation
Inverse models: from sounds to mouth shape?
Can you hear the shape of the mouth?
Instantaneous mapping is ill-posed (1)
Instantaneous mapping is ill-posed (2)
Instantaneous mapping is ill-posed (3)
Instantaneous mapping is ill-posed (4)
Instantaneous mapping is ill-posed (5)
Impossible cases
Possible case
Million dollar solution
State estimation from entire trajectories
Constructing a simple LDS
Direct Kalman smoothing?
Self-organizing Markov models
A simple game
Noise and repetitions
Traces in smooth maps
Direct application of SOHMM?
A combined approach
Recovery of movements from acoustics (tt)
Recovery of movements from acoustics (td)
Movement Recovery (all beads)
Articulatory recognition: from mouth shapes to words
From movements to words
``Cheating'' experiments
Isolated word spotting
Continuous speech
Articulatory speech recognition
Results: one small step...
Articulatory word spotting
Other research groups
Representation is everything
Acknowledgements
Articulatory Signatures: another way to do articulatory recognition
Phase space
Articulatory ``signatures''
Signatures for recognition?
Principal Components of Mouth Shapes
Leading Eigenvector of Mouth Shapes
Second Eigenvector of Mouth Shapes
Third Eigenvector of Mouth Shapes
Speaker identification
Speech Processing Background
Some history
Use what you know
Current approach: statistical model
Problem solved ?
What hurts current systems?
Examples of variability
Forces driving the technology
Three cool applications
Excitement
Future research areas
A research paradox?
Statistics of natural speech
Mouths and Ears
Speech production organs
Ear Parts
Ear Diagram
Speech perception organs