Unsupervised Feature Learning
Speaker: Honglak Lee, Stanford University
Location: Warren Weaver Hall 201
Date: April 20, 2010, 11:30 a.m.
Host: Richard Cole
Machine learning has proved a powerful tool for artificial intelligence and data mining problems. However, its success has usually relied on having a good feature representation of the data, and having a poor representation can severely limit the performance of learning algorithms. These feature representations are often hand-designed, require significant amounts of domain knowledge and human labor, and do not generalize well to new domains. To address these issues, I will present machine learning algorithms that can automatically learn good feature representations from unlabeled data in various domains, such as images, audio, text, and robotic sensors. Specifically, I will first describe how "sparse coding" algorithms --- which represent each input example using a small number of basis vectors --- can be used to learn good low-level representations from unlabeled data. I also show that this gives feature representations that yield improved performance in object recognition, audio classification, text classification, and 3D point cloud classification. In addition, I will present an algorithm for building more complex, hierarchical representations, in which more complex features are automatically learned as a composition of simpler ones. When applied to images, this method automatically learns features that correspond to objects and decompositions of objects into object-parts. These features often lead to performance competitive with or better than highly hand-engineered computer vision algorithms in object recognition and image segmentation tasks. Further, the same algorithm can be used to learn feature representations from audio data. Here, the learned features yield improved performance over state-of-the-art methods in several different speech recognition tasks, such as speaker identification, phone classification, and gender classification.
Honglak Lee is a Ph.D. candidate in Computer Science Department at Stanford University, where he is advised by Andrew Ng. His research interests include machine learning, artificial intelligence, and data mining. He received ICML 2009 best application paper award and CEAS 2005 best student paper award. Honglak graduated with a B.Sc. in Physics and Computer Science from Seoul National University in Korea. He has been a recipient of Korea Foundation of Advanced Studies Fellowship and Stanford Graduate Fellowship.
Refreshments will be offered starting 15 minutes prior to the scheduled start of the talk.