Abstract
=======

Feature learning is becoming an important topic in machine learning,
but most studies have focused on developing heuristic methods without
proper justifications. In this talk I'd discuss two application
problems we investigated recently that led to simple and effect
feature learning algorithms based on sound statistical models.

The first model is unsupervised learning of super-vector coding for
image classification, and the second model is supervised learning of
regularized decision forest for generic nonlinear prediction problems.
Both methods can be regarded as learning nonlinear basis functions for
statistical additive models. The first method is mathematically
motivated by the computation of Hellinger distance between two
empirical probability distributions in high dimension, and our
solution achieves the state of the art performance in image
classification. The second method is motivated by boosted decision
trees where we are able to apply the modern concept of structured
sparsity regularization to design algorithms that are superior.