Marginalization is not Marginal: Non-Convex, Bayesian-Inspired Algorithms for Sparse and Low-Rank Estimation
Speaker: David Wipf, Microsoft Research Beijing
Location: Warren Weaver Hall 1302
Date: April 13, 2016, 11:30 a.m.
Host: Dennis Shasha
Sparse estimation and related matrix rank minimization have emerged as important tools in diverse fields including computational neuroscience, signal/image processing, computer vision, and machine learning. Both for computational efficiency and theoretical accessibility, convex algorithms have come to dominate the practical optimization landscape. The typical recipe, which has been iterated in numerous application domains, involves first postulating some putatively ideal, combinatorial cost function favoring sparsity, minimal rank, or both, forming the tightest convex relaxation, and then assessing equivalence conditions whereby the convex approximation will lead to solutions sufficiently close to the original non-convex problem. In particular, the recent popularity of compressive sensing and robust PCA have expedited the acceptance of such convex surrogates, largely because the requisite equivalence conditions can be naturally satisfied using, for example, simple randomized or incoherent designs. But in reality, many practical applications of sparsity and low rank matrices do not benefit from this luxury; rather, because of intrinsic correlations in the signal dictionary (or related structure in rank minimization problems), convex algorithms must be used in regimes where theoretical support no longer holds. Moreover, in some situations it has been shown that convex relaxations are in fact provably bad. Consequently, non-convex algorithms, while perhaps theoretically less accommodating, may nonetheless produce superior results. Here we will examine non-convex estimation algorithms, many of which originate from Bayesian machine learning ideas, that thrive in environments where more popular convex alternatives fail. In all cases, theoretical model justification will be provided independent of any assumed prior distributions. Illustrative examples related to robust PCA and rank minimization under affine constraints will be considered for evaluation purposes. Time-permitting, connections with deep learning networks can also be discussed.
Refreshments will be offered starting 15 minutes prior to the scheduled start of the talk.