Homework 6
Assigned Mon Mar 21, due Fri Apr 1

Subgradients and ADMM

Let f be a convex function. Either
- Prove the Fenchel-Young inequality f(x) + f*(y) ≥ x^Ty and show that it holds with equality if and only if y ∈ ∂f(x), or
- Prove that y ∈ ∂f(x) if and only if y^Td ≤ f′(x;d) for all d ∈ Rⁿ, where f′(x;d) is the ordinary directional derivative of f at x in the direction d
My notes on subgradients and the subdifferential are here.
Implement the ADMM method for the LASSO problem on p.43 of the ADMM paper by Boyd et al. See p.32 for the definition of the soft thresholding operator S. My annotated version of the relevant parts of the Boyd paper is here.
1. First test it on a small problem and check whether you get the right answer using cvx.
2. Then fix A to be a randomly generated large but very sparse matrix via A=sprandn(M,N,density); (sparse-random-normal distribution) with M=100000, N=10000, density=2/M. Take a look at its sparsity pattern by typing spy(A), shg (note that nnz, at the bottom, means number of nonzeros, which you can also display directly with nnz(A)). Set b=randn(M,1); and fix rho (i.e. ρ) to some positive value. Then compute the (sparse) Cholesky factorization LL^T (see p.669 of BV) of A^TA + ρ I via B = A'*A + rho*speye(N); L=chol(B,'lower'); (speye means sparse identity; do not compute the dense identity matrix eye(N)) before the ADMM iteration starts. Take a look at L's sparsity with spy(L). Then, inside the ADMM iteration you can solve the relevant systems with forward and back substitution (Algorithm C.2, p.670 of BV), using the backslash operator \.
For both the small problem and the large problem
- Experiment with λ : does larger λ result in solutions x which are more sparse, as it should?
- Experiment with ρ : what effect does this have on the method?