Assigned Mon Mar 21, due Fri Apr 1
Subgradients and ADMM
- Let f be a convex function. Either
My notes on subgradients and the subdifferential are here.
- Prove the Fenchel-Young inequality f(x) + f*(y) ≥ xTy
and show that it holds with equality if and only if y ∈ ∂f(x), or
- Prove that y ∈ ∂f(x) if and only if yTd ≤ f′(x;d) for all d ∈ Rn, where f′(x;d) is the ordinary directional derivative of f at x in the direction d
- Implement the ADMM method for the LASSO problem on p.43 of the ADMM paper by Boyd et al.
See p.32 for the definition of the soft thresholding operator S.
My annotated version of the relevant parts of the Boyd paper is
For both the small problem and the large problem
- First test it on a small problem and check whether you get the right
answer using cvx.
- Then fix A to be a randomly generated large but very sparse matrix via
A=sprandn(M,N,density); (sparse-random-normal distribution)
with M=100000, N=10000, density=2/M. Take a look at its sparsity pattern
by typing spy(A), shg (note that nnz, at the bottom, means number of nonzeros, which you can
also display directly with nnz(A)). Set b=randn(M,1); and
fix rho (i.e. ρ) to some positive value. Then compute the (sparse) Cholesky factorization
LLT (see p.669 of BV) of ATA + ρ I via
B = A'*A + rho*speye(N); L=chol(B,'lower');
(speye means sparse identity; do not compute the dense identity matrix eye(N))
before the ADMM iteration starts. Take a look at L's sparsity with spy(L).
Then, inside the ADMM iteration you can solve the
relevant systems with forward and back substitution (Algorithm C.2, p.670 of BV),
using the backslash operator \.
- Experiment with λ : does larger λ result in solutions
x which are more sparse, as it should?
- Experiment with ρ : what effect does this have on the method?