Assigned Mon Apr 6, due Thurs Apr **23** (any time)
Note: there is no class on Apr ***13***.
The Nuclear Norm and Semidefinite Programming
- The singular value decomposition of a p by q matrix A
can be defined in several ways which can lead to confusion. Let's assume
that p &ge q, and, for the moment, that the columns are linearly
independent, so the rank is q. Then the definition in the text on
p.648 is consistent with the "economy-sized" SVD in Matlab, [U,S,V]=svd(A,0).
When the matrix has rank r < q, the definition in the book
defines only nonzero singular values, but in the more standard definition,
there are q-r zero singular values (and additional columns in U
Then, the singular values of A are the square roots of the eigenvalues
of the symmetric positive semidefinite q by q matrix
ATA. (This is shown on p.648; see also p.646.)
Assuming to avoid confusion that the rank is q,
define an p+q by p+q symmetric indefinite matrix B
with zero diagonal blocks and with A and AT
in the off-diagonal block positions. Show that B has 2q
nonzero eigenvalues which are plus and minus the singular values of A,
and p-q eigenvalues which are zero.
- In his second Courant lecture, Candes discussed the
low rank matrix completion problem,, where a p by q
matrix (such as the Netflix matrix), which is assumed to be low rank, needs to
be completed from knowledge of only a few of its entries. He explained that
this can be very successfully tackled by solving the following convex optimization
problem: minimize the sum of all q singular values of the matrix
subject to the constraint that the matrix has the correct known entries. The sum of
the singular values is called the nuclear norm (or the Schatten norm,
or the Ky Fan norm) of the matrix. It turns out that the nuclear norm
minimization problem can be written as a semidefinite program (SDP):
see the top of p. 9 in
a paper by Candes and Recht. (In order to make the optimal objective equal
to the nuclear norm, I think there needs to be a factor of 1/2 in the objective.)
Making use of Question 1 above, given any X satisfying its constraints,
write down a feasible (W1,W2) satisfying
the semidefinite constraint, namely by setting both to the same multiple of the
identity matrix: how small can you make this? Also, give an example of
a very special X for which this feasible
(W1,W2) actually is the optimal solution.
- Write this SDP in standard dual form
(max bT y subject to &Sigma yk Ak + Z = C, Z positive semidefinite.)
What is the number of "dual" variables (the y variables)? How much work will it
be to (a) form and (b) factor (by Cholesky) the "Schur complement" matrix that needs to be
solved in interior point methods? For the simplest method, the "XZ" or
"H..K..M" method, the (i,j) entry of the Schur complement is
tr AiXAjZ-1, using standard
form notation. When figuring out the cost, take into account the
fact that the Ai are very sparse, but X and
Z are dense matrices. Which costs more, forming the Schur complement
matrix or factoring it? (You don't need these to answer this question, but references
for primal-dual methods for SDP include Steve Wright's wonderful book
Primal-Dual Interior-Point Methods (SIAM, 1997, in the library),
a great survey by M.J. Todd and
a paper by me and two coauthors.)
- Write the primal standard form version of the same SDP
( min tr CX subject to tr Ak X = bk, X positive semidefinite).
Does this give any insight?
Read the CVX user's guide
sufficiently carefully so that you can write a Matlab function to solve this SDP
using CVX. Note that the beauty of
CVX is that you do not have to express the SDP in standard primal or dual
form; CVX does the conversion for you. Using the examples in
Xdata.mat, minimize the nuclear norm specifying only
a relatively small number of the entries, where you generate the row and column
indexes for these specified entries randomly.
Run successive SDPs with increasing numbers of entries specified
until you successfully recover the matrix. For each of the 3 matrices in the
data file, approximately how many of the entries do you need to specify to be
able to recover the matrix exactly? Average your results over repeated runs
with different row and column index pairs. (Use ceil(k*rand)) to
generate a random integer between 1 and k.)
- For randomly generated square matrices X of order q and
fixed rank = 3 do a more systematic investigation of how the number of required
specified entries (to recover the matrix) and the time required by CVX
(use cputime) varies with q.
Plot the results. How big can you make
q and still solve one of these SDPs in less than 5 minutes?
- For LP, in a primal-dual interior point method, once (&Delta x,&Delta y, &Delta z)
has been computed by solving the Newton linear system, one updates x by
x + &alpha &Delta x, where &alpha=min(1,&tau t), where &tau is a little
smaller than one, e.g. 0.999, and t
is the the "step to the boundary", the maximum s
such that x+s &Delta x &ge 0. Note that x>0, so if &Delta x is also
positive, t=&infin so &alpha=1. (In this situation it is possibly true,
as I said confidently in class, that the LP is unbounded but I was confusing this with
the simplex method: we cannot conclude that here. In the simplest primal-dual interior point methods,
unboundedness of the iterates is detected simply by them growing very large; unboundedness of
the primal implies dual infeasibility. The reason for the maximum step of 1 is that the method
is based on Newton's method. Furthermore, if a step of 1 is ever taken, the iterates
are primal feasible for all subsequent iterations.)
Write down a simple "ratio test" algorithm for computing s.
- In SDP, the step to the boundary is the maximum s such that X+s &Delta X &ge 0
in the positive semidefinite sense. Note that X>0 in the positive definite sense,
so if &Delta X is positive semidefinite, again we have t=&infin so &alpha=1.
Write down a simple algorithm for computing s. Hint: use the Cholesky
factorization of X and compute the eigenvalues of a certain matrix.