Homework 5
Assigned Mon Apr 6, due Thurs Apr **23** (any time)

Note: there is no class on Apr ***13***.

The Nuclear Norm and Semidefinite Programming
1. The singular value decomposition of a p by q matrix A can be defined in several ways which can lead to confusion. Let's assume that p &ge q, and, for the moment, that the columns are linearly independent, so the rank is q. Then the definition in the text on p.648 is consistent with the "economy-sized" SVD in Matlab, [U,S,V]=svd(A,0). When the matrix has rank r < q, the definition in the book defines only nonzero singular values, but in the more standard definition, there are q-r zero singular values (and additional columns in U and V). Then, the singular values of A are the square roots of the eigenvalues of the symmetric positive semidefinite q by q matrix ATA. (This is shown on p.648; see also p.646.) Assuming to avoid confusion that the rank is q, define an p+q by p+q symmetric indefinite matrix B with zero diagonal blocks and with A and AT in the off-diagonal block positions. Show that B has 2q nonzero eigenvalues which are plus and minus the singular values of A, and p-q eigenvalues which are zero.

2. In his second Courant lecture, Candes discussed the low rank matrix completion problem,, where a p by q matrix (such as the Netflix matrix), which is assumed to be low rank, needs to be completed from knowledge of only a few of its entries. He explained that this can be very successfully tackled by solving the following convex optimization problem: minimize the sum of all q singular values of the matrix subject to the constraint that the matrix has the correct known entries. The sum of the singular values is called the nuclear norm (or the Schatten norm, or the Ky Fan norm) of the matrix. It turns out that the nuclear norm minimization problem can be written as a semidefinite program (SDP): see the top of p. 9 in a paper by Candes and Recht. (In order to make the optimal objective equal to the nuclear norm, I think there needs to be a factor of 1/2 in the objective.) Making use of Question 1 above, given any X satisfying its constraints, write down a feasible (W1,W2) satisfying the semidefinite constraint, namely by setting both to the same multiple of the identity matrix: how small can you make this? Also, give an example of a very special X for which this feasible (W1,W2) actually is the optimal solution.

3. Write this SDP in standard dual form (max bT y subject to &Sigma yk Ak + Z = C, Z positive semidefinite.) What is the number of "dual" variables (the y variables)? How much work will it be to (a) form and (b) factor (by Cholesky) the "Schur complement" matrix that needs to be solved in interior point methods? For the simplest method, the "XZ" or "H..K..M" method, the (i,j) entry of the Schur complement is tr AiXAjZ-1, using standard form notation. When figuring out the cost, take into account the fact that the Ai are very sparse, but X and Z are dense matrices. Which costs more, forming the Schur complement matrix or factoring it? (You don't need these to answer this question, but references for primal-dual methods for SDP include Steve Wright's wonderful book Primal-Dual Interior-Point Methods (SIAM, 1997, in the library), a great survey by M.J. Todd and a paper by me and two coauthors.)

4. Write the primal standard form version of the same SDP ( min tr CX subject to tr Ak X = bk, X positive semidefinite). Does this give any insight?

5. Read the CVX user's guide sufficiently carefully so that you can write a Matlab function to solve this SDP using CVX. Note that the beauty of CVX is that you do not have to express the SDP in standard primal or dual form; CVX does the conversion for you. Using the examples in Xdata.mat, minimize the nuclear norm specifying only a relatively small number of the entries, where you generate the row and column indexes for these specified entries randomly. Run successive SDPs with increasing numbers of entries specified until you successfully recover the matrix. For each of the 3 matrices in the data file, approximately how many of the entries do you need to specify to be able to recover the matrix exactly? Average your results over repeated runs with different row and column index pairs. (Use ceil(k*rand)) to generate a random integer between 1 and k.)

6. For randomly generated square matrices X of order q and fixed rank = 3 do a more systematic investigation of how the number of required specified entries (to recover the matrix) and the time required by CVX (use cputime) varies with q. Plot the results. How big can you make q and still solve one of these SDPs in less than 5 minutes?

7. For LP, in a primal-dual interior point method, once (&Delta x,&Delta y, &Delta z) has been computed by solving the Newton linear system, one updates x by x + &alpha &Delta x, where &alpha=min(1,&tau t), where &tau is a little smaller than one, e.g. 0.999, and t is the the "step to the boundary", the maximum s such that x+s &Delta x &ge 0. Note that x>0, so if &Delta x is also positive, t=&infin so &alpha=1. (In this situation it is possibly true, as I said confidently in class, that the LP is unbounded but I was confusing this with the simplex method: we cannot conclude that here. In the simplest primal-dual interior point methods, unboundedness of the iterates is detected simply by them growing very large; unboundedness of the primal implies dual infeasibility. The reason for the maximum step of 1 is that the method is based on Newton's method. Furthermore, if a step of 1 is ever taken, the iterates are primal feasible for all subsequent iterations.)

Write down a simple "ratio test" algorithm for computing s.

8. In SDP, the step to the boundary is the maximum s such that X+s &Delta X &ge 0 in the positive semidefinite sense. Note that X>0 in the positive definite sense, so if &Delta X is positive semidefinite, again we have t=&infin so &alpha=1. Write down a simple algorithm for computing s. Hint: use the Cholesky factorization of X and compute the eigenvalues of a certain matrix.