Homework 5
Assigned Mon Mar 7, due Tue Mar 22

Nesterov's Optimal Method

In class we derived a lower bound for the complexity of gradient methods on strongly convex functions; see the last 5 pages of my notes which are taken from p. 66-68 of Nesterov's book. Nesterov then derives several methods that attain this bound. However, the derivation is very complicated. So, all we did in class was to present the simplest one given on the last page of my notes (see also (2.2.11), p.81 of his book). This assignment simply asks you to program this optimal method and compare the results with the ordinary gradient method, using two different choices of fixed stepsizes, namely, t=1/M (for which we have the bound on the first page of my notes from (9.18) in BV) and t=2/(m+M) (for which we have the bound on the second page of my notes, from Theorem 2.1.15 in Nesterov's book), on three examples:

the quadratic defined by the Hilbert matrix that you used in HW4 using the same starting point (the vector of all ones) as in HW4
the example of Exercise 9.30 (BV page 519) that you used in HW4, using the same starting point (zero) as in HW4
Nesterov's worst case example mentioned above, with the given starting point (zero) and with the order of the tridiagonal matrix truncated to n=10,000, with m=1 and two values for M: M=100 and M=10,000. (Recall Nesterov uses μ and L for m and M.) Be sure to code the tridiagonal matrix as a sparse matrix (type help sparse in Matlab) so that matrix-vector multiplies with the tridiagonal matrix are efficient.

Summarize your results in whatever way seems to be convenient.