Assigned Mon Mar 23, due Thu Apr 2 (any time)

Please do

Ground rules for homework: you can get help from any source (friends, relatives, books, the web) but you

Generally, homework will not be accepted late unless there is a good reason!

- Write a Matlab
**function**to implement the Gradient Method (equivalently, Steepest Descent in the 2-norm). The first line should be**function [x,f,iters] = gradmeth(fun, x0, tol, maxit)**, as in the template gradmeth.m. The first input parameter**fun**is an "anonymous function", which means that calls to**fun**inside**gradmeth**will actually be to some function whose name is unknown until**gradmeth**is called. This is the function that**gradmeth**is supposed to minimize. The second input parameter is the starting point, the third is the tolerance on the norm of the gradient and the fourth is the max number of iterations to be allowed. The**gradmeth**code should include the backtracking line search described on p.464. You can use two nested**while**loops (the outer one for the main iteration, the inner one for the backtracking line search), or you can code the backtracking line search in a separate function, since we will also need it for Newton's method below. Use &alpha=1/4, &beta=1/2. Test my template on the simple quadratic function coded in quad.m by calling the script example.m and make sure you understand how all this works before replacing the dummy code in the template by your own code. (If you are using Matlab 6 or earlier, or a clone that does not support anonymous functions, you can achieve the same effect with a little more trouble using**feval**.)

- Suppose you run your code for 1000 iterations. Let
**p***denote the minimum value of**fun**. By what factor does the theory that we discussed last lecture (see Section 9.3) predict that the ratio**(fun(x)-p*)/(fun(x0)-p*)**will be reduced, in terms of**m**and**M**, the smallest and largest eigenvalues of the Hessian of**fun**? How does this compare with what the code actually computes? You can compute the eigenvalues of the Hessian by**eig(A)**, and the minimal value**p***from evaluating**fun**at the exact minimizer**-A\b**. (Type "help mldivide" at the Matlab prompt if you are not familiar with the crucial "matrix left divide" operator "\".)

- Write another function
**newtmeth**that implements Newton's method in the same way. Now the function to be minimized must return a third output argument, the Hessian**H**. The Newton code must solve a system of linear equations, which could be done using "\", or "inv", but**chol**(Cholesky factorization) is far preferable, as it exploits the symmetry and positive definiteness of**H**. Follow the factorization with two triangular solves using "\" (the implementation of "\" checks whether the left-hand side is triangular, or even a permuted triangular matrix, and exploits this with a solve that is only O(n^2) work). Newton's method minimizes a quadratic function in one step, so, once your code appears to be working, apply it to the problem in Exercise 9.30 (page 519). (Ignore the instructions in part (a) and (b)). Although in principle writing the code for computing**g**and**H**is straightforward, in practice it is easy to get this wrong. Therefore,**check your derivative formulas against explicit finite difference quotients**(of the function for**g**, of the gradient for**H**) before proceeding. The size of the difference should be in the range 1e-6 to 1e-8; if you make it too small, rounding errors will make the finite difference quotients meaningless. For the data defining the problem, set**n**to 100,**m**to 50, and the vectors**a**as the columns of the matrix saved in Adata.mat (type_{i}**load Adata**at the prompt). (Note the two different meanings of**m**in this homework!)

- Using x0=0, how many iterations of Newton's method are required to
minimize this function to approximately the limits of the machine precision
(say what you mean by this). How does this compare with the
upper bound on the number of Newton iterations from the classical
Newton theory that we discussed last week? You can estimate
**m**and**M**by computing them using**eig**at x0, at the minimizer, and perhaps at some other points, and you know**p***after successfully minimizing the function. How about the upper bound given by the self-concordance theory that we will discuss in class today?

- Using
**semilogy**, plot the norm of the gradient and the Newton decrement as functions of the iteration count (add output arguments to**newtmeth**returning these quantities as vectors so you don't need to plot inside**newtmeth**). Use**legend**,**title**,**xlabel**,**ylabel**to label the plot nicely. Do you observe quadratic convergence?