Homework 1: 30 pts

DUE: APRIL 16 2007

NOTE: DON’T FORGET TO SEND ME SUMMARY AND DIARY FILE ALONG WITH THE OTHER FILES THAT I ASKED IN EXERCISES....

Please send me only files that I asked ...

Introduction

This homework is concerned with interpolating data with polynomials. There is a difference between interpolating and approximating. Interpolating means that the function passes through all the given points little regard to the points between them. Approximating means only that the function is close to all the given points. Although it might seem that interpolating is always better than approximating, this is not true.

Polynomials are commonly used to interpolate data. There are several methods for defining, constructing, and evaluating the polynomial of a given degree that passes through data values. One common way is to solve for the coefficients of the polynomial by setting up a linear system called Vandermonde's equation. A second way is to represent the polynomial as the sum of simple Lagrange polynomials. We will see that interpolation does not necessarily mean good approximation and that polynomial interpolation can break down with a fatal case of ``the wiggles.''

Notation

This homework is focussed on finding functions that pass through certain specified points. In customary notation, you are given a set of points $\{(x_k,y_k)\vert k=1,2,\ldots,n\}$ and you want to find a function $f(x)$ that passes through each point ( $f(x_k)=y_k$ for $k=1,2,\ldots,n$ ). The given points are variously called the ``given'' points, the ``specified'' points, the ``data'' points, etc. In this homework, we will use the term ``data points.''

Written in the customary notation, it is easy to see that the quantities $x$ and $x_k$ are essentially different. In Matlab, without kerning and font differentiation, it can be difficult to keep the various quantities straight. In this lab, we will use the name xval to denote the values of $x$ and the names xdata and ydata to denote the data $x_k$ and $y_k$ . The variable names fval or yval are used for the value $f(x)$ to emphasize that it is the value of the function approximating $y$ .

Matlab Tips

At some point in this lab, you will need to determine if a quantity A is not equal to a quantity B. The Matlab syntax for this is

  if A ~= B

Sometimes it is convenient to enclose the logical expression in parentheses, but this is not required.

If you have a (square) linear system of equations, of the form

  A * x = b

where A is an N by N matrix, and b and x are column vectors, then Matlab can solve the linear system either by using the expression:

  x = inv(A)*b

or, better, via the ``backslash'' command:

  x = A \ b

The backslash command is strange looking. You might remember it by noting that the matrix A appears underneath the backslash. The difference between the two commands is merely that the backslash command is about three times faster because it solves the equation without constructing the inverse matrix. You will not notice a difference in this lab, but you might see one later if you use Matlab for more complicated projects.

The backslash command is actually more general than just multiplication by the inverse. It can find solutions when the matrix A is singular or when it is not a square matrix!

Matlab has several commands for dealing with polynomials:

poly(r)

Finds the coefficients of the polynomial whose roots are given by the vector r.

roots(c)

Finds the roots of the polynomial whose coefficients are given by the vector c.

polyfit( xdata, ydata, n)

Finds the coefficients of the polynomial of degree n passing through the points xdata(k),ydata(k), for $\texttt{k}=1,2,\dots,K$ , where $K$ need not be near to n.

polyval( c, xvals)

Evaluates a polynomial with coefficients given by the vector c at the values xvals(k) for $k=1,2,\dots,K$ for $K\ge 1$ .

When there are more data values than the minimum, the polyfit function returns the coefficients of a polynomial that ``best fits'' the given values in the sense of least squares. This polynomial approximates, but does not necessarily interpolate, the data. In this lab, you will be writing m-files with functions similar to polyfit but that generate polynomials of the precise degree determined by the number of data points. (n=length(xdata)-1.

The vector of coefficients c of a polynomial in Matlab are, by convention, defined as

$\displaystyle p(x)=\sum_{k=1}^N c_k x^{N-k}$

(1)

Beware: When a vector of numbers, c, of length N, is regarded as coefficients of a polynomial, the subscripts run backwards! c(1) is the coefficient of the term with degree N-1, and the constant term is c(N).

In this and some later labs, you will be writing m-files with functions analogous to polyfit and polyval, using several different methods. Rather than following the matlab naming convention, functions with the prefix coef_ will generate a vector of coefficients, as by polyfit, and functions with the prefix eval_ will evaluate a polynomial (or other function) at values of xval, as with polyval.

We will be using Matlab functions to construct a known polynomial and using it to generate ``data'' values. Then we use our interpolation functions to recover the original, known, polynomial. This strategy is a powerful tool for illustrative and debugging purposes, but practical use of interpolation starts from arbitrary data, not contrived data.

The Polynomial through Given Data

We discussed Newton's method last semester, which can be derived by determining a linear polynomial (degree 1) that passes through the point $(a,f_a)$ with derivative $f'_a$ . That is $p(a)=f_a$ and $p'(a)=dp/dx(a)=f'_a$ . One way of looking at this is that we are constructing an interpolating function, in this case a polynomial, that explains all the data that we have. We may then want to examine the graph of the polynomial, evaluate it at other points, determine its integral or derivative, or do other things with it. This is a reason that polynomial interpolation is important even when the functions we are interested in are not polynomials.

Vandermonde's Equation

Here's one way to see how to organize the computation of the polynomial that passes through a set of data. It's not the best way, but it will give us a good place to start.

Suppose we wanted to determine the linear polynomial $p(x)=c_1x+c_2$ that passes through the data points $(x_1,y_1)$ and $(x_{2},y_{2})$ . We simply have to solve a set of linear equations for $c_1$ and $c_2$ .

$\displaystyle c_1 x_1 + c_2$	$\displaystyle = y_1$
$\displaystyle c_1 x_2 + c_2$	$\displaystyle = y_2$

or, equivalently,

$\displaystyle \left[\begin{array}{cc}x_1 & 1 x_2 & 1\end{array}\right] \left[... ...c_1 c_2\end{array}\right]= \left[\begin{array}{c}y_1 y_2\end{array}\right]$

which (usually) has the solution

$\displaystyle c_1$	$\displaystyle =$	$\displaystyle ( y_2 - y_1 ) / ( x_2 - x_1 )$
$\displaystyle c_2$	$\displaystyle =$	$\displaystyle ( x_2 y_1 - x_1 y_2 ) / ( x_2 - x_1 )$

Compare that situation with the case where we want to determine the quadratic polynomial $p(x) = c_1 x^2 + c_2 x + c_3$ that passes through three sets of data values. Then we have to solve the following set of linear equations for the polynomial coefficients c:

$\displaystyle \left[\begin{array}{ccc}x_1^2 & x_1 & 1 x_2^2&x_2 & 1\\ x_3^2&... ...c_3\end{array}\right]= \left[\begin{array}{c}y_1 y_2 y_3\end{array}\right]$

This is an example of a third order Vandermonde Equation. It is characterized by the fact that for each row (sometimes column) of the coefficient matrix, the succesive entries are generated by a decreasing (sometimes increasing) set of powers of a set of variables.

You should be able to see that, for any collection of abscissae and ordinates, it is possible to define a linear system that should be satisfied by the (unknown) polynomial coefficients. If we can solve the system, and solve it accurately, then that is one way to determine the interpolating polynomial.

Now, let's see how to construct and solve the Vandermonde equation using Matlab. This involves setting up the coefficient matrix A. As before, we use the Matlab variables xdata and ydata to represent the quantities $x_k$ and $y_k$ , and we will assume them to have length n.

for j = 1:n

  for k = 1:n

    A(j,k) = xdata(j)^(n-k) ;

end

end

Then we have to set the right hand side to the ordinates ydata. If we can get all of that set up, then actually solving the linear system is easy. We just write:

c = A \ ydata';

Recall that the backslash symbol means to solve the system with matrix A and right side ydata'. Notice that we are using the transpose of the row vector ydata in this equation. By default, Matlab constructs row vectors unless told to do otherwise.

Exercise 1: The Matlab built-in function polyfit finds the coefficients of a polynomial through a set of points. We will write our own using the Vandermonde matrix. (This is the way that Matlab actually uses.)

1. Write a Matlab function m-file, coef_vander.m with signature

2.

3.                 function c = coef_vander ( xdata, ydata )

4.                 % c = coef_vander ( xdata, ydata )

5.                 % xdata= ???

6.                 % ydata= ???

7.                 % c= ???

8.                 % other comments

9.

10.             % your name and the date

that accepts a pair of vectors xdata and ydata of arbitrary but equal length, and returns the coefficient vector c of the polynomial that passes through that data.

11. Test your function by computing the coefficients of the polynomial through the following data points. (This polynomial is $y=x^2$ , so you can check your coefficient vector ``by inspection.'')

12.

13.             xdata= [  0   1   2 ]

14.             ydata= [  0   1   4 ]

15. Test your function by computing the coefficients of the polynomial that passes through the following points

16.

17.             xdata= [ -3   -2  -1   0   1   2   3]

18.             ydata= [-364 -63  -6  -1   0  21 182]

19. Confirm using polyval that your polynomial passes through these data points.

In the following exercise you will construct a polynomial using coef_vander to interpolate data points and then you will see what happens between the interpolation points.

Exercise 2:

1. Consider the polynomial whose roots are r=[1 2 3 4 5]. Use the Matlab poly function to find its coefficients. Call these coefficients cTrue.

2. This polynomial obviously passes through zero at each of these five ``data'' points. We want to see if our coef_vander function can reproduce it. To use our coef_vander function, we need a sixth point. You can ``read off'' the value of the polynomial at x=0 from its coefficients above. What is this value?

3. Use the coef_vander function to find the coefficients of the polynomial passing through the ``data'' points

4.

5.                   xdata=[0  1 2 3 4 5];

6.                   ydata=[?? 0 0 0 0 0];

Call these coefficients cVander.

7. What would happen if you had used only the five roots as xdata?

8. Use the following code to compute and plot the values of the two polynomials on the interval [0,5]. How big is the difference between the curves? (We will be using essentially this same code in several following exercises. You should be sure you understand what it does. You might want to copy it to a script m-file.)

9.

10.             xval=linspace(0,5,500);  % test abscissae for plotting

11.             yvalTrue=polyval(cTrue,xval);  % true ordinates

12.             yvalVander=polyval(cVander,xval);  % our ordinates

13.             plot(xval,yvalTrue,'b'); % true curve in blue

14.             hold on

15.             plot(xval,yvalVander,'r');  % our curve in red

16.             hold off

17.             norm(yvalTrue-yvalVander)

Please include a copy of this plot with your summary. (If it appears that there is only one curve, it is possible that the two curves are on top of one another.)

Lagrange Polynomials

Suppose we fix the set of $n$ distinct abscissæ $x_k$ , $k=1,\ldots,n$ and think about the problem of constructing a polynomial that has (not yet specified) values $y_i$ at these points. Now suppose I have a polynomial $\ell_{7}(x)$ that is zero at each abscissa value, except that at $x_{7}$ , where it is 1. Then the intermediate polynomial $y_{7} \ell_{7}(x)$ would have the value $y_{7}$ at $x_{7}$ , and be 0 at all the other $x_i$ . Doing the same for each abscissa and adding the intermediate polynomials together results in the polynomial that interpolates the data!

In fact, the Lagrange polynomials $\ell_k$ are easily constructed for any set of abscissae. Each Lagrange polynomial will be of degree $n-1$ . There will be $n$ Lagrange polynomials, one per abscissa, and the $k^{\mbox{th}}$ polynomial $\ell_{k}(x)$ will have the special relationship with the abscissa $x_{k}$ , namely, it will be 1 there, and 0 at the other abscissæ.

In terms of Lagrange polynomials, then, the interpolating polynomial has the form:

$\displaystyle p(x)$	$\displaystyle =$	$\displaystyle y_1 \ell_1(x) + y_2 \ell_2(x) +\ldots + y_n \ell_n(x)$	(2)
	$\displaystyle =$	$\displaystyle \sum_{k=1}^n y_k\ell_k(x)$

Assuming we can determine these magical polynomials, this is a second way to define the interpolating polynomial to a set of data.

Remark: The trick of finding a function that equals 1 at a distinguished point and zero at all other points in a set is very powerful. One of the places you will meet it again is when you study the construction of finite elements for solving partial differential equations.

Exercise 3: In this exercise and the following one, we will construct the polynomials passing through the same data points as in the previous exercise, but we will use the Lagrange polynomials. This exercise will concentrate on construction of the Lagrange polynomials and the following exercise will concentrate on fitting polynomials. Recall the data set for $y=x^2$ :

   k :    1   2   3

xdata= [  0   1   2   ]

ydata= [  0   1   4   ]

(Actually, ydata is immaterial for construction of $\ell_k(x)$ .) In general, the formula for $\ell_{k}(x)$ can be written as:

$\displaystyle \ell_k(x)= (f_{1}(x)) (f_2(x)) \cdots (f_{k-1}(x)) (f_{k+1}(x)) \cdots (f_{n}(x) )$

(3)

(skipping the $k^{\mbox{th}}$ factor), where each factor has the form

$\displaystyle f_{j}(x) = \frac{( x - x_{j} ) }{ ( x_{k} - x_{j})}.$

1. Write a Matlab function m-file called lagrange.m that computes the Lagrange polynomials (3) for any k. The signature should be

2.

3.                 function pval = lagrange( k , xdata, xval )

4.                 % pval = lagrange( k , xdata, xval )

5.                 % comments

6.                 % k= ???

7.                 % xdata= ???

8.                 % xval= ???

9.                 % pval= ???

10.

11.             % your name and the date

and the function should evaluate the k-th Lagrange polynomial for the abscissas xdata at the point xval. Hint, you can implement the general formula using code like the following.

pval = 1;

for j = 1 : n

  if j ~= k

    pval = pval .* ???

end

end

Note 1: You will need to decide what to use for n.

Note 2: If xval is a vector of values, then pval will be the vector of corresponding values, so that an elementwise multiplication (.*) is being performed.

12. Check that lagrange gives the correct values (ones and zeros) at the data points xdata for each k. You should be able to use the commands:

13.

14.               lagrange ( 1, xdata, xdata )

15.               lagrange ( 2, xdata, xdata )

16.               lagrange ( 3, xdata, xdata )

Exercise 4: The hard part is done. Now we want to use our lagrange routine as a helper for a second replacement for polyfit-polyval pair, called eval_lag, that implements Equation (2). Unlike coef_vander, the coefficient vector of the polynomial does not need to be generated separately because it is so easy, and that is why eval_lag both fits and evaluates the Lagrange interpolating polynomial.

1. Write a Matlab function m-file called eval_lag.m with the signature

2.

3.                 function yval = eval_lag ( xdata, ydata, xval )

4.                 % yval = eval_lag ( xdata, ydata, xval )

5.                 %  comments

6.

7.                 % your name and the date

This function should take the data values xdata and ydata, and compute the value of the interpolating polynomial at xval according to (2), using the lagrange.m file for the Lagrange polynomials. Be sure to include comments to that effect.

8. Test eval_lag on the simplest data set we have been using.

9.

10.                  k :    1   2   3

11.               xdata= [  0   1   2 ]

12.               ydata= [  0   1   4 ]

by evaluating it at xval=xdata. Of course, you should get ydata back.

13. Test your function by interpolating the polynomial that passes through the following points, again by evaluating it at xval=xdata.

14.

15.             xdata= [  -3  -2  -1   0   1   2   3]

16.             ydata= [-364 -63  -6  -1   0  21 182]

17. Returning to the polynomial constructed in an earlier exercise with roots r=[1 2 3 4 5] and coefficients cTrue, reconstruct (using polyval) or recall the ydata values associated with xdata=[0 1 2 3 4 5] and compute the values of the Lagrange interpolating polynomial at the same 100 test points between 0 and 3. Call these values yvalLag. Plot yvalLag and yvalTrue and compute the error between them using the test plotting approach used above. Include a copy of this plot with your summary.

Interpolating functions that are not polynomials

Interpolating functions that are polynomials and using polynomials to do it is cheating a little bit. It is harder to use polynomials to interpolate functions that are not themselves polynomials. At the interpolation points, the function and its interpolant agree exactly, so we want to examine the behavior between the interpolation points.

Exercise 5:

1. Construct a function m-file for the Runge function $y=\frac{1}{1+x^2}$ . Name the file runge.m and give it the signature

2.

3.                 function y=runge(x)

4.                 % comments

Use componentwise (vector) division and exponentiation (./ and .^).

5. We would like to interpolate the Runge function on the interval $[-\pi,\pi]$ , so use the following outline to examine the behavior of the polynomial interpolant to the Runge function for five evenly-spaced points. It would be best if you put these commands into a script m-file named exer5.m.

6.

7.                 % construct n=5 data points

8.                 n=5;

9.                 xdata=linspace(-pi,pi,n);

10.             ydata=runge(xdata);

11.

12.             % construct many test points

13.             xval=linspace(??,??,4001);

14.             % construct the true function, for reference

15.             yvalTrue=runge(??);

16.

17.             % use Lagrange polynomial interpolation to evaluate

18.             % the interpolant at the test points

19.             yval=eval_lag(??,??,xval);

20.

21.             % plot reference values in blue, interpolant in red

22.             plot(xval,yvalTrue,'b');

23.             hold on

24.             plot(xval,yval,'r');

25.             hold off

26.

27.             % estimate the approximation error of the interpolant

28.             approximationError=max(abs(yvalTrue-yval))

Please send me this plot.

29. Confirm visually that the Runge function and its interpolant agree at the interpolation points, but not necessarily between them.

30. Using more data points gives higher degree interpolation polynomials. Fill in the following table using Lagrange interpolation with increasing numbers of data points.

31.

32.               n =  5  Approximation Error = ________

33.               n = 11  Approximation Error = ________

34.               n = 21  Approximation Error = ________

You probably are surprised to see that the errors do not decrease.

Most people expect that an interpolating polynomial $p(x)$ gives a good approximation to the function $f(x)$ everywhere, no matter what function we choose. If the approximation is not good, we expect it to get better if we increase the number of data points. These expectations hold only when the function does not exhibit some ``essentially non-polynomial'' behavior. You will see why the Runge function cannot be approximated well by polynomials in the following exercise.

Exercise 6: The Runge function has Taylor series

$\displaystyle \frac{1}{1+x^2}=1-x^2+x^4-x^6+\dots$

(4)

as you can easily prove. This series has a radius of convergence in the complex plane of 1. Polynomials, on the other hand, are entire functions, i.e., their Taylor series converges everywhere in the complex plane. No finite sum of polynomials can be anything but entire, but no entire function can approximate the Runge function on a disk with radius larger than one about the origin in the complex plane. If there were one, it would have to agree with the series (4) inside the unit disk, but the series diverges at $x=i$ and an entire function cannot have an infinite value.

1. Make a copy of exer5.m called exer6.m that uses coef_vander and polyval to evaluate the interpolating polynomial rather than eval_lag.

2. Confirm that you get the same results as in Exercise 5 when you use Vandermonde interpolation for the Runge function.

3.

4.                   n =  5  Approximation Error = ________

5.                   n = 11  Approximation Error = ________

6.                   n = 21  Approximation Error = ________

7. Look at the nontrivial coefficients ( $c_k$ ) of the interpolating polynomials by filling in the following table.

8.

9.                 n=5   c( 5)= _____  c( 3)= _____  c( 1)= _____

10.             n=11  c(11)= _____  c( 9)= _____  c( 7)= _____  c( 5)= _____

11.             n=21  c(21)= _____  c(19)= _____  c(17)= _____  c(15)= _____

12. Look at the trivial coefficients ( $c_k$ ) of the interpolating polynomials by filling in the following table. (Look carefully at what the colon notation does.)

13.

14.             n=5   max(abs(c(2:2: 5)))= _____

15.             n=11  max(abs(c(2:2:11)))= _____

16.             n=21  max(abs(c(2:2:21)))= _____

You should see that the interpolating polynomials are ``trying'' to reproduce the Taylor series (4). These polynomials cannot approximate the Runge function at all points, though, because the Taylor series does not converge at all points.

Chebyshev Points

We know that the interpolation error at any point x is given by an expression of the form:

$\displaystyle f(x)-p(x)=\left[\frac{(x-x_1)(x-x_2)\ldots(x-x_n)}{n!}\right]f^{n}(\xi)$

where $\xi$ is an unknown nearby point. This is something like an error term for a Taylor's series.

We can't do a thing about the expression $f^{n}(\xi)$ , because is an arbitrary (differentiable) function, but we can affect the magnitude of the bracketed expression. For instance, if we are only using a single data point () in the interval [10,20], then the very best choice for the data point would be , because then the maximum absolute value of the expression

$\displaystyle \left[\frac{(x-x_1)}{1!}\right]$

would be 5. The worst value would be to choose at one of the endpoints, in which case we'd double the maximum value. This doesn't guarantee better results, but it improves the chance.

Constructing the Chebyshev points

For a given number of data points, the Chebyshev points can be constructed in the following way.

Pick equally-spaced angles $\theta_k=(2k-1)\pi/(2n)$ , for $k=1,2,\ldots,n$ .
The Chebyshev points on the interval are given as

$\displaystyle x_k=0.5(a+b+(a-b)\cos\theta_k),$

for $k=1,2,\ldots,n$ .

Exercise 7

1. Write a Matlab function m-file called cheby_points.m with the signature:

2.                 function xdata = cheby_points ( a, b, ndata )

3.                 % xdata = cheby_points ( a, b, ndata )

4.                 % more comments

5.

6.                 % your name and the date

that returns in the vector xdata the values of the ndata Chebyshev points in the interval [a,b]. You can do this in 3 lines using vector notation:

k = (1:ndata);

theta = (vector expression involving k)

x = (vector expression involving theta)

or you can use a for loop. (The vector notation is more compact and runs faster, but is a little harder to understand.)

7. To check, here's what you should get for ndata=4, and [a,b]=[-5,5]:

8.                       1         2         3         4

9.                   -4.61940  -1.91342   1.91342   4.61940

10. You should be able to see from the formula that the Chebyshev points are not uniformly distributed on the interval, but they are symmetrical about its center.

Exercise 8 Repeat exercise 1 but with Chebyshev points. Fill in the table. (Note the extra row!) You should see smaller errors than before, especially for larger ndata.

        Runge function, Chebyshev points

  ndata =  5  Max Error = ________

  ndata = 11  Max Error = ________

  ndata = 21  Max Error = ________

  ndata = 41  Max Error = ________