9
1 Lecture Notes on Numerical Analysis V irginia T ech · MATH/CS 5466 · Spring 2016 Image from Johannes Kepler’s Astronomia nova, 1609, (ETH Bibliothek). In this text Kepler derives his famous equation that solves two-body orbital motion, M = E - e sin E, where M (the mean anomaly) and e (the eccentricity) are known, and one solves for E (the eccentric anomaly). This vital problem spurred the de- velopment of algorithms for solving nonlinear equations. We model our world with continuous mathematics. Whether our interest is natural science, engineering, even finance and economics, the models we most often employ are functions of real variables. The equations can be linear or nonlinear, involve derivatives, integrals, combinations of these and beyond. The tricks and techniques one learns in algebra and calculus for solving such systems exactly can- not tackle the complexities that arise in serious applications. Exact solution may require an intractable amount of work; worse, for many problems, it is impossible to write an exact solution using elementary functions like polynomials, roots, trig functions, and logarithms. This course tells a marvelous success story. Through the use of clever algorithms, careful analysis, and speedy computers, we can construct approximate solutions to these otherwise intractable prob- lems with remarkable speed. Nick Trefethen defines numerical analysis to be ‘the study of algorithms for the problems of continuous math- ematics’. 1 This course takes a tour through many such algorithms, 1 We highly recommend Trefethen’s essay, ‘The Definition of Numerical Analysis’, (reprinted on pages 321327 of Trefethen & Bau, Numerical Linear Algebra), which inspires our present manifesto. sampling a variety of techniques suitable across many applications. We aim to assess alternative methods based on both accuracy and efficiency, to discern well-posed problems from ill-posed ones, and to see these methods in action through computer implementation. Perhaps the importance of numerical analysis can be best appre- ciated by realizing the impact its disappearance would have on our world. The space program would evaporate; aircraft design would be hobbled; weather forecasting would again become the stuff of soothsaying and almanacs. The ultrasound technology that uncov- ers cancer and illuminates the womb would vanish. Google couldn’t rank web pages. Even the letters you are reading, whose shapes are specified by polynomial curves, would suffer. (Several important ex- ceptions involve discrete, not continuous, mathematics: combinatorial optimization, cryptography and gene sequencing.) On one hand, we are interested in complexity: we want algorithms that minimize the number of calculations required to compute a solu- tion. But we are also interested in the quality of approximation: since we do not obtain exact solutions, we must understand the accuracy of our answers. Discrepancies arise from approximating a compli- cated function by a polynomial, a continuum by a discrete grid of points, or the real numbers by a finite set of floating point numbers. Different algorithms for the same problem will differ in the quality of their answers and the labor required to obtain those answers; we will

Lecture Notes on Numerical Analysis - Virginia Tech · 2016-01-27 · 1 Lecture Notes on Numerical Analysis Virginia Tech · MATH/CS 5466 · Spring 2016 Image from Johannes Kepler’s

Embed Size (px)

Citation preview

Page 1: Lecture Notes on Numerical Analysis - Virginia Tech · 2016-01-27 · 1 Lecture Notes on Numerical Analysis Virginia Tech · MATH/CS 5466 · Spring 2016 Image from Johannes Kepler’s

1

Lecture Notes on Numerical Analysis

Virginia Tech · MATH/CS 5466 · Spring 2016

Image fromJohannes Kepler’s Astronomia nova,1609, (ETH Bibliothek). In this textKepler derives his famous equation thatsolves two-body orbital motion,

M = E � e sin E,

where M (the mean anomaly) and e

(the eccentricity) are known, and onesolves for E (the eccentric anomaly).This vital problem spurred the de-velopment of algorithms for solvingnonlinear equations.

We model our world with continuous mathematics. Whether ourinterest is natural science, engineering, even finance and economics,the models we most often employ are functions of real variables. Theequations can be linear or nonlinear, involve derivatives, integrals,combinations of these and beyond. The tricks and techniques onelearns in algebra and calculus for solving such systems exactly can-not tackle the complexities that arise in serious applications. Exactsolution may require an intractable amount of work; worse, for manyproblems, it is impossible to write an exact solution using elementaryfunctions like polynomials, roots, trig functions, and logarithms.

This course tells a marvelous success story. Through the use ofclever algorithms, careful analysis, and speedy computers, we canconstruct approximate solutions to these otherwise intractable prob-lems with remarkable speed. Nick Trefethen defines numerical analysis

to be ‘the study of algorithms for the problems of continuous math-ematics’.1 This course takes a tour through many such algorithms,

1 We highly recommend Trefethen’sessay, ‘The Definition of NumericalAnalysis’, (reprinted on pages 321–327

of Trefethen & Bau, Numerical Linear

Algebra), which inspires our presentmanifesto.

sampling a variety of techniques suitable across many applications.We aim to assess alternative methods based on both accuracy andefficiency, to discern well-posed problems from ill-posed ones, and tosee these methods in action through computer implementation.

Perhaps the importance of numerical analysis can be best appre-ciated by realizing the impact its disappearance would have on ourworld. The space program would evaporate; aircraft design wouldbe hobbled; weather forecasting would again become the stuff ofsoothsaying and almanacs. The ultrasound technology that uncov-ers cancer and illuminates the womb would vanish. Google couldn’trank web pages. Even the letters you are reading, whose shapes arespecified by polynomial curves, would suffer. (Several important ex-ceptions involve discrete, not continuous, mathematics: combinatorialoptimization, cryptography and gene sequencing.)

On one hand, we are interested in complexity: we want algorithmsthat minimize the number of calculations required to compute a solu-tion. But we are also interested in the quality of approximation: sincewe do not obtain exact solutions, we must understand the accuracyof our answers. Discrepancies arise from approximating a compli-cated function by a polynomial, a continuum by a discrete grid ofpoints, or the real numbers by a finite set of floating point numbers.Different algorithms for the same problem will differ in the quality oftheir answers and the labor required to obtain those answers; we will

Page 2: Lecture Notes on Numerical Analysis - Virginia Tech · 2016-01-27 · 1 Lecture Notes on Numerical Analysis Virginia Tech · MATH/CS 5466 · Spring 2016 Image from Johannes Kepler’s

2

learn how to evaluate algorithms according to these criteria.Numerical analysis forms the heart of ‘scientific computing’ or

‘computational science and engineering,’ fields that also encompassthe high-performance computing technology that makes our algo-rithms practical for problems with millions of variables, visualizationtechniques that illuminate the data sets that emerge from these com-putations, and the applications that motivate them.

Though numerical analysis has flourished in the past seventyyears, its roots go back centuries, where approximations were neces-sary in celestial mechanics and, more generally, ‘natural philosophy’.Science, commerce, and warfare magnified the need for numericalanalysis, so much so that the early twentieth century spawned theprofession of ‘computers,’ people who conducted computations withhand-crank desk calculators. But numerical analysis has always beenmore than mere number-crunching, as observed by Alston House-holder in the introduction to his Principles of Numerical Analysis, pub-lished in 1953, the end of the human computer era:

The material was assembled with high-speed digital computationalways in mind, though many techniques appropriate only to “hand”computation are discussed.. . . How otherwise the continued use ofthese machines will transform the computer’s art remains to be seen.But this much can surely be said, that their effective use demands amore profound understanding of the mathematics of the problem, anda more detailed acquaintance with the potential sources of error, thanis ever required by a computation whose development can be watched,step by step, as it proceeds.

Thus the analysis component of ‘numerical analysis’ is essential. Werely on tools of classical real analysis, such as continuity, differentia-bility, Taylor expansion, and convergence of sequences and series.

Matrix computations play a fundamental role in numerical analy-sis. Discretization of continuous variables turns calculus into algebra.Algorithms for the fundamental problems in linear algebra are cov-ered in MATH/CS 5465. If you have missed this beautiful content,your life will be poorer for it; when the methods we discuss thissemester connect to matrix techniques, we will provide pointers.

These lecture notes were developed alongside courses that weresupported by textbooks, such as An Introduction to Numerical Analysis

by Süli and Mayers, Numerical Analysis by Gautschi, and Numerical

Analysis by Kincaid and Cheney. These notes have benefited from thispedigree, and reflect certain hallmarks of these books. We have alsobeen significantly influenced by G. W. Stewart’s inspiring volumes,Afternotes on Numerical Analysis and Afternotes Goes to Graduate School.I am grateful for comments and corrections from past students, andwelcome suggestions for further repair and amendment.

— Mark Embree

Page 3: Lecture Notes on Numerical Analysis - Virginia Tech · 2016-01-27 · 1 Lecture Notes on Numerical Analysis Virginia Tech · MATH/CS 5466 · Spring 2016 Image from Johannes Kepler’s

1Interpolation

lecture 1: Polynomial Interpolation in the Monomial Basis

Among the most fundamental problems in numerical analysisis the construction of a polynomial that approximates a continuousreal function f : [a, b] ! IR. Of the several ways we might designsuch polynomials, we begin with interpolation: we will construct poly-nomials that exactly match f at certain fixed points in the interval[a, b] ⇢ IR.

1.1 Basic definitions and notation

Definition 1.1 The set of continuous functions that map [a, b] ⇢ IRto IR is denoted by C[a, b]. The set of continuous functions whose first r

derivatives are also continuous on [a, b] is denoted by C

r[a, b]. (Note that

C

0[a, b] ⌘ C[a, b].)

Definition 1.2 The set of polynomials of degree n or less is denoted by Pn

.

Note that C[a, b], C

r[a, b] (for any a < b, r � 0) and Pn

are linear

spaces of functions (since linear combinations of such functions main- We freely use the concept of vector

spaces. A set of functions V is a realvector space it is closed under vectoraddition and multiplication by a realnumber: for any f , g 2 V, f + g 2 V,and for any f 2 V and a 2 IR, a f 2 V.For more details, consult a text onlinear algebra or functional analysis.

tain continuity and polynomial degree). Furthermore, note that Pn

isan n + 1 dimensional subspace of C[a, b].

The polynomial interpolation problem can be stated as:

Given f 2 C[a, b] and n + 1 points {x

j

}n

j=0 satisfying

a x0 < x1 < · · · < x

n

b,

determine some p

n

2 Pn

such that

p

n

(x

j

) = f (x

j

) for j = 0, . . . , n.

Page 4: Lecture Notes on Numerical Analysis - Virginia Tech · 2016-01-27 · 1 Lecture Notes on Numerical Analysis Virginia Tech · MATH/CS 5466 · Spring 2016 Image from Johannes Kepler’s

4

It shall become clear why we require n + 1 points x0, . . . , x

n

, and nomore, to determine a degree-n polynomial p

n

. (You know the n = 1case well: two points determine a unique line.) If the number of datapoints were smaller, we could construct infinitely many degree-ninterpolating polynomials. Were it larger, there would in general beno degree-n interpolant.

As numerical analysts, we seek answers to the following questions:

• Does such a polynomial p

n

2 Pn

exist?

• If so, is it unique?

• Does p

n

2 Pn

behave like f 2 C[a, b] at points x 2 [a, b] whenx 6= x

j

for j = 0, . . . , n?

• How can we compute p

n

2 Pn

efficiently on a computer?

• How can we compute p

n

2 Pn

accurately on a computer (withfloating point arithmetic)?

• If we want to add a new interpolation point x

n+1, can we easilyadjust p

n

to give an interpolating polynomial p

n+1 of one higherdegree?

• How should the interpolation points {x

j

} be chosen?

Regarding this last question, we should note that, in practice, weare not always able to choose the interpolation points as freely aswe might like. For example, our ‘continuous function f 2 C[a, b]’could actually be a discrete list of previously collected experimentaldata, and we are stuck with the values {x

j

}n

j=0 at which the data wasmeasured.

1.2 Constructing interpolants in the monomial basis

Of course, any polynomial p

n

2 Pn

can be written in the form

p

n

(x) = c0 + c1x + c2x

2 + · · ·+ c

n

x

n

for coefficients c0, c1, . . . , c

n

. We can view this formula as an expres-sion for p

n

as a linear combination of the basis functions 1, x, x

2, . . . ,x

n; these basis functions are called monomials.To construct the polynomial interpolant to f , we merely need to

determine the proper values for the coefficients c0, c1, . . . , c

n

in theabove expansion. The interpolation conditions p

n

(x

j

) = f (x

j

) for

Page 5: Lecture Notes on Numerical Analysis - Virginia Tech · 2016-01-27 · 1 Lecture Notes on Numerical Analysis Virginia Tech · MATH/CS 5466 · Spring 2016 Image from Johannes Kepler’s

5

j = 0, . . . , n reduce to the equations

c0 + c1x0 + c2x

20 + · · ·+ c

n

x

n

0 = f (x0)

c0 + c1x1 + c2x

21 + · · ·+ c

n

x

n

1 = f (x1)

...

c0 + c1x

n

+ c2x

2n

+ · · ·+ c

n

x

n

n

= f (x

n

).

Note that these n + 1 equations are linear in the n + 1 unknownparameters c0, . . . , c

n

. Thus, our problem of finding the coefficientsc0, . . . , c

n

reduces to solving the linear system

(1.1)

2

666666664

1 x0 x

20 · · · x

n

0

1 x1 x

21 · · · x

n

1

1 x2 x

22 · · · x

n

2...

......

. . ....

1 x

n

x

2n

· · · x

n

n

3

777777775

2

666666664

c0

c1

c2...

c

n

3

777777775

=

2

666666664

f (x0)

f (x1)

f (x2)

...

f (x

n

)

3

777777775

,

which we denote as Ac = f. Matrices of this form, called Vander-

monde matrices, arise in a wide range of applications.1 Provided all 1 Higham presents many interestingproperties of Vandermonde matricesand algorithms for solving Vander-monde systems in Chapter 21 ofAccuracy and Stability of Numerical

Algorithms, 2nd ed., (SIAM, 2002). Van-dermonde matrices arise often enoughthat MATLAB has a built-in commandfor creating them. If x = [x0, . . . , x

n

]T ,then A = fliplr(vander(x)).

the interpolation points {x

j

} are distinct, one can show that this ma-trix is invertible.2 Hence, fundamental properties of linear algebra

2 In fact, the determinant takes thesimple form

det(A) =n

’j=0

n

’k=j+1

(x

k

� x

j

).

This is evident for n = 1; we willnot prove if for general n, as we willhave more elegant ways to establishexistence and uniqueness of polynomialinterpolants. For a clever proof, seep. 193 of Bellman, Introduction to Matrix

Analysis, 2nd ed., (McGraw-Hill, 1970).

allow us to confirm that there is exactly one degree-n polynomial thatinterpolates f at the given n + 1 distinct interpolation points.

Theorem 1.1 Given f 2 C[a, b] and distinct points {x

j

}n

j=0, a x0 <

x1 < · · · < x

n

b, there exists a unique p

n

2 Pn

such that p

n

(x

j

) =

f (x

j

) for j = 0, 1, . . . , n.

To determine the coefficients {c

j

}, we could solve the above linearsystem with the Vandermonde matrix using some variant of Gaus-sian elimination (e.g., using MATLAB’s \ command); this will takeO(n3) floating point operations. Alternatively, we could (and should)use a specialized algorithm that exploit the Vandermonde structureto determine the coefficients {c

j

} in only O(n2) operations, a vastimprovement.3

3 See Higham’s book for details andstability analysis of specialized Vander-monde algorithms.

1.2.1 Potential pitfalls of the monomial basis

Though it is straightforward to see how to construct interpolatingpolynomials in the monomial basis, this procedure can give rise tosome unpleasant numerical problems when we actually attempt todetermine the coefficients {c

j

} on a computer. The primary difficultyis that the monomial basis functions 1, x, x

2, . . . , x

n look increasinglyalike as we take higher and higher powers. Figure 1.1 illustrates thisbehavior on the interval [a, b] = [0, 1] with n = 5 and x

j

= j/5.

Page 6: Lecture Notes on Numerical Analysis - Virginia Tech · 2016-01-27 · 1 Lecture Notes on Numerical Analysis Virginia Tech · MATH/CS 5466 · Spring 2016 Image from Johannes Kepler’s

6

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

0

0.2

0.4

0.6

0.8

1

x

1

x

x

2

x

3

x

4

x

5

Figure 1.1: The six monomial basisvectors for P5, based on the interval[a, b] = [0, 1] with x

j

= j/5 (red circles).Note that the basis vectors increasinglyalign as the power increases: this basisbecomes ill-conditioned as the degree ofthe interpolant grows.

Because these basis vectors become increasingly alike, one findsthat the expansion coefficients {c

j

} in the monomial basis can be-come very large in magnitude even if the function f (x) remains ofmodest size on [a, b].

Consider the following analogy from linear algebra. The vectors"

110�10

#,

"10

#

form a basis for IR2. However, both vectors point in nearly the samedirection, though of course they are linearly independent. We can writethe vector [1, 1]T as a unique linear combination of these basis vec-tors:

(1.2)

"11

#= 10, 000, 000, 000

"1

10�10

#� 9, 999, 999, 999

"10

#.

Although the vector we are expanding and the basis vectors them-selves are all have modest size (norm), the expansion coefficients areenormous. Furthermore, small changes to the vector we are expand-ing will lead to huge changes in the expansion coefficients. This is arecipe for disaster when computing with finite-precision arithmetic.

This same phenomenon can occur when we express polynomialsin the monomial basis. As a simple example, consider interpolatingf (x) = 2x + x sin(40x) at uniformly spaced points (x

j

= j/n, j =

0, . . . , n) in the interval [0, 1]. Note that f 2 C

•[0, 1]: this f is a ‘nice’function with infinitely many continuous derivatives. As seen inFigures 1.2–1.3, f oscillates modestly on the interval [0, 1], but it

Page 7: Lecture Notes on Numerical Analysis - Virginia Tech · 2016-01-27 · 1 Lecture Notes on Numerical Analysis Virginia Tech · MATH/CS 5466 · Spring 2016 Image from Johannes Kepler’s

7

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-1

0

1

2

3

4

5

6

7

f(x) = 2x + x sin(40x)pn(x) = interpolating polynomial, n = 10

x

Figure 1.2: Degree n = 10 interpolantp10(x) to f (x) = 2x + x sin(40x) atthe uniformly spaced points x0, . . . , x10for x

j

= j/10 over [a, b] = [0, 1].Even though p10(x

j

) = f (x

j

) at theeleven points x0, . . . , x

n

(red circles), theinterpolant gives a poor approximationto f at the ends of the interval.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-1

0

1

2

3

4

5

6

7

f(x) = 2x + x sin(40x)pn(x) = interpolating polynomial, n = 30

x

Figure 1.3: Repetition of Figure 1.2, butnow with the degree n = 30 interpolantat uniformly spaced points x

j

= j/30 on[0, 1]. The polynomial still overshoots f

near x = 0 and x = 1, though by lessthan for n = 10; for this example, theovershoot goes away as n is increasedfurther.

certainly does not grow excessively large in magnitude or exhibit anynasty singularities.

Comparing the interpolants with n = 10 and n = 30 between thetwo figures, it appears that, in some sense, p

n

! f as n increases.Indeed, this is the case, in a manner we shall make precise in futurelectures.

However, we must address a crucial question:

Can we accurately compute the coefficients c0, . . . , c

n

that specify the interpolating polynomial?

Page 8: Lecture Notes on Numerical Analysis - Virginia Tech · 2016-01-27 · 1 Lecture Notes on Numerical Analysis Virginia Tech · MATH/CS 5466 · Spring 2016 Image from Johannes Kepler’s

8

Use MATLAB’s basic Gaussian elimination algorithm to solve theVandermonde system Ac = f for c via the command c = A\f, thenevaluate

p

n

(x) =n

Âj=0

c

j

x

j

e.g., using MATLAB’s polyval command.Since p

n

was constructed to interpolate f at the points x0, . . . , x

n

,we might (at the very least!) expect

f (x

j

)� p

n

(x

j

) = 0, j = 0, . . . , n.

Since we are dealing with numerical computations with a finite preci-sion floating point system, we should instead be well satisfied if ournumerical computations only achieve | f (x

j

) � p

n

(x

j

)| = O(#mach), More precisely, we might expect

| f (x

j

)� p

n

(x

j

)| ⇡ #machk f kL• ,

where | f kL• := max

x2[a,b]| f (x)|.

where #mach denotes the precision of the floating point arithmeticsystem.4

4 For the double-precision arithmeticused by MATLAB, #mach ⇡ 2.2 ⇥ 10�16.

Instead, the results of our numerical computations are remarkably

inaccurate due to the magnitude of the coefficients c0, . . . , c

n

and theill-conditioning of the Vandermonde matrix.

Recall from numerical linear algebra that the accuracy of solvingthe system Ac = f depends on the condition number kAkkA

�1k of A.5 5 For information on conditioning andthe accuracy of solving linear systems,see, e.g.., Lecture 12 of Trefethen andBau, Numerical Linear Algebra (SIAM,1997).

Figure 1.4 shows that this condition number grows exponentially as n

increases.6 Thus, we should expect the computed value of c to have

6 The curve has very regular behaviorup until about n = 20; beyond thatpoint, where kAkkA

�1k ⇡ 1/#mach,the computation is sufficiently unstablethat the condition number is no longercomputed accurately! For n > 20,take all the curves in Figure 1.4 with ahealthy dose of salt.

errors that scale like kAkkA

�1k#mach. Moreover, consider the entriesin c. For n = 10 (a typical example), we have

j c

j

0 0.000001 363.247052 �10161.842043 113946.069624 �679937.110165 2411360.826906 �5328154.950337 7400914.854558 �6277742.915799 2968989.64443

10 �599575.07912

The entries in c grow in magnitude and oscillate in sign, akin to thesimple IR2 vector example in (1.2). The sign-flips and magnitude ofthe coefficients would make

p

n

(x) =n

Âj=0

c

j

x

j

Page 9: Lecture Notes on Numerical Analysis - Virginia Tech · 2016-01-27 · 1 Lecture Notes on Numerical Analysis Virginia Tech · MATH/CS 5466 · Spring 2016 Image from Johannes Kepler’s

9

0 5 10 15 20 25 30 35 4010-20

10-15

10-10

10-5

100

105

1010

1015

1020

1025

1030

n

condition number kAkkA

�1k

maxj

|cj

|

maxj

| f (x

j

)� p

n

(x

j

)|

Figure 1.4: Illustration of some pitfallsof working with interpolants in themonomial basis for large n: (a) the con-dition number of A grows large with n;(b) as a result, some coefficients c

j

arelarge in magnitude (blue line) and in-accurately computed; (c) consequently,the computed ‘interpolant’ p

n

is farfrom f at the interpolation points (redline).

difficult to compute accurately for large n, even if all the coefficientsc0, . . . , c

n

were given exactly. Figure 1.4 shows how the largest com-puted value in c grows with n. Finally, this figure also shows thequantity we began discussing,

max0jn

| f (x

j

)� p

n

(x

j

)|.

Rather than being nearly zero, this quantity grows with n, until thecomputed ‘interpolating’ polynomial differs from f at some interpo-lation point by roughly 1/10 for the larger values of n: we must havehigher standards!

This is an example where a simple problem formulation quicklyyields an algorithm, but that algorithm gives unacceptable numericalresults.

Perhaps you are now troubled by this entirely reasonable question:If the computations of p

n

are as unstable as Figure 1.4 suggests, whyshould we put any faith in the plots of interpolants for n = 10 and,especially, n = 30 in Figures 1.2–1.3?

You should trust those plots because I computed them using amuch better approach, about which we shall next learn.