Haoudout1-4

INTRODUCTION TO MATLAB FORENGINEERING STUDENTS

David HoucqueNorthwestern University

(version 1.2, August 2005)

Contents

1 Tutorial lessons 1 1

1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Basic features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.3 A minimum MATLAB session . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.3.1 Starting MATLAB . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.3.2 Using MATLAB as a calculator . . . . . . . . . . . . . . . . . . . . . 4

1.3.3 Quitting MATLAB . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.4 Getting started . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.4.1 Creating MATLAB variables . . . . . . . . . . . . . . . . . . . . . . . 5

1.4.2 Overwriting variable . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.4.3 Error messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.4.4 Making corrections . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.4.5 Controlling the hierarchy of operations or precedence . . . . . . . . . 6

1.4.6 Controlling the appearance of floating point number . . . . . . . . . . 8

1.4.7 Managing the workspace . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.4.8 Keeping track of your work session . . . . . . . . . . . . . . . . . . . 9

1.4.9 Entering multiple statements per line . . . . . . . . . . . . . . . . . . 9

1.4.10 Miscellaneous commands . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.4.11 Getting help . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2 Tutorial lessons 2 12

2.1 Mathematical functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.1.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

i

2.2 Basic plotting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.2.1 overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.2.2 Creating simple plots . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.2.3 Adding titles, axis labels, and annotations . . . . . . . . . . . . . . . 15

2.2.4 Multiple data sets in one plot . . . . . . . . . . . . . . . . . . . . . . 16

2.2.5 Specifying line styles and colors . . . . . . . . . . . . . . . . . . . . . 17

2.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.4 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.5 Matrix generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.5.1 Entering a vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.5.2 Entering a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.5.3 Matrix indexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.5.4 Colon operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.5.5 Linear spacing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.5.6 Colon operator in a matrix . . . . . . . . . . . . . . . . . . . . . . . . 22

2.5.7 Creating a sub-matrix . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.5.8 Deleting row or column . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.5.9 Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.5.10 Continuation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.5.11 Transposing a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.5.12 Concatenating matrices . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.5.13 Matrix generators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.5.14 Special matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

2.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3 Array operations and Linear equations 30

3.1 Array operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.1.1 Matrix arithmetic operations . . . . . . . . . . . . . . . . . . . . . . . 30

3.1.2 Array arithmetic operations . . . . . . . . . . . . . . . . . . . . . . . 30

3.2 Solving linear equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3.2.1 Matrix inverse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

ii

3.2.2 Matrix functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

4 Introduction to programming in MATLAB 35

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

4.2 M-File Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

4.2.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

4.2.2 Script side-effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

4.3 M-File functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

4.3.1 Anatomy of a M-File function . . . . . . . . . . . . . . . . . . . . . . 38

4.3.2 Input and output arguments . . . . . . . . . . . . . . . . . . . . . . . 40

4.4 Input to a script file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

4.5 Output commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

4.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

5 Control flow and operators 43

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

5.2 Control flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

5.2.1 The ‘‘if...end’’ structure . . . . . . . . . . . . . . . . . . . . . . . 43

5.2.2 Relational and logical operators . . . . . . . . . . . . . . . . . . . . . 45

5.2.3 The ‘‘for...end’’ loop . . . . . . . . . . . . . . . . . . . . . . . . . 45

5.2.4 The ‘‘while...end’’ loop . . . . . . . . . . . . . . . . . . . . . . . 46

5.2.5 Other flow structures . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

5.2.6 Operator precedence . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

5.3 Saving output to a file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

5.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

6 Debugging M-files 49

6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

6.2 Debugging process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

6.2.1 Preparing for debugging . . . . . . . . . . . . . . . . . . . . . . . . . 50

6.2.2 Setting breakpoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

iii

6.2.3 Running with breakpoints . . . . . . . . . . . . . . . . . . . . . . . . 50

6.2.4 Examining values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

6.2.5 Correcting and ending debugging . . . . . . . . . . . . . . . . . . . . 51

6.2.6 Ending debugging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

6.2.7 Correcting an M-file . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

A Summary of commands 53

B Release notes for Release 14 with Service Pack 2 58

B.1 Summary of changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

B.2 Other changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

B.3 Further details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

C Main characteristics of MATLAB 62

C.1 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

C.2 Strengths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

C.3 Weaknesses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

C.4 Competition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

iv

List of Tables

1.1 Basic arithmetic operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.2 Hierarchy of arithmetic operations . . . . . . . . . . . . . . . . . . . . . . . . 7

2.1 Elementary functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.2 Predefined constant values . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.3 Attributes for plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.4 Elementary matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.5 Special matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3.1 Array operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.2 Summary of matrix and array operations . . . . . . . . . . . . . . . . . . . . 32

3.3 Matrix functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

4.1 Anatomy of a M-File function . . . . . . . . . . . . . . . . . . . . . . . . . . 38

4.2 Difference between scripts and functions . . . . . . . . . . . . . . . . . . . . 39

4.3 Example of input and output arguments . . . . . . . . . . . . . . . . . . . . 40

4.4 disp and fprintf commands . . . . . . . . . . . . . . . . . . . . . . . . . . 41

5.1 Relational and logical operators . . . . . . . . . . . . . . . . . . . . . . . . . 45

5.2 Operator precedence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

A.1 Arithmetic operators and special characters . . . . . . . . . . . . . . . 53

A.2 Array operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

A.3 Relational and logical operators . . . . . . . . . . . . . . . . . . . . . . 54

A.4 Managing workspace and file commands . . . . . . . . . . . . . . . . . 55

A.5 Predefined variables and math constants . . . . . . . . . . . . . . . . . 55

v

A.6 Elementary matrices and arrays . . . . . . . . . . . . . . . . . . . . . . 56

A.7 Arrays and Matrices: Basic information . . . . . . . . . . . . . . . . . 56

A.8 Arrays and Matrices: operations and manipulation . . . . . . . . . . 56

A.9 Arrays and Matrices: matrix analysis and linear equations . . . . . 57

vi

List of Figures

1.1 The graphical interface to the MATLAB workspace . . . . . . . . . . . . . . 3

2.1 Plot for the vectors x and y . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.2 Plot of the Sine function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.3 Typical example of multiple plots . . . . . . . . . . . . . . . . . . . . . . . . 17

vii

Preface

“Introduction to MATLAB for Engineering Students” is a document for an introductorycourse in MATLAB R©1 and technical computing. It is used for freshmen classes at North-western University. This document is not a comprehensive introduction or a reference man-ual. Instead, it focuses on the specific features of MATLAB that are useful for engineeringclasses. The lab sessions are used with one main goal: to allow students to become familiarwith computer software (e.g., MATLAB) to solve application problems. We assume that thestudents have no prior experience with MATLAB.

The availability of technical computing environment such as MATLAB is now reshapingthe role and applications of computer laboratory projects to involve students in more intenseproblem-solving experience. This availability also provides an opportunity to easily conductnumerical experiments and to tackle realistic and more complicated problems.

Originally, the manual is divided into computer laboratory sessions (labs). The labdocument is designed to be used by the students while working at the computer. Theemphasis here is “learning by doing”. This quiz-like session is supposed to be fully completedin 50 minutes in class.

The seven lab sessions include not only the basic concepts of MATLAB, but also an in-troduction to scientific computing, in which they will be useful for the upcoming engineeringcourses. In addition, engineering students will see MATLAB in their other courses.

The end of this document contains two useful sections: a Glossary which contains thebrief summary of the commands and built-in functions as well as a collection of release notes.The release notes, which include several new features of the Release 14 with Service Pack2, well known as R14SP2, can also be found in Appendix. All of the MATLAB commandshave been tested to take advantage with new features of the current version of MATLABavailable here at Northwestern (R14SP2). Although, most of the examples and exercises stillwork with previous releases as well.

This manual reflects the ongoing effort of the McCormick School of Engineering andApplied Science leading by Dean Stephen Carr to institute a significant technical computingin the Engineering First R©2 courses taught at Northwestern University.

Finally, the students - Engineering Analysis (EA) Section - deserve my special grati-tude. They were very active participants in class.

David HoucqueEvanston, Illinois

August 2005

1MATLAB R© is a registered trademark of MathWorks, Inc.2Engineering First R© is a registered trademark of McCormick

School of Engineering and Applied Science (Northwestern University)

viii

Acknowledgements

I would like to thank Dean Stephen Carr for his constant support. I am grateful to a numberof people who offered helpful advice and comments. I want to thank the EA1 instructors(Fall Quarter 2004), in particular Randy Freeman, Jorge Nocedal, and Allen Taflove fortheir helpful reviews on some specific parts of the document. I also want to thank MalcombMacIver, EA3 Honors instructor (Spring 2005) for helping me to better understand theanimation of system dynamics using MATLAB. I am particularly indebted to the manystudents (340 or so) who have used these materials, and have communicated their commentsand suggestions. Finally, I want to thank IT personnel for helping setting up the classes andother computer related work: Rebecca Swierz, Jesse Becker, Rick Mazec, Alan Wolff, KenKalan, Mike Vilches, and Daniel Lee.

About the author

David Houcque has more than 25 years’ experience in the modeling and simulation of struc-tures and solid continua including 14 years in industry. In industry, he has been working asR&D engineer in the fields of nuclear engineering, oil rig platform offshore design, oil reser-voir engineering, and steel industry. All of these include working in different internationalenvironments: Germany, France, Norway, and United Arab Emirates. Among other things,he has a combined background experience: scientific computing and engineering expertise.He earned his academic degrees from Europe and the United States.

Here at Northwestern University, he is working under the supervision of Professor BrianMoran, a world-renowned expert in fracture mechanics, to investigate the integrity assess-ment of the aging highway bridges under severe operating conditions and corrosion.

ix

Chapter 1

Tutorial lessons 1

1.1 Introduction

The tutorials are independent of the rest of the document. The primarily objective is to helpyou learn quickly the first steps. The emphasis here is “learning by doing”. Therefore, thebest way to learn is by trying it yourself. Working through the examples will give you a feelfor the way that MATLAB operates. In this introduction we will describe how MATLABhandles simple numerical expressions and mathematical formulas.

The name MATLAB stands for MATrix LABoratory. MATLAB was written originallyto provide easy access to matrix software developed by the LINPACK (linear system package)and EISPACK (Eigen system package) projects.

MATLAB [1] is a high-performance language for technical computing. It integratescomputation, visualization, and programming environment. Furthermore, MATLAB is amodern programming language environment: it has sophisticated data structures, containsbuilt-in editing and debugging tools, and supports object-oriented programming. These factorsmake MATLAB an excellent tool for teaching and research.

MATLAB has many advantages compared to conventional computer languages (e.g.,C, FORTRAN) for solving technical problems. MATLAB is an interactive system whosebasic data element is an array that does not require dimensioning. The software packagehas been commercially available since 1984 and is now considered as a standard tool at mostuniversities and industries worldwide.

It has powerful built-in routines that enable a very wide variety of computations. Italso has easy to use graphics commands that make the visualization of results immediatelyavailable. Specific applications are collected in packages referred to as toolbox. There aretoolboxes for signal processing, symbolic computation, control theory, simulation, optimiza-tion, and several other fields of applied science and engineering.

In addition to the MATLAB documentation which is mostly available on-line, we would

1

recommend the following books: [2], [3], [4], [5], [6], [7], [8], and [9]. They are excellent intheir specific applications.

1.2 Basic features

As we mentioned earlier, the following tutorial lessons are designed to get you startedquickly in MATLAB. The lessons are intended to make you familiar with the basics ofMATLAB. We urge you to complete the exercises given at the end of each lesson.

1.3 A minimum MATLAB session

The goal of this minimum session (also called starting and exiting sessions) is to learn thefirst steps:

• How to log on

• Invoke MATLAB

• Do a few simple calculations

• How to quit MATLAB

1.3.1 Starting MATLAB

After logging into your account, you can enter MATLAB by double-clicking on the MATLABshortcut icon (MATLAB 7.0.4) on your Windows desktop. When you start MATLAB, aspecial window called the MATLAB desktop appears. The desktop is a window that containsother windows. The major tools within or accessible from the desktop are:

• The Command Window

• The Command History

• The Workspace

• The Current Directory

• The Help Browser

• The Start button

2

Figure 1.1: The graphical interface to the MATLAB workspace

3

When MATLAB is started for the first time, the screen looks like the one that shownin the Figure 1.1. This illustration also shows the default configuration of the MATLABdesktop. You can customize the arrangement of tools and documents to suit your needs.

Now, we are interested in doing some simple calculations. We will assume that youhave sufficient understanding of your computer under which MATLAB is being run.

You are now faced with the MATLAB desktop on your computer, which contains the prompt(>>) in the Command Window. Usually, there are 2 types of prompt:

>> for full version

EDU> for educational version

Note: To simplify the notation, we will use this prompt, >>, as a standard prompt sign,though our MATLAB version is for educational purpose.

1.3.2 Using MATLAB as a calculator

As an example of a simple interactive calculation, just type the expression you want toevaluate. Let’s start at the very beginning. For example, let’s suppose you want to calculatethe expression, 1 + 2× 3. You type it at the prompt command (>>) as follows,

>> 1+2*3

ans =

7

You will have noticed that if you do not specify an output variable, MATLAB uses adefault variable ans, short for answer, to store the results of the current calculation. Notethat the variable ans is created (or overwritten, if it is already existed). To avoid this, youmay assign a value to a variable or output argument name. For example,

>> x = 1+2*3

x =

7

will result in x being given the value 1 + 2 × 3 = 7. This variable name can alwaysbe used to refer to the results of the previous computations. Therefore, computing 4x willresult in

>> 4*x

ans =

28.0000

Before we conclude this minimum session, Table 1.1 gives the partial list of arithmeticoperators.

4

Table 1.1: Basic arithmetic operators

Symbol Operation Example+ Addition 2 + 3− Subtraction 2− 3∗ Multiplication 2 ∗ 3/ Division 2/3

1.3.3 Quitting MATLAB

To end your MATLAB session, type quit in the Command Window, or select File −→ ExitMATLAB in the desktop main menu.

1.4 Getting started

After learning the minimum MATLAB session, we will now learn to use some additionaloperations.

1.4.1 Creating MATLAB variables

MATLAB variables are created with an assignment statement. The syntax of variable as-signment is

variable name = a value (or an expression)

For example,

>> x = expression

where expression is a combination of numerical values, mathematical operators, variables,and function calls. On other words, expression can involve:

• manual entry

• built-in functions

• user-defined functions

5

1.4.2 Overwriting variable

Once a variable has been created, it can be reassigned. In addition, if you do not wish tosee the intermediate results, you can suppress the numerical output by putting a semicolon(;) at the end of the line. Then the sequence of commands looks like this:

>> t = 5;

>> t = t+1

t =

6

1.4.3 Error messages

If we enter an expression incorrectly, MATLAB will return an error message. For example,in the following, we left out the multiplication sign, *, in the following expression

>> x = 10;

>> 5x

??? 5x

|

Error: Unexpected MATLAB expression.

1.4.4 Making corrections

To make corrections, we can, of course retype the expressions. But if the expression islengthy, we make more mistakes by typing a second time. A previously typed commandcan be recalled with the up-arrow key ↑. When the command is displayed at the commandprompt, it can be modified if needed and executed.

1.4.5 Controlling the hierarchy of operations or precedence

Let’s consider the previous arithmetic operation, but now we will include parentheses. Forexample, 1 + 2× 3 will become (1 + 2)× 3

>> (1+2)*3

ans =

9

and, from previous example

6

>> 1+2*3

ans =

7

By adding parentheses, these two expressions give different results: 9 and 7.

The order in which MATLAB performs arithmetic operations is exactly that taughtin high school algebra courses. Exponentiations are done first, followed by multiplicationsand divisions, and finally by additions and subtractions. However, the standard order ofprecedence of arithmetic operations can be changed by inserting parentheses. For example,the result of 1+2×3 is quite different than the similar expression with parentheses (1+2)×3.The results are 7 and 9 respectively. Parentheses can always be used to overrule priority,and their use is recommended in some complex expressions to avoid ambiguity.

Therefore, to make the evaluation of expressions unambiguous, MATLAB has estab-lished a series of rules. The order in which the arithmetic operations are evaluated is givenin Table 1.2. MATLAB arithmetic operators obey the same precedence rules as those in

Table 1.2: Hierarchy of arithmetic operations

Precedence Mathematical operationsFirst The contents of all parentheses are evaluated first, starting

from the innermost parentheses and working outward.Second All exponentials are evaluated, working from left to rightThird All multiplications and divisions are evaluated, working

from left to rightFourth All additions and subtractions are evaluated, starting

from left to right

most computer programs. For operators of equal precedence, evaluation is from left to right.Now, consider another example:

1

2 + 32+

4

5× 6

7

In MATLAB, it becomes

>> 1/(2+3^2)+4/5*6/7

ans =

0.7766

or, if parentheses are missing,

>> 1/2+3^2+4/5*6/7

ans =

10.1857

7

So here what we get: two different results. Therefore, we want to emphasize the importanceof precedence rule in order to avoid ambiguity.

1.4.6 Controlling the appearance of floating point number

MATLAB by default displays only 4 decimals in the result of the calculations, for example−163.6667, as shown in above examples. However, MATLAB does numerical calculationsin double precision, which is 15 digits. The command format controls how the results ofcomputations are displayed. Here are some examples of the different formats together withthe resulting outputs.

>> format short

>> x=-163.6667

If we want to see all 15 digits, we use the command format long

>> format long

>> x= -1.636666666666667e+002

To return to the standard format, enter format short, or simply format.

There are several other formats. For more details, see the MATLAB documentation,or type help format.

Note - Up to now, we have let MATLAB repeat everything that we enter at theprompt (>>). Sometimes this is not quite useful, in particular when the output is pages enlength. To prevent MATLAB from echoing what we type, simply enter a semicolon (;) atthe end of the command. For example,

>> x=-163.6667;

and then ask about the value of x by typing,

>> x

x =

-163.6667

1.4.7 Managing the workspace

The contents of the workspace persist between the executions of separate commands. There-fore, it is possible for the results of one problem to have an effect on the next one. To avoidthis possibility, it is a good idea to issue a clear command at the start of each new inde-pendent calculation.

8

>> clear

The command clear or clear all removes all variables from the workspace. Thisfrees up system memory. In order to display a list of the variables currently in the memory,type

>> who

while, whos will give more details which include size, space allocation, and class of thevariables.

1.4.8 Keeping track of your work session

It is possible to keep track of everything done during a MATLAB session with the diary

command.

>> diary

or give a name to a created file,

>> diary FileName

where FileName could be any arbitrary name you choose.

The function diary is useful if you want to save a complete MATLAB session. Theysave all input and output as they appear in the MATLAB window. When you want to stopthe recording, enter diary off. If you want to start recording again, enter diary on. Thefile that is created is a simple text file. It can be opened by an editor or a word processingprogram and edited to remove extraneous material, or to add your comments. You canuse the function type to view the diary file or you can edit in a text editor or print. Thiscommand is useful, for example in the process of preparing a homework or lab submission.

1.4.9 Entering multiple statements per line

It is possible to enter multiple statements per line. Use commas (,) or semicolons (;) toenter more than one statement at once. Commas (,) allow multiple statements per linewithout suppressing output.

>> a=7; b=cos(a), c=cosh(a)

b =

0.6570

c =

548.3170

9

1.4.10 Miscellaneous commands

Here are few additional useful commands:

• To clear the Command Window, type clc

• To abort a MATLAB computation, type ctrl-c

• To continue a line, type . . .

1.4.11 Getting help

To view the online documentation, select MATLAB Help from Help menu or MATLAB Helpdirectly in the Command Window. The preferred method is to use the Help Browser. TheHelp Browser can be started by selecting the ? icon from the desktop toolbar. On the otherhand, information about any command is available by typing

>> help Command

Another way to get help is to use the lookfor command. The lookfor command differsfrom the help command. The help command searches for an exact function name match,while the lookfor command searches the quick summary information in each function fora match. For example, suppose that we were looking for a function to take the inverse ofa matrix. Since MATLAB does not have a function named inverse, the command help

inverse will produce nothing. On the other hand, the command lookfor inverse willproduce detailed information, which includes the function of interest, inv.

>> lookfor inverse

Note - At this particular time of our study, it is important to emphasize one main point.Because MATLAB is a huge program; it is impossible to cover all the details of each functionone by one. However, we will give you information how to get help. Here are some examples:

• Use on-line help to request info on a specific function

>> help sqrt

• In the current version (MATLAB version 7), the doc function opens the on-line versionof the help manual. This is very helpful for more complex commands.

>> doc plot

10

• Use lookfor to find functions by keywords. The general form is

>> lookfor FunctionName

1.5 Exercises

Note: Due to the teaching class during this Fall 2005, the problems are temporarily removedfrom this section.

11

Chapter 2

Tutorial lessons 2

2.1 Mathematical functions

MATLAB offers many predefined mathematical functions for technical computing whichcontains a large set of mathematical functions.

Typing help elfun and help specfun calls up full lists of elementary and specialfunctions respectively.

There is a long list of mathematical functions that are built into MATLAB. Thesefunctions are called built-ins. Many standard mathematical functions, such as sin(x), cos(x),tan(x), ex, ln(x), are evaluated by the functions sin, cos, tan, exp, and log respectively inMATLAB.

Table 2.1 lists some commonly used functions, where variables x and y can be numbers,vectors, or matrices.

Table 2.1: Elementary functions

cos(x) Cosine abs(x) Absolute valuesin(x) Sine sign(x) Signum functiontan(x) Tangent max(x) Maximum valueacos(x) Arc cosine min(x) Minimum valueasin(x) Arc sine ceil(x) Round towards +∞atan(x) Arc tangent floor(x) Round towards −∞exp(x) Exponential round(x) Round to nearest integersqrt(x) Square root rem(x) Remainder after divisionlog(x) Natural logarithm angle(x) Phase anglelog10(x) Common logarithm conj(x) Complex conjugate

In addition to the elementary functions, MATLAB includes a number of predefined

12

constant values. A list of the most common values is given in Table 2.2.

Table 2.2: Predefined constant values

pi The π number, π = 3.14159 . . .i,j The imaginary unit i,

√−1Inf The infinity, ∞NaN Not a number

2.1.1 Examples

We illustrate here some typical examples which related to the elementary functions previouslydefined.

As a first example, the value of the expression y = e−a sin(x) + 10√

y, for a = 5, x = 2, andy = 8 is computed by

>> a = 5; x = 2; y = 8;

>> y = exp(-a)*sin(x)+10*sqrt(y)

y =

28.2904

The subsequent examples are

>> log(142)

ans =

4.9558

>> log10(142)

ans =

2.1523

Note the difference between the natural logarithm log(x) and the decimal logarithm (base10) log10(x).

To calculate sin(π/4) and e10, we enter the following commands in MATLAB,

>> sin(pi/4)

ans =

0.7071

>> exp(10)

ans =

2.2026e+004

13

Notes:

• Only use built-in functions on the right hand side of an expression. Reassigning thevalue to a built-in function can create problems.

• There are some exceptions. For example, i and j are pre-assigned to√−1. However,

one or both of i or j are often used as loop indices.

• To avoid any possible confusion, it is suggested to use instead ii or jj as loop indices.

2.2 Basic plotting

2.2.1 overview

MATLAB has an excellent set of graphic tools. Plotting a given data set or the resultsof computation is possible with very few commands. You are highly encouraged to plotmathematical functions and results of analysis as often as possible. Trying to understandmathematical equations with graphics is an enjoyable and very efficient way of learning math-ematics. Being able to plot mathematical functions and data freely is the most importantstep, and this section is written to assist you to do just that.

2.2.2 Creating simple plots

The basic MATLAB graphing procedure, for example in 2D, is to take a vector of x-coordinates, x = (x1, . . . , xN), and a vector of y-coordinates, y = (y1, . . . , yN), locate thepoints (xi, yi), with i = 1, 2, . . . , n and then join them by straight lines. You need to preparex and y in an identical array form; namely, x and y are both row arrays or column arrays ofthe same length.

The MATLAB command to plot a graph is plot(x,y). The vectors x = (1, 2, 3, 4, 5, 6)and y = (3,−1, 2, 4, 5, 1) produce the picture shown in Figure 2.1.

>> x = [1 2 3 4 5 6];

>> y = [3 -1 2 4 5 1];

>> plot(x,y)

Note: The plot functions has different forms depending on the input arguments. If y is avector plot(y)produces a piecewise linear graph of the elements of y versus the index of theelements of y. If we specify two vectors, as mentioned above, plot(x,y) produces a graphof y versus x.

For example, to plot the function sin (x) on the interval [0, 2π], we first create a vector ofx values ranging from 0 to 2π, then compute the sine of these values, and finally plot theresult:

14

1 2 3 4 5 6−1

0

1

2

3

4

5

Figure 2.1: Plot for the vectors x and y

>> x = 0:pi/100:2*pi;

>> y = sin(x);

>> plot(x,y)

Notes:

• 0:pi/100:2*pi yields a vector that

– starts at 0,

– takes steps (or increments) of π/100,

– stops when 2π is reached.

• If you omit the increment, MATLAB automatically increments by 1.

2.2.3 Adding titles, axis labels, and annotations

MATLAB enables you to add axis labels and titles. For example, using the graph from theprevious example, add an x- and y-axis labels.

Now label the axes and add a title. The character \pi creates the symbol π. Anexample of 2D plot is shown in Figure 2.2.

15

0 1 2 3 4 5 6 7−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

x = 0:2π

Sin

e of

x

Plot of the Sine function

Figure 2.2: Plot of the Sine function

>> xlabel(’x = 0:2\pi’)

>> ylabel(’Sine of x’)

>> title(’Plot of the Sine function’)

The color of a single curve is, by default, blue, but other colors are possible. The desiredcolor is indicated by a third argument. For example, red is selected by plot(x,y,’r’). Notethe single quotes, ’ ’, around r.

2.2.4 Multiple data sets in one plot

Multiple (x, y) pairs arguments create multiple graphs with a single call to plot. For example,these statements plot three related functions of x: y1 = 2 cos(x), y2 = cos(x), and y3 =0.5 ∗ cos(x), in the interval 0 ≤ x ≤ 2π.

>> x = 0:pi/100:2*pi;

>> y1 = 2*cos(x);

>> y2 = cos(x);

>> y3 = 0.5*cos(x);

>> plot(x,y1,’--’,x,y2,’-’,x,y3,’:’)

>> xlabel(’0 \leq x \leq 2\pi’)

>> ylabel(’Cosine functions’)

>> legend(’2*cos(x)’,’cos(x)’,’0.5*cos(x)’)

16

>> title(’Typical example of multiple plots’)

>> axis([0 2*pi -3 3])

The result of multiple data sets in one graph plot is shown in Figure 2.3.

0 1 2 3 4 5 6−3

−2

−1

0

1

2

3

0 ≤ x ≤ 2π

Cos

ine

func

tions

Typical example of multiple plots

2*cos(x)cos(x)0.5*cos(x)

Figure 2.3: Typical example of multiple plots

By default, MATLAB uses line style and color to distinguish the data sets plotted inthe graph. However, you can change the appearance of these graphic components or addannotations to the graph to help explain your data for presentation.

2.2.5 Specifying line styles and colors

It is possible to specify line styles, colors, and markers (e.g., circles, plus signs, . . . ) usingthe plot command:

plot(x,y,’style_color_marker’)

where style_color_marker is a triplet of values from Table 2.3.

To find additional information, type help plot or doc plot.

17

Table 2.3: Attributes for plot

Symbol Color Symbol Line Style Symbol Marker

k Black − Solid + Plus signr Red −− Dashed o Circleb Blue : Dotted ∗ Asteriskg Green −. Dash-dot . Pointc Cyan none No line × Crossm Magenta s Squarey Yellow d Diamond

2.3 Exercises

Note: Due to the teaching class during this Fall Quarter 2005, the problems are temporarilyremoved from this section.

18

2.4 Introduction

Matrices are the basic elements of the MATLAB environment. A matrix is a two-dimensionalarray consisting of m rows and n columns. Special cases are column vectors (n = 1) and rowvectors (m = 1).

In this section we will illustrate how to apply different operations on matrices. The followingtopics are discussed: vectors and matrices in MATLAB, the inverse of a matrix, determinants,and matrix manipulation.

MATLAB supports two types of operations, known as matrix operations and array opera-tions. Matrix operations will be discussed first.

2.5 Matrix generation

Matrices are fundamental to MATLAB. Therefore, we need to become familiar with matrixgeneration and manipulation. Matrices can be generated in several ways.

2.5.1 Entering a vector

A vector is a special case of a matrix. The purpose of this section is to show how to createvectors and matrices in MATLAB. As discussed earlier, an array of dimension 1×n is calleda row vector, whereas an array of dimension m× 1 is called a column vector. The elementsof vectors in MATLAB are enclosed by square brackets and are separated by spaces or bycommas. For example, to enter a row vector, v, type

>> v = [1 4 7 10 13]

v =

1 4 7 10 13

Column vectors are created in a similar way, however, semicolon (;) must separate thecomponents of a column vector,

>> w = [1;4;7;10;13]

w =

1

4

7

10

13

On the other hand, a row vector is converted to a column vector using the transpose operator.The transpose operation is denoted by an apostrophe or a single quote (’).

19

>> w = v’

w =

1

4

7

10

13

Thus, v(1) is the first element of vector v, v(2) its second element, and so forth.

Furthermore, to access blocks of elements, we use MATLAB’s colon notation (:). For exam-ple, to access the first three elements of v, we write,

>> v(1:3)

ans =

1 4 7

Or, all elements from the third through the last elements,

>> v(3,end)

ans =

7 10 13

where end signifies the last element in the vector. If v is a vector, writing

>> v(:)

produces a column vector, whereas writing

>> v(1:end)

produces a row vector.

2.5.2 Entering a matrix

A matrix is an array of numbers. To type a matrix into MATLAB you must

• begin with a square bracket, [

• separate elements in a row with spaces or commas (,)

• use a semicolon (;) to separate rows

• end the matrix with another square bracket, ].

20

Here is a typical example. To enter a matrix A, such as,

A =

1 2 34 5 67 8 9

(2.1)

type,

>> A = [1 2 3; 4 5 6; 7 8 9]

MATLAB then displays the 3× 3 matrix as follows,

A =

1 2 3

4 5 6

7 8 9

Note that the use of semicolons (;) here is different from their use mentioned earlier tosuppress output or to write multiple commands in a single line.

Once we have entered the matrix, it is automatically stored and remembered in theWorkspace. We can refer to it simply as matrix A. We can then view a particular element ina matrix by specifying its location. We write,

>> A(2,1)

ans =

4

A(2,1) is an element located in the second row and first column. Its value is 4.

2.5.3 Matrix indexing

We select elements in a matrix just as we did for vectors, but now we need two indices.The element of row i and column j of the matrix A is denoted by A(i,j). Thus, A(i,j)in MATLAB refers to the element Aij of matrix A. The first index is the row number andthe second index is the column number. For example, A(1,3) is an element of first row andthird column. Here, A(1,3)=3.

Correcting any entry is easy through indexing. Here we substitute A(3,3)=9 byA(3,3)=0. The result is

>> A(3,3) = 0

A =

1 2 3

4 5 6

7 8 0

21

Single elements of a matrix are accessed as A(i,j), where i ≥ 1 and j ≥ 1. Zero or negativesubscripts are not supported in MATLAB.

2.5.4 Colon operator

The colon operator will prove very useful and understanding how it works is the key toefficient and convenient usage of MATLAB. It occurs in several different forms.

Often we must deal with matrices or vectors that are too large to enter one ele-ment at a time. For example, suppose we want to enter a vector x consisting of points(0, 0.1, 0.2, 0.3, · · · , 5). We can use the command

>> x = 0:0.1:5;

The row vector has 51 elements.

2.5.5 Linear spacing

On the other hand, there is a command to generate linearly spaced vectors: linspace. Itis similar to the colon operator (:), but gives direct control over the number of points. Forexample,

y = linspace(a,b)

generates a row vector y of 100 points linearly spaced between and including a and b.

y = linspace(a,b,n)

generates a row vector y of n points linearly spaced between and including a and b. This isuseful when we want to divide an interval into a number of subintervals of the same length.For example,

>> theta = linspace(0,2*pi,101)

divides the interval [0, 2π] into 100 equal subintervals, then creating a vector of 101 elements.

2.5.6 Colon operator in a matrix

The colon operator can also be used to pick out a certain row or column. For example, thestatement A(m:n,k:l specifies rows m to n and column k to l. Subscript expressions referto portions of a matrix. For example,

22

>> A(2,:)

ans =

4 5 6

is the second row elements of A.

The colon operator can also be used to extract a sub-matrix from a matrix A.

>> A(:,2:3)

ans =

2 3

5 6

8 0

A(:,2:3) is a sub-matrix with the last two columns of A.

A row or a column of a matrix can be deleted by setting it to a null vector, [ ].

>> A(:,2)=[]

ans =

1 3

4 6

7 0

2.5.7 Creating a sub-matrix

To extract a submatrix B consisting of rows 2 and 3 and columns 1 and 2 of the matrix A,do the following

>> B = A([2 3],[1 2])

B =

4 5

7 8

To interchange rows 1 and 2 of A, use the vector of row indices together with the colonoperator.

>> C = A([2 1 3],:)

C =

4 5 6

1 2 3

7 8 0

It is important to note that the colon operator (:) stands for all columns or all rows. Tocreate a vector version of matrix A, do the following

23

>> A(:)

ans =

1

2

3

4

5

6

7

8

0

The submatrix comprising the intersection of rows p to q and columns r to s is denoted byA(p:q,r:s).

As a special case, a colon (:) as the row or column specifier covers all entries in that row orcolumn; thus

• A(:,j) is the jth column of A, while

• A(i,:) is the ith row, and

• A(end,:) picks out the last row of A.

The keyword end, used in A(end,:), denotes the last index in the specified dimension. Hereare some examples.

>> A

A =

1 2 3

4 5 6

7 8 9

>> A(2:3,2:3)

ans =

5 6

8 9

>> A(end:-1:1,end)

ans =

9

6

3

24

>> A([1 3],[2 3])

ans =

2 3

8 9

2.5.8 Deleting row or column

To delete a row or column of a matrix, use the empty vector operator, [ ].

>> A(3,:) = []

A =

1 2 3

4 5 6

Third row of matrix A is now deleted. To restore the third row, we use a technique forcreating a matrix

>> A = [A(1,:);A(2,:);[7 8 0]]

A =

1 2 3

4 5 6

7 8 0

Matrix A is now restored to its original form.

2.5.9 Dimension

To determine the dimensions of a matrix or vector, use the command size. For example,

>> size(A)

ans =

3 3

means 3 rows and 3 columns.

Or more explicitly with,

>> [m,n]=size(A)

25

2.5.10 Continuation

If it is not possible to type the entire input on the same line, use consecutive periods, calledan ellipsis . . ., to signal continuation, then continue the input on the next line.

B = [4/5 7.23*tan(x) sqrt(6); ...

1/x^2 0 3/(x*log(x)); ...

x-7 sqrt(3) x*sin(x)];

Note that blank spaces around +, −, = signs are optional, but they improve readability.

2.5.11 Transposing a matrix

The transpose operation is denoted by an apostrophe or a single quote (’). It flips a matrixabout its main diagonal and it turns a row vector into a column vector. Thus,

>> A’

ans =

1 4 7

2 5 8

3 6 0

By using linear algebra notation, the transpose of m× n real matrix A is the n×m matrixthat results from interchanging the rows and columns of A. The transpose matrix is denotedAT .

2.5.12 Concatenating matrices

Matrices can be made up of sub-matrices. Here is an example. First, let’s recall our previousmatrix A.

A =

1 2 3

4 5 6

7 8 9

The new matrix B will be,

>> B = [A 10*A; -A [1 0 0; 0 1 0; 0 0 1]]

B =

1 2 3 10 20 30

26

4 5 6 40 50 60

7 8 9 70 80 90

-1 -2 -3 1 0 0

-4 -5 -6 0 1 0

-7 -8 -9 0 0 1

2.5.13 Matrix generators

MATLAB provides functions that generates elementary matrices. The matrix of zeros, thematrix of ones, and the identity matrix are returned by the functions zeros, ones, and eye,respectively.

Table 2.4: Elementary matrices

eye(m,n) Returns an m-by-n matrix with 1 on the main diagonaleye(n) Returns an n-by-n square identity matrixzeros(m,n) Returns an m-by-n matrix of zerosones(m,n) Returns an m-by-n matrix of onesdiag(A) Extracts the diagonal of matrix Arand(m,n) Returns an m-by-n matrix of random numbers

For a complete list of elementary matrices and matrix manipulations, type help elmat

or doc elmat. Here are some examples:

1. >> b=ones(3,1)

b =

1

1

1

Equivalently, we can define b as >> b=[1;1;1]

2. >> eye(3)

ans =

1 0 0

0 1 0

0 0 1

3. >> c=zeros(2,3)

c =

0 0 0

27

0 0 0

In addition, it is important to remember that the three elementary operations of ad-dition (+), subtraction (−), and multiplication (∗) apply also to matrices whenever thedimensions are compatible.

Two other important matrix generation functions are rand and randn, which generatematrices of (pseudo-)random numbers using the same syntax as eye.

In addition, matrices can be constructed in a block form. With C defined by C = [1

2; 3 4], we may create a matrix D as follows

>> D = [C zeros(2); ones(2) eye(2)]

D =

1 2 0 0

3 4 0 0

1 1 1 0

1 1 0 1

2.5.14 Special matrices

MATLAB provides a number of special matrices (see Table 2.5). These matrices have inter-esting properties that make them useful for constructing examples and for testing algorithms.For more information, see MATLAB documentation.

Table 2.5: Special matrices

hilb Hilbert matrixinvhilb Inverse Hilbert matrixmagic Magic squarepascal Pascal matrixtoeplitz Toeplitz matrixvander Vandermonde matrixwilkinson Wilkinson’s eigenvalue test matrix

28

2.6 Exercises


29

Chapter 3

Array operations and Linearequations

3.1 Array operations

MATLAB has two different types of arithmetic operations: matrix arithmetic operationsand array arithmetic operations. We have seen matrix arithmetic operations in the previouslab. Now, we are interested in array operations.

3.1.1 Matrix arithmetic operations

As we mentioned earlier, MATLAB allows arithmetic operations: +, −, ∗, and ˆ to becarried out on matrices. Thus,

A+B or B+A is valid if A and B are of the same sizeA*B is valid if A’s number of column equals B’s number of rowsA^2 is valid if A is square and equals A*Aα*A or A*α multiplies each element of A by α

3.1.2 Array arithmetic operations

On the other hand, array arithmetic operations or array operations for short, are doneelement-by-element. The period character, ., distinguishes the array operations from thematrix operations. However, since the matrix and array operations are the same for addition(+) and subtraction (−), the character pairs (.+) and (.−) are not used. The list of arrayoperators is shown below in Table 3.2. If A and B are two matrices of the same size withelements A = [aij] and B = [bij], then the command

30

.* Element-by-element multiplication

./ Element-by-element division

.^ Element-by-element exponentiation

Table 3.1: Array operators

>> C = A.*B

produces another matrix C of the same size with elements cij = aijbij. For example, usingthe same 3× 3 matrices,

A =

1 2 34 5 67 8 9

, B =

10 20 3040 50 6070 80 90

we have,

>> C = A.*B

C =

10 40 90

160 250 360

490 640 810

To raise a scalar to a power, we use for example the command 10^2. If we want theoperation to be applied to each element of a matrix, we use .^2. For example, if we wantto produce a new matrix whose elements are the square of the elements of the matrix A, weenter

>> A.^2

ans =

1 4 9

16 25 36

49 64 81

The relations below summarize the above operations. To simplify, let’s consider twovectors U and V with elements U = [ui] and V = [vj].

U. ∗ V produces [u1v1 u2v2 . . . unvn]U./V produces [u1/v1 u2/v2 . . . un/vn]U.ˆV produces [uv1

1 uv22 . . . uvn

n ]

31

Operation Matrix Array

Addition + +Subtraction − −

Multiplication ∗ .∗Division / ./

Left division \ .\Exponentiation ˆ .̂

Table 3.2: Summary of matrix and array operations

3.2 Solving linear equations

One of the problems encountered most frequently in scientific computation is the solution ofsystems of simultaneous linear equations. With matrix notation, a system of simultaneouslinear equations is written

Ax = b (3.1)

where there are as many equations as unknown. A is a given square matrix of order n, b is agiven column vector of n components, and x is an unknown column vector of n components.

In linear algebra we learn that the solution to Ax = b can be written as x = A−1b, whereA−1 is the inverse of A.

For example, consider the following system of linear equations

x + 2y + 3z = 14x + 5y + 6z = 17x + 8y = 1

The coefficient matrix A is

A =

1 2 34 5 67 8 9

and the vector b =

111

With matrix notation, a system of simultaneous linear equations is written

Ax = b (3.2)

This equation can be solved for x using linear algebra. The result is x = A−1b.

There are typically two ways to solve for x in MATLAB:

1. The first one is to use the matrix inverse, inv.

32

>> A = [1 2 3; 4 5 6; 7 8 0];

>> b = [1; 1; 1];

>> x = inv(A)*b

x =

-1.0000

1.0000

-0.0000

2. The second one is to use the backslash (\)operator. The numerical algorithm behindthis operator is computationally efficient. This is a numerically reliable way of solvingsystem of linear equations by using a well-known process of Gaussian elimination.

>> A = [1 2 3; 4 5 6; 7 8 0];

>> b = [1; 1; 1];

>> x = A\b

x =

-1.0000

1.0000

-0.0000

This problem is at the heart of many problems in scientific computation. Hence it is impor-tant that we know how to solve this type of problem efficiently.

Now, we know how to solve a system of linear equations. In addition to this, we willsee some additional details which relate to this particular topic.

3.2.1 Matrix inverse

Let’s consider the same matrix A.

A =

1 2 34 5 67 8 0

Calculating the inverse of A manually is probably not a pleasant work. Here the hand-calculation of A−1 gives as a final result:

A−1 =1

9

−16 8 −114 −7 2−1 2 −1

In MATLAB, however, it becomes as simple as the following commands:

33

>> A = [1 2 3; 4 5 6; 7 8 0];

>> inv(A)

ans =

-1.7778 0.8889 -0.1111

1.5556 -0.7778 0.2222

-0.1111 0.2222 -0.1111

which is similar to:

A−1 =1

9

−16 8 −1

14 −7 2−1 2 −1

and the determinant of A is

>> det(A)

ans =

27

For further details on applied numerical linear algebra, see [10] and [11].

3.2.2 Matrix functions

MATLAB provides many matrix functions for various matrix/vector manipulations; seeTable 3.3 for some of these functions. Use the online help of MATLAB to find how to usethese functions.

det Determinantdiag Diagonal matrices and diagonals of a matrixeig Eigenvalues and eigenvectorsinv Matrix inversenorm Matrix and vector normsrank Number of linearly independent rows or columns

Table 3.3: Matrix functions

3.3 Exercises


34

Chapter 4

Introduction to programming inMATLAB

4.1 Introduction

So far in these lab sessions, all the commands were executed in the Command Window.The problem is that the commands entered in the Command Window cannot be savedand executed again for several times. Therefore, a different way of executing repeatedlycommands with MATLAB is:

1. to create a file with a list of commands,

2. save the file, and

3. run the file.

If needed, corrections or changes can be made to the commands in the file. The files thatare used for this purpose are called script files or scripts for short.

This section covers the following topics:

• M-File Scripts

• M-File Functions

4.2 M-File Scripts

A script file is an external file that contains a sequence of MATLAB statements. Scriptfiles have a filename extension .m and are often called M-files. M-files can be scripts thatsimply execute a series of MATLAB statements, or they can be functions that can acceptarguments and can produce one or more outputs.

35

4.2.1 Examples

Here are two simple scripts.

Example 1

Consider the system of equations:

x + 2y + 3z = 13x + 3y + 4z = 12x + 3y + 3z = 2

Find the solution x to the system of equations.

Solution:

• Use the MATLAB editor to create a file: File → New → M-file.

• Enter the following statements in the file:

A = [1 2 3; 3 3 4; 2 3 3];

b = [1; 1; 2];

x = A\b

• Save the file, for example, example1.m.

• Run the file, in the command line, by typing:

>> example1

x =

-0.5000

1.5000

-0.5000

When execution completes, the variables (A, b, and x) remain in the workspace. To see alisting of them, enter whos at the command prompt.

Note: The MATLAB editor is both a text editor specialized for creating M-files and agraphical MATLAB debugger. The MATLAB editor has numerous menus for tasks such assaving, viewing, and debugging. Because it performs some simple checks and also uses colorto differentiate between various elements of codes, this text editor is recommended as thetool of choice for writing and editing M-files.

There is another way to open the editor:

36

>> edit

or

>> edit filename.m

to open filename.m.

Example 2

Plot the following cosine functions, y1 = 2 cos(x), y2 = cos(x), and y3 = 0.5 ∗ cos(x), in theinterval 0 ≤ x ≤ 2π. This example has been presented in previous Chapter. Here we putthe commands in a file.

• Create a file, say example2.m, which contains the following commands:

x = 0:pi/100:2*pi;

y1 = 2*cos(x);

y2 = cos(x);

y3 = 0.5*cos(x);

plot(x,y1,’--’,x,y2,’-’,x,y3,’:’)

xlabel(’0 \leq x \leq 2\pi’)

ylabel(’Cosine functions’)

legend(’2*cos(x)’,’cos(x)’,’0.5*cos(x)’)

title(’Typical example of multiple plots’)

axis([0 2*pi -3 3])

• Run the file by typing example2 in the Command Window.

4.2.2 Script side-effects

All variables created in a script file are added to the workspace. This may have undesirableeffects, because:

• Variables already existing in the workspace may be overwritten.

• The execution of the script can be affected by the state variables in the workspace.

As a result, because scripts have some undesirable side-effects, it is better to code anycomplicated applications using rather function M-file.

37

4.3 M-File functions

As mentioned earlier, functions are programs (or routines) that accept input arguments andreturn output arguments. Each M-file function (or function or M-file for short) has its ownarea of workspace, separated from the MATLAB base workspace.

4.3.1 Anatomy of a M-File function

This simple function shows the basic parts of an M-file.

function f = factorial(n) (1)

% FACTORIAL(N) returns the factorial of N. (2)

% Compute a factorial value. (3)

f = prod(1:n); (4)

The first line of a function M-file starts with the keyword function. It gives the functionname and order of arguments. In the case of function factorial, there are up to one outputargument and one input argument. Table 4.1 summarizes the M-file function.

As an example, for n = 5, the result is,

>> f = factorial(5)

f =

120

Table 4.1: Anatomy of a M-File function

Part no. M-file element Description

(1) Function Define the function name, and thedefinition number and order of input andline output arguments

(2) H1 line A one line summary descriptionof the program, displayed when yourequest Help

(3) Help text A more detailed description ofthe program

(4) Function body Program code that performsthe actual computations

Both functions and scripts can have all of these parts, except for the function definitionline which applies to function only.

38

In addition, it is important to note that function name must begin with a letter, andmust be no longer than than the maximum of 63 characters. Furthermore, the name of thetext file that you save will consist of the function name with the extension .m. Thus, theabove example file would be factorial.m.

Table 4.2 summarizes the differences between scripts and functions.

Table 4.2: Difference between scripts and functions

Scripts Functions

- Do not accept input - Can accept input arguments andarguments or return output return output arguments.arguments.- Store variables in a - Store variables in a workspaceworkspace that is shared internal to the function.with other scripts- Are useful for automating - Are useful for extending the MATLABa series of commands language for your application

39

4.3.2 Input and output arguments

As mentioned above, the input arguments are listed inside parentheses following the functionname. The output arguments are listed inside the brackets on the left side. They are usedto transfer the output from the function file. The general form looks like this

function [outputs] = function_name(inputs)

Function file can have none, one, or several output arguments. Table 4.3 illustrates somepossible combinations of input and output arguments.

Table 4.3: Example of input and output arguments

function C=FtoC(F) One input argument andone output argument

function area=TrapArea(a,b,h) Three inputs and one outputfunction [h,d]=motion(v,angle) Two inputs and two outputs

4.4 Input to a script file

When a script file is executed, the variables that are used in the calculations within the filemust have assigned values. The assignment of a value to a variable can be done in threeways.

1. The variable is defined in the script file.

2. The variable is defined in the command prompt.

3. The variable is entered when the script is executed.

We have already seen the two first cases. Here, we will focus our attention on the third one.In this case, the variable is defined in the script file. When the file is executed, the user isprompted to assign a value to the variable in the command prompt. This is done by usingthe input command. Here is an example.

% This script file calculates the average of points

% scored in three games.

% The point from each game are assigned to a variable

% by using the ‘input’ command.

game1 = input(’Enter the points scored in the first game ’);

40

game2 = input(’Enter the points scored in the second game ’);

game3 = input(’Enter the points scored in the third game ’);

average = (game1+game2+game3)/3

The following shows the command prompt when this script file (saved as example3) isexecuted.

>> example3

>> Enter the points scored in the first game 15

>> Enter the points scored in the second game 23

>> Enter the points scored in the third game 10

average =

16

The input command can also be used to assign string to a variable. For more information,see MATLAB documentation.

A typical example of M-file function programming can be found in a recent paper whichrelated to the solution of the ordinary differential equation (ODE) [12].

4.5 Output commands

As discussed before, MATLAB automatically generates a display when commands are exe-cuted. In addition to this automatic display, MATLAB has several commands that can beused to generate displays or outputs.

Two commands that are frequently used to generate output are: disp and fprintf.The main differences between these two commands can be summarized as follows (Table4.4).

Table 4.4: disp and fprintf commands

disp . Simple to use.. Provide limited control over the appearance of output

fprintf . Slightly more complicated than disp.. Provide total control over the appearance of output

41

4.6 Exercises

1. Liz buys three apples, a dozen bananas, and one cantaloupe for $2.36. Bob buys a dozenapples and two cantaloupe for $5.26. Carol buys two bananas and three cantaloupefor $2.77. How much do single pieces of each fruit cost?

2. Write a function file that converts temperature in degrees Fahrenheit (◦F) to degreesCentigrade (◦C). Use input and fprintf commands to display a mix of text andnumbers. Recall the conversion formulation, C = 5/9 ∗ (F− 32).

3. Write a user-defined MATLAB function, with two input and two output argumentsthat determines the height in centimeters (cm) and mass in kilograms (kg)of a personfrom his height in inches (in.) and weight in pounds (lb).

(a) Determine in SI units the height and mass of a 5 ft.15 in. person who weight 180lb.

(b) Determine your own height and weight in SI units.

42

Chapter 5

Control flow and operators

5.1 Introduction

MATLAB is also a programming language. Like other computer programming languages,MATLAB has some decision making structures for control of command execution. Thesedecision making or control flow structures include for loops, while loops, and if-else-end

constructions. Control flow structures are often used in script M-files and function M-files.

By creating a file with the extension .m, we can easily write and run programs. Wedo not need to compile the program since MATLAB is an interpretative (not compiled)language. MATLAB has thousand of functions, and you can add your own using m-files.

MATLAB provides several tools that can be used to control the flow of a program(script or function). In a simple program as shown in the previous Chapter, the commandsare executed one after the other. Here we introduce the flow control structure that makepossible to skip commands or to execute specific group of commands.

5.2 Control flow

MATLAB has four control flow structures: the if statement, the for loop, the while loop,and the switch statement.

5.2.1 The ‘‘if...end’’ structure

MATLAB supports the variants of “if” construct.

• if ... end

• if ... else ... end

43

• if ... elseif ... else ... end

The simplest form of the if statement is

if expression

statements

end

Here are some examples based on the familiar quadratic formula.

1. discr = b*b - 4*a*c;

if discr < 0

disp(’Warning: discriminant is negative, roots are

imaginary’);

end

2. discr = b*b - 4*a*c;

if discr < 0


imaginary’);

else

disp(’Roots are real, but may be repeated’)

end

3. discr = b*b - 4*a*c;

if discr < 0


imaginary’);

elseif discr == 0

disp(’Discriminant is zero, roots are repeated’)

else

disp(’Roots are real’)

end

It should be noted that:

• elseif has no space between else and if (one word)

• no semicolon (;) is needed at the end of lines containing if, else, end

• indentation of if block is not required, but facilitate the reading.

• the end statement is required

44

5.2.2 Relational and logical operators

A relational operator compares two numbers by determining whether a comparison is trueor false. Relational operators are shown in Table 5.1.

Table 5.1: Relational and logical operators

Operator Description

> Greater than< Less than

>= Greater than or equal to<= Less than or equal to== Equal to∼= Not equal to& AND operator| OR operator∼ NOT operator

Note that the “equal to” relational operator consists of two equal signs (==) (with no spacebetween them), since = is reserved for the assignment operator.

5.2.3 The ‘‘for...end’’ loop

In the for ... end loop, the execution of a command is repeated at a fixed and predeter-mined number of times. The syntax is

for variable = expression

statements

end

Usually, expression is a vector of the form i:s:j. A simple example of for loop is

for ii=1:5

x=ii*ii

end

It is a good idea to indent the loops for readability, especially when they are nested. Notethat MATLAB editor does it automatically.

Multiple for loops can be nested, in which case indentation helps to improve thereadability. The following statements form the 5-by-5 symmetric matrix A with (i, j) elementi/j for j ≥ i:

45

n = 5; A = eye(n);

for j=2:n

for i=1:j-1

A(i,j)=i/j;

A(j,i)=i/j;

end

end

5.2.4 The ‘‘while...end’’ loop

This loop is used when the number of passes is not specified. The looping continues until astated condition is satisfied. The while loop has the form:

while expression

statements

end

The statements are executed as long as expression is true.

x = 1

while x <= 10

x = 3*x

end

It is important to note that if the condition inside the looping is not well defined, the loopingwill continue indefinitely. If this happens, we can stop the execution by pressing Ctrl-C.

5.2.5 Other flow structures

• The break statement. A while loop can be terminated with the break statement,which passes control to the first statement after the corresponding end. The break

statement can also be used to exit a for loop.

• The continue statement can also be used to exit a for loop to pass immediately tothe next iteration of the loop, skipping the remaining statements in the loop.

• Other control statements include return, continue, switch, etc. For more detailabout these commands, consul MATLAB documentation.

46

5.2.6 Operator precedence

We can build expressions that use any combination of arithmetic, relational, and logicaloperators. Precedence rules determine the order in which MATLAB evaluates an expression.We have already seen this in the “Tutorial Lessons”.

Here we add other operators in the list. The precedence rules for MATLAB are shownin this list (Table 5.2), ordered from highest (1) to lowest (9) precedence level. Operatorsare evaluated from left to right.

Table 5.2: Operator precedence

Precedence Operator

1 Parentheses ()2 Transpose (. ′), power (.ˆ), matrix power (ˆ)3 Unary plus (+), unary minus (−), logical negation (∼)4 Multiplication (. ∗), right division (. /), left division (.\),

matrix multiplication (∗), matrix right division (/),matrix left division (\)

5 Addition (+), subtraction (−)6 Colon operator (:)7 Less than (<), less than or equal to (≤), greater (>),

greater than or equal to (≥), equal to (==), not equal to (∼=)8 Element-wise AND, (&)9 Element-wise OR, (|)

5.3 Saving output to a file

In addition to displaying output on the screen, the command fprintf can be used forwriting the output to a file. The saved data can subsequently be used by MATLAB or othersoftwares.

To save the results of some computation to a file in a text format requires the followingsteps:

1. Open a file using fopen

2. Write the output using fprintf

3. Close the file using fclose

Here is an example (script) of its use.

47

% write some variable length strings to a file

op = fopen(’weekdays.txt’,’wt’);

fprintf(op,’Sunday\nMonday\nTuesday\nWednesday\n’);

fprintf(op,’Thursday\nFriday\nSaturday\n’);

fclose(op);

This file (weekdays.txt) can be opened with any program that can read .txt file.

5.4 Exercises


48

Chapter 6

Debugging M-files

6.1 Introduction

This section introduces general techniques for finding errors in M-files. Debugging is theprocess by which you isolate and fix errors in your program or code.

Debugging helps to correct two kind of errors:

• Syntax errors - For example omitting a parenthesis or misspelling a function name.

• Run-time errors - Run-time errors are usually apparent and difficult to track down.They produce unexpected results.

6.2 Debugging process

We can debug the M-files using the Editor/Debugger as well as using debugging functionsfrom the Command Window. The debugging process consists of

• Preparing for debugging

• Setting breakpoints

• Running an M-file with breakpoints

• Stepping through an M-file

• Examining values

• Correcting problems

• Ending debugging

49

6.2.1 Preparing for debugging

Here we use the Editor/Debugger for debugging. Do the following to prepare for debugging:

• Open the file

• Save changes

• Be sure the file you run and any files it calls are in the directories that are on thesearch path.

6.2.2 Setting breakpoints

Set breakpoints to pause execution of the function, so we can examine where the problemmight be. There are three basic types of breakpoints:

• A standard breakpoint, which stops at a specified line.

• A conditional breakpoint, which stops at a specified line and under specified conditions.

• An error breakpoint that stops when it produces the specified type of warning, error,NaN, or infinite value.

You cannot set breakpoints while MATLAB is busy, for example, running an M-file.

6.2.3 Running with breakpoints

After setting breakpoints, run the M-file from the Editor/Debugger or from the CommandWindow. Running the M-file results in the following:

• The prompt in the Command Window changes to

K>>

indicating that MATLAB is in debug mode.

• The program pauses at the first breakpoint. This means that line will be executedwhen you continue. The pause is indicated by the green arrow.

• In breakpoint, we can examine variable, step through programs, and run other callingfunctions.

50

6.2.4 Examining values

While the program is paused, we can view the value of any variable currently in theworkspace. Examine values when we want to see whether a line of code has producedthe expected result or not. If the result is as expected, step to the next line, and continuerunning. If the result is not as expected, then that line, or the previous line, contains anerror. When we run a program, the current workspace is shown in the Stack field. Use who

or whos to list the variables in the current workspace.

Viewing values as datatips

First, we position the cursor to the left of a variable on that line. Its current value appears.This is called a datatip, which is like a tooltip for data. If you have trouble getting thedatatip to appear, click in the line and then move the cursor next to the variable.

6.2.5 Correcting and ending debugging

While debugging, we can change the value of a variable to see if the new value producesexpected results. While the program is paused, assign a new value to the variable in the Com-mand Window, Workspace browser, or Array Editor. Then continue running and steppingthrough the program.

6.2.6 Ending debugging

After identifying a problem, end the debugging session. It is best to quit debug mode beforeediting an M-file. Otherwise, you can get unexpected results when you run the file. To enddebugging, select Exit Debug Mode from the Debug menu.

6.2.7 Correcting an M-file

To correct errors in an M-file,

• Quit debugging

• Do not make changes to an M-file while MATLAB is in debug mode

• Make changes to the M-file

• Save the M-file

• Clear breakpoints

51

• Run the M-file again to be sure it produces the expected results.

For details on debugging process, see MATLAB documentation.

52

Appendix A

Summary of commands

Table A.1: Arithmetic operators and special characters

Character Description

+ Addition− Subtraction∗ Multiplication (scalar and array)/ Division (right)ˆ Power or exponentiation: Colon; creates vectors with equally spaced elements; Semi-colon; suppresses display; ends row in array, Comma; separates array subscripts

. . . Continuation of lines% Percent; denotes a comment; specifies output format′ Single quote; creates string; specifies matrix transpose= Assignment operator( ) Parentheses; encloses elements of arrays and input arguments[ ] Brackets; encloses matrix elements and output arguments

53

Table A.2: Array operators


.∗ Array multiplication

./ Array (right) division

.ˆ Array power

.\ Array (left) division.′ Array (nonconjugated) transpose

Table A.3: Relational and logical operators


< Less than≤ Less than or equal to> Greater than≥ Greater than or equal to

== Equal to∼= Not equal to& Logical or element-wise AND| Logical or element-wise OR

&& Short-circuit AND| | Short-circuit OR

54

Table A.4: Managing workspace and file commands

Command Description

cd Change current directoryclc Clear the Command Window

clear (all) Removes all variables from the workspaceclear x Remove x from the workspacecopyfile Copy file or directorydelete Delete filesdir Display directory listing

exist Check if variables or functions are definedhelp Display help for MATLAB functions

lookfor Search for specified word in all help entriesmkdir Make new directory

movefile Move file or directorypwd Identify current directory

rmdir Remove directorytype Display contents of filewhat List MATLAB files in current directorywhich Locate functions and fileswho Display variables currently in the workspacewhos Display information on variables in the workspace

Table A.5: Predefined variables and math constants

Variable Description

ans Value of last variable (answer)eps Floating-point relative accuracyi Imaginary unit of a complex number

Inf Infinity (∞)eps Floating-point relative accuracyj Imaginary unit of a complex number

NaN Not a numberpi The number π (3.14159 . . .)

55

Table A.6: Elementary matrices and arrays

Command Description

eye Identity matrixlinspace Generate linearly space vectors

ones Create array of all onesrand Uniformly distributed random numbers and arrayszeros Create array of all zeros

Table A.7: Arrays and Matrices: Basic information

Command Description

disp Display text or arrayisempty Determine if input is empty matrixisequal Test arrays for equalitylength Length of vectorndims Number of dimensionsnumel Number of elementssize Size of matrix

Table A.8: Arrays and Matrices: operations and manipulation

Command Description

cross Vector cross productdiag Diagonal matrices and diagonals of matrixdot Vector dot productend Indicate last index of arrayfind Find indices of nonzero elementskron Kronecker tensor productmax Maximum value of arraymin Minimum value of arrayprod Product of array elements

reshape Reshape arraysort Sort array elementssum Sum of array elementssize Size of matrix

56

Table A.9: Arrays and Matrices: matrix analysis and linear equations

Command Description

cond Condition number with respect to inversiondet Determinantinv Matrix inverse

linsolve Solve linear system of equationslu LU factorization

norm Matrix or vector normnull Null spaceorth Orthogonalizationrank Matrix rankrref Reduced row echelon formtrace Sum of diagonal elements

57

Appendix B

Release notes for Release 14 withService Pack 2

B.1 Summary of changes

MATLAB 7 Release 14 with Service Pack 2 (R14SP2) includes several new features. Themajor focus of R14SP2 is on improving the quality of the product. This document doesn’tattempt to provide a complete specification of every single feature, but instead providesa brief introduction to each of them. For full details, you should refer to the MATLABdocumentation (Release Notes).

The following key points may be relevant:

1. Spaces before numbers - For example: A* .5, you will typically get a mystifyingmessage saying that A was previously used as a variable. There are two workarounds:

(a) Remove all the spaces:

A*.5

(b) Or, put a zero in front of the dot:

A * 0.5

2. RHS empty matrix - The right-hand side must literally be the empty matrix [ ]. Itcannot be a variable that has the value [ ], as shown here:

rhs = [];

A(:,2) = rhs

??? Subscripted assignment dimension mismatch

58

3. New format option - We can display MATLAB output using two new formats:short eng and long eng.

• short eng – Displays output in engineering format that has at least 5 digits anda power that is a multiple of three.

>> format short eng

>> pi

ans =

3.1416e+000

• long eng – Displays output in engineering format that has 16 significant digitsand a power that is a multiple of three.

>> format long eng

>> pi

ans =

3.14159265358979e+000

4. Help - To get help for a subfunction, use

>> help function_name>subfunction_name

In previous versions, the syntax was

>> help function_name/subfunction_name

This change was introduced in R14 (MATLAB 7.0) but was not documented. Use theMathWorks Web site search features to look for the latest information.

5. Publishing - Publishing to LATEXnow respects the image file type you specify in pref-erences rather than always using EPSC2-files.

• The Publish image options in Editor/Debugger preferences for Publishing Imageshave changed slightly. The changes prevent you from choosing invalid formats.

• The files created when publishing using cells now have more natural extensions.For example, JPEG-files now have a .jpg instead of a .jpeg extension, and EPSC2-files now have an .eps instead of an .epsc2 extension.

• Notebook will no longer support Microsoft Word 97 starting in the next releaseof MATLAB.

6. Debugging - Go directly to a subfunction or using the enhanced Go To dialog box.Click the Name column header to arrange the list of function alphabetically, or clickthe Line column header to arrange the list by the position of the functions in the file.

59

B.2 Other changes

1. There is a new command mlint, which will scan an M-file and show inefficiencies inthe code. For example, it will tell you if you’ve defined a variable you’ve never used, ifyou’ve failed to pre-allocate an array, etc. These are common mistakes in EA1 whichproduce runnable but inefficient code.

2. You can comment-out a block of code without putting % at the beginning of each line.The format is

%{

Stuff you want MATLAB to ignore...

%}

The delimiters %{ and %} must appear on lines by themselves, and it may not workwith the comments used in functions to interact with the help system (like the H1line).

3. There is a new function linsolve which will solve Ax = b but with the user’s choice ofalgorithm. This is in addition to left division x = A\b which uses a default algorithm.

4. The eps constant now takes an optional argument. eps(x) is the same as the oldeps*abs(x).

5. You can break an M-file up into named cells (blocks of code), each of which you canrun separately. This may be useful for testing/debugging code.

6. Functions now optionally end with the end keyword. This keyword is mandatory whenworking with nested functions.

B.3 Further details

1. You can dock and un-dock windows from the main window by clicking on an icon.Thus you can choose to have all Figures, M-files being edited, help browser, commandwindow, etc. All appear as panes in a single window.

2. Error messages in the command window resulting from running an M-file now includea clickable link to the offending line in the editor window containing the M-file.

3. You can customize figure interactively (labels, line styles, etc.) and then automaticallygenerate the code which reproduces the customized figure.

60

4. feval is no longer needed when working with function handles, but still works forbackward compatibility. For example, x=@sin; x(pi) will produce sin(pi) just likefeval(x,pi) does, but faster.

5. You can use function handles to create anonymous functions.

6. There is support for nested functions, namely, functions defined within the body ofanother function. This is in addition to sub-functions already available in version 6.5.

7. There is more support in arithmetic operations for numeric data types other thandouble, e.g. single, int8, int16, uint8, uint32, etc.

Finally, please visit our webpage for other details:

http://computing.mccormick.northwestern.edu/matlab/

61

Appendix C

Main characteristics of MATLAB

C.1 History

• Developed primarily by Cleve Moler in the 1970’s

• Derived from FORTRAN subroutines LINPACK and EISPACK, linear and eigenvaluesystems.

• Developed primarily as an interactive system to access LINPACK and EISPACK.

• Gained its popularity through word of mouth, because it was not officially distributed.

• Rewritten in C in the 1980’s with more functionality, which include plotting routines.

• The MathWorks Inc. was created (1984) to market and continue development ofMATLAB.

According to Cleve Moler, three other men played important roles in the origins of MATLAB:J. H. Wilkinson, George Forsythe, and John Todd. It is also interesting to mention theauthors of LINPACK: Jack Dongara, Pete Steward, Jim Bunch, and Cleve Moler. Sincethen another package emerged: LAPACK. LAPACK stands for Linear Algebra Package. Ithas been designed to supersede LINPACK and EISPACK.

C.2 Strengths

• MATLAB may behave as a calculator or as a programming language

• MATLAB combine nicely calculation and graphic plotting.

• MATLAB is relatively easy to learn

62

• MATLAB is interpreted (not compiled), errors are easy to fix

• MATLAB is optimized to be relatively fast when performing matrix operations

• MATLAB does have some object-oriented elements

C.3 Weaknesses

• MATLAB is not a general purpose programming language such as C, C++, or FOR-TRAN

• MATLAB is designed for scientific computing, and is not well suitable for other appli-cations

• MATLAB is an interpreted language, slower than a compiled language such as C++

• MATLAB commands are specific for MATLAB usage. Most of them do not have adirect equivalent with other programming language commands

C.4 Competition

• One of MATLAB’s competitors is Mathematica, the symbolic computation program.

• MATLAB is more convenient for numerical analysis and linear algebra. It is frequentlyused in engineering community.

• Mathematica has superior symbolic manipulation, making it popular among physicists.

• There are other competitors:

– Scilab

– GNU Octave

– Rlab

63

Bibliography

[1] The MathWorks Inc. MATLAB 7.0 (R14SP2). The MathWorks Inc., 2005.

[2] S. J. Chapman. MATLAB Programming for Engineers. Thomson, 2004.

[3] C. B. Moler. Numerical Computing with MATLAB. Siam, 2004.

[4] C. F. Van Loan. Introduction to Scientific Computing. Prentice Hall, 1997.

[5] D. J. Higham and N. J. Higham. MATLAB Guide. Siam, second edition edition, 2005.

[6] K. R. Coombes, B. R. Hunt, R. L. Lipsman, J. E. Osborn, and G. J. Stuck. DifferentialEquations with MATLAB. John Wiley and Sons, 2000.

[7] A. Gilat. MATLAB: An introduction with Applications. John Wiley and Sons, 2004.

[8] J. Cooper. A MATLAB Companion for Multivariable Calculus. Academic Press, 2001.

[9] J. C. Polking and D. Arnold. ODE using MATLAB. Prentice Hall, 2004.

[10] D. Kahaner, C. Moler, and S. Nash. Numerical Methods and Software. Prentice-Hall,1989.

[11] J. W. Demmel. Applied Numerical Linear Algebra. Siam, 1997.

[12] D. Houcque. Applications of MATLAB: Ordinary Differential Equations. Internalcommunication, Northwestern University, pages 1–12, 2005.

64

1

Polynomials in Matlab

Polynomials

• f(x) = anxn + an-1xn-1 + ... + a1x + a0

• n is the degree of the polynomial• Examples:

f(x) = 2x2 - 4x + 10 degree 2f(x) = 6 degree 0

Polynomials in Matlab

• Represented by a row vector in which the elements are the coefficients.

• Must include all coefficients, even if 0• Examples

8x + 5 p = [8 5]6x2 - 150 h = [6 0 -150]

2

Value of a Polynomial

• Recall that we can compute the value of a polynomial for any value of x directly.

• Ex: f(x) = 5x3 + 6x2 - 7x + 3

x = 2;y = (5 * x ^ 3) + (6 * x ^ 2) - (7 * x) + 3y =

53

Value of a Polynomial

• Matlab can also compute the value of a polynomial at point x using a function, which is sometimes more convenient

• polyval (p, x)– p is a vector with the coefficients of the

polynomial– x is a number, variable or expression

Value of a polynomial

• Ex: f(x) = 5x3 + 6x2 - 7x + 3

p = [5 6 -7 3];x = 2;y = polyval(p, x)y =

53

3

Roots of a Polynomial

• Recall that the roots of a polynomial are the values of the argument for which the polynomial is equal to zero

• Ex: f(x) = x2 - 2x -30 = x2 - 2x -30 = (x + 1)(x - 3)0 = x + 1 OR 0 = x - 3x = -1 x = 3


• Matlab can compute the roots of a function• r = roots(p)

– p is a row vector with the coefficients of the polynomial

– r is a column vector with the roots of the polynomial


• Ex: f(x) = x2 - 2x -3

p = [1 -2 -3];r = roots(p)r =

3.0000-1.0000

4

Polynomial Coefficients

• Given the roots of a polynomial, the polynomial itself can be calculated

• Ex: roots = -3, +2x = -3 OR x = 20 = x + 3 0 = x - 20 = (x + 3)(x - 2)f(x) = x2 + x - 6


• Given the roots of a polynomial, Matlabcan compute the coefficients

• p = poly(r)– r is a row or column vector with the roots of

the polynomial– p is a row vector with the coefficients


• Ex: roots = -3, +2

r = [-3 +2];p = poly(r)p =

1 1 -6% f(x) = x2 + x -6

5

Adding and Subtracting Polynomials

• Polynomials can be added or subtracted• Ex: f1(x) + f2(x) f1(x) = 3x 6 + 15x 5 - 10x3 - 3x2 +15x - 40f2(x) = 3x3 - 2x - 6

3x6 + 15x5 - 7x3 -3x2 +13x - 46


• Can do this in Matlab by just adding or subtracting the coefficient vectors– Both vectors must be of the same size, so the

shorter vector must be padded with zeros


Ex: f1(x) = 3x6 + 15x5 - 10x3 - 3x2 +15x - 40f2(x) = 3x3 - 2x - 6p1 = [3 15 0 -10 -3 15 -40];p2 = [0 0 0 3 0 -2 -6];p = p1+p2p =

3 15 0 -7 -3 13 -46%f(x) = 3x6 + 15x5 -7x3 -3x2 +13x -46

6

Multiplying Polynomials

• Polynomials can be multiplied:• Ex: (2x 2 + x -3) * (x + 1) =

2x3 + x2 - 3x+ 2x2 + x -3

2x3 +3x2 -2x -3


• Matlab can also multiply polynomials• c = conv(a, b)

– a and b are the vectors of the coefficients of the functions being multiplied

– c is the vector of the coefficients of the product


• Ex: (2x 2 + x -3) * (x + 1)a = [2 1 -3];b = [1 1];c = conv(a, b)c =

2 3 -2 -3% 2x3 + 3x2 -2x -3

7

Dividing Polynomials

• Recall that polynomials can also be divided

x -10x + 1 x2 - 9x - 10

-x2 - x-10x -10-10x -10

0


• Matlab can also divide polynomials• [q,r] = deconv(u, v)

– u is the coefficient vector of the numerator– v is the coefficient vector of the denominator– q is the coefficient vector of the quotient– r is the coefficient vector of the remainder


• Ex: (x2 - 9x -10) ÷ (x + 1)u = [1 -9 -10];v = [1 1];[q, r] = deconv(u, v)q =

1 -10 % quotient is (x - 10)r =

0 0 0 % remainder is 0

8

Example 1

• Write a program to calculate when an object thrown straight up will hit the ground. The equation iss(t) = -½gt2 +v0t + h0

s is the position at time t (a position of zero means that the object has hit the ground)g is the acceleration of gravity: 9.8m/s2

v0 is the initial velocity in m/sh0is the initial height in meters (ground level is 0, a positive height means that the object was thrown from a raised platform)

Example 1

Prompt for and read in initial velocityPrompt for and read in initial heightFind the roots of the equationSolution is the positive rootDisplay solution

Example 1

v = input('Please enter initial velocity: ');h = input('Please enter initial height: ');x = [-4.9 v h];y = roots(x);if y(1) >= 0

fprintf('The object will hit the ground in %.2f seconds\n', y(1))

elsefprintf('The object will hit the ground in %.2f seconds\n',

y(2))end

9

Example 1

Please enter initial velocity: 19.6Please enter initial height: 58.8The object will hit the ground in 6.00

seconds

Derivatives of Polynomials

• We can take the derivative of polynomials

f(x) = 3x2 -2x + 4dy = 6x - 2dx


• Matlab can also calculate the derivatives of polynomials

• k = polyder(p)p is the coefficient vector of the polynomialk is the coefficient vector of the derivative

10


• Ex: f(x) = 3x2 - 2x + 4

p = [3 -2 4];k = polyder(p)k =

6 -2% dy/dx = 6x - 2

Integrals of Polynomials

• ?6x2 dx = 6 ?x2 dx= 6 * ? x3

= 2 x3

Integrals of Polynomials

• Matlab can also calculate the integral of a polynomialg = polyint(h, k)h is the coefficient vector of the polynomialg is the coefficient vector of the integralk is the constant of integration - assumed

to be 0 if not present

11

Integration of Polynomials

• ?6x2 dxh = [6 0 0];g = polyint(h)g =

2 0 0 0% g(x) = 2x3

Polynomial Curve Fitting

• Curve fitting is fitting a function to a set of data points

• That function can then be used for various mathematical analyses

• Curve fitting can be tricky, as there are many possible functions and coefficients

Curve Fitting

• Polynomial curve fitting uses the method of least squares– Determine the difference between each data

point and the point on the curve being fitted, called the residual

– Minimize the sum of the squares of all of the residuals to determine the best fit

12

Curve Fitting

• A best-fit curve may not pass through any actual data points

• A high-order polynomial may pass through all the points, but the line may deviate from the trend of the data

13

Matlab Curve Fitting

• Matlab provides an excellent polynomial curve fitting interface

• First, plot the data that you want to fitt = [0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0];

w = [6.00 4.83 3.70 3.15 2.41 1.83 1.49 1.21 0.96 0.73 0.64];plot(t, w)

• Choose Tools/Basic Fitting from the menu on the top of the plot

14

Matlab Curve Fitting

• Choose a fit – it will be displayed on the plot– the numerical results will show the equation

and the coefficients– the norm of residuals is a measure of the

quality of the fit. A smaller residual is a better fit.

• Repeat until you find the curve with the best fit

Linear Fit

Linear Fit

15

8th Degree Polynomial

8th Degree Polynomial

Example 2

• Find the parabola that best fits the data points (-1, 10) (0, 6) (1, 2) (2, 1) (3, 0) (4, 2) (5, 4) and (6, 7)

• The equation for a parabola is f(x) = ax2 + bx + a

16

Example 2

X = [-1, 0, 1, 2, 3, 4, 5, 6];Y= [10, 6, 2, 1, 0, 2, 4, 7];plot (X, Y)

17

Other curves

• All previous examples use polynomial curves. However, the best fit curve may also be power, exponential, logarithmic or reciprocal

• See your textbook for information on fitting data to these types of curves

Interpolation

• Interpolation is estimating values between data points

• Simplest way is to assume a line between each pair of points

• Can also assume a quadratic or cubic polynomial curve connects each pair of points

Interpolation

• yi = interp1(x, y, xi, 'method')interp1 - last character is onex is vector with x pointsy is a vector with y pointsxi is the x coordinate of the point to be

interpolatedyi is the y coordinate of the point being

interpolated

18

Interpolation

• method is optional:'nearest' - returns y value of the data point that

is nearest to the interpolated x point'linear' - assume linear curve between each two

points (default)'spline' - assumes a cubic curve between each

two points

Interpolation

• Example: x = [0 1 2 3 4 5];y = [1.0 -0.6 -1.4 3.2 -0.7 -6.4];yi = interp1(x, y, 1.5, 'linear')yi =

-1yj = interp1(x, y, 1.5, 'spline')yj =

-1.7817

Lecture 8

Matrices and Matrix Operations in

Matlab

Matrix operations

Recall how to multiply a matrix A times a vector v:

Av =

(

1 23 4

)(

−12

)

=

(

1 · (−1) + 2 · 23 · (−1) + 4 · 2

)

=

(

35

)

.

This is a special case of matrix multiplication. To multiply two matrices, A and B you proceed asfollows:

AB =

(

1 23 4

)(

−1 −22 1

)

=

(

−1 + 4 −2 + 2−3 + 8 −6 + 4

)

=

(

3 05 −2

)

.

Here both A and B are 2 × 2 matrices. Matrices can be multiplied together in this way providedthat the number of columns of A match the number of rows of B. We always list the size of a matrixby rows, then columns, so a 3× 5 matrix would have 3 rows and 5 columns. So, if A is m× n andB is p× q, then we can multiply AB if and only if n = p. A column vector can be thought of as ap× 1 matrix and a row vector as a 1× q matrix. Unless otherwise specified we will assume a vectorv to be a column vector and so Av makes sense as long as the number of columns of A matches thenumber of entries in v.

Printing matrices on the screen takes up a lot of space, so you may want to use> format compact

Enter a matrix into Matlab with the following syntax:> A = [ 1 3 -2 5 ; -1 -1 5 4 ; 0 1 -9 0]

Also enter a vector u:> u = [ 1 2 3 4]’

To multiply a matrix times a vector Au use *:> A*u

Since A is 3 by 4 and u is 4 by 1 this multiplication is valid and the result is a 3 by 1 vector.

Now enter another matrix B using:> B = [3 2 1; 7 6 5; 4 3 2]

You can multiply B times A:> B*A

but A times B is not defined and> A*B

will result in an error message.

You can multiply a matrix by a scalar:> 2*A

28

29

Adding matrices A+A will give the same result:> A + A

You can even add a number to a matrix:> A + 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . This should add 3 to every entry of A.

Component-wise operations

Just as for vectors, adding a ’.’ before ‘*’, ‘/’, or ‘^’ produces entry-wise multiplication, divisionand exponentiation. If you enter:> B*B

the result will be BB, i.e. matrix multiplication of B times itself. But, if you enter:> B.*B

the entries of the resulting matrix will contain the squares of the same entries of B. Similarly if youwant B multiplied by itself 3 times then enter:> B^3

but, if you want to cube all the entries of B then enter:> B.^3

Note that B*B and B^3 only make sense if B is square, but B.*B and B.^3 make sense for any sizematrix.

The identity matrix and the inverse of a matrix

The n× n identity matrix is a square matrix with ones on the diagonal and zeros everywhere else.It is called the identity because it plays the same role that 1 plays in multiplication, i.e.

AI = A, IA = A, Iv = v

for any matrix A or vector v where the sizes match. An identity matrix in Matlab is produced bythe command:> I = eye(3)

A square matrix A can have an inverse which is denoted by A−1. The definition of the inverse isthat:

AA−1 = I and A−1A = I.

In theory an inverse is very important, because if you have an equation:

Ax = b

where A and b are known and x is unknown (as we will see, such problems are very common andimportant) then the theoretical solution is:

x = A−1b.

We will see later that this is not a practical way to solve an equation, and A−1 is only importantfor the purpose of derivations.

In Matlab we can calculate a matrix’s inverse very conveniently:> C = randn(5,5)

> inv(C)

However, not all square matrices have inverses:> D = ones(5,5)

> inv(D)

30 LECTURE 8. MATRICES AND MATRIX OPERATIONS IN MATLAB

The “Norm” of a matrix

For a vector, the “norm” means the same thing as the length. Another way to think of it is how farthe vector is from being the zero vector. We want to measure a matrix in much the same way andthe norm is such a quantity. The usual definition of the norm of a matrix is the following:

Definition 1 Suppose A is a m× n matrix. The norm of A is

|A| ≡ max|v|=1

|Av|.

The maximum in the definition is taken over all vectors with length 1 (unit vectors), so the definitionmeans the largest factor that the matrix stretches (or shrinks) a unit vector. This definition seemscumbersome at first, but it turns out to be the best one. For example, with this definition we havethe following inequality for any vector v:

|Av| ≤ |A||v|.

In Matlab the norm of a matrix is obtained by the command:> norm(A)

For instance the norm of an identity matrix is 1:> norm(eye(100))

and the norm of a zero matrix is 0:> norm(zeros(50,50))

For a matrix the norm defined above and calculated by Matlab is not the square root of the sumof the square of its entries. That quantity is called the Froebenius norm, which is also sometimesuseful, but we will not need it.

Some other useful commands

Try out the following:> C = rand(5,5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . random matrix with uniform distribution in [0, 1].> size(C) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . gives the dimensions (m× n) of C.> det(C) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . the determinant of the matrix.> max(C) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . the maximum of each column.> min(C) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .the minimum in each column.> sum(C) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . sums each column.> mean(C) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . the average of each column.> diag(C) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . just the diagonal elements.> C’ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .tranpose the matrix.

In addition to ones, eye, zeros, rand and randn, Matlab has several other commands that auto-matically produce special matrices:> hilb(6)

> pascal(5)

31

Exercises

8.1 Enter the matrix M by> M = [1,3,-1,6;2,4,0,-1;0,-2,3,-1;-1,2,-5,1]

and also the matrix

N =

−1 −3 32 −1 61 4 −12 −1 2

.

Multiply M and N using M * N. Can the order of multiplication be switched? Why or whynot? Try it to see how Matlab reacts.

8.2 By hand, calculate Av, AB, and BA for:

A =

2 4 −1−2 1 9−1 −1 0

, B =

0 −1 −11 0 2−1 −2 0

, v =

31−1

.

Check the results using Matlab. Think about how fast computers are. Turn in your handwork.

8.3 (a) Write a Matlab function program myinvcheck that makes a n × n random matrix(normally distributed, A = randn(n,n)), calculates its inverse (B = inv(A)), multipliesthe two back together, calculates the error equal to the norm of the difference of theresult from the n× n identity matrix (eye(n)), and returns error.

(b) Write a Matlab script program that calls myinvcheck for n = 10, 20, 40, . . . , 2i10,records the results of each trial, and plots the error versus n using a log plot. (Seehelp loglog.)

What happens to error as n gets big? Turn in a printout of the programs, the plot, and avery brief report on the results of your experiments.

Section 4.2 Control Structures in Matlab 299

4.2 Control Structures in MatlabIn this section we will discuss the control structures offered by the Matlab

programming language that allow us to add more levels of complexity to thesimple programs we have written thus far. Without further ado and fanfare, let’sbegin.

IfIf evaluates a logical expression and executes a block of statements based onwhether the logical expressions evaluates to true (logical 1) or false (logical 0).The basic structure is as follows.

if logical_expressionstatements

end

If the logical_expression evaluates as true (logical 1), then the block of state-ments that follow if logical_expression are executed, otherwise the statementsare skipped and program control is transferred to the first statement that followsend. Let’s look at an example of this control structure’s use.

Matlab’s rem(a,b) returns the remainder when a is divided by b. Thus, if a isan even integer, then rem(a,2) will equal zero. What follows is a short programto test if an integer is even. Open the Matlab editor, enter the following script,then save the file as evenodd.m.

n = input(’Enter an integer: ’);if (rem(n,2)==0)

fprintf(’The integer %d is even.\n’, n)end

Return to the command window and run the script by entering evenodd atthe Matlab prompt. Matlab responds by asking you to enter an integer. As aresponse, enter the integer 12 and press Enter.

Copyrighted material. See: http://msenux.redwoods.edu/Math4Textbook/1

300 Chapter 4 Programming in Matlab

>> evenoddEnter an integer: 12The integer 12 is even.

Run the program again. When prompted, enter the nonnegative integer 17 andpress Enter.

>> evenoddEnter an integer: 17

There is no response in this case, because our program provides no alternative ifthe input is odd.

Besides the new conditional control structure, we have two new commandsthat warrant attention.

1. Matlab’s input command, when used in the form n = input(’Enter aninteger: ’), will display the string as a prompt and wait for the user to entera number and hit Enter on the keyboard, whereupon it stores the numberinput by the user in the variable n.

2. The command fprintf is used to print formatted data to a file. The defaultsyntax is fprintf(FID,FORMAT,A,...).i. The argument FID is a file identifier. If no file identifier is present, then

fprintf prints to the screen.ii. The argument FORMAT is a format string, which may contain con-

tain conversion specifications from the C programming language. In theevenodd.m script, %d is conversion specification which will format thefirst argument following the format string as a signed decimal number.Note the \n at the end of the format string. This is a newline character,which creates a new line after printing the format string to the screen.

iii. The format string can be followed by zero or more arguments, which willbe substituted in sequence for the C-language conversion specifications inthe format string.

ElseWe can provide an alternative if the logical_expression evaluates as false. Thebasic structure is as follows.


if logical_expressionstatements

elsestatments

end

In this form, if the logical_expression evaluates as true (logical 1), the blockof statements between if and else are executed. If the logical_expressionevaluates as false (logical 0), then the block of statements between else and endare executed.

We can provide an alternative to our evenodd.m script. Add the followinglines to the script and resave as evenodd.m.

n = input(’Enter an integer: ’);if (rem(n,2)==0)

fprintf(’The integer %d is even.\n’, n)else

fprintf(’The integer %d is odd.\n’, n)end

Run the program, enter 17, and note that we now have a different response.

>> evenoddEnter an integer: 17The integer 17 is odd.

Because the logical expression rem(n,2)==0 evaluates as false when n = 17,the fprintf command that lies between else and end is executed, informing usthat the input is an odd integer.

ElseifSometimes we need to add more than one alternative. For this, we have thefollowing structure.


if logical_expression1statements

elseif logical_expression2statements

elsestatements

end

Program flow is as follows.

1. If logical_expression1 evaluates as true, then the block of statements be-tween if and elseif are executed.

2. If logical_expression1 evaluates as false, then program control is passed toelseif. At that point, if logical_expression2 evaluates as true, then theblock of statements between elseif and else are executed. If the logical ex-pression logical_expression2 evaluates as false, then the block of statmentsbetween else and end are executed.

You can have more than one elseif in this control structure, depending onneed. As an example of use, consider a program that asks a user to make a choicefrom a menu, then reacts accordingly.

First, ask for input, then set up a menu of choices with the following com-mands.

a=input(’Enter a number a: ’);b=input(’Enter a number b: ’);fprintf(’\n’)fprintf(’1) Add a and b.\n’)fprintf(’2) Subtract b from a.\n’)fprintf(’3) Multiply a and b.\n’)fprintf(’4) Divide a by b.\n’)fprintf(’\n’)n=input(’Enter your choice: ’);

Now, execute the appropriate choice based on the user’s menu selection.


if n==1fprintf(’The sum of %0.2f and %0.2f is %0.2f.\n’,a,b,a+b)

elseif n==2fprintf(’The difference of %0.2f and %.2f is %.2f.\n’,a,b,a-b)

elseif n==3fprintf(’The product of %.2f and %.2f is %.2f.\n’,a,b,a*b)

elseif n==4fprintf(’The quotient of %.2f and %.2f is %.2f.\n’,a,b,a/b)

elsefprintf(’Not a valid choice.\n’)

end

Save this script as mymenu.m, then execute the command mymenu at thecommand window prompt. You will be prompted to enter two numbers a and b.As shown, we entered 15.637 for a and 28.4 for b. The program presents a menuof choices and prompts us for a choice.

Enter a number a: 15.637Enter a number b: 28.4

1). Add a and b.2). Subtract b from a.3). Multiply a and b.4). Divide a by b.

Enter your choice:

We enter 1 as our choice and the program responds by adding the values of a andb.

Enter your choice: 1

The sum of 15.64 and 28.40 is 44.04.

Some comments are in order.

1. Note that after several elseif statments, we still list an else command to catchinvalid entries for n. The choice for n should match a menu designation, either1, 2, 3, or 4, but any other choice for n causes the statement following else to


be executed. This is a good programming practice, having a sort of “otherwiseblock ’ as a catchall for unexpected responses by the user.

2. We use the conversion specification %0.2f in this example, which outputs anumber in fixed point format with two decimal places. Note that this resultsin rounding in the output of numbers with more than two decimal places.

Switch, Case, and OtherwiseAs the number of elseif entries in a conditional control structure increases, thecode becomes harder to understand. Matlab provides a much simpler controlconstruct designed for this situation.

switch expressioncase value1

statementscase value2

statements...otherwise

statementsend

The expression can be a scalar or string. Switch works by comparing theinput expression to each case value. It executes the statements following thefirst match it finds, then transfers control to the line following the end. If switchfinds no match, it executes the statment block following the otherwise statement.

Let’s create a second script having the same menu building code.

a=input(’Enter a number a: ’);b=input(’Enter a number b: ’);fprintf(’\n’)fprintf(’1) Add a and b.\n’)fprintf(’2) Subtract b from a.\n’)fprintf(’3) Multiply a and b.\n’)fprintf(’4) Divide a by b.\n’)fprintf(’\n’)n=input(’Enter your choice: ’);fprintf(’\n’)


However, instead of using an if ...elseif...else...end control structure to select ablock of statments to execute based on the user’s choice of menu item, we willimplement a switch structure instead.

switch ncase 1fprintf(’The sum of %.2f and %.2f is %.2f.\n’,a,b,a+b)

case 2fprintf(’The difference of %.2f and %.2f is %.2f.\n’,a,b,a-b)

case 3fprintf(’The product of %.2f and %.2f is %.2f.\n’,a,b,a*b)

case 4fprintf(’The quotient of %.2f and %.2f is %.2f.\n’,a,b,a/b)

otherwisefprintf(’Not a valid choice.\n’)

end

Save the script as mymenuswitch.m, change to the command window, and enterand execute the command mymenuswitch at the command prompt. Note thatthe behavior of the program mymenuswitch is identical to that of mymenu.


1) Add a and b.2) Subtract b from a.3) Multiply a and b.4) Divide a by b.


The sum of 15.64 and 28.40 is 44.04.

If the user enters an invalid choice, note how control is passed to the statmentfollowing otherwise.



1) Add a and b.2) Subtract b from a.3) Multiply a and b.4) Divide a by b.


Not a valid choice.

LoopsA for loop is a control structure that is designed to execute a block of statementsa predetermined number of times. Here is its general use syntax.

for index=start:increment:finishstatements

end

As an example, we display the squares of every other integer from 5 to 13, inclu-sive.

for k=5:2:13fprintf(’The square of %d is %d.\n’, k, k^2)

end

The output of this simple loop follows.

The square of 5 is 25.The square of 7 is 49.The square of 9 is 81.The square of 11 is 121.The square of 13 is 169.

Another looping construct is Matlab’s while structure who’s basic syntax follows.


while expressionstatements

end

The while loop executes its block of statements as long as the logical controllingexpresssion evaluates as true (logical 1). For example, we can duplicate theoutput of the previous for loop with the following while loop.

k=5;while k<=13

fprintf(’The square of %d is %d.\n’, k, k^2)k=k+2;

end

When using while, it is not difficult to fall into a trap of programming a loop thatiterates indefinitely. In the example above, the loop will iterate as long a k<=13,so it is imperative that we increment the value of k in the interior of the loop, aswe do with the command k=k+2. This command adds 2 to the current value ofk, then replaces k with this incremented value. Thus, the first time through theloop, k = 5, the second time through the loop, k = 7, etc. When k is no longerless than or equal to 13, the loop terminates.

The simple use of for and while loops to generate the squares of every otherinteger from 5 to 13 can be accomplished just as easily with array operations.

k=5:2:13;A=[k; k.^2];fprintf(’The square of %d is %d.\n’, A)

The use of fprintf in this example warrants some explanation. First, the com-mand A=[k; k.^2] generates a matrix with two rows, the first row holding thevalues of k, the second the squares of k.

A =5 7 9 11 13

25 49 81 121 169


Again, with roots deeply planted in Fortran, Matlab enters the entries of matrixA in columnwise fashion. The following command will print all entries in thematrix A.

>> A(:)ans =

5257

499

8111

12113

169

It is important to note the order in which the entries were extracted from thematrix A, that is, the entries from the first column, followed by the entries fromthe second column, etc, until all of the entries of matrix A are on display. This isprecisely the order in which the fprintf command picks off the entries of matrixA in our example above. The first time through the loop, fprintf replaces the twooccurrences of %d in its format string with 5 and 25. The second time throughthe loop, the two occurrences of %d are replaced with 7 and 49, and so on, untilthe loop terminates.

For additional help on fprintf, type doc fprintf at the Matlab commandprompt.

Although a good introductory example of using loops, producing the squaresof integers is not a very interesting task. Let’s look at a more interesting exampleof the use of for and while loops.

I Example 1. For a number of years, a number of leading mathematicicansbelieved that the infinite series

∞∑k=1

1k2 = 1 + 1

22 + 132 + 1

42 + · · · (4.1)

converged to a finite sum, but none of them could determine what that sumwas. Until 1795, that is, when Leonhard Euler produced the suprising resultthat the sum of the series equaled π2/6. Use a for loop to sum the first twentyterms of the series and compare this partial sum with π2/6 (compute the relative


error). Secondly, write a while loop that will determine how many terms mustbe summed to produce an approximation of pi2/6 that is correct to 4 significantdigits.

We wish to sum the first 20 terms, so we begin by setting n = 20. We willstore the running sum in the variable s, which we initialize to zero. A for loop isused to track the running sum. On each pass of the for loop, the sum is updatedwith the command s=s+1/k^2, which takes the previous value of the runningsum stored in s, adds the value of the next term in the series, and stores the resultback in s.

n=20;s=0;for k=1:n

s=s+1/k^2;end

The true sum of the series in π2/6. Hence, we compute the relative error withthe following statement.

rel=abs(s-pi^2/6)/abs(pi^2/6);

We can then use fprintf to report the results.

fprintf(’The sum of the first %d terms is %f.\n’, n, s)fprintf(’The relative error is %e.\n’, rel)

You should obtain the following results.

The sum of the first 20 terms is 1.596163.The relative error is 2.964911e-02.

Note that the relative error is less than 5 × 10−2, so our approximation of π2/6(using the first 20 terms of the series) contains only 2 significant digits.

The next question asks how many terms must be summed to produce andapproximation of π2/6 correct to 4 significant digits. Thus, the relative error inthe approximation must be less than 5× 10−4.


We start by setting tol=5e-4. The intent is to loop until the relative erroris less than this tolerance, which signifies that we have 4 signicant digits in ourapproximation of π2/6 using the infinite series. The variable s will again containthe running sum and we again initialize the running sum to zero. The variable kcontains the term number, so we initialize k to 1. We next initialize rel_errorto 1 so that we enter the while loop at least once.

tol=5e-4;s=0;k=1;rel_error=1;

Each pass through the loop, we do three things:

1. We update the running sum by adding the next term in the series to theprevious running sum. This is accomplished with the statement s=s+1/k^2.

2. We increment the term number by 1, using the statement k=k+1 for thispurpose.

3. We calculate the current relative error.

We continue to loop while the relative error is greater than or equal to thetolerance.

while rel_error>=tols=s+1/k^2;k=k+1;rel_error=abs(s-pi^2/6)/abs(pi^2/6);

end

When the loop is completed, we use fprintf to output results.

fprintf(’The actual value of pi^2/6 is %f.\n’, pi^2/6)fprintf(’The sum of the first %d terms is %f.\n’, k, s)fprintf(’The relative error is %f.\n’, rel_error)

The results follow.


The actual value of pi^2/6 is 1.644934.The sum of the first 1217 terms is 1.644112.The relative error is 0.000500.

Note that we have agreement in approximately 4 significant digits.

This example demonstrates when we should use a for loop and when a whileloop is more appropriate. We offer the following advice.

What type of loop should I use? When you know in advance the precisenumber of times the loop should interate, use a for loop. On the other hand,if you have no predetermined knowledge of how many times you will needthe loop to execute, use a while loop.

For k = A

Matlab also supports a type of for loop whose block of statements are executedfor each column in a matrix A. The syntax is as follows.

for k = Astatements

end

In this construct, the columns of matrix A are stored one at a time in the variablek while the following statments, up to the end, are executed. For example, enterthe matrix A.

>> A=[1 2;3 4]A =

1 23 4

Now, enter and exercute the following loop.


>> for k=A, k, endk =

13

k =24

Note that k is assigned a column of matrix A at each iteration of the loop.This column assignment does not contradict the use we have already investi-

gated, i.e., for k=start:increment:finish. In this case, the Matlab constructstart:increment:finish creates a row vector and k is assigned a new column ofthe row vector at each iteration. Of course, this means that k is assigned a newelement of the row vector at each iteration.

As an example of use, let’s plot the family of functions defined by the equation

y = 2Cte−t2, (4.2)

where C is one of a number of specific constants. The following code producesthe family of curves shown in Figure 4.1. At each iteration of the loop, C isassigned the next value in the row vector [-5,-3,-1,0,1,3,5].

t=linspace(-4,4,200);for C=[-5,-3,-1,0,1,3,5]

y=-2*C*t.*exp(-t.^2);line(t,y)

end

Break and ContinueThere are two sitautions that frequently occur when writing loops.

1. If a certain state is achieved, the programmer wishes to terminate the loopand pass control to the code that follows the end of the loop.

2. The programmer doesn’t want to exit the loop, but does want to pass controlto the next iteration of the loop, thereby skipping any remaining code thatremains in the loop.

As an example of the first situation, consider the following snippet of code.Matlab’s break command will cause the loop to terminate if a prime is found inthe loop index k.


Figure 4.1. A family of curves for C = −5, −3, −1, 0, 1,3, and 5.

for k=[4,8,12,15,17,19,21,24]if isprime(k)

fprintf(’Prime found, %d, exiting loop.\n’,k)break

endfprintf(’Current value of index k is: %d\n’,k)

end

Note that the output lists each value of k until a prime is found. The fprinftcommand issues a warning message and the break command terminates the loop.

Current value of index k is: 4Current value of index k is: 8Current value of index k is: 12Current value of index k is: 15Prime found, 17, exiting loop.

In this next code snippet, we want to print the multiples of three that are lessthan or equal to 20. At each iteration of the while loop, Matlab checks to see if kis divisible by 3. If not, it increments the counter k, then the continue statment


that follows skips the remaining statements in the loop, and returns control tothe next iteration of the loop.

N=20;k=1;while k<=N

if mod(k,3)~=0k=k+1;continue

endfprintf(’Multiple of 3: %d\n’,k)k=k+1;

end

If a multiple of 3 is found, fprintf is used to output the result in a nicely formattedstatement.

Multiple of 3: 3Multiple of 3: 6Multiple of 3: 9Multiple of 3: 12Multiple of 3: 15Multiple of 3: 18

Any and AllThe command any returns true if any element of a vector is a nonzero numberor is logical 1 (true).

>> v=[0 0 1 0 1]; any(v)ans =

1

On the other hand, any returns false if non of the elements of a vector are nonzeronumbers or logical 1’s.


>> v=[0 0 0 0]; any(v)ans =

0

The command all returns true if all of the elements of a vector are nonzeronumbers or are logical 1’s (true).

>> v=[1 1 1 1]; all(v)ans =

1

On the other hand, all returns false if the vector contains any zeros or logical0’s.

>> v=[1 0 1 1 1]; all(v)ans =

0

The any and all command can be quite useful in programs. Suppose forexample, that you want to pick out all the numbers from 1 to 2000 that aremultiples of 8, 12, and 15. Here’s a code snippet that will perform this task. Eachtime through the loop, we use the mod function to determine the remainder whenthe current value of k is divided by the numbers in divisors. If “all” remaindersare equal to zero, then k is divisible by each of the numbers in divisors and weappend k to a list of multiples of 8, 12, and 15.

N=2000;divisors=[8,12,15];multiples=[];for k=1:N

if all(mod(k,divisors)==0)multiples=[multiples,k];

endendfmt=’%5d %5d %5d %5d %5d\n’;fprintf(fmt,multiples)


The resulting output follows.

120 240 360 480 600720 840 960 1080 1200

1320 1440 1560 1680 18001920

Vectorizing the Code. In this case we can avoid loops altogether by usinglogical indexing.

k=1:2000;b=(mod(k,8)==0) & (mod(k,12)==0) & (mod(k,15)==0);fmt=’%5d %5d %5d %5d %5d\n’;fprintf(fmt,k(b))

The reader should check that this code snippet produces the same result, i.e., allmultiples of 8, 12, and 15 from 1 to 2000.

Nested LoopsIt’s possible to nest one loop inside another. You can nest a second for loopinside a primary for loop, or you can nest a for loop inside a while loop. Othercombinations are also possible. Let’s look at some situations where this is useful.

I Example 2. A Hilbert Matrix H is an n× n matrix defined by

H(i, j) = 1i + j − 1

, (4.3)

where 1 ≤ i, j ≤ n. Set up nested for loops to construct a Hilbert Matrix of ordern = 5.

We could create this Hilbert Matrix using hand calculations. For example,the entry in row 3 column 2 is

H(3, 2) = 13 + 2− 1

= 14.

These calculations are not difficult, but for large order matrices, they would betedious. Let’s let Matlab do the work for us.


n=5;H=zeros(n);

Now, doubly nested for loops achieve the desired result.

for i=1:nfor j=1:n

H(i,j)=1/(i+j-1);end

end

Set rational display format, display the matrix H, then set the format back todefault.

format ratHformat

The result is shown in the following Hilbert Matrix of order 5.

H =1 1/2 1/3 1/4 1/51/2 1/3 1/4 1/5 1/61/3 1/4 1/5 1/6 1/71/4 1/5 1/6 1/7 1/81/5 1/6 1/7 1/8 1/9

Note that the entry is row 3 column 2 is 1/4, as predicted by our hand calculationsabove. Similar hand calculations can be used to verify other entries in this result.

How does it work? Because n = 5, the primary loop becomes for i=1:5.Similarly, the inner loop becomes for j=1:5. Now, here is the way the nestedstructure proceeds. First, set i = 1, then execute the statement H(i,j)=1/(i+j-1) for j = 1, 2, 3, 4, and 5. With this first pass, we set the entries H(1, 1), H(1, 2),H(1, 3), H(1, 4), and H(1, 5). The inner loop terminates and control returns tothe primary loop, where the program next sets i = 2 and again executes thestatement H(i,j)=1/(i+j-1) for j = 1, 2, 3, 4, and 5. With this second pass, weset the entries H(2, 1), H(2, 2), H(2, 3), H(2, 4), and H(2, 5). A third and fourth


pass of the primary loop occur next, with i = 3 and 4, each time iterating theinner loop for j = 1, 2, 3, 4, and 5. Finally, on the last pass through the primary,the program sets i = 5, then executes the statement H(i,j)=1/(i+j-1) for j = 1,2, 3, 4, and 5. On this last pass, the program sets the entries H(5, 1), H(5, 2),H(5, 3), H(5, 4), and H(5, 5) and terminates.

Fourier Series — An Application of a For LoopMatlab’s mod(a,b) works equally well with real numbers, providing the remain-der when a is divided by b. For example, if you divide 5.7 by 2, the quotientis 2 and the remainder is 1.7. The command mod(5.7,2) should return theremainder, namely 1.7.

>> mod(5.7,2)ans =

1.7000

The following commands produce what engineers call a sawtooth curve, shown inFigure 4.2(a).

t=linspace(0,6,500);y=mod(t,2);plot(t,y)

In Figure 4.2(a), we plot the remainders when the time is divided by 2. Hence,we get the a periodic function of period 2, where the remainders grow from 0 tojust less than 2, then repeat every 2 units.

With the following adjustment, we produce what engineers call a square wave(shown in Figure 4.2(b)). Recall that y<1 returns true (logical 1) when y isless than 1, and false (logical 0) when y is not less than 1.

y=(y<1);plot(t,y,’*’)

This type of curve can emulate a switch that is “on” for one second (y-value 1),then off for the next second (y-value 0), and then periodically repeats this “on-off”cycle over its domain.


(a) (b)Figure 4.2. Generating a square wave with mod.

Let’s make one last adjustment, increasing the amplitude by 2, then shifting thegraph downward 1 unit. This produces the square wave shown in Figure 4.3.

y=2*y-1;plot(t,y,’*’)

Figure 4.3. A squarewave with period 2.

Note that this square waves alternates between the values 1 and −1. The curveequals 1 on the first half of its period, then −1 on its second half. This patternthen repeats with period 2 over the remainder of its domain.

Fourier Series. Using advanced mathematics, it can be shown that thefollowing Fourier Series “converges” to the square wave pictured in Figure 4.3

∞∑n=0

4(2n + 1)π

sin(2n + 1)πt (4.4)


It is certain remarkable, if not somewhat implausible, that one can sum a seriesof sinusoidal function and get the resulting square wave picture in Figure 4.3.

We will generate and plot the first five terms of the series (4.4), saving eachterm in a row of matrix A for later use. First, we initialize the time and thenumber of terms of the series that we will use, then we allocate appropriate spacefor matrix A. Note how we wisely save the number of points and the number ofterms in variables, so that if we later decide to change these values, we won’t haveto scan every line of our program making changes.

N=500;numterms=5;t=linspace(0,6,N);A=zeros(numterms,N);

Next, we use a for loop to compute each of the first 5 terms (numterms) ofseries (4.4), storing each term in a row of matrix A, then plotting the sinusoid inthree space, where we use the angular frequency as the third dimension.

for n=0:numterms-1y=4/((2*n+1)*pi)*sin((2*n+1)*pi*t);A(n+1,:)=y;line((2*n+1)*pi*ones(size(t)),t,y)

end

We orient the 3D-view, add a box for depth, turn on the grid, and annotate eachaxis to produce the image shown in Figure 4.4(a).

view(20,20)box ongrid onxlabel(’Angular Frequency’)ylabel(’t-axis’)zlabel(’y-axis’)

As one would expect, after examining the terms of series (4.4), each each consecu-tive term of the series is a sinusoid with increasing angular frequency and decreas-ing amplitude. This is evident with the sequence of terms shown in Figure 4.4(a).


However, a truly remarkable result occurs when we add the sinusoids inFigure 4.4(a). This is easy to do because we saved each individual term ina row of matrix A. Matlab’s sum command will sum the columns of matrix A,effectively summing the first five terms of series (4.4) at each instant of time t.

figureplot(t,sum(A))

We “tighten” the axes, add a grid, then annotate each axis. This producesthe image in Figure 4.4(b). Note the striking similarity to the square wavein Figure 4.3.

axis tightgrid onxlabel(’t-axis’)ylabel(’y-axis’)

(a) (b)Figure 4.4. Approximating a square wave with a Fourier series (5 terms).

Because we smartly stored the number of terms in the variable numterms,to see the effect of adding the first 10 terms of the fourier series (4.4) is a simplematter of changing numterms=5 to numterms=10 and running the programagain. The ouptut is shown in Figures 4.5(a) and (b). In Figure 4.5(b), notethe even closer resemblance to the square wave in Figure 4.3, except at the ends,where the “ringing” exhibited there is known as Gibb’s Phenomenon. If you one


day get a chance to take a good upper-division differential equations course, youwill study Fourier series in more depth.

(a) (b)Figure 4.5. Adding additional terms of the Fourier series (4.4).


4.2 Exercises

1. Write a for loop that will outputthe cubes of the first 10 positive in-tegers. Use fprintf to output the re-sults, which should include the integerand its cube. Write a second programthat uses a while loop to produce anindentical result.

2. Without appealing to the Matlabcommand factorial, write a for loopto output the factorial of the numbers10 through 15. Use fprintf to formatthe output. Write a second programthat uses a while loop to produce anindentical result. Hint: Consider theprod command.

3. Write a single program that willcount the number of divisors of each ofthe following integers: 20, 36, 84, and96. Use fprintf to output each resultin a form similar to “The number ofdivisors of 12 is 6.”

4. Set A=magic(5). Write a pro-gram that uses nested for loops tofind the sum of the squares of all en-tries of matrix A. Use fprintf to for-mat the output. Write a second pro-gram that uses array operations andMatlab’s sum function to obtain thesame result.

5. Write a program that uses nestedfor loops to produce Pythagorean Triples,positive integers a, b and c that satisfya2+b2 = c2. Find all such triples suchthat 1 ≤ a, b, c ≤ 20 and use fprintfto produce nicely formatted results.

6. Another result proved by Leon-

hard Euler shows that

π4

90= 1 + 1

24 + 134 + 1

44 + · · · .

Write a program that uses a for loopto sum the first 20 terms of this series.Compute the relative error when thissum is used as an approximation ofπ4/90. Write a second program thatuses a while loop to determine thenumber of terms required so that thesum approximates π4/90 to four sig-nificant digits. In both programs, usefprintf to format your output.

7. Some attribute the following se-ries to Leibniz.

π

4= 1− 1

3+ 1

5− 1

7+ 1

9− · · ·

Write a program that uses a for loopto sum the first 20 terms of this se-ries. Compute the relative error whenthis sum is used as an approximationof π/4. Write a second program thatuses a while loop to determine thenumber of terms required so that thesum approximates π/4 to four signif-icant digits. In both programs, usefprintf to format your output.

8. Goldbach’s Conjecture is one ofthe most famous unproved conjecturesin all of mathematics, which is remark-able in light of its simplistic statement:

All even integers greater than2 can be expressed as a sumof two prime integers. Writea program that expresses 1202as the sum of two primes in ten


different ways. Use fprintf toformat your output.

9. Here is a simple idea for gener-ating a list of prime integers. Createa vector primes with a single entry,the prime integer 2. Write a programto test each integer less than 100 tosee if it is a prime using the followingprocedure.

i. If an integer is divisible by any ofthe integers in primes, skip it andgo to the next integer.

ii. If an integer is not divisible by anyof the integers in primes, appendthe integer to the vector primesand go to the next integer.

Use fprintf to output the vector primesin five columns, right justified.

10. There is a famous stoy about SirThomas Hardy and Srinivasa Ramanu-jan, which Hardy relates in his famouswork “A Mathematician’s Apology,”a copy of which resides in the CR li-brary along with the work “The ManWho Knew Infinity: A Life of the Ge-nius Ramanujan.”

I remember once going to seehim when he was ill at Putney.I had ridden in taxi cab num-ber 1729 and remarked that thenumber seemed to me rather adull one, and that I hoped itwas not an unfavorable omen.“No,” he replied, “it is a veryinteresting number; it is the small-est number expressible as thesum of two cubes in two differ-ent ways.”

Write a program with nested loops to

find integers a and b (in two ways) sothat a3 + b3 = 1729. Use fprintf toformat the output of your program.

11. Write a program to perform eachof the following tasks.

i. Use Matlab to draw a circle of ra-dius 1 centered at the origin andinscribed in a square having ver-tices (1, 0), (−1, 1), (−1,−1), and(1,−1). The ratio of the area ofthe circle to the area of the squareis π : 4 or π/4. Hence, if we wereto throw darts at the square in arondom fashion, the ratio of dartsinside the circle to the number ofdarts thrown should be approxi-mately equal to π/4.

ii. Write a for loop that will plot 1000randomly generated points insidethe square. Use Matlab’s randcommand for this task. Each timerandom point lands within the unitcircle, increment a counter hits.When the for loop terminates, usefprintf to output the ratio dartsthat land inside the circle to thenumber of darts thrown. Calcu-late the relative error in approxi-mating pi/4 with this ratio.

12. Write a program to perform eachof the following tasks.

i. Prompt the user to enter a 3 × 4matrix. Store the result in the ma-trix A.

ii. Use if..elseif..else to provide a menuwith three choices: (1) Switch tworows of the matrix; (2) Multiply arow of the matrix by a scalar; and(3) Subtract a scalar multiple of a


row from another row of the ma-trix.

iii. Perform the task in the requestedmenu item, then return the result-ing matrix to the user.a. In the case of (1), your pro-

gram should prompt the userfor the rows to switch.

b. In the case of (2), your pro-gram should prompt the userfor a scalar and a row that willbe multiplied by the scalar.

c. In the case of (3), your pro-gram should prompt the userfor a scalar and two row num-bers, the first of which is tobe multiplied by the scalar andsubtracted from the second.

13. The solutions of the quadraticequation ax2 +bx+c = 0 are given bythe quadratic formula

x = −b±√

b2 − 4ac

2a.

The discriminant D = b2−4ac is usedto predict the number of roots. Thereare three cases.

1. If D > 0, there are two real solu-tions.

2. If D = 0, there is one real solution.3. If D < 0, there are no real solu-

tions.

Write a program the performs each ofthe following tasks.

i. The program prompts the user toinput a, b, and c.

ii. The programs computes the dis-criminant D which it uses with theconditional if..elseif..else to de-

termine and output the number ofsolutions.

iii. The program should also outputthe solutions, if any.

Use well crafted format strings withfprintf to output all results.

14. Plot the modified sawtooth curve.

N=500;t=linspace(0,6*pi,N);y=mod(t+pi,2*pi)-pi;line(t,y)

It can be shown that the Fourierseries

∞∑n=1

2(−1)n+1

nsin nx

“converges” to the sawtooth curve. Per-form each of the following tasks.

i. Sketch each of the individual termsin 3-space, as shown in the narra-tive.

ii. Sketch the sawtooth, hold the graph,then sum the first five terms of theseries and superimpose the plot ofthe result.

15. Logical arrays are helpful whenit comes to drawing piecewise func-tions. For example, consider the piece-wise definition

f(x) ={

0, if −π ≤ x < 0π − x, if 0 ≤ x ≤ π.

Enter the following to plot the piece-wise function f .


N=500;x=linspace(-pi,pi,N);y=0*(x<0)+(pi-x).*(x>=0);plot(x,y,’.’)

A Fourier series representation of thefunction f is given by

a02

+∞∑

n=1[an cos nx + bn sin nx] ,

where a0 = π/2, and

an = 1− cos nπ

n2πand bn = 1

n.

Hold the plot of f , then superimposethe Fourier series sum for n = 1 ton = 5. Hint: This is a nice placefor anonymous functions, for example,set:

b = @(n) 1/n;


4.2 Answers

1. This for loop will print the cubes of the first 10 positive integers.

for k=1:10fprintf(’The cube of %d is %d.\n’,k,k^3)

end

This while loop will print the cubes of the first 10 positive integers.

k=1;while k<=10

fprintf(’The cube of %d is %d.\n’,k,k^3)k=k+1;

end

3. The following program will print the number of divisors of 20, 36, 84, and 96.

for k=[20, 36, 84, 96]count=0;divisor=1;while (divisor<=k)

if mod(k,divisor)==0count=count+1;

enddivisor=divisor+1;

endfprintf(’The number of divisors of %d is %d.\n’,k,count)

end

5. This program will print Pythagorean Triples so that 1 ≤ a, b, c ≤ 20.


N=20;for a=1:N

for b=1:Nfor c=1:N

if (c^2==a^2+b^2)fprintf(’Pythagorean Triple: %d, %d, %d\n’, a, b, c)

endend

endend

7. The following loop will sum the first 20 terms.

N=20;s=0;for k=1:N;

s=s+(-1)^(k+1)/(2*k-1);end

Compute the relative error in approximating π/4 with the sum of the first 20terms.

rel=abs(s-pi/4)/abs(pi/4);

Output the results.

fprintf(’The actual value of pi/4 is %.6f.\n’,pi/4)fprintf(’The sum of the first %d terms is %f.\n’,N,s)fprintf(’The relative error is %.2e.\n’,rel)

9. The mod(m,primes)==0 comparison will produce a logical vector whichcontains a 1 in each position that indicates that the current number m is divisibleby the number in the prime vector in the corresponding position. If “any” ofthese are 1’s, then the number m is divisible by at least one of the primes in thecurrent list primes. In that case, we continue to the next value of m. Otherwise,we append m to the list of primes.


primes=2;for m=3:100

if any(mod(m,primes)==0)continue

elseprimes=[primes,m];

endend

We create a format of 5 fields, each of width 5, then fprintf the vector primesusing this format.

fmt=’%5d %5d %5d %5d %5d\n’;

The resulting list of primes is nicely formatted in 5 columns.

2 3 5 7 1113 17 19 23 2931 37 41 43 4753 59 61 67 7173 79 83 89 97

11. We open a figure window and set a lighter background color of gray.

fig=figure;set(fig,’Color’,[0.95,0.95,0.95])

We draw a circle of radius one, increase its line width, change its color todark red, set the axis equal, then turn the axes off.

t=linspace(0,2*pi,200);x=cos(t);y=sin(t);line(x,y,’LineWidth’,2,’Color’,[0.625,0,0])axis equalaxis off


We draw the square in counterclockwise order of vertices, starting at thepoint (1, 1). In that order, we set the x- and y-values of the vertices in the vectorsx and y. We then draw the square with a thickened line width and color it darkblue.

x=[1,-1,-1,1,1]; y=[1,1,-1,-1,1];line(x,y,’LineWidth’,2,’Color’,[0,0,0.625])

Next, we set the number of darts thrown at 1000. Then we plot one dartat the origin, change its linestyle to ’none’ (points will not be connected with linesegements), choose a marker style, and color it dark blue.

N=1000;dart=line(0,0,...

’LineStyle’,’none’,...’Marker’,’.’,...’Color’,[0,0,0.625]);

In the loop that follows, we use the handle to this initial dart to add x- andy-data (additional darts). First, we set the number of hits (darts that land insidethe circle) to zero (we don’t count the initial dart at (0, 0)).

hits=0;for k=1:Nx=2*rand-1;y=2*rand-1;set(dart,...

’XData’,[get(dart,’XData’),x],...’YData’,[get(dart,’YData’),y])%drawnowif (x^2+y^2<1)

hits=hits+1;end

end

In the body of the loop, we choose uniform random numbers between −1and 1 for both x and y. We then add these new x- and y-values to the XDataand YData of the existing dart by using its handle dart. We do this for x by


first “getting” the XData for the dart with get(dart,’XData’), then we appendthe current value of x with [get(dart,’XData’),x]. The set command is usedto set the updated list of x-values. A similar task updates the dart’s y-values.At each iteration of the loop, we also increment the hits counter if the dart lieswithin the border of the circle, that is, if x2 + y2 < 1. You can uncomment thedrawnow command to animate the dart throwing at the expense of waiting for1000 darts to be thrown to the screen.Finally, we output the requested data to the command window.

fprintf(’%d of %d darts fell inside the circle.\n’,hits,N)fprintf(’The ratio of hits to throws is %f\n\n’,hits/N)

fprintf(’The ratio of the area of the circle to the area\n’)fprintf(’the area of the square is pi/4, or approximately %f.\n\n’,pi/4)

rel_err=abs(hits/N-pi/4)/abs(pi/4);fprintf(’The relative error in approximating pi/4 wiht the ratio\n’)fprintf(’of hits to total darts thrown is %.2e.\n’,rel_err)

15. We begin by using logical relations to craft the piecewise function.

N=500;x=linspace(-pi,pi,N);y=0*(x<0)+(pi-x).*(x>=0);line(x,y,...

’LineStyle’,’none’,...’Marker’,’.’)

Next, we compose two anonymous functions for the coefficients an and bn.

a=@(n) (1-cos(n*pi))/(n^2*pi);b=@(n) 1/n;

Note that a0 = π/2, so a0/2 = π/4, which must be added to the sum of thefirst five terms of the series. Thus, we’ll start our running sum at s = π/4. Thefor loop then sums the first five terms. We use another line command to plotthe result.


s=pi/4;N=5;for k=1:N

s=s+(a(k)*cos(k*x)+b(k)*sin(k*x));endline(x,s,’LineWidth’,2,’Color’,[0.625,0,0])

Chapter 6

Eigenvalues and Eigenvectors

6.1 Introduction to Eigenvalues

Linear equations Ax D b come from steady state problems. Eigenvalues have their greatestimportance in dynamic problems. The solution of du=dt D Au is changing with time—growing or decaying or oscillating. We can’t find it by elimination. This chapter enters anew part of linear algebra, based on Ax D �x. All matrices in this chapter are square.

A good model comes from the powers A; A2; A3; : : : of a matrix. Suppose you need thehundredth power A100. The starting matrix A becomes unrecognizable after a few steps,and A100 is very close to Œ :6 :6I :4 :4 � :

�:8 :3

:2 :7

� �:70 :45

:30 :55

� �:650 :525

:350 :475

��

�:6000 :6000

:4000 :4000

�

A A2 A3 A100

A100 was found by using the eigenvalues of A, not by multiplying 100 matrices. Thoseeigenvalues (here they are 1 and 1=2) are a new way to see into the heart of a matrix.

To explain eigenvalues, we first explain eigenvectors. Almost all vectors change di-rection, when they are multiplied by A. Certain exceptional vectors x are in the samedirection as Ax. Those are the “eigenvectors”. Multiply an eigenvector by A, and thevector Ax is a number � times the original x.

The basic equation is Ax D �x. The number � is an eigenvalue of A.

The eigenvalue � tells whether the special vector x is stretched or shrunk or reversed or leftunchanged—when it is multiplied by A. We may find � D 2 or 1

2or �1 or 1. The eigen-

value � could be zero! Then Ax D 0x means that this eigenvector x is in the nullspace.If A is the identity matrix, every vector has Ax D x. All vectors are eigenvectors of I .

All eigenvalues “lambda” are � D 1. This is unusual to say the least. Most 2 by 2 matriceshave two eigenvector directions and two eigenvalues. We will show that det.A � �I / D 0.

283

284 Chapter 6. Eigenvalues and Eigenvectors

This section will explain how to compute the x’s and �’s. It can come early in the coursebecause we only need the determinant of a 2 by 2 matrix. Let me use det.A � �I / D 0 tofind the eigenvalues for this first example, and then derive it properly in equation (3).

Example 1 The matrix A has two eigenvalues � D 1 and � D 1=2. Look at det.A��I /:

A D�

:8 :3

:2 :7

�det

�:8 � � :3

:2 :7 � �

�D �2 � 3

2� C 1

2D .� � 1/

�� 1

2

�:

I factored the quadratic into � � 1 times � � 12

, to see the two eigenvalues � D 1 and� D 1

2 . For those numbers, the matrix A � �I becomes singular (zero determinant). Theeigenvectors x1 and x2 are in the nullspaces of A � I and A � 1

2I .

.A � I /x1 D 0 is Ax1 D x1 and the first eigenvector is .:6; :4/.

.A � 12I /x2 D 0 is Ax2 D 1

2x2 and the second eigenvector is .1; �1/:

x1 D�

:6

:4

�and Ax1 D

�:8 :3

:2 :7

� �:6

:4

�D x1 (Ax D x means that �1 D 1)

x2 D�

1

�1

�and Ax2 D

�:8 :3

:2 :7

� �1

�1

�D

�:5

�:5

�(this is 1

2x2 so �2 D 1

2).

If x1 is multiplied again by A, we still get x1. Every power of A will give Anx1 D x1.Multiplying x2 by A gave 1

2x2, and if we multiply again we get . 1

2/2 times x2.

When A is squared, the eigenvectors stay the same. The eigenvalues are squared.

This pattern keeps going, because the eigenvectors stay in their own directions (Figure 6.1)and never get mixed. The eigenvectors of A100 are the same x1 and x2. The eigenvaluesof A100 are 1100 D 1 and . 1

2/100 D very small number.

�� D 1 Ax1 D x1 D

�:6

:4

�

Ax2 D �2x2 D�

:5

�:5

�� D :5

x2 D�

1

�1

�

�

A2x1 D .1/2x1

A2x2 D .:5/2x2 D�

:25

�:25

�

Ax D �x

A2x D �2x

�2 D :25

�2 D 1

Figure 6.1: The eigenvectors keep their directions. A2 has eigenvalues 12 and .:5/2.

Other vectors do change direction. But all other vectors are combinations of the twoeigenvectors. The first column of A is the combination x1 C .:2/x2:

Separate into eigenvectors�

:8

:2

�D x1 C .:2/x2 D

�:6

:4

�C

�:2

�:2

�: (1)

6.1. Introduction to Eigenvalues 285

Multiplying by A gives .:7; :3/, the first column of A2. Do it separately for x1 and .:2/x2.Of course Ax1 D x1. And A multiplies x2 by its eigenvalue 1

2:

Multiply each x i by �i A

�:8

:2

�D

�:7

:3

�is x1 C 1

2.:2/x2 D

�:6

:4

�C

�:1

�:1

�:

Each eigenvector is multiplied by its eigenvalue, when we multiply by A. We didn’t needthese eigenvectors to find A2. But it is the good way to do 99 multiplications. At every stepx1 is unchanged and x2 is multiplied by . 1

2/, so we have . 1

2/99:

A99

�:8

:2

�is really x1 C .:2/.

1

2/99x2 D

�:6

:4

�C

24 very

smallvector

35 :

This is the first column of A100. The number we originally wrote as :6000 was not exact.We left out .:2/. 1

2/99 which wouldn’t show up for 30 decimal places.

The eigenvector x1 is a “steady state” that doesn’t change (because �1 D 1/. Theeigenvector x2 is a “decaying mode” that virtually disappears (because �2 D :5/. Thehigher the power of A, the closer its columns approach the steady state.

We mention that this particular A is a Markov matrix. Its entries are positive andevery column adds to 1. Those facts guarantee that the largest eigenvalue is � D 1 (as wefound). Its eigenvector x1 D .:6; :4/ is the steady state—which all columns of Ak willapproach. Section 8.3 shows how Markov matrices appear in applications like Google.

For projections we can spot the steady state .� D 1/ and the nullspace .� D 0/.

Example 2 The projection matrix P D�

:5 :5

:5 :5

�has eigenvalues � D 1 and � D 0.

Its eigenvectors are x1 D .1; 1/ and x2 D .1; �1/. For those vectors, P x1 D x1 (steadystate) and P x2 D 0 (nullspace). This example illustrates Markov matrices and singularmatrices and (most important) symmetric matrices. All have special �’s and x’s:

1. Each column of P D�

:5 :5

:5 :5

�adds to 1, so � D 1 is an eigenvalue.

2. P is singular, so � D 0 is an eigenvalue.

3. P is symmetric, so its eigenvectors .1; 1/ and .1; �1/ are perpendicular.

The only eigenvalues of a projection matrix are 0 and 1. The eigenvectors for � D 0

(which means P x D 0x/ fill up the nullspace. The eigenvectors for � D 1 (which meansP x D x/ fill up the column space. The nullspace is projected to zero. The column spaceprojects onto itself. The projection keeps the column space and destroys the nullspace:

Project each part v D�

1

�1

�C

�2

2

�projects onto P v D

�0

0

�C

�2

2

�:

Special properties of a matrix lead to special eigenvalues and eigenvectors.That is a major theme of this chapter (it is captured in a table at the very end).


Projections have � D 0 and 1. Permutations have all j�j D 1. The next matrix R (areflection and at the same time a permutation) is also special.

Example 3 The reflection matrix R D �0 11 0

�has eigenvalues 1 and �1.

The eigenvector .1; 1/ is unchanged by R. The second eigenvector is .1; �1/—its signsare reversed by R. A matrix with no negative entries can still have a negative eigenvalue!The eigenvectors for R are the same as for P , because reflection D 2.projection/ � I :

R D 2P � I

�0 1

1 0

�D 2

�:5 :5

:5 :5

��

�1 0

0 1

�: (2)

Here is the point. If P x D �x then 2P x D 2�x. The eigenvalues are doubled whenthe matrix is doubled. Now subtract Ix D x. The result is .2P � I /x D .2� � 1/x.When a matrix is shifted by I , each � is shifted by 1. No change in eigenvectors.

Figure 6.2: Projections P have eigenvalues 1 and 0. Reflections R have � D 1 and �1.A typical x changes direction, but not the eigenvectors x1 and x2.

Key idea: The eigenvalues of R and P are related exactly as the matrices are related:

The eigenvalues of R D 2P � I are 2.1/ � 1 D 1 and 2.0/ � 1 D �1.

The eigenvalues of R2 are �2. In this case R2 D I . Check .1/2 D 1 and .�1/2 D 1.

The Equation for the Eigenvalues

For projections and reflections we found �’s and x’s by geometry: P x D x; P x D 0;

Rx D �x. Now we use determinants and linear algebra. This is the key calculation inthe chapter—almost every application starts by solving Ax D �x.

First move �x to the left side. Write the equation Ax D �x as .A � �I /x D 0. Thematrix A � �I times the eigenvector x is the zero vector. The eigenvectors make up thenullspace of A � �I . When we know an eigenvalue �, we find an eigenvector by solving.A � �I /x D 0.

Eigenvalues first. If .A � �I /x D 0 has a nonzero solution, A � �I is not invertible.The determinant of A � �I must be zero. This is how to recognize an eigenvalue �:


Eigenvalues The number � is an eigenvalue of A if and only if A � �I is singular:

det.A � �I / D 0: (3)

This “characteristic equation” det.A � �I / D 0 involves only �, not x. When A is n by n,the equation has degree n. Then A has n eigenvalues and each � leads to x:

For each � solve .A � �I /x D 0 or Ax D �x to find an eigenvector x:

Example 4 A D�

1 2

2 4

�is already singular (zero determinant). Find its �’s and x’s.

When A is singular, � D 0 is one of the eigenvalues. The equation Ax D 0x hassolutions. They are the eigenvectors for � D 0. But det.A � �I / D 0 is the way to find all�’s and x’s. Always subtract �I from A:

Subtract � from the diagonal to find A � �I D�

1 � � 2

2 4 � �

�: (4)

Take the determinant “ad � bc” of this 2 by 2 matrix. From 1 � � times 4 � �,the “ad” part is �2 � 5� C 4. The “bc” part, not containing �, is 2 times 2.

det

�1 � � 2

2 4 � �

�D .1 � �/.4 � �/ � .2/.2/ D �2 � 5�: (5)

Set this determinant �2 � 5� to zero. One solution is � D 0 (as expected, since A issingular). Factoring into � times � � 5, the other root is � D 5:

det.A � �I / D �2 � 5� D 0 yields the eigenvalues �1 D 0 and �2 D 5 :

Now find the eigenvectors. Solve .A � �I /x D 0 separately for �1 D 0 and �2 D 5:

.A � 0I /x D�

1 2

2 4

� �y

z

�D

�0

0

�yields an eigenvector

�y

z

�D

�2

�1

�for �1 D 0

.A � 5I /x D��4 2

2 �1

� �y

z

�D

�0

0

�yields an eigenvector

�y

z

�D

�1

2

�for �2 D 5:

The matrices A � 0I and A � 5I are singular (because 0 and 5 are eigenvalues). Theeigenvectors .2; �1/ and .1; 2/ are in the nullspaces: .A � �I /x D 0 is Ax D �x.

We need to emphasize: There is nothing exceptional about � D 0. Like every othernumber, zero might be an eigenvalue and it might not. If A is singular, it is. The eigenvec-tors fill the nullspace: Ax D 0x D 0. If A is invertible, zero is not an eigenvalue. We shiftA by a multiple of I to make it singular.

In the example, the shifted matrix A � 5I is singular and 5 is the other eigenvalue.


Summary To solve the eigenvalue problem for an n by n matrix, follow these steps:

1. Compute the determinant of A � �I . With � subtracted along the diagonal, thisdeterminant starts with �n or ��n. It is a polynomial in � of degree n.

2. Find the roots of this polynomial, by solving det.A � �I / D 0. The n roots arethe n eigenvalues of A. They make A � �I singular.

3. For each eigenvalue �, solve .A � �I /x D 0 to find an eigenvector x.

A note on the eigenvectors of 2 by 2 matrices. When A � �I is singular, both rows aremultiples of a vector .a; b/. The eigenvector is any multiple of .b; �a/. The example had� D 0 and � D 5:

� D 0 W rows of A � 0I in the direction .1; 2/; eigenvector in the direction .2; �1/

� D 5 W rows of A � 5I in the direction .�4; 2/; eigenvector in the direction .2; 4/:

Previously we wrote that last eigenvector as .1; 2/. Both .1; 2/ and .2; 4/ are correct.There is a whole line of eigenvectors—any nonzero multiple of x is as good as x.MATLAB’s eig.A/ divides by the length, to make the eigenvector into a unit vector.

We end with a warning. Some 2 by 2 matrices have only one line of eigenvectors.This can only happen when two eigenvalues are equal. (On the other hand A D I hasequal eigenvalues and plenty of eigenvectors.) Similarly some n by n matrices don’t haven independent eigenvectors. Without n eigenvectors, we don’t have a basis. We can’t writeevery v as a combination of eigenvectors. In the language of the next section, we can’tdiagonalize a matrix without n independent eigenvectors.

Good News, Bad News

Bad news first: If you add a row of A to another row, or exchange rows, the eigenvaluesusually change. Elimination does not preserve the �’s. The triangular U has its eigenvaluessitting along the diagonal—they are the pivots. But they are not the eigenvalues of A!Eigenvalues are changed when row 1 is added to row 2:

U D�

1 3

0 0

�has � D 0 and � D 1; A D

�1 3

2 6

�has � D 0 and � D 7:

Good news second: The product �1 times �2 and the sum �1 C �2 can be found quicklyfrom the matrix. For this A, the product is 0 times 7. That agrees with the determinant(which is 0/. The sum of eigenvalues is 0 C 7. That agrees with the sum down the maindiagonal (the trace is 1 C 6/. These quick checks always work:

The product of the n eigenvalues equals the determinant.The sum of the n eigenvalues equals the sum of the n diagonal entries.


The sum of the entries on the main diagonal is called the trace of A:

�1 C �2 C � � � C �n D trace D a11 C a22 C � � � C ann: (6)

Those checks are very useful. They are proved in Problems 16–17 and again in the nextsection. They don’t remove the pain of computing �’s. But when the computation is wrong,they generally tell us so. To compute the correct �’s, go back to det.A � �I / D 0.

The determinant test makes the product of the �’s equal to the product of the pivots(assuming no row exchanges). But the sum of the �’s is not the sum of the pivots—as theexample showed. The individual �’s have almost nothing to do with the pivots. In this newpart of linear algebra, the key equation is really nonlinear: � multiplies x.

Why do the eigenvalues of a triangular matrix lie on its diagonal?

Imaginary Eigenvalues

One more bit of news (not too terrible). The eigenvalues might not be real numbers.

Example 5 The 90ı rotation QD �0 1

�1 0

�has no real eigenvectors. Its eigenvalues are

� D i and � D �i . Sum of �’s D trace D 0. Product D determinant D 1.

After a rotation, no vector Qx stays in the same direction as x (except x D 0 which isuseless). There cannot be an eigenvector, unless we go to imaginary numbers. Which wedo.

To see how i can help, look at Q2 which is �I . If Q is rotation through 90ı, thenQ2 is rotation through 180ı. Its eigenvalues are �1 and �1. (Certainly �Ix D �1x.)Squaring Q will square each �, so we must have �2 D �1. The eigenvalues of the 90ırotation matrix Q are Ci and �i , because i 2 D �1.

Those �’s come as usual from det.Q � �I / D 0. This equation gives �2 C 1 D 0.Its roots are i and �i . We meet the imaginary number i also in the eigenvectors:

Complexeigenvectors

�0 1

�1 0

� �1

i

�D i

�1

i

�and

�0 1

�1 0

� �i

1

�D �i

�i

1

�:

Somehow these complex vectors x1 D .1; i/ and x2 D .i; 1/ keep their direction asthey are rotated. Don’t ask me how. This example makes the all-important point that realmatrices can easily have complex eigenvalues and eigenvectors. The particular eigenvaluesi and �i also illustrate two special properties of Q:

1. Q is an orthogonal matrix so the absolute value of each � is j�j D 1.

2. Q is a skew-symmetric matrix so each � is pure imaginary.


A symmetric matrix .AT D A/ can be compared to a real number. A skew-symmetricmatrix .AT D �A/ can be compared to an imaginary number. An orthogonal matrix.ATA D I / can be compared to a complex number with j�j D 1. For the eigenvalues thoseare more than analogies—they are theorems to be proved in Section 6:4.

The eigenvectors for all these special matrices are perpendicular. Somehow .i; 1/ and.1; i/ are perpendicular (Chapter 10 explains the dot product of complex vectors).

Eigshow in MATLAB

There is a MATLAB demo (just type eigshow), displaying the eigenvalue problem for a 2

by 2 matrix. It starts with the unit vector x D .1; 0/. The mouse makes this vector movearound the unit circle. At the same time the screen shows Ax, in color and also moving.Possibly Ax is ahead of x. Possibly Ax is behind x. Sometimes Ax is parallel to x. Atthat parallel moment, Ax D �x (at x1 and x2 in the second figure).

The eigenvalue � is the length of Ax, when the unit eigenvector x lines up. The built-inchoices for A illustrate three possibilities: 0; 1, or 2 directions where Ax crosses x.

0. There are no real eigenvectors. Ax stays behind or ahead of x. This means theeigenvalues and eigenvectors are complex, as they are for the rotation Q.

1. There is only one line of eigenvectors (unusual). The moving directions Ax and x

touch but don’t cross over. This happens for the last 2 by 2 matrix below.

2. There are eigenvectors in two independent directions. This is typical! Ax crosses x

at the first eigenvector x1, and it crosses back at the second eigenvector x2. ThenAx and x cross again at �x1 and �x2.

You can mentally follow x and Ax for these five matrices. Under the matrices I willcount their real eigenvectors. Can you see where Ax lines up with x?

A D�

2 0

0 1

� �0 1

1 0

� �0 1

�1 0

� �1 �1

1 �1

� �1 1

0 1

�

2 2 0 1 1


When A is singular (rank one), its column space is a line. The vector Ax goes upand down that line while x circles around. One eigenvector x is along the line. Anothereigenvector appears when Ax2 D 0. Zero is an eigenvalue of a singular matrix.

REVIEW OF THE KEY IDEAS

1. Ax D �x says that eigenvectors x keep the same direction when multiplied by A.

2. Ax D �x also says that det.A � �I / D 0. This determines n eigenvalues.

3. The eigenvalues of A2 and A�1 are �2 and ��1, with the same eigenvectors.

4. The sum of the �’s equals the sum down the main diagonal of A (the trace).The product of the �’s equals the determinant.

5. Projections P , reflections R, 90ı rotations Q have special eigenvalues 1; 0; �1; i; �i .Singular matrices have � D 0. Triangular matrices have �’s on their diagonal.

WORKED EXAMPLES

6.1 A Find the eigenvalues and eigenvectors of A and A2 and A�1 and A C 4I :

A D�

2 �1

�1 2

�and A2 D

�5 �4

�4 5

�:

Check the trace �1 C �2 and the determinant �1�2 for A and also A2.

Solution The eigenvalues of A come from det.A � �I / D 0:

det.A � �I / Dˇ̌̌ˇ2 � � �1

�1 2 � �

ˇ̌̌ˇ D �2 � 4� C 3 D 0:

This factors into .��1/.��3/ D 0 so the eigenvalues of A are �1 D 1 and �2 D 3. For thetrace, the sum 2C2 agrees with 1C3. The determinant 3 agrees with the product �1�2 D 3.The eigenvectors come separately by solving .A � �I /x D 0 which is Ax D �x:

� D 1: .A � I /x D�

1 �1

�1 1

� �x

y

�D

�0

0

�gives the eigenvector x1 D

�1

1

�

� D 3: .A � 3I /x D��1 �1

�1 �1

� �x

y

�D

�0

0

�gives the eigenvector x2 D

�1

�1

�


A2 and A�1 and A C 4I keep the same eigenvectors as A. Their eigenvalues are �2 and��1 and � C 4:

A2 has eigenvalues 12 D 1 and 32 D 9 A�1 has1

1and

1

3A C 4I has

1 C 4 D 5

3 C 4 D 7

The trace of A2 is 5 C 5 which agrees with 1 C 9. The determinant is 25 � 16 D 9.Notes for later sections: A has orthogonal eigenvectors (Section 6.4 on symmetric

matrices). A can be diagonalized since �1 ¤ �2 (Section 6.2). A is similar to any 2 by 2

matrix with eigenvalues 1 and 3 (Section 6.6). A is a positive definite matrix (Section 6.5)since A D AT and the �’s are positive.

6.1 B Find the eigenvalues and eigenvectors of this 3 by 3 matrix A:

Symmetric matrixSingular matrixTrace 1 C 2 C 1 D 4

A D24 1 �1 0

�1 2 �1

0 �1 1

35

Solution Since all rows of A add to zero, the vector x D .1; 1; 1/ gives Ax D 0. Thisis an eigenvector for the eigenvalue � D 0. To find �2 and �3 I will compute the 3 by 3

determinant:

det.A � �I / Dˇ̌̌ˇ̌̌1 � � �1 0

�1 2 � � �1

0 �1 1 � �

ˇ̌̌ˇ̌̌ D .1 � �/.2 � �/.1 � �/ � 2.1 � �/

D .1 � �/Œ.2 � �/.1 � �/ � 2�

D .1 � �/.��/.3 � �/:

That factor �� confirms that � D 0 is a root, and an eigenvalue of A. The other factors.1 � �/ and .3 � �/ give the other eigenvalues 1 and 3, adding to 4 (the trace). Eacheigenvalue 0; 1; 3 corresponds to an eigenvector :

x1 D241

1

1

35 Ax1 D 0x1 x2 D

24 1

0

�1

35 Ax2 D 1x2 x3 D

24 1

�2

1

35 Ax3 D 3x3 :

I notice again that eigenvectors are perpendicular when A is symmetric.The 3 by 3 matrix produced a third-degree (cubic) polynomial for det.A � �I / D

��3 C 4�2 � 3�. We were lucky to find simple roots � D 0; 1; 3. Normally we would usea command like eig.A/, and the computation will never even use determinants (Section 9.3shows a better way for large matrices).

The full command ŒS; D� D eig.A/ will produce unit eigenvectors in the columns ofthe eigenvector matrix S . The first one happens to have three minus signs, reversed from.1; 1; 1/ and divided by

p3. The eigenvalues of A will be on the diagonal of the eigenvalue

matrix (typed as D but soon called ƒ).


Problem Set 6.11 The example at the start of the chapter has powers of this matrix A:

A D�

:8 :3

:2 :7

�and A2 D

�:70 :45

:30 :55

�and A1 D

�:6 :6

:4 :4

�:

Find the eigenvalues of these matrices. All powers have the same eigenvectors.

(a) Show from A how a row exchange can produce different eigenvalues.

(b) Why is a zero eigenvalue not changed by the steps of elimination?

2 Find the eigenvalues and the eigenvectors of these two matrices:

A D�

1 4

2 3

�and A C I D

�2 4

2 4

�:

A C I has the eigenvectors as A. Its eigenvalues are by 1.

3 Compute the eigenvalues and eigenvectors of A and A�1. Check the trace !

A D�

0 2

1 1

�and A�1 D

��1=2 1

1=2 0

�:

A�1 has the eigenvectors as A. When A has eigenvalues �1 and �2, its inversehas eigenvalues .

4 Compute the eigenvalues and eigenvectors of A and A2:

A D��1 3

2 0

�and A2 D

�7 �3

�2 6

�:

A2 has the same as A. When A has eigenvalues �1 and �2, A2 has eigenvalues. In this example, why is �2

1 C �22 D 13?

5 Find the eigenvalues of A and B (easy for triangular matrices) and A C B:

A D�

3 0

1 1

�and B D

�1 1

0 3

�and A C B D

�4 1

1 4

�:

Eigenvalues of A C B (are equal to)(are not equal to) eigenvalues of A plus eigen-values of B .

6 Find the eigenvalues of A and B and AB and BA:

A D�

1 0

1 1

�and B D

�1 2

0 1

�and AB D

�1 2

1 3

�and BA D

�3 2

1 1

�:

(a) Are the eigenvalues of AB equal to eigenvalues of A times eigenvalues of B?

(b) Are the eigenvalues of AB equal to the eigenvalues of BA?


7 Elimination produces A D LU . The eigenvalues of U are on its diagonal; theyare the . The eigenvalues of L are on its diagonal; they are all . Theeigenvalues of A are not the same as .

8 (a) If you know that x is an eigenvector, the way to find � is to .

(b) If you know that � is an eigenvalue, the way to find x is to .

9 What do you do to the equation Ax D �x, in order to prove (a), (b), and (c)?

(a) �2 is an eigenvalue of A2, as in Problem 4.

(b) ��1 is an eigenvalue of A�1, as in Problem 3.

(c) � C 1 is an eigenvalue of A C I , as in Problem 2.

10 Find the eigenvalues and eigenvectors for both of these Markov matrices A and A1.Explain from those answers why A100 is close to A1:

A D�

:6 :2

:4 :8

�and A1 D

�1=3 1=3

2=3 2=3

�:

11 Here is a strange fact about 2 by 2 matrices with eigenvalues �1 ¤ �2: The columnsof A � �1I are multiples of the eigenvector x2. Any idea why this should be?

12 Find three eigenvectors for this matrix P (projection matrices have �D1 and 0):

Projection matrix P D24 :2 :4 0

:4 :8 0

0 0 1

35 :

If two eigenvectors share the same �, so do all their linear combinations. Find aneigenvector of P with no zero components.

13 From the unit vector u D �16; 1

6; 3

6; 5

6

construct the rank one projection matrix

P D uuT. This matrix has P 2 D P because uTu D 1.

(a) P uDu comes from .uuT/uDu. /. Then u is an eigenvector with �D1.

(b) If v is perpendicular to u show that P v D 0. Then � D 0.

(c) Find three independent eigenvectors of P all with eigenvalue � D 0.

14 Solve det.Q � �I / D 0 by the quadratic formula to reach � D cos � ˙ i sin � :

Q D�

cos � � sin �

sin � cos �

�rotates the xy plane by the angle � . No real �’s.

Find the eigenvectors of Q by solving .Q � �I /x D 0. Use i 2 D �1.


15 Every permutation matrix leaves x D .1; 1; : : :; 1/ unchanged. Then � D 1. Findtwo more �’s (possibly complex) for these permutations, from det.P � �I / D 0 :

P D240 1 0

0 0 1

1 0 0

35 and P D

240 0 1

0 1 0

1 0 0

35 :

16 The determinant of A equals the product �1�2 � � � �n. Start with the polynomialdet.A � �I / separated into its n factors (always possible). Then set � D 0 :

det.A � �I / D .�1 � �/.�2 � �/ � � � .�n � �/ so det A D :

Check this rule in Example 1 where the Markov matrix has � D 1 and 12

.

17 The sum of the diagonal entries (the trace) equals the sum of the eigenvalues:

A D�

a b

c d

�has det.A � �I / D �2 � .a C d/� C ad � bc D 0:

The quadratic formula gives the eigenvalues � D .aCd Cp/=2 and � D .

Their sum is . If A has �1 D 3 and �2 D 4 then det.A � �I / D .

18 If A has �1 D 4 and �2 D 5 then det.A � �I / D .� � 4/.� � 5/ D �2 � 9� C 20.Find three matrices that have trace a C d D 9 and determinant 20 and � D 4; 5.

19 A 3 by 3 matrix B is known to have eigenvalues 0; 1; 2. This information is enoughto find three of these (give the answers where possible) :

(a) the rank of B

(b) the determinant of BTB

(c) the eigenvalues of BTB

(d) the eigenvalues of .B2 C I /�1.

20 Choose the last rows of A and C to give eigenvalues 4; 7 and 1; 2; 3:

Companion matrices A D�

0 1

� ��

C D240 1 0

0 0 1

� � �

35 :

21 The eigenvalues of A equal the eigenvalues of AT. This is because det.A � �I /

equals det.AT � �I /. That is true because . Show by an example that theeigenvectors of A and AT are not the same.

22 Construct any 3 by 3 Markov matrix M : positive entries down each column add to 1.Show that M T.1; 1; 1/ D .1; 1; 1/. By Problem 21, � D 1 is also an eigenvalueof M . Challenge: A 3 by 3 singular Markov matrix with trace 1

2has what �’s ?


23 Find three 2 by 2 matrices that have �1 D �2 D 0. The trace is zero and thedeterminant is zero. A might not be the zero matrix but check that A2 D 0.

24 This matrix is singular with rank one. Find three �’s and three eigenvectors:

A D241

2

1

35 �

2 1 2� D

242 1 2

4 2 4

2 1 2

35 :

25 Suppose A and B have the same eigenvalues �1; : : :; �n with the same independenteigenvectors x1; : : :; xn. Then A D B . Reason: Any vector x is a combinationc1x1 C � � � C cnxn. What is Ax? What is Bx?

26 The block B has eigenvalues 1; 2 and C has eigenvalues 3; 4 and D has eigenval-ues 5; 7. Find the eigenvalues of the 4 by 4 matrix A:

A D�

B C

0 D

�D

2664

0 1 3 0

�2 3 0 4

0 0 6 1

0 0 1 6

3775 :

27 Find the rank and the four eigenvalues of A and C :

A D

2664

1 1 1 1

1 1 1 1

1 1 1 1

1 1 1 1

3775 and C D

2664

1 0 1 0

0 1 0 1

1 0 1 0

0 1 0 1

3775 :

28 Subtract I from the previous A. Find the �’s and then the determinants of

B D A � I D

2664

0 1 1 1

1 0 1 1

1 1 0 1

1 1 1 0

3775 and C D I � A D

2664

0 �1 �1 �1

�1 0 �1 �1

�1 �1 0 �1

�1 �1 �1 0

3775 :

29 (Review) Find the eigenvalues of A, B , and C :

A D241 2 3

0 4 5

0 0 6

35 and B D

240 0 1

0 2 0

3 0 0

35 and C D

242 2 2

2 2 2

2 2 2

35 :

30 When a C b Dc C d show that .1; 1/ is an eigenvector and find both eigenvalues :

A D�

a b

c d

�:


31 If we exchange rows 1 and 2 and columns 1 and 2, the eigenvalues don’t change.Find eigenvectors of A and B for � D 11. Rank one gives �2 D �3 D 0.

A D241 2 1

3 6 3

4 8 4

35 and B D PAP T D

246 3 3

2 1 1

8 4 4

35 :

32 Suppose A has eigenvalues 0; 3; 5 with independent eigenvectors u; v; w.

(a) Give a basis for the nullspace and a basis for the column space.

(b) Find a particular solution to Ax D v C w. Find all solutions.

(c) Ax Du has no solution. If it did then would be in the column space.

33 Suppose u; v are orthonormal vectors in R2, and A D uvT. Compute A2 D uvTuvT

to discover the eigenvalues of A. Check that the trace of A agrees with �1 C �2.

34 Find the eigenvalues of this permutation matrix P from det .P � �I / D 0. Whichvectors are not changed by the permutation? They are eigenvectors for � D 1. Canyou find three more eigenvectors?

P D

2664

0 0 0 1

1 0 0 0

0 1 0 0

0 0 1 0

3775 :

Challenge Problems

35 There are six 3 by 3 permutation matrices P . What numbers can be the determinantsof P ? What numbers can be pivots? What numbers can be the trace of P ? Whatfour numbers can be eigenvalues of P , as in Problem 15?

36 Is there a real 2 by 2 matrix (other than I ) with A3 D I ? Its eigenvalues must satisfy�3 D 1. They can be e2�i=3 and e�2�i=3. What trace and determinant would thisgive? Construct a rotation matrix as A (which angle of rotation?).

37 (a) Find the eigenvalues and eigenvectors of A. They depend on c:

A D�

:4 1 � c

:6 c

�:

(b) Show that A has just one line of eigenvectors when c D 1:6.

(c) This is a Markov matrix when c D :8. Then An will approach what matrix A1?

Outlines

Cramer’s Rule and Gauss Elimination

Mike Renfro

September 28, 2004

Mike Renfro Cramer’s Rule and Gauss Elimination

OutlinesPart I: Review of Previous LecturePart II: Cramer’s Rule and Gauss Elimination

Review of Previous Lecture


OutlinesPart I: Review of Previous LecturePart II: Cramer’s Rule and Gauss Elimination


Cramer’s RuleIntroductionExampleAdvantages and Disadvantages

Gauss EliminationIntroduction and RulesExampleMatrix Version and ExampleAdvantages and Disadvantages

Homework


Part I




Graphical interpretation

Solvable and unsolvable problems

Linear dependence and independence

Ill-conditioning


Cramer’s RuleGauss Elimination

Homework

Part II




Homework

IntroductionExampleAdvantages and Disadvantages

Cramer’s Rule (1750)

A linear system of equations can be solved by using Cramer’s rule,which for a system of 2 equations[

a11 a12

a21 a22

]{x1

x2

}=

{b1

b2

}yields

x1 =|[A1]||[A]|

, x2 =|[A2]||[A]|

where

A1 =

[b1 a12

b2 a22

]A2 =

[a11 b1

a21 b2

]



Homework


Cramer’s Rule Details

For a system of n equations, Cramer’s rule requires that youcalculate n + 1 determinants of n × n matrices.

In the general case for a system of equations [A]{x} = {b},the matrix [Ai ] is obtained by replacing the ith column of theoriginal [A] matrix with the contents of the {b} vector.

Each unknown variable xi is found by dividing the determinant|[Ai ]| by the determinant of the original coefficient matrix|[A]|.



Homework


Cramer’s Rule Example

Solve the following system of equations using Cramer’s rule:

x1 − x2 + x3 = 3

2x1 + x2 − x3 = 0

3x1 + 2x2 + 2x3 = 15

Convert the system of equations into matrix form: 1 −1 12 1 −13 2 2

x1

x2

x3

=

30

15



Homework


Cramer’s Rule Example (continued)

[A] =

1 −1 12 1 −13 2 2

, {x} =

x1

x2

x3

, {b} =

30

15

Define matrices [A1], [A2], and [A3] as

[A1] =

3 −1 10 1 −1

15 2 2

, [A2] =

1 3 12 0 −13 15 2

,

[A3] =

1 −1 32 1 03 2 15



Homework



Calculate determinants of [A], [A1], [A2], and [A3]:

|[A]| = 12

|[A1]| = 12

|[A2]| = 24

|[A3]| = 48

Unknowns x1, x2, and x3 are then calculated as

x1 =|[A1]||[A]|

=12

12= 1, x2 =

|[A2]||[A]|

=24

12= 2, x3 =

|[A3]||[A]|

=48

12= 4



Homework



Be sure to double-check your answers by substituting them intothe original equations:

x1 − x2 + x3 = 1 − 2 + 4 = 3

2x1 + x2 − x3 = 2 + 2 − 4 = 0

3x1 + 2x2 + 2x3 = 3 + 4 + 8 = 15



Homework


Cramer’s Rule Advantages/Disadvantages

Advantages

Easy to remember steps

Disadvantages

Computationally intensive compared to other methods: themost efficient ways of calculating the determinant of an n × nmatrix require (n − 1)(n!) operations. So Cramer’s rule wouldrequire (n − 1)((n + 1)!) total operations. For 8 equations,that works out to 7(9!) = 2540160 operations, or around 700hours if you can perform one operation per second.

Roundoff error may become significant on large problems withnon-integer coefficients.



Homework

Introduction and RulesExampleMatrix Version and ExampleAdvantages and Disadvantages

Gauss Elimination

Recall the scaffolding problem from the beginning of Chapter 3. Itsmatrix form was

1 1 −1 −1 0 −10 −9 1 4 0 70 0 1 1 −1 00 0 0 −3 2 00 0 0 0 1 10 0 0 0 0 −4

TA

TB

TC

TD

TE

TF

=

P1

−5P1

P2

−P2

P3

−P3

Notice that its coefficient matrix contains nothing but zeroesbelow the diagonal. This is an example of an upper triangularmatrix, and these systems of equations are very easy to solve.



Homework


Gauss Elimination Introduction (continued)

The original system of equations on the scaffolding problem was

TA +TB −TC −TD −TF = P1

−9TB +TC +4TD +7TF = −5P1

TC +TD −TE = P2

−3TD +2TE = −P2

TE +TF = P3

−4TF = −P3

.

Notice that we can solve for TF using only the sixth equation inthe system. That is, TF = P3

4 . After solving for TF , we can solvefor TE using only the fifth equation. The pattern continues,back-substituting through the system of equations until finally wesolve for TA using the first equation.



Homework


Gauss Elimination Introduction (continued)

The goal of Gauss elimination is to convert any given system ofequations into an equivalent upper triangular form. Onceconverted, we can back-substitute through the equations, solvingfor the unknowns algebraically.



Homework


Gauss Elimination Rules

The operations used in converting a system of equations to uppertriangular form are known as elementary operations and are:

Any equation may be multiplied by a nonzero scalar.

Any equation may be added to (or subtracted from) anotherequation.

The positions of any two equations in the system may beswapped.



Homework


Gauss Elimination Example

2x1 − x2 + x3 = 4 (1)

4x1 + 3x2 − x3 = 6 (2)

3x1 + 2x2 + 2x3 = 15 (3)

To eliminate the 4x1 term in Equation 2, multiply Equation 1 by 2and subtract it from Equation 2. To eliminate the 3x1 term inEquation 3, multiply Equation 1 by 3

2 and subtract it fromEquation 3. This gives a system of equations

2x1 − x2 + x3 = 4 (4)

5x2 − 3x3 = −2 (5)

7

2x2 +

1

2x3 = 9 (6)



Homework


Gauss Elimination Example (continued)

To eliminate the 72x2 term from Equation 6, multiply Equation 5

by 710 and subtract it from Equation 6. This gives a system of

equations

2x1 − x2 + x3 = 4 (7)

5x2 − 3x3 = −2 (8)

13

5x3 =

52

5(9)



Homework


Gauss Elimination Example (continued)

Equation 9 can easily be solved for x3.

x3 =52

5

(5

13

)= 4

Equation 8 can easily be solved for x2, once x3 is known.

x2 =1

5(−2 + 3x3) =

1

5(−2 + 3(4)) = 2

Equation 7 can easily be solved for x1, once both x2 and x3 areknown.

x1 =1

2(4 + x2 − x3) =

1

2(4 + 2 − 4) = 1



Homework


Matrix Version of Gauss Elimination

The Gauss elimination method can be applied to a system ofequations in matrix form. Instead of eliminating terms fromequations, we’ll be replacing certain elements of the coefficientmatrix with zeroes.



Homework


Matrix Version (Step 0)

Start by defining the augmented matrix [C (0)] for the problem:

[C (0)] =

a(0)11 a

(0)12 · · · a

(0)1n a

(0)1,n+1

a(0)21 a

(0)22 · · · a

(0)2n a

(0)2,n+1

......

......

...

a(0)n1 a

(0)n2 · · · a

(0)nn a

(0)n,n+1

where the first n columns are the elements of the original [A]matrix, and the last column is the elements of the original {b}matrix.



Homework



Zero out the first column of the [C ] matrix, rows 2 · · · n. To turna21 to a zero, multiply row 1 by a21

a11, then subtract the numbers on

row 1 from row 2. To turn a31 to a zero, multiply row 1 by a31a11

,then subtract the numbers on row 1 from row 3. Repeat for rows4 · · · n.

[C (1)] =

a(0)11 a

(0)12 · · · a

(0)1n a

(0)1,n+1

0 a(1)22 · · · a

(1)2n a

(1)2,n+1

......

......

...

0 a(1)n2 · · · a

(1)nn a

(1)n,n+1



Homework



Zero out the second column of the [C ] matrix, rows 3 · · · n. Toturn a32 to a zero, multiply row 2 by a32

a22, then subtract the

numbers on row 2 from row 3. To turn a42 to a zero, multiply row2 by a42

a22, then subtract the numbers on row 2 from row 4. Repeat

for rows 5 · · · n.

[C (1)] =

a(0)11 a

(0)12 a

(0)13 · · · a

(0)1n a

(0)1,n+1

0 a(1)22 a

(1)13 · · · a

(1)2n a

(1)2,n+1

0 0 a(2)33 · · · a

(2)3n a

(2)3,n+1

......

......

......

0 0 a(2)n3 · · · a

(2)nn a

(2)n,n+1



Homework


Matrix Version (Step n-1)

Zero out the (n − 1)th column of the [C ] matrix, row n. To turnan,n−1 to a zero, multiply row n − 1 by

an,n−1

an−1,n−1, then subtract the

numbers on row n − 1 from row n.

[C (1)] =

a(0)11 a

(0)12 a

(0)13 · · · a

(0)1n a

(0)1,n+1

0 a(1)22 a

(1)13 · · · a

(1)2n a

(1)2,n+1

0 0 a(2)33 · · · a

(2)3n a

(2)3,n+1

......

......

......

0 0 0 · · · a(n−1)nn a

(n−1)n,n+1



Homework


Example

Solve the following system of equations with Gauss elimination: 2 −1 14 3 −13 2 2

x1

x2

x3

=

46

15

First, set up the augmented matrix [C (0)]:

[C (0)] =

2 −1 1 44 3 −1 63 2 2 15



Homework


Example (continued)

Step 1a: eliminate the 4 on row 2, column 1. Multiply all theelements of row 1 by a21

a11= 4

2 = 2, then subtract them from theelements of row 2.

[C ] =

2 −1 1 44 − (2)(2) 3 − (2)(−1) −1 − (2)(1) 6 − (2)(4)

3 2 2 15

=

2 −1 1 40 5 −3 −23 2 2 15



Homework


Example (continued)

Step 1b: eliminate the 3 on row 3, column 1. Multiply all theelements of row 1 by a31

a11= 3

2 = 1.5, then subtract them from theelements of row 3.

[C (1)] =

2 −1 1 40 5 −3 −2

3 − (1.5)(2) 2 − (1.5)(−1) 2 − (1.5)(1) 15 − (1.5)(4)

=

2 −1 1 40 5 −3 −20 3.5 0.5 9

This completes the first elimination step.



Homework


Example (continued)

Step 2: eliminate the 3.5 on row 3, column 2. Multiply all theelements of row 2 by a32

a22= 3.5

5 = 0.7, then subtract them from theelements of row 3.

[C (2)] =

2 −1 1 40 5 −3 −20 3.5 − (0.7)(5) 0.5 − (0.7)(−3) 9 − (0.7)(−2)

=

2 −1 1 40 5 −3 −20 0 2.6 10.4

This completes the second elimination step.



Homework


Example (continued)

We’ve now converted the original system of equations 2 −1 14 3 −13 2 2

x1

x2

x3

=

46

15

into an equivalent upper-triangular system of equations 2 −1 1

0 5 −30 0 2.6

x1

x2

x3

=

4

−210.4



Homework


Example (continued)

The new system of equations can be converted back to algebraicform as:

2x1 − x2 + x3 = 4 (10)

5x2 − 3x3 = −2 (11)

2.6x3 = 10.4 (12)

Solve Equation 12 for x3: x3 = 10.42.6 = 4. Then solve Equation 11

for x2: x2 = 15(−2 + 3x3) = 2. Then solve Equation 10 for x1:

x1 = 12(4 + x2 − x3) = 1.



Homework


Example (continued)

Double-check the solution by substituting the values of x1, x2, andx3 into the original equations:

2x1 − x2 + x3 = 2(1) − 2 + 4 = 4

4x1 − 3x2 − x3 = 4(1) + 3(2) − 4 = 6

3x1 + 2x2 + 2x3 = 3(1) + 2(2) + 2(4) = 15



Homework


Gauss Elimination Advantages/Disadvantages

Advantages

Much less computation required for larger problems. Gausselimination requires n3

3 multiplications to solve a system of nequations. For 8 equations, this works out to around 170operations, versus the roughly 2.5 million operations forCramer’s rule.

Disadvantages

Not quite as easy to remember the procedure for handsolutions.

Roundoff error may become significant, but can be partiallymitigated by using more advanced techniques such as pivotingor scaling.



Homework

Homework

Solve Problem 3.4 using: Cramer’s rule, Gauss elimination,and MATLAB’s \ operator. Double-check your answers bysubstituting them back into the original equations.


Lecture 12

LU Decomposition

In many applications where linear systems appear, one needs to solve Ax = b for many differentvectors b. For instance, a structure must be tested under several different loads, not just one. As inthe example of a truss (9.2), the loading in such a problem is usually represented by the vector b.Gaussian elimination with pivoting is the most efficient and accurate way to solve a linear system.Most of the work in this method is spent on the matrix A itself. If we need to solve several differentsystems with the same A, and A is big, then we would like to avoid repeating the steps of Gaussianelimination on A for every different b. This can be accomplished by the LU decomposition, whichin effect records the steps of Gaussian elimination.

LU decomposition

The main idea of the LU decomposition is to record the steps used in Gaussian elimination on A inthe places where the zero is produced. Consider the matrix:

A =

1 −2 32 −5 120 2 −10

.

The first step of Gaussian elimination is to subtract 2 times the first row from the second row. Inorder to record what we have done, we will put the multiplier, 2, into the place it was used to makea zero, i.e. the second row, first column. In order to make it clear that it is a record of the step andnot an element of A, we will put it in parentheses. This leads to:

1 −2 3(2) −1 60 2 −10

.

There is already a zero in the lower left corner, so we don’t need to eliminate anything there. Werecord this fact with a (0). To eliminate the third row, second column, we need to subtract −2times the second row from the third row. Recording the −2 in the spot it was used we have:

1 −2 3(2) −1 6(0) (−2) 2

.

Let U be the upper triangular matrix produced, and let L be the lower triangular matrix with therecords and ones on the diagonal, i.e.:

L =

1 0 02 1 00 −2 1

, U =

1 −2 30 −1 60 0 2

,

43

44 LECTURE 12. LU DECOMPOSITION

then we have the following mysterious coincidence:

LU =

1 0 02 1 00 −2 1

1 −2 30 −1 60 0 2

=

1 −2 32 −5 120 2 −10

= A.

Thus we see that A is actually the product of L and U . Here L is lower triangular and U isupper triangular. When a matrix can be written as a product of simpler matrices, we call that adecomposition of A and this one we call the LU decomposition.

Using LU to solve equations

If we also include pivoting, then an LU decomposition for A consists of three matrices P , L and U

such that:PA = LU. (12.1)

The pivot matrix P is the identity matrix, with the same rows switched as the rows of A are switchedin the pivoting. For instance:

P =

1 0 00 0 10 1 0

,

would be the pivot matrix if the second and third rows of A are switched by pivoting. Matlab willproduce an LU decomposition with pivoting for a matrix A with the following command:> [L U P] = lu(A)

where P is the pivot matrix. To use this information to solve Ax = b we first pivot both sides bymultiplying by the pivot matrix:

PAx = Pb ≡ d.

Substituting LU for PA we get:LUx = d.

Then we need only to solve two back substitution problems:

Ly = d

andUx = y.

In Matlab this would work as follows:> A = rand(5,5)

> [L U P] = lu(A)

> b = rand(5,1)

> d = P*b

> y = L\d

> x = U\y

> rnorm = norm(A*x - b) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Check the result.We can then solve for any other b without redoing the LU step. Repeat the sequence for a newright hand side: c = randn(5,1); you can start at the third line. While this may not seem like abig savings, it would be if A were a large matrix from an actual application.

45

Exercises

12.1 Solve the systems below by hand using the LU decomposition. Pivot if appropriate. In eachof the two problems, check by hand that LU = PA and Ax = b.

(a) A =

(

2 4.5 4

)

, b =

(

0−3

)

(b) A =

(

1 43 5

)

, b =

(

32

)

12.2 Finish the following Matlab function program:

function [x1, e1, x2, e2] = mysolve(A,b)

% Solves linear systems using the LU decomposition with pivoting

% and also with the built-in solve function A\b.

% Inputs: A -- the matrix

% b -- the right-hand vector

% Outputs: x1 -- the solution using the LU method

% e1 -- the norm of the residual using the LU method

% x2 -- the solution using the built-in method

% e2 -- the norm of the residual using the built-in method

format long

Test the program on both random matrices (randn(n,n)) and Hilbert matrices (hilb(n))with n really big and document the results.

Chapter 2

Linear Equations

One of the problems encountered most frequently in scientific computation is thesolution of systems of simultaneous linear equations. This chapter covers the solu-tion of linear systems by Gaussian elimination and the sensitivity of the solution toerrors in the data and roundoff errors in the computation.

2.1 Solving Linear SystemsWith matrix notation, a system of simultaneous linear equations is written

Ax = b.

In the most frequent case, when there are as many equations as unknowns, A is agiven square matrix of order n, b is a given column vector of n components, and xis an unknown column vector of n components.

Students of linear algebra learn that the solution to Ax = b can be writtenx = A−1b, where A−1 is the inverse of A. However, in the vast majority of practicalcomputational problems, it is unnecessary and inadvisable to actually computeA−1. As an extreme but illustrative example, consider a system consisting of justone equation, such as

7x = 21.

The best way to solve such a system is by division:

x =217

= 3.

Use of the matrix inverse would lead to

x = 7−1 × 21 = 0.142857× 21 = 2.99997.

The inverse requires more arithmetic—a division and a multiplication instead ofjust a division—and produces a less accurate answer. Similar considerations apply

February 15, 2008

1

2 Chapter 2. Linear Equations

to systems of more than one equation. This is even true in the common situationwhere there are several systems of equations with the same matrix A but differentright-hand sides b. Consequently, we shall concentrate on the direct solution ofsystems of equations rather than the computation of the inverse.

2.2 The MATLAB Backslash OperatorTo emphasize the distinction between solving linear equations and computing in-verses, Matlab has introduced nonstandard notation using backward slash andforward slash operators, “\” and “/”.

If A is a matrix of any size and shape and B is a matrix with as many rowsas A, then the solution to the system of simultaneous equations

AX = B

is denoted byX = A\B.

Think of this as dividing both sides of the equation by the coefficient matrix A.Because matrix multiplication is not commutative and A occurs on the left in theoriginal equation, this is left division.

Similarly, the solution to a system with A on the right and B with as manycolumns as A,

XA = B,

is obtained by right division,X = B/A.

This notation applies even if A is not square, so that the number of equations is notthe same as the number of unknowns. However, in this chapter, we limit ourselvesto systems with square coefficient matrices.

2.3 A 3-by-3 ExampleTo illustrate the general linear equation solution algorithm, consider an example oforder three:

10 −7 0−3 2 65 −1 5

x1

x2

x3

=

746

.

This, of course, represents the three simultaneous equations

10x1 − 7x2 = 7,

−3x1 + 2x2 + 6x3 = 4,

5x1 − x2 + 5x3 = 6.

The first step of the solution algorithm uses the first equation to eliminate x1 fromthe other equations. This is accomplished by adding 0.3 times the first equation

2.3. A 3-by-3 Example 3

to the second equation and subtracting 0.5 times the first equation from the thirdequation. The coefficient 10 of x1 in the first equation is called the first pivotand the quantities −0.3 and 0.5, obtained by dividing the coefficients of x1 in theother equations by the pivot, are called the multipliers. The first step changes theequations to

10 −7 00 −0.1 60 2.5 5

x1

x2

x3

=

76.12.5

.

The second step might use the second equation to eliminate x2 from the thirdequation. However, the second pivot, which is the coefficient of x2 in the secondequation, would be −0.1, which is smaller than the other coefficients. Consequently,the last two equations are interchanged. This is called pivoting. It is not actuallynecessary in this example because there are no roundoff errors, but it is crucial ingeneral:

10 −7 00 2.5 50 −0.1 6

x1

x2

x3

=

72.56.1

.

Now the second pivot is 2.5 and the second equation can be used to eliminate x2 fromthe third equation. This is accomplished by adding 0.04 times the second equationto the third equation. (What would the multiplier have been if the equations hadnot been interchanged?)

10 −7 00 2.5 50 0 6.2

x1

x2

x3

=

72.56.2

.

The last equation is now6.2x3 = 6.2.

This can be solved to give x3 = 1. This value is substituted into the second equation:

2.5x2 + (5)(1) = 2.5.

Hence x2 = −1. Finally, the values of x2 and x3 are substituted into the firstequation:

10x1 + (−7)(−1) = 7.

Hence x1 = 0. The solution is

x =

0−11

.

This solution can be easily checked using the original equations:

10 −7 0−3 2 65 −1 5

0−11

=

746

.


The entire algorithm can be compactly expressed in matrix notation. For thisexample, let

L =

1 0 00.5 1 0−0.3 −0.04 1

, U =

10 −7 00 2.5 50 0 6.2

, P =

1 0 00 0 10 1 0

.

The matrix L contains the multipliers used during the elimination, the matrix Uis the final coefficient matrix, and the matrix P describes the pivoting. With thesethree matrices, we have

LU = PA.

In other words, the original coefficient matrix can be expressed in terms of productsinvolving matrices with simpler structure.

2.4 Permutation and Triangular MatricesA permutation matrix is an identity matrix with the rows and columns interchanged.It has exactly one 1 in each row and column; all the other elements are 0. Forexample,

P =

0 0 0 11 0 0 00 0 1 00 1 0 0

.

Multiplying a matrix A on the left by a permutation matrix to give PA permutesthe rows of A. Multiplying on the right, AP , permutes the columns of A.

Matlab can also use a permutation vector as a row or column index to rear-range the rows or columns of a matrix. Continuing with the P above, let p be thevector

p = [4 1 3 2]

Then P*A and A(p,:) are equal. The resulting matrix has the fourth row of A asits first row, the first row of A as its second row, and so on. Similarly, A*P andA(:,p) both produce the same permutation of the columns of A. The P*A notationis closer to traditional mathematics, PA, while the A(p,:) notation is faster anduses less memory.

Linear equations involving permutation matrices are trivial to solve. Thesolution to

Px = b

is simply a rearrangement of the components of b:

x = PT b.

An upper triangular matrix has all its nonzero elements above or on the maindiagonal. A unit lower triangular matrix has ones on the main diagonal and all the

2.5. LU Factorization 5

rest of its nonzero elements below the main diagonal. For example,

U =

1 2 3 40 5 6 70 0 8 90 0 0 10

is upper triangular, and

L =

1 0 0 02 1 0 03 5 1 04 6 7 1

is unit lower triangular.Linear equations involving triangular matrices are also easily solved. There are

two variants of the algorithm for solving an n-by-n upper triangular system Ux = b.Both begin by solving the last equation for the last variable, then the next-to-lastequation for the next-to-last variable, and so on. One subtracts multiples of thecolumns of U from b.

x = zeros(n,1);for k = n:-1:1

x(k) = b(k)/U(k,k);i = (1:k-1)’;b(i) = b(i) - x(k)*U(i,k);

end

The other uses inner products between the rows of U and portions of theemerging solution x.

x = zeros(n,1);for k = n:-1:1

j = k+1:n;x(k) = (b(k) - U(k,j)*x(j))/U(k,k);

end

2.5 LU FactorizationThe algorithm that is almost universally used to solve square systems of simultane-ous linear equations is one of the oldest numerical methods, the systematic elimi-nation method, generally named after C. F. Gauss. Research in the period 1955 to1965 revealed the importance of two aspects of Gaussian elimination that were notemphasized in earlier work: the search for pivots and the proper interpretation ofthe effect of rounding errors.

In general, Gaussian elimination has two stages, the forward elimination andthe back substitution. The forward elimination consists of n − 1 steps. At the kthstep, multiples of the kth equation are subtracted from the remaining equationsto eliminate the kth variable. If the coefficient of xk is “small,” it is advisable to


interchange equations before this is done. The elimination steps can be simultane-ously applied to the right-hand side, or the interchanges and multipliers saved andapplied to the right-hand side later. The back substitution consists of solving thelast equation for xn, then the next-to-last equation for xn−1, and so on, until x1 iscomputed from the first equation.

Let Pk, k = 1, . . . , n − 1, denote the permutation matrix obtained by in-terchanging the rows of the identity matrix in the same way the rows of A areinterchanged at the kth step of the elimination. Let Mk denote the unit lower tri-angular matrix obtained by inserting the negatives of the multipliers used at thekth step below the diagonal in the kth column of the identity matrix. Let U be thefinal upper triangular matrix obtained after the n− 1 steps. The entire process canbe described by one matrix equation,

U = Mn−1Pn−1 · · ·M2P2M1P1A.

It turns out that this equation can be rewritten

L1L2 · · ·Ln−1U = Pn−1 · · ·P2P1A,

where Lk is obtained from Mk by permuting and changing the signs of the multi-pliers below the diagonal. So, if we let

L = L1L2 · · ·Ln−1,P = Pn−1 · · ·P2P1,

then we haveLU = PA.

The unit lower triangular matrix L contains all the multipliers used during theelimination and the permutation matrix P accounts for all the interchanges.

For our example

A =

10 −7 0−3 2 65 −1 5

,

the matrices defined during the elimination are

P1 =

1 0 00 1 00 0 1

, M1 =

1 0 00.3 1 0−0.5 0 1

,

P2 =

1 0 00 0 10 1 0

, M2 =

1 0 00 1 00 0.04 1

.

The corresponding L’s are

L1 =

1 0 00.5 1 0−0.3 0 1

, L2 =

1 0 00 1 00 −0.04 1

.

2.6. Why Is Pivoting Necessary? 7

The relation LU = PA is called the LU factorization or the triangular de-composition of A. It should be emphasized that nothing new has been introduced.Computationally, elimination is done by row operations on the coefficient matrix,not by actual matrix multiplication. LU factorization is simply Gaussian elimina-tion expressed in matrix notation.

With this factorization, a general system of equations

Ax = b

becomes a pair of triangular systems

Ly = Pb,Ux = y.

2.6 Why Is Pivoting Necessary?The diagonal elements of U are called pivots. The kth pivot is the coefficient of thekth variable in the kth equation at the kth step of the elimination. In our 3-by-3example, the pivots are 10, 2.5, and 6.2. Both the computation of the multipliers andthe back substitution require divisions by the pivots. Consequently, the algorithmcannot be carried out if any of the pivots are zero. Intuition should tell us that itis a bad idea to complete the computation if any of the pivots are nearly zero. Tosee this, let us change our example slightly to

10 −7 0−3 2.099 65 −1 5

x1

x2

x3

=

73.901

6

.

The (2, 2) element of the matrix has been changed from 2.000 to 2.099, and theright-hand side has also been changed so that the exact answer is still (0,−1, 1)T .Let us assume that the solution is to be computed on a hypothetical machine thatdoes decimal floating-point arithmetic with five significant digits.

The first step of the elimination produces

10 −7 00 −0.001 60 2.5 5

x1

x2

x3

=

76.0012.5

.

The (2, 2) element is now quite small compared with the other elements in the ma-trix. Nevertheless, let us complete the elimination without using any interchanges.The next step requires adding 2.5 · 103 times the second equation to the third:

(5 + (2.5 · 103)(6))x3 = (2.5 + (2.5 · 103)(6.001)).

On the right-hand side, this involves multiplying 6.001 by 2.5 · 103. The result is1.50025 ·104, which cannot be exactly represented in our hypothetical floating-pointnumber system. It must be rounded to 1.5002 · 104. The result is then added to 2.5and rounded again. In other words, both of the 5’s shown in italics in

(5 + 1.5000 · 104)x3 = (2.5 + 1.50025 · 104)


are lost in roundoff errors. On this hypothetical machine, the last equation becomes

1.5005 · 104x3 = 1.5004 · 104.

The back substitution begins with

x3 =1.5004 · 104

1.5005 · 104= 0.99993.

Because the exact answer is x3 = 1, it does not appear that the error is too serious.Unfortunately, x2 must be determined from the equation

−0.001x2 + (6)(0.99993) = 6.001,

which gives

x2 =1.5 · 10−3

−1.0 · 10−3= −1.5.

Finally, x1 is determined from the first equation,

10x1 + (−7)(−1.5) = 7,

which givesx1 = −0.35.

Instead of (0,−1, 1)T , we have obtained (−0.35,−1.5, 0.99993)T .Where did things go wrong? There was no “accumulation of rounding error”

caused by doing thousands of arithmetic operations. The matrix is not close tosingular. The difficulty comes from choosing a small pivot at the second step of theelimination. As a result, the multiplier is 2.5 · 103, and the final equation involvescoefficients that are 103 times as large as those in the original problem. Roundofferrors that are small if compared to these large coefficients are unacceptable interms of the original matrix and the actual solution.

We leave it to the reader to verify that if the second and third equations areinterchanged, then no large multipliers are necessary and the final result is accurate.This turns out to be true in general: If the multipliers are all less than or equalto one in magnitude, then the computed solution can be proved to be satisfactory.Keeping the multipliers less than one in absolute value can be ensured by a processknown as partial pivoting. At the kth step of the forward elimination, the pivot istaken to be the largest (in absolute value) element in the unreduced part of the kthcolumn. The row containing this pivot is interchanged with the kth row to bringthe pivot element into the (k, k) position. The same interchanges must be donewith the elements of the right-hand side b. The unknowns in x are not reorderedbecause the columns of A are not interchanged.

2.7 lutx, bslashtx, luguiWe have three functions implementing the algorithms discussed in this chapter.The first function, lutx, is a readable version of the built-in Matlab function lu.

2.7. lutx, bslashtx, lugui 9

There is one outer for loop on k that counts the elimination steps. The inner loopson i and j are implemented with vector and matrix operations, so that the overallfunction is reasonably efficient.

function [L,U,p] = lutx(A)%LU Triangular factorization% [L,U,p] = lutx(A) produces a unit lower triangular% matrix L, an upper triangular matrix U, and a% permutation vector p, so that L*U = A(p,:).

[n,n] = size(A);p = (1:n)’

for k = 1:n-1

% Find largest element below diagonal in k-th column[r,m] = max(abs(A(k:n,k)));m = m+k-1;

% Skip elimination if column is zeroif (A(m,k) ~= 0)

% Swap pivot rowif (m ~= k)

A([k m],:) = A([m k],:);p([k m]) = p([m k]);

end

% Compute multipliersi = k+1:n;A(i,k) = A(i,k)/A(k,k);

% Update the remainder of the matrixj = k+1:n;A(i,j) = A(i,j) - A(i,k)*A(k,j);

endend

% Separate resultL = tril(A,-1) + eye(n,n);U = triu(A);

Study this function carefully. Almost all the execution time is spent in thestatement

A(i,j) = A(i,j) - A(i,k)*A(k,j);


At the kth step of the elimination, i and j are index vectors of length n-k. Theoperation A(i,k)*A(k,j) multiplies a column vector by a row vector to producea square, rank one matrix of order n-k. This matrix is then subtracted from thesubmatrix of the same size in the bottom right corner of A. In a programminglanguage without vector and matrix operations, this update of a portion of A wouldbe done with doubly nested loops on i and j.

The second function, bslashtx, is a simplified version of the built-in Matlabbackslash operator. It begins by checking for three important special cases: lowertriangular, upper triangular, and symmetric positive definite. Linear systems withthese properties can be solved in less time than a general system.

function x = bslashtx(A,b)% BSLASHTX Solve linear system (backslash)% x = bslashtx(A,b) solves A*x = b

[n,n] = size(A);if isequal(triu(A,1),zeros(n,n))

% Lower triangularx = forward(A,b);return

elseif isequal(tril(A,-1),zeros(n,n))% Upper triangularx = backsubs(A,b);return

elseif isequal(A,A’)[R,fail] = chol(A);if ~fail

% Positive definitey = forward(R’,b);x = backsubs(R,y);return

endend

If none of the special cases is detected, bslashtx calls lutx to permute and fac-tor the coefficient matrix, then uses the permutation and factors to complete thesolution of a linear system.

% Triangular factorization[L,U,p] = lutx(A);

% Permutation and forward eliminationy = forward(L,b(p));

% Back substitutionx = backsubs(U,y);

2.8. Effect of Roundoff Errors 11

The bslashtx function employs subfunctions to carry out the solution of lower andupper triangular systems.

function x = forward(L,x)% FORWARD. Forward elimination.% For lower triangular L, x = forward(L,b) solves L*x = b.[n,n] = size(L);for k = 1:n

j = 1:k-1;x(k) = (x(k) - L(k,j)*x(j))/L(k,k);

end

function x = backsubs(U,x)% BACKSUBS. Back substitution.% For upper triangular U, x = backsubs(U,b) solves U*x = b.[n,n] = size(U);for k = n:-1:1

j = k+1:n;x(k) = (x(k) - U(k,j)*x(j))/U(k,k);

end

A third function, lugui, shows the steps in LU decomposition by Gaussianelimination. It is a version of lutx that allows you to experiment with various pivotselection strategies. At the kth step of the elimination, the largest element in theunreduced portion of the kth column is shown in magenta. This is the element thatpartial pivoting would ordinarily select as the pivot. You can then choose amongfour different pivoting strategies:

• Pick a pivot. Use the mouse to pick the magenta element, or any otherelement, as pivot.

• Diagonal pivoting. Use the diagonal element as the pivot.

• Partial pivoting. Same strategy as lu and lutx.

• Complete pivoting. Use the largest element in the unfactored submatrix asthe pivot.

The chosen pivot is shown in red and the resulting elimination step is taken. As theprocess proceeds, the emerging columns of L are shown in green and the emergingrows of U in blue.

2.8 Effect of Roundoff ErrorsThe rounding errors introduced during the solution of a linear system of equationsalmost always cause the computed solution—which we now denote by x∗—to differsomewhat from the theoretical solution x = A−1b. In fact, if the elements of x


are not floating-point numbers, then x∗ cannot equal x. There are two commonmeasures of the discrepancy in x∗: the error,

e = x− x∗,

and the residual,r = b−Ax∗.

Matrix theory tells us that, because A is nonsingular, if one of these is zero, theother must also be zero. But they are not necessarily both “small” at the sametime. Consider the following example:

(0.780 0.5630.913 0.659

) (x1

x2

)=

(0.2170.254

).

What happens if we carry out Gaussian elimination with partial pivoting on ahypothetical three-digit decimal computer? First, the two rows (equations) areinterchanged so that 0.913 becomes the pivot. Then the multiplier

0.7800.913

= 0.854 (to three places)

is computed. Next, 0.854 times the new first row is subtracted from the new secondrow to produce the system

(0.913 0.659

0 0.001

) (x1

x2

)=

(0.2540.001

).

Finally, the back substitution is carried out:

x2 =0.0010.001

= 1.00 (exactly),

x1 =0.254− 0.659x2

0.913= −0.443 (to three places).

Thus the computed solution is

x∗ =(−0.443

1.000

).

To assess the accuracy without knowing the exact answer, we compute the residuals(exactly):

r = b−Ax∗ =(

0.217− ((0.780)(−0.443) + (0.563)(1.00))0.254− ((0.913)(−0.443) + (0.659)(1.00))

)

=(−0.000460−0.000541

).

2.8. Effect of Roundoff Errors 13

The residuals are less than 10−3. We could hardly expect better on a three-digitmachine. However, it is easy to see that the exact solution to this system is

x =(

1.000−1.000

).

So the components of our computed solution actually have the wrong signs; theerror is larger than the solution itself.

Were the small residuals just a lucky fluke? You should realize that thisexample is highly contrived. The matrix is very close to being singular and is nottypical of most problems encountered in practice. Nevertheless, let us track downthe reason for the small residuals.

If Gaussian elimination with partial pivoting is carried out for this example ona computer with six or more digits, the forward elimination will produce a systemsomething like

(0.913000 0.659000

0 −0.000001

)(x1

x2

)=

(0.2540000.000001

).

Notice that the sign of U2,2 differs from that obtained with three-digit computation.Now the back substitution produces

x2 =0.000001−0.000001

= −1.00000,

x1 =0.254− 0.659x2

0.913= 1.00000,

the exact answer. On our three-digit machine, x2 was computed by dividing twoquantities, both of which were on the order of rounding errors and one of which didnot even have the correct sign. Hence x2 can turn out to be almost anything. Thenthis arbitrary value of x2 was substituted into the first equation to obtain x1.

We can reasonably expect the residual from the first equation to be small—x1 was computed in such a way as to make this certain. Now comes a subtle butcrucial point. We can also expect the residual from the second equation to be small,precisely because the matrix is so close to being singular. The two equations are verynearly multiples of one another, so any pair (x1, x2) that nearly satisfies the firstequation will also nearly satisfy the second. If the matrix were known to be exactlysingular, we would not need the second equation at all—any solution of the firstwould automatically satisfy the second.

In Figure 2.1, the exact solution is marked with a circle and the computedsolution with an asterisk. Even though the computed solution is far from the exactintersection, it is close to both lines because they are nearly parallel.

Although this example is contrived and atypical, the conclusion we reachedis not. It is probably the single most important fact that we have learned aboutmatrix computation since the invention of the digital computer:

Gaussian elimination with partial pivoting is guaranteed to produce smallresiduals.


−1.5 −1 −0.5 0 0.5 1 1.5 2

−1.5

−1

−0.5

0

0.5

1

1.5

Figure 2.1. The computed solution, marked by an asterisk, shows a largeerror, but a small residual.

Now that we have stated it so strongly, we must make a couple of qualifyingremarks. By “guaranteed” we mean it is possible to prove a precise theorem thatassumes certain technical details about how the floating-point arithmetic systemworks and that establishes certain inequalities that the components of the residualmust satisfy. If the arithmetic units work some other way or if there is a bug inthe particular program, then the “guarantee” is void. Furthermore, by “small” wemean on the order of roundoff error relative to three quantities: the size of theelements of the original coefficient matrix, the size of the elements of the coefficientmatrix at intermediate steps of the elimination process, and the size of the elementsof the computed solution. If any of these are “large,” then the residual will notnecessarily be small in an absolute sense. Finally, even if the residual is small, wehave made no claims that the error will be small. The relationship between the sizeof the residual and the size of the error is determined in part by a quantity knownas the condition number of the matrix, which is the subject of the next section.

2.9 Norms and Condition NumbersThe coefficients in the matrix and right-hand side of a system of simultaneous linearequations are rarely known exactly. Some systems arise from experiments, and sothe coefficients are subject to observational errors. Other systems have coefficientsgiven by formulas that involve roundoff error in their evaluation. Even if the systemcan be stored exactly in the computer, it is almost inevitabe that roundoff errors willbe introduced during its solution. It can be shown that roundoff errors in Gaussianelimination have the same effect on the answer as errors in the original coefficients.

2.9. Norms and Condition Numbers 15

Consequently, we are led to a fundamental question. If perturbations are madein the coefficients of a system of linear equations, how much is the solution altered?In other words, if Ax = b, how can we measure the sensitivity of x to changes in Aand b?

The answer to this question lies in making the idea of nearly singular precise.If A is a singular matrix, then for some b’s a solution x will not exist, while forothers it will not be unique. So if A is nearly singular, we can expect small changesin A and b to cause very large changes in x. On the other hand, if A is the identitymatrix, then b and x are the same vector. So if A is nearly the identity, smallchanges in A and b should result in correspondingly small changes in x.

At first glance, it might appear that there is some connection between the sizeof the pivots encountered in Gaussian elimination with partial pivoting and nearnessto singularity, because if the arithmetic could be done exactly, all the pivots wouldbe nonzero if and only if the matrix is nonsingular. To some extent, it is also truethat if the pivots are small, then the matrix is close to singular. However, whenroundoff errors are encountered, the converse is no longer true—a matrix might beclose to singular even though none of the pivots are small.

To get a more precise, and reliable, measure of nearness to singularity thanthe size of the pivots, we need to introduce the concept of a norm of a vector. Thisis a single number that measures the general size of the elements of the vector.The family of vector norms known as lp depends on a parameter p in the range1 ≤ p ≤ ∞:

‖x‖p =

(n∑

i=1

|xi|p)1/p

.

We almost always use p = 1, p = 2, or lim p →∞:

‖x‖1 =n∑

i=1

|xi|,

‖x‖2 =

(n∑

i=1

|xi|2)1/2

,

‖x‖∞ = maxi|xi|.

The l1-norm is also known as the Manhattan norm because it corresponds to thedistance traveled on a grid of city streets. The l2-norm is the familiar Euclideandistance. The l∞-norm is also known as the Chebyshev norm.

The particular value of p is often unimportant and we simply use ‖x‖. Allvector norms have the following basic properties associated with the notion of dis-tance:

‖x‖ > 0 if x 6= 0,

‖0‖ = 0,

‖cx‖ = |c|‖x‖ for all scalars c,

‖x + y‖ ≤ ‖x‖+ ‖y‖ (the triangle inequality).


In Matlab, ‖x‖p is computed by norm(x,p), and norm(x) is the same asnorm(x,2). For example,

x = (1:4)/5norm1 = norm(x,1)norm2 = norm(x)norminf = norm(x,inf)

produces

x =0.2000 0.4000 0.6000 0.8000

norm1 =2.0000

norm2 =1.0954

norminf =0.8000

Multiplication of a vector x by a matrix A results in a new vector Ax that canhave a very different norm from x. This change in norm is directly related to thesensitivity we want to measure. The range of the possible change can be expressedby two numbers:

M = max‖Ax‖‖x‖ ,

m = min‖Ax‖‖x‖ .

The max and min are taken over all nonzero vectors x. Note that if A is singular,then m = 0. The ratio M/m is called the condition number of A:

κ(A) =max ‖Ax‖

‖x‖min ‖Ax‖

‖x‖.

The actual numerical value of κ(A) depends on the vector norm being used,but we are usually only interested in order of magnitude estimates of the conditionnumber, so the particular norm is usually not very important.

Consider a system of equations

Ax = b

and a second system obtained by altering the right-hand side:

A(x + δx) = b + δb.


We think of δb as being the error in b and δx as being the resulting error in x,although we need not make any assumptions that the errors are small. BecauseA(δx) = δb, the definitions of M and m immediately lead to

‖b‖ ≤ M‖x‖and

‖δb‖ ≥ m‖δx‖.Consequently, if m 6= 0,

‖δx‖‖x‖ ≤ κ(A)

‖δb‖‖b‖ .

The quantity ‖δb‖/‖b‖ is the relative change in the right-hand side, and the quantity‖δx‖/‖x‖ is the relative error caused by this change. The advantage of using relativechanges is that they are dimensionless, that is, they are not affected by overall scalefactors.

This shows that the condition number is a relative error magnification factor.Changes in the right-hand side can cause changes κ(A) times as large in the solution.It turns out that the same is true of changes in the coefficient matrix itself.

The condition number is also a measure of nearness to singularity. Althoughwe have not yet developed the mathematical tools necessary to make the idea pre-cise, the condition number can be thought of as the reciprocal of the relative distancefrom the matrix to the set of singular matrices. So, if κ(A) is large, A is close tosingular.

Some of the basic properties of the condition number are easily derived.Clearly, M ≥ m, and so

κ(A) ≥ 1.

If P is a permutation matrix, then the components of Px are simply a rearrangementof the components of x. It follows that ‖Px‖ = ‖x‖ for all x, and so

κ(P ) = 1.

In particular, κ(I) = 1. If A is multiplied by a scalar c, then M and m are bothmultiplied by the same scalar, and so

κ(cA) = κ(A).

If D is a diagonal matrix, then

κ(D) =max |dii|min |dii| .

These last two properties are two of the reasons that κ(A) is a better measure ofnearness to singularity than the determinant of A. As an extreme example, considera 100-by-100 diagonal matrix with 0.1 on the diagonal. Then det(A) = 10−100,which is usually regarded as a small number. But κ(A) = 1, and the components ofAx are simply 0.1 times the corresponding components of x. For linear systems ofequations, such a matrix behaves more like the identity than like a singular matrix.


The following example uses the l1-norm:

A =(

4.1 2.89.7 6.6

),

b =(

4.19.7

),

x =(

10

).

Clearly, Ax = b, and‖b‖ = 13.8, ‖x‖ = 1.

If the right-hand side is changed to

b̃ =(

4.119.70

),

the solution becomes

x̃ =(

0.340.97

).

Let δb = b− b̃ and δx = x− x̃. Then

‖δb‖ = 0.01,

‖δx‖ = 1.63.

We have made a fairly small perturbation in b that completely changes x. In fact,the relative changes are

‖δb‖‖b‖ = 0.0007246,

‖δx‖‖x‖ = 1.63.

Because κ(A) is the maximum magnification factor,

κ(A) ≥ 1.630.0007246

= 2249.4.

We have actually chosen the b and δb that give the maximum, and so, for thisexample with the l1-norm,

κ(A) = 2249.4.

It is important to realize that this example is concerned with the exact so-lutions to two slightly different systems of equations and that the method used toobtain the solutions is irrelevant. The example is constructed to have a fairly largecondition number so that the effect of changes in b is quite pronounced, but similarbehavior can be expected in any problem with a large condition number.

The condition number also plays a fundamental role in the analysis of theroundoff errors introduced during the solution by Gaussian elimination. Let us


assume that A and b have elements that are exact floating-point numbers, and letx∗ be the vector of floating-point numbers obtained from a linear equation solversuch as the function we shall present in the next section. We also assume that exactsingularity is not detected and that there are no underflows or overflows. Then itis possible to establish the following inequalities:

‖b−Ax∗‖‖A‖‖x∗‖ ≤ ρε,

‖x− x∗‖‖x∗‖ ≤ ρκ(A)ε.

Here ε is the relative machine precision eps and ρ is defined more carefully later,but it usually has a value no larger than about 10.

The first inequality says that the relative residual can usually be expected tobe about the size of roundoff error, no matter how badly conditioned the matrix is.This was illustrated by the example in the previous section. The second inequalityrequires that A be nonsingular and involves the exact solution x. It follows directlyfrom the first inequality and the definition of κ(A) and says that the relative errorwill also be small if κ(A) is small but might be quite large if the matrix is nearlysingular. In the extreme case where A is singular but the singularity is not detected,the first inequality still holds but the second has no meaning.

To be more precise about the quantity ρ, it is necessary to introduce the ideaof a matrix norm and establish some further inequalities. Readers who are notinterested in such details can skip the remainder of this section. The quantity Mdefined earlier is known as the norm of the matrix. The notation for the matrixnorm is the same as for the vector norm:

‖A‖ = max‖Ax‖‖x‖ .

It is not hard to see that ‖A−1‖ = 1/m, so an equivalent definition of the conditionnumber is

κ(A) = ‖A‖‖A−1‖.Again, the actual numerical values of the matrix norm and condition number

depend on the underlying vector norm. It is easy to compute the matrix normscorresponding to the l1 and l∞ vector norms. In fact, it is not hard to show that

‖A‖1 = maxj

∑

i

|ai,j |,

‖A‖∞ = maxi

∑

j

|ai,j |.

Computing the matrix norm corresponding to the l2 vector norm involves the sin-gular value decomposition (SVD), which is discussed in a later chapter. Matlabcomputes matrix norms with norm(A,p) for p = 1, 2, or inf.

The basic result in the study of roundoff error in Gaussian elimination is dueto J. H. Wilkinson. He proved that the computed solution x∗ exactly satisfies

(A + E)x∗ = b,


where E is a matrix whose elements are about the size of roundoff errors in theelements of A. There are some rare situations where the intermediate matricesobtained during Gaussian elimination have elements that are larger than those ofA, and there is some effect from accumulation of rounding errors in large matrices,but it can be expected that if ρ is defined by

‖E‖‖A‖ = ρε,

then ρ will rarely be bigger than about 10.From this basic result, we can immediately derive inequalities involving the

residual and the error in the computed solution. The residual is given by

b−Ax∗ = Ex∗,

and hence‖b−Ax∗‖ = ‖Ex∗‖ ≤ ‖E‖‖x∗‖.

The residual involves the product Ax∗, so it is appropriate to consider the relativeresidual, which compares the norm of b− Ax to the norms of A and x∗. It followsdirectly from the above inequalities that

‖b−Ax∗‖‖A‖‖x∗‖ ≤ ρε.

If A is nonsingular, the error can be expressed using the inverse of A by

x− x∗ = A−1(b−Ax∗),

and so‖x− x∗‖ ≤ ‖A−1‖‖E‖‖x∗‖.

It is simplest to compare the norm of the error with the norm of the computedsolution. Thus the relative error satisfies

‖x− x∗‖‖x∗‖ ≤ ρ‖A‖‖A−1‖ε.

Hence‖x− x∗‖‖x∗‖ ≤ ρκ(A)ε.

The actual computation of κ(A) requires knowing ‖A−1‖. But computing A−1

requires roughly three times as much work as solving a single linear system. Com-puting the l2 condition number requires the SVD and even more work. Fortunately,the exact value of κ(A) is rarely required. Any reasonably good estimate of it issatisfactory.

Matlab has several functions for computing or estimating condition numbers.

• cond(A) or cond(A,2) computes κ2(A). Uses svd(A). Suitable for smallermatrices where the geometric properties of the l2-norm are important.

2.10. Sparse Matrices and Band Matrices 21

• cond(A,1) computes κ1(A). Uses inv(A). Less work than cond(A,2).

• cond(A,inf) computes κ∞(A). Uses inv(A). Same as cond(A’,1).

• condest(A) estimates κ1(A). Uses lu(A) and a recent algorithm of Highamand Tisseur [9]. Especially suitable for large, sparse matrices.

• rcond(A) estimates 1/κ1(A). Uses lu(A) and an older algorithm developedby the LINPACK and LAPACK projects. Primarily of historical interest.

2.10 Sparse Matrices and Band MatricesSparse matrices and band matrices occur frequently in technical computing. Thesparsity of a matrix is the fraction of its elements that are zero. The Matlabfunction nnz counts the number of nonzeros in a matrix, so the sparsity of A isgiven by

density = nnz(A)/prod(size(A))sparsity = 1 - density

A sparse matrix is a matrix whose sparsity is nearly equal to 1.The bandwidth of a matrix is the maximum distance of the nonzero elements

from the main diagonal.

[i,j] = find(A)bandwidth = max(abs(i-j))

A band matrix is a matrix whose bandwidth is small.As you can see, both sparsity and bandwidth are matters of degree. An n-by-n

diagonal matrix with no zeros on the diagonal has sparsity 1− 1/n and bandwidth0, so it is an extreme example of both a sparse matrix and a band matrix. On theother hand, an n-by-n matrix with no zero elements, such as the one created byrand(n,n), has sparsity equal to zero and bandwidth equal to n− 1, and so is farfrom qualifying for either category.

The Matlab sparse data structure stores the nonzero elements together withinformation about their indices. The sparse data structure also provides efficienthandling of band matrices, so Matlab does not have a separate band matrix storageclass. The statement

S = sparse(A)

converts a matrix to its sparse representation. The statement

A = full(S)

reverses the process. However, most sparse matrices have orders so large that it isimpractical to store the full representation. More frequently, sparse matrices arecreated by

S = sparse(i,j,x,m,n)


This produces a matrix S with

[i,j,x] = find(S)[m,n] = size(S)

Most Matlab matrix operations and functions can be applied to both full andsparse matrices. The dominant factor in determining the execution time and mem-ory requirements for sparse matrix operations is the number of nonzeros, nnz(S),in the various matrices involved.

A matrix with bandwidth equal to 1 is known as a tridiagonal matrix. It isworthwhile to have a specialized function for one particular band matrix operation,the solution of a tridiagonal system of simultaneous linear equations:

b1 c1

a1 b2 c2

a2 b3 c3

. . . . . . . . .an−2 bn−1 cn−1

an−1 bn

x1

x2

x3...

xn−1

xn

=

d1

d2

d3...

dn−1

dn

.

The function tridisolve is included in the NCM directory. The statement

x = tridisolve(a,b,c,d)

solves the tridiagonal system with subdiagonal a, diagonal b, superdiagonal c, andright-hand side d. We have already seen the algorithm that tridisolve uses; itis Gaussian elimination. In many situations involving tridiagonal matrices, thediagonal elements dominate the off-diagonal elements, so pivoting is unnecessary.Furthermore, the right-hand side is processed at the same time as the matrix itself.In this context, Gaussian elimination without pivoting is also known as the Thomasalgorithm.

The body of tridisolve begins by copying the right-hand side to a vectorthat will become the solution.

x = d;n = length(x);

The forward elimination step is a simple for loop.

for j = 1:n-1mu = a(j)/b(j);b(j+1) = b(j+1) - mu*c(j);x(j+1) = x(j+1) - mu*x(j);

end

The mu’s would be the multipliers on the subdiagonal of L if we were saving the LUfactorization. Instead, the right-hand side is processed in the same loop. The backsubstitution step is another simple loop.

2.11. PageRank and Markov Chains 23

x(n) = x(n)/b(n);for j = n-1:-1:1

x(j) = (x(j)-c(j)*x(j+1))/b(j);end

Because tridisolve does not use pivoting, the results might be inaccurate if abs(b)is much smaller than abs(a)+abs(c). More robust, but slower, alternatives thatdo use pivoting include generating a full matrix with diag:

T = diag(a,-1) + diag(b,0) + diag(c,1);x = T\d

or generating a sparse matrix with spdiags:

S = spdiags([a b c],[-1 0 1],n,n);x = S\d

2.11 PageRank and Markov ChainsOne of the reasons why GoogleTM is such an effective search engine is the PageRankTM

algorithm developed by Google’s founders, Larry Page and Sergey Brin, when theywere graduate students at Stanford University. PageRank is determined entirely bythe link structure of the World Wide Web. It is recomputed about once a monthand does not involve the actual content of any Web pages or individual queries.Then, for any particular query, Google finds the pages on the Web that match thatquery and lists those pages in the order of their PageRank.

Imagine surfing the Web, going from page to page by randomly choosing anoutgoing link from one page to get to the next. This can lead to dead ends atpages with no outgoing links, or cycles around cliques of interconnected pages.So, a certain fraction of the time, simply choose a random page from the Web.This theoretical random walk is known as a Markov chain or Markov process. Thelimiting probability that an infinitely dedicated random surfer visits any particularpage is its PageRank. A page has high rank if other pages with high rank link toit.

Let W be the set of Web pages that can be reached by following a chain ofhyperlinks starting at some root page, and let n be the number of pages in W . ForGoogle, the set W actually varies with time, In June 2004, their value of n was over4 billion. Today, Google does not reveal how many pages they reach. Let G be then-by-n connectivity matrix of a portion of the Web, that is, gij = 1 if there is ahyperlink to page i from page j and gij = 0 otherwise. The matrix G can be huge,but it is very sparse. Its jth column shows the links on the jth page. The numberof nonzeros in G is the total number of hyperlinks in W .

Let ri and cj be the row and column sums of G:

ri =∑

j

gij , cj =∑

i

gij .

The quantities rj and cj are the in-degree and out-degree of the jth page. Let p bethe probability that the random walk follows a link. A typical value is p = 0.85.


Then 1− p is the probability that some arbitrary page is chosen and δ = (1− p)/nis the probability that a particular random page is chosen. Let A be the n-by-nmatrix whose elements are

aij ={

pgij/cj + δ : cj 6= 01/n : cj = 0.

Notice that A comes from scaling the connectivity matrix by its column sums. Thejth column is the probability of jumping from the jth page to the other pages onthe Web. If the jth page is a dead end, that is has no out-links, then we assign auniform probability of 1/n to all the elements in its column. Most of the elementsof A are equal to δ, the probability of jumping from one page to another withoutfollowing a link. If n = 4 · 109 and p = 0.85, then δ = 3.75 · 10−11.

The matrix A is the transition probability matrix of the Markov chain. Itselements are all strictly between zero and one and its column sums are all equal toone. An important result in matrix theory known as the Perron–Frobenius theoremapplies to such matrices. It concludes that a nonzero solution of the equation

x = Ax

exists and is unique to within a scaling factor. If this scaling factor is chosen sothat ∑

i

xi = 1,

then x is the state vector of the Markov chain and is Google’s PageRank. Theelements of x are all positive and less than one.

The vector x is the solution to the singular, homogeneous linear system

(I −A)x = 0.

For modest n, an easy way to compute x in Matlab is to start with some approx-imate solution, such as the PageRanks from the previous month, or

x = ones(n,1)/n

Then simply repeat the assignment statement

x = A*x

until successive vectors agree to within a specified tolerance. This is known as thepower method and is about the only possible approach for very large n.

In practice, the matrices G and A are never actually formed. One step of thepower method would be done by one pass over a database of Web pages, updatingweighted reference counts generated by the hyperlinks between pages.

The best way to compute PageRank in Matlab is to take advantage of theparticular structure of the Markov matrix. Here is an approach that preserves thesparsity of G. The transition matrix can be written

A = pGD + ezT


where D is the diagonal matrix formed from the reciprocals of the outdegrees,

djj ={

1/cj : cj 6= 00 : cj = 0,

e is the n-vector of all ones, and z is the vector with components

zj ={

δ : cj 6= 01/n : cj = 0.

The rank-one matrix ezT accounts for the random choices of Web pages that donot follow links. The equation

x = Ax

can be written(I − pGD)x = γe

whereγ = zT x.

We do not know the value of γ because it depends upon the unknown vector x, butwe can temporarily take γ = 1. As long as p is strictly less than one, the coefficientmatrix I − pGD is nonsingular and the equation

(I − pGD)x = e

can be solved for x. Then the resulting x can be rescaled so that∑

i

xi = 1.

Notice that the vector z is not actually involved in this calculation.The following Matlab statements implement this approach

c = sum(G,1);k = find(c~=0);D = sparse(k,k,1./c(k),n,n);e = ones(n,1);I = speye(n,n);x = (I - p*G*D)\e;x = x/sum(x);

The power method can also be implemented in a way that does not actuallyform the Markov matrix and so preserves sparsity. Compute

G = p*G*D;z = ((1-p)*(c~=0) + (c==0))/n;

Start with

x = e/n


Then repeat the statement

x = G*x + e*(z*x)

until x settles down to several decimal places.It is also possible to use an algorithm known as inverse iteration.

A = p*G*D + deltax = (I - A)\ex = x/sum(x)

At first glance, this appears to be a very dangerous idea. Because I − A is the-oretically singular, with exact computation some diagonal element of the uppertriangular factor of I − A should be zero and this computation should fail. Butwith roundoff error, the computed matrix I - A is probably not exactly singular.Even if it is singular, roundoff during Gaussian elimination will most likely pre-vent any exact zero diagonal elements. We know that Gaussian elimination withpartial pivoting always produces a solution with a small residual, relative to thecomputed solution, even if the matrix is badly conditioned. The vector obtainedwith the backslash operation, (I - A)\e, usually has very large components. If itis rescaled by its sum, the residual is scaled by the same factor and becomes verysmall. Consequently, the two vectors x and A*x equal each other to within roundofferror. In this setting, solving the singular system with Gaussian elimination blowsup, but it blows up in exactly the right direction.

alpha

beta

gamma

delta

sigma rho

Figure 2.2. A tiny Web.

Figure 2.2 is the graph for a tiny example, with n = 6 instead of n = 4 · 109.Pages on the Web are identified by strings known as uniform resource locators,or URLs. Most URLs begin with http because they use the hypertext transferprotocol. In Matlab , we can store the URLs as an array of strings in a cell array.This example involves a 6-by-1 cell array.


U = {’http://www.alpha.com’’http://www.beta.com’’http://www.gamma.com’’http://www.delta.com’’http://www.rho.com’’http://www.sigma.com’}

Two different kinds of indexing into cell arrays are possible. Parentheses denotesubarrays, including individual cells, and curly braces denote the contents of thecells. If k is a scalar, then U(k) is a 1-by-1 cell array consisting of the kth cell in U,while U{k} is the string in that cell. Thus U(1) is a single cell and U{1} is the string’http://www.alpha.com’. Think of mail boxes with addresses on a city street.B(502) is the box at number 502, while B{502} is the mail in that box.

We can generate the connectivity matrix by specifying the pairs of indices(i,j) of the nonzero elements. Because there is a link to beta.com from alpha.com,the (2,1) element of G is nonzero. The nine connections are described by

i = [ 2 6 3 4 4 5 6 1 1]j = [ 1 1 2 2 3 3 3 4 6]

A sparse matrix is stored in a data structure that requires memory only for thenonzero elements and their indices. This is hardly necessary for a 6-by-6 matrixwith only 27 zero entries, but it becomes crucially important for larger problems.The statements

n = 6G = sparse(i,j,1,n,n);full(G)

generate the sparse representation of an n-by-n matrix with ones in the positionsspecified by the vectors i and j and display its full representation.

0 0 0 1 0 11 0 0 0 0 00 1 0 0 0 00 1 1 0 0 00 0 1 0 0 01 0 1 0 0 0

The statement

c = full(sum(G))

computes the column sums

c =2 2 3 1 0 1

Notice that c(5) = 0 because the 5th page, labeled rho, has no out-links.The statements


x = (I - p*G*D)\ex = x/sum(x)

solve the sparse linear system to produce

x =0.32100.17050.10660.13680.06430.2007

1 2 3 4 5 60

0.05

0.1

0.15

0.2

0.25

0.3

0.35Page Rank

Figure 2.3. Page Rank for the tiny Web

The bar graph of x is shown in figure 2.3. If the URLs are sorted in PageRankorder and listed along with their in- and out-degrees, the result is

page-rank in out url1 0.3210 2 2 http://www.alpha.com6 0.2007 2 1 http://www.sigma.com2 0.1705 1 2 http://www.beta.com4 0.1368 2 1 http://www.delta.com3 0.1066 1 3 http://www.gamma.com5 0.0643 1 0 http://www.rho.com

We see that alpha has a higher PageRank than delta or sigma, even though theyall have the same number of in-links. A random surfer will visit alpha over 32% ofthe time and rho only about 6% of the time.

For this tiny example with p = .85, the smallest element of the Markov tran-sition matrix is δ = .15/6 = .0250.


A =0.0250 0.0250 0.0250 0.8750 0.1667 0.87500.4500 0.0250 0.0250 0.0250 0.1667 0.02500.0250 0.4500 0.0250 0.0250 0.1667 0.02500.0250 0.4500 0.3083 0.0250 0.1667 0.02500.0250 0.0250 0.3083 0.0250 0.1667 0.02500.4500 0.0250 0.3083 0.0250 0.1667 0.0250

Notice that the column sums of A are all equal to one.Our collection of NCM programs includes surfer.m. A statement like

[U,G] = surfer(’http://www.xxx.zzz’,n)

starts at a specified URL and tries to surf the Web until it has visited n pages. Ifsuccessful, it returns an n-by-1 cell array of URLs and an n-by-n sparse connectivitymatrix. The function uses urlread, which was introduced in Matlab 6.5, alongwith underlying Java utilities to access the Web. Surfing the Web automaticallyis a dangerous undertaking and this function must be used with care. Some URLscontain typographical errors and illegal characters. There is a list of URLs toavoid that includes .gif files and Web sites known to cause difficulties. Mostimportantly, surfer can get completely bogged down trying to read a page froma site that appears to be responding, but that never delivers the complete page.When this happens, it may be necessary to have the computer’s operating systemruthlessly terminate Matlab. With these precautions in mind, you can use surferto generate your own PageRank examples.

0 100 200 300 400 500

0

50

100

150

200

250

300

350

400

450

500

nz = 2636

Figure 2.4. Spy plot of the harvard500 graph.


The statement

[U,G] = surfer(’http://www.harvard.edu’,500)

accesses the home page of Harvard University and generates a 500-by-500 test case.The graph generated in August 2003 is available in the NCM directory. The state-ments

load harvard500spy(G)

produce a spy plot (Figure 2.4) that shows the nonzero structure of the connectivitymatrix. The statement

pagerank(U,G)

computes page ranks, produces a bar graph (Figure 2.5) of the ranks, and printsthe most highly ranked URLs in PageRank order.

For the harvard500 data, the dozen most highly ranked pages are

page-rank in out url1 0.0843 195 26 http://www.harvard.edu

10 0.0167 21 18 http://www.hbs.edu42 0.0166 42 0 http://search.harvard.edu:8765/

custom/query.html130 0.0163 24 12 http://www.med.harvard.edu18 0.0139 45 46 http://www.gse.harvard.edu15 0.0131 16 49 http://www.hms.harvard.edu9 0.0114 21 27 http://www.ksg.harvard.edu

17 0.0111 13 6 http://www.hsph.harvard.edu46 0.0100 18 21 http://www.gocrimson.com13 0.0086 9 1 http://www.hsdm.med.harvard.edu260 0.0086 26 1 http://search.harvard.edu:8765/

query.html19 0.0084 23 21 http://www.radcliffe.edu

The URL where the search began, www.harvard.edu, dominates. Like most uni-versities, Harvard is organized into various colleges and institutes, including theKennedy School of Government, the Harvard Medical School, the Harvard Busi-ness School, and the Radcliffe Institute. You can see that the home pages of theseschools have high PageRank. With a different sample, such as the one generatedby Google itself, the ranks would be different.

2.12 Further ReadingFurther reading on matrix computation includes books by Demmel [2], Golub andVan Loan [3], Stewart [4, 5], and Trefethen and Bau [6]. The definitive referenceson Fortran matrix computation software are the LAPACK Users’ Guide and Website [1]. The Matlab sparse matrix data structure and operations are described

Exercises 31

0 50 100 150 200 250 300 350 400 450 5000

0.002

0.004

0.006

0.008

0.01

0.012

0.014

0.016

0.018

0.02Page Rank

Figure 2.5. PageRank of the harvard500 graph.

in [8]. Information available on Web sites about PageRank includes a brief expla-nation at Google [7], a technical report by Page, Brin, and colleagues [11], and acomprehensive survey by Langville and Meyer [10].

Exercises2.1. Alice buys three apples, a dozen bananas, and one cantaloupe for $2.36. Bob

buys a dozen apples and two cantaloupes for $5.26. Carol buys two bananasand three cantaloupes for $2.77. How much do single pieces of each fruitcost? (You might want to set format bank.)

2.2. What Matlab function computes the reduced row echelon form of a ma-trix? What Matlab function generates magic square matrices? What is thereduced row echelon form of the magic square of order six?

2.3. Figure 2.6 depicts a plane truss having 13 members (the numbered lines)connecting 8 joints (the numbered circles). The indicated loads, in tons, areapplied at joints 2, 5, and 6, and we want to determine the resulting force oneach member of the truss.For the truss to be in static equilibrium, there must be no net force, hor-izontally or vertically, at any joint. Thus, we can determine the memberforces by equating the horizontal forces to the left and right at each joint,and similarly equating the vertical forces upward and downward at each joint.For the eight joints, this would give 16 equations, which is more than the 13unknown factors to be determined. For the truss to be statically determi-nate, that is, for there to be a unique solution, we assume that joint 1 isrigidly fixed both horizontally and vertically and that joint 8 is fixed verti-


1 2 5 6 8

3 4 7

1 3 5 7 9 11 12

2 6 10 13

4 8

10 15 20

Figure 2.6. A plane truss.

cally. Resolving the member forces into horizontal and vertical componentsand defining α = 1/

√2, we obtain the following system of equations for the

member forces fi:

Joint 2: f2 = f6,

f3 = 10;Joint 3: αf1 = f4 + αf5,

αf1 + f3 + αf5 = 0;Joint 4: f4 = f8,

f7 = 0;Joint 5: αf5 + f6 = αf9 + f10,

αf5 + f7 + αf9 = 15;Joint 6: f10 = f13,

f11 = 20;Joint 7: f8 + αf9 = αf12,

αf9 + f11 + αf12 = 0;Joint 8: f13 + αf12 = 0.

Solve this system of equations to find the vector f of member forces.2.4. Figure 2.7 is the circuit diagram for a small network of resistors.

There are five nodes, eight resistors, and one constant voltage source. Wewant to compute the voltage drops between the nodes and the currents aroundeach of the loops.Several different linear systems of equations can be formed to describe thiscircuit. Let vk, k = 1, . . . , 4, denote the voltage difference between each ofthe first four nodes and node number 5 and let ik, k = 1, . . . , 4, denote theclockwise current around each of the loops in the diagram. Ohm’s law saysthat the voltage drop across a resistor is the resistance times the current. For

Exercises 33

12

3

4

5

r23

r34

r45

r25

r13

r12

r14

r35

vs

i1

i2

i3

i4

Figure 2.7. A resistor network.

example, the branch between nodes 1 and 2 gives

v1 − v2 = r12(i2 − i1).

Using the conductance, which is the reciprocal of the resistance, gkj = 1/rkj ,Ohm’s law becomes

i2 − i1 = g12(v1 − v2).

The voltage source is included in the equation

v3 − vs = r35i4.

Kirchhoff’s voltage law says that the sum of the voltage differences aroundeach loop is zero. For example, around loop 1,

(v1 − v4) + (v4 − v5) + (v5 − v2) + (v2 − v1) = 0.

Combining the voltage law with Ohm’s law leads to the loop equations forthe currents:

Ri = b.

Here i is the current vector,

i =

i1i2i3i4

,

b is the source voltage vector,

b =

000vs

,


and R is the resistance matrix,

r25 + r12 + r14 + r45 −r12 −r14 −r45

−r12 r23 + r12 + r13 −r13 0−r14 −r13 r14 + r13 + r34 −r34

−r45 0 −r34 r35 + r45 + r34

.

Kirchhoff’s current law says that the sum of the currents at each node is zero.For example, at node 1,

(i1 − i2) + (i2 − i3) + (i3 − i1) = 0.

Combining the current law with the conductance version of Ohm’s law leadsto the nodal equations for the voltages:

Gv = c.

Here v is the voltage vector,

v =

v1

v2

v3

v4

,

c is the source current vector,

c =

00

g35vs

0

,

and G is the conductance matrix,

g12 + g13 + g14 −g12 −g13 −g14

−g12 g12 + g23 + g25 −g23 0−g13 −g23 g13 + g23 + g34 + g35 −g34

−g14 0 −g34 g14 + g34 + g45

.

You can solve the linear system obtained from the loop equations to computethe currents and then use Ohm’s law to recover the voltages. Or you can solvethe linear system obtained from the node equations to compute the voltagesand then use Ohm’s law to recover the currents. Your assignment is to verifythat these two approaches produce the same results for this circuit. You canchoose your own numerical values for the resistances and the voltage source.

2.5. The Cholesky algorithm factors an important class of matrices known aspositive definite matrices. Andre-Louis Cholesky (1875–1918) was a Frenchmilitary officer involved in geodesy and surveying in Crete and North Africajust before World War I. He developed the method now named after him tocompute solutions to the normal equations for some least squares data-fitting

Exercises 35

problems arising in geodesy. His work was posthumously published on hisbehalf in 1924 by a fellow officer, Benoit, in the Bulletin Geodesique.A real symmetric matrix A = AT is positive definite if any of the followingequivalent conditions hold:

• The quadratic formxT Ax

is positive for all nonzero vectors x.

• All determinants formed from symmetric submatrices of any order cen-tered on the diagonal of A are positive.

• All eigenvalues λ(A) are positive.

• There is a real matrix R such that

A = RT R.

These conditions are difficult or expensive to use as the basis for checking ifa particular matrix is positive definite. In Matlab , the best way to checkpositive definiteness is with the chol function. See

help chol

(a) Which of the following families of matrices are positive definite?

M = magic(n)H = hilb(n)P = pascal(n)I = eye(n,n)R = randn(n,n)R = randn(n,n); A = R’ * RR = randn(n,n); A = R’ + RR = randn(n,n); I = eye(n,n); A = R’ + R + n*I

(b) If the matrix R is upper triangular, then equating individual elements inthe equation A = RT R gives

akj =k∑

i=1

rikrij , k ≤ j.

Using these equations in different orders yields different variants of the Choleskyalgorithm for computing the elements of R. What is one such algorithm?

2.6. This example shows that a badly conditioned matrix does not necessarilylead to small pivots in Gaussian elimination. The matrix is the n-by-n uppertriangular matrix A with elements

aij =

−1, i < j,1, i = j,0, i > j.


Show how to generate this matrix in Matlab with eye, ones, and triu.Show that

κ1(A) = n2n−1.

For what n does κ1(A) exceed 1/eps?This matrix is not singular, so Ax cannot be zero unless x is zero. However,there are vectors x for which ‖Ax‖ is much smaller than ‖x‖. Find one suchx.Because this matrix is already upper triangular, Gaussian elimination withpartial pivoting has no work to do. What are the pivots?Use lugui to design a pivot strategy that will produce smaller pivots thanpartial pivoting. (Even these pivots do not completely reveal the large con-dition number.)

2.7. The matrix factorizationLU = PA

can be used to compute the determinant of A. We have

det(L)det(U) = det(P )det(A).

Because L is triangular with ones on the diagonal, det(L) = 1. Because U istriangular, det(U) = u11u22 · · ·unn. Because P is a permutation, det(P ) =+1 if the number of interchanges is even and −1 if it is odd. So

det(A) = ±u11u22 · · ·unn.

Modify the lutx function so that it returns four outputs.

function [L,U,p,sig] = lutx(A)%LU Triangular factorization% [L,U,p,sig] = lutx(A) computes a unit lower triangular% matrix L, an upper triangular matrix U, a permutation% vector p, and a scalar sig, so that L*U = A(p,:) and% sig = +1 or -1 if p is an even or odd permutation.

Write a function mydet(A) that uses your modified lutx to compute thedeterminant of A. In Matlab, the product u11u22 · · ·unn can be computedby the expression prod(diag(U)).

2.8. Modify the lutx function so that it uses explicit for loops instead of Matlabvector notation. For example, one section of your modified program will read

% Compute the multipliersfor i = k+1:n

A(i,k) = A(i,k)/A(k,k);end

Compare the execution time of your modified lutx program with the origi-nal lutx program and with the built-in lu function by finding the order ofthe matrix for which each of the three programs takes about 10 s on yourcomputer.

Exercises 37

2.9. Let

A =

1 2 34 5 67 8 9

, b =

135

.

(a) Show that the set of linear equations Ax = b has infinitely many solutions.Describe the set of possible solutions.(b) Suppose Gaussian elimination is used to solve Ax = b using exact arith-metic. Because there are infinitely many solutions, it is unreasonable toexpect one particular solution to be computed. What does happen?(c) Use bslashtx to solve Ax = b on an actual computer with floating-pointarithmetic. What solution is obtained? Why? In what sense is it a “good”solution? In what sense is it a “bad” solution?(d) Explain why the built-in backslash operator x = A\b gives a differentsolution from x = bslashtx(A,b).

2.10. Section 2.4 gives two algorithms for solving triangular systems. One subtractscolumns of the triangular matrix from the right-hand side; the other usesinner products between the rows of the triangular matrix and the emergingsolution.(a) Which of these two algorithms does bslashtx use?(b) Write another function, bslashtx2, that uses the other algorithm.

2.11. The inverse of a matrix A can be defined as the matrix X whose columns xj

solve the equationsAxj = ej ,

where ej is the jth column of the identity matrix.(a) Starting with the function bslashtx, write a Matlab function

X = myinv(A)

that computes the inverse of A. Your function should call lutx only once andshould not use the built-in Matlab backslash operator or inv function.(b) Test your function by comparing the inverses it computes with the inversesobtained from the built-in inv(A) on a few test matrices.

2.12. If the built-in Matlab lu function is called with only two output arguments

[L,U] = lu(A)

the permutations are incorporated into the output matrix L. The help entryfor lu describes L as “psychologically lower triangular.” Modify lutx so thatit does the same thing. You can use

if nargout == 2, ...

to test the number of output arguments.2.13. (a) Why is

M = magic(8)lugui(M)


an interesting example?(b) How is the behavior of lugui(M) related to rank(M)?(c) Can you pick a sequence of pivots so that no roundoff error occurs inlugui(M)?

2.14. The pivot selection strategy known as complete pivoting is one of the optionsavailable in lugui. It has some slight numerical advantages over partialpivoting. At each stage of the elimination, the element of largest magnitudein the entire unreduced matrix is selected as the pivot. This involves bothrow and column interchanges and produces two permutation vectors p and qso that

L*U = A(p,q)

Modify lutx and bslashtx so that they use complete pivoting.2.15. The function golub in the NCM directory is named after Stanford professor

Gene Golub. The function generates test matrices with random integer en-tries. The matrices are very badly conditioned, but Gaussian eliminationwithout pivoting fails to produce the small pivots that would reveal the largecondition number.(a) How does condest(golub(n)) grow with increasing order n? Becausethese are random matrices you can’t be very precise here, but you can givesome qualitative description.(b) What atypical behavior do you observe with the diagonal pivoting optionin lugui(golub(n))?(c) What is det(golub(n))? Why?

2.16. The function pascal generates symmetric test matrices based on Pascal’striangle.(a) How are the elements of pascal(n+1) related to the binomial coefficientsgenerated by nchoosek(n,k)?(b) How is chol(pascal(n)) related to pascal(n)?(c) How does condest(pascal(n)) grow with increasing order n?(d) What is det(pascal(n))? Why?(e) Let Q be the matrix generated by

Q = pascal(n);Q(n,n) = Q(n,n) - 1;

How is chol(Q) related to chol(pascal(n))? Why?(f) What is det(Q)? Why?

2.17. Play “Pivot Pickin’ Golf” with pivotgolf. The goal is to use lugui tocompute the LU decompositions of nine matrices with as little roundoff erroras possible. The score for each hole is

‖R‖∞ + ‖Lε‖∞ + ‖Uε‖∞,

where R = LU−PAQ is the residual and ‖Lε‖∞ and ‖Uε‖∞ are the nonzerosthat should be zero in L and U .

Exercises 39

(a) Can you beat the scores obtained by partial pivoting on any of the courses?(b) Can you get a perfect score of zero on any of the courses?

2.18. The object of this exercise is to investigate how the condition numbers ofrandom matrices grow with their order. Let Rn denote an n-by-n matrix withnormally distributed random elements. You should observe experimentallythat there is an exponent p so that

κ1(Rn) = O(np).

In other words, there are constants c1 and c2 so that most values of κ1(Rn)satisfy

c1np ≤ κ1(Rn) ≤ c2n

p.

Your job is to find p, c1, and c2.The NCM M-file randncond.m is the starting point for your experiments.This program generates random matrices with normally distributed elementsand plots their l1 condition numbers versus their order on a loglog scale.The program also plots two lines that are intended to enclose most of theobservations. (On a loglog scale, power laws like κ = cnp produce straightlines.)(a) Modify randncond.m so that the two lines have the same slope and enclosemost of the observations.(b) Based on this experiment, what is your guess for the exponent p inκ(Rn) = O(np)? How confident are you?(c) The program uses (’erasemode’,’none’), so you cannot print the re-sults. What would you have to change to make printing possible?

2.19. For n = 100, solve this tridiagonal system of equations three different ways:

2x1 − x2 =1,

−xj−1 + 2xj − xj+1 =j, j = 2, . . . , n− 1,

−xn−1 + 2xn =n.

(a) Use diag three times to form the coefficient matrix and then use lutxand bslashtx to solve the system.(b) Use spdiags once to form a sparse representation of the coefficient matrixand then use the backslash operator to solve the system.(c) Use tridisolve to solve the system.(d) Use condest to estimate the condition of the coefficient matrix.

2.20. Use surfer and pagerank to compute PageRanks for some subset of the Webthat you choose. Do you see any interesting structure in the results?

2.21. Suppose that U and G are the URL cell array and the connectivity matrixproduced by surfer and that k is an integer. Explain what

U{k}, U(k), G(k,:), G(:,k), U(G(k,:)), U(G(:,k))

are.


2.22. The connectivity matrix for the harvard500 data set has four small, almostentirely nonzero, submatrices that produce dense patches near the diagonalof the spy plot. You can use the zoom button to find their indices. Thefirst submatrix has indices around 170 and the other three have indices inthe 200s and 300s. Mathematically, a graph with every node connected toevery other node is known as a clique. Identify the organizations within theHarvard community that are responsible for these near cliques.

2.23. A Web connectivity matrix G has gij = 1 if it is possible to get to page ifrom page j with one click. If you multiply the matrix by itself, the entriesof the matrix G2 count the number of different paths of length two to page ifrom page j. The matrix power Gp shows the number of paths of length p.(a) For the harvard500 data set, find the power p where the number ofnonzeros stops increasing. In other words, for any q greater than p, nnz(G^q)is equal to nnz(G^p).(b) What fraction of the entries in Gp are nonzero?(c) Use subplot and spy to show the nonzeros in the successive powers.(d) Is there a set of interconnected pages that do not link to the other pages?

2.24. The function surfer uses a subfunction, hashfun, to speed up the search for apossibly new URL in the list of URLs that have already been processed. Findtwo different URLs on The MathWorks home page http://www.mathworks.comthat have the same hashfun value.

2.25. Figure 2.8 is the graph of another six-node subset of the Web. In this example,there are two disjoint subgraphs.

alpha

beta

gamma

delta

sigma rho

Figure 2.8. Another tiny Web.

(a) What is the connectivity matrix G?(b) What are the PageRanks if the hyperlink transition probability p is thedefault value 0.85?(c) Describe what happens with this example to both the definition of PageR-ank and the computation done by pagerank in the limit p → 1.

Exercises 41

2.26. The function pagerank(U,G) computes PageRanks by solving a sparse linearsystem. It then plots a bar graph and prints the dominant URLs.(a) Create pagerank1(G) by modifying pagerank so that it just computesthe PageRanks, but does not do any plotting or printing.(b) Create pagerank2(G) by modifying pagerank1 to use inverse iterationinstead of solving the sparse linear system. The key statements are

x = (I - A)\ex = x/sum(x)

What should be done in the unlikely event that the backslash operation in-volves a division by zero?(c) Create pagerank3(G) by modifying pagerank1 to use the power methodinstead of solving the sparse linear system. The key statements are

G = p*G*Dz = ((1-p)*(c~=0) + (c==0))/n;while termination_test

x = G*x + e*(z*x)end

What is an appropriate test for terminating the power iteration?(d) Use your functions to compute the PageRanks of the six-node examplediscussed in the text. Make sure you get the correct result from each of yourthree functions.

2.27. Here is yet another function for computing PageRank. This version usesthe power method, but does not do any matrix operations. Only the linkstructure of the connectivity matrix is involved.

function [x,cnt] = pagerankpow(G)% PAGERANKPOW PageRank by power method.% x = pagerankpow(G) is the PageRank of the graph G.% [x,cnt] = pagerankpow(G)% counts the number of iterations.

% Link structure

[n,n] = size(G);for j = 1:n

L{j} = find(G(:,j));c(j) = length(L{j});

end

% Power method

p = .85;delta = (1-p)/n;x = ones(n,1)/n;


z = zeros(n,1);cnt = 0;while max(abs(x-z)) > .0001

z = x;x = zeros(n,1);for j = 1:n

if c(j) == 0x = x + z(j)/n;

elsex(L{j}) = x(L{j}) + z(j)/c(j);

endendx = p*x + delta;cnt = cnt+1;

end

(a) How do the storage requirements and execution time of this functioncompare with the three pagerank functions from the previous exercise?(b) Use this function as a template to write a function that computes PageR-ank in some other programming language.

Bibliography

[1] E. Anderson, Z. Bai, C. Bischof, S. Blackford, J. Demmel, J.Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney,and D. Sorensen, LAPACK Users’ Guide, Third Edition, SIAM, Philadelphia,1999.http://www.netlib.org/lapack

[2] J. W. Demmel, Applied Numerical Linear Algebra, SIAM, Philadelphia, 1997.

[3] G. H. Golub and C. F. Van Loan, Matrix Computations, Third Edition,The Johns Hopkins University Press, Baltimore, 1996.

[4] G. W. Stewart, Introduction to Matrix Computations, Academic Press, NewYork, 1973.

[5] G. W. Stewart, Matrix Algorithms: Basic Decompositions, SIAM, Philadel-phia, 1998.

[6] L. N. Trefethen and D. Bau, III, Numerical Linear Algebra, SIAM,Philadelphia, 1997.

[7] Google, Google Technology.http://www.google.com/technology/index.html

[8] J. R. Gilbert, C. Moler, and R. Schreiber, Sparse matrices in MATLAB:Design and implementation, SIAM Journal on Matrix Analysis and Applica-tions, 13 (1992), pp. 333–356.

[9] N. J. Higham, and F. Tisseur, A block algorithm for matrix 1-norm esti-mation, with an application to 1-norm pseudospectra, SIAM Journal on MatrixAnalysis and Applications, 21 (2000), pp. 1185–1201.

[10] A. Langville, and C. Meyer, Deeper Inside PageRank,http://meyer.math.ncsu.edu/Meyer/PS_Files/DeeperInsidePR.pdf

[11] L. Page, S. Brin, R. Motwani, and T. Winograd, The PageRank Cita-tion Ranking: Bringing Order to the Web.http://dbpubs.stanford.edu:8090/pub/1999-66

43

1

SSoolluuttiioonn ooff nnoonn--lliinneeaarr eeqquuaattiioonnss By Gilberto E. Urroz, September 2004

In this document I present methods for the solution of single non-linear equations as well as for systems of such equations.

SSoolluuttiioonn ooff aa ssiinnggllee nnoonn--lliinneeaarr eeqquuaattiioonn Equations that can be cast in the form of a polynomial are referred to as algebraic equations. Equations involving more complicated terms, such as trigonometric, hyperbolic, exponential, or logarithmic functions are referred to as transcendental equations. The methods presented in this section are numerical methods that can be applied to the solution of such equations, to which we will refer, in general, as non-linear equations. In general, we will we searching for one, or more, solutions to the equation,

f(x) = 0.

We will present the Newton-Raphson algorithm, and the secant method. In the secant method we need to provide two initial values of x to get the algorithm started. In the Newton-Raphson methods only one initial value is required. Because the solution is not exact, the algorithms for any of the methods presented herein will not provide the exact solution to the equation f(x) = 0, instead, we will stop the algorithm when the equation is satisfied within an allowed tolerance or error, ε. In mathematical terms this is expressed as

|f(xR)| < ε. The value of x for which the non-linear equation f(x)=0 is satisfied, i.e., x = xR, will be the solution, or root, to the equation within an error of ε units. The Newton-Raphson method Consider the Taylor-series expansion of the function f(x) about a value x = xo:

f(x)= f(xo)+f'(xo)(x-xo)+(f"(xo)/2!)(x-xo)2+…. Using only the first two terms of the expansion, a first approximation to the root of the equation

f(x) = 0

can be obtained from

f(x) = 0 ≈ f(xo)+f'(xo)(x1 -xo)

2

Such approximation is given by,

x1 = xo - f(xo)/f'(xo). The Newton-Raphson method consists in obtaining improved values of the approximate root through the recurrent application of equation above. For example, the second and third approximations to that root will be given by

x2 = x1 - f(x1)/f'(x1), and

x3= x2 - f(x2)/f'(x2), respectively. This iterative procedure can be generalized by writing the following equation, where i represents the iteration number:

xi+1 = xi - f(xi)/f'(xi). After each iteration the program should check to see if the convergence condition, namely,

|f(x i+1)|<ε, is satisfied. The figure below illustrates the way in which the solution is found by using the Newton-Raphson method. Notice that the equation f(x) = 0 ≈ f(xo)+f'(xo)(x1 -xo) represents a straight line tangent to the curve y = f(x) at x = xo. This line intersects the x-axis (i.e., y = f(x) = 0) at the point x1 as given by x1 = xo - f(xo)/f'(xo). At that point we can construct another straight line tangent to y = f(x) whose intersection with the x-axis is the new approximation to the root of f(x) = 0, namely, x = x2. Proceeding with the iteration we can see that the intersection of consecutive tangent lines with the x-axis approaches the actual root relatively fast.

3

The Newton-Raphson method converges relatively fast for most functions regardless of the initial value chosen. The main disadvantage is that you need to know not only the function f(x), but also its derivative, f'(x), in order to achieve a solution. The secant method, discussed in the following section, utilizes an approximation to the derivative, thus obviating such requirement.

The programming algorithm of any of these methods must include the option of stopping the program if the number of iterations grows too large. How large is large? That will depend of the particular problem solved. However, any Newton-Raphson, or secant method solution that takes more than 1000 iterations to converge is either ill-posed or contains a logical error. Debugging of the program will be called for at this point by changing the initial values provided to the program, or by checking the program's logic. AA MMAATTLLAABB ffuunnccttiioonn ffoorr tthhee NNeewwttoonn--RRaapphhssoonn mmeetthhoodd

The function newton, listed below, implements the Newton-Raphson algorithm. It uses as arguments an initial value and expressions for f(x) and f'(x). function [x,iter]=newton(x0,f,fp) % newton-raphson algorithm N = 100; eps = 1.e-5; % define max. no. iterations and error maxval = 10000.0; % define value for divergence xx = x0; while (N>0) xn = xx-f(xx)/fp(xx); if abs(f(xn))<eps x=xn;iter=100-N; return; end; if abs(f(xx))>maxval disp(['iterations = ',num2str(iter)]); error('Solution diverges'); break; end; N = N - 1; xx = xn; end; error('No convergence'); break; % end function

We will use the Newton-Raphson method to solve for the equation, f(x) = x3-2x2+1 = 0. The following MATLAB commands define the function f001(x) and its derivative, f01p(x): »f001 = inline('x.^3-2*x.^2+1','x') f001 = Inline function: f001(x) = x.^3-2*x.^2+1

4

» f01p = inline('3*x.^2-2','x') f01p = Inline function: f01p(x) = 3*x.^2-2 To have an idea of the location of the roots of this polynomial we'll plot the function using the following MATLAB commands: » x = [-0.8:0.01:2.0]';y=f001(x); » plot(x,y);xlabel('x');ylabel('f001(x)'); » grid on

-1 -0.5 0 0.5 1 1.5 2-1

-0.5

0

0.5

1

x

f001

(x)

We see that the function graph crosses the x-axis somewhere between –1.0 and –0.5, close to 1.0, and between 1.5 and 2.0. To activate the function we could use, for example, an initial value x0 = 2: » [x,iterations] = newton(2,f001,f01p) x = 1.6180 iterations = 39 The following command are aimed at obtaining vectors of the solutions provided by function newton.m for f001(x)=0 for initial values in the vector x0 such that –20 < x0 < 20. The solutions found are stored in variable xs while the required number of iterations is stored in variable is. » x0 = [-20:0.5:20]; xs = []; is = []; EDU» for i = 1:length(x0) [xx,ii] = newton(x0(i),f001,f01p); xs = [xs,xx]; is = [is,ii]; end Plot of xs vs. x0, and of is vs. x0 are shown next:

5

» figure(1);plot(x0,xs,'o');hold;plot(x0,xs,'-');hold; » title('Newton-Raphson solution'); » xlabel('initial guess, x0');ylabel('solution, xs');

-20 -15 -10 -5 0 5 10 15 201

1.1

1.2

1.3

1.4

1.5

1.6

1.7

1.8Newton-Raphson solution

initial guess, x0

solu

tion,

xs

This figure shows that for initial guesses in the range –20 < xo<20, function newton.m converges mainly to the solution x = 1.6180, with few instances converting to the solutions x = -1 and x = 1. » figure(2);plot(x0,is,'+'); hold;plot(x0,is,'-');hold; » title('Newton-Raphson solution'); » xlabel('initial guess, x0');ylabel('iterations, is');

-20 -15 -10 -5 0 5 10 15 200

10

20

30

40

50

60

70Newton-Raphson solution

initial guess, x0

itera

tions

, is

This figure shows that the number of iterations required to achieve a solution ranges from 0 to about 65. Most of the time, about 50 iterations are required. The following example shows a case in which the solution actually diverges:

6

» [x,iter] = newton(-20.75,f001,f01p) iterations = 6 ??? Error using ==> newton Solution diverges NOTE: If the function of interest is defined by an m-file, the reference to the function name in the call to newton.m should be placed between quotes. The Secant Method In the secant method, we replace the derivative f'(xi) in the Newton-Raphson method with

f'(xi) ≈ (f(xi) - f(xi-1))/(xi - xi-1).

With this replacement, the Newton-Raphson algorithm becomes

To get the method started we need two values of x, say xo and x1, to get the first approximation to the solution, namely,

As with the Newton-Raphson method, the iteration is stopped when

|f(x i+1)|<ε.

Figure 4, below, illustrates the way that the secant method approximates the solution of the equation f(x) = 0.

).()()(

)(1

11 −

−+ −⋅

−−= ii

ii

iii xx

xfxfxf

xx

).()()(

)(01

1

112 xx

xfxfxfxx

o

−⋅−

−=

7

AA MMAATTLLAABB ffuunnccttiioonn ffoorr tthhee sseeccaanntt mmeetthhoodd

The function secant, listed below, uses the secant method to solve for non-linear equations. It requires two initial values and an expression for the function, f(x). function [x,iter]=secant(x0,x00,f) % newton-raphson algorithm N = 100; eps = 1.e-5; % define max. no. iterations and error maxval = 10000.0; % define value for divergence xx1 = x0; xx2 = x00; while N>0 gp = (f(xx2)-f(xx1))/(xx2-xx1); xn = xx1-f(xx1)/gp; if abs(f(xn))<eps x=xn; iter = 100-N; return; end; if abs(f(xn))>maxval iter=100-N; disp(['iterations = ',num2str(iter)]); error('Solution diverges'); abort; end; N = N - 1; xx1 = xx2; xx2 = xn; end; iter=100-N; disp(['iterations = ',iter]); error('No convergence'); abort; % end function

AApppplliiccaattiioonn ooff sseeccaanntt mmeetthhoodd

We use the same function f001(x) = 0 presented earlier. The following commands call the function secant.txt to obtain a solution to the equation: » [x,iter] = secant(-10.0,-9.8,f001) x = -0.6180 iter = 11

The following command are aimed at obtaining vectors of the solutions provided by function Newton.m for f001(x)=0 for initial values in the vector x0 such that –20 < x0 < 20. The solutions found are stored in variable xs while the required number of iterations is stored in variable is.

8

» x0 = [-20:0.5:20]; x00 = x0 + 0.5; xs = []; is = []; » for i = 1:length(x0) [xx,ii] = secant(x0(i),x00(i),f001); xs = [xs, xx]; is = [is, ii]; end Plot of xs vs. x0, and of is vs. x0 are shown next: » figure(1);plot(x0,xs,'o');hold;plot(x0,xs,'-');hold; » title('Secant method solution'); » xlabel('first initial guess, x0');ylabel('solution, xs');

-20 -15 -10 -5 0 5 10 15 20-1

-0.5

0

0.5

1

1.5

2Secant method solution

first initial guess, x0

solu

tion,

xs

This figure shows that for initial guesses in the range –20 < xo<20, function newton.m converges to the three solutions. Notice that initial guesses in the range –20 < x0 < -1, converge to x = - 0.6180; those in the range –1< x < 1, converge to x = 1; and, those in the range 1<x<20, converge to x =1.6180. » figure(2);plot(x0,is,'o');hold;plot(x0,is,'-');hold; » xlabel('first initial guess, x0');ylabel('iterations, is'); » title('Secant method solution');

-20 -15 -10 -5 0 5 10 15 200

5

10

15

first initial guess, x0

itera

tions

, is

Secant method solution

9

This figure shows that the number of iterations required to achieve a solution ranges from 0 to about 15. Notice also that, the closer the initial guess is to zero, the less number of iterations to convergence are required. NOTE: If the function of interest is defined by an m-file, the reference to the function name in the call to newton.m should be placed between quotes. SSoollvviinngg ssyysstteemmss ooff nnoonn--lliinneeaarr eeqquuaattiioonnss

Consider the solution to a system of n non-linear equations in n unknowns given by

f1(x1,x2,…,xn) = 0 f2(x1,x2,…,xn) = 0

.

.

. fn(x1,x2,…,xn) = 0

The system can be written in a single expression using vectors, i.e.,

f(x) = 0, where the vector x contains the independent variables, and the vector f contains the functions fi(x):

.

)(

)()(

),...,,(

),...,,(),...,,(

)(, 2

1

21

212

211

2

1

=

=

=

x

xx

xfx

nnn

n

n

n f

ff

xxxf

xxxfxxxf

x

xx

MMM

Newton-Raphson method to solve systems of non-linear equations A Newton-Raphson method for solving the system of linear equations requires the evaluation of a matrix, known as the Jacobian of the system, which is defined as:

.][

///

//////

),...,,(),...,,(

21

22212

12111

21

21nn

j

i

nnnn

n

n

n

n

xf

xfxfxf

xfxfxfxfxfxf

xxxfff

×∂∂

=

∂∂∂∂∂∂

∂∂∂∂∂∂∂∂∂∂∂∂

=∂∂

=

L

MOMM

L

L

J

If x = x0 (a vector) represents the first guess for the solution, successive approximations to the solution are obtained from

xn+1 = xn - J-1⋅f(xn) = xn - ∆xn, with ∆xn = xn+1 - xn.

10

A convergence criterion for the solution of a system of non-linear equation could be, for example, that the maximum of the absolute values of the functions fi(xn) is smaller than a certain tolerance ε, i.e.,

.|)(|max ε<niif x

Another possibility for convergence is that the magnitude of the vector f(xn) be smaller than the tolerance, i.e.,

|f(xn)| < ε. We can also use as convergence criteria the difference between consecutive values of the solution, i.e.,

.,|)()(|max 1 ε<−+ niniixx

or, |∆xn | = |xn+1 - xn| < ε.

The main complication with using Newton-Raphson to solve a system of non-linear equations is having to define all the functions ∂fi/∂xj, for i,j = 1,2, …, n, included in the Jacobian. As the number of equations and unknowns, n, increases, so does the number of elements in the Jacobian, n2. MATLAB function for Newton-Raphson method for a system of non-linear equations

The following MATLAB function, newtonm, calculates the solution to a system of n non-linear equations, f(x) = 0, given the vector of functions f and the Jacobian J, as well as an initial guess for the solution x0. function [x,iter] = newtonm(x0,f,J) % Newton-Raphson method applied to a % system of linear equations f(x) = 0, % given the jacobian function J, with % J = del(f1,f2,...,fn)/del(x1,x2,...,xn) % x = [x1;x2;...;xn], f = [f1;f2;...;fn] % x0 is an initial guess of the solution N = 100; % define max. number of iterations epsilon = 1e-10; % define tolerance maxval = 10000.0; % define value for divergence xx = x0; % load initial guess while (N>0) JJ = feval(J,xx); if abs(det(JJ))<epsilon error('newtonm - Jacobian is singular - try new x0'); abort; end; xn = xx - inv(JJ)*feval(f,xx);

11

if abs(feval(f,xn))<epsilon x=xn; iter = 100-N; return; end; if abs(feval(f,xx))>maxval iter = 100-N; disp(['iterations = ',num2str(iter)]); error('Solution diverges'); abort; end; N = N - 1; xx = xn; end; error('No convergence after 100 iterations.'); abort; % end function

The functions f and the Jacobian J need to be defined as separate functions. To illustrate the definition of the functions consider the system of non-linear equations:

f1(x1,x2) = x12

+ x22 - 50 = 0,

f2(x1,x2) = x1 ⋅ x2 - 25 = 0, whose Jacobian is

.22

12

21

2

2

1

2

2

1

1

1

=

∂∂

∂∂

∂∂

∂∂

=xxxx

xf

xf

xf

xf

J

We can define the function f as the following user-defined MATLAB function f2: function [f] = f2(x) % f2(x) = 0, with x = [x(1);x(2)] % represents a system of 2 non-linear equations f1 = x(1)^2 + x(2)^2 - 50; f2 = x(1)*x(2) -25; f = [f1;f2]; % end function The corresponding Jacobian is calculated using the user-defined MATLAB function jacob2x2: function [J] = jacob2x2(x) % Evaluates the Jacobian of a 2x2 % system of non-linear equations J(1,1) = 2*x(1); J(1,2) = 2*x(2); J(2,1) = x(2); J(2,2) = x(1); % end function

12

Illustrating the Newton-Raphson algorithm for a system of two non-linear equations

Before using function newtonm, we will perform some step-by-step calculations to illustrate the algorithm. We start by defining an initial guess for the solution as: » x0 = [2;1] x0 = 2 1

Let’s calculate the function f(x) at x = x0 to see how far we are from a solution: » f2(x0) ans = -45 -23 Obviously, the function f(x0) is far away from being zero. Thus, we proceed to calculate a better approximation by calculating the Jacobian J(x0): » J0 = jacob2x2(x0) J0 = 4 2 1 2 The new approximation to the solution, x1, is calculated as: » x1 = x0 - inv(J0)*f2(x0) x1 = 9.3333 8.8333 Evaluating the functions at x1 produces: f2(x1) ans = 115.1389 Still far away from convergence. Let’s calculate a new approximation, x2: » x2 = x1-inv(jacob2x2(x1))*f2(x1) x2 = 6.0428 5.7928

13

Evaluating the functions at x2 indicates that the values of the functions are decreasing: » f2(x2) ans = 20.0723 10.0049 A new approximation and the corresponding function evaluations are: » x3 = x2 - inv(jacob2x2(x2))*f2(x2) x3 = 5.1337 5.0087 E» f2(x3) ans = 1.4414 0.7129

The functions are getting even smaller suggesting convergence towards a solution. Solution using function newtonm

Next, we use function newtonm to solve the problem postulated earlier. A call to the function using the values of x0, f2, and jacob2x2 is: » [x,iter] = newtonm(x0,'f2','jacob2x2') x = 5.0000 5.0000 iter = 16 The result shows the number of iterations required for convergence (16) and the solution found as x1

= 5.0000 and x2 = 5.000. Evaluating the functions for those solutions results in: » f2(x) ans = 1.0e-010 * 0.2910 -0.1455

14

The values of the functions are close enough to zero (error in the order of 10-11). “Secant” method to solve systems of non-linear equations In this section we present a method for solving systems of non-linear equations through the Newton-Raphson algorithm, namely, xn+1 = xn - J-1⋅f(xn), but approximating the Jacobian through finite differences. This approach is a generalization of the secant method for a single non-linear equation. For that reason, we refer to the method applied to a system of non-linear equations as a “secant” method, although the geometric origin of the term not longer applies. The “secant” method for a system of non-linear equations free us from having to define the n2 functions necessary to define the Jacobian for a system of n equations. Instead, we approximate the partial derivatives in the Jacobian with

,),,,,,(),,,,,( 2121

xxxxxfxxxxxf

xf njinji

j

i

∆

−∆+≈

∂∂ LLLL

where ∆x is a small increment in the independent variables. Notice that ∂fi/∂xj represents element Jij in the jacobian J = ∂(f1,f2,…,fn)/∂(x1,x2,…,xn). To calculate the Jacobian we proceed by columns, i.e., column j of the Jacobian will be calculated as shown in the function jacobFD (jacobian calculated through Finite Differences) listed below: function [J] = jacobFD(f,x,delx) % Calculates the Jacobian of the % system of non-linear equations: % f(x) = 0, through finite differences. % The Jacobian is built by columns [m n] = size(x); for j = 1:m xx = x; xx(j) = x(j) + delx; J(:,j) = (f(xx)-f(x))/delx; end; % end function Notice that for each column (i.e., each value of j) we define a variable xx which is first made equal to x, and then the j-th element is incremented by delx, before calculating the j-th column of the Jacobian, namely, J(:,j). This is the MATLAB implementation of the finite difference approximation for the Jacobian elements Jij = ∂fi/∂xj as defined earlier.

15

Illustrating the “secant” algorithm for a system of two non-linear equations

To illustrate the application of the “secant” algorithm we use again the system of two non-linear equations defined earlier through the function f2. We choose an initial guess for the solution as x0 = [2;3], and an increment in the independent variables of ∆x = 0.1: x0 = [2;3] x0 = 2 3 EDU» dx = 0.1 dx = 0.1000 Variable J0 will store the Jacobian corresponding to x0 calculated through finite differences with the value of ∆x defined above: » J0 = jacobFD('f2',x0,dx) J0 = 4.1000 6.1000 3.0000 2.0000

A new estimate for the solution, namely, x1, is calculated using the Newton-Raphson algorithm: » x1 = x0 - inv(J0)*f2(x0) x1 = 6.1485 6.2772

The finite-difference Jacobian corresponding to x1 gets stored in J1: » J1 = jacobFD('f2',x1,dx) J1 = 12.3970 12.6545 6.2772 6.1485 And a new approximation for the solution (x2) is calculated as: » x2 = x1 - inv(J1)*f2(x1) x2 = 4.6671 5.5784

16

The next two approximations to the solution (x3 and x4) are calculated without first storing the corresponding finite-difference Jacobians: » x3 = x2 - inv(jacobFD('f2',x2,dx))*f2(x2) x3 = 4.7676 5.2365 EDU» x4 = x3 - inv(jacobFD('f2',x3,dx))*f2(x3) x4 = 4.8826 5.1175

To check the value of the functions at x = x4 we use: » f2(x4) ans = 0.0278 -0.0137 The functions are close to zero, but not yet at an acceptable error (i.e., something in the order of 10-6). Therefore, we try one more approximation to the solution, i.e., x5: » x5 = x4 - inv(jacobFD('f2',x4,dx))*f2(x4) x5 = 4.9413 5.0587 The functions are even closer to zero than before, suggesting a convergence to a solution. » f2(x5) ans = 0.0069 -0.0034 MMAATTLLAABB ffuunnccttiioonn ffoorr ““sseeccaanntt”” mmeetthhoodd ttoo ssoollvvee ssyysstteemmss ooff nnoonn--lliinneeaarr eeqquuaattiioonnss

To make the process of achieving a solution automatic, we propose the following MATLAB user -defined function, secantm: function [x,iter] = secantm(x0,dx,f) % Secant-type method applied to a % system of linear equations f(x) = 0,

17

% given the jacobian function J, with % The Jacobian built by columns. % x = [x1;x2;...;xn], f = [f1;f2;...;fn] % x0 is the initial guess of the solution % dx is an increment in x1,x2,... variables N = 100; % define max. number of iterations epsilon = 1.0e-10; % define tolerance maxval = 10000.0; % define value for divergence if abs(dx)<epsilon error('dx = 0, use different values'); break; end; xn = x0; % load initial guess [n m] = size(x0); while (N>0) JJ = [1,2;2,3]; xx = zeros(n,1); for j = 1:n % Estimating xx = xn; % Jacobian by xx(j) = xn(j) + dx; % finite fxx = feval(f,xx); fxn = feval(f,xn); JJ(:,j) = (fxx-fxn)/dx; % differences end; % by columns if abs(det(JJ))<epsilon error('newtonm - Jacobian is singular - try new x0,dx'); break; end; xnp1 = xn - inv(JJ)*fxn; fnp1 = feval(f,xnp1); if abs(fnp1)<epsilon x=xnp1; disp(['iterations: ', num2str(100-N)]); return; end; if abs(fnp1)>maxval disp(['iterations: ', num2str(100-N)]); error('Solution diverges'); break; end; N = N - 1; xn = xnp1; end; error('No convergence'); break; % end function

18

SSoolluuttiioonn uussiinngg ffuunnccttiioonn sseeccaannttmm

To solve the system represented by function f2, we now use function secantm. The following call to function secantm produces a solution after 18 iterations: » [x,iter] = secantm(x0,dx,'f2') iterations: 18 x = 5.0000 5.0000 iter = 18

Solving equations with Matlab function fzero Matlab provides function fzero for the solution of single non-linear equations. Use » help fzero to obtain additional information on function fzero. Also, read Chapter 7 (Function Functions) in the Using Matlab guide. For the solution of systems of non-linear equations Matlab provides function fsolve as part of the Optimization package. Since this is an add-on package, function fsolve is not available in the student version of Matlab. If using the full version of Matlab, check the help facility for function fsolve by using: » help fsolve If function fsolve is not available in the Matlab installation you are using, you can always use function secantm (or newtonm) to solve systems of non-linear equations.

Documents

Haoudout1-4