Linear Algebra - Department of Mathematicsdey/linearalg01.pdflinear dependence, dimension, matrices,...

Linear Algebra

Santanu Dey

January 17, 2011

1. INTRODUCTION

This is the second course in Mathematics for you at IIT

Central theme: study thoroughly geometric objects inhigher dimensions and also functions.Simplest geometric objects are lines and planes. Simplestfunctions are linear functions.Linear algebra brings an unified approach to topics likecoordinate geometry and vector algebraUseful for calculus of several variables, systems ofdifferential equations, etc.; applications to electricalnetworks, mechanics, optimization problems, processes instatistics, etc.

1. INTRODUCTION

This is the second course in Mathematics for you at IITCentral theme: study thoroughly geometric objects inhigher dimensions and also functions.

Simplest geometric objects are lines and planes. Simplestfunctions are linear functions.Linear algebra brings an unified approach to topics likecoordinate geometry and vector algebraUseful for calculus of several variables, systems ofdifferential equations, etc.; applications to electricalnetworks, mechanics, optimization problems, processes instatistics, etc.

1. INTRODUCTION

This is the second course in Mathematics for you at IITCentral theme: study thoroughly geometric objects inhigher dimensions and also functions.Simplest geometric objects are lines and planes. Simplestfunctions are linear functions.

Linear algebra brings an unified approach to topics likecoordinate geometry and vector algebraUseful for calculus of several variables, systems ofdifferential equations, etc.; applications to electricalnetworks, mechanics, optimization problems, processes instatistics, etc.

1. INTRODUCTION

This is the second course in Mathematics for you at IITCentral theme: study thoroughly geometric objects inhigher dimensions and also functions.Simplest geometric objects are lines and planes. Simplestfunctions are linear functions.Linear algebra brings an unified approach to topics likecoordinate geometry and vector algebra

Useful for calculus of several variables, systems ofdifferential equations, etc.; applications to electricalnetworks, mechanics, optimization problems, processes instatistics, etc.

1. INTRODUCTION

This is the second course in Mathematics for you at IITCentral theme: study thoroughly geometric objects inhigher dimensions and also functions.Simplest geometric objects are lines and planes. Simplestfunctions are linear functions.Linear algebra brings an unified approach to topics likecoordinate geometry and vector algebraUseful for calculus of several variables, systems ofdifferential equations, etc.; applications to electricalnetworks, mechanics, optimization problems, processes instatistics, etc.

HOW DO WE GO ABOUT IT?

1. proper foundation: vector spaces, linear transformations,linear dependence, dimension, matrices, determinants,eigenvalues, inner product spaces, etc.

2. applications: the study of quadratic forms3. an elementary proof of fundamental theorem of algebra

(using linear algebra)

Main Textbook for the Course: Chapter 6 and 7 of“Advanced Engineering Mathematics" by E. Kreyszig, 8th

edition

HOW DO WE GO ABOUT IT?1. proper foundation: vector spaces, linear transformations,

linear dependence, dimension, matrices, determinants,eigenvalues, inner product spaces, etc.

edition

2. applications: the study of quadratic forms

3. an elementary proof of fundamental theorem of algebra(using linear algebra)

edition

Cartesian coordinate space

René Descarte (1596-1650)

French philosopher, mathematician, physicist, and writer.

“cogito ergo sum" (I think therefore I am)

Cartesian coordinate space

René Descarte (1596-1650)

French philosopher, mathematician, physicist, and writer.“cogito ergo sum" (I think therefore I am)

n-dimensional Cartesian coordinate space ℝn

∼= ℝ× . . .× ℝ (n factors)

ℝn is the totality of all ordered n-tuples (x1, . . . , xn) wherexi ∈ ℝfor n = 2 is the (x , y) ∈ ℝ2

�i : ℝn → ℝ defined by

�i((x1, . . . , xn)) = xi

is called the i th coordinate function or i th coordinateprojectionGiven a function f : A→ ℝn,define fi := �i ∘ fcompletely determines f

∼= ℝ× . . .× ℝ (n factors)ℝn is the totality of all ordered n-tuples (x1, . . . , xn) wherexi ∈ ℝ

for n = 2 is the (x , y) ∈ ℝ2

�i((x1, . . . , xn)) = xi

∼= ℝ× . . .× ℝ (n factors)ℝn is the totality of all ordered n-tuples (x1, . . . , xn) wherexi ∈ ℝfor n = 2 is the (x , y) ∈ ℝ2

�i((x1, . . . , xn)) = xi

is called the i th coordinate function or i th coordinateprojection

Given a function f : A→ ℝn,define fi := �i ∘ fcompletely determines f

�i((x1, . . . , xn)) = xi

is called the i th coordinate function or i th coordinateprojectionGiven a function f : A→ ℝn,define fi := �i ∘ f

completely determines f

�i((x1, . . . , xn)) = xi

For n < m we have the inclusion map � : ℝn → ℝm

� : (x1, . . . , xn)→ (x1, . . . , xn,0, . . . ,0)

fail to determine the behaviour of function completely.So calculus of several variables is not a mere extension ofcalculus of 1-variable.Examples:

1. Let f (x , y) = (x2y , x + y)g(x , y) = (cos(x + y), sin(x/y))

⇒ Domain of g is ℝ2∖{(x ,0) : x ∈ ℝ}Therefore, neither f ∘ g nor g ∘ f is defined on ℝ2

Restrict domain of f to ℝ2∖{(x , y) : x + y = 0}, then

g ∘ f (x , y) = (cos(x2y + x + y), sin[x2y/(x + y)])

� : (x1, . . . , xn)→ (x1, . . . , xn,0, . . . ,0)

fail to determine the behaviour of function completely.

So calculus of several variables is not a mere extension ofcalculus of 1-variable.Examples:

� : (x1, . . . , xn)→ (x1, . . . , xn,0, . . . ,0)

fail to determine the behaviour of function completely.So calculus of several variables is not a mere extension ofcalculus of 1-variable.

Examples:

� : (x1, . . . , xn)→ (x1, . . . , xn,0, . . . ,0)

⇒ Domain of g is ℝ2∖{(x ,0) : x ∈ ℝ}

Therefore, neither f ∘ g nor g ∘ f is defined on ℝ2

� : (x1, . . . , xn)→ (x1, . . . , xn,0, . . . ,0)

2. Let N = {1,2, . . . ,n}.

Nn = N × ⋅ ⋅ ⋅ × N;S(n) = Set of all sequences of length n and with values in

N;F (N,N) is the set all functions from N to N

There are natural ways of getting one-to-one mappings ofany one of these three sets into another.Let Σ(n) denote the subset of F (N,N) consisting of thosewhich are one-to-one. What does it correspond to in Nn

and S(n)?

2. Let N = {1,2, . . . ,n}.

There are natural ways of getting one-to-one mappings ofany one of these three sets into another.

Let Σ(n) denote the subset of F (N,N) consisting of thosewhich are one-to-one. What does it correspond to in Nn

and S(n)?

2. Let N = {1,2, . . . ,n}.

There are natural ways of getting one-to-one mappings ofany one of these three sets into another.Let Σ(n) denote the subset of F (N,N) consisting of thosewhich are one-to-one. What does it correspond to in Nn

and S(n)?

Exercise:

(i) Let (x , y) = (x + y2, y + x2 + 2xy2 + y4). Is a bijectivemap?

(ii) (optional) Let f : ℝ2 → ℝ be a continuous map. Then thereexist a,b ∈ ℝ such that for all r ∈ ℝ∖{a,b},

f−1(r) is either ∅ or infinite.Prove that:

(a) for every n there exist at least n points which are mapped tothe same point by f

(b) if f is surjective then f−1(r) is infinite for all r ∈ ℝ(c) Find a continuous function f : ℝ2 → ℝ such that

f−1(−1) = {−1} and f−1(1) = {1}

Exercise:(i) Let (x , y) = (x + y2, y + x2 + 2xy2 + y4). Is a bijective

(ii) (optional) Let f : ℝ2 → ℝ be a continuous map. Then thereexist a,b ∈ ℝ such that for all r ∈ ℝ∖{a,b},

f−1(r) is either ∅ or infinite.Prove that:

f−1(−1) = {−1} and f−1(1) = {1}

map?(ii) (optional) Let f : ℝ2 → ℝ be a continuous map. Then there

exist a,b ∈ ℝ such that for all r ∈ ℝ∖{a,b},f−1(r) is either ∅ or infinite.

Prove that:

f−1(−1) = {−1} and f−1(1) = {1}

Prove that:

f−1(−1) = {−1} and f−1(1) = {1}

Prove that:

f−1(−1) = {−1} and f−1(1) = {1}

Prove that:

(b) if f is surjective then f−1(r) is infinite for all r ∈ ℝ

(c) Find a continuous function f : ℝ2 → ℝ such thatf−1(−1) = {−1} and f−1(1) = {1}

Prove that:

f−1(−1) = {−1} and f−1(1) = {1}

Algebraic structure of ℝn

For x = (x1, . . . , xn),y = (y1, . . . , yn) define

x + y = (x1 + y1, . . . , xn + yn)

Note: usual laws of addition,

0 = (0, . . . ,0),−x = (−x1, . . . ,−xn)

Scalar multiplication: �x := (�x1, . . . , �xn)

1. associative: �(�x) = (��)x2. distributive: �(x + y) = �x + �y3. identity: 1x = x

x + y = (x1 + y1, . . . , xn + yn)

0 = (0, . . . ,0),−x = (−x1, . . . ,−xn)

x + y = (x1 + y1, . . . , xn + yn)

0 = (0, . . . ,0),−x = (−x1, . . . ,−xn)

x + y = (x1 + y1, . . . , xn + yn)

0 = (0, . . . ,0),−x = (−x1, . . . ,−xn)

1. associative: �(�x) = (��)x

2. distributive: �(x + y) = �x + �y3. identity: 1x = x

x + y = (x1 + y1, . . . , xn + yn)

0 = (0, . . . ,0),−x = (−x1, . . . ,−xn)

1. associative: �(�x) = (��)x2. distributive: �(x + y) = �x + �y

3. identity: 1x = x

x + y = (x1 + y1, . . . , xn + yn)

0 = (0, . . . ,0),−x = (−x1, . . . ,−xn)

Geometry of ℝn

Distance function on ℝn (linear metric):

d(x,y) =

√√√⎷ n∑i=1

(xi − yi)2, x,y ∈ ℝn

This is related to dot product: (x,y) 7→ x.y :=∑n

i=1 xiyi

The norm function:

∥x∥ := d(x,0) =√∑

x2i =√

It follows that d(x,y) = ∥x− y∥(d1) symmetry: d(x,y) = d(y,x)

(d2) triangle inequality: d(x,y) ≤ d(x,w) + d(w,y)

(d3) positivity: d(x,y) ≥ 0 and d(x,y) = 0⇔ x = y(d4) homogeneity: d(�x, �y) = ∣�∣d(x,y)

( for (d2) use Cauchy-Schwarz inequality: ∣x.y∣ ≤ ∥x∥∥y∥ )

Geometry of ℝn

d(x,y) =

√√√⎷ n∑i=1

i=1 xiyi

The norm function:

∥x∥ := d(x,0) =√∑

x2i =√

Geometry of ℝn

d(x,y) =

√√√⎷ n∑i=1

i=1 xiyi

The norm function:

∥x∥ := d(x,0) =√∑

x2i =√

Geometry of ℝn

d(x,y) =

√√√⎷ n∑i=1

i=1 xiyi

The norm function:

∥x∥ := d(x,0) =√∑

x2i =√

It follows that d(x,y) = ∥x− y∥

(d1) symmetry: d(x,y) = d(y,x)

Geometry of ℝn

d(x,y) =

√√√⎷ n∑i=1

i=1 xiyi

The norm function:

∥x∥ := d(x,0) =√∑

x2i =√

Geometry of ℝn

d(x,y) =

√√√⎷ n∑i=1

i=1 xiyi

The norm function:

∥x∥ := d(x,0) =√∑

x2i =√

Geometry of ℝn

d(x,y) =

√√√⎷ n∑i=1

i=1 xiyi

The norm function:

∥x∥ := d(x,0) =√∑

x2i =√

(d3) positivity: d(x,y) ≥ 0 and d(x,y) = 0⇔ x = y

(d4) homogeneity: d(�x, �y) = ∣�∣d(x,y)

Geometry of ℝn

d(x,y) =

√√√⎷ n∑i=1

i=1 xiyi

The norm function:

∥x∥ := d(x,0) =√∑

x2i =√

Geometry of ℝn

d(x,y) =

√√√⎷ n∑i=1

i=1 xiyi

The norm function:

∥x∥ := d(x,0) =√∑

x2i =√

( for (d2) use Cauchy-Schwarz inequality: ∣x.y∣ ≤ ∥x∥∥y∥ )10/51

Exercise:

(i) Define �(x , y) =∑n

i=1 ∣xi − yi ∣ and

D(x , y) = max{∣xi − yi ∣ : 1 ≤ i ≤ n}.

Show that �,D satisfy (d1), (d2), (d3).

(ii) On the set of 64 squares of a chess board define thedistance d from one square to another to be the leastnumber of knight moves required.

(a) Check that this distance function satisfy (d1), (d2), (d3).(b) Determine the diameter of the chess board with respect to

this distance where diameter of a space is the supremum ofall values of d .

Exercise:(i) Define �(x , y) =

∑ni=1 ∣xi − yi ∣ and

D(x , y) = max{∣xi − yi ∣ : 1 ≤ i ≤ n}.

(a) Check that this distance function satisfy (d1), (d2), (d3).

(b) Determine the diameter of the chess board with respect tothis distance where diameter of a space is the supremum ofall values of d .

D(x , y) = max{∣xi − yi ∣ : 1 ≤ i ≤ n}.

“Not to rely too much on intuition when dealing with higherdimensions!"

Consider a square of side 4 units and place four coins of unitradius one in each corner so as to touch two of the sides. Ofcourse each coin touches other two coins. Now place a coin atthe center of the square so as to touch all the four coins.

Do this inside a n-dimensional cube of side 4 units with2n n-dimensional balls of unit radius.

“Not to rely too much on intuition when dealing with higherdimensions!"Consider a square of side 4 units and place four coins of unitradius one in each corner so as to touch two of the sides. Ofcourse each coin touches other two coins. Now place a coin atthe center of the square so as to touch all the four coins.

For n = 2,3, . . . ,9 the central ball which is kept touching all theballs in the corner lies inside the cube.

The surprise is that for n > 9 the central ball cannot fit insidethe cube. Prove this by showing that the radius of the centralball is

√n − 1.

The surprise is that for n > 9 the central ball cannot fit insidethe cube.

Prove this by showing that the radius of the centralball is

√n − 1.

The surprise is that for n > 9 the central ball cannot fit insidethe cube. Prove this by showing that the radius of the centralball is

√n − 1.

2. LINEAR MAPS ON EUCLIDEAN SPACES AND

MATRICES

Definitionf : ℝn → ℝm is said to be a linear map iff (�x + �y) = �f (x) + �f (y).

Examples: Projection map �i , inclusion map, multiplicationby scalar

dot product by a fixed vector; what about the converse?f : ℝn → ℝm is linear iff fi s are linearDistance travelled is a linear function of time when velocityis constant. So is the voltage as a function of resistancewhen the current is constant. The logarithm of the changein concentration in any first order chemical reaction is alinear function of time.∣x ∣, xn (n > 1), sin x , etc. are not linear

MATRICES

Examples: Projection map �i , inclusion map, multiplicationby scalardot product by a fixed vector; what about the converse?

f : ℝn → ℝm is linear iff fi s are linearDistance travelled is a linear function of time when velocityis constant. So is the voltage as a function of resistancewhen the current is constant. The logarithm of the changein concentration in any first order chemical reaction is alinear function of time.∣x ∣, xn (n > 1), sin x , etc. are not linear

MATRICES

Examples: Projection map �i , inclusion map, multiplicationby scalardot product by a fixed vector; what about the converse?f : ℝn → ℝm is linear iff fi s are linear

Distance travelled is a linear function of time when velocityis constant. So is the voltage as a function of resistancewhen the current is constant. The logarithm of the changein concentration in any first order chemical reaction is alinear function of time.∣x ∣, xn (n > 1), sin x , etc. are not linear

MATRICES

Examples: Projection map �i , inclusion map, multiplicationby scalardot product by a fixed vector; what about the converse?f : ℝn → ℝm is linear iff fi s are linearDistance travelled is a linear function of time when velocityis constant. So is the voltage as a function of resistancewhen the current is constant. The logarithm of the changein concentration in any first order chemical reaction is alinear function of time.

∣x ∣, xn (n > 1), sin x , etc. are not linear

MATRICES

Examples: Projection map �i , inclusion map, multiplicationby scalardot product by a fixed vector; what about the converse?f : ℝn → ℝm is linear iff fi s are linearDistance travelled is a linear function of time when velocityis constant. So is the voltage as a function of resistancewhen the current is constant. The logarithm of the changein concentration in any first order chemical reaction is alinear function of time.∣x ∣, xn (n > 1), sin x , etc. are not linear

Exercise:

(i) Show that if f is a linear map then

f (k∑

�ixi) =k∑

�i f (xi).

(ii) Show that the projection on a line L passing through theorigin defines a linear map of ℝ2 to ℝ2 and its image isequal to L.

(iii) Show that rotation through a fixed angle � is a linear mapfrom ℝ2 → ℝ2.

(iv) By a rigid motion of ℝn we mean a map f : ℝn → ℝn suchthat

d(f (x), f (y)) = d(x,y).

Show that a rigid motion of ℝ3 which fixes the origin is alinear map.

Exercise:

f (k∑

�ixi) =k∑

�i f (xi).

d(f (x), f (y)) = d(x,y).

Exercise:

f (k∑

�ixi) =k∑

�i f (xi).

d(f (x), f (y)) = d(x,y).

Exercise:

f (k∑

�ixi) =k∑

�i f (xi).

d(f (x), f (y)) = d(x,y).

Exercise:

f (k∑

�ixi) =k∑

�i f (xi).

d(f (x), f (y)) = d(x,y).

Structure of linear maps

L(n,m) = Set of all linear maps from ℝn to ℝm

For f ,g ∈ L(n,m) define �f and f + g by

(�f )(x) = �f (x); (f + g)(x) = f (x) + g(x)

If f ∈ L(n,m) and g ∈ L(m, l), then g ∘ f ∈ L(n, l)If f ,g ∈ L(n,1), then define fg : ℝn → ℝ by

(fg)(x) = f (x)g(x).

Does fg ∈ L(n,1)?

Let ei = (0, . . . ,0,1,0, . . . ,0) (standard basis elements). Ifx ∈ ℝn, then x =

∑ni=1 xiei .

If f ∈ L(n,m), then

f (x) =∑

xi f (ei)

(�f )(x) = �f (x); (f + g)(x) = f (x) + g(x)

(fg)(x) = f (x)g(x).

Does fg ∈ L(n,1)?

∑ni=1 xiei .

f (x) =∑

xi f (ei)

(�f )(x) = �f (x); (f + g)(x) = f (x) + g(x)

If f ∈ L(n,m) and g ∈ L(m, l), then g ∘ f ∈ L(n, l)

If f ,g ∈ L(n,1), then define fg : ℝn → ℝ by

(fg)(x) = f (x)g(x).

Does fg ∈ L(n,1)?

∑ni=1 xiei .

f (x) =∑

xi f (ei)

(�f )(x) = �f (x); (f + g)(x) = f (x) + g(x)

(fg)(x) = f (x)g(x).

Does fg ∈ L(n,1)?

∑ni=1 xiei .

f (x) =∑

xi f (ei)

(�f )(x) = �f (x); (f + g)(x) = f (x) + g(x)

(fg)(x) = f (x)g(x).

Does fg ∈ L(n,1)?

∑ni=1 xiei .

f (x) =∑

xi f (ei)

(�f )(x) = �f (x); (f + g)(x) = f (x) + g(x)

(fg)(x) = f (x)g(x).

Does fg ∈ L(n,1)?

∑ni=1 xiei .

f (x) =∑

xi f (ei)

Conversely, if given v1, . . . , vn ∈ ℝm, define a (unique)linear map f by assigning f (ei) = vi

Examples:1. Given a f ∈ L(n,1), if we put u = (f (e1), . . . , f (en)), then

f (x) =∑

i xi f (ei) = u.x2.

a11x1 + a12x2 + . . . + a1nxn = b1

a21x1 + a22x2 + . . . + a2nxn = b2

. . . . . .

am1x1 + am2x2 + . . . + amnxn = bm

Set of all solutions of j th equation is a hyper plane Pj in ℝn.Solving the system means finding

P1 ∩ . . . ∩ Pm.

Examples:

1. Given a f ∈ L(n,1), if we put u = (f (e1), . . . , f (en)), thenf (x) =

∑i xi f (ei) = u.x

a11x1 + a12x2 + . . . + a1nxn = b1

a21x1 + a22x2 + . . . + a2nxn = b2

. . . . . .

am1x1 + am2x2 + . . . + amnxn = bm

P1 ∩ . . . ∩ Pm.

f (x) =∑

i xi f (ei) = u.x

a11x1 + a12x2 + . . . + a1nxn = b1

a21x1 + a22x2 + . . . + a2nxn = b2

. . . . . .

am1x1 + am2x2 + . . . + amnxn = bm

P1 ∩ . . . ∩ Pm.

f (x) =∑

i xi f (ei) = u.x2.

a11x1 + a12x2 + . . . + a1nxn = b1

a21x1 + a22x2 + . . . + a2nxn = b2

. . . . . .

am1x1 + am2x2 + . . . + amnxn = bm

P1 ∩ . . . ∩ Pm.

f (x) =∑

i xi f (ei) = u.x2.

a11x1 + a12x2 + . . . + a1nxn = b1

a21x1 + a22x2 + . . . + a2nxn = b2

. . . . . .

am1x1 + am2x2 + . . . + amnxn = bm

P1 ∩ . . . ∩ Pm.

On the other hand the lhs of each of these equations can bethought of as a linear map Ti : ℝn → ℝ.

Together, they defineone function

T ∈ L(ℝn,ℝm)

such that T = (T1, . . . ,Tm).

Determining x ∈ ℝn such that T (x) = b,where b = (b1, . . . ,bm)

On the other hand the lhs of each of these equations can bethought of as a linear map Ti : ℝn → ℝ. Together, they defineone function

T ∈ L(ℝn,ℝm)

On the other hand the lhs of each of these equations can bethought of as a linear map Ti : ℝn → ℝ. Together, they defineone function

T ∈ L(ℝn,ℝm)

Matrix representation

⎛⎜⎜⎜⎝x1x2...

⎞⎟⎟⎟⎠

= (x1, x2, . . . , xn)T ∈ ℝn (T stands for transpose)

called column vectorsGiven linear map f : ℝn → ℝm we get n column vectors (ofsize m) viz., f (e1), . . . , f (en). Place them side by side:For instance, if f (ej) = (f1j , f2j , . . . , fmj)

T , then we obtain

ℳf =

⎛⎜⎜⎜⎝f11 f12 . . . f1nf21 f22 . . . f2n...

...fm1 fm2 . . . fmn

⎞⎟⎟⎟⎠This array is called a matrix with m rows and n columns.We say matrixℳf is of size m × n.

⎛⎜⎜⎜⎝x1x2...

⎞⎟⎟⎟⎠ = (x1, x2, . . . , xn)T ∈ ℝn

(T stands for transpose)

T , then we obtain

ℳf =

⎛⎜⎜⎜⎝f11 f12 . . . f1nf21 f22 . . . f2n...

...fm1 fm2 . . . fmn

⎛⎜⎜⎜⎝x1x2...

⎞⎟⎟⎟⎠ = (x1, x2, . . . , xn)T ∈ ℝn (T stands for transpose)

T , then we obtain

ℳf =

⎛⎜⎜⎜⎝f11 f12 . . . f1nf21 f22 . . . f2n...

...fm1 fm2 . . . fmn

⎛⎜⎜⎜⎝x1x2...

called column vectors

Given linear map f : ℝn → ℝm we get n column vectors (ofsize m) viz., f (e1), . . . , f (en). Place them side by side:For instance, if f (ej) = (f1j , f2j , . . . , fmj)

T , then we obtain

ℳf =

⎛⎜⎜⎜⎝f11 f12 . . . f1nf21 f22 . . . f2n...

...fm1 fm2 . . . fmn

⎛⎜⎜⎜⎝x1x2...

called column vectorsGiven linear map f : ℝn → ℝm we get n column vectors (ofsize m) viz., f (e1), . . . , f (en). Place them side by side:

For instance, if f (ej) = (f1j , f2j , . . . , fmj)T , then we obtain

ℳf =

⎛⎜⎜⎜⎝f11 f12 . . . f1nf21 f22 . . . f2n...

...fm1 fm2 . . . fmn

⎛⎜⎜⎜⎝x1x2...

T , then we obtain

ℳf =

⎛⎜⎜⎜⎝f11 f12 . . . f1nf21 f22 . . . f2n...

...fm1 fm2 . . . fmn

⎞⎟⎟⎟⎠

This array is called a matrix with m rows and n columns.We say matrixℳf is of size m × n.

⎛⎜⎜⎜⎝x1x2...

T , then we obtain

ℳf =

⎛⎜⎜⎜⎝f11 f12 . . . f1nf21 f22 . . . f2n...

...fm1 fm2 . . . fmn

⎞⎟⎟⎟⎠This array is called a matrix with m rows and n columns.

We say matrixℳf is of size m × n.

⎛⎜⎜⎜⎝x1x2...

T , then we obtain

ℳf =

⎛⎜⎜⎜⎝f11 f12 . . . f1nf21 f22 . . . f2n...

...fm1 fm2 . . . fmn

Notation: ℳf = ((fij))

Matrices are equal if their sizes are the same and theentries are the same.If m = 1 we get row matrices; if n = 1 we get columnmatrices

ℳ : L(n,m)→ Mm,nf 7→ ℳf

is one-one and called matrix representation of linear maps

ℳf+g =ℳf +ℳg ; ℳ�f = �ℳf

Matrices are equal if their sizes are the same and theentries are the same.

If m = 1 we get row matrices; if n = 1 we get columnmatrices

Examples:

1. ℳId = I or In where

⎛⎜⎜⎜⎝1 0 . . . 00 1 . . . 0...

...0 0 . . . 1

⎞⎟⎟⎟⎠ = ((�ij))

�ij = 1 if i = j and = 0 otherwise(Kronecker delta)

2. Linear T : ℝ2 → ℝ2 which interchange coordinates is

represented by(

0 11 0

)3. Corresponding to multiplication by � ∈ ℝ is the diagonal

matrix D(�, . . . , �) = ((��ij))

Examples:

1. ℳId

= I or In where

⎛⎜⎜⎜⎝1 0 . . . 00 1 . . . 0...

...0 0 . . . 1

⎞⎟⎟⎟⎠ = ((�ij))

represented by(

0 11 0

matrix D(�, . . . , �) = ((��ij))

Examples:

⎛⎜⎜⎜⎝1 0 . . . 00 1 . . . 0...

...0 0 . . . 1

⎞⎟⎟⎟⎠

= ((�ij))

represented by(

0 11 0

matrix D(�, . . . , �) = ((��ij))

Examples:

⎛⎜⎜⎜⎝1 0 . . . 00 1 . . . 0...

...0 0 . . . 1

⎞⎟⎟⎟⎠ = ((�ij))

represented by(

0 11 0

matrix D(�, . . . , �) = ((��ij))

Examples:

⎛⎜⎜⎜⎝1 0 . . . 00 1 . . . 0...

...0 0 . . . 1

⎞⎟⎟⎟⎠ = ((�ij))

�ij = 1 if i = j and = 0 otherwise

(Kronecker delta)

represented by(

0 11 0

matrix D(�, . . . , �) = ((��ij))

Examples:

⎛⎜⎜⎜⎝1 0 . . . 00 1 . . . 0...

...0 0 . . . 1

⎞⎟⎟⎟⎠ = ((�ij))

represented by(

0 11 0

matrix D(�, . . . , �) = ((��ij))

Examples:

⎛⎜⎜⎜⎝1 0 . . . 00 1 . . . 0...

...0 0 . . . 1

⎞⎟⎟⎟⎠ = ((�ij))

represented by

(0 11 0

matrix D(�, . . . , �) = ((��ij))

Examples:

⎛⎜⎜⎜⎝1 0 . . . 00 1 . . . 0...

...0 0 . . . 1

⎞⎟⎟⎟⎠ = ((�ij))

represented by(

0 11 0

3. Corresponding to multiplication by � ∈ ℝ is the diagonalmatrix D(�, . . . , �) = ((��ij))

Examples:

⎛⎜⎜⎜⎝1 0 . . . 00 1 . . . 0...

...0 0 . . . 1

⎞⎟⎟⎟⎠ = ((�ij))

represented by(

0 11 0

)3. Corresponding to multiplication by � ∈ ℝ is

the diagonalmatrix D(�, . . . , �) = ((��ij))

Examples:

⎛⎜⎜⎜⎝1 0 . . . 00 1 . . . 0...

...0 0 . . . 1

⎞⎟⎟⎟⎠ = ((�ij))

represented by(

0 11 0

matrix D(�, . . . , �)

= ((��ij))

Examples:

⎛⎜⎜⎜⎝1 0 . . . 00 1 . . . 0...

...0 0 . . . 1

⎞⎟⎟⎟⎠ = ((�ij))

represented by(

0 11 0

matrix D(�, . . . , �) = ((��ij))

4. The rotation through an angle � about the origin

←→(

cos � − sin �sin � cos �

)Linearity can be shown by:Law of congruent triangles;Alternatively, write down the image of e1 and e2 underrotation, write down the matrix M and then show thatrotation takes (x , y)T to M(x , y)T .

5. the reflection through a line L (with 0 ≤ < �) passingthrough the origin

←→(

cos 2 sin 2 sin 2 − cos 2

←→(

Linearity can be shown by:Law of congruent triangles;Alternatively, write down the image of e1 and e2 underrotation, write down the matrix M and then show thatrotation takes (x , y)T to M(x , y)T .

←→(

)Linearity can be shown by:

Law of congruent triangles;Alternatively, write down the image of e1 and e2 underrotation, write down the matrix M and then show thatrotation takes (x , y)T to M(x , y)T .

←→(

)Linearity can be shown by:Law of congruent triangles;

Alternatively, write down the image of e1 and e2 underrotation, write down the matrix M and then show thatrotation takes (x , y)T to M(x , y)T .

←→(

)Linearity can be shown by:Law of congruent triangles;Alternatively, write down the image of e1 and e2 underrotation,

write down the matrix M and then show thatrotation takes (x , y)T to M(x , y)T .

←→(

)Linearity can be shown by:Law of congruent triangles;Alternatively, write down the image of e1 and e2 underrotation, write down the matrix M

and then show thatrotation takes (x , y)T to M(x , y)T .

←→(

Operations on matrices

Set of all matrices of size m × n is denoted by Mm,n

For A = ((aij)),B = ((bij)) ∈ Mm,n

A + B = ((aij + bij)); �A = ((�aij));

0 matrix with all entries 0; −A = ((−aij))

ℝn ←→ M1,n, similarly ℝm ←→ Mm,1

‘Transpose’ operation introduced earlier can be extendedto all matrices

AT := ((bij)) ∈ Mn,m where bij := aji

(�A + �B)T = �AT + �BT

Let f : ℝn → ℝm,g : ℝm → ℝl be linear maps. If A :=ℳfand B :=ℳg thenℳg∘f =?

A + B = ((aij + bij)); �A = ((�aij));

(�A + �B)T = �AT + �BT

A + B = ((aij + bij)); �A = ((�aij));

(�A + �B)T = �AT + �BT

A + B = ((aij + bij)); �A = ((�aij));

(�A + �B)T = �AT + �BT

A + B = ((aij + bij)); �A = ((�aij));

(�A + �B)T = �AT + �BT

A + B = ((aij + bij)); �A = ((�aij));

(�A + �B)T = �AT + �BT

We have

(g ∘ f )(ej) = g(∑

aijei) =∑

aijg(ei)

aij(∑

bkiek ) =∑

bkiaij)ek

Soℳg∘f = C = ((ckj)), where ckj =∑m

i=1 bkiaij

DefineBA := C

Properties:1. Associativity: A(BC) = (AB)C (if AB and BC are

defined)2. Right and Left Distributivity:

A(B + C) = AB + AC, (B + C)A = BA + CA

We have

(g ∘ f )(ej) = g(∑

aijei) =∑

aijg(ei)

aij(∑

bkiek ) =∑

bkiaij)ek

i=1 bkiaij

DefineBA := C

A(B + C) = AB + AC, (B + C)A = BA + CA

We have

(g ∘ f )(ej) = g(∑

aijei) =∑

aijg(ei)

aij(∑

bkiek ) =∑

bkiaij)ek

i=1 bkiaij

DefineBA := C

A(B + C) = AB + AC, (B + C)A = BA + CA

We have

(g ∘ f )(ej) = g(∑

aijei) =∑

aijg(ei)

aij(∑

bkiek ) =∑

bkiaij)ek

i=1 bkiaij

DefineBA := C

Properties:

1. Associativity: A(BC) = (AB)C (if AB and BC aredefined)

2. Right and Left Distributivity:A(B + C) = AB + AC, (B + C)A = BA + CA

We have

(g ∘ f )(ej) = g(∑

aijei) =∑

aijg(ei)

aij(∑

bkiek ) =∑

bkiaij)ek

i=1 bkiaij

DefineBA := C

defined)

2. Right and Left Distributivity:A(B + C) = AB + AC, (B + C)A = BA + CA

We have

(g ∘ f )(ej) = g(∑

aijei) =∑

aijg(ei)

aij(∑

bkiek ) =∑

bkiaij)ek

i=1 bkiaij

DefineBA := C

A(B + C) = AB + AC, (B + C)A = BA + CA

3. Multiplicative identity: if A ∈ Mm,n, B ∈ Mn,k thenAIn = A and InB = B

4. (AB)T = BT AT

5. Let f : ℝn → ℝm, g : ℝm → ℝl be linear maps. Thenℳg∘f =ℳgℳf

Remark

M2,2 ←→ ℝ4

(preserves sum and scalar product). Similarly

Mm,n ←→ ℝmn

4. (AB)T = BT AT

Remark

M2,2 ←→ ℝ4

Mm,n ←→ ℝmn

4. (AB)T = BT AT

Remark

M2,2 ←→ ℝ4

Mm,n ←→ ℝmn

4. (AB)T = BT AT

Remark

M2,2 ←→ ℝ4

Mm,n ←→ ℝmn

4. (AB)T = BT AT

Remark

M2,2 ←→ ℝ4

(preserves sum and scalar product).

Similarly

Mm,n ←→ ℝmn

4. (AB)T = BT AT

Remark

M2,2 ←→ ℝ4

Mm,n ←→ ℝmn

Invertible Transformations and Matrices

DefinitionAny function f : X → Y is said to be invertible, if there existsg : Y → X such that

g ∘ f = IdX and f ∘ g = IdY .

The inverse of a function if it exists is unique and is denoted byf−1.

A n × n matrix (i.e, square matrix) A is said to beinvertible if there exists another n × n matrix B such thatAB = BA = In. We call B an inverse of A.Remarks:

(i) An inverse of a matrix is unique. [if C is another matrixsuch that CA = AC = In then

C = CIn = C(AB) = (CA)B = InB = B.]

Denote it by A−1.

A n × n matrix (i.e, square matrix) A is said to beinvertible if there exists another n × n matrix B such thatAB = BA = In. We call B an inverse of A.

Remarks:(i) An inverse of a matrix is unique. [if C is another matrix

such that CA = AC = In then

Denote it by A−1.

Denote it by A−1.26/51

(ii) If A1,A2 are invertible then so is A1A2.

What is its inverse?(iii) Clearly In, and

diag(a1,a2, . . . ,an)

with ai ∕= 0 are invertible.(iv) Let B := A−1. If fA, fB : ℝn → ℝn are the linear maps

associated with A,B resp., then it follows that

fA ∘ fB = fAB = Id .

Likewise fB ∘ fA = Id . Even the converse holds. (viz,ℳfℳf−1 = In).

(v) An invertible map is one-one and onto.(vi) If f : ℝn → ℝn is an invertible linear map, then f−1 is linear.

(ii) If A1,A2 are invertible then so is A1A2. What is its inverse?

(iii) Clearly In, anddiag(a1,a2, . . . ,an)

(ii) If A1,A2 are invertible then so is A1A2. What is its inverse?(iii) Clearly In, and

diag(a1,a2, . . . ,an)

with ai ∕= 0 are invertible.

(iv) Let B := A−1. If fA, fB : ℝn → ℝn are the linear mapsassociated with A,B resp., then it follows that

diag(a1,a2, . . . ,an)

Likewise fB ∘ fA = Id .

Even the converse holds. (viz,ℳfℳf−1 = In).

diag(a1,a2, . . . ,an)

(v) An invertible map is one-one and onto.

(vi) If f : ℝn → ℝn is an invertible linear map, then f−1 is linear.

diag(a1,a2, . . . ,an)

Elementary and Permutation Matrix

Consider a square matrix Eij whose (i , j)th entry is 1 and allother entries are 0.

If we multiply a matrix A by Eij on the left, then what we getis a matrix whose i th row is equal to the j th row of A and allother rows are zero. In particular EijEij = 0 for (i ∕= j).It follows that for any � and i ∕= j

(I + �Eij)(I − �Eij) = I + �Eij − �Eij − �2EijEij = I.

(I + �Eii)(I + �Eii) = I + (� + � + ��)Eii .

So rhs to be equal to I then we must have � + � + �� = 0.Thus I + �Eii is invertible if � ∕= −1.Alternatively, I + �Eii is the diagonal matrix with all thediagonal entries equal to 1 except the (i , i)th one which isequal to 1 + �.

Consider a square matrix Eij whose (i , j)th entry is 1 and allother entries are 0.If we multiply a matrix A by Eij on the left, then what we getis a matrix whose i th row is equal to the j th row of A and allother rows are zero.

In particular EijEij = 0 for (i ∕= j).It follows that for any � and i ∕= j

(I + �Eii)(I + �Eii) = I + (� + � + ��)Eii .

Consider a square matrix Eij whose (i , j)th entry is 1 and allother entries are 0.If we multiply a matrix A by Eij on the left, then what we getis a matrix whose i th row is equal to the j th row of A and allother rows are zero. In particular EijEij = 0 for (i ∕= j).

It follows that for any � and i ∕= j

(I + �Eii)(I + �Eii) = I + (� + � + ��)Eii .

Consider a square matrix Eij whose (i , j)th entry is 1 and allother entries are 0.If we multiply a matrix A by Eij on the left, then what we getis a matrix whose i th row is equal to the j th row of A and allother rows are zero. In particular EijEij = 0 for (i ∕= j).It follows that for any � and i ∕= j

(I + �Eii)(I + �Eii) = I + (� + � + ��)Eii .

So rhs to be equal to I then we must have � + � + �� = 0.Thus I + �Eii is invertible if � ∕= −1.

Alternatively, I + �Eii is the diagonal matrix with all thediagonal entries equal to 1 except the (i , i)th one which isequal to 1 + �.

(I + �Eii)(I + �Eii) = I + (� + � + ��)Eii .

Further consider I + Eij + Eji − Eii − Ejj which is similarlyinvertible.

This matrix is nothing but the identity matrix afterinterchanging the i th and j th rows. These are calledtransposition matrices.The linear maps corresponding to them merelyinterchange the i th and j th coordinates. We shall denotethem simply by Tij . To sum up we have:

TheoremThe elementary matrices I + �Eij (i ∕= j), I + �Eii (� ∕= −1) andTij = I + Eij + Eji −Eii −Ejj are all invertible with their respectiveinverses I − �Eij , I − (�/(1 + �))Eii and Tij .

Permutation matrices are defined to be those squarematrices which have all the entries in any given row (andcolumn) equal to zero except one entry equal to 1.

Further consider I + Eij + Eji − Eii − Ejj which is similarlyinvertible. This matrix is nothing but the identity matrix afterinterchanging the i th and j th rows. These are calledtransposition matrices.

The linear maps corresponding to them merelyinterchange the i th and j th coordinates. We shall denotethem simply by Tij . To sum up we have:

Further consider I + Eij + Eji − Eii − Ejj which is similarlyinvertible. This matrix is nothing but the identity matrix afterinterchanging the i th and j th rows. These are calledtransposition matrices.The linear maps corresponding to them merelyinterchange the i th and j th coordinates. We shall denotethem simply by Tij .

To sum up we have:

Further consider I + Eij + Eji − Eii − Ejj which is similarlyinvertible. This matrix is nothing but the identity matrix afterinterchanging the i th and j th rows. These are calledtransposition matrices.The linear maps corresponding to them merelyinterchange the i th and j th coordinates. We shall denotethem simply by Tij . To sum up we have:

From a permutation matrix we can get a map � : N → Nwhich is one-to-one mapping.

Conversely, given a permutation � : N → N, we define amatrix P� = ((pij)) :

{0 if j ∕= �(i)1 if j = �(i)

A permutation matrix is obtained by merely shuffling therows of the identity matrix (or by shuffling the columns)If A denotes a permutation matrix, then

AAT = AT A = In.

In particular, they are invertible.

From a permutation matrix we can get a map � : N → Nwhich is one-to-one mapping.Conversely, given a permutation � : N → N, we define amatrix P� = ((pij)) :

{0 if j ∕= �(i)1 if j = �(i)

AAT = AT A = In.

{0 if j ∕= �(i)1 if j = �(i)

A permutation matrix is obtained by merely shuffling therows of the identity matrix (or by shuffling the columns)

If A denotes a permutation matrix, then

AAT = AT A = In.

{0 if j ∕= �(i)1 if j = �(i)

AAT = AT A = In.

Gauss Elimination

Carl Friedrich Gauss (1777-1855)

German mathematician and scientist,

contributed to number theory, statistics, algebra, analysis,differential geometry, geophysics, electrostatics, astronomy,

optics

Gauss Elimination

Carl Friedrich Gauss (1777-1855)

German mathematician and scientist,contributed to number theory, statistics, algebra, analysis,

differential geometry, geophysics, electrostatics, astronomy,optics

Gauss elimination method: to solve a system of m linearequations in n unknowns.

The three types of operations on these equations which donot alter the solutions:

(1) Interchanging two equations.(2) Multiplying all the terms of an equation by a nonzero scalar.(3) Adding to one equation a multiple of another equation.

Only need is to keep track of which coefficient came fromwhich variable.

Gauss elimination method: to solve a system of m linearequations in n unknowns.The three types of operations on these equations which donot alter the solutions:

(1) Interchanging two equations.

(2) Multiplying all the terms of an equation by a nonzero scalar.(3) Adding to one equation a multiple of another equation.

(1) Interchanging two equations.(2) Multiplying all the terms of an equation by a nonzero scalar.

(3) Adding to one equation a multiple of another equation.

⎛⎜⎝ a11 . . . a1n...

...am1 . . . amn

⎞⎟⎠⎛⎜⎝ x1

⎞⎟⎠ =

⎛⎜⎝ b1...

⎞⎟⎠ (∗)

The matrix A = ((aij)) is called the coefficient matrix. Bya solution of (∗) we mean any choice of x1, x2, . . . , xn whichsatisfies all the equations in it.If each bi = 0 we say that the system is homogeneous.Otherwise it is called an inhomogeneous system. The

matrix

⎛⎜⎝ a11 . . . a1n...

...am1 . . . amn

∣∣∣∣∣∣∣b1...

⎞⎟⎠ is called the augmented

matrix.Now the above three operations on the equationscorrespond to certain operations on the row of theaugmented matrix. These are called elementary rowoperations.

⎛⎜⎝ a11 . . . a1n...

...am1 . . . amn

⎞⎟⎠⎛⎜⎝ x1

⎞⎟⎠ =

⎛⎜⎝ b1...

⎞⎟⎠ (∗)

The matrix A = ((aij)) is called the coefficient matrix. Bya solution of (∗) we mean any choice of x1, x2, . . . , xn whichsatisfies all the equations in it.

If each bi = 0 we say that the system is homogeneous.Otherwise it is called an inhomogeneous system. The

matrix

⎛⎜⎝ a11 . . . a1n...

...am1 . . . amn

∣∣∣∣∣∣∣b1...

⎛⎜⎝ a11 . . . a1n...

...am1 . . . amn

⎞⎟⎠⎛⎜⎝ x1

⎞⎟⎠ =

⎛⎜⎝ b1...

⎞⎟⎠ (∗)

matrix

⎛⎜⎝ a11 . . . a1n...

...am1 . . . amn

∣∣∣∣∣∣∣b1...

matrix.

Now the above three operations on the equationscorrespond to certain operations on the row of theaugmented matrix. These are called elementary rowoperations.

⎛⎜⎝ a11 . . . a1n...

...am1 . . . amn

⎞⎟⎠⎛⎜⎝ x1

⎞⎟⎠ =

⎛⎜⎝ b1...

⎞⎟⎠ (∗)

matrix

⎛⎜⎝ a11 . . . a1n...

...am1 . . . amn

∣∣∣∣∣∣∣b1...

Example 1 (A system with a unique solution):

2x − 5y + 4z = −3x − 2y + z = 5

x − 4y + 6z = 10.

⎛⎝ 2 −5 41 −2 11 −4 6

∣∣∣∣∣∣−3510

⎞⎠The three basic operations mentioned above will beperformed on the rows of the augmented matrix,

⎛⎝ 1 −2 12 −5 41 −4 6

∣∣∣∣∣∣5−310

⎞⎠

2x − 5y + 4z = −3x − 2y + z = 5

x − 4y + 6z = 10.

⎛⎝ 2 −5 41 −2 11 −4 6

∣∣∣∣∣∣−3510

⎞⎠

The three basic operations mentioned above will beperformed on the rows of the augmented matrix,

⎛⎝ 1 −2 12 −5 41 −4 6

∣∣∣∣∣∣5−310

⎞⎠

2x − 5y + 4z = −3x − 2y + z = 5

x − 4y + 6z = 10.

⎛⎝ 2 −5 41 −2 11 −4 6

∣∣∣∣∣∣−3510

⎛⎝ 1 −2 12 −5 41 −4 6

∣∣∣∣∣∣5−310

⎞⎠

2x − 5y + 4z = −3x − 2y + z = 5

x − 4y + 6z = 10.

⎛⎝ 2 −5 41 −2 11 −4 6

∣∣∣∣∣∣−3510

⎛⎝ 1 −2 12 −5 41 −4 6

∣∣∣∣∣∣5−310

⎞⎠

First we add -2 times the first row to the second row. Then wesubtract the first row from the third row.

⎛⎝ 1 −2 10 −1 20 −2 5

∣∣∣∣∣∣5−13

⎞⎠This last step is also called ‘sweeping’ a column. Now werepeat the process for the smaller matrix:

viz.(−1 2−2 5

∣∣∣∣ −135

(1 −2−2 5

∣∣∣∣ 135

(1 −20 1

∣∣∣∣ 1331

)Put back the rows and columns that has been cut outearlier:

⎛⎝ 1 −2 10 −1 20 −2 5

∣∣∣∣∣∣5−13

⎞⎠

This last step is also called ‘sweeping’ a column. Now werepeat the process for the smaller matrix:

viz.(−1 2−2 5

∣∣∣∣ −135

(1 −2−2 5

∣∣∣∣ 135

(1 −20 1

∣∣∣∣ 1331

⎛⎝ 1 −2 10 −1 20 −2 5

∣∣∣∣∣∣5−13

viz.(−1 2−2 5

∣∣∣∣ −135

(1 −2−2 5

∣∣∣∣ 135

(1 −20 1

∣∣∣∣ 1331

⎛⎝ 1 −2 10 −1 20 −2 5

∣∣∣∣∣∣5−13

viz.(−1 2−2 5

∣∣∣∣ −135

1 −2−2 5

∣∣∣∣ 135

(1 −20 1

∣∣∣∣ 1331

⎛⎝ 1 −2 10 −1 20 −2 5

∣∣∣∣∣∣5−13

viz.(−1 2−2 5

∣∣∣∣ −135

(1 −2−2 5

∣∣∣∣ 135

1 −20 1

∣∣∣∣ 1331

⎛⎝ 1 −2 10 −1 20 −2 5

∣∣∣∣∣∣5−13

viz.(−1 2−2 5

∣∣∣∣ −135

(1 −2−2 5

∣∣∣∣ 135

(1 −20 1

∣∣∣∣ 1331

Put back the rows and columns that has been cut outearlier:

⎛⎝ 1 −2 10 −1 20 −2 5

∣∣∣∣∣∣5−13

viz.(−1 2−2 5

∣∣∣∣ −135

(1 −2−2 5

∣∣∣∣ 135

(1 −20 1

∣∣∣∣ 1331

⎛⎝ 1 −2 10 1 −20 0 1

∣∣∣∣∣∣5

⎞⎠ (∗)

The matrix represents the linear system:

x − 2y + z = 5y − 2z = 13

z = 31

These can be solved successively by backward substitutionz = 31; y = 13 + 2z = 75;x = 5 + 2y − z = 124

⎛⎝ 1 −2 10 1 −20 0 1

∣∣∣∣∣∣5

⎞⎠ (∗)

x − 2y + z = 5y − 2z = 13

z = 31

⎛⎝ 1 −2 10 1 −20 0 1

∣∣∣∣∣∣5

⎞⎠ (∗)

x − 2y + z = 5y − 2z = 13

z = 31

These can be solved successively by backward substitutionz = 31;

y = 13 + 2z = 75;x = 5 + 2y − z = 124

⎛⎝ 1 −2 10 1 −20 0 1

∣∣∣∣∣∣5

⎞⎠ (∗)

x − 2y + z = 5y − 2z = 13

z = 31

These can be solved successively by backward substitutionz = 31; y = 13 + 2z = 75;

x = 5 + 2y − z = 124

⎛⎝ 1 −2 10 1 −20 0 1

∣∣∣∣∣∣5

⎞⎠ (∗)

x − 2y + z = 5y − 2z = 13

z = 31

Alternatively, we can continue the process called theGauss-Jordan Process.

Here at this stage, we first makesure that all diagonal entries are indeed 1 or 0.

Then for those those columns for which the diagonal entry are 1,we sweep the columns above the diagonal entry too. Thisis carried out in the decreasing order of the columnnumbers.

Recall

⎛⎝ 1 −2 10 1 −20 0 1

∣∣∣∣∣∣51331

⎞⎠ (∗)

(1) add twice third row to the second(2) then subtract the third row from the first(3) add twice the second row to the first

Alternatively, we can continue the process called theGauss-Jordan Process. Here at this stage, we first makesure that all diagonal entries are indeed 1 or 0.

Recall

⎛⎝ 1 −2 10 1 −20 0 1

∣∣∣∣∣∣51331

⎞⎠ (∗)

Then for those those columns for which the diagonal entry are 1,we sweep the columns above the diagonal entry too.

Thisis carried out in the decreasing order of the columnnumbers.

Recall

⎛⎝ 1 −2 10 1 −20 0 1

∣∣∣∣∣∣51331

⎞⎠ (∗)

Recall

⎛⎝ 1 −2 10 1 −20 0 1

∣∣∣∣∣∣51331

⎞⎠ (∗)

Recall

⎛⎝ 1 −2 10 1 −20 0 1

∣∣∣∣∣∣51331

⎞⎠ (∗)

Recall

⎛⎝ 1 −2 10 1 −20 0 1

∣∣∣∣∣∣51331

⎞⎠ (∗)

⎛⎝ 1 0 00 1 00 0 1

∣∣∣∣∣∣1247531

⎞⎠

⇒ The augmented matrix gives the desired solutionx = 124; y = 75; z = 31.

Notation: Let Ri denote the i th row of a given matrix.

Operation NotationMultiply Ri by a scalar c cRiMultiply Rj by a scalar c and add to Ri Ri + cRjInterchange Ri and Rj Ri ↔ Rj

⎛⎝ 1 0 00 1 00 0 1

∣∣∣∣∣∣1247531

⎞⎠⇒ The augmented matrix gives the desired solution

x = 124; y = 75; z = 31.

⎛⎝ 1 0 00 1 00 0 1

∣∣∣∣∣∣1247531

⎞⎠⇒ The augmented matrix gives the desired solution

x = 124; y = 75; z = 31.

Example 2 (A system with infinitely many solutions):

x − 2y + z − u + v = 52x − 5y + 4z + u − v = −3x − 4y + 6z + 2u − v = 10

⎛⎝ 1 −2 1 −1 12 −5 4 1 −11 −4 6 2 −1

∣∣∣∣∣∣5−310

⎞⎠We shall use the notation introduced above for the rowoperations

.R2 − 2R1R3 − R1−→

⎛⎝ 1 −2 1 −1 10 −1 2 3 −30 −2 5 3 −2

∣∣∣∣∣∣5−13

⎞⎠.

R3 − 2R2−→

⎛⎝ 1 −2 1 −1 10 −1 2 3 −30 0 1 −3 4

∣∣∣∣∣∣5−1331

⎞⎠

⎛⎝ 1 −2 1 −1 12 −5 4 1 −11 −4 6 2 −1

∣∣∣∣∣∣5−310

⎞⎠

We shall use the notation introduced above for the rowoperations

.R2 − 2R1R3 − R1−→

⎛⎝ 1 −2 1 −1 10 −1 2 3 −30 −2 5 3 −2

∣∣∣∣∣∣5−13

⎞⎠.

R3 − 2R2−→

⎛⎝ 1 −2 1 −1 10 −1 2 3 −30 0 1 −3 4

∣∣∣∣∣∣5−1331

⎞⎠

⎛⎝ 1 −2 1 −1 12 −5 4 1 −11 −4 6 2 −1

∣∣∣∣∣∣5−310

.R2 − 2R1R3 − R1−→

⎛⎝ 1 −2 1 −1 10 −1 2 3 −30 −2 5 3 −2

∣∣∣∣∣∣5−13

⎞⎠.

R3 − 2R2−→

⎛⎝ 1 −2 1 −1 10 −1 2 3 −30 0 1 −3 4

∣∣∣∣∣∣5−1331

⎞⎠

⎛⎝ 1 −2 1 −1 12 −5 4 1 −11 −4 6 2 −1

∣∣∣∣∣∣5−310

.R2 − 2R1R3 − R1−→

⎛⎝ 1 −2 1 −1 10 −1 2 3 −30 −2 5 3 −2

∣∣∣∣∣∣5−13

⎞⎠

.R3 − 2R2−→

⎛⎝ 1 −2 1 −1 10 −1 2 3 −30 0 1 −3 4

∣∣∣∣∣∣5−1331

⎞⎠

⎛⎝ 1 −2 1 −1 12 −5 4 1 −11 −4 6 2 −1

∣∣∣∣∣∣5−310

.R2 − 2R1R3 − R1−→

⎛⎝ 1 −2 1 −1 10 −1 2 3 −30 −2 5 3 −2

∣∣∣∣∣∣5−13

⎞⎠.

R3 − 2R2−→

⎛⎝ 1 −2 1 −1 10 −1 2 3 −30 0 1 −3 4

∣∣∣∣∣∣5−1331

⎞⎠39/51

.−R2−→

⎛⎝ 1 −2 1 −1 10 1 −2 −3 30 0 1 −3 4

∣∣∣∣∣∣51331

⎞⎠

.R1 + 2R2−→

⎛⎝ 1 0 −3 −7 70 1 −2 −3 30 0 1 −3 4

∣∣∣∣∣∣311331

⎞⎠.

R2 + 2R3R1 + 3R3−→

⎛⎝ 1 0 0 −16 190 1 0 −9 110 0 1 −3 4

∣∣∣∣∣∣1247531

⎞⎠The system of linear equations corresponding to the lastaugmented matrix is:

x = 124 + 16u − 19vy = 75 + 9u − 11vz = 31 + 3u − 4v .

.−R2−→

⎛⎝ 1 −2 1 −1 10 1 −2 −3 30 0 1 −3 4

∣∣∣∣∣∣51331

⎞⎠.

R1 + 2R2−→

⎛⎝ 1 0 −3 −7 70 1 −2 −3 30 0 1 −3 4

∣∣∣∣∣∣311331

⎞⎠

.R2 + 2R3R1 + 3R3−→

⎛⎝ 1 0 0 −16 190 1 0 −9 110 0 1 −3 4

∣∣∣∣∣∣1247531

x = 124 + 16u − 19vy = 75 + 9u − 11vz = 31 + 3u − 4v .

.−R2−→

⎛⎝ 1 −2 1 −1 10 1 −2 −3 30 0 1 −3 4

∣∣∣∣∣∣51331

⎞⎠.

R1 + 2R2−→

⎛⎝ 1 0 −3 −7 70 1 −2 −3 30 0 1 −3 4

∣∣∣∣∣∣311331

⎞⎠.

R2 + 2R3R1 + 3R3−→

⎛⎝ 1 0 0 −16 190 1 0 −9 110 0 1 −3 4

∣∣∣∣∣∣1247531

⎞⎠

The system of linear equations corresponding to the lastaugmented matrix is:

x = 124 + 16u − 19vy = 75 + 9u − 11vz = 31 + 3u − 4v .

.−R2−→

⎛⎝ 1 −2 1 −1 10 1 −2 −3 30 0 1 −3 4

∣∣∣∣∣∣51331

⎞⎠.

R1 + 2R2−→

⎛⎝ 1 0 −3 −7 70 1 −2 −3 30 0 1 −3 4

∣∣∣∣∣∣311331

⎞⎠.

R2 + 2R3R1 + 3R3−→

⎛⎝ 1 0 0 −16 190 1 0 −9 110 0 1 −3 4

∣∣∣∣∣∣1247531

x = 124 + 16u − 19vy = 75 + 9u − 11vz = 31 + 3u − 4v .

We say that u and v are independent variables and x, y, zare dependent variables

(x , y , z,u, v)T

= (124 + 16t1 − 19t2,75 + 9t1 − 11t2,31 + 3t1 − 4t2, t1, t2)T

= (124,75,31,0,0)T + t1(16,9,3,1,0)T

+t2(−19,−11,−4,0,1)T .

The above equation gives the general solution to thesystem. (124,75,31,0,0) is a particular solution of theinhomogeneous system.v1 = (16,9,3,1,0) and v2 = (−19,−11,−4,0,1) aresolutions of the corresponding homogeneous system.(These two solutions are linearly independent and) everyother solution of the homogeneous system is a linearcombination of these two solutions.

(x , y , z,u, v)T

= (124 + 16t1 − 19t2,75 + 9t1 − 11t2,31 + 3t1 − 4t2, t1, t2)T

= (124,75,31,0,0)T + t1(16,9,3,1,0)T

+t2(−19,−11,−4,0,1)T .

(x , y , z,u, v)T

= (124 + 16t1 − 19t2,75 + 9t1 − 11t2,31 + 3t1 − 4t2, t1, t2)T

= (124,75,31,0,0)T + t1(16,9,3,1,0)T

+t2(−19,−11,−4,0,1)T .

The above equation gives the general solution to thesystem. (124,75,31,0,0) is a particular solution of theinhomogeneous system.

v1 = (16,9,3,1,0) and v2 = (−19,−11,−4,0,1) aresolutions of the corresponding homogeneous system.(These two solutions are linearly independent and) everyother solution of the homogeneous system is a linearcombination of these two solutions.

(x , y , z,u, v)T

= (124 + 16t1 − 19t2,75 + 9t1 − 11t2,31 + 3t1 − 4t2, t1, t2)T

= (124,75,31,0,0)T + t1(16,9,3,1,0)T

+t2(−19,−11,−4,0,1)T .

The above equation gives the general solution to thesystem. (124,75,31,0,0) is a particular solution of theinhomogeneous system.v1 = (16,9,3,1,0) and v2 = (−19,−11,−4,0,1) aresolutions of the corresponding homogeneous system.

(These two solutions are linearly independent and) everyother solution of the homogeneous system is a linearcombination of these two solutions.

(x , y , z,u, v)T

= (124 + 16t1 − 19t2,75 + 9t1 − 11t2,31 + 3t1 − 4t2, t1, t2)T

= (124,75,31,0,0)T + t1(16,9,3,1,0)T

+t2(−19,−11,−4,0,1)T .

TheoremSuppose Ax = b is a system of linear equations whereA = ((aij)) is a m × n matrix and x = (x1, x2, . . . , xn)T ,b = (b1,b2, . . . ,bm)T .

Suppose c = (c1, c2, . . . , cn)T is asolution of Ax = b and S is the set of all solutions to theassociated homogeneous system Ax = 0.

Then the set of all solutions to Ax = b is c + S := {c + v∣v ∈ S}.

Proof: Let r ∈ ℝn be a solution of Ax = b. Then

A(r− c) = Ar− Ac = b− b = 0.

Hence r− c ∈ S. Thus r ∈ c + S.

Conversely, let v ∈ S. Then