33
Least-Square Regression Chapter 17 July 2013  ر ي !"#$% &'()$ *  ي+", ."/($% *"$% 45 689"8 :;< => ?> ."@A *B?  رD "% ر/ *  يE+ *$"F  ر8 2G0 HHHH I ?> +?  رK$9% MB  رN$"8  Physics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms, umerical, Economy  O"N#P Q"A ! I HHHH 2G0 in fo@eng-hs.com *$)T UV"? W رP " X +", 4  ي#Y$"8 eng-hs.  com, eng-hs.  net

Numerical Ch17

Embed Size (px)

Citation preview

Page 1: Numerical Ch17

8/20/2019 Numerical Ch17

http://slidepdf.com/reader/full/numerical-ch17 1/32

Least-Square Regression

Chapter 17

July 2013

 N$"8ر K$9% MBر ?+8 2G0HHHHI?>ر E+ *$"Fي */ر%D "ر ?+", ."/($%*"$%45 689"8:;< =>?>."@A*Bي * $()'& %$#"! ير 

 hysics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,

umerical, Economy

 O"N#P Q"A ! [email protected]*$)T UV"? WرP" X+", 4 ي#Y$"8eng-hs. com, eng-hs. net

Page 2: Numerical Ch17

8/20/2019 Numerical Ch17

http://slidepdf.com/reader/full/numerical-ch17 2/32

July 2013

Where substantial error is associated with data,

polynomial interpolation is inappropriate and may yield

unsatisfactory results when used to predict intermediate

values. Experimentally data is often of this type. For

example, the following gure (a) shows seven

experimentally derived data points showing signicant

variability. he data indicates that higher values of y are

associated with higher values of x.

!ow, if a sixth"order interpolating polynomial is tted to this

data (g b), it will pass exactly through all of the points.

#owever, because of the variability in the data, the curve

oscillates widely in the interval between the points. $n

particular, the interpolated values at x % &.' and x % .'

appear to be well beyond the range suggested by the data.

more appropriate strategy is to derive an

approximating function that ts the shape . Fig (c) illustrates

how a straight line can be used to generally characteri*e

the trend of the data without passing through any particular

point.

+ne way to determine the line in gure (c) is to loo at

the plotted data and then setch a -best line through the

points. /uch approaches are not enough because they are

arbitrary. hat is, unless the points dene a perfect straight

 N$"8ر K$9% MBر ?+8 2G0HHHHI?>ر E+ *$"Fي */ر%D "ر ?+", ."/($%*"$%45 689"8:;< =>?>."@A*Bي * $()'& %$#"! ير 

 hysics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,

umerical, Economy

 O"N#P Q"A ! [email protected]*$)T UV"? WرP" X+", 4 ي#Y$"8eng-hs. com, eng-hs. net

Page 3: Numerical Ch17

8/20/2019 Numerical Ch17

http://slidepdf.com/reader/full/numerical-ch17 3/32

July 2013

line (in which case, interpolation would be appropriate),

di0erent analysis would draw di0erent lines.

 o avoid this, some criterion

must be devised to establish a basis for the t. +ne way to

do this is to derive a curve that minimi*es the discrepancy

between the data points

 N$"8ر K$9% MBر ?+8 2G0HHHHI?>ر E+ *$"Fي */ر%D "ر ?+", ."/($%*"$%45 689"8:;< =>?>."@A*Bي * $()'& %$#"! ير

 hysics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,

umerical, Economy

 O"N#P Q"A ! [email protected]*$)T UV"? WرP" X+", 4 ي#Y$"8eng-hs. com, eng-hs. net

Z MY ["+> Qر@+ "(N)\> ])^B " _`9% )5 "(8 OKB O?"B 

Page 4: Numerical Ch17

8/20/2019 Numerical Ch17

http://slidepdf.com/reader/full/numerical-ch17 4/32

July 2013

and the curve. +ne techni1ue for doing

this is called least"s1uares regression.

  1!"1 #inear $egression

 he simple example of a least"s1uares approximation is

tting a straight line to a set of paired observations2 (x &,y&),

(x3,y3), 4, (xn,yn).

 he mathematical expression for the straight line is

y % a5 6 a&x 6 e

where a5 and a& are coe7cients representing the intercept

and the slope, respectively, and e is the error between the

model and the observations, which can be represented by

rearranging the previous e1uation as

e % y 8 a5 8 a&x

thus, the error is the discrepancy between the true value of 

y and the approximate value, a5 6 a&x, predicted by the

linear e1uation.

1!"1"1 Criteria %or the &'est( )t

+ne strategy for tting a -best line through the data

would be to minimi*e the sum of the residual errors for all

the available data, as in

 N$"8ر K$9% MBر ?+8 2G0HHHHI?>ر E+ *$"Fي */ر%D "ر ?+", ."/($%*"$%45 689"8:;< =>?>."@A*Bي * $()'& %$#"! ير 

 hysics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,

umerical, Economy

 O"N#P Q"A ! [email protected]*$)T UV"? WرP" X+", 4 ي#Y$"8eng-hs. com, eng-hs. net

" )5 UETK$ b#K/ O> "c )5 d'+  /N,KF ?> bTر

f ي)5 UET/ " bA 

Page 5: Numerical Ch17

8/20/2019 Numerical Ch17

http://slidepdf.com/reader/full/numerical-ch17 5/32

July 2013

∑i=1

n

ei=∑i=1

n

( y i−a0−a1 xi )

where n % total number of points. #owever, this is an

inade1uate criterion, as illustrated by the next gure, which

shows the t of a straight line to two points.

 N$"8ر K$9% MBر ?+8 2G0HHHHI?>ر E+ *$"Fي */ر%D "ر ?+", ."/($%*"$%45 689"8:;< =>?>."@A*Bي * $()'& %$#"! ير

 hysics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,

umerical, Economy

 O"N#P Q"A ! [email protected]*$)T UV"? WرP" X+", 4 ي#Y$"8eng-hs. com, eng-hs. net

bA P U .%MY f'($ "#jB K$% ?MT$%

Page 6: Numerical Ch17

8/20/2019 Numerical Ch17

http://slidepdf.com/reader/full/numerical-ch17 6/32

July 2013

+bviously, the best t is the line connecting the points.

#owever, any straight line passing through the midpoint of 

the connecting line results in a minimum value of the

previous e1uation e1ual to *ero because the errors cancel.

 herefore, another logical criterion might be to minimi*e the

sum of the absolute values of the discrepancies, as in

∑i=1

n

|ei|=∑i=1

n

| y i−ao−a1 x i|

 he previous g (b) demonstrates why this criterion is also

inade1uate.

For the four points shown, any straight line falling within the

dashed lines will minimi*e the sum of the absolute values.

 hus, this criterion also does not yield a uni1ue best t.

  third strategy for tting a best line is the minimax

criterion.

$n this techni1ue, the line is chosen that minimi*es the

maximum distance that an individual point falls from the

line. s shown in previous g (c), this strategy is ill"suited

for regression because it gives big e0ect to an outlier, that

is, a single point with a large error.

 N$"8ر K$9% MBر ?+8 2G0HHHHI?>ر E+ *$"Fي */ر%D "ر ?+", ."/($%*"$%45 689"8:;< =>?>."@A*Bي * $()'& %$#"! ير 

 hysics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,

umerical, Economy

 O"N#P Q"A ! [email protected]*$)T UV"? WرP" X+", 4 ي#Y$"8eng-hs. com, eng-hs. net

.%k",+c ]^T/ "M(5 d/"#F  "(5  ,#/ "KA d+c ["($% M^K#Bر

Page 7: Numerical Ch17

8/20/2019 Numerical Ch17

http://slidepdf.com/reader/full/numerical-ch17 7/32

July 2013

strategy that overcomes the shortcomings of the

previous approaches is to minimi*e the sum of the s1uares

of the residuals between the measured y and the y

calculated with the linear model.

ei

2=¿∑i=1

n

( yi ,measured− y i,model)2=∑

i=1

n

( yi−a0−a1 x i)2

Sr=∑i=1

n

¿

 his criterion has a number of advantages, including the

fact that it yields a uni1ue line for a give set of data.

1!"1"2 #east*Suares )t o% a straight line

 o determine values of a5 and a&, the previous e1uation

is di0erentiated with respect to each coe7cient2

∂Sr

∂a0

=−2∑ ( y i−a0−a1 x i)

∂Sr

∂a1

=−2∑ [( y i−a0−a1 xi) x i ]

!ote that we have simplied the summation symbols9

unless otherwise indicated, all summation are from i % & to

n. /etting these derivatives e1ual to *ero will result in a

minimum /r.

 yi−∑ a0−¿∑ a1 x i

0=∑ ¿

 N$"8ر K$9% MBر ?+8 2G0HHHHI?>ر E+ *$"Fي */ر%D "ر ?+", ."/($%*"$%45 689"8:;< =>?>."@A*Bي * $()'& %$#"! ير 

 hysics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,

umerical, Economy

 O"N#P Q"A ! [email protected]*$)T UV"? WرP" X+", 4 ي#Y$"8eng-hs. com, eng-hs. net

U mj %c d+c Q" يT$% bV",5 4 f)#$ f ي)5 UETKF d+ *^$% O? " 

Page 8: Numerical Ch17

8/20/2019 Numerical Ch17

http://slidepdf.com/reader/full/numerical-ch17 8/32

July 2013

 yi x i−∑ a0 xi−¿∑ a1 x i2

0=∑ ¿

!ow, reali*ing that ∑ a0   % na5, we can express the

e1uations as a set of two simultaneous linear e1uations with

two unnowns (a5 and a&)2

na0+(∑ xi ) a1=∑ y i   (&:.;)

(∑ x i )a0+(∑ x i

2) a1=∑ xi y i

 hese are called the normal e1uations. hey can be solvedsimultaneously

a1=n∑ xi yi−∑ x i∑ y i

n∑ xi2−(∑ x i)

2

 his result can then be used in con<unction with E1. (&:.;)

to solve for

a0=´ y−a1 ´ x

where ´ y  and ´ x  are the means of y and x, respectively.

Eam-le 1!"1 #inear $egression

Pro'lem Statement.Fit a straight line to the x and y values in the rst two

columns of the next table

 N$"8ر K$9% MBر ?+8 2G0HHHHI?>ر E+ *$"Fي */ر%D "ر ?+", ."/($%*"$%45 689"8:;< =>?>."@A*Bي * $()'& %$#"! ير 

 hysics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,

umerical, Economy

 O"N#P Q"A ! [email protected]*$)T UV"? WرP" X+", 4 ي#Y$"8eng-hs. com, eng-hs. net

4B f) ي/ 4B " f^ ي^T/ 4B "? f^ ي^T/ 

f ي$c )$ "^B ر̀ !M#+ 4$ 

Page 9: Numerical Ch17

8/20/2019 Numerical Ch17

http://slidepdf.com/reader/full/numerical-ch17 9/32

July 2013

Solution.

 he following 1uantities can be computed

n % : ∑ x i yi=119.5   ∑ x i2=140

∑ xi=28 ´ x=28

7 =4

∑ y i=24 ´ y=24

7 =3.428571

=sing the previous two e1uations,

a1=7 (119.5 )−28(24)

7 (140 )−(28)2  =0.8392857

a0=3.428571−0.8392857 (4 )=0.07142857

 herefore, the least"s1uare t is

 y=0.07142857+0.8392857 x

 he line, along with the data, is shown in the rst gure (c).

1!"1"3 uanti)cation o% Error o% #inear $egression

ny line other than the one computed in the previous

example results in a larger sum of the s1uares of the

residuals. hus, the line is uni1ue and in terms of our

chosen criterion is a -best line through the points.

 N$"8ر K$9% MBر ?+8 2G0HHHHI?>ر E+ *$"Fي */ر%D "ر ?+", ."/($%*"$%45 689"8:;< =>?>."@A*Bي * $()'& %$#"! ير

 hysics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,

umerical, Economy

 O"N#P Q"A ! [email protected]*$)T UV"? WرP" X+", 4 ي#Y$"8eng-hs. com, eng-hs. net

O Qر $> d^BM? Qر ?M5 A%

Qرj$"8 )5> ]BME$% b)^+%

Page 10: Numerical Ch17

8/20/2019 Numerical Ch17

http://slidepdf.com/reader/full/numerical-ch17 10/32

July 2013

number of additional properties of this t can be

explained by examining more closely the way in which

residuals were computed.

>ecall that the sum of the s1uares is dened as

Sr=∑i=1

n

e i

2=∑i=1

n

( y i−a0−a1 xi)2

!otice the similarity between the previous e1uation and

St =∑ (

 yi−

 ́y)

2

 he similarity can be extended further for cases where (&)

the spread of the points around the line is of similar

magnitude along the entire range of the data and (3) the

distribution of these points about the line is normal.

$t can be demonstrated that if these criteria are met, least"

s1uare regression will provide the best (that is, the most

liely) estimates of a5 and a&.

$n addition, if these criteria are met, a -standard

deviation of the regression line can be determined as

s y / x=√  Sr

n−2

where s y / x  is called the stan0ar0 error o% the estimate.

 he subscript notation -   y / x designates that the error is

for a predicted value of y corresponding to a particular

value of x.

 N$"8ر K$9% MBر ?+8 2G0HHHHI?>ر E+ *$"Fي */ر%D "ر ?+", ."/($%*"$%45 689"8:;< =>?>."@A*Bي * $()'& %$#"! ير

 hysics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,

umerical, Economy

 O"N#P Q"A ! [email protected]*$)T UV"? WرP" X+", 4 ي#Y$"8eng-hs. com, eng-hs. net

O> M5$"8 <F? Ujي (* $(Nر."8"+8 O ي)8"+v M#/ Z

Page 11: Numerical Ch17

8/20/2019 Numerical Ch17

http://slidepdf.com/reader/full/numerical-ch17 11/32

July 2013

lso, notice that we now divide by n"3 because two data

derived estimates 8 a5 and a& 8 were used to compute /r9

thus, we have lost two degrees of freedom.

nother <ustication for dividing by n"3 is that there is no

such thing as the -spread of data around a straight line

connecting two points..

 he standard error of the estimate 1uanties the spreadof the data. #owever, s y / x  1uanties the spread around the

regression line as shown in the next gure (b) in contrast to

the original standard deviation /y that 1uantied the spread

around the mean ( g (a)).

 he above concepts can be used to 1uantify the -goodness

of our t. his is particularly useful for comparison of 

several regressions

(next gure). o do this, we return to the original data and

determine the total sum of the s1uares around the mean for N$"8ر K$9% MBر ?+8 2G0HHHHI?>ر E+ *$"Fي */ر%D "ر ?+", ."/($%*"$%45 689"8:;< =>?>."@A*Bي * $()'& %$#"! ير 

 hysics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,

umerical, Economy

 O"N#P Q"A ! [email protected]*$)T UV"? WرP" X+", 4 ي#Y$"8eng-hs. com, eng-hs. net

A" O"+c $c  ي   mBMF> %cd ي$c O"+c wMF> Oc? xر/ O>

x"(/ O> A"  ي  

Page 12: Numerical Ch17

8/20/2019 Numerical Ch17

http://slidepdf.com/reader/full/numerical-ch17 12/32

July 2013

the dependent variable

(in our case, y). his 1uantity is designated as / t. his is the

magnitude of the residual error associated with the

dependent variable prior to regression. fter performing the

regression, we can compute /r, the sum of the s1uares of 

the residuals around the regression line.

 his characteri*es the residual error that remains after the

regression.

$t is, therefore, sometimes called the unexplained sum of 

the s1uares.

 he di0erence between the two 1uanties, /t  8 /r,

1uanties the improvement or error reduction due to

 N$"8ر K$9% MBر ?+8 2G0HHHHI?>ر E+ *$"Fي */ر%D "ر ?+", ."/($%*"$%45 689"8:;< =>?>."@A*Bي * $()'& %$#"! ير 

 hysics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,

umerical, Economy

 O"N#P Q"A ! [email protected]*$)T UV"? WرP" X+", 4 ي#Y$"8eng-hs. com, eng-hs. net

^K bD"\ m+>? )/ d/" يA %` f ي)5 !M(/ BMA @5>

Page 13: Numerical Ch17

8/20/2019 Numerical Ch17

http://slidepdf.com/reader/full/numerical-ch17 13/32

July 2013

describing the data in terms of a straight line rather than as

an average value.

?ecause the magnitude of this 1uantity is scale"dependent,

the di0erence is normali*ed to /t to yield

r2=

S t −Sr

St 

where r3 is called the coecient o% 0etermination and r

is the correlation coe7cient (%√ r

2

).

For a perfect t, /r % 5 and r % r3 % &, signifying that the

line explains &55 percent of the variability of the data. For r

% r3 % 5, /r % /t and the t represents no improvement.

n alternative formulation for r that is more convenient

for computer implementations is

r=  n∑ xi y i−(∑ xi)(∑  yi)

√ n∑ x i2−(∑ x i)

2

√ n∑ y i2−(∑  yi)

2

Eam-le 1!"2 Estimate o% errors %or the linear least*

Suares it

 N$"8ر K$9% MBر ?+8 2G0HHHHI?>ر E+ *$"Fي */ر%D "ر ?+", ."/($%*"$%45 689"8:;< =>?>."@A*Bي * $()'& %$#"! ير 

 hysics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,

umerical, Economy

 O"N#P Q"A ! [email protected]*$)T UV"? WرP" X+", 4 ي#Y$"8eng-hs. com, eng-hs. net

]BM b$ *)" *($ m,KA% "8 QMA%? *^ يY f/"< U$% 4 4$ 

Page 14: Numerical Ch17

8/20/2019 Numerical Ch17

http://slidepdf.com/reader/full/numerical-ch17 14/32

July 2013

Pro'lem Statement.

@ompute the total standard deviation, the standard error of 

the estimate, and the correlation coe7cient for the data in

Example &:.&

Solution.

 he summations are performed and represented in the

previous exampleAs table. he standard deviation is

S y=√  St 

n−1=√

22.7143

7−1  =1.9457

and the standard error of the estimate is

S y / x=√  Sr

n−2=√

2.9911

7−2 =0.7735

 hus, because S y / x<S y , the linear regression model is

e7cient.

 he extent of the improvement is 1uantied by

r2=

S t −Sr

St 

=22.7143−2.9911

22.7143  =0.868

or

r=√ 0.868=0.932

 hese results indicate that B.B percent of the original

uncertainty has been explained by the linear model.

1!"1" #ineari4ation o% onlinear $elationshi-s

 N$"8ر K$9% MBر ?+8 2G0HHHHI?>ر E+ *$"Fي */ر%D "ر ?+", ."/($%*"$%45 689"8:;< =>?>."@A*Bي * $()'& %$#"! ير 

 hysics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,

umerical, Economy

 O"N#P Q"A ! [email protected]*$)T UV"? WرP" X+", 4 ي#Y$"8eng-hs. com, eng-hs. net

&$% *#+ M^ O"+c "z ر{$%

Page 15: Numerical Ch17

8/20/2019 Numerical Ch17

http://slidepdf.com/reader/full/numerical-ch17 15/32

July 2013

Cinear regression provides a powerful techni1ue for

tting a best line to data. #owever, it is predicated on the

fact that the relationship between the dependent and

independent variables is linear.

 his is not always the case and the rst step in any

regression analysis should be to plot and visually inspect

the data to now whether a linear model applies. For

example, the next gure shows some data that is obviously

curvilinear. $n some cases, techni1ues such as polynomial

regression, are appropriate. For example, transformations

can be used to express the data in a form that is compatible

with linear regression.

 N$"8ر K$9% MBر ?+8 2G0HHHHI?>ر E+ *$"Fي */ر%D "ر ?+", ."/($%*"$%45 689"8:;< =>?>."@A*Bي * $()'& %$#"! ير 

 hysics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,

umerical, Economy

 O"N#P Q"A ! [email protected]*$)T UV"? WرP" X+", 4 ي#Y$"8eng-hs. com, eng-hs. net

b "(B> Q"#$%  |(B 4 4رb K x}%? "')B 4 4/Z? 

Page 16: Numerical Ch17

8/20/2019 Numerical Ch17

http://slidepdf.com/reader/full/numerical-ch17 16/32

July 2013

+ne example is the

e-onential mo0el

 y=α 1e β1 x

(&:.3)

where   α 1   and  β1 are

constants. s shown

in the next gure, the

e1uation represents a nonlinear

relationship (for  β1≠0

) between x and y.

nother

example of a nonlinear model is the sim-le -o5er

euation 

 y=a2 x β2

(&:.&D)

 N$"8ر K$9% MBر ?+8 2G0HHHHI?>ر E+ *$"Fي */ر%D "ر ?+", ."/($%*"$%45 689"8:;< =>?>."@A*Bي * $()'& %$#"! ير 

 hysics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,

umerical, Economy

 O"N#P Q"A ! [email protected]*$)T UV"? WرP" X+", 4 ي#Y$"8eng-hs. com, eng-hs. net

Page 17: Numerical Ch17

8/20/2019 Numerical Ch17

http://slidepdf.com/reader/full/numerical-ch17 17/32

July 2013

where α 2   and  β2  are constant coe7cients. s shown in

the previous figure, the e1uation ( for 3  5 or &) is

nonlinear.

third example of a nonlinear model is the saturation*

gro5th*rate e1uation

 y=α 3 x

 β3+ x

(&:.;)Where α 

3   and  β3   are constant coe7cients. his model

also represents a nonlinear relationship between y and x,

that levels o0 as x increases.

simpler alternative is to use mathematical

manipulations to transform the e1uations into a linear form.

 hen, simple linear regression can be employed to t the

e1uations to data.

E1uation (&:.3) can be lineari*ed by taing its natural

logarithm

ln y=lnα 1+ β1 x ln e ?ut because ln e % &,ln y=lnα 1+ β1 x

 hus, a plot of ln y  versus x  will yield a straight line with a

slope of  β1  and an intercept of ln α 1  (previous g d).

 N$"8ر K$9% MBر ?+8 2G0HHHHI?>ر E+ *$"Fي */ر%D "ر ?+", ."/($%*"$%45 689"8:;< =>?>."@A*Bي * $()'& %$#"! ير 

 hysics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,

umerical, Economy

 O"N#P Q"A ! [email protected]*$)T UV"? WرP" X+", 4 ي#Y$"8eng-hs. com, eng-hs. net

Page 18: Numerical Ch17

8/20/2019 Numerical Ch17

http://slidepdf.com/reader/full/numerical-ch17 18/32

July 2013

E1uation (&:.D) is lineari*ed by taing its base"&5 logarithm

to give

log y= β2 log x+ logα 2

 hus, a plot of y  versus log x  will yield a straight line with a

slope of  β2  and an intercept of log α 2  ( previous g e).

E1uation (&:.&;) is lineari*ed by inverting it to give

1

 y =

 β3

α 3

1

 x +

 1

α 3

 hus, a plot of1/ y

 versus1/ x

 will be linear, with a slope of  β3/α 3  and an intercept of 1/α 3  (previous g f).

$n their transformed forms, these models can use linear

regression to evaluate the constant coe7cients. hey could

then be transformed bac to their original state and used

for predictive purposes.

Example &:.; illustrates this procedure for E1. (&:.D)

Eam-le 1!"6 #ineari4ation o% a Po5er Euation

Pro'lem Statement.

Fit E1.(&:.&D) to the data in the next table using a

logarithmic transformation of the data.

 N$"8ر K$9% MBر ?+8 2G0HHHHI?>ر E+ *$"Fي */ر%D "ر ?+", ."/($%*"$%45 689"8:;< =>?>."@A*Bي * $()'& %$#"! ير 

 hysics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,

umerical, Economy

 O"N#P Q"A ! [email protected]*$)T UV"? WرP" X+", 4 ي#Y$"8eng-hs. com, eng-hs. net

"(K *V ["~~($% U M#/ Z<5 8 dK^z 4/ Z? dAي ~~~"{

dK  %F )5 "~~~B NK 

Page 19: Numerical Ch17

8/20/2019 Numerical Ch17

http://slidepdf.com/reader/full/numerical-ch17 19/32

July 2013

Solution.

 he next gure (a) is a plot of the original data in its

untransformed state. Figure (b) shows the plot of the

transformed data. linear regression of the log"transformed

data yields the result

log y=1.75 log x−0.300

 hus, the intercept, log α 2 , e1uals "5.D55, and therefore,

by taing the antilogarithm, α 2=10−0.3

=0.5 . he slope is β2=1.75 .

@onse1uently, the power e1uation is

 y=0.5 x1.75

 his curve, as plotted in the next gure (a), indicates a good

t.

 N$"8ر K$9% MBر ?+8 2G0HHHHI?>ر E+ *$"Fي */ر%D "ر ?+", ."/($%*"$%45 689"8:;< =>?>."@A*Bي * $()'& %$#"! ير 

 hysics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,

umerical, Economy

 O"N#P Q"A ! [email protected]*$)T UV"? WرP" X+", 4 ي#Y$"8eng-hs. com, eng-hs. net

f')/Z K$% *T يE($% NY ! ي$% 8:B 4 F•% }%رP $c %M\  Pي ;jB € "ر

Page 20: Numerical Ch17

8/20/2019 Numerical Ch17

http://slidepdf.com/reader/full/numerical-ch17 20/32

July 2013

1!"1"7 8eneral Comments on

#inear $egression

We have focused on the

simple derivation and practical use of e1uations to t data.

/ome statistical assumptions that are inherent in the linear

least"s1uare procedures are

&.Each x has a xed value9 it is not random and is

nown without error.3. he y values are independent random variables and

all have the same variance.D. he y values for a given x must be normally

distributed.

/uch assumptions are relevant to the proper derivation

and use of regression. For example, the rst assumption

means that (&) the x values must be error"free and (3) the

regression of y versus x is not the same as x versus y.

 N$"8ر K$9% MBر ?+8 2G0HHHHI?>ر E+ *$"Fي */ر%D "ر ?+", ."/($%*"$%45 689"8:;< =>?>."@A*Bي * $()'& %$#"! ير 

 hysics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,

umerical, Economy

 O"N#P Q"A ! [email protected]*$)T UV"? WرP" X+", 4 ي#Y$"8eng-hs. com, eng-hs. net

 O"+c :;< &^B 4 

O"; يP f ي)5  ?EB 4ر

Page 21: Numerical Ch17

8/20/2019 Numerical Ch17

http://slidepdf.com/reader/full/numerical-ch17 21/32

July 2013

  1!"2 Polynomial $egression

/ome engineering data, although representing a mared

pattern, is poorly represented by a straight line. For these

cases, a curve would be better suited to t the data. +ne

method to accomplish this ob<ective is to use

transformations. nother alternative is to t polynomials to

the data using polynomial regression.

 he least"s1uares procedure can be readily extended to

t the data to a higher"order polynomial. For example,

suppose that we t a

second"order polynomial or 1uadratic2

 y=a0+a1 x+a2 x2

+e  

for this case the sum of the s1uares of the residuals is

Sr=∑i=1

n

( y i−a0−a1 x i−a2 x i

2)2

Following the procedure of the previous section, we tae the

derivative of the previous e1uation with respect to each of 

the unnown coe7cients of the polynomial, as in

∂Sr

∂a0

=−2∑ ( y i−a0−a1 x i−a2 xi2)

∂Sr

∂a1

=−2∑ x i( yi−a0−a1 xi−a2 x i

2)

∂Sr

∂a2=−2∑ x i

2

( y i−a0−a1 x i−a2 x i

2

)

 N$"8ر K$9% MBر ?+8 2G0HHHHI?>ر E+ *$"Fي */ر%D "ر ?+", ."/($%*"$%45 689"8:;< =>?>."@A*Bي * $()'& %$#"! ير 

 hysics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,

umerical, Economy

 O"N#P Q"A ! [email protected]*$)T UV"? WرP" X+", 4 ي#Y$"8eng-hs. com, eng-hs. net

Z 4 ]T$% QMP w?"KB 4 & %M8>  E+ b);Bي ‚$% *Tير

xري\ ƒV"E(8 Zc &(^B Z

Page 22: Numerical Ch17

8/20/2019 Numerical Ch17

http://slidepdf.com/reader/full/numerical-ch17 22/32

July 2013

 hese e1uations can be set e1ual to *ero and rearranged to

develop the following set of normal e1uations2

(n )a0+(∑ xi )a1+(∑ x i

2 )a2=∑  yi

(∑ x i )a0+(∑ x i

2) a1+(∑ xi

3 )a2=∑ xi y i

(∑ x i

2 )a0+(∑ x i

3)a1+(∑ xi

4 ) a2=∑ xi

2 y i

where all summations are from i % & through n. !ote that

the above three e1uations are linear and have three

unnowns2 a0 , a1 , and a2 .

 he coe7cients of the unnowns can be calculated

directly from the observed data.

For this case, we see that the problem of determining a

least"s1uares second"order polynomial is e1uivalent to

solving a system of three simultaneous linear e1uations.

 N$"8ر K$9% MBر ?+8 2G0HHHHI?>ر E+ *$"Fي */ر%D "ر ?+", ."/($%*"$%45 689"8:;< =>?>."@A*Bي * $()'& %$#"! ير 

 hysics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,

umerical, Economy

 O"N#P Q"A ! [email protected]*$)T UV"? WرP" X+", 4 ي#Y$"8eng-hs. com, eng-hs. net

„< U 4 Q"'KFZ% 4A>?> dN ي  ي'< }|8 ]V"Y …'A *#  %ر ?> ">

Page 23: Numerical Ch17

8/20/2019 Numerical Ch17

http://slidepdf.com/reader/full/numerical-ch17 23/32

July 2013

Eam-le Polynomial $egression

Pro'lem Statement.

Fit a second"order polynomial to the data in the rst two

columns of the next table.

Solution.

From the given data,

m % 3   ∑ x i=15   ∑ xi4=979

n %   ∑ y i=152.6   ∑ x i yi=585.6

´ x=2.5   ∑ x i2=55   ∑ x i

2 y i=2488.8

´ y=25.433   ∑ xi3=225

 herefore, the simultaneous linear e1uations are

[ 6 15 55

15 55 225

55 225 979]{a0

a1

a2}=

{152.6

585.6

2488.8}/olving these e1uations through a techni1ue such as Gauss

elimination gives a5  % 3.;:B':, a& % 3.D'H3H, and a3  %

&.B5:&.

 N$"8ر K$9% MBر ?+8 2G0HHHHI?>ر E+ *$"Fي */ر%D "ر ?+", ."/($%*"$%45 689"8:;< =>?>."@A*Bي * $()'& %$#"! ير

 hysics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,

umerical, Economy

 O"N#P Q"A ! [email protected]*$)T UV"? WرP" X+", 4 ي#Y$"8eng-hs. com, eng-hs. net

 d8 K( ي f,K/ => $c € ر#/ Z %cxMB ر/ =$% %$\ O" ƒ  •% )5 €";ير

Page 24: Numerical Ch17

8/20/2019 Numerical Ch17

http://slidepdf.com/reader/full/numerical-ch17 24/32

July 2013

Continue.

 herefore, the least"s1uares 1uadratic e1uations for this

case is

y % 3.;:B': 6 3.D'H3Hx 6 &.B5:&x3

 he standard error of the estimate based on the regression

polynomial is

S y / x=√  Sr

n−(m+1)=√

3.74657

6−3=1.12

 he coe7cient of determination is

r2=

S t −Sr

St 

=2513.39−3.74657

2513.39  =0.99851

and the correlation coe7cient is r % 5.HHH3'.

 hese results indicate that HH.B'& percent of the

original uncertainty has been explained by the model. hisresult supports the conclusion that the 1uadratic e1uation

represents an excellent t, as is also evident from the next

gure.

 N$"8ر K$9% MBر ?+8 2G0HHHHI?>ر E+ *$"Fي */ر%D "ر ?+", ."/($%*"$%45 689"8:;< =>?>."@A*Bي * $()'& %$#"! ير 

 hysics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,

umerical, Economy

 O"N#P Q"A ! [email protected]*$)T UV"? WرP" X+", 4 ي#Y$"8eng-hs. com, eng-hs. net

Page 25: Numerical Ch17

8/20/2019 Numerical Ch17

http://slidepdf.com/reader/full/numerical-ch17 25/32

July 2013

  1!"3 9ulti-le #inear

$egression

useful extension of linear regression is the case where

y is a linear function of two or more independent variables.For example, y might be a linear function of x& and x3, as in

 y=a0+a1 x1+a2 x2+e

/uch an e1uation is particularly useful when tting

experimental data where the variable being studied is often

a function of two other variables. For this two"dimensional

case, the regression -line becomes a -plane (next gure).

 N$"8ر K$9% MBر ?+8 2G0HHHHI?>ر E+ *$"Fي */ر%D "ر ?+", ."/($%*"$%45 689"8:;< =>?>."@A*Bي * $()'& %$#"! ير 

 hysics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,

umerical, Economy

 O"N#P Q"A ! [email protected]*$)T UV"? WرP" X+", 4 ي#Y$"8eng-hs. com, eng-hs. net

U† ! ي$% O"+9% ‡ ي#B 

bT/ 4B: "/%*^8"F .%%رY ˆ.%(F <|5 M#8 O/ Oر

Page 26: Numerical Ch17

8/20/2019 Numerical Ch17

http://slidepdf.com/reader/full/numerical-ch17 26/32

July 2013

s with the previous cases, the -best values of the

coe7cients are determined by setting up the sum of the

s1uares of the residuals,

Sr=∑i=1

n

( yi−a0−a1 x1 i−a2 x2 i)2

and di0erentiating with respect to each of the unnown

coe7cients.

∂Sr

∂a0

=−2∑ ( y i−a0−a1 x1i−a2 x2 i)

∂Sr

∂a1

=−2∑ x1 i( yi−a0−a1 x1 i−a2 x2 i)

∂Sr

∂a2

=−2∑ x2 i( yi−a0−a1 x1i−a2 x2 i)

 N$"8ر K$9% MBر ?+8 2G0HHHHI?>ر E+ *$"Fي */ر%D "ر ?+", ."/($%*"$%45 689"8:;< =>?>."@A*Bي * $()'& %$#"! ير 

 hysics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,

umerical, Economy

 O"N#P Q"A ! [email protected]*$)T UV"? WرP" X+", 4 ي#Y$"8eng-hs. com, eng-hs. net

Q" يT$% ‡" )5 ‡ ي#/ O> / UdBM$ O> !> ["($% b)\> ‡ ي#B " 

&  > $"#)$? dK• .""Fc

Page 27: Numerical Ch17

8/20/2019 Numerical Ch17

http://slidepdf.com/reader/full/numerical-ch17 27/32

July 2013

 he coe7cients yielding the minimum sum of the s1uares of 

the residuals are obtained by setting the partial derivatives

e1ual to *ero and expressing the result in matrix form as

[  n   ∑ x

1 i   ∑ x2 i

∑  x1 i   ∑ x

1 i

2 ∑ x1 i∑ x

2 i

∑  x2 i   ∑ x

1 i∑ x2 i   ∑ x

2 i

2   ]{a0

a1

a2

}={  ∑ yi

∑  x1 i y i

∑  x2 i y i}

Eam-le 1!"7 9ulti-le #inear $egression

Pro'lem Statement.

 he following data was calculated from

the e1uations y % ' 6 ;x& 8 Dx32

=se multiple linear regression to t this

data.

Solution.

 N$"8ر K$9% MBر ?+8 2G0HHHHI?>ر E+ *$"Fي */ر%D "ر ?+", ."/($%*"$%45 689"8:;< =>?>."@A*Bي * $()'& %$#"! ير 

 hysics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,

umerical, Economy

 O"N#P Q"A ! [email protected]*$)T UV"? WرP" X+", 4 ي#Y$"8eng-hs. com, eng-hs. net

 „'+ U#/ O> O(,$% &YK/? *^B ر;$% „'(8 "5•%

4 ي;|+>v *')K ‰V"K+ 

Page 28: Numerical Ch17

8/20/2019 Numerical Ch17

http://slidepdf.com/reader/full/numerical-ch17 28/32

July 2013

 he summations re1uired to develop the previous e1uation

are2

 he result is

[   6 16.5 14

16.5 76.25 48

14 48 54 ]{a0

a1

a2}={

  54

243.5

100 }Which can be solved using a method such as Gauss

elimination for

a5 % ' ai % ; a3% "D

which is consistent with the original e1uation from which

the data was derived.

 he foregoing two"dimensional case can be easily

extended to m dimensions, as in

y % a5 6 a&x& 6 a3x3 6 4 6 amxm 6 e

where the standard error is formulated as

S y / x=√  Sr

n−(m+1)

and the coe7cient of determination is computed as in E1

(&:.&5).

 N$"8ر K$9% MBر ?+8 2G0HHHHI?>ر E+ *$"Fي */ر%D "ر ?+", ."/($%*"$%45 689"8:;< =>?>."@A*Bي * $()'& %$#"! ير 

 hysics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,

umerical, Economy

 O"N#P Q"A ! [email protected]*$)T UV"? WرP" X+", 4 ي#Y$"8eng-hs. com, eng-hs. net

W",+ fKNY"5 U  "5 b#/ Š4 ي(z MA> &MKF $‹ U| "(z *KY‹ *)  "5 *#K ?>

Page 29: Numerical Ch17

8/20/2019 Numerical Ch17

http://slidepdf.com/reader/full/numerical-ch17 29/32

July 2013

lthough there may be certain cases where a variable is

linearly related to two or more other variables, multiple

linear regression has additional utility in the derivation of 

power e1uations of the general form

 y=a0 x

1

a1 x

2

a2…. xm

am

/uch e1uations are extremely useful when tting

experimental data.

 o use multiple linear regression, the e1uation is

transformed by taing its logarithm to yield.

 y= loga0+a1 log x1+¿a2log x2+…+am log xm

log ¿

 his transformation is similar in spirit to the one used to t a

power e1uations when y is a function of a single variable x.

Pro'lem 1!"

=se least"s1uares regression to t a straight line to

x : && &' &: 3& 3D 3H 3H D:

DH

y 3H 3& 3H &; 3& &' : : &D 5

D

 N$"8ر K$9% MBر ?+8 2G0HHHHI?>ر E+ *$"Fي */ر%D "ر ?+", ."/($%*"$%45 689"8:;< =>?>."@A*Bي * $()'& %$#"! ير 

 hysics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,

umerical, Economy

 O"N#P Q"A ! [email protected]*$)T UV"? WرP" X+", 4 ي#Y$"8eng-hs. com, eng-hs. net

 k%Œ$% F ر$% U{ ])^$% f($? "V%  رTK/ d)#, يF O" => $% d)B 4$ 

Page 30: Numerical Ch17

8/20/2019 Numerical Ch17

http://slidepdf.com/reader/full/numerical-ch17 30/32

July 2013

@ompute the standard error of the estimate and the

correlation coe7cient. Ilot the data and the regression line.

$f someone made an additional measurement of x % &5, y %

&5, would you suspect, that the measurement was valid or

faultyJ Kustify your conclusion.

Solution.

 he results can be summari*ed as

)901489.0;476306.4( 78055.00589.31 /   ==−=   r  s x y  x y

t x  % &5, the best t e1uation gives 3D.3';D. he line and

data can be plotted along with the point (&5, &5).

 he value of &5 is nearly D times the standard error awayfrom the line,

3D.3';D 8 &5 % &D.3';D L D;.;:

 hus, we can conclude that the value is probably erroneous.

Pro'lem 1!"13

n investigator has reported the data tabulated below for

an experiment to determine the growth rate of bacteria

(per d), as a function of oxygen concentration c (mgMC). $t is N$"8ر K$9% MBر ?+8 2G0HHHHI?>ر E+ *$"Fي */ر%D "ر ?+", ."/($%*"$%45 689"8:;< =>?>."@A*Bي * $()'& %$#"! ير 

 hysics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,

umerical, Economy

 O"N#P Q"A ! [email protected]*$)T UV"? WرP" X+", 4 ي#Y$"8eng-hs. com, eng-hs. net

A Q"#Fي #|/ 4ر   {> ƒNEKFر

  b\ر/ "'+ Q" يT$% O>d/M+"? d5 

 Z MY ["+> Qر@+ "(N)\> ])^B " 

  )5 "(8 OKB O?"B _`9%

Page 31: Numerical Ch17

8/20/2019 Numerical Ch17

http://slidepdf.com/reader/full/numerical-ch17 31/32

July 2013

nown that such data can be modeled by the following

e1uation2

k =

k maxc2

cs+c2

Where cs   and k max  are parameters. =se a transformation

to lineari*e this e1uation. hen use linear regression to

estimate cs and k max  and predict the growth rate at c % 3

mgMC.

@ 5.' 5.B &.' 3.' ;N &.& 3.; '.D :. B.H

Solution.

 he e1uation can be lineari*ed by inverting it to yield

max2

max

111

k ck 

c

k  s

+=

@onse1uently, a plot of &Mk  versus &Mc should yield a straight

line with an intercept of &Mk max and a slope of csMk max 

c,mg/#   k , /0 1/c2 1/k 

1/c2 1/

k  :1/c2;2

5.' &.&;.555555

5.H5H5H&

D.DD;

&.555555

5.B 3.;&.'3'55

5.;&:

5.'&5;3

3.;;&;5

&.' '.D5.;;;;;;

5.&BB:H

5.5BDB':

5.&H:'D&

3.' :.5.&5555

5.&D&':H

5.53&5'D

5.53'55

; B.H5.53'55

5.&&3D5

5.55:533

5.55DH5

Sum 7"22<6 1"!=3 6"3<<3 1="77=

 N$"8ر K$9% MBر ?+8 2G0HHHHI?>ر E+ *$"Fي */ر%D "ر ?+", ."/($%*"$%45 689"8:;< =>?>."@A*Bي * $()'& %$#"! ير 

 hysics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,

umerical, Economy

 O"N#P Q"A ! [email protected]*$)T UV"? WرP" X+", 4 ي#Y$"8eng-hs. com, eng-hs. net

Page 32: Numerical Ch17

8/20/2019 Numerical Ch17

http://slidepdf.com/reader/full/numerical-ch17 32/32

July 2013

  66 ! 3= 66

@ontinue2

 he slope and the intercept can be computed as

202489.0)229444.6()66844.18(5

)758375.1(229444.6)399338.4(521   =

−=a

099396.05

229444.6202489.0

5

758375.10   =−=a

 herefore, k max % &M5.5HHDH % &5.55:; and cs %

&5.55:;(5.353;BH) % 3.5D:&BH, and the t is

2

2

037189.2

06074.10

c

ck 

+

=

 his e1uation can be plotted together with the data2

 he e1uation can be used to compute

666.6)2(037189.2

)2(06074.102

2

=

+

=k 

 )5 =KTB Z 4 ي̂ z%$% ["Y 

 

; $ $ @$ U