94
Regression II Dr. Rahim Mahmoudvand Regression II; Bu-Ali Sina University Fall 2014 1

Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University [email protected] Regression II; Bu-Ali Sina University Fall

Embed Size (px)

Citation preview

Page 1: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Regression II

Dr. Rahim Mahmoudvand

Department of Statistics,

Bu-Ali Sina University

[email protected]

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 11

Page 2: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Chapter 4Chapter 4

Model Adequacy

Checking

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 22

Page 3: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch4: Partial Regression Plot:Ch4: Partial Regression Plot: Definition and UsageDefinition and Usage

Is a curvature effect for the regressor needed in the model?

A partial regression plot is a variation of the plot of residual versus the residual.

This plot evaluate whether we have specified the relationship between the response and the regressor variables correctly.

This plot study the marginal relationship of a regressor given the other variables that are in the model.

The partial residual plot is called the added-variable plot or the adjusted variable plot, too.

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 33

Page 4: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch4: Partial Regression Plot: Ch4: Partial Regression Plot: The way of working with exampleThe way of working with example

In this plot, the response variable y and the regressor xj are both regressed against the other regressors in the model and the residuals obtained for each regression. The plot of these residuals against each other provides information about the nature of the marginal relationship for regressor xj under consideration.

Example: Example:

2 0 1 2

2 2

ˆ ˆˆ ( )

ˆ ( ) , 1, 2,...,

i i

i i i

y x x

e y x y y x i n

1 2 0 1 2

1 2 1 1 2

ˆ ˆˆ ( )

ˆ ( ) , 1,2,...,

i i

i i i

x x x

e x x x x x i n

0 1 1 2 2 ; 1, 2,...,i i i iy x x i n

Y is regressed on x2

x1 is regressed on x2

Regression II; Bu-Ali Sina UniversityFall 2014 4

Page 5: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch4: Partial Regression Plot:Ch4: Partial Regression Plot: Interpretation of plotInterpretation of plot

2ie y x

1 2ie x x

2ie y x 2ie y x

1 2ie x x 1 2ie x x

• Regressor x1 enters the

model linearly,

• Line through from the

origin and slop of the

line is equal to 1

• Higher order term in x1

such as x12 is required.

• Transformation such as

replacing x1 with 1/ x1 is

required

• there is no additional useful information in x1 for predicting y.

Regression II; Bu-Ali Sina UniversityFall 2014 5

Page 6: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch4: Partial Regression Plot: Ch4: Partial Regression Plot: Relationship among ResidualsRelationship among Residuals

Consider model . Denoting

We have:

( )

( ) ( ) ( ) ( ) ( )

( ) ( ) ( ) ( ) ( )

0

( ) ( )

j j

j j j j j j j

j j j j j j j

e x X

j j j j

e Y X I H Y I H X x

I H X I H x I H

e x X I H

( ) ( )j j j je x X I H x

Y X

Y is regressed on X(j)

xj is regressed on X(j)

( ) 1 11j j jX x x

( ) ( )j j j jY X x

Regression II; Bu-Ali Sina UniversityFall 2014 6

Page 7: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch4: Partial Regression Plot: Ch4: Partial Regression Plot: shortcomingshortcoming

• These plots may not give information about the proper form of the relationship if several variables already in the model are incorrectly specified.

• Partial regression plots will not, in general, detect interaction effects among the regressors.

• The presence of strong multicollinearity can cause partial regression plots to give incorrect information about the relationship between the response and the regressor variables.

Regression II; Bu-Ali Sina UniversityFall 2014 7

Page 8: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch4: Ch4: Partial Residual Plots: Definition and usageCh4: Ch4: Partial Residual Plots: Definition and usage

• Partial residual plot is closely related to the partial regression plot

• A partial residual plot is a variation of the plot of residuals versus the predictor.

• It is designed to show the relationship between the response variable and the regressors

Regression II; Bu-Ali Sina UniversityFall 2014 8

Page 9: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch4: Partial residual Plot:Ch4: Partial residual Plot: computation of partial residuals computation of partial residuals

Consider model . Denoting

We have:

Partial residual is defined and calculated by:

,( ) ( )

,( ) ( )

ˆ ˆˆ

ˆ ˆi i i i i j j ij j

i i j j i ij j

e y y y X x

y X e x

Y X

,( ) , 1 , 11i j i j i jX x x

,( ) ( )i i j j ij j iy X x

*,( ) ( )

ˆ ˆi j i i j j i ij je y x y X e x

Regression II; Bu-Ali Sina UniversityFall 2014 9

Page 10: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch4: Partial Residual Plot:Ch4: Partial Residual Plot: Interpretation of plotInterpretation of plot

*i je y x

ijx

*i je y x *

i je y x

ijxijx

• Regressor xj enters the

model linearly,

• Line through from the

origin and slop of the

line is equal to ˆ

j

• Higher order term in xij

such as xij2 is required.

• Transformation such as

replacing xij with 1/ xij is

required

• there is no additional useful information in xij for predicting y.

Regression II; Bu-Ali Sina UniversityFall 2014 10

Page 11: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch4: Other Plots: Regressor versus regressorCh4: Other Plots: Regressor versus regressor

• Scatterplot of regressor xi against regressor xj :• Is useful in studying the relationship between regressor variables,

• Is useful in detecting multicollinearity

Regression II; Bu-Ali Sina UniversityFall 2014 11

jx

ix

jx

ix

• There is one unusual observation with respect to xj

• There is one unusual observation with respect to xj

• There is one unusual observation with respect to both sides

• There is one unusual observation with respect to both sides

Page 12: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch4: Other Plots: Response versus regressorCh4: Other Plots: Response versus regressor

• Scatterplot of response y against regressor xj :• Is useful in distinguishing the type of points

Regression II; Bu-Ali Sina UniversityFall 2014 12

y

ix

y

ix

y

ix

• Influential point,• Outlier in x space,• Prediction variance for

this point is large,• Residual variance for

this point is small.

• Influential point,• Outlier in x space,• Prediction variance for

this point is large,• Residual variance for

this point is small.

• Influential point,• Outlier in y direction• Influential point,• Outlier in y direction

• leverage point,• Outlier in both sides,• Prediction variance for

this point is large,• Residual variance for

this point is small.

• leverage point,• Outlier in both sides,• Prediction variance for

this point is large,• Residual variance for

this point is small.

Page 13: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch4: PRESS Statistic:Ch4: PRESS Statistic: computation and usage computation and usage

• PRESS is generally regarded as a measure of how well a regression model

will perform in predicting new data.

• A model with a small value of PRESS is desired.

• PRESS residuals are:

and accordingly PRESS statistic is defined as follows:

• R2 for prediction based on PRESS statistic:

( ) ( )ˆ , 1,2,..., .1

ii i i

ii

ee y y i n

h

2( )

1

PRESS=n

ii

e

Regression II; Bu-Ali Sina UniversityFall 2014 13

2 PRESSR 1-

TSS

Page 14: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch4: PRESS Statistic:Ch4: PRESS Statistic: Interpretation with an example Interpretation with an example

Regression II; Bu-Ali Sina UniversityFall 2014 14

Page 15: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch4: PRESS Statistic: Ch4: PRESS Statistic: Interpretation with an exampleInterpretation with an example

Regression II; Bu-Ali Sina UniversityFall 2014 15

2 2( ) (9)

1

PRESS= 459 , 218n

ii

e e

2 459

R -PRESS=1- 0.92095784

2 233.73R =1- 0.9596

5784

y is regressed on x1

2( )

1

PRESS= 733.55n

ii

e

y is regressed on x1 , x2

2( )

1

PRESS= 459n

ii

e

So, model including both x1 and x2 is better than model with only x1 is included.

Page 16: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch4: Ch4: Detection and Treatment of Outliers:Detection and Treatment of Outliers: Tools and methods Tools and methods

Recall that, an outlier is an extreme observation; one that is considerably different from the majority of the data.Detection tools:

ResidualsScaled residualsDoing statistical test

Outliers can be categorized to:Bad values, occurring as a result of unusual but explainable events; such as faulty measurement or analysis, incorrect recording of data and failure of a measurement instrument;Normally observed values, such as leverage and influential observations.

TreatmentRemove bad valuesFollow-up analysis of outliers; this may help us to improve process or results in new knowledge concerning factors whose effect on the response was previously unknown.The effect of outliers may be checked easily by dropping these points and refitting the regression equation.

Regression II; Bu-Ali Sina UniversityFall 2014 16

Page 17: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch4Ch4: Detection and Treatment of Outliers:: Detection and Treatment of Outliers: Example 1 (Rocket data) Example 1 (Rocket data)

Regression II; Bu-Ali Sina UniversityFall 2014 17

Quantity Obs 5 & 6 IN Obs 5 & 6 out

intercept 2627.82 2658.97

Slope -37.15 -37.69

R2 0.9018 0.9578

MSRes 9244.59 3964.63

Page 18: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch4: Detection and Treatment of Outliers:Ch4: Detection and Treatment of Outliers: Example 2 Example 2

Regression II; Bu-Ali Sina UniversityFall 2014 18

Country Cigarette Deaths

Australia 480 180

Canada 500 150

Denmark 380 170

Finland 1100 350

UK 1100 460

Iceland 230 60

Netherlands 490 240

Norway 250 90

Sweden 300 110

Switzerland 510 250

USA 1300 200

Regression with all the data:The regression equation isy = 67.6 + 0.228 xR-Sq = 54.4%

Regression without the USAThe regression equation isy = 9.1 + 0.369 xR-Sq = 88.9%

Page 19: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch4: Lack of fit of the regression model:Ch4: Lack of fit of the regression model: What is meant? What is meant?

All models are wrong; some models are useful All models are wrong; some models are useful (George Box)(George Box)

In the simple linear regression model if we have n distinct data points we can always fit a polynomial of order up to n-1. In the process what we claim to be random error is actually a systematic departure as the result of not fitting enough terms.

Regression II; Bu-Ali Sina UniversityFall 2014 19

y

ix

y

ix

• Perfect linear fitting is always possible when we have two distinct points.

• Perfect linear fitting is always possible when we have two distinct points.

• Perfect linear fitting is not possible in general when we have three (and more) distinct points.

• Perfect linear fitting is not possible in general when we have three (and more) distinct points.

Page 20: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch4: Lack of fit of the regression model:Ch4: Lack of fit of the regression model: A formal test A formal test

This test assumes that normality, independence and constant variance requirements are met and

Only first order or straight line character of the relationship is in doubt. To do this test, we have to replicate observations on response y for at least

one level of x. These new data can provide a model-independent estimate of 2.

Regression II; Bu-Ali Sina UniversityFall 2014 20

ijx

iy

• Straight line fit is not satisfactory• Straight line fit is not

satisfactory

Page 21: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch4: Lack of fit of the regression model:Ch4: Lack of fit of the regression model: A formal test A formal test

Let yij= jth observation on the response at xi ; j=1,…, ni ; i=1,….,m.

So, we have n=ni observations and we can write:

Regression II; Bu-Ali Sina UniversityFall 2014 21

1

2

11 1

1 1

21 2

2 2

1

1

,

1m

n

n

mm

mmn

y x

y x

y x

Y Xy x

xy

xy

0 1 ; 1,..., , 1,...,ij i ij iy x j n i m

Considering a linear regression:

0 1

1 10 1 1

1 1 1

ˆ ˆˆ ˆ ; 1,..., , 1,...,

( )1 1ˆ ˆ ˆ;

i

i

ij i i i

nm

i ijnm mi j

ij i ii j i xx

y y x j n i m

x x y y

y n xn n S

0 1

2

,1 1

mininm

iji j

Page 22: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch4: Lack of fit of the regression model:Ch4: Lack of fit of the regression model: A formal test A formal test

We have:

Accordingly, we get:

Regression II; Bu-Ali Sina UniversityFall 2014 22

ˆ ˆij ij i ij i i ie y y y y y y

22

Re1 1 1 1

2 2

1 1 1 1 1 1

2 2

1 1 1 1 1 1

0

ˆ

ˆ ˆ2

ˆ ˆ2

i i

i i i

i i i

n nm m

s ij ij i i ii j i j

n n nm m m

ij i i i ij i i ii j i j i j

n n nm m m

ij i i i i i ij ii j i j i j

SS e y y y y

y y y y y y y y

y y y y y y y y

Page 23: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch4: Lack of fit of the regression model:Ch4: Lack of fit of the regression model: A formal test A formal test

Accordingly:

Regression II; Bu-Ali Sina UniversityFall 2014 23

LOFPE

2 2

Re1 1 1

ˆinm m

s ij i i i ii j i

SSSS

SS y y n y y

1 1

. 1 ;m m

i ii i

d f n n m n n

• If the assumption of constant variance is satisfied, SSPE is a model independent measure of pure error.

• Degree of freedom for SSPE is

• Note also that:

where Si2

is the variance of response at level xi.

• If the assumption of constant variance is satisfied, SSPE is a model independent measure of pure error.

• Degree of freedom for SSPE is

• Note also that:

where Si2

is the variance of response at level xi.

• If the fitted value are close to the corresponding average response then there is a strong indication that the regression function is linear.

• Note that:

• If the fitted value are close to the corresponding average response then there is a strong indication that the regression function is linear.

• Note that:

2PE

1

1m

i ii

SS n S

1

1

11 1 1 1

1 ˆˆ;

( )1 ˆ;

i

i i

n

i ij i iji

n nm mi

ij iji j i jxx

y y y y x xn

x xy y y

n S

Page 24: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch4: Lack of fit of the regression model:Ch4: Lack of fit of the regression model: A formal test A formal test

It is well known that E(Si2)=2 and so we get:

But for the SSLOF we have:

Regression II; Bu-Ali Sina UniversityFall 2014 24

2 2PE

1

1 ( )m

i ii

E SS n E S n m

2 2LOF

1 1

222

0 11

222

0 11 1

220 1

1

ˆ ˆ ˆvar

1 1( )

1 ( )

2 ( )

m m

i i i i i i i ii i

mi

i i ii i xx

m mi ii

i i ii ixx

m

i i ii

E SS n E y y n y y E y y

x xn E y x

n n S

n x xnn E y x

n S

m n E y x

2

2

2

1ˆvar( )

ˆ ˆvar( ) ;cov , var( )

ii

xx

i i i ii

x xy

n S

y y y yn

Page 25: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch4: Lack of fit of the regression model:Ch4: Lack of fit of the regression model: A formal test A formal test

An unbiased estimation of variance can be obtained by:

Moreover, we have:

So, the ratio

can be considered as a statistics for testing the linearity assumption in the linear regression model. It can be seen that F0 follows a Fm-2,n-m and therefor Regression function is not linear if F0> Fm-2,n-m,1-

Regression II; Bu-Ali Sina UniversityFall 2014 25

2PEPE

SSE MS E

n m

2

0 12LOF 1

LOF

( )

2 2

m

i i ii

n E y xSS

E MS Em m

PE0

LOF

MSF

MS

Page 26: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch4: Ch4: Lack of fit of the regression model:Lack of fit of the regression model: limitation and solutions limitation and solutions

limitations Ideally, we find that the F ratio for lack of fit is not significant, and the hypothesis of

significance of regression is rejected. Unfortunately, this does not guarantee that the model will be satisfactory as a prediction

equation. The model may have been fitted to error only.

Solutions Regression model is to be useful as a predictor when F ratio is at least four or five times

of critical value from F table, Comparing the range of the fitted value s to their average standard error. In order to do

this we can use the following measure for average standard error

Where is a model independent estimate of the error variance.

Regression II; Bu-Ali Sina UniversityFall 2014 26

2

1

ˆ11ˆ ˆˆ ˆvar var

n

i ii

ky y

n n

2

Page 27: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch4: Lack of fit of the regression model:Ch4: Lack of fit of the regression model: multiple version multiple version

repeat observations do not often occur in multiple regression According to a solution we are searching for points in x space that are

near neighbors, that is, sets of observations that have been taken with nearly identical levels of x1 , x2 , …, xk.

As a measure of the distance between any two points, for example

will use the weighted sum of squared distance (WSSD):

Pairs of point that have small Dii’2

are near neighbors

The residuals at two points with a small value of Dii’2

can be used to obtain an estimate of pure error.

Regression II; Bu-Ali Sina UniversityFall 2014 27

1 1,..., ; ,...,i ik i i kx x x x

2

2

1 Re

ˆkj ij i j

iij s

x xD

MS

Page 28: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch4: Lack of fit of the regression model:Ch4: Lack of fit of the regression model: multiple version multiple version

There is a relationship between the range of a sample from a normal population and the population standard deviation. For samples of size 2, this relationship is (Exercise)

An algorithm for sample size greater than 2 is: First arrange the data points xi1, xi2, …, xik in order of increasing

Compute the values of Dii’2 for all n – 1 pairs of points with adjacent values of

Repeat this calculation for the pairs of points separated by one, two, and three intermediate values . This will produce 4n-10 values of Dii’

2 Arrange above values in ascending order. Let Eu for u=1,…, 4n-10 denote the range of residuals at these points and

calculate an estimate of standard deviation of pure error by:

E1, E2 ,…, Em, are residuals associated with the m smallest values of Dii’2

Regression II; Bu-Ali Sina UniversityFall 2014 28

1ˆ 1.128 0.886 ; i iE E E e e

ˆ iy

ˆ iy

1

0.886ˆ

m

uu

Em

Page 29: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Chapter 5Chapter 5

Methods to Correct

Model Inadequacy

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 2929

Page 30: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch 5: Transformation and WeightingCh 5: Transformation and Weighting

Main assumption in model Y=X+ are:

E()=0 , Var()= 2 ,

N(0, 2 ) ,

Form of X , that has used in the model, is correct.

We use residuals analysis to detect violation from these basic

assumptions. In this chapter, we focus on methods and procedures

for building regression models when some of the above assumptions

are violated.

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 3030

Page 31: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch5: Transformation and Weighting: Ch5: Transformation and Weighting: Problems?Problems?

Error variance is not constant

Relationship between y and regressors is not linear.

Solutions

Transformation: Using transformed data and stabilize variance.

Weighting: Using weighted least square.

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 3131

Parameter estimators are unbiased Parameter estimators are not BLUE

Page 32: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch5: Transformation: Ch5: Transformation: Stabilizing varianceStabilizing variance

Let Var(Y)=c2.[E(Y)]h . In this case we have:

Example 1: Poisson data, Var(Yi)=E(Yi)

Example 2: Inverse-Gaussian data, Var(Yi)=E3(Yi)

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 3232

1

2

22 2 20 1

1

( ) .. ( ) .

hi i i i i

i ih h h hi ii i i i

y y y y yy y

cVar Y c yc E Y c x c y

1

2

0 1( ) ( )i i i i i

i i

i ii i i i

y y y y yy y

Var Y yE Y x y

1

2

3 3 3 3

0 1( ) ( )

i i i i ii i

i ii i i i

y y y y yy y

Var Y yE Y x y

Page 33: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch5: Transformation:Ch5: Transformation: Stabilizing variance Stabilizing variance

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 3333

Relationship of 2 to E(Y) Transformation

2 constant Y’=Y (no transformation)

2 E(Y) Y’=Y1/2 (square root; Poisson data)

2 E(Y)[1-E(Y)] Y’=Arcsine(Y1/2 ) (Binomial data)

2 [E(Y)]2 Y’=log(Y) (Gamma distribution)

2 [E(Y)]3 Y’=Y-1/2 (Inverse-Gaussian)

2 [E(Y)]4 Y’=Y-1

Page 34: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch5: Transformation for stabilizing variance: Ch5: Transformation for stabilizing variance: limitationslimitations

Note that the predicted values are in the transformed scale, so:

Applying the inverse transformation directly to the predicted values

gives an estimate of the median of the distribution of the response

instead of the mean.

Confidence or prediction intervals may be directly converted from one

metric to another. However, there is no assurance that the resulting

intervals in the original units are the shortest possible intervals

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 3434

Page 35: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch5: Transformation: Ch5: Transformation: Linearizing modelLinearizing model

The linearity assumption is the usual starting point in regression analysis.

Occasionally we find that this assumption is inappropriate.

Nonlinearity may be detected via the

lack-of-fit test,

from scatter diagrams, the matrix of scatterplots,

residual plots such as the partial regression plot,

Prior experience or theoretical considerations .

In some cases a nonlinear function can be linearized by using a suitable

transformation. Such nonlinear models are called intrinsically or

transformably linear.

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 3535

Page 36: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch5: Transformation: Ch5: Transformation: Linearizing modelLinearizing model

Example 1:

This function is intrinsically linear since it can be transformed to a straight line by a logarithmic transformation

Example 2:

This function can be linearized by using the reciprocal transformation x’=1/x:

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 3636

10

xy e

0 1

0 1

log log logy x

y x

0 1

1y

x

0 1y x

Page 37: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch5: Transformation:Ch5: Transformation: Linearizing model Linearizing model

When transformations such as those described above are employed, the least –squares estimator has least-squares properties with respect to the transformed data, not the original data.

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 3737

Linearizable function Transformation

Y=0exp(1x) Y’=log (Y)

Y=0 x1 Y’=log(Y) and x’=log(x)

Y=0+1log(x) x’=log(x)

Y=0/(0x-1) X’=1/x and Y’=1/Y

Page 38: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch5: Transformation: Ch5: Transformation: Analytical methodAnalytical method

Transformation on Y

Power transformation

Box-Cox method

Where is the geometric mean of the observations. Then fit model

Y() =X+

The maximum-likelihood estimate of λ corresponds to the value of λ for

which the residual sum of squares from the fitted model SSRes(λ) is a

minimum

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 3838

( ) 1

1; 0

log ( ) ; 0G

G

y

y y

y y

Gy

Page 39: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch5: Transformation: Ch5: Transformation: Analytical methodAnalytical method

Obtaining suitable value for λ is easy by plotting λ versus SSRes(λ) for

some possible values of λ (usually between (-3 , +3))

SSRes(λ)

λ

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 3939

Page 40: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch5: Transformation: Ch5: Transformation: Analytical method with regressorsAnalytical method with regressors

Suppose that the relationship between y and one or more of the regressor

variables is nonlinear but that the usual assumptions of normally and

independently distributed responses with constant variance are at least

approximately satisfied.

Assume E(Y)=0+1Z where

Assuming 0, we expand about 0= in a Taylor series and ignore terms of

higher than first order:

E(Y)=0+1x0 +(- 0) 1x

0.log(x)=0+1x1+2.x2

Where, 0= 0 , 1=1 , 2=(- 0) 1 and x1=x0 , x2=x0.log(x).

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 4040

; 0

log ( ) ; 0

xZ

x

Page 41: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch5: Transformation: Ch5: Transformation: Analytical method with regressorsAnalytical method with regressors

Now use the following algorithm (By Box-Tidwell, 1962):

1.Fit model E(Y)=0+1x and find least square estimates of 0 and 1

2.Fit model E(Y)=0+1x+2x.log(x) and find least square estimates of 0 , 1

and 2.

3.Applying equality provide an updated

4.Set x=xi and repeat steps 1-3 again.

5.apply the above algorithm until a small difference among i and i-1.

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 4141

2 1 1ˆ i i 21

1

ˆˆi i

(index i is for the repeat in the algorithm and 0=1)(index i is for the repeat in the algorithm and 0=1)

Page 42: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch5: Transformation: Ch5: Transformation: Analytical method with regressorsAnalytical method with regressors

Example:

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 4242

ŷ=0.1309+0.2411 x

ŷ =-2.4168+1.5344 x-0.462 x log(x)

1=-0.462/0.2411+1=-0.92

ŷ=0.1309+0.2411 x

ŷ =-2.4168+1.5344 x-0.462 x log(x)

1=-0.462/0.2411+1=-0.92

ŷ =3.1039-6.6874 x’

ŷ =3.2409-6.445 x+0.5994 x’ log(x’)

2=0.5994/-6.6874-0.92=-1.01

ŷ =3.1039-6.6874 x’

ŷ =3.2409-6.445 x+0.5994 x’ log(x’)

2=0.5994/-6.6874-0.92=-1.01

x’=x-0.92

Page 43: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch5: Transformation: Ch5: Transformation: Analytical method with regressorsAnalytical method with regressors

Example:

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 4343

Model 1 (blue and solid line in the graph)

ŷ=0.1309+0.2411 x

R2=0.87

Model 1 (blue and solid line in the graph)

ŷ=0.1309+0.2411 x

R2=0.87

Model2 (Red and dotted line in the graph)

ŷ =2.9650-6.9693/x

R2=0.980

Model2 (Red and dotted line in the graph)

ŷ =2.9650-6.9693/x

R2=0.980

Page 44: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch5: Ch5: Generalized least square:Generalized least square: Covariance matrix is Covariance matrix is nonsingularnonsingular

Consider the model Y=X+ with the following assumptions:

E()=0 ,

Var()= 2 V ,

where V is a nonsingular square matrix.

We will approach this problem by transforming the model to a new set of

observations that satisfy the standard least-squares assumptions.

Then we will use ordinary least squares on the transformed data.

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 4444

Page 45: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch5: Generalized least square:Ch5: Generalized least square: Covariance matrix is nonsingularCovariance matrix is nonsingular

Since V is nonsingular and positive definite, we can write

V=K’K=KK

where K is a nonsingular symmetry square matrix.

Define the new variables:

Z=K-1Y ; B=K-1X ; g=K-1

Multiplying both sides of original regression model by K-1 gives:

Z=B+g

This new transformed model has following properties:

E(g)=K-1E()=0 ; Var(g)=E{[(g-E(g)]’[(g-E(g)]}=E(g’g)=K-1E(’) K-1

= K-1 2 V K-1= 2 K-1 KK K-1= 2

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 4545

Page 46: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch5: Generalized least square:Ch5: Generalized least square: Covariance matrix is nonsingularCovariance matrix is nonsingular

So, in this transformed model, the error terms g has zero mean and constant variance and uncorrelated. In this model:

This estimator is called Generalized least square estimator of . We have easily:

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 4646

11 1 1 1 1

11 1 1 1

11 1

ˆ ( ) ( )B B B Z K X K X K X K Y

X K K X X K K Y

X V X X V Y

11 1

11 1

11 1

ˆE E X V X X V Y

X V X X V E Y

X V X X V X

12

12 1

ˆvar B B

X V X

Page 47: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch5: Generalized least square:Ch5: Generalized least square: Covariance matrix is diagonalCovariance matrix is diagonal

When the errors ε are uncorrelated but have unequal variances so that the covariance matrix of ε is:

the estimation procedure is usually called weighted least squares. Let W=V-1. Then we have:

Which is called the weighted least-squares estimator. Note that observations with large variances will have smaller weights than observations with small variances

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 4747

1

2

2 2 2

1

1

10 0 0

10 0 0

1 1diag , , 0 0 0 0

1

10 0 0

n

n

n

w

w

Vw w

w

w

1ˆ X WX X WY

Page 48: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch5: Generalized least square:Ch5: Generalized least square: Covariance matrix is diagonalCovariance matrix is diagonal

For the case of simple linear regression, the weighted least-squares function is

Getting derivative with respect to 0 and 1 the resulting least-squares normal equations would become:

Exercise: Show the solutions of the above system is coincide with general formula, stated in the previous page.

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 4848

2

0 1 0 11

( , )n

i i ii

S w y x

0 11 1 1

20 1

1 1 1

n n n

i i i i ii i i

n n n

i i i i i i ii i i

w w x w y

w x w x w x y

Page 49: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Chapter 6Chapter 6

Diagnostics for

Leverage and

Influence Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 4949

Page 50: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch6: Diagnostics for Leverage and influenceCh6: Diagnostics for Leverage and influence

In this chapter, we present several diagnostics for leverage and influence.

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 5050

y

ix

y

ix

• It has a noticeable impact on the model coefficients.

• It has a noticeable impact on the model coefficients.

• This point does not affect the estimates of the regression coefficients

• It has a dramatic effect on the model summary statistics such as R2 and the standard error of the regression coefficients

• This point does not affect the estimates of the regression coefficients

• It has a dramatic effect on the model summary statistics such as R2 and the standard error of the regression coefficients

Page 51: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch6: Diagnostics for Leverage and influence: Ch6: Diagnostics for Leverage and influence: importanceimportance

A regression coefficient may have a sign that does not make

engineering or scientific sense,

A regressor known to be important may be statistically insignificant,

A model that fits the data well and that is logical from an

application–environment perspective may produce poor predictions.

These situations may be the result of one or perhaps a few influential

observations. Finding these observations then can shed considerable

light on the problems with the model.

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 5151

Page 52: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch6: Diagnostics for Leverage and influence: Ch6: Diagnostics for Leverage and influence: LeverageLeverage

The only measure is hat matrix. The hat matrix diagonal is a standardized measure of the distance of the ith observation from the center (or centroid) of the x space. Thus, large hat diagonals reveal observations that are leverage points because they are remote in x space from the rest of the sample.

Two problem with this rule:

2(K+1)>n, in this case the cut off does not apply,

Leverage points are potentially influential.

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 5252

If hii>2¯h =2(K+1)/n then i

th observation is leverage

If hii>2¯h =2(K+1)/n then i

th observation is leverage

Page 53: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch6: Diagnostics for Leverage and influence: Ch6: Diagnostics for Leverage and influence: Measures of Measures of influenceinfluence

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 5353

Measure Formula Rules

Cook’s D

• Di is not an F statistics but practically it can be compared with F,K+1,n-K-1,

• We consider Points with Di>1 to be influence.

DFBETASIf |DFBETTASj,i|>2/n then ith observation warrants examination

DFFITSIf |DFFITSi|>2(K+1)/n then ith observation warrants attention

( ) ( )

Re

ˆ ˆ ˆ ˆ

( 1)

i i

is

X XD

K MS

( ), 2

( )

ˆ ˆj j i

j i

i jj

DFBETASS C

( )

2( )

ˆ ˆi ii

i ii

y yDFFITS

S h

Page 54: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch6: Diagnostics for Leverage and influence: Ch6: Diagnostics for Leverage and influence: Cook’s DCook’s D

There are several equivalent formulas (Exercise)

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 5454

22ˆvar 1

1 var 1 1ii ii

i ii ii

yr hD r

K e K h

This term is big if case i is unusual in y-directionThis term is big if case i is unusual in y-direction

This term is big if case i is unusual in x-directionThis term is big if case i is unusual in x-direction

( ) ( )

Re

ˆ ˆ ˆ ˆ

( 1)i i

is

y y y yD

K MS

It is the squared EuclideanDistance that the vector of fitted values moves when the ith observation is deleted

It is the squared EuclideanDistance that the vector of fitted values moves when the ith observation is deleted

Page 55: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch6: Diagnostics for Leverage and influence: Ch6: Diagnostics for Leverage and influence: DFBETASDFBETAS

There is an interesting computational formula for DFBETAS.

Let R=(X’X)-1X’ and r’j =[rj,1, rj,2,…,rj,n] denotes the jth row of R. Then, we can

write (Exercise)

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 5555

,, 1

j i ij i

iij j

r tDFBETAS

h

r r

Is a measure of the impact of the ith observation on ˆjIs a measure of the impact

of the ith observation on ˆjThis term is big if case i is unusual in both sidesThis term is big if case i is unusual in both sides

Page 56: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch6: Diagnostics for Leverage and influence: Ch6: Diagnostics for Leverage and influence: DFFITSDFFITS

DFFITSi is the number of standard deviations that the fitted value i changes if observation i is removed. Computationally we may find (Exersice)

Note that, However, if hii≈0, the effect of R-student will be moderated. Similarly a near-zero R-student combined with a high leverage point could produce a small value of DFFITS.

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 5656

1ii

i iii

hDFFITS t

h

Is the leverage of the ith observation.

Is the leverage of the ith observation.

This term is big if case i is an outlier

This term is big if case i is an outlier

Page 57: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch6: Ch6: Diagnostics for Leverage and influence: Diagnostics for Leverage and influence: A measure of model performanceA measure of model performance

The diagnostics Di , DFBETASj,i , and DFFITSi provide insight about the effect of observations on the estimated coefficients j and fitted values i. They do not provide any information about overall precision of estimation.

Since it is fairly common practice to use the determinant of the covariance matrix as a convenient scalar measure of precision, called the generalized variance, we could define the generalized variance of as

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 5757

12ˆ ˆvarGV X X

Page 58: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch6: Diagnostics for Leverage and influence: Ch6: Diagnostics for Leverage and influence: A measure of model A measure of model performanceperformance

To express the role of the ith observation on the precision of estimation, we could define

Clearly if COVRATIOi > 1, the ith observation improves the precision of estimation, while if COVRATIOi < 1, inclusion of the ith point degrades precision. Computationally (Exercise):

Cutoff value for COVRATIO is not easy, but researchers suggest that if COVRATIOi > 1 + 3(K+1)/n or if COVRATIOi < 1 – 3(K+1)/n, then the ith point should be considered influential.

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 5858

1 2( ) ( ) ( )

1

Re

; 1, 2,...,i i i

i

s

X X SCOVRATIO i n

X X MS

12( )

1Re

1

1

K

i

i Ks ii

SCOVRATIO

MS h

Page 59: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Chapter 7Chapter 7

Polynomial

Regression Models

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 5959

Page 60: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch 7: Polynomial regression modelsCh 7: Polynomial regression models

Is a subclass of multiple regression, Example 1: the second-order polynomial in one variable Y=0+1x+ 2x2+

Example 2: the second-order polynomial in two variables Y=0+1x+ 2x2+ 11x12 +22x2

2 +12x1x2 +

Polynomials are widely used in situations where the response is

curvilinear,

Complex nonlinear relationships can be adequately modeled by

polynomials over reasonably small ranges of the x’s.

This chapter will survey several problems and issues associated with fitting polynomials.

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 6060

Page 61: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch 7: Polynomial regression models: Ch 7: Polynomial regression models: in one variables in one variables

In general, the kth-order polynomial model in one variable is

Y=0+1x+ 2x2+ … +kxk+

If we set xj = xj, j = 1, 2,.. ., k, then the above model becomes a multiple linear regression

model in the k regressors x1 , x2 ,. .. xk. Thus, a polynomial model of order k may be fitted

using the techniques studied previously.

Set E(Y|X=x)=g(x) be an unknown function. Using Taylor series expansion:

So, the polynomial models are also useful as approximating functions to unknown

and possibly very complex nonlinear relationships.

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 6161

( ) ( )

0 0

( ) ( ) ( )! !

m mkm m

m j

x a x aY g x g a g a

m m

Page 62: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch 7: Polynomial regression models: Ch 7: Polynomial regression models: in one variables in one variables

Example (Second-order model or Quadratic model):

Y=0+1x+ 2x2+

We often call β1 the linear effect parameter and β2 the quadratic effect parameter.

The parameter β0 is the mean of y when x = 0 if the range of the data includes x = 0.

Otherwise β0 has no physical interpretation.

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 6262

| | | | | | | | | | |0 1 2 3 4 5 6 7 8 9 10 x

10 ╾ 9 ╾ 8 ╾ 7 ╾E(Y) 6 ╾ 5 ╾ 4 ╾ 3 ╾ 2 ╾ 1 ╾ 0 ╾

5-2x-0.25x2

Numerical exampleNumerical example

Page 63: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch 7: Polynomial regression models: Ch 7: Polynomial regression models: Important consideration in fitting these models Important consideration in fitting these models

1. Order of the model: Keep the order of the model as low as possible

2. Model building strategy: Use forward selection or backward elimination

3. Extrapolation: extrapolation with polynomial models can be extremely hazardous

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 6363

| | | | | | | | | | |0 1 2 3 4 5 6 7 8 9 10 x

10 ╾ 9 ╾ 8 ╾ 7 ╾E(Y) 6 ╾ 5 ╾ 4 ╾ 3 ╾ 2 ╾ 1 ╾ 0 ╾

5+2x-0.25x2

Region of original data Extrapolation

Example:

•If we extrapolate beyond the range of

the original data, the predicted

response turns downward.

•This may be at odds with the true

behavior of the system. But, In

general, a polynomial models may

turn in unanticipated and

inappropriate directions, both in

interpolation and in extrapolation.

Example:

•If we extrapolate beyond the range of

the original data, the predicted

response turns downward.

•This may be at odds with the true

behavior of the system. But, In

general, a polynomial models may

turn in unanticipated and

inappropriate directions, both in

interpolation and in extrapolation.

Page 64: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch 7: Polynomial regression models: Ch 7: Polynomial regression models: Important consideration in fitting these models Important consideration in fitting these models

4. Ill –Conditioning I: This means that the matrix inversion calculations will

be inaccurate, and considerable error may be introduced into the parameter

estimates. Nonessential ill-conditioning caused by the arbitrary choice of

origin can be removed by first centering the regressor variables

5. Ill –Conditioning II : If the values of x are limited to a narrow range, there

can be significant ill-conditioning or multicollinearity in the columns of the X

matrix. For example, if x varies between 1 and 2, x2 varies between 1 and 4,

which could create strong multicollinearity between x and x2.

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 6464

Page 65: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch 7: Polynomial regression models: Ch 7: Polynomial regression models: Important consideration in fitting these models Important consideration in fitting these models

4. Example: Hardwood Concentration in Pulp and Tensile Strength of Kraft Paper

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 6565

20 1 2( ) ( )y x x x x

2ˆ 45.295 2.546( 7.2632) 0.635( 7.2632)y x x Fitting:Fitting:

Testing:Testing:

0 2

1 2 0 2 1 0

Re1 2

0.01,1,16

: 0( , ) ( , )

: 0

105.45 8.53

R R

s

HSS SS

by FMS

H

F F

Diagnostic:Diagnostic: Residual analysis

Page 66: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch 7: Polynomial regression models: Ch 7: Polynomial regression models: in two or more variables in two or more variables

In general, these models are straightforward extension of the model

with one variable . An example of a second-order in two variable is:

Y=0+1x1 + 2x2+ 11x12 +22x2

2 +12x1x2 +

Where, 1, 2 are linear effect parameters, 11, 22 are quadratic effect parameters and

12 is an interaction effect parameter.

This example, has received considerable attention, both from researchers and from

practitioners. The regression function of this example is called response surface.

Response surface methodology (RSM) is widely applied in industry formodeling the output response(s) of a process in terms of the important

controllable variables and then finding the operating conditions that optimize the response.

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 6666

Page 67: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch 7: Polynomial regression models: Ch 7: Polynomial regression models: in two or more variables in two or more variables

Example:

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 6767

Observation Run order Temperature (T) Concentration (C ) conversion

1 4 200 15 43

2 12 250 15 78

3 11 200 25 69

4 5 250 25 73

5 6 189.65 20 48

6 7 260.35 20 76

7 3 225 12.93 65

8 1 225 27.07 74

9 8 225 20 76

10 10 225 20 79

11 9 225 20 83

12 2 225 20 81

x1 x2 y

-1 -1 43

1 -1 78

-1 1 69

1 1 73

-1.414 0 48

1.414 0 76

0 -1.414 65

0 1.414 74

0 0 76

0 0 79

0 0 83

0 0 81

1

225

25

Tx

2

20

5

Cx

Page 68: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch 7: Polynomial regression models: Ch 7: Polynomial regression models: in two or more variables in two or more variables

Example: Central composite design is widely used for fitting RSM.

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 6868

| | | |175 200 225 250 275 Temperature,

| | | |-2 -1 0 1 2 x1,

30 ╾

25 ╾

20 ╾

15 ╾

10

2 ╾

1 ╾

0 ╾

-1 ╾

-2

Runs at:Comers of square (x1,x2)=(-1,-1),(-1,1), (1,-1),(1,1)

Center of square (x1,x2)=(0,0),(0,0), (0,0),(0,0)

Axial of square (x1,x2)=(0,-1.414),(0,1.414), (-1.414,0),(1.414,0)

Page 69: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch 7: Polynomial regression models: Ch 7: Polynomial regression models: in two or more variables in two or more variables

We fit the second order model:

Y=0+1x1 + 2x2+ 11x12 +22x2

2 +12x1x2 +

To do this, we have:

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 6969

2 21 2 1 2 1 2

1 1 1 1 1 1 43

1 1 1 1 1 1 78

1 1 1 1 1 1 69

1 1 1 1 1 1 73

1 1.414 0 2 0 0 48

1 1.414 0 2 0 0 76

1 0 1.414 0 2 0 65

1 0 1.414 0 2 0 74

1 0 0 0 0 0 76

1 0 0 0 0 0 79

1 0 0 0 0 0 83

1 0 0 0 0 0 81

x x x x x x

X y

0 11 22

1

2

0 11 22

0 11 22

12 0 0 8 8 0 845

0 8 0 0 0 0 78.592

0 0 8 0 0 0 33.726,

8 0 0 12 4 0 511

8 0 0 4 12 0 541

0 0 0 0 0 4 31

ˆ ˆ ˆ12 8 8 845

ˆ8 78.592

ˆ8 33.726ˆˆ ˆ ˆ8 12 4 511

ˆ ˆ ˆ8 4 12

X X X y

β β β

β

βX X β X y

β β β

β β β

12

79.75

9.83

4.22ˆ8.88

5.135417.75ˆ4 31

β

β

Page 70: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch 7: Polynomial regression models: Ch 7: Polynomial regression models: in two or more variables in two or more variables

So, the fitted model by coded variable is: the second order model:

ŷ=79.75+9.83x1 +4.22x2- 8.88x12 -5.13x2

2 -7.75x1x2

And in terms of the original data, the model is:

We use coded data for computation of sum of squares:

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 7070

2 2

2 2

225 20 225 20 225 20ˆ 79.75 9.83 4.22 8.88 5.13 7.75

25 5 25 5 25 5

1105.56 8.0242 22.994 0.0142 0.20502 0.062

T C T C T Cy

T C T C TC

Source of variation SS D.F MS F P-valueRegression 1733.58 5 346.72 58.87 <0.0001

Residual 35.34 6 5.89

Total 1768.92 11

12 12

2 2 2 2

1 1

ˆˆ 43.96 79.11 67.89 72.04 48.11 75.90 63.54 75.46 79.75 79.75 79.75 79.75

ˆ 12 1733.57 ; 12 1768.92R i T ii i

y β X

SS y y SS y y

Page 71: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch 7: Polynomial regression models: Ch 7: Polynomial regression models: in two or more variables in two or more variables

So, if we fit only linear model by coded variable, we have:

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 7171

1 2

1 1 1 43

1 1 1 78

1 1 1 69

1 1 1 73

1 1.414 0 48

1 1.414 0 76

1 0 1.414 65

1 0 1.414 74

1 0 0 76

1 0 0 79

1 0 0 83

1 0 0 81

x x

X y

0

1

2

12 0 0 845

0 8 0 , 78.592

0 0 8 33.726

ˆ12 845 70.42ˆ ˆ ˆ8 78.592 9.83

ˆ 4.228 33.726

X X X y

β

X X β X y β β

β

Source of variation SS D.F MS F P-valueRegression 914.41 2 457.21 4.82 0.0377

Residual 854.51 9 94.95

Total 1768.92 11

12 12

2 2 2

1 1

1 1 1 1 1 1 1 1 1 1 1 1ˆˆ [70.42 9.83 4.22] 1 1 1 1 1.414 1.414 0 0 0 0 0 0

1 1 1 1 0 0 1.414 1.414 0 0 0 0

ˆ 56.37 76.03 64.81 84.47 56.52 84.32 64.45 76.39 70.42 70.42 70.42 70.42

ˆ 12 914.41 ;R i T ii i

y β X

y

SS y y SS y

212 1768.92y

Page 72: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch 7: Polynomial regression models: Ch 7: Polynomial regression models: in two or more variables in two or more variables

As, the last four rows of the matrix X in page 64 are the same, we can divide the SSRes ibto two components and do a

lack of fit test. We have:

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 7272

Source of variation SS D.F MS F P-value

Regression (SSR) SSR(1, 2|0) SSR(11, 22, 12 |1, 2, 0)=SSR-SSR(1, 2|0)

1733.58(914.4)(819.2)

5(2)(3)

346.72(457.2)(273.1)

58.87 <0.0001

Residual Lack of fit Pure error

35.34(8.5)(26.8)

6(3)(3)

5.89(2.83)(8.92)

0.3176 0.8120

Total 1768.92 11

2

2 2 2 2 2 29

2 2PE 9 LOF Re PE

1

76 79 83 8110 for 1,...,8 but 76 79 83 81

3 4

( 1) (4 1) 26.75 35.34 26.75 8.59

j

m

i i si

S j S

SS n S S SS SS SS

Page 73: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch 7: Polynomial regression models: Ch 7: Polynomial regression models: in two or more variables in two or more variables

As the quadratic model is significant for the data, we can do tests on the individual

variables to drop out unimportant terms, if there is any. We use the following statistics

Where Cjj are diagonal entities of the matrix (XX’)-1:

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 7373

Re

ˆ ˆ ˆ

ˆ ˆˆvar

j j jj

jj sj j

β β βt

C MSS β β

Variable Estimated coefficient Standard error

t P-value

Intercept 79.75 1.21 65.72

x1 9.83 0.86 11.45 0.0001

x2 4.22 0.86 4.913 0.0027

x12 -8.88 0.96 -9.25 0.0001

x22 -5.13 0.96 -5.341 0.0018

x1x2 -7.75 1.21 -6.386 0.0007

1

1 1 10 0 0

4 8 81

0 0 0 0 08

10 0 0 0 0

81 5 1

0 0 08 32 321 1 5

0 0 08 32 32

10 0 0 0 0

4

X X

15.89

4

Page 74: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch 7: Polynomial regression models: Ch 7: Polynomial regression models: in two or more variables in two or more variables

Generally we prefer to fit the full quadratic model whenever possible, unless there are large

differences between the full and the reduced model in terms of PRESS and adjusted R2

ŷ=79.75+9.83x1 +4.22x2- 8.88x12 -5.13x2

2 -7.75x1x2

Using equation we have:

Note that, all 8 runs 1 to 8 have the same h ii as these points are

Equidistant form the center of the design. In addition, all last

four runs have hii=0.25.

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 7474

x1 x2 y ŷ ei hii ti e[i]

-1 -1 43 43.96 -0.96 0.625 -0.67 -2.55

1 -1 78 79.11 -1.11 0.625 -0.74 -2.95

-1 1 69 67.89 1.11 0.625 0.75 2.96

1 1 73 72.04 0.96 0.625 0.65 2.56

-1.414 0 48 48.11 -0.11 0.625 -0.07 -0.29

1.414 0 76 75.90 0.10 0.625 0.07 0.28

0 -1.414 65 63.54 1.46 0.625 0.98 3.89

0 1.414 74 75.46 -1.46 0.625 0.99 -3.90

0 0 76 79.75 -3.75 0.250 -1.78 -5.00

0 0 79 79.75 -0.75 0.250 -0.36 -1.00

0 0 83 79.75 3.25 0.250 1.55 4.33

0 0 81 79.75 1.25 0.250 0.59 1.67

1

ii i ih XX x x

11

1 1 10 0 0

4 8 81

0 0 0 0 0 1811

0 0 0 0 0181 1 1 1 1 1 0.625

11 5 10 0 0

8 32 32 11 1 5 10 0 08 32 32

10 0 0 0 0

4

h

R2=0.98 , R2Adj=0.96 , R2

Predicted=0.94

Page 75: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch 7: Polynomial regression models: Ch 7: Polynomial regression models: in two or more variables in two or more variables

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 7575

Normality is hold,

because:

Normality is hold,

because:

Variance is stable,

because:

Variance is stable,

because:

independence is hold,

because:

independence is hold,

because:

Page 76: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch 7: Polynomial regression models: Ch 7: Polynomial regression models: Orthogonal polynomial Orthogonal polynomial

Consider the kth-order polynomial model in one variable as

Y=0+1x+ 2x2+ … +kxk+

Generally the columns of the X matrix will not be orthogonal. One

approach to deal with this problem is orthogonal polynomial. In this

approach we fit the following model:

Y=0+ 1 P1(x)+ 2P1(x)+ … + kPk(x)+

Where Pj(x) is jth-order orthogonal polynomial defined such as:

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 7676

01

( ) ( ) 0 , ; ( ) 1 , 1,...,n

r i s i ii

P x P x r s P x i n

Page 77: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch 7: Polynomial regression models: Ch 7: Polynomial regression models: Orthogonal polynomial Orthogonal polynomial

With this model we have

Pj(x) can be determined by gram-Schmidt process. In the cases where the level of x are equally spaced we have:

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 7777

20

10 1 1 1 1

20 2 1 2 2 1 1

12

10 12

1

( ) 0 0

( ) ( ) ( )( )

( ) ( ) ( ) 0 ( )) 0ˆ , 0,...,

( )( ) ( ) ( )

0 0 ( )

n

ii

nk n

j i ik i i

i j n

j iin n k n n

k ii

P x

P x P x P xP x y

P x P x P x P xX X X α j k

P xP x P x P x

P x

0

11

2 2

22

3 2

33

2 24 2 2

44

( ) 1

1( )

1 1( )

12

1 3 7( )

20

3 1 91 3 13( )

14 560

i

ii

ii

i ii

i ii

P x

x xP x

λ d

x x nP x

λ d

x x x x nP x

λ d d

n nx x x x nP x

λ d d

Page 78: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch 7: Polynomial regression models: Ch 7: Polynomial regression models: Orthogonal polynomial Orthogonal polynomial

Gram-Schmidt process: Consider an arbitrary set S={U1,…,Uk} and

denote by 〈 Ui , Uj 〉 the inner product of Ui and Uj. Then the set

S’={V1,…,Vk} are orthogonal when computed as bellow:

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 7878

1 1

2 12 2 1

1 1

3 2 3 13 3 2 1

2 2 1 1

1

1

,

,

, ,

, ,

,

,

kk j

k k jj j j

V U

U VV U V

V V

U V U VV U V V

V V V V

U VV U V

V V

〈 〉〈 〉〈 〉 〈 〉〈 〉 〈 〉

〈 〉〈 〉

NormalizingNormalizing 1,...,,

jj

j j

Ve j k

V V

〈 〉

Page 79: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch 7: Polynomial regression models: Ch 7: Polynomial regression models: Orthogonal polynomial Orthogonal polynomial

In polynomial regression with one variable assume that Ui =xi-1 . Now, applying

Gram-Schmidt process, we have:

If the levels of x are equally spaced we have:

So, in this case we have:

Exercise : Give a proof for other Pj(x) in page 72 by similar method.

Note: every arbitrary constant can be substituted by j in Pj(x).

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 7979

2

,11

,1V x x x

x〈 〉1〈 〉

NormalizingNormalizing

21

22 2

1

( ), n

ii

V x xP x

V Vx x

〈 〉

2 2 2

2 2 20

1 1 1

1 1 1( 1) 0 1

2 2 12

n n n

ii i i

n n nx x x d i x d d i nd

1 2 212

1 1( )

1 112 12

x x x x x xP x

d dn nd n n

Page 80: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch 7: Polynomial regression models: Ch 7: Polynomial regression models: Orthogonal polynomial Orthogonal polynomial

Example:

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 8080

x y50 335

75 326

100 316

125 313

150 311

175 314

200 318

225 328

250 337

275 345

i P0(x)=1

1 1 -9 335

2 1 -7 326

3 1 -5 316

4 1 -3 313

5 1 -1 311

6 1 1 314

7 1 3 318

8 1 5 328

9 1 7 337

10 1 9 345

xi-xi-1 =25 for all i, so the levels of X

are equally spaced and we have

xi-xi-1 =25 for all i, so the levels of X

are equally spaced and we have

2

2

99( ) 0.5 5.5

12iP x i

1( ) 2 5.5iP x i

10 10 10

0 1 21 1 1

0 1 210 10 102 2 2

0 1 21 1 1

( ) ( ) ( )ˆ ˆ ˆ324.3 , 0.74 , 2.8

( ) ( ) ( )

i i i i i ii i i

i i ii i i

P x y P x y P x yα y α α

P x P x P x

Page 81: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch 7: Polynomial regression models: Ch 7: Polynomial regression models: in two or more variables in two or more variables

Then, the fitted model is:

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 8181

Source of variation SS D.F MS F P-value

Regression (SSR) Linear Quadratic

1213.43181.891031.54

2(1)(1)

606.72(181.89)(1031.54)

159.2447.74270.75

<0.0001<0.0002<0.0001

Residual 26.67 7 3.81

Total 1240.1 9

1 2

2

2

324 3 0 74 2 8

99324 3 1 48 5 5 1 4 5 5

12

346 96 13 92 1 4

i i iy . . P ( x ) . P ( x )

. . i . . i .

. . i . i

Page 82: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Chapter 8Chapter 8

Indicator Variables

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 8282

Page 83: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch 8: Indicator VariablesCh 8: Indicator Variables

The variables employed in regression analysis, are often quantitative variables: Example: temperature, distance, income

These variables have well defined scale of measurement

In some situation it is necessary to use qualitative or categorical variables, as

predictor variables Example: sex, operators, employment status,

In general, these variables have no natural scale of measurement,

Question: How we can account for the effect that these variables may have on the response?

This is done through the use of indicator variable. Sometimes, indicator variables are called dummy variables .

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 8383

Page 84: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch 8: Indicator Variables: Example 1Ch 8: Indicator Variables: Example 1

Y=life of cutting tool: X1 = lathe speed per minute;

X2 = Type of cutting tool; is qualitative and has two levels (e.g. tool types A and B)

Let

Assuming that a first-order model is appropriate, we have:Y=β0+ β1x1+ β2x2+ϵ

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 8484

0 if the observation is from tool types AX2= 1 if the observation is from tool types B

Y=β0+ β1x1+ β2(0)+ϵ=β0+ β1x1+ϵ Y=β0+ β1x1+ β2(1)+ϵ=(β0+β2)+ β1x1+ϵA ← tool type →BA ← tool type →B

Page 85: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch 8: Indicator VariablesCh 8: Indicator Variables

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 8585

50

β0+β2

β0

0

500 1000 Lathe speed, x1 (RPM)

E(Y|x2=0)=β0+ β1x1 , tool type A

E(Y|x2=1)=β0+ β2 + β1x1 , tool type B↑β2

↓ β1

β1

• Regression lines are parallel;

• 2 is a measure of the difference

in mean tool life resulting from

changing from tool type A to tool

type B;

• Variance of the error is assumed

to be the same for both tool types

A and B.

• Regression lines are parallel;

• 2 is a measure of the difference

in mean tool life resulting from

changing from tool type A to tool

type B;

• Variance of the error is assumed

to be the same for both tool types

A and B.

Page 86: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch 8: Indicator Variables: Example 2Ch 8: Indicator Variables: Example 2

Consider again the example 1 , but here assume that X2 = Type of cutting tool; is qualitative and has three levels (e.g. tool types A, B and C)

Define two indicator variables:

Assuming that a first-order model is appropriate, we have:

Y=β0+ β1x1+ β2x2+ β3x3+ ϵ=

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 8686

(0,0) if the observation is from tool types A

(x2,x3)= (1,0) if the observation is from tool types B

(0,1) if the observation is from tool types C

β0+ β1x1+ β2(0)+ β3(0)+ ϵ=β0+ β1x1+ϵ tool type=A

β0+ β1x1+ β2(1)+ β3(0)+ ϵ=(β0+β2)+ β1x1+ϵ tool type=B

β0+ β1x1+ β2(0)+ β3(1)+ ϵ=(β0+β3)+ β1x1+ϵ tool type=C

In general, a qualitative variable with l levels is represented by l -1 indicator variables, each taking on the values 0 and 1.

In general, a qualitative variable with l levels is represented by l -1 indicator variables, each taking on the values 0 and 1.

Page 87: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch 8: Indicator Variables: Ch 8: Indicator Variables: Numerical exampleNumerical example

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 8787

yi xi1 xi2

18.73 610 A

14.52 950 A

17.43 720 A

14.54 840 A

13.44 980 A

24.39 530 A

13.34 680 A

22.71 540 A

12.68 890 A

19.32 730 A

30.16 670 B

27.09 770 B

25.40 880 B

26.05 1000 B

33.49 760 B

35.62 590 B

26.07 910 B

36.78 650 B

34.95 810 B

43.67 500 B

We fit model Y=β0+ β1x1+ β2x2+ϵ

0 1 2

0 1 2

0 1 2

20 15010 10 490.38

15010 11717500 7540 , 356515.7

10 7540 10 319.28

ˆ ˆ ˆ20 15010 10 490.38

ˆ ˆ ˆ ˆ15010 11717500 7540 356515.7

ˆ ˆ ˆ10 7540 10 319.28

36.99ˆ

X X X y

β β β

X X β X y β β β

β β β

β

0.03

15.00

20 20

2 2 2 2

1 1

1 1 1 1 1 1ˆˆ [36.99 0.03 15] 610 950 730 670 770 500

0 0 0 1 1 1

ˆ 20.76 11.71 38.69

ˆ 20 1418.03 ; 20 1575.09R i T ii i

y β X

y

SS y y SS y y

Page 88: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch 8: Indicator Variables: Ch 8: Indicator Variables: Numerical exampleNumerical example

Then, the fitted model is:

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 8888

Source of variation SS D.F MS F P-value

Regression (SSR) 1418.03 2 709.02 79.75 <0.0001

Residual 157.06 17 9.24

Total 1575.09 19

1 236 99 0 03 15 00iy . . x . x

Variable Estimated coefficient Standard error

t P-value

Intercept 36.99

x1 -0.03 0.005 -5.89 <0.00001

x2 15 1.360 11.04 <0.00001

Page 89: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch 8: Indicator Variables: Ch 8: Indicator Variables: Comparing regression modelsComparing regression models

Consider the case of simple linear regression where the n observations can be formed into M groups, with the

m th group having nm observations. The most general model consists of M separate equations such as:

Y=0m+1mx+ , m=1,2,…,M

It is often of interest to compare this general model to a more restrictive one

Indicator variables are helpful in this regard. Using indicator variables we can write:

Y=(01+11x)D1 +(02+12x)D2 +…+(0M+1Mx)DM +

Where Di is 1 when group i is selected. We call this model as full model (FM). In this model we have

2M parameters and so degree of freedom for SSRes(FM) is n-2M.

Exercise: Let SSRes(FMm) denotes sum of square of residual in model Y=0m+1mx+. Show

that SSRes(FM)= SSRes(FM1)+ SSRes(FM2)+…+ SSRes(FMM)

We consider three cases:1) Parallel lines: 11=12 =…= 1M

2) Concurrent lines: 01=02 =…= 0M

3) Coincide lines: 11=12 =…= 1M and 01=02 =…= 0M

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 8989

Page 90: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch8: Indicator Variables: Ch8: Indicator Variables: parallel linesparallel lines

In the parallel lines all M slopes are identical but the intercepts may differ. So, here we want to test:

H0:11=12 =…= 1M =1

Recall that this procedure involves fitting a full model (FM) and a reduced model(RM) restricted to the null hypothesis

and computing the F statistics

Under H0 the full model will be reduced to the following model

Y=01+1x +2D2 +…+MDM +

In this model dfRM=n-(M+1)

Therefore, using the above F statistics we can test hypothesis H0.

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 9090

FM RM FM

Res Res

FM RM0 1 , ,

Res

FM

(FM) (RM)

H is rejected when(FM) df df df

SS SS

df dfF F F

SS

df

Analysis of covarianceAnalysis of covariance

Page 91: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch 8: Indicator Variables: Ch 8: Indicator Variables: Concurrent and coincide linesConcurrent and coincide lines

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 9191

In the concurrent lines all M intercepts are identical but the slopes may differ:

H0:01=02 =…= 0M =0

Under H0 the full model will be reduced to the following model

Y=0+1x +2 xD2 +…+M xDM +

In this model dfRM=n-(M+1)

In this way, similar to parallel lines, we can test hypothesis H0 using the above F statistics.

In the concurrent lines all M intercepts are identical but the slopes may differ:

H0:01=02 =…= 0M =0

Under H0 the full model will be reduced to the following model

Y=0+1x +2 xD2 +…+M xDM +

In this model dfRM=n-(M+1)

In this way, similar to parallel lines, we can test hypothesis H0 using the above F statistics.

In the coincide lines we want to test:

H0:01=02 =…= 0M =0 and 11=12 =…= 1M =1

Under H0 the full model will be reduced to the simple model

Y=0+1x +

In this model dfRM=n-2

In this way, similar to parallel lines, we can test hypothesis H0 using the above F statistics.

In the coincide lines we want to test:

H0:01=02 =…= 0M =0 and 11=12 =…= 1M =1

Under H0 the full model will be reduced to the simple model

Y=0+1x +

In this model dfRM=n-2

In this way, similar to parallel lines, we can test hypothesis H0 using the above F statistics.

Page 92: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch 8: Indicator Variables: Ch 8: Indicator Variables: Regression approach to analysis varianceRegression approach to analysis variance

Consider a one way model:

yij=+i+ij =i+ij , i=1,…,k; j=1,2,…,n

In the fixed effect case:

H0:1= 2 =…= k =0

H1: 10 at least for one i

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 9292

Source of variation

SS Df MS F

Treatment K-1 ssT/(k-1)

Error K(n-1) SSRes /k(n-1)

Total Kn-1

2

. ..1

k

ii

n y y

2

.1 1

k n

ij ii j

y y

2

..1 1

k n

iji j

y y

Page 93: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch 8: Indicator Variables: Ch 8: Indicator Variables: Regression approach to analysis varianceRegression approach to analysis variance

Equivalent regression model for the one way model:

yij=i+ij , i=1,…,k; j=1,2,…,n

is:

Yij=0+1x1j + 2x2j+ …+k-1xk-1,j +ij

where

Relationship between two models:

0=k

i= i -k ; i=1,…,k

Exercise: Find the relationship among Sum of Squares in regression and one way Anova.

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 9393

1 if the observation j is from treatment iXij= 0 otherwise

Page 94: Regression II Dr. Rahim Mahmoudvand Department of Statistics, Bu-Ali Sina University R.Mahmoudvand@basu.ac.ir Regression II; Bu-Ali Sina University Fall

Ch 8: Indicator Variables: Ch 8: Indicator Variables: Regression approach to analysis varianceRegression approach to analysis variance

For the case k=3 we have:

Regression II; Bu-Ali Sina UniversityRegression II; Bu-Ali Sina UniversityFall 2014Fall 2014 9494

1 2

11

12

13

0 021 ..

22 1 1. 1

23 2.2

31

32

33

1 1 0

1 1 0

1 1 0ˆ ˆ9 3 31 0 1

ˆ ˆ ˆ( ) 3 3 01 0 1

ˆ ˆ3 0 31 0 1

1 0 0

1 0 0

1 0 0

x x

y

y

yβ βy y

X y y X X β X y β y β

y yβ βy

y

y

3.

1. 3.

2. 3.2

y

y y

y y

H0:1= 2 = 3 =0

H1: 10 at least for one i

H0:0= and 1= 2=0

H1: 10 or 10 or bothequivalentequivalent