30
Sequential sums of squares … or … extra sums of squares

Sequential sums of squares … or … extra sums of squares

Embed Size (px)

Citation preview

Page 1: Sequential sums of squares … or … extra sums of squares

Sequential sums of squares

… or … extra sums of squares

Page 2: Sequential sums of squares … or … extra sums of squares

Sequential sums of squares: what are they?

• The reduction in the error sum of squares when one or more predictor variables are added to the regression model.

• Or, the increase in the regression sum of squares when one or more predictor variables are added to the regression model.

Page 3: Sequential sums of squares … or … extra sums of squares

Sequential sums of squares:why?

• They can be used to test whether one slope parameter is 0.

• They can be used to test whether a subset (more than two, but less than all) of the slope parameters are 0.

Page 4: Sequential sums of squares … or … extra sums of squares

Example: Brain and body size predictive of intelligence?

• Sample of n = 38 college students• Response (Y): intelligence based on the PIQ

(performance) scores from the (revised) Wechsler Adult Intelligence Scale.

• Predictor (X1): Brain size based on MRI scans (given as count/10,000)

• Predictor (X2): Height in inches• Predictor (X3): Weight in pounds

Page 5: Sequential sums of squares … or … extra sums of squares

OUTPUT #1The regression equation is PIQ = 4.7 + 1.18 MRI

Predictor Coef SE Coef T PConstant 4.65 43.71 0.11 0.916MRI 1.1766 0.4806 2.45 0.019

Analysis of Variance

Source DF SS MS F PRegression 1 2697.1 2697.1 5.99 0.019Error 36 16197.5 449.9Total 37 18894.6

Page 6: Sequential sums of squares … or … extra sums of squares

OUTPUT #2The regression equation is PIQ = 111 + 2.06 MRI - 2.73 Height

Predictor Coef SE Coef T PConstant 111.28 55.87 1.99 0.054MRI 2.0606 0.5466 3.77 0.001Height -2.7299 0.9932 -2.75 0.009

Analysis of VarianceSource DF SS MS F PRegression 2 5572.7 2786.4 7.32 0.002Residual 35 13321.8 380.6Total 37 18894.6

Source DF Seq SSMRI 1 2697.1Height 1 2875.6

Page 7: Sequential sums of squares … or … extra sums of squares

OUTPUT #3The regression equation isPIQ = 111 + 2.06 MRI - 2.73 Height + 0.001 Weight

Predictor Coef SE Coef T PConstant 111.35 62.97 1.77 0.086MRI 2.0604 0.5634 3.66 0.001Height -2.732 1.229 -2.22 0.033Weight 0.0006 0.1971 0.00 0.998

Analysis of VarianceSource DF SS MS F PRegression 3 5572.7 1857.6 4.74 0.007Error 34 13321.8 391.8Total 37 18894.6

Source DF Seq SSMRI 1 2697.1Height 1 2875.6Weight 1 0.0

Page 8: Sequential sums of squares … or … extra sums of squares

Sequential sums of squares: definition using SSE notation

• SSR(X2|X1) = SSE(X1) - SSE(X1,X2)

• In general, you subtract the error sum of squares due to all of the predictors both left and right of the bar from the error sum of squares due to the predictor to the right of the bar.

• SSR(X2,X3|X1) = SSE(X1) - SSE(X1,X2,X3)

Page 9: Sequential sums of squares … or … extra sums of squares

Sequential sums of squares: definition using SSR notation

• SSR(X2|X1) = SSR(X1,X2) – SSR(X1)

• In general, you subtract the regression sum of squares due to the predictor to the right of the bar from the regression sum of squares due to all of the predictors both left and right of the bar.

• SSR(X2,X3|X1) = SSR(X1,X2,X3)-SSR(X1)

Page 10: Sequential sums of squares … or … extra sums of squares

Decomposition of regression sum of squares

In multiple regression, there is more than one way to decompose the regression sum of squares. For example:

12121 |, XXSSRXSSRXXSSR

21221 |, XXSSRXSSRXXSSR

Page 11: Sequential sums of squares … or … extra sums of squares

OUTPUT #2The regression equation is PIQ = 111 + 2.06 MRI - 2.73 Height

Predictor Coef SE Coef T PConstant 111.28 55.87 1.99 0.054MRI 2.0606 0.5466 3.77 0.001Height -2.7299 0.9932 -2.75 0.009

Analysis of VarianceSource DF SS MS F PRegression 2 5572.7 2786.4 7.32 0.002Residual 35 13321.8 380.6Total 37 18894.6

Source DF Seq SSMRI 1 2697.1Height 1 2875.6

Page 12: Sequential sums of squares … or … extra sums of squares

OUTPUT #4The regression equation isPIQ = 111 - 2.73 Height + 2.06 MRI

Predictor Coef SE Coef T PConstant 111.28 55.87 1.99 0.054Height -2.7299 0.9932 -2.75 0.009MRI 2.0606 0.5466 3.77 0.00

Analysis of VarianceSource DF SS MS F PRegression 2 5572.7 2786.4 7.32 0.002Error 35 13321.8 380.6Total 37 18894.6

Source DF Seq SSHeight 1 164.0MRI 1 5408.8

Page 13: Sequential sums of squares … or … extra sums of squares

Decomposition of SSR: how?

111 XSSEXSSRXSSTO

212121 ,,, XXSSEXXSSRXXSSTO

12121 |, XXSSRXSSRXXSSR

211211 ,| XXSSEXXSSRXSSRXSSTO

Page 14: Sequential sums of squares … or … extra sums of squares

Decomposition of SSR: how?

222 XSSEXSSRXSSTO

212121 ,,, XXSSEXXSSRXXSSTO

21221 |, XXSSRXSSRXXSSR

212122 ,| XXSSEXXSSRXSSRXSSTO

Page 15: Sequential sums of squares … or … extra sums of squares

Even more ways to decompose SSR when 3 or more predictors

321 ,, XXXSSR

321 ,, XXXSSR

321 ,, XXXSSR

Page 16: Sequential sums of squares … or … extra sums of squares

Degrees of freedom and regression mean squares

A sequential sum of squares involving one extra predictor variable has one degree of freedom associated with it:

1

|| 12

12

XXSSRXXMSR

A sequential sum of squares involving two extra predictor variables has two degrees of freedom associated with it:

2

|,|, 132

132

XXXSSRXXXMSR

Page 17: Sequential sums of squares … or … extra sums of squares

Sequential sums of squares in Minitab

• The SSR is automatically decomposed into one-degree-of-freedom sequential sums of squares, in the order in which the predictor variables are entered into the model.

• To get sequential sum of squares involving two or more predictor variables, sum the appropriate one-degree-of-freedom sequential sums of squares.

Page 18: Sequential sums of squares … or … extra sums of squares

OUTPUT #3The regression equation isPIQ = 111 + 2.06 MRI - 2.73 Height + 0.001 Weight

Predictor Coef SE Coef T PConstant 111.35 62.97 1.77 0.086MRI 2.0604 0.5634 3.66 0.001Height -2.732 1.229 -2.22 0.033Weight 0.0006 0.1971 0.00 0.998

Analysis of VarianceSource DF SS MS F PRegression 3 5572.7 1857.6 4.74 0.007Error 34 13321.8 391.8Total 37 18894.6

Source DF Seq SSMRI 1 2697.1Height 1 2875.6Weight 1 0.0

Page 19: Sequential sums of squares … or … extra sums of squares

OUTPUT #5The regression equation isPIQ = 111 - 2.73 Height + 0.001 Weight + 2.06 MRI

Predictor Coef SE Coef T PConstant 111.35 62.97 1.77 0.086Height -2.732 1.229 -2.22 0.033Weight 0.0006 0.1971 0.00 0.998MRI 2.0604 0.5634 3.66 0.001

Analysis of VarianceSource DF SS MS F PRegression 3 5572.7 1857.6 4.74 0.007Error 34 13321.8 391.8Total 37 18894.6

Source DF Seq SSHeight 1 164.0Weight 1 169.5MRI 1 5239.2

Page 20: Sequential sums of squares … or … extra sums of squares

Testing one slope β1= βMRI is 0Predictor Coef SE Coef T PConstant 111.35 62.97 1.77 0.086Height -2.732 1.229 -2.22 0.033Weight 0.0006 0.1971 0.00 0.998MRI 2.0604 0.5634 3.66 0.001

Analysis of VarianceSource DF SS MS F PRegression 3 5572.7 1857.6 4.74 0.007Error 34 13321.8 391.8Total 37 18894.6

Source DF Seq SSHeight 1 164.0Weight 1 169.5MRI 1 5239.2

Page 21: Sequential sums of squares … or … extra sums of squares

Testing one slope β2= βHT is 0Predictor Coef SE Coef T PConstant 111.35 62.97 1.77 0.086MRI 2.0604 0.5634 3.66 0.001Weight 0.0006 0.1971 0.00 0.998Height -2.732 1.229 -2.22 0.033

Analysis of VarianceSource DF SS MS F PRegression 3 5572.7 1857.6 4.74 0.007Error 34 13321.8 391.8Total 37 18894.6

Source DF Seq SSMRI 1 2697.1Weight 1 940.9Height 1 1934.7

Page 22: Sequential sums of squares … or … extra sums of squares

Testing one slope β3= βWT is 0Predictor Coef SE Coef T PConstant 111.35 62.97 1.77 0.086MRI 2.0604 0.5634 3.66 0.001Height -2.732 1.229 -2.22 0.033Weight 0.0006 0.1971 0.00 0.998

Analysis of VarianceSource DF SS MS F PRegression 3 5572.7 1857.6 4.74 0.007Error 34 13321.8 391.8Total 37 18894.6

Source DF Seq SSMRI 1 2697.1Height 1 2875.6Weight 1 0.0

Page 23: Sequential sums of squares … or … extra sums of squares

Testing one slope βk is 0: why it works?

Full model:ii XXXY 3322110

321 ,,)( XXXSSEFSSE 4ndfF

Reduced model:ii XXY 22110

21,)( XXSSERSSE 3ndfR

Page 24: Sequential sums of squares … or … extra sums of squares

Testing one slope βk is 0: why it works? (cont’d)

The general linear test statistic:

FFR df

FSSE

dfdf

FSSERSSEF

*

becomes:

321

213321213

,,

,|

4

,,

1

,|*

XXXMSE

XXXMSR

n

XXXSSEXXXSSRF

Page 25: Sequential sums of squares … or … extra sums of squares

Testing whether β2 = β3 = 0

Full model:ii XXXY 3322110

321 ,,)( XXXSSEFSSE 4ndfF

Reduced model:ii XY 110

1)( XSSERSSE 2ndfR

Page 26: Sequential sums of squares … or … extra sums of squares

Testing whether β2 = β3 = 0 (cont’d)

The general linear test statistic:

FFR df

FSSE

dfdf

FSSERSSEF

*

becomes:

321

132321123

,,

|,

4

,,

2

|,*

XXXMSE

XXXMSR

n

XXXSSEXXXSSRF

Page 27: Sequential sums of squares … or … extra sums of squares

OUTPUT #3The regression equation isPIQ = 111 + 2.06 MRI - 2.73 Height + 0.001 Weight

Predictor Coef SE Coef T PConstant 111.35 62.97 1.77 0.086MRI 2.0604 0.5634 3.66 0.001Height -2.732 1.229 -2.22 0.033Weight 0.0006 0.1971 0.00 0.998

Analysis of VarianceSource DF SS MS F PRegression 3 5572.7 1857.6 4.74 0.007Error 34 13321.8 391.8Total 37 18894.6

Source DF Seq SSMRI 1 2697.1Height 1 2875.6Weight 1 0.0

Page 28: Sequential sums of squares … or … extra sums of squares

Cumulative Distribution FunctionF distribution with 2 DF in numerator and 34 DF in denominator

x P( X <= x ) 3.6700 0.9640

00:

0:

32

320

orH

H

A

670.38.3912

6.2875* F

036.0964.01670.334,2 FPP-value is:

Page 29: Sequential sums of squares … or … extra sums of squares

Getting P-value for F-statistic in Minitab

• Select Calc >> Probability Distributions >> F…

• Select Cumulative Probability. Use default noncentrality parameter of 0.

• Type in numerator DF and denominator DF.• Select Input constant. Type in F-statistic.

Answer appears in session window.• P-value is 1 minus the number that appears.

Page 30: Sequential sums of squares … or … extra sums of squares

Test whether β1 = β3 = 0

Analysis of VarianceSource DF SS MS F PRegression 3 5572.7 1857.6 4.74 0.007Error 34 13321.8 391.8Total 37 18894.6

Source DF Seq SSHeight 1 164.0Weight 1 169.5MRI 1 5239.2