38
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1 Business Statistics, 4e by Ken Black Chapter 13 Simple Regression Analysis D iscreteD istributions

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1 Business Statistics, 4e by Ken Black Chapter 13 Simple Regression Analysis

Embed Size (px)

Citation preview

Page 1: Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1 Business Statistics, 4e by Ken Black Chapter 13 Simple Regression Analysis

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1

Business Statistics, 4eby Ken Black

Chapter 13

Simple RegressionAnalysis

Discrete Distributions

Page 2: Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1 Business Statistics, 4e by Ken Black Chapter 13 Simple Regression Analysis

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-2

Learning Objectives

• Compute the equation of a simple regression line from a sample of data, and interpret the slope and intercept of the equation.

• Understand the usefulness of residual analysis in testing the assumptions underlying regression analysis and in examining the fit of the regression line to the data.

• Compute a standard error of the estimate and interpret its meaning.

• Compute a coefficient of determination and interpret it.• Test hypotheses about the slope of the regression model and

interpret the results.• Estimate values of Y using the regression model.

Page 3: Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1 Business Statistics, 4e by Ken Black Chapter 13 Simple Regression Analysis

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-3

Regression and Correlation

• Regression analysis is the process of constructing a mathematical model or function that can be used to predict or determine one variable by another variable.

• Correlation is a measure of the degree of relatedness of two variables.

Page 4: Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1 Business Statistics, 4e by Ken Black Chapter 13 Simple Regression Analysis

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-4

Simple Regression Analysis

• bivariate (two variables) linear regression -- the most elementary regression model– dependent variable, the variable to be

predicted, usually called Y– independent variable, the predictor or

explanatory variable, usually called X

Page 5: Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1 Business Statistics, 4e by Ken Black Chapter 13 Simple Regression Analysis

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-5

Airline Cost Data

Number of Passengers

XCost ($1,000)

Y

61 4.280

63 4.080

67 4.420

69 4.170

70 4.480

74 4.300

76 4.820

81 4.700

86 5.110

91 5.130

95 5.640

97 5.560

Page 6: Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1 Business Statistics, 4e by Ken Black Chapter 13 Simple Regression Analysis

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-6

Scatter Plot of Airline Cost Data

0

1

2

3

4

5

6

0 20 40 60 80 100 120

Number of Passengers

Co

st

($1

00

0)

Page 7: Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1 Business Statistics, 4e by Ken Black Chapter 13 Simple Regression Analysis

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-7

Regression Models

Deterministic Regression ModelDeterministic Regression Model

Y = Y = 00 + + 11XX

Probabilistic Regression ModelProbabilistic Regression Model

Y = Y = 00 + + 11X + X +

00 and and 11 are population parameters are population parameters

00 and and 11 are estimated by sample statistics b are estimated by sample statistics b00 and b and b11

Page 8: Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1 Business Statistics, 4e by Ken Black Chapter 13 Simple Regression Analysis

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-8

Equation of the Simple Regression Line

YY

where

XY

bb

bb

of valuepredicted the= ˆ

slope sample the=

intercept sample the= :

ˆ

1

0

10

Page 9: Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1 Business Statistics, 4e by Ken Black Chapter 13 Simple Regression Analysis

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-9

Least Squares Analysis

1 2 2 2

22b

X X X X X XX X Y Y XY nXY

n

XYX Y

n

n

0 1 1b b bY XY

n

X

n

Page 10: Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1 Business Statistics, 4e by Ken Black Chapter 13 Simple Regression Analysis

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-10

Least Squares Analysis

SS X X Y Y XYX Y

n

SSn

SS

SS

XY

XX

XY

XX

X X X X

b

2 2

2

1

0 1 1b b bY XY

n

X

n

Page 11: Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1 Business Statistics, 4e by Ken Black Chapter 13 Simple Regression Analysis

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-11

Solving for b1 and b0 of the Regression Line: Airline Cost Example (Part 1)

Number ofPassengers Cost ($1,000) X Y X2 XY

61 4.28 3,721 261.0863 4.08 3,969 257.0467 4.42 4,489 296.1469 4.17 4,761 287.7370 4.48 4,900 313.6074 4.30 5,476 318.2076 4.82 5,776 366.3281 4.70 6,561 380.7086 5.11 7,396 439.4691 5.13 8,281 466.8395 5.64 9,025 535.8097 5.56 9,409 539.32

X = 930 Y = 56.69 2X = 73,764 XY = 4,462.22

Page 12: Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1 Business Statistics, 4e by Ken Black Chapter 13 Simple Regression Analysis

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-12

Solving for b1 and b0 of the Regression Line: Airline Cost Example (Part 2)

745.6812

)69.56)(930(22.462,4

n

YXXYSS XY

168912

)930(764,73

)( 222

n

XXSS XX

0407.1689

745.681

XX

XY

SS

SSb

57.112

930)0407(.

12

69.5610

n

Xb

n

Yb

XY 0407.57.1ˆ

Page 13: Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1 Business Statistics, 4e by Ken Black Chapter 13 Simple Regression Analysis

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-13

Graph of Regression Line for the Airline Cost Example

0

1

2

3

4

5

6

0 20 40 60 80 100 120

Number of Passengers

Co

st (

$100

0)

Page 14: Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1 Business Statistics, 4e by Ken Black Chapter 13 Simple Regression Analysis

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-14

Airline Cost: Excel Summary Output

SUMMARY OUTPUT

Regression Statistics

Multiple R 0.94820033

R Square 0.89908386

Adjusted R Square 0.88899225

Standard Error 0.17721746

Observations 12

ANOVA

  df SS MS F Significance F

Regression 1 2.79803 2.79803 89.092179 2.7E-06

Residual 10 0.31406 0.03141

Total 11 3.11209      

  Coefficients Standard Error t Stat P-valueIntercept 1.56979278 0.33808 4.64322 0.0009175Number of Passengers 0.0407016 0.00431 9.43887 2.692E-06

Page 15: Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1 Business Statistics, 4e by Ken Black Chapter 13 Simple Regression Analysis

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-15

Residual Analysis: Airline Cost Example

N u m b e r o f P r e d i c t e dP a s s e n g e r s C o s t ( $ 1 , 0 0 0 ) V a l u e R e s i d u a l X Y Y ˆ YY ˆ

6 1 4 . 2 8 4 . 0 5 3 . 2 2 76 3 4 . 0 8 4 . 1 3 4 - . 0 5 46 7 4 . 4 2 4 . 2 9 7 . 1 2 36 9 4 . 1 7 4 . 3 7 8 - . 2 0 87 0 4 . 4 8 4 . 4 1 9 . 0 6 17 4 4 . 3 0 4 . 5 8 2 - . 2 8 27 6 4 . 8 2 4 . 6 6 3 . 1 5 78 1 4 . 7 0 4 . 8 6 7 - . 1 6 78 6 5 . 1 1 5 . 0 7 0 . 0 4 09 1 5 . 1 3 5 . 2 7 4 - . 1 4 49 5 5 . 6 4 5 . 4 3 6 . 2 0 49 7 5 . 5 6 5 . 5 1 8 . 0 4 2

001.)ˆ( YY

Page 16: Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1 Business Statistics, 4e by Ken Black Chapter 13 Simple Regression Analysis

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-16

Excel Graph of Residuals for the Airline Cost Example

10090807060

0.2

0.1

0.0

-0.1

-0.2

-0.3

Number of Passengers

Res

idua

l

Page 17: Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1 Business Statistics, 4e by Ken Black Chapter 13 Simple Regression Analysis

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-17

Nonlinear Residual Plot

0 X

Page 18: Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1 Business Statistics, 4e by Ken Black Chapter 13 Simple Regression Analysis

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-18

Nonconstant Error Variance

0 X

0 X

Page 19: Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1 Business Statistics, 4e by Ken Black Chapter 13 Simple Regression Analysis

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-19

Graphs of Nonindependent Error Terms

0 X 0 X

Page 20: Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1 Business Statistics, 4e by Ken Black Chapter 13 Simple Regression Analysis

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-20

Healthy Residual Plot

0 X

Page 21: Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1 Business Statistics, 4e by Ken Black Chapter 13 Simple Regression Analysis

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-21

Standard Error of the Estimate

SSE

Y XY

SSE

n

Y Y

Y b b

Se

2

2

0 1

2

Sum of Squares Error

Standard Errorof the

Estimate

Page 22: Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1 Business Statistics, 4e by Ken Black Chapter 13 Simple Regression Analysis

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-22

Determining SSE for the Airline Cost Example

N u m b e r o fP a s s e n g e r s C o s t ( $ 1 , 0 0 0 ) R e s i d u a l X Y YY ˆ 2)ˆ( YY

6 1 4 . 2 8 . 2 2 7 . 0 5 1 5 36 3 4 . 0 8 - . 0 5 4 . 0 0 2 9 26 7 4 . 4 2 . 1 2 3 . 0 1 5 1 36 9 4 . 1 7 - . 2 0 8 . 0 4 3 2 67 0 4 . 4 8 . 0 6 1 . 0 0 3 7 27 4 4 . 3 0 - . 2 8 2 . 0 7 9 5 27 6 4 . 8 2 . 1 5 7 . 0 2 4 6 58 1 4 . 7 0 - . 1 6 7 . 0 2 7 8 98 6 5 . 1 1 . 0 4 0 . 0 0 1 6 09 1 5 . 1 3 - . 1 4 4 . 0 2 0 7 49 5 5 . 6 4 . 2 0 4 . 0 4 1 6 29 7 5 . 5 6 . 0 4 2 . 0 0 1 7 6

001.)ˆ( YY 2)ˆ( YY = . 3 1 4 3 4

S u m o f s q u a r e s o f e r r o r = S S E = . 3 1 4 3 4

Page 23: Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1 Business Statistics, 4e by Ken Black Chapter 13 Simple Regression Analysis

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-23

Standard Error of the Estimate for the Airline Cost Example

1773.010

31434.0

2

31434.0

ˆ 2

n

SSE

SSE

S

YY

e

Sum of Squares ErrorSum of Squares Error

Standard Errorof the

Estimate

Standard Errorof the

Estimate

Page 24: Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1 Business Statistics, 4e by Ken Black Chapter 13 Simple Regression Analysis

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-24

Coefficient of Determination

SSn

SS lained iation un lained iation

SS SSR SSE

SSR

SS

SSE

SSSSR

SSSSE

SSSSE

n

YY

YY

YY

YY YY

YY

YY

Y Y YY

r

YY

2 2

2

2

2

2

1

1

1

exp var exp var

0 12 r

Page 25: Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1 Business Statistics, 4e by Ken Black Chapter 13 Simple Regression Analysis

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-25

Coefficient of Determination for the Airline Cost Example

899.11209.3

31434.1

1

11209.312

56.699251.270

31434.0

2

22

2

YY

YY

SS

SSEr

n

YYSS

SSE

89.9% of the variabilityof the cost of flying a

Boeing 737 is accounted for by the number of passengers.

Page 26: Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1 Business Statistics, 4e by Ken Black Chapter 13 Simple Regression Analysis

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-26

Hypothesis Tests for the Slope of the Regression Model

t

whereSS

SSE

n

SSn

df n

bS

S S

S

XX

b

be

XX

e

XX

1 1

2

2

1

2

2

:

the hypothesized slope

H

H

01

11

0

0

:

:

H

H

01

11

0

0

:

:

H

H

01

11

0

0

:

:

Page 27: Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1 Business Statistics, 4e by Ken Black Chapter 13 Simple Regression Analysis

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-27

Hypothesis Test: Airline Cost Example (Part 1)

0

0

10,025.

Hreject not do ,228.2228.2

Hreject ,228.2||

228.2

05.

102102

tIf

tIf

ndf

t

H

H

01

11

0

0

:

:

Page 28: Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1 Business Statistics, 4e by Ken Black Chapter 13 Simple Regression Analysis

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-28

Hypothesis Test: Airline Cost Example (Part 2)

43.912

764,73

1773.00407.

)930(2

t

0Hreject ,228.243.9 Since t

Page 29: Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1 Business Statistics, 4e by Ken Black Chapter 13 Simple Regression Analysis

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-29

Testing the Overall Model (Part 1)

0

0

10,1,05.

Hreject not do ,96.4

Hreject ,96.4

96.4

05.

1011121

1

FIf

IfF

kndf

kdf

F

err

reg

H

H

01

11

0

0

:

:

Page 30: Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1 Business Statistics, 4e by Ken Black Chapter 13 Simple Regression Analysis

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-30

Testing the Overall Model (Part 2)

ANOVA

  df SS MS F Significance F

Regression 1 2.79803 2.79803 89.092179 2.7E-06

Residual 10 0.31406 0.03141

Total 11 3.11209      

09.8903141.0

7980.2

103141.01

7980.2

F

err

reg

err

err

reg

reg

MS

MS

dfSSdfSS

F

F = 89.09 > 4.96, reject H0

Page 31: Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1 Business Statistics, 4e by Ken Black Chapter 13 Simple Regression Analysis

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-31

Point Estimation for the Airline Cost Example

10.541,4$ 5411.4

730407.057.1ˆ

,73

0407.057.1ˆ

or

Y

XFor

XY

Page 32: Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1 Business Statistics, 4e by Ken Black Chapter 13 Simple Regression Analysis

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-32

Confidence Interval to Estimate Y : Airline Cost Example

6631.44191.4

12205411.412

930764,73

5.7773

12

11773.0228.25411.4

,level confidence 95% a and 73For

=SS

of valueparticular a :SS01ˆ

73

2

2

0

2

2XX

0

XX

2

2,2

YE

X

n

XX

XXwheren

YXXSt en

Page 33: Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1 Business Statistics, 4e by Ken Black Chapter 13 Simple Regression Analysis

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-33

Confidence Interval to Estimate the Average Value of Y for some Values of

X: Airline Cost Example

X Confidence Interval

62 4.0934 + .1876 3.9058 to 4.2810 68 4.3376 + .1461 4.1915 to 4.4837 73 4.5411 + .1220 4.4191 to 4.6631 85 5.0295 + .1349 4.8946 to 5.1644 90 5.2230 + .1656 5.0674 to 5.3986

Page 34: Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1 Business Statistics, 4e by Ken Black Chapter 13 Simple Regression Analysis

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-34

Prediction Interval to Estimate Y for a given value of X

n

where

nY

XX

X

XXSt en

2

2XX

0

XX

2

2,2

=SS

X of valueparticular a :

SS01

Page 35: Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1 Business Statistics, 4e by Ken Black Chapter 13 Simple Regression Analysis

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-35

Confidence Intervals for Estimation

60 70 80 90 100

4

5

6

Number of Passengers

Co

st

Regression

95% CI

95% PI

Regression Plot

Page 36: Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1 Business Statistics, 4e by Ken Black Chapter 13 Simple Regression Analysis

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-36

MINITAB Regression Analysis of the Airline Cost Example

The regression equation isCost = 1.57 + 0.0407 Number of Passengers

Predictor Coef StDev T PConstant 1.5698 0.3381 4.64 0.001Number o 0.040702 0.004312 9.44 0.000

S = 0.1772 R-Sq = 89.9% R-Sq(adj) = 88.9%

Analysis of Variance

Source DF SS MS F PRegression 1 2.7980 2.7980 89.09 0.000Residual Error 10 0.3141 0.0314Total 11 3.1121

Obs Number o Cost Fit StDev Fit Residual St Resid 1 61.0 4.2800 4.0526 0.0876 0.2274 1.48 2 63.0 4.0800 4.1340 0.0808 -0.0540 -0.34 3 67.0 4.4200 4.2968 0.0683 0.1232 0.75 4 69.0 4.1700 4.3782 0.0629 -0.2082 -1.26 5 70.0 4.4800 4.4189 0.0605 0.0611 0.37 6 74.0 4.3000 4.5817 0.0533 -0.2817 -1.67 7 76.0 4.8200 4.6631 0.0516 0.1569 0.93 8 81.0 4.7000 4.8666 0.0533 -0.1666 -0.99 9 86.0 5.1100 5.0701 0.0629 0.0399 0.24 10 91.0 5.1300 5.2736 0.0775 -0.1436 -0.90 11 95.0 5.6400 5.4364 0.0912 0.2036 1.34 12 97.0 5.5600 5.5178 0.0984 0.0422 0.29

Page 37: Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1 Business Statistics, 4e by Ken Black Chapter 13 Simple Regression Analysis

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-37

Pearson Product-Moment Correlation Coefficient

rSSXY

SSX SSY

X X Y Y

XYX Y

n

n n

X X Y Y

XX

YY

2 2

2

2

2

2

1 1r

Page 38: Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1 Business Statistics, 4e by Ken Black Chapter 13 Simple Regression Analysis

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-38

Three Degrees of Correlation

r < 0 r > 0

r = 0