29
Chapter 12 Regression and Correlation Analysis

Stats and Maths

Embed Size (px)

Citation preview

Page 1: Stats and Maths

8/12/2019 Stats and Maths

http://slidepdf.com/reader/full/stats-and-maths 1/29

Chapter 12

Regression andCorrelation Analysis

Page 2: Stats and Maths

8/12/2019 Stats and Maths

http://slidepdf.com/reader/full/stats-and-maths 2/29

Chapter 12 - Chapter Outcomes

After studying the material in this chapter, youshould be able to:

Calculate and interpret the simple correlationbetween two variables. Determine whether the correlation is significant.Calculate and interpret the simple linear

regression coefficients for a set of data.Understand the basic assumptions behind

regression analysis. Determine whether a regression model is

significant.

Page 3: Stats and Maths

8/12/2019 Stats and Maths

http://slidepdf.com/reader/full/stats-and-maths 3/29

Chapter 12 - Chapter Outcomes(continued)

After studying the material in this chapter, youshould be able to:

Calculate and interpret confidence intervals for the regression coefficients. Recognize regression analysis applications

for purposes of prediction and description. Recognize some potential problems if

regression analysis is used incorrectly. Recognize several nonlinear relationships

between two variables.

Page 4: Stats and Maths

8/12/2019 Stats and Maths

http://slidepdf.com/reader/full/stats-and-maths 4/29

Scatter Diagrams

A scatter plot is a graph that may beused to represent the relationshipbetween two variables. Alsoreferred to as a scatter diagram .

Page 5: Stats and Maths

8/12/2019 Stats and Maths

http://slidepdf.com/reader/full/stats-and-maths 5/29

Dependent and Independent

Variables

A dependent variable is the variable to be

predicted or explained in a regressionmodel. This variable is assumed to befunctionally related to the independent

variable.

Page 6: Stats and Maths

8/12/2019 Stats and Maths

http://slidepdf.com/reader/full/stats-and-maths 6/29

Dependent and Independent

Variables

An independent variable is the variable

related to the dependent variable in aregression equation. The independentvariable is used in a regression model to

estimate the value of the dependentvariable.

Page 7: Stats and Maths

8/12/2019 Stats and Maths

http://slidepdf.com/reader/full/stats-and-maths 7/29

Two Variable Relationships(Figure 11-1)

X

Y

(a) Linear

Page 8: Stats and Maths

8/12/2019 Stats and Maths

http://slidepdf.com/reader/full/stats-and-maths 8/29

Two Variable Relationships(Figure 11-1)

X

Y

(b) Linear

Page 9: Stats and Maths

8/12/2019 Stats and Maths

http://slidepdf.com/reader/full/stats-and-maths 9/29

Two Variable Relationships(Figure 11-1)

X

Y

(c) Curvilinear

Page 10: Stats and Maths

8/12/2019 Stats and Maths

http://slidepdf.com/reader/full/stats-and-maths 10/29

Two Variable Relationships(Figure 11-1)

X

Y

(d) Curvilinear

Page 11: Stats and Maths

8/12/2019 Stats and Maths

http://slidepdf.com/reader/full/stats-and-maths 11/29

Two Variable Relationships(Figure 11-1)

X

Y

(e) No Relationship

Page 12: Stats and Maths

8/12/2019 Stats and Maths

http://slidepdf.com/reader/full/stats-and-maths 12/29

Correlation

The correlation coefficient is a quantitativemeasure of the strength of the linearrelationship between two variables. Thecorrelation ranges from + 1.0 to - 1.0. Acorrelation of 1.0 indicates a perfect linear

relationship, whereas a correlation of 0indicates no linear relationship.

Page 13: Stats and Maths

8/12/2019 Stats and Maths

http://slidepdf.com/reader/full/stats-and-maths 13/29

Correlation

SAMPLE CORRELATION COEFFICIENT orPearson’s Correlation Coefficient

where:

r = Sample correlation coefficientn = Sample size x = Value of the independent variabley = Value of the dependent variable

])(][)([

))((22 y y x x

y y x xr

Page 14: Stats and Maths

8/12/2019 Stats and Maths

http://slidepdf.com/reader/full/stats-and-maths 14/29

Correlation

SAMPLE CORRELATION COEFFICIENT

or the algebraic equivalent:

])()(][)()([ 2222 y yn x xn

y x xynr

Page 15: Stats and Maths

8/12/2019 Stats and Maths

http://slidepdf.com/reader/full/stats-and-maths 15/29

Correlation(Example 11-1)

Sales Yearsy x xy y 2 x 2

487 3 1,461 237,169 9445 5 2,225 198,025 25

272 2 544 73,984 4641 8 5,128 410,881 64187 2 374 34,969 4440 6 2,640 193,600 36346 7 2,422 119,716 49

238 1 238 56,644 1312 4 1,248 97,344 16269 2 538 72,361 4655 9 5,895 429,025 81563 6 3,378 316,969 36

(Table 11-1)

855,4 55 687,240,2 091,26 329

Page 16: Stats and Maths

8/12/2019 Stats and Maths

http://slidepdf.com/reader/full/stats-and-maths 16/29

Correlation(Example 11-1)

])()(][)()([ 2222 y yn x xn

y x xynr

8325.0])855,4()687,240,2(12][)55()329(12[

)855,4(55)091,26(12

22

r

Page 17: Stats and Maths

8/12/2019 Stats and Maths

http://slidepdf.com/reader/full/stats-and-maths 17/29

Simple Linear Regression

Analysis

Simple linear regression analysis analyzes the linear relationship thatexists between a dependent variableand a single independent variable.

Page 18: Stats and Maths

8/12/2019 Stats and Maths

http://slidepdf.com/reader/full/stats-and-maths 18/29

Simple Linear Regression Analysis

SIMPLE LINEAR REGRESSION MODEL(POPULATION MODEL)

where:y = Value of the dependent variable

x = Value of the independent variable= Population’s y -intercept= Slope of the population regression line= Error term, or residual

x y 10

0

1

Page 19: Stats and Maths

8/12/2019 Stats and Maths

http://slidepdf.com/reader/full/stats-and-maths 19/29

Simple Linear Regression Analysis

The simple linear regression model has fourassumptions: Individual values if the error terms, i , are

statistically independent of one another.The distribution of all possible values of i is normal.The distributions of possible i values have equalvariances for all value of x.The means of the dependent variable, for all specifiedvalues of the independent variable, y, can beconnected by a straight line called the populationregression model.

Page 20: Stats and Maths

8/12/2019 Stats and Maths

http://slidepdf.com/reader/full/stats-and-maths 20/29

Simple Linear Regression

Analysis

REGRESSION COEFFICIENTS In the simple regression model, thereare two coefficients: the intercept andthe slope.

Page 21: Stats and Maths

8/12/2019 Stats and Maths

http://slidepdf.com/reader/full/stats-and-maths 21/29

Simple Linear Regression

Analysis

The interpretation of the regression slopecoefficient is that is gives the average changein the dependent variable for a unit increasein the independent variable . The slope

coefficient may be positive or negative,depending on the relationship between thetwo variables.

Page 22: Stats and Maths

8/12/2019 Stats and Maths

http://slidepdf.com/reader/full/stats-and-maths 22/29

Simple Linear Regression

Analysis

The least squares criterion is usedfor determining a regression linethat minimizes the sum of squaredresiduals.

Page 23: Stats and Maths

8/12/2019 Stats and Maths

http://slidepdf.com/reader/full/stats-and-maths 23/29

Simple Linear Regression

Analysis

A residual is the difference betweenthe actual value of the dependentvariable and the value predicted bythe regression model.

y y ˆ

Page 24: Stats and Maths

8/12/2019 Stats and Maths

http://slidepdf.com/reader/full/stats-and-maths 24/29

Simple Linear Regression Analysis

ESTIMATED REGRESSION MODEL(SAMPLE MODEL)

where:= Estimated, or predicted, y value

b 0 = Unbiased estimate of the regression interceptb 1 = Unbiased estimate of the regression slope x = Value of the independent variable

xbb yi 10ˆ

Page 25: Stats and Maths

8/12/2019 Stats and Maths

http://slidepdf.com/reader/full/stats-and-maths 25/29

Simple Linear Regression Analysis

LEAST SQUARES EQUATIONS

algebraic equivalent:

and

n

x x

n

y x xy

b 22

1 )(

21

)(

))((

x x

y y x xb

xb yb 10

Page 26: Stats and Maths

8/12/2019 Stats and Maths

http://slidepdf.com/reader/full/stats-and-maths 26/29

Simple Linear Regression Analysis (Annual Truck Repair Expense Example page:662 )

Director of Chapel Hill is interested in therelationship b/w the age of a garbage truck &

annual repair expense for 4 trucks.

Repair Exp (hundred Age of trucksduring last yr (y) in yrs. (x) xy y 2 x 2

7 5 35 49 25

7 3 21 49 9

6 3 18 36 94 1 4 16 1

12 x 24 y 442

x 78 xy 1502

y

Page 27: Stats and Maths

8/12/2019 Stats and Maths

http://slidepdf.com/reader/full/stats-and-maths 27/29

Simple Linear Regression Analysis

(Table 11-3)

75.0

4)12(

444

)24(1278

)( 222

1

n

x x

n

y x xy

b

75.3)3(75.0610

xb ybThe least squares regression line is:

)(75.075.3ˆ

x y

Page 28: Stats and Maths

8/12/2019 Stats and Maths

http://slidepdf.com/reader/full/stats-and-maths 28/29

Least Squares RegressionProperties

The sum of the residuals from the leastsquares regression line is 0.

The sum of the squared residuals is aminimum.The simple regression line always passes

through the mean of the y variable andthe mean of the x variable.The least squares coefficients are unbiased

estimates of 0 and 1.

Page 29: Stats and Maths

8/12/2019 Stats and Maths

http://slidepdf.com/reader/full/stats-and-maths 29/29

Simple Linear Regression

AnalysisStandard Error of Estimate: Statistician useStandard Error of Estimate to measure the

reliability of the estimating regressionequation or in other words Standard Errorof Estimate measure the variability or

scatter of the observation around estimatedregression line.