Regression Analysis: “Regression is that technique of statistical analysis by which when one...

Preview:

Citation preview

Regression Analysis:

“Regression is that technique of statistical analysis by which when one variable is known the other variable can be estimated. In regression analysis the variable, which is known, is called independent variable and which is to be estimated is called dependent variable.”

 

3

Nondependent and Dependent Nondependent and Dependent RelationshipsRelationships

Types of RelationshipTypes of Relationship Nondependent (correlation) -- neither one of Nondependent (correlation) -- neither one of

variables is target Example: protein and fat intakevariables is target Example: protein and fat intake Dependent (regression) -- value of one variable is Dependent (regression) -- value of one variable is

used to predict value of another variable. used to predict value of another variable. Example: ACT and MCAT scores for medical Example: ACT and MCAT scores for medical applicants, MCAT is the dependent and ACT is the applicants, MCAT is the dependent and ACT is the independent variableindependent variable

Statistical ExpressionsStatistical Expressions Correlation CoefficientCorrelation Coefficient -- index of nondependent -- index of nondependent

relationshiprelationship Regression CoefficientRegression Coefficient -- index of dependent -- index of dependent

relationshiprelationship

Regression: 3 Main PurposesRegression: 3 Main Purposes

To describeTo describe (or model) (or model)

To predictTo predict (or estimate) (or estimate)

To controlTo control (or administer) (or administer)

Simple Linear RegressionSimple Linear Regression

Statistical method for findingStatistical method for finding the “line of best fit” the “line of best fit”

for one response (dependent) for one response (dependent) numerical variable numerical variable

based on one explanatory based on one explanatory (independent) variable.  (independent) variable.  

Difference between Correlation and Regression:

1.Degree and nature: Correlation studies the relationship between two or more series but regression analysis measures the degree and extent of this relationship thereby providing a base for estimation.

 

2.Cause and effect relationship:

Correlation specifies the relationship between two series and it can specify as to what extent is the cause and what is effect. Whereas in regression the value of which series is known is called independent series and whose value is to be predicted is called dependent series. The independent series is cause and independent series is effect.

3. Limit of co-efficient:

The limit of co-efficient of correlation is plus minus 1 but this is not the case with regression co-efficient. But the product of both the regression co-efficient cannot become greater than 1.

Regression Lines:

The regression analysis between two related series of data is usually done with the help of diagrams.

On the scatter diagram obtained by plotting the various values of related series X and Y, two lines of best fit are drawn through the various points of the diagrams, which are called regression lines.

Why two regression lines?

When there are two series then the lines of regression will also be two.

If the variable values of two series are named as X and Y then one regression line is called X on Y and the other is called Y on X.

Deviations Taken From Arithmetic Mean

(i) Regression Equations of X on Y X – X = r x ( Y – Y) y

Here r x is known as the regression coefficient of X on Y y

It is also denoted by b xy or b1.It measures the change in X corresponding to a unit Change in YAlso b xy = xy / y 2

(ii) Regression Equations of Y on X Y – Y = r y ( X – X) x

Here r y is known as the regression coefficient of Y on X x

It is also denoted by b yx or b2.It measures the change in Y corresponding to a unit Change in XAlso b yx = xy / x 2

Where x = X –( mean of X series) y = Y –( mean of Y series)

AlsoRegression equation of X on Y(X – X ) = b1 (Y- Y)

Regression equation of Y on X (Y – Y) = b2 (X – X)

Here r is known as the coefficient of correlation betweenX and Y series. and r = b1x b2

Least square method:

X on Y: Y on X:

X = n.a + b. Y Y = n.a + b. X

XY = Y a + b Y XY = X a + b X

X = a + by Y = a + bx

2 2

Example:

Calculate the regression equations taking deviations of items from the mean of X and Y series.

X Y06 0902 1110 0504 0808 07

Deviations taken from assumed mean(i) Regression Equations of X on Y X – X = r x ( Y – Y) y

Here r x = N dx dy – ( dx dy) / N dy 2 (dy)2

y

(ii) Regression Equation of Y on X

Y – Y = r x ( X – X) y

Here r x = N dx dy – ( dx dy) / N dx 2 (dx)2

y

Example:

From the following data of the rainfall and production of rice, find (i) the most likely production corresponding to rainfall 40 cm (ii) the most likely rainfall corresponding to production 45 kgs.

  Rainfall(cm) Prod (kgs)

Mean 35 50

Std deviation 5 8

Coefficient of correlation between rainfall and production = .8

Example:

Obtain the lines of regression:

X Y

5 2

6 4

5 8

7 5

2 1

 

Example:

From the following series X and Y, find out the value of:

(i)Two regression coefficients.

(ii)Two regression equations.

(iii) Most likely value of X when Y is 34.

(iv) Most likely value of Y when X is 47.

 

Series X Series Y

 48 36

50 32

53 33

49 38

51 37

55 31

53 35

49 30

Example :

The two regression equations are as follows:

 20 X - 3Y = 975………………..(i)

4 Y – 15 X + 530 = 0………………….(ii)

 Find out

(i) Mean value of X and Y

(ii) The coefficient of correlation between X and Y

(iii)Estimate the value of Y, when X = 90; and that of X when Y = 130.

Example:

For a certain X and Y series, the two lines of regressionAre given below

6Y – 5X = 9015X – 8Y = 130Variance of X series is 16

1. Find the mean value of X and Y series2. Coefficient of correlation between X and Y series.3. Standard deviation of Y series.

Real Life ApplicationsEstimating Seasonal Sales for Department

Stores (Periodic)

Real Life ApplicationsPredicting Student Grades Based on Time

Spent Studying

Practice ProblemsMeasure Height vs. Arm SpanFind line of best fit for height.Predict height for

one student not indata set. Checkpredictability of model.

Practice Problems

Is there any correlation between shoe size and height?

Does gender make a difference in this analysis?

Practice ProblemsCan the number of points scored in a basketball game be predicted by The time a player plays in

the game?

By the player’s height?

Questions ???Questions ???

Recommended