16
Cause (Part II) - Causal Systems I. The Logic of Multiple Relationships II. Multiple Correlation Topics : III. Multiple Regression IV. Path Analysis

Cause (Part II) - Causal Systems I. The Logic of Multiple Relationships II. Multiple Correlation Topics: III. Multiple Regression IV. Path Analysis

Embed Size (px)

Citation preview

Page 1: Cause (Part II) - Causal Systems I. The Logic of Multiple Relationships II. Multiple Correlation Topics: III. Multiple Regression IV. Path Analysis

Cause (Part II) - Causal Systems

I. The Logic of Multiple Relationships

II. Multiple Correlation

Topics:

III. Multiple Regression

IV. Path Analysis

Page 2: Cause (Part II) - Causal Systems I. The Logic of Multiple Relationships II. Multiple Correlation Topics: III. Multiple Regression IV. Path Analysis

Cause (Part II) - Causal Systems

Y

X2

X1

One Dependent Variable, Multiple Independent Variables

In this diagram the overlap of any two circles can be thought of as the r2 between the two variables. When we add a third variable, however, we must ‘partial out’ the redundant overlap of the additional independent variables.

R

NR

NR

I. The Logic of Multiple Relationships

Page 3: Cause (Part II) - Causal Systems I. The Logic of Multiple Relationships II. Multiple Correlation Topics: III. Multiple Regression IV. Path Analysis

Cause (Part II) - Causal Systems

II. Multiple Correlation

Y

X2

X1R

NR

NR

R2y.x1x2

= r2yx1

+ r2yx2

Y X2X1 NR NR

R2y.x1x2

= r2yx1

+ r2yx2.x1

Notice that when the Independent Variables are independent of each other, the multiple correlation coefficient (R2) is simply the sum of the individual r2, but if the independent variables are related, R2 is the sum of one zero order r2 of one plus the partial r2 of the other(s). This is required to compensate for the fact that multiple independent variables being related to each other would be otherwise double counted in explaining the same portion of the dependent variable. Partially out this redundancy solves this problem.

Page 4: Cause (Part II) - Causal Systems I. The Logic of Multiple Relationships II. Multiple Correlation Topics: III. Multiple Regression IV. Path Analysis

Cause (Part II) - Causal Systems

II. Multiple Regression

Y X2X1

X1

X2

Y

Y’ = a + byx1

X1 + byx2X2

Y’ = Byx1

X1 + Byx2X2

or Standardized

If we were to translate this into the language of regression, multiple independent variables, that are themselves independent of each other would have their own regression slopes and would simply appear as an another term added in the regression equation.

Page 5: Cause (Part II) - Causal Systems I. The Logic of Multiple Relationships II. Multiple Correlation Topics: III. Multiple Regression IV. Path Analysis

Cause (Part II) - Causal Systems

Multiple Regression

Y

X2

X1

X1

X2

Y

Y’ = a + byx1

X1 + byx2.x1X2

or Standardized

Y’ = Byx1

X1 + Byx2.x1X2

Once we assume the Independent Variables are themselves related with respect to the variance explained in the Dependent Variable, then we must distinguish between direct and indirect predictive effects. We do this using partial regression coefficients to find these direct effects. When standardized these B-values are called “Path coefficients” or “Beta Weights”

Page 6: Cause (Part II) - Causal Systems I. The Logic of Multiple Relationships II. Multiple Correlation Topics: III. Multiple Regression IV. Path Analysis

III. Path Analysis – The Steps and an Example

2. Calculate the Correlation Matrix

3. Specify the Path Diagram

4. Enumerate the Equations

1. Input the data

5. Solve for the Path Coefficients (Betas)

6. Interpret the Findings

Cause (Part II) - Causal Systems

Page 7: Cause (Part II) - Causal Systems I. The Logic of Multiple Relationships II. Multiple Correlation Topics: III. Multiple Regression IV. Path Analysis

Path Analysis – Steps and Example

Step1 – Input the data

Y = DV - income

X3 = IV - educ

X2 = IV - pedu

X1 = IV - pinc

Assume you have information from ten respondents as to their income, education, parent’s education and parent’s income. We would input these ten cases and four variables into SPSS in the usual way, as here on the right. In this analysis we will be trying to explain respondent’s income (Y), using the three other independent variables (X1, X2, X3)

Page 8: Cause (Part II) - Causal Systems I. The Logic of Multiple Relationships II. Multiple Correlation Topics: III. Multiple Regression IV. Path Analysis

Step 2 – Calculate the Correlation Matrix

X1

X2

X3

Y

Path Analysis – Steps and Example

These correlations are calculated in the usual manner through the “analyze”, “correlate”, bivariate menu clicks.

Notice the zero order correlations of each IV with the DV. Clearly these IV’s must interrelate as the values of the r2 would sum to an R2 indicating more than 100% of the variance in the DV which, of course, is impossible.

Page 9: Cause (Part II) - Causal Systems I. The Logic of Multiple Relationships II. Multiple Correlation Topics: III. Multiple Regression IV. Path Analysis

Step 3 – Specify the Path Diagram

YX3

X1

X2

b

c

X3 = Offspring’s education

X2 = Parent’s education

X1 = Parent’s income

Y = Offspring’s income

Time

a

d

e

f

Path Analysis – Steps and Example

Therefore, we must specify a model that explains the relationship among the variables across time We start with the dependent variable on the right most side of the diagram and form the independent variable relationship to the left, indicating their effect on subsequent variables.

Page 10: Cause (Part II) - Causal Systems I. The Logic of Multiple Relationships II. Multiple Correlation Topics: III. Multiple Regression IV. Path Analysis

Step 4 – Enumerate the Path Equations

1. ryx1 = a + brx3x1 + crx2x1

2. ryx2 = c + brx3x2 + arx1x2

3. ryx3 = b + arx1x3 + crx2x3

4. rx3x2 = d + erx1x2

6. rx1x2 = f

5. rx3x1 = e + drx1x2

b

c

a

d

e

f X3

X1

X2

Y

Path Analysis – Steps and Example

Click here for solution to two equations in two unknowns

With the diagram specified, we need to articulate the formulae necessary to find the path coefficients (arbitrarily indicated here by letters on each path). Overall correlations between an independent and the dependent variable can be separated into its direct effect plus the sum of its indirect effects.

Page 11: Cause (Part II) - Causal Systems I. The Logic of Multiple Relationships II. Multiple Correlation Topics: III. Multiple Regression IV. Path Analysis

Step 5 – Solve for the Path Coefficients – a, b and c

Path Analysis – Steps and Example

The easiest way to calculate B is to use the Regression module in SPSS. By indicating income as the dependent variable and pinc, pedu and educ as the independent variables, we can solve for the Beta Weights or Path Coefficients for each of the Independent Variables.

These circled numbers correspond to Beta for paths a, c and b, respectively, in the previous path diagram.

Page 12: Cause (Part II) - Causal Systems I. The Logic of Multiple Relationships II. Multiple Correlation Topics: III. Multiple Regression IV. Path Analysis

Step 5 – Solve for the Path Coefficients – d and e

Path Analysis – Steps and Example

The easiest way to calculate B is to use the Regression module in SPSS. By indicating offspring education as the dependent variable and Parents Inc and Parents Edu as the independent variables, we can solve for the Beta Weights or Path Coefficients for each of these Independent Variables on the DV Offspring Edu. These circled numbers correspond to Beta for paths d and e, respectively, in the previous path diagram.

Page 13: Cause (Part II) - Causal Systems I. The Logic of Multiple Relationships II. Multiple Correlation Topics: III. Multiple Regression IV. Path Analysis

The SPSS Regression module also calculate R2. According to this statistic, for our data, 50% of the variation in the respondent’s income (Y) is accounted for by the respondent’s education (X3), parent’s education (X2) and parent’s income (X1)

Path Analysis – Steps and Example

Step 5a – Solving for R2

R2 is calculated by multiplying the Path Coefficient (Beta) by its respective zero order correlation and summed across all of the independent variables (see spreadsheet at right).

Page 14: Cause (Part II) - Causal Systems I. The Logic of Multiple Relationships II. Multiple Correlation Topics: III. Multiple Regression IV. Path Analysis

Checking the Findings

YX3

X1

X2

r = .57

B =.31

.57 = .31 + -.21(.82) + .63(.68)

.52 = -.21 + .63(.75) + .31(.82)

.69 = .63 + -.21(.75) +.31(.68)

Time

r = .69

B = .63

r = .82

B = .58

r = B =.68

e = .50

r = .52

B = -.21

r = .75B = .35

The values of r and B tells us three things: 1) the value of Beta is the direct effect; 2) dividing Beta by r gives the proportion of direct effect; and 3) the product of Beta and r summed across each of the variables with direct arrows into the dependent variable is R2 . The value of 1-R2 is e.

Path Analysis – Steps and Example

ryx1 = a + brx3x1 + crx2x1

ryx2 = c + brx3x2 + arx1x2

ryx3 = b + arx1x3 + crx2x3

Page 15: Cause (Part II) - Causal Systems I. The Logic of Multiple Relationships II. Multiple Correlation Topics: III. Multiple Regression IV. Path Analysis

Step 6 – Interpret the Findings

YX3

X1

X2.31

-.21

X3 = Offspring’s education

X2 = Parent’s education

X1 = Parent’s income

Y = Offspring’s income

Time

.63

.35

.58

.68

e = .50

Specifying the Path Coefficients (Betas), several facts are apparent, among which are that Parent’s income has the highest percentage of direct effect (i.e., .63/.69 = 92% of its correlation is a direct effect, 8% is an indirect effect). Moreover, although the overall correlation of educ with income is positive, the direct effect of offspring’s education, in these data, is actually negative!

Path Analysis – Steps and Example

End

Page 16: Cause (Part II) - Causal Systems I. The Logic of Multiple Relationships II. Multiple Correlation Topics: III. Multiple Regression IV. Path Analysis

r32 d er12

r31 e dr12

Substituting the correlations from the matrix, we get

.82 d e.68

.75 e d.68

Then restating for e

e .75 .68d

We substitute for d

.82 d .68.75 .68d

.82 d .51 .46d

.82 .54d .51d .31/.54 .57

Finally inserting d's value, we solve for e

.82 .57 .68e

.68e .25e .25/.68 .36

Exercise - Solving Two Equations in Two Unknowns

Back