27
September 1, 2009 Session 2 Slide 1 PSC 5940: Regression Review and Questions about “Causality” Session 2 Fall, 2009

September 1, 2009 Session 2Slide 1 PSC 5940: Regression Review and Questions about “Causality” Session 2 Fall, 2009

Embed Size (px)

Citation preview

Page 1: September 1, 2009 Session 2Slide 1 PSC 5940: Regression Review and Questions about “Causality” Session 2 Fall, 2009

September 1, 2009 Session 2 Slide 1

PSC 5940: Regression Review and Questions about “Causality”

Session 2

Fall, 2009

Page 2: September 1, 2009 Session 2Slide 1 PSC 5940: Regression Review and Questions about “Causality” Session 2 Fall, 2009

September 1, 2009 Session 2 Slide 2

Data Discussion• EE09 & NS09 Data: research ideas?• Fixing data in Excel: EE09

– NA replacement– Text to numeric (e28_gcc)– Getting rid of extraneous characters

• $ in “random_p”

• EE and partisanship– Loading and attaching the data– Examining party identification (“e216_par”)– Examining gender (“e3_gender”)

• Dealing with awkward names and NA values

Page 3: September 1, 2009 Session 2Slide 1 PSC 5940: Regression Review and Questions about “Causality” Session 2 Fall, 2009

September 1, 2009 Session 2 Slide 3

Deterministic Linear Models

• Theoretical Model:– andare constant terms

• is the intercept

• is the slope

– Xi is a predictor of Yi

Yi = +Xi

Yi = +Xi

a

b

1 =a

b

Xi

Yi

Page 4: September 1, 2009 Session 2Slide 1 PSC 5940: Regression Review and Questions about “Causality” Session 2 Fall, 2009

September 1, 2009 Session 2 Slide 4

Stochastic Linear Models• E[Yi] = +Xi

– Variation in Y is caused by more than X:

error (i)

• So:

0 = Y when X = 0

Each 1 unit increase in X increases Y by

i = Yi −(β0 + β1Xi ) = Yi − E[Yi ]

Yi =E[Yi ] + i= + Xi + i

Page 5: September 1, 2009 Session 2Slide 1 PSC 5940: Regression Review and Questions about “Causality” Session 2 Fall, 2009

September 1, 2009 Session 2 Slide 5

Assumptions Necessary for Estimating Linear Models

1.Errors have identical distributions

Zero mean, same variance, across the range of X

2.Errors are independent of X and other i

3.Errors are normally distributed

E[ i ] ≠ f(X)andE[i ] ≠ f( j , j ≠i)

i=0

X

Page 6: September 1, 2009 Session 2Slide 1 PSC 5940: Regression Review and Questions about “Causality” Session 2 Fall, 2009

September 1, 2009 Session 2 Slide 6

Normal, Independent & Identical i Distributions (“Normal iid”)

Y

X

Problem: We don’t know:

a) if error assumptions hold true; b) values for 0 and 1

Solution: Estimate ‘em!

Page 7: September 1, 2009 Session 2Slide 1 PSC 5940: Regression Review and Questions about “Causality” Session 2 Fall, 2009

September 1, 2009 Session 2 Slide 7

OLS Derivation of b0

Given that : Yi = bo + b1X i +e and ˆ Y = bo + b1X

SSE = ei2

1

n

∑ = (Yi − ˆ Y i1

n

∑ )2

= (Yi

1

n

∑ − bo − b1X i)2

Now we minimize w.r.t. b0, using the chain rule :

′ f (b0) = 2(Yi − bo − b1X i∑ ) ⋅(−1) =

− 2 (Yi − bo − b1X ii∑ ) = −2 Yi + 2nbo + 2b1 X i∑∑

Use partial derivation in this step:

Page 8: September 1, 2009 Session 2Slide 1 PSC 5940: Regression Review and Questions about “Causality” Session 2 Fall, 2009

September 1, 2009 Session 2 Slide 8

Derivation of b0, step 2

Now we want to set the derivative to zero and solve :

−2 Yi + 2nbo + 2b Xi∑∑ =

First shift the non-b terms to the other side:

2nb =2 Yi −∑ 2b Xi∑ 2Now divide through by n:

2nb2n

=2 Yi∑2n

−2b Xi∑2n

which is: b = Y −b X

Page 9: September 1, 2009 Session 2Slide 1 PSC 5940: Regression Review and Questions about “Causality” Session 2 Fall, 2009

September 1, 2009 Session 2 Slide 9

Derivation of b1

Step 1: Multiply out e2

ei2∑ = (Yi −b0 −b1Xi)

2∑= (Yi −b0 −b1Xi)∑ ⋅(Yi −b0 −b1Xi)

= Yi2∑ − Yi∑ b0 − Yib1Xi −∑ Yi∑ b0 + b0

2 + b0b1Xi∑∑− Yib1Xi +∑ b0b1Xi∑ + b1

2Xi2∑

Now add the like terms and then drag all the constants

through the summations to get:

= Y2 −2b0 Y∑∑ −2b1 XY+nb02∑ +2b0b1 X+b1

2 X2∑∑

Page 10: September 1, 2009 Session 2Slide 1 PSC 5940: Regression Review and Questions about “Causality” Session 2 Fall, 2009

September 1, 2009 Session 2 Slide 10

Derivation of b1

Step 2: Differentiate w.r.t. b1

∑ ∑∑

∑ ∑∑∑ ∑

++−=′

+++−−

==

2101

1

22110

2010

2

12

12

222 )(

:obtain toderivation thefrom

dropped be orecan theref and constants

effect,in are, b without termsall that Note

222

)(

:b respect to with e atedifferentipartially Next,

XbXbXYbf

XbXbbnbXYbYbY

bfe

Page 11: September 1, 2009 Session 2Slide 1 PSC 5940: Regression Review and Questions about “Causality” Session 2 Fall, 2009

September 1, 2009 Session 2 Slide 11

Derivation of b1

Step 3: Substitute for b0

′f (b1 ) = − 2 XY +∑ 2b0 X + 2b1 X 2∑∑

Since b0 = Y − b1 X, we can write ′ f (b1 ) as follows :

−2 XY +∑2 Y X∑∑

n−

2b1( X )2∑n

+ 2b1 X2∑ = 0

Page 12: September 1, 2009 Session 2Slide 1 PSC 5940: Regression Review and Questions about “Causality” Session 2 Fall, 2009

September 1, 2009 Session 2 Slide 12

Derivation of b1

Step 4: Simplify and Isolate b1 Now we can multiply through by

n2

and

put all the b1 terms on the same side :

−2 XY +∑2 Y X∑∑

n−2b( X)2∑

n+ 2b X2∑ =, so

(nb X2 −b( X)2 ) =(n XY− X Y)∑∑∑∑∑ , or

b(n X2 −( X)2 ) =n XY− X Y, and finally∑∑∑∑∑

b =n XY− X Y∑∑∑n X2 −( X)2∑∑

Page 13: September 1, 2009 Session 2Slide 1 PSC 5940: Regression Review and Questions about “Causality” Session 2 Fall, 2009

September 1, 2009 Session 2 Slide 13

Calculating b0 and b1

• The formula for b1 and b0 allow you (or preferably your computer) to calculate the error-minimizing slope and intercept for any data set representing a bi-variate, linear relationship.

ˆ b 1 =n XY− X Y∑∑∑n X2 −( X)2∑∑

=(Xi − X)(Yi − Y )∑

(Xi − X)2∑

ˆb = Y −b X

• No other line, using the same data, will result in a smaller a squared-error (e2 ). OLS gives best fit.

Page 14: September 1, 2009 Session 2Slide 1 PSC 5940: Regression Review and Questions about “Causality” Session 2 Fall, 2009

September 1, 2009 Session 2 Slide 14

Interpreting b1 and b0

ˆ b 1 =(Xi − X)(Yi − Y)∑

(Xi − X)2∑

ˆ b 0 = Y −b X

For each 1-unit increase in X, you get b1 units change in Y

When X is zero, Y will be equal to b0. Note that a regression model with no independent variables is simply the mean.

Page 15: September 1, 2009 Session 2Slide 1 PSC 5940: Regression Review and Questions about “Causality” Session 2 Fall, 2009

September 1, 2009 Session 2 Slide 15

Theoretical Specification of Multivariate Regression

E[Yi] 0 1 Xi,1 2 Xi,2 ... K 1Xi ,K 1

where K is the number of parameters ( ' s)

Yi E[Yi ] i or (in matrix form) Y X U

ˆ Y b0 b1 Xi,1 b2 Xi, 2 ... bK 1 Xi, K 1

So RSS= (Yi ˆ Y i )2 ei

2

Page 16: September 1, 2009 Session 2Slide 1 PSC 5940: Regression Review and Questions about “Causality” Session 2 Fall, 2009

Regression in Matrix Form• Assume a model using n observations, with K-1

Xi (independent) variables

Y (n ×1) is a column vector of the observed dependent variable

ˆ Y (n ×1) is a column vector of predicted Y values

X (n × K) each column is of observations on an X, first column 1's

B (K ×1) a row vector of regression coefficients (first is b0)

U (n ×1) is a column vector of n residual values

Page 17: September 1, 2009 Session 2Slide 1 PSC 5940: Regression Review and Questions about “Causality” Session 2 Fall, 2009

Regression in Matrix Form

Y =XB+UˆY =XB

B=( ′X X)− ′XY

Note: we can’t uniquely define (X’X)-1 if anycolumn in the X matrix is a linear function ofany other column(s) in X.

Page 18: September 1, 2009 Session 2Slide 1 PSC 5940: Regression Review and Questions about “Causality” Session 2 Fall, 2009

The X’X Matrix

( ′X X) =

n X∑ X2∑ X3∑X∑ X

2∑ XX2∑ XX3∑X2∑ X2X∑ X2

2∑ X2X3∑X3∑ X3X∑ X3X2∑ X3

2∑

⎢⎢⎢

⎥⎥⎥

Note that you can obtain the basis for all the necessary means, variances and covariances among the Xs from the (X’X) matrix

Page 19: September 1, 2009 Session 2Slide 1 PSC 5940: Regression Review and Questions about “Causality” Session 2 Fall, 2009

An Example of Matrix Regression

Using a sample of 7 observations, where X hasElements {X0, X1, X2, X3}

[ ]49.004.006.196.3=B

11.0

58.0

11.0

49.0

41.0

98.0

48.0

10.11

9.58

4.89

2.51

4.41

10.02

6.48

=Y

5281

4371

5431

6911

4621

3271

4541

10

9

5

3

4

11

6

⎥⎥⎥⎥⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢⎢⎢⎢⎢

=

⎥⎥⎥⎥⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢⎢⎢⎢⎢

⎥⎥⎥⎥⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢⎢⎢⎢⎢

=

⎥⎥⎥⎥⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢⎢⎢⎢⎢

= UXY

Page 20: September 1, 2009 Session 2Slide 1 PSC 5940: Regression Review and Questions about “Causality” Session 2 Fall, 2009

September 1, 2009 Session 2 Slide 20

Summary of OLS Assumption Failures and their Implications

Problem Biased b Biased SE Invalid t/F Hi Var

Non-linear Yes Yes Yes ---

Omit relev. X Yes Yes Yes ---

Irrel X No No No Yes

X meas. Error Yes Yes Yes ---

Heterosced. No Yes Yes Yes

Autocorr. No Yes Yes Yes

X corr. error Yes Yes Yes ---

Non-normal err. No No Yes Yes

Multicolinearity No No No Yes

Page 21: September 1, 2009 Session 2Slide 1 PSC 5940: Regression Review and Questions about “Causality” Session 2 Fall, 2009

September 1, 2009 Session 2 Slide 21

BREAK

Page 22: September 1, 2009 Session 2Slide 1 PSC 5940: Regression Review and Questions about “Causality” Session 2 Fall, 2009

September 1, 2009 Session 2 Slide 22

Causality and Experiments

X2 Y

Number ofFire Trucks

Number ofFire Deaths

Question: What is the relationship between the number of fire trucks at the scene of a fire, and the number of deaths caused by that fire?

Experimental approach: Randomly assign fire incidents to different categories, which receive different numbers of trucks (treatment).

Page 23: September 1, 2009 Session 2Slide 1 PSC 5940: Regression Review and Questions about “Causality” Session 2 Fall, 2009

September 1, 2009 Session 2 Slide 23

Causality and Observational Data

The problem of spurious relations...

X2

X1

Y

Number ofFire Trucks

Number ofFire Deaths

Size ofFire

In an experimental design, we fully control for spuriousrelationships. With OLS we try to manage them statistically.

Page 24: September 1, 2009 Session 2Slide 1 PSC 5940: Regression Review and Questions about “Causality” Session 2 Fall, 2009

September 1, 2009 Session 2 Slide 24

Statistical Calculation of Partial Effects

In calculating the effect of X1 on Y, we remove the effect of the other X’s on both X1 and Y:

ˆ Y i b0 b2 Xi ,2 ei, Y| X2

ˆ X i ,1 b0 b2 Xi ,2 ei ,X 1 |X2

so

ei ,Y |X2 b0 b1ei, X1 |X 2

The use of residuals “cleans” both Y and X1 of their correlationswith X2, permitting estimation PRCs.

Y stripped ofthe effect of X2

X1 stripped ofthe effect of X2

Page 25: September 1, 2009 Session 2Slide 1 PSC 5940: Regression Review and Questions about “Causality” Session 2 Fall, 2009

September 1, 2009 Session 2 Slide 25

Intuition of PRC’s

• All overlapping variance is stripped

• Highly correlated IVs are problematic– But what if the overlap is important?

• What if X1 and X2 are really part of some larger construct?

– The case of knowledge, efficacy and behavior– Kelstet et al

• How should we interpret the PRC’s in this case?

Page 26: September 1, 2009 Session 2Slide 1 PSC 5940: Regression Review and Questions about “Causality” Session 2 Fall, 2009

September 1, 2009 Session 2 Slide 26

Workshop• Load EE data

• Run a simple model:

• Willingness to pay for an alternative energy tax

• Use randomly assigned cost as IV

• Plot to relationship (use jitter)

• Now add: Income, Ideology

• Change in cost variable? (Why?)

Page 27: September 1, 2009 Session 2Slide 1 PSC 5940: Regression Review and Questions about “Causality” Session 2 Fall, 2009

September 1, 2009 Session 2 Slide 27

Homework• Generate and analyze the residuals

• Add to the model:

• Belief in anthropogenic climate change

• Will require recodes

• Understanding of GCC science

• Recode “What scientists’ believe…” variables

• 1 page summary of findings for class next week

• Next Extension: Modeling Dummies and Interactions