33
1 Quantitative Methods II Dummy Variables & Interaction Effects Edmund Malesky, Ph.D., UCSD

PPT research method

Embed Size (px)

Citation preview

Page 1: PPT research method

1

Quantitative Methods II

Dummy Variables & Interaction Effects

Edmund Malesky, Ph.D., UCSD

Page 2: PPT research method

2

The Homogeneity Assumption

OLS assumes all cases in your data are comparable

x’s are a sample drawn from a single population

But we may analyze distinct groups of cases together in one analysis

Mean value of y may differ by group

Page 3: PPT research method

3

Qualitative Variables

These group effects remain as part of the error term

If groups differ in their distribution of x’s, then we get a correlation between the X variables and the error term

Violates assumption: cov(Xi, ui)=E(u)=0 Omitted Variable Bias!

Page 4: PPT research method

4

Testing for Differences Across Groups (p. 249-252) The Chow Test i.e. Testing for difference between males and females on

academic performance.

SSR1=Males only; SSR2=Females only SSRur=SSR1+SSR2 SSRP=SSRr=Pooling across both groups

1 2

1 2

[ ( )] [ 2( 1)]*

1pSSR SSR SSR n k

FSSR SSR k

The Chow Test:1. Is only valid under homoskedasticity (the

error variance for the two groups must be equal).

2. The null hypothesis is that there is no difference at all; either in the intercept or

the slope between the two groups.

3. This may be two restrictive in these cases, we should allow dummy variables

and dummy interactions to allow us to predict different slopes and intercepts for

the two groups.

Page 5: PPT research method

5

Example: Democracy & Tariffs

Here we see that democracies have lower tariffs

Here we see that states in Regional Trading Arrangements (RTA’s) have lower tariffs

0

5

10

15

20

25

30

35

40

Dictator Oligarch Anocracy DemocracyP

erce

nt

Tar

iffs

Pooled Data

05

10152025

3035404550

Dictator Oligarch Anocracy Democracy

Per

cen

t T

ari

ffs

RTANo RTAPooled Data

But if Democracies are more likely to be in RTA’s, then pooling RTA and non-RTA

states biases the coefficient

Page 6: PPT research method

6

Solution: The Qualitative Variable

Measure this group difference (RTA vs. Non-RTA) and specify it as an x

This eliminates bias But we have no numerical scale to

measure RTA’s Create a categorical variable that captures

this group difference

Page 7: PPT research method

7

The Qualitative “Dummy”

Create a variable that equals 1 when a case is part of a group, 0 otherwise

This variable creates a new intercept for the cases in the group marked by the dummy

Specifically, how would we interpret:

0 1 2TARIFF DEM RTA u

Page 8: PPT research method

8

Democracy and Tariff Barriers

05

10152025

3035404550

Dictator Oligarch Anocracy Democracy

Perc

ent T

ariff

s

RTANo RTA

0 1 2

0 1 2

ˆ ˆ ˆ ˆ

ˆ ˆ ˆ50 5 10

TARIFF DEM RTA u

and and

Page 9: PPT research method

9x1 (could be continuous, categorical, or dichotomous)

y

Graphical Depiction of a Dummy

10

0 2ˆ ˆ

1

1

0

0 1 1 2 2 2ˆ ˆ ˆˆ if 1y x x x

0 1 1ˆ ˆy x

0 1 1 2 2 2ˆ ˆ ˆˆ if 0y x x x

Page 10: PPT research method

10

Multiple Category Dummies

Dummy variables are a very flexible way to assess categorical differences in the mean of y

We can use dummies even for concepts with multiple categories

Imagine we want to capture the impact of global region on tariffsRegions: Americas, Europe, Asia, Africa

Page 11: PPT research method

11

Warning!Warning!

Do not fall into the dummy variable trap!

When you have entered both values of a dummy variable in the same regression. These two variables are linearly dependent. One will drop out.

Page 12: PPT research method

12

Multiple Category Dummies

Create 4 separate dummy variables - 1 for each region

Include all except one of these dummies in the equation

If you include all 4 dummies you get perfect collinearity with the constant. The fourth dummy will drop out.

Americas+Europe+Asia+Africa=1

Page 13: PPT research method

13

Interpreting Multi-Category Dummies

Each coefficient compares the mean for that group to the mean in the excluded category

Thus if: βhat

2-βhat4 compare the mean tariff in each region to the

mean in the Americas

Mean in Americas is βhat0

An alternative strategy is to drop the constant and run all dummies, as discussed last week.

0 1 2 3 4ˆ ˆ ˆ ˆ ˆ ˆTARIFF DEM EUR ASIA AFR u

Page 14: PPT research method

14

Dumb Dummies

Dummy variables are easy, flexible ways to measure categorical concepts

They CAN be just labels for ignorance Try to use dummies to capture theoretical

constructs not empirical observations If possible, measure the theoretical

construct more directly

Page 15: PPT research method

15

Interaction Effects

Dummy variables specify new intercepts Other slope coefficients in the equation do

not change OLS assumes that the slopes of

continuous variables are constant across all cases

What if slopes are different for different groups in our sample?

Page 16: PPT research method

16

Interaction Effects: An Example

What if the effect of democracy on tariffs depends on whether the state is in an RTA?

0 1 2ˆ ˆ ˆ ˆTARIFF DEM RTA u

1 0 1ˆ ˆ ˆ RTA

Page 17: PPT research method

17

Interaction Effects: An Illustration (Notice that democracy has been converted to a dummy as

well for illustration purposes)

0

5

10

15

20

25

30

35

Non-Dem Democracy

Per

cen

t T

ari

ffs

RTANo RTA

0 1 2

1

1

ˆ ˆ ˆ

ˆ 5 0

ˆ 6 1

TARIFF DEM RTA u

if RTA

if RTA

Page 18: PPT research method

18

How Do We Estimate This Set of Relationships?

We begin with:

Substituting for Βhat1,

we get:

0 1 2ˆ ˆ ˆ ˆTARIFF DEM RTA u

1 0 1ˆ ˆ ˆ RTA

0 0 1 2

0 0 1 2

ˆ ˆˆ ˆ ˆ( )

ˆ ˆˆ ˆ ˆ*

TARIFF RTA DEM RTA u

TARIFF DEM RTA DEM RTA u

Βhat1 Βhat

2 Βhat3

In STATA, they will appear as regular

coefficients

Page 19: PPT research method

19

What Do These Coefficients Mean?

0 2ˆ ˆ is the new intercept for DEM when RTA=1 0

ˆ is the intercept for DEM when RTA=0

0 is the slope of DEM when RTA=0

1 is the impact of RTA on the coefficient for DEM

0 1So if RTA=1, the slope of DEM is +

0 0 1 2ˆ ˆˆ ˆ ˆ*TARIFF DEM RTA DEM RTA u

Page 20: PPT research method

20

Interpreting the Interaction

Recall that:

RTA is a dummy variable taking on the values 0 or 1

0 0 1 2

0 0 1 2

ˆ ˆˆ ˆ ˆ( )

ˆ ˆˆ ˆ ˆ*

TARIFF RTA DEM RTA u

TARIFF DEM RTA DEM RTA u

0 1 2ˆ ˆ ˆ ˆTARIFF DEM RTA u

1 0 1ˆ ˆ ˆ RTA

1 0ˆ ˆThus if RTA=0, then =

1 0 1ˆ ˆ ˆBut if RTA=1, then = +

Page 21: PPT research method

21

An Illustration of the Coefficients

Imagine we estimate:

0

5

10

15

20

25

30

35

Non-Dem Democracy

Per

cen

t T

ari

ffs

RTANo RTA

30 5( ) 1( * ) 10( )TARIFF DEM RTA DEM RTA

Page 22: PPT research method

22

Substantive Effects of Dummy Interactions

No RTA RTA

Non-

Democracy

Βhat0 =

3030

Βhat0 + Βhat

3 =

2020

Democracy Βhat0 + Βhat

1 =

2525

Βhat0 + Βhat

1 +

Βhat2 + Βhat

3 = 1414

Page 23: PPT research method

23

Interactions with Continuous Variables

The exact same logic about interactions applies if Βhat

1 depends on a continuous variable

0 1 2ˆ is the impact of x when x =0

0 1 1 2 2

1 0 1 2

ˆ ˆ ˆ ˆx x

ˆ ˆ ˆ x

y u

1 1 2ˆˆ is the change in for each one unit increase in x

2 2 1ˆ is the impact of x when x =0

Page 24: PPT research method

24

Example: Democracy, Tariffs & Unemployment

28 2( ) 1( * ) 5( )TARIFF DEM DEM UNEMP UNEMP

10

20

30

40

50

Tari

ff R

ate

Dictator Oligarch Anocrat DemoDemocracy 1-4

yhat_, Unemployment == 0 yhat_, Unemployment == 2yhat_, Unemployment == 4 yhat_, Unemployment == 6

Page 25: PPT research method

25x1 (could be continuous, categorical, or dichotomous)

y

Graphical Depiction of a Dummy/Continuous Interaction

1

0

0 1 1 2 2 2ˆ ˆ ˆˆ if 1y x x x

1 0 1

0 3ˆ ˆ

1 0

0 0 1 1 1 2 3 2 2ˆ ˆˆ ( * ) if 1y x x x x x

0 0 1 1 1 2 3 2 2ˆ ˆˆ ( * ) if 0y x x x x x

0

Page 26: PPT research method

26

What if a Variable Interacts with Itself?

What if Βhat1 depends on the value of x1?

Then we substitute in as before:

Curvilinear (Quadratic) effect is a type of interaction

0 1 1 2 2

1 0 1 1

ˆ ˆ ˆ ˆx x

ˆ ˆ ˆ x

y u

0 0 1 1 1 2 2

20 0 1 1 1 2 2

ˆ ˆˆ ˆ ˆ( x )x x

ˆ ˆˆ ˆ ˆx x x

y u

y u

Page 27: PPT research method

27

More Complex Interactions

We can use this method to specify the functional form of βhat

1 in any way we choose

Simply substitute the function in for βhat1 ,

multiply out the terms and estimate Only limitations are theories of interaction

and levels of collinearity

Page 28: PPT research method

28

Examples of interaction effects

from my own research

Page 29: PPT research method

29

Figure 4: PCI Performance and Economic Welfare

05

10

15

20

04

GD

P p

er

ca

pit

a (

in M

illio

ns o

f C

on

sta

nt

19

94

VN

D)

0 20 40 60 80 100Structural Endowments (Infrastructure, Human Capital, Proximity to Markets)

Low PCI High PCI

“The Governance Premium” Better governed (high PCI)

provinces are able to generate higher living

standards from the same level of development

Governance and Economic Welfare

Page 30: PPT research method

30

Predicted Number of Loans by Legal Status among Vietnamese Private Firms

  Land Use Rights Certificate

Registered at DPI None Partial Full

No 0.83 0.99 1.2

Yes 2.73 3.27 3.98

Page 31: PPT research method

31

Predicted Probability of Provincial Division in Vietnam

(By State Sector Output with Number of Cabinet Officials).4

.5.6

.7.8

Pre

dic

ted P

rob

ablity

of P

rovin

cia

l D

ivis

ion

0 .2 .4 .6 .8 1State Contribution to Provincial Output

No Cabinet Members 1 Cabinet Member

2+ Cabinet Members

Contribution of covariates at 75th percentile

Page 32: PPT research method

32

Page 33: PPT research method

33