77
Case Studies and Regression Analysis Jason Seawright [email protected] August 11, 2010 J. Seawright (PolSci) Essex 2010 August 11, 2010 1 / 44

Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright [email protected] August

Embed Size (px)

Citation preview

Page 1: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Case Studies and RegressionAnalysis

Jason Seawright

[email protected]

August 11, 2010

J. Seawright (PolSci) Essex 2010 August 11, 2010 1 / 44

Page 2: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Examples of Strengths andWeaknesses

J. Seawright (PolSci) Essex 2010 August 11, 2010 2 / 44

Page 3: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Case Selection

J. Seawright (PolSci) Essex 2010 August 11, 2010 3 / 44

Page 4: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Case Selection

1 Study the entire population.

J. Seawright (PolSci) Essex 2010 August 11, 2010 3 / 44

Page 5: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Case Selection

1 Study the entire population.

2 Take a random sample.

J. Seawright (PolSci) Essex 2010 August 11, 2010 3 / 44

Page 6: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Case Selection

1 Study the entire population.

2 Take a random sample.

3 Follow some rule for deliberate case

selection.

J. Seawright (PolSci) Essex 2010 August 11, 2010 3 / 44

Page 7: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Mill’s Methods

J. Seawright (PolSci) Essex 2010 August 11, 2010 4 / 44

Page 8: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Mill’s Methods

Method of Agreement

J. Seawright (PolSci) Essex 2010 August 11, 2010 4 / 44

Page 9: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Mill’s Methods

Method of Agreement

Method of Difference

J. Seawright (PolSci) Essex 2010 August 11, 2010 4 / 44

Page 10: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Mill’s Methods

Method of Agreement

Method of Difference

Etc..

J. Seawright (PolSci) Essex 2010 August 11, 2010 4 / 44

Page 11: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Crucial Cases

J. Seawright (PolSci) Essex 2010 August 11, 2010 5 / 44

Page 12: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Crucial Cases

Theory assigns high likelihood to an

outcome in a particular case that, for (all,

or most, or a major) competing theory has

low likelihood.

J. Seawright (PolSci) Essex 2010 August 11, 2010 5 / 44

Page 13: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Regression Analysis

In a survey of 1000 articles in 10 leading

political science journals, 49% used

statistics (Bennett, Barth, and Rutherford

2003).

Presumably, most of them involve somevariant of regression.

J. Seawright (PolSci) Essex 2010 August 11, 2010 6 / 44

Page 14: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Regression Analysis

In a survey of 1000 articles in 10 leading

political science journals, 49% used

statistics (Bennett, Barth, and Rutherford

2003).

Presumably, most of them involve somevariant of regression.

A search in JStor for the word “regression”

finds 10,404 relevant articles.

J. Seawright (PolSci) Essex 2010 August 11, 2010 6 / 44

Page 15: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Regression Analysis

Almost no matter what you work on, you

will have to interact with regression-based

studies.

J. Seawright (PolSci) Essex 2010 August 11, 2010 7 / 44

Page 16: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Choosing Cases

Case-selection rules:

J. Seawright (PolSci) Essex 2010 August 11, 2010 8 / 44

Page 17: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Choosing Cases

Case-selection rules:

Random sampling

J. Seawright (PolSci) Essex 2010 August 11, 2010 8 / 44

Page 18: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Choosing Cases

Case-selection rules:

Random samplingTypical cases

J. Seawright (PolSci) Essex 2010 August 11, 2010 8 / 44

Page 19: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Choosing Cases

Case-selection rules:

Random samplingTypical casesDiverse cases

J. Seawright (PolSci) Essex 2010 August 11, 2010 8 / 44

Page 20: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Choosing Cases

Case-selection rules:

Random samplingTypical casesDiverse casesExtreme cases

J. Seawright (PolSci) Essex 2010 August 11, 2010 8 / 44

Page 21: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Choosing Cases

Case-selection rules:

Random samplingTypical casesDiverse casesExtreme casesDeviant cases

J. Seawright (PolSci) Essex 2010 August 11, 2010 8 / 44

Page 22: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Choosing Cases

Case-selection rules:

Random samplingTypical casesDiverse casesExtreme casesDeviant casesInfluential cases

J. Seawright (PolSci) Essex 2010 August 11, 2010 8 / 44

Page 23: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Choosing Cases

Case-selection rules:

Random samplingTypical casesDiverse casesExtreme casesDeviant casesInfluential casesMost-similar cases

J. Seawright (PolSci) Essex 2010 August 11, 2010 8 / 44

Page 24: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Running Example

J. Seawright (PolSci) Essex 2010 August 11, 2010 9 / 44

Page 25: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Typical Cases

Typicalityi = −abs[yi − E(yi |x1,i , x2,i , . . . , xk ,i)] (1)

J. Seawright (PolSci) Essex 2010 August 11, 2010 10 / 44

Page 26: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Typical Cases

J. Seawright (PolSci) Essex 2010 August 11, 2010 11 / 44

Page 27: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Extreme Cases

Extremityi = |xi − x

s| (2)

J. Seawright (PolSci) Essex 2010 August 11, 2010 12 / 44

Page 28: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Deviant Cases

Deviantnessi = −Typicalityi (3)

J. Seawright (PolSci) Essex 2010 August 11, 2010 13 / 44

Page 29: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Influential Cases

Cook’s distance is a statistical measure of

how much the overall regression result

would change if a given case is deleted.

J. Seawright (PolSci) Essex 2010 August 11, 2010 14 / 44

Page 30: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Influential Cases

Cook’s distance is a statistical measure of

how much the overall regression result

would change if a given case is deleted.

A Cook’s distance score of 1 or more

usually is regarded as representing

substantial influence.

J. Seawright (PolSci) Essex 2010 August 11, 2010 14 / 44

Page 31: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Influential Cases

J. Seawright (PolSci) Essex 2010 August 11, 2010 15 / 44

Page 32: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Most-Similar Cases

Matching techniques are an automated way

of finding most similar cases.

J. Seawright (PolSci) Essex 2010 August 11, 2010 16 / 44

Page 33: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Measurement Error in Y

Y∗i = Yi + δY ,i

J. Seawright (PolSci) Essex 2010 August 11, 2010 17 / 44

Page 34: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Measurement Error in Y

Y∗i = Yi + δY ,i

Random Sampling

J. Seawright (PolSci) Essex 2010 August 11, 2010 17 / 44

Page 35: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Measurement Error in Y

Typical/Deviant Cases:

ei = Yi −Hi ,·Y + δY ,i

J. Seawright (PolSci) Essex 2010 August 11, 2010 18 / 44

Page 36: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Measurement Error in Y

Influential Cases Strategy:

Maximizes the product of the error term and the

weighted average distance of the right-hand-side

variables from their means.

J. Seawright (PolSci) Essex 2010 August 11, 2010 19 / 44

Page 37: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Measurement Error in Y

Extreme Cases:

Y∗i = Yi + δY ,i

J. Seawright (PolSci) Essex 2010 August 11, 2010 20 / 44

Page 38: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Measurement Error in Y

Most-Similar Cases

J. Seawright (PolSci) Essex 2010 August 11, 2010 21 / 44

Page 39: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Measurement Error in Y

Most-Similar Cases

Most-Different Cases

J. Seawright (PolSci) Essex 2010 August 11, 2010 21 / 44

Page 40: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Measurement Error in X

X∗i = Xi + δX ,i

J. Seawright (PolSci) Essex 2010 August 11, 2010 22 / 44

Page 41: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Measurement Error in X

X∗i = Xi + δX ,i

Random Sampling

J. Seawright (PolSci) Essex 2010 August 11, 2010 22 / 44

Page 42: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Measurement Error in X

Typical/Deviant Cases:

ei = Yi − Xi β∗ − δX ,i β

J. Seawright (PolSci) Essex 2010 August 11, 2010 23 / 44

Page 43: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Measurement Error in X

Influential Cases Strategy:

Maximizes the product of the error term and the

weighted average distance of the right-hand-side

variables from their means.

J. Seawright (PolSci) Essex 2010 August 11, 2010 24 / 44

Page 44: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Measurement Error in X

Extreme Cases:

X∗i = Xi + δX ,i

J. Seawright (PolSci) Essex 2010 August 11, 2010 25 / 44

Page 45: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Measurement Error in X

Most-Similar Cases

J. Seawright (PolSci) Essex 2010 August 11, 2010 26 / 44

Page 46: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Measurement Error in X

Most-Similar Cases

Most-Different Cases

J. Seawright (PolSci) Essex 2010 August 11, 2010 26 / 44

Page 47: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Omitted Variables

ei = di + γZi , where Zi = ZI − E(Zi |Xi)

J. Seawright (PolSci) Essex 2010 August 11, 2010 27 / 44

Page 48: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Omitted Variables

ei = di + γZi , where Zi = ZI − E(Zi |Xi)

Random Sampling

J. Seawright (PolSci) Essex 2010 August 11, 2010 27 / 44

Page 49: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Omitted Variables

Typical/Deviant Cases:

ei = di + γZi

J. Seawright (PolSci) Essex 2010 August 11, 2010 28 / 44

Page 50: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Omitted Variables

Influential Cases Strategy:

Maximizes the product of the error term and the

weighted average distance of the right-hand-side

variables from their means.

J. Seawright (PolSci) Essex 2010 August 11, 2010 29 / 44

Page 51: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Omitted Variables

Extreme Cases:

J. Seawright (PolSci) Essex 2010 August 11, 2010 30 / 44

Page 52: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Omitted Variables

Extreme Cases:

For confounders, extreme on X may be a good

strategy.

J. Seawright (PolSci) Essex 2010 August 11, 2010 30 / 44

Page 53: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Omitted Variables

Extreme Cases:

For confounders, extreme on X may be a good

strategy.

Extreme on Y maximizes:

Yi + di + γZi

J. Seawright (PolSci) Essex 2010 August 11, 2010 30 / 44

Page 54: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Omitted Variables

Most-Similar Cases

J. Seawright (PolSci) Essex 2010 August 11, 2010 31 / 44

Page 55: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Omitted Variables

Most-Similar Cases

Most-Different Cases

J. Seawright (PolSci) Essex 2010 August 11, 2010 31 / 44

Page 56: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Pathway Variables

Wi = ν + µXi + ωi

Yi = α + τWi + σi

J. Seawright (PolSci) Essex 2010 August 11, 2010 32 / 44

Page 57: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Pathway Variables

Wi = ν + µXi + ωi

Yi = α + τWi + σiRandom Sampling

J. Seawright (PolSci) Essex 2010 August 11, 2010 32 / 44

Page 58: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Pathway Variables

Typical/Deviant Cases:

ei = τωi + σi

J. Seawright (PolSci) Essex 2010 August 11, 2010 33 / 44

Page 59: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Pathway Variables

Influential Cases Strategy:

Maximizes the product of the error term and the

weighted average distance of the right-hand-side

variables from their means.

J. Seawright (PolSci) Essex 2010 August 11, 2010 34 / 44

Page 60: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Pathway Variables

Extreme Cases:

J. Seawright (PolSci) Essex 2010 August 11, 2010 35 / 44

Page 61: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Pathway Variables

Extreme Cases:

Wi = ν + µXi + ωi

J. Seawright (PolSci) Essex 2010 August 11, 2010 35 / 44

Page 62: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Pathway Variables

Extreme Cases:

Wi = ν + µXi + ωi

Extreme on Y maximizes:

Yi = α + τWi + σi

J. Seawright (PolSci) Essex 2010 August 11, 2010 35 / 44

Page 63: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Pathway Variables

Most-Similar Cases

J. Seawright (PolSci) Essex 2010 August 11, 2010 36 / 44

Page 64: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Pathway Variables

Most-Similar Cases

Most-Different Cases

J. Seawright (PolSci) Essex 2010 August 11, 2010 36 / 44

Page 65: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Summary: Analytic Arguments

Deviant Influential Ext. X Ext. Y

Error in Y Good Mixed Poor Good

Error in X Mixed Mixed Good Poor

Confound Good Mixed Mixed Good

Pathway Good Mixed Good Mixed

J. Seawright (PolSci) Essex 2010 August 11, 2010 37 / 44

Page 66: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Monte Carlo for Case Selection

Simulate case selection for the same problem

10,000 times.

J. Seawright (PolSci) Essex 2010 August 11, 2010 38 / 44

Page 67: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Monte Carlo for Case Selection

Simulate case selection for the same problem

10,000 times.

Analysis of presidential vote shares and the

economy in Latin America, 1980-2000.

J. Seawright (PolSci) Essex 2010 August 11, 2010 38 / 44

Page 68: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Monte Carlo for Case Selection

Simulate case selection for the same problem

10,000 times.

Analysis of presidential vote shares and the

economy in Latin America, 1980-2000.

Add measurement error, omitted variables,

etc.

J. Seawright (PolSci) Essex 2010 August 11, 2010 38 / 44

Page 69: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Monte Carlo for Case Selection

Simulate case selection for the same problem

10,000 times.

Analysis of presidential vote shares and the

economy in Latin America, 1980-2000.

Add measurement error, omitted variables,

etc.

2 SD Rule

J. Seawright (PolSci) Essex 2010 August 11, 2010 38 / 44

Page 70: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Simulation Results

Random Typical Influential Deviant Extreme Y Extreme X Similar Different Contrast

0.00

0.05

0.10

0.15

0.20

Figure: Case Selection for Finding Confounder.

J. Seawright (PolSci) Essex 2010 August 11, 2010 39 / 44

Page 71: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Simulation Results

Random Typical Influential Deviant Extreme Y Extreme X Similar Different Contrast

0.00

0.05

0.10

0.15

0.20

0.25

0.30

Figure: Case Selection for Other Causes.

J. Seawright (PolSci) Essex 2010 August 11, 2010 40 / 44

Page 72: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Simulation Results

Random Typical Influential Deviant Extreme Y Extreme X Similar Different Contrast

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

Figure: Case Selection for Exploring Mechanisms.

J. Seawright (PolSci) Essex 2010 August 11, 2010 41 / 44

Page 73: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Simulation Results

Random Typical Influential Deviant Extreme Y Extreme X Similar Different Contrast

0.00

0.05

0.10

0.15

0.20

0.25

0.30

Figure: Case Selection for Error in X .

J. Seawright (PolSci) Essex 2010 August 11, 2010 42 / 44

Page 74: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Simulation Results

Random Typical Influential Deviant Extreme Y Extreme X Similar Different Contrast

0.0

0.2

0.4

0.6

0.8

Figure: Case Selection for Error in Y .

J. Seawright (PolSci) Essex 2010 August 11, 2010 43 / 44

Page 75: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Simulation Results

Random Typical Influential Deviant Extreme Y Extreme X Similar Different Contrast

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Figure: Case Selection for Estimating Overall Slope.

J. Seawright (PolSci) Essex 2010 August 11, 2010 44 / 44

Page 76: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Case-selection software in R

J. Seawright (PolSci) Essex 2010 August 11, 2010 45 / 44

Page 77: Case Studies and Regression Analysisfaculty.wcas.northwestern.edu/~jws780/essexslides/day3.pdfCase Studies and Regression Analysis Jason Seawright j-seawright@northwestern.edu August

Assignment

Implement each case-selection technique for a

data set of interest, or off my website. Be

prepared to discuss what kinds of cases you get,

and whether they seem on first glance to be

useful, tomorrow.

J. Seawright (PolSci) Essex 2010 August 11, 2010 46 / 44