33
Life After P-hacking (APS May 2013, Washington DC) With minor edits for posting Uri Simonsohn Penn (gave the talk) Leif Nelson UC Berkeley Joe Simmons Penn also Photo not necessar y

Life After P-hacking (APS May 2013, Washington DC) With minor edits for posting Uri Simonsohn Penn (gave the talk) Leif Nelson UC Berkeley Joe Simmons

Embed Size (px)

Citation preview

Page 1: Life After P-hacking (APS May 2013, Washington DC) With minor edits for posting Uri Simonsohn Penn (gave the talk) Leif Nelson UC Berkeley Joe Simmons

Life After P-hacking(APS May 2013, Washington DC)

With minor edits for posting

Uri SimonsohnPenn (gave the talk)

Leif NelsonUC Berkeley

Joe SimmonsPenn also

Photo not necessary

Page 2: Life After P-hacking (APS May 2013, Washington DC) With minor edits for posting Uri Simonsohn Penn (gave the talk) Leif Nelson UC Berkeley Joe Simmons

Definition

p-hacking: exploiting researchers’ degrees-of-freedom seeking p<.05

Page 3: Life After P-hacking (APS May 2013, Washington DC) With minor edits for posting Uri Simonsohn Penn (gave the talk) Leif Nelson UC Berkeley Joe Simmons

Life after p-hacking

• n>50• Direct replications• 21 words• Compromise writing• Who to hire• What about Bayesian?

Page 4: Life After P-hacking (APS May 2013, Washington DC) With minor edits for posting Uri Simonsohn Penn (gave the talk) Leif Nelson UC Berkeley Joe Simmons

~ Median study: n=20

• False-Positive Psych: n>20

• What can you reliably detect with n=20?

• Mturk study. – N=674– Why not published ds?

Page 5: Life After P-hacking (APS May 2013, Washington DC) With minor edits for posting Uri Simonsohn Penn (gave the talk) Leif Nelson UC Berkeley Joe Simmons

n=20 is enough for:

• Men taller than womenn=6

• People above median age closer to retirementn=10

• Women, more shoes than menn=15

Page 6: Life After P-hacking (APS May 2013, Washington DC) With minor edits for posting Uri Simonsohn Penn (gave the talk) Leif Nelson UC Berkeley Joe Simmons

n=20 is not enough for:• People who like spicy food are more likely to like Indian food n = 27

• Liberals rate social equality as more important than do conservatives n = 34

• People who like eggs report eating egg salad more often n = 47

• Men weigh more than women n = 47

• Smokers think smoking is less likely to kill someone than do non-smokersn = 146

Page 7: Life After P-hacking (APS May 2013, Washington DC) With minor edits for posting Uri Simonsohn Penn (gave the talk) Leif Nelson UC Berkeley Joe Simmons

• People who like spicy food are more likely to like Indian food n = 27

• Liberals rate social equality as more important than do conservatives n = 34

• People who like eggs report eating egg salad more often n = 47

• Men weigh more than women n = 47

• Smokers think smoking is less likely to kill someone than do non-smokersn = 146

Page 8: Life After P-hacking (APS May 2013, Washington DC) With minor edits for posting Uri Simonsohn Penn (gave the talk) Leif Nelson UC Berkeley Joe Simmons

• Are you studying a bigger effect than: • Men weigh more than women?

• If not, use n>50

Page 9: Life After P-hacking (APS May 2013, Washington DC) With minor edits for posting Uri Simonsohn Penn (gave the talk) Leif Nelson UC Berkeley Joe Simmons

Life after p-hacking

• n>50• Direct replications• 21 words• Compromise writing• Who to hire• What about Bayesian?

Page 10: Life After P-hacking (APS May 2013, Washington DC) With minor edits for posting Uri Simonsohn Penn (gave the talk) Leif Nelson UC Berkeley Joe Simmons

Lion's Weight Coins Calories

-0.15

-0.1

-0.05

0

0.05

0.1

0.15

Low HighEs

timat

e

Estimates are way off

Subjects confused?

Big outliers

Page 11: Life After P-hacking (APS May 2013, Washington DC) With minor edits for posting Uri Simonsohn Penn (gave the talk) Leif Nelson UC Berkeley Joe Simmons

Lion's Weight Coins Calories

-0.25-0.2

-0.15-0.1

-0.050

0.050.1

0.150.2

0.25

Low HighEs

timat

e

p < .03Estimates are way off

Subjects confused?

Big outliers

Page 12: Life After P-hacking (APS May 2013, Washington DC) With minor edits for posting Uri Simonsohn Penn (gave the talk) Leif Nelson UC Berkeley Joe Simmons

Calories

-0.25-0.2

-0.15-0.1

-0.050

0.050.1

0.150.2

0.25

Low HighEs

timat

e

p < .03

Study 1?

Page 13: Life After P-hacking (APS May 2013, Washington DC) With minor edits for posting Uri Simonsohn Penn (gave the talk) Leif Nelson UC Berkeley Joe Simmons

• Run calories study again.• Same exclusion rule.

Page 14: Life After P-hacking (APS May 2013, Washington DC) With minor edits for posting Uri Simonsohn Penn (gave the talk) Leif Nelson UC Berkeley Joe Simmons

Why not just conceptual replication?

• Restart p-hacking clock

• Failures do not count

Page 15: Life After P-hacking (APS May 2013, Washington DC) With minor edits for posting Uri Simonsohn Penn (gave the talk) Leif Nelson UC Berkeley Joe Simmons

Replications

• Conceptual– Rule out confounds– Rule in generalizability

• Direct– Rule out false-positive

Page 16: Life After P-hacking (APS May 2013, Washington DC) With minor edits for posting Uri Simonsohn Penn (gave the talk) Leif Nelson UC Berkeley Joe Simmons

Life after p-hacking

• n>50• Direct replications• 21 words (Google it)• Compromise writing• Who to hire• What about Bayesian?

Page 17: Life After P-hacking (APS May 2013, Washington DC) With minor edits for posting Uri Simonsohn Penn (gave the talk) Leif Nelson UC Berkeley Joe Simmons

How can an organic farmer compete?

Page 18: Life After P-hacking (APS May 2013, Washington DC) With minor edits for posting Uri Simonsohn Penn (gave the talk) Leif Nelson UC Berkeley Joe Simmons

How can an organic researcher compete?

• If you determined sample size in advanceSay it.

• If you did not drop variablesSay it.

• If you did not drop conditionsSay it.

Page 19: Life After P-hacking (APS May 2013, Washington DC) With minor edits for posting Uri Simonsohn Penn (gave the talk) Leif Nelson UC Berkeley Joe Simmons

21 Word Solution get .pdf here http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2160588

Footnote 1

We report how we determined our sample size, all data exclusions (if any), all manipulations, and all measures in the study.

Organic Farmer Organic Researcher

Page 20: Life After P-hacking (APS May 2013, Washington DC) With minor edits for posting Uri Simonsohn Penn (gave the talk) Leif Nelson UC Berkeley Joe Simmons

Life after p-hacking

• n>50• Direct replications• 21 words • Compromise writing• Who to hire• What about Bayesian?

Page 21: Life After P-hacking (APS May 2013, Washington DC) With minor edits for posting Uri Simonsohn Penn (gave the talk) Leif Nelson UC Berkeley Joe Simmons

Compromise writing

• While reviewers still in dark ages.• Have it both ways.• “Clean” version in main text

– All studies “worked” & < 2500 words• Supplement/footnote

– n=100n=150 – p=.08 w/o exclusion– Data and materials online

• Only reformers read small print• Organic 21 words applies.• Everybody likes the paper

Page 22: Life After P-hacking (APS May 2013, Washington DC) With minor edits for posting Uri Simonsohn Penn (gave the talk) Leif Nelson UC Berkeley Joe Simmons

Life after p-hacking

• n>50• Direct replications• 21 words • Compromise writing• Who to hire• What about Bayesian?

Page 23: Life After P-hacking (APS May 2013, Washington DC) With minor edits for posting Uri Simonsohn Penn (gave the talk) Leif Nelson UC Berkeley Joe Simmons

If you hire based on quantityyou pass on these guys

Page 24: Life After P-hacking (APS May 2013, Washington DC) With minor edits for posting Uri Simonsohn Penn (gave the talk) Leif Nelson UC Berkeley Joe Simmons

What’s the alternative to counting papers?

• Rookies: Best 1• Tenure: Best 3• Full: Best 5

Try it. It is a powerful question. What’s her best paper?

Page 25: Life After P-hacking (APS May 2013, Washington DC) With minor edits for posting Uri Simonsohn Penn (gave the talk) Leif Nelson UC Berkeley Joe Simmons

Life after p-hacking

• n>50• Direct replications• 21 words • Compromise writing• Who to hire• What about Bayesian? Only speak for myself here.

My prior: Bayesians will be unhappy in 3 2 1

Page 26: Life After P-hacking (APS May 2013, Washington DC) With minor edits for posting Uri Simonsohn Penn (gave the talk) Leif Nelson UC Berkeley Joe Simmons

P-hacking also invalidatesBayesian results

Page 27: Life After P-hacking (APS May 2013, Washington DC) With minor edits for posting Uri Simonsohn Penn (gave the talk) Leif Nelson UC Berkeley Joe Simmons

P-hacking also invalidatesBayesian results

Let me say that again

Page 28: Life After P-hacking (APS May 2013, Washington DC) With minor edits for posting Uri Simonsohn Penn (gave the talk) Leif Nelson UC Berkeley Joe Simmons

• Bayesian proposals for Psych1) Bayesian t-test• Replications use it sometimes • Turns out

– α = 5%

2) Bayesian estimation • Latest JEP:G . • Turns out

– Changes nothing

1%

Page 29: Life After P-hacking (APS May 2013, Washington DC) With minor edits for posting Uri Simonsohn Penn (gave the talk) Leif Nelson UC Berkeley Joe Simmons

t-test “vs” Bayesian Estimationchanges nothing

How similar?Results change by less than if we dropped 1 observation at random.

Page 30: Life After P-hacking (APS May 2013, Washington DC) With minor edits for posting Uri Simonsohn Penn (gave the talk) Leif Nelson UC Berkeley Joe Simmons

But!

• Isn’t data-peeking OK for Bayes?– Not when used for hypothesis testing

• Also:– Dropped subjects, measures, conditions invalidate all inference.

Page 31: Life After P-hacking (APS May 2013, Washington DC) With minor edits for posting Uri Simonsohn Penn (gave the talk) Leif Nelson UC Berkeley Joe Simmons

• P-hacking Bayesian stats

• Drunk driving leather seats

Good reasons to go Bayesian do not include p-hacking.

Page 32: Life After P-hacking (APS May 2013, Washington DC) With minor edits for posting Uri Simonsohn Penn (gave the talk) Leif Nelson UC Berkeley Joe Simmons

• Next slide is the last.

Page 33: Life After P-hacking (APS May 2013, Washington DC) With minor edits for posting Uri Simonsohn Penn (gave the talk) Leif Nelson UC Berkeley Joe Simmons

Life after p-hacking

• n>50• Direct replications• 21 words • Compromise writing• Who to hire• What about Bayesian? Only speak for myself here.

Leif NelsonUC Berkeley

Joe SimmonsPenn