46
9 February 2007 SSP Core Facility 1 Department of Statistics (and Precision) Effective Research Design Planning for Grant Proposals & More Walt Stroup, Ph.D. Professor & Chair, Department of Statistics University of Nebraska, Lincoln Power

(and Precision) Effective Research Design Planning for Grant Proposals & More

  • Upload
    wells

  • View
    27

  • Download
    1

Embed Size (px)

DESCRIPTION

Power. (and Precision) Effective Research Design Planning for Grant Proposals & More. Walt Stroup, Ph.D. Professor & Chair, Department of Statistics University of Nebraska, Lincoln. Outline for Talk. What is “Power Analysis”? Why should I do it? Essential Background - PowerPoint PPT Presentation

Citation preview

Page 1: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 1

Department of Statistics

(and Precision) Effective Research Design

Planning

for Grant Proposals & More

Walt Stroup, Ph.D.Professor & Chair, Department of Statistics

University of Nebraska, Lincoln

Power Power

Page 2: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 2

Department of Statistics

Outline for Talk

I. What is “Power Analysis”? Why should I do it?

II. Essential Background

III. A Word about Software

IV. Decisions that Affect Power – several examples

V. Latest Thinking

VI. Final Thoughts

Page 3: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 3

Department of Statistics

Power and Precision Defined

Precision a.k.a “Margin of Error”− In most cases, the standard error of relevant estimate

Power−Prob { reject H0 given H0 false }

−Prob { research hypothesis statistically significant }

Power analysis−essentially, “If I do the study this way, power = ?”

Sample size estimation−How many observations required to achieve given power?

Page 4: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 4

Department of Statistics

What’s involved in Power Analysis

WHAT IT’S NOT: “Painting by numbers...”

IF IT’S DONE RIGHT Power analysis should be

−a comprehensive conversation to plan the study

−a “dress rehearsal” for the statistical analysis once the data are collected

Page 5: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 5

Department of Statistics

Why do a Power Analysis?

For NIH Grant Proposal−because it’s required

For many other grant proposals−because it gives you a competitive edge

Other reasons−practical: increases chance of success; reduces

“we don’t have time to do it right, but lots of time to do it over” syndrome

−ethical

Page 6: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 6

Department of Statistics

Ethical???

Last Ph.D. in U.S. Senate Irritant to doctrinaire left and right

Keynote address to 1997 American Stat. Assoc. “... we can continue to make policy based on ‘data-free ideology’ on we can inform policy where possible by competent inquiry...”

late U.S. Senator Daniel Patrick Moynihan

Page 7: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 7

Department of Statistics

Ethical

Results of your study may affect policy Well-conceived research means

−better information

−greater chance of sound decisions

Poorly-conceived research− lost opportunity

−deprives policy-makers of information that might have been useful

−or worse: bad information misinforms or misleads public

Page 8: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 8

Department of Statistics

What affects Power & Precision?

A short statistics lesson

1. What goes into computing test statistics

2. What test statistics are supposed to tell us

3. A bit about the distribution of test statistics

4. Central and non-central t, F, and chi-square( mostly F )

Page 9: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 9

Department of Statistics

What goes into a test statistic?

Research hypothesis – motivation for study

Assumed not true unless data show compelling evidence otherwise

Research hypothesis: HA ; opposite: H0

H0 true HA true

Fail to reject H0 Type II error

Reject H0 Type I error Power

Page 10: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 10

Department of Statistics

What goes into a test statistic?

Visualize using F

But same basic principles for t, chi-square, etc

F is ratio of variation attributable to factor under study vs. variation attributable to noise

22

2

of obs effect size estimates

(noise)

NF

N of obs effect sizevariance of noise(i.e. among obs)

Page 11: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 11

Department of Statistics

When H0 True – i.e. no trt effect

numerator (trt) d.f., denominator (noise/error) d.f.~ FF

Page 12: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 12

Department of Statistics

When H0 false (i.e. Research HA true)

num. d.f., den (error) d.f.,

2

2

~ F "non-centrality parameter"

N of obs effect size

F

Page 13: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 13

Department of Statistics

What affects Power?

2

2

Increase "non-centrality parameter" Increase Power

N of obs effect size

N of obs effect sizevariance of noise(i.e. among obs)

Page 14: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 14

Department of Statistics

What should be in a conversation about Power?

2

2

Increase "non-centrality parameter" Increase Power

N of obs effect size

N of obs effect sizevariance of noise(i.e. among obs)

Effect size: what is the minimum that matters? Variance: how much “noise” in the response

variable (range? distribution? count? pct?) Practical Constraints Design: same N can produce varying Power

Page 15: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 15

Department of Statistics

About Software (part I) Canned Software

− lots of it−Xiang and Zhou working on report−“painting by numbers”

Simulation−most accurate; not constrained by canned scenarios−you can see what will happen if you actually do this...

“Exemplary data set” + modeling software−nearly as accurate as simulation−“dress rehearsal” for actual analysis−MIXED, GLIMMIX, NLMIXED: if you can model it

you can do power analysis

Page 16: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 16

Department of Statistics

Design Decisions – Some Examples

Main Idea: For the same amount of effort, or $$$, or # observations, power and precision can be quite different

Power analysis objective: Work smarter, not harder

Simple example – design of regression study−From STAT 412 exercise

Page 17: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 17

Department of Statistics

Treatment Design Exercise

Class was asked to predict Bounce Height of basketball from Drop Height and to see if relationship changes depending on floor surface

Decision: What drop heights to use???

Page 18: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 18

Department of Statistics

Objectives and Operating Definitions

Recall objective: does drop: bounce height relationship change with floor surface?

0 1 0 1

1 1

Model:

relationship change meansC C T T

C T

y X X

operating definition

Page 19: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 19

Department of Statistics

Consequences of Drop Height Decisions Should we use fewer drops heights & more obs per drop

height or vice versa?

table from Stat 412 Avery archive

Page 20: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 20

Department of Statistics

Simulation

CRD example: 3 treatments, 5 reps / treatment Suspected Effect size: 6-10% relative to control,

whose mean is known to be ~ 100 Standard deviation: 10 considered “reasonable” Simulate 1000 experiments Reject H0: equal trt means 228 times

−power = 0.228 at alpha=0.05

Ctl mean ranked correctly 820 times (intermediate mean ranked correctly 589 times)

Page 21: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 21

Department of Statistics

“Exemplary Data” Many software packages for power & sample size

− e.g SAS PROC POWER− for FIXED effect models only

“Exemplary Data” more general Especially (but not only) when “Mixed Model Issues”

− random effects− split-plot structure− errors potentially correlated: longitudinal or spatial data− any other non-standard model structure

Methods use PROC MIXED or GLIMMIX− adapted from Stroup (2002, JABES)

Chapter 12, SAS for Mixed Models − (Littell, et al, 2006)

Page 22: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 22

Department of Statistics

“Exemplary Data” - Computing Power using SAS

create data set like proposed design

run PROC GLIMMIX (or MIXED) with variance fixed

=(F computed by GLIMMIX)rank(K) [or chi-sq with GLM]

use GLIMMIX to compute

critical F (Fcrit ) is value s.t.

P{F (rank(K), υ, 0 ) > Fcrit}= [or chi-square]

Power = P{F [rank(K), υ, ] >Fcrit }

SAS functions can compute Fcrit & Power

Page 23: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 23

Department of Statistics

/* step 1 - create data set with same structure as proposed design use MU (expected mean) instead of observed Y_ij values *//* this example shows power for 5, 10, and 15 e.u. per trt */

data crdpwrx1; input trt mu; do n=5 to 15 by 5; do eu=1 to n; output; end; end;cards;1 1002 943 90;

Compute Power with GLIMMIX – CRD example

Page 24: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 24

Department of Statistics

Compute Power with GLIMMIX – CRD example

/* step 2 - use PROC GLIMMIX to compute non-centrality parameters for ANOVA tests & contrasts ODS statements output them to new data sets */proc sort data=crdpwrx1;by n;

proc glimmix data=crdpwrx1;by n; class trt; model mu=trt; parms (100)/hold=1; contrast 'et1 v et2' trt 0 1 -1; contrast 'c vs et' trt 2 -1 -1; ods output tests3=b; ods output contrasts=c;run;

Page 25: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 25

Department of Statistics

/* step 3: combine ANOVA & contrast n-c parameter data sets use SAS functions PROBF and FINV to compute power */data power; set b c; alpha=0.05; ncparm=numdf*fvalue; fcrit=finv(1-alpha,numdf,dendf,0); power=1-probf(fcrit,numdf,dendf,ncparm);proc print;

Obs Effect Label DF DenDF alpha nc fcrit power

1 trt 2 12 0.05 2.53333 3.88529 0.223612 et1 v et2 1 12 0.05 0.40000 4.74723 0.089803 c vs et 1 12 0.05 2.13333 4.74723 0.26978

Type III Tests of Fixed Effects

EffectNum

DFDen DF F Value Pr > F

trt 2 12 1.27 0.3169

Contrasts

LabelNum

DFDen DF F Value Pr > F

et1 v et2 1 12 0.40 0.5390

c vs et 1 12 2.13 0.1698

Note close agreementof Simulated Power(0.228) and “exemplarydata” power (0.224)

Page 26: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 26

Department of Statistics

More Advanced Example

Plots in 8 x 3 grid Main variation along 8 “rows” 3 x 2 treatment design Alternative designs

− randomized complete block (4 blocks, size 6)

− incomplete block (8 blocks, size 3)

−split plot

RCBD “easy” but ignores natural variation

Page 27: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 27

Department of Statistics

Picture the 8 x 3 Grid

Gradient

e.g. 8 schools, gradient is “SES”, 3 classrooms each

Page 28: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 28

Department of Statistics

SAS Programs to Compare 8 x 3 Designdata a; input bloc trtmnt @@; do s_plot=1 to 3; input dose @@; mu=trtmnt*(0*(dose=1)+4*(dose=2)+8*(dose=3)); output; end;cards;1 1 1 2 31 2 1 2 32 1 1 2 32 2 1 2 33 1 1 2 33 2 1 2 34 1 1 2 34 2 1 2 3;

proc glimmix data=a noprofile; class bloc trtmnt dose; model mu=bloc trtmnt|dose; random trtmnt/subject=bloc; parms (4) (6) / hold=1,2; lsmeans trtmnt*dose / diff; contrast 'trt x lin'

trtmnt*dose 1 0 -1 -1 0 1; ods output diffs=b; ods output contrasts=c;run;

Split-Plot

Page 29: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 29

Department of Statistics

8 x 3 – Incomplete Blockdata a; input bloc @@; do eu=1 to 3; input trtmnt dose @@; mu=trtmnt*(0*(dose=1)+4*(dose=2)+8*(dose=3)); output; end;cards;1 1 1 1 2 1 32 1 1 1 2 2 23 1 1 1 3 2 34 1 1 2 1 2 25 1 2 1 3 2 26 1 2 2 1 2 37 1 3 2 1 2 38 2 1 2 2 2 3;

proc glimmix data=a noprofile; class bloc trtmnt dose; model mu=trtmnt|dose; random intercept / subject=bloc; parms (4) (6) / hold=1,2; lsmeans trtmnt*dose / diff; contrast 'trt x lin'

trtmnt*dose 1 0 -1 -1 0 1; ods output diffs=b; ods output contrasts=c;run;

Page 30: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 30

Department of Statistics

8 x 3 Example - RCBDdata a; input trtmnt dose @@; do bloc=1 to 4; mu=trtmnt*(0*(dose=1)+4*(dose=2)+8*(dose=3)); output; end;cards;1 1 1 2 1 3 2 1 2 2 2 3;

proc glimmix data=a noprofile; class bloc trtmnt dose; model mu=bloc trtmnt|dose; parms (10) / hold=1; lsmeans trtmnt*dose / diff; contrast 'trt x lin'

trtmnt*dose 1 0 -1 -1 0 1; ods output diffs=b; ods output contrasts=c;run;

Page 31: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 31

Department of Statistics

How did designs compare?

Suppose main objective is compare regression over 3 levels of doses: do they differ by treatment? (similar to basketball experiment)

Operating definition is thus H0: dose regression coefficient equal

Power for Randomized Block: 0.66 Power for Incomplete Block: 0.85 Power for Split-Plot: 0.85 Same # observations – you can work smarter

Page 32: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 32

Department of Statistics

But what if I don’t know Trt Effect Size or Variance?

“How can I do a power analysis? If I knew the effect size and the variance I wouldn’t have to do the study.”

What trt effect size is NOT: it is NOT the effect size you are going to observe

It is somewhere between−what current knowledge suggests is a reasonable

expectation

−minimum difference that would be considered “important” or “meaningful”

Page 33: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 33

Department of Statistics

And Variance??

Know thy relevant background / Do thy homework

Literature search: what have others working with similar subjects reported as variance?

Pilot study Educated guess

−range you’d expect 95% of likely obs? divide it by 4

−most extreme values you can plausibly imagine? divide range by 6

Page 34: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 34

Department of Statistics

Hierarchical Linear Models

From Bovaird (10-27-2006) seminar 2 treatment 20 classrooms / trt 25 students / classroom 4 years reasonable ideas of classroom(trt),

student(classroom*trt), within student variances as well as effect size

Implement via exemplary data + GLIMMIX

Page 35: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 35

Department of Statistics

Categorical Data?

Example: Binary data “Standard” has success probability of 0.25 “New & Improved” hope to increase to 0.30 Have N subjects at each of L locations

For sake of argument, suppose we have−900 subjects / location

−10 locations

Page 36: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 36

Department of Statistics

Power for GLMs

2 treatments P{favorable outcome} for trt 1 p= 0.30; for trt 2 p=0.25 power if n1=300; n2=600data a; input trt y n; datalines;1 90 3002 150 600;

proc glimmix; class trt; model y/n=trt / chisq; ods output tests3=pwr;run;

data power; set pwr; alpha=0.05; ncparm=numdf*chisq; crit=cinv(1-alpha,numdf,0); power=1-probchi(crit,numdf,ncparm); proc print; run;

exemplary data

Page 37: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 37

Department of Statistics

Power for GLMM Same trt and sample size per location as before 10 locations Var(Location)=0.25; Var(Trt*Loc)=0.125 Variance Components: variation in log(OddsRatio) Power?data a; input trt y n; do loc=1 to 10; output; end; datalines; 1 90 300 2 150 600 ;

proc glimmix data=a initglm; class trt loc; model y/n = trt / oddsratio; random intercept trt / subject=loc; random _residual_; parms (0.25) (0.125) (1) / hold=1,2,3; ods output tests3=pwr;run;

Page 38: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 38

Department of Statistics

GLMM Power Analysis Results

Obs Effect NumDF DenDF alpha ncparm fcrit power

1 trt 1 9 0.05 2.29868 5.11736 0.27370

Odds Ratio Estimates

trt _trt Estimate DF

95% Confidence

Limits

1 2 1.286 9 0.884 1.871

Gives you expected Conf Limits for # Locations & N / Loccontemplated

Gives you the power of the test of TRT effect on prob(favorable)

Page 39: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 39

Department of Statistics

GLMM Power: Impact of Sample Size?

N of subjects per trt per location?

N of Locations?

Three cases

1. n-300/600 10 loc2. n=600/1200, 10 loc3. n=300/600, 20 loc

data a; input trt y n; do loc=1 to 10; output; end; datalines; 1 90 300 2 150 600 ;

data a; input trt y n; do loc=1 to 10; output; end; datalines; 1 180 600 2 300 1200 ;

data a; input trt y n; do loc=1 to 20; output; end; datalines; 1 90 300 2 150 600 ;

Page 40: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 40

Department of Statistics

GLMM Power: Impact of Sample Size?Recall, for 10 locations, N=300/600,

CI for OddsRatio was (0.884, 1.871); Power was 0.274For 10 locations, N=600 / 1200

Odds Ratio Estimates

trt _trt Estimate DF 95% Confidence Limits

1 2 1.286 9 0.891 1.855

Obs Effect NumDF DenDF alpha ncparm fcrit power

1 trt 1 9 0.05 2.40715 5.11736 0.28421

For 20 locations, N=300 / 600Odds Ratio Estimates

trt _trt Estimate DF 95% Confidence Limits

1 2 1.286 19 1.006 1.643

Obs Effect NumDF DenDF alpha ncparm fcrit power

1 trt 1 19 0.05 4.59736 4.38075 0.53003

N alone has almost no impact

Page 41: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 41

Department of Statistics

Recent developments

Continue binary example Power analysis shows:

-level 0.10 0.05 0.05 0.01 0.05 0.01

Power 0.80 0.80 0.90 0.80 0.95 0.90

Llocations

27 38 46 53 57 68

what do you do?

Page 42: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 42

Department of Statistics

More Information

Consider studies directed toward improving success rate similar to that proposed in study

Lit search yields 95 such studies 29 have reported statistically significant gains of

p1-p2>0.05 (or, alternatively, significant odds ratios of [(30/70)/(25/75)]=1.28 or greater)

If this holds, “prior” prob (desired effect size ) is approx 0.3

Page 43: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 43

Department of Statistics

An Intro Stat Result

0Pr desired effect size | reject H

0Pr | D.E.S. Pr D.E.S.

Pr | D.E.S. Pr D.E.S. Pr | DES Pr DES

reject H

reject reject

for =0.10, power=0.8

0.8 0.30.77

0.8 0.3+0.1 0.7

real Pr{type I error}is more like 0.23than 0.10!!!

Page 44: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 44

Department of Statistics

Returning to All Scenarios

-level 0.10 0.05 0.05 0.01 0.05 0.01

Power 0.80 0.80 0.90 0.80 0.95 0.90

Llocations

27 38 46 53 57 68

Pr{DES | reject H0 }

0.77 0.87 0.89 0.97 0.89 0.97

NOTE dramatic impact of alpha-level when “prior” Pr { DES } is relatively low

POWER role increases at Pr { DES } increases

Page 45: (and Precision) Effective Research Design  Planning for Grant Proposals & More

9 February 2007 SSP Core Facility 45

Department of Statistics

Closing Comments

In case it’s not obvious− I’m not a fan of “painting by numbers”

−Role of power analysis misunderstood & underappreciated

MOST of ALL it is an opportunity to explore and rehearse study design & planned analysis

Engage statistician as a participating member of research team

Give it the TIME it REQUIRES

Page 46: (and Precision) Effective Research Design  Planning for Grant Proposals & More

46

ThanksThanks

... for coming... for coming