39
Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

Embed Size (px)

Citation preview

Page 1: Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

Cohort studies:Statistical analysis

Jan Wohlfahrt

Department of Epidemiology Research

Statens Serum Institut

Page 2: Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

Contents

1. A research question and a wrong answer

2. What kind of data is needed

3. How to analyse data

4. Confounder adjustment

5. Poisson regression

6. Cox regression

Danish Epidemiology Science Centre, Copenhagen, Denmark

Page 3: Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

1.1. A research question

Do MMR vaccination increase the risk of autistic disorder ?

Danish Epidemiology Science Centre, Copenhagen, Denmark

Page 4: Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

1.1. Material

• All children born 1991 to 98 (537.000 children).

• Registerbased information on MMR vacn. (441.000) and autistic disorder (412 cases).

Danish Epidemiology Science Centre, Copenhagen, Denmark

Inform. on autisme: Danish Psychiatric Central Register

Danish Civil Registration System

Inform. on MMR Danish National Board of Health

Cohort

Page 5: Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

1.2. A wrong answer I

Danish Epidemiology Science Centre, Copenhagen, Denmark

-autism +autism

- MMR vacn. 96587 62 (0.064%)

+ MMR vacn. 440594 350 (0.079%)

Relative risk = 0.079/0.064= 1.23.

What is the proportion of children with autism in vacn and non-vacn in the cohort before end of 2000(end of follow-up)?

Page 6: Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

1.2. A wrong answer II

Danish Epidemiology Science Centre, Copenhagen, Denmark

The simple comparison of proportion is not correct, because:

1) autism may be diagnosed before MMR 2) no age-adjustment, time under risk not

taken into account,

Conclusion: Compare person-time under risk, not the number of persons under risk.

Page 7: Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

2. What kind of data is needed

Page 8: Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

2.1. Information on time

Danish Epidemiology Science Centre, Copenhagen, Denmark

• Time of study-entrance (1yr birthdate)

• Time of status-change (date of vaccination)

• Time of outcome (date of autism)

• Time of study-exit (date of autism, death, emigration, disappearance, end of study)

Page 9: Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

2.2 Datalines

Who 1yr birth date Vacn. autism death/emig.

1 11sep1995 04apr1997 17oct1997 .

2 13dec1994 . . 24jan2000

3 27jan1990 . . .

4 23jul1993 04nov1995 01jan1998 .

5 15nov2000 . . .

6 15jun1997 03apr1999 . 15apr2001

7 03may1992 . 06nov1995 .

….. …… ….. ….. …..

……………….. more than 500000 datalines

. Not before end of 2000

Page 10: Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

2.3 Data as livelines

Page 11: Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

3. How to analyse data

Page 12: Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

3.1 Cox vs. Poisson regresssion

• Poisson regression in large datasets with time-dependent variables

• Cox regression in small datasets

Page 13: Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

3.2 Livelines

Page 14: Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

3.3 Contribution of pyrs

Vacn. ?? Person years Autisme ??

1 No 1.56 yr No

1 Yes 0.54 yr Yes

2 No 5.11 yr No

3 . . .

4 No 2.28 yr No

4 Yes 2.16 yr Yes

5 No 0.13 yr No

6 No 1.80 yr No

6 Yes 2.03 yr No

7 No 3.51 yr Yes

Page 15: Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

3.4 Contribution of pyrs

Vacn. ?? Person years Autisme ??

1 No 1.56 yr No

1 Yes 0.54 yr Yes

2 No 5.11 yr No

3 . . .

4 No 2.28 yr No

4 Yes 2.16 yr Yes

5 No 0.13 yr No

6 No 1.80 yr No

6 Yes 2.03 yr No

7 No 3.51 yr Yes

Page 16: Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

3.5 Data reduction

Vacn. ?? cases person years (pyrs)

- vacn. 1 1.56+2.28+1.80 +3.51 +5.11 +0.13 = 14.39

+ vacn. 2 0.54+2.16+2.03=4.73

Page 17: Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

3.6 Rate ratio calculation

Vacn. cases person years Rate (per 100000)

- vacn. 68 577.000 11.8

+ vacn. 344 2.084.000 16.5

(Incidence) rate = number of new autistic cases per year

= cases/pyrs

Rate ratio = RR+vacn vs –vacn = rate+vacn/rate-vacn= 1.40

Page 18: Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

4. Confounder Adjustment

Page 19: Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

4.1 The lexis-diagram

Page 20: Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

4.2 Person-years by age and period

Nr. Vacn. Age Period Pyrs Autism

….. ….. ….. ….. ….. …..

4 0 1 1993 0.44 No

4 0 1 1994 0.56 No

4 0 2 1994 0.44 No

4 0 2 1995 0.56 No

4 0 3 1995 0.28 No

4 1 3 1995 0.16 No

4 1 3 1996 0.56 No

4 1 4 1996 0.44 No

4 1 4 1997 0.56 No

4 1 5 1997 0.44 Yes

….. ….. ….. ….. ….. …..

….. ….. ….. ….. ….. …..

Page 21: Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

4.3 Person-years by age and period

(9 ages) x (9 periods) x (two vacn.) =162 groups

e.g.Age period vacn pyrs cases 4 1996 yes 54626 9

Page 22: Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

4.4 Relative rates by age and period

age period vacn. cases pyrs1 Rate2 RR

1-4

92-95 -vacn 13 227 5.7 1

+vacn 49 569 8.6 1.51

96-00 -vacn 34 237 14.3 1

+vacn 192 808 23.8 1.66

5-9

92-95 -vacn 0 5 - -

+vacn 4 27 - -

96-00 -vacn 21 108 19.4 1

+vacn 99 679 14.5 0.751 in thousands, 2 per 100000 yr

Page 23: Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

5. Poisson Regression

Page 24: Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

5.1 Regression analysis of the rates

log(rate) = const + aI(vacn) + bI(5-9) + cI(96-00)  I(vacn) = 1 if vacn, 0 otherwiseI(5-9) = 1 if 5-9 years, 0 otherwiseI(96-00) = 1 in period 1996-2000, 0 otherwise

For non-vacn. children in 1997 aged 6 log(rate) is modelled by: const+b+c.

Page 25: Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

5.2 Log-linear Poisson regression (I)

 log(rate) = log((nr of cases)/pyrs) = log(nr of cases) - log(pyrs)

i.e. log(nr of cases) = log(pyrs) + log(rate)

 

log(rate) = const + aI(vacn) + bI(5-9) + cI(96-00)

log(nr of cases) =

log(pyrs) + const + aI(vacn) + bI(5-9) + cI(96-00)

Page 26: Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

5.3 Log-linear Poisson regression (II)

   log(nr of cases) =

log(pyrs) + const + aI(vacn) + bI(5-9) + cI(96-00)

• The number of case is Poisson-distributed.

• log of the number of cases is modelled with a linear-function

• log(pyrs) is considered known for every cell and is called an offset

Page 27: Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

5.4 Parameters and rate ratios

log(rate) = k + aI(vacn) + bI(5-9) + cI(96-00)

rate = exp(k + aI(vacn) + bI(5-9) + cI(96-00)) = exp(k)exp(aI(vacn)exp(bI(5-9))exp(cI(96-00)). For children 5-9 yr in the period 1996-2000: RR+vacn vs -vacn = rate+vacn/rate-vacn

= (exp(k)exp(a)exp(b)exp(c)) (exp(k)exp(b)exp(c))

  = exp(a)

Page 28: Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

5.5 A more complicated model

log(rate) = k + aI(vacn) +

b1I(1yr) + b2I(2yr) + b3I(3yr) +

b4I(4yr) + b5I(5yr) + b6I(6yr) +

b7I(7yr) + b8I(8yr) +

c1I(92-93) + c2I(94) + c3I(95) +

c4I(96) + c5I(97) + c6I(98) + c7I(99) +

  with non-vacn as the vacn-reference, age=9yr as the age-reference, and period=2000 as the period-reference.

Page 29: Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

5.6 SAS-dataset to Poisson regression

data mmrdata;input age period vacn cases pyrs;logpyrs=log(pyrs);datalines;1 92 0 0 20301.681 92 1 0 12027.50 . . . . . . . . . .8 00 0 0 9553.128 00 1 2 54829.649 00 0 1 4844.919 00 1 0 26937.23;run; 

Page 30: Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

5.7 SAS-procedure to Poisson regression

proc genmod data=mmrdata;class age period;model cases=age period vacn/ dist=poissonlink=logoffset= logpyrs;run;

Page 31: Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

5.8 SAS-outputParameter DF Estimate Std Err ChiSquare Pr>Chi INTERCEPT 1 -10.2733 1.0063 104.2281 0.0001 AGE 1 1 0.0720 1.0488 0.0047 0.9453 AGE 2 1 1.6150 1.0137 2.5381 0.1111 AGE 3 1 2.4219 1.0088 5.7637 0.0164 AGE 4 1 2.3435 1.0093 5.3913 0.0202 AGE 5 1 2.0080 1.0118 3.9386 0.0472 AGE 6 1 1.6608 1.0168 2.6679 0.1024 AGE 7 1 1.0864 1.0338 1.1044 0.2933 AGE 8 1 0.4626 1.0966 0.1780 0.6731 AGE 9 0 0.0000 0.0000 . . PERIOD 1992 1 -1.4554 0.7289 3.9869 0.0459 PERIOD 1994 1 -0.6997 0.3148 4.9397 0.0262 PERIOD 1995 1 -0.9527 0.2619 13.2350 0.0003 PERIOD 1996 1 -0.6582 0.2033 10.4808 0.0012 PERIOD 1997 1 -0.3866 0.1728 5.0079 0.0252 PERIOD 1998 1 0.0366 0.1478 0.0614 0.8044 PERIOD 1999 1 0.1157 0.1423 0.6614 0.4161 PERIOD 2000 0 0.0000 0.0000 . . VACN 1 1 -0.1111 0.1348 0.6791 0.4099 VACN 999 0 0.0000 0.0000 . .

Page 32: Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

5.9 Confidence-interval

RR+vacn vs –vacn = exp(-0.1111) = 0.89

Confidence-interval:

RRlower= exp(estimate - 1.96StdErr)

RRupper= exp(estimate + 1.96StdErr)

  RR+vacn vs -vacn= 0.89 (0.69-1.17)

Page 33: Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

X.X Time since vaccination

5.9 years0.8 years

- vacn 0.8yr

+ vacn 5.9yr

-vacn 0.8 yr

V:<1 yr 1 yr

V:1-2 yr 2 yr

V:3-4 yr 2 yr

V:5+ yr 0.9 yr

Page 34: Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

6. Cox Regression

Page 35: Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

6.1 Cox regression

Page 36: Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

log(rate) = k + aI(vacn) +

b1I(1yr) + b2I(2yr) + b3I(3yr)

b4I(4yr) + b5I(5yr) + b6I(6yr) +

b7I(7yr) + b8I(8yr)

+c1I(92-93) + c2I(94) + c3I(95) +

c4I(96) + c5I(97) + c6I(98) + c7I(99) +

  

6.2 Cox regression

(age)

Page 37: Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

6.3 Live-lines

Page 38: Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

6.4 Data to Cox-regression

data coxdata;input @1 intime date7. @9 vactime date7. @17 auttime date7. @25 othtime date7.;datalines;11sep95 04apr97 17oct97 .13dec94 . . 24jan0027jan90 . . .23jul93 04nov95 01jan98 .15nov00 . . .15jun97 03apr99 . 15apr0103may92 . 06nov95 . .....run;

Page 39: Cohort studies: Statistical analysis Jan Wohlfahrt Department of Epidemiology Research Statens Serum Institut

6.5 Cox SAS-program

data coxdata2;set coxdata;outtime=min(auttime,othtime,"31dec2000"d);time=(outtime-intime);if auttime=outtime then status=1; else status=0;run;

proc phreg;model time*status(0)=vacn;if (vactime=. or time<(vactime-intime)) then vacn=0;else vacn=1;run;