Upload
norma-holmes
View
218
Download
2
Tags:
Embed Size (px)
Citation preview
Cohort studies:Statistical analysis
Jan Wohlfahrt
Department of Epidemiology Research
Statens Serum Institut
Contents
1. A research question and a wrong answer
2. What kind of data is needed
3. How to analyse data
4. Confounder adjustment
5. Poisson regression
6. Cox regression
Danish Epidemiology Science Centre, Copenhagen, Denmark
1.1. A research question
Do MMR vaccination increase the risk of autistic disorder ?
Danish Epidemiology Science Centre, Copenhagen, Denmark
1.1. Material
• All children born 1991 to 98 (537.000 children).
• Registerbased information on MMR vacn. (441.000) and autistic disorder (412 cases).
Danish Epidemiology Science Centre, Copenhagen, Denmark
Inform. on autisme: Danish Psychiatric Central Register
Danish Civil Registration System
Inform. on MMR Danish National Board of Health
Cohort
1.2. A wrong answer I
Danish Epidemiology Science Centre, Copenhagen, Denmark
-autism +autism
- MMR vacn. 96587 62 (0.064%)
+ MMR vacn. 440594 350 (0.079%)
Relative risk = 0.079/0.064= 1.23.
What is the proportion of children with autism in vacn and non-vacn in the cohort before end of 2000(end of follow-up)?
1.2. A wrong answer II
Danish Epidemiology Science Centre, Copenhagen, Denmark
The simple comparison of proportion is not correct, because:
1) autism may be diagnosed before MMR 2) no age-adjustment, time under risk not
taken into account,
Conclusion: Compare person-time under risk, not the number of persons under risk.
2. What kind of data is needed
2.1. Information on time
Danish Epidemiology Science Centre, Copenhagen, Denmark
• Time of study-entrance (1yr birthdate)
• Time of status-change (date of vaccination)
• Time of outcome (date of autism)
• Time of study-exit (date of autism, death, emigration, disappearance, end of study)
2.2 Datalines
Who 1yr birth date Vacn. autism death/emig.
1 11sep1995 04apr1997 17oct1997 .
2 13dec1994 . . 24jan2000
3 27jan1990 . . .
4 23jul1993 04nov1995 01jan1998 .
5 15nov2000 . . .
6 15jun1997 03apr1999 . 15apr2001
7 03may1992 . 06nov1995 .
….. …… ….. ….. …..
……………….. more than 500000 datalines
. Not before end of 2000
2.3 Data as livelines
3. How to analyse data
3.1 Cox vs. Poisson regresssion
• Poisson regression in large datasets with time-dependent variables
• Cox regression in small datasets
3.2 Livelines
3.3 Contribution of pyrs
Vacn. ?? Person years Autisme ??
1 No 1.56 yr No
1 Yes 0.54 yr Yes
2 No 5.11 yr No
3 . . .
4 No 2.28 yr No
4 Yes 2.16 yr Yes
5 No 0.13 yr No
6 No 1.80 yr No
6 Yes 2.03 yr No
7 No 3.51 yr Yes
3.4 Contribution of pyrs
Vacn. ?? Person years Autisme ??
1 No 1.56 yr No
1 Yes 0.54 yr Yes
2 No 5.11 yr No
3 . . .
4 No 2.28 yr No
4 Yes 2.16 yr Yes
5 No 0.13 yr No
6 No 1.80 yr No
6 Yes 2.03 yr No
7 No 3.51 yr Yes
3.5 Data reduction
Vacn. ?? cases person years (pyrs)
- vacn. 1 1.56+2.28+1.80 +3.51 +5.11 +0.13 = 14.39
+ vacn. 2 0.54+2.16+2.03=4.73
3.6 Rate ratio calculation
Vacn. cases person years Rate (per 100000)
- vacn. 68 577.000 11.8
+ vacn. 344 2.084.000 16.5
(Incidence) rate = number of new autistic cases per year
= cases/pyrs
Rate ratio = RR+vacn vs –vacn = rate+vacn/rate-vacn= 1.40
4. Confounder Adjustment
4.1 The lexis-diagram
4.2 Person-years by age and period
Nr. Vacn. Age Period Pyrs Autism
….. ….. ….. ….. ….. …..
4 0 1 1993 0.44 No
4 0 1 1994 0.56 No
4 0 2 1994 0.44 No
4 0 2 1995 0.56 No
4 0 3 1995 0.28 No
4 1 3 1995 0.16 No
4 1 3 1996 0.56 No
4 1 4 1996 0.44 No
4 1 4 1997 0.56 No
4 1 5 1997 0.44 Yes
….. ….. ….. ….. ….. …..
….. ….. ….. ….. ….. …..
4.3 Person-years by age and period
(9 ages) x (9 periods) x (two vacn.) =162 groups
e.g.Age period vacn pyrs cases 4 1996 yes 54626 9
4.4 Relative rates by age and period
age period vacn. cases pyrs1 Rate2 RR
1-4
92-95 -vacn 13 227 5.7 1
+vacn 49 569 8.6 1.51
96-00 -vacn 34 237 14.3 1
+vacn 192 808 23.8 1.66
5-9
92-95 -vacn 0 5 - -
+vacn 4 27 - -
96-00 -vacn 21 108 19.4 1
+vacn 99 679 14.5 0.751 in thousands, 2 per 100000 yr
5. Poisson Regression
5.1 Regression analysis of the rates
log(rate) = const + aI(vacn) + bI(5-9) + cI(96-00) I(vacn) = 1 if vacn, 0 otherwiseI(5-9) = 1 if 5-9 years, 0 otherwiseI(96-00) = 1 in period 1996-2000, 0 otherwise
For non-vacn. children in 1997 aged 6 log(rate) is modelled by: const+b+c.
5.2 Log-linear Poisson regression (I)
log(rate) = log((nr of cases)/pyrs) = log(nr of cases) - log(pyrs)
i.e. log(nr of cases) = log(pyrs) + log(rate)
log(rate) = const + aI(vacn) + bI(5-9) + cI(96-00)
log(nr of cases) =
log(pyrs) + const + aI(vacn) + bI(5-9) + cI(96-00)
5.3 Log-linear Poisson regression (II)
log(nr of cases) =
log(pyrs) + const + aI(vacn) + bI(5-9) + cI(96-00)
• The number of case is Poisson-distributed.
• log of the number of cases is modelled with a linear-function
• log(pyrs) is considered known for every cell and is called an offset
5.4 Parameters and rate ratios
log(rate) = k + aI(vacn) + bI(5-9) + cI(96-00)
rate = exp(k + aI(vacn) + bI(5-9) + cI(96-00)) = exp(k)exp(aI(vacn)exp(bI(5-9))exp(cI(96-00)). For children 5-9 yr in the period 1996-2000: RR+vacn vs -vacn = rate+vacn/rate-vacn
= (exp(k)exp(a)exp(b)exp(c)) (exp(k)exp(b)exp(c))
= exp(a)
5.5 A more complicated model
log(rate) = k + aI(vacn) +
b1I(1yr) + b2I(2yr) + b3I(3yr) +
b4I(4yr) + b5I(5yr) + b6I(6yr) +
b7I(7yr) + b8I(8yr) +
c1I(92-93) + c2I(94) + c3I(95) +
c4I(96) + c5I(97) + c6I(98) + c7I(99) +
with non-vacn as the vacn-reference, age=9yr as the age-reference, and period=2000 as the period-reference.
5.6 SAS-dataset to Poisson regression
data mmrdata;input age period vacn cases pyrs;logpyrs=log(pyrs);datalines;1 92 0 0 20301.681 92 1 0 12027.50 . . . . . . . . . .8 00 0 0 9553.128 00 1 2 54829.649 00 0 1 4844.919 00 1 0 26937.23;run;
5.7 SAS-procedure to Poisson regression
proc genmod data=mmrdata;class age period;model cases=age period vacn/ dist=poissonlink=logoffset= logpyrs;run;
5.8 SAS-outputParameter DF Estimate Std Err ChiSquare Pr>Chi INTERCEPT 1 -10.2733 1.0063 104.2281 0.0001 AGE 1 1 0.0720 1.0488 0.0047 0.9453 AGE 2 1 1.6150 1.0137 2.5381 0.1111 AGE 3 1 2.4219 1.0088 5.7637 0.0164 AGE 4 1 2.3435 1.0093 5.3913 0.0202 AGE 5 1 2.0080 1.0118 3.9386 0.0472 AGE 6 1 1.6608 1.0168 2.6679 0.1024 AGE 7 1 1.0864 1.0338 1.1044 0.2933 AGE 8 1 0.4626 1.0966 0.1780 0.6731 AGE 9 0 0.0000 0.0000 . . PERIOD 1992 1 -1.4554 0.7289 3.9869 0.0459 PERIOD 1994 1 -0.6997 0.3148 4.9397 0.0262 PERIOD 1995 1 -0.9527 0.2619 13.2350 0.0003 PERIOD 1996 1 -0.6582 0.2033 10.4808 0.0012 PERIOD 1997 1 -0.3866 0.1728 5.0079 0.0252 PERIOD 1998 1 0.0366 0.1478 0.0614 0.8044 PERIOD 1999 1 0.1157 0.1423 0.6614 0.4161 PERIOD 2000 0 0.0000 0.0000 . . VACN 1 1 -0.1111 0.1348 0.6791 0.4099 VACN 999 0 0.0000 0.0000 . .
5.9 Confidence-interval
RR+vacn vs –vacn = exp(-0.1111) = 0.89
Confidence-interval:
RRlower= exp(estimate - 1.96StdErr)
RRupper= exp(estimate + 1.96StdErr)
RR+vacn vs -vacn= 0.89 (0.69-1.17)
X.X Time since vaccination
5.9 years0.8 years
- vacn 0.8yr
+ vacn 5.9yr
-vacn 0.8 yr
V:<1 yr 1 yr
V:1-2 yr 2 yr
V:3-4 yr 2 yr
V:5+ yr 0.9 yr
6. Cox Regression
6.1 Cox regression
log(rate) = k + aI(vacn) +
b1I(1yr) + b2I(2yr) + b3I(3yr)
b4I(4yr) + b5I(5yr) + b6I(6yr) +
b7I(7yr) + b8I(8yr)
+c1I(92-93) + c2I(94) + c3I(95) +
c4I(96) + c5I(97) + c6I(98) + c7I(99) +
6.2 Cox regression
(age)
6.3 Live-lines
6.4 Data to Cox-regression
data coxdata;input @1 intime date7. @9 vactime date7. @17 auttime date7. @25 othtime date7.;datalines;11sep95 04apr97 17oct97 .13dec94 . . 24jan0027jan90 . . .23jul93 04nov95 01jan98 .15nov00 . . .15jun97 03apr99 . 15apr0103may92 . 06nov95 . .....run;
6.5 Cox SAS-program
data coxdata2;set coxdata;outtime=min(auttime,othtime,"31dec2000"d);time=(outtime-intime);if auttime=outtime then status=1; else status=0;run;
proc phreg;model time*status(0)=vacn;if (vactime=. or time<(vactime-intime)) then vacn=0;else vacn=1;run;