Upload
others
View
9
Download
0
Embed Size (px)
Citation preview
The University of Sydney
School of Public Health
Diagnostic test accuracy reviews.
Advanced Meta-analysis: dealing with
heterogeneity and test comparisons.
Petra Macaskill
Screening and Test Evaluation Program
School of Public Health
University of Sydney
Co-convenor, Cochrane Screening and Diagnostic Tests
Methods Group
Outline
• Background
• Descriptive Analyses (available in Revman)
– Graphical displays
– Summary ROC
– Exploring heterogeneity
• Hierarchical Models (not available in Revman)
– Rationale for using hierarchical models
– Choice of model:
• Bivariate
• HSROC (Rutter and Gatsonis model)
– Investigating heterogeneity
– Index test comparisons
Requires statistical expertise
Major steps covered in:
Cochrane Handbook for Systematic Reviews of
Diagnostic Test Accuracy
Objective of the review (e.g. performance of a single test,
exploring heterogeneity in test performance, test comparisons)
Locating and selecting studies
Assessing study quality – QUADAS2 updates in preparation
Extracting data – to be updated
Meta-analysis
Interpretation of the results – in preparation
Chapter 10: Analysing and Presenting Results Petra Macaskill, Constantine Gatsonis, Jonathan Deeks, Roger Harbord, Yemisi
Takwoingi.
Systematic Review of
Diagnostic Test Performance
http://srdta.cochrane.org/handbook-dta-reviews
Single index test:
Remains a common form of systematic review
Heterogeneity in test performance between studies is likely to be present, and reasons for it should be explored.
Test comparisons:
Increasing in importance and relevance
Methods for investigating heterogeneity can be applied
Ideally, test comparisons should focus on studies that directly compare the tests of interest
Systematic Review of
Diagnostic Test Performance
Reference test (binary)
“true” disease status, i.e. target condition
Index test (continuous, ordinal or binary)
Test threshold
Sensitivity and specificity
Likelihood ratios
ROC curve
Underlying Concepts
Test threshold: Individual Study Level
A plot of sensitivity against 1-specificity across the range of thresholds
results in a receiver operating characteristic (ROC) curve.
a single study:
diseasednon-diseased
TP
TP increases
FP increases
FP
threshold
TP decreases
FP decreases
ROC curves: Individual Study Level
diseasednon-diseased
0 40 80 120
test measurement
0.0
0.2
0.4
0.6
0.8
1.0
sen
sitiv
ity
0.00.20.40.60.81.0
specificity
diseasednon-diseased
0 40 80 120
test measurement
0.0
0.2
0.4
0.6
0.8
1.0
sen
sitiv
ity
0.00.20.40.60.81.0
specificity
diseasednon-diseased
0 40 80 120
test measurement
0.0
0.2
0.4
0.6
0.8
1.0
sen
sitiv
ity
0.00.20.40.60.81.0
specificity
diseasednon-diseased
0 40 80 120
test measurement
0.0
0.2
0.4
0.6
0.8
1.0
sen
sitiv
ity
0.00.20.40.60.81.0
specificity
Most studies report test sensitivity and specificity at a threshold(s),
or provide sufficient information to construct the following 2 x 2
table at the threshold(s):
From this table we can compute
True positive rate (tpr):
False positive rate (fpr):
Data extraction
DnTPFNTPTPysensitivit
D
nFPTNFPFPyspecificit 1
“true” disease status
+ -
test
result
+ TP FP
- FN TN
Reasons for variability in test accuracy
between studies
• Random sampling error
For each study, the estimated sensitivity and specificity is subject to
sampling error. The larger the sample size, the smaller the
sampling error as shown by the confidence intervals in a Forest
plot.
Because the sensitivity and specificity are both proportions, the
within study sampling error is straightforward to estimate using
the binomial distribution.
Reasons for variability in test accuracy
between studies
• True underlying differences between studies
– In diagnostic reviews, sampling error is unlikely to account for all
of the variability (scatter) between studies.
– Additional heterogeneity in test performance between studies is
likely to occur for other reasons, including differences in:
• Cut-point chosen to define a positive test (threshold effect)
• Spectrum of disease
• Clinical setting
• Study design
• etc…
Even if all studies use the same cut-point, sensitivity and
specificity are expected to vary between studies
Graphical Displays
Descriptive plots should include:
– Forest plot showing sensitivity and specificity for each study and
the numbers on which these estimates are based for each study
– Scatter plot showing (1-specificity, sensitivity) pair for each study
in ROC space. The size of each marker should ideally reflect the
numbers in both the diseased and non-diseased groups.
RevMan provides facilities for:
• graphical displays (improvements made in version 5.2).
• summary ROC curve estimation based on Moses-Littenberg method
• Descriptive exploration of heterogeneity using subgroup analyses
50 studies taken from the review conducted by Nishimura (2007) of
Rheumatoid factor (RF) as a marker for rheumatoid arthritis (RA)
The cut-point for test positivity for RF varied between studies ranging 3
to 100 U/ml (not all studies reported the cut-point)
The reference standard was based on the 1987 revised American
College of Rheumatology (ACR) criteria or clinical diagnosis.
Note: RF contributes to the ACR criteria so there is some risk of bias in
this analysis.
Example: Rheumatoid Factor as a marker
for Rheumatoid Arthritis
Study
Aho 1999
Anuradha 2005
Banchuin 1992
Bas 2003
Berthelot 1995
Bizzaro 2001
Bombardieri 2004
Carpenter 1989
Choi 2005
Cordonnier 1996
Das 2004
Davis 1989
de Bois 1996
De Rycke 2004
Despres 1994
Dubucquoi 2004
Fernandez-Suarez 2005
Girelli 2004
Goldbach-Mansky 2000
Gomes-Daudrix 1994
Greiner 2005
Grootenboer-Mignot 2004
Hitchon 2004
Jansen 2003
Jonsson 1998
Kamali 2005
Kwok 2005
Lee 2003
Lopez-Hoyos 2004
Nell 2005
Quinn 2006
Rantapaa-Dahlqvist 2003
Raza 2005
Saraux 1995
Saraux 2003
Sauerland 2005
Schellekens 2000
Soderlin 2004
Spiritus 2004
Suzuki 2003
Swedler
Thammanichanond 2005
Vallbracht 2004
van Leeuwen 1988
Vasiliauskiene 2001
Visser 1996
Vittecoq 2001
Vittecoq 2004
Winkles 1989
Young 1991
TP
64
482
36
143
80
61
27
60
261
20
42
18
8
93
143
84
30
32
70
48
75
64
32
130
50
20
77
73
36
56
115
49
22
8
35
161
80
5
57
383
89
57
196
163
75
157
26
62
113
25
FP
16
2
6
43
50
36
6
8
54
2
46
3
8
28
39
41
2
29
39
1
42
18
10
8
14
32
16
22
3
11
53
23
2
8
8
89
28
4
9
38
3
25
75
10
21
287
1
11
19
1
FN
27
82
41
53
39
37
3
20
63
29
14
31
0
25
63
56
23
3
36
40
12
29
9
128
20
26
52
29
5
46
67
28
20
31
51
7
69
11
33
166
9
6
99
28
21
78
32
114
29
14
TN
153
153
313
196
45
196
33
119
197
18
127
25
31
118
130
90
73
13
93
99
191
73
13
113
191
25
52
90
70
87
63
359
80
91
149
360
284
49
93
170
39
111
345
140
106
1466
29
127
481
20
cutoff
8.0
100.0
15.0
87.0
9.0
40.0
16.3
3.0
3.125
100.0
20.0
50.0
20.0
20.0
20.0
20.0
30.0
40.0
20.0
15.0
80.0
22.0
40.0
20.0
30.0
40.0
20.0
3.0
20.0
20.0
15.0
20.0
20.0
15.0
17.0
9.0
80.0
16.0
40.0
Method
LA
LA
ELISA
ELISA
LA
Nephelometry
Nephelometry
ELISA
LA
LA
Nephelometry
ELISA
ELISA
LA
LA
ELISA
Nephelometry
Nephelometry
Nephelometry
ELISA
Nephelometry
Nephelometry
Nephelometry
Nephelometry
ELISA
LA
Nephelometry
LA
Nephelometry
Not reported
Not reported
ELISA
LA
LA
ELISA
Nephelometry
ELISA
LA
Nephelometry
Nephelometry
Nephelometry
LA
ELISA
ELISA
ELISA
ELISA
LA
ELISA
LA
RA hemagglutination
Sensitivity
0.70 [0.60, 0.79]
0.85 [0.82, 0.88]
0.47 [0.35, 0.58]
0.73 [0.66, 0.79]
0.67 [0.58, 0.76]
0.62 [0.52, 0.72]
0.90 [0.73, 0.98]
0.75 [0.64, 0.84]
0.81 [0.76, 0.85]
0.41 [0.27, 0.56]
0.75 [0.62, 0.86]
0.37 [0.23, 0.52]
1.00 [0.63, 1.00]
0.79 [0.70, 0.86]
0.69 [0.63, 0.76]
0.60 [0.51, 0.68]
0.57 [0.42, 0.70]
0.91 [0.77, 0.98]
0.66 [0.56, 0.75]
0.55 [0.44, 0.65]
0.86 [0.77, 0.93]
0.69 [0.58, 0.78]
0.78 [0.62, 0.89]
0.50 [0.44, 0.57]
0.71 [0.59, 0.82]
0.43 [0.29, 0.59]
0.60 [0.51, 0.68]
0.72 [0.62, 0.80]
0.88 [0.74, 0.96]
0.55 [0.45, 0.65]
0.63 [0.56, 0.70]
0.64 [0.52, 0.74]
0.52 [0.36, 0.68]
0.21 [0.09, 0.36]
0.41 [0.30, 0.52]
0.96 [0.92, 0.98]
0.54 [0.45, 0.62]
0.31 [0.11, 0.59]
0.63 [0.53, 0.73]
0.70 [0.66, 0.74]
0.91 [0.83, 0.96]
0.90 [0.80, 0.96]
0.66 [0.61, 0.72]
0.85 [0.80, 0.90]
0.78 [0.69, 0.86]
0.67 [0.60, 0.73]
0.45 [0.32, 0.58]
0.35 [0.28, 0.43]
0.80 [0.72, 0.86]
0.64 [0.47, 0.79]
Specificity
0.91 [0.85, 0.94]
0.99 [0.95, 1.00]
0.98 [0.96, 0.99]
0.82 [0.77, 0.87]
0.47 [0.37, 0.58]
0.84 [0.79, 0.89]
0.85 [0.69, 0.94]
0.94 [0.88, 0.97]
0.78 [0.73, 0.83]
0.90 [0.68, 0.99]
0.73 [0.66, 0.80]
0.89 [0.72, 0.98]
0.79 [0.64, 0.91]
0.81 [0.73, 0.87]
0.77 [0.70, 0.83]
0.69 [0.60, 0.77]
0.97 [0.91, 1.00]
0.31 [0.18, 0.47]
0.70 [0.62, 0.78]
0.99 [0.95, 1.00]
0.82 [0.76, 0.87]
0.80 [0.71, 0.88]
0.57 [0.34, 0.77]
0.93 [0.87, 0.97]
0.93 [0.89, 0.96]
0.44 [0.31, 0.58]
0.76 [0.65, 0.86]
0.80 [0.72, 0.87]
0.96 [0.88, 0.99]
0.89 [0.81, 0.94]
0.54 [0.45, 0.64]
0.94 [0.91, 0.96]
0.98 [0.91, 1.00]
0.92 [0.85, 0.96]
0.95 [0.90, 0.98]
0.80 [0.76, 0.84]
0.91 [0.87, 0.94]
0.92 [0.82, 0.98]
0.91 [0.84, 0.96]
0.82 [0.76, 0.87]
0.93 [0.81, 0.99]
0.82 [0.74, 0.88]
0.82 [0.78, 0.86]
0.93 [0.88, 0.97]
0.83 [0.76, 0.89]
0.84 [0.82, 0.85]
0.97 [0.83, 1.00]
0.92 [0.86, 0.96]
0.96 [0.94, 0.98]
0.95 [0.76, 1.00]
Sensitivity
0 0.2 0.4 0.6 0.8 1
Specificity
0 0.2 0.4 0.6 0.8 1
Forest plot – sorted by specificity
Example: Rheumatoid Factor as a marker
for Rheumatoid Arthritis
Moses LE, Shapiro D, Littenberg B Stat Med 1993; 12:1293-1316.
For each study i
Compute accuracy (log diagnostic odds ratio, lnDOR):
and proxy for threshold (based on overall positivity rate):
Moses-Littenberg SROC regression
)logit()logit( iii fprtprD
)logit()logit( iii fprtprS
The relationship between test accuracy and test threshold is modelled
to estimate a summary ROC curve.
This fixed effect model is generally fitted using linear regression
(unweighted or weighted by inverse variance of lnDOR).
b 0 Accuracy depends on threshold resulting in an
asymmetric SROC
b = 0 Accuracy is independent of threshold resulting in a
symmetric SROC
The SROC is produced by using the estimates of a and b to compute the
expected sensitivity (tpr) across a range of values for 1-specificity (fpr)
SROC regression: model specification
bSaD
SROC regression:
properties and summary measures
Example: Rheumatoid Factor as a marker
for Rheumatoid Arthritis
Moses-Littenberg SROC
Historically, this has been the most commonly used method
easy to implement
uses standard regression methods / software
can use regression diagnostics to identify influential studies
but
does not take proper account of within and between study variability
confidence intervals and P-values are likely to be inaccurate
should be regarded as a descriptive/exploratory analysis
Hence:
Revman5 will provide only exploratory analyses based on SROC
regression. Statistical inference will require more complex analyses
using multilevel (hierarchical) models using other software.
Moses-Littenberg SROC regression:
comments
Historically, this has been the most commonly used method
easy to implement
uses standard regression methods / software
can use regression diagnostics to identify influential studies
but
does not take proper account of within and between study variability
confidence intervals and P-values are likely to be inaccurate
should be regarded as a descriptive/exploratory analysis
Multilevel (hierarchical) models have the advantage that they take
proper account of both:
(i) within study variability (sampling error)
(ii) between study variability not accounted for by (i), through the
inclusion of random effects
Moses-Littenberg SROC regression:
comments
Hierarchical (Mixed) models have the advantage that
they take account of both:
(i) within study variability (sampling error)
(ii) between study variability (heterogeneity) not
accounted for by (i), through the inclusion of random
effects
Hierarchical models provide a more rigorous method that
allow statistical inferences to be made.
Hierarchical (Mixed) models
Two hierarchical models most commonly used for the
meta-analysis of studies of diagnostic accuracy:
Bivariate model: the primary objective is to obtain a
summary estimate of sensitivity and specificity
and
HSROC model: the primary objective is to fit a
summary ROC
The two models are mathematically equivalent when no
covariates are included in the model
Hierarchical (Mixed) models
Estimating a summary operating point:
• This is appropriate if there is a common cut-point or criterion for
test positivity between studies
• If studies use different criteria for test positivity the summary
operating point will be difficult to interpret.
Estimating a summary curve:
• This is appropriate if there is variation in the cut-point or criterion
for test positivity between studies
• If studies use the same criterion for test positivity, there will be
very limited information to inform the shape of the curve.
Which method to use?
If no covariates included in the model, the Bivariate
and HSROC methods are mathematically equivalent:
• The parameter estimates from the HSROC model can be used to
derive the summary point and corresponding confidence region
• The parameter estimates from the Bivariate model can be used
to obtain the HSROC
If covariates are included in the model to explore
reasons for heterogeneity in test performance, the
choice will be guided jointly by:
The research question: Whether we want to make inferences about (i)
the summary curve or (ii) the summary point
Whether or not there is a common criterion for test positivity.
Which method to use?
Bivariate model:
Models the relationship between sensitivity and specificity directly (after
logit transformation), including random effects for both and allowing
for correlation between them.
The focus is on estimating the expected sensitivity and specificity (i.e.
expected operating point).
An underlying SROC can be derived from the estimated model
parameters (the HSROC is one of the possible SROC curves).
HSROC (Rutter and Gatsonis) model:
Includes random effects test accuracy and the proxy for test threshold.
The focus is on estimating a summary ROC.
The expected sensitivity for a given specificity, expected operating
point, etc can be derived from the estimated model parameters.
Multilevel (hierarchical) models
LEVEL 1
For each study (i), the number testing positive is assumed to follow a
Binomial distribution
where j=1 represents diseased group
j=2 represents non-diseased group
represents the number in group j
represents the probability of a positive
test result in group j
LEVEL 2
Model can be fitted using random effects logistic regression
(e.g. SAS, Stata, R, ...)
Bivariate model
),(~ ijijij nBy
ijn
ij
2
2
~)1logit(
)logit(
BAB
ABA
B
A
i
iBN
B
A
spec
sens
37 studies taken from the review conducted by Nishimura (2007) of
anti-cyclic citrullinated peptide antibody (anti-CCP).
the anti-CCP test is deemed positive if any anti-CCP antibody is
detected. Hence, detection may be considered a common threshold
the reference standard was based on the 1987 revised American
College of Rheumatology (ACR) criteria or clinical diagnosis.
if we can assume a common threshold (cut-point or criterion for test
positivity) across studies, it is appropriate to focus on summary
estimate(s) for sensitivity and specificity.
Bivariate Model Example :
Anti-CCP for the diagnosis of rheumatoid arthritis.
Example: Anti-CCP for the diagnosis of
rheumatoid arthritis. Study
Bas 2003
Bizzaro 2001
Goldbach-Mansky 2000
Jansen 2003
Saraux 2003
Schellekens 2000
Vincent 2002
Zeng 2003
Aotsuka 2005
Bombardieri 2004
Choi 2005
Correa 2004
De Rycke 2004
Dubucquoi 2004
Fernandez-Suarez 2005
Garcia-Berrocal 2005
Girelli 2004
Greiner 2005
Grootenboer-Mignot 2004
Hitchon 2004
Kamali 2005
Kumagai 2004
Kwok 2005
Lee 2003
Lopez-Hoyos 2004
Nell 2005
Nielen 2005
Quinn 2006
Rantapaa-Dahlqvist 2003
Raza 2005
Sauerland 2005
Soderlin 2004
Suzuki 2003
Vallbracht 2004
van Gaalen 2005
van Venrooij 2004
Vittecoq 2004
TP
110
40
43
110
40
72
139
90
115
23
236
74
89
90
31
69
25
70
167
26
26
64
71
68
38
42
149
147
47
24
171
7
481
190
82
865
69
FP
24
5
1
3
11
14
7
7
17
0
20
11
4
2
0
8
2
5
8
8
1
14
2
14
3
2
7
10
7
3
26
2
23
12
13
79
5
FN
86
58
63
148
46
77
101
101
16
7
88
8
29
50
22
18
10
17
98
15
20
15
58
35
0
60
109
35
20
18
60
9
68
105
71
252
107
TN
215
227
120
118
146
298
464
313
73
39
231
130
142
129
75
38
40
228
88
15
56
293
66
132
73
96
114
106
375
79
443
51
185
408
301
2218
133
Generation
CCP1
CCP1
CCP1
CCP1
CCP1
CCP1
CCP1
CCP1
CCP2
CCP2
CCP2
CCP2
CCP2
CCP2
CCP2
CCP2
CCP2
CCP2
CCP2
CCP2
CCP2
CCP2
CCP2
CCP2
CCP2
CCP2
CCP2
CCP2
CCP2
CCP2
CCP2
CCP2
CCP2
CCP2
CCP2
CCP2
CCP2
Sensitivity
0.56 [0.49, 0.63]
0.41 [0.31, 0.51]
0.41 [0.31, 0.51]
0.43 [0.37, 0.49]
0.47 [0.36, 0.58]
0.48 [0.40, 0.57]
0.58 [0.51, 0.64]
0.47 [0.40, 0.54]
0.88 [0.81, 0.93]
0.77 [0.58, 0.90]
0.73 [0.68, 0.78]
0.90 [0.82, 0.96]
0.75 [0.67, 0.83]
0.64 [0.56, 0.72]
0.58 [0.44, 0.72]
0.79 [0.69, 0.87]
0.71 [0.54, 0.85]
0.80 [0.71, 0.88]
0.63 [0.57, 0.69]
0.63 [0.47, 0.78]
0.57 [0.41, 0.71]
0.81 [0.71, 0.89]
0.55 [0.46, 0.64]
0.66 [0.56, 0.75]
1.00 [0.91, 1.00]
0.41 [0.32, 0.51]
0.58 [0.51, 0.64]
0.81 [0.74, 0.86]
0.70 [0.58, 0.81]
0.57 [0.41, 0.72]
0.74 [0.68, 0.80]
0.44 [0.20, 0.70]
0.88 [0.85, 0.90]
0.64 [0.59, 0.70]
0.54 [0.45, 0.62]
0.77 [0.75, 0.80]
0.39 [0.32, 0.47]
Specificity
0.90 [0.85, 0.93]
0.98 [0.95, 0.99]
0.99 [0.95, 1.00]
0.98 [0.93, 0.99]
0.93 [0.88, 0.96]
0.96 [0.93, 0.98]
0.99 [0.97, 0.99]
0.98 [0.96, 0.99]
0.81 [0.71, 0.89]
1.00 [0.91, 1.00]
0.92 [0.88, 0.95]
0.92 [0.86, 0.96]
0.97 [0.93, 0.99]
0.98 [0.95, 1.00]
1.00 [0.95, 1.00]
0.83 [0.69, 0.92]
0.95 [0.84, 0.99]
0.98 [0.95, 0.99]
0.92 [0.84, 0.96]
0.65 [0.43, 0.84]
0.98 [0.91, 1.00]
0.95 [0.92, 0.97]
0.97 [0.90, 1.00]
0.90 [0.84, 0.95]
0.96 [0.89, 0.99]
0.98 [0.93, 1.00]
0.94 [0.88, 0.98]
0.91 [0.85, 0.96]
0.98 [0.96, 0.99]
0.96 [0.90, 0.99]
0.94 [0.92, 0.96]
0.96 [0.87, 1.00]
0.89 [0.84, 0.93]
0.97 [0.95, 0.99]
0.96 [0.93, 0.98]
0.97 [0.96, 0.97]
0.96 [0.92, 0.99]
Sensitivity
0 0.2 0.4 0.6 0.8 1
Specificity
0 0.2 0.4 0.6 0.8 1
Proc NLMIXED for Bivariate Model
data accp (keep=study_id sens spec true n);
input study_id $ generation tp fp fn tn;
sens=1; spec=0; true=tp; n=tp+fn; output; sens=0; spec=1; true=tn; n=tn+fp; output;
cards;
Bas 1 110 24 86 215
Bizzaro 1 40 5 58 227
Goldbach-Mansky 1 43 1 63 120
Jansen 1 110 3 148 118
Saraux 1 40 11 46 146
Schellekens 1 72 14 77 298
Vincent 1 139 7 101 464
Zeng 1 90 7 101 313
Aotsuka 2 115 17 16 73
Bombardieri 2 23 0 7 39
.
.
; The resulting SAS dataset accp will have two records per study,
the first contains the numerator and denominator for sensitivity
the second contains the numerator and denominator for specificity
Proc NLMIXED for Bivariate Model
Summary estimate of
logit(sensitivity)
Summary estimate of
logit(specificity)
proc nlmixed data=accp cov ecov;
parms msens=2 mspec= 2 s2usens=0.5 s2uspec=0.5 covsesp=0;
logitp = (msens + usens)*sens + (mspec + uspec)*spec;
p = exp(logitp)/(1+exp(logitp));
model true ~ binomial(n,p);
random usens uspec ~ normal([0 , 0],[s2usens,covsesp,s2uspec])
subject=study_id out=randeffs;
run;
Proc NLMIXED for Bivariate Model
proc nlmixed data=accp cov ecov;
parms msens=2 mspec= 2 s2usens=0.5 s2uspec=0.5 covsesp=0;
logitp = (msens + usens)*sens + (mspec + uspec)*spec;
p = exp(logitp)/(1+exp(logitp));
model true ~ binomial(n,p);
random usens uspec ~ normal([0 , 0],[s2usens,covsesp,s2uspec])
subject=study_id out=randeffs;
run;
Random effects
Distribution of the random effects
Fit Statistics
-2 Log Likelihood 545.6
AIC (smaller is better) 555.6
AICC (smaller is better) 556.4
BIC (smaller is better) 563.6
Parameter Estimates
Standard
Parameter Estimate Error DF t Value Pr > |t| Alpha Lower Upper Gradient
msens 0.6534 0.1275 35 5.13 <.0001 0.05 0.3946 0.9122 0.000013
mspec 3.1090 0.1459 35 21.31 <.0001 0.05 2.8128 3.4051 -0.00015
s2usens 0.5426 0.1463 35 3.71 0.0007 0.05 0.2455 0.8397 0.000222
s2uspec 0.5717 0.1873 35 3.05 0.0043 0.05 0.1914 0.9520 0.000039
covsesp -0.2704 0.1199 35 -2.26 0.0304 0.05 -0.5137 -0.02710 0.000036
Covariance Matrix of Parameter Estimates
Row Parameter msens mspec s2usens s2uspec covsesp
1 msens 0.01625 -0.00741 0.000890 -0.00004 -0.00004
2 mspec -0.00741 0.02128 -0.00006 0.004286 -0.00116
3 s2usens 0.000890 -0.00006 0.02142 0.003997 -0.00874
4 s2uspec -0.00004 0.004286 0.003997 0.03509 -0.01184
5 covsesp -0.00004 -0.00116 -0.00874 -0.01184 0.01436
Proc NLMIXED for Bivariate Model
Input of Model Results to RevMan
The specificities appear to be
relatively homogenous but there is
considerable variation in the
sensitivities. (This is evident in the size of
the prediction region on the SROC plot.)
The summary estimate of sensitivity
and specificity is shown by the solid
black dot. (The sensitivity and specificity at
this point can be computed by inverse
transformation of the logit estimates to give
0.66 and 0.96 respectively.)
Bivariate Model Example :
Anti-CCP for the diagnosis of rheumatoid arthritis.
LEVEL 1
For each study (i), the number testing positive is assumed to follow a
Binomial distribution
where j=1 represents diseased group
j=2 represents non-diseased group
represents the number in group j
represents the probability of a positive
test result in group j
The model takes the form:
where represents the “true” disease status (coded as -0.5 for the non-
diseased and 0.5 for the diseased)
Rutter and Gatsonis HSROC model
),(~ ijijij nBy
ijn
ij
ijijiiij disdis exp)logit(
ijdis
LEVEL 1 cont.
The model is based on the ordinal logistic regression proposed by McCullagh.
Rutter and Gatsonis HSROC model
ijijiiij disdis exp)logit(
dependence
of accuracy on
threshold
(fixed effect)
threshold
(random effect)
accuracy
(random effect)
When = 0, the model reduces to a logistic regression model and
i is estimated by (logit(tpri) + logit(fpri))/2 ( = Si/2)
i is estimated by logit(tpri) - logit(fpri) ( = lnDORi)
LEVEL 1 cont.
The model is based on the ordinal logistic regression proposed by McCullagh.
ijijiiij disdis exp)logit(
LEVEL 1 cont.
The model is based on the ordinal logistic regression proposed by McCullagh.
ijijiiij disdis exp)logit( ijijiiij disdis exp)logit(
LEVEL 2
The random effects are assumed to be independent and normally
distributed:
The SROC curve is computed using for
chosen values of fpr
When = 0, provides a global estimate of the expected test accuracy
(lnDOR) and the resulting SROC is symmetric.
The expected tpr and fpr are given by and
respectively.
Rutter and Gatsonis HSROC model
),(~ 2
Ni
),(~ 2
Ni
efpreetprE logit5.0
11)(
5.05.011 ee
5.05.011 ee
The Rutter and Gatsonis HSROC model is a generalised non-linear
random effects model and hence requires more specialised software
to fit it.
It is often fitted using SAS Proc NLMIXED, or using Bayesian (MCMC)
methods.
Notes:
Metandi (macro available for Stata) exploits the relationship between
the Bivariate model and the HSROC model to fit the summary curve.
This software cannot accommodate covariates.
The METADAS macro for SAS create code for Proc NLMIXED and
provide output suitable for input to RevMan
Fitting the HSROC model
50 studies taken from the review conducted by Nishimura (2007) of
Rheumatoid factor (RF) as a marker for rheumatoid arthritis (RA)
The cut-point for test positivity for RF varied between studies ranging 3
to 100 U/ml (not all studies reported the cut-point)
The reference standard was based on the 1987 revised American
College of Rheumatology (ACR) criteria or clinical diagnosis.
Note: RF contributes to the ACR criteria so there is some risk of bias in
this analysis.
Example: Rheumatoid Factor as a marker
for Rheumatoid Arthritis
data rf (keep=study_id dis pos n);
input study_id $ tp fp fn tn method $;
dis=0.5; pos=tp; n=tp+fn; output;
dis=-0.5; pos=fp; n=tn+fp; output;
cards;
Bizzaro 61 36 37 196 N Bombardieri 27 6 3 33 N Das 42 46 14 127 N Suzuki 383 38 166 170 N Swedler 89 3 9 39 N Aho 64 16 27 153 LA Berthelot 80 50 39 45 LA Choi 261 54 63 197 LA Cordonnier 20 2 29 18 LA DeRycke 93 28 25 118 LA . . ;
Proc NLMIXED for HSROC Model
The resulting SAS dataset rf will have two records per study,
the first contains the numerator and denominator for sensitivity
the second contains the numerator and denominator for 1-specificity
proc nlmixed data=rf ecov cov;
parms alpha=2 theta=0 beta=0 s2ua=0 s2ut=0 ;
logitp = (theta + ut + (alpha + ua)*dis) * exp(-(beta)*dis);
p = exp(logitp)/(1+exp(logitp));
model pos ~ binomial(n,p);
random ut ua ~ normal([0,0],[s2ut,0,s2ua])
subject=study_id out=randeffs;
run;
Proc NLMIXED for HSROC Model
Summary estimate
for “threshold”
Summary estimate
for “accuracy”
Shape parameter
estimate
proc nlmixed data=rf ecov cov;
parms alpha=2 theta=0 beta=0 s2ua=0 s2ut=0 ;
logitp = (theta + ut + (alpha + ua)*dis) * exp(-(beta)*dis);
p = exp(logitp)/(1+exp(logitp));
model pos ~ binomial(n,p);
random ut ua ~ normal([0,0],[s2ut,0,s2ua])
subject=study_id out=randeffs;
run;`
Proc NLMIXED for HSROC Model
Random effects
Distribution of the random effects
Parameter Estimates
Standard
Parameter Estimate Error DF t Value Pr > |t| Alpha Lower Upper Gradient
alpha 2.6016 0.1862 48 13.97 <.0001 0.05 2.2273 2.9759 2.227E-6
theta -0.4370 0.1469 48 -2.98 0.0046 0.05 -0.7323 -0.1417 4.573E-6
beta 0.2267 0.1624 48 1.40 0.1691 0.05 -0.09978 0.5532 -1.16E-6
s2ua 1.3014 0.3046 48 4.27 <.0001 0.05 0.6890 1.9137 -6.42E-7
s2ut 0.5423 0.1237 48 4.39 <.0001 0.05 0.2937 0.7909 -6.99E-6
Proc NLMIXED for HSROC Model
Input of Model Results to RevMan
Example: RF for the diagnosis of
rheumatoid arthritis.
The summary curve shows the
expected trade-off between sensitivity
and specificity as threshold varies.
Notes:
Since RF constitutes part of the ACR
criteria, diagnostic accuracy may be
overestimated.
The impact of potentially influential studies
should be investigated.
Reasons for variability in test accuracy
between studies
• True underlying differences between studies
– In diagnostic reviews, sampling error is unlikely to account for all
of the variability (scatter) between studies.
– Additional heterogeneity in test performance between studies is
likely to occur for other reasons, including differences in:
• Cut-point chosen to define a positive test (threshold effect)
• Spectrum of disease
• Clinical setting
• Study design
• etc…
Covariates can be included in both the Bivariate and
HSROC models to investigate factors that may be
associated with heterogeneity.
37 studies taken from the review conducted by Nishimura (2007) of
anti-cyclic citrullinated peptide antibody (anti-CCP).
the anti-CCP test is deemed positive if any anti-CCP antibody is
detected. Hence, detection may be considered a common threshold
the reference standard was based on the 1987 revised American
College of Rheumatology (ACR) criteria or clinical diagnosis.
two generations of CCP are included in the analysis, CCP1 and CCP2
Bivariate Model Example :
Anti-CCP for the diagnosis of rheumatoid arthritis:
generation of CCP.
Bivariate Model Example :
Anti-CCP for the diagnosis of rheumatoid arthritis:
generation of CCP.
LEVEL 1
For each study (i), the number testing positive is assumed to follow a
Binomial distribution
where j=1 represents diseased group
j=2 represents non-diseased group
represents the number in group j
represents the probability of a positive
test result in group j
LEVEL 2
Assuming a study level covariate Z (assumed to have a fixed effect)
Model can be fitted using random effects logistic regression
(e.g. SAS, Stata, R, ...)
Bivariate model with a covariate
),(~ ijijij nBy
ijn
ij
2
2
~)1logit(
)logit(
BAB
ABA
iBB
iAA
i
i
Zv
ZvBN
B
A
spec
sens
Proc NLMIXED for Bivariate Model
data accp (keep=study_id sens spec true n ccpg);
input study_id $ generation tp fp fn tn;
if generation eq 1 then ccpg=0;
if generation eq 2 then ccpg=1; sens=1; spec=0; true=tp; n=tp+fn; output; sens=0; spec=1; true=tn; n=tn+fp; output;
cards;
Bas 1 110 24 86 215
Bizzaro 1 40 5 58 227
Goldbach-Mansky 1 43 1 63 120
Jansen 1 110 3 148 118
Saraux 1 40 11 46 146
Schellekens 1 72 14 77 298
Vincent 1 139 7 101 464
Zeng 1 90 7 101 313
Aotsuka 2 115 17 16 73
Bombardieri 2 23 0 7 39
.
.
;
CCP1 is the
referent category
Proc NLMIXED for Bivariate Model
proc nlmixed data=accp cov ecov; parms msens=2 mspec= 2 s2usens=0.5 s2uspec=0.5 covsesp=0
se1=0 sp1=0;
logitp = (msens + se1*ccpg + usens)*sens + (mspec + sp1*ccpg + uspec)*spec;
p = exp(logitp)/(1+exp(logitp));
model true ~ binomial(n,p);
random usens uspec ~ normal([0 , 0],[s2usens,covsesp,s2uspec]) subject=study_id out=randeffs;
/* Estimate logit(sensitivity) and logit(specificity) for CCP2 */
estimate 'logitsens CCP2' msens + se1;
estimate 'logitspec CCP2' mspec + sp1;
run;
run;
Notes:
The variance of the random effects for CCP1 and CCP2 are assumed to be the same
Proc NLMIXED for Bivariate Model
Random effects estimates common to both CCP1 and CCP2
Fit Statistics
-2 Log Likelihood 533.4
AIC (smaller is better) 547.4
AICC (smaller is better) 549.1
BIC (smaller is better) 558.6
Parameter Estimates
Standard
Parameter Estimate Error DF t Value Pr > |t| Alpha Lower Upper Gradient
msens -0.09653 0.2203 35 -0.44 0.6640 0.05 -0.5438 0.3507 -0.00024
mspec 3.4467 0.2982 35 11.56 <.0001 0.05 2.8412 4.0522 -0.00002
s2usens 0.3598 0.1022 35 3.52 0.0012 0.05 0.1524 0.5673 0.000479
s2uspec 0.5399 0.1802 35 3.00 0.0050 0.05 0.1742 0.9057 -0.00002
covsesp -0.1968 0.09836 35 -2.00 0.0532 0.05 -0.3965 0.002825 0.000213
se1 0.9626 0.2513 35 3.83 0.0005 0.05 0.4523 1.4728 -0.00025
sp1 -0.4302 0.3377 35 -1.27 0.2111 0.05 -1.1158 0.2554 0.000046
Covariance Matrix of Parameter Estimates
Row Parameter msens mspec s2usens s2uspec covsesp se1 sp1
1 msens 0.04854 -0.02464 -0.00012 -0.00001 -0.00003 -0.04855 0.02465
2 mspec -0.02464 0.08895 -0.00002 0.004771 -0.00065 0.02463 -0.08834
3 s2usens -0.00012 -0.00002 0.01044 0.002118 -0.00440 0.000693 -0.00005
4 s2uspec -0.00001 0.004771 0.002118 0.03246 -0.00860 -0.00007 -0.00039
5 covsesp -0.00003 -0.00065 -0.00440 -0.00860 0.009674 0.000100 -0.00091
6 se1 -0.04855 0.02463 0.000693 -0.00007 0.000100 0.06317 -0.03160
7 sp1 0.02465 -0.08834 -0.00005 -0.00039 -0.00091 -0.03160 0.1140
Proc NLMIXED for Bivariate Model
Fit Statistics
-2 Log Likelihood 533.4
AIC (smaller is better) 547.4
AICC (smaller is better) 549.1
BIC (smaller is better) 558.6
Parameter Estimates
Standard
Parameter Estimate Error DF t Value Pr > |t| Alpha Lower Upper Gradient
msens -0.09653 0.2203 35 -0.44 0.6640 0.05 -0.5438 0.3507 -0.00024
mspec 3.4467 0.2982 35 11.56 <.0001 0.05 2.8412 4.0522 -0.00002
s2usens 0.3598 0.1022 35 3.52 0.0012 0.05 0.1524 0.5673 0.000479
s2uspec 0.5399 0.1802 35 3.00 0.0050 0.05 0.1742 0.9057 -0.00002
covsesp -0.1968 0.09836 35 -2.00 0.0532 0.05 -0.3965 0.002825 0.000213
se1 0.9626 0.2513 35 3.83 0.0005 0.05 0.4523 1.4728 -0.00025
sp1 -0.4302 0.3377 35 -1.27 0.2111 0.05 -1.1158 0.2554 0.000046
Covariance Matrix of Parameter Estimates
Row Parameter msens mspec s2usens s2uspec covsesp se1 sp1
1 msens 0.04854 -0.02464 -0.00012 -0.00001 -0.00003 -0.04855 0.02465
2 mspec -0.02464 0.08895 -0.00002 0.004771 -0.00065 0.02463 -0.08834
3 s2usens -0.00012 -0.00002 0.01044 0.002118 -0.00440 0.000693 -0.00005
4 s2uspec -0.00001 0.004771 0.002118 0.03246 -0.00860 -0.00007 -0.00039
5 covsesp -0.00003 -0.00065 -0.00440 -0.00860 0.009674 0.000100 -0.00091
6 se1 -0.04855 0.02463 0.000693 -0.00007 0.000100 0.06317 -0.03160
7 sp1 0.02465 -0.08834 -0.00005 -0.00039 -0.00091 -0.03160 0.1140
Estimates for CCP1 (the referent category),
Additional Estimates
Standard
Label Estimate Error DF t Value Pr > |t| Alpha Lower Upper
logitsens CCP2 0.8660 0.1209 35 7.16 <.0001 0.05 0.6206 1.1114
logitspec CCP2 3.0165 0.1622 35 18.59 <.0001 0.05 2.6871 3.3459
Covariance Matrix of Additional Estimates
Row Label Cov1 Cov2
1 logitsens CCP2 0.01461 -0.00697
2 logitspec CCP2 -0.00697 0.02632
Proc NLMIXED for Bivariate Model
the ESTIMATE command is used to get corresponding values for CCP2
The change in -2logLikelihood when
the two covariates were added to the
model was 12.2 (a chi-squared statistic
with 2 df, P=0.002).
Hence, there is strong statistical
evidence that sensitivity and/or
specificity vary by generation.
The confidence regions show that
sensitivity varies by generation, but not
specificity.
Further models may be fitted to formally test
the effect of removing the covariate for
specificity from the model.
Bivariate Model Example :
Anti-CCP for the diagnosis of rheumatoid arthritis:
generation of CCP.
Summary estimates for specificity:
0.97 (95%CI 0.94, 0.98) for CCP1 and
0.95 (95%CI 0.94, 0.97) for CCP2.
Summary estimates for sensitivity:
0.48 (95%CI 0.37, 0.59) for CCP1 and
0.70 (95% CI 0.65, 0.75) for CCP2.
These results indicate an improvement
in sensitivity, without loss of specificity
for CCP2 compared with CCP1.
Bivariate Model Example :
Anti-CCP for the diagnosis of rheumatoid arthritis:
generation of CCP.
50 studies taken from the review conducted by Nishimura (2007) of
Rheumatoid factor (RF) as a marker for rheumatoid arthritis (RA)
The cut-point for test positivity for RF varied between studies ranging 3
to 100 U/ml (not all studies reported the cut-point)
The reference standard was based on the 1987 revised American
College of Rheumatology (ACR) criteria or clinical diagnosis.
Method of measurement of RF:
15 studies used nephelometry (N), 16 used latex agglutination (LA),
16 used ELISA (E)
(3 studies excluded: 2 method not specified, 1 used RA
hemaggltination)
Example: Rheumatoid Factor as a marker for
Rheumatoid Arthritis:
Method of measurement of RF
Example: Rheumatoid Factor as a marker for
Rheumatoid Arthritis:
Method of measurement of RF
LEVEL 1
For each study (i), the number testing positive is assumed to follow a
Binomial distribution
where j=1 represents diseased group
j=2 represents non-diseased group
represents the number in group j
represents the probability of a positive
test result in group j
Assuming a study level covariate Z (assumed to have a fixed effect)
where represents the “true” disease status (coded as -0.5 for the non-
diseased and 0.5 for the diseased)
Rutter and Gatsonis HSROC model with a
covariate
),(~ ijijij nBy
ijn
ij
ijdis
ijiijiiiiij disZdisZZ exp)logit(
data rf (keep=study_id dis pos n rfm1 rfm2);
input study_id $ tp fp fn tn method $;
rfm1=0; if method eq ‘LA’ then rfm1=1;
rfm2=0; if method eq ‘E’ then rfm2=1;
dis=0.5; pos=tp; n=tp+fn; output;
dis=-0.5; pos=fp; n=tn+fp; output;
cards;
Bizzaro 61 36 37 196 N Bombardieri 27 6 3 33 N Das 42 46 14 127 N Suzuki 383 38 166 170 N Swedler 89 3 9 39 N Aho 64 16 27 153 LA Berthelot 80 50 39 45 LA Choi 261 54 63 197 LA Cordonnier 20 2 29 18 LA DeRycke 93 28 25 118 LA . . ;
Proc NLMIXED for HSROC Model
N is the referent
category
proc nlmixed data=rf ecov cov;
parms alpha=2 theta=0 beta=0 s2ua=0 s2ut=0
a1=0 a2=0 t1=0 t2=0 b1=0 b2=0 ;
logitp = (theta + t1*rfm1 +t2*rfm2 + ut +
(alpha + a1*rfm1 +a2*rfm2 + ua)*dis)
* exp(-(beta + b1*rfm1 + b2*rfm2)*dis);
p = exp(logitp)/(1+exp(logitp));
model pos ~ binomial(n,p);
random ut ua ~ normal([0,0],[s2ut,0,s2ua])
subject=study_id out=randeffs;
run;`
Proc NLMIXED for HSROC Model
This model assumes the SROC curves differ in shape.
Removing b1*rfm1 + b2*rfm2 from the model changed the -2 logLikelihood
by only 0.2 (a chi-squared statistic with 2df, P=0.9 ). Hence there is no statistical
evidence that the curves differ in shape.
proc nlmixed data=rf ecov cov;
parms alpha=2 theta=0 beta=0 s2ua=0 s2ut=0
a1=0 a2=0 t1=0 t2=0;
logitp = (theta + t1*rfm1 +t2*rfm2 + ut +
(alpha + a1*rfm1 +a2*rfm2 + ua)*dis)
* exp(-(beta)*dis);
p = exp(logitp)/(1+exp(logitp));
model pos ~ binomial(n,p);
random ut ua ~ normal([0,0],[s2ut,0,s2ua])
subject=study_id out=randeffs;
/* parameter estimates for the methods of RF measurement; */
estimate 'alpha ELISA' alpha + a1;
estimate 'theta ELISA' theta + t1;
estimate 'alpha Nephelometry' alpha + a2;
estimate 'theta Nephelometry' theta + t2;
run;
Proc NLMIXED for HSROC Model
This model assumes the SROC curves all have the same asymmetric shape
-2 Log Likelihood 753.1
AIC (smaller is better) 771.1
AICC (smaller is better) 773.2
BIC (smaller is better) 787.7
Parameter Estimates
Standard
Parameter Estimate Error DF t Value Pr > |t| Alpha Lower Upper Gradient
alpha 2.4552 0.3245 45 7.57 <.0001 0.05 1.8017 3.1087 -0.0004
theta -0.5490 0.2137 45 -2.57 0.0136 0.05 -0.9794 -0.1186 0.000139
beta 0.1995 0.1702 45 1.17 0.2472 0.05 -0.1432 0.5423 -0.00018
s2ua 1.2865 0.3109 45 4.14 0.0002 0.05 0.6603 1.9128 -0.00038
s2ut 0.4786 0.1139 45 4.20 0.0001 0.05 0.2492 0.7080 0.00062
a1 0.2483 0.4408 45 0.56 0.5760 0.05 -0.6395 1.1361 -0.00038
a2 0.3328 0.4439 45 0.75 0.4573 0.05 -0.5612 1.2269 0.000093
t1 -0.1962 0.2614 45 -0.75 0.4568 0.05 -0.7227 0.3303 -0.00017
t2 0.4960 0.2627 45 1.89 0.0654 0.05 -0.03301 1.0250 0.000366
Additional Estimates
Standard
Label Estimate Error DF t Value Pr > |t| Alpha Lower Upper
alpha ELISA 2.7035 0.3278 45 8.25 <.0001 0.05 2.0433 3.3637
theta ELISA -0.7452 0.2103 45 -3.54 0.0009 0.05 -1.1687 -0.3217
alpha Nephelometry 2.7880 0.3067 45 9.09 <.0001 0.05 2.1704 3.4057
theta Nephelometry -0.05297 0.2125 45 -0.25 0.8043 0.05 -0.4810 0.3750
Proc NLMIXED for HSROC Model
The common shape parameter to all 3 curves is given by beta
Example: Rheumatoid Factor as a marker for
Rheumatoid Arthritis:
Method of measurement of RF
LA appears to be less accurate
than N and E whose curves show
very similar accuracy.
Removing a1*rfm1 +a2*rfm2
from the model gave a chi-squared
statistic of 0.6, 2df, P=0.74. Hence,
there is no statistical evidence
that the method of measurement
of RF is associated with
accuracy.
The effect of potentially influential
studies should be investigated.
Index Test Comparisons
Comparison based on all studies that evaluate one or both tests:
Methods of analysis follow the same approach as already outlined for investigation of heterogeneity
It may be necessary to allow variances of random effects to vary by test.
Such comparisons may be biased due to confounding arising from heterogeneity among studies in terms of design, study quality, setting, etc
Adjusting for potential confounders is often not feasible because the required information is typically missing or poorly reported.
Index Test Comparisons
Comparison restricted to studies that evaluate both tests:
Restricting the analysis to studies that evaluated both tests in the same patients ( truly “paired” studies), or randomized patients to receive each test, removes the need to adjust for confounders.
Methods of analysis for investigation of heterogeneity are extended to model sensitivity and specificity for both tests within each study (i.e. 2 records for sensitivity and 2 records for specificity per study, with a covariate for test type) all studies are analysed as if they are randomised
this approach is generally conservative
methods for dealing for pairing of test results within studies under development
The cross classification of tests results within disease groups for truly paired studies is generally not reported
Example: Comparison of Computed Tomogrpahy (CT)
and Ultrasonography (US) for the diagnosis of
appendicitis.
22 studies were included in the review by Terasawa (2004)
12 studies evaluated CT
14 studies evaluated US
4 studies evaluated both CT and US.
Example: Comparison of Computed Tomogrpahy (CT)
and Ultrasonography (US) for the diagnosis of
appendicitis.
Analysis based on all studies:
Strong statistical evidence of a difference
in sensitivity and specificity between the
tests (P<0.001)
CT has higher sensitivity and specificity
than US.
Example: Comparison of Computed Tomogrpahy (CT)
and Ultrasonography (US) for the diagnosis of
appendicitis.
Analysis based on comparative studies:
CT consistently shows higher sensitivity
than US
Specificity for CT is equal to or greater
than for US
Only 4 studies available for this model.
Convergence is an issue, and simplifying
assumptions may be necessary.
Analyses in RevMan are designed to be descriptive and exploratory.
Hierarchical models provide a more rigorous approach. The Bivariate model
and Rutter and Gatsonis HSROC models are the most commonly used.
The choice of model must be informed by the research question and whether a
common threshold for test positivity is used across studies.
Covariates can be included in hierarchical models to investigate heterogeneity.
The results can be input to RevMan for graphical display.
Modelling of test comparisons follows approach for investigation of
heterogeneity.
Ideally, comparative meta-analysis should focus on studies that compare tests
directly.
A comprehensive list of references is provided in Chapter 10 of the Handbook
for DTA Reviews.
Concluding Remarks
Small number of studies
Convergence issues
Model checking
Data reported at multiple thresholds per study:
• choosing a cutpoint for each study
• methods for analysing multiple 2x2 tables per study Hamza Taye H.; Arends Lidia R.; van Houwelingen Hans C.; Stijnen Theo
Multivariate random effects meta-analysis of diagnostic tests with multiple thresholds BMC MEDICAL RESEARCH METHODOLOGY Vol 9, Article Number: 73 DOI: 10.1186/1471-
2288-9-73 Published: NOV 10 2009
Other?
Discussion Points ( Methods continue to be extended and refined! )