Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Introduction .o Daily diaries are utilized in clinical trials and observational studies, to record
patient health-related quality of life (HRQOL) over a time-limited recall period (eg,
24 hours) over multiple days 1
o Useful for symptomatic conditions such as pain due to day-to-day variability.
o Applicable when day-to-day recall may be affected (eg, mood or health state)
o The burden of daily report can lead to missing data and biased interpretation
o Typically methodologists choose a minimum number of non-missing responses
(days) to calculate an average score (eg, 4-5 days to generate a weekly average)
o Longitudinal study factors may influence the amount of missing data: 1) more
severe patients may have an increased likelihood of missing diary entries (Table 1)
and 2) missing data tends to increase as a trial progresses 2,3
Griffiths P,1 Floden E,2 Doll H,1 Morris M,2 Hudgens S2
1Clinical Outcomes Solutions, Folkestone, UK; 2Clinical Outcomes Solutions, Tucson, AZ, USA.
Results .o Reliability bias increased as the proportion of missing data increased
o Higher bias for MNAR compared with MAR
o When participants have up to 4 days of missing data:
o MNAR or MAR: reliability estimates were less biased than complete case analysis
o When participants have 5 days of missing data:
o MAR: Reliability estimates were comparable to complete case analysis
o MNAR: Reliability estimates were more biased than complete case analysis
o Loss of power for missingness mechanisms compared to the fully observed data showed
MDA assuming MNAR and MAR mostly performed better then CCA (Table 2)
o Over 90% of MDA samples reached the critical threshold (>0.70) with 4 days missing
o Lower 95% CIs were mostly <0.70 for all missingness scenarios with >1 day missing
Conclusions .• For a simulation of real clinical trial pain scores where the ICC was approx. 0.8, reliability
was either comparable or less biased when calculating weekly averages from availabledata irrespective of how many days present compared with a complete case analysis
• When at least 3 days of data are present, more reliable estimates are obtained fromusing all available data to create participant mean scores• As few as two days of data could be justified in the right context
• For example, when only 10% or 20% of the population have missing data• When 50% sample was missing data, although most MDA point estimates achieved
reliability (>0.70), lower 95% CI of ICC was poor (<0.70)• This was for an extreme case (50% of population with missing data)
Implications• Calculating a weekly average pain score utilising all available data maintains power
compared with complete case analysis without adversely affecting instrument reliability• Utilizing all available patient data is more inline with Intention to Treat principles
Psychometric Properties in the Face of Missing DataA Simulation Study Assessing the Effect of Missing Data on Test-Retest
Reliability in Diary Studies
Methods .Simulation of test-retest data
o Correlation matrix was designed from a real trialdata of pain recorded on an 11-point NRS
o 1000 datasets of N = 100o Time 1 data: Each “participant” had 7 days of
integer pain scores on a 0–10 scaleo Time 2 data: a second timepoint was simulated
by adding a random value with mean=1 andSD=1 to each day of each participants score
o All scores were rounded to the nearest integero Mean (SD) ICC over 1000 datasets = 0.82 (0.024)
Missingness
o Missingness was created in the dataset for
MCAR, MAR and MNAR mechanisms (described
in Table 1)
o 10%, 20%, 30%, 40% and 50% of the sample
had missing data assigned, according to the
missing mechanism
o 1 day, 2 days, 3 days, 4 days and 5 days of
data were deleted according to the missing
mechanism
o 7 days of deleted data (complete case
analysis) was also used as a comparator
o Thus, for each scenario a new dataset was
created (eg 10% of population missing 1
day, 10% missing 2 days etc.)
Testing reliability in Weekly Mean Scores
o A reliability index (the intraclass correlation[ICC], Box 1) and associated confidence interval(CI) was averaged over all samples for the fullyobserved data and all missingness scenarios:1. For each missing data scenario, the
difference between the fully observed dataICC and the missing data scenario ICC wascalculated
2. Equivalence of the ICC between the fullyobserved data simulation and eachmissingness condition was tested at 0.05and 0.10 ICC units difference (Figure 1)
3. Number of samples achieving a reliabilityestimate of ≥0.70 (Table 2)
Background .o Previously, we assessed bias of mean diary scores including missing
observations:4
o Complete cases analysis (CCA) had larger bias than missing data analysis:
o For more than 50% of CCA samples, the estimate bias was larger than 0.50
standard deviations (SD) of the true mean
o Missing data analysis (MDA), using estimates derived from all available records,
led to less bias:
o For all MDA samples, the estimate bias was smaller than 0.20 SDs of the
true mean
o False positive rate (Type 1 error) comparing group differences for the CCA or MDA
was controlled regardless of the number of missing diary days
o Compared with fully observed data, power to detect a difference between group
mean scores (1-Type 2 error) was reduced in the CCA (>15% lower) but not in the
MD (All mechanisms: <2% lower)
o Therefore, excluding patients with missing data:o reduces power compared with using all available data, and
o leads to larger biases in score estimates
Mechanism Description Simulation Technique
Missing Completely at Random(MCAR)
Not related to health status (eg, forgot diary)
1. Participants chosen at random to receive missing data
2. Days selected for deletion had equal weighting
Missing at Random (MAR)
Related to other (observed) data (eg, health status gets worse immediately prior to the missing event)
1. Participants chosen to receive missing data weighted by severity
2. Days more likely to be deleted weighted by the next most recent score
Missing Not at Random (MNAR)
Related to the unobserved (ie, missing) variable (eg, missing due to severity – patient too ill to complete)
1. Participants chosen to receive missing data weighted by severity
2. Days more likely to be deleted weighted by the severity of each days score
Table1 Missingness Mechanisms
Objectives .o To understand the impact of the type of missingness (MCAR, MAR and MNAR) on
reliability in a simulated test-retest study using a pain numeric rating scale (NRS).
Specifically, we assess:
1. Bias of reliability estimates between complete case analysis and missing data
analysis (assigned under three missingness mechanisms)
2. The loss of power of reliability estimates between fully observed data and
both complete case analysis and missing data analysis
Figure 2 ICCs of 100 Individual Samples compared with Fully Observed Data and MCAR
Missing Not at RandomMissing at Random
2 D
ays
Mis
sin
g4
Day
s M
issi
ng
5 D
ays
Mis
sin
g
Figure 3 Complete Case Analysis
DAYS MISSING
MAR MNAR
Estimate Lower CI Estimate Lower CI
Fully Observed Data
100% 88.4% 100% 88.4%
1 Day 100% 81.2% 100% 82.5%
2 Days 100% 69.1% 99.9% 68.3%
3 Days 99.6% 50.4% 99.4% 42.3%
4 Days 97.1% 27.0% 92.9% 12.5%
5 Days 88.3% 11.6% 52.8% 1.1%
7 Days 57.6% 7.5% 57.6% 7.5%
Table2 Proportion of Samples Achieving ≥0.70 Reliability (50% Missing Scenario)
Waterfall Plots showing reliability (ICC) • BLUE LINE: fully observed data• RED LINE: MCAR samples• BARS represent each sample
Data show that reliability estimates for MAR and MNAR data at 4 Days missing are less biased than CCA
MAR data at 5 Days missing are similar to CCA
Poster presented at the International Society for Quality of Life Research 25th Annual Conference, 24th–27th October 2018, Dublin, Ireland
References: (1) Coons SJ, Eremenco S, Lundy JJ, O’Donohoe P, O’Gorman H, & Malizia W (2015). Capturing patient-reported outcome (PRO) data electronically: the past, present, and promise of ePRO measurement in clinicaltrials. The Patient-Patient-Centered Outcomes Research, 8(4), 301-309. (2) Fairclough DL (2010). Design and analysis of quality of life studies in clinical trials. Chapman and Hall/CRC.(3) Seitz C, Lanius V, Lippert, S, Gerlinger C,Haberland C, Oehmke F, & Tinneberg HR (2018). Patterns of missing data in the use of the endometriosis symptom diary. BMC women's health, 18(1), 88. (4)Griffiths P, Floden E, and Hudgens S (2017) Scoring and Interpretationof Daily Diary Data in the Presence of Non-ignorable Missing Data. International Society for Quality of Life Research. (5)Bell ML, & Fairclough DL (2014). Practical and statistical issues in missing data for longitudinal patient-reported outcomes. Statistical methods in medical research, 23(5), 440-459. (6) McGraw KO, & Wong SP (1996). Forming inferences about some intraclass correlation coefficients. Psychological methods, 1(1), 30.
Figure 1 Mean ICC Differences from Fully Observed Data
Missing at Random
Missing Not at Random
Average ICC Difference from Full Observed Data
Average ICC Difference from Full Observed Data
X Axis – Samples ordered by reliability (ICC) from lowest to highest
Box1 ICC(2,1) Formula for Agreement6
𝐼𝐶𝐶 =𝜎𝐵2
𝜎𝐵2 + 𝜎𝑊
2 + 𝜎𝑒2
Where 𝐵 = between subjects variance, 𝑊 = within subjects variance and 𝑒 = Error variance