A Simulation Study Assessing the Effect of Missing Data on ......Capturing patient-reported outcome (PRO) data electronically: the past, present, and promise of ePRO measurement in

Introduction .o Daily diaries are utilized in clinical trials and observational studies, to record

patient health-related quality of life (HRQOL) over a time-limited recall period (eg,

24 hours) over multiple days 1

o Useful for symptomatic conditions such as pain due to day-to-day variability.

o Applicable when day-to-day recall may be affected (eg, mood or health state)

o The burden of daily report can lead to missing data and biased interpretation

o Typically methodologists choose a minimum number of non-missing responses

(days) to calculate an average score (eg, 4-5 days to generate a weekly average)

o Longitudinal study factors may influence the amount of missing data: 1) more

severe patients may have an increased likelihood of missing diary entries (Table 1)

and 2) missing data tends to increase as a trial progresses 2,3

Griffiths P,1 Floden E,2 Doll H,1 Morris M,2 Hudgens S2

1Clinical Outcomes Solutions, Folkestone, UK; 2Clinical Outcomes Solutions, Tucson, AZ, USA.

Results .o Reliability bias increased as the proportion of missing data increased

o Higher bias for MNAR compared with MAR

o When participants have up to 4 days of missing data:

o MNAR or MAR: reliability estimates were less biased than complete case analysis

o When participants have 5 days of missing data:

o MAR: Reliability estimates were comparable to complete case analysis

o MNAR: Reliability estimates were more biased than complete case analysis

o Loss of power for missingness mechanisms compared to the fully observed data showed

MDA assuming MNAR and MAR mostly performed better then CCA (Table 2)

o Over 90% of MDA samples reached the critical threshold (>0.70) with 4 days missing

o Lower 95% CIs were mostly <0.70 for all missingness scenarios with >1 day missing

Conclusions .• For a simulation of real clinical trial pain scores where the ICC was approx. 0.8, reliability

was either comparable or less biased when calculating weekly averages from availabledata irrespective of how many days present compared with a complete case analysis

• When at least 3 days of data are present, more reliable estimates are obtained fromusing all available data to create participant mean scores• As few as two days of data could be justified in the right context

• For example, when only 10% or 20% of the population have missing data• When 50% sample was missing data, although most MDA point estimates achieved

reliability (>0.70), lower 95% CI of ICC was poor (<0.70)• This was for an extreme case (50% of population with missing data)

Implications• Calculating a weekly average pain score utilising all available data maintains power

compared with complete case analysis without adversely affecting instrument reliability• Utilizing all available patient data is more inline with Intention to Treat principles

Psychometric Properties in the Face of Missing DataA Simulation Study Assessing the Effect of Missing Data on Test-Retest

Reliability in Diary Studies

Methods .Simulation of test-retest data

o Correlation matrix was designed from a real trialdata of pain recorded on an 11-point NRS

o 1000 datasets of N = 100o Time 1 data: Each “participant” had 7 days of

integer pain scores on a 0–10 scaleo Time 2 data: a second timepoint was simulated

by adding a random value with mean=1 andSD=1 to each day of each participants score

o All scores were rounded to the nearest integero Mean (SD) ICC over 1000 datasets = 0.82 (0.024)

Missingness

o Missingness was created in the dataset for

MCAR, MAR and MNAR mechanisms (described

in Table 1)

o 10%, 20%, 30%, 40% and 50% of the sample

had missing data assigned, according to the

missing mechanism

o 1 day, 2 days, 3 days, 4 days and 5 days of

data were deleted according to the missing

mechanism

o 7 days of deleted data (complete case

analysis) was also used as a comparator

o Thus, for each scenario a new dataset was

created (eg 10% of population missing 1

day, 10% missing 2 days etc.)

Testing reliability in Weekly Mean Scores

o A reliability index (the intraclass correlation[ICC], Box 1) and associated confidence interval(CI) was averaged over all samples for the fullyobserved data and all missingness scenarios:1. For each missing data scenario, the

difference between the fully observed dataICC and the missing data scenario ICC wascalculated

2. Equivalence of the ICC between the fullyobserved data simulation and eachmissingness condition was tested at 0.05and 0.10 ICC units difference (Figure 1)

3. Number of samples achieving a reliabilityestimate of ≥0.70 (Table 2)

Background .o Previously, we assessed bias of mean diary scores including missing

observations:4

o Complete cases analysis (CCA) had larger bias than missing data analysis:

o For more than 50% of CCA samples, the estimate bias was larger than 0.50

standard deviations (SD) of the true mean

o Missing data analysis (MDA), using estimates derived from all available records,

led to less bias:

o For all MDA samples, the estimate bias was smaller than 0.20 SDs of the

true mean

o False positive rate (Type 1 error) comparing group differences for the CCA or MDA

was controlled regardless of the number of missing diary days

o Compared with fully observed data, power to detect a difference between group

mean scores (1-Type 2 error) was reduced in the CCA (>15% lower) but not in the

MD (All mechanisms: <2% lower)

o Therefore, excluding patients with missing data:o reduces power compared with using all available data, and

o leads to larger biases in score estimates

Mechanism Description Simulation Technique

Missing Completely at Random(MCAR)

Not related to health status (eg, forgot diary)

1. Participants chosen at random to receive missing data

2. Days selected for deletion had equal weighting

Missing at Random (MAR)

Related to other (observed) data (eg, health status gets worse immediately prior to the missing event)

1. Participants chosen to receive missing data weighted by severity

2. Days more likely to be deleted weighted by the next most recent score

Missing Not at Random (MNAR)

Related to the unobserved (ie, missing) variable (eg, missing due to severity – patient too ill to complete)

1. Participants chosen to receive missing data weighted by severity

2. Days more likely to be deleted weighted by the severity of each days score

Table1 Missingness Mechanisms

Objectives .o To understand the impact of the type of missingness (MCAR, MAR and MNAR) on

reliability in a simulated test-retest study using a pain numeric rating scale (NRS).

Specifically, we assess:

1. Bias of reliability estimates between complete case analysis and missing data

analysis (assigned under three missingness mechanisms)

2. The loss of power of reliability estimates between fully observed data and

both complete case analysis and missing data analysis

Figure 2 ICCs of 100 Individual Samples compared with Fully Observed Data and MCAR

Missing Not at RandomMissing at Random

2 D

ays

Mis

sin

g4

Day

s M

issi

ng

5 D

ays

Mis

sin

g

Figure 3 Complete Case Analysis

DAYS MISSING

MAR MNAR

Estimate Lower CI Estimate Lower CI

Fully Observed Data

100% 88.4% 100% 88.4%

1 Day 100% 81.2% 100% 82.5%

2 Days 100% 69.1% 99.9% 68.3%

3 Days 99.6% 50.4% 99.4% 42.3%

4 Days 97.1% 27.0% 92.9% 12.5%

5 Days 88.3% 11.6% 52.8% 1.1%

7 Days 57.6% 7.5% 57.6% 7.5%

Table2 Proportion of Samples Achieving ≥0.70 Reliability (50% Missing Scenario)

Waterfall Plots showing reliability (ICC) • BLUE LINE: fully observed data• RED LINE: MCAR samples• BARS represent each sample

Data show that reliability estimates for MAR and MNAR data at 4 Days missing are less biased than CCA

MAR data at 5 Days missing are similar to CCA

Poster presented at the International Society for Quality of Life Research 25th Annual Conference, 24th–27th October 2018, Dublin, Ireland

References: (1) Coons SJ, Eremenco S, Lundy JJ, O’Donohoe P, O’Gorman H, & Malizia W (2015). Capturing patient-reported outcome (PRO) data electronically: the past, present, and promise of ePRO measurement in clinicaltrials. The Patient-Patient-Centered Outcomes Research, 8(4), 301-309. (2) Fairclough DL (2010). Design and analysis of quality of life studies in clinical trials. Chapman and Hall/CRC.(3) Seitz C, Lanius V, Lippert, S, Gerlinger C,Haberland C, Oehmke F, & Tinneberg HR (2018). Patterns of missing data in the use of the endometriosis symptom diary. BMC women's health, 18(1), 88. (4)Griffiths P, Floden E, and Hudgens S (2017) Scoring and Interpretationof Daily Diary Data in the Presence of Non-ignorable Missing Data. International Society for Quality of Life Research. (5)Bell ML, & Fairclough DL (2014). Practical and statistical issues in missing data for longitudinal patient-reported outcomes. Statistical methods in medical research, 23(5), 440-459. (6) McGraw KO, & Wong SP (1996). Forming inferences about some intraclass correlation coefficients. Psychological methods, 1(1), 30.

Figure 1 Mean ICC Differences from Fully Observed Data

Missing at Random

Missing Not at Random

Average ICC Difference from Full Observed Data

Average ICC Difference from Full Observed Data

X Axis – Samples ordered by reliability (ICC) from lowest to highest

Box1 ICC(2,1) Formula for Agreement6

𝐼𝐶𝐶 =𝜎𝐵2

𝜎𝐵2 + 𝜎𝑊

2 + 𝜎𝑒2

Where 𝐵 = between subjects variance, 𝑊 = within subjects variance and 𝑒 = Error variance

Documents

A Simulation Study Assessing the Effect of Missing Data on ......Capturing patient-reported outcome (PRO) data electronically: the past, present, and promise of ePRO measurement in