30
Data Integrity Michelle A. Detry Department of Biostatistics and Medical Informatics University of Wisconsin - Madison ICTR Short Course – June 9, 2010

Data Integrity - Video Libraryvideos.med.wisc.edu/files/Data_integrity_MDetry.pdf · Data Integrity • What is Data Integrity? • Learning objective is to “Maintain the integrity

  • Upload
    vubao

  • View
    236

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Data Integrity - Video Libraryvideos.med.wisc.edu/files/Data_integrity_MDetry.pdf · Data Integrity • What is Data Integrity? • Learning objective is to “Maintain the integrity

Data Integrity

Michelle A. Detry Department of Biostatistics and

Medical Informatics University of Wisconsin - Madison ICTR Short Course – June 9, 2010

Page 2: Data Integrity - Video Libraryvideos.med.wisc.edu/files/Data_integrity_MDetry.pdf · Data Integrity • What is Data Integrity? • Learning objective is to “Maintain the integrity

Data Integrity

•  What is Data Integrity? •  Learning objective is to “Maintain the integrity

of data when collecting, recording, analyzing and reporting it”

•  Success of research depends on data collected

•  Need to collect quality data to assure integrity of the results

•  Depends on careful attention to detail, from planning until publication

•  Analyses at end of study cannot “fix” data quality

Page 3: Data Integrity - Video Libraryvideos.med.wisc.edu/files/Data_integrity_MDetry.pdf · Data Integrity • What is Data Integrity? • Learning objective is to “Maintain the integrity

Data Quality

•  Two fundamental measures of data quality – Completeness – Accuracy

•  Poor data quality can result in bias and increased variability

•  Bias – systematic error that would result in erroneous conclusions given a sufficiently large sample

•  Increased variability – decreases power for detecting differences between groups or increases uncertainty in differences identified

Page 4: Data Integrity - Video Libraryvideos.med.wisc.edu/files/Data_integrity_MDetry.pdf · Data Integrity • What is Data Integrity? • Learning objective is to “Maintain the integrity

Data Quality

• Close attention must be paid to data collection process and design of data collection forms

•  Primary goal should be to minimize potential for bias through completeness of data

Page 5: Data Integrity - Video Libraryvideos.med.wisc.edu/files/Data_integrity_MDetry.pdf · Data Integrity • What is Data Integrity? • Learning objective is to “Maintain the integrity

Data Collection

• Needs to be thoroughly planned before study begins

• Difficult to make changes after study begins

• How much data do you collect? • Not too much that it is a burden on

study team •  But enough to answer study question

Page 6: Data Integrity - Video Libraryvideos.med.wisc.edu/files/Data_integrity_MDetry.pdf · Data Integrity • What is Data Integrity? • Learning objective is to “Maintain the integrity

Data Collection

• Need to have well specified study objectives (primary, secondary, exploratory) before study begins

• Need to have well defined outcomes

• Carefully planning will allow for collection of enough data to answer questions

Page 7: Data Integrity - Video Libraryvideos.med.wisc.edu/files/Data_integrity_MDetry.pdf · Data Integrity • What is Data Integrity? • Learning objective is to “Maintain the integrity

Data Collection – Keys to Success

• Defining variables and recordable events clearly

• Communicating these definitions clearly to research nurses, clinic personnel, data managers, and statistical analysts

• Developing strategies for consistent data collection

Page 8: Data Integrity - Video Libraryvideos.med.wisc.edu/files/Data_integrity_MDetry.pdf · Data Integrity • What is Data Integrity? • Learning objective is to “Maintain the integrity

Data Collection – Issues to Consider

• What data can we collect…where…and when?

– During Clinic Visit?

– During Surgery?

– Data from Pathology and Radiology Reports?

– Data associated with outside treatments and previous disease events?

Page 9: Data Integrity - Video Libraryvideos.med.wisc.edu/files/Data_integrity_MDetry.pdf · Data Integrity • What is Data Integrity? • Learning objective is to “Maintain the integrity

Data Collection – Issues to Consider

•  Who is responsible for collecting and recording each type of data?

•  Who is responsible for quality control of data?

•  Which data sources override one another?

•  What is the frequency with which data are reviewed?

Page 10: Data Integrity - Video Libraryvideos.med.wisc.edu/files/Data_integrity_MDetry.pdf · Data Integrity • What is Data Integrity? • Learning objective is to “Maintain the integrity

Data Collection – Case Report Forms (CRFs)

•  Well designed forms are crucial

•  Must be clear and easy to use

•  Can include directions, but prefer forms to be self-explanatory

•  Consistency in how forms filled out is crucial

•  Consider testing forms prior to implementation

Page 11: Data Integrity - Video Libraryvideos.med.wisc.edu/files/Data_integrity_MDetry.pdf · Data Integrity • What is Data Integrity? • Learning objective is to “Maintain the integrity

Data Collection – Case Report Forms (CRFs)

•  Each subject should have a unique study id which is on every form

•  Data entered in coded fields with boxes to check for appropriate categories

•  Can more than one box be checked (specify)

•  If yes/no variable include boxes for yes AND no

Page 12: Data Integrity - Video Libraryvideos.med.wisc.edu/files/Data_integrity_MDetry.pdf · Data Integrity • What is Data Integrity? • Learning objective is to “Maintain the integrity

Data Collection – Case Report Forms (CRFs)

•  Date formats should be clearly specified – DD-MON-YYYY or MM/DD/YYYY or DD/MM/

YYYY

•  Open ended text fields are problematic

•  Missing data – is it missing, was it not done, was it not applicable?

•  In addition to outcome yes/no collect dates if applicable

Page 13: Data Integrity - Video Libraryvideos.med.wisc.edu/files/Data_integrity_MDetry.pdf · Data Integrity • What is Data Integrity? • Learning objective is to “Maintain the integrity

Data Integrity

•  Problems could be due to fraudulent activity

•  Most commonly problems due to poor design or lack of planning

•  Science is based on replication of results

•  Clinical trials sometimes repeated

•  Cannot be perfect, but want to do best you can

Page 14: Data Integrity - Video Libraryvideos.med.wisc.edu/files/Data_integrity_MDetry.pdf · Data Integrity • What is Data Integrity? • Learning objective is to “Maintain the integrity

Data Integrity - Incompetence

•  Need to have competent investigators

•  Need to have timely collection of data

•  Need to collect high quality data

•  Need to have a competent lab if samples are collected

•  Can the lab handle the volume of samples it will receive

Page 15: Data Integrity - Video Libraryvideos.med.wisc.edu/files/Data_integrity_MDetry.pdf · Data Integrity • What is Data Integrity? • Learning objective is to “Maintain the integrity

Data Integrity - Misunderstanding

•  Data to be collected must be clearly and specifically defined

•  Cannot have variation due to interpretation of definition

•  Eligibility criteria needs to be well thought out and clearly defined

•  Outcome measures need to be specifically defined

•  Death is clear, recurrence may not be clear •  Impact of misunderstanding could be serious

Page 16: Data Integrity - Video Libraryvideos.med.wisc.edu/files/Data_integrity_MDetry.pdf · Data Integrity • What is Data Integrity? • Learning objective is to “Maintain the integrity

Data Integrity - Misunderstanding

•  Train personnel who will be collecting data •  Train data entry personnel •  Train personnel who will be assessing

eligibility •  Impact of misunderstanding could be serious

– i.e. misunderstanding of outcome •  Collect all data to determine outcome •  If multiple components collect each

component details

Page 17: Data Integrity - Video Libraryvideos.med.wisc.edu/files/Data_integrity_MDetry.pdf · Data Integrity • What is Data Integrity? • Learning objective is to “Maintain the integrity

Data Integrity - Errors

•  Errors may be random or systematic •  No way of predicting random errors, but they

are unlikely to be repeated in same way •  Systematic errors more problematic •  High probability errors will happen again in

similar situation •  Random errors add variability “noise” to study

but most likely will not invalidate results •  Systematic errors may affect results and

credibility of study

Page 18: Data Integrity - Video Libraryvideos.med.wisc.edu/files/Data_integrity_MDetry.pdf · Data Integrity • What is Data Integrity? • Learning objective is to “Maintain the integrity

Data Integrity - Errors

•  Subjects may have been enrolled/randomized but do not meet eligibility criteria

•  What do you do?

•  Still follow subjects for study measurements and outcomes?

•  YES!

•  In a randomized trial, random errors will be balanced between the study groups and will add variability but will not invalidate the trial

Page 19: Data Integrity - Video Libraryvideos.med.wisc.edu/files/Data_integrity_MDetry.pdf · Data Integrity • What is Data Integrity? • Learning objective is to “Maintain the integrity

Data Integrity - Errors

•  Important to monitor for systematic errors

• May not be possible to go back and correct errors

Page 20: Data Integrity - Video Libraryvideos.med.wisc.edu/files/Data_integrity_MDetry.pdf · Data Integrity • What is Data Integrity? • Learning objective is to “Maintain the integrity

Data Integrity - Bias

•  Prejudices conscious or unconscious can introduce bias

•  Blinding important for both subjects and investigator where possible

•  Patient knowledge of treatment could affect actions

•  Investigator’s knowledge of treatment assignment can subconsciously affect evaluation of outcomes

Page 21: Data Integrity - Video Libraryvideos.med.wisc.edu/files/Data_integrity_MDetry.pdf · Data Integrity • What is Data Integrity? • Learning objective is to “Maintain the integrity

Data Integrity - Bias

• Could be introduced by excluding randomized/enrolled patients from analysis because they did not complete therapy or did not meet eligibility criteria

•  Bias in primary outcomes can be very problematic

•  Effort should be given to eliminate bias in design, conduct, and analysis

Page 22: Data Integrity - Video Libraryvideos.med.wisc.edu/files/Data_integrity_MDetry.pdf · Data Integrity • What is Data Integrity? • Learning objective is to “Maintain the integrity

Data Integrity - Intention to Treat

•  Intention-to-treat (ITT) principle:

– All subjects meeting admission criteria and subsequently randomized should be counted in their originally assigned treatment groups without regard to deviations from assigned treatment (Fisher et al., 1990)

Page 23: Data Integrity - Video Libraryvideos.med.wisc.edu/files/Data_integrity_MDetry.pdf · Data Integrity • What is Data Integrity? • Learning objective is to “Maintain the integrity

Data Integrity - Intention to Treat

• The intention-to-treat (ITT) principle means:

– All subjects randomized should be counted in their originally assigned treatment groups without regard to deviations from assigned treatment

– No exceptions

Page 24: Data Integrity - Video Libraryvideos.med.wisc.edu/files/Data_integrity_MDetry.pdf · Data Integrity • What is Data Integrity? • Learning objective is to “Maintain the integrity

Data Integrity - Intention to Treat

•  Examples of subjects that are frequently excluded: – Failed to meet compliance/adherence

requirements – Discontinued treatment due to adverse

effects – Received no treatment – Received the wrong treatment (e.g., due to

record keeping error) – Failed inclusion/exclusion criteria after

randomization

Page 25: Data Integrity - Video Libraryvideos.med.wisc.edu/files/Data_integrity_MDetry.pdf · Data Integrity • What is Data Integrity? • Learning objective is to “Maintain the integrity

Data Integrity – Intention to Treat

•  Deviations from ITT may bias analyses

•  Treatment discontinuations, non-compliance with protocol, and non-adherence to treatment are frequently treatment and outcome dependent

•  Example: a drug whose only effect it to cause a severe reaction in the sickest subjects: if you exclude from the analysis those who discontinue, the drug appears to make subjects better

Page 26: Data Integrity - Video Libraryvideos.med.wisc.edu/files/Data_integrity_MDetry.pdf · Data Integrity • What is Data Integrity? • Learning objective is to “Maintain the integrity

References •  Cook T and DeMets DL. Introduction to Statistical

Methods for Clinical Trials, Chapman & Hall/CRC; Taylor & Francis Group, LLC, Boca Raton, FL, 2008.

•  DeMets, D. L., Distinctions between fraud, bias, errors, misunderstanding, and incompetence. Controlled Clinical Trials 1997;18:637-650.

•  Introduction to Responsible Research by Nicholas H. Steneck, Office of Research Integrity, Department of Health and Human Services http://ori.dhhs.gov/documents/rcrintro.pdf

Page 27: Data Integrity - Video Libraryvideos.med.wisc.edu/files/Data_integrity_MDetry.pdf · Data Integrity • What is Data Integrity? • Learning objective is to “Maintain the integrity

Data Integrity – ITT example

•  Anturane Reinfarction Trial (ART): •  Trial of the clotting inhibitor anturane •  Subjects with recent myocardial

infarction •  Primary outcome: mortality •  1629 randomized • Re-evaluation of eligibility identified 71

ineligible subjects

Page 28: Data Integrity - Video Libraryvideos.med.wisc.edu/files/Data_integrity_MDetry.pdf · Data Integrity • What is Data Integrity? • Learning objective is to “Maintain the integrity

Data Integrity – ITT example

•  Anturane Reinfarction Trial (ART):

•  Initial mortality analysis excluded the ineligible subjects

• Rationale: pre-specified eligibility criteria, based on data measured prior to randomization

Page 29: Data Integrity - Video Libraryvideos.med.wisc.edu/files/Data_integrity_MDetry.pdf · Data Integrity • What is Data Integrity? • Learning objective is to “Maintain the integrity

Data Integrity – ITT example

Reference: Temple and Pledger (1980) NEJM p. 1488

Subgroup Anturane Placebo P-value

ITT (all rand) 74/813 (9.1%) 89/816 (10.9%) 0.20 Eligible 64/775 (8.3%) 85/783 (10.9%) 0.07

Ineligible 10/38 (26.3%) 4/33 (12.1%) 0.12

Eligible vs. Ineligible

p = 0.0001 p = 0.98

Page 30: Data Integrity - Video Libraryvideos.med.wisc.edu/files/Data_integrity_MDetry.pdf · Data Integrity • What is Data Integrity? • Learning objective is to “Maintain the integrity

Data Integrity – Intention to Treat

•  Exclusions prior to randomization are not the problem, these subjects should, by definition, be excluded from analyses

• Withdrawals after randomization are the concern

• Need to specifically define when an subject has been randomized