View
1.190
Download
4
Category
Preview:
Citation preview
Post-Lock Data Flow:From CRF to FDA
Ben Vaughn, MS, RACPrincipal Statistical Scientist
Broad Strokes of What Statisticians Do with Clinical Trial Data
Get data from data management
Do assorted “Stuff” with that data
Make Tables, Listings and Figures
Transfer Tables, Listings and Figures to Medical Writing
A little more detail and some terms-Step 1: EDC System Data
• Data exported from an EDC System is frequently called “raw” data.– Can contain superfluous variables (edit checks, audit
trails, etc.)– Not very conducive to generating tables (form follows
function: typically parallels CRF pages)– Documentation isn’t designed for an FDA reviewer as
the audience– Datasets are highly variable from vendor to vendor
and from various EDC systems– Generally speaking there is a 1:1 relationship; each
data point appears once and only once in the raw data
A little more detail and some terms-Step 2: Analysis datasets• Datasets generated from the raw data to facilitate the production
of Tables Listings and Figures are called “Analysis Datasets” (ADS)– Any variables that must be derived are added: scoring of
instruments, determination of whether AEs are treatment emergent, determination of baseline values and calculation of change from baseline, etc.
– Key variables are merged onto all records: treatment codes, covariates, study start and stop dates, age, gender, race, etc.
– Datasets might be split or combined into more logical groups; many different patient reported outcomes might be in a single dataset from DM, but have different analysis rules, and therefore be split into multiples analysis datasets.
– Goal is to create datasets where most key points of information on tables can be generated with one procedure
The Assorted “Stuff”
• Documentation is written for the raw EDC data to allow an FDA reviewer to understand the source of each data point
• Analysis datasets (ADS) are generated using (SAS) programs to transform the raw data into analysis datasets; these can be rerun on new cuts of data
• Documentation is written for the ADS to allow an FDA reviewer to understand the source of each variable/row and how it maps from the raw data
• Did I mention documentation? FDA loves documentation.
But wait! Data standardsElectronic Standardized Study Data Timeline (Fitzmartin, PhUSE 2014)
Data Standards, Cont.
• ALL data submitted to FDA for studies starting next year, MUST conform to data standards (but sponsors should already be doing it)
• These guidances are BINDING, refusal to file is possible if they are not followed
• A draft guidance defines what the standards are: Study Data Tabulation Model (SDTM) for the “raw” data and Analysis Data Model (ADaM) for the analysis datasets; this guidance is actively reviewed and updated
• A sponsor may apply for a waiver, but FDA seems unlikely to grant them
Data Standards: SDTM
• Extremely rigid format• Anything can be mapped into this format, and there is
a standard for expanding it for things that don’t map well to an existing pre-specified dataset
• Does not necessarily reflect the flow of a clinic visit or the CRF design, which can make it difficult to implement directly in an EDC system
• Some types of data there is no excuse not to get in SDTM from the start (ex: central labs vendor should be able to provide SDTM data)
• Standardized documentation (define.xml)• Submitted to FDA in place of raw data
Data Standards: ADaM
• Typically uses SDTM as its source• Somewhat less rigid than SDTM• Fewer specified data structures (but expanding):
– ADSL (Subject- Level dataset; standard variables for treatments, dates, sites, age, sex, race, populations
– ADAE (Adverse Events)– ADTTE (Time to event)– OCCDS (Occurrence Data Structure, generalization
of ADAE for things like Medical History and Concomitant Medications)
– BDS (Everything else)• Standardized documentation (Define.xml or Define.pdf)
Data Standards: ADaM, cont.
Legacy data is frequently a “Wide” format…Subject
Visit
DIABP
SYSBP
PULSE
RESP WEIGHT
HEIGHT
BMI
DIABPBL
SYSBPBL
PULSEBL
RESPBL
WEIGHTBL
HEIGHTBL
BMIBL
DIABPCBL
SYSBPCBL
PULSECBL
RESPCBL
WEIGHTCBL
HEIGHTCBL
BMICBL
Data Standards: ADaM, cont.Crammed onto one row:
Subject
Visit DIABP
SYSBP
PULSE
RESP WEIGHT
HEIGHT
BMI DIABPBL
SYSBPBL
PULSEBL
RESPBL
WEIGHTBL
HEIGHTBL
BMIBL
DIABPCBL
SYSBPCBL
PULSECBL
RESPCBL
WEIGHTCBL
HEIGHTCBL
BMICBL
Data Standards: ADaM, cont.SDTM and ADaM are “Tall, Skinny” formats
SUBJID
AVISITN
PARAMCD
AVAL
BASE CHG
001 1 DIABP001 1 SYSBP001 1 PULSE001 1 RESP001 1 WEIGHT001 1 HEIGHT001 1 BMI
Data Standards: ADaM Advantages• Huge efficiencies for table programming:
– You almost never need to look up variable names– Programming code for one table can be altered to
make a similar table by just changing the dataset and parameters
• Standard documentation allows reviewers to easily understand what is in each dataset, how it was derived and which flags should be used to produce a particular display
• Data from multiple studies can be “Stacked” as long as things like the parameter codes are uniform
Data Standards: ADaM Disadvantages• Datasets are a bigger investment• Completely fails where you need multiple
outcomes on a single row• “Drill down” questions are problematic; can be
created as additional rows/ outcomes, but clinical reviewers are typically interested in how they relate to the questions that triggered the drill down
• Clinical reviewers almost always want “Wide” listings: Everything collected at the same time point on a single row (Transpose of the data is required)
CDASH: Related, but not required
• Clinical Data Acquisition Standards Harmonization (CDASH) is a suite of standardized CRFs and variable names for the data points collected in those forms
• Goes cleanly and uniformly into SDTM• Saves time and money!• Your study is no longer a unique snowflake• It is likely that there will always be non-standard
data collected, so manual mapping will be required
Broad Strokes of What Statisticians Do
Get data from data managementMap “raw” data to SDTM and generate documentationMap SDTM data to ADaM and generate documentationMake Tables, Listings and FiguresTransfer Tables, Listings and Figures to Medical Writing
NDA/BLA Submission
• An integrated analysis of safety and efficacy (ISS/ISE) will be needed for nearly all NDAs and BLAs
• Many individual studies must be combined into an ISS/ISE database
• Integrated data must be summarized in ISS/ISE post-text tables
• This is distilled into the ISS/ISE text and sections 2.7.3 and 2.7.4 of the eCTD
Ideal Dataflow Process
CDASH CRF data SDT
M ADaM TLFs CSR
Study 2 ADS
Study …n ADS
Study 3 ADS
ISS/ ISE ADS
ISSISETLFs
ISSISE
All SDTM is created consistently; study analysis datasets are created with uniform structures; all information can be cleanly and sequentially linked back to the CRF data.
NDA
More Typical State of Data
CRF data SDTM
Study ADS
(ADaM)TLFs CSR
(Some) Phase III studies
CRF data in Legacy Format
Study ADS TLFs CSR
Phase I/II (III) studies
Assorted judgment calls with documentation of varying quality
Considerations for Legacy Conversions• FDA places an extremely high value on traceability
and reproducibility- this trumps any data standard• SDTM conversion of legacy data is NOT required• When converting legacy data to SDTM for submission
(where CSRs were generated off legacy data) FDA suggests additionally submitting the legacy data
• FDA has not clearly indicated that it uses SDTM data in any way for non-pivotal trials where the CSR relies on legacy data.
Suggested Integration and Submission ApproachSDTM Study #1
Study ADS
(ADaM)TLFs CSR
Legacy Study #1 Study ADS TLFs CSR
SDTM Study #2..n
Study ADS
(ADaM)TLFs CSR
Legacy Study #2..n
Study ADS TLFs CSR
Map study ADS into uniform
ADAM
ISSISETLFs
ISSISE
NDA
References
• SDTM and ADaM specs and implementation: http://www.cdisc.org/
• Study Data Technical Conformance Guide: http://www.fda.gov/downloads/ForIndustry/DataStandards/StudyDataStandards/UCM384744.pdf
• eStudy Data Guidance: http://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/UCM292334.pdf
Recommended