Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
CPTR RDST Data Platform Concept September 22, 2014
Outline
• C-Path overview and examples of data projects
• Knowledge sharing concept
• RDST approach
• Examples of RDST data types
• Database architecture
• Next steps – timeline
CPTR-RDST Data Platform 2014 Workshop Slides 2
C-Path Consortia
Coalition Against Major Diseases UNDERSTANDING DISEASES OF THE BRAIN
Critical Path to TB Drug Regimens TESTING DRUG COMBINATIONS
Multiple Sclerosis Outcome Assessments Consortium
DRUG EFFECTIVENESS IN MS
Polycystic Kidney Disease Consortium NEW IMAGING BIOMARKERS
Patient-Reported Outcome Consortium DRUG EFFECTIVENESS
Electronic Patient-Reported Outcome Consortium DRUG EFFECTIVENESS
Predictive Safety Testing Consortium DRUG SAFETY
Seven global consortia developing novel drug development tools
Biomarkers
Clinical Outcome Assessment Instruments
Clinical Trial Simulation Tools
In vitro tools
Data Standards
3
C-Path Online Data Repository
Current C-Path examples CAMD – AD Clinical Trial Simulation Tool PKD - Biomarker Qualification Project MSOAC – New Outcome Assessment Instrument for MS
MSOAC
4
CDC TB study data now available
Datasets contributed to C-Path for consortia projects
Consortium Therapeutic Area # of
Studies
Total Number
of Subjects
Number of
Data
Contributors
Coalition Against
Major Diseases
Alzheimer's disease 27 7340 11
Parkinson's disease 7 2597 2
Critical Path to
TB drug Regimens Tuberculosis 10 2495 5
MS Outcome
Assessments Consortium Multiple sclerosis 6 4700 4
Polycystic Kidney Disease Polycystic kidney
disease 5 2941 4
Predictive
Safety
Testing
Consortium
Normal healthy
volunteer-kidney 1 172 1
Skeletal-muscular
(non- clinical) 38 1766 6
Hepato-toxicity
(non-clinical) 43 2340 7
Nephro-toxicity
(non-clinical) 14 941 8
5
Value of data sharing, data standards & data pooling
Nine member companies agreed to share data from 24 Alzheimer’s disease (AD) trials
The data were not in a common format The data were remapped to the CDISC AD
standard and pooled
A new clinical trial simulation tool was created and has
been the first model endorsed by the FDA and EMA Researchers utilizing database to advance research
Start Point
Result
24 studies, >6500 patients
6
7
Model endorsed by FDA and EMA
Access to AD data available to
qualified researchers
Future TB model
8
Rapid DST TB Data Sharing Platform Architecture Concept
9
CPTR TB Drug Resistance DB
Data Platform to Inform Assay Development
10
How do we build this system?
Linking Global TB Sequence Researchers
• TB sequence community inputs
• Expert review to advance investigational biomarkers to validated status
• Can use separate or consolidated DBs
• FDA compliant CDISC architecture for regulatory submission DB
Approved members • Academic labs • Reference labs • Commercial companies • Others…
Validated DR biomarkers
Expert Panel
Review
Approved biomarkers
Sequence repository Anonymized sequence data Clinical annotation Phenotypic methods User friendly cloud interface
Analysis files generated
Investigational DB
Analysis files generated
Analysis files generated
Analysis files generated
Analysis files generated
Analysis files generated
CPTR-RDST Data Platform 2014 Workshop Slides
How do we accomplish this?
• Clear objective
– Improved research resource to enable development of new rapid diagnostics for TB
• Future objective
– With sustainability funding: resource for clinicians
• Build on previous efforts for TB and for data sharing
– Apply technology product development discipline
– Design to handle wide range of data types
– Quality criteria and defined process for incoming data
– Lean, efficient and well managed implementation
– Expandable / adaptable / flexible
– Great usability
• Strong alignment with anticipated analysis use cases
11 CPTR-RDST Data Platform 2014 Workshop Slides
Related efforts: TBDReamDB
12 CPTR-RDST Data Platform 2014 Workshop Slides http://www.tbdreamdb.com/index.html
Example of future objective: Stanford HIV database
13 CPTR-RDST Data Platform 2014 Workshop Slides http://hivdb.stanford.edu/
Product development discipline
• Detailed, documented requirements
• Early prototyping
• Design to requirements
• Staged development with clear milestones
• Extensive testing
– Verification of all features and function
– Usability
– Performance
– Scalability
• Phased rollout
– RDST members
– Qualified external researchers
• Ongoing support and enhancements based on user feedback
Requirements
Prototype
Design
Build
Test
Deploy
Support and enhance
14 CPTR-RDST Data Platform 2014 Workshop Slides
RDST Data: multiple data types
15
Need to incorporate multiple types of data
• sequence data
• SNP reports
• resistance test data
• clinical trial/study/registry data
• external information resources
• any other data that may be necessary
Which need to be analyzed to find and validate correlations
CPTR-RDST Data Platform 2014 Workshop Slides
RDST Data: genotypic data example
@M00347:61:000000000-A9B8J:1:1101:15324:1677 1:N:0:1
TCTTGATCGCGAGTTCGCGGCCCGGGGTGAGCACCCAGGTGAGCGGGAAATGCGTGGTGTCGTGGTAGCTGACGTCGACGATGCCGTGGCGGTATTCGAGGTCTGTGAACGTGTCGTCGTCGAGGAAGTTCTGCAGCACCAGCAGCGGATC
+
11>>1@BF1>>11AEF00000AA////A//A1AB/?/GAEC1GBE??///FFG/?E?EFHF/F?A?EG1BDBC/FCGGCCCC-.1011/0//?/333333/030333044300///B1111/00//>22222211111111101111100
RDST Data: SNP report example
17
SNP reports
CPTR-RDST Data Platform 2014 Workshop Slides
RDST Data: phenotypic data example
18
resistance test data
CPTR-RDST Data Platform 2014 Workshop Slides https://tbdr.org/cgi/tbdr
STUDYID DOMAIN USUBJID AGE SEX RACE ARM
19 DM 10001 27 F WHITE Ethambutol 5 Times Per Week
19 DM 10002 63 M WHITE Moxifloxacin 3 Times Per Week
19 DM 10003 42 M BLACK OR AFRICAN AMERICAN Moxifloxacin 5 Times Per Week
19 DM 10004 30 F ASIAN Moxifloxacin 5 Times Per Week
19 DM 10005 29 M BLACK OR AFRICAN AMERICAN Moxifloxacin 3 Times Per Week
19 DM 10006 35 M BLACK OR AFRICAN AMERICAN Ethambutol 3 Times Per Week
19 DM 10007 46 F UNKNOWN Ethambutol 3 Times Per Week
19 DM 10008 34 F BLACK OR AFRICAN AMERICAN Moxifloxacin 5 Times Per Week
19 DM 10009 55 M BLACK OR AFRICAN AMERICAN Ethambutol 3 Times Per Week
19 DM 10010 42 M ASIAN Moxifloxacin 5 Times Per Week
19 DM 10011 23 F BLACK OR AFRICAN AMERICAN Ethambutol 3 Times Per Week
19 DM 10012 47 F WHITE Ethambutol 3 Times Per Week
19 DM 10013 25 F BLACK OR AFRICAN AMERICAN Moxifloxacin 5 Times Per Week
19 DM 10014 21 M WHITE Ethambutol 3 Times Per Week
19 DM 10015 79 M WHITE Moxifloxacin 3 Times Per Week
19 DM 10016 27 F ASIAN Moxifloxacin 3 Times Per Week
19 DM 10017 37 M BLACK OR AFRICAN AMERICAN Ethambutol 3 Times Per Week
19 DM 10018 28 M BLACK OR AFRICAN AMERICAN Moxifloxacin 3 Times Per Week
RDST Data: clinical data example (hypothetical data)
19
STUDYID DOMAIN USUBJID MBTESTCD MBTEST MBORRES MBSPEC VISIT
13 MB 10001 AFB Acid Fast Bacilli NEGATIVE SPONT SPUTUM WEEK 8
13 MB 10001 ORGANISM Organism Present NEGATIVE FOR TUBERCULOSIS SPONT SPUTUM WEEK 4
13 MB 10001 MTBINH M.tuberculosis INH Resistant POSITIVE NON-OVERNIGHT SPUTUMSCREENING
15 MB 10001 ORGANISM Organism Present NEGATIVE FOR TUBERCULOSIS SPONT SPUTUM WEEK 4
13 MB 10002 AFB Acid Fast Bacilli NEGATIVE SPONT SPUTUM WEEK 8
13 MB 10002 ORGANISM Organism Present POSITIVE FOR M. TUBERCULOSIS COMPLEX SPONT SPUTUM SCREENING
15 MB 10002 ORGANISM Organism Present POSITIVE FOR M. TUBERCULOSIS COMPLEX SPONT SPUTUM SCREENING
13 MB 10003 ORGANISM Organism Present NEGATIVE FOR TUBERCULOSIS INDUCED SPUTUM WEEK 4
15 MB 10004 ORGANISM Organism Present NEGATIVE FOR TUBERCULOSIS SPONT SPUTUM WEEK 4
STUDYID DOMAIN USUBJID MOTESTCDMOTEST MOORRES MOSTRESC MOLOC VISIT MODY
17 MO 10001 CAVIT Cavitation Y Y LUNG, LEFT SCREENING -4
17 MO 10002 CAVIT Cavitation Y Y LUNG, LEFT SCREENING -5
17 MO 10002 PLEURALD Pleural Disease N N LUNG, LEFT SCREENING 1
17 MO 10004 PLEURALD Pleural Disease N N LUNG, LEFT SCREENING -8
17 MO 10005 CAVIT Cavitation N N LUNG, LEFT SCREENING -9
17 MO 10005 CAVIT Cavitation N N LUNG, LEFT SCREENING -15
17 MO 10006 CAVIT Cavitation Y Y LUNG, LEFT SCREENING 1
17 MO 10006 PLEURALD Pleural Disease N N LUNG, LEFT SCREENING -7
clinical trial data
CPTR-RDST Data Platform 2014 Workshop Slides
RDST Data: TB strain summary table
http://www.ncbi.nlm.nih.gov/genome/genomes/166 20
external resources
CPTR-RDST Data Platform 2014 Workshop Slides
RDST data platform: design to handle multiple data types
•
Aggregated Research Database
Subject – Level Clinical Trial Data
VAR1 1 2 3 4 5 6 7
s1 x1 x2 x3 x4 x5 x6 x7
s2 y1 y2 y3 y4 y5 y6 y7
s.. z1 z2 z3 z4 z5 z6 z7
Time
VAR2 1 2 3 4 5 6 7
s1 x1 x2 x3 x4 x5 x6 x7
s2 y1 y2 y3 y4 y5 y6 y7
s.. z1 z2 z3 z4 z5 z6 z7
VAR3 1 2 3 4 5 6 7
s1 x1 x2 x3 x4 x5 x6 x7
s2 y1 y2 y3 y4 y5 y6 y7
s.. z1 z2 z3 z4 z5 z6 z7
Strain 1 sequence data
Strain 2 sequence data
Strain 3 sequence data
ACAAGATGCCATTGTCCCGCT…
CCTGGAGGGTGGGAGACA…
CTTTCCTCGCTTGGGTGG…..
21
Clinical Trial Data Data
Analysis
Data Analysis
OBS1 1 2 3 4 5 6 7
s1 x1 x2 x3 x4 x5
s2 y1 y2 y3 y4 y5
s.. z1 z2 z3 z4
VAR1 1 2 3 4 5 6 7
s1 x1 x6 x7
s2 y1 y4 y5 y6 y7
s.. z1 z2 z3 z4 z5 z6 z7
TEST1 1 2 3 4 5 6 7
s1 x_base x_chk1 x7
s2 y_base y_chk1 y7
s.. z_base z2 z3 z7
Surveillance Data Time
Apply CDISC Data Standards
Surveillance data
Genotypic data
Phenotypic data data analysis
CPTR-RDST Data Platform 2014 Workshop Slides
Key success factors for incoming data
• Buy in for data contributions
– Survey and prioritize
– Proactive engagement
– Recognition and incentives for contributions
• Clearly defined quality criteria
– Develop and vet during initial data survey & prioritization
• Consistent process for incoming data processing
– Unified pipeline for incoming sequence data
– Ability to apply CDISC standards to create efficient database (vs large number of small data buckets)
• Ongoing curation and quality control
22 CPTR-RDST Data Platform 2014 Workshop Slides
Quality criteria and defined process for incoming sequence data
23
Incoming FASTQ plus associated SNP report
New SNP report generated with RDST unified pipeline
RDST unified pipeline for sequence data
CPTR-RDST Data Platform 2014 Workshop Slides
Data Element: Phase of TB treatment
Data Element: TB Symptoms
24
TB clinical data mapping to CDISC
We do this today for CPTR
USUBJID EXTRT EXDOS EXDOSU
USUBJID CETERM CEPRESP CEOCCUR
Clin
ical E
vents
(CE
)
Exposu
re
(E
X)
Skin
Response
(SR
)
USUBJID SRTESTCD SRTEST SRORRES SRORRESU
12345 INDURDIA Induration
Diameter
16 mm
Controlled Terminology
Map to CDISC domains
CDISC Variables
Data Element: Tuberculin Skin Test Result Definition: The number of millimeters in diameter of the induration, or raised hardening, at the tuberculin skin test site. Permissible value set: mm of induration.
Preserve, do not change the data content A place for everything, everything in its place Capture the smallest usable elements of data
24 CPTR-RDST Data Platform 2014 Workshop Slides
Rapid DST Data Platform
25
• Can we apply CDISC standards to TB genotypic and phenotypic data?
• What is the benefit of doing this?
CPTR-RDST Data Platform 2014 Workshop Slides
Interventions Special
Purpose
Demographics
Subject Elements
Subject Visits
Findings
ECG
Incl/Excl Exceptions
Events
Con Meds
Disposition Comments
Trial Design
Trial Elements
Trial Arms
Trial Visits
Trial Incl/Excl
Exposure
Substance Use
Adverse Events
Medical History
Deviations
Clinical Events
PK Concentrations
Vital Signs
Microbiology Spec.
Questionnaire
Drug Accountability
Subject Characteristics
Labs
Microbiology Suscept. PK Parameters
Physical Exam
Trial Summary Findings About
26 CPTR-RDST Data Platform 2014 Workshop Slides
CDISC Study Data Tabulation Model (SDTM) domains for classification of data elements
Data Mapping: SNP report example
27
PFORREF – reference result (can apply to nucleotides or amino acids, depends on value in PFTEST)
PFORRES – experimental result (can apply to nucleotides or amino acids, depends on value in PFTEST)
PFRESCAT – category of result (is this a nonsense or missense mutation? frameshift? etc.)
PFGENTYP – type of feature we’re looking at (gene, sector, protein, etc.)
PFGENRI – region of interest (it is defined as the specific gene or locus being looked at)
PFSTRESC – standard result of the analysis. Usually uses HGVS nomenclature
CPTR-RDST Data Platform 2014 Workshop Slides
Rapid DST Data Platform
28
Three primary categories of data
• Data as received from contributors
• Quality checked, processed, standardized data
–Master copy
–Full complement of data for RDST consortium use
–Authorized subset for external researchers (as broad as possible within sharing terms and conditions imposed by each data contributor)
• Analysis data extracts and reports to support research
CPTR-RDST Data Platform 2014 Workshop Slides
Rapid DST Data Platform
+ Lean, efficient and well managed implementation + Expandable / adaptable / flexible + Great usability
Strong alignment with anticipated analysis use
cases
29
Rapid DST Data Platform
30
Next Steps
CPTR-RDST Data Platform 2014 Workshop Slides
2014 2015 2016 2017
S O N D J F M A M J J A S O N D J F M A M J J A S O N D J F M A M J J A S O
S O N D J F M A M J J A S O N D J F M A M J J A S O N D J F M A M J J A S O
2014 2015 2016 2017
RDST Data Sharing Platform Timeline v4
1.1 Governance Model
1.5 Dev Ph 1 Dev Ph 2
Data Platform Available for Consortium Members
2.4 Perform phase 1 program assessment
2.6 Enable for external researchers
Sustainability Funding Secured
C-Path Milestone
1.3 Req’s, Arch and Design
1.2 Value proposition, DUA updates and Communication Plan
1.5 Test Ph 1 1.7 Test Ph 2
Dev Ph 3
1.8 Test Ph 3
3.5 Perform Phase 2 program assessment
Data Platform Available for external researchers
3.6/3.8 Release 2 Dev/Test
FIND Milestone
2.5 Expand Capacity 3.7 Expand Capacity
1.6/2.2/3.3 Prepare and load contributed data in Data Platform as it becomes available
2.3/3.4 Review and approve access requests as they are submitted
2.1/3.2 Monitor performance and usage
1.1.1 Inventory of available DBs
1.2.1 Form Expert Panel and support Data Platform development and use
1.2.2 Develop criteria for determination of resistance mutations
1.2.3 Develop algorithms for interpretation of genotypic data
1.2.3 Published algorithm for interpretation of genotypic data
1.3.1 Develop guidelines/criteria for clinical validation of assays to detect/interpret resistance mutation
1.4.1 Support for development of access models and tools for broad access
PHASE 2 PHASE 3
C-PATH Milestones
3.9 Pursue funding to support clinical use
2.8/3.1 Pursue sustainability funding
1.6 Load early data
2.7 Beta Test
1.4 Request early data
1.9 Prep for production
Request early data
1.5.1/1.5.2 Support for sustainable business model and review process
1.1.2 Early Data Packages available for inclusion in Data Platform
1.2.2 Defined Criteria for determination of resistance mutations
1.3.1 WHO report on guidelines/criteria for validation of assays to detect and intrerpret resistance mutations
1.1.2 Prepare data packages for inclusion in Data Platform
1.1.3 Input to C-Path on design of Data Platform
PHASE 1
FIND Milestones
Assist with development of Value
Proposition and Communications Plan
Next steps: the art of the start
Build and deploy Expanded access Sustainability
Rapid DST Data Platform
32
• Big job in front of us
• We are not starting from scratch
• We have lots of help
We can do this!
CPTR-RDST Data Platform 2014 Workshop Slides
33
www.c-path.org
CPTR-RDST Data Platform 2014 Workshop Slides
Rapid DST Data Platform
34
Backup
CPTR-RDST Data Platform 2014 Workshop Slides
Rapid DST Data Platform
35
+ Lean, efficient and well managed implementation + Expandable / adaptable / flexible + Great usability
CPTR-RDST Data Platform 2014 Workshop Slides
Rapid DST Data Platform
36
Investigational DB with user access levels
(data team, RDST, external)
use
r
frie
nd
ly
cl
ou
d
in
terf
ace
FASTQ data files
internal
Incoming data
storage
external
CPTR-RDST Data Platform 2014 Workshop Slides
Rapid DST Data Platform
37
Strong alignment with anticipated analysis use cases
CPTR-RDST Data Platform 2014 Workshop Slides