Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
NEXT-GENERATION DISEASE REGISTRIES AND POPULATION HEALTH INFORMATICS INITIATIVES, WITH A FOCUS ON REAL-WORLD EVIDENCE
Jim Lacey, Jr., Ph.D. Director, Division of Cancer Etiology
@jimlaceyjr
2.
• A Disease Registry, an Observational Cohort Study, and a Population Health Plan walk into a data-driven, forward-looking restaurant …
3.
“Welcome. Table for one?”
A modern data ecosystem should leverage the core similarities
between - Disease registries
- Observational studies - Population health analysis
4.
Precision Medicine and Population Health
• Registries and observational data are the current workhorses of population health, patient-centered, and translational research
• Population Health Management plans have the potential to facilitate even broader & better integrated research
• None of these 3 approaches is currently where they should be https://cancercontrol.cancer.gov/research-emphasis/precision-medicine.html
5.
Registry Observational Study Population Health Uniform data about
specific outcomes in a population
Uniform data, including outcomes, about a specific population
Aggregated data from a population of
patients
Facilitate research, evaluate effectiveness,
improve outcomes
Conduct etiologic and translational research
Analyze data to improve clinical and financial outcomes
How to sustain, standardize, share, and
generalize data?
How to sustain, standardize, share, and
generalize data?
How to prioritize, standardize, stratify, and translate data?
Similar data, goals, and challenges
6.
Problem We’re All Trying to Solve
Same general process for registries, observational studies, and population health plans, but no one has yet perfected it
7.
Voice of the (Frustrated) Customer
8.
California Teachers Study www.calteachersstudy.org
9.
CTS Clinical Endpoint Data
10.
A Typical Participant Journey
Surveys ’17-pres.
And a typical observational study: • All on-premise data, • siloed at 4 institutions, & • managed and analyzed manually
11.
How the CTS is Tackling this Problem Today
12.
CTS Data Warehouse, built by SDSC
13.
Address the Workforce Gap through Managed Services
14.
Relational Database, File, and ETL Servers
15.
Scope
16.
Two Primary Use Cases for Epidemiologists
Data Exploration • How many …
– Of a type of cancer? – Biospecimens? – Cancers with biospecimens? – Responded to the questions
about …?
• Goal: inform feasibility – Exemplifies many disease-
registry use cases – Patient matching for trials
Data Analysis • Enable statistical analyses
– Cohort selection – Include/exclude criteria &
follow-up time – Choose covariates – Univariate & multivariate
• Goal: discover & publish – Analytic methods for disease
endpoints are similar to methods for population-health endpoints, e.g., cost
17.
Data Exploration: Visualization
18.
Pick-Lists, not Code, for Routine Data
19.
Anticipate, Automate, and Deliver
20.
Anticipate, Automate, and Deliver (cont’d)
21.
Strengths of CTS DW Environment
Data Exploration • Provide value by …
– Replacing code with clicks – Displaying related results
right away, instead of requiring users to ask for the same data every time
• Goal: enable better, faster, deeper decision-making about feasibility
Data Analysis • Enable statistical analyses
– Cohort selection – Include/exclude criteria &
follow-up time – Choose covariates – Univariate & multivariate
• Goal: discover & publish – Analytic methods for disease
endpoints are similar to methods for population-health endpoints, e.g., cost
22.
Data Analysis: Add Structure and Process
• Before: “Email the PI …” – Email is not a process
• Instead, standardize the way users define & request data
Framework • Timeline, type of analysis, number of outcomes
Cancer • Codes/site, Incident/prevalent/fatal, Dx date, censoring
Mortality • Cause of death; previous diagnoses
Hospitalization • ICD code(s), Incident/prevalent, 1st/2nd, Dx date
Survey data • Which survey / sections / questions
23.
Analysis Data Request Form
24.
Analysis Data Request Form: Cancer
25.
Analysis Data Request Form: Self-report data
26.
What We Do With that Information
• Epidemiology lives in a desktop SAS world …
Insert choices into SAS PROC SQL template
Identifies data to call from DM
Give users shared workspace: • Tableau workbook • 1-line SAS include code template
27.
Strengths of CTS DW Environment
Data Exploration • How many …
– Of a type of cancer? – Biospecimens? – Cancers with biospecimens? – Responded to the questions
about …?
• Goal: inform feasibility – Exemplifies many disease-
registry use cases – Patient matching for trials
Data Analysis • Provide value by …
– Standardizing the process of specifying essential data decisions
– Delivering ready-to-go, fit-for-project, analytic data
• Goal: eliminate unnecessary barriers to use and empower users to spend time analyzing, not wrangling, data
28.
Ready, Set, Go!
• Replaced silos with single source • Exposed prev. unavailable data • Standardized & normalized data • Introduced data visualization • Provided shared workspace • Eliminated pages of SAS code
… led to moderate uptake.
29.
Not So Fast …
It only takes 10 minutes of direct observation of a nurse in a hospital to understand care-delivery processes are not standardized and are dependent on individuals, not systems. This lack of reproducibility leads to errors. Since every caregiver does it his or her own way, it’s difficult to improve anything. Stable systems that are reproducible are required to deliver consistently high quality. Industrial companies figured this out 50 years ago. What that nurse and doctor do and how they do it are still what will determine successful outcomes of care. Every nurse and doctor does not get to do it his or her own way. Standards are established about how the work is performed, and those standards are followed by all until a better way is determined collectively by the team.
30.
Not So Fast …
It only takes 10 minutes of direct observation of a to understand not standardized and are dependent on individuals, not systems. This lack of
reproducibility leads to errors. Since every does it his or her own way, it’s difficult to improve anything. Stable systems that are reproducible are required to deliver . Industrial companies figured this out 50 years ago. What that do and how they do it are still what will determine successful outcomes of . Every does not get to do it his or her own way. Standards are established about how the work is performed, and those standards are followed by all until a better way is determined collectively by the team.
31.
A Counterclockwise CTS Journey, From the Bottom Up
32.
Where is the Boundary Between Exploration & Analysis?
Open-Ended Exploration
Traditional Analysis
Increasing #s of queries & slice-and-dice strain exploration & visualization tools
Example 1: calculate power
• All-cause mortality endpoint • Multiple questionnaire variables • Within Tableau, connect 26,000 deaths to
all 5122 questionnaire variables to enable picklist that updates distributions
Example 2: focused analytic subgroup
• Identify all participants who completed Q5 & donated blood & were cancer-free & donated blood +/- 1 year of Q5 and answered specific questions in Q5
• Ask user to request analytic data
33.
Next Steps for the CTS
• 5-year project funded by U01-CA199277, Sep. 2015 to Aug. 2020 – Continue to build library of analytic & exploration tools
• Every new project helps to fill the shelves with reusable templates – Build out domain-specific visualizations to drive insights
• Cancers, hospitalizations, questionnaire, biospecimens
• Ultimate goals: – Deliver analytic results at the time users define their data requests
• A priori defined analysis plan should be a script that runs at selection – Utilize the Calif. Teachers Study to develop, test, & harden
observational data in the NCI Data Commons and Data Ecosystem • All CTS data should be publicly available for individual-level analysis • Need to address data processes, but lots of lessons learned to date
34.
• Observational data and disease registries are likely to be rate-limiting steps in achieving a fully integrated Cancer Data Ecosystem – Less big data means less recognition of need for & urgency to change – More process problems means a bigger workflow challenge
• And greater need to accelerate adoption of automated analytics
• Leveraging the pipelines and analytic workflows from the NCI GDC and genomics clouds could accelerate the integration of registry, observational study, and population-based data
• Essential to bend the cost curve – Less data management and analysis, and more results interpretation
35.
Open Invitation to use California Teachers Study data
• Aim 4 of this U01 grant: openly share our progress & lessons learned
• Please join us & help us expand this work – Multiple opportunities, collaborative or independent, for…
• Analytics projects • Innovative visualizations • Software development • Grants and analyses
www.calteachersstudy.org [email protected]