Upload
philip-bourne
View
442
Download
1
Embed Size (px)
Citation preview
Secure Data Sharing and Related Matters – An NIH View
Philip E. Bourne, PhD, FACMIAssociate Director for Data Science
National Institutes of Health
October 26, 2015
Disclaimer…
I am not a cybersecurity expert, and as an informatician previously
working primarily in the pre-clinical space not an expert in security
associated with human subjects
Conversation Cards
What is happening that makes a discussion of security important
What is secure anyway?
How is the NIH responding in this changing landscape
Conversation Cards
What is happening that makes a discussion of security important
What is secure anyway?
How is the NIH responding in this changing landscape
Big Data in the Life Sciences …
This speaks to something more fundamental that more data …
It speaks to new methodologies, new skills, new emphasis, new cultures,
new modes of discovery …
Consider this change from my own career experience ….
The History of Computational Biomedicine According to Bourne
1980s 1990s 2000s 2010s 2020
Discipline:
Unknown Expt. Driven Emergent Over-sold A Service A Partner A Driver
The Raw Material:
Non-existent Limited /Poor More/Ontologies Big Data/Siloed Open/Integrated
The People:
No name Technicians Industry recognition data scientists Academics
Searls (ed) The Roots in Bioinformatics Series PLOS Comp Biol
Consider what the expert prophets are saying …
We are at a Point of Deception …
Evidence:– Google car– 3D printers– Waze– Robotics– Sensors
From: The Second Machine Age: Work, Progress, and Prosperity in a Time of Brilliant Technologies by Erik Brynjolfsson & Andrew McAfee
Example - Photography
DigitizationDeception
Disruption
Demonetization
Dematerialization
Democratization
Time
Vol
ume,
Vel
ocity
, Var
iety
Digital camera invented byKodak but shelved
Megapixels & quality improve slowly; Kodak slow to react
Film market collapses;Kodak goes bankrupt
Phones replacecameras
Instagram,Flickr become thevalue proposition
Digital media becomes bona fide form of communication
We Are At a Point of Deception The 6D Exponential Framework
Digitization of Basic & Clinical Research & EHR’s
Deception
We Are Here
Disruption
Demonetization
Dematerialization
Democratization
Open science
Patient centered health care
What Are Some General Implications of Such a Future?
Open collaborative science becomes of increasing importance
The value of data and associated analytics becomes of increasing value to scholarship
Opportunities exist to improve the efficiency of the research enterprise and hence fund more research
Cooperation between funders will be needed to sustain the emergent digital enterprise
Current training content and modalities will not match supply to demand
Balancing accessibility vs security becomes more important yet more complex
An Example of That Promise:Comorbidity Network for 6.2M Danes
Over 14.9 Years
Jensen et al 2014 Nat Comm 5:4022
“And that’s why we’re here today. Because something called precision medicine … gives us one of the greatest opportunities for new medical breakthroughs that we have ever seen.”
President Barack ObamaJanuary 30, 2015
Precision Medicine Initiative
National Research Cohort – >1 million U.S. volunteers– Numerous existing cohorts (many funded by NIH)– New volunteers
Participants will be centrally involved in design and implementation of the cohort
They will be able to share genomic data, lifestyle information, biological samples – all linked to their electronic health records
Conversation Cards
What is happening that makes a discussion of security important
What is secure anyway?
How is the NIH responding in this changing landscape
For the Purposes of this NIH Centric Digital Discussion:
What is Secure Anyway?
Access to digital research objects when, how, and by whom are authorized to access them in
accordance of the wishes of the owner and/or laws and policies which
define accessibility
Some of the Complexities
Research objects– Narrative– Data – preclinical and
clinical– Software– Publications
Owner– Individual– Institution– Funding agency– Third party
Governance– Federal– Funding agency– Institutional– Third party
Conversation Cards
What is happening that makes a discussion of security important
What is secure anyway?
How is the NIH responding in this changing landscape
Consider This Response from 3 Intersecting Perspectives
Community Policy
Infrastructure
Consider This Response from 3 Intersecting Perspectives
Community Policy
Infrastructure
Laying the Foundation for Open Access:
HGP, Bermuda, 1996
“The HGP changed the norms around data sharing in biomedical research.”
Data Sharing Goes Global: GA4GHGlobal Alliance for Genomics and
Health Accelerating the potential of genomic medicine to
advance human health, by:– Establishing common framework of approaches to enable
effective, responsible sharing of genomic and clinical data– Catalyzing data sharing projects that drive and demonstrate
value of data sharing Alliance*: >350 leading institutions (healthcare, research,
advocacy, life science, IT) representing 35 countries Working groups (Clinical, Data, Security, Regulatory &
Ethics) assess, prioritize needs– Form task teams to produce tools, solutions, demonstration
projects
*Statistics as of October 5, 2015
Consider This Response from 3 Intersecting Perspectives
Community Policy
Infrastructure
A Culture of Sharing
1999 20042003 2007 20142008
Research Tools Policy
NIH Data Sharing Policy
Model Organism Policy
Genome-wide Association (GWAS) Policy
2012
NIH Public Access Policy (Publications)
Big Data to Knowledge (BD2K) Initiative
Genomic Data Sharing (GDS) Policy
Modernization of NIH Clinical Trials
White House Initiative
(2013 “Holdren Memo”)
Guiding Principle of NIH GWAS Policy
The greatest public benefit will be realized if data from GWAS are made available, under terms and conditions consistent with the informed consent provided by individual participants, in a timely manner to the largest possible number of investigators.
NIH expectation that data would be shared in the NIH database of Genotype and Phenotype (dbGaP)
Data Access Requests Per Year 2007–September 2015
A Culture of Sharing
1999 20042003 2007 20142008
Research Tools Policy
NIH Data Sharing Policy
Model Organism Policy
Genome-wide Association (GWAS) Policy
2012
NIH Public Access Policy (Publications)
Big Data to Knowledge (BD2K) Initiative
Genomic Data Sharing (GDS) Policy
Modernization of NIH Clinical Trials
White House Initiative
(2013 “Holdren Memo”)
NIH Public Access Policy for Publications
Ensures public access to published results of all research funded by NIH since 2008– Recipients of NIH funds required to submit final peer-
reviewed journal manuscripts to PubMed Central (PMC) upon acceptance for publication
– Papers must be accessible to the public on PMC no later than 12 months after publication
A Culture of Sharing
1999 20042003 2007 20142008
Research Tools Policy
NIH Data Sharing Policy
Model Organism Policy
Genome-wide Association (GWAS) Policy
2012
NIH Public Access Policy (Publications)
Big Data to Knowledge (BD2K) Initiative
Genomic Data Sharing (GDS) Policy
Modernization of NIH Clinical Trials
White House Initiative
(2013 “Holdren Memo”)
Harnessing Data to Improve Health: BD2K (Big Data to Knowledge)
NIH’s 6-year initiative to use data science to foster an open digital ecosystem that will accelerate efficient, cost-effective biomedical research to enhance health, lengthen life, and reduce illness and disability
Programs and activities:Advance discovery for biomedical researchFacilitate use and re-use of biomedical dataDevelop analytical methods and softwareEnhance biomedical data science training
A Culture of Sharing
1999 20042003 2007 20142008
Research Tools Policy
NIH Data Sharing Policy
Model Organism Policy
Genome-wide Association (GWAS) Policy
2012
NIH Public Access Policy (Publications)
Big Data to Knowledge (BD2K) Initiative
Genomic Data Sharing (GDS) Policy
Modernization of NIH Clinical Trials
White House Initiative
(2013 “Holdren Memo”)
NIH Genomic Data Sharing (GDS) Policy
Purpose– Sets forth expectations, responsibilities that ensure broad,
responsible sharing of genomic research data in a timely manner
Scope– All NIH-funded research generating large-scale human or
non-human genomic data – and their use for subsequent research
• Data to be submitted to NIH-designated data repositories (e.g., dbGaP, GEO, GenBank, WormBase, FlyBase, Rat Genome Database)
– Applies to all funding mechanisms (grants, contracts, intramural support) with no minimum threshold for cost
Released August 2014; effective January 25, 2015
gds.nih.gov
A Culture of Sharing
1999 20042003 2007 20142008
Research Tools Policy
NIH Data Sharing Policy
Model Organism Policy
Genome-wide Association (GWAS) Policy
2012
NIH Public Access Policy (Publications)
Big Data to Knowledge (BD2K) Initiative
Genomic Data Sharing (GDS) Policy
Modernization of NIH Clinical Trials
White House Initiative
(2013 “Holdren Memo”)
Modernizing NIH Clinical Trials Activities:The Need
NIH-Funded trials published within 100 months of completion
Less than 50% published within 30 months of completion
BMJ 2012;344:d7292
Modernizing NIH Clinical Trials Activities:
Call to Action
Increasing Clinical Trial Transparency Proposed November 2014; Final Spring 2016 (est.)
Notice of Proposed Rulemaking: Clinical Trials Registration and Results Submission (FDAAA, Section 801)– Further implements statutory requirements on private and
public sponsors to register; report results on phase 2, 3, and 4 trials
– Includes drugs, biologics, and devices (except small feasibility)
Draft NIH Policy on Clinical Trial Information Dissemination – Extends Section 801 requirements to all NIH-funded clinical
trials– Includes phase 1 trials and trials of non-FDA regulated
interventions such as behavioral trials
Consider This Response from 3 Intersecting Perspectives
Community Policy
Infrastructure
BD2KCenter
BD2KCenter
BD2KCenter
BD2KCenter
BD2KCenter
BD2KCenter
DDICC
Software
Standards
Infrastructure - The Commons
Labs
Labs
Labs
Labs
The Commons: Components
The CommonsDigital Object Compliance: FAIR
Attributes of digital objects in the Commons Initial Phase
• Unique digital object identifiers of some type• A minimal set of searchable metadata • Physically available in a cloud based Commons provider• Clear access rules (especially important for human subjects data)• An entry (with metadata) in one or more indices
– Future Phases• Standard, community based unique digital object identifiers • Conform to community approved standard metadata for enhanced
searching• Digital objects accessible via open standard APIs• Are physically and logical available to the commons
BD2K Targeted Software Topics
Supports innovative analytical methods and software tools that address critical current and emerging needs of the biomedical research 2015 Topics (18 awards, U01s)
– Data Compression– Data Provenance– Data Visualization– Data Wrangling
2016 Topics (U01s, under review)– Data Privacy– Data Repurposing– Applying Metadata
– 2016: Crowdsourcing and interactive Digital Media (UH2)
I not only use all the brains I have, but all I can borrow.
– Woodrow Wilson
The Team
45
NIHNIH……Turning Discovery Into HealthTurning Discovery Into Health
[email protected]://datascience.nih.gov/
http://www.ncbi.nlm.nih.gov/research/staff/bourne/