13
Three Critical Ideas for UC Health Sciences Cyber Infrastructure 3/23/2015 Joe Hesse - [email protected] Director of Innovation, UCSF Memory and Aging Center Technical Lead, UCSF Neuroscience Knowledge Network HPC Cluster Administrator, UCSF Institute for Human Genetics

Three Critical Ideas for UC Health Sciences Cyber Infrastructurecnc.ucr.edu/uccybersummit/images/joehesseuccyberplan.pdf · 2015-05-06 · Three Critical Ideas for UC Health Sciences

  • Upload
    buinhi

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

Three Critical Ideas for UC Health Sciences Cyber Infrastructure

3/23/2015

Joe Hesse - [email protected] Director of Innovation, UCSF Memory and Aging Center Technical Lead, UCSF Neuroscience Knowledge Network HPC Cluster Administrator, UCSF Institute for Human Genetics

Driving “Cyber” Needs for Health Science

In discovering causes and developing treatments for disease; in promoting health, encouraging prevention, and delivering care; we fundamentally need: 1.  To reason, compute, and discover as health professionals,

clinical researchers, social, and basic scientists, in any combination of roles, at any time.

2.  To harness agile, cost-effective, and easy to use computational infrastructure throughout the full lifecycle of our research and clinical activities.

3.  To interact with colleagues through ubiquitous and collaborative “data science” environments characterized by rich data and method annotations, secure and audited sharing, and transformational communication methods.

3/24/15 Three Critical Ideas for UC Health Sciences Cyber Infrastructure - Joe Hesse 2

Idea One: Develop and Deliver Regulatory Compliant Service Layers for Research Cyber Infrastructure.

A Fundamental Problem:  We practically force clinical researchers to abandon their clinical

role (and access to their patient’s data) before they use most research computational infrastructure.

 Asking medical professionals to really “de-identify” data to meet compliance standards (w/ associated legal liabilities) is impractical. Results in a lot of “don’t ask, don’t tell behavior”.

A Key Opportunity:  Current technology trends (e.g. agile dev-ops, platforms as a

service, software-defined everything) and the maturing open source tools (often with enterprise options) makes it practical to develop and deliver complex security and monitoring to research infrastructure at commodity prices using existing university IT capabilities.

3/24/15 Three Critical Ideas for UC Health Sciences Cyber Infrastructure - Joe Hesse 3

Delivering Regulatory Compliance through Mgmt & Orchestration

3/24/15 Three Critical Ideas for UC Health Sciences Cyber Infrastructure - Joe Hesse 4

HPC Pilot Project FY 2015/16 (funding decision pending)

Designing a new unified management and orchestration layer and providing a single portal for access to three distinct high performance computing clusters (each currently serving distinct user communities).

Opportunity to intentionally design security, monitoring and auditing layers with regulatory compliance as a target.

Many benefits, but reduce user and administration costs by standardizing the most complex aspects of the environments is a key driver.

High Performance Computing: Simple Schematic of Layers.

3/24/15 Three Critical Ideas for UC Health Sciences Cyber Infrastructure - Joe Hesse 5

Common tools and service layers to support distinct HPC workloads and hardware

Idea Two Create Continuum of Research Cyber Infrastructure to Support the Complete User Investigatory Experience

Fundamental Barriers / Problems:  Most research computational infrastructure organized is around the

technology stacks that aim to meet cohering subsets of user needs, but that fall short of addressing the complete analytic and discovery lifecycle needs of complex investigations.

 Users often find it difficult to use idiosyncratically developed technology solutions; find it impossible to navigate between these infrastructural siloes; are usually unable to apply their funding in an agile manner across support organizations; and frequently lack knowledge about the most appropriate tools.

Key Opportunity  Extend common, regulatory compliant, management and orchestration

layers to the full continuum of research cyber technologies.

3/24/15 Three Critical Ideas for UC Health Sciences Cyber Infrastructure - Joe Hesse 6

Vision of Continuum of Research Cyber Infrastructure

3/24/15 Three Critical Ideas for UC Health Sciences Cyber Infrastructure - Joe Hesse 7

Common portal, access, and billing tools streamline user experience. E.g. Using a secure reporting station to 1) query the EMR or research database for correlative variables to 2) drive an exploratory neuroimaging analysis using a large virtual workstation that will 3) become a pipelined HPC or GPU cluster analytic applied retrospectively to 1000’s of image studies is not only possible but commonplace.

Idea Three Use Novel Collaboration and Data Science Environments to Bridge the huge Data, Method and Knowledge Divides Fundamental Problems / Barriers:  With increasing size, complexity, and privacy / ethical concerns of our

health sciences / biomedical data, siloed research infrastructure stacks become like true islands without any chance of meaningful integration for users. Data portability is extremely difficult and frequently insecure.

 Reproducibility of results and cleanly annotated analytic provenance of derived data products remains elusive.

 Enormous startup costs to simply adopt methods and tools from other labs or collaborators.

Key Opportunity:

 Support and prioritize development of novel ubiquitous data environments designed for research and collaborative science.

3/24/15 Three Critical Ideas for UC Health Sciences Cyber Infrastructure - Joe Hesse 8

Collaboration and Data Workspace Environments KBase (www.kbase.us) is the first large-scale bioinformatics system that enables users to upload their own data, analyze it (along with collaborator and public data), build increasingly realistic models, and share and publish their workflows and conclusions.

KBase aims to provide a knowledgebase: an integrated environment where knowledge and insights are created and multiplied.

3/24/15 Three Critical Ideas for UC Health Sciences Cyber Infrastructure - Joe Hesse 9

Collaboration and Data Workspace Environments KNECT, (inspired by KBase ) is a prototype knowledge network environment for precision medicine.

KNECT aims to provide a common data workspace with:

•  richly typed and annotated data objects,

•  tightly integrated scalable data science cluster and service technologies (e.g. Spark, Docker)

•  clinically compliant security and auditing frameworks.

3/24/15 Three Critical Ideas for UC Health Sciences Cyber Infrastructure - Joe Hesse 10

Vision for Complete Data Science Environments

3/24/15 Three Critical Ideas for UC Health Sciences Cyber Infrastructure - Joe Hesse 11

•  Collaboration spaces / knowledge networks / Open sciences environments layer on top of unified continuum of technologies to connect investigators and investigations.

•  Ubiquitous Hyper Converged Data Environments underneath the full technology stack enables complete data portability and scientific agility.

Summary of Ideas for Cyber Infrastructure Health Science Priorities

1) Build regulatory compliance at the foundation of research infrastructure.

2) Emphasize a unified user experience across the continuum of research computational tools needed for translational health science discovery and delivery.

3) Prioritize support for novel collaboration and data environments that connect investigators and investigations.

3/24/15 Three Critical Ideas for UC Health Sciences Cyber Infrastructure - Joe Hesse 12

Acknowledgements and Appreciation

3/24/15 Three Critical Ideas for UC Health Sciences Cyber Infrastructure - Joe Hesse 13

Funding Sources and Key Supporters

Dr. Keith Yamamoto UCSF Vice Chancellor for Research

Dr. Bruce Miller Director, UCSF Memory and Aging Center

Tau Consortium www.tauconsortium.com

Dr. Neil Risch Director, UCSF Institute for Human Genetics

Joe Bengfort UCSF Chief Information Officer

Colleagues and Inspiration

Dr. Kate Rankin UCSF Memory and Aging

Brad Dispensa UCSF Institute for Human Genetics

Dr. Adam Arkin LBNL, UC Berkeley, KBase

Michael Schaffer Dir. of Tech. UCSF Memory and Aging

Contact Info

Joe Hesse – [email protected] Office: 415-502-0590 Mobile: 415-819-1054

UCSF Sandler Neurosciences Center