tranSMART Community Meeting 5-7 Nov 13 - Session 3: transmart’s application to clinical biomarker...

Preview:

DESCRIPTION

tranSMART Community Meeting 5-7 Nov 13 - Session 3: tranSMART’s Application to Clinical Biomarker Discovery Studies in Sanofi Sherry Cao, Sanofi This presentation will discuss challenges we are encountering in clinical biomarker discovery study and how we are using tranSMART to help to address them.

Citation preview

tranSMART’s Application to Clinical Biomarker Discovery in SanofiSherry Cao Ph.D.

tranSMART Community Meeting

Nov. 6th, 2013

Outline

● Challenges in clinical biomarker discovery

● How Sanofi is meeting those challenges

● Role of tranSMART

● tranSMART in Sanofi

Clinical Biomarker Discovery Process

Data Capture Discovery & Interpretation

Clinical Sample Validation

Molecular Information• DNA• RNA• Protein• Lipid• Metabolites

Biomarkers • Diagnostic• Prognostic• Efficacy

Signatures• Molecular classifications• Patient stratifications

Target ID/Credentialing• Molecular targets• Pathways• Clinical phenotypes

Clinical Sample Procurement

Sample Sources• In house• Public

Type• In silico• Experimental

Clinical Information• Patients• Diseases• Clinical Phenotypes• Lab tests• Pathology reports• Drugs

Challenges for Clinical Biomarker Discovery

● High-throughput biological measurements generate unprecedented amount of data for each biological sample● Chip based profiling technologies● Exome, transcriptome & genomic sequencing technologies

● The complexity of disease biology requires large sample numbers to reach statistical significance● GWAS studies for complex traits● Molecular signature developments for patient stratification

● Heterogeneous data types & data sources● Research & clinical● Structured & non-structured data

● Data curation is a very critical & time consuming process

● Complex analysis & visualizations are needed to transform data to knowledge

Data Management

Integration & Analysis

| 5

Interdisciplinary team for Clinical Biomarker Research

Clinical Informaticians

Research Informaticians

Clinical Statisticians

Clinicians

Research Scientists

CBR Team

Two Distinctive User Groups

Clinicians, Research Scientists

Informatic Scientists & Statisticians

Main Role Hypothesis generation,Mechanistic Interpretation Data analysis

Statistical Analysis Type Single variable, correlative analysis

Multi-variable complex analysis

Statistical Tool Access Very limited SAS, JMP, R

User Interface Drag & Drop GUI API

Major Complaints Data acquisition, Data analysis turnaround time

Data acquisition, Data curation & reformatting, Not

enough time to do real analysis

Informatics Systems Mapped onto Research Flow

Discovery Interpretation Clinical Sample ValidationData Capture

Data Management& Integration

Platform Specific System

Challenges for Clinical Biomarker Discovery

● High-throughput biological measurements generate unprecedented amount of data for each biological sample● Chip based profiling technologies● Exome, transcriptome & genomic sequencing technologies

● The complexity of disease biology requires large sample numbers to reach statistical significance● GWAS studies for complex traits● Molecular signature developments for patient stratification

● Heterogeneous data types & data sources● Research & clinical● Structured & non-structured data

● Data curation is a very critical & time consuming process

● Complex analysis & visualizations are needed to transform data to knowledge

Data Management

Integration & Analysis

Two Distinctive User Groups

Clinicians, Research Scientists

Informatic Scientists & Statisticians

Main Role Hypothesis generation,Mechanistic Interpretation Data analysis

Statistical Analysis Type Single variable, correlative analysis

Multi-variable complex analysis

Statistical Tool Access Very limited SAS, JMP, R

User Interface Drag & Drop GUI API

Major Complaints Data acquisition, Data analysis turnaround time

Data acquisition, Data curation & reformatting, Not

enough time to do real analysis

Informatics Systems Mapped onto Research Flow

Discovery Interpretation Clinical Sample ValidationData Capture

Platform Specific System

Data Management& Integration

Role of TranSMART within Sanofi

● Translational data hub - One stop shop for all data related to a biomarker discovery project● Clinical & research data● Structured & non-structured data● Fully curated data for integrated analysis & not-fully curated data

● Deliver critically needed statistical/informatics analysis tool to clinicians & research scientists● Unit variant analysis● Simple clustering analysis & heatmap generation

● Help informatics scientists to generate custom analysis data sets based on distinctive cohort definitions

Data management & integration

Data management & integration

Clinical Biomarker Discovery Use Case 1

● Business unit with established & active biomarker discovery process

● Samples are routinely sent out for profiling at different platforms

● Data are generated routinely both from CRO & internal groups● High throughput profiling data● Low throughput imaging & assay data (IHC, ELISA, qPCR, etc.)

● Situation● Biomarker team reps are overwhelmed by data management

related questions with little time to do actual analysis

● Critical need● How to organize data effectively?● How to manage the low throughput data systematically with data

from clinical & high throughput data?● How to search & find the relevant data quickly?

tranSMART in Sanofi – Data Management

Navigate within Programs > Studies > Assays , Analysis and File Folders (see next slide)

Search data using dictionaries

Create new Programs > Studies > Assays and Files Folders, and annotate (tag) them

Export files

Visualize gene expression analysis results

Global view of all the data availableFrom level 1 data (uncurated/raw files)

to levels 3-4 data (analysis results, findings)

Run analysis on subject-level data (former Dataset Explorer)

Browse level 2 (processed) data – incl. clinical / preclinical / molecular data, etc.

Search subject-level data

Select data subsets (cohorts)

Run basic statistical and genomic analyses on those subsets (standard features from tranSMART v1.0)

Export out data subsets

Data organization

● Data is organized in a hierarchical structure:

| 14

* A file folder can be created at any levels: program, study, assay…

File Folder*

AnalysisAssayProgram Study

Each object (Program, Study, Assay, etc.) is tagged with metadata:– Provide information on the object– Enable queries using search

Predefined annotation templates– Most fields use CV with pick-list or

autocomplete functionalities. Examples of dictionaries used: MESH, WhoDD, some branches Nextbio Ontology.

– Description field enables to capture free text

Program Explorer

| 15

Program Explorer box allows to navigate within Programs , Studies , Assays Analysis or File Folders

1

2

Integrated search

| 16

Autocomplete feature for values

in dictionaries

Dropdown with a list of dictionaries + free-text

search

New search function at the top of the screen. Any data (levels 1-4) can be searched.

Browse view: The search returns Programs, Studies, Assays and/or Files that match your query

Analyze view:The system points you to level 2 data

Filter

| 17

A new Filter option can also be used for selections based on fields with a small set of possible values.

The search returns Programs, Studies, Assays and/or Files that match your query.

2

1

Search & filter in Analyze

| 18

Synchronized search & filter function in Analyze

Visualization of gene expression analysis

| 19

Creation of a template for loading and displaying gene expression analysis results.

File export – Shopping Cart function

| 20

New concept of Shopping Cart for exporting files.

Note: If positive feedback from users on this Shopping Cart concept, we may extend this feature in RC-2 to subject-level data.

Clinical Biomarker Discovery Use Case 2

● Business unit with focused biomarker discovery program

●Goal is to identify disease progression biomarkers than the current clinical functional test

● Situation at hand● Researchers don’t have any appropriate analytical tools for

correlative analysis● A variety of profiling experiments are being planned

• RNAseq, Proteomics, RBM, miRNA, Metabolomics● Patient data at multiple time points are collected

● Critical need● How to integrate all the data?● How to enable clinical researchers to analyze and visualize data?● How to analyze time series data more effectively?

tranSMART in Sanofi – Data Integration

● Current state ● Within study clinical & gene expression profiling data

Gene expression

En

d P

oin

t

tranSMART in Sanofi – Data Integration

● In the pipeline● Multi-modal profiling data support

● Data types to be addressed● RNAseq● miRNA profiling (qPCR + seq) ● Metabolomics● Proteomics● RBM

Gene expression

Pro

tein

Le

vel

tranSMART in Sanofi – Providing Analysis Tools to Research Scientists

General Summary Statistics on Patient Cohorts

Baseline marker gene expression is correlated with outcome at 52 weeks

Disease Signature Evaluation

Clinical Biomarker Discovery Use Case 3

● Efficacy biomarker discovery for complex disease with 15,000 patients

● Situation at hand● A number of profiling experiments are being planned

• RNAseq, RBM, Metabolomics● Patients often manifest other disease symptons

● Critical issue● How to load such a large dataset?● How to analyze such a large sample numbers with multiple high

dimensional data?● How to analyze comorbidities?

Conclusions

● tranSMART can provide critical solutions for clinical biomarker discovery needs● Data management, integration & analysis

● Two distinctive user groups for tranSMART through user interface and through API

● Different business units have different requirements for tranSMART

● Sanofi developed critical user interface and functionality improvements to meet sanofi and general clinical biomarker discovery needs

Question

Functionality User Interface

Acknowledgement

●Genzyme● Jike Cui, Adam Palermo, Rena Baek, Petra Olivova, Leslie Jost, Rob

Pomponio, Allison McVie-Wylie, Steve Madden, Clarence Wang

● Diabetes● Juergen Kammerer, Manfred Hendlich, Dan Crowther

●Oncology● Mary Penniston, Jack Pollard

● Sanofi tranSMART development team● Claire Virenque, Annick Peraux● Angelo Decristofano, Lars Greiffenberg, Christophe Gibault, David

Peyruc

Dream Analysis Process

Define question

Identify patient cohort

Obtain relevant profile & clinical data

Run analysis

Export & publish results

Satisfied

Format!

Recommended