Upload
stephen-moore
View
217
Download
0
Tags:
Embed Size (px)
CELL INDEX DATABASE (CELLX): A WEB TOOL FOR CANCER PRECISION MEDICINE
http://cellx.sourceforge.net
Pacific Symposium on Biocomputing (PSB) 2015 January 4-8, 2015 The Big Island of Hawaii
Keith ChingSenior Principal Scientist, Computational BiologyPfizer, Oncology Research Unit, San Diego, CA
What is CELLX?
• Web interface to a database of molecular profiling data• Cell Lines ( CCLE, Broad, Sanger, GSK, Pfizer )
• TCGA – The Cancer Genome Atlas
• Published studies ( GSE from NCBI GEO )
• GTEx - Genotype-Tissue Expression project
• Custom data ( internal studies )
• Datatypes• Microarray expression
• RNA-Seq expression (RSEM)
• mutation (COSMIC, TCGA, CCLE)
• Copy Number Variation (CNV)
• Compound activity (limited)
• Protein array, RPPA (limited)
• Meta data, annotations.
Pfizer Confidential │ 2
Architecture
• Demo: http://cellx.sourceforge.net
• Open source http://sourceforge.net/projects/cellx/
• YouTube tutorials : http://sourceforge.net/p/cellx/wiki/Tutorials/
Pfizer Confidential │ 3
mysqlApache/Tomcat
Rserve
Amazon Web Services
minimum requirements: t2.micro vm, 1GB RAM, 1 CPU 150 GB disk space
Perl
Java
Why CELLX ?
• For each analysis, half the time is spent on data collection and formatting.– getting most recent dataset.
– matching identifiers, merging datatypes
• Analyses developed to answer a specific question are abstracted and generalized.
• As new data is generated, the same analysis will be repeated over and over.
Pfizer Confidential │ 4
Generalized query
• For target gene X:– what kinds of alterations
• mutation, fusion, amplification, deletion, over/under expression
– where are alterations found• cell lines, primary samples, PDX models
– what gene alterations associate (or not) with gene X alterations• KRAS mutation, ALK fusion, CCND1 amplification, PD1 expression
– what sample characteristics associate with gene X alterations• tissue type, subtype, compound sensitivity
• For target genes W, X, Y, Z– which tumor types have W and X alterations but not Y or Z.
Pfizer Confidential │ 5
Precision Medicine Support
• pre-clinical and translational programs for late-stage targeted oncology agents. ( small molecules or antibodies )– cell line or Patient Derived Xenograft (PDX) selection
• mutation status, CNV amp or del, high/low expression
– cell line / PDX correlates with agent activity.• tissue type, mutation, CNV, expression, meta data
– understanding the size / frequency of potential responder indications
• presence / absence of biomarkers
• one or more constraints ( tissue type, subtype, subgroup, viral status)
– hypothesis testing• confirming literature reports, investigator results in public datasets.
– easy data access, merging for custom analyses• adding custom analyses as new queries
Pfizer Confidential │ 6
Expression
Pfizer Confidential │ 7
CNV
Pfizer Confidential │ 8
Exp vs CNV
Pfizer Confidential │ 9
Matrix
Pfizer Confidential │ 10
Pfizer Confidential │ 11
Expression / mutation
Pfizer Confidential │ 12
Breast Cell line panel screening – CDK4i
IC50 values Palbociclib* Gene expressionSens vs. Resist
CNV / mutations
RB1
*Finn RS, et.alBreast Cancer Res. 2009;11(5):R77. doi: 10.1186/bcr2419.
CCNE1
Metadata test vs. expression of RB1
Meta association with EGFR expression
Pfizer Confidential │ 15
RB1, CDKN2A, CCND1 in TCGA breast
Pfizer Confidential │ 16
Cutoffs
Pfizer Confidential │ 17
Across all TCGA
Pfizer Confidential │ 18
Genes correlated with RB1 expression
Pfizer Confidential │ 19
TCGA-BRCA-RSEM
Pfizer Confidential │ 20
GLI1
Pfizer Confidential │ 21
Data: George Kan
Classes: Kai WangACRG HCC – ACVRL1 correlation
Multiple correlations across TCGA (PD-L1)
• runs correlation across 32 TCGA datasets
• summary table of number of times a gene appears
• zip file of each correlation table
Pfizer Confidential │ 22
top 100 genesper dataset
top 1000 genesper dataset
CD274 / JAK2 / PDCD1LG2 same locus 9p24
Pfizer Confidential │ 23
ACRG157T
Survival
• Genomewide rank of gene expression and survival.
Pfizer Confidential │ 24
TCGA-HNSC
Acknowledgments
Paul Rejto : Exec Dir Precision Med
CompBio
Kai Wang – ACRG subclasses
Zhengyan (George) Kan – ACRG data
Julio Fernandez – CCLE data
Wenyan Zhong – requirements, exp, cnv, mutation correlations
Jarek Kostrowicki – R optimization
Tao Xi – Tumor vs. normal plots
Zhou Zhu – METABRIC data
Pfizer Confidential │ 25
Requirements
• Oncology Business Unit
Jean-François Martini : Sr. Dir
Biomarker reports, venn, freq
Maria Koehler : VP
multiple datatype scatter plot
• Integrative Biology and Biochem
Kim Arndt : VP IBB
biomarker frequencies by subtype
[email protected]://cellx.sourceforge.net