Computational analysis of biological systems: Past, present and future Sven Bergmann

Preview:

DESCRIPTION

Computational analysis of biological systems: Past, present and future Sven Bergmann. UNIL tenure track commission 5 January 2010. Large (genomic) systems many uncharacterized elements relationships unknown - PowerPoint PPT Presentation

Citation preview

Computational analysis of biological Computational analysis of biological systems: Past, present and futuresystems: Past, present and future

Sven BergmannSven Bergmann

UNIL tenure track commission 5 January 2010

Research Overview

Large (genomic) systems• many uncharacterized

elements

• relationships unknown

• computational analysis should: improve annotation reveal relations reduce complexity

Small systems• elements well-known

• many relationships established

• aim at quantitative modeling of

systems properties like: Dynamics Robustness Logics

PASTPAST

Large-scale data analysesLarge-scale data analyses

Search for transcription modules:

Set of genes co-regulated undera certain set of conditions

• context specific

• allow for overlaps

How to extract information from very large-scale expression data?

J Ihmels, G Friedlander, SB, O Sarig, Y Ziv & N Barkai Nature Genetics (2002)

Identification of transcription modules using many random “seeds”

random“seeds”

Transcription modules

Independent identification:Modules may overlap!

SB, J Ihmels & N Barkai Physical Review E (2003)

New Tools: Module Visualization

http://maya.unil.ch:7575/ExpressionView

Data Integration: Example NCI60

60 cancer cell lines (9 tissue types)

~23,000 gene probes

GeneExpression

Data

~5,000 drugs

Drug Response

Data

How to identify Co-modules?

Iteratively refine genes, cell-lines

and drugs to get co-modules

Z Kutalik, J Beckmann & SB, Nature Biotechnology (2008)

6’18

9 in

divi

dual

s

Phenotypes

159 measurement

144 questions

Genotypes

500.000 SNPs

CoLaus = Cohort Lausanne

Collaboration with:Vincent Mooser (GSK), Peter Vollenweider & Gerard Waeber (CHUV)

PCA of POPRES cohort

Impact: Web of Science 2005-2009

Impact: Who cites our work?

PRESENTPRESENT

Large-scale data analysisLarge-scale data analysis

Current insights from GWAS:

• Well-powered (meta-)studies with (ten-)thousands of samples have identified a few (dozen) candidate loci with highly significant associations

• Many of these associations have been replicated in independent studies

Current insights from GWAS:

• Each locus explains but a tiny (<1%) fraction of the phenotypic variance

• All significant loci together explain only a small (<10%) of the variance

David Goldstein:

“~93,000 SNPs would be required to explain 80% of the population variation in height.”

Common Genetic Variation and Human Traits, NEJM 360;17

1. Other variants like Copy Number Variations or epigenetics may play an important role

2. Interactions between genetic variants (GxG) or with the environment (GxE)

3. Many causal variants may be rare and/or poorly tagged by the measured SNPs

4. Many causal variants may have very small effect sizes

So what do we miss?

Status: - Dec: submitted to PLoS Computational Biology (IF=6.2) (after positive reply to pre-submission inquiry)

Status: accepted forpublication in Nature (IF=31.4 )

Status: - Dec: submitted to PLoS Genetics (IF=8.7), currently under review

Status: - submitted to Biostatistics (IF=3.4, 2nd best of 92 journals for Statistics & Probability)- Revision accounting for reviewers’ comments to be submitted soon

Status: accepted for publication GASTROENTEROLOGY (IF=12.6).

Status: submitted as application note to Bioinformatics (IF=4.32, 2nd best of 28 journals for Mathematical & Computational Biology)

Status: manuscript ready for submission to PLoS Comp Biology

Research Overview

Large (genomic) systems• many uncharacterized

elements

• relationships unknown

• computational analysis should: improve annotation reveal relations reduce complexity

Small systems• elements well-known

• many relationships established

• aim at quantitative modeling of

systems properties like: Dynamics Robustness Logics

PASTPAST

Modeling Modeling

Drosophila as model for Development

Quantitative Experimental Study using Automated Image

Processing

a: mark anterior and posterior pole, first and last eve-stripeb: extract region around dorsal midlinec: semi-automatic determination of stripes/boundaries

Experimental Results: Positions

• Bergmann S, Sandler S, Sberro H, Shnider S, Shilo B-Z, Schejter E and Barkai N Pre-Steady-State Decoding of the Bicoid Morphogen Gradient, PLoS Biology 5(2) (2007) e46. • Bergmann S, Tamari Z, Shilo B-Z, Schejter E and Barkai N Stability of the Bicoid Gradient? Cell 132 (2008) 15.

A bit of Theory…

The morphogen density M(x,t) can be modeled by a differential equation (reaction diffusion equation):

Change in concentration of the morphogenat position x, time t

DiffusionD: diffusion const.

SourceDegradationα: decay rate

The Canonical Model

Model including nuclear trapping

2

02( )n n n

nn n n

M MD k MB k M s x

t xM

k MB k Mt

kn k-n

s0

D

Mn(x,t) nuclear emissionnuclear absorbtion

nuclear morphogen

diffusion

production

free morphogenM(x,t)

Nuclei density

B(x,t)

N

N

N

PRESENTPRESENT

Modeling Modeling

Precision is highest at mid-embryo

Similar trend in direct measurementsof Bcd noise byGregor et al. (Cell 2007)

1xbcd2xbcd4xbcd

Δ: GtΔ: Kr□: Hb o: Eve

Scaling is position-dependent!

“hyper-scaling” at anterior pole

Status: - May: submitted to Molecular Systems Biology (IF=12.2)- Aug: first resubmission after mostly positive reviews- Dec: second submission (informally) accepted subject to proper response with respect to minor issues

• Partner in SystemsX.ch project WingX- PhD student: Aitana Morton Delachapelle- PostDoc: Sascha Dalessi

• Image processing to obtain spatio-temporal measures of proteins

• Modeling Dpp gradient formation with focus on scaling

Modeling the Drosophila wing disk

Modeling the plant growth

• Partner in SystemsX.ch project PlantX- PostDoc: Micha Hersch- PostDoc: Tim Hohm

• Image processing to obtain spatio-temporal measures of seedlings

• Modeling shade avoidance behavior

Future directionsFuture directions

Organisms

Data types

– Genotypic (SNPs/CNVs)– Epigenetic data – Gene/protein expression– Protein interactions– Organismal data?

Biological Insight

The challenge of many datasets: How to integrate all the information?

Modular Approach for Integrative Analysis of Genotypes and Phenotypes

Individuals

Genotypes

Phenotypes

Me

as

ure

me

nts

SN

Ps/H

ap

lotyp

es

Modular links

Association of (average) module expression is often stronger than for any

of its constituent genes

Towards interactions: Network Approaches

for Integrative Association Analysis

Using knowledge on physical gene-interactions or pathways to prioritize the search for functional interactions

Modeling: Cross-talk between Drosophila

and Arabidopsis modeling

Both systems are growing multi-cellular tissues:Modelers (in my group and within the two RTDs) may learn from each other and exchange tools

Acknowledgements to my group

http://serverdgm.unil.ch/bergmann

Funding: SystemsX.ch, SNSF, SIB, Cavaglieri, Leenaards, European FP

People: Zoltán KutalikMicha HerschAitana MortonDiana MarekBarbara PiaseckaBastian PeterKaren KapurAlain Sewer*Toby Johnson*Armand ValsessiaGabor CsardiSascha DalessiTim Hohm*alumni

Acknowledgements to my collaborators

DGM:Jacqui BeckmannRoman ChrastCarlo Rivolta

CIG:Christian FankhauserSophie MartinAlexandre ReymondMehdi TaftiBernard Thorens

UNIL/CHUV:Murielle BochudPierre-Yves BochudFabienne MaurerMarc Robinson-RechaviAmalio TelentiPeter VollenweiderGerard Weber

Uni Geneva:Stylianos AntonarakisManolis DermitzakisJacques Schrenzel

Uni Bern:Cris Kuhlemeier Andri RauchRichard Smith

ETH & Uni Zurich:Konrad BaslerErnst HafenMatthias HeinemannChristian v. MehringMarkus NollEckart ZitzlerEPFL:

Dario FloreanoFelix Naef

Uni Basel:Markus AffolterMihaela Zavolan

Weizmann:Naama BarkaiBenny ShiloOrly Reiner

MRC Cambridge:Ruth LoosNick Wareham

Uni Minnesota:Judith Berman

GSK:Vincent MoserDawn Waterworth

UCSD:Trey Ideker

UCLA:John Novembre

Teaching: Past and PresentTeaching: Past and Presenthttp://www2.unil.ch/cbg/index.php?title=Teaching

Teaching: FutureTeaching: Future

1. How can we equip Biology students at UNIL with basic knowledge in Computational Biology?

• more “hands on” training!• group projects• new Master

2. How can we educate proficient Computational Biologists?

• New Master program jointly with SIB, UniGE?• Develop ties with EPFL?

Integration: Past & PresentIntegration: Past & Present

Integration: FutureIntegration: Future

How can UNIL/FBM strengthen its position in Computational Biology?

1. Networking!

2. Create new senior positions!

Integration: FutureIntegration: Future

How can UNIL/FBM strengthen its ties with the industry?

Vincent Moser

Andreas Schupert

Manuel Peitsch

Ulrich Genick

Pierre Farmer

David Heard

Pietro Scalfaro

CBG

Recommended