32
Statistical learning of biological networks: a brief overview Florence d’Alché–Buc IBISC CNRS, Université d’Evry, GENOPOLE, Evry, France Email: fl[email protected] Statistical learning of biological networks: a brief overview 1 / 30

Statistical learning of biological networks: a brief overview · Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 12 / 30. Supervised

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Statistical learning of biological networks: a brief overview · Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 12 / 30. Supervised

Statistical learning of biological networks: a briefoverview

Florence d’Alché–Buc

IBISC CNRS, Université d’Evry, GENOPOLE, Evry, FranceEmail: [email protected]

Statistical learning of biological networks: a brief overview 1 / 30

Page 2: Statistical learning of biological networks: a brief overview · Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 12 / 30. Supervised

Biological networks

Statistical learning of biological networks: a brief overview Introduction 2 / 30

Page 3: Statistical learning of biological networks: a brief overview · Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 12 / 30. Supervised

Motivation

Identify and understand complex mechanisms at work in the cellBiological networks

signaling pathwaysgene regulatory networksprotein-protein interaction networksmetabolic pathways

Use experimental data and prior knowledge AND statisticalinference to unravel biological networks and predict theirbehaviour

Statistical learning of biological networks: a brief overview Introduction 3 / 30

Page 4: Statistical learning of biological networks: a brief overview · Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 12 / 30. Supervised

How to learn biological networks from data ?

Data-mining approaches : extract co-expressed patterns and/orco-regulated patterns, reduce dimension [large scale data, oftenpreliminary to more accurate modelling or prediction]Modeling approaches : model the network behavior, can beused to simulate and predict the network as a system [smallerscale data]Predictive approaches : predict (only) edges in an unsupervisedor supervised way [large or medium scale data]

Statistical learning of biological networks: a brief overview Introduction 4 / 30

Page 5: Statistical learning of biological networks: a brief overview · Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 12 / 30. Supervised

Learning (biological) networks

Statistical learning of biological networks: a brief overview Introduction 5 / 30

Page 6: Statistical learning of biological networks: a brief overview · Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 12 / 30. Supervised

Outline

1 Introduction

2 Supervised Predictive approaches

3 Modelling approaches

4 Conclusion

Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 6 / 30

Page 7: Statistical learning of biological networks: a brief overview · Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 12 / 30. Supervised

Supervised learning of the regulation concept

Instance Problem 1 (transcriptional regulatorynetworks):Training sample S = {(wi = (vi , v ′i ), yi), i = 1...n}where wi are pairs of components vi and v ′i (think transcriptionfactor and potential regulee) and yi ∈ Y indicates if there is vi is atranscription factor for v ′i . We wish to be able to predict newregulations.Reference : Qian et al. 2003, Bioinformatics.In symbolic machine learning, this corresponds to the frameworkof relational learning classically associated with inductive logicprogramming (ILP) and more recently to statistical ILP :The predicate interaction(X,Y) can be learned from labeledexamples

Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 7 / 30

Page 8: Statistical learning of biological networks: a brief overview · Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 12 / 30. Supervised

Supervised learning of interactions

From a known network where each vertex is described by someinput feature vector x , predict the edges involving new verticesdescribed by their input feature vector

Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 8 / 30

Page 9: Statistical learning of biological networks: a brief overview · Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 12 / 30. Supervised

Supervised prediction of protein-protein interactionnetwork

Instance Problem 2 (protein-protein interaction networks) :Training sample S = {(wi = (vi , v ′i ), yi), i = 1...n} where wi arecouples of components vi and v ′i (think proteins) and yi ∈ Yindicates if there is an edge or not between vi and v ′i . We wish topredict interactions for test and training input dataNoble et al. in 2005 (SVM) with kernel combinationFurther studied by Biau and Bleakley 2006, Bleakley et al. 2007

Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 9 / 30

Page 10: Statistical learning of biological networks: a brief overview · Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 12 / 30. Supervised

Similarity or kernel learning

In the case of non oriented graphs, a similarity betweencomponents can be learnt instead of a classification functionYamanishi and Vert’s work (2005) first introduced this kind ofapproachWe proposed a new way of formulating the problem as regressionin output space endowed with a kernel (Geurts et al. 2006,2007)

Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 10 / 30

Page 11: Statistical learning of biological networks: a brief overview · Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 12 / 30. Supervised

Supervised learning with output (kernel) feature space

Suppose we have a learning sampleLS = {xi = x(vi), i = 1, . . . ,N} drawn from a fixed but unknownprobability distribution and an additional information provided by aGram matrix K = kij = k(vi , vj), fori , j = 1, . . . ,N} that expresseshow much objects vi , i = 1...n are close to each other.Let us call respectively φ the implicit output feature map and k thepositive definite kernel defined on V × V such that< φ(v), φ(v ′) >= k(v , v ′).

From a learning sample {(xi ,Kij |i = 1, . . . ,N, j = 1, . . . ,N}with xi ∈ X ,find a function f : X → F that minimizes theexpectation of some loss function ` : F × F → IR over thejoint distribution of input/output pairs:

Ex ,φ(v){`(fφ(x), φ(v))}

Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 11 / 30

Page 12: Statistical learning of biological networks: a brief overview · Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 12 / 30. Supervised

Application to supervised inference of edges in agraph 1

For objects v1, ..., vN , let us assume we have : feature vectorsx(vi), i = 1...N and a Gram matrix K defined as Ki,j = k(vi , vj). Thekernel k reflects the proximity between objects v , as vertices in theknown graph.Reminder: kernel k is a positive definite (similarity) function. For suchfunction, there exists a function φ called a feature map :V → F such thatk(v , v ′) = 〈φ(v), φ(v ′)〉.

Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 12 / 30

Page 13: Statistical learning of biological networks: a brief overview · Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 12 / 30. Supervised

Supervised inference of edges in a graph

Use a machine learning method that can infer a function h :X → F to get for a given x(v), an approximation of φ(v) and getan approximation g(x(v), x(v ′)) = 〈h(x(v)),h(x(v ′))〉 of the kernelvalue between v and v ′ described by their input feature vectorsx(v) and x(v ′)Connect these two vertices if g(x(v), x(v ′)) > θ

(by varying θ we get different tradeoffs between true positive and falsepositive rates)

Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 13 / 30

Page 14: Statistical learning of biological networks: a brief overview · Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 12 / 30. Supervised

A kernel on graph nodes

Diffusion kernel (Kondor and Lafferty, 2002):The Gram matrix K with Ki,j = k(vi , vj) is given by:

K = exp(−βL)

where the graph Laplacian L is defined by:

Li,j =

di the degree of node vi if i = j ;−1 if vi and vj are connected;0 otherwise.

Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 14 / 30

Page 15: Statistical learning of biological networks: a brief overview · Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 12 / 30. Supervised

Interpretability: rules and clusters (an example with aprotein-protein network)

Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 15 / 30

Page 16: Statistical learning of biological networks: a brief overview · Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 12 / 30. Supervised

Network completion and function prediction for yeastdata

Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 16 / 30

Page 17: Statistical learning of biological networks: a brief overview · Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 12 / 30. Supervised

Challenges and limitations in supervised predictiveapproaches

Semi-supervised learning or even transductive learningIssue : unbalanced distribution of positive and negative exampleslocal approach (the graph is not seen as a single variable)data (labeled examples) are not i.i.d. : regulations are notindependent

Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 17 / 30

Page 18: Statistical learning of biological networks: a brief overview · Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 12 / 30. Supervised

Outline

1 Introduction

2 Supervised Predictive approaches

3 Modelling approaches

4 Conclusion

Statistical learning of biological networks: a brief overviewModelling approaches 18 / 30

Page 19: Statistical learning of biological networks: a brief overview · Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 12 / 30. Supervised

Graphical models : from simple interactions models tocomplex ones

Graphical Gaussian Model model estimation: estimating partialcorrelation as a measure of conditional independency (classifiedas graph prediction in my terminology)Bayesian networks estimation: modelling directed interactionsDynamic Bayesian Networks estimation: modelling directedinteractions through timeState-space models estimation: modelling observed and hiddendynamical processes as well

Statistical learning of biological networks: a brief overviewModelling approaches 19 / 30

Page 20: Statistical learning of biological networks: a brief overview · Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 12 / 30. Supervised

Focus on state-space models

Goal:Quantitative models (easier to learn, encompass mechanisticmodels : biological relevance)Taking into account timeSome variables are not measured: assumption of an hiddenprocessLinear Gaussian models: parameters encapsulate networkstructure (Perrin et al. 03, Rangel et al. 04)Nonlinear models (more biologically relevant): the structure isencapsulated in the form of the transition function (Nachman 04,Rogers et al. 06, Quach et al. 07)

x(tk+1) = F(x(tk ),u; θ) + εh(tk )

y(tk ) = H(x(tk ),u(tk ); θ) + ε(tk )

Statistical learning of biological networks: a brief overviewModelling approaches 20 / 30

Page 21: Statistical learning of biological networks: a brief overview · Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 12 / 30. Supervised

System of Ordinary Differential Equations (ODE)

dxdt

= f(x(t),u(t); θ)

Let us focus on gene regulatory networksx(t) : state variables at time t

protein concentrationsmRNA concentrations

f : the form of f encodes the nature of interactions (and theirstructure)

linear/nonlinear modelsMichaelis-Menten kineticsMass action kinetics...

θ: parameter set (kinetic parameters, rate constants,...)u(t): input variables at time t

Statistical learning of biological networks: a brief overviewModelling approaches 21 / 30

Page 22: Statistical learning of biological networks: a brief overview · Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 12 / 30. Supervised

Reverse Engineering of Biological Networks

GivenAn ODE model :

dx(t)dt

= f(x(t),u(t); θ)

A partially and noisy observation model:

y(t) = H(x(t),u(t); θ) + ε(t)

where H is a nonlinear observation function, ε(t) is a i.i.d noiseA sequence of observed data : y1:K = {y1, ...,yK} at timet1, t2, ..., tk

GoalStructure estimationParameters estimation θ

States estimation x(t)

Statistical learning of biological networks: a brief overviewModelling approaches 22 / 30

Page 23: Statistical learning of biological networks: a brief overview · Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 12 / 30. Supervised

Reverse Engineering of Biological Networks

GivenAn ODE model :

dx(t)dt

= f(x(t),u(t); θ)

A partially and noisy observation model:

y(t) = H(x(t),u(t); θ) + ε(t)

where H is a nonlinear observation function, ε(t) is a i.i.d noiseA sequence of observed data : y1:K = {y1, ...,yK} at timet1, t2, ..., tk

GoalStructure estimationParameters estimation θ

States estimation x(t)Statistical learning of biological networks: a brief overviewModelling approaches 22 / 30

Page 24: Statistical learning of biological networks: a brief overview · Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 12 / 30. Supervised

Structure learning

Case 1: a very few variables involved, then a combinatorial searchfor structure can be processed. For each potential structure,estimation of parameters has to be carried onCase 2: more than a tens of variables are involved, then it is worthusing an algorithm dedicated to structure learning. Structurelearning in nonlinear dynamical models as well as in staticBayesian networks can be solved by a stochastic exploration ofthe candidates (huge) set using an appropriate criterion that takeinto account data and parameters estimation, given the candidatestructure. MCMC methods, evolutionary approaches are used.In the following, we assume that the network structure is given

Statistical learning of biological networks: a brief overviewModelling approaches 23 / 30

Page 25: Statistical learning of biological networks: a brief overview · Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 12 / 30. Supervised

An example of Nonlinear State-Space Model

Continuous time ODE model

dx(t)dt

= f(x(t),u(t); θ)

y(t) = H(x(t),u(t); θ) + ε(t)

The system at discrete-time points t1, ..., tK

x(tk+1) = F(x(tk ),u; θ)

y(tk ) = H(x(tk ),u(tk ); θ) + ε(tk )

with

F(x(tk ),u; θ) = x(tk ) +

∫ tk+1

tkf(x(τ),u(τ); θ)dτ

Statistical learning of biological networks: a brief overviewModelling approaches 24 / 30

Page 26: Statistical learning of biological networks: a brief overview · Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 12 / 30. Supervised

Bayesian inference

Given:Prior distribution over the initial state and parameters: p(x1,θ)

A state transition model: p(xk |xk−1,θ)

An observation model: p(yk |xk ,θ)

A sequence of observations: y1:K = {y1, ...,yK}

Estimating the posterior distributionsFocus on the filtering distribution: p(xk ,θ|y1:k )

Tool: Unscented Kalman Filter to deal with nonlinearities (Quachet al., 2007)

Statistical learning of biological networks: a brief overviewModelling approaches 25 / 30

Page 27: Statistical learning of biological networks: a brief overview · Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 12 / 30. Supervised

Bayesian inference

Given:Prior distribution over the initial state and parameters: p(x1,θ)

A state transition model: p(xk |xk−1,θ)

An observation model: p(yk |xk ,θ)

A sequence of observations: y1:K = {y1, ...,yK}

Estimating the posterior distributionsFocus on the filtering distribution: p(xk ,θ|y1:k )

Tool: Unscented Kalman Filter to deal with nonlinearities (Quachet al., 2007)

Statistical learning of biological networks: a brief overviewModelling approaches 25 / 30

Page 28: Statistical learning of biological networks: a brief overview · Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 12 / 30. Supervised

Example: the Repressilator

[Elowitz and Leibler,Nature 2000] dr1

dt= vmax

1kn

12kn

12 + pn2− kmRNA

1 r1

dr2

dt= vmax

2kn

23kn

23 + pn3− kmRNA

2 r2

dr3

dt= vmax

3kn

31kn

31 + pn1− kmRNA

3 r3

dp1

dt= k1r1 − kprotein

1 p1

dp2

dt= k2r2 − kprotein

2 p2

dp3

dt= k3r3 − kprotein

3 p3

mRNAs are observed, proteins are hiddenmRNA and protein degradation rate constants are supposed to beknownEstimate 9 parametersStatistical learning of biological networks: a brief overviewModelling approaches 26 / 30

Page 29: Statistical learning of biological networks: a brief overview · Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 12 / 30. Supervised

Parameter Estimation

Statistical learning of biological networks: a brief overviewModelling approaches 27 / 30

Page 30: Statistical learning of biological networks: a brief overview · Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 12 / 30. Supervised

Challenges in (dynamical) modelling approaches

Identifiability of dynamical modelsTheoretical results about sample complexityScaling to large networksNon stationnarityIncorporate other components : space, cellular compartments ...coupled systems : metabolic and regulatory networks,protein-protein interactions and regulatory networkMORE DATA : benchmark problems, challenges

Statistical learning of biological networks: a brief overviewModelling approaches 28 / 30

Page 31: Statistical learning of biological networks: a brief overview · Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 12 / 30. Supervised

General conclusion and perspective

Different views of the learning problem, different scales, differentprior knowledgeSome of these methods could be linked to participate to the samediscovery processNeed for building data repository and demand for biologicalvalidation

Statistical learning of biological networks: a brief overview Conclusion 29 / 30

Page 32: Statistical learning of biological networks: a brief overview · Statistical learning of biological networks: a brief overviewSupervised Predictive approaches 12 / 30. Supervised

References

C. Auliac, V. Frouin, X. Gidrol, F. d’Alché-Buc, Evolutionary Approaches for theReverse-Engineering of Gene Regulatory Networks: A Study on a RealisticBiological Dataset, accepté à BMC Bioinformatics, à paraître en 2008.P. Geurts, N. Touleimat, M. Dutreix, F. d’Alché-Buc, Inferring biologicalnetworks with output kernel trees, BMC Bioinformatics, to appear, May 3,2007.Kato, K. Tsuda, EM based algorithm for kernel matrix completion,Bioinformatics, vol. 21, 2005.B.-E. Perrin, L. Ralaivola,A. Mazurie, S. Bottani, J. Mallet, F. d’Alché-Buc,Inference of gene regulatory network with Dynamic Bayesian Network,Bioinformatics (Oxford Press), vol. 19, pi38-49,2003.M. Quach, N.Brunel, F. d’Alché-Buc, Estimating parameters and hiddenvariables in nonlinear state-space models based on ODEs for biologicalnetworks inference, November, 23:3209-3216, 2007.Y. Yamanishi, Y., J.-P. Vert and Kanehisa, M. Supervised enzyme networkinference from the integration of genomic data and chemicalinformation,Bioinformatics, vol. 21,2005.

Statistical learning of biological networks: a brief overview Conclusion 30 / 30