Predicting Gene Expression using Logic Modeling and Optimization Abhimanyu Krishna

Preview:

DESCRIPTION

Predicting Gene Expression using Logic Modeling and Optimization Abhimanyu Krishna. New Challenges in the European Area: Young Scientist’s 1st International Baku Forum. Input Stimuli. Input Stimuli. p. B. p. p. A. A. p. A. B. C. C. - PowerPoint PPT Presentation

Citation preview

Predicting Gene Expression using Logic Modeling and Optimization

Abhimanyu Krishna

New Challenges in the European Area: Young Scientist’s 1st International Baku Forum

Gene Regulatory Network reconstruction

R

A

TRB TRC

p

A

p

A

p

A

B C

Input Stimuli

C

Input Stimuli

R

C

B

p

What is Gene Expression? -> Regulation? -> Gene Regulatory Network?

Introduction:

Literature based Gene Regulatory Network

Experimental expression data

+

Missing expression values in grey

How to contextualize literature to our experimental conditions

Objective

4

Stable state Stable

state

Unstable transient state

Biological processes represented as transitions in a landscape

“Predicting missing expression values in gene regulatory networks using a discrete logic modeling optimization guided by network stable states”

Introduction:Networks of interactions

5

Why these predictions are not trivial?

Noisy network reconstruction process

“Predicting missing expression values in gene regulatory networks using a discrete logic modeling optimization guided by network stable states”

6

Problem:Inconsistency between network and experimental expression data

Solution:Contextualize the Network using experimental expression data

“Predicting missing expression values in gene regulatory networks using a discrete logic modeling optimization guided by network stable states”

7

Why is this an optimization problem?

“Predicting missing expression values in gene regulatory networks using a discrete logic modeling optimization guided by network stable states”

8

Why is this an optimization problem?

Local consistency

“Predicting missing expression values in gene regulatory networks using a discrete logic modeling optimization guided by network stable states”

9

Why is this an optimization problem?

Local consistency

Edge removal

“Predicting missing expression values in gene regulatory networks using a discrete logic modeling optimization guided by network stable states”

10

Why is this an optimization problem?

Local consistencyGlobal consistency

“Predicting missing expression values in gene regulatory networks using a discrete logic modeling optimization guided by network stable states”

11

Stable state Stable

state

Unstable transient state

“Predicting missing expression values in gene regulatory networks using a discrete logic modeling optimization guided by network stable states”

Which property are we going to use in the optimization?

Network stability

12

with

Objective function:This score S uses the normalized Hamming distance (h) to compare N Boolean gene expression values (σ) between all calculated steady states (α) of a pruned network and the two known phenotypes (φ1 and φ2) defined by the expression data, in order to identify the two best-matching phenotype/steady state couples (φα1 and φα2)

Iterative network pruning

with

Objective function:This score S uses the normalized Hamming distance (h) to compare N Boolean gene expression values (σ) between all calculated steady states (α) of a pruned network and the two known phenotypes (φ1 and φ2) defined by the expression data, in order to identify the two best-matching phenotype/steady state couples (φα1 and φα2)

Iterative network pruning

14

with

Objective function:This score S uses the normalized Hamming distance (h) to compare N Boolean gene expression values (σ) between all calculated steady states (α) of a pruned network and the two known phenotypes (φ1 and φ2) defined by the expression data, in order to identify the two best-matching phenotype/steady state couples (φα1 and φα2)

Iterative network pruning

15

with

Objective function:This score S uses the normalized Hamming distance (h) to compare N Boolean gene expression values (σ) between all calculated steady states (α) of a pruned network and the two known phenotypes (φ1 and φ2) defined by the expression data, in order to identify the two best-matching phenotype/steady state couples (φα1 and φα2)

Iterative network pruning

16

with

Objective function:This score S uses the normalized Hamming distance (h) to compare N Boolean gene expression values (σ) between all calculated steady states (α) of a pruned network and the two known phenotypes (φ1 and φ2) defined by the expression data, in order to identify the two best-matching phenotype/steady state couples (φα1 and φα2)

Iterative network pruning

17

with

Objective function:This score S uses the normalized Hamming distance (h) to compare N Boolean gene expression values (σ) between all calculated steady states (α) of a pruned network and the two known phenotypes (φ1 and φ2) defined by the expression data, in order to identify the two best-matching phenotype/steady state couples (φα1 and φα2)

Iterative network pruning

18

with

Objective function:This score S uses the normalized Hamming distance (h) to compare N Boolean gene expression values (σ) between all calculated steady states (α) of a pruned network and the two known phenotypes (φ1 and φ2) defined by the expression data, in order to identify the two best-matching phenotype/steady state couples (φα1 and φα2)

Iterative network pruning

19

with

Objective function:This score S uses the normalized Hamming distance (h) to compare N Boolean gene expression values (σ) between all calculated steady states (α) of a pruned network and the two known phenotypes (φ1 and φ2) defined by the expression data, in order to identify the two best-matching phenotype/steady state couples (φα1 and φα2)

Iterative network pruning

20

But the contribution of interactions to the network stability it is not linearly independent.

The evaluation of one specific link is highly dependent of the links already removed or, in other words, the order of removal.

We are going to capture interdependencies between variables considering sequentially both the probability distribution of positive circuits and separated edges.

Positive circuit Positive circuit Negative circuit

“Predicting missing expression values in gene regulatory networks using a discrete logic modeling optimization guided by network stable states”

Thomas R, Thieffry D, Kaufman M: DYNAMICAL BEHAVIOR OF BIOLOGICAL REGULATORY NETWORKS .1. BIOLOGICAL ROLE OF FEEDBACK LOOPS AND PRACTICAL USE OF THE CONCEPT OF THE LOOP-CHARACTERISTIC STATE. Bulletin of Mathematical Biology 1995, 57:247-276.

Positive circuits are necessary condition to have several fixed points

21

with

Objective function:This score S uses the normalized Hamming distance (h) to compare N Boolean gene expression values (σ) between all calculated steady states (α) of a pruned network and the two known phenotypes (φ1 and φ2) defined by the expression data, in order to identify the two best-matching phenotype/steady state couples (φα1 and φα2)

Iterative network pruningPositive Circuit 1

22

with

Objective function:This score S uses the normalized Hamming distance (h) to compare N Boolean gene expression values (σ) between all calculated steady states (α) of a pruned network and the two known phenotypes (φ1 and φ2) defined by the expression data, in order to identify the two best-matching phenotype/steady state couples (φα1 and φα2)

Iterative network pruningPositive Circuit 2

23

with

Objective function:This score S uses the normalized Hamming distance (h) to compare N Boolean gene expression values (σ) between all calculated steady states (α) of a pruned network and the two known phenotypes (φ1 and φ2) defined by the expression data, in order to identify the two best-matching phenotype/steady state couples (φα1 and φα2)

Iterative network pruningPositive Circuit 3

24

Which property are we going to use in the optimization?

Network stability

“Predicting missing expression values in gene regulatory networks using a discrete logic modeling optimization guided by network stable states”

25

Biological scope targeted by this approach: transitions between long term expression patterns or

stable states

Epithelial-mesenchymal transition

Epithelial Mesenchymal

“Predicting missing expression values in gene regulatory networks using a discrete logic modeling optimization guided by network stable states”

Example:

26

Computing attractors in a discrete dynamical system (Boolean)

Based on logic functions and the assumption of only 2 possible gene states: active (ON or 1) and inactive (OFF or 0).Logic functions:

The state of the node xi at time t+1 depends on the state of its regulators at time t.

Updating scheme: Synchronous

Types of attractors: fixed points and limit cycles

Fixed point

Limit cycle

“Predicting missing expression values in gene regulatory networks using a discrete logic modeling optimization guided by network stable states”

27

Consistency between expression data and network stable states

“Predicting missing expression values in gene regulatory networks using a discrete logic modeling optimization guided by network stable states”

28

Optimization of h(x) (objective function)

h(x) = X1+X2+X3+X4+X5+x6

Xi = 0 or 1

Network topology optimized using an Estimation of Distribution Algorithm (EDA)

Toy example:

Iterative network pruning

“Predicting missing expression values in gene regulatory networks using a discrete logic modeling optimization guided by network stable states”

29

Top 10 solutions

Initial population Next population

EDA: toy example

30

EDA: toy example

Top 10 solutions

Initial population Next population

31

EDA: toy example

Top 10 solutions

Initial population Next population

32

EDA: toy example

Top 10 solutions

Initial population Next population

33

EDA: toy example

Top 10 solutions

Initial population Next population

0.7

34

EDA: toy example

Top 10 solutions

Initial population Next population

0.7 0.7

35

EDA: toy example

Top 10 solutions

Initial population Next population

0.7 0.7 0.6

36

EDA: toy example

Top 10 solutions

Initial population Next population

0.7 0.7 0.6 0.6

37

EDA: toy example

Top 10 solutions

Initial population Next population

0.7 0.7 0.6 0.6 0.8

38

EDA: toy example

Top 10 solutions

Initial population Next population

0.7 0.7 0.6 0.6 0.8 0.7

39

EDA: toy example

Top 10 solutions

Initial population Next population

0.7 0.7 0.6 0.6 0.8 0.7

40

EDA: toy example

Top 10 solutions

Initial population Next population

0.7 0.7 0.6 0.6 0.8 0.7

STOP CRITERIA

41

with

Objective function:This score S uses the normalized Hamming distance (h) to compare N Boolean gene expression values (σ) between all calculated steady states (α) of a pruned network and the two known phenotypes (φ1 and φ2) defined by the expression data, in order to identify the two best-matching phenotype/steady state couples (φα1 and φα2)

Iterative network pruning

with

Objective function:This score S uses the normalized Hamming distance (h) to compare N Boolean gene expression values (σ) between all calculated steady states (α) of a pruned network and the two known phenotypes (φ1 and φ2) defined by the expression data, in order to identify the two best-matching phenotype/steady state couples (φα1 and φα2)

Iterative network pruning

43

with

Objective function:This score S uses the normalized Hamming distance (h) to compare N Boolean gene expression values (σ) between all calculated steady states (α) of a pruned network and the two known phenotypes (φ1 and φ2) defined by the expression data, in order to identify the two best-matching phenotype/steady state couples (φα1 and φα2)

Iterative network pruning

44

with

Objective function:This score S uses the normalized Hamming distance (h) to compare N Boolean gene expression values (σ) between all calculated steady states (α) of a pruned network and the two known phenotypes (φ1 and φ2) defined by the expression data, in order to identify the two best-matching phenotype/steady state couples (φα1 and φα2)

Iterative network pruning

45

with

Objective function:This score S uses the normalized Hamming distance (h) to compare N Boolean gene expression values (σ) between all calculated steady states (α) of a pruned network and the two known phenotypes (φ1 and φ2) defined by the expression data, in order to identify the two best-matching phenotype/steady state couples (φα1 and φα2)

Iterative network pruning

46

with

Objective function:This score S uses the normalized Hamming distance (h) to compare N Boolean gene expression values (σ) between all calculated steady states (α) of a pruned network and the two known phenotypes (φ1 and φ2) defined by the expression data, in order to identify the two best-matching phenotype/steady state couples (φα1 and φα2)

Iterative network pruning

47

with

Objective function:This score S uses the normalized Hamming distance (h) to compare N Boolean gene expression values (σ) between all calculated steady states (α) of a pruned network and the two known phenotypes (φ1 and φ2) defined by the expression data, in order to identify the two best-matching phenotype/steady state couples (φα1 and φα2)

Iterative network pruning

48

with

Objective function:This score S uses the normalized Hamming distance (h) to compare N Boolean gene expression values (σ) between all calculated steady states (α) of a pruned network and the two known phenotypes (φ1 and φ2) defined by the expression data, in order to identify the two best-matching phenotype/steady state couples (φα1 and φα2)

Iterative network pruning

49

But the contribution of interactions to the network stability it is not linearly independent.

The evaluation of one specific link is highly dependent of the links already removed or, in other words, the order of removal.

We are going to capture interdependencies between variables considering sequentially both the probability distribution of positive circuits and separated edges.

Positive circuit Positive circuit Negative circuit

“Predicting missing expression values in gene regulatory networks using a discrete logic modeling optimization guided by network stable states”

Thomas R, Thieffry D, Kaufman M: DYNAMICAL BEHAVIOR OF BIOLOGICAL REGULATORY NETWORKS .1. BIOLOGICAL ROLE OF FEEDBACK LOOPS AND PRACTICAL USE OF THE CONCEPT OF THE LOOP-CHARACTERISTIC STATE. Bulletin of Mathematical Biology 1995, 57:247-276.

Positive circuits are necessary condition to have several fixed points

50

with

Objective function:This score S uses the normalized Hamming distance (h) to compare N Boolean gene expression values (σ) between all calculated steady states (α) of a pruned network and the two known phenotypes (φ1 and φ2) defined by the expression data, in order to identify the two best-matching phenotype/steady state couples (φα1 and φα2)

Iterative network pruningPositive Circuit 1

51

with

Objective function:This score S uses the normalized Hamming distance (h) to compare N Boolean gene expression values (σ) between all calculated steady states (α) of a pruned network and the two known phenotypes (φ1 and φ2) defined by the expression data, in order to identify the two best-matching phenotype/steady state couples (φα1 and φα2)

Iterative network pruningPositive Circuit 2

52

with

Objective function:This score S uses the normalized Hamming distance (h) to compare N Boolean gene expression values (σ) between all calculated steady states (α) of a pruned network and the two known phenotypes (φ1 and φ2) defined by the expression data, in order to identify the two best-matching phenotype/steady state couples (φα1 and φα2)

Iterative network pruningPositive Circuit 3

53

“Predicting missing expression values in gene regulatory networks using a discrete logic modeling optimization guided by network stable states”

Algorithm:

54

Predictions based on the consensus

between the familiy of

alternative solutions

“Predicting missing expression values in gene regulatory networks using a discrete logic modeling optimization guided by network stable states”

http://nar.oxfordjournals.org/content/early/2012/08/30/nar.gks785.full

Software http://maia.uni.lu/demo/

Paper

Availability:

Isaac Crespo

Computational Biology Unit

(LCSB)

Abhimanyu Krishna

Bioinformatic core

(LCSB)

Antony Le Béchec Antonio del Sol

Head of Computational Biology Unit

(LCSB)

Life sciences research unit

(LSRU)

Vital-IT (SIB)

Thank you!

Questions?

57

Recommended