84
VL Netzwerke, WS 2007/08 Edda Klipp 1 Max Planck Institute Molecular Genetics Humboldt University Berlin Theoretical Biophysics Networks in Metabolism and Signaling Edda Klipp Humboldt University Berlin Lecture 4 / WS 2007/08 Boolean Networks

One Network, Different Models

Embed Size (px)

DESCRIPTION

Networks in Metabolism and Signaling Edda Klipp Humboldt University Berlin Lecture 4 / WS 2007/08 Boolean Networks. gene a. gene b. gene c. gene d. C. A. D. B. A. B. +. +. One Network, Different Models. transcription. translation. repression. activation. gene. protein. - PowerPoint PPT Presentation

Citation preview

Page 1: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 1

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Networks in Metabolism and Signaling

Edda Klipp Humboldt University Berlin

Lecture 4 / WS 2007/08Boolean Networks

Page 2: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 2

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

One Network, Different Models

gene a gene b

gene c gene d

C

A

D

B

AB

+

+

repression

activation

transcription

translation

gene

protein

a b

c d

Directed graphs

V = {a,b,c,d}

E = {(a,c,+),(b,c,+), (c,b,-),(c,d,-),(d,b,+)}

a b

c d

Boolean network

a(t+1) = a(t)

b(t+1) = (not c(t)) and d(t)c(t+1) = a(t) and b(t)

d(t+1) = not c(t)

a b

c d

Bayesian network

p(xa)

p(xb)p(xc|xa,xb),

p(xd|xc),

Page 3: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 3

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Simplification of Gene Expression Regulation

Gene

mRNA

Protein

Gene

mRNA

ProteinTranscription Factor

A B C D E F G

Page 4: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 4

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Boolean NetworkBoolean network is

- a directed graph G(V,E)

characterized by

- the number of nodes („genes“): N

- the number of inputs per node (regulatory interactions): k

A B

C

E

D

F G

N=7,kA=0, kB=1, kC=2,… in-degrees

Page 5: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 5

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Boolean Logic

(George Boole, 1815-1864)Each gene can assume one of two states:

expressed („1“) or not expressed („0“)

Background: Not enough information for more detailed descriptionIncreasing complexity and computational effort for more specific models

Replacement of continuousfunctions (e.g. Hill function)by step function

Boolean models are discrete (in state and time) and deterministic.

Page 6: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 6

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Boolean Network

722

Boolean networks have

always a finite number of possible states: 2N

and, therefore, a finite number of state transitions:

A B

C

E

D

F G

N=7, 27 states, theoretically possible state transition

N22

Page 7: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 7

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Dynamics of Boolean NetworkS

The dynamics are described by rules:

„if input value/s at time t is/are...., then output value at t+1 is....“

A B

A(t) B(t+1)

Page 8: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 8

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Boolean Models: Truth functions

in output

0 0 0 1 11 0 1 0 1

p p not p

rule 0 1 2 3

A B

B(t+1) = not (A(t))rule 2

A(t) B(t+1)

Page 9: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 9

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Dynamics of Boolean Networks with k=1

Linear chainA B C D

A fixed (no input). Rules 0 and 3 not considered (since independence of input).

A(t) B(t+1)B(t+1) C(t+2)

C(t+2) D(t+3)

The system reaches a steady state after N-1 time steps.

Page 10: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 10

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Dynamics of Boolean Networks with k=1

RingA B

C D

A

B

Again: Rules 0 and 3 not considered (since independence of input).

A(t+1)=B(t)B(t+1)=A(t)Both rule 1

A B A B A B A B0 0 1 0 0 1 1 10 0 0 1 1 0 1 10 0 1 0 0 1 1 1

A(t+1)=not B(t)B(t+1)=A(t)Both rule 1

A B A B A B A B0 0 1 0 0 1 1 11 0 1 1 0 0 0 11 1 0 1 1 0 0 00 1 0 0 1 1 1 00 0 1 0 0 1 1 1

Fixpoint or cycle of length 2 depending on initial conditions

Cycle of length 4 independent ofinitial conditions.

Page 11: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 11

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Attractor

The trajectory connects the successive states for increasing time.

An attractor is a region of a dynamical system's state space that the system can enter but not leave, and which contains no smaller such region (a special trajectory).

Fixpoint – cycle of length 1Cycles of length LBasin of attraction: is the surrounding region in state space such that all trajectories starting in that region end up in the attractor.

Bifurcation: appearance of a boarder separating two basins of attraction.

Page 12: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 12

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Boolean Models: Truth functions k=2

And Or Nor

0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1

0 1 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1

1 0 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1

1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

rule 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

input outputp q

AC C(t+1) = not (A(t)) and B(t)

rule 4B

p=A(t), q=B(t)

Page 13: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 13

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Example Network

Three genes X, Y, and Z

X

Y

Z

Rules

X(t+1) = X(t) and Y(t) Y(t+1) = X(t) or Y(t)Z(t+1) = X(t) or (not Y(t) and Z(t))

Current Next state state000 000001 001010 010011 010100 011101 011110 111111 111

000 001 010 011

100101 110 111

Page 14: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 14

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Example Network

X

Y

Z

000 001 010 011

100101 110 111

- The number of accessible states is finite, .

- Cyclic trajectories are possible.

- Not every state must be approachable from every other state.

- The successor state is unique, the predecessor state is not unique.

N2

Page 15: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 15

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Example Network as Boolean Model

gene a gene b

gene c gene d

C

A

D

B

AB

+

+

repression

activation

transcription

translation

gene

protein

a b

c d

Boolean network a(t+1) = a(t)

b(t+1) = (not c(t)) and d(t)

c(t+1) = a(t) and b(t)

d(t+1) = not c(t)

Page 16: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 16

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Example Network as Boolean Model

a b

c d

Boolean network

a(t+1) = a(t)

b(t+1) = (not c(t)) and d(t)

c(t+1) = a(t) and b(t)

d(t+1) = not c(t)

0000 00010001 01010010 00000011 00000100 00010101 01010110 00000111 0000

Steady state: 0101

1000 10011001 11011010 10001011 10001100 10111101 11111110 10101111 1010

Cycle: 1000 1001 1101 1111 1010 1000

Page 17: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 17

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Naïve Reconstruction of Boolean Models

If it is known -the number of vertices, N, and -the number of inputs per vertex, k,-As well as a sufficient set of successive states, one can reconstruct the network

List- List for each vertex all possible input combinations

- List all respective outputs

Experiments:- Delete after every “experiment” all “wrong” entries of the list

Page 18: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 18

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Naïve Reconstruction of Boolean Models

A B

N=2, k=1

Input Output A(A),B(A)A B rule 0 0 0 1 0 2 0 3 1 0 1 1 1 2 1 3 2 0 2 1 2 2 2 3 3 0 3 1 3 2 3 3 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0 1 1 1 1 1 0 1 0 1 1 1 10 1 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0 1 1 1 1 1 0 1 0 1 1 1 11 0 0 0 0 1 0 0 0 1 1 0 1 1 1 0 1 1 0 0 0 1 0 0 0 1 1 0 1 1 1 0 1 11 1 0 0 0 1 0 0 0 1 1 0 1 1 1 0 1 1 0 0 0 1 0 0 0 1 1 0 1 1 1 0 1 1 In out

0 0 0 1 11 0 1 0 1rule 0 1 2 3

Input Output A(B),B(B)A B rule 0 0 0 1 0 2 0 3 1 0 1 1 1 2 1 3 2 0 2 1 2 2 2 3 3 0 3 1 3 2 3 3 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0 1 1 1 1 1 0 1 0 1 1 1 10 1 0 0 0 1 0 0 0 1 1 0 1 1 1 0 1 1 0 0 0 1 0 0 0 1 1 0 1 1 1 0 1 11 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0 1 1 1 1 1 0 1 0 1 1 1 11 1 0 0 0 1 0 0 0 1 1 0 1 1 1 0 1 1 0 0 0 1 0 0 0 1 1 0 1 1 1 0 1 1

Input Output A(A),B(B)A B rule 0 0 0 1 0 2 0 3 1 0 1 1 1 2 1 3 2 0 2 1 2 2 2 3 3 0 3 1 3 2 3 3 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0 1 1 1 1 1 0 1 0 1 1 1 10 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 11 0 0 0 0 0 0 1 0 1 1 0 1 0 1 1 1 1 0 0 0 0 0 1 0 1 1 0 1 0 1 1 1 11 1 0 0 0 1 0 0 0 1 1 0 1 1 1 0 1 1 0 0 0 1 0 0 0 1 1 0 1 1 1 0 1 1

Input Output A(B),B(A)A B rule 0 0 0 1 0 2 0 3 1 0 1 1 1 2 1 3 2 0 2 1 2 2 2 3 3 0 3 1 3 2 3 3 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0 1 1 1 1 1 0 1 0 1 1 1 10 1 0 0 0 0 0 1 0 1 1 0 1 0 1 1 1 1 0 0 0 0 0 1 0 1 1 0 1 0 1 1 1 11 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 11 1 0 0 0 1 0 0 0 1 1 0 1 1 1 0 1 1 0 0 0 1 0 0 0 1 1 0 1 1 1 0 1 1

A B1

2

“Experimente….”

In OutA B A B0 0 0 10 1 1 11 0 0 01 1 1 0

Page 19: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 19

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Random Boolean Networks

If the rules for updating states are unknown

select rules randomly

N nodes ½ pN (N-1) edges

Rule 2

Rule 0

Rule 1

Rule 2

Page 20: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 20

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Kauffman’s NK Boolean Networks

An NK automaton is an autonomous random network of N

Boolean logic elements. Each element has K inputs and

one output. The signals at inputs and outputs take binary

(0 or 1) values. The Boolean elements of the network and

the connections between elements are chosen in a

random manner. There are no external inputs to the

network. The number of elements N is assumed to be

large. S.A. Kauffman, 1969, J Theor Biol. Metabolic Stability and Epigenesis in Randomly Constructed Genetic NetsS. A. Kauffman. The Origins of Order: Self-Organization and Selection in Evolution, Oxford

University Press, New York, 1993. S.A. Kauffman, 2003, PNAS, Random Boolean Network Models and the Yeast Transcriptional Network

Page 21: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 21

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Kauffman’s NK Boolean Networks

An automaton operates in discrete time. The set of the

output signals of the Boolean elements at a given

moment of time characterizes a current state of an

automaton. During an automaton operation, the

sequence of states converges to a cyclic attractor. The

states of an attractor can be considered as a "program" of

an automaton operation. The number of attractors M and

the typical attractor length L are important characteristics

of NK automata.

Page 22: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 22

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Kauffman’s Boolean Network

Fundamental question: require metabolic stability and epigenesis the genetic regulatory circuits to be precisely constructed??

Has fortunate evolutionary history selected only nets of highly ordered circuits which alone insure metabolic stability;Or are stability and epigenesis, even in nets of randomly interconnected regulatory circuits, to be expected as the probable consequence of as yet unknown mathematical laws?

Are living things more akin to precisely programmed automata selected by evolution, or to randomly assembled automata…?

Note: cellular differentiation despite identical sets of genes

Page 23: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 23

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Kauffman’s Boolean Network

Page 24: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 24

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Further Properties

K connections: 22K Boolean input functions

Nets are free of external inputs.

Once, connections and rules are selected, they remain constant and the time evolution is deterministic.

Earlier work by Walker and Ashby (1965): same Boolean functions for all genes:Choice of Boolean function affects length of cycles:

“and” yields short cycles,“exclusive or” yields cycles of immense length

Page 25: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 25

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Further Properties: Cycles

State of the net: Row listing the present value of all N elements (0 or 1)

Finite number of states (2N) as system passes along a sequence of states from an arbitrary initial state, it must eventually re-enter a state previously passed a cycle

Cycle length: number of states on a re-enterant cycle of behavior

Cycle of length 1 – equilibrial state

Transient (or run-in) length: number of state between initial states and entering the cycle

Confluent: set of states leading to or being part of a cycle

Page 26: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 26

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Further Properties: Number of Cycles

Such a net must contain at least one cycle, it may have more.

There number can be counted just be releasing the net from different initial states

No state can diverge on to two different states, no state can be on two different cycles

Page 27: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 27

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Further Properties: Number of Cycles

(a) A net of three binary elements, each of which receives inputs from the other two. The Boolean function assigned to each element is shown beside the element. (b) All possible states of the 3-element net are shown in the left 3 x 8 matrix below T. The subsequent state of the net at time T+ 1, shown in the matrix on the right, is derived from the inputs and functions shown in (a). (c) A kimatograph showing the sequence of state transitions leading into a state cycle of length 3. All states lie on one confluent. There are three run-ins to the single state cycle.

Page 28: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 28

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Example: Net with N=10

Periodic attractor (yellow)and basin of attraction(cyan)

Page 29: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 29

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Example: Net with N=10

The entire state space of an RBN with 10 nodes. Note: Self connections do not appear so a period-1 attractor appears to have no outputs although each network state must have exactly one output.

Page 30: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 30

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Further Properties: Distance

Distance compares two states of the net

Can be defined as the number of genes with different values in two states.

For example N=5: state (00000) and state (00111) differ in the value of three elements

This is used as measure of dissimilarity between

- subsequent states on a transient- subsequent states on a cycle- cycles

Page 31: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 31

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Totally Connected Nets, K=N

Is like random mapping of a finite set of numbers into itself.

Expected length of cycle is N2

E.g. net with N=200 states expected cycle length 2100 ~ 1030

Compare to Hubbel’s age of the universe: 1023

If every transition would take only a second….

Such networks are biologically impossible

Page 32: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 32

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

One Connected Nets, K=1

Either one cycle of length N

Or a number of disconnected cycles for the full systems state cycles lengths are lowest

common multiples of the individual loop lenghts

the state cycle length becomes easily very large

Again biologically not feasible

Page 33: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 33

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Two Connected Nets, K=2

Kauffman studied networks of N= 15, 50, 64,…, 400, 1024, .., 8191

Nets of 1000 elements possess 21000~10300 states

16 Boolean functions

Study of cycle length (surprisingly short)

Page 34: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 34

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Two Connected Nets, K=2: Cycle Length

(a) A histogram of the lengths of state cycles in nets of 400 binary elements which used all 16 Boolean functions of two variables equiprobably. The distribution is skewed toward short cycles. (b) A histogram of the lengths of state cycles in nets of 400 binary elements which used neither tautology nor contradiction, but used the remaining 14Boolean functions of 2 variables equiprobably. The distribution is skewed toward short cycles.

Page 35: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 35

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Two Connected Nets, K=2: Cycle Length

Log median cycle length as a function of log N, in nets using all 16 Booleanfunctions of two inputs (all Boolean functions used), and in nets disallowing these two functions(tautology and contradiction not used). The asymptotic slopes are about 0.3 and 0.6.

Page 36: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 36

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

K=2: Transient Lengths

A scattergram of run-in length and cycle length in nets of 400 binary elementsusing neither tautology nor contradiction. Run-in length appears uncorelated with cyclelength. A log/log plot was used merely to accommodate the data.

Page 37: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 37

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

K=2: Number of Cycles

A histogram of the number of cycles per net in nets of 400 elements using neithertautology nor contradiction, but the remaining Boolean functions of two inputs equiprobably.The median is 10 cycles per net. The distribution is skewed toward few cycles.

Expected number of cycles: 2

N

Page 38: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 38

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

K=2: Activity

After release from an arbitrary initial state:

Number of elements changing their state per state transition decreases

Example: net of 100 elements first step: about 0.4 N elements change exponential decay of this number minimum activity 0 to 0.25 N

On a cycle: 0 to 35 of 100 elements change

most genes are constant during a cycle

Bis hier 12. Nov 2007

Page 39: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 39

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

NoiseOne unit of noise may be introduced by arbitrarily changing the value of a single gene for one time moment.

The system may return to the cycle perturbed or run into a different cycle.

In a net of size N there are just N states which differ from any state in the value of just one gene

Consider a net with several cycles: By perturbing all states on each cycle (distance 1) one obtains a matrix listing all cycles and how often they are reached from another one.

By dividing all cells by the rows totals transition probabilities

The matrix is a Markov chain.

Page 40: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 40

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Noise: for the Example

a b

c d

Boolean network

a(t+1) = a(t)

b(t+1) = (not c(t)) and d(t)

c(t+1) = a(t) and b(t)

d(t+1) = not c(t)

Cycle 10000 00010001 01010010 00000011 00000100 00010101 01010110 00000111 0000

Steady state: 0101

Cycle 21000 10011001 11011010 10001011 10001100 10111101 11111110 10101111 1010

Cycle: 1000 1001 1101 1111 1010 1000

C1 C2C1 ¾ ¼C2 ¼ ¾

TransitionMatrix

Page 41: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 41

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Noise

(a) A matrix listing the 30 cycles of one net and the total number of times one unit of perturbation shifted the net from each cycle to each cycle. The system generally returns to the cycle perturbed. Division of the value in each cell of the matrix by the total of its row yields the matrix of transition probabilities between modes of behavior which constitute a Markov chain. The transition probabilities between cycles may be asymmetric.(b) Transitions between cycles in the net shown in (a). The solid arrows are the most probable transition to a cycle other than the cycle perturbed, the dotted arrows are the second most probable. The remaining transitions are not shown. Cycles 2, 7, 5 and 15 form an ergodic set into which the remaining cycles flow. If all the transitions between cycles are included, the ergodic set of cycles becomes: 1, 2, 3, 5, 6, 12, 13, 15, 16. The remainder are transient cycles leading into this single ergodic set-.

Page 42: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 42

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Noise

The total number of cycles reached from each cycle after it was perturbed inall possible ways by one unit of noise correlated with the number of cycles in the net beingperturbed. The data is from nets using neither tautology nor contradiction, with N = 191,and 400.

Page 43: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 43

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Application to Cell Cycle

Logarithm of cell replication time in minutes against logarithm of estimated number of genes for various single cell organisms and cell types. Solid lines: connects medium replication times of bacteria, protozoa, chicken, mouse, dog, and man.

Page 44: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 44

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Application to Cellular Differentiation

The logarithm of the number of cell types is plotted against the logarithm of the estimated number of genes per cell, and the logarithm of the median number of state cycles is plotted against logarithm N. The observed and theoretical slopes are about 0.5. Scale: 2 x lo6 genes per cell = 6 x 10-12g DNA per cell.

Page 45: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 45

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Distance: the Derrida PlotRecurrence relation showing the expected distanceDT+1 between two states at time T+1 after each is acted upon by the network at time T, as a function of the distance DT between the two states at time T. Distance is normalized to the fraction of elements in different activity values in the two states beingcompared. For K=2, the recurrence curve is below the 45x line, and hence the distance between arbitrary initial states decreases toward zero over iterations. For K>2, states that are initially very close diverge to an asymptotic distance given by the crossing of thecorresponding K curve at the 45x line. Thus K>2 networks exhibit sensitivity to initial conditions and chaos, not order.

Example:N=3, at T two states (000) and (001) – distance 1 (or 1/3 normalized)Transition to T+1 : (000)(100) and (001)(010) distance 2 (or 2/3)

Page 46: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 46

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Ordered and Chaotic Regimes

Series of states: White: changingBlack: not changing

Edge of Chaos

Chaotic regime Ordered regime

Page 47: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 47

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Kauffman’s NK Boolean NetworksDependence of Behavior on Degree

KK large (K = N): the behavior is essentially stochastic.

The successive states are random with respect to the preceding ones. The "programs" are very sensitive to minimal disturbances (a minimal disturbance is a change of an output of a particular element during an automaton operation) and to mutations (changes in Boolean element types and in network connections). The attractor lengths L are very large: L ~ 2N/2 . The number of attractors M is of the order of N.

If the connection degree K is decreased, this stochastic type of behavior is still observed, until K ~ 2.

Page 48: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 48

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Kauffman’s NK Boolean NetworksDependence of Behavior on Degree

K

At K ~ 2 the network behavior changes drastically.

The sensitivity to minimal disturbances is small.

The mutations create typically only slight variations an automaton dynamics. Only some rare mutations evoke the radical, cascading changes in the automata "programs".

The attractor length L and the number of attractors M are of the order of 1/2N. This is the behavior at the edge of chaos, at the borderland between chaos and order.

Page 49: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 49

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Boolean Dynamics – Scale Free Nets

Page 50: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 50

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Boolean Dynamics – Scale Free NetsTypical examples of directed networks are shown for the size of N = 64: (a),(b) random network with K = 2; (e),(f) scale-free network with <k>=2.

We show same network by two kinds of representation. For (a) and (e) the nodes are located on the circumference with equal distance. For (b) and (f) the nodes are randomly distributed in the square.

Each node is represented as a bold point with size in proportion to the number of the input links. We represent input(output)-side of the links with deep(faint) color such that the direction of a link is denoted by the color gradation from deep color(output) to faint color(input).

Page 51: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 51

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Boolean Dynamics – Scale Free NetsDistribution of Cycle Lengths

K=2 <k>=2

Page 52: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 52

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Boolean Dynamics – Scale Free NetsDistribution of Cycle Lengths

K=2 <k>=2

N=40

Page 53: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 53

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Boolean Dynamics – Scale Free NetsDistribution of Cycle Lengths

Histograms h of the lengths Lc of state cycles in various types of the directed networks. The network size is N = 80. Each histogram is generated by 103 different sets of the Boolean functions and five different network structures. The maximum iteration number of the Boolean dynamics is 105 until the convergence to the cycle is realized. (a) the RBN with K ¼ 2; (b) the SFRBN with <k>= 2.

Page 54: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 54

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Boolean Dynamics – Scale Free NetsDerrida Plots (Distance between

States)

Derrida plots of the SFRBN with <k> = 1, <k> = 2 and <k> = 4. The analytical curves for the RBN with K = 1, K = 2 and K = 4 are also overplotted. A line H(t+1)=H(t) is the dividing linebetween order and chaos. It is clear that K = 2 lies directly on this line, the system size is N = 1024 and the number of the initial states for averaging is 2000.

Page 55: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 55

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Example: Yeast Cell Cycle

Finish

StartS

Celldivision

Manaphase

Mmetaphase

G1

Cln2 Clb5

Sic1Sic1PSic1Clb5

Sic1Clb2

Clb2

Ccd20Ccd20

SBF MBF

Hct1

Hct1

Budding

APC

Progression through cell cycle

Production, degradation, complex formation

Activation

Inhibition

Active protein or complex

Inactive protein or complex

APC

Page 56: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 56

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Cell Cycle ModelsCyclin

M+ M

X+ X

vi vd

v1

v2

v3

v4

Minimal model taking into account a cyclin, a cyclin dependent kinase (CDK = M) and a protease (X). M and X may assume active and inactive states. Model shows oscillations. (Goldbeter, 1991)

ODE models of increasing complexity (Tyson & Novak groups, 1993-2007), including cyclins, CDKs, transcriptional activators and repressors.

Shows oscillations, with some tricks.

Page 57: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 57

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Yeast Cell Cycle – Data

Page 58: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 58

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Yeast Cell Cycle – Data

Page 59: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 59

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Yeast Cell Cycle – Model

Page 60: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 60

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Yeast Cell Cycle – Model

Page 61: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 61

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Yeast Cell Cycle – Model

Regulatory interactions of 20 genes of S.cerevisiae. The full arcs represent activatory regulation, the dashed arcs represent inhibitory regulation.The relationship between genes regulating one common gene is described by ‘OR’-function.

Page 62: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 62

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Boolean Network Identification

Given: Experimental data

Demanded: Network connectivity and Boolean rules

Page 63: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 63

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Boolean Network Identification

Page 64: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 64

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Boolean Network Identification

Page 65: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 65

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Identification Problem

Let (Ij;Oj) be a pair of expression patterns of {v1; … ; vn}, where Ij cor-responds to the INPUT and Oj corresponds to the OUTPUT. We call the pair (Ij;Oj) an example.

Formally, it is defined the identification problem. Relating to the identification problem, it is also defined the consistency problem, the counting problem and the enumeration problem.

A node vi in a Boolean network G(V;F) is consistent with an example (Ij;Oj) if Oj(vi) = fi(Ij(vi1 ); … ; Ij(vik)) holds. A Boolean network G(V;F) is consistent with (Ij;Oj) if all nodes are consistent with (Ij;Oj). For a set of examples EX = {(I1;O1); (I2;O2);…; (Im;Om)}, network G(V;F) (resp. node vi) is consistent with EX if G(V;F) (resp. node vi) is consistent with all (Ij;Oj) for 1≤ j ≤ m.

Page 66: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 66

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Identification Problem

CONSISTENCY: Given N (the number of nodes) and EX, decide whether or not there exists a Boolean network consistent with EX and output one if it exists;

COUNTING: Given N and EX, count the number of Boolean networks consistent with EX;

ENUMERATION: Given N and EX, output all the Boolean networks consistent with EX;

IDENTIFICATION: Given N and EX, decide whether or not thereexists a unique Boolean network consistent with EX and output it if it exists.

Page 67: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 67

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Identification Problem

Page 68: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 68

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Identification: Algorithm

for the consistency problem:

The algorithm below is natural and conceptually very simple since it simply outputs Boolean functions consistent with given examples:

(1) For each node vi V, execute STEP (2).

(2) If there exists a triplet (fi; vk; vh) satisfying Oj(vi) = fi(Ij(vk); Ij(vh)) for all j = 1; … ; m, output fi as a Boolean function assigned to vi and output vk; vh as input nodes to vi.

In order to find a triplet (fi; vk; vh), we use a simple exhaustive search: for each pair of nodes (vk; vh) (k < h) and for each Boolean function f, we check whether or not Oj(vi) = f(Ij(vk); Ij(vh)) holds for all j.

Page 69: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 69

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Identification: Algorithm

For the enumeration problem:

replace STEP (2) with the following:

(2') Enumerate all triplets (fi; vk; vh) satisfying Oj(vi) = fi(Ij(vk); Ij(vh)) for all j = 1; … ;m.

Then, any combination of triplets ((f1; vk1; vh1); (f2; vk2; vh2); … ; (fn; vkn; vhn)) can represent a consistent Boolean network. Of course, we carefully enumerate triplets since there exists more than two triplets which represent the same Boolean function (such as vk Λ vh and vh Λ vk).

Page 70: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 70

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Identification: Algorithm

For the counting problem:simply multiply the number of triplets consistent with each node.

For the identification problem: replace STEP (2) with the following:

(2") If there exists only one triplet (fi; vk; vh) satisfying Oj(vi) = fi(Ij(vk); Ij(vh)) for all j = 1; … ;m.output fi as a Boolean function assigned to vi and output vk; vh as input nodes to vi.

Page 71: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 71

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Identification: Time Complexity

K

N

K22

mNNO KK22

There are Boolean functions with K input variables.

There are (possible) combinations of input nodes per node.

For each node, triplets are examined in the algorithm.

For each triplets, m examples are examined. Therefore, pairs of Boolean functions and examples are examined in total. In order to examine one pair, O(K) time is required. Therefore, the algorithm works in time. Thus, the algorithm works in polynomial time for fixed K.

Similarly, we can show that the algorithms for the counting problem and the identification problem work in polynomial time for fixed K.

For any Boolean network of fixed K, O(log N) INPUT/OUTPUT pairs are sufficient with high probability.

KNO

K22

mNKO KK 122

Page 72: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 72

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Boolean Network Identification: Reveal

REVEAL. The results suggested that only a small number of state transition pairs (100 pairs from 1015) were sufficient for inferring Boolean networks with 50 nodes (genes) whose indegree (the number of input nodes to a node) was bounded by 3.

Page 73: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 73

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Boolean Network Identification: Reveal

Information theoretic principles of mutual information (M) analysis

Information theory provides us with a quantitative information measure, the Shannon entropy, H. The Shannon entropy is defined in terms of the probability of observing a particular symbol or event, pi, within a given sequence (Shannon &Weaver, 1963),

H= - pi log pi.

Page 74: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 74

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Shannon EntropyIn a binary system, an element, X, may be in either of s=2 states, say on or off.

Over a particular sequence of events, the sum of the probabilities of X being on, p(1) or off, p(0) must be equal to unity, therefore p(1)=1-p(0), and H(X)=-p(0)*log[p(0)]-[1-p(0)] *log[1-p(0)].

H reaches its maximum when the on and off states are equiprobable, i.e. the system is using each information carrying state to its fullest possible extent. As one state becomes more probable than the other, H decreases - the system is becoming biased. In the limiting case, where one probability is unity (certainty) and the other(s) zero (impossibility), H is zero (no uncertainty - no freedom of choice - no information).

The maximum entropy, Hmax, occurs when all states are equiprobable, i.e. p(0)=p(1) =1/2. Accordingly, Hmax=log(2).

Entropies are commonly measured in “bits” (binary digits), when using the logarithm on base 2; e.g. Hmax=1 for a 2 state system.

Page 75: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 75

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Shannon Entropy

Page 76: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 76

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Shannon Entropy

Determination of H. a) Single element. Probabilities are calculated from frequency of on/off values of X and Y.

Page 77: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 77

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Shannon Entropy: Co-Occurrence

Determination of H.b) Distribution of value pairs. H is calculated from the probabilities of co-occurrence.

H(X)= - pi log pi ,H(Y)= - pj log pj , andH(X, Y) = - pi, j log pi, j

Page 78: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 78

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Conditional Entropies

There are 2 conditional entropies which capture the relationship between the sequences of X and Y,

H(X|Y) and H(Y|X).

These are related as follows (Shannon & Weaver, 1963):

H(X,Y) = H(Y|X) + H(X) = H(X|Y) + H(Y) .

In words, the uncertainty of X and the remaining uncertainty of Y given knowledge of X, H(Y|X), i.e. the information containedin Y that is not shared with X, sum to the entropy of the combination of X and Y.

Page 79: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 79

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Mutual Information

“Mutual information”, M(X,Y), also referred to as “rate of transmission” between an input/output channel pair (Shannon & Weaver, 1963) is defined as:

M(X,Y) = H(Y) - H(Y|X) = H(X) - H(X|Y).

The shared information between X and Y corresponds to the remaining information of X if we remove the information of X that is not shared with Y. Using the above equations, mutual information can be defined directly in terms of the original entropies;this formulation will be important for the considerations below:

M(X,Y) = H(X) + H(Y) - H(X,Y).

Page 80: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 80

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

Mutual Information

Venn diagrams of information relationships. In each case, add theshaded portions of both squares to determine one of the following:[H(X)+H(Y)], H(X,Y), and M(X,Y). The small corner rectangles represent information that X and Y have in common. H(Y) is shown smaller than H(X) and with the corner rectangle on the left instead of the right to indicate that X and Y are different, although theyhave some mutual information.

Page 81: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 81

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

REVEAL Algorithm

1tt1t aaa HH ;

Page 82: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 82

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

REVEAL Algorithm1. Identification of perfect input-output state pairs of connectivity k=1Compute the mutual information of all input-output state vector pairs. The calculation of the mutual information values reveals that H(at+1;at)=H(at+1), i.e. at uniquely determines . Likewise, H(dt+1;ct)=H(dt+1), i.e. ct uniquely determines dt+1. For all other genes there is no perfect match.

2. Determination of the rules for the identified pairs at k=1.We retrieve the rules a(t+1)=a(t) and d(t+1)=not c(t) by the respective rule tables.

3. Identification of perfect input-output state pairs of connectivity k=2If not all rules can be retrieved by k=1 we consider k=2, by comparing the output state vectors of the remaining genes all possible pairs of input state vectors. The calculation gives H(bt+1;ct,dt)=H(bt+1), i.e. the pair ct,dt determines bt+1. Likewise,

H(ct+1;at,dt)=H(bt+1), i.e. the pair at,dt determines ct+1.

4. Determination of the rules for the identified pairs at k=2We retrieve the rules b(t+1)= (not c(t)) and d(t) and likewise c(t+1)=a(t) and b(t).

5. Identification of perfect input-output state pairs of connectivity k=p

6. Determination of the rules for the identified pairs at k=p. Stop, if all genes have been assigned a rule, otherwise increment p and go to 5.

1tt1t aaa HH ;

Page 83: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 83

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

REVEAL Algorithm: Other example

Page 84: One Network, Different Models

VL Netzwerke, WS 2007/08 Edda Klipp 84

Max Planck Institute Molecular Genetics

Humboldt University BerlinTheoretical Biophysics

REVEAL Algorithm: Other example