Diversity and Design in Cellular Networks Prediction, Control and Design of and with Biology Adam Arkin, University of California, Berkeley

Diversity and Design in Cellular Networks

Prediction, Control and Design of and with Biology

Adam Arkin, University of California, Berkeleyhttp://genomics.lbl.gov

"Nothing in biology makes sense except in the light of evolution."

Theodosius Dobzhansky, The American Biology Teacher, March 1973

Bacillus Yeast Volvox An egg Humpty Dumpty

A scientist

The Advent of Molecular Biology

Genome Macromolecules Metabolites

Biochemistry

Through RNA

Feedback &Feedforward

Images from Reichardt or D. Kaiser

Myxococcus xanthus

• Even cells as “simple” as bacteria are highly social, differentiating, sensing/actuation systems

Immune cells

• They perform amazing engineering feats under the control of complex cellular networks

Onsum, Arkin, UCB Mione, Redd, UCL

1/50 of the known neutrophil chemotaxis network

Fc- receptor c5a- receptor

Calcium control PIP3 control

Systems and Synthetic Biology

• Systems biology seeks to uncover the design and control principles of cellular systems through– Biophysical characterization of macromolecules and other cellular

structures– Comparative genomic analysis– Functional genomic and high-throughput phenotyping of cellular

systems– Mathematical modeling of regulatory networks and interacting cell

populations.

• Synthetic biology seeks to develop new designs in the biological substrate for biotechnological, medical, and material science.– Founded on the understanding garnered from systems biology– New modalities for genetic engineering and directed evolution– Scaling towards programmable biomaterials.

Systems biology is necessary

• Because of the highly interconnected nature of cellular networks

• Because it is the best way to understand what is controllable and what is not in pathway dynamics

• Because it discovers what designs evolution has arrived at to solve cellular engineering problems that we emulate in our own designs.

A broader overview• Evolutionary Game Theory• Ecological Modeling• Population Biology• Epidemiology• Neuroscience• Organ Physiology• Immune Networks• Cellular Networks

• Problems:– Static and Dynamic Representations– Physical Picture for Representation (e.g. deterministic vs. stochastic)– Mathematical Description of Physics (e.g. Langevin vs. Master Equation)– Levels of abstraction: Formal and ad hoc.– Measurement: High-throughput/broadbrush/imprecise vs. low-

through/targetted/precise

v12

v12dt

r12=r1+r2

r1

tvrVcol 122

12 tvrVcol 122

12

Consider a collision between two hard spheres:

In a small time interval, dt, sphere 1 will sweep out a small volume relative to sphere 2.

If the center of sphere 2 lies within this volume at time t, then in the time small time interval the spheres will collide.

The probability that a given sphere of type 2 is in that volume is simplyVcol/V (where V is the containing volume).

All that remains is to average this quantity over the velocity distributions of the spheres.

V V V r v tcol / 1122

12 V V V r v tcol / 1

122

12

Chemical Kinetics: The short course I.

dtvrVXX 122

12121 dtvrVXX 12

212

121

Given that, at time t, there are X1 type-1 spheres and X2 type-2 spheres then the probability that a 1-2 collision will occur on V in the next time interval is:

Now if each collision has a probability of causing a reaction then in analogy to the last equation, all we can say is

X1 X2 c1 dt = average probability that an R1 reaction will occur somewhere in V within the interval dt.

Chemical Kinetics: The short course II.

If we wish to map trajectories of chemical concentration, we want to know the probability that there will be

molecules of each species in the chemical mechanism at time t in V. We call that probability:

),( tXP

),( tXP

},...,3,2,1{ XnXXXX

},...,3,2,1{ XnXXXX

This function gives complete knowledge of the stochastic state of the system at time t.

The master equation is simply the time evolution of this probability. To derive it we need to derive which is simply done from our previous work.

It is the sum of two terms:

1. The probability that we were at X at time t and we stayed there.2. The probability that a reaction of type brought us to this state.

Chemical Kinetics: The Master Equation I.

]1[*),(1

dtatXPPM

stay

dtchdta

The first term is given by:

Where

= The probability that a reaction of type will occur given that the system is in a given state at time t.

and where h is a combinatorial function of the number of molecules of each chemical species in reaction type .

Chemical Kinetics: The Master Equation II

dtBPM

enter

1

The second term is given by:

where B is the probability that the system is one reaction away from state at time t and then undergoes a reaction of type .

),(),(1

tXPaBtXPt

M

),(),(1

tXPaBtXPt

M

Plugging these terms into the equation for and rearranging we arrive at the master equation.

),( dttXP

Chemical Kinetics: The Master Equation III

Deterministic Kinetics I.

Deterministic kinetics may be derived with some assumptions from the master equation. The end result is simple a set of coupled ODE’s:

vdt

Xd

where is the stoichiometric matrix and v is a vector of rate laws.

Example: Enzyme kinetics

),(),(1

tXPaBtXPt

M

),(),(1

tXPaBtXPt

M

Mathematical Representation

X + 2 Y 2 ZZ + E EZ

EZ E+PEnzymatic

EZk

EZk

ZEk

YXk

dt

P

EZ

E

Z

Y

X

d

cat

2

2

21

*

*

1000

1110

1110

0112

0002

0001

/

Very simplest “Mass action representation”

StoichiometricMatrix

Flux Vector

Mathematical Representation

X + 2 Y 2 ZZ + E EZ

EZ E+P

ZK

ZV

YXkdt

P

Z

Y

X

d

mmax

21 *

10

12

02

01

/

Often times…the enzyme isn’t represented…

X + 2 Y 2 Z

Z PE

][

][][]][[

][]][[

][][]][[

3

321

21

321

SEk

SEkSEkSEk

SEkSEk

SEkSEkSEk

dt

Xd

But often times we make assumptions equivalent to a singular perturbation. E.g. we assume that E,S and ES are in rapid equilibrium:

])[/(][])[/(][][*

])[/(][][

][]/[][

]/[]][[

][][

max33 SKSVSKSEkSEkdt

dP

SKSESE

SESSEKE

KSESE

SEEE

MMtot

Mtot

Mtot

M

tot

Enzyme Kinetics II.

These forms are the common forms used in basic analysis

Stationary State Analysis

EZk

EZk

ZEk

YXk

cat

2

2

21

*

*

1000

1110

1110

0112

0002

0001

0

Clearly, the steady state fluxes are in the “null space” of the stoichiometric matrix.

But these are only unique if significant constraints are also applied (the system in under-determined).

Also– highly dependent on “representation”.

The Stoichiometric Matrix

• This matrix is a description of the “topology” of the network.

• It is tricky to abstract into a simple incidence matrix, for example.

• Most experimental measurements can only capture a small fraction of the interactions that make up a network.

• However, it does put some limits on behavior…

1000

1110

1110

0112

0002

0001

4321

P

EZ

E

Z

Y

X

RRRR

Graph Theory: “Scale-Free” networks?

• Nodes are protein domains• Edges are “interactions”• Statements are made about

– Robustness– Signal Propagation (small world properties)– Evolution

Stability Analysis for Deterministic Systems

a v= m• ab v= k * a• a + 2 b 3 b v= a*b2

• b c v= b

da/dt= m- k*a – a*b2

db/dt= k*a + a*b2 - b

Stationary State

da/dt= m- k*a – a*b2=0db/dt= k*a + a*b2 – b=0

ass= m/(m2+k), bss= m;

So for any given value of m or k we can calculate the steady-state. These are “parameters”

Stability• We calculate stability by figuring

out if small perturbations around a stationary state grow away from the state or fall back towards the state….

• So we expand our differential equations around a steady state and ask how small pertubations in a and b grow….

Stability2

1 1

21 1

2 21 1

( , | , )

( , | , )

0

. , ( ) ( )

ssss ss

ssss ss

ss ss

ssss

dam k a ab f a b m k

dtdb

k a ab b g a b m kdt

d a f ff a b

dt a b

d b g gg a b

dt a b

f g

fe g k b k m

a

Stability

21 1

21 1

2 21 1

1 1 22

11 21

22

1 21

1

( , | , )

( , | , )

. , ( ) ( )

2( )

( )

2( )

( )

ssss

n

n n

n

dam k a ab f a b m k

dtdb

k a ab b g a b m kdt

fe g k b k m

a

f fm

k mx xk m

Jm

f f k mk m

x x

Stability

1 1 2 2

3 1 4 2

exp( ) exp( )

exp( ) exp( )

ad

abJ

bdt

c t c ta

c t c tb

Thus the are the eigenvalues of the perturbation matrix and will determine if the perturbations grow or diminish.

Why is quantitative analysis important?

B-p

A A-p

B-p

A A-p

5 10 15 20

5

10

15

20

[A]ss

[B-p]

?E.g. Focal Adhesion Kinase Alternative Splice

][][

*

][][

*][*

max pAKpA

VylationDephosphor

AKA

pBkationPhosphoryl

Arr

AffcatB

Quantitative AnalysisB-p

A A-p

Phosphorylation k A pA

K AA p cat fAAf

*[ ]*[ ]

[ ]2

dA

dtDephosphorylation Phosphorylation PhosphorylationB A p ( )

Bistability

A simple model of the positive feedback

Monostable

Weakly bistable

Irreversibly Bistable

kC=1.6

kc

kc – catalytic constant for the trans-autophosphorylation.

Sta

tio

nar

y st

ate

[FA

K-I

]

Signal Filtering

Brief Digression: Chemical Impedance

IA*][)( ][][

2

121 I

k

kIAAkIk

dt

dAt

So A is the signal inside the cell that I is outside the cell.What if A signals to downstream targets by reacting with them?

A+B*C

][

][)( [A][B] k- ][][

32

1321 Bkk

IkIAAkIk

dt

dAt

The rates and concentrations of downstream processes degrade the signal from A.

Brief Digression: Chemical Impedance

IA* ][)( ][][2

121 I

k

kIAAkIk

dt

dAt

But what if reaction is by reversible binding?

A+B*C

2

1

4321

][)(

[C]k [A][B] k- ][][

k

IkIA

AkIkdt

dA

t

The rates and concentrations of downstream processes don’t affect the signal.

+∫

But….what about the ME

),(),(1

tXPaBtXPt

M

),(),(1

tXPaBtXPt

M

Error and ORDINARY DIFFERENTIAL EQUATIONS

Ordinary Differential Equations

• A differential equation defines a relationship between an unknown function and one or more of its derivatives

• Physical problems using differential equations– electrical circuits– heat transfer– motion


• The derivatives are of the dependent variable with respect to the independent variable

• First order differential equation with y as the dependent variable and x as the independent variable would be:

dy

f x,ydx


A second order differential equation would have the form:

),,(2

2

dx

dyyxf

dx

yd


• An ordinary differential equation is one with a single independent variable.

• Thus, the previous two equations are ordinary differential equations

• The following is not:

( )1

1 2= x

dyf ,

xx ,

dy

Partial Differential Equations

( )

( )

1

1 2

1

1

2

Correct notation:

dyf , ,

dx

yf , ,

x

x x

x x

y

y

=

¶=

¶d

d


• The analytical solution of ordinary differential equation as well as partial differential equations is called the “closed form solution”

• This solution requires that the constants of integration be evaluated using prescribed values of the independent variable(s).


• At best, only a few differential equations can be solved analytically in a closed form.

• Solutions of most practical engineering problems involving differential equations require the use of numerical methods.

One Step Methods

• Focus is on solving ODE in the form

( )

1+

=

= +fii

dyf x,y

dxy y h

y

x

yi

h

This is the same as saying:new value = old value + (slope) x (step size)

One Step Methods


( )

1+

=

= +fii

dyf x,y

dxy y h


y

x

slope = yi

h

One Step Methods


( )

1+

=

= +fii

dyf x,y

dxy y h


y

x

slope = yi

h

Euler’s Method

• The first derivative provides a direct estimate of the slope at xi

• The equation is applied iteratively, or one step at a time, over small distance in order to reduce the error

• Hence this is often referred to as Euler’s One-Step Method

Taylor Series

2

i i i i

i i i

hy x h y x hy x y x

2

y x h y x hy x

K

EXAMPLE

24=dy

xdx

For the initial condition y(1)=1, determine y for h = 0.1 analytically and using Euler’s method given:

2

3

3

dy4x

dxI.C. y 1 at x 1

4y x C

31

C3

4 1y x

3 3y 1.1 1.44133

2

i 1 i

2

dy4x

dxy y h

y 1.1 y 1 4 1 0.1 1.4

2

i 1 i

2

2

dy4x

dxy y h

y 1.1 y 1 4 1 0.1 1.4

Note :

y 1.1 y 1 4 1 0.1

dy/dxI.C.

step size

2

i 1 i

2

dy4x

dxy y h

y 1.1 y 1 4 1 0.1 1.4

Recall the analytical solution was 1.4413If we instead reduced the step size to to 0.05 andapply Euler’s twice

Recall the analytical solution was 1.4413

2

2

y(1.05) y(1) 4 1 1.05 1.00 1 0.2 1.2

y 1.1 y 1.05 4 1.05 1.1 1.05 1.4205

If we instead reduced the step size to to 0.05 and apply Euler’s twice:

Error Analysis of Euler’s Method

• Truncation error - caused by the nature of the techniques employed to approximate values of y– local truncation error (from Taylor Series)– propagated truncation error– sum of the two = global truncation error

• Round off error - caused by the limited number of significant digits that can be retained by a computer or calculator

Taylor Series

2 3

i i i i i

2

i i i i

h hy x h y x hy x y x y x

2 6

hy x h y x hy x y x

2

K

Higher Order Taylor Series Methods

2

i i i i

2

i 1 i i i x i i y i i

y x

hy x h y x hy x y x

2y x f x,y

df x,y f x f y f fy x f

dx x x y x x y

hy y f x ,y h f f x ,y f x ,y

2f f

f fy x

Derivatives

x y

2 2xx xy yy x y y

y f x,y

y f f f

y f 2f f f f f f f f

M

Modification of Euler’s Methods

• A fundamental error in Euler’s method is that the derivative at the beginning of the interval is assumed to apply across the entire interval

• Two simple modifications will be demonstrated

• These modification actually belong to a larger class of solution techniques called Runge-Kutta which we will explore later.

Heun’s Method

Consider our Taylor expansion:

Approximate f’ as a simple forward difference

i 1 i 1 i ii i

f x ,y f x ,y' x ,f y

h

Substituting into the expansion2

i 1 i i 1 ii 1 i i i

f f h f fy y f h y h

h 2 2

Heun’s Method

• Determine the derivatives for the interval @– the initial point– end point (based on Euler step from initial

point)

• Use the average to obtain an improved estimate of the slope for the entire interval

• We can think of the Euler step as a “test” step

Heun’s Method

y

xi xi+1

Take the slope at xi

Project to get f(xi+1 )based on the step size h

h

y

xi xi+1

h

y

xi xi+1

Now determine the slopeat xi+1

y

xi xi+1Take the average of thesetwo slopes

y

xi xi+1

y

xi xi+1

Use this “average” slopeto predict yi+1

h

yxfyxfyy iiiiii 2

,, 111

{

y

xi xi+1

Use this “average” slopeto predict yi+1

h

yxfyxfyy iiiiii 2

,, 111

{

y

xi xi+1

y

xxi xi+1

h

yxfyxfyy iiiiii 2

,, 111

y

xxi xi+1

h

yxfyxfyy iiiiii 2

,, 111

hyy ii 1

Improved Polygon Method

• Another modification of Euler’s Method• Uses Euler’s to predict a value of y at

the midpoint of the interval

• This predicted value is used to estimate the slope at the midpoint

i 1/ 2 i i ih

y y f x ,y2

i 1/ 2 i 1/ 2 i 1/ 2y ' f x ,y

• We then assume that this slope represents a valid approximation of the average slope for the entire interval

• Use this slope to extrapolate linearly from xi to xi+1 using Euler’s algorithm

i 1 i i 1/ 2 i 1/ 2y y f x ,y h


We could also get this algorithm from substituting a forward difference in f to i+1/2 into the Taylor expansion for f’, i.e.

2i 1/ 2 i

i 1 i i

i i 1/ 2

f f hy y f h

h / 2 2

y f h


y

xxi

f(xi)

y

xxi xi+1/2

h/2

y

xxi xi+1/2

h/2

y

xxi xi+1/2

f(xi+1/2)

y

xxi xi+1/2

f’(xi+1/2)

y

xxi xi+1/2 xi+1

h

Extend your slopenow to get f(x i+1)

y

xxi xi+1/2 xi+1

f(xi+1)

Conclusions

• Algorithms can be more or less stable to truncation or round off error.

• Algorithms can be better or worse approximations to the math you want to do.

• Algorithms can be more or less complex

(Based on Gillespie, D.T. (1977) JPC, 81(25): 2340)

§ We are given a system in the state (X1,...,XN) at time t.

To move the system forward in time we must ask two questions:

•When will the next reaction occur?•What kind of reaction will it be?

In order to answer these questions we introduce

P()dt = probability that, given the state(X1,...,XN) at time t, the next reaction in V will occur in

theinfinitesmal time interval (t+,t++dt) there will be a reaction of type R.

Master Equation Simulation I

Now we can define the P() to be the probability that no reaction occurs in the

interval (t,t+t) (Po(t)) times the probability that reaction R will occur in the

infinitesmal time dt following this interval (aµdt):

P()dt= Po() aµd

Now aµ is simply a term related to the rate equation for a given reaction. In fact it

is a transition probability, cµ, times a combinatorial term which enumerates the

number of ways n-species can react in volume V given the configuration

(X1,...,XN), hµ.

Therefore

[1- aµd ']= probability that no

reaction will occur in

time d ' from the state

(X1,...,XN).

and

Po( ' + d ')= Po( ')[1- aµd ']

the solution of which is

Po( ')= exp[- aµ ]

Master Equation Simulation II

• The Algorithm

Step 0: Choose Initial Conditions and Rates

Step 1: Calculate aµ for each reaction as well as the sum of all of them.

Step 2: Generate a random number, , based on Po() and roulette wheel select a reaction based on aµ.

Step 3: Increment time by and execute reactionµ. Goto Step 1.

Master Equation Simulation III

Endogenous Noise

• One gene• Growing cell, 45 minutes division time• Average ~60 seconds between transcripts• Average 10 proteins/transcript:

• One gene• Growing cell, 45 minutes division time• Average ~60 seconds between transcripts• Average 10 proteins/transcript:

gene

aPA

A

Promoter

Signal ProteinA2 AA

A *

0

10

20

30

40

50

60

70

0 5 10 15 20 25 30 35 40 45

Time (minutes)

about

50 molecules

25 molecules

Monte Carlo simulation data

B-p

A A-p

What happens when you have bistability and noise?+∫

Langevin equation

• But what if there is external noise on E?

• Let’s start with…

E+

A A-p

5 10 15 20

5

10

15

20

,)()(

,)()()()()(

ConstEtEtE

WEfEtNoiseEtEtEtEboundfree

tboundfree

The compact Langevin

• Plug the conservation conditions into the equations for A-p (A*)

**

*( ) t

k E A k E A k AdA dA dt f E dB

K A K A K A

Drift Diffusion

Note that another term in 1/K+A has been introduced. There is now the possibility of a cubic nullcline.

The Fokker-Planck equivalent.

• Compared to the deteriministic nullcline…

2*

*

( ; ) 1( ; ) ( ) ( ; )

2

k E A k E A k Ap A tp A t f E p A t

t A K A K A A K A

Which yields the stationary nullcline

220

20

( )( )( ) 0

( ) ( )ss ss

ss ss ss

k E A A K A k KE f E

k A K A A K A

0

0

( )( )0

( )ss ss

ss ss

K A X AkE E

k K X A A

Depending on the noise type

det

½p1p

0p

0 . 3 0 . 5 1 1 . 5 2E

0 . 0 0 5

0 . 0 1

0 . 0 5

0 . 1

0 . 5

1Xs s

E 0 E ½ E 1

Ass

pEEf )(p=0 Normal Noisep=1/2 Chi-square noisep=1 Log-normal noise

Validation by ME simulation

EXC

EXC

CEX

k

k

k

*3

2

1

EXC

EXC

CEX

k

k

k

3

2

1

*

**

**

EN

NENN

k

k

k

k

22

22

21

21

It turns out this generates log-normal noise on E+

ME Simulation

With noise on E Without noise

Stationary Distribution with Noise

N

E

*X

Stationary Distribution w/o Noise

*X E

Summary

• Adding noise to a system (in this case external noise) can qualitatively change its dynamics.

• Interestingly we can predict the effect with a compact Langevin approach AND a MM approximation pretty well compared to what’s observed in a full ME simulation.

• The implications for noise-induced bistability and switching haven’t been fully worked out.

B-p

A A-p

But an Ugly specter is raised….

Is this really a valid picture? Adding noise changes the nullcline!

Nonetheless: Static noise can make things look

bistable

E

X

a linear response

a switch response

p(E)

X

p(x)

X

p(x)

Linear SwitchThere is a relationship between the variance on E and the slope of the response that determines whether the stationary distribution will be bimodal.

Niches are Dynamic

abiotoic reservoir

• Characteristic times may be spent in each environment.

• Environments themselves are variable.

Life Cycle

• Adaptability: Adjustment on the time scale of the life cycle of the organism

• Evolvability: Capacity for genetic changes to invade new life cycles

New niches with new lifecycles

Adaptability vs. Evolvability

Chris Voigt

• In a dynamic environment, the lineage that adapts first, wins

• Fewer mutations means faster evolution

• Are some biosystems constructed to minimize the mutations required to find improvements?

“Environment”

{ Parameter Space }

Pat

tern

{ Parameter Space }

Pat

tern

{ Parameter Space }

Pat

tern

{ Parameter Space }

Pat

tern“Environment”

• Modularity

• Robustness / Neutral drift improves functional sampling

• Shape of functionality in parameter space

• Minimize null regions in parameter space (entropy of multiple mutations)

Evolvability

Chris Voigt

Logic of B.subtilis stress response

• Network organization has a functional logic.• There are different levels of abstraction to be

found.

ComA~P

AbrB; SinR

DegU~P PhoP~PResD~P

Spo0A~P

AbrB

DegU~P ComK

AbrB; SinR;SigH

AbrBSinR

ComA~P

Sporulation

Clustered Phylogenetic Profiles

• Clustered phylogenetic profile shows blocks of conserved genes

1. methyl-processing receptors and chemotaxis genes in motile bacteria

2. methyl-processing receptors and chemotaxis genes in motile Archaea

3. flagellar genes in motile bacteria4. type III secretion system (virulence) in non-motile

pathogenic bacteria5. motility genes in spore-forming bacteria6. late-stage sporulation genes in spore-forming

bacteria7. spore coat and germination response genes in

spore-forming bacteria that are not competent8. late-stage sporulation genes in spore-forming

bacteria that are also competent9. DNA uptake genes in Gram positive bacteria10. DNA uptake genes in Gram negative bacteria

1 2

3 4 5

species

6

7

8

910

8

Chem

ota

xis

ge

ne

s

Sporu

latio

nC

om

pete

nce

Consider Chemotaxis: E. coli

Periplasm

Cytoplasm

Consider Chemotaxis: E. coli

Periplasm

Cytoplasm

Sensor (Input

Transducer)Controller Actuator

Sensor (Output

Transducer)

output

error or actuating

signal

signal proportional

to inputinput

signal proportional

to output

(Adapted from Control Systems Engineering, N.S. Nise 2000)

receptors

CheAWYZ Flagella

cheB/cheR

Integral Feedback Controller

Clusters are functionally coherent

Receptors

Signal Transduction (che)

Hook and Flagellar Body

Flagellar export/Type III secretion

Flagellar length and motor control

Hypthothetical receptors

Cross-Regulation with Sporulation/Cell Cycle

Different modules for different livesA

rcheal Extr

em

ophile

s

Sporu

lato

rs

End

opath

ogens

Pla

nt

path

og

ens

Anim

al path

ogens

End

opath

ogens

What Ontology Recovers Modules?

Color legend:

■ sensor

■ controller

■ actuator

■ cross-talk between

networks

■ unknown

Systems Ontology

Comparative analysis is especially important

These are the homologous chemotaxis pathways in E.coli and B. subtilis

They have the same wild-type behavior.Different biochemical mechanisms.

Different robustnesses! Chris Rao/John Kirby

Rao, CV, Kirby, J, Arkin, A,P. (2004) PLOS Biology, 2(2), 239-252

Two important features

Exact Adaptation

Exact Adaptation

AdaptationTime

AdaptationTime

Differences in robustnessE . Coli B . subtilis

Do these differences lead to differences in actual fitness?Do these differences lead to differences in actual fitness?

Chris Rao/John Kirby

• In a dynamic environment, the lineage that adapts first, wins

• Fewer mutations means faster evolution

• Are some biosystems constructed to minimize the mutations required to find improvements?

“Environment”

{ Parameter Space }

Pat

tern

{ Parameter Space }

Pat

tern

{ Parameter Space }

Pat

tern

{ Parameter Space }

Pat

tern“Environment”

• Modularity

• Robustness / Neutral drift improves functional sampling

• Shape of functionality in parameter space

• Minimize null regions in parameter space (entropy of multiple mutations)

Evolvability

Chris Voigt

Logic of B.subtilis stress response

• Network organization has a functional logic.• There are different levels of abstraction to be

found.

ComA~P

AbrB; SinR

DegU~P PhoP~PResD~P

Spo0A~P

AbrB

DegU~P ComK

AbrB; SinR;SigH

AbrBSinR

ComA~P

Sporulation

Sporulation initiation

A Motif

P1 P3sinI sinR

Sporulation genes (stage II)

spoIIG as model

SIN Operon

Spo0A

• Vegetative (healthy) growth: Constitutive SinR expression from P3

Environmental & Cellular Signals

Spo0A~P

• Resource depletion and high cell density leads to the phosphorylation of Spo0A

The SIN Operon: A recurrent motif

Feedback provides filtering

0 2000 4000 6000 8000 100000

500

1000

1500

0 2000 4000 6000 8000 100000

500

1000

1500

0 2000 4000 6000 8000 100000

50

100

150

200

time (s)

I(nM

)S

po0A

~P(n

M)

I(nM

)

0 2000 4000 6000 8000 100000

500

1000

1500

0 2000 4000 6000 8000 100000

500

1000

1500

0 2000 4000 6000 8000 100000

50

100

150

200

time (s)

I(nM

)S

po0A

~P(n

M)

I(nM

)

INPUT of Spo0A~P

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1 2 3 4 5 6 7 8 9 10 11

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1 2 3 4 5 6 7 8 9 10 11

P1 SinI Activity SinR Activity

k1 GS GRNAP GR AI I KI k3 AR R KR

k1 GS GRNAP GR AI I KI k3 AR R KR

P3

Pa

ram

ete

r S

pace

Bistability

Type 1

Type 2

Oscillations

Hopf points

Functional Regions in Parameter Space

Chris Voigt

Single steady state

Two steady states

Oscillations

• Tuning the expression of SinR (AR) with respect to SinI leads to dynamical plasticity

• Transcription from P3 (k3) strengthens bistability and damps oscillations

0 0.1 0.2 0.3 0.410

-2

10-1

100

101

102

103

AR (protein/mRNA-s)

[Sin

I] (n

M)

Bistability

Osc PulseSwitchGraded

AR (protein/mRNA-s)

k 3 (

mR

NA

/s)

0A = 10,000 nM

0A = 10 nM

Full Bifurcation Analyses: Evolvability?

• How can complicated dynamical behavior arise from simple evolutionary events?

• What are the requirements to bias the operon to one function?

• Once established can one function evolve into another?

sigX

Iron concentration

Iron flux control

Thermotolerance

Bacitracin Resistance

Growth phase

rsiX

INO

UT

sigX

Iron concentration

Iron flux control

Thermotolerance


Growth phase

rsiXsigX

Iron concentration

Iron flux control

Thermotolerance


Iron flux control

Thermotolerance


Growth phase

rsiX

INO

UT

Bistable Switch

phrA

Competance (ComA~P)

Sporulation (Spo0A~P)

rapA

INO

UT

phrA

Competance (ComA~P)


rapA phrA

Competance (ComA~P)


rapA

INO

UT

Pulse Generator

soj spo0J

Sporulation (Spo0A and stage II promoters)

Chromosome organizationIN

OU

T

soj spo0J


Chromosome organization

soj spo0J


Chromosome organizationIN

OU

T

? – spatial oscillations

Examples of Protein-Antagonist Operons

Chris VoigtLisa Fontaine-Bodin, Keasling LAb

Comparative analysis of SinI/SinR

In anthracis:Mutations mostly affect KI and k1

Threshold of the switch is most affected.

Comparison of five strains of Bacillus anthracisComparison of five strains of Bacillus anthracis

Across ALL sporulatorsVery variable.

region affecting k1 KI

Voigt, CA, Wolf, DM, Arkin, AP, (2004) Genetics, In pressPMID: 15466432

Feedback induces stochastic bimodality

0

50

100

150

200

250

300

350

400

450

0.0 0.5 1.0 1.5 2.0 2.50

10

20

30

40

50

60

70

0.0 0.5 1.0 1.5 2.0 2.5

0

5

10

15

20

25

30

35

0.0 0.5 1.0 1.5 2.0 2.5

0

50

100

150

200

250

300

350

400

450

0.0 0.5 1.0 1.5 2.0 2.5

0

10

20

30

40

50

60

70

80

90

0.0 0.5 1.0 1.5 2.0 2.5

0

5

10

15

20

25

30

35

0.0 0.5 1.0 1.5 2.0 2.5

I (log10 nM)

coun

tco

unt

[spo0A~p]=1nm [spo0A~p]=4nm [spo0A~p]=100nm

[sinI]

Though we must be careful since the addition of noise itself changes the qualitative dynamics.

Heterogeneity of Entry to Sporulation

Microscopic analysis of LF25 (amyE::PspoIIE cm). Observation by DIC X60 (A.) and fluorescence (B.) of cells resuspended to induce sporulation and incubated 3 hours at 37°C. An example of cells not showing fluorescence are circled in figure A.

A. B.

Lisa Fontaine-Bodin, Denise Wolf, Jay Keasling

Summary 1

• Has flexible function based on parameters– Most parameters tune response– A couple of parameters qualitatively change the

response

• Is an example of a possible Evolvable Motif

• Sometimes exhibits stochastic effects– Are they adaptive?

So this motif:

Stochastic Effects Are Ubiquitous

10-1

100

101

102

103

FL1 LOG: GFP

No Positive Feedback

Tat Feedback: Very Bright Sort

Stochastic Gene Expression in HIV-1 Derived Lentiviruses

Stable Clones

Stochastic Gene Expression in HIV-1 Derived Lentiviruses

Stable Clones

Tat Feedback: Bright Sort

Clones Images

Software

• MatLab

• Mathematica

• Berkeley Madonna

• GEPASI

• TerraNode

• JDesigner

Environment

t

E1

E2

E3

E4

E2

E1

Organism 1 Organism 2

11 2

2

pi

S1S2

S3S4

S5

SN

Sensors

pi

S1S2

S3S4

S5

SN

Outputsignals

quorum

noise

Beginning to link Game Theory to Dynamical Cellular Strategies.

The game of life

Formal Model

xx yy

sx1

E1gy>gx

E2gx>gy

p1,2

p2,1

1-p2,1 1-p1,2

Time-varying environment

a)

b)

Transition matrix TI,j(k)

Time kt Time (k+1)t

Ei?

)(

)(

)(

)(

2

2

1

1

ky

kx

ky

kx

X k

)1(

)1(

)1(

)1(

2

2

1

1

ky

kx

ky

kx

no

Ei Observers

Non-observers

yes

CorrectĒ

IncorrectĒ

pObs

1-pObs

Psii

Psij

Rate matrix Ri(k)

x1

x2

y1

y2S2

S1sy1

sx2

sy2

sq1,2 sq2,1 sq1,2 sq2,1

Accuracy SiObservability pObs Mixing M

P1

P2

1-P2

1-P1

yyxx

E1 E2

sx1

x ysy1

sx2

x ysy2

~n gen

~m gen

e.g. x=pili; y=no pili E1=in host; E2=out

IF E1: selects for x, against y E2: selects against x, for y

E1 E2 E1 E2

x

y

Example: two environments, two moves, no sensor

Denise Wolf, Vijay Vazirani

1.ALL cells in state x

2.ALL cells in state y

3.Statically mixed population (some x, some y)

4.Phase variation of individual cells between x and y

y

E1 E2

x

x y

With no sensor, the options are…Denise Wolf, Vijay Vazirani





Extinction

E1 E2 E1 ..

With no sensor, the options are…

y

E1 E2

x

x y






Extinction

E1 E2 E1 ..


y

E1 E2

x

x y






Extinction


y

E1 E2

x

x y






Proliferation!


y

E1 E2

x

x y


This is a Devil’s compromise: Phase-variation behaviors is not optimal in any one environment but necessary for survival with noisy sensors in a fluctuating environment.

Rate of XY Switching

Rate

of

Y

X S

wit

chin

g

Phase variation for survival


Learning Environment from Cell StateStrategy Sensor profile Environmental profile

RandomPhaseVariation(RPV)

No sensors •Devil’s Compromise (DC) lifecycle: time varying environment with different environmental states selecting for different cell states. •Optimal switching rates a function of lifecycle asymmetries and environmental autocorrelation.•Time variation required (spatial variation insufficient).

O=Low prob. observable transitions over DC or extinction set.

D=Long delays relative to env. transition times.

Perfect sensors Frequency dependent growth curves with mixed ESS.

SensorBasedMixed

O=High prob. observable transitions;A=Poor accuracy

•Devil’s Compromise lifecycle.

•Asymmetric lifecycle required.

•Optimal mixing probabilities biased toward selected cell-states in dominant environmental states.

SensorBasedMixed;LPF

O=High prob. observable transitions;A=Poor accuracy.N=High additive noise.

SensorBasedPure

O=High prob. observable transitions;A=High accuracy; or moderate accuracy and low noise N.

Temporally or spatially varying environment with each environmental state selecting for a single cell state.

SensorBased Pure;LPF

O=High prob. observable transitions;A=Moderate accuracy.N=High additive noise.


Robustness and Fragility• The stratagems of a cell evolve in a given

environment for robust survival.

• Evolution writes an internal model of the environment into the genome.

• But the system is fragile both – to certain changes in the environment (though there

are evolvable designs)– And certain random changes in its process structure.

• One of the central questions has to be: Robust on what time scale? Can evolution “design” for the future by learning from the past?

Summary• The availability of large numbers of

bacterial genomes and our ability to measure their expression opens a new field of “Evolutionary Systems Biology” or “Regulatory Phylogenomics”.

• Comparative genomics identifies particularly conserved motifs, parts of which are evolutionarily variable and select for different behaviors of the network.

• By understanding what evolution selects in a network context we better understand what the engineerable aspects of the network are.

Acknowledgements

• Comparative Stress Response: Amoolya Singh, Denise Wolf

• SinIR analysis: Chris Voigt, Denise Wolf

• Chemotaxis: Chris Rao, John Kirby

• HIV: Leor Weinberger, David Schaffer

• Games: Denise Wolf, Vijay V. Vazirani

• Funding: – NIGMS/NIH– DOE Office of Science– DARPA BioCOMP– HHMI

Documents

Diversity and Design in Cellular Networks Prediction, Control and Design of and with Biology Adam Arkin, University of California, Berkeley