24
A simple model for the evolution of molecular codes driven by the interplay of accuracy, diversity and cost Tsvi Tlusty, Physical Biology Gidi Lasovski

Tsvi Tlusty, Physical Biology Gidi Lasovski

  • Upload
    marika

  • View
    38

  • Download
    2

Embed Size (px)

DESCRIPTION

A simple model for the evolution of molecular codes driven by the interplay of accuracy, diversity and cost. Tsvi Tlusty, Physical Biology Gidi Lasovski. The main idea. Understanding molecular codes Their evolution and the forces that affect them. What is a molecular code The genetic code - PowerPoint PPT Presentation

Citation preview

Page 1: Tsvi Tlusty, Physical Biology Gidi Lasovski

A simple model for the evolution of

molecular codes driven by the interplay

of accuracy, diversity and costTsvi Tlusty, Physical Biology

Gidi Lasovski

Page 2: Tsvi Tlusty, Physical Biology Gidi Lasovski

The main idea

Understanding molecular codes Their evolution and the forces that affect

them

Page 3: Tsvi Tlusty, Physical Biology Gidi Lasovski

What is a molecular code The genetic code The fitness of molecular codes The evolution and emergence of molecular

codes Suggested experimental verification

Page 4: Tsvi Tlusty, Physical Biology Gidi Lasovski

The Central Dogma of Molecular Biology1. A signaling protein binds to a gene

2. The RNA polymerase generates mRNA from the gene

3. The mRNA exits the nucleus of the cell

4. A Ribosome reads the mRNA and creates a protein, with the help of tRNAs

The tRNAs provide the Ribosome with amino acids, the building blocks of the protein

Page 5: Tsvi Tlusty, Physical Biology Gidi Lasovski

What is a molecular code? The Genetic Code is a molecular code:

The symbols are A, U, C & G The Machine:

RNA Polymerase Signaling molecules (proteins) mRNA Ribosome

The output: Proteins

The cost of operation of the machine is the ATP and the tRNAs.

The symbols encode Amino Acids redundantly 64 options – only 20 amino acids for robustness reasons?

Page 6: Tsvi Tlusty, Physical Biology Gidi Lasovski

The genetic code

Non PolarPolarBasicAcidic

Page 7: Tsvi Tlusty, Physical Biology Gidi Lasovski

The genetic code - similarity

Page 8: Tsvi Tlusty, Physical Biology Gidi Lasovski

The fitness of molecular codesThree parameters: Error load Diversity Cost

We define the fitness of the code as the linear combination of these three conflicting needs

Page 9: Tsvi Tlusty, Physical Biology Gidi Lasovski

Error load

When reading a number, we can misread 3 for 8 (or vice versa) anywhere:3838383838383838383838

here or hereWe want to make sure the errors would be less

likely where they’re more important

3838383838383838383838

Page 10: Tsvi Tlusty, Physical Biology Gidi Lasovski

Error load

Similar meaning should go with a similar (close) symbol, so that a small reading error would cause only a small understanding error.

If this -> signifies the deviation of sugar, which code would you prefer:

A or B

Page 11: Tsvi Tlusty, Physical Biology Gidi Lasovski

Diversity

Enables efficient and accurate delivery of different messages.

A small lack of sugar - I’m hungry

A medium lack of sugar - I’m starving

A large lack of sugar – Let’s go to San Martin

NOW!

Page 12: Tsvi Tlusty, Physical Biology Gidi Lasovski

Diversity

Enables the code to transmit as many different symbols as possible, equivalent to different symbols in a UTM

Many different symbols – less states of the machine

More symbols also enable faster, more accurate control

Page 13: Tsvi Tlusty, Physical Biology Gidi Lasovski

Cost

Car insurance – the cost of improving the robustness of your driving

Another example is the price of ink and space in my demonstration

Page 14: Tsvi Tlusty, Physical Biology Gidi Lasovski

Cost

Strong binding takes up more energy to create and read

The energy is proportional to the length of the binding site.

The binding probability scales like e-E/T, E ~ ln(p)

Notice that diversity has its costs as well, more symbols means longer molecules

Page 15: Tsvi Tlusty, Physical Biology Gidi Lasovski

Summary

The code has to be optimized at an equilibrium of error load, diversity and cost.

Page 16: Tsvi Tlusty, Physical Biology Gidi Lasovski

Quantifying the code

Using Lagrange multipliers:

H = −Load + WD · Diversity − WC · Cost

C is the reduction of entropy, so WC is equivalent to the temperature (WCC ~ TdS)

Page 17: Tsvi Tlusty, Physical Biology Gidi Lasovski

wc is equivalent to the temperature

J/wc = 1 is the phase transition: “liquid” (the non coding state) J/wc < 1

“solid” (the coding state) J/wc > 1

Ψ – the order parameter

H – the fitness

C – the cost

D – the diversity

L – the error load

The result is an Ising like model

Page 18: Tsvi Tlusty, Physical Biology Gidi Lasovski

Possible experiment Take a bacteria with the

transcription factor i. Duplicate the gene that codes i,

let’s call the duplicate j i, j control the response to A(t) If A(t) fluctuates strongly, i, j may

evolve to 2 different meanings - better control

If A(t) fluctuates weakly, maybe one of them would be deleted.

Experiment around the critical point

Page 19: Tsvi Tlusty, Physical Biology Gidi Lasovski

Using Lagrange multipliers:

H = −L + WD · D − WC · CC is the reduction of entropy, so WC is equivalent to the

temperature (WCC ~ TdS)

Diversity

D = Σi,j,α,β(1 − δij )piαpjβcαβ

Error loadL = Σi,j,α,β rijpiαpjβcαβ

Cost

C = Σiα piα ln(piα/pα)

Eiα ln ∼ piα

pα = ns-1 Σj pjα

rij – the probability to read i as j

Piα – the probability for i to be mapped to α is

Cαβ – the cost of misinterpreting α as β

Page 20: Tsvi Tlusty, Physical Biology Gidi Lasovski

Additional slides for the mathematical model

Page 21: Tsvi Tlusty, Physical Biology Gidi Lasovski

J = c (1−2r + wD)

wc is equivalent to the temperature

J/wc = 1 is the phase transition: “liquid” (the non coding state) J/wc < 1 “solid” (the coding state) J/wc > 1

Ψ – the order parameter

H – the fitness

C – the cost

D – the diversity

L – the error load

ψ = tanh (∗ J/wC · ψ )∗

H = c·J·ψ2 − wC[(1 + ψ) ln(1 + ψ)+ (1 − ψ) ln(1 − ψ)]

Page 22: Tsvi Tlusty, Physical Biology Gidi Lasovski

Quantifying the code

Ns symbols (i, j, k..) mapped to Nm meanings (α, β..)

Piα - The probability for i to be mapped to α

ΣαPiα =1

In the non coding state, the prob. is constant 1/Nm

rij – the probability to read i as j.

Cαβ – the cost of misinterpreting α as β The total error load:

L = Σi,j,α,β rijpiαpjβcαβ

Just like a ferromagnet: r – interaction, c – magnitude p – the spin

Also prefers specific symbols L(rii) = 0 only if i signifies a specific meaning

Page 23: Tsvi Tlusty, Physical Biology Gidi Lasovski

Toy model (1 bit)

P - ∗ the optimal code, can be found by the derivation ∂HT/∂piα = 0 p∗

iα = z-1 p∗α exp(−Giα/wC) z = Σβ p∗

βexp(−Giβ/wC) Giα = 2Σj,β (rij − wD(1 − δij))pjβcαβ c = 0 c

c 0 r = 1−r r

r 1−r p = 0.5 1 + ψ 1 − ψ

1 − ψ 1 + ψ

ψ∗ = tanh (J/wC · ψ∗) J = c (1−2r + wD) wC∗ = J = (1 − 2r + wD) c

Page 24: Tsvi Tlusty, Physical Biology Gidi Lasovski

General criteria

Qiαjβ =−(∂2H/∂piα∂pjβ) stops being positive definite

wC∗ = 2*nm-1 (λr

∗ + wD)|λc∗ |

λr∗ is the 2nd-largest eigenvalue of r

λc ∗ is the smallest eigenvalue of c - corresponds to the longest

wavelength – smallest error load