32
1

1 2 Extreme Pathway Lengths and Reaction Participation in Genome Scale Metabolic Networks Jason A. Papin, Nathan D. Price and Bernhard Ø. Palsson

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

1

2

Extreme Pathway Lengths and Extreme Pathway Lengths and Reaction Participation in Reaction Participation in Genome Scale Metabolic Genome Scale Metabolic

NetworksNetworks

Jason A. Papin, Nathan D. Price and Bernhard Ø. Palsson

3

IntroductionIntroduction

Reaction Network

1 2 3 4 5 6 1 2 3

1 0 0 0 0 0 1 0 0

1 2 2 0 0 0 0 0 0

0 1 0 0 1 1 0 0 0

0 0 1 1 1 0 0 0 0

0 0 0 1 0 1 0 1 0

0 1 1 0 0 0 0 0 1

0 0 1 1 1 0 0 0 0

v v v v v v b b b

A

B

C

S D

E

byp

cof

Stoichiometric MatrixDuplication is only for easy

drawing

4

BackgroundBackground1 2 3 4 5 6 1 2 3

1 0 0 0 0 0 1 0 0

1 2 2 0 0 0 0 0 0

0 1 0 0 1 1 0 0 0

0 0 1 1 1 0 0 0 0

0 0 0 1 0 1 0 1 0

0 1 1 0 0 0 0 0 1

0 0 1 1 1 0 0 0 0

v v v v v v b b b

A

B

C

S D

E

byp

cof

For every metabolite in the system we get the following equation:

,

ii j j

j

d XS v

dt

5

BackgroundBackground1 2 3 4 5 6 1 2 3

1 0 0 0 0 0 1 0 0

1 2 2 0 0 0 0 0 0

0 1 0 0 1 1 0 0 0

0 0 1 1 1 0 0 0 0

0 0 0 1 0 1 0 1 0

0 1 1 0 0 0 0 0 1

0 0 1 1 1 0 0 0 0

v v v v v v b b b

A

B

C

S D

E

byp

cof

Lets look at B for example:

1 2 32 2d B

v v vdt

Since the time constants associated with growth are much larger than those associated with each individual reaction we assume:

1 2 32 2 0d B

v v vdt

6

BackgroundBackground

We get:

,, 0ii j j

j

d Xi S v

dt 0S v

1

2

3

4

5

6

7

8

9

1 0 0 0 0 0 1 0 0 0

1 2 2 0 0 0 0 0 0 0

0 1 0 0 1 1 0 0 0 0

0 0 1 1 1 0 0 0 0 0

0 0 0 1 0 1 0 1 0 0

0 1 1 0 0 0 0 0 1 0

0 0 1 1 1 0 0 0 0 0

v

v

v

v

S v

v

v

v

v

Every solution of this set of equation is a steady state that the system can be in.

7

BackgroundBackgroundReminder:

• Such a system is called homogenous.• Such a system always has a solution (the zero solution).• If it has more than one solution it has an infinite number of

solutions.• The set of all the solutions is a vector space.• This vector space is called the null space.• From the rank theorem of linear algebra we know:

( is the number of reactions) dimNul S n rank S n

8

BackgroundBackgroundDefining the null space

In order to define the null space we need to find a spanning set.

1, , , ,m lK k k k U m

Reminder:

A spanning set for a vector space of dimension is a set of vectors, such that every other vector in can be written as a linear combination of the vectors in .

U

UK

Mathematically:

11

, , , s.t. l

l j jj

u U k u

The minimal possible size of a spanning set is .

If the spanning set satisfies this then it is called a base and all the vectors in it are

linearly independent.

m

9

BackgroundBackgroundDefining the null space

Since we have for every . 1, , mK k k Nul s i0iS k This implies that every member of the base is a possible steady state.

A Problem

11

, , , s.t. m

m j jj

u Nul S k u

Mathematically, can take negative values.1, , m Biologically this creates a problem since each vector defines a flux which can not be “reversed”.

10

BackgroundBackgroundThe solution

Notice that we are only interested in solutions where for every (since the reactions must take place in the “right” direction).

0iv i

We find a spanning set such that every such solution can be written as a linear combination of the vectors in where all the coefficients take non-negative values.

Notice that the vectors in such a set can be linearly independent.

KK

11

BackgroundBackgroundThe solution

These vectors will be called genetically independent.

Genetically independent vectors are a group of vectors in which no vector can be expressed as a linear combination of the other vectors such that all the coefficients are non negative.

An algorithm to find a genetically independent minumum spanning set is described in Clarke’s paper “Complete set of steady states for the general stoichiometric dynamical systems” and will not be shown in framework of this presentation.

12

BackgroundBackgroundThe resulting solution space takes the space of a convex polyhedral cone.

13

Extreme PathwaysExtreme Pathways1 2 3 4 5 6 1 2 3

1 0 0 0 0 0 1 0 0

1 2 2 0 0 0 0 0 0

0 1 0 0 1 1 0 0 0

0 0 1 1 1 0 0 0 0

0 0 0 1 0 1 0 1 0

0 1 1 0 0 0 0 0 1

0 0 1 1 1 0 0 0 0

v v v v v v b b b

A

B

C

S D

E

byp

cof

The genetically independent spanning set in the above example is the following:

1

2

3

4

5

6

1

2

3

2 2 2

1 0 1

0 1 0

0 1 1

, ,0 0 1

1 0 0

2 2 2

1 1 1

1 1 1

v

v

v

v

v

v

b

b

b

14

Extreme PathwaysExtreme PathwaysNotice that each such vector defines a pathway in the Reaction network.

1

2

3

4

5

6

1

2

3

2

1

0

0

0

1

2

1

1

v

v

v

v

v

v

b

b

b

1

2

3

4

5

6

1

2

3

2

1

0

1

1

0

2

1

1

v

v

v

v

v

v

b

b

b

1

2

3

4

5

6

1

2

3

2

0

1

1

0

0

2

1

1

v

v

v

v

v

v

b

b

b

15

Extreme PathwaysExtreme PathwaysThese pathways are called extreme pathways.

From the way they were calculated we know that every possible steady state flux can be expressed as a non negative linear combination of these extreme pathways.

The extreme pathways define the topological structure of the network.

16

Extreme PathwaysExtreme PathwaysWe now define there extreme pathway matrix:

1 2 3

1

2

3

4

5

6

1

2

3

2 2 2

1 0 1

0 1 0

0 1 1

0 0 1

1 0 0

2 2 2

1 1 1

1 1 1

EP EP EP

v

v

v

v

P v

v

b

b

b

equals the relative flux value through the reaction in the extreme pathway.,i jP

thi thj

17

Extreme Pathway LengthExtreme Pathway LengthA property of the extreme pathways which we are interested in is the length of the extreme pathways.

These lengths can be calculated from the extreme pathway matrix.

First we transform to a binary matrix by changing all the non zero values to 1.

P P

1 2 3

1

2

3

4

5

6

1

2

3

2 2 2

1 0 1

0 1 0

0 1 1

0 0 1

1 0 0

2 2 2

1 1 1

1 1 1

EP EP EP

v

v

v

v

P v

v

b

b

b

1 2 3

1

2

3

4

5

6

1

2

3

1 1 1

1 0 1

0 1 0

0 1 1

0 0 1

1 0 0

1 1 1

1 1 1

1 1 1

EP EP EP

v

v

v

v

P v

v

b

b

b

18

Extreme Pathway LengthExtreme Pathway LengthWe then simply multiply with .PTP

1 2 3

1

2

3

6 4 5

6 5

7

T

EP EP EP

EP

P P EP

EP

1EP 2EP 3EP

The numbers in the position represent the length of the extreme pathway.

The numbers in the represent the shared length of the and extreme

pathways.

The numbers in the position represent the length of the extreme pathway.

The numbers in the represent the shared length of the and extreme

pathways.

,i ithi

,i jthi thj

19

Extreme Pathway LengthExtreme Pathway LengthWhy is this true?

1 2 3

1

2

3

6 4 5

6 5

7

T

EP EP EP

EP

P P EP

EP

Lets look at the 1,3 entry for example:

1 2 3

1

2

3

4

5

6

1

2

3

1 1 1

1 0 1

0 1 0

0 1 1

0 0 1

1 0 0

1 1 1

1 1 1

1 1 1

EP EP EP

v

v

v

v

P v

v

b

b

b

1st row of : 1 1 0 0 0 1 1 1 11 2 3 4 5 6 1 2 3v v v v v v b b b

TP

3rd column of : 1 1 0 1 1 0 1 1 1P

20

Extreme Pathway LengthExtreme Pathway LengthUsing this method we can calculate the extreme pathway lengths for various organisms.

In this article the lengths of the extreme pathways responsible for producing amino acids were calculated for:

1. Haemophilus influenzae – AKA Pfeiffer's bacillus or Bacillus influenzae.

2. Helicobacter pylori – A bacteria that infects the lining of the human stomach.

21

Extreme Pathway Length Extreme Pathway Length Haemophilus influenzae These distributions have more than one

peak.

This implies that there are often multiple common extreme pathway lengths around

which deviations are made

22

Extreme Pathway Length Extreme Pathway Length Helicobacter pylori

valine and alanine are almost identical except that the histogram is shifted.

It takes five extra reaction steps to make valine for shorter extreme

pathways and only three extra reactions for the longer ones.

Conclusion:

The number of extra reaction steps need to create valine

instead of alanine depends on the length of the pathway.

Conclusion:

The number of extra reaction steps need to create valine

instead of alanine depends on the length of the pathway.

23

Extreme Pathway Reaction Extreme Pathway Reaction ParticipationParticipation

Another property of the extreme pathways which we are interested in is the reaction participation in the extreme pathways.

The reaction participation of a reaction is the number of extreme pathways that the reaction takes place in.

iv

1EP 2EP 3EP

‘s reaction participation is 3 for example.1v

24

Extreme Pathway Reaction Extreme Pathway Reaction ParticipationParticipation

We want to calculate the reaction participation value for each of the reactions.

Recall that is the matrix obtained from by changing all the non zero values to 1.

PP

1 2 3

1

2

3

4

5

6

1

2

3

2 2 2

1 0 1

0 1 0

0 1 1

0 0 1

1 0 0

2 2 2

1 1 1

1 1 1

EP EP EP

v

v

v

v

P v

v

b

b

b

1 2 3

1

2

3

4

5

6

1

2

3

1 1 1

1 0 1

0 1 0

0 1 1

0 0 1

1 0 0

1 1 1

1 1 1

1 1 1

EP EP EP

v

v

v

v

P v

v

b

b

b

25

Extreme Pathway Reaction Extreme Pathway Reaction ParticipationParticipation

This can be achieved by multiplying with .P TP

1 2 3 4 5 6 1 2 3

1

2

3

4

5

6

1

2

3

3 2 1 2 1 1 3 3 3

2 0 1 1 1 2 2 2

1 1 0 0 1 1 1

2 1 0 2 2 2

1 0 1 1 1

1 1 1 1

3 3 3

3 3

3

T

v v v v v v b b b

v

v

v

v

P P v

v

b

b

b

The numbers in the position

represent in how many extreme

pathways the reactions

participates in.

The numbers in the position

represent in how many extreme

pathways the reactions

participates in.

,i i

thi

The numbers in the position

represent in how many extreme pathways both reaction and

reaction participates in.

The numbers in the position

represent in how many extreme pathways both reaction and

reaction participates in.

,i j

ij

26

Extreme Pathway Reaction Extreme Pathway Reaction ParticipationParticipation

Why is this true?1 2 3 4 5 6 1 2 3

1

2

3

4

5

6

1

2

3

3 2 1 2 1 1 3 3 3

2 0 1 1 1 2 2 2

1 1 0 0 1 1 1

2 1 0 2 2 2

1 0 1 1 1

1 1 1 1

3 3 3

3 3

3

T

v v v v v v b b b

v

v

v

v

P P v

v

b

b

b

Lets look at the 2,4 entry for example:

2nd row of : 1 0 11 2 3EP EP EP

P

4th column of :TP

1 2 3

1

2

3

4

5

6

1

2

3

1 1 1

1 0 1

0 1 0

0 1 1

0 0 1

1 0 0

1 1 1

1 1 1

1 1 1

EP EP EP

v

v

v

v

P v

v

b

b

b

0 1 1

27

Extreme Pathway Reaction Extreme Pathway Reaction ParticipationParticipation

What can we learn from the extreme pathway reaction participation matrix?

Lets look the at for example:1v

1 2 3 4 5 6 1 2 3

1

2

3

4

5

6

1

2

3

3 2 1 2 1 1 3 3 3

2 0 1 1 1 2 2 2

1 1 0 0 1 1 1

2 1 0 2 2 2

1 0 1 1 1

1 1 1 1

3 3 3

3 3

3

T

v v v v v v b b b

v

v

v

v

P P v

v

b

b

b

participates in 3 extreme pathways.Since there are only 3 extreme

pathways we know that participates in all the extreme pathways.

1v

1v

28

Extreme Pathway Reaction Extreme Pathway Reaction ParticipationParticipation

What else can we learn from the extreme pathway reaction participation matrix?

Lets look the at and for example:1v

1 2 3 4 5 6 1 2 3

1

2

3

4

5

6

1

2

3

3 2 1 2 1 1 3 3 3

2 0 1 1 1 2 2 2

1 1 0 0 1 1 1

2 1 0 2 2 2

1 0 1 1 1

1 1 1 1

3 3 3

3 3

3

T

v v v v v v b b b

v

v

v

v

P P v

v

b

b

b

1b

participates in 3 extreme pathways. participates in 3 extreme pathways.

Since there are 3 extreme pathways in which they both appear in we know that one takes place iff the other takes place.

1v

1b

29

Extreme Pathway Reaction Extreme Pathway Reaction ParticipationParticipation

The reactions in region 1 participate in all of the extreme pathways.Conclusion: all the reactions in region 1

either participate or not together.

Conclusion: all the reactions in region 1 either participate or not together.

30

Extreme Pathway Reaction Extreme Pathway Reaction ParticipationParticipation

This information if of value. If we know all the reactions that must occur together we can control (or completely prevent) a reaction by affecting a different reaction.

As we just saw, in some cases this information is easily seen in the matrix.

We will now describe an algorithm which based on the reaction participation matrix will find all the reactions that must occur (or not occur) together.

31

Extreme Pathway Reaction Extreme Pathway Reaction ParticipationParticipation

The algorithm:

1. Each reaction is in a group of its own.

2. While changes:

I. Check if there exist such that:

II. For every such couple merge the two groups.

1 , , nK v v

K

,i j , , ,PM PM PMi i j j i jR R R

Naïve implementation: iterations. 3O n

The implementation can be improved to work in iterations by choosing the pairs on which we perform the test more carefully.

2O n

is the reaction

participation matrix.

PMR

Can we make the algorithm work faster?

Answer: Who cares?

32

Extreme Pathway Reaction Extreme Pathway Reaction ParticipationParticipation

Example:1 2 3 4 5 6 1 2 3

1

2

3

4

5

6

1

2

3

3 2 1 2 1 1 3 3 3

2 0 1 1 1 2 2 2

1 1 0 0 1 1 1

2 1 0 2 2 2

1 1 1 1 1

1 1 1 1

3 3 3

3 3

3

PM

v v v v v v b b b

v

v

v

v

R v

v

b

b

b

1 2 3 4 5 6 1 2 3, , , , , , , ,K v v v v v v b b b 1 1 2 3 4 5 6 2 3, , , , , , , ,K v b v v v v v b b 1 1 2 2 3 4 5 6 3, , , , , , , ,K v b b v v v v v b 1 1 2 3 2 3 4 5 6, , , , , , , ,K v b b b v v v v v 1 1 2 3 2 3 4 5 6, , , , , , , ,K v b b b v v v v v

We could make a small optimization. When we reach a reaction that was

already joined with another reaction we need not check it.

This however doesn’t change the worst case complexity.

We could make a small optimization. When we reach a reaction that was

already joined with another reaction we need not check it.

This however doesn’t change the worst case complexity.