17
Qiong Cheng, Robert Harrison, Alexander Zel ikovsky Computer Science in Georgia State Univer sity 2007 IEEE 7 th International Conference on BioInformatics & BioEn Homomorphism Mapping in Metabolic Pathways

Homomorphism Mapping in Metabolic Pathways

Embed Size (px)

DESCRIPTION

Homomorphism Mapping in Metabolic Pathways. Qiong Cheng, Robert Harrison , Alexander Zelikovsky Computer Science in Georgia State University. Oct. 17 2007 IEEE 7 th International Conference on BioInformatics & BioEngineering. Outline. Comparison of metabolic pathways - PowerPoint PPT Presentation

Citation preview

Page 1: Homomorphism Mapping in Metabolic Pathways

Qiong Cheng, Robert Harrison, Alexander ZelikovskyComputer Science in Georgia State University

Oct. 17 2007 IEEE 7th International Conference on BioInformatics & BioEngineering

Homomorphism Mapping in Metabolic Pathways

Page 2: Homomorphism Mapping in Metabolic Pathways

Outline Comparison of metabolic pathways Graph mappings: embeddings & homomorphisms Min cost homomorphism problem

Enzyme similarity Optimal DP algorithm for trees Searching metabolic networks for

pathway motifs pathway holes

Future work

Page 3: Homomorphism Mapping in Metabolic Pathways

Metabolic pathway & pathways model

Metabolic pathways model

A portion of pentose phosphate pathway

1.1.1.49

1.1.1.34

2.7.1.13

3.1.1.31 1.1.1.44

Metabolic pathway

Page 4: Homomorphism Mapping in Metabolic Pathways

Comparison of metabolic pathways

Enzyme Similarity

Pathway topology Similarity

Enzyme similarity and pathway topology together represent the similarity of pathway functionality.

Mismatch/Substitute

match

match

match

Page 5: Homomorphism Mapping in Metabolic Pathways

Related work

Linear topology

Tree topology

DCBA

D’X’

A’

(Forst & Schulten[1999], Chen & Hofestaedt[2004];)

D

CB

A

X’

B’

A’

(Pinter [2005] VGVTlogVGVGVTlogVT)

Arbitrary topology

Mapping : Linear pattern Graph (Kelly et al 2004) ( VTVG)Exhaustively search(Sharan et al 2005 ( VTVG) Yang et al 2007 ( VGVG)

Page 6: Homomorphism Mapping in Metabolic Pathways

Enzyme mapping cost

EC (Enzyme Commission) notation

Enzyme similarity score Δ

Measure Δ by tight reaction property Enzyme X = x1 . x2 . x3 . x4Enzyme Y = y1 . y2 . y3 . y4

= = = = Δ[X, Y ] = 1= = = Δ[X, Y ] = 10

Δ[X, Y ] = +∞

Measure Δ by the lowest common upper class distribution

Enzyme D = d1 . d2 . d3 . d4

Δ[X, Y ] = log2c(X, Y )

otherwise

Page 7: Homomorphism Mapping in Metabolic Pathways

Graph mappings: embeddings & homomorphisms

Isomorphism

T G

f

Homomorphism

Isomorphic embedding Homeomorphic embedding

Homomorphism f : T G: fv : VT VG

fe : ET paths of GEdge-to-path cost : (|fe(e)|-1)

We allow different enzymes to be mapped to the same enzyme.

Homomorphism coste in ET

(|fe(e)|-1)+

Δ(v, fv(v))v in T=

Page 8: Homomorphism Mapping in Metabolic Pathways

Min cost homomorphism problem

A multi-source tree is a directed graph, whose underlying undirected graph is a tree.

ignoring direction

Given a multisource tree T =<VT, ET> (Pattern) and an arbitrary graph G =<VG, EG> (Text), find min cost homomorphism f : T G

Page 9: Homomorphism Mapping in Metabolic Pathways

Preprocessing of text graph

Transitive closure of G is graph G*=(V, E*), where E*={(i,j): there is i-j-path in G}

Text G

A B

C D

E

FTransitive Closure of G : G*

A B

C

D

EF

1

1 1

1

1

1

2

23

2

2

3

Transitive closure

Page 10: Homomorphism Mapping in Metabolic Pathways

Pattern graph ordering

Pattern T

a b

c

d

Ordering

c bd a

Construct ordered pattern T’• DFS traversal • Processing order in opposite way

• Each edge ei in T’ is the unique edge connecting vi with the previous vertices in the order

Ordered pattern T’

Page 11: Homomorphism Mapping in Metabolic Pathways

DP table

u1 … uj … u|VG|

a

b

c

d

Text arbitrary order

DT[a, uj]

min cost homomorphism mapping from T’s subgraph induced by

previous vertices in the order into G*

Pattern T

a b

c

d

Page 12: Homomorphism Mapping in Metabolic Pathways

Filling DP table

is penalty for gaps

Δ(vi , uj) if vi is a leaf in T

Δ(vi, uj) + ∑l=1 to adj(vi)Minj’=1 to |VG|C(il, j’) if vi is a leaf in T

=DT[i, j]i<|VT|j<|VG|

Recursive function

h(j, jl) = #(hops between uj and ujl in G)

C[il, jl] = DT[il, jl] + (h(j, jl) - 1)

vil

vi

Pattern T G*

uj’

uj

h(j, j’)

Page 13: Homomorphism Mapping in Metabolic Pathways

Runtime Analysis

Transitive closure takes O(|VG||EG|).

The total runtime is O(|VG||EG|+|VG*||VT|).

Pattern graph ordering takes O(|VT| + |ET|) Dynamic programming

- Calculate min contribution of all child pairs of node pair (vi∈T,uj∈G) takes tij = degT (vi)degG*(uj)

- Filling DT takes

j=1 to |VG| i=1 to |VT|tij = j=1 to |VG| degG*(uj)i=1 to |VT|degT(vi)

= 2|EG*||ET|

Page 14: Homomorphism Mapping in Metabolic Pathways

Statistical significance

Randomized P-Value computation

Random degree-conserved graph generation: Reshuffle nodes a b

c d

a b

c d

Reshuffle edgesReshuffle edge

Page 15: Homomorphism Mapping in Metabolic Pathways

Experiments & applications

Identifying conserved pathways 24 pathways that are conserved across all 4

species 18 more pathways that are conserved

across at least three of these species

Resolving ambiguity

Discovering pathways holes

All-against-all mappings among S. cerevisiae, B. subtilis, T. thermophilus, and E.coli

Page 16: Homomorphism Mapping in Metabolic Pathways

Observation example 1 : Resolving Ambiguity

2.6.1.1

2.6.1.11.2.4.-1.2.4.2

2.3.1.-2.3.1.61

6.2.1.5

6.2.1.5

1.3.99.1

1.3.99.1

4.2.1.2

4.2.1.2

1.

1.1.1.82

1.1.1.82

Mapping of glutamate degradation VII pathways from B. subtilis to T. thermophilus (p < 0.01).

Page 17: Homomorphism Mapping in Metabolic Pathways

Future work

Approximation algorithm to handle with the comparison of general graphs

Mining protein interaction network

A web-oriented tool will be developed

Discovery of critical elements or modules based on graph comparison

Discovery of evolution relation of organisms by pathway comparison of different organisms at different time points

Integration with genome database