Upload
beryl-chambers
View
213
Download
0
Embed Size (px)
Citation preview
BIBM 2008
Qiong ChengGeorgia State University
Joint work withPiotr Berman (Pennstate) Robert Harrison (GSU)Alexander Zelikovsky (GSU)
Fast Alignments of Metabolic Networks
BIBM 2008
Metabolic pathway & pathways model
Metabolic pathways model
A portion of pentose phosphate pathway
1.1.1.49
1.1.1.34
2.7.1.13
3.1.1.31 1.1.1.44
Metabolic pathway1.1.1.49
1.1.1.342.7.1.13
3.1.1.31
1.1.1.44
BIBM 2008
Alignments of metabolic pathways
Enzyme Similarity
Pathway topology
Similarity
Enzyme similarity and pathway topology together represent the similarity of pathway functionality.
Mismatch/Substitute
match
match
match
Pattern P : query pathway Text T : pathway in database
BIBM 2008
Types of Pathway Alignments
Pattern
Text
+ gene duplication and function sharing
= vertex collapsing
embedding + enzyme insertions
= edge subdividing -fine per insertion Pinter et al 2005
+ enzyme deletion = bypass deletion : send vertex
to b Kelly et al 2005 + subpath deletion = strong deletion : send vertex to
d Yang et al 2007
f
BIBM 2008
Optimal Alignment Problem Formulation
Given: a metabolic pathway P =<VP, EP> (Pattern) and a metabolic network T =<VT, ET> (Text)
Find minimum cost alignment f : P T
Minimize cost(f)=∑u in VP Δ(u, fv(u))+ λ∑l in lP
(|fl(l)|-1)
fv : every vertex in VP is mapped to a vertex in VT U {b,d};fl : every path lP across vertices in fv
-1(VT) is mapped to path
lT
DP solution when pattern is multisource tree Runtime for DP solution with Fibonacci heaps: O(|VP|(|ET| + |VT|log|VT|)).
BIBM 2008
Handling cycles
a bc
d
e
a bc
d
e
• Total Runtime : O(|VT| |F(P)||VP|
(|ET| + |VT|log|VT|))
DP does not work when pattern has cycles
“Fix” images for some pattern vertices and reduce to acyclic case
Find Minimum Feedback vertex set F(P):
VP-F(P) is acyclic NP-complete but easy to be approximate
Runtime is increased by factor O(VT |F(P)|)
BIBM 2008
Comparison on different methods
Alignment of tree pathways from different species with optimal homomorphism (HM) and optimal network alignment (NA). Average number of mismatches and gaps are reported on common statistically significant matched pathways.
BIBM 2008
Significant deletionAspartate superpathway in E. coli Lysine biosynthesis in T. thermophilus
Mapping result: unmatched vertices are deleted.
BIBM 2008
Pathway holes: find and fill
Check if there is such enzyme in pattern Find the closest protein in the same group
If identity is too high > 80% then we expect good filling Align to previous and next enzyme – the functions may be
taken over
Mapping of formaldehyde oxidation V pathway in B. subtilis to formy1THF biosynthesis pathway in E. coli
Hole = missing enzyme in pathway description (in database)
Finding holes is difficult task: comparison can help
BIBM 2008
Resolving Ambiguity
Mapping of glutamate degradation VII pathways from B. subtilis to T. thermophilus (p<0.01). The shaded node reflects enzyme homology.
BIBM 2008
Future work
Improve method of filling pathway holes Discover critical metabolic
elements/modules/motifs Describe evolution of metabolic pathways Integrate with genome database
BIBM 2008
Acknowledgments GSU Molecular Basis of Disease (MBD) fellowship Peter Karp Oleg Rokhlenko Florian Rasche Amit Sabnis, Dipendra Kaur Kelly Westbrooks, Irina Astrovskaya, Stefan Gremalschi, Jingwu He,
Dumitru Brinza, Weidong Mao ,Nisar Hudewale