6
Exploring mitochondrial evolution and metabolism organization principles by comparative analysis of metabolic networks Xiao Chang a,b,d , Zhuo Wang c , Pei Hao a,d , Yuan-Yuan Li a,d, , Yi-Xue Li a,d, a Bioinformatics Center, Key Lab of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yueyang Road, Shanghai 200031, China b Graduate School of the Chinese Academy of Sciences, 19 Yuquan Road, Beijing 100039, China c College of Life Sciences and Biotechnology, Shanghai Jiaotong University, 800 Dongchuan Road, Shanghai 200240, China d Shanghai Center for Bioinformation Technology, 100 Qinzhou Road, Shanghai 200235, China abstract article info Article history: Received 27 May 2009 Accepted 8 March 2010 Available online 15 March 2010 Keywords: Metabolic network Betweenness centrality Degree centrality Triad signicance proles The endosymbiotic theory proposed that mitochondrial genomes are derived from an alpha-proteobacterium- like endosymbiont, which was concluded from sequence analysis. We rebuilt the metabolic networks of mitochondria and 22 relative species, and studied the evolution of mitochondrial metabolism at the level of enzyme content and network topology. Our phylogenetic results based on network alignment and motif identication supported the endosymbiotic theory from the point of view of systems biology for the rst time. It was found that the mitochondrial metabolic network were much more compact than the relative species, probably related to the higher efciency of oxidative phosphorylation of the specialized organelle, and the network is highly clustered around the TCA cycle. Moreover, the mitochondrial metabolic network exhibited high functional specicity to the modules. This work provided insight to the understanding of mitochondria evolution, and the organization principle of mitochondrial metabolic network at the network level. © 2010 Elsevier Inc. All rights reserved. Introduction According to the endosymbiotic theory, mitochondria are descended from aerobic bacteria that somehow survived endocytosis and became incorporated into the cytoplasm [1]. The closest eubacterial relatives of mitochondria are considered to be members of the rickettsial subdivision of the α-proteobacteria [2,3]. However, only a small fraction of all of the mitochondrial proteome can be traced back to an origin in the α-proteobacteria, with the remaining mitochondrial proteins probably originating from other eubacteria or the primitive eukaryotic host [46], as is the case with the enzymes of the tricarboxylic acid (TCA) cycle, the major pathway of carbon metabolism in mitochondria [7]. During the endosymbiosis, genome reduction plays an important role so that the mitochondrial genome shrinks relative to its bacteria ancestor [8]. Many genes of the ancestral genome have been transferred to the host genome by horizontal gene transfer [9], while some of the remaining ones have been lost with their function replaced by host processes [10]. Thus, a selective set of metabolism pathways encoded by both mitochondrial genome and nuclear genome are maintained in mitochondria. Since the extensive genome reduction and gene reshufing in organelle limit the utilization of phylogenetic analysis based on sequence information [11], we set out to study the organelle evolution by comparing the metabolic networks. In a previous work, we compared metabolic network properties of chloroplasts and prokary- otic photosynthetic organisms, and found that the differences in metabolic network properties may reect the evolutionary changes during endosymbiosis that led to the improvement of the photosyn- thesis efciency in higher plants [12]. In this decade, the metabolic reconstruction of organelles as well as organisms have been constantly carried out by using both graph theoretic methods [1216] and constraint-based methods [1723], generating experimentally test- able hypotheses and valuable information on possible cellular phenotypes. In this work, we employed the graph theoretic methods to rebuild the metabolic networks of mitochondria and 22 relative species and compared the topological properties among these networks. The comparisons provided insight to the study of mitochondria evolution and the organization principle of mitochondrial metabolic network. Results Phylogenetic analysis based on enzyme content The metabolic networks of mitochondria and the 22 relative species were constructed based on the compound and enzyme information respectively (Supplementary Table 1). Before the phylo- genetic analysis at the topology level, we rst compared the enzyme content of the metabolic networks of mitochondria and the 22 relative Genomics 95 (2010) 339344 Corresponding authors. Address: Shanghai Center for Bioinformation Technology, 100 Qinzhou Road, Shanghai 200235, China. Fax: +86 21 64838882. E-mail addresses: [email protected] (X. Chang), [email protected] (Z. Wang), [email protected] (P. Hao), [email protected] (Y.-Y. Li), [email protected] (Y.-X. Li). 0888-7543/$ see front matter © 2010 Elsevier Inc. All rights reserved. doi:10.1016/j.ygeno.2010.03.006 Contents lists available at ScienceDirect Genomics journal homepage: www.elsevier.com/locate/ygeno

Exploring mitochondrial evolution and metabolism organization principles by comparative analysis of metabolic networks

  • Upload
    fudan

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Genomics 95 (2010) 339–344

Contents lists available at ScienceDirect

Genomics

j ourna l homepage: www.e lsev ie r.com/ locate /ygeno

Exploring mitochondrial evolution and metabolism organization principles bycomparative analysis of metabolic networks

Xiao Chang a,b,d, Zhuo Wang c, Pei Hao a,d, Yuan-Yuan Li a,d,⁎, Yi-Xue Li a,d,⁎a Bioinformatics Center, Key Lab of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yueyang Road, Shanghai 200031, Chinab Graduate School of the Chinese Academy of Sciences, 19 Yuquan Road, Beijing 100039, Chinac College of Life Sciences and Biotechnology, Shanghai Jiaotong University, 800 Dongchuan Road, Shanghai 200240, Chinad Shanghai Center for Bioinformation Technology, 100 Qinzhou Road, Shanghai 200235, China

⁎ Corresponding authors. Address: Shanghai Center f100 Qinzhou Road, Shanghai 200235, China. Fax: +86 2

E-mail addresses: [email protected] (X. Chang), [email protected] (P. Hao), [email protected] (Y.-Y. Li), yxl

0888-7543/$ – see front matter © 2010 Elsevier Inc. Adoi:10.1016/j.ygeno.2010.03.006

a b s t r a c t

a r t i c l e i n f o

Article history:Received 27 May 2009Accepted 8 March 2010Available online 15 March 2010

Keywords:Metabolic networkBetweenness centralityDegree centralityTriad significance profiles

The endosymbiotic theory proposed thatmitochondrial genomes are derived from an alpha-proteobacterium-like endosymbiont, which was concluded from sequence analysis. We rebuilt the metabolic networks ofmitochondria and 22 relative species, and studied the evolution of mitochondrial metabolism at the level ofenzyme content and network topology. Our phylogenetic results based on network alignment and motifidentification supported the endosymbiotic theory from the point of view of systems biology for the first time.It was found that the mitochondrial metabolic network were much more compact than the relative species,probably related to the higher efficiency of oxidative phosphorylation of the specialized organelle, and thenetwork is highly clustered around the TCA cycle. Moreover, the mitochondrial metabolic network exhibitedhigh functional specificity to the modules. This work provided insight to the understanding of mitochondriaevolution, and the organization principle of mitochondrial metabolic network at the network level.

or Bioinformation Technology,1 [email protected] (Z. Wang),[email protected] (Y.-X. Li).

ll rights reserved.

© 2010 Elsevier Inc. All rights reserved.

Introduction

According to the endosymbiotic theory,mitochondria are descendedfrom aerobic bacteria that somehow survived endocytosis and becameincorporated into the cytoplasm [1]. The closest eubacterial relatives ofmitochondria are considered to be members of the rickettsialsubdivision of theα-proteobacteria [2,3]. However, only a small fractionof all of the mitochondrial proteome can be traced back to an origin inthe α-proteobacteria, with the remaining mitochondrial proteinsprobably originating from other eubacteria or the primitive eukaryotichost [4–6], as is the casewith theenzymesof the tricarboxylic acid (TCA)cycle, the major pathway of carbon metabolism in mitochondria [7].

During the endosymbiosis, genome reduction plays an importantrole so that the mitochondrial genome shrinks relative to its bacteriaancestor [8]. Many genes of the ancestral genome have beentransferred to the host genome by horizontal gene transfer [9], whilesomeof the remainingones have been lostwith their function replacedby host processes [10]. Thus, a selective set of metabolism pathwaysencoded by both mitochondrial genome and nuclear genome aremaintained in mitochondria.

Since the extensive genome reduction and gene reshuffling inorganelle limit the utilization of phylogenetic analysis based on

sequence information [11], we set out to study the organelle evolutionby comparing the metabolic networks. In a previous work, wecompared metabolic network properties of chloroplasts and prokary-otic photosynthetic organisms, and found that the differences inmetabolic network properties may reflect the evolutionary changesduring endosymbiosis that led to the improvement of the photosyn-thesis efficiency in higher plants [12]. In this decade, the metabolicreconstruction of organelles aswell as organisms have been constantlycarried out by using both graph theoretic methods [12–16] andconstraint-based methods [17–23], generating experimentally test-able hypotheses and valuable information on possible cellularphenotypes.

In this work, we employed the graph theoretic methods to rebuildthe metabolic networks of mitochondria and 22 relative species andcompared the topological properties among these networks. Thecomparisons provided insight to the study of mitochondria evolutionand the organization principle of mitochondrial metabolic network.

Results

Phylogenetic analysis based on enzyme content

The metabolic networks of mitochondria and the 22 relativespecies were constructed based on the compound and enzymeinformation respectively (Supplementary Table 1). Before the phylo-genetic analysis at the topology level, we first compared the enzymecontent of themetabolic networks of mitochondria and the 22 relative

Fig. 2. Cluster tree based on network alignment. Pink, green, brown, and red indicateRickettsiae, non-parasite bacteria, Archaea and mitochondria, respectively.

340 X. Chang et al. / Genomics 95 (2010) 339–344

species, including five archaea, nine non-parasite bacteria and eightrickettsiae out of α-proteobacteria, in order to deduce the evolution-ary relationships in terms of the overall function. It should be noticedthat the enzyme content here referred to mitochondrial proteomeencoded by both mitochondrial genome and the host genome.

The enzyme content distance of each species pair was calculatedaccording to Korbel's definition [24], and the phylogenetic tree wasconstructed (Fig. 1). In this tree, the Rickettsiae formed one group,whichwas parallel to the group of non-parasite bacteria; all the bacteriaclades were clustered together. The Archaea formed another groupoutside of the bacteria cluster. These were basically consistent with thephylogenetic tree based on 16S rRNA sequences [25,26] with theexception that the Archaeoglobus fulgidus (abbreviated as afu) wasmixed in the Rickettsiae group, which might be attributed to a similarmetabolic network organization shared by afu and Rickettsiae.Noticeably, there was a clear separation of mitochondria from theother three groups of species. The large divergence betweenmitochon-dria and Rickettsiae, probably resulted from mutability of enzymes orrecruitment of exogenous enzymes for specialized function.

Compound network analysis by graph alignment

To study the topological properties of the metabolic networks ofmitochondria and the relative species,we conductednetwork alignmentamong their compound networks by using an algorithm depending ofthe topological information of both edge and node [27–29], andobtained a distance matrix.

In the cluster tree from the distance matrix (Fig. 2), the non-parasitic bacteria and the Archaea were grouped together with eachgroup clustered separately; the mitochondria was clustered with theRickettsiae. In addition, Rickettsia prowazekii (abbreviated as rpr),which have the most mitochondria-like eubacterial genome [26], wasseparated from the other Rickettsiae.

Network comparison based on specific motif

To understand the local structures and design principles of themetabolic network of mitochondria and the relative species, bothenzyme networks and compound networks were scanned for the 13directly connected three-node motifs (triads) (Fig. 3A). The motifscontainingmore than three nodes were routinely ignored in our studyto reduce the computation challenge.

The significant profiles (SPs) of the 13 triads (Triad SignificanceProfiles, TSPs) for two sets of metabolic networks (enzyme andcompound) were calculated, resulting in two matrixes of 26 rowsindicating species and 13 columns indicating motifs.

Fig. 1. Phylogenetic tree based on enzyme content distance. Pink, gree, brown and redindicate Rickettsiae, non-parasite bacteria, Archaea and mitochondria, respectively.

Species clustering was performed on the rows with the Pearsoncorrelation coefficient as the similarity measure using average linkagemethod [30]. In the enzyme graph (Fig. 3B), the mitochondria weregrouped with the Rickettsiae and an archaeon (ape), which wasparallel to another group containing the non-parasitic bacteria andtwo archaea (pai, pab). In the compound graph (Fig. 3C), themitochondria were exclusively clustered with most of the Rickettsiaein an isolated group, while the other Rickettsiae (ecn, erw and eru)were clustered with the non-parasitic bacteria and the Archaea,especially forming a small cluster with an exception of Neisseriameningitidis MC58 (abbreviated as nme), a non-parasitic bacteria.

Moreover, motif clustering was carried out on the column using theEuclideandistance, and the 13 triadswere divided into twomain classesin both enzyme networks and compound networks (Figs. 3B, C), withthe first class containing triads 1, 2, 3, 4, 7, 8 and the second containingtriads 5, 6, 9, 10, 11, 12, 13. Itwas found that the triads in thefirst class donot contain any bidirectional edges, except for triad 4, and all triads inthe second one contain at least one bidirectional edge. Therefore, wetermed the first class as unidirectional triads, and the second asbidirectional triads. According to the clustering results, the bidirectionaltriadoccurredmoremuch frequently than theunidirectional triad in ourmetabolic networks, which might be related to the extensive existenceof reversible biochemical reactions.

Centrality and modularity of the mitochondrial metabolic network

Degree centrality and betweenness centrality were used here tomeasure the network topological centrality [31]. It was found thatdegree centralities of both the enzyme network and the compoundnetwork of mitochondria were much higher than the other species(Supplementary Fig. 1), indicating that mitochondria possess tighterlinkage in the metabolic network, which might contribute to thehigher efficiency of oxidative phosphorylation of the specializedorganelle.

Since the tricarboxylic acid (TCA) cycle is the major pathway ofcarbon metabolism in mitochondria of oxygen-respiring eukaryotesto generate usable energy, the average degree of whole network andTCA sub-network of mitochondria were further examined. In themitochondrial compound network, the degree of whole network was2.6 while that of TCA sub-network was 4.8 with the t-test p-value of0.001027, indicating that the compounds involved in TCA cycle wereconnected much tighter than the background. A similar phenomenonwas observed in the enzyme network, where the average degrees ofwhole network and TCA sub-network were 3.38 and 4.9. It was thusproposed that the mitochondrial network was highly clusteredaround the TCA cycle, similar to our previous work on the chloroplast,which proposed that the density of sub-networks directly linked to

Fig. 3. (A) the directly connected three-node motifs (triads). Clustering results based on the triad significance profile (TSP) for metabolic enzyme (B) and compound (C) networks.Pink, green, brown and red indicate Rickettsiae, non-parasite bacteria, Archaea and mitochondria, respectively.

341X. Chang et al. / Genomics 95 (2010) 339–344

Calvin Cycle was much higher than the overall density of themetabolic network in chloroplasts [12].

The mitochondrial metabolic networks were then broken downinto modules by iteratively removing the edges with the highestbetweenness centrality [31]. The function of eachmodule was definedusing the classification scheme in KEGG which includes nine majorpathways: carbohydrate metabolism, energy metabolism, lipid me-tabolism, nucleotide metabolism, amino-acid metabolism, glycanbiosynthesis and metabolism, metabolism of cofactors and vitamins,biosynthesis of secondary metabolites, and biodegradation ofxenobiotics.

In the mitochondrial enzyme network, five out of the ninemodules, module 1, 2, 4, 5, 6, had their dominant KEGG pathway(Fig. 4A). Enzymes involved in TCA cycle were distributed in threemodules, module 1, 2 and 3. Module 1 was matched to carbohydratemetabolism and amino-acid metabolism; module 2 was the largestone, associated with three pathways including carbohydrate metab-olism, amino-acid metabolism, lipid metabolism; and almost all theenzymes in module 3 participate in the TCA cycle.

In the mitochondrial compound network, most of the twelvemodules, module 2, 4, 5, 6, 7, 8, 10, 11 and 12, had their dominantKEGG pathway (Fig. 4B). Especially, module 4 and 7 were concom-itantly matched to lipid metabolism while module 8 and 12 matchedto nucleotide metabolism. Inversely, it was noticeable that com-pounds involved in TCA cycle were also distributed in three modules,1, 2 and 9. Module 1 was the largest module mainly related to threepathways including carbohydrate metabolism, amino-acid metabo-

lism, metabolism of cofactors and vitamins. Module 9 contained mostTCA cycle related compounds, and nearly half of the compounds inthis module were involved in TCA cycle. The compounds of TCA cyclein module 2, 1, 9 were respectively related to enzymes in module 1, 2,3 of the enzyme network.

In general, these results showed that the mitochondrial moduleswere quite pathway-specific. Sincemodules represent the grouping ofreactions based on their connections and reflect in some degree thecoordination of thewholemetabolism, it seems that themitochondrialnetwork is organized with a modular structure, although the overallcomplexity is reduced with fewer reactions and absence of somepathways.

Discussion

In the present work, we reconstructed the metabolic networks ofmitochondria and 22 relative species. The network comparison resultssupported the serial endosymbiont hypothesis of mitochondria andthe closest evolutionary relationship between mitochondria andRickettsiae from the viewpoint of metabolic networks. Moreover,the centrality and modularity of mitochondria provided clues to theunderstanding of the organization principles of mitochondria.

Although an abundance of sequence data, including genomeorganization, gene order and nucleotide and amino acid sequence,can be used to deduce evolutionary relationships, the extensivereshuffling of mitochondrial genes limits the usefulness of organiza-tion and order information, and the resolving power of single-gene

Fig. 4. Functional modules of metabolic enzyme (A) and compound (B) networks.

342 X. Chang et al. / Genomics 95 (2010) 339–344

analyses is limited by the inherently small information content ofindividual genes, complicated in the particular case of mitochondriaby extreme differences in base composition and in the rate ofsequence divergence of mtDNA-encoded genes in different eukaryoticlines [11]. Considering that cellular metabolism is a complex networkof physico-chemical processes constructed by sequences of biochem-ical reactions, we resorted to the enzyme and the compoundinformation at the level of networks instead of sequence level.

According to the topological analyses based on network alignmentand motif identification, the mitochondria are evolutionarily similarto the rickettsial subdivision of the alpha-proteobacteria, supportingthe mitochondria endosymbiont hypothesis from the point of view ofbiological network for the first time. Among the Rickettsiae, themitochondria seems much closer to Rickettsia prowazekii based onnetwork alignment, which is consistent with the phylogenetic resultson genome sequences [26]. While, the phylogenetic analysis based onenzyme content showed a clear separation of mitochondria from theother three groups of species, Archaea, Rickettsiae and non-paraciticbacteria (Fig. 1), suggesting that mitochondrial enzymes underwentgreater changes during evolution than the topology of their metabolicnetworks did. We attributed this phenomenon to the enzymeinstability during the evolution. Recent studies on the mechanisms

of pathways evolution show that about 20% enzyme superfamilies arequite variable, and these variable superfamilies account for nearly halfof all known reactions [32]. The changes of enzymes could beoriginated by the specifization of multifunctional enzymes, orrecruitment of exogenous ones [32,33]. It was proposed that themost commonly occurring metabolites provide a helping hand for thechanges of enzymes since they can be accommodated by manyenzyme superfamilies. New or modified pathways thus prefer toevolve around central metabolites, thereby keeping the overalltopology and the function of the metabolic network [13].

According to Antoine Danchin in his book The Delphic Boat, whatmatters is the relationship and organization of the system, but notactual original components, whether the system is a boat or anorganism. In our case, the metabolic network is like the ‘boat’(system) and the enzymes are like the ‘planks’ (components). Thetopological properties are more related to the overall function, andthus conserved between mitochondria and Rickettsiae; while theenzyme content is more relevant to detailed or specialized function,and therefore different between mitochondria and Rickettsiae.

The simplest building blocks of the complex networks, the motifs,such as triads, are believed to carry significant information about theirfunction and overall organization [34–37]. For instance, the triads 7

343X. Chang et al. / Genomics 95 (2010) 339–344

(Fig. 3A)may correspondwith feed-forward loops and perform signal-processing tasks such as persistence detection, pulse generation, andacceleration of transcription responses [36]. We proposed theunidirectional triads and bidirectional triads enriched in two differentclasses based on the triad significance profile for mitochondria and itsrelative species (Figs. 2B, C) might be related to the more extensiveexistence of reversible biochemical reactions. The singular classifica-tion of triad 4 in these species also needs further studies.

This paper studied the mitochondria evolution at the networklevel and investigated the network properties of mitochondrial wholemetabolic network and TCA sub-network, which provided clues to thestudies of mitochondria on both the evolution and the function fromthe angle of systems biology. We believe that the mitochondrialsystems biology will help to unravel the evolutionary origin ofmitochondria, the mechanism of evolution events, the specializedfunction as an organelle, the related disorders, and ultimately leadingto the practical therapy.

Material and methods

Dataset preparation and network construction

The mitochondrial pathway data were extracted from the Saccharo-myces Genome Database (http://www.yeastgenome.org/) and Com-prehensive Yeast Genome Database (http://mips.gsf.de/genre/proj/yeast/). The enzymes and reactions of 22 selected organisms(seeSupplementary Table 1) were obtained from the KEGG LIGAND database(ftp://ftp.genome.jp/pub/kegg/ligand/). We considered two comple-mentary representations of a metabolic network. First, the compoundgraph, (Vs, Es), was constructed with chemical compounds as nodes andtransitions as edges. Two substrates s1, s2 are joined by an edge, whenthey occur in the same chemical reaction. The enzyme graph,Ge=(Ve,Ee),has a vertex set Ve consisting of all enzymes in the network. Two enzymese1, e2 are joined by an edge ,if e1 catalyzes a reaction whose product istaken by e2 as a substrate. The “current metabolites”, which is referred toas cofactors in biochemistry such as ATP, ADP, NADP, H2O,were removedfrom our networks in order to avoid the unrealistic definition of the pathlength of network according to the previous work.

Phylogenetic analysis of enzyme content

The distance of enzyme content between two organisms is defined

as dij = − lnðsijÞ = − lnnijffiffiffi

2p

ninj=ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffin2i + n2

j

q0B@

1CA. sij is the Korbel's

definition to describe the similarity of a pair of objects. ni, nj are thenumbers of enzymes in organism i and j, respectively, and nij is thenumber of shared enzymes in the two compared organisms. The valueof dij is between 0 and infinity. A distance matrix containing thedistance information of all studied organisms is obtained according tothis definition. The phylogenetic tree is then constructed based on thedistance matrix using the software package PHYLIP.

Significance of network motif profiles

Network motifs are the subgraph patterns that occur in complexnetworks at much higher frequency than expected in randomizednetworks [35,38]. The statistical significance of subgraph i is evaluated bythe Z score, Zi=(Nreali−bNrandiN)/std(Nrandi) where Nreali is thenumberof times the subgraphappears in thenetwork, andbNrandliN andstd (Nrandli ) are the mean and standard deviation of its appearances inthe randomized networks with the same degree sequence as the realnetwork. Considering that motifs in large networks tend to displayhigher Z scores than motifs in small networks, the vector of Z scorescalculated from different species is normalized to length 1, which is

termed as significance profile (SP): SPi=Zi/(∑Zi2)1/2. Average-link

algorithm is used to produce hierarchical classification from thecorrelation coefficient matrix of SPs [13].

Cross-species analysis based on network alignment

The similarity of two networks i and j is defined as sij sij =∑n

2ki∩kjki + kjffiffiffiffiffiffiffiffiffiffiffiffiGi × Gj

paccording to thenetworkalignment algorithmraised byZhang et al [28],where the two networks contain Gi and Gj vertices respectively, with nshared vertices, and each shared vertex has ki and kj neighbors.Therefore the topological distance between the two graphs can bedefined as:

dij = 1−∑n

2ki∩kjki + kjffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

Gi × Gj

q

Degree centrality and Module detection

For a graph G=(V,E) with n vertices, the degree centrality CD(v)for vertex v is: CDðvÞ = degðvÞ

n−1 . Let v⁎ be the node with highest degreecentrality in G,Then the degree centrality of the graph G is defined as

follows: CD Gð Þ =∑j V j

i=1CD v*ð Þ−CD við Þ½ �

n−2 [39]. The degree centrality iscalculated by the social network analysis software pajek.

Themodules are detected by themethod of Newman [31], which isimplemented by R package igraph.

Acknowledgements

This work was supported by grants from Shanghai Institutesfor Biological Sciences, Chinese Academy of Sciences (2008KIP207),the National “973” Basic Research Program (2006CB0D1203,2006CB0D1205), the National Natural Science Foundation of China(30770497), the National Natural Science Foundation of China(30800199) and the National Key Technologies R&D Program(2007AA02Z331) andNational S& TMajor Project (2009ZX10004-302).

Appendix A. Supplemental data

Supplementary data associated with this article can be found, inthe online version, at doi:10.1016/j.ygeno.2010.03.006.

References

[1] L. Margulis, Symbiosis in Cell Evolution, Freeman, San Francisco, 1981.[2] D. Yang, Y. Oyaizu, H. Oyaizu, G.J. Olsen, C.R. Woese, Mitochondrial origins, Proc.

Natl. Acad. Sci. U. S. A. 82 (1985) 4443–4447.[3] M.W. Gray, D.F. Spencer, Evolution of Microbial Life, in: D.M. Roberts, P. Sharp, G.

Alderson, M. Collins (Eds.), Society for General Microbiology Symposia, Cam-bridge Univ Press, Cambridge, 1996, pp. 107–126.

[4] O. Karlberg, B. Canback, C.G. Kurland, S.G. Andersson, The dual origin of the yeastmitochondrial proteome, Yeast 17 (2000) 170–187.

[5] C.G. Kurland, S.G. Andersson, Origin and evolution of themitochondrial proteome,Microbiol Mol Biol Rev 64 (2000) 786–820.

[6] W.E. Walden, From bacteria to mitochondria: aconitase yields surprises, Proc.Natl. Acad. Sci. U S A 99 (2002) 4138–4140.

[7] C. Schnarrenberger, W. Martin, Evolution of the enzymes of the citric acid cycleand the glyoxylate cycle of higher plants. A case study of endosymbiotic genetransfer, Eur. J. Biochem. 269 (2002) 868–883.

[8] M.W. Gray, G. Burger, B.F. Lang, The origin and early evolution of mitochondria,Genome Biol. 2 (2001) REVIEWS1018.

[9] S.D. Dyall, M.T. Brown, P.J. Johnson, Ancient invasions: from endosymbionts toorganelles, Science 304 (2004) 253–257.

[10] T. Gabaldon, M.A. Huynen, Reconstruction of the proto-mitochondrial metabo-lism, Science 301 (2003) 609.

[11] M.W. Gray, G. Burger, B.F. Lang, Mitochondrial evolution, Science 283 (1999)1476–1481.

[12] Z. Wang, et al., Exploring photosynthesis evolution by comparative analysis ofmetabolic networks between chloroplasts and photosynthetic bacteria, BMCGenomics 7 (2006) 100.

344 X. Chang et al. / Genomics 95 (2010) 339–344

[13] H. Jeong, B. Tombor, R. Albert, Z.N. Oltvai, A.L. Barabasi, The large-scaleorganization of metabolic networks, Nature 407 (2000) 651–654.

[14] J. Podani, et al., Comparable system-level organization of Archaea and Eukaryotes,Nat. Genet. 29 (2001) 54–56.

[15] H.Ma, A.P. Zeng, Reconstruction ofmetabolic networks fromgenomedata and analysisof their global structure for various organisms, Bioinformatics 19 (2003) 270–277.

[16] H.W. Ma, A.P. Zeng, Phylogenetic comparison of metabolic capacities of organismsat genome level, Mol. Phylogenet. Evol. 31 (2004) 204–213.

[17] M.L. Mo, N. Jamshidi, B.O. Palsson, A genome-scale, constraint-based approach tosystems biology of human metabolism, Mol. Biosyst. 3 (2007) 598–603.

[18] T.D. Vo, H.J. Greenberg, B.O. Palsson, Reconstruction and functional characteriza-tion of the human mitochondrial metabolic network based on proteomic andbiochemical data, J. Biol. Chem. 279 (2004) 39532–39540.

[19] J.L. Reed, T.D. Vo, C.H. Schilling, B.O. Palsson, An expanded genome-scale model ofEscherichia coli K-12 (iJR904 GSM/GPR), Genome Biol. 4 (2003) R54.

[20] I. Thiele, T.D. Vo, N.D. Price, B.O. Palsson, Expanded metabolic reconstruction ofHelicobacter pylori (iIT341 GSM/GPR): an in silico genome-scale characterizationof single- and double-deletion mutants, J. Bacteriol. 187 (2005) 5818–5830.

[21] T.D. Vo, B.O. Palsson, Isotopomer analysis of myocardial substrate metabolism: asystems biology approach, Biotechnol. Bioeng 95 (2006) 972–983.

[22] N.C. Duarte, et al., Global reconstruction of the human metabolic network based ongenomic and bibliomic data, Proc. Natl. Acad. Sci. U S A 104 (2007) 1777–1782.

[23] T.D. Vo, B.O. Palsson, Building the power house: recent advances in mitochondrialstudies through proteomics and systems biology, Am. J. Physiol. Cell Physiol. 292(2007) C164–C177.

[24] J.O. Korbel, B. Snel, M.A. Huynen, P. Bork, SHOT: a web server for the constructionof genome phylogenies, Trends Genet. 18 (2002) 158–162.

[25] S.G. Andersson, O. Karlberg, B. Canback, C.G. Kurland, On the origin of mitochondria:a genomics perspective, Philos. Trans. R. Soc. Lond. B. Biol. Sci. 358 (2003) 165–177discussion 177-9.

[26] S.G. Andersson, et al., The genome sequence of Rickettsia prowazekii and theorigin of mitochondria, Nature 396 (1998) 133–140.

[27] R. Sharan, T. Ideker, Modeling cellular machinery through biological networkcomparison, Nat. Biotechnol. 24 (2006) 427–433.

[28] Y. Zhang, et al., Phylophenetic properties of metabolic pathway topologies asrevealed by global analysis, BMC Bioinf. 7 (2006) 252.

[29] D. Zhu, Z.S. Qin, Structural comparison of metabolic networks in selected singlecell organisms, BMC Bioinf. 6 (2005) 8.

[30] R. Milo, et al., Superfamilies of evolved and designed networks, Science 303(2004) 1538–1542.

[31] M.E. Newman, M. Girvan, Finding and evaluating community structure innetworks, Phys. Rev. E Stat. Nonlin. Soft Matter Phys. 69 (2004) 026113.

[32] E. Zientz, T. Dandekar, R. Gross, Metabolic interdependence of obligateintracellular bacteria and their insect hosts, Microbiol. Mol. Biol. Rev. 68 (2004)745–770.

[33] S. Roy, Multifunctional enzymes and evolution of biosynthetic pathways: retro-evolution by jumps, Proteins 37 (1999) 303–309.

[34] S.S. Shen-Orr, R. Milo, S. Mangan, U. Alon, Network motifs in the transcriptionalregulation network of Escherichia coli, Nat. Genet. 31 (2002) 64–68.

[35] R. Milo, et al., Network motifs: simple building blocks of complex networks,Science 298 (2002) 824–827.

[36] U. Alon, Network motifs: theory and experimental approaches, Nat. Rev. Genet.8 (2007) 450–461.

[37] A. Vazquez, et al., The topological relationship between the large-scale attributesand local interaction patterns of complex networks, Proc. Natl. Acad. Sci. U S A 101(2004) 17940–17945.

[38] S. Maslov, K. Sneppen, Specificity and stability in topology of protein networks,Science 296 (2002) 910–913.

[39] L. Freeman, Centrality in social networks: Conceptual clarification, SocialNetworks 1 (1979) 215–239.