1
M ITOCHONDRIAL P ROTEOMICS OF A S ECONDARY G REEN A LGA Emilie P EREZ 1,2 , Hervé D EGAND 3 , Pierre M ORSOMME 3 , Claire R EMACLE 2 , Pierre CARDOL 2 , Denis B AURAIN 1 1 Phylogénomique des Eucaryotes, Institut de Botanique, Université de Liège, Belgium 2 Génétique des Microorganismes, Institut de Botanique, Université de Liège, Belgium 3 SST/ISV, Institut des Sciences de la Vie, Université Catholique de Louvain, Belgium Euglena gracilis, a secondary green alga Figure 1: Primary endosymbiosis. Origin of photosynthetic cells. D URING evolution, some eukaryotic lineages have acquired the ability to photosynthesize through endosymbiosis and conversion of the endosymbiont into a plastid. The endosymbiosis is termed pri- mary when the endosymbiont is a cyanobacterium whereas it is termed secondary when it is a eukaryotic alga. Euglena gracilis derives from a secondary en- dosymbiosis between its phagotrophic ancestor and a green alga. It is thus a kind of repetition of the much older primary endosymbiosis. Figure 2: Secondary endosymbiosis. Origin of Euglenids. C ONSIDERED as a secondary green alga, Euglena gracilis belongs to the phylum of euglenids, which are part of the larger group Euglenozoa (supergroup Excavates). Also included in Euglenozoa are kineto- plastids, which encompass trypanosoma (the para- sites causing sleeping sickness, among other diseases). Algae such as Euglena gracilis are termed "complex" because they are genetic chimeras. Due to their par- ticular history, their genes may come from Euglenozoa but also from the green lineage (or possibly from other lineages such as red algae). T HE principal metabolic pathways, especially those ensuring cellular energetics, are mainly found in the plastid and the mitochondrion. Our general objective is to study the interactions established between these two organelles during secondary endosymbiosis and to determine the phylogenetic origin of the genes en- coding proteins involved in these interactions. As a first step, we performed a high-throughput analysis of the mitochondrial proteome of Euglena gracilis. Methods T HE mass spectrometry analysis of proteins in sam- ples of mitochondria extracted from Euglena gra- cilis allowed us to identify several hundred protein frag- ments. First, the peptides obtained by MS/MS were compared to a translated database of publicly avail- able Euglena gracilis ESTs. Then, the ESTs matching one or more peptides were annotated with BLAST against the NCBI non-redundant protein database. This strategy led to the identification of most protein fragments obtained by mass spectrometry. Subcellular localization of identified protein fragments 0 50 100 150 200 250 1e-50 1e-20 1e-10 1e-03 1 # identified protein fragments E-value unknown others mitochondrion cytoplasm Figure 3: Subcellular localization of protein fragments broken by E-value. 1e-50: E-value <= 1e-50, 1e-20: 1e-50 < E-value <= 1e-20, 1e-10: 1e-20 < E-value <= 1e-10, 1e-03: 1e-10 < E-value <= 1e-03 and 1: 1e-03 < E-value <= 1. W E can evaluate the quality of our analysis by fo- cusing on the subcellular localization of these protein fragments. Here, we see that for the best E- values (< 1e-10), the vast majority of protein fragments are localized in the mitochondrion. In contrast, for low (> 1e-10) or non-significant E-values (> 1e-03), the mi- tochondrial subcellular localization is much rarer, thus indicating that the identification is probably incorrect. Furthermore, we observe that there are very few iden- tifications with non-significant E-values (about 70 of nearly 500 identified protein fragments). Taxonomic affiliation of identified protein fragments 0 50 100 150 200 250 1e-50 1e-20 1e-10 1e-03 1 # identified protein fragments E-value unknown prokaryotes others unikonts Euglenozoa reds greens Figure 4: Taxonomic affiliation of protein fragments broken by E-value. 1e-50: E-value <= 1e-50, 1e-20: 1e-50 < E-value <= 1e-20, 1e-10: 1e-20 < E-value <= 1e-10, 1e-03: 1e-10 < E-value <= 1e-03 and 1: 1e-03 < E-value <= 1. W HEN we examine the group of organisms corre- sponding to the first BLAST hits, results are incon- clusive. Indeed, there are relatively few organisms be- longing to Euglenozoa, contrary to what is expected, and this trend persists over all ranges of E-value. Mor- ever, we observe a relatively large proportion of hits corresponding to red algae and allies ("reds"), while hits from green algae and land plants ("greens") are much less abundant. Though already reported in Eu- glena gracilis, this remains to explained. Coverage of the mitochondrial metabolic pathways F INALLY , we searched for mitochondrial metabolic pathways present in our set of identified proteins. oxidative phosphorylation & ATP synthesis 34 Krebs & glyoxylate cycles 10 fatty acid & lipid metabolisms 18 pyruvate decarboxylation 5 intermembrane space matrix inner membrane outer membrane amino acid metabolism 12 photo- respiration 3 cellular protection mechanisms 7 DNA regulation & translation 7 other pathways 7 indeterminate pathways 18 transporters 13 Figure 5: Coverage of the mitochondrial metabolic pathways. Number of identified proteins for each pathway. F OR each pathway, we report the number of different identified proteins. Most mitochondrial metabolic pathways are represented by many of their proteins. The most represented pathway is oxidative phospho- rylation associated with ATP synthesis (34 identified components), which was expected since this is the most important pathway in terms of components (the four complexes of the respiratory chain and ATP syn- thetase are mostly composed of several tens of pro- teins). Thus, we identify a total of 134 mitochondrial proteins involved in various metabolic pathways. Preprocessing of the EST database 0 0.02 0.04 0.06 0.08 0.1 0 100 200 300 400 500 600 700 frequency length (aa) translated ESTs length distribution raw ESTs EST contigs Figure 6: Translated ESTs length distribution. Lengths are those of the longest ORF for each EST. A MONG the approaches likely to increase the num- ber of identified proteins in the mass spectrometry analysis, the preprocessing of the public database of ESTs from Euglena gracilis is the easiest to implement. Indeed, assembling EST contigs with cap3 allowed us to obtain longer ESTs, thus providing better identifica- tion of protein fragments. Systems Biology Workshop of the Great Region University, Lultzhausen, Luxembourg, June 19–20, 2012 PhD student funded by

, Hervé D , Pierre M , Claire R , Denis Bunikonts Euglenozoa reds greens Figure 4: Taxonomic affiliation of protein fragments broken by E-value. 1e-50: E-value

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: , Hervé D , Pierre M , Claire R , Denis Bunikonts Euglenozoa reds greens Figure 4: Taxonomic affiliation of protein fragments broken by E-value. 1e-50: E-value

MITOCHONDRIAL PROTEOMICSOF A SECONDARY GREEN ALGA

Emilie PEREZ1,2, Hervé DEGAND3, Pierre MORSOMME3, Claire REMACLE2,Pierre CARDOL2, Denis BAURAIN1

1Phylogénomique des Eucaryotes, Institut de Botanique, Université de Liège, Belgium2Génétique des Microorganismes, Institut de Botanique, Université de Liège, Belgium

3SST/ISV, Institut des Sciences de la Vie, Université Catholique de Louvain, Belgium

Euglena gracilis,a secondary green alga

Figure 1: Primary endosymbiosis. Origin of photosynthetic cells.

DURING evolution, some eukaryotic lineages haveacquired the ability to photosynthesize through

endosymbiosis and conversion of the endosymbiontinto a plastid. The endosymbiosis is termed pri-mary when the endosymbiont is a cyanobacteriumwhereas it is termed secondary when it is a eukaryoticalga. Euglena gracilis derives from a secondary en-dosymbiosis between its phagotrophic ancestor anda green alga. It is thus a kind of repetition of the mucholder primary endosymbiosis.

Figure 2: Secondary endosymbiosis. Origin of Euglenids.

CONSIDERED as a secondary green alga, Euglenagracilis belongs to the phylum of euglenids, which

are part of the larger group Euglenozoa (supergroupExcavates). Also included in Euglenozoa are kineto-plastids, which encompass trypanosoma (the para-sites causing sleeping sickness, among other diseases).Algae such as Euglena gracilis are termed "complex"because they are genetic chimeras. Due to their par-ticular history, their genes may come from Euglenozoabut also from the green lineage (or possibly from otherlineages such as red algae).

THE principal metabolic pathways, especially thoseensuring cellular energetics, are mainly found in the

plastid and the mitochondrion. Our general objectiveis to study the interactions established between thesetwo organelles during secondary endosymbiosis andto determine the phylogenetic origin of the genes en-coding proteins involved in these interactions. As a firststep, we performed a high-throughput analysis of themitochondrial proteome of Euglena gracilis.

Methods

THE mass spectrometry analysis of proteins in sam-ples of mitochondria extracted from Euglena gra-

cilis allowed us to identify several hundred protein frag-ments. First, the peptides obtained by MS/MS werecompared to a translated database of publicly avail-able Euglena gracilis ESTs. Then, the ESTs matchingone or more peptides were annotated with BLASTagainst the NCBI non-redundant protein database.This strategy led to the identification of most proteinfragments obtained by mass spectrometry.

Subcellular localizationof identified protein fragments

0

50

100

150

200

250

1e-5

0

1e-2

0

1e-1

0

1e-0

3 1

# id

entif

ied

prot

ein

frag

men

ts

E-value

unknownothersmitochondrioncytoplasm

Figure 3: Subcellular localization of protein fragments broken byE-value. 1e-50: E-value <= 1e-50, 1e-20: 1e-50 < E-value <= 1e-20,1e-10: 1e-20 < E-value <= 1e-10, 1e-03: 1e-10 < E-value <= 1e-03and 1: 1e-03 < E-value <= 1.

WE can evaluate the quality of our analysis by fo-cusing on the subcellular localization of these

protein fragments. Here, we see that for the best E-values (< 1e-10), the vast majority of protein fragmentsare localized in the mitochondrion. In contrast, for low(> 1e-10) or non-significant E-values (> 1e-03), the mi-tochondrial subcellular localization is much rarer, thusindicating that the identification is probably incorrect.Furthermore, we observe that there are very few iden-tifications with non-significant E-values (about 70 ofnearly 500 identified protein fragments).

Taxonomic affiliationof identified protein fragments

0

50

100

150

200

250

1e-5

0

1e-2

0

1e-1

0

1e-0

3 1

# id

entif

ied

prot

ein

frag

men

ts

E-value

unknownprokaryotesothersunikontsEuglenozoaredsgreens

Figure 4: Taxonomic affiliation of protein fragments broken byE-value. 1e-50: E-value <= 1e-50, 1e-20: 1e-50 < E-value <= 1e-20,1e-10: 1e-20 < E-value <= 1e-10, 1e-03: 1e-10 < E-value <= 1e-03and 1: 1e-03 < E-value <= 1.

WHEN we examine the group of organisms corre-sponding to the first BLAST hits, results are incon-

clusive. Indeed, there are relatively few organisms be-longing to Euglenozoa, contrary to what is expected,and this trend persists over all ranges of E-value. Mor-ever, we observe a relatively large proportion of hitscorresponding to red algae and allies ("reds"), whilehits from green algae and land plants ("greens") aremuch less abundant. Though already reported in Eu-glena gracilis, this remains to explained.

Coverage of the mitochondrialmetabolic pathways

FINALLY, we searched for mitochondrial metabolicpathways present in our set of identified proteins.

oxidative phosphorylation & ATP synthesis

34

Krebs & glyoxylate

cycles

10

fatty acid & lipid

metabolisms

18

pyruvate decarboxylation

5

intermembrane space

matrix

inner membrane outer membrane

amino acid metabolism

12

photo-respiration

3

cellular protection

mechanisms

7

DNA regulation & translation

7

other pathways

7

indeterminate pathways

18 transporters

13

Figure 5: Coverage of the mitochondrial metabolic pathways.Number of identified proteins for each pathway.

FOR each pathway, we report the number of differentidentified proteins. Most mitochondrial metabolic

pathways are represented by many of their proteins.The most represented pathway is oxidative phospho-rylation associated with ATP synthesis (34 identifiedcomponents), which was expected since this is themost important pathway in terms of components (thefour complexes of the respiratory chain and ATP syn-thetase are mostly composed of several tens of pro-teins). Thus, we identify a total of 134 mitochondrialproteins involved in various metabolic pathways.

Preprocessing of the EST database

0

0.02

0.04

0.06

0.08

0.1

0 100 200 300 400 500 600 700

freq

uenc

y

length (aa)

translated ESTs length distribution

raw ESTsEST contigs

Figure 6: Translated ESTs length distribution. Lengths are thoseof the longest ORF for each EST.

AMONG the approaches likely to increase the num-ber of identified proteins in the mass spectrometry

analysis, the preprocessing of the public database ofESTs from Euglena gracilis is the easiest to implement.Indeed, assembling EST contigs with cap3 allowed usto obtain longer ESTs, thus providing better identifica-tion of protein fragments.

Systems Biology Workshop of the Great Region University, Lultzhausen, Luxembourg, June 19–20, 2012 PhD student funded by