28
BioMed Central Page 1 of 28 (page number not for citation purposes) BMC Genomics Open Access Research article Molecular phenotype of zebrafish ovarian follicle by serial analysis of gene expression and proteomic profiling, and comparison with the transcriptomes of other animals Anja Knoll-Gellida 1 , Michèle André 1 , Tamar Gattegno 2 , Jean Forgue 1 , Arie Admon 2 and Patrick J Babin* 1 Address: 1 Génomique et Physiologie des Poissons, UMR NUAGE, Université Bordeaux 1, 33405 Talence cedex, France and 2 Department of Biology, Technion-Israel Institute of Technology, Haifa 32000, Israel Email: Anja Knoll-Gellida - [email protected]; Michèle André - [email protected]; Tamar Gattegno - [email protected]; Jean Forgue - [email protected]; Arie Admon - [email protected]; Patrick J Babin* - [email protected] * Corresponding author Abstract Background: The ability of an oocyte to develop into a viable embryo depends on the accumulation of specific maternal information and molecules, such as RNAs and proteins. A serial analysis of gene expression (SAGE) was carried out in parallel with proteomic analysis on fully-grown ovarian follicles from zebrafish (Danio rerio). The data obtained were compared with ovary/follicle/egg molecular phenotypes of other animals, published or available in public sequence databases. Results: Sequencing of 27,486 SAGE tags identified 11,399 different ones, including 3,329 tags with an occurrence superior to one. Fifty-eight genes were expressed at over 0.15% of the total population and represented 17.34% of the mRNA population identified. The three most expressed transcripts were a rhamnose-binding lectin, beta- actin 2, and a transcribed locus similar to the H2B histone family. Comparison with the large-scale expressed sequence tags sequencing approach revealed highly expressed transcripts that were not previously known to be expressed at high levels in fish ovaries, like the short-sized polarized metallothionein 2 transcript. A higher sensitivity for the detection of transcripts with a characterized maternal genetic contribution was also demonstrated compared to large-scale sequencing of cDNA libraries. Ferritin heavy polypeptide 1, heat shock protein 90-beta, lactate dehydrogenase B4, beta-actin isoforms, tubulin beta 2, ATP synthase subunit 9, together with 40 S ribosomal protein S27a, were common highly-expressed transcripts of vertebrate ovary/unfertilized egg. Comparison of transcriptome and proteome data revealed that transcript levels provide little predictive value with respect to the extent of protein abundance. All the proteins identified by proteomic analysis of fully-grown zebrafish follicles had at least one transcript counterpart, with two exceptions: eosinophil chemotactic cytokine and nothepsin. Conclusion: This study provides a complete sequence data set of maternal mRNA stored in zebrafish germ cells at the end of oogenesis. This catalogue contains highly-expressed transcripts that are part of a vertebrate ovarian expressed gene signature. Comparison of transcriptome and proteome data identified downregulated transcripts or proteins potentially incorporated in the oocyte by endocytosis. The molecular phenotype described provides groundwork for future experimental approaches aimed at identifying functionally important stored maternal transcripts and proteins involved in oogenesis and early stages of embryo development. Published: 09 March 2006 BMC Genomics 2006, 7:46 doi:10.1186/1471-2164-7-46 Received: 07 December 2005 Accepted: 09 March 2006 This article is available from: http://www.biomedcentral.com/1471-2164/7/46 © 2006 Knoll-Gellida et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

BMC Genomics BioMed Central - Home - Springer · BioMed Central Page 1 of 28 ... BMC Genomics Research article Open Access Molecular phenotype of zebrafish ovar ian follicle by serial

  • Upload
    vukiet

  • View
    218

  • Download
    0

Embed Size (px)

Citation preview

Page 1: BMC Genomics BioMed Central - Home - Springer · BioMed Central Page 1 of 28 ... BMC Genomics Research article Open Access Molecular phenotype of zebrafish ovar ian follicle by serial

BioMed CentralBMC Genomics

ss

Open AcceResearch articleMolecular phenotype of zebrafish ovarian follicle by serial analysis of gene expression and proteomic profiling, and comparison with the transcriptomes of other animalsAnja Knoll-Gellida1, Michèle André1, Tamar Gattegno2, Jean Forgue1, Arie Admon2 and Patrick J Babin*1

Address: 1Génomique et Physiologie des Poissons, UMR NUAGE, Université Bordeaux 1, 33405 Talence cedex, France and 2Department of Biology, Technion-Israel Institute of Technology, Haifa 32000, Israel

Email: Anja Knoll-Gellida - [email protected]; Michèle André - [email protected]; Tamar Gattegno - [email protected]; Jean Forgue - [email protected]; Arie Admon - [email protected]; Patrick J Babin* - [email protected]

* Corresponding author

AbstractBackground: The ability of an oocyte to develop into a viable embryo depends on the accumulation of specificmaternal information and molecules, such as RNAs and proteins. A serial analysis of gene expression (SAGE) wascarried out in parallel with proteomic analysis on fully-grown ovarian follicles from zebrafish (Danio rerio). Thedata obtained were compared with ovary/follicle/egg molecular phenotypes of other animals, published oravailable in public sequence databases.

Results: Sequencing of 27,486 SAGE tags identified 11,399 different ones, including 3,329 tags with an occurrencesuperior to one. Fifty-eight genes were expressed at over 0.15% of the total population and represented 17.34%of the mRNA population identified. The three most expressed transcripts were a rhamnose-binding lectin, beta-actin 2, and a transcribed locus similar to the H2B histone family. Comparison with the large-scale expressedsequence tags sequencing approach revealed highly expressed transcripts that were not previously known to beexpressed at high levels in fish ovaries, like the short-sized polarized metallothionein 2 transcript. A highersensitivity for the detection of transcripts with a characterized maternal genetic contribution was alsodemonstrated compared to large-scale sequencing of cDNA libraries. Ferritin heavy polypeptide 1, heat shockprotein 90-beta, lactate dehydrogenase B4, beta-actin isoforms, tubulin beta 2, ATP synthase subunit 9, togetherwith 40 S ribosomal protein S27a, were common highly-expressed transcripts of vertebrate ovary/unfertilized egg.Comparison of transcriptome and proteome data revealed that transcript levels provide little predictive valuewith respect to the extent of protein abundance. All the proteins identified by proteomic analysis of fully-grownzebrafish follicles had at least one transcript counterpart, with two exceptions: eosinophil chemotactic cytokineand nothepsin.

Conclusion: This study provides a complete sequence data set of maternal mRNA stored in zebrafish germ cellsat the end of oogenesis. This catalogue contains highly-expressed transcripts that are part of a vertebrate ovarianexpressed gene signature. Comparison of transcriptome and proteome data identified downregulated transcriptsor proteins potentially incorporated in the oocyte by endocytosis. The molecular phenotype described providesgroundwork for future experimental approaches aimed at identifying functionally important stored maternaltranscripts and proteins involved in oogenesis and early stages of embryo development.

Published: 09 March 2006

BMC Genomics 2006, 7:46 doi:10.1186/1471-2164-7-46

Received: 07 December 2005Accepted: 09 March 2006

This article is available from: http://www.biomedcentral.com/1471-2164/7/46

© 2006 Knoll-Gellida et al; licensee BioMed Central Ltd.This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Page 1 of 28(page number not for citation purposes)

Page 2: BMC Genomics BioMed Central - Home - Springer · BioMed Central Page 1 of 28 ... BMC Genomics Research article Open Access Molecular phenotype of zebrafish ovar ian follicle by serial

BMC Genomics 2006, 7:46 http://www.biomedcentral.com/1471-2164/7/46

BackgroundFolliculogenesis and oogenesis include the formation ofovarian follicles, the initiation and completion of meiosis,and the accumulation of specific information and mole-cules such as RNAs, proteins, or imprinted genes in thefemale germ cells to sustain embryo development to thestage where zygotic gene activation takes over [1-3].

The zebrafish, Danio rerio, is currently the most popularfish model in developmental and genomic analyses and agenome sequencing project is currently underway [4]. Alarge number of expressed sequence tags (ESTs) is alreadyavailable [5]. A number of methods are currently used forgene expression profiling. They differ in scale, economy,and sensitivity. Delineation of the transcriptome of teleostfish ovaries has been evaluated using large-scale ESTsequencing [6] of cDNA libraries of zebrafish [7] andAtlantic salmon [8,9] or subtractive hybridisation ofcDNA libraries [10] of medaka [11] or rainbow trout [12]gonads. While these methods give an idea of transcriptabundance or enrichment in a specific tissue, a few genesexpressed at high levels usually represent a large propor-tion of the total transcripts and are thus more frequentlyrepresented in the EST database [13].

Serial analysis of gene expression (SAGE) based on theenumeration of directionally reliable short cDNAsequences (tags), provides qualitative as well as quantita-tive analysis of a large number of genes in a defined tissue[14,15]. This is a method of choice for discovering novelgenes and spliced variants. This technique has beenwidely applied in human studies and various SAGE tags/SAGE libraries have been generated from different cells/tissues, including human oocytes [16,17], thus enablingthe successful identification of differentially expressedgenes in normal physiological processes and pathologicalconditions [18].

The aim of this work was to profile the transcriptome offully-grown zebrafish follicles using the SAGE methodand compare it with the protein repertoire determined atthe same stage of oogenesis after one- (1D) and two-dimensional (2D)-polyacrylamide-gel electrophoresis(PAGE) protein fractionation and in-gel proteolysis, fol-lowed by tandem mass spectrometry (MS/MS) identifica-tion of the resulting peptides [19]. The database generatedwas compared with several vertebrate ovary/unfertilizedegg transcriptomes generated with the large-scale ESTsequencing approach in order to identify functionallyimportant maternal transcripts and proteins stored ingerm cells.

ResultsZebrafish fully-grown follicle transcriptomeAn essential step in SAGE library analysis is the unambig-uous assignment of each SAGE tag to the correspondingmRNA or EST sequence. This tag-to-gene mappingrequires an initial in silico SAGE tag extraction of virtualtags found in public EST/cDNA sequence databases. Exist-ing web sites [20-23] provide the correspondence betweenSAGE tags and transcripts. A limited number of specieshave been subjected to SAGE analysis, as seen on theSAGEmap web site [20,24], which presents SAGE tag toUniGene mapping for eighteen species. As no previousstudy had used the SAGE strategy in any fish species, therewas no in silico SAGE tag database available for zebrafish.A generic computer package named FISHTAG wasdesigned to extract virtual SAGE tags from UniGene andTIGR EST databases and generate a zebrafish in silico SAGEtag database, ZEBRATAG. Applying SAGE to the zebrafishfully-grown ovarian follicle generated a catalogue of27,486 sequenced tags, ZEBRAOV. The list of these exper-imental SAGE tags and their relative frequencies weredeposited in the NCBI gene expression and hybridizationarray data repository (GEO) [25] under accession numberGSE3679 (Additional file 1: Table 1). Analysing these tagsled to the identification of 11,399 unique transcripts,including 3,329 tags with an occurrence superior to one.ZEBRAOV was subdivided into eighty-three abundanceclasses, from 1 to 284 tag copies per tag species (Figure 1),according to frequency of occurrence.

Around 87% (2,898) and 67% (2,223) of ZEBRAOV non-singleton tags were assigned using FISHTAG to at least oneUniGene (Build#87) or TIGR (release 16.0) cluster,respectively. Fifty-eight transcripts, identified by theirSAGE tags, were expressed at over 0.15%, i.e. the numberof times a tag was counted in ZEBRAOV was at least 42,(Table 1), accounted for 17.34% of the mRNA populationidentified and represented forty-six of the most abundantclasses (Figure 1). Among these SAGE tags, thirty-sevenwere assigned, using the FISHTAG software package withUniGene database Build#87 as a reference, to at least oneUniGene cluster in a correct position of sense (R1) or anti-sense (R1cr) transcript sequences, which were annotatedor not. Nineteen tags were multiple-matched, i.e. match-ing over one UniGene cluster. Two tags did not match anyreference cDNA sequences in GenBank™/EBI Data Bankafter extraction of their SAGE tags at the first three posi-tions and could not, thus, be classified. Unmatched SAGEtags in the SAGE library could be due to the presence ofgenes in the zebrafish genome or spliced variant tran-scripts that had not previously been identified by ESTdata. Some of these unidentified transcripts were widelyexpressed in the zebrafish fully-grown follicle transcrip-tome (see Table 1).

Page 2 of 28(page number not for citation purposes)

Page 3: BMC Genomics BioMed Central - Home - Springer · BioMed Central Page 1 of 28 ... BMC Genomics Research article Open Access Molecular phenotype of zebrafish ovar ian follicle by serial

BMC Genomics 2006, 7:46 http://www.biomedcentral.com/1471-2164/7/46

Table 1: Transcripts expressed at over 0.15% in zebrafish fully-grown ovarian follicle as evaluated by the presence of their corresponding SAGE tag

Tag N° SAGE-tag Nbr Rank GenBank Cluster Gene Description

1 CATGAAATAAACGC

284 R1 BM141044 Dr.12439 Transcribed locus similar to rhamnose-binding lectin

2 CATGTAAGACATCC

226 R1 BC045879 Dr.1109 bactin2 Beta-actin2

3 CATGCAGTGCTTTT

209 R1 BM531116 Dr.46793 h2b Transcribed locus similar to H2B histone family

4 CATGAAAAAAAAAA

153 [146]

5 CATGTCCTTTGTCT

142 R1 BC049475 Dr.47289 mt2 Metallothionein 2

6 CATGTGAACTGTTT

139 [7]

7 CATGACCGTTTTTA

130 R1 BC078260 Dr.26975 cldnd Claudin d

8 CATGCAAAGGAAAC

124 R1 BC055519 Dr.47296 Zgc:66160

9 CATGCTAGCCTAAT

122 R1cr BM861564 Dr.5628 Similar to contains element MSR1 repetitive element/ZP1-related

10 CATGGTGAAACCTG

119 R1 AF068772 Dr.31066 hsp90b Heat shock protein 90-beta

11 CATGTTGATGAGGG

113 R1 BC044190 Dr.729 ldhb Lactate dehydrogenase B4

12 CATGTAAAACCAAA

108 R1 AF057040 Dr.25213 bactin1 Beta-actin1

13 CATGTATTTCACTA

98 R1 BC045894 Dr.1043 atp5g ATP synthase, H+ trans., mit. F0 complex, subunit c (subunit 9)

14 CATGGATATTTAAC

94 R1 BI979482 Dr.47212 Transcribed locus, moderately similar to thymosin beta 10

15 CATGCATTTTCTAA

90 R1cr BM571854 Dr.19916 zp2.3 Egg envelope protein ZP2 variant B [zp2.3]

15 CATGCATTTTCTAA

90 R1 BC093133 Dr.23439 zp2.4 Zp2.4 protein

16 CATGACAACTAAAA

89 R1 BC071411 Dr.989 Zgc:56419 similar to stress-associated endo. reticulum protein 1

17 CATGGTGTGATGGA

82 [5]

18 CATGGAAAAAAAAA

80 [38]

19 CATGAGAAACGTTT

77 R1 BC067692 Dr.26977 zp3b Zona pellucida glycoprotein 3b

20 CATGGTGGTGTTGA

75 R1 NM_131331 Dr.30322 zp3 Zona pellucida glycoprotein 3

21 CATGCAATCATATT

75 R1 BM533765 Dr.13590 Transcribed locus, moderately similar to zygote arrest 1

22 CATGCAGAAAAATA

73 [5]

23 CATGGAAGTTGCTT

72 [6]

24 CATGCAAATGTTTT

71 R1 CK699865 Dr.29860 Transcribed locus

25 CATGAAGTTACTGT

70 [3]

26 CATGCAGTACTGCG

66 R1 AY561514 Dr.30264 rpl3 Ribosomal protein L3

27 CATGCCACTCTTTT

66 [2]

28 CATGCCATAAGTGC

66 R1 BC054596 Dr.1012 Retinol dehydrogenase 10

29 CATGATATTAATAA

65 [3]

Page 3 of 28(page number not for citation purposes)

Page 4: BMC Genomics BioMed Central - Home - Springer · BioMed Central Page 1 of 28 ... BMC Genomics Research article Open Access Molecular phenotype of zebrafish ovar ian follicle by serial

BMC Genomics 2006, 7:46 http://www.biomedcentral.com/1471-2164/7/46

30 CATGATTTTTTGAA

65 [5]

31 CATGTAAAGTCGCC

65 R1 BM778414 Dr.1428 rps15a Ribosomal protein S15a

32 CATGGTCTACACGT

64 R1cr BI325627 Dr.7383 Zgc:73301 similar to cytochrome oxidase III

33 CATGTAAATTTGAA

64 [4]

34 CATGCTTGTTGGAA

62 R1 BQ260230 Dr.29755 Wu:fe38a04 similar to RL31_HUMAN 60S ribosomal protein L31

35 CATGGGATTCGGAC

61 R1 BM102431 Dr.23788 gstp1 Glutathione S-transferase pi

36 CATGGGGTGCTTTT

61 R1 CO351922 Dr.14015 Im:7140333 weakly similar to latrophilin; lectomedin-1

37 CATGACATCACATT

60 [9]

38 CATGTGCTCCATAC

59 R1cr BC095579 Dr.5555 ccna1 Cyclin A1

39 CATGGCATTTTTAG

58 n.m.

40 CATGAGGCTGTCTG

55 R1 BC055524 Dr.24903 Zgc:66168 similar to ubiquitin; 40S ribosomal protein S27a

41 CATGTGTTCTGTAT

55 R1 BC095694 Dr.5890 Zgc:112226 similar to cathepsin S

42 CATGTCGATGATGG

54 R1cr BM103997 Dr.26808 rpl10a Ribosomal protein L10a

43 CATGTTAATAAAAG

53 R1 NM_198809 Dr.5605 tubb2 Tubulin beta 2

44 CATGAGGAAAGCTG

52 R1 BC071384 Dr.1335 rpl36 Ribosomal protein L36

45 CATGGGCTTCGGTC

52 [4]

46 CATGTGCTGCTTGT

51 R1 BC045278 Dr.6496 fth1 Ferritin, heavy polypeptide 1

47 CATGATCAATGTGA

50 R1 BC075915 Dr.1209 ssr2 Signal sequence receptor, beta

48 CATGTGAGGGCTCT

49 [6]

49 CATGATATGTATTT

48 [3]

50 CATGAATTGGAGGG

47 n.m.

51 CATGTCTGTTTAAC

47 R1 BC049058 Dr.11308 rplp0 Ribosomal protein, large, P0

52 CATGAAGCCCATTA

46 R1 BC048027 Dr.4947 cirbp Cold inducible RNA binding protein

53 CATGGTTCCTTGGC

46 R1 BQ617300 Dr.1334 rps25 Ribosomal protein S25

54 CATGTATCAGTTAT

46 [3]

55 CATGTTAGTGTGAC

46 [6]

56 CATGTGAGCCAAAT

45 [3]

57 CATGTTGTTTTTGT

43 [5]

58 CATGTAAATGAGAT

42 R1 BC054688 Dr.12480 Clone IMAGE:5604190, weakly similar to Cd27 binding protein

Nbr, is the number of times a tag was counted in ZEBRAOV; Rank, is tag position at sense (R1) or antisense (R1cr) corresponding mRNA sequence; The number in brackets indicates the number of corresponding UniGene clusters in case of a tag matching with more than one transcript, i.e. a multiple matched tag; n.m., no matched tag; Cluster, cluster accession number of zebrafish UniGene database Build#87.

Table 1: Transcripts expressed at over 0.15% in zebrafish fully-grown ovarian follicle as evaluated by the presence of their corresponding SAGE tag (Continued)

Page 4 of 28(page number not for citation purposes)

Page 5: BMC Genomics BioMed Central - Home - Springer · BioMed Central Page 1 of 28 ... BMC Genomics Research article Open Access Molecular phenotype of zebrafish ovar ian follicle by serial

BMC Genomics 2006, 7:46 http://www.biomedcentral.com/1471-2164/7/46

The most abundant tag in ZEBRAOV was recovered 284times, thus representing 1.033% of the mRNA populationevaluated by the SAGE method (Table 1). This SAGE tagwas identified four times on chromosome 22 genomiccontig (GenBank (gb) accession numbergb|NW_634208|) and in two deduced transcript variants,gb|XM_702755| and gb|XM_704598|, the latter beingnearly identical to the consensus sequence of full-lengthovary cDNA entries gb|BM141044| and gb|CO351300|.These transcripts are part of a large conserved protein fam-ily with at least thirteen transcript variants centred onlocus LOC561392 of chromosome 22. Deduced proteinsequence gb|BM141044| was 41.6% identical in 125amino acid overlap with the N-terminal part of humanlatrophilin-2 precursor protein (UniProt (up) accessionnumber up|O95490|). It contains the galactose/rham-nose-binding lectin domain found in numerous proteinswith sugar binding properties (Pfam accession numberPF02140) [26], including the two domains found inrhamnose-binding lectins in catfish (up|Q9PVW8|) andrainbow trout (up|Q9IB51|, up|Q9IB52|, up|Q9IB53|)eggs. It should be noted that UniGene (ug) cluster

number ug|Dr.12439| enclosing SAGE tag N°1 was in facta chimera cluster also containing EST sequences from het-erogeneous nuclear ribonucleoprotein K (hnrpk) tran-script gb|NM_212994| and, therefore, could not be usedas a valid reference UniGene cluster. hnrpk is in factlocated on chromosome 8 in the zebrafish genome(Ensembl Gene ID ENSDARG00000018914).

Components of the cytoskeleton, β-actin 2 (bactin2)(SAGE tag N°2), β-actin 1 (bactin1) (SAGE tag N°12),tubulin β-2 (tubb2) (SAGE tag N°43), and a transcriptsimilar to thymosin beta 10 (SAGE tag N°14), as well asclaudin d (cldnd) (SAGE tag N°7), a constituent of thetight junction complex, were among the most highly-expressed transcripts. A simultaneous high expression ofdifferent isoforms of several zona pellucida (ZP) glyco-protein transcripts was also observed with common(SAGE tag N°15, zp2.3 or zp2.4 variant proteins) or dis-tinct (SAGE tag N°19, zp3b, and SAGE tag N°20, zp3) tags.

A transcribed locus (gb|XM_701504|) similar to the H2B(h2b) histone family and derived from an annotated

Distribution of ZEBRAOV library SAGE tags by abundance classesFigure 1Distribution of ZEBRAOV library SAGE tags by abundance classes. The library was subdivided into 83 abundance classes, from 1 to 284 tag copies per tag species, according to the occurrence frequency of each tag species. Tag frequencies were plotted on a logarithmic scale on the x-axis against the number of SAGE tag species in each class on the y-axis.

1

10

100

1000

10000

110100500Tag frequency(numberof copies / tag species)

Num

bero

f SAG

E ta

g sp

ecie

s/ c

lass

Page 5 of 28(page number not for citation purposes)

Page 6: BMC Genomics BioMed Central - Home - Springer · BioMed Central Page 1 of 28 ... BMC Genomics Research article Open Access Molecular phenotype of zebrafish ovar ian follicle by serial

BMC Genomics 2006, 7:46 http://www.biomedcentral.com/1471-2164/7/46

genome sequence (gb|NW_649547|) contained the thirdmost abundant annotated SAGE tag in the correct senseposition. The corresponding UniGene cluster

ug|Dr.46793| was a chimera cluster, while the correct ESTcluster of this transcribed portion of the zebrafish genomewas located in TIGR (TC291160). We also observed that a

Transcriptional activity of zebrafish fully-grown follicles and ovaries evaluated by the composition of the ZEBRAOV SAGE tag database and EST clusters in two selected cDNA librariesFigure 2Transcriptional activity of zebrafish fully-grown follicles and ovaries evaluated by the composition of the ZEBRAOV SAGE tag database and EST clusters in two selected cDNA libraries. The cumulative percentage of tag species in the SAGE tag library (solid line) or number of EST clusters in cDNA libraries ID.9767 (dotted line) and ID.15519 (dashed line) (representing the number of transcripts identified), were plotted in order of abundance on the x-axis against the cumulative percentage of the corresponding total tag or EST counts (representing the total amount of sampled transcriptome in SAGE or dbEST libraries) on the y-axis.

Cumulative% of SAGE tag species or EST clusters

Cum

ulat

ive%

of t

otal

SA

GE

tags

or E

STss

eque

nced

100

80

60

40

20

0100806040200

Page 6 of 28(page number not for citation purposes)

Page 7: BMC Genomics BioMed Central - Home - Springer · BioMed Central Page 1 of 28 ... BMC Genomics Research article Open Access Molecular phenotype of zebrafish ovar ian follicle by serial

BMC Genomics 2006, 7:46 http://www.biomedcentral.com/1471-2164/7/46

Page 7 of 28(page number not for citation purposes)

Table 2: The most abundant annotated SAGE tags in zebrafish fully-grown follicles and their corresponding UniGene cluster occurrences in two selected zebrafish ovary cDNA libraries

Tag N° Cluster* Description SAGE % ID.9767 % ID.15519 %

A. Occurrences > 0.15% in ZEBRAOV and cDNA libraries ID.9767 and ID.155192 Dr.1109 Beta-actin2 0.822 0.317 0.7529 Dr.5628 Similar to contains element MSR1 repetitive

element/ZP1-related0.444 0.220 2.517

12 Dr.25213 Beta-actin1 0.393 0.326 0.69815 Dr.19916 Egg envelope protein ZP2 variant B [zp2.3] 0.327 0.397 2.61015 Dr.23439 Zp2.4 protein 0.327 0.740 11.36719 Dr.26977 Zona pellucida glycoprotein 3b 0.280 0.264 0.39920 Dr.30322 Zona pellucida glycoprotein 3 0.275 0.705 4.44443 Dr.5605 Tubulin beta 2 0.193 0.290 0.852

B. Occurrences > 0.15% in ZEBRAOV and cDNA library ID.97671 Dr.12439 Transcribed locus similar to rhamnose-

binding lectin1.033 2.530 0.046

3 Dr.46793 Transcribed locus similar to H2B histone family

0.760 0.652 0.008

7 Dr.26975 Claudin d 0.473 0.264 0.0238 Dr.47296 Zgc:66160 0.451 0.185 0.01514 Dr.47212 Transcribed locus, moderately similar to

thymosin beta 100.342 0.485 0.015

41 Dr.5890 Zgc:112226 similar to cathepsin S 0.200 0.194 0.04659 Dr.12480 Clone IMAGE:5604190, weakly similar to

Cd27 binding protein0.153 0.300 0.008

C. Occurrences > 0.15% in ZEBRAOV and cDNA library ID.1551926 Dr.30264 Ribosomal protein L3 0.240 0.053 0.16928 Dr.1012 Retinol dehydrogenase 10 0.240 0.062 0.43738 Dr.5555 Cyclin A1 0.215 0.123 0.706

D. Occurrences > 0.15% in ZEBRAOV5 Dr.47289 Metallothionein 2 0.517 0.053 0.03110 Dr.31066 Heat shock protein 90-beta 0.433 0.079 0.03111 Dr.729 Lactate dehydrogenase B4 0.411 0.035 0.04613 Dr.1043 ATP synthase, H+ transporting, mit. F0

complex, subunit 90.357 0.149 0

16 Dr.989 Zgc:56419 similar to stress-associated endo. reticulum protein 1

0.324 0.079 0.008

21 Dr.13590 Transcribed locus, moderately similar to zygote arrest 1

0.273 0.106 0

24 Dr.29860 Transcribed locus 0.258 0.053 031 Dr.1428 Ribosomal protein S15a 0.236 0.044 032 Dr.7383 Zgc:73301 similar to cytochrome oxidase III 0.233 0.018 034 Dr.29755 Wu:fe38a04 similar to RL31_HUMAN 60S

ribosomal protein L310.226 0.018 0

35 Dr.23788 Glutathione S-transferase pi 0.222 0.053 036 Dr.14015 Im:7140333 weakly similar to latrophilin;

lectomedin-10.222 0.106 0.015

40 Dr.24903 Zgc:66168 similar to ubiquitin; 40S ribosomal protein S27a

0.200 0.088 0.015

42 Dr.31131 Zgc:109948 similar to ribosomal protein L10a

0.196 0.062 0.008

44 Dr.1335 Ribosomal protein L36 0.189 0.062 046 Dr.6496 Ferritin, heavy polypeptide 1 0.186 0.123 0.01547 Dr.1209 Signal sequence receptor, beta 0.182 0.009 0.01551 Dr.11308 Ribosomal protein, large, P0 0.171 0.079 0.02352 Dr.4947 Cold inducible RNA binding protein 0.167 0.088 0.10053 Dr.1334 Ribosomal protein S25 0.167 0.035 0

*Zebrafish UniGene Build#87. The values indicated are the percentages of UniGene clusters as evaluated by the percentages of corresponding tags or EST counts in the mRNA population identified in ZEBRAOV SAGE tag database and dbEST libraries. Percentages at over 0.15% were bolded.

Page 8: BMC Genomics BioMed Central - Home - Springer · BioMed Central Page 1 of 28 ... BMC Genomics Research article Open Access Molecular phenotype of zebrafish ovar ian follicle by serial

BMC Genomics 2006, 7:46 http://www.biomedcentral.com/1471-2164/7/46

set of eight transcripts for genes associated with S-, L-, andP0-type ribosomal proteins, individually expressed at over0.15%, represented a total of 1.63% of the total follicletranscripts evaluated by the SAGE method (Table 1).

SAGE tag N°5, with 142 tags sequenced, representing0.517% of the mRNA population, was found in the cor-rect sense position of the short (561 bp) transcript(gb|BC049475|) of metallothionein (mt2) on chromo-some 18 (Ensembl Gene ID ENSDARG00000041553). Alonger (1,739 bp) mt2 transcript (gb|NM_194273|) wasrecovered from the same UniGene cluster, ug|Dr.47289|.However, the corresponding SAGE tag (CATGAGTGTGA-GAT) from this long transcript was different from that ofthe short (CATGTCCTTTGTCT) transcript and was notdetected in ZEBRAOV. In addition to the mt2 short tran-script, another abundant transcript related to metal bind-ing, ferritin heavy polypeptide 1 (fth1) (SAGE tag N°46,gb|BC045278|), was also detected with a SAGE tag occur-rence of 51. Analysis of ZEBRAOV revealed the presence ofan additional highly-expressed tag with an occurrence of23, linked to a transcript similar to the ferritin heavy sub-unit found in zebrafish embryos (gb|CB360372|,gb|AI883498|) and testes (gb|BI672352|).

SAGE tag N°8 was part of an abundant transcript(gb|NM_199781|) encoding LOC327299, a hypotheticalprotein on zebrafish chromosome 3 (Dr.47296,zgc:66160) with similarities to another zebrafish tran-script (gb|XP_691610|), itself similar to the Xenopus pro-tein MGC68553 (gb|AAH60006|). The deduced proteinshave high sequence similarities with a mammaliandeduced protein of unknown function (e.g. human hypo-thetical protein LOC29035, gb|NP_054836|, and bovineCG4768-PA isoforms, gb|XP_611066|). Another abun-dant tag (SAGE tag N°24) was assigned to a transcribedlocus, while no significant sequence similarity wasdetected in any annotated sequence in GenBank™/EBIData Bank. However, significant similarity was detectedwith other expressed transcripts from teleost fish speciesPimephales promelas (gb|DT169547|), Oncorhynchus mykiss(gb|CX039674|), and Salmo salar (gb|CA037218|).

SAGE tag N°9 was included in cDNA EST clones (e.g.gb|BM861564|), which contains repetitive element MSR1in addition to a transcript similar to ZP1-related protein(gb|XM_686880|). This tag was recovered at least threetimes in a genome sequence of linkage group 11(gb|BX548064|).

The heat shock protein 90-beta (hsp90b) (SAGE tag N°10)transcript was also remarkably expressed, together with alocus similar to stress-associated endoplasmic reticulumprotein 1 (SAGE tag N°16), a cold inducible RNA binding

protein (cirbp) (SAGE tag N°52), and 40 S ribosomal pro-tein S27a, similar to ubiquitin (SAGE tag N°40).

The gene products for the following enzymes: lactatedehydrogenase B4 (ldhb) (SAGE tag N°11), subunit 9 ofATP synthase of mitochondrial F0 complex (atp5g) (SAGEtag N°13), retinol dehydrogenase 10 (rdh10) (SAGE tagN°28), a protein similar to cytochrome oxidase III (SAGEtag N°32), glutathione S-transferase pi (gstp1) (SAGE tagN°35), and a proteolytic enzyme similar to cathepsin S(SAGE tag N°41) were also recovered.

The UniGene clusters identified in ZEBRAOV using tags atR1 or R1cr positions and expressed in over 0.05% of thetotal transcript population of fully-grown follicles wereclassified according to the Gene Ontology (GO) [27] sys-tem, with help of updated information from the zebrafishinformation network (ZFIN) database [28]. The distribu-tion by molecular function GO terms was: 32.6% bindingproperties, including 22.2% nucleic acid binding; 11.1%structural molecule activity; 10.4% catalytic activity; 3.5%transporter activity; 2.1% signal transducer activity; 1.4%translation regulator activity; 0.7% translation initiationfactor activity; and 0.7% enzyme inhibitor activity, whilethe remaining 37.5% have an unknown postulatedmolecular functions. In addition to molecular functioncategories, ZEBRAOV was used to identify transcript mem-bers of specific biological processes or metabolic path-ways. For example, the following transcripts wereidentified in ZEBRAOV by their tags in R1 or R1cr positionand related to the cell cycle: cyclin A1 (ccna1) (SAGE tagN°38, Table 1), a transcribed locus moderately similar tozygote arrest 1 (SAGE tag N°21, Table 1), cyclin B1 (ccnb1,tag occurrence 25, gb|NM_131513|), cyclin B2 (ccnb2, tagoccurrence 13 for a long transcript, gb|BC045937| and tagoccurrence 4 for a short transcript, gb|AF365872|), cyclinG1 (ccng1, tag occurrence 3, gb|BC052125|), cyclin-dependent kinase 9 (cdk9, occurrence 3,gb|NM_212591|), and cyclin L1 (ccnl1, tag occurrence 2,gb|NM_199740|). As a second example, some of the tran-scripts encoded proteins related to the lipoprotein or fattyacid metabolisms, such as low density lipoprotein recep-tor-related protein associated protein 1 (lrpap1, tag occur-rence 4, gb|BC049517|), high density lipoprotein-binding protein (vigilin) (hdlbp, tag occurrence 4,gb|AI545520|), fatty acid binding protein 3 (fabp3, tagoccurrence 4, gb|NM_152961|), and fatty acid bindingprotein 7b (fabp7b, tag occurrence 2, gb|AY380814|).

Comparison of ZEBRAOV with zebrafish ovary cDNA librariesThe cumulative percentages of SAGE tag species or ESTclusters were plotted in order of abundance, according tothe cumulative percentages of tag or EST counts, provid-ing a comparative view of transcriptional activity (Figure

Page 8 of 28(page number not for citation purposes)

Page 9: BMC Genomics BioMed Central - Home - Springer · BioMed Central Page 1 of 28 ... BMC Genomics Research article Open Access Molecular phenotype of zebrafish ovar ian follicle by serial

BMC Genomics 2006, 7:46 http://www.biomedcentral.com/1471-2164/7/46

Page 9 of 28(page number not for citation purposes)

Table 3: Recovery of selected genes with a characterized maternal genetic contribution in ZEBRAOV and zebrafish cDNA libraries ID.9767 and ID.15519*

Gene GenBank** SAGE tag Nbr Cluster*** % SAGE % ID. 9767 % ID. 15519

ccnb1 (cyclin B1) NM_131513 CATGGGTTGAGCAA

25 Dr.25226 0.091 0.229 1.589

pou5f1 (POU domain, class 5, trf 1)

NM_131112 CATGCTCACCCGCA

2 Dr.258 0.007 0.035 0.008

catnb (Beta-catenin 2) AF329680 CATGCAGAATAAAG

13 Dr.28281 0.047 0.018 n.d.

cdc25 (cdc25) CO352749 CATGTGTGTGTTCA

6 Dr.8228 0.022 0.035 n.d.

smad5 (MAD homolog 5) AF127920 CATGTATCTCTACT

1 Dr.20558 0.004 0.027 n.d.

zorba (Orb/CPEB-related) NM_131427 CATGCAGAGCTATT

14 Dr.614 0.051 n.d. 0.038

dazl (daz-like gene) NM_131524 CATGGGAGTTGTGG

2 Dr.8159 0.007 n.d. 0.015

oep (one-eyed pinhead) NM_131092 CATGTCATCTTAAC

1 Dr.581 0.004 n.d. 0.008

foxh1 (forkhead box H1) NM_131502 CATGAGAATGCAGT

1 Dr.7743 0.004 n.d. 0.008

acvr2b (activin receptor Iib)

AF069500 CATGCAAGACTGTG

2 Dr.21047 0.007 n.d. n.d.

pbx4 (pre-B-cell leukemia trf 4

NM_131447 CATGTAGGGTGTGT

1 Dr.4926 0.004 n.d. n.d.

snai1a (snail homolog 1a) NM_131066 CATGCTTTTTCTTT

1 Dr.15 0.004 n.d. n.d.

cth1 (cth1) NM_130939 CATGTTTAACCCTG

n.d. Dr.621 n.d. 0.044 0.092

alk8 (activin receptor-like kinase 8)

NM_131345 CATGTTTGTGTTGA

n.d. Dr.606 n.d. 0.018 n.d.

szl (sizzled) NM_181663 CATGCGCTAGTCAG

n.d. Dr.25497 n.d. n.d. n.d.

cugbp (bruno-like) BC076238 CATGAGGAGGACGG

n.d. Dr.16766 n.d. n.d. n.d.

gsc (goosecoid) NM_131017 CATGTACGATGGCA

n.d. Dr.289 n.d. n.d. n.d.

sox19 (SRY-box containing gene 19)

BC078221 CATGAGGTTCAAC

n.d. Dr.20910 n.d. n.d. n.d.

stat3 (signal trans. & act.trans. 3)

NM_131479 CATGTATATAGTGA

n.d. Dr.8150 n.d. n.d. n.d.

vasa (vasa homolog) NM_131057 CATGGCACAACAGG

n.d. Dr.559 n.d. n.d. n.d.

notch1a (notch homolog 1a)

NM_131441 CATGGCTCCGCCTA

n.d. Dr.11847 n.d. n.d. n.d.

notch1b (notch homolog 1b)

U57973 CATGAAATGAAACT

n.d. Dr.11846 n.d. n.d. n.d.

notch2 (notch homolog 2) AF334944 CATGTTCACCAAAC

n.d. Dr.604 n.d. n.d. n.d.

notch3 (notch homolog 3) NM_131549 CATGTGCCCAATGA

n.d. Dr.17815 n.d. n.d. n.d.

*The total number of cDNA clones sequenced was 576 to generate ZEBRAOV instead of 11,344 with cDNA library ID.9767 and 13,029 with cDNA library ID.15519. **GenBank accession number of the sequence from which the SAGE-tag was extracted. ***UniGene Build#87. All the tags were at R1 sense position with the exception of notch1a and notch1b transcripts where a Rn1 sense position was found. Nbr, is the number of times a tag was counted in ZEBRAOV. n.d., the corresponding transcript was non-detected. The values indicated are the percentages of UniGene clusters as evaluated by the percentages of corresponding tags or EST counts in the mRNA population identified in ZEBRAOV SAGE tag database and dbEST libraries.

Page 10: BMC Genomics BioMed Central - Home - Springer · BioMed Central Page 1 of 28 ... BMC Genomics Research article Open Access Molecular phenotype of zebrafish ovar ian follicle by serial

BMC Genomics 2006, 7:46 http://www.biomedcentral.com/1471-2164/7/46

Page 10 of 28(page number not for citation purposes)

Localisation of selected maternal transcripts in zebrafish fully-grown follicles as revealed by whole-mount in situ hybridizationFigure 3Localisation of selected maternal transcripts in zebrafish fully-grown follicles as revealed by whole-mount in situ hybridization. A, Polarization of the oocyte along the animal-vegetal axis was visualized using two colour whole-mount in situ hybridization. Cyclin B1 (ccnb1) mRNAs were identified at animal pole (arrow) after probe labelling with fluorescein-labelled antisense riboprobes and Deleted AZoospermia-Like (dazl) transcripts located at vegetal pole (arrowhead) after probe labelling with digoxigenin-labelled antisense riboprobes. The hybridization signal is coloured red with fluorescein and dark brown to blue with digoxigenin. B, The animal pole localisation of transcripts similar to rhamnose-binding lectin at the fully-grown follicle stage (stage IV) was detected with digoxigenin-labelled antisense riboprobes. C, No staining signal was observed using sense riboprobes of transcripts similar to rhamnose-binding lectin. D-H, Stage-specific polarised distribution of metal-lothionein 2 (mt2) short transcript isoform in stage IB (D, E), II (F), early stage III (G), and stage IV (H) follicles. The met2 hybrid-ization signal detected at early stages was widely distributed in the ooplasma, whereas it concentrated at the animal pole from early stage III to the end of vitellogenesis. For stages III and IV, the animal pole, oriented toward top of page, is indicated by an arrow. Scale bar = 100 µm.

Page 11: BMC Genomics BioMed Central - Home - Springer · BioMed Central Page 1 of 28 ... BMC Genomics Research article Open Access Molecular phenotype of zebrafish ovar ian follicle by serial

BMC Genomics 2006, 7:46 http://www.biomedcentral.com/1471-2164/7/46

Page 11 of 28(page number not for citation purposes)

Table 4: The most abundant annotated SAGE-tags of zebrafish fully-grown follicles and the presence of homologous UniGene clusters in vertebrate ovary/unfertilized egg cDNA libraries

Tag N° Cluster* Description Trout Salmon Fugu Xenopus Swine Mouse Human

A. Occurrences > 0.15% in ZEBRAOV and cDNA libraries ID.9767 and ID.155192 Dr.1109 Beta-actin2 Y N Y* Y Y Y Y*12 Dr.25213 Beta-actin19 Dr.5628 Similar to contains element MSR1 repetitive

element/ZP1-relatedN N N Y N N Y

15 Dr.19916 Egg envelope protein ZP2 variant B [zp2.3] N N N Y N N Y15 Dr.23439 Zp2.4 protein Y N N n.i. n.i. N N19 Dr.26977 Zona pellucida glycoprotein 3b Y* Y* N Y N N Y20 Dr.30322 Zona pellucida glycoprotein 343 Dr.5605 Tubulin beta 2 N Y n.i. Y Y* Y* Y*

B. Occurrences > 0.15% in ZEBRAOV and cDNA library ID.97671 Dr.12439 Transcribed locus similar to rhamnose-binding

lectin/Latrophilin-2Y* Y* n.i. Y N N Y

3 Dr.46793 Transcribed locus similar to H2B histone family n.i. n.i. n.i. N Y N Y7 Dr.26975 Claudin d N n.i. n.i. Y* Y N Y8 Dr.47296 Zgc:66160 N n.i. n.i. Y N Y N14 Dr.47212 Transcribed locus, moderately similar to thymosin

beta 10N n.i. n.i. Y Y N Y

41 Dr.5890 Zgc:112226 similar to cathepsin S N N n.i. N N N N59 Dr.12480 Clone IMAGE:5604190, weakly similar to Cd27

binding proteinN n.i. n.i. Y N N Y

C. Occurrences > 0.15% in ZEBRAOV and cDNA library ID.1551926 Dr.30264 Ribosomal protein L3 N N n.i. N N N Y*28 Dr.1012 Retinol dehydrogenase 10 N n.i. n.i. Y n.i. N Y38 Dr.5555 Cyclin A1 N n.i. n.i. Y* N N Y

D. Occurrences > 0.15% in ZEBRAOV5 Dr.47289 Metallothionein 2 n.i. Y N N N N N10 Dr.31066 Heat shock protein 90-beta N Y n.i. Y Y Y* Y*11 Dr.729 Lactate dehydrogenase B4 N N n.i. Y Y Y* Y*13 Dr.1043 ATP synthase, H+ transporting, mit.F0 complex,

subunit c (subunit 9)Y* Y* Y* Y Y N Y

16 Dr.989 Zgc:56419 similar to stress-associated endoplasmic reticulum protein 1

N n.i. n.i. Y Y N Y

21 Dr.13590 Transcribed locus, moderately similar to zygote arrest 1

Y n.i. N Y N Y N

24 Dr.29860 Transcribed locus N n.i. n.i. n.i. n.i. n.i. n.i.31 Dr.1428 Ribosomal protein S15a N N n.i. N n.i. Y Y32 Dr.7383 Zgc:73301 similar to cytochrome oxidase III n.i. n.i. n.i. n.i. n.i. n.i. n.i.34 Dr.29755 Wu:fe38a04 similar to RL31_HUMAN 60S

ribosomal protein L31N n.i. n.i. n.i. n.i. N n.i.

35 Dr.23788 Glutathione S-transferase pi N n.i. n.i. N N n.i. Y36 Dr.14015 Im:7140333 weakly similar to latrophilin;

lectomedin-1n.i. n.i. n.i. Y n.i. N Y

40 Dr.24903 Zgc:66168 similar to ubiquitin; 40S ribosomal protein S27a

Y Y Y* Y Y* N Y

42 Dr.26808 Ribosomal protein L10a N N N N Y Y Y*44 Dr.1335 Ribosomal protein L36 N N N Y Y N Y46 Dr.6496 Ferritin, heavy polypeptide 1 Y Y* Y Y Y* Y Y*47 Dr.1209 Signal sequence receptor, beta N n.i. n.i. Y n.i. N Y51 Dr.11308 Ribosomal protein, large, P0 N n.i. N Y n.i. n.i. Y*52 Dr.4947 Cold inducible RNA binding protein N Y n.i. Y Y Y Y53 Dr.1334 Ribosomal protein S25 Y n.i. Y Y n.i. Y Y

*UniGeneBuild#87. The homologous UniGene clusters in other vertebrate species were found with UniGene tool or Blast search. Y, homologous UniGene cluster identified in the library. Y*, homologous UniGene cluster identified in the library with an occurrence >0.15%. N, the homologous UniGene cluster was not detected in the library. n.i., an homologous UniGene cluster was not recovered in UniGene. The presence of homologous clusters in at least four tetrapods or at least five compared vertebrate species are underlined. The vertebrates dbEST cDNA libraries accession numbers used for comparison were: ID.15587 for rainbow trout (Oncorhynchus mykiss); ID.15459 for Atlantic salmon (Salmo salar); ID.11967 for Fugu (Takifugu rubripes); ID.9909 for Xenopus tropicalis; ID.15096 for swine (Sus scrofa), ID.10029, ID.14142, and ID.1389 for mouse (Mus musculus); ID.4908, ID.10552, and ID.5611 for human (Homo sapiens).

Page 12: BMC Genomics BioMed Central - Home - Springer · BioMed Central Page 1 of 28 ... BMC Genomics Research article Open Access Molecular phenotype of zebrafish ovar ian follicle by serial

BMC Genomics 2006, 7:46 http://www.biomedcentral.com/1471-2164/7/46

2). A similar proportion of transcripts was observedbetween ZEBRAOV and cDNA library ID.9767, whilelibrary ID.15519 contained a few clusters with an over-representation of their EST numbers. At 50% of the cumu-lative percentage of tags or ESTs, there were 9% ZEBRAOVSAGE tags, numbered from 1 to 1026, and 12.5% ID.9767EST clusters, numbered from 1 to 348. In contrast, inlibrary ID.15519, the majority of the ESTs sequenced orig-inated from a small number of unique transcripts, i.e. 1%of the total clusters, numbered from 1 to 26, and repre-sented 50% of the ESTs in the library. The differencesobserved between the proportions of transcripts in thetwo cDNA libraries were also illustrated by the differencein the number of ESTs per transcript abundance class.Transcript numbers in libraries ID.15519 and ID.9767ranged from one copy/unique cluster in the least abun-dant class to 1,714 or 287 copies/unique cluster, in themost abundant class.

The occurrence of the most abundant annotated SAGE tagwas compared with their corresponding UniGene clusterrepresentation, evaluated by the number of ESTs, in thetwo selected zebrafish ovary cDNA libraries (Table 2).Eight clusters, including bactin2, bactin1, zp2.3, zp2.4,zp3b, zp3, and tubb2 were expressed over 0.15% in thethree transcriptome profiles (Table 2A). UniGene clusteraccession numbers Dr.30322 of zp3, Dr.5628, Dr.19916of zp2.3, and Dr.23439 of zp2.4. were overrepresented inlibrary ID.15519, accounting for over 11% of the tran-scripts identified in this cDNA library. Seven clustershigher than 0.15% were in both ZEBRAOV and ID.9767,e.g.: cldnd, transcripts similar to rhamnose-bindinglectins, H2B histone family, and cathepsin S (Table 2B).ZEBRAOV and ID.15519 had three clusters in common,i.e. ribosomal protein L3 (rpl3), rdh10, and ccna1 (Table2C). Twenty transcripts expressed at over 0.15% of thetotal mRNA population in ZEBRAOV were recoveredbelow this limit in the ovary cDNA libraries, while someof them were not expressed at all in library ID.15519(Table 2D). The largest discrepancies, evaluated by the dif-ference between the mean number of EST values obtainedwith the cDNA libraries and the expression level indicatedby SAGE tag frequency, were observed with mt2, hsp90b,and ldhb. The absence of expression in library ID.15519 ofsome UniGene clusters found in both library ID.9767 andZEBRAOV was illustrated with SAGE tag N°21 that waspart of a transcript (gb|BM533765/XM_682436/XM_703698|) moderately similar to zygote arrest 1,which contains sequences similar to the atypical PHDmotif found in the zygote arrest 1 gene from vertebratespecies including zebrafish zar1 (gb|BM533765|). Theattached UniGene cluster (ug|Dr.13590|) was highlyexpressed in ZEBRAOV, only moderately in libraryID.9767, and not in library ID.15519. It should be notedthat the zebrafish zar1 transcript attached to UniGene

cluster number Dr.12340 was found at high levels(0.370%) in library ID.9767 but not in ZEBRAOV orID.15519.

EST data from the two selected zebrafish ovary cDNAlibraries were also compared with ZEBRAOV to determinethe sensitivity for detecting transcripts with a character-ized maternal genetic contribution [29] (Table 3). Usingthe same maternal transcripts as the target sequences, theZEBRAOV database was around 1.8 times more sensitive,for the detection of these transcripts, than the EST methodpreviously used to describe the zebrafish ovary transcrip-tome. The number of selected transcripts with a maternalfactor role detected by the SAGE method was 12 out of atotal of 24, compared with 7 for libraries ID.9767 andID.15519. Only two maternal transcripts, ccnb1 and POUdomain class 5 trf 1 (pou5f1), were detected in ZEBRAOVand both libraries. Two transcripts, activin receptor IIb(acvr2b), and pre-B-cell leukemia trf 4 (pbx4) were foundonly in ZEBRAOV, while another two, cth1 (cth1) andactivin receptor-like kinase 8 (alk8), were only retained inthe cDNA libraries. It should be pointed out that thehigher sensitivity observed with the SAGE method wasobtained with around 21 times fewer sequenced cDNAclones, i.e. 576 to generate ZEBRAOV instead of 11,344for library ID.9767 and 13,029 for library ID.15519.However, the low-abundant maternal-effect vasahomolog (vasa) transcript that was not detected inZEBRAOV or in the ID.9767 and ID.15519 zebrafishdbEST libraries was identified at low levels in libraryID.9874 (ug|Dr.559|, 0.056%), even if the 3,544 ESTsequences in this library have only been classified in 1,475UniGene entries.

Polarised distribution of metallothionein 2 transcripts and those similar to rhamnose-binding lectins in zebrafish oocyteThe polarization of oocytes along the animal-vegetativeaxis was visualized with ccnb1 mRNAs, identified at theanimal pole, and Deleted AZoospermia-Like (dazl)located at the vegetative pole (Figure 3A). The ccnb1 tran-script was SAGE tag N°106, with a tag occurrence of 25,detected in both ovary cDNA libraries (Table 3). The dazltranscript was SAGE tag N°3122, with a tag occurrence of2, found only in library ID.15519 with two EST sequenceentries. Whole-mount in situ hybridization using an RNAlabelled probe that was potentially capable of hybridizingwith all rhamnose-binding lectin transcripts variants(SAGE tag N°1, tag occurrence 284), due to their very highsequence conservation, revealed preferential polarizationat the animal pole (Figure 3B). Whole-mount in situhybridization using a specific mt2 (SAGE tag N°5, tagoccurrence 142) riboprobe confirmed the presence of thismRNA in zebrafish ovarian follicles, although only fourand six EST sequence entries were found in libraries

Page 12 of 28(page number not for citation purposes)

Page 13: BMC Genomics BioMed Central - Home - Springer · BioMed Central Page 1 of 28 ... BMC Genomics Research article Open Access Molecular phenotype of zebrafish ovar ian follicle by serial

BMC Genomics 2006, 7:46 http://www.biomedcentral.com/1471-2164/7/46

Table 5: Comparison of zebrafish follicle protein repertoire deduced from SAGE with the protein repertoire isolated by proteomic analysis

SAGE-tag Nbr* GenBank

Cluster**

Gene Protein name UniProt accession number

Gel piece ***

R1 or R1crCATGTAAGACATCC 226 BC04587

9Dr.1109 bactin2 Beta-actin2 Q7ZVI7, Q6NWJ6 VII, 13

CATGTAAAACCAAA 108 AF057040 Dr.25213 bactin1 Beta-actin1CATGTTGATGAGGG 113 BC04419

0Dr.729 ldhb Lactate dehydrogenase

B4Q9PVK4, Q803U5 IX

CATGCATTTTCTAA 90 BM571854

Dr.19916 zp2.3 Egg envelope protein ZP2 variant B [zp2.3]

Q98SS2 VII-IX, 10, 12–15

CATGCATTTTCTAA 90 BC093133

Dr.23439 zp2.4 Zp2.4 protein Q502R0 VII-IX, 10, 12–15

CATGAGAAACGTTT 77 BC067692

Dr.26977 zp3b Zona pellucida glycoprotein 3b

O12989, Q6NW78 VI, 12

CATGGTGGTGTTGA 75 NM_131331

Dr.30322 zp3 Zona pellucida glycoprotein 3

Q9PWC7, Q6NW70 V, VI, 8–12

CATGTTAATAAAAG 53 NM_198809

Dr.5605 tubb2 Tubulin, beta, 2 Q6TGT0, Q6P5M9, Q6IQJ2

VI, 7

CATGTCTGTTTAAC 47 BC049058

Dr.11308 rplp0 Ribosomal protein, large, P0

Q7ZUG3, Q6P5K3 VIII, IX

CATGCTTAAATTGA 40 BC095386

Dr.30628 gapd Gapdh protein (Fragment)

Q5XJ10 VIII, IX

CATGTTCTTTCTGT 36 BM778256

Dr.23435 zp2.2 Egg envelope protein ZP2 variant A [zp2.2]

Q98SS3 VII-IX, 10, 12–15

CATGCCACAGGGCA 35 CO917383

Dr.5705 rps3 Ribosomal protein S3 Q6TLG8 IX

CATGGCTGCCTTGG 34 NM_199795

Dr.23436 tuba4l Tubulin alpha 3; Alpha tubulin-like protein

Q802Y6, Q6TGS5 V, 7

CATGAGGATCGAGG 30 BI844143 Dr.25678 eno3 Enolase 3, (beta, muscle)

Q568G3, Q6TH14 VI, 9, 11, 12

CATGTACAAGGCTT 28 AY394968

Dr.18315 eef1g Elongation factor 1-gamma

Q6PE25, Q6NYN8 VI, 7, 8, 12

CATGTTCTCCCTGT 25 NM_001003844

Dr.28231 rpl6 60S ribosomal protein L6

Q6DRC6, Q567N5 X

CATGTGCTGAAAGT 16 NM_213058

Dr.26116 hspa5 Heat shock 70kDa protein 5

Q7SZD3, Q6P3L3 III

CATGCCCGGTGGAT 15 BC045841

Dr.24802 hspa8 Heat shock protein 8 Q7ZVJ1, Q6TEQ5, Q6NYR4

III

CATGGTGTAAGTGA 15 NM_212772

Dr.28850 tuba8l Tubulin, alpha 2 Q6TNP9 V, 7

CATGCTCAAAGATG 12 BG305824

Dr.25868 Hypothetical protein zgc:64133 (L-lactate dehydro. activity)

Q7T334 IX

CATGTCTTCTGAAA 12 CO352070

Dr.34146 zp3al Zona pellucida glycoprotein 3a, like

Q5TYP2 V, VI, 9–12, 14, 15

CATGTCTTCTGAAA 12 CO351980

Dr.25594 zp3a Zona pellucida glycoprotein 3a

Q5TYP5 V, VI, 9–12, 14, 15

CATGTTTTTTTTGA 11 BC056719

Dr.24206 cct4 Chaperonin containing TCP1, subunit 4 (Delta)

Q6PH46, Q6P123 V, 8

CATGTCTTGGTTAA 9 NM_200153

Dr.6969 Hypothetical protein zgc:55944 (aminopeptidase I activity)

Q803B5 8

CATGTGTTGATGAA 8 NM_199547

Dr.3463 ssb Sjogren syndrome antigen B (Autoantigen La)

Q7ZTI0 12

CATGTGAAGCACTT 8 NM_212718

Dr.18319 ZPA domain containing protein [si:dkeyp-50f7.2]

Q8JIX8, Q5TYX2 III

CATGAGAAGGAGGA 6 BC055516

Dr.20938 sod1 Superoxide dismutase [Cu-Zn]

O73872 22

Page 13 of 28(page number not for citation purposes)

Page 14: BMC Genomics BioMed Central - Home - Springer · BioMed Central Page 1 of 28 ... BMC Genomics Research article Open Access Molecular phenotype of zebrafish ovar ian follicle by serial

BMC Genomics 2006, 7:46 http://www.biomedcentral.com/1471-2164/7/46

CATGTCTGAGCTGA 4 AY398308

Dr.4751 aldh2 Mitochondrial aldehyde dehydrogenase 2 family [aldh2l]

Q6TH48, Q6NWJ9 10, 11

CATGAACCACTCAA 4 BC050953

Dr.7644 cct2 Chaperonin containing TCP1, subunit 2 (Beta)

Q6PBW6 V

CATGACACAAAATA 4 BQ263267

Dr.25009 vg1 Vitellogenin 1 Q8JH36, Q8JH37, Q90YN8

I-XI, 1–7, 10–17, 21–24

CATGTGCTCAAAAT 4 BC045933

Dr.134 cct7 Chaperonin-containing T-complex protein 1 eta subunit

Q8JHG7 Q7ZTS3 7

CATGATTGTATGGT 3 BM860549

Dr.7343 Acyl-Coenzyme A dehydrogenase, C-4 to C-12 straight chain

Q7ZTH8, Q6NYM4, Q502R3

VII, 12

CATGATGTGGATCC 3 CK673334

Dr.28241 rpl7a Ribosomal protein L7a Q6PBZ1 X

CATGCATTGTATAT 3 BM101604

Dr.11543 Zgc:103482 Q5XJA5 VI, 12, 13

CATGCCGACCCTAC 3 BC083399

Dr.11543 Zgc:103482 Q5XJA5 VI, 12, 13

CATGTTCTCCCTCA 3 BC091659

Dr.10133 Zgc:110766 (translation elongation factor activity)

Q5BJ17 12, 13

CATGAAGATGGAAA 3 BQ285089

Dr.24685 Hyp.prot.zgc:111961(hydro.-translo.F-type ATPase complex)

Q4VBK0 VI

CATGTTTTGTACAA 2 NM_200174

Dr.8999 Hypothetical protein zgc:55702

Q7ZVR7 8

CATGTCACAGGCCA 2 CO919203

Dr.29735 naca Nascent polypeptide-associated complex alpha subunit

Q8JIU7 IX

CATGAAGCGGAGCT 2 BC059705

Dr.28202 Hypothetical protein zgc:73404 (Peptidase)

Q6NSN3, Q6PBH6 10

CATGCAAGTGTGAT 1 BI843130 Dr.20270 Wu:fi20c07 Hypothetical protein (Fragment) (Pentaxin)

Q7SZ53, Q5XJ77, Q4V8X3

17, 18

CATGCCTGATCTGT 1 NM_201052

Dr.1065 rpsa Ribosomal protein SA Q803F6 16, 20

CATGAGGGCTGGAA 1 NM_201290

Dr.6928 cct6a Chaperonin containing TCP1, subunit 6A (Zeta 1)

Q7ZYX4 IV, 7, 8

CATGTTGAACTCCA 1 BQ419723

Dr.28864 serpina1 Novel protein similar to Alpha-1 antiproteinase (Serpina1)

Q5SPJ1, Q5SPJ4 10, 11

R2 or R3CATGGAGATCTCTA 2 BC04443

2Dr.1051 dldh Dihydrolipoamide

dehydrogenaseQ803L1, Q6TNU6 8

CATGTGTGCACGCA 1 BC095354

Dr.39983 Hypothetical protein zgc:110655

Q503F1 IX

Rn1 or Rn2CATGTTAAAACGTG 11 CA47109

3Dr.7108 hspd1 Hspd1 protein (Heat

shock 60 kD protein 1)Q803B0 IV

CATGCTGAACGGAT 5 BC073176

Dr.5529 GDP dissociation inhibitor 3;GDP dissociation inhibitor 2

Q7ZVL9, Q6TNT9 VI, 11

CATGCACATCAGAT 1 CK400766

Dr.11310 tuba1 Tuba1 protein Q8AWD6, Q6NWK7, O42271

V, 7–9, 23

CATGATGAGACAAA 1 CO917509

Dr.6501 Sb:cb825 protein (Fragment)

Q803D7 V, 10

CATGCAAGTGTGTC 1 BC053271

Dr.28298 cct3 Chaperonin-containing TCP-1 complex gamma chain

Q8JHI7, Q7T2P2 IV

R1 or R1cr, multiple matchesCATGAAAAAAAAAA 153 BI879427 Dr.296 ckb Creatine kinase, brain Q7ZU04, Q8AY63,

Q5RIR2VII, 12–15

Table 5: Comparison of zebrafish follicle protein repertoire deduced from SAGE with the protein repertoire isolated by proteomic analysis (Continued)

Page 14 of 28(page number not for citation purposes)

Page 15: BMC Genomics BioMed Central - Home - Springer · BioMed Central Page 1 of 28 ... BMC Genomics Research article Open Access Molecular phenotype of zebrafish ovar ian follicle by serial

BMC Genomics 2006, 7:46 http://www.biomedcentral.com/1471-2164/7/46

CATGACATCACATT 60 AY394971

Dr.664 tuba8l4 Similar to tubulin, alpha 1 (tubulin, alpha 8 like 4)

Q7ZVI6, Q6NWJ5 V, 7–9, 23

CATGGCCATTGAGG 4 BI671108 Dr.20576 Mitochondrial ATP synthase alpha subunit

Q6EE37 V, 7

CATGAAGAAAAAAA 3 AF254638 Dr.31985 vg3 Vitellogenin 3 (Fragment) [vg3]

Q9DFT9 II, VIII, 6, 21

CATGATCACTGGAA 3 BM182705

Dr.15108 LOC407641 protein (Fragment) translation elongation factor

Q7ZWA1 VI

CATGTATAAATTCC 2 NM_001002374

Dr.31390 Zgc:92082 Q6DHT4 8, 9

CATGTGTGTGTGAG 1 BG304242

Dr.7952 pkm2 Pyruvate kinase, muscle Q7ZVT2, Q6NXD1 7, 8, 9

CATGCAGGAGTTCA 1 CA496254

Dr.4724 eno1 Enolase 1, (Alpha) [zgc:73152]

Q6IQP5 8, 9, 11, 12

Non detectedCATGTGGCTTTCAA 0 NM_2004

46Dr.831 chia Eosinophil chemotactic

cytokineQ7SYA8, Q6TH31 IV

CATGCTGGACCAGC 0 NM_131804

Dr.10788 nots Nots protein Q5PR42, Q9DDE2 16, 20

*Nbr, is the number of times a tag was counted in ZEBRAOV. **Zebrafish UniGene Build#87. ***SDS-PAGE slices numbered I to XI or two-dimensional PAGE excised spot area numbered 1 to 24 (see Figure 4).

Table 5: Comparison of zebrafish follicle protein repertoire deduced from SAGE with the protein repertoire isolated by proteomic analysis (Continued)

Page 15 of 28(page number not for citation purposes)

Page 16: BMC Genomics BioMed Central - Home - Springer · BioMed Central Page 1 of 28 ... BMC Genomics Research article Open Access Molecular phenotype of zebrafish ovar ian follicle by serial

BMC Genomics 2006, 7:46 http://www.biomedcentral.com/1471-2164/7/46

ID.15519 and ID.9767, respectively. As with rhamnose-binding lectins, colocalization of the mt2 transcript withccnb1 mRNA by two-colour whole-mount in situ hybrid-ization demonstrated a signal restricted to the animal poleof the oocyte (data not shown). The distribution of thistranscript was stage-dependent (Figure 3D–H). Thehybridization signal was homogeneously distributed instages I and II of oogenesis and restricted from early stageIII to the animal pole of the oocyte.

Comparison of ZEBRAOV with ovary/egg functional genomic data from other vertebrate and non-vertebrate speciesThe annotated transcripts expressed at over 0.15% inzebrafish fully-grown ovarian follicles (Table 1) that hadcorresponding UniGene clusters were used to search forhomologous Unigene clusters in the ovary/unfertilizedegg cDNA libraries currently available for seven vertebratespecies (Table 4). The homologous vertebrate UniGeneclusters were identified using UniGene homologous toolor found after BLAST [30] searches. Numerous genesexpressed in zebrafish had homologous counterparts inthe expression profile of human and Xenopus cDNA librar-ies and, to a lesser extent, in profiles available for othervertebrate species. Out of a total of thirty-five zebrafishgenes, twenty-six homologous UniGene clusters wereidentified in humans, including seven at over 0.15% ofthe mRNA population. Twenty four homologous Uni-Gene clusters were identified in Xenopus, including two atover 0.15% of the mRNA population. Due to the dupli-cated nature of the zebrafish genome, bactin2 and bactin1were duplicated forms of the mammal bactin gene. Thesehousekeeping genes, together with tubb2, were a commoncharacteristic of ovarian vertebrate abundant cytoskeletalprotein encoding clusters. Homologous flh1 transcriptwas remarkably expressed in all available vertebrateovary/egg transcriptomes, with high levels in human,swine, and salmon cDNA libraries. This transcript wasalso recovered at high levels in the ID.16098 dog (Canisfamiliaris) ovary dbEST library (ug|Cfa.1238|, 0.264%).hsp90b, ldhb, atp5g, and 40S ribosomal protein S27a, werealso some of the most expressed genes in the vertebrateovaries. The other abundant ZEBRAOV transcripts recov-ered in at least one vertebrate species at a homologouscluster frequency >0.15% were zp3 or related gene zp3b,cldnd, rpl3, and ccna1. The cirbp transcript was also widelydistributed, but in a lower relative proportion of mRNA.

The annotated transcripts of ZEBRAOV were also com-pared with published data obtained from human germi-nal vesicle (GV)-stage oocytes by PCR-SAGE [16]. Thishuman SAGE tag library has not been deposited at theGEO database. The published short-list of human tags waschecked against UniGene Build#187 using SAGEmaptools. Out of a total of 175 SAGE tags analysed, 81 were

identified in R1 sense position. This updated list was com-pared with the homologous UniGene clusters expressed atover 0.15% in ZEBRAOV. Three homologous clusterswere identified in the human oocyte catalogue, i.e. actingamma 1 (ACTG1)/actin beta (ACTB) (ug|Hs.514581|and ug|Hs.520640|), ZP glycoprotein 3 (ZP3)(ug|Hs.488877|), and heat shock 90 kDa protein 1, beta(HSPCB) (ug|Hs.509736|). These clusters were also recov-ered at high levels in SAGE tag libraries deposited onSAGEmap for human ovarian cancer cell lines (e.g. GEOaccession number GSM726). Comparison between theupdated annotated list of human GV oocyte andZEBRAOV clusters revealed that, in addition to ACTG1/ACTB, ZP3, and HSPCB, other homologous expressedclusters, identified by their SAGE tag at R1 sense positionand expressed above 0.01% of the total expressed tran-scripts, had also been detected in ZEBRAOV: tubulinalpha 6 (TUBA6) and beta 4 (TUBB4), programmed celldeath 5 (PDCD5), proliferating cell nuclear antigen(PCNA), barrier to autointegration factor 1 (BANF1), gua-nine nucleotide binding protein, beta polypeptide 2-like1 (GNB2L1), CD9 antigen (CD9), and glyceraldehyde-3-phosphate dehydrogenase (GAPD) were found.

The most expressed transcripts in ZEBRAOV (Table 1),including the conserved expressed transcripts of vertebrateovaries/unfertilized eggs (Table 4), were used to screenovary/egg dbEST or SAGE tag libraries from non-verte-brate species. As expected, some homologous housekeep-ing genes, which encode ribosomal proteins or proteinsresponsible for cell structure, were among the mosthighly-expressed genes in these libraries. For example,homologous transcript of 40 S ribosomal protein S27awas retrieved from the silkworm (Bombyx mori) egg SAGEtag library at 0.366% of the total mRNA population,together with transcripts of 60 S ribosomal protein L3(0.4%), and 60 S ribosomal large P0 (0.254%). In addi-tion, transcripts for beta-tubulin were retrieved from silk-worm (0.266%), nematode (Caenorhabditis elegans)(ug|Cel.10737|, 0.116%) and amphioxus (Branchiostomafloridae) (ug|Bfl.1179|, 0.051%) egg cDNA libraries.Other conserved homologous transcripts identified wererelated to the cell cycle, with cyclin A type in sea urchin(Strongylocentrotus purpuratus) (ug|Spu.227|, 0,499%),and amphioxus (ug|Bfl.5026|, 0,511%) eggs, or metalbinding, with ferritin protein genes in silkworm(0.319%), and amphioxus (ug|Bfl.851|, 0.106%) eggs, ormetallothionein in sea urchin (ug|Spu.96|, 0.066%) eggs.Catalytic activity transcripts for cytochrome c oxidase IIIin silkworm (0.228%), and lactate dehydrogenase innematode (ug|Cel.22829|, 0.022%) eggs, homologouswith transcripts expressed at over 0.15% in ZEBRAOV,were expressed in ovaries/eggs of non-vertebrate species.

Page 16 of 28(page number not for citation purposes)

Page 17: BMC Genomics BioMed Central - Home - Springer · BioMed Central Page 1 of 28 ... BMC Genomics Research article Open Access Molecular phenotype of zebrafish ovar ian follicle by serial

BMC Genomics 2006, 7:46 http://www.biomedcentral.com/1471-2164/7/46

Comparison of zebrafish follicle protein repertoire deduced from SAGE with the protein repertoire isolated after proteomic analysisThe proteins extracted from fully-grown follicles wereresolved by 1D-SDS-PAGE or 2D-PAGE, subjected to in-gel tryptic digestion, and analysed by MS/MS. The proteinrepertoire determined was then compared with the reper-toire deduced from ZEBRAOV (Table 5). Forty-three outof a total of sixty proteins identified by proteomic analysiswere initially retrieved using 1D-SDS-PAGE fractionation,forty-one by 2D-PAGE fractionation, and twenty-fourwere common to both. Potential molecular functions ofthe proteins identified using proteomic analysis accordingto GO terms were: 26% structural molecule activity, 25%binding properties, including 3% nucleic acid binding,22% catalytic activity, 3% each for transporter, transla-tion-regulator, and enzyme-regulator activity, 2% each forsignal transducer, antioxidant, and electron transporteractivity, while the remaining 12% had no postulatedmolecular function. The three most abundant categorieswere: (i) structural molecules, represented by beta-actin,tubulin, and ZP variant isoforms, as well as ribosomalproteins; (ii) binding proteins, mostly chaperonins and

heat shock proteins; and (iii) proteins with a catalyticactivity, mostly oxidoreductases, like acyl-Coenzyme Adehydrogenase or enolases, and transferases, like creatinekinase and pyruvate kinase. Comparison of transcriptomeand proteome data indicated that forty-three proteinswere recovered with a corresponding transcript identifiedby an experimental SAGE tag at a correct, R1 or R1cr, posi-tion. Seven proteins were also found with a correspondingtag in R2/R3 or Rn1/Rn2 position, and eight with corre-sponding multiple-matched R1 or R1cr tags. Comparingtranscriptome and proteome also revealed a weak predic-tive value between mRNA and protein abundance. TheMS-based protein identification approach recoveredaround 23% of the proteins, including bactins, zp2.3,zp2.4, zp3, zp3b, tubb2, ldhb, and ribosomal protein large,P0 (rplp0), deduced from transcripts expressed at over0.15% and identified by an experimental SAGE tag in acorrect, R1 or R1cr, position (Table 1). All the proteinsidentified by proteomic analysis had at least one tran-script counterpart in ZEBRAOV, with two exceptions, eosi-nophil chemotactic cytokine (chia) and nothepsin (nots).UniGene cluster ug|Dr.831| of the chia transcript wasdetected three and five times in ovary cDNA banks

Table 6: Comparison of zebrafish follicle transcript occurrence of proteins identified by proteomic analysis, with the corresponding protein repertoire deduced from transcripts expressed at over 0.15% of the mRNA population in human ovary cDNA libraries

Zebrafish* Human** Human gene

Protein name

Zebrafish Human

SAGE ID.9767 ID.15519 ID.4908 ID.5611 ID.10552

Dr.1109 Hs.520640 ACTB Beta-actin 0.822 0.317 0.752 0.404 0.503 0.535Dr.25213 Hs.520640 ACTB Beta-actin 0.393 0.326 0.698 0.404 0.503 0.535Dr.729 Hs.446149 LDHB Lactate

dehydrogenase B

0.411 0.035 0.046 0.210 0.140 0.197

Dr.5605 Hs.433615 TUBB2 Tubulin, beta, 2

0.193 0.290 0.852 0.194 0.140 0.159

Dr.11308 Hs.448226 RPLP0 Ribosomal protein, large, P0

0.171 0.079 0.023 0.381 0.199 0.384

Dr.30628 Hs.544577 GAPDH Glyceraldehyde-3-phos. dehydro.

0.145 0.096 0.116 1.105 1.018 0.816

Dr.5705 Hs.546286 RPS3 Ribosomal protein S3

0.127 0.017 0.008 0.708 0.889 0.356

0Dr.18315 Hs.444467 EEF1G Elongation factor 1-gamma

0.101 0.052 0.089 0.700 0.304 0.525

Dr.28241 Hs.558380 RPL7A Ribosomal protein L7a

0.011 0.088 0.004 0.591 0.081 0.366

Dr.1065 Hs.558354 RPSA Ribosomal protein SA

0.003 0.088 0.008 0.342 0.187 0.103

Dr.7952 Hs.198281 PKM2 Pyruvate kinase, muscle

0.003 0 0.017 1.035 0.374 0.657

Dr.4724 Hs.517145 ENO1 Enolase 1 0.003 0 0 2.023 0.573 0.582

*Zebrafish UniGene Build#87. **Human UniGene Build#187. The values indicated are the percentages of species specific UniGene clusters as evaluated by the percentages of corresponding tags or EST counts in the mRNA population identified in ZEBRAOV SAGE tag database and dbEST libraries. Percentages at over 0.15% were bolded.

Page 17 of 28(page number not for citation purposes)

Page 18: BMC Genomics BioMed Central - Home - Springer · BioMed Central Page 1 of 28 ... BMC Genomics Research article Open Access Molecular phenotype of zebrafish ovar ian follicle by serial

BMC Genomics 2006, 7:46 http://www.biomedcentral.com/1471-2164/7/46

ID.15519 and ID.9767, respectively. UniGene clusterug|Dr.10788| of nots transcript was not detected eitherzebrafish cDNA library used. Zebrafish chia encoded aprotein highly similar to zebrafish protein isoforms simi-lar to chitinase (gb|XP_708403.1|, gb|XP_686386|) orproteins encoding by multiple chitinase genes in rainbowtrout (gb|CAD59687|) and Japanese flounder(gb|BAD15061|) and, to a lesser extent, acidic mamma-lian chitinase precursor, e.g. up|Q9BZP6| in humans.Zebrafish nots encoded a protein similar to vertebrateaspartic-type endopeptidases, such as zebrafish cathepsinD (up|Q8JH28|) and human cathepsin E (up|P14091|).

Proteins synthesized by bactin1 and bactin2 were notresolved by the proteomic analysis, due to the amino acidsequence identity of these isoforms. However, corre-sponding differentially expressed transcripts were discrim-inated by a specific SAGE tag, due to divergent 3'-untranslated part sequences. On the contrary, a well con-served divergent 3'-untranslated part sequence led to acommon SAGE tag identified in zp2.3 and zp2.4 tran-scripts, while a distinct SAGE tag was recovered with zp2.2(Tables 1 and 5). The high sequence similarities of thesethree protein isoforms led to an unsolved protein identi-fication on the gel map produced by proteomic analysis.Identical peptide sequences were also identified withtubulin alpha isoforms and recovered in different areasafter 2D-PAGE fractionation. Sequence differences oradditional identified peptides made it possible to discrim-inate between tubulin alpha 1, alpha 8 like 4, alpha 2, andalpha 3. In all cases, specific SAGE tags were identified foreach tubulin transcript isoform, even if isoform alpha 8like 4 was identified using a multiple-matched tag.

Furthermore, comparison of transcriptome and proteomedata also revealed that two different SAGE tags, with anoccurrence of 3, were identified for the unannotateddeduced protein zgc:103482 (up|Q5XJA5|), due to thepresence of a 3'-untranslated region that could beextended by 127 bp, as revealed by the nucleotidesequence of dbEST clone gb|BM101604|. The uniquecommon translated region of both transcripts encoded aprotein that was part of the described proteomic profile.

Vitellogenin (VTG) derivatives were identified in most ofthe gel pieces excised from the 1D- or 2D-PAGE. It shouldbe noted that VTG 1 (vg1) SAGE tag (occurrence 4) andVTG 3 (vg3) multiple-matched SAGE tag (occurrence 3)were extracted from ZEBRAOV. Other annotated cleavedproteins revealed by the identification of specific peptidesequences with different mass values for the protein ofinterest were ZP glycoprotein forms, alpha 1 and alpha 8like 4 tubulins, enolase 3, elongation factor 1-gamma,and nothepsin. Annotated proteins with non-cleavageposttranslational modifications, predicted by variation of

the isoelectric point of the protein of interest, were: VTGderivatives, ZP glycoproteins, alpha 1 and alpha 8 like 4tubulins, enolase 3, elongation factor 1-gamma, mito-chondrial aldehyde dehydrogenase 2, chaperonin con-taining TCP1 subunit 6A, serpin a1, creatine kinase, andpyruvate kinase.

The zebrafish follicle protein repertoire determined byproteomic analysis (Table 5) and the corresponding tran-script levels inferred from ZEBRAOV or EST count fromovary libraries ID.9767 and ID.15519 was compared withthe protein repertoire deduced from transcripts expressedat over 0.15% in human ovary cDNA libraries ID.4908,ID.5611, and ID.10552 (Table 6). Some of the proteinsidentified in zebrafish fully-grown follicles have a high,i.e. bactins, tubb2, ldhb, and rplp0, or moderate, i.e. glycer-aldehyde-3-phosphate dehydrogenase (gapd), ribosomalproteins S3 (rps3), elongation factor 1-gamma (eef1g),transcript count of homologous UniGene clusters inhuman ovary cDNA libraries. However, human clustershomologous to ribosomal proteins L7a (rpl7a) and SA(rpsa), pyruvate kinase (pkm2), and enolase 1 alpha (eno1)were highly expressed in human cDNA libraries, while thehomologous transcripts were counted at very low levels inthe zebrafish transcriptome, even if these proteins hadbeen identified by proteomic analysis. It should be notedthat ZEBRAOV contained an additional enolase familymember homologous to enolase 3 beta (eno3), with amoderate expression level (ug|Dr.25678|, 0.109%), whileits human annotated counterpart (ug|Hs.224171| wasexpressed at very low levels or not in human cDNA librar-ies used for analyses. Enolase transcript (ug|Dm.18435|)was detected in Drosophila ovary dbEST libraries ID.1058(0.052%) and ID.1059 (0.102%).

DiscussionAs in other vertebrates [1], somatic gonadal cells inzebrafish surround a single oocyte to establish a follicle[31]. The entire folliculogenesis process, from primarygrowth to post-vitellogenic stage takes about ten days inzebrafish [32]. Since large numbers of follicles at differentdevelopmental stages are easily obtained year round inthis species, zebrafish offer an excellent alternative modelfor analysing some fundamental aspects of ovarian devel-opment and regulation [33], as well as identifying con-served maternal factors [29], which are important in earlystages of embryo development. This study analysed thetranscriptome and proteome of zebrafish fully-grownovarian follicles and compared these data with other ani-mal ovary/follicle/egg molecular phenotypes publishedor available in public sequence databases.

The delineation of the transcriptome of teleost fish ovarieshas already been evaluated using large-scale EST sequenc-ing of cDNA libraries [7-9], subtractive hybridisation [11],

Page 18 of 28(page number not for citation purposes)

Page 19: BMC Genomics BioMed Central - Home - Springer · BioMed Central Page 1 of 28 ... BMC Genomics Research article Open Access Molecular phenotype of zebrafish ovar ian follicle by serial

BMC Genomics 2006, 7:46 http://www.biomedcentral.com/1471-2164/7/46

and microarray-based analyses [12]. These large-scalestrategies were used, together with digital differential dis-play analysis, to identify expressed genes specific to ova-ries/follicles/oocytes and early embryos in mice [34-39],bovines [40-42], rats [43,44], and humans [45]. Whilethese methods give an idea of transcript abundance orenrichment in a specific tissue, it has been demonstratedthat the SAGE method is reproducible [46], provides anunbiased, quantitative report of gene expression, that maybe correlated with microarray data [47,48], and seemsmore efficient than EST-based methods for discoveringnovel genes and spliced variant transcripts [13]. However,one limitation of the SAGE method is the presence of tran-scripts that produce multiple matched tags [[49-51], thisstudy]. SAGE has been applied to human oocytes [16,17],and silkworm eggs [52], and successfully identified differ-entially expressed genes in human ovarian carcinomasand normal ovarian surface epithelium [53]. Large-scaleanalyses of proteomes from mouse oolemmal proteins[54], matured pig oocyte proteins [55], microtubule-asso-ciated proteins from Xenopus egg extracts [56], and pro-teins extracted from Drosophila oocytes [57] have alsobeen carried out. To our knowledge, no previous studyhad analysed the transcriptome and proteome profiles ofsamples of ovarian origin at the same biological stage ona large-scale.

The transcript repertoire obtained using the SAGE methodis an accurate picture of gene expression on both qualita-tive and quantitative levels and gives a global expressionprofile of transcripts present in zebrafish fully-grownovarian follicles. Sequencing of 27,486 SAGE tags identi-fied 11,399 different tag species, classified into 3,437 Uni-Gene clusters with tags in position R1 or R1cr, including3,329 tag species with an occurrence greater than one.Comparative analysis of transcriptional activity, using theZEBRAOV SAGE tag database and dbEST libraries cur-rently available for zebrafish ovaries, revealed a globallysimilar pattern between ZEBRAOV and the ID.9767library (Figure 2). However, a clearly different quantita-tive pattern was obtained with library ID.15519, due to anover-representation of the number of EST sequencesattached to a small number of unique transcripts. Thisbias is commonly observed with the EST sequencingmethod [13]. Consequently, some of the abundant tran-scripts found by the SAGE method were not detected inlibrary ID.15519. For example, a transcript moderatelysimilar to zygote arrest 1 (ZAR1), with a domain similarto the atypical homeodomain (PHD) finger motif foundin ZAR1 in vertebrate species, including zebrafish [58], ishighly expressed in ZEBRAOV, moderately in libraryID.9767, and not in library ID.15519. In a second exam-ple, the transcript of signal sequence receptor beta (ssr2),also called translocon-associated protein beta, wasdetected at high levels with SAGE, and very low levels with

dbEST ovary library sequencing (Table 2). This protein ispart of the translocon-associated protein (TRAP) complexrequired for the translocation of nascent polypeptidesinto the lumen of the endoplasmic reticulum, and the cor-responding zebrafish ssr2 mRNA is maternally supplied tothe egg [59]. We also found the ZEBRAOV databasearound 1.8 times more sensitive than EST sequencing indetecting transcripts with a characterized maternal geneticcontribution (Table 3), even if the number of cDNAclones sequenced to generate the ZEBRAOV SAGE tagsdatabase was around 21 times lower. However, thisnumber is not sufficient to identify some of these tran-scripts, as demonstrated with the maternal-effect vasatranscript that was not detected in ZEBRAOV. The vasa-like genes are expressed in the germ cells of many animalspecies [60], including zebrafish oocytes and early-stageembryos [61,62]. Furthermore, the presence ofunmatched tags in the SAGE library generated from ovar-ian follicles indicates the presence of genes in thezebrafish genome or spliced variant transcripts that hadnot previously been identified by EST data. A broadersnapshot of gene expression was therefore obtained bySAGE, as previously reported [13,46]. It should bepointed out that some of these unidentified transcripts arelargely expressed in the zebrafish fully-grown follicle tran-scriptome.

Comparison of the transcriptome of zebrafish fully-grownfollicles as evaluated by the SAGE method with ovary/eggtranscriptomes available for other animal species revealedboth similarities and differences. SAGE revealed the pres-ence of several tags corresponding to novel transcripts,some highly expressed in the zebrafish fully-grown follicletranscriptome and well-conserved in vertebrates. Asexpected, some of the most abundant transcripts identi-fied in zebrafish, corresponding to some ribosomal pro-teins or translated to housekeeping genes, including beta-actins, and tubulins, or well known ovary-enriched pro-teins, like ZP protein isoforms or cyclins [7,63], are well-conserved in the ovarian transcriptomes of other fish spe-cies. Homologous highly-expressed transcripts were alsorecovered in mammals and Xenopus transcriptomes and,to a lesser extent, in silkworm, nematode, sea urchin, andamphioxus egg profiles [34,37] (Table 4, and Results sec-tion). Some of these transcripts are members of multigenefamilies that may be widely expressed in the zebrafishfully-grown follicle transcriptome, e.g., ZP protein iso-form transcripts. This high transcript level may berestricted to a few members of other gene families, as illus-trated with claudin genes. Claudins, the major tight junc-tion transmembrane proteins, are members of thetetraspanin protein superfamily that mediate cellularadhesion and migration [64]. Numerous claudin geneshave been identified in zebrafish [65] but only the cldnbtranscript was recovered at very high levels in ZEBRAOV

Page 19 of 28(page number not for citation purposes)

Page 20: BMC Genomics BioMed Central - Home - Springer · BioMed Central Page 1 of 28 ... BMC Genomics Research article Open Access Molecular phenotype of zebrafish ovar ian follicle by serial

BMC Genomics 2006, 7:46 http://www.biomedcentral.com/1471-2164/7/46

and library ID.9767, and cldng in library ID.9767. Someof the claudin isoform transcripts of maternal origin arethen downregulated in the early stages of zebrafish embry-ogenesis [66].

The most abundant transcript in zebrafish fully-grown fol-licles belongs to a large conserved protein family contain-ing one domain with sequence similarities to thegalactose/rhamnose-binding lectin domain found innumerous proteins with sugar binding properties. Thisdomain was initially characterized in sea urchin(Anthocidaris crassispina) egg lectin (SUEL) [67]. It wasthen characterized in rhamnose-binding lectins from ofrainbow trout eggs (Oncorhynchus mykiss), which consistof two homologous SUEL domains repeated in tandem[68]. It has been suggested that this domain plays a rolefrom egg maturation to fertilization [69]. Rhamnose-binding lectin in catfish (Parasilurus asotus) is composedof three tandem-repeat domains homologous to the SUELlectin domain [70]. A cysteine-rich domain homologousto the SUEL protein has been also identified in the N-ter-minal part of mammalian latrophilin-2 precursor protein[71].

The SAGE approach also revealed numerous transcriptshighly expressed in zebrafish that were not previouslyknown to be significantly expressed by zebrafish ovaries,including mt2, hsp90b, ldhb, atp5g, fth1, cirbp, rplp0, and40S ribosomal protein S27a. The relative abundance ofmolecules stored in oocytes may differ between speciesbut some of the abundant transcripts found in zebrafishfollicles are common highly-expressed transcripts in verte-brate ovaries/unfertilized eggs (Table 4). It is noteworthythat almost all ribosomal protein transcripts identifiedfrom the SAGE tags, expressed at over 0.15%, were recov-ered below this limit from the two selected zebrafish ovarycDNA libraries. In some cases, these differences may berelated to the loss of these small size transcripts after sizeselection of cDNAs during construction of the libraries, aprocess that did not occur using the SAGE method.

Short mt2 transcript is a very good example of the quanti-tative as well as qualitative original data obtained afterSAGE analysis. We found that mt2 was very abundantlyexpressed in zebrafish oocytes, at a level ten times higherthan that previously inferred from analysis of zebrafishovary dbEST libraries (Table 2). This difference in mt2transcript levels may be due to the loss of this small sizetranscript during cDNA library construction. An enrich-ment of this transcript in fully-grown oocytes versus ova-ries is less plausible due to the asynchronousdevelopment of zebrafish ovaries, containing oocytes atdifferent stages in development [31]. In addition, whole-mount in situ hybridization demonstrated a strong stage-dependent mt2 polarized hybridization signal in the cyto-

plasm of zebrafish oocytes (Figure 3). These data are con-sistent with the metallothionein activity content ofzebrafish oocytes [72] and the presence of this transcriptbefore the mid-blastula transition of the embryo [73].Metallothionein transcripts were also recovered from seaurchin egg and salmon ovary dbEST libraries, as well aslizard ovarian follicles (Podarcis sicula), with the highestlevel in ovulated eggs [74]. Expression of the rat Mt2 geneis also strongly regulated during primordial follicle assem-bly and development in rat ovaries [44]. SAGE may alsohelp to distinguish between the expressions of several iso-forms at the 3'-end of a transcript. In the same UniGenemt2 cluster a second mt2 transcript, identified with its insilico SAGE tag, contained an identical sequence in thecoding region but a long untranslated 3'-part. This longtranscript was not expressed in zebrafish fully-grown fol-licles. It should be noted that differential expression of 3'-end transcript isoforms was easily identified using theSAGE method, as also demonstrated with the ccnb2 tran-script.

In addition to the mt2 short transcript, abundant tran-scripts of heavy chain ferritins, including fth1, related tometal binding, were also detected in ZEBRAOV. This is inaccordance with a disproportionately high number ofsalmon ovary assembled ESTs seen in GO categoriesrelated to heavy metal (copper, iron, and zinc) [9] and thepresence of ferritin H mRNA in rainbow trout (Oncorhyn-chus mykiss) eggs [75]. Homologous genes to zebrafishfth1 are expressed in all vertebrate ovary dbEST librariesavailable at UniGene, with very high relative levels insalmon, swine, dog, and human libraries. Ferritin-con-taining inclusions were demonstrated in yolk platelets ofschistosome (Schistosoma mansoni) [76], a species inwhich a female-specific yolk ferritin transcript is expressedat high levels in the vitellarium [77]. Ferritin also occursin amphibian [78,79] and snail [80] eggs. It should benoted that high-level expression of ferritin H-chain mRNAis observed in metastatic human ovarian tumours [81].

The second significant difference in transcript abundancebetween zebrafish fully-grown follicle transcriptomes asevaluated by SAGE and the profile defined in ovary cDNAlibraries concerned the hsp90β transcript (Table 2). Exten-sive molecular characterization, including zebrafish tran-scripts, and biochemical studies have revealed thatvertebrate members of the heat shock protein 90 (HSP90)family play a post-translational regulatory role within thecell by interacting with several important cellular signal-ling molecules and transcription factors, such as steroidreceptors, and modulating their activity [82]. Homolo-gous transcripts are highly expressed in mouse andhuman ovaries (Table 4) and a strong HSP90 immunore-activity was demonstrated in rat primordial germ cells[83]. This signal was also detected in both male and

Page 20 of 28(page number not for citation purposes)

Page 21: BMC Genomics BioMed Central - Home - Springer · BioMed Central Page 1 of 28 ... BMC Genomics Research article Open Access Molecular phenotype of zebrafish ovar ian follicle by serial

BMC Genomics 2006, 7:46 http://www.biomedcentral.com/1471-2164/7/46

female pre-meiotic germ cells. HSP90 was also identifiedas one of the highly abundant proteins in mature mouseeggs and is strongly associated with the plasma membrane[54]. In addition, hsp83 transcript, the Drosophila homo-logue of the mammalian Hsp90 family of regulatorymolecular chaperones, is present at high levels throughthe end of oogenesis and both maternal and zygotic tran-scripts are spatially restricted during early embryo devel-opment [84]. All these data are consistent with the hightranscript level of hsp90β in zebrafish ovaries, whereas ahigh number of hsp90β cDNA clones was observed in thelibrary generated from testes but not ovaries [7]. The largediscrepancy in the relative level of hsp90β transcriptsobserved between the SAGE and cDNA approaches maybe related to an enrichment of this transcript in the termi-nal stages of folliculogenesis.

Other transcripts highly expressed in zebrafish folliclesand consistently represented in vertebrate ovarian tran-scriptomes are transcripts of ATP synthase, H+ transport-ing, mitochondrial F0 complex, subunit c (ATP5G), cold-inducible RNA-binding protein (CIRBP), and lactatedehydrogenase B4. atp5g is highly expressed in fish ova-ries and the encoded protein is one of the chains of thenonenzymatic membrane component (F0) of mitochon-drial ATPase in mitochondrial membrane. CIRBP appar-ently plays an essential role in cold-induced suppressionof cell proliferation [85]. One of the Xenopus CIRBPhomologues is a major RNA-binding protein in fully-grown oocytes and may be involved in translational regu-lation via modulation of oocyte ribosomal function [86].Lactate dehydrogenase B transcripts are widely distributedin animal ovarian transcriptomes, with high levels foundin mice and humans. It has been previously demonstratedthat lactate dehydrogenase B mRNA is one of the mostabundant transcripts in fully-grown mouse oocytes [87].Lactate dehydrogenase mRNA appears to be translatedefficiently during oocyte growth and then downregulatedduring maturation and after fertilization [88].

The egg is a transcriptionally inactive cell and, as such, isa storehouse of maternal mRNA and proteins required forfertilization and initiation of zygotic development. How-ever, many of the proteins comprising the animal egg pro-teome have yet to be identified, as very few large-scaleproteome analyses have been performed. As expected, thezebrafish follicle protein repertoire, determined by pro-teome analysis, identified ribosomal proteins, ZP familyprotein members, components of the cytoskeleton, chap-eronins, heat shock proteins, and VTG derivatives, butalso some proteins not previously reported in ovary pro-tein repertoires, e.g. a Sjogren syndrome antigen B homol-ogous protein (Table 5). This RNA-binding protein bindsto several small cytoplasmic RNA molecules, known as YRNAs, and may stabilize these RNAs, preventing degrada-

tion [89]. At least eight proteins, out of a total of thirty-eight deduced using the SAGE assigned transcript methodand expressed at over 0.15%, were identified by proteomeanalysis. The identification of abundant mRNAs withoutthe corresponding translated proteins may be due toinsufficient proteome delineation and/or the presence ofoocyte stage-specific maternal transcripts, stored insidethe oocyte cytoplasm and translated during early embryodevelopment. There were several proteins distributed inover one spot position after 1D-, 2D-PAGE separation andMS/MS (Table 5). While some of them, e.g. creatinekinase (CK), were present in closely isoelectric focusinglocated spots, suggesting the presence of isoforms or post-translational modifications, the distribution of otherspots, e.g. VTG derivatives, indicates a cleavage of precur-sor proteins with the presence of lower-molecular-weightderivative fragments. CKs play crucial roles in intracellularenergy transfer and expression of a CK brain-type isoen-zyme during oogenesis has been demonstrated in rodents[90,91]. A homologous transcript was also identified athigh levels in amphioxus eggs (ug|Bfl.4313|, 0,119%). ckbmRNA is shown to be maternally supplied in zebrafishembryos [92].

Comparison of the zebrafish follicle protein repertoirededuced from SAGE with the protein repertoire isolatedafter proteomic analysis revealed that some abundanttranscripts identified by their SAGE tags, but not previ-ously reported to be present in abundance in fish ovaries,had corresponding proteins. This was the case of lactatedehydrogenase B4 and, to a lesser extent, ribosomal pro-tein large P0 (Tables 2 and 5). Comparison also revealedthat bactin1 and bactin2 transcripts were differentiallyexpressed in zebrafish ovarian follicles, but their proteinsequences were not resolved due to the very high sequenceconservation of these duplicated gene copies. On theother hand, some ZP family protein members could bediscriminated on the protein level, while the same SAGEtags were generated with zp2.3 and zp2.4 or zp3a and zp3altranscripts, due to the high sequence similarities of the 3'-end untranslated part of these transcripts.

Oocyte growth, particularly in oviparous species, is char-acterized by intense deposition of RNAs and proteins, notnecessary of the same nature and origin. These maternalfactors can be stored for very long periods of time untiltheir use during embryonic development. Comparison oftranscriptome and proteome data revealed that transcriptlevels provide little predictive value with respect to theextend of protein abundance, taking into account the factthat the protein identification approach used detects rela-tively abundant proteins from the biological extract, whilethe mRNA abundance evaluated by SAGE tag frequencyvaried by over two orders of magnitude (Table 5). Tran-script profiling provides a measure of RNA abundance,

Page 21 of 28(page number not for citation purposes)

Page 22: BMC Genomics BioMed Central - Home - Springer · BioMed Central Page 1 of 28 ... BMC Genomics Research article Open Access Molecular phenotype of zebrafish ovar ian follicle by serial

BMC Genomics 2006, 7:46 http://www.biomedcentral.com/1471-2164/7/46

which may be affected not only by transcription levels butalso by RNA processing and degradation. Moreover, notall transcripts are translated and RNA abundance may notcorrespond to protein levels. High transcription and trans-lation rates during folliculogenesis and oocyte growth arefollowed by differential translational silencing and degra-dation of many mRNA species, especially at the end of theoocyte growth phase [2,93]. The identification ofzebrafish follicle proteins, e.g. pyruvate kinase and eno-lase I, by proteome analysis, with very low correspondingtranscript levels but very high homologous transcriptcounts in human ovary transcriptomes used as an externalreference, suggests a downregulation of the quantity ofthese transcripts and storage of the proteins at the end ofzebrafish folliculogenesis.

A comparison of transcriptome and proteome datarevealed two proteins encoded by chia and nots withoutcorresponding transcripts in ZEBRAOV. chia is related tothe multiple chitinases genes identified in rainbow troutand Japanese flounder [94], as well as, to a lesser extent,the acidic mammalian chitinase precursor in humans[95]. While the molecular functions of these proteins arerelated to chitin binding and chitinase activity, the func-tionality and origin of the protein identified in zebrafishfully-grown follicles remains to be determined. However,the presence of a small amount of chia transcript in mul-tiple follicular stage zebrafish cDNA libraries ID.9767,ID.15519 supports a stage-specific transcription of thisgene during zebrafish folliculogenesis as previously dem-onstrated by the downregulation of the transcription ofsome fish maternal genes, e.g. VTG/very-low density lipo-protein receptor [96] at the end of oogenesis. A high rateof protein deposition has also been observed duringoocyte growth in oviparous species via a receptor-medi-ated endocytosis of exogenous precursors. The presence ofan abundant protein in the repertoire without a corre-sponding transcript in ZEBRAOV may be due to endocy-tosis of the protein from the plasma to the oocyte.Zebrafish vg1 and vg3 are mainly expressed in the liverand, to a lesser extent, in several non-liver tissues, includ-ing the adipocytes associated with several organs, such asovaries [97]. This may explain the presence of a limitedamount of vg1 and vg3 transcripts in ZEBRAOV (Table 5).These precursor proteins are synthesized outside oocytesduring vitellogenesis, specifically incorporated in theoocyte by receptor-mediated endocytosis and cleaved intoyolk proteins. The identification of nothepsin in zebrafishfully-grown follicles by proteome analysis although notranscript was detected in the ovary by Northern blot [98],EST sequencing of ovary cDNA libraries, or SAGE (thisstudy), strongly suggests an extraovarian origin for thisenzyme that may be present in the plasma of femalesundergoing vitellogenesis. Zebrafish nots encodes a paral-ogous aspartic proteinase related to endoproteolytic pro-

teinases, such as cathepsin D, cathepsin E, and pepsin.This gene is specifically expressed in the liver under estro-genic control [99]. The sexual dimorphic expression ofnots may be related to the reproductive process, like VTGprecursor processing, or other sex-specific proteins insidethe oocyte cytoplasm.

ConclusionThis study provides a complete sequence data set of mater-nal mRNA stored in zebrafish germ cells at the end of oog-enesis. This catalogue contains highly-expressedtranscripts that were not previously known to be signifi-cantly expressed in the fish ovaries, including some thatare part of a vertebrate ovarian expressed gene signature.Comparison of transcriptome and proteome data identi-fied downregulated transcripts or proteins potentiallyincorporated in the oocyte by endocytosis. The molecularphenotype described provides groundwork for futureexperimental approaches aimed at identifying function-ally important maternal transcripts and proteins involvedin oogenesis and early stages of embryo development.

MethodsIsolation of fully-grown ovarian folliclesZebrafish, Danio rerio, were obtained from our facilitiesand maintained at 28.5°C on a 12L:12D photoperiod.Salts (0.23 g/l Instant Ocean, Aquarium System, Inc, Men-tor, USA and 0.1 g/l CaSO4, 2H2O) were added to reverseosmosis (Optima 60, Veolia Water STI, Blagnac, France)purified water in order to ensure an optimal water quality.The zebrafish ovaries undergo asynchronous develop-ment and oocyte development is divided into five stages:I (primary growth), II (cortical alveolus or pre-vitello-genic), III (vitellogenic), IV (maturation), and V (matureegg) [31]. Sexually mature females were anaesthetized byimmersion in 2-phenoxyethanol (1/2000, v/v) and fully-grown follicles (diameter > 0.69 mm) [31] were obtainedby gently squeezing the abdomen. Histological analysisdemonstrated that, in some areas, the stripped oocyteswere covered with a single layer of granulosa cells,attached to the zona radiata (data not shown).

SAGE library constructionTotal RNAs were isolated from about 250 fully-grown fol-licles, isolated from five none in-bred mature females,using the NucleoSpin RNAII kit (Macherey-Nagel, Duren,Germany). Fifty micrograms of total RNAs were used togenerate the SAGE library using the I-SAGE kit (Invitro-gen, Cergy Pontoise, France). The SAGE library was con-structed according to the manufacturer's instructions,with minor modifications. Briefly, total RNAs were boundto magnetic Dynal oligo(dT) beads and cDNAs were syn-thesized directly on oligo(dT) beads. cDNAs were digestedwith NlaIII at 37°C for 1 h and divided into two equalparts, pools A and B. These pools were ligated at 16°C for

Page 22 of 28(page number not for citation purposes)

Page 23: BMC Genomics BioMed Central - Home - Springer · BioMed Central Page 1 of 28 ... BMC Genomics Research article Open Access Molecular phenotype of zebrafish ovar ian follicle by serial

BMC Genomics 2006, 7:46 http://www.biomedcentral.com/1471-2164/7/46

2 h to specific adapters, adapX (5'-TTTGGATTTGCTGGT-GCAGTACAACTAGGCTTAATAGGGACATG-3') andadapY (5'-TTTCTGCTCGAATTCAAGCTTCTAACGATG-TACGGGGACATG-3'), containing the priming sites forPCR amplification at the 5'-end and the type IIS restrictionendonuclease BsmFI site at the 3'-end. The 3'-ends of theadapters were modified with an amino group to preventself-ligation. The two ligation products were then cleavedwith the tagging enzyme BsmFI at 65°C for 1 h. The result-ing tags from pools A and B were ligated at 16°C over-night in a 3 µl mixture to form ditag cassettes. The ligatedditag mixture was diluted 1:110 (v/v), and 1 µl was usedin a 50 µl PCR mixture. A total of 300 ditag PCR amplifi-cations were performed for 33 cycles using the followingprimers derived from adapX (5'GGATTTGCTGGTGCAGT-ACA-3') and adapY (5'-CTGCTCGAATTCAAGCTTCT-3'),respectively. Individual ditag PCR products (100 bp) werepurified on 12% (w/v) polyacrylamide gel. Adapters (40bp) were removed from ditags by NlaIII digestion at 37°Cfor 2 h 30 min and ditags without adapters (26 bp) werepurified on 12% (w/v) polyacrylamide gel. Purified ditags(26 bp) were ligated together at 16°C overnight andresolved in an 8% (w/v) polyacrylamide gel. Concatemerfractions ranging from 0.3 to 0.6 kb, 0.6 to 1.5 kb, andover 1.5 kb, were purified separately. The purified con-catemers were cloned into the SphI site of the pZErO-1plasmid (Invitrogen, Cergy Pontoise, France). The ligatedmixture was transformed into One Shot TOP10 electro-competent cells (Invitrogen, Cergy Pontoise, France). Pos-itive transformants were selected by plating on low-saltLuria-Bertani plates containing 50 µg/ml Zeocin and incu-bating at 37°C for 24 h. The concatemer sizes werescreened for 10% of total clones by colony PCR using Sp6forward and T7 reverse primers.

SAGE library sequencingHigh throughput sequencing reactions were carried out byGenome Express (Meylan, France). Automatic tag detec-tion, extraction, counting and quality control were under-taken by Skuld-Tech (Montpellier, France). Ditags longerthan 50 bp (2.97%), fewer than 24 bp (0.4%), or repeatedditags (2.69%) were discarded and not taken into accountfor tag number calculation. Contamination rates withlinker and ribosomal 18 S and 28 S RNA sequences were0.2% and 0.22%, respectively. The ZEBRAOV was gener-ated from 576 sequenced clones containing inserts origi-nating from the 0.6 to 1.5 kb concatemer fraction.Consequently, 96% of the sequenced clones have con-catemers with over 500 bp, resulting in an average SAGEtag number of 47 per concatemer.

Data analysis and tag-to-gene mappingTo our knowledge, we were the first to carry out SAGEanalysis on a fish species. Consequently, linking an exper-imental SAGE tag to an annotated transcript required the

development of a species-specific SAGE tag database fromthe tags extracted from ESTs or transcripts available for aparticular fish species. FISHTAG, a generic computer pack-age written in PERL and implemented on a UNIX worksta-tion, was designed for automatic extraction andannotation of SAGE tags from the sequences deposited inUniGene and TIGR public sequence databases (Rousselotet al., in preparation). Briefly, three 14-long nucleotide insilico tags, identified with the 5'-CATG ending sequence,were extracted from each reference transcript sequence atthe first three sense positions, starting from the 3'-end ofthe transcript, and named R1, R2, or R3 in the presence ofa polyadenylation signal and poly(A) tail. The tags werenamed Rn1, Rn2, Rn3 for EST sequences without a polya-denylation signal, a poly(A) tail, or both. Furthermore,the corresponding in silico tags of complementary reversedantisense sequences of deposited ESTs called "3'-reads"were also extracted and named R1cr, R2cr, or R3cr in thepresence of a polyadenylation signal and poly(A) tail. Thetags were named Rn1cr, Rn2cr, Rn3cr for EST sequenceswithout a polyadenylation signal, a poly(A) tail or both.A table was constructed and additional data, like EST clus-ter descriptions, GenBank™ annotations, and URL links tothe UniGene or TIGR sites were attached to each extractedin silico tag. This table was then used as a reference for tag-to-gene mapping by comparison between the experimen-tal SAGE tags and the EST-derived in silico extracted SAGEtags.

The zebrafish sequence data were downloaded from Uni-Gene and TIGR ftp sites [100,101]. There were 673,076public UniGene sequence entries (Build#87), including99,968 3'-read entries, in 31,681 UniGene clusters. Dataretrieved from TIGR contained 33,752 unique TC leadersequences (release 16.0). The EST sequences used andincluded in UniGene were from more than two-hundreddbEST libraries originating from at least twenty differentzebrafish tissues or developmental stages [102]. The in sil-ico zebrafish SAGE-tag database, ZEBRATAG, generatedusing FISHTAG, was used to combine the copy number ofeach experimental SAGE tag from the ZEBRAOV librarywith its annotation.

Ovary/egg cDNA library sequence data usedThe teleost fish ovary dbEST cDNA libraries currentlyavailable at the GenBank™ database and used to analysetheir EST sequences were: ID.15519 (NIH_ZGC_5) with13,029 EST sequences classified into 2,610 clusters andID.9767 (Gong zebrafish ovary), with 11,344 ESTsequences classified into 2,794 clusters, for zebrafish(Danio rerio) (UniGene Build#87), ID.15587 (AGENAErainbow trout normalized ovarian library) with 4,936 ESTsequences classified into 274 clusters for rainbow trout(Oncorhynchus mykiss) (UniGene Build#16), ID.11967(Fugu UT13 adult ovary) with 2,545 EST sequences classi-

Page 23 of 28(page number not for citation purposes)

Page 24: BMC Genomics BioMed Central - Home - Springer · BioMed Central Page 1 of 28 ... BMC Genomics Research article Open Access Molecular phenotype of zebrafish ovar ian follicle by serial

BMC Genomics 2006, 7:46 http://www.biomedcentral.com/1471-2164/7/46

fied into 705 clusters for Fugu (Takifugu rubripes) (Uni-Gene Build#13), and ID.15459 (Atlantic salmon ovarycDNA library) with 2,269 EST sequences classified into849 clusters for salmon (Salmo salar) (UniGene Build#8).

The other vertebrate dbEST cDNA library accession num-bers used for comparison were ID.10552(NIH_MGC_109), with 10,652 sequences classified into2,973 UniGene entries, ID.4908 (NIH_MGC_9), with12,849 sequences classified into 3,289 UniGene entries,and ID.5611 (NIH_MGC_66) with 8,540 sequences clas-sified into 3,060 UniGene entries for human (Homo sapi-ens) ovaries (UniGene Build#187), ID.15096 (full-lengthenriched swine cDNA library, adult ovary) with 26,127sequences classified into 4,770 UniGene entries for swine(Sus scrofa) (UniGene Build#34), ID.10029 (NIA mouseunfertilized egg cDNA library, long) with 13,201sequences classified into 1,939 UniGene entries, ID.1389(mouse unfertilized egg cDNA library) with 3,098sequences classified into 1,228 UniGene entries, andID.14142 (NIA mouse unfertilized egg cDNA library, long1) with 6,443 sequences classified into 1,821 UniGene

entries for mouse (Mus musculus) oocytes (UniGeneBuild#148), and ID.9909 (XGC-egg) with 65,236sequences classified into 7,968 UniGene entries for Xeno-pus tropicalis (UniGene Build#26).

The other non-vertebrate dbEST cDNA library accessionnumbers used for comparison were ID.13749, with 6,003EST sequences classified into 1,775 clusters for unferti-lised sea urchin (Strongylocentrotus purpuratus) eggs (Uni-Gene Build#9), ID.17404, with 38,522 EST sequencesclassified into 3,720 clusters for amphioxus (Branchios-toma floridae) eggs (UniGene Build#1), and ID.1977, with53,273 EST sequences classified into 5,777 clusters for fer-tilized nematode (Caenorhabditis elegans) eggs (UniGeneBuild#25). In addition, a silkworm (Bombyx mori) eggSAGE tag library [52] was also used.

Whole-mount in situ hybridizationFollicles were fixed in 4% paraformaldehyde overnight at4°C, rinsed three times in phosphate buffered saline(PBS) buffer, twice in methanol, and stored at -20°C inmethanol until used. A first primer pair (sense oligonucle-

Electrophoretic pattern of zebrafish fully-grown follicle proteinsFigure 4Electrophoretic pattern of zebrafish fully-grown follicle proteins. A, SDS-polyacrylamide gradient 8–16% slab gel of total yolk proteins with increasing amounts of proteins deposited per lane. Eleven gel slices, numbered I to XI, were cut after staining the gel with See-Band Forte (GeBA). B, Two-dimensional polyacrylamide gel electrophoresis of total yolk proteins. Twenty-four spot areas, numbered 1 to 24, were excised after staining the gel with GeBA. In both procedures, gel pieces were in-gel digested with trypsin and the resulting peptides identified by mass spectrometry (see Table 5). The rectangles from which VTG derivatives were isolated are identified with a solid lined line while the rectangles with no derivatives (numbers 8, 9, and 20) are labelled with a broken line. M, relative molecular weights of the standards × 10-3.

Page 24 of 28(page number not for citation purposes)

Page 25: BMC Genomics BioMed Central - Home - Springer · BioMed Central Page 1 of 28 ... BMC Genomics Research article Open Access Molecular phenotype of zebrafish ovar ian follicle by serial

BMC Genomics 2006, 7:46 http://www.biomedcentral.com/1471-2164/7/46

otide 5'-CAGCAGAATCATGCGCTC-3' and antisense oli-gonucleotide 5'-CAACATATGGACGACAGG-3') was usedto amplify a 611 bp of a cDNA fragment from nucleotides-10 to +601 (numbered from the translation initiatorcodon) of a transcript (GenBank™/EBI (gb) Data Bankaccession number gb|BM141044|) with similarities tohuman latrophilin-2, identified as a rhamnose-bindinglectin. A second primer pair (sense oligonucleotide 5'-GACTGGAACTTGCAACTG-3' and antisense oligonucle-otide 5'-GACGGTACAGGAAACAGAT-3') was used toamplify a 445 bp cDNA fragment from nucleotides +24 to+468 of mt2 short transcript (gb|BC049475|). A thirdprimer pair (sense oligonucleotide 5'-TATCCAGAAG-CATCGTCAGG-3' and antisense oligonucleotide 5'-CCT-TCACATCACACTCATGC-3') was used to amplify a 764bp cDNA fragment from nucleotides +57 to +820 of dazltranscript (gb|NM_131524|). A fourth primer pair (senseoligonucleotide 5'-AGGCTGCTTCAGGAGACC-3' andantisense oligonucleotide 5'-CCTAAAGAAGTGACGG-TACC-3') was used to amplify a 769 bp cDNA fragmentfrom nucleotides +550 to +1318 of ccnb1 transcript(gb|AB040435|). These amplified cDNA fragments wereused as templates to generate the corresponding RNAprobes. Both antisense- and sense-digoxigenin- and fluo-rescein-labelled RNA probes were obtained using T7 orSP6 RNA polymerase (Promega, France) and the digoxi-genin/fluoresceine RNA labelling mix (Roche, Germany),following the manufacturer's instructions. RNA probeswere purified using the ProbeQuant G-50 micro columns(Amersham Biosciences, England) and checked for purityby denaturing agarose gel electrophoresis. Whole-mountin situ hybridization was carried out as previouslydescribed http://zfin.org/zf_info/zfbook/chapt9/9.82.html, with minor modifications. The PBS buffer usedcontained 0.04% (w/v) KCl. Samples were treated with 20µg/ml proteinase K for 20 min and hybridized at 58°Cwith 50% formamide. Two-colour whole-mount in situhybridization was carried out with a fluorescein-labelledRNA probe to detect the most-strongly expressed gene,then with the digoxigenin-labelled RNA probe. Sampleswere mounted in 100% glycerol and observed under anEclipse E1000 Nikon microscope. Histology was per-formed according to the procedure previously described[103].

Protein extractionProteins were extracted from zebrafish fully-grown folli-cles using TRI Reagent™ (Sigma-Aldrich, U.S.A.) [104]. A100 µl aliquot of frozen follicles in L-15 was mixed with 1ml of cold TRI Reagent and incubated at room tempera-ture for 10 min. 200 µl chloroform were added andmixed, incubated at room temperature for 10 min andcentrifuged at 10,000 g at 4°C for 10 min. The proteinphase was separated from the DNA and RNA and the pro-teins precipitated with 300 µl 100% ethanol at room tem-

perature for 10 min. After centrifugation at 2,000 g at 4°Cfor 10 min, the supernatant containing the proteins wasprecipitated with isopropanol at room temperature for 10minutes, and then centrifuged at 10,000 g at 4°C for 10min. The protein pellet was washed twice with 2 ml 0.3 Mguanidine hydrochloride in 95% ethanol and incubatedfor 20 min at room temperature, then centrifuged at 6,500g, at 4°C for 10 min. The pellet was further washed threetimes with 10 ml 100% ethanol for 20 min at room tem-perature and centrifuged at 6,500 g, in 4°C for 10 min.The remaining salts were removed by adding 1.5 ml 80%cold acetone (-20°C) and centrifuging at 6,000 g at 4°Cfor 10 min three times. The air-dried pellet was used for1D- and 2D-PAGE.

1D-, 2D-PAGE and tandem mass spectrometryProtein pellets were dissolved in 2D-PAGE sample buffercontaining 7 M urea, 2 M thiourea, 2% CHAPS, 65 mMDTT, 1.25% ampholytes (pH 3–10, BioRad) with a traceof bromophenol blue. The protein amounts were assayedwith the Bradford reagent (BioRad, Hercules, U.S.A.). Forisoelectric focusing, 600 µg of each protein sample wereapplied to 11 cm IPG Strips (pH 3–10 NL, BioRad). Isoe-lectric focusing was at 70,000 Volt/hr with an IPG-PhorSystem (Amersham Biosciences), followed by second-dimension electrophoresis, using pre-cast Criterion Tris-HCl gels (4–20% Linear Gradient, BioRad). 1D SDS-PAGE (133 × 87 × 1 mm) was performed on a home-madepolyacrylamide gradient gel 8–16%. The gels were stainedwith See-Band Forte (GeBA). Images were taken with aFlourS imager (BioRad) and the images were analysedusing PDQuest software (BioRad). Spots were excised, in-gel digested with trypsin, and identified either by peptidemass fingerprinting and CID using a 4700 MALDI-TOF-TOF mass spectrometer (Applied Biosystems) or a nano-capillary RP-HPLC and ESI-QIT mass spectrometer (LCQ-Deca, ThermoFinnigan). The MS data was analysed usingSequest [105], Pep-Miner [106] and Mascot [107] soft-ware tools and searching the NCBInr or ZFIN Zebrafishdatabases. Each peptide identified was then manuallychecked against the corresponding Swiss-Prot/TrEMBL/PIR UniProt Data Bank entry at the UniProt web site[108]. The presence of a protein in zebrafish fully-grownfollicle protein extract was confirmed when at least onepeptide sequence perfectly matching the target UniProtprotein entry appeared in the proteome database at leasttwice.

Authors' contributionsA.K-G., carried out the genomics and bioinformatics stud-ies, A.M. participated in the genomics as well as themolecular biology experiments, T.G. carried out the pro-teomics studies, J.F. performed the in situ hybridizationexperiments, A.A. participated in the proteomic analysis,and P.J.B. carried out the computational analyses, coordi-

Page 25 of 28(page number not for citation purposes)

Page 26: BMC Genomics BioMed Central - Home - Springer · BioMed Central Page 1 of 28 ... BMC Genomics Research article Open Access Molecular phenotype of zebrafish ovar ian follicle by serial

BMC Genomics 2006, 7:46 http://www.biomedcentral.com/1471-2164/7/46

nated the work, and drafted the manuscript. All authorsread and approved the final manuscript.

AcknowledgementsWe wish thank Celine Rousselot for her help in software tools develop-ment. We wish also to acknowledge David Piquemal for advice regarding SAGE data analyses. This work was supported by the Commission of the European Communities, grant Q5S-2002-00784 "CRYOCYTE" program "Quality of Life and Management of Living Resources".

References1. Matova N, Cooley L: Comparative aspects of animal oogenesis.

Dev Biol 2001, 231:291-320.2. Eichenlaub-Ritter U, Peschke M: Expression in in-vivo and in-

vitro growing and maturing oocytes: focus on regulation ofexpression at the translational level. Hum Reprod Update 2002,8:21-41.

3. Duranthon V, Renard J-P: Storage and functional recovery ofmolecules in oocytes. In Biology and Pathology of the Oocyte: its rolein Fertility and Reproduction Medicine Edited by: Trounson A, Gosden R.Cambridge University Press; 2003:81-112.

4. The Danio rerio Sequencing Project 2006 [http://www.sanger.ac.uk/Projects/D_rerio/].

5. Expressed Sequence Tags database 2006 [http://www.ncbi.nlm.nih.gov/dbEST/index.html].

6. Boguski MS, Lowe TM, Tolstoshev CM: dbEST – database for"expressed sequence tags". Nat Genet 1993, 4:332-333.

7. Zeng S, Gong Z: Expressed sequence tag analysis of expressedprofiles of zebrafish testis and ovary. Gene 2002, 294:45-53.

8. Davey GC, Caplice NC, Martin SA, Powell R: A survey of genes inthe Atlantic salmon (Salmo salar) as identified by expressedsequence tags. Gene 2001, 263:121-130. 269:229

9. Rise ML, von Schalburg KR, Brown GD, Mawer MA, Devlin RH,Kuipers N, Busby M, Beetz-Sargent M, Alberto R, Gibbs AR, Hunt P,Shukin R, Zeznik JA, Nelson C, Jones SR, Smailus DE, Jones SJ, ScheinJE, Marra MA, Butterfield YS, Stott JM, Ng SH, Davidson WS, KoopBF: Development and application of a salmonid EST databaseand cDNA microarray: data mining and interspecific hybrid-ization characteristics. Genome Res 2004, 14:478-490.

10. Wang Z, Brown DD: A gene expression screen. Proc Natl Acad SciUSA 1991, 88:11505-11509.

11. Kanamori A: Systematic identification of genes expressed dur-ing early oogenesis in medaka. Mol Reprod Dev 2000, 55:31-36.

12. von Schalburg KR, Rise ML, Brown GD, Davidson WS, Koop BF: Acomprehensive survey of the genes involved in maturationand development of the rainbow trout ovary. Biol Reprod 2005,72:687-699.

13. Chen J, Sun M, Lee S, Zhou G, Rowley JD, Wang SM: Identifyingnovel transcripts and novel genes in the human genome byusing novel SAGE tags. Proc Natl Acad Sci USA 2002,99:12257-12262.

14. Velculescu VE, Zhang L, Vogelstein B, Kinzler KW: Serial analysisof gene expression. Science 1995, 270:368-369.

15. Gowda M, Jantasuriyarat C, Dean RA, Wang G: Robust-LongSAGE(RL-SAGE): a substantially improved LongSAGE method forgene discovery and transcriptome analysis. Plant Physiol 2004,134:890-897.

16. Neilson L, Andalibi A, Kang D, Coutifaris C, Strauss JF 3rd, StantonJA, Green DP: Molecular phenotype of the human oocyte byPCR-SAGE. Genomics 2000, 63:13-24.

17. Stanton JL, Bascand M, Fisher L, Quinn M, Macgregor A, Green DP:Gene expression profiling of human GV oocytes: an analysisof a profile obtained by Serial Analysis of Gene Expression(SAGE). J Reprod Immunol 2002, 53:193-201.

18. Tuteja R, Tuteja N: Serial analysis of gene expression: Applica-tions in human studies. J Biomed Biotechnol 2004, 2004:113-120.

19. Aebersold R, Mann M: Mass spectrometry-based proteomics.Nature 2003, 422:198-207.

20. Serial Analysis of gene Expression Tag to Gene Mapping,SAGEmap 2006 [http://www.ncbi.nlm.nih.gov/SAGE/].

21. Mouse SAGE Site 2006 [http://mouse.biomed.cas.cz/sage/].22. The Cancer Genome Anatomy Project. SAGE Genie 2006

[http://cgap.nci.nih.gov/SAGE].

23. Melbourne Brain Genome Project 2006 [http://www.mbgproject.org/].

24. Lash AE, Tolstoshev CM, Wagner L, Schuler GD, Strausberg RL, Rig-gins GJ, Altschul SF: SAGEmap: a public gene expressionresource. Genome Res 2000, 10:1051-1060.

25. Gene Expression Omnibus 2006 [http://www.ncbi.nlm.nih.gov/geo/].

26. Pfam 2006 [http://www.sanger.ac.uk/Software/Pfam/].27. The Gene Ontology 2006 [http://www.geneontology.org/].28. ZFIN. The Zebrafish Information Network 2006 [http://

zfin.org/cgi-bin/webdriver?MIval=aa-ZDB_home.apg].29. Pelegri F: Maternal factors in zebrafish development. Dev Dyn

2003, 228:535-554.30. McGinnis S, Madden TL: BLAST: at the core of a powerful and

diverse set of sequence analysis tools. Nucleic Acids Res2004:W20-W25.

31. Selman K, Wallace RA, Sarka A, Qi X: Stages of oocyte develop-ment in the zebrafish Brachydanio rerio. J Morphol 1993,218:203-224.

32. Wang Y, Ge W: Developmental profiles of activin βA, βB, andfollistatin expression in the zebrafish ovary: evidence fortheir differential roles during sexual maturation and ovula-tory cycle. Biol Reprod 2004, 71:2056-2064.

33. Ge W: Intrafollicular paracrine communication in thezebrafish ovary: The state of the art of an emerging modelfor the study of vertebrate folliculogenesis. Mol Cell Endocrinol2005, 237:1-10.

34. Stanton JL, Green DPL: A set of 840 mouse oocyte genes withwell-matched human homologues. Mol Hum Reprod 2001,7:521-543.

35. Zeng F, Schultz RM: Gene expression in mouse oocytes andpreimplantation embryos: use of suppression subtractivehybridization to identify oocyte- and embryo-specific genes.Biol Reprod 2003, 68:31-39.

36. Sharov AA, Piao Y, Matoba R, Dudekula DB, Qian Y, et al.: Tran-scriptome analysis of mouse stem cells and early embryos.PLoS Biol 2003, 1:e74.

37. Tanaka M, Hennebold JD, Miyakoshi K, Teranishi T, Ueno K, AdashiEY: The generation and characterization of an ovary-selec-tive cDNA library. Mol Cell Endocrinol 2003, 202:67-69.

38. Paillisson A, Dadé S, Callebaut I, Bontoux M, Dalbiès-Tran R, VaimanD, Monget P: Identification, characterization and metagen-ome analysis of oocyte-specific genes organized in clusters inthe mouse genome. BMC Genomics 2005, 6:76.

39. Small CL, Shima JE, Uzumcu M, Skinner M., Griswold MD: Profilinggene expression during the differentiation and developmentof the murine embryonic gonad. Biol Reprod 2005, 72:492-501.

40. Dalbiès-Tran R, Mermillod P: Use of heterologous complemen-tary DNA array screening to analyse bovine oocyte tran-scriptome and its evolution during in vitro maturation. BiolReprod 2003, 68:252-261.

41. Yao J, Ren X, Ireland JJ, Coussens PM, Smith TPL, Smith GW: Gen-eration of a bovine oocyte cDNA library and microarray:resources for identification of genes important for folliculardevelopment and early embryogenesis. Physiol Genomics 2004,19:84-92.

42. Pennetier S, Uzbekova S, Guyader-Joly C, Humblot P, Mermillot P,Dalbiès-Tran R: Genes preferentially expressed in bovineoocytes revealed by subtractive and suppressive hybridiza-tion. Biol Reprod 2005, 73:713-720.

43. Leo CP, Pisarska MD, Hsueh AJ: DNA array analysis of changesin preovulatory gene expression in the rat ovary. Biol Reprod2001, 65:269-276.

44. Kezele PR, Ague JM, Nilsson E, Skinner MK: Alterations in theovarian transcriptome during primordial follicle assemblyand development. Biol Reprod 2005, 72:241-255.

45. Stanton JL, Macgregor AB, Grenn DPL: Using expressed sequencetag databases to identify ovarian genes of interest. Mol CellEndocrinol 2002, 191:11-14.

46. Dinel S, Bolduc C, Belleau P, Boivin A, Yoshioka M, Calvo E, PiedboeufB, Snyder EE, Labrie F, St-Amand J: Reproducibility, bioinformaticanalysis and power of the SAGE method to evaluate changesin transcriptome. Nucleic Acids Res 2005, 33:e26.

47. Ibrahim AFM, Hedley PE, Cardle L, Kruger W, Marshall DF, Muehl-bauer GJ, Waugh R: A comparative analysis of transcript abun-

Page 26 of 28(page number not for citation purposes)

Page 27: BMC Genomics BioMed Central - Home - Springer · BioMed Central Page 1 of 28 ... BMC Genomics Research article Open Access Molecular phenotype of zebrafish ovar ian follicle by serial

BMC Genomics 2006, 7:46 http://www.biomedcentral.com/1471-2164/7/46

dance using SAGE and Affymetrix arrays. Funct Integr Genomics2005, 5:163-174.

48. van Ruissen F, Ruijter JM, Schaaf GJ, Asgharnegad L, Zwijnenburg DA,Kool M, Bass F: Evaluation of the similarity of gene expressiondata estimated with SAGE and Affymetrix GeneChips. BMCGenomics 2005, 6:91.

49. Welle S, Bhatt K, Thornton CA: Inventory of high-abundancemRNAs in skeletal muscle of normal men. Genome Res 1999,9:506-513.

50. Lorenz WW, Dean JFD: SAGE profiling and demonstration ofdifferential gene expression along the axial developmentalgradient of lignifying xylem in loblolly pine (Pinus taeda). TreePhysiol 2002, 22:301-310.

51. Fizames C, Munos S, Cazettes C, Nacry P, Boucherez J, Gaymard F,Piquemal D, Delorme V, Commes TS, Doumas P, et al.: The Arabi-dopsis root transcriptome by serial analysis of gene expres-sion. Gene identification using the genome sequence. PlantPhysiol 2004, 134:67-80.

52. Huang J, Miao X, Jin W, Couble P, Mita K, Zhang Y, Liu W, Zhuang L,Shen Y, Keime C, Gandrillon O, Brouilly P, Briolay J, Zhao G, HuangY: Serial analysis of gene expression in the silkworm, Bombyxmori. Genomics 2005, 86:223-241.

53. Peters DG, Kudla DM, Deloia JA, Chu TJ, Fairfull L, Edwards RP, Fer-rel RE: Comparative gene expression analysis of ovarian car-cinoma and normal ovarian epithelium by serial analysis ofgene expression. Cancer Epidemiol Biomarkers Prev 2005,14:1717-1723.

54. Calvert ME, Digilio LC, Herr JC, Coonrod SA: Oolemmal pro-teomics – identification of highly abundant heat shock pro-teins and molecular chaperones in the mature mouse eggand their localization on the plasma membrane. Reprod BiolEndocrinol 2003, 1:27.

55. Ellederova Z, Halada P, Man P, Kubelka M, Motlik J, Kovarova H: Pro-tein patterns of pig oocytes during in vitro maturation. BiolReprod 2004, 71:1533-1539.

56. Liska AJ, Popov AV, Sunyaev S, Coughlin P, Habermann B, ShevchenkoA, Bork P, Karsenti E, Shevchenko A: Homology-based functionalproteomics by mass spectrometry: Application to the Xeno-pus microtubule-associated proteome. Proteomics 2004,4:2707-2721.

57. Nakahara K, Kim K, Sciulli C, Dowd SR, Minden JS, Carthew RW:Targets of microRNA regulation in the Drosophila oocyteproteome. Proc Natl Acad Sci USA 2005, 102:12023-12028.

58. Wu X, Wang P, Brown CA, Zilinski CA, Matzuk MM: Zygote arrest1 (Zar1) is an evolutionarily conserved gene expressed invertebrate ovaries. Biol Reprod 2003, 69:861-867.

59. Mangos S, Krawetz R, Kelly GM: The Translocon-AssociatedProtein beta (TRAPbeta) in zebrafish embryogenesis. I.Enhanced expression of transcripts in notochord and hatch-ing gland precursors. Mol Cell Biochem 2000, 215:93-101.

60. Raz E: The function and regulation of vasa-like genes in germ-cell development. Genome Biol 2000, 1:reviews1017.1-1017.6.

61. Bratt AK, Zandbergen T, Van de Water S, Goos HJTH, Zivkovic D:Characterization of zebrafish primordial germ cells: Mor-phology and early distribution of vasa RNA. Dev Dyn 1999,216:153-167.

62. Krovel AV, Olsen LC: Sexual dimorphic expression pattern ofa splice variant of zebrafish vasa during gonadal develop-ment. Dev Biol 2004, 271:190-197.

63. Wen C, Zhang Z, Ma W, Xu M, Wen Z, Peng J: Genome-wide iden-tification of female-enriched genes in zebrafish. Dev Dyn 2005,232:171-179.

64. Tsukita S, Furuse M: Claudin-based barrier in simple and strat-ified cellular sheets. Curr Opin Cell Biol 2002, 14:531-536.

65. Kollmar R, Nakamura SK, Kappler JA, Hudspeth AJ: Expression andphylogeny of claudins in vertebrate primordia. Proc Natl AcadSci USA 2001, 98:10196-10201.

66. Lo J, Lee S, Xu M, Liu F, Ruan H, Eun A, He Y, Ma W, Wang W, WenZ, Peng J: 15,000 unique zebrafish EST clusters and theirfuture use in microarray for profiling gene expression pat-terns during embryogenesis. Genome Res 2003, 13:455-466.

67. Ozeki Y, Matsui T, Suzuki M, Titani K: Amino acid sequence andmolecular characterization of a D-galactoside-specific lectinpurified from sea urchin (Anthocidaris crassispina) eggs. Bio-chemistry 1991, 30:2391-2394.

68. Tateno H, Saneyoshi A, Ogawa T, Muramoto K, Kamiya H, SaneyoshiM: Isolation and characterization of rhamnose-bindinglectins from eggs of steelhead trout (Oncorhynchus mykiss)homologous to low density lipoprotein receptor super-family. J Biol Chem 1998, 273:19190-19197.

69. Tateno H, Ogawa T, Muramoto K, Kamiya H, Hirai T, Saneyoshi M:A novel rhamnose-binding lectin family from eggs of steel-head trout (Oncorhynchus mykiss) with different structuresand tissue distribution. Biosci Biotechnol Biochem 2001,65:1328-1338.

70. Hosono M, Ishikawa K, Mineki K, Murayama K, Numata C, Ogawa Y,Takayanagi Y, Nitta K: Tandem repeat structure of rhamnose-binding lectin from catfish (Silurus asotus) eggs. Biochim BiophysActa 1999, 1472:668-675.

71. Lelianova VG, Davletov BA, Sterling A, Rahman MA, Grishin EV, TottyNF, Ushkaryov YA: Alpha-latrotoxin receptor, latrophilin, is anovel member of the secretin family of G protein-coupledreceptors. J Biol Chem 1997, 272:21504-21508.

72. Riggio M, Filosa S, Parisi E, Scudiero R: Changes in zinc, copperand metallothionein contents during oocyte growth andearly development of the teleost Danio rerio (zebrafish).Comp Biochem Physiol C Toxicol Pharmacol 2003, 135:191-196.

73. Chen WY, John JA, Lin CH, Lin HF, Wu SC, Lin CH, Chang CY:Expression of metallothionein gene during embryonic andearly larval development in zebrafish. Aquat Toxicol 2004,69:215-227.

74. Riggio M, Trinchella F, Filosa S, Parisi E, Scudeiro R: Accumulationof zinc, copper, and metallothionein mRNA in lizard ovaryproceeds without a concomitant increase in metallothioneincontent. Mol Reprod Dev 2003, 66:374-382.

75. Aegerter S, Jalabert B, Bobe J: Large scale real-time PCR analysisof mRNA abundance in rainbow trout eggs in relationshipwith egg quality and post-ovulatory ageing. Biol Reprod 2005,72:377-385.

76. Schussler P, Potters E, Winnen R, Bottke W, Kunz W: An isoformof ferritin as a component of protein yolk platelets in Schis-tosoma mansoni. Mol Reprod Dev 1995, 41:325-330.

77. Schussler P, Potters E, Winnen R, Michel A, Bottke W, Kunz W: Fer-ritin mRNAs in Schistosoma mansoni do not have iron-responsive elements for post-transcriptional regulation. EurJ Biochem 1996, 241:64-69.

78. Brown DD, Caston JD: Biochemistry of amphibian develop-ment. III. Identification of ferritin in the egg and earlyembryos of Rana pipiens. Dev Biol 1962, 5:445-451.

79. Kandror KV, Tsuprun VL, Stepanov AS: The main adenosine tri-phosphate-binding component of amphibian oocytes is ferri-tin. Mol Reprod Dev 1992, 31:48-54.

80. von Darl M, Harrisson PM, Bottke W: cDNA cloning and deducedamino acid sequence of two ferritins: soma ferritin and yolkferritin, from the snail Lymnaea stagnalis L. Eur J Biochem 1994,222:353-366.

81. Tripathi PK, Chatterjee SK: Elevated expression of ferritin H-chain mRNA in metastatic ovarian tumor. Cancer Invest 1996,14:518-526.

82. Krone PH, Sass JB, Lele Z: Heat shock protein gene expressionduring embryonic development of the zebrafish. Cell Mol LifeSci 1997, 53:122-129.

83. Ohsako S, Bunick D, Hayashi Y: Immunocytochemical observa-tion of the 90 KD heat shock protein (HSP90): high expres-sion in primordial and pre-meiotic germ cells of male andfemale rat gonads. J Histochem Cytochem 1995, 43:67-76.

84. Ding D, Parkhurst SM, Halsell SR, Lipshitz HD: Dynamic Hsp83RNA localization during Drosophila oogenesis and embryo-genesis. Mol Cell Biol 1993, 13:3773-3781.

85. Fujita J: Cold shock response in mammalian cells. J Mol MicrobiolBiotechnol 1999, 1:243-255.

86. Matsumoto K, Aoki K, Dohmae N, Takio K, Tsujimoto M: CIRP2, amajor cytoplasmic RNA-binding protein in Xenopus oocytes.Nucleic Acids Res 2000, 28:4689-4697.

87. Roller RJ, Kinloch RA, Hiraoka BY, et al.: Gene expression duringmammalian oogenesis and early embryogenesis: quantifica-tion of three messenger RNAs abundant in fully grownmouse oocytes. Development 1989, 106:251-262.

88. Cascio SM, Wasserman PM: Program of early development inthe mammals: post-translational control of a class of pro-

Page 27 of 28(page number not for citation purposes)

Page 28: BMC Genomics BioMed Central - Home - Springer · BioMed Central Page 1 of 28 ... BMC Genomics Research article Open Access Molecular phenotype of zebrafish ovar ian follicle by serial

BMC Genomics 2006, 7:46 http://www.biomedcentral.com/1471-2164/7/46

Publish with BioMed Central and every scientist can read your work free of charge

"BioMed Central will be the most significant development for disseminating the results of biomedical research in our lifetime."

Sir Paul Nurse, Cancer Research UK

Your research papers will be:

available free of charge to the entire biomedical community

peer reviewed and published immediately upon acceptance

cited in PubMed and archived on PubMed Central

yours — you keep the copyright

Submit your manuscript here:http://www.biomedcentral.com/info/publishing_adv.asp

BioMedcentral

teins synthesized by mouse oocytes and early embryos. DevBiol 1982, 89:397-408.

89. Deutscher SL, Harley JB, Keene JD: Molecular analysis of the 60-kDa human Ro ribonucleoprotein. Proc Natl Acad Sci USA 1988,85:9479-9483.

90. Iyengar MR, Iyengar CW, Chen HY, Brinster RL, Bornslaeger E,Schultz RM: Expression of creatine kinase isoenzyme duringoogenesis and embryogenesis in the mouse. Dev Biol 1983,96:263-268.

91. Naumoff PA, Stevenson PM: Creatine kinase, steroidogenesisand the developing ovarian follicle. Int J Biochem 1985,17:1363-1367.

92. Dickmeis T, Rastegar S, Aanstad P, Clark M, Fischer N, Plessy C, RosaF, Korzh V, Strähle U: Expression of brain subtype creatinekinase in the zebrafish embryo. Mech Dev 2001, 109:409-412.

93. Schultz RM: The molecular foundations of the maternal tozygotic transition in the preimplantation embryo. HumReprod Updat 2002, 8:323-331.

94. Kurokawa T, Uji S, Suzuki T: Molecular cloning of multiple chiti-nase genes in Japanese flounder, Paralichthys olivaceus. CompBiochem Physiol B Biochem Mol Biol 2004, 138:255-264.

95. Boot RG, Blommaart EF, Swart E, Ghauharali-van der Vlugt K, Bijl N,Moe C, Place A, Aerts JM: Identification of a novel acidic mam-malian chitinase distinct from chitotriosidase. J Biol Chem2001, 276:6770-6778.

96. Perrazolo L, Coward K, Davail B, Normand E, Tyler CR, Pakdel F,Schneider WJ, Le Meen F: Expression and localization of mes-senger ribonucleic acid for the vitellogenin receptor in ovar-ian follicles throughout oogenesis in the rainbow troutOncorhynchus mykiss. Biol Reprod 1999, 60:1057-1068.

97. Wang H, Tan JTT, Emelyanov A, Korzh V, Gong Z: Hepatic andextrahepatic expression of vitellogenin genes in thezebrafish, Danio rerio. Gene 2005, 356:91-100.

98. Riggio M, Scudiero R, Filosa S, Parisi E: Sex- and tissue-specificexpression of aspartic proteinases in Danio rerio (zebrafish).Gene 2000, 260:67-75.

99. Riggio M, Scudiero R, Filosa S, Parisi E: Oestrogen-inducedexpression of a novel liver-specific aspartic proteinase inDanio rerio (zebrafish). Gene 2002, 295:241-246.

100. Download UniGene 2006 [ftp://ftp.ncbi.nih.gov/repository/UniGene/].

101. TIGR download Zebrafish Gene Index 2006 [ftp://ftp.tigr.org/pub/data/tgi/Danio_rerio/].

102. NCBI UniGene Danio rerio Library Browser 2006 [http://www.ncbi.nlm.nih.gov/UniGene/lbrowse2.cgi?TAXID=7955].

103. Tingaud-Sequeira A, André M, Forgue J, Barthe C, Babin PJ: Expres-sion pattern of three estrogen receptor genes duringzebrafish (Danio rerio) development: evidence for a highexpression in neuromasts. Gene Expr Patterns 2004, 4:561-568.

104. Chomczynski P: A reagent for the single-step simultaneous iso-lation of RNA, DNA and proteins from cell and tissue sam-ples. Biotechniques 1993, 15:532-534. 536–537

105. Yates JR 3rd, Morgan SF, Gatlin CL, Griffin PR, Eng JK: Method tocompare collision-induced dissociation spectra of peptides:potential for library searching and subtractive analysis. AnalChem 1998, 70:3557-3565.

106. Beer I, Barnea E, Ziv T, Admon A: Improving large-scale pro-teomics by clustering of mass spectrometry data. Proteomics2004, 4:950-960.

107. Perkins DN, Pappin DJ, Creasy DM, Cottrell JS: Probability-basedprotein identification by searching sequence databases usingmass spectrometry data. Electrophoresis 1999, 20:3551-3567.

108. UniProt 2006 [http://www.ebi.uniprot.org/index.shtml].

Page 28 of 28(page number not for citation purposes)