Upload
coyne
View
36
Download
0
Embed Size (px)
DESCRIPTION
Supplemental information. Sequence collection qPCR calibration formula. Database subcollections and their taxonomy relationship. Mollusca. Symbol. Database. Gastropoda. Patellogastropoda. Lottia gigantea. Lgi. Genome, EST. Vetigastropoda. Haliotis. Haliotis asinina. Has. EST. - PowerPoint PPT Presentation
Citation preview
Supplemental information
A. Sequence collectionB. qPCR calibration formula
Database subcollections and their taxonomy relationship
MolluscaGastropoda
PatellogastropodaLottia gigantea
VetigastropodaHaliotis
Haliotis asininaHaliotis midaeHaliotis diversicolorHaliotis discus
CaenogastropodaLittorina littorea
HeterobranchiaAplysia californica
BivalviaHeteroconchia
Meretrix meretrixPteriomorphia
OstreoidaCrassostrea angulata
MytiloidaMytilus galloprovincialis
CephalopodaOctopus vulgaris
DatabaseSymbol
Lgi
HasHmiHdivHdis
Lli
Mme
Can
Mga
Genome, EST
ESTTranscriptomeTranscriptomeEST
mRNA
EST
Transcriptome
Transcriptome
EST
Transcriptome
Gene symbols and corresponding GenBank ID
SARP19-like genes vdg3-like genes GenBank ID
Symbol GenBank ID
Symbol
JU004909 Can-II1 GD272046 Has-I1JU034913 Can-II2 DY403113 Has-I2GD241786
Has-I1 GD241803 Has-I3
EE676526 Hdis-I1 DY403155 Has-I4EG362622Hdis-I5 GD241807 Has-I5DN763845
Hdis-I8 AY916060 Has-I10
EE664153 Hdis-I9 EG362075 Hdis-I1JU063184 Hdiv-I1 EE675817 Hdis-I10JU069467 Hdiv-I2 EE664072 Hdis-I4JU066047 Hdiv-I3 EG362106 Hdis-I5JU063185 Hdiv-I4 EG362237 Hdis-I7JU071961 Hdiv-I5 JU063200 Hdiv-I1JU078033 Hdiv-I6 JU063213 Hdiv-I2GT866721 Hdiv-I7 JU063214 Hdiv-I3JU063488 Hdiv-II JU063212 Hdiv-I4Cg 4092 Hmi-I2 JU063203 Hdiv-I5Cg 5590 Hmi-I3 JU071971 Hdiv-I6Cg 4147 Hmi-I4 Cg 21259 Hmi-I3Cg 19409 Hmi-I5 Cg 22293 Hmi-I5Cg 1131 Hmi-I6 Cg 18717 Hmi-I7Cg 9978 Hmi-I7 Cg 22334 Hmi-I8FC665937 LgiSARP19-1 Cg 22249 Hmi-I9FC706468 LgiSARP19-2 FC598503 Lgi-vdg3-1JI273633 Mme-II1 FC641104 Lgi-vdg3-2JI268876 Mme-II2 FC714859 Lgi-vdg3-3AF369698
SARP19 FL594967 Mga-II1
DQ268867 Mga-II2 FL595024 Mga-II3 AJ625851 Mga-II4 AJ625949 Mga-II5
SARP19 sequence collection- Step1.
AA sequences of H. diversicolor SARP19-I1 (GB: JU063184) and Littorina littorea SARP19 (GB: AAM20842) were BLASTp to GenBank NR protein database.
Hits distributed in a wide taxonomy catalog from nematodes to birds (FigS1). Such wide distribution could be resulted from the conservative EF-hand calcium-binding motifs.
However, some of the distances are too long to fit monophyletic hypothesis. Sequence collection should be focused.
FigS1. Hitting pattern SARP19-I1 to GenBank NR protein database
SARP19 sequence collection- Step2.
• EST or TSA sequence libraries of three abalones, H. diversicolor, H. discus and H. asinina (EST), were downloaded from NCBI.
• CDS of H. diversicolor SARP19-I1 and Littorina littorea SARP19 were tBLASTx these mRNA libraries. Similar sequences were identified and repeated such BLAST until no new sequence was added.
• Redundant sequences were manually removed, and CDSs were predicted by interpreting the BLAST results. Because some sequencing errors can cause breaks in the CDS, the obvious break points were manually modified by adding or deleting single nucleotides to replace the CDS.
• These CDS were BLAST Lottia gigantea genome (http://genome.jgi-psf.org/pages/blast.jsf?db=Lotgi1) and EST. New non-redundant CDS were added to the collection.
• Alignments of putative amino acid sequences were performed by ClustalW and then manually modified. Neighbor-joining trees were constructed by MEGA (FigS2).
FigS2. NJ tree of SARP19-like sequences collected from three abalones, Lottia gigantea and Littorina littorea. ○, H. diversicolor; ●, Haliotis discus; , Haliotis asinina; , Lottia ■ △gigantea; , Littorina littorea▲
JU063184
GD241786
EE676526
EG362622
JU071961
JU063185
JU078033
JU063488
JU069467
GT866721
EE664153
DN763845
JU066047
AF369698
FC665937
FC706468
8899
65
98
91
3033
69
85
76
74
95
100
0.2
SARP19 sequence collection– Step3.
• mRNA libraries of Crassostrea angulata and Meretrix meretrix were added. After search, 12 more SARP19-like sequences were recruited.
• The NJ guild tree shows the collection could be separated as two distinct groups (FigS3, S4).
• Average branch length is obviously different between Group A and B. It may imply that their evolutionary constraints were different.
FigS3. NJ tree of 27 SARP19-like sequences collected from ○, H. diversicolor; ●, Haliotis discus; ■, Haliotis asinina; △, Lottia gigantea;▲, Littorina littorea ;◇, Crassostrea angulata; ◆, Meretrix meretrix. JU
0631
84
GD
2417
86
83
EE676526
96
JU071961
EG362622
89
97
JU063185
JU078033
100
98
AF369698
FC665937F
C706468
85
75
67
JU06
9467
JU06
6047
DN
7638
45
GT866721
EE664153352573
72
51
JU063488
JU004909
50
JU034913
JI268876
JI273633
62
50
24
JU076278
JU066084
100
JU066550
77
JU066548
JU090408
JT999542
JI284184
98
87
82
95
0.2
Group A
Group B
JU063184
GD241786
EE676526
JU071961
EG362622
JU063185
JU078033
AF369698
FC665937
FC706468
JU069467
JU066047
DN763845
GT866721
EE664153
JU063488
JU004909
JU034913
JI268876
JI273633
JU076278
JU066084
JU066550
JU066548
JU090408
JT999542
JI284184
100
100
8396
89
97
98
8575
25
73
72
67
62
51
50
50
24
77
95
82
8798
0.2
FigS4. NJ tree of 27 SARP19-like sequences collected from ○, H. diversicolor; ●, Haliotis discus; ■, Haliotis asinina; △, Lottia gigantea; ▲, Littorina littorea ; ◇, Crassostrea angulata; ◆, Meretrix meretrix.
Group A
Group B
SARP19 sequence collection– Step4.
• More mRNA sequence libraries were added: Haliotis midae (Bioproject: PRJNA79815), Aplysia californica (EST), Octopus vulgaris (Bioproject: PRJNA79361)
• After search, 12 more mRNA sequences from H. midae were recruited. No SARP19-like sequence was found from sea hare and octopus neural transcriptoms.
• The NJ guild tree was built as previous described (FigS5) .
JU063184 GD241786
EE676526 EG362622
JU071961 Cg 19409
JU063185 Cg 4147
JU078033 Cg 1131
AF369698 FC665937
FC706468 JU069467
Cg 4092 EE664153
DN763845 JU066047
Cg 5590 GT866721
Cg 9978 JI273633
JU034913 JI268876
JU063488 JU004909
JU076278 JU066084
JU066550 Cg 1279
Cg 1847 JU066548
Cg 3912 JU090408
Cg 7118 JT999542
JI284184 Cg 22570
Cg 20522
100
99
100
100
100
100
7498
9291
99
99
99
99
95
95
82
59
79
81
80
52
73
69
65
62
52
37
34
31
17
75
35
31
0.2
FigS5. NJ tree of 39 SARP19-like sequences. ○, Haliotis diversicolor; ●, H. discus; □ , H. midae;
, H. asinina; ■, Lottia gigantea; △, Littorina littorea ; ▲, Crassostrea angulata; ◇, Meretrix meretrix; ◆
Group A
Group B
Outgroup for GroupA
Collection boundary
SARP19 sequence collection– Step5.
• Setting the collection boundary – Best hitting. All sequences from Group A show best
hitting to other members in Group A. However, some sequences of Group B show ambivalent best hitting pattern.
– To simplify the situation, a boundary was set as showed in FigS5. Those ambivalent sequences could act as outgroup of Group A.
• 26 sequences of the collection were final go-through SWISSPROT, NR and NT databases to find any new sequences that fit the boundary. However, no new sequence was qualified.
Vdg3 sequence collection- Step 1.
AA sequences of H. diversicolor Vdg3-I1 were BLASTp to GenBank NR protein database.
The hitting pattern was much simpler than SARP19 (FigS6). No conserved motif was found in the putative vdg3 proteins.
FigS6. H. diversicolor vdg3-I1 Hitting to GenBank NR protein database
Vdg3 sequence collection- Step 2.
• CDS of H. diversicolor vdg3-I1 and H. asinina vdg3 were hit to former constructed sequence libraries and Lottia gigantea genome (http://genome.jgi-psf.org/pages/blast.jsf?db=Lotgi1).
• Similar sequences were identified and they were repeated such hitting until no new sequence was added.
• As former procedure, redundancies were removed, CDS were predicted and patched, alignments and NJ trees were constructed, and collection boundaries were set by best hitting or guild trees.
• Sequences of the collection were final go-through SWISSPROT, NR protein database and nt DNA database to recruit any missed sequences that fit the boundary.
• The NJ tree of 30 vdg3-like proteins was showed in FigS7.
FigS7. NJ tree of 30 vdg3-like sequences. ○ H. diversicolor; ● H. discus; □ H. midae;
■ H. asinina; △ Lottia gigantea; ▲ Littorina littorea; ◇ Crassostrea angulata;
◆ Meretrix meretrix; ▼ Mytilus galloprovincialis.
qPCR calibration formula• For a set of qPCR reactions in a same run, fluorescence intensities (designate as F)
should be constant (set as f) when they reach their threshold cycle numbers (Ct) i.e., FCt = f (1).
• While the fluorescence intensity F in a SYBR Green qPCR system is in direct proportion to the DNA amount of an amplicon, then
F = k•N•L (2), where k is an unknown constant, N is the molecular number and L is the amplicon
length.• While N = N0 • E^Ct (3),
where N0 is the initial molecular number and E is the PCR efficiency; then, for each gene k •N0(gene) • E(gene)^Ct(gene)•L(gene) = f (4).
• Then, we have the calibration formulaN0(target gene) / N0(control gene) = (E(control gene)^Ct(control gene)•L(control gene))/ (E(target gene)^Ct(target gene)•L(target
gene)) (5),
where OAZ1 was set as control gene and N0(control gene) in each stage was set as 100.