18
Supplemental information A.Sequence collection B.qPCR calibration formula

Supplemental information

  • Upload
    coyne

  • View
    36

  • Download
    0

Embed Size (px)

DESCRIPTION

Supplemental information. Sequence collection qPCR calibration formula. Database subcollections and their taxonomy relationship. Mollusca. Symbol. Database. Gastropoda. Patellogastropoda. Lottia gigantea. Lgi. Genome, EST. Vetigastropoda. Haliotis. Haliotis asinina. Has. EST. - PowerPoint PPT Presentation

Citation preview

Page 1: Supplemental information

Supplemental information

A. Sequence collectionB. qPCR calibration formula

Page 2: Supplemental information

Database subcollections and their taxonomy relationship

MolluscaGastropoda

PatellogastropodaLottia gigantea

VetigastropodaHaliotis

Haliotis asininaHaliotis midaeHaliotis diversicolorHaliotis discus

CaenogastropodaLittorina littorea

HeterobranchiaAplysia californica

BivalviaHeteroconchia

Meretrix meretrixPteriomorphia

OstreoidaCrassostrea angulata

MytiloidaMytilus galloprovincialis

CephalopodaOctopus vulgaris

DatabaseSymbol

Lgi

HasHmiHdivHdis

Lli

Mme

Can

Mga

Genome, EST

ESTTranscriptomeTranscriptomeEST

mRNA

EST

Transcriptome

Transcriptome

EST

Transcriptome

Page 3: Supplemental information

Gene symbols and corresponding GenBank ID

SARP19-like genes   vdg3-like genes GenBank ID

Symbol   GenBank ID

Symbol

JU004909 Can-II1   GD272046 Has-I1JU034913 Can-II2   DY403113 Has-I2GD241786

Has-I1   GD241803 Has-I3

EE676526 Hdis-I1   DY403155 Has-I4EG362622Hdis-I5   GD241807 Has-I5DN763845

Hdis-I8   AY916060 Has-I10

EE664153 Hdis-I9   EG362075 Hdis-I1JU063184 Hdiv-I1   EE675817 Hdis-I10JU069467 Hdiv-I2   EE664072 Hdis-I4JU066047 Hdiv-I3   EG362106 Hdis-I5JU063185 Hdiv-I4   EG362237 Hdis-I7JU071961 Hdiv-I5   JU063200 Hdiv-I1JU078033 Hdiv-I6   JU063213 Hdiv-I2GT866721 Hdiv-I7   JU063214 Hdiv-I3JU063488 Hdiv-II   JU063212 Hdiv-I4Cg 4092 Hmi-I2   JU063203 Hdiv-I5Cg 5590 Hmi-I3   JU071971 Hdiv-I6Cg 4147 Hmi-I4   Cg 21259 Hmi-I3Cg 19409 Hmi-I5   Cg 22293 Hmi-I5Cg 1131 Hmi-I6   Cg 18717 Hmi-I7Cg 9978 Hmi-I7   Cg 22334 Hmi-I8FC665937 LgiSARP19-1   Cg 22249 Hmi-I9FC706468 LgiSARP19-2   FC598503 Lgi-vdg3-1JI273633 Mme-II1   FC641104 Lgi-vdg3-2JI268876 Mme-II2   FC714859 Lgi-vdg3-3AF369698

SARP19   FL594967 Mga-II1

      DQ268867 Mga-II2      FL595024 Mga-II3      AJ625851 Mga-II4      AJ625949 Mga-II5

Page 4: Supplemental information

SARP19 sequence collection- Step1.

AA sequences of H. diversicolor SARP19-I1 (GB: JU063184) and Littorina littorea SARP19 (GB: AAM20842) were BLASTp to GenBank NR protein database.

Hits distributed in a wide taxonomy catalog from nematodes to birds (FigS1). Such wide distribution could be resulted from the conservative EF-hand calcium-binding motifs.

However, some of the distances are too long to fit monophyletic hypothesis. Sequence collection should be focused.

Page 5: Supplemental information

FigS1. Hitting pattern SARP19-I1 to GenBank NR protein database

Page 6: Supplemental information

SARP19 sequence collection- Step2.

• EST or TSA sequence libraries of three abalones, H. diversicolor, H. discus and H. asinina (EST), were downloaded from NCBI.

• CDS of H. diversicolor SARP19-I1 and Littorina littorea SARP19 were tBLASTx these mRNA libraries. Similar sequences were identified and repeated such BLAST until no new sequence was added.

• Redundant sequences were manually removed, and CDSs were predicted by interpreting the BLAST results. Because some sequencing errors can cause breaks in the CDS, the obvious break points were manually modified by adding or deleting single nucleotides to replace the CDS.

• These CDS were BLAST Lottia gigantea genome (http://genome.jgi-psf.org/pages/blast.jsf?db=Lotgi1) and EST. New non-redundant CDS were added to the collection.

• Alignments of putative amino acid sequences were performed by ClustalW and then manually modified. Neighbor-joining trees were constructed by MEGA (FigS2).

Page 7: Supplemental information

FigS2. NJ tree of SARP19-like sequences collected from three abalones, Lottia gigantea and Littorina littorea. ○, H. diversicolor; ●, Haliotis discus; , Haliotis asinina; , Lottia ■ △gigantea; , Littorina littorea▲

JU063184

GD241786

EE676526

EG362622

JU071961

JU063185

JU078033

JU063488

JU069467

GT866721

EE664153

DN763845

JU066047

AF369698

FC665937

FC706468

8899

65

98

91

3033

69

85

76

74

95

100

0.2

Page 8: Supplemental information

SARP19 sequence collection– Step3.

• mRNA libraries of Crassostrea angulata and Meretrix meretrix were added. After search, 12 more SARP19-like sequences were recruited.

• The NJ guild tree shows the collection could be separated as two distinct groups (FigS3, S4).

• Average branch length is obviously different between Group A and B. It may imply that their evolutionary constraints were different.

Page 9: Supplemental information

FigS3. NJ tree of 27 SARP19-like sequences collected from ○, H. diversicolor; ●, Haliotis discus; ■, Haliotis asinina; △, Lottia gigantea;▲, Littorina littorea ;◇, Crassostrea angulata; ◆, Meretrix meretrix. JU

0631

84

GD

2417

86

83

EE676526

96

JU071961

EG362622

89

97

JU063185

JU078033

100

98

AF369698

FC665937F

C706468

85

75

67

JU06

9467

JU06

6047

DN

7638

45

GT866721

EE664153352573

72

51

JU063488

JU004909

50

JU034913

JI268876

JI273633

62

50

24

JU076278

JU066084

100

JU066550

77

JU066548

JU090408

JT999542

JI284184

98

87

82

95

0.2

Group A

Group B

Page 10: Supplemental information

JU063184

GD241786

EE676526

JU071961

EG362622

JU063185

JU078033

AF369698

FC665937

FC706468

JU069467

JU066047

DN763845

GT866721

EE664153

JU063488

JU004909

JU034913

JI268876

JI273633

JU076278

JU066084

JU066550

JU066548

JU090408

JT999542

JI284184

100

100

8396

89

97

98

8575

25

73

72

67

62

51

50

50

24

77

95

82

8798

0.2

FigS4. NJ tree of 27 SARP19-like sequences collected from ○, H. diversicolor; ●, Haliotis discus; ■, Haliotis asinina; △, Lottia gigantea; ▲, Littorina littorea ; ◇, Crassostrea angulata; ◆, Meretrix meretrix.

Group A

Group B

Page 11: Supplemental information

SARP19 sequence collection– Step4.

• More mRNA sequence libraries were added: Haliotis midae (Bioproject: PRJNA79815), Aplysia californica (EST), Octopus vulgaris (Bioproject: PRJNA79361)

• After search, 12 more mRNA sequences from H. midae were recruited. No SARP19-like sequence was found from sea hare and octopus neural transcriptoms.

• The NJ guild tree was built as previous described (FigS5) .

Page 12: Supplemental information

JU063184 GD241786

EE676526 EG362622

JU071961 Cg 19409

JU063185 Cg 4147

JU078033 Cg 1131

AF369698 FC665937

FC706468 JU069467

Cg 4092 EE664153

DN763845 JU066047

Cg 5590 GT866721

Cg 9978 JI273633

JU034913 JI268876

JU063488 JU004909

JU076278 JU066084

JU066550 Cg 1279

Cg 1847 JU066548

Cg 3912 JU090408

Cg 7118 JT999542

JI284184 Cg 22570

Cg 20522

100

99

100

100

100

100

7498

9291

99

99

99

99

95

95

82

59

79

81

80

52

73

69

65

62

52

37

34

31

17

75

35

31

0.2

FigS5. NJ tree of 39 SARP19-like sequences. ○, Haliotis diversicolor; ●, H. discus; □ , H. midae;

, H. asinina; ■, Lottia gigantea; △, Littorina littorea ; ▲, Crassostrea angulata; ◇, Meretrix meretrix; ◆

Group A

Group B

Outgroup for GroupA

Collection boundary

Page 13: Supplemental information

SARP19 sequence collection– Step5.

• Setting the collection boundary – Best hitting. All sequences from Group A show best

hitting to other members in Group A. However, some sequences of Group B show ambivalent best hitting pattern.

– To simplify the situation, a boundary was set as showed in FigS5. Those ambivalent sequences could act as outgroup of Group A.

• 26 sequences of the collection were final go-through SWISSPROT, NR and NT databases to find any new sequences that fit the boundary. However, no new sequence was qualified.

Page 14: Supplemental information

Vdg3 sequence collection- Step 1.

AA sequences of H. diversicolor Vdg3-I1 were BLASTp to GenBank NR protein database.

The hitting pattern was much simpler than SARP19 (FigS6). No conserved motif was found in the putative vdg3 proteins.

Page 15: Supplemental information

FigS6. H. diversicolor vdg3-I1 Hitting to GenBank NR protein database

Page 16: Supplemental information

Vdg3 sequence collection- Step 2.

• CDS of H. diversicolor vdg3-I1 and H. asinina vdg3 were hit to former constructed sequence libraries and Lottia gigantea genome (http://genome.jgi-psf.org/pages/blast.jsf?db=Lotgi1).

• Similar sequences were identified and they were repeated such hitting until no new sequence was added.

• As former procedure, redundancies were removed, CDS were predicted and patched, alignments and NJ trees were constructed, and collection boundaries were set by best hitting or guild trees.

• Sequences of the collection were final go-through SWISSPROT, NR protein database and nt DNA database to recruit any missed sequences that fit the boundary.

• The NJ tree of 30 vdg3-like proteins was showed in FigS7.

Page 17: Supplemental information

FigS7. NJ tree of 30 vdg3-like sequences. ○ H. diversicolor; ● H. discus; □ H. midae;

■ H. asinina; △ Lottia gigantea; ▲ Littorina littorea; ◇ Crassostrea angulata;

◆ Meretrix meretrix; ▼ Mytilus galloprovincialis.

Page 18: Supplemental information

qPCR calibration formula• For a set of qPCR reactions in a same run, fluorescence intensities (designate as F)

should be constant (set as f) when they reach their threshold cycle numbers (Ct) i.e., FCt = f (1).

• While the fluorescence intensity F in a SYBR Green qPCR system is in direct proportion to the DNA amount of an amplicon, then

F = k•N•L (2), where k is an unknown constant, N is the molecular number and L is the amplicon

length.• While N = N0 • E^Ct (3),

where N0 is the initial molecular number and E is the PCR efficiency; then, for each gene k •N0(gene) • E(gene)^Ct(gene)•L(gene) = f (4).

• Then, we have the calibration formulaN0(target gene) / N0(control gene) = (E(control gene)^Ct(control gene)•L(control gene))/ (E(target gene)^Ct(target gene)•L(target

gene)) (5),

where OAZ1 was set as control gene and N0(control gene) in each stage was set as 100.