1
Identification of the Origin of Replication of Gram-Positive Anaerobic Bacterium Clostridium taeniosporum Thompson, Jose § , Bode, Addys § , Shrenker, Natalie § , Hunicke-Smith, S. ¥ , Blinkova, A. ¥ , Walker J.R. ¥ , León, A.J. § , and Ginés- Candelaria, E. § § Miami Dade College-Wolfson Campus, Department of Natural Sciences, Health & Wellness, Miami, FL 33132 ¥ University of Texas at Austin, Austin TX 78712 ABSTRACT Clostridium taeniosporum is a Gram-positive, nonpathogenic, and anaerobic bacterium isolated from the Crimean lake silt. It is a close relative to the toxigenic Clostridium botilinum Group II strain; with approximately 98% similarity. Clostridium taeniosporum is unique in forming ribbon- like endospore appendages that are present even in the best of conditions (Lyer, 2008). Clostridium taeniosporum’s genome is approximately 3, 452.763 Kbs in length, subdivided by sequencing and genome assembly into 18 scaffolds of various sizes. Analysis of scaffolds 1 through 12 was performed using bioinformatics tool Ori Finder to locate specific dnaA boxes and putative oriC sites and indicative genetic markers. Each scaffold was submitted individually, sixteen times, to match each of the different consensus bacterial DnaA boxes present in the indicator bacteria: Escherichia coli, Chalmydiae, Prochlorales, Synechococcus, Haemophilus, Nocardia, Mycoplasma, Bradyrhizobiaceae, Burkholderia chrII, Burkholderia chrIII, Dehalococcoides, Flavobacteriaceae, Helicobacter, Nitrobacter, Thermotoga, and Vibrio chrII. The query was matched against the oriC regions database, My SQL (DoriC) for searching of the dnaA boxes. Results identified scaffold 10 to have the highest amount of dnaA boxes and three putative oriC regions. These results indicate that the origin of replication may be located in scaffold 10. Indicator genes in addition to dnaA, such as dnaN, parA, gidA, and recF were also found using Ori Finder in this scaffold. Protein homology analyses also performed using protein BLAST confirmed the presence of these different proteins the said scaffold. The latter result seems to confirm the hypothesis that the origin of replication in C. taeniosporum may be localized in scaffold 10. This project has provided a fine structural analysis for a putative replication origin in C. taeniosporum. Bioinformatics tools were used to discover the ori regions and dnaA boxes. Ori-Finder from the Center of Bioinformatics in Tianjin University (TUBIC) was utilized to locate dnaA boxes in the eighteen scaffolds of Clostridium taeniosporum’s genome. Scaffolds were obtained from University of Texas and were analyzed with Geneious Pro.v.5.5.6. Each scaffold was submitted individually, sixteen times to match each of the different indicator bacterial dnaA boxes: Escherichia coli, Chalmydiae, Prochlorales, Synechococcus, Haemophilus, Nocardia, Mycoplasma, Bradyrhizobiaceae, Burkholderia (chromosome II), Burkholderia (chromosome III), Dehalococcoides, Flavobacteriaceae, Helicobacter, Nitrobacter, Thermotoga, and Vibrio (chromosome II). Scaffold 10 was resubmitted to Ori Finder to locate possible indicator genes also located in putative ori regions. Genomic DNA extractions were then performed to isolate C. taeniosporum DNA. Genomic DNA was also isolated from Escherichia coli and Bacillus subtilis as controls for this experiment. Agarose gel electrophoresis confirmed the quality for each genomic DNA extraction. In order to do a PCR of the putative region and positive control had to be determined. MATERIALS & METHODS Upon submitting the scaffolds to the Ori Finder database, results show potential dnaA box homologies (see Table 1). With scaffolds 1, 8, and 11 we detected possible matches of a total of 3 dnaA boxes in Haemophilus. Scaffold 3 resulted in a match of 4 dnaA boxes in Chalmydia. Also, in scaffolds 5 and 7, three dnaA boxes were identified in Escherichia coli, Haemophilus, Mycoplasma, Dehelococcoides, and Nocardia, respectively. Scaffold 2 shows homology of 4 dnaA boxes with Chalmydia, Prochlorales, Synechococcus, and Mycoplasma. Scaffold 12 has 3 dnaA-box matches in bacteria such as Dehelococcoides and Haemophilus; and scaffold 4 shows no homology with any of the sixteen representative indicator bacteria. Scaffold 9 was found to contain 11 dnaA-box matches with Haemophilus, four in Escherichia coli and four others in Mycoplasma. Scaffold 10 contains a total of 104 dnaA-box homologies. In Escherichia coli there were 20, eleven in Chalmydia, Prochlorales, and Synechococcus, 10 in Nocardia; twenty in Haemophilus, and 21 in Mycoplasma. Due to the high number of dnaA-box homologies, scaffold 10 was resubmitted to Ori Finder to locate ori-like-indicator genes, that would signal the location of a putative origin of replication, in addition to the dnaA boxes. Results show 16 dnaA boxes around the putative ori region. Figure 1 shows the structure of the origin of replication in E. coli, oriC. Figure 2 shows the putative ori region and the annotated indicator genes discovered: dnaA, gidA, dnaN, parA, and recF, as well as fifteen unique dnaA box sequences. A reevaluation of the open reading frames around the region was also performed using BLASTn and BLASTp and Proteins such as DNA Polymerase III, DNA gyrase subunits were also detected in the sequence flanking the putative origin (see Figure 2) and other proteins participating in DNA replication including helicase and single-straded binding protein (data not shown) . Figure 2 shows a tentative structure of the proposed origin of replication of Clostridium taeniosporum. Even though the structure varies from species to species, indicators such as high AT content, dnaA boxes (red), dnaN and dnaA (yellow) remain similar to other origins. The structure shown in Figure 2 is consistent with reports of similar origin structures in Gram positives where ori regions contain multiple repeats of dnaA boxes upstream and downstream from the main initiation activator gene dnaA. As shown in Table 1, scaffold 10 contains the greatest number of dnaA boxes, 104 among all indicator bacterial genera examined. We identified 15 unique dnaA-box sequences with the remaining 89 representing variations of these 15-sequence theme. The fine structure of the origin that we proposed for scaffold 10 is supported by various reports that confirm the hypothesis that the origin of replication might be located in scaffold 10. DISCUSSION RESULTS RESULTS Scaffold 1 Scaffold 2 Scaffold 3 Scaffold 5 Scaffold 7 Scaffold 8 Scaffold 9 Scaffold 10 Scaffold 11 Scaffold 12 Escherichia coli 0 0 0 3 dnaA 1 oriC 0 0 4 dnaA 1 oriC 20 dnaA 3 oriC 0 0 Chalmydiae 0 3 dnaA 1 oriC 4 dnaA 1 oriC 0 0 0 0 11 dnaA 3 oriC 0 0 Dehelococcoides 0 0 0 0 4 dnaA 1 oriC 0 0 0 0 3 dnaA 1 oriC Prochlorales 0 3 dnaA 1 oriC 0 0 0 0 0 11 dnaA 3 oriC 0 0 Synechococcus 0 3 dnaA 1 oriC 0 0 0 0 0 11 dnaA boxes 0 0 Heamophilus 3 dnaA 1 oriC 0 0 3 dnaA 1 oriC 3 dnaA 1 oriC 3 dnaA 1 oriC 11 dnaA 3 oriC 20 dnaA 3 oriC 3 dnaA 1 oriC 3 dnaA 1 oriC Nocardia 0 0 0 0 3 dnaA 1 oriC 0 0 10 dnaA 3 oriC 0 0 Mycoplasma 0 3 dnaA 1 oriC 0 3 dnaA 1 oriC 0 0 4 dnaA 1 oriC 21 dnaA 3 oriC 0 0 REFERENCES/ACKNOWLEDGEMENTS Jakimowicz, D. et al. 1998. Structural elements of the Streptomyces oriC region and their interactions with the DnaA protein. Microbiology 144:1281-1290. Moriya, S. and N. Ogasawara. 1996. Mapping of the replication origin of the Bacillus subtilis chromosome by the two-dimensional gel method. Gene 176:81-84. We would like to thank the National Science Foundation Advanced Technological Education Program NSF ATE DUE 0802508 “The Biotechnology Research Learning Collaborative” (BRLC) and the US Department of Education, HIS-STEM Program P031C110190 STEM-TRACK for their support of this research project. Figure 1a. Architecture of typical E.coli Origin of Replication. Other indicators of origin structured such as oriC AT-rich region (shown in yellow), dnaA boxes (shown in red), and dnaN (shown in grey). Figure 2. Geneious Pro.v.5.5.6. scaffold 10 view of the architecture of the putative ori region. Shown in the annotated sequence above include the dnaA boxes (in red), indicator genes and proteins (in yellow), and region putative ori (in grey). Table 1. Results of scaffolds 1-12 matched dnaA boxes of indicator bacteria and their putative oriC regions. Only the scaffolds and bacteria with results are shown. Figure 1b. Architecture of Origin of Replication of Bacillus subtilis, a typical Gram-positive bacterium. Adapted from Jakimowicz et al., 1998. Arrows indicate the direction of the dnaA boxes. Other genetic markers surrounding the ori region are also indicated CONCLUSION We have proposed a comprehensive structure for the origin of replication for Clostridium taeniosporum, localizing it to a region within genome scaffold 10. Protein coding sequences around this structure have been supported by proteomic data and multiple reports specifying a similar structure for many Gram-positive bacteria. Therefore, the available evidence confirms the structure proposed here. Efforts will be directed to confirm the viability of this structure in the future (Blinkova, personal communication).

Identification of the Origin of Replication of Gram-Positive Anaerobic Bacterium Clostridium taeniosporum Thompson, Jose §, Bode, Addys §, Shrenker, Natalie

Embed Size (px)

Citation preview

Page 1: Identification of the Origin of Replication of Gram-Positive Anaerobic Bacterium Clostridium taeniosporum Thompson, Jose §, Bode, Addys §, Shrenker, Natalie

Identification of the Origin of Replication of Gram-Positive Anaerobic Bacterium Clostridium taeniosporumThompson, Jose §, Bode, Addys §, Shrenker, Natalie §, Hunicke-Smith, S.¥, Blinkova, A.¥, Walker J.R.¥, León, A.J.§, and Ginés-Candelaria, E.§

§Miami Dade College-Wolfson Campus, Department of Natural Sciences, Health & Wellness, Miami, FL 33132¥University of Texas at Austin, Austin TX 78712

ABSTRACT

Clostridium taeniosporum is a Gram-positive, nonpathogenic, and anaerobic bacterium isolated from the Crimean lake silt. It is a close relative to the toxigenic Clostridium botilinum Group II strain; with approximately 98% similarity. Clostridium taeniosporum is unique in forming ribbon-like endospore appendages that are present even in the best of conditions (Lyer, 2008). Clostridium taeniosporum’s genome is approximately 3, 452.763 Kbs in length, subdivided by sequencing and genome assembly into 18 scaffolds of various sizes. Analysis of scaffolds 1 through 12 was performed using bioinformatics tool Ori Finder to locate specific dnaA boxes and putative oriC sites and indicative genetic markers. Each scaffold was submitted individually, sixteen times, to match each of the different consensus bacterial DnaA boxes present in the indicator bacteria: Escherichia coli, Chalmydiae, Prochlorales, Synechococcus, Haemophilus, Nocardia, Mycoplasma, Bradyrhizobiaceae, Burkholderia chrII, Burkholderia chrIII, Dehalococcoides, Flavobacteriaceae, Helicobacter, Nitrobacter, Thermotoga, and Vibrio chrII. The query was matched against the oriC regions database, My SQL (DoriC) for searching of the dnaA boxes. Results identified scaffold 10 to have the highest amount of dnaA boxes and three putative oriC regions. These results indicate that the origin of replication may be located in scaffold 10. Indicator genes in addition to dnaA, such as dnaN, parA, gidA, and recF were also found using Ori Finder in this scaffold. Protein homology analyses also performed using protein BLAST confirmed the presence of these different proteins the said scaffold. The latter result seems to confirm the hypothesis that the origin of replication in C. taeniosporum may be localized in scaffold 10. This project has provided a fine structural analysis for a putative replication origin in C. taeniosporum.

Bioinformatics tools were used to discover the ori regions and dnaA boxes. Ori-Finder from the Center of Bioinformatics in Tianjin University (TUBIC) was utilized to locate dnaA boxes in the eighteen scaffolds of Clostridium taeniosporum’s genome. Scaffolds were obtained from University of Texas and were analyzed with Geneious Pro.v.5.5.6. Each scaffold was submitted individually, sixteen times to match each of the different indicator bacterial dnaA boxes: Escherichia coli, Chalmydiae, Prochlorales, Synechococcus, Haemophilus, Nocardia, Mycoplasma, Bradyrhizobiaceae, Burkholderia (chromosome II), Burkholderia (chromosome III), Dehalococcoides, Flavobacteriaceae, Helicobacter, Nitrobacter, Thermotoga, and Vibrio (chromosome II). Scaffold 10 was resubmitted to Ori Finder to locate possible indicator genes also located in putative ori regions. Genomic DNA extractions were then performed to isolate C. taeniosporum DNA. Genomic DNA was also isolated from Escherichia coli and Bacillus subtilis as controls for this experiment. Agarose gel electrophoresis confirmed the quality for each genomic DNA extraction. In order to do a PCR of the putative region and positive control had to be determined.

MATERIALS & METHODS

Upon submitting the scaffolds to the Ori Finder database, results show potential dnaA box homologies (see Table 1). With scaffolds 1, 8, and 11 we detected possible matches of a total of 3 dnaA boxes in Haemophilus. Scaffold 3 resulted in a match of 4 dnaA boxes in Chalmydia. Also, in scaffolds 5 and 7, three dnaA boxes were identified in Escherichia coli, Haemophilus, Mycoplasma, Dehelococcoides, and Nocardia, respectively. Scaffold 2 shows homology of 4 dnaA boxes with Chalmydia, Prochlorales, Synechococcus, and Mycoplasma. Scaffold 12 has 3 dnaA-box matches in bacteria such as Dehelococcoides and Haemophilus; and scaffold 4 shows no homology with any of the sixteen representative indicator bacteria. Scaffold 9 was found to contain 11 dnaA-box matches with Haemophilus, four in Escherichia coli and four others in Mycoplasma. Scaffold 10 contains a total of 104 dnaA-box homologies. In Escherichia coli there were 20, eleven in Chalmydia, Prochlorales, and Synechococcus, 10 in Nocardia; twenty in Haemophilus, and 21 in Mycoplasma. Due to the high number of dnaA-box homologies, scaffold 10 was resubmitted to Ori Finder to locate ori-like-indicator genes, that would signal the location of a putative origin of replication, in addition to the dnaA boxes. Results show 16 dnaA boxes around the putative ori region. Figure 1 shows the structure of the origin of replication in E. coli, oriC. Figure 2 shows the putative ori region and the annotated indicator genes discovered: dnaA, gidA, dnaN, parA, and recF, as well as fifteen unique dnaA box sequences. A reevaluation of the open reading frames around the region was also performed using BLASTn and BLASTp and Proteins such as DNA Polymerase III, DNA gyrase subunits were also detected in the sequence flanking the putative origin (see Figure 2) and other proteins participating in DNA replication including helicase and single-straded binding protein (data not shown) .  

Figure 2 shows a tentative structure of the proposed origin of replication of Clostridium taeniosporum. Even though the structure varies from species to species, indicators such as high AT content, dnaA boxes (red), dnaN and dnaA (yellow) remain similar to other origins. The structure shown in Figure 2 is consistent with reports of similar origin structures in Gram positives where ori regions contain multiple repeats of dnaA boxes upstream and downstream from the main initiation activator gene dnaA.As shown in Table 1, scaffold 10 contains the greatest number of dnaA boxes, 104 among all indicator bacterial genera examined. We identified 15 unique dnaA-box sequences with the remaining 89 representing variations of these 15-sequence theme. The fine structure of the origin that we proposed for scaffold 10 is supported by various reports that confirm the hypothesis that the origin of replication might be located in scaffold 10.

DISCUSSION

RESULTS RESULTS

Scaffold 1 Scaffold 2 Scaffold 3 Scaffold 5 Scaffold 7 Scaffold 8 Scaffold 9 Scaffold 10 Scaffold 11 Scaffold 12

Escherichia coli

0 0 0 3 dnaA1 oriC

0 0 4 dnaA1 oriC

20 dnaA3 oriC

0 0

Chalmydiae

0 3 dnaA1 oriC

4 dnaA1 oriC

0 0 0 0 11 dnaA3 oriC

0 0

Dehelococcoides

0 0 0 0 4 dnaA1 oriC

0 0 0 0 3 dnaA1 oriC

Prochlorales

0 3 dnaA1 oriC

0 0 0 0 0 11 dnaA3 oriC

0 0

Synechococcus

0 3 dnaA1 oriC

0 0 0 0 0 11 dnaAboxes

0 0

Heamophilus

3 dnaA1 oriC

0 0 3 dnaA1 oriC

3 dnaA1 oriC

3 dnaA1 oriC

11 dnaA3 oriC

20 dnaA3 oriC

3 dnaA1 oriC

3 dnaA1 oriC

Nocardia

0 0 0 0 3 dnaA1 oriC

0 0 10 dnaA3 oriC

0 0

Mycoplasma

0 3 dnaA1 oriC

0 3 dnaA1 oriC

0 0 4 dnaA1 oriC

21 dnaA3 oriC

0 0

REFERENCES/ACKNOWLEDGEMENTS

Jakimowicz, D. et al. 1998. Structural elements of the Streptomyces oriC region and their interactions with the DnaA protein. Microbiology 144:1281-1290.

Moriya, S. and N. Ogasawara. 1996. Mapping of the replication origin of the Bacillus subtilis chromosome by the two-dimensional gel method. Gene 176:81-84.

We would like to thank the National Science Foundation Advanced Technological Education Program NSF ATE DUE 0802508 “The Biotechnology Research Learning Collaborative” (BRLC) and the US Department of Education, HIS-STEM Program P031C110190 STEM-TRACK for their support of this research project.

Figure 1a. Architecture of typical E.coli Origin of Replication. Other indicators of origin structured such as oriC AT-rich region (shown in yellow), dnaA boxes (shown in red), and dnaN (shown in grey).

Figure 2. Geneious Pro.v.5.5.6. scaffold 10 view of the architecture of the putative ori region. Shown in the annotated sequence above include the dnaA boxes (in red), indicator genes and proteins (in yellow), and region putative ori (in grey).

Table 1. Results of scaffolds 1-12 matched dnaA boxes of indicator bacteria and their putative oriC regions. Only the scaffolds and bacteria with results are shown.

Figure 1b. Architecture of Origin of Replication of Bacillus subtilis, a typical Gram-positive bacterium. Adapted from Jakimowicz et al., 1998. Arrows indicate the direction of the dnaA boxes. Other genetic markers surrounding the ori region are also indicated

CONCLUSION

We have proposed a comprehensive structure for the origin of replication for Clostridium taeniosporum, localizing it to a region within genome scaffold 10. Protein coding sequences around this structure have been supported by proteomic data and multiple reports specifying a similar structure for many Gram-positive bacteria. Therefore, the available evidence confirms the structure proposed here. Efforts will be directed to confirm the viability of this structure in the future (Blinkova, personal communication).