Upload
nguyenbao
View
215
Download
0
Embed Size (px)
Citation preview
Stephen Koury ([email protected]) | profile | logout
[ ] geni-act
geni-act :: genomes :: Kytococcus sedentarius DSM 20547 CP001686 :: Ksed_00010 :: Lab Notebook 8ed2a45227a744df
Lab NotebookOrganism: Kytococcus sedentarius DSM 20547 CP001686Locus: Ksed_00010
[-] Basic Information
Module Instructions
DNA Coordinates
go to the Gene Page
DNA coordinates209..1729
DNA Sequence
go to the Gene Page
Nucleotide sequence (FASTA format; see module Quick Links for instructions)>Ksed_00010 nucleotide sequnce GTGAGCCAGACCCCCGACGACCACGCCACCGCCATCTGGCAGGAGGCCATGGTCCACCTCCAGGGAGCAGGCCTGGCCCCGCGCGACATCGGGGTGCTCCGGCTGGCCACGCTCGTGGGTCTGCTGGAGGGCACTGCCCTGCTCGCGGTGAAGTACGACCACGTCAAGGACGCCGTCGAGGGGCACCTGCGCGAGGACGTGTCCACCGCCCTGGCGGAGGTCCTGGACCGTGACATCCGGCTGGCCGTCTCGGTGGACCCCGATGCGGTGAGCGCCGCCCAGGAGGAGGCCGCACCCCCGGCCCCGTCCCCGGCCGATGAGGACGACCCGGCCACAGGTGAGGGACCGTTGTCCACAGCTGTGGACGGAGCCGTGGAAAAGCACGAGGGAAGCAGTCCGGCACGTGCCGGGGAATCGGTGGCGCCGGCCACGACGGCCAGCCTGACGGCGACAAACTCCTCACCCGGTGTGGAGCGCGATTACTCCGCGCTGAACCACAAGTACACTTTCGACACCTTCGTGCTGGGGTCGTCGAACCGTTTCGCCCACGCCGCAGCGACCGCCGTGGCCGAAGCCCCCGCCCGCGCCTACAACCCGCTGTTCATCTACGGCGGATCAGGTCTGGGCAAGACCCACCTGTTGCACGCCATCGGCCACTACGCCCGCACCCTGGATTCCTCGGTGCGCGTGAAGTACGTGAACTCGGAGGAGTTCACCAACCAGTTCATCAACGCGGTCTCGGCCGGCCAGGCGAATGCCTTCCAGCGCCAGTACCGCGATGTGGACGTCCTGCTCATCGACGACATCCAGTTCCTGCAGGGCAAGGAGCAGACGATGGAGGAGTTCTTCCACACCTTCAACACCCTGCACAACAGCGAGAAGCAGATCGTCATCACCTCCGACCAGCCCCCGAAGAAGCTCAGTGGCTTCGCCGAGCGCATGCGCTCGCGTTTCGAGTGGGGTCTGCTCACCGACGTGCAGCCGCCGGACCTGGAGACCCGCATCGCGATCCTCCGGCGCAAGGCAGCGGCCGACAAGCTGGACATCCCCGATGACGTGCTCCACCTCATCGCGTCGAAGATCTCCTCGAACATCCGCGAGCTCGAGGGGGCCCTGACCCGGGTGACGGCCTTCGCGAGCCTGTCCGGGTCGCCCCTGGACGAGTACCTGGCCCGCACGGTGCTCAAGGACGTGATGCCCGGCGGTGACAGCGGCCAGATCACGCCCACGATGATCCTGGAGGAGACCGCGGGGTACTTCGTCATCTCCGTCGAGGAGATCCAGGGCGCCTCCCGCTCGCGCAACCTGACCCGGGCCCGGCAGATCGCCATGTACCTGTGCCGCGAGCTCACGGACCTCTCGCTGCCGAAGATCGGCAAGGAGTTCGGCGGC
http://geni-act.org/lab_notebook/public/8ed2a45227a744df/ 11/7/15, 7:37 PMPage 1 of 39
CGCGACCACACGACCGTCATGCACGCCGAGCGCAAGATCAAGCAGCTGCTCGGGGAGGACCGCCGGGTCTACGACGAGGTGAGCGAGCTCACCAGCATCATCCGCAAGAAGGCGGCGCGCGGCCGCTG A
Sequence Length1521 nucleotides
Protein Sequence
go to the Gene Page
Amino acid sequence
>Ksed_00010_amino acid seqMSQTPDDHATAIWQEAMVHLQGAGLAPRDIGVLRLATLVGLLEGTALLAVKYDHVKDAVEGHLREDVSTALAEVLDRDIRLAVSVDPDAVSAAQEEAAPPAPSPADEDDPATGEGPLSTAVDGAVEKHEGSSPARAGESVAPATTASLTATNSSPGVERDYSALNHKYTFDTFVLGSSNRFAHAAATAVAEAPARAYNPLFIYGGSGLGKTHLLHAIGHYARTLDSSVRVKYVNSEEFTNQFINAVSAGQANAFQRQYRDVDVLLIDDIQFLQGKEQTMEEFFHTFNTLHNSEKQIVITSDQPPKKLSGFAERMRSRFEWGLLTDVQPPDLETRIAILRRKAAADKLDIPDDVLHLIASKISSNIRELEGALTRVTAFASLSGSPLDEYLARTVLKDVMPGGDSGQITPTMILEETAGYFVISVEEIQGASRSRNLTRARQIAMYLCRELTDLSLPKIGKEFGGRDHTTVMHAERKIKQLLGEDRRVYDEVSELTSIIRKKAARGR
Sequence Length506 aa
[-] Sequence-based Similarity Data
Module Instructions
BLAST
go to BLAST at http://www.ncbi.nlm.nih.gov/blast
Gene product name (top hit)
Note that results from both the Swiss-Prot and nr databases have been recorded. The Swiss-Prot data for both tophits are shown in the "BLAST first hit" section of the notebook and the nr data are shown in the second hit section ofthe notebook.
Hit #1 Swis Prot Database - chromosomal replication initiator protein DnaA
Hit #2 Swis Prot Database - chromosomal replication initiator protein DnaA
http://geni-act.org/lab_notebook/public/8ed2a45227a744df/ 11/7/15, 7:37 PMPage 2 of 39
Organism
Hit #1 Swis Prot Database - Kineococcus radiotolerans SRS30216
Hit #2 Swis Prot Database - Mycobacterium vanbaalenii PYR-1
Alignment Length
Hit #1 Swis Prot Database - 493
Hit #2 Swis Prot Database - 504
Score
See image files below for both SwissProt hits
E-value
See image files below for both SwissProt hits
Alignment of the top hit and the query sequence
SwissProt Database 2 top hits:
http://geni-act.org/lab_notebook/public/8ed2a45227a744df/ 11/7/15, 7:37 PMPage 3 of 39
Gene product name (second hit)
Hit #1 nr Database - chromosomal replication initiator protein DnaA
Hit #2 nr Database - chromosomal replication initiator protein DnaA
Organism
Hit #1 nr Database - Ornithinimicrobium pekingense
Hit #2 nr Database- Serinicoccus profundi
Alignment Length
Hit #1 nr Database - 503
Hit #2 nr Database - 503
Score
See image files below for both nr hits
E-value
See image files below for both nr hits
Alignment of the second hit and the query sequence
http://geni-act.org/lab_notebook/public/8ed2a45227a744df/ 11/7/15, 7:37 PMPage 5 of 39
CDD
click on the CDD search results at the top of the BLAST results page
COG number (top hit)COG0593
COG nameDnaA: Chromosomal replication initiation ATPase DnaA [Replication, recombination and repair]
E-value1.03e-153
http://geni-act.org/lab_notebook/public/8ed2a45227a744df/ 11/7/15, 7:37 PMPage 7 of 39
COG number (second hit)No second found
COG name
E-value
T-Coffee
go to T-Coffee at http://www.ebi.ac.uk/Tools/msa/tcoffee
Sequences used for alignmentNote that the nr database BLAST results were used to select sequences for alginment since there were a much larger number of good hitsin those results.
>Kytococcus sedentarius DSM 20547]
MSQTPDDHATAIWQEAMVHLQGAGLAPRDIGVLRLATLVGLLEGTALLAVKYDHVKDAVEGHLREDVSTALAEVLDRDIR
LAVSVDPDAVSAAQEEAAPPAPSPADEDDPATGEGPLSTAVDGAVEKHEGSSPARAGESVAPATTASLTATNSSPGVERD
YSALNHKYTFDTFVLGSSNRFAHAAATAVAEAPARAYNPLFIYGGSGLGKTHLLHAIGHYARTLDSSVRVKYVNSEEFTN
QFINAVSAGQANAFQRQYRDVDVLLIDDIQFLQGKEQTMEEFFHTFNTLHNSEKQIVITSDQPPKKLSGFAERMRSRFEW
GLLTDVQPPDLETRIAILRRKAAADKLDIPDDVLHLIASKISSNIRELEGALTRVTAFASLSGSPLDEYLARTVLKDVMP
GGDSGQITPTMILEETAGYFVISVEEIQGASRSRNLTRARQIAMYLCRELTDLSLPKIGKEFGGRDHTTVMHAERKIKQL
LGEDRRVYDEVSELTSIIRKKAARGR
>gi|551300082|ref|WP_022920049.1| chromosomal replication initiator protein DnaA [Ornithinimicrobium pekingense]
MTSQSPAESAEVWQRVVSQLESQGVTARDRAFLRLTQLVGLLDTTALLAVPYQHTKETLETTLRQPIVDALAGELGHDVR
LAITVDEDLRRQVEDEGDPAPGPAVTEQVPSDPDRTPYRSNGAGPGEPRSDGHRTPSGAVQTASAEDARLNPKYTFDTFV
SGSSNRFAHAASLAVAESPARAYNPLFIYGESGLGKTHLLHAIGHYARSLYPGVRVRYVNSEEFTNDFINSIRDDKAGAF
QRRYRNVDFLLVDDIQFLQGKEQTVEEFFHTFNTLHNSEKQVVITSDQPPKRLSGFAERMRSRFEWGLLTDVQPPDLETR
IAILKKKAAQEGMQLPDEVLELIGSKISTNIRELEGALIRVTAFASLSSTPPDAALASHVLKDIIPNSESAAITVPTIMA
EVADYFQISNDDLCGTSRSRTLVNARQIAMYLCRELTDLSLPKIGQEFGGRDHTTVMHAERKIRQLIGERRALYDQITEL
TGIIRKASAR
>gi|497833122|ref|WP_010147278.1| chromosomal replication initiator protein DnaA [Serinicoccus profundi]
MSQPSTDSGDTWRRVVSELEDKGLGAREKAFLRLTTMVGVLDSTVLLAVPYPHTKEMLETTLRQPIVDLLSRELDREVRL
AITVDDDVRQRVEDEADDEADEDAQTRESLTRPASQPSSSAGAGVPGPSGNGIPRPATPAGPAVTGAADEARLNPKYSFD
TFVSGPSNRFAHAASLAVAESPARAYNPLFIYGESGLGKTHLLHAIGHYARKLYPGVRVRYVNSEEFTNDFINSIRDDKA
GAFQRRYRNVDFLLVDDIQFLQGKEQTVEEFFHTFNTLHNSEKQVVITSDQPPKRLSGFAERMRSRFEWGLLTDVQPPDL
ETRIAILRKKAAQEGMQLPDEVLEHIASRITTNIRELEGALIRVTAFASLSSQPADADLAAHVLKDIVPGSDTAQITVST
http://geni-act.org/lab_notebook/public/8ed2a45227a744df/ 11/7/15, 7:37 PMPage 8 of 39
IIREVSEYFQISIDELCGTSRSRTLVNARQIAMYLCRELTDLSLPKIGQEFGGRDHTTVMHAERKIRAQIGERRALYDQI
AELTGTIRRASQR
>gi|737975618|ref|WP_035938084.1| chromosomal replication initiator protein DnaA [Knoellia aerolata] >gi|700180054|gb|KGN40755.1|chromosomal replication initiation protein [Knoellia aerolata DSM 18566]
MDQIWRTTLDALDSDGIPVQQRAFLSLARLVGLLDETALIAVPNDFTKDIVETRLRDRVTETLSSQLGHTVRLAVTVDSS
LGDVPVLDPPADAPSGSTTTEPRPAAGTEGDGRHAERRAELDGIALVEDDDDGDSSRTGRSVAHTRSPGALRPRPGVTVP
EQVELTRLNPKYTFDTFVIGASNRFANAAALAVAETPAKAYNPLFIYGESGLGKTHLLHAIGHYARNLFPHVKVRYVNSE
EFTNDFINSIRDDKAANFQRRYRDVDVLLIDDIQFLQGKVQTQEEFFHTFNTLHNANKQVVITSDLPPKLLSGFEERMRS
RFEWGLMTDVQPPDLETRIAILRKKAAQEKLSVPDDVLEFIASRISTNIRELEGALIRVTAFASLNRQPVDISLAEIVLK
DLIPHDSSSQITSATIMAQTAAYFGLTLEDLQGQSRSRVLVTARQIAMYLCRELTDLSLPKIGQQFGGRDHTTVMHADKK
IRQLMAERRAIYNQVTELTNRIKQQSR
>gi|914680953|ref|WP_050671265.1| chromosomal replication initiation protein [Luteipulveratus sp. C296001] >gi|912506872|gb|KNX38763.1|chromosomal replication initiation protein [Luteipulveratus sp. C296001]
MSEDKPDLAHVWHSTVMALEETGIAARDRAFLRLTRLVGLVDQTALLAVPFDHTKDILETSLREPVSSALSRLLDRDVRL
AVTVDPQLQQTVSASAEAKDAGIGEDEVAETDAEEAGTPAPALIRPVRASNRPPVEPVTDARLNPKYTFDTFVIGASNRF
AHAAAFAVAEAPAKSYNPLFVYGDSGLGKTHLLHAIGHYVRNLYPSMRVRYVNSEEFTNDFINSIRDDKASSFQKRYRDD
VDVLLIDDVQFLQGKDGTQEEFFHTFNALHDSEKQIVLTSDQPPKKLSGFADRMRSRFEWGLQTDVTPPDLETRIAILRK
KAIAERMNVPDDVLELIASKFSTNIRELEGALIRVMAFASLSQQPVDQQVAGYVLKDLVPSAGSSQITATLIMTKTAEYF
HVSVEELCGSSRSRTLVTARQIAMYLCRELTELSLPKIGQQFGGRDHTTVMHADRKIRQLIGERRPIYDQITELTGQITR
AAAG
>gi|497462772|ref|WP_009776970.1| chromosomal replication initiator protein DnaA [Janibacter sp. HTCC2649] >gi|84382082|gb|EAP97964.1|chromosomal replication initiator protein [Janibacter sp. HTCC2649]
MDQIWRTTLDALDSDGIPVQQRAFLSLAKLVGLLDETALIAVPNDFTKDIVETRLRDRVTETLSSQLGHDVRLAVTVDHS
LADVPVTIPADTTTVDGAGADQVPRTATTIGLEPGPADADGRRAKRRAELDGIALVEDDEGEDDSRNNGAIGRTRSPGAL
RPRPGATVPEQVELTRLNPKYTFDTFVIGASNRFANAAALAVAETPAKAYNPLFIYGESGLGKTHLLHAIGHYARNLYPH
VKVRYVNSEEFTNDFINSIRDDKAANFQRRYRDVDVLLIDDIQFLQGKVQTQEEFFHTFNTLHNANKQVVITSDLPPKLL
SGFEERMRSRFEWGLMTDVQPPDLETRIAILRKKAAQEKLSVPDDVLEFIASRISTNIRELEGALIRVTAFASLNRQPVD
ISLAEIVLKDLIPHDSANQITSATIMAQTAAYFGLTLEDLQGQSRSRVLVTARQIAMYLCRELTDLSLPKIGQQFGGRDH
TTVMHADKKIRQLMAERRAIYNQVTELTNRIKQQSR
>gi|872700517|ref|WP_048552171.1| chromosomal replication initiation protein [Tetrasphaera japonica] >gi|665501165|emb|CCH80159.1|chromosomal replication initiator protein DnaA [Tetrasphaera japonica T1-X7]
MDYSRVWRTALDELDADGLPIQQRAFLSLARFVGLLDDTALIAVPNDYTKEFVETRVRQQLTQTLSNHAGRELRLAVTVD
PGLKAADEAEPTLGAPAEAWGPRPRAGIGPVDDEEDHEAQDVIDARGVDTFHIQRPTPGQRPVPELIENTRLNPKYTFDT
FVIGASNRFAHAAAVAVAEAPAKAYNPLFVYGDSGLGKTHLLHAIGHYARNLYPTVRVRYVNSEEFTNDFINSIRDDKAA
NFQSRYRDVDVLLIDDIQFLQGKVQTQEEFFHTFNTLHNANKQVVITSDVPPKLLSGFEERMRSRFEWGLLTDVQPPDLE
TRIAILRKKAIQERLSVPDDVLEFIASRISTNIRELEGALIRVTAFASLNRQPVDMSLAEIVLKDLIPSDGANQITSATI
MAQTATYFGLTVEDLQGASRSRVLVTARQIAMYLCRELTDLSLPKIGQQFGGRDHTTVMHADKKIRQLMAERRAIYNQVT
ELTNRIKQQSR
http://geni-act.org/lab_notebook/public/8ed2a45227a744df/ 11/7/15, 7:37 PMPage 9 of 39
>gi|656266264|ref|WP_029212190.1| chromosomal replication initiator protein DnaA [Arsenicicoccus bolidensis]
MTDAQVDVPRVWRDTLRALESGGISAQHRGFLRLSRLVGLLEGTALIAVPNDYTRDIVEKRIRTELVAALQEQLGRDVRL
AVTVDSSLELSEAEDRDDSTQRPGSPGEVPHVVTSSDPVRHDGERGGSDPRSTYGPRLVRDERVPRPVSTDASFGGERPY
ADPARHDGGPGHAPGIRPAQDDDADDDRELLAEGDGIREMFRKPFVPEDGRDAKLNPKYTFDTFVIGSSNRFAHAAAIAV
AEAPAKAYNPLFIYGGSGLGKTHLLHAIGHYAQQIYPNVRVKYVNSEEFTNDFINSIGANKASDFQRRYRDIDFLLIDDI
QFLQGKVQTQEEFFHTFNTLHNANKQVVITSDVAPKLLSGFEERMRSRFEWGLLTDVQPPDLETRIAILRKKAVQERMTA
PDDVLEFIASKISTNIRELEGALIRVTAFASLNHQGVDMTLAEVVLKDLIPADQTNQITPATIMAQTASYFGLTVDDLCG
TSRSRVLVTARQIAMYLCRELTDLSLPKIGQQFGGRDHTTVMHADRKIRELMSERRAIYNQVTELTNRIKSNPS
>gi|918414635|ref|WP_052465811.1| chromosomal replication initiation protein [Mobilicoccus sp. SIT2]
MSGSGEPPTTPAVDDPARIWGATLRALDQAGIPAPQRAFLRQAMLVGVLDTTALIAVPDDFTKEIVESRARDYLVKALTE
QVGREVRLAVTVDASLREQIAAAEQPVLEGYADPDSDVAAAEPPAASPTAAAPASPPVEDSPAPVGERRPRPAPTGRREE
SSQLNPKYTFDTFVIGASNRFAHAAAVAVAEAPAKAYNPLFVYGDSGLGKTHLLHAIGHYARNLYPHVKVKYVNSEEFTN
DFINSIRDDKAAGFQRRYRDMDVLLIDDIQFLQGKMQTQEEFFHTFNTLHNSNKQVVITSDVPPKQLSGFEERMRSRFEW
GLLTDVQPPDLETRIAILRKKAVQEHLALPDEVMEFIASRISTNIRELEGALIRVTAFANLNRQPVEMSLAEIVLKDLIP
DKESSQITASMIMGQTAAYFGLSIDDLCGSSRSRGLVTARQIAMYLCRELTELSLPKIGQQFGGRDHTTVMHADRKIRQL
MAERRSVYNQVTELTTRIRSQAS
>gi|503647989|ref|WP_013882065.1| chromosomal replication initiator protein DnaA [[Cellvibrio] gilvus] >gi|336102716|gb|AEI10535.1| chromosomalreplication initiator protein DnaA [[Cellvibrio] gilvus ATCC 13127]
MAQDEELSRVWGHVVTTLEESPDITQRQLAFVRLAQPLGLLDGTIILAVGNEYTKEYLETKVRAEVTSALGSALGRDGRF
AITVDPSLVDDAPPAVRAMTSAPELGVVTDGTDERGAPNRTVPTDADTGRHERSPMLSESAEPTRPVRETASSRRPAAEP
ARLNPHYLFETFVIGSSNRFAHAAAVAVAEAPAKAYNPLFIYGDSGLGKTHLLHAIGHYAQNLYPSVRVRYVNSEEFTND
FINSISEGKAGAFQRRYREVDVLLIDDIQFLQGKEQTMEEFFHTFNTLHNANKQVVITSDLPPKQLNGFEDRMRSRFEWG
LITDVQPPDLETRIAILRKKAGGDNMQAPPDVLEYIASKISTNIRELEGALIRVTAFASLNRQQVDLSLAEIVLKDLITD
DQTTEITATQVIGQTAAYFGLSIEDLCGSSRSRVLVTARQIAMYLCRELTDLSLPKIGQAFGGRDHTTVMHANRKIRELM
AERRSIYNQVTELTNRIKQQSRG
>gi|739084600|ref|WP_036955733.1| chromosomal replication initiator protein DnaA [Promicromonosporaceae bacterium W15]
MAEVWSAARIQLENDPDVTPRQRGYVRLVAPLAHIDDTVFLKVSDEPIRSFIETNLRADLVGALAGVLGYEPKLAISVDP
DLQIADDDTDPAAPVYARPAPTTATSPAAQRMPTDEPDHGLSNPVGFGATVGLPEAMPAAPGSARAAQHAPRHAEPTAPL
GENSRLNPKYLFETFVIGASNRFAHAAAVAVAEAPAKAYNPLFIYGDSGLGKTHLLHAIGHYAQSLYPNVRVRYVNSEEF
TNDFINSIGEGKAGAFQRRYRDVDVLLIDDIQFLQGKEQTMEEFFHTFNALHNANKQVVITSDLPPKQLNGFEDRLRSRF
EWGLITDVQPPDLETRIAILRKKALQERLDAPDDVLEYIASRISTNIRELEGALIRVTAFANLNRQPVDQSLAEIVLKDL
ITDDDAGEITAASVIAQTAAYFGLTIDDLCGSSRSRVLVTARQIAMYLCRELTDLSLPKIGQQFGGRDHTTVMHANRKIS
EQMAERRSIYNQVTELTSRIKQQHRG
Multiple sequence alignment
[Kytococcus MSQTPDDHATAIWQEAMVHLQ-GAGLAPRDIGVLRLATLVGLLEGTALLAgi|478759289|emb|CCH68940.1| MADA---SMTSVWVRILRALD-REGVSHQERAFLSITRLAGVLDETALIA
http://geni-act.org/lab_notebook/public/8ed2a45227a744df/ 11/7/15, 7:37 PMPage 10 of 39
gi|497462772|ref|WP_009776970.1| M--------DQIWRTTLDALD-SDGIPVQQRAFLSLAKLVGLLDETALIAgi|497833122|ref|WP_010147278.1| MSQ-PSTDSGDTWRRVVSELE-DKGLGAREKAFLRLTTMVGVLDSTVLLAgi|503647989|ref|WP_013882065.1| MAQD--EELSRVWGHVVTTLEESPDITQRQLAFVRLAQPLGLLDGTIILAgi|551300082|ref|WP_022920049.1| MTSQSPAESAEVWQRVVSQLE-SQGVTARDRAFLRLTQLVGLLDTTALLAgi|656266264|ref|WP_029212190.1| MTDA-QVDVPRVWRDTLRALE-SGGISAQHRGFLRLSRLVGLLEGTALIAgi|665501165|emb|CCH80159.1| MD------YSRVWRTALDELD-ADGLPIQQRAFLSLARFVGLLDDTALIAgi|737975618|ref|WP_035938084.1| M--------DQIWRTTLDALD-SDGIPVQQRAFLSLARLVGLLDETALIAgi|750471801|ref|WP_040753674.1| M--------TSVWVRILRALD-REGVSHQERAFLSITRLAGVLDETALIA * * : *: .: :. ..: :: *:*: * ::*
[Kytococcus VKYDHVKDAVEGHLREDVSTALAEVLDRDIRLAVSVDPDAVSAAQEEAAPgi|478759289|emb|CCH68940.1| VPNDFSKDIVETRLRGRISGHLTAELDRPLRLAVTVDPSLAEAEPLDLDAgi|497462772|ref|WP_009776970.1| VPNDFTKDIVETRLRDRVTETLSSQLGHDVRLAVTVDHSLADVPVTIPADgi|497833122|ref|WP_010147278.1| VPYPHTKEMLETTLRQPIVDLLSRELDREVRLAITVDDDVRQRVEDEADDgi|503647989|ref|WP_013882065.1| VGNEYTKEYLETKVRAEVTSALGSALGRDGRFAITVDPSLVDDAPPAVRAgi|551300082|ref|WP_022920049.1| VPYQHTKETLETTLRQPIVDALAGELGHDVRLAITVDEDLRRQVEDEGD-gi|656266264|ref|WP_029212190.1| VPNDYTRDIVEKRIRTELVAALQEQLGRDVRLAVTVDSSLELSEAEDRDDgi|665501165|emb|CCH80159.1| VPNDYTKEFVETRVRQQLTQTLSNHAGRELRLAVTVDPGLKAADEAEPTLgi|737975618|ref|WP_035938084.1| VPNDFTKDIVETRLRDRVTETLSSQLGHTVRLAVTVDSSLGDVPVLDPPAgi|750471801|ref|WP_040753674.1| VPNDFSKDIVETRLRGRISGHLTAELDRPLRLAVTVDPSLAEAEPLDLDA * . :: :* :* : * .: *:*::** .
[Kytococcus -PAP----------SPADEDDPATGEGPLS--------------------gi|478759289|emb|CCH68940.1| HDSQDHSLATD--PQRATQDDPALVNGVADL----AVVD-----------gi|497462772|ref|WP_009776970.1| -TTTVDGAGADQVPRTATTI--GLEPGP----------------------gi|497833122|ref|WP_010147278.1| -EAD----------EDAQTRESL---------------------------gi|503647989|ref|WP_013882065.1| M-------TSA--PELGVVTDGTDERGAPNR----TV-P-----------gi|551300082|ref|WP_022920049.1| -PAP----------GPAVTEQVP---------------------------gi|656266264|ref|WP_029212190.1| STQRPGSPGEV--PHVVTSSDPVRHDGERGGSDPRSTYGPRLVRDERVPRgi|665501165|emb|CCH80159.1| -GAPAEA--WG--PRPRA--------------------------------gi|737975618|ref|WP_035938084.1| -DAPSGSTTTE--PRPAA----G---------------------------gi|750471801|ref|WP_040753674.1| HDSQDHSLATD--PQRATQDDPALVNGVADL----AVVD-----------
[Kytococcus -------------------------TAVDG--AVEKHEGSSPARA-GESVgi|478759289|emb|CCH68940.1| ---L----------DDPEVRRAQRRLELDGL-------DADETPM-PRAAgi|497462772|ref|WP_009776970.1| --------------ADADGRRAKRRAELDGIALVEDDEGEDDSRN-NGAIgi|497833122|ref|WP_010147278.1| -----------------------------------TRPASQPSSSAGAGVgi|503647989|ref|WP_013882065.1| --------------TDADTGR------------------HERSPMLSESAgi|551300082|ref|WP_022920049.1| -----------------------------------SDPDRTPYRSNGAGPgi|656266264|ref|WP_029212190.1| PVSTDASFGGERPYADP-ARHDGGPGHAPGIRPAQDDD-ADDDRE-LL-Agi|665501165|emb|CCH80159.1| -----------------------------GIGPVDDEEDHEAQDV-IDA-gi|737975618|ref|WP_035938084.1| --------------TEGDGRHAERRAELDGIALVEDDDDGDSSRT-GRSVgi|750471801|ref|WP_040753674.1| ---L----------DDPEVRRAQRRLELDGL-------DADETPM-PRAA
[Kytococcus A-PATTASL---TATNSSPGVERDYSALNHKYTFDTFVLGSSNRFAHAAAgi|478759289|emb|CCH68940.1| EPTGIPANLR-----RGSNPENVELTRLNPKYTFETFVIGASNRFAHAAAgi|497462772|ref|WP_009776970.1| GRTRSPGALRP-RPG-ATVPEQVELTRLNPKYTFDTFVIGASNRFANAAAgi|497833122|ref|WP_010147278.1| PGPSGNGIPRPATPAGPAVTGAADEARLNPKYSFDTFVSGPSNRFAHAASgi|503647989|ref|WP_013882065.1| EPTR-PVRE----T-ASSRRPAAEPARLNPHYLFETFVIGSSNRFAHAAAgi|551300082|ref|WP_022920049.1| GEPRSDGHR---TPSGAVQTASAEDARLNPKYTFDTFVSGSSNRFAHAASgi|656266264|ref|WP_029212190.1| EGDGIREMFR-----KPFVPEDGRDAKLNPKYTFDTFVIGSSNRFAHAAAgi|665501165|emb|CCH80159.1| RGVDTFHIQRP-TPGQRPVPELIENTRLNPKYTFDTFVIGASNRFAHAAAgi|737975618|ref|WP_035938084.1| AHTRSPGALRP-RPG-VTVPEQVELTRLNPKYTFDTFVIGASNRFANAAAgi|750471801|ref|WP_040753674.1| EPTGIPANLR-----RGSNPENVELTRLNPKYTFETFVIGASNRFAHAAA : ** :* *:*** *.*****:**:
[Kytococcus TAVAEAPARAYNPLFIYGGSGLGKTHLLHAIGHYARTLDSSVRVKYVNSEgi|478759289|emb|CCH68940.1| TAVGETPAKAYNPLFIYGGSGLGKTHLLHAIGHYARSLYPNVKVRYVNSEgi|497462772|ref|WP_009776970.1| LAVAETPAKAYNPLFIYGESGLGKTHLLHAIGHYARNLYPHVKVRYVNSEgi|497833122|ref|WP_010147278.1| LAVAESPARAYNPLFIYGESGLGKTHLLHAIGHYARKLYPGVRVRYVNSEgi|503647989|ref|WP_013882065.1| VAVAEAPAKAYNPLFIYGDSGLGKTHLLHAIGHYAQNLYPSVRVRYVNSEgi|551300082|ref|WP_022920049.1| LAVAESPARAYNPLFIYGESGLGKTHLLHAIGHYARSLYPGVRVRYVNSEgi|656266264|ref|WP_029212190.1| IAVAEAPAKAYNPLFIYGGSGLGKTHLLHAIGHYAQQIYPNVRVKYVNSEgi|665501165|emb|CCH80159.1| VAVAEAPAKAYNPLFVYGDSGLGKTHLLHAIGHYARNLYPTVRVRYVNSEgi|737975618|ref|WP_035938084.1| LAVAETPAKAYNPLFIYGESGLGKTHLLHAIGHYARNLFPHVKVRYVNSEgi|750471801|ref|WP_040753674.1| TAVGETPAKAYNPLFIYGGSGLGKTHLLHAIGHYARSLYPNVKVRYVNSE **.*:**:******:** ****************: : . *:*:*****
[Kytococcus EFTNQFINAVSAGQANAFQRQYRDVDVLLIDDIQFLQGKEQTMEEFFHTFgi|478759289|emb|CCH68940.1| EFTNDFINSVRDGKAAEFQRRYRYVDVLLIDDIQFLQGKEQTQEEFFHTFgi|497462772|ref|WP_009776970.1| EFTNDFINSIRDDKAANFQRRYRDVDVLLIDDIQFLQGKVQTQEEFFHTFgi|497833122|ref|WP_010147278.1| EFTNDFINSIRDDKAGAFQRRYRNVDFLLVDDIQFLQGKEQTVEEFFHTFgi|503647989|ref|WP_013882065.1| EFTNDFINSISEGKAGAFQRRYREVDVLLIDDIQFLQGKEQTMEEFFHTFgi|551300082|ref|WP_022920049.1| EFTNDFINSIRDDKAGAFQRRYRNVDFLLVDDIQFLQGKEQTVEEFFHTFgi|656266264|ref|WP_029212190.1| EFTNDFINSIGANKASDFQRRYRDIDFLLIDDIQFLQGKVQTQEEFFHTFgi|665501165|emb|CCH80159.1| EFTNDFINSIRDDKAANFQSRYRDVDVLLIDDIQFLQGKVQTQEEFFHTFgi|737975618|ref|WP_035938084.1| EFTNDFINSIRDDKAANFQRRYRDVDVLLIDDIQFLQGKVQTQEEFFHTFgi|750471801|ref|WP_040753674.1| EFTNDFINSVRDGKAAEFQRRYRYVDVLLIDDIQFLQGKEQTQEEFFHTF ****:***:: .:* ** :** :*.**:********* ** *******
[Kytococcus NTLHNSEKQIVITSDQPPKKLSGFAERMRSRFEWGLLTDVQPPDLETRIAgi|478759289|emb|CCH68940.1| NALHNANKQVVVTSDVAPKQLAGMEERLRSRLEWGLLTDVQPPDLETRIAgi|497462772|ref|WP_009776970.1| NTLHNANKQVVITSDLPPKLLSGFEERMRSRFEWGLMTDVQPPDLETRIAgi|497833122|ref|WP_010147278.1| NTLHNSEKQVVITSDQPPKRLSGFAERMRSRFEWGLLTDVQPPDLETRIAgi|503647989|ref|WP_013882065.1| NTLHNANKQVVITSDLPPKQLNGFEDRMRSRFEWGLITDVQPPDLETRIAgi|551300082|ref|WP_022920049.1| NTLHNSEKQVVITSDQPPKRLSGFAERMRSRFEWGLLTDVQPPDLETRIAgi|656266264|ref|WP_029212190.1| NTLHNANKQVVITSDVAPKLLSGFEERMRSRFEWGLLTDVQPPDLETRIAgi|665501165|emb|CCH80159.1| NTLHNANKQVVITSDVPPKLLSGFEERMRSRFEWGLLTDVQPPDLETRIAgi|737975618|ref|WP_035938084.1| NTLHNANKQVVITSDLPPKLLSGFEERMRSRFEWGLMTDVQPPDLETRIAgi|750471801|ref|WP_040753674.1| NALHNANKQVVVTSDVAPKQLAGMEERLRSRLEWGLLTDVQPPDLETRIA *:***::**:*:*** .** * *: :*:***:****:*************
http://geni-act.org/lab_notebook/public/8ed2a45227a744df/ 11/7/15, 7:37 PMPage 11 of 39
[Kytococcus ILRRKAAADKLDIPDDVLHLIASKISSNIRELEGALTRVTAFASLSGSPLgi|478759289|emb|CCH68940.1| ILRKKAIHERLSVPDDVMEFIASRISTNIRELEGALIRVTAFANLNRQPVgi|497462772|ref|WP_009776970.1| ILRKKAAQEKLSVPDDVLEFIASRISTNIRELEGALIRVTAFASLNRQPVgi|497833122|ref|WP_010147278.1| ILRKKAAQEGMQLPDEVLEHIASRITTNIRELEGALIRVTAFASLSSQPAgi|503647989|ref|WP_013882065.1| ILRKKAGGDNMQAPPDVLEYIASKISTNIRELEGALIRVTAFASLNRQQVgi|551300082|ref|WP_022920049.1| ILKKKAAQEGMQLPDEVLELIGSKISTNIRELEGALIRVTAFASLSSTPPgi|656266264|ref|WP_029212190.1| ILRKKAVQERMTAPDDVLEFIASKISTNIRELEGALIRVTAFASLNHQGVgi|665501165|emb|CCH80159.1| ILRKKAIQERLSVPDDVLEFIASRISTNIRELEGALIRVTAFASLNRQPVgi|737975618|ref|WP_035938084.1| ILRKKAAQEKLSVPDDVLEFIASRISTNIRELEGALIRVTAFASLNRQPVgi|750471801|ref|WP_040753674.1| ILRKKAIHERLSVPDDVMEFIASRISTNIRELEGALIRVTAFANLNRQPV **::** : : * :*:. *.*:*::********* ******.*.
[Kytococcus DEYLARTVLKDVMPGGDSGQITPTMILEETAGYFVISVEEIQGASRSRNLgi|478759289|emb|CCH68940.1| DLALAEIVLRDLIPTE-GGEITSATIMAQTAAYFGLTLEDLRGSSRSRVLgi|497462772|ref|WP_009776970.1| DISLAEIVLKDLIPHDSANQITSATIMAQTAAYFGLTLEDLQGQSRSRVLgi|497833122|ref|WP_010147278.1| DADLAAHVLKDIVPGSDTAQITVSTIIREVSEYFQISIDELCGTSRSRTLgi|503647989|ref|WP_013882065.1| DLSLAEIVLKDLITDDQTTEITATQVIGQTAAYFGLSIEDLCGSSRSRVLgi|551300082|ref|WP_022920049.1| DAALASHVLKDIIPNSESAAITVPTIMAEVADYFQISNDDLCGTSRSRTLgi|656266264|ref|WP_029212190.1| DMTLAEVVLKDLIPADQTNQITPATIMAQTASYFGLTVDDLCGTSRSRVLgi|665501165|emb|CCH80159.1| DMSLAEIVLKDLIPSDGANQITSATIMAQTATYFGLTVEDLQGASRSRVLgi|737975618|ref|WP_035938084.1| DISLAEIVLKDLIPHDSSSQITSATIMAQTAAYFGLTLEDLQGQSRSRVLgi|750471801|ref|WP_040753674.1| DLALAEIVLRDLIPTE-GGEITSATIMAQTAAYFGLTLEDLRGSSRSRVL * ** **:*::. ** . :: :.: ** :: ::: * **** *
[Kytococcus TRARQIAMYLCRELTDLSLPKIGKEFGGRDHTTVMHAERKIKQLLGEDRRgi|478759289|emb|CCH68940.1| VNARQIAMYLCRELTSMSLPEIGKEFN-KDHTTVMHANKKIGQLMAERRAgi|497462772|ref|WP_009776970.1| VTARQIAMYLCRELTDLSLPKIGQQFGGRDHTTVMHADKKIRQLMAERRAgi|497833122|ref|WP_010147278.1| VNARQIAMYLCRELTDLSLPKIGQEFGGRDHTTVMHAERKIRAQIGERRAgi|503647989|ref|WP_013882065.1| VTARQIAMYLCRELTDLSLPKIGQAFGGRDHTTVMHANRKIRELMAERRSgi|551300082|ref|WP_022920049.1| VNARQIAMYLCRELTDLSLPKIGQEFGGRDHTTVMHAERKIRQLIGERRAgi|656266264|ref|WP_029212190.1| VTARQIAMYLCRELTDLSLPKIGQQFGGRDHTTVMHADRKIRELMSERRAgi|665501165|emb|CCH80159.1| VTARQIAMYLCRELTDLSLPKIGQQFGGRDHTTVMHADKKIRQLMAERRAgi|737975618|ref|WP_035938084.1| VTARQIAMYLCRELTDLSLPKIGQQFGGRDHTTVMHADKKIRQLMAERRAgi|750471801|ref|WP_040753674.1| VNARQIAMYLCRELTSMSLPEIGKEFN-KDHTTVMHANKKIGQLMAERRA . *************.:***:**: *. :********::** :.* *
[Kytococcus VYDEVSELTSIIRKKAARGRgi|478759289|emb|CCH68940.1| IYNNVTELTGRIKQQS---Rgi|497462772|ref|WP_009776970.1| IYNQVTELTNRIKQQS---Rgi|497833122|ref|WP_010147278.1| LYDQIAELTGTIRRAS--QRgi|503647989|ref|WP_013882065.1| IYNQVTELTNRIKQQS--RGgi|551300082|ref|WP_022920049.1| LYDQITELTGIIRKAS--ARgi|656266264|ref|WP_029212190.1| IYNQVTELTNRIKSNP---Sgi|665501165|emb|CCH80159.1| IYNQVTELTNRIKQQS---Rgi|737975618|ref|WP_035938084.1| IYNQVTELTNRIKQQS---Rgi|750471801|ref|WP_040753674.1| IYNNVTELTGRIKQQS---R :*::::***. *: .
WebLogo
go to WebLogo at http://weblogo.berkeley.edu
Sequence Logo
http://geni-act.org/lab_notebook/public/8ed2a45227a744df/ 11/7/15, 7:37 PMPage 12 of 39
Comments/observationsThe N-terminal end of this protein is indicated to be reasonably well conserved. There is a gap from positions 128 to 164 in the alignment showingthat gaps were added in this region of the alignment. The C-terminal end of the protein is highly conserved showing several regions with hydrophobicamino acids.
[-] Cellular Localization Data
Module Instructions
Gram Stain
go to NCBI Pubmed at http://www.ncbi.nlm.nih.gov/pubmed
Gram stain of the microbeGram Positive
TMHMM
go to TMHMM at http://www.cbs.dtu.dk/services/TMHMM
Number of predicted transmembrane helices0
Transmembrane topology graph
http://geni-act.org/lab_notebook/public/8ed2a45227a744df/ 11/7/15, 7:37 PMPage 15 of 39
Comments/observationsNo transmembrane helicies were predicted by TMHMM, suggesting that Ksed_00010 does not reside within the cell membrane.
SignalP
go to SignalP at http://www.cbs.dtu.dk/services/SignalP
Signal peptide probabilityNo
0.131
Most likely cleavage site (between position # and #)N/A - no signal peptide is predicted, thus there is no predicted cleavage site.
Signal peptide graph
http://geni-act.org/lab_notebook/public/8ed2a45227a744df/ 11/7/15, 7:37 PMPage 16 of 39
LipoP
Go to LipoP at http://www.cbs.dtu.dk/services/LipoP/
Best Prediction (options = cytoplasm, signal peptidase I, signal peptidases II)This was not done for Kytococcus since it only applies to Gram negative bacteria and Kytococcus is Gram positive.
Cleavage site, if predicted, after AA #
PSORT-B
go to PSORT-B at http://www.psort.org/psortb
Cytoplasmic score9.97
CytoplasmicMembrane score
http://geni-act.org/lab_notebook/public/8ed2a45227a744df/ 11/7/15, 7:37 PMPage 17 of 39
0.00
Cellwall score
0.01
Periplasmic scoreN/A
OuterMembrane scoreN/A
Extracellular score0.02
PSORT-B final prediction
Phobius
go to Phobius at http://phobius.sbc.su.se
Phobius probability graph
The Phobius tool predicts that there is neither a signal peptide nor transmembrane helicies found in the amino acid sequence of Ksed_00010. Thestatement that amino acids 1-506 are "non cytoplasmic" in the Phobius probablilty graph is meaningless in predicting location.
http://geni-act.org/lab_notebook/public/8ed2a45227a744df/ 11/7/15, 7:37 PMPage 18 of 39
Hypothesis
Based on the results of these tools, where do you expect to find this protein?TMHMM indicates that there are no transmembrane helices, Signal IP revelas low probability for signal peptide, pSORTB predicts Ksed_00010 to bea cytoplasmic protein . Phobius coroborates negative results for TMHs and Signal peptide. Taking all of these results into account, the final predictionfor localization is that Ksed_00010 is a cytoplasmic protein.
[-] Alternative Open Reading Frame
Module Instructions
DNA Coordinates
login to IMG/EDU and find the gene page for this gene
Proposed DNA coordinates (if different from those predicted by IMG)..
http://geni-act.org/lab_notebook/public/8ed2a45227a744df/ 11/7/15, 7:37 PMPage 19 of 39
Explanation of choice
There is no shine dalgarno sequence upstream of the predicted start codon for Ksed_00010. No alternative start codons with a better start codonwere seen when 99 bases of upstream sequence was added, but a start codon with a shine-dalgarno sequence was found downstream of the startcodon called by the computer. This was tested as a potential alternative start codon (see image below).
The sequence tested was as follows:
>Ksed_00010_amino acid seq - downstream start codonMSQTPDDHATAIWQEAMVHLQGAGLAPRDIGVLRLATLVGLLEGTALLAVKYDHVKDAVEGHLREDVSTALAEVLDRDIRLAVSVDPDAVSAAQEEAAPPAPSPADEDDPATGEGPLSTAVDGAVEKHEGSSPARAGESVAPATTASLTATNSSPGVERDYSALNHKYTFDTFVLGSSNRFAHAAATAVAEAPARAYNPLFIYGGSGLGKTHLLHAIGHYARTLDSSVRVKYVNSEEFTNQFINAVSAGQANAFQRQYRDVDVLLIDDIQFLQGKEQTMEEFFHTFNTLHNSEKQIVITSDQPPKKLSGFAERMRSRFEWGLLTDVQPPDLETRIAILRRKAAADKLDIPDDVLHLIASKISSNIRELEGALTRVTAFASLSGSPLDEYLARTVLKDVMPGGDSGQITPTMILEETAGYFVISVEEIQGASRSRNLTRARQIAMYLCRELTDLSLPKIGKEFGGRDHTTVMHAERKIKQLLGEDRRVYDEVSELTSIIRKKAARGR
The BLAST scores for the sequence starting with the downstream start codon was less than that observed when the sequence beginning with the computer called start codon was used in either Swiss-Prot or nr databases. Therefore is no evidence to suggest the computer called start codon is incorrect.
http://geni-act.org/lab_notebook/public/8ed2a45227a744df/ 11/7/15, 7:37 PMPage 20 of 39
[-] Structure-based Evidence
Module Instructions
TIGRFAM
go to TIGRFAM at http://tigrblast.tigr.org/web-hmm
TIGRFAM numberFirst Hit: TIGR00362
Second Hit: TIGR03420
TIGRFAM name
First Hit: DnaA: chromosomal replication initiator protein DnaA
DnaA is involved in DNA biosynthesis; initiation of chromosome replication and can also be transcription regulator. The C-terminal of the family hits the pfam bacterial DnaA (bac_dnaA) domain family. For a review, see Kaguni (2006).
Second Hit: DnaA_homol_Hda: DnaA regulatory inactivator Hda
Members of this protein family are Hda (Homologous to DnaA). These proteins are about half the length of DnaA and homologous over length of Hda. In the model species Escherichia coli, the initiation of DNA replication requires DnaA bound to ATP rather than ADP; Hda helps facilitate the conversion of DnaA-ATP to DnaA-ADP.
Score
First Hit: 740.9 Second Hit: 13.9
E-value
First Hit: 3.8e-220
Second Hit: 1.2e-13
Pfam
go to Pfam at http://pfam.sanger.ac.uk/search
Pfam number (PF#####) for top hit
PF00308
http://geni-act.org/lab_notebook/public/8ed2a45227a744df/ 11/7/15, 7:37 PMPage 21 of 39
Pfam name
Bac_DnaA
The following was copied from the wiki link for PF00308. "DnaA is a protein that activates initiation of DNA replication in bacteria.[1] It is areplication initiation factor which promotes the unwinding of DNA at oriC.[1] The onset of the initiation phase of DNA replication is determined by theconcentration of DnaA.[1] DnaA accumulates during growth and then triggers the initiation of replication.[1]Replication begins with active DnaA bindingto 9-mer (9-bp) repeats upstream of oriC.[1] Binding of DnaA leads to strand separation at the 13-mer repeats.[1] This binding causes the DNA to loopin preparation for melting open by the helicase DnaB.[1]"
Clan name
P-loop_NTPase
P-loop containing nucleoside triphosphate hydrolase superfamilyAAA family proteins often perform chaperone-like functions that assist in the assembly, operation, or disassembly of protein complexes [2].
This clan contains 198 families and the total number of domains in the clan is 5072865. The clan was built by DJ Studholme.
Clan number (CL####)
CL0023
Score329.1
E-value1.3e-98
Pairwise alignment
http://geni-act.org/lab_notebook/public/8ed2a45227a744df/ 11/7/15, 7:37 PMPage 22 of 39
HMM logo
Key functional/structural residues (e.g. I2, W7, F13)
There are a number of amino acicds that are predominant in the model HMM as indicated by the large wide letters atvarious postions in the logo ( L1, F7, F10, P36 etc). Comparing the pairwise alignment to the HMM logo, one call seethat L1, F7 and F10 (among many others) are exact matches of the Ksed_00010 amino acid sequence and the majoramino acid found at the same position in the HMM logo.appears to be significant variablity among other proteinsused to create this model HMM as indicated by the large number of smaller letters under the predominant aminoacids at those positions.
Pfam number (PF#####) for second hit
PF08299
Pfam name
Bac_DnaA_C
The following was copied from the wiki link for PF08299.
"The active form DnaA is bound to ATP.[1] Immediately after a cell has divided, the level of active DnaA within the cell is low.[1] Although the activeform of DnaA requires ATP, the formation of the oriC/DnaA complex and subsequent DNA unwinding does not require ATP hydrolysis.[2]
The oriC site in E. coli has three AT rich 13 base pair regions (DUEs) followed by four 9 bp regions with the sequence TTAT(C or A)CA(C or A)A.Around 10 DnaA molecules bind to the 9 bp regions, which wrap around the proteins causing the DNA at the AT-rich region to unwind. There are 8DnaA binding sites within oriC, to which DnaA binds with differential affinity. When DNA replication is about to commence, DnaA occupies all of thehigh and low affinity binding sites. The denatured AT-rich region allows for the recruitment of DnaB (helicase), which complexes with DnaC (helicaseloader). DnaC helps the helicase to bind to and to properly accommodate the ssDNA at the 13 bp region; this is accomplished by ATP hydrolysiswhich DnaC is released. Single-strand binding proteins (SSBs) stabilize the single DNA strands in order to maintain the replication bubble. DnaB5'→3' helicase, so it travels on the lagging strand. It associates with DnaG (a primase) to form the only primer for the leading strand and to add primers on the lagging strand. The interaction between DnaG and DnaB is necessary to control the longitude of Okazaki fragments on the laggingstrand. DNA polymerase III is then able to start DNA replication.
DnaA contains two conserved regions: the first is located in the central part of the protein and corresponds to the ATP-binding domain, the second is
http://geni-act.org/lab_notebook/public/8ed2a45227a744df/ 11/7/15, 7:37 PMPage 23 of 39
located in the C-terminal half and is involved in DNA-binding.[4]"
Clan name
HTH
Helix-turn-helix clan This family contains a diverse range of mostly DNA-binding domains that contain a helix-turn-helix motif.
This clan contains 217 families and the total number of domains in the clan is 4270183. The clan was built by A Bateman.
Clan number (CL####)
CL0123
Score
101.8
E-value
1.4e-29
Pairwise alignment
HMM logo
http://geni-act.org/lab_notebook/public/8ed2a45227a744df/ 11/7/15, 7:37 PMPage 24 of 39
Key functional/structural residues (e.g. I2, W7, F13)I4, V8, R24, R32, I50, G51, T61 and V62 are all exact matches for the HTH domain in Ksed_00010 and the model HMM
PDB
go to PDB at http://www.rcsb.org/pdb/home/home.do
PDB code2Z4R
PDB nameCrystal structure of domain III from the Thermotoga maritima replication initiation protein DnaA
Alignment length346
E-value9.20639E-59
Pairwise alignment
http://geni-act.org/lab_notebook/public/8ed2a45227a744df/ 11/7/15, 7:37 PMPage 25 of 39
[-] Enzymatic Function
Module Instructions
KEGG
http://geni-act.org/lab_notebook/public/8ed2a45227a744df/ 11/7/15, 7:37 PMPage 26 of 39
go to KEGG at http://www.genome.jp/kegg/pathway.html
KEGG pathway ID
ko02020 Two-component system
The following was obtained from a PubMedSearch:
Annu Rev Biochem. 2000;69:183-215.
Two-component signaltransduction.Stock AM 1, Robinson VL, Goudreau PN.
Author information
Abstract
Most prokaryotic signal-transduction systems and a few eukaryoticpathways use phosphotransfer schemes involving two conservedcomponents, a histidine protein kinase and a response regulatorprotein. The histidine protein kinase, which is regulated byenvironmental stimuli, autophosphorylates at a histidine residue,creating a high-energy phosphoryl group that is subsequentlytransferred to an aspartate residue in the response regulator protein.Phosphorylation induces a conformational change in the regulatorydomain that results in activation of an associated domain that effectsthe response. The basic scheme is highly adaptable, and numerousvariations have provided optimization within specific signaling systems.The domains of two-component proteins are modular and can beintegrated into proteins and pathways in a variety of ways, but the corestructures and activities are maintained. Thus detailed analyses of arelatively small number of representative proteins provide a foundationfor understanding this large family of signaling proteins.
Pathway map
Below is a portion of the general two component map that shows the location of DnaA
http://geni-act.org/lab_notebook/public/8ed2a45227a744df/ 11/7/15, 7:37 PMPage 27 of 39
Below is the same section of the two component system, but with the genes called in Kytococcus highlighted in green. DnaA has been called by thecomputer. Clicking on the DnaA hyperlink confirmst that DnaA in Kytococcus is, in fact, Ksed_00010
http://geni-act.org/lab_notebook/public/8ed2a45227a744df/ 11/7/15, 7:37 PMPage 28 of 39
MetaCyc
go to MetaCyc at http://metacyc.org
Pathway mapNot applicable: Ksed-00010 is not an enzyme and therefore not found in Metacyc
E.C. Number
go to ExPASy ENZYME at http://www.expasy.ch/enzyme/enzyme-search-ec.html
EC NumberNot applicable: Ksed-00010 is not an enzyme and therefore does not have an E.C. number or name
EC NameN/A
[-] Duplication and Degradation
Module Instructions
DNA Coordinates
Paralog gene product nameNo significant paralog hits were found.
Percent identity
Alignment length
E-value
http://geni-act.org/lab_notebook/public/8ed2a45227a744df/ 11/7/15, 7:37 PMPage 29 of 39
Pairwise alignment
Pseudogene
Use the instructions provided by your professor
Is this a pseudogene?By the three criteria used in the project manual there no evidence that the Ksed_00010 is a pseudogene.
[-] Horizontal Gene Transfer
Module Instructions
Phylogenetic Tree
go to Phylogeny.fr at http://www.phylogeny.fr
Phylogenetic tree
Sequences used to contruct the tree:
>Kytococcus sedentarius DSM 20547] MSQTPDDHATAIWQEAMVHLQGAGLAPRDIGVLRLATLVGLLEGTALLAVKYDHVKDAVEGHLREDVSTALAEVLDRDIR LAVSVDPDAVSAAQEEAAPPAPSPADEDDPATGEGPLSTAVDGAVEKHEGSSPARAGESVAPATTASLTATNSSPGVERD YSALNHKYTFDTFVLGSSNRFAHAAATAVAEAPARAYNPLFIYGGSGLGKTHLLHAIGHYARTLDSSVRVKYVNSEEFTN QFINAVSAGQANAFQRQYRDVDVLLIDDIQFLQGKEQTMEEFFHTFNTLHNSEKQIVITSDQPPKKLSGFAERMRSRFEW GLLTDVQPPDLETRIAILRRKAAADKLDIPDDVLHLIASKISSNIRELEGALTRVTAFASLSGSPLDEYLARTVLKDVMP GGDSGQITPTMILEETAGYFVISVEEIQGASRSRNLTRARQIAMYLCRELTDLSLPKIGKEFGGRDHTTVMHAERKIKQL LGEDRRVYDEVSELTSIIRKKAARGR >Ornithinimicrobium pekingense] MTSQSPAESAEVWQRVVSQLESQGVTARDRAFLRLTQLVGLLDTTALLAVPYQHTKETLETTLRQPIVDALAGELGHDVR LAITVDEDLRRQVEDEGDPAPGPAVTEQVPSDPDRTPYRSNGAGPGEPRSDGHRTPSGAVQTASAEDARLNPKYTFDTFV SGSSNRFAHAASLAVAESPARAYNPLFIYGESGLGKTHLLHAIGHYARSLYPGVRVRYVNSEEFTNDFINSIRDDKAGAF QRRYRNVDFLLVDDIQFLQGKEQTVEEFFHTFNTLHNSEKQVVITSDQPPKRLSGFAERMRSRFEWGLLTDVQPPDLETR IAILKKKAAQEGMQLPDEVLELIGSKISTNIRELEGALIRVTAFASLSSTPPDAALASHVLKDIIPNSESAAITVPTIMA EVADYFQISNDDLCGTSRSRTLVNARQIAMYLCRELTDLSLPKIGQEFGGRDHTTVMHAERKIRQLIGERRALYDQITEL TGIIRKASAR >Serinicoccus profundi] MSQPSTDSGDTWRRVVSELEDKGLGAREKAFLRLTTMVGVLDSTVLLAVPYPHTKEMLETTLRQPIVDLLSRELDREVRL AITVDDDVRQRVEDEADDEADEDAQTRESLTRPASQPSSSAGAGVPGPSGNGIPRPATPAGPAVTGAADEARLNPKYSFD TFVSGPSNRFAHAASLAVAESPARAYNPLFIYGESGLGKTHLLHAIGHYARKLYPGVRVRYVNSEEFTNDFINSIRDDKA GAFQRRYRNVDFLLVDDIQFLQGKEQTVEEFFHTFNTLHNSEKQVVITSDQPPKRLSGFAERMRSRFEWGLLTDVQPPDL ETRIAILRKKAAQEGMQLPDEVLEHIASRITTNIRELEGALIRVTAFASLSSQPADADLAAHVLKDIVPGSDTAQITVST IIREVSEYFQISIDELCGTSRSRTLVNARQIAMYLCRELTDLSLPKIGQEFGGRDHTTVMHAERKIRAQIGERRALYDQI AELTGTIRRASQR >Knoellia aerolata] >gi|700180054|gb|KGN40755.1| chromosomal replication initiation protein [Knoellia aerolata DSM 18566] MDQIWRTTLDALDSDGIPVQQRAFLSLARLVGLLDETALIAVPNDFTKDIVETRLRDRVTETLSSQLGHTVRLAVTVDSS LGDVPVLDPPADAPSGSTTTEPRPAAGTEGDGRHAERRAELDGIALVEDDDDGDSSRTGRSVAHTRSPGALRPRPGVTVP EQVELTRLNPKYTFDTFVIGASNRFANAAALAVAETPAKAYNPLFIYGESGLGKTHLLHAIGHYARNLFPHVKVRYVNSE EFTNDFINSIRDDKAANFQRRYRDVDVLLIDDIQFLQGKVQTQEEFFHTFNTLHNANKQVVITSDLPPKLLSGFEERMRS
http://geni-act.org/lab_notebook/public/8ed2a45227a744df/ 11/7/15, 7:37 PMPage 30 of 39
RFEWGLMTDVQPPDLETRIAILRKKAAQEKLSVPDDVLEFIASRISTNIRELEGALIRVTAFASLNRQPVDISLAEIVLK DLIPHDSSSQITSATIMAQTAAYFGLTLEDLQGQSRSRVLVTARQIAMYLCRELTDLSLPKIGQQFGGRDHTTVMHADKK IRQLMAERRAIYNQVTELTNRIKQQSR >Luteipulveratus sp. C296001] MSEDKPDLAHVWHSTVMALEETGIAARDRAFLRLTRLVGLVDQTALLAVPFDHTKDILETSLREPVSSALSRLLDRDVRL AVTVDPQLQQTVSASAEAKDAGIGEDEVAETDAEEAGTPAPALIRPVRASNRPPVEPVTDARLNPKYTFDTFVIGASNRF AHAAAFAVAEAPAKSYNPLFVYGDSGLGKTHLLHAIGHYVRNLYPSMRVRYVNSEEFTNDFINSIRDDKASSFQKRYRDD VDVLLIDDVQFLQGKDGTQEEFFHTFNALHDSEKQIVLTSDQPPKKLSGFADRMRSRFEWGLQTDVTPPDLETRIAILRK KAIAERMNVPDDVLELIASKFSTNIRELEGALIRVMAFASLSQQPVDQQVAGYVLKDLVPSAGSSQITATLIMTKTAEYF HVSVEELCGSSRSRTLVTARQIAMYLCRELTELSLPKIGQQFGGRDHTTVMHADRKIRQLIGERRPIYDQITELTGQITR AAAG >Janibacter sp. HTCC2649] MDQIWRTTLDALDSDGIPVQQRAFLSLAKLVGLLDETALIAVPNDFTKDIVETRLRDRVTETLSSQLGHDVRLAVTVDHS LADVPVTIPADTTTVDGAGADQVPRTATTIGLEPGPADADGRRAKRRAELDGIALVEDDEGEDDSRNNGAIGRTRSPGAL RPRPGATVPEQVELTRLNPKYTFDTFVIGASNRFANAAALAVAETPAKAYNPLFIYGESGLGKTHLLHAIGHYARNLYPH VKVRYVNSEEFTNDFINSIRDDKAANFQRRYRDVDVLLIDDIQFLQGKVQTQEEFFHTFNTLHNANKQVVITSDLPPKLL SGFEERMRSRFEWGLMTDVQPPDLETRIAILRKKAAQEKLSVPDDVLEFIASRISTNIRELEGALIRVTAFASLNRQPVD ISLAEIVLKDLIPHDSANQITSATIMAQTAAYFGLTLEDLQGQSRSRVLVTARQIAMYLCRELTDLSLPKIGQQFGGRDH TTVMHADKKIRQLMAERRAIYNQVTELTNRIKQQSR >Tetrasphaera japonica T1-X7] MDYSRVWRTALDELDADGLPIQQRAFLSLARFVGLLDDTALIAVPNDYTKEFVETRVRQQLTQTLSNHAGRELRLAVTVD PGLKAADEAEPTLGAPAEAWGPRPRAGIGPVDDEEDHEAQDVIDARGVDTFHIQRPTPGQRPVPELIENTRLNPKYTFDT FVIGASNRFAHAAAVAVAEAPAKAYNPLFVYGDSGLGKTHLLHAIGHYARNLYPTVRVRYVNSEEFTNDFINSIRDDKAA NFQSRYRDVDVLLIDDIQFLQGKVQTQEEFFHTFNTLHNANKQVVITSDVPPKLLSGFEERMRSRFEWGLLTDVQPPDLE TRIAILRKKAIQERLSVPDDVLEFIASRISTNIRELEGALIRVTAFASLNRQPVDMSLAEIVLKDLIPSDGANQITSATI MAQTATYFGLTVEDLQGASRSRVLVTARQIAMYLCRELTDLSLPKIGQQFGGRDHTTVMHADKKIRQLMAERRAIYNQVT ELTNRIKQQSR >Arsenicicoccus bolidensis] MTDAQVDVPRVWRDTLRALESGGISAQHRGFLRLSRLVGLLEGTALIAVPNDYTRDIVEKRIRTELVAALQEQLGRDVRL AVTVDSSLELSEAEDRDDSTQRPGSPGEVPHVVTSSDPVRHDGERGGSDPRSTYGPRLVRDERVPRPVSTDASFGGERPY ADPARHDGGPGHAPGIRPAQDDDADDDRELLAEGDGIREMFRKPFVPEDGRDAKLNPKYTFDTFVIGSSNRFAHAAAIAV AEAPAKAYNPLFIYGGSGLGKTHLLHAIGHYAQQIYPNVRVKYVNSEEFTNDFINSIGANKASDFQRRYRDIDFLLIDDI QFLQGKVQTQEEFFHTFNTLHNANKQVVITSDVAPKLLSGFEERMRSRFEWGLLTDVQPPDLETRIAILRKKAVQERMTA PDDVLEFIASKISTNIRELEGALIRVTAFASLNHQGVDMTLAEVVLKDLIPADQTNQITPATIMAQTASYFGLTVDDLCG TSRSRVLVTARQIAMYLCRELTDLSLPKIGQQFGGRDHTTVMHADRKIRELMSERRAIYNQVTELTNRIKSNPS >Mobilicoccus sp. SIT2] MSGSGEPPTTPAVDDPARIWGATLRALDQAGIPAPQRAFLRQAMLVGVLDTTALIAVPDDFTKEIVESRARDYLVKALTE QVGREVRLAVTVDASLREQIAAAEQPVLEGYADPDSDVAAAEPPAASPTAAAPASPPVEDSPAPVGERRPRPAPTGRREE SSQLNPKYTFDTFVIGASNRFAHAAAVAVAEAPAKAYNPLFVYGDSGLGKTHLLHAIGHYARNLYPHVKVKYVNSEEFTN DFINSIRDDKAAGFQRRYRDMDVLLIDDIQFLQGKMQTQEEFFHTFNTLHNSNKQVVITSDVPPKQLSGFEERMRSRFEW GLLTDVQPPDLETRIAILRKKAVQEHLALPDEVMEFIASRISTNIRELEGALIRVTAFANLNRQPVEMSLAEIVLKDLIP DKESSQITASMIMGQTAAYFGLSIDDLCGSSRSRGLVTARQIAMYLCRELTELSLPKIGQQFGGRDHTTVMHADRKIRQL MAERRSVYNQVTELTTRIRSQAS >Cellvibrio] gilvus ATCC 13127] MAQDEELSRVWGHVVTTLEESPDITQRQLAFVRLAQPLGLLDGTIILAVGNEYTKEYLETKVRAEVTSALGSALGRDGRF AITVDPSLVDDAPPAVRAMTSAPELGVVTDGTDERGAPNRTVPTDADTGRHERSPMLSESAEPTRPVRETASSRRPAAEP ARLNPHYLFETFVIGSSNRFAHAAAVAVAEAPAKAYNPLFIYGDSGLGKTHLLHAIGHYAQNLYPSVRVRYVNSEEFTND FINSISEGKAGAFQRRYREVDVLLIDDIQFLQGKEQTMEEFFHTFNTLHNANKQVVITSDLPPKQLNGFEDRMRSRFEWG LITDVQPPDLETRIAILRKKAGGDNMQAPPDVLEYIASKISTNIRELEGALIRVTAFASLNRQQVDLSLAEIVLKDLITD DQTTEITATQVIGQTAAYFGLSIEDLCGSSRSRVLVTARQIAMYLCRELTDLSLPKIGQAFGGRDHTTVMHANRKIRELM AERRSIYNQVTELTNRIKQQSRG >Promicromonosporaceae bacterium W15] MAEVWSAARIQLENDPDVTPRQRGYVRLVAPLAHIDDTVFLKVSDEPIRSFIETNLRADLVGALAGVLGYEPKLAISVDP DLQIADDDTDPAAPVYARPAPTTATSPAAQRMPTDEPDHGLSNPVGFGATVGLPEAMPAAPGSARAAQHAPRHAEPTAPL GENSRLNPKYLFETFVIGASNRFAHAAAVAVAEAPAKAYNPLFIYGDSGLGKTHLLHAIGHYAQSLYPNVRVRYVNSEEF TNDFINSIGEGKAGAFQRRYRDVDVLLIDDIQFLQGKEQTMEEFFHTFNALHNANKQVVITSDLPPKQLNGFEDRLRSRF EWGLITDVQPPDLETRIAILRKKALQERLDAPDDVLEYIASRISTNIRELEGALIRVTAFANLNRQPVDQSLAEIVLKDL ITDDDAGEITAASVIAQTAAYFGLTIDDLCGSSRSRVLVTARQIAMYLCRELTDLSLPKIGQQFGGRDHTTVMHANRKIS EQMAERRSIYNQVTELTSRIKQQHRG >Nocardioides sp. Iso805N] METPPDNGQHDELGQAWQSVVAELQPHQRAWLTACQPEALHGTTVLVGVPNDFTRNQLEGRLRAELEDALSTRFGQEMRI AAIVKPELEDRPLEATTPAPAISDAPVAPIKPHIDLSTNRFGGSSHNDTPDSGGAAPNFQLPPLRHEAPETERTTSESRL NPKYTFETFVIGSSNRFPHAAAVAVAEAPGRAYNPLLVYGESGLGKTHLLHAIGHYVRSLYNGAKVRYVSSEEFTNEFIN AIRDDRQDRFKRKYRDIDVLLIDDIQFLEGKTQTQEEFFHTFNTLHNANKQIVLTSDRAPKRLEALEDRLRNRFEWGLIT DVQPPDVETRIAILRKKAAMERLKAPADVLEFIATKIQTNIRELEGALIRVTAFANLNRQEVDMTLAEIVLKDLIPEGGE
http://geni-act.org/lab_notebook/public/8ed2a45227a744df/ 11/7/15, 7:37 PMPage 31 of 39
PEITAGLIIAQTAAYFGLSIDELTGPSRGRHLVMARQIAMYLCRELTDLSLPKIGQQFGGRDHTTVMYAERKINQLLAER RSVFNQVSELTNRVKMQARQA >Arthrobacter globiformis NBRC 12137] MTVDEANHANTVGSSWRRVVTLLEQDDRVSPRQRGFVILAQAQGLIGSTLLVAVPNELTREVLQTQVKDALDDALRNVFS DDIRCAIDVDTDLVPIHEEPEPAVELSLANDPSIEQKPQPMLPSTSHEFGRLNPKYVFDTFVIGSSNRFAHAAAVAVAEA PAKAYNPLFIYGDSGLGKTHLLHAIGHYARRLYSGIRVRYVNSEEFTNDFINSIRDDEGASFKTTYRNVDVLLIDDIQFL AGKDRTLEEFFHTFNSLHNNNKQVVITSDQPPKLLAGFEDRMKSRFEWGLLTDIQPPELETRIAILRKKALSEGLSAPDD ALEYIASKIASNIRELEGALIRVTAFASLNRQPVDVALAEMVLKDLITDDGAQEITSGQILQQTADYFKLSMEELCSKSR TRTLVTARQIAMYLCRELTDMSLPKIGQELGGRDHTTVIHADRKIRELMAERRVIYNQVTELTNRIKQQQRDS
The T-Coffee Alignment:
CLUSTAL W (1.83) multiple sequence alignment
Arsenicicoccus MTDA--------QVDVPRVWRDTLRALES-GGISAQHRGFLRLSRLVGLLArthrobacter MTVDE---AN-HANTVGSSWRRVVTLLEQDDRVSPRQRGFVILAQAQGLICellvibrio] MAQD---------EELSRVWGHVVTTLEESPDITQRQLAFVRLAQPLGLLJanibacter M---------------DQIWRTTLDALDS-DGIPVQQRAFLSLAKLVGLLKnoellia M---------------DQIWRTTLDALDS-DGIPVQQRAFLSLARLVGLLKytococcus MSQTP-------DDHATAIWQEAMVHLQG-AGLAPRDIGVLRLATLVGLLLuteipulveratus MSED--------KPDLAHVWHSTVMALEE-TGIAARDRAFLRLTRLVGLVMobilicoccus MSGSGEPPTTPAVDDPARIWGATLRALDQ-AGIPAPQRAFLRQAMLVGVLNocardioides METPP---DNGQHDELGQAWQSVVAELQ------PHQRAWLTACQPEALHOrnithinimicrobium MTSQS-------PAESAEVWQRVVSQLES-QGVTARDRAFLRLTQLVGLLPromicromonosporaceae M---------------AEVWSAARIQLENDPDVTPRQRGYVRLVAPLAHISerinicoccus MSQP--------STDSGDTWRRVVSELED-KGLGAREKAFLRLTTMVGVLTetrasphaera M-------------DYSRVWRTALDELDA-DGLPIQQRAFLSLARFVGLL * * . *: . . : .
Arsenicicoccus EGTALIAVPNDYTRDIVEKRIRTELVAALQEQLGRDVRLAVTVDSSLELSArthrobacter GSTLLVAVPNELTREVLQTQVKDALDDALRNVFSDDIRCAIDVDTDLVPICellvibrio] DGTIILAVGNEYTKEYLETKVRAEVTSALGSALGRDGRFAITVDPSLVDDJanibacter DETALIAVPNDFTKDIVETRLRDRVTETLSSQLGHDVRLAVTVDHSLADVKnoellia DETALIAVPNDFTKDIVETRLRDRVTETLSSQLGHTVRLAVTVDSSLGDVKytococcus EGTALLAVKYDHVKDAVEGHLREDVSTALAEVLDRDIRLAVSVDPDAVSALuteipulveratus DQTALLAVPFDHTKDILETSLREPVSSALSRLLDRDVRLAVTVDPQLQQTMobilicoccus DTTALIAVPDDFTKEIVESRARDYLVKALTEQVGREVRLAVTVDASLREQNocardioides GTTVLVGVPNDFTRNQLEGRLRAELEDALSTRFGQEMRIAAIVKPELEDROrnithinimicrobium DTTALLAVPYQHTKETLETTLRQPIVDALAGELGHDVRLAITVDEDLRRQPromicromonosporaceae DDTVFLKVSDEPIRSFIETNLRADLVGALAGVLGYEPKLAISVDPDLQIASerinicoccus DSTVLLAVPYPHTKEMLETTLRQPIVDLLSRELDREVRLAITVDDDVRQRTetrasphaera DDTALIAVPNDYTKEFVETRVRQQLTQTLSNHAGRELRLAVTVDPGLKAA * :: * :. :: : : * . : * *.
Arsenicicoccus EAEDR-DDSTQRPG-SPGEVPHVVT--SSDPVRHDGERGGSDPRSTYGPRArthrobacter HEE-P-EPAVE-----------LSL--ANDPS------------------Cellvibrio] ---APPA-VRAM---TS--APELGV--VTDGTDERG--------------Janibacter PVTI-PADTTTVDGAGADQVPRTATTIGLEPGPADA--------------Knoellia PVLDPPADAPS-----------GST--TTEPRPAAG----TE--------Kytococcus AQEEAAPPAPS-----------PAD--EDDPAT--G----E---------Luteipulveratus VSASAE--AKD-----------AGI--GEDEVA-----------------Mobilicoccus IAAAE-QPVL--EG-YADPDSDVAA--AEPPAASP---------------Nocardioides PLEAT-TPAPA-----------IS----DAPV------------------Ornithinimicrobium VEDEGDP-APG-----------PAV--TEQVPS-----------------Promicromonosporaceae DDDTDPA-APV----YA--RPAPTT--ATSPAA-----------------Serinicoccus VEDEADDEADE-----------DAQ--TRESLT-----------------Tetrasphaera DEAEPTLGAPA-----------EA----WGPRPRAG----I---------
Arsenicicoccus LVRDERVPRPVSTDASFGGERPYADPARHDGGPGHAPGIRPAQDDDADDDArthrobacter --------------------------------------------------Cellvibrio] --------------AP-NRTVP-------------------T-DA-DT--Janibacter -------------DGRRAKRRA------------ELDGIALVEDDEGEDDKnoellia ------------GDGRHAERRA------------ELDGIALVEDDDDGDSKytococcus -----------------GPLST---A---------VDG--AVEKHEGSSPLuteipulveratus -----------------ET-------------------------------Mobilicoccus --------------TAAAPASP---P---------------VED------Nocardioides -----------------APIKP---H---------IDL-S-TNRFGGSSHOrnithinimicrobium -----------------DPD---------------------------RTPPromicromonosporaceae ------------------QRMP-------------------T-DEPDHGLSerinicoccus -----------------RPA---------------------------SQPTetrasphaera ---------------------------------------GPVDDEEDHEA
Arsenicicoccus RE-----LLA-------------EGDGIR-EM-FRKPFVPEDGRDAKLNPArthrobacter --------------------------IEQKPQ-P--MLPSTSHEFGRLNPCellvibrio] -----------GRHERSPMLSESAEPTRPVR-ETA-SSRRPAAEPARLNPJanibacter SR---N-NGA---------IGRTRSPGALRPR-PG-ATVPEQVELTRLNPKnoellia SR---T-GRS---------VAHTRSPGALRPR-PG-VTVPEQVELTRLNPKytococcus AR---AGES-----------VAPATT---ASLTATNSSPGVERDYSALNHLuteipulveratus -D---AEEAG---------TPAP---ALIRPVRASNRPPVEPVTDARLNPMobilicoccus ---------------------SPAPVGERRPR-P--APTGRREESSQLNPNocardioides ND-----TPD---------SGGAAPNFQLPPLRHEAPETERTTSESRLNPOrnithinimicrobium YR---SNGAG---------PGEPRSDGHR---TPSGAVQTASAEDARLNPPromicromonosporaceae SNPVGF-GATVGLPEAMPAAPGSARAAQHAP-RHA-EPTAPLGENSRLNPSerinicoccus SS---SAGAG---------VPGPSGNGIPRPATPAGPAVTGAADEARLNP
http://geni-act.org/lab_notebook/public/8ed2a45227a744df/ 11/7/15, 7:37 PMPage 32 of 39
Tetrasphaera QD-----VID---------ARGVDTFHIQRPT-PGQRPVPELIENTRLNP **
Arsenicicoccus KYTFDTFVIGSSNRFAHAAAIAVAEAPAKAYNPLFIYGGSGLGKTHLLHAArthrobacter KYVFDTFVIGSSNRFAHAAAVAVAEAPAKAYNPLFIYGDSGLGKTHLLHACellvibrio] HYLFETFVIGSSNRFAHAAAVAVAEAPAKAYNPLFIYGDSGLGKTHLLHAJanibacter KYTFDTFVIGASNRFANAAALAVAETPAKAYNPLFIYGESGLGKTHLLHAKnoellia KYTFDTFVIGASNRFANAAALAVAETPAKAYNPLFIYGESGLGKTHLLHAKytococcus KYTFDTFVLGSSNRFAHAAATAVAEAPARAYNPLFIYGGSGLGKTHLLHALuteipulveratus KYTFDTFVIGASNRFAHAAAFAVAEAPAKSYNPLFVYGDSGLGKTHLLHAMobilicoccus KYTFDTFVIGASNRFAHAAAVAVAEAPAKAYNPLFVYGDSGLGKTHLLHANocardioides KYTFETFVIGSSNRFPHAAAVAVAEAPGRAYNPLLVYGESGLGKTHLLHAOrnithinimicrobium KYTFDTFVSGSSNRFAHAASLAVAESPARAYNPLFIYGESGLGKTHLLHAPromicromonosporaceae KYLFETFVIGASNRFAHAAAVAVAEAPAKAYNPLFIYGDSGLGKTHLLHASerinicoccus KYSFDTFVSGPSNRFAHAASLAVAESPARAYNPLFIYGESGLGKTHLLHATetrasphaera KYTFDTFVIGASNRFAHAAAVAVAEAPAKAYNPLFVYGDSGLGKTHLLHA :* *:*** *.****.:**: ****:*.::****::** ***********
Arsenicicoccus IGHYAQQIYPNVRVKYVNSEEFTNDFINSIGANKASDFQRRYRD-IDFLLArthrobacter IGHYARRLYSGIRVRYVNSEEFTNDFINSIRDDEGASFKTTYRN-VDVLLCellvibrio] IGHYAQNLYPSVRVRYVNSEEFTNDFINSISEGKAGAFQRRYRE-VDVLLJanibacter IGHYARNLYPHVKVRYVNSEEFTNDFINSIRDDKAANFQRRYRD-VDVLLKnoellia IGHYARNLFPHVKVRYVNSEEFTNDFINSIRDDKAANFQRRYRD-VDVLLKytococcus IGHYARTLDSSVRVKYVNSEEFTNQFINAVSAGQANAFQRQYRD-VDVLLLuteipulveratus IGHYVRNLYPSMRVRYVNSEEFTNDFINSIRDDKASSFQKRYRDDVDVLLMobilicoccus IGHYARNLYPHVKVKYVNSEEFTNDFINSIRDDKAAGFQRRYRD-MDVLLNocardioides IGHYVRSLYNGAKVRYVSSEEFTNEFINAIRDDRQDRFKRKYRD-IDVLLOrnithinimicrobium IGHYARSLYPGVRVRYVNSEEFTNDFINSIRDDKAGAFQRRYRN-VDFLLPromicromonosporaceae IGHYAQSLYPNVRVRYVNSEEFTNDFINSIGEGKAGAFQRRYRD-VDVLLSerinicoccus IGHYARKLYPGVRVRYVNSEEFTNDFINSIRDDKAGAFQRRYRN-VDFLLTetrasphaera IGHYARNLYPTVRVRYVNSEEFTNDFINSIRDDKAANFQSRYRD-VDVLL ****.: : :*:**.******:***:: .. *: **: :*.**
Arsenicicoccus IDDIQFLQGKVQTQEEFFHTFNTLHNANKQVVITSDVAPKLLSGFEERMRArthrobacter IDDIQFLAGKDRTLEEFFHTFNSLHNNNKQVVITSDQPPKLLAGFEDRMKCellvibrio] IDDIQFLQGKEQTMEEFFHTFNTLHNANKQVVITSDLPPKQLNGFEDRMRJanibacter IDDIQFLQGKVQTQEEFFHTFNTLHNANKQVVITSDLPPKLLSGFEERMRKnoellia IDDIQFLQGKVQTQEEFFHTFNTLHNANKQVVITSDLPPKLLSGFEERMRKytococcus IDDIQFLQGKEQTMEEFFHTFNTLHNSEKQIVITSDQPPKKLSGFAERMRLuteipulveratus IDDVQFLQGKDGTQEEFFHTFNALHDSEKQIVLTSDQPPKKLSGFADRMRMobilicoccus IDDIQFLQGKMQTQEEFFHTFNTLHNSNKQVVITSDVPPKQLSGFEERMRNocardioides IDDIQFLEGKTQTQEEFFHTFNTLHNANKQIVLTSDRAPKRLEALEDRLROrnithinimicrobium VDDIQFLQGKEQTVEEFFHTFNTLHNSEKQVVITSDQPPKRLSGFAERMRPromicromonosporaceae IDDIQFLQGKEQTMEEFFHTFNALHNANKQVVITSDLPPKQLNGFEDRLRSerinicoccus VDDIQFLQGKEQTVEEFFHTFNTLHNSEKQVVITSDQPPKRLSGFAERMRTetrasphaera IDDIQFLQGKVQTQEEFFHTFNTLHNANKQVVITSDVPPKLLSGFEERMR :**:*** ** * ********:**: :**:*:*** .** * .: :*::
Arsenicicoccus SRFEWGLLTDVQPPDLETRIAILRKKAVQERMTAPDDVLEFIASKISTNIArthrobacter SRFEWGLLTDIQPPELETRIAILRKKALSEGLSAPDDALEYIASKIASNICellvibrio] SRFEWGLITDVQPPDLETRIAILRKKAGGDNMQAPPDVLEYIASKISTNIJanibacter SRFEWGLMTDVQPPDLETRIAILRKKAAQEKLSVPDDVLEFIASRISTNIKnoellia SRFEWGLMTDVQPPDLETRIAILRKKAAQEKLSVPDDVLEFIASRISTNIKytococcus SRFEWGLLTDVQPPDLETRIAILRRKAAADKLDIPDDVLHLIASKISSNILuteipulveratus SRFEWGLQTDVTPPDLETRIAILRKKAIAERMNVPDDVLELIASKFSTNIMobilicoccus SRFEWGLLTDVQPPDLETRIAILRKKAVQEHLALPDEVMEFIASRISTNINocardioides NRFEWGLITDVQPPDVETRIAILRKKAAMERLKAPADVLEFIATKIQTNIOrnithinimicrobium SRFEWGLLTDVQPPDLETRIAILKKKAAQEGMQLPDEVLELIGSKISTNIPromicromonosporaceae SRFEWGLITDVQPPDLETRIAILRKKALQERLDAPDDVLEYIASRISTNISerinicoccus SRFEWGLLTDVQPPDLETRIAILRKKAAQEGMQLPDEVLEHIASRITTNITetrasphaera SRFEWGLLTDVQPPDLETRIAILRKKAIQERLSVPDDVLEFIASRISTNI .****** **: **::*******::** : : * :.:. *.::: :**
Arsenicicoccus RELEGALIRVTAFASLNHQGVDMTLAEVVLKDLIPADQTNQITPATIMAQArthrobacter RELEGALIRVTAFASLNRQPVDVALAEMVLKDLITDDGAQEITSGQILQQCellvibrio] RELEGALIRVTAFASLNRQQVDLSLAEIVLKDLITDDQTTEITATQVIGQJanibacter RELEGALIRVTAFASLNRQPVDISLAEIVLKDLIPHDSANQITSATIMAQKnoellia RELEGALIRVTAFASLNRQPVDISLAEIVLKDLIPHDSSSQITSATIMAQKytococcus RELEGALTRVTAFASLSGSPLDEYLARTVLKDVMPGGDSGQITPTMILEELuteipulveratus RELEGALIRVMAFASLSQQPVDQQVAGYVLKDLVPSAGSSQITATLIMTKMobilicoccus RELEGALIRVTAFANLNRQPVEMSLAEIVLKDLIPDKESSQITASMIMGQNocardioides RELEGALIRVTAFANLNRQEVDMTLAEIVLKDLIPEGGEPEITAGLIIAQOrnithinimicrobium RELEGALIRVTAFASLSSTPPDAALASHVLKDIIPNSESAAITVPTIMAEPromicromonosporaceae RELEGALIRVTAFANLNRQPVDQSLAEIVLKDLITDDDAGEITAASVIAQSerinicoccus RELEGALIRVTAFASLSSQPADADLAAHVLKDIVPGSDTAQITVSTIIRETetrasphaera RELEGALIRVTAFASLNRQPVDMSLAEIVLKDLIPSDGANQITSATIMAQ ******* ** ***.*. : :* ****::. ** :: :
Arsenicicoccus TASYFGLTVDDLCGTSRSRVLVTARQIAMYLCRELTDLSLPKIGQQFGGRArthrobacter TADYFKLSMEELCSKSRTRTLVTARQIAMYLCRELTDMSLPKIGQELGGRCellvibrio] TAAYFGLSIEDLCGSSRSRVLVTARQIAMYLCRELTDLSLPKIGQAFGGRJanibacter TAAYFGLTLEDLQGQSRSRVLVTARQIAMYLCRELTDLSLPKIGQQFGGRKnoellia TAAYFGLTLEDLQGQSRSRVLVTARQIAMYLCRELTDLSLPKIGQQFGGRKytococcus TAGYFVISVEEIQGASRSRNLTRARQIAMYLCRELTDLSLPKIGKEFGGRLuteipulveratus TAEYFHVSVEELCGSSRSRTLVTARQIAMYLCRELTELSLPKIGQQFGGRMobilicoccus TAAYFGLSIDDLCGSSRSRGLVTARQIAMYLCRELTELSLPKIGQQFGGRNocardioides TAAYFGLSIDELTGPSRGRHLVMARQIAMYLCRELTDLSLPKIGQQFGGROrnithinimicrobium VADYFQISNDDLCGTSRSRTLVNARQIAMYLCRELTDLSLPKIGQEFGGRPromicromonosporaceae TAAYFGLTIDDLCGSSRSRVLVTARQIAMYLCRELTDLSLPKIGQQFGGRSerinicoccus VSEYFQISIDELCGTSRSRTLVNARQIAMYLCRELTDLSLPKIGQEFGGRTetrasphaera TATYFGLTVEDLQGASRSRVLVTARQIAMYLCRELTDLSLPKIGQQFGGR .: ** :: ::: . ** * *. *************::******: :***
Arsenicicoccus DHTTVMHADRKIRELMSERRAIYNQVTELTNRIKSNP---S
http://geni-act.org/lab_notebook/public/8ed2a45227a744df/ 11/7/15, 7:37 PMPage 33 of 39
Arthrobacter DHTTVIHADRKIRELMAERRVIYNQVTELTNRIKQQQ-RDSCellvibrio] DHTTVMHANRKIRELMAERRSIYNQVTELTNRIKQQS--RGJanibacter DHTTVMHADKKIRQLMAERRAIYNQVTELTNRIKQQS---RKnoellia DHTTVMHADKKIRQLMAERRAIYNQVTELTNRIKQQS---RKytococcus DHTTVMHAERKIKQLLGEDRRVYDEVSELTSIIRKKAARGRLuteipulveratus DHTTVMHADRKIRQLIGERRPIYDQITELTGQITRAA--AGMobilicoccus DHTTVMHADRKIRQLMAERRSVYNQVTELTTRIRSQA---SNocardioides DHTTVMYAERKINQLLAERRSVFNQVSELTNRVKMQA-RQAOrnithinimicrobium DHTTVMHAERKIRQLIGERRALYDQITELTGIIRKAS--ARPromicromonosporaceae DHTTVMHANRKISEQMAERRSIYNQVTELTSRIKQQH--RGSerinicoccus DHTTVMHAERKIRAQIGERRALYDQIAELTGTIRRAS--QRTetrasphaera DHTTVMHADKKIRQLMAERRAIYNQVTELTNRIKQQS---R *****::*::** :.* * ::::::*** :
The Cladogram from T-COFFEE:
The Tree File in Newick Format from PhyML at Phylogeny.fr:
((((((Arsenicicoccus:0.363182,(Tetrasphaera:0.164269,(Knoellia:0.031567,Janibacter:0.053012)1.000000:0.157616)0.944000:0.077073)0.903000:0.074505,Mobilicoccus:0.285031)0.880000:0.062980,(Nocardioides:0.686282,((Promicromonosporaceae:0.278553,Cellvibrio:0.267551)0.969000:0.118671,Arthrobacter:0.550811)0.522000:0.056440)0.926000:0.080531)0.998000:0.210238,Luteipulveratus:0.307826)0.864000:0.084907,Kytococcus:0.618026)0.995000:0.205025,Serinicoccus:0.196835,Ornithinimicrobium:0.129401);
The phylogram from TreeDyn:
http://geni-act.org/lab_notebook/public/8ed2a45227a744df/ 11/7/15, 7:37 PMPage 34 of 39
The radial version of the phlogenetic tree from TreeDyn:
http://geni-act.org/lab_notebook/public/8ed2a45227a744df/ 11/7/15, 7:37 PMPage 35 of 39
Interpretation of phylogenetic treeLooking at the names of the different genera on the phylogenetic trees, it is clear that the three that are in a clade with Kytococcus are related torelated down to the same order or family. Since few closer relatives of Kytococcus have had their genomes sequenced, there is no evidencesuggesting that horizontal gene transfer of Ksed_00010 has occured. In fact, just about all of the genera in the tree have a realtionship to the sameorder, with the notable exception of Cellvibrio. Given that Ksed_00010 is an enzyme of DNA metabolism and the ecvalues and scores of all 100 hitsin the nr database are highly signficant, DnaA is likely very highly coserved across microbial species, and it would be hard to find evidence ofhorizontal transfer as a result.
Ornithinimicrobium - same order Serinicoccus - same order Luteipulveratus - same family
http://geni-act.org/lab_notebook/public/8ed2a45227a744df/ 11/7/15, 7:37 PMPage 36 of 39
Cellvibrio - different phylum Promicromonosporaceae - same order Arsenicicoccus - same order Mobilicoccus - same order
Gene Context
login to IMG/ER and find the gene page for this gene
Ortholog Neighborhood Region of organism and examples of similarities or differences
Comment on the ortholog neighborhood regionsThe neighborhoods in the image above have similarity in the region immediately downstream of and including Ksed_00010 and are all as closelyrelated to Kytococcus sedentarius as we can expect to find. Therefore there is no evidence of horizontal gene transfer from th ortholog neighborhoodregions approach.
Chromosome Viewer GC Heat Map
go to the IMG Gene Detail page
Characteristic GC% of the genome72%
http://geni-act.org/lab_notebook/public/8ed2a45227a744df/ 11/7/15, 7:37 PMPage 37 of 39
Average GC% of the geneKsed_00010 has a GC content of 68%. The average GC content of the genome of Kytococcus sedentarius is 72%. Therefore the GC content ofKsed_00010 is similar to that of the genome as a whole, and there is no evidence of horizontal gene transfer using the %GC approach.
[-] RNA
Module Instructions
Rfam
go to Rfam at http://rfam.sanger.ac.uk
Rfam number (RF#####)Not applicable. Ksed_00010 is not an RNA gene
Rfam name
Score
E-value
Pairwise Alignment
[-] Proposed Annotation
Proposed Annotation
Enter the proposed annotation for the gene
chromosomal replication initiator protein DnaA
The top two BLAST hits in both the nr and Swiss-prot databases have name of chromosomal replication initiatorprotein DnaA, the top COG hit is DnaA: Chromosomal replication initiation ATPase DnaA [Replication,recombination and repair]; the top TIGRFAM hit is DnaA: chromosomal replication initiator protein DnaA and the topPfam hit is Bac_DnaA. All of these findings support the annotation of Ksed_00010 as chromosomal replicationinitiator protein DnaA as indicated by the computer annotation.
http://geni-act.org/lab_notebook/public/8ed2a45227a744df/ 11/7/15, 7:37 PMPage 38 of 39