Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
1
Secondarymetabolicsymbiosisinshipworms(Teredinidae)Authors:MarvinA.Altamiaa,b*,ZhenjianLinc*,AmaroE.Trindade-Silvad,e,IrisDianaUyb,f,J.ReubenShipwayg,DiegoVerasWilkee,GiselaP.Concepcionb,f,DanielL.Distela,EricW.Schmidtc**,MargoG.Haygoodc**aOceanGenomeLegacyCenter,DepartmentofMarineandEnvironmentalScience,
NortheasternUniversity,Nahant,MA,USAbTheMarineScienceInstitute,UniversityofthePhilippinesDiliman,QuezonCity1101,
PhilippinescDepartmentofMedicinalChemistry,UniversityofUtahdBioinformaticandMicrobialEcologyLaboratory-BIOME,FederalUniversityofBahia,
Salvador,Bahia,BrazileDrugResearchandDevelopmentCenter,DepartmentofPhysiologyandPharmacology,Federal
UniversityofCeara,60430275,Ceara,BrazilfPhilippineGenomeCenter,UniversityofthePhilippinesDiliman,QuezonCity1101,PhilippinesgInstituteofMarineScience,SchoolofBiologicalSciences,UniversityofPortsmouth,UK*equalcontributionauthors**co-correspondingauthorsSignificanceShipwormsplaycriticalrolesinrecyclingwoodintheseaandinshapingmangrovehabitats.Symbioticbacteriasupplytheenzymesthattheyneedfornutritionandwooddegradation.Here,weshowthatthesamenutritionalsymbiontsalsohaveanimmensecapacitytoproduceamultitudeofdiverseandlikelynovelbioactivesecondarymetabolites.Thecompoundslikelysupporttheabilityofshipwormstodegradewoodinmarineenvironmentsandincludeacompoundunderinvestigationforitstherapeuticpotential.Becausemanyofthesymbiontscanbecultivated,theyprovideamodelforunderstandinghowsecondarymetabolismimpactsmicrobialsymbiosisinanimals.AbstractShipworms,assistedbyintracellularγ-proteobacteriaintheirgills,aretheprincipaldegradersofwoodinthesea.Shipwormsymbiontshavebeencultivatedinthelaboratory.Thegenomesofthesesymbionts,inadditiontobeingrepletewithlyticenzymescapableofdegradingwoodand/orenzymesofthioautotrophicmetabolism,areamongthebacterialgenomesrichestinsecondarymetabolitegenes.Thesecultivatedsymbiontsrepresentthedominantspeciesinthegillsindiverseshipwormspecies.Weinvestigatedhowtheisolatesecondarymetabolitesmightimpactthehostanimals:whichbacterialpathwaysarepresent,howwidelydistributedtheyare,andhowtheyvary.Focusingon14wood-eatingshipwormspecimens,wefoundbetweenonetothreemajorbacterialspeciesineachgill,witheachspeciescomprisingacomplexmixtureofcloselyrelatedstrains.Themixtureallowstheshipwormhosttoaccessamuchmore
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint
2
complexmetabolismthancanbeaffordedbyasinglesymbiontstrain.Weanalyzedsequencesfrom22shipwormgillmetagenomesfromsevenshipwormspeciesandcomparedthemwiththegenomesof23cultivatedbacterialisolatesfromshipwormgills.Limitingouranalysestowell-characterizedbiosyntheticgenefamilies,wefoundmorethan400polyketide,nonribosomalpeptide,andrelatedbiosyntheticgeneclusters,onlyahandfulofwhichresembleknownones,comprisingover100geneclusterfamilies(GCFs).OnlyfourGCFsarefoundinallspecimensinvestigated.SeveralGCFsexhibitedahostspecies-specificdistribution,butmostoccurredstochastically.Shipwormsandtheirsymbiontsthusprovideamodelsystemforunderstandingtheroleofsecondarymetabolisminsymbioses.
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint
3
IntroductionShipworms(FamilyTeredinidae)arebivalvemollusksfoundthroughouttheworld’soceans(1,2).Manyshipwormseatwood,assistedbycellulasesfromintracellularsymbioticγ-proteobacteriathatinhabittheirgills(Figure1)(3-6).Othersusesulfidemetabolismalsorelyingongill-dwellingγ-proteobacteriaforsulfuroxidation(7).Shipwormgillsymbiontsofseveraldifferentspeciesarethusessentialtoshipwormnutritionandsurvival.Oneofthemostremarkablefeaturesoftheshipwormsystemisthatwooddigestiondoesnottakeplacewherethebacteriaarelocated,sothatthebacterialcellulaseproductsaretransferredfromthegilltoanearlysterilececum(8),wherewooddigestionoccurs(Figure1)(9).Thisenablesthehostshipwormstodirectlyconsumeglucoseandothersugarsderivedfromwoodlignocelluloseandhemicellulose,ratherthanthelessenergeticfermentationproductsofcellulolyticgutmicrobesasfoundinothersymbioses.Shipwormsymbiontsarealsoessentialfornitrogenfixationinthelow-nitrogenwoodenvironment(10).Thus,shipwormshaveevolvedstructuresandmechanismsenablingbacterialmetabolismtosupportanimalhostnutrition.Whileinmanynutritionalsymbiosesthebacteriaaredifficulttocultivate,shipwormgillsymbioticγ-proteobacteriahavebeenbroughtintostableculture(5,11,12).Thisledtothediscoverythatthesebacteriaareexceptionalsourcesofsecondarymetabolites(13).Ofbacteriawithsequencedgenomes,thegillsymbiontsTeredinibacterturneraeT7901andrelatedstrainsareamongtherichestsourcesofbiosyntheticgeneclusters(BGCs),comparableincontenttofamousproducersofcommercialimportancesuchasStreptomycesspp.(12-15).Thisimpliesthatshipwormsmightbeagoodsourceofnewcompoundsfordrugdiscovery.Ofequalimportance,thesymbioticbacteriaarecrucialtosurvivalofhostshipworms,andbioactivesecondarymetabolitesmightplayaroleinshapingthosesymbioses.ThegenomesequenceofT.turneraeT7901revealedninecomplexpolyketidesynthase(PKS)andnonribosomalpeptidesynthetase(NRPS)BGCs(13),andmorecomprehensiveanalysisidentifiesupto14potentialBGCs.Oneofthesewasshowntoproduceanovelcatecholatesiderophore,turnerbactin,whichiscrucialinobtainingironandtothesurvivalofthesymbiontinnature(16).AsecondBGCsynthesizestheboratedpolyketidetartrolonsD/E,whichareantibioticandpotentlyantiparasiticcompounds(17).Bothweredetectedintheextractsofshipworms,implyingapotentialroleinproducingtheremarkablenearsterilityobservedinthececum(8).Thesedatasuggestedspecificrolesforsecondarymetabolisminmaintainingshipwormfunction.T.turneraeT7901isjustoneofmultiplestrainsandspeciesofγ-proteobacterialivingintracellularlyinvariousshipwormgills(3,11),andthustheseanalysesjustbegintodescribeshipwormsecondarymetabolism.Manyshipwormspeciesaregeneralists,consumingwoodfromavarietyofsources(1,18).Otherwood-eaters,suchasDicyathifermannii,Bactronophorusthoracites,andNeoteredoreynei,arespecialiststhatliveinthesubmergedbranches,trunksandrhizomesofmangroves(19,20).There,theyplayanimportantroleinecologicalprocessesinmangroveecosystems,i.e.transferringlargeamountofcarbonfixedbymangrovestothemarineenvironment(18).Severalshipwormspecies,suchasKuphus
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint
4
polythalamius,liveinothersubstrates.K.polythalamiusoftenisfoundinsedimenthabitats(aswellasinwood)whereitsgillsymbiontsarecrucialtosulfideoxidationandcarryoutcarbonfixation(7).K.polythalamiuslackssignificantamountsofcellulolyticsymbiontssuchasT.turnerae,andinsteadcontainsThiosociusteredinicola,whichoxidizessulfideandgeneratesenergyforthehost(21).Othershipwormsarefoundinsolidrockandinseagrass(22,23).Thus,gillsymbiontsvary,butinallcasesthesymbiontsappeartobeessentialtothesurvivalofshipworms.WhilethepotentialofT.turneraeasanunexploredproducerofsecondarymetaboliteshasbeendescribed(13,15),thecapacityofothershipwormsymbiontsisstilllargelyunknown.Moreover,severalpiecesofdataindicatethattheBGCsfoundincultivatedisolatesmightalsobefoundinshipwormgills,buttheirpresence,distributionandvariabilityinnatureareunknown.PreviousdataincludethedetectionoftartrolonsandturnerbactinsandtheirBGCsinshipworms(16,17);aninvestigationoffourisolategenomesandonemetagenomethatobservedsharedpathways(24);alsoanexploratoryinvestigationofthemetagenomeofN.reyneigillsanddigestivetractledtothedetectionofknownT.turneraeBGCsaswellasnovelclusters(25).Thesefindingsleftmajorquestionsabouttheorigin,abundance,variability,distribution,andpotentialrolesofshipwormsecondarymetabolites.Here,weuseacomparativemetagenomicsapproachtoanswerthesequestions.ResultsandDiscussionGenomicdatafromshipwormsandtheirsymbioticbacteria.CellulolyticT.turneraeisrelativelyrichinsecondarymetabolismincomparisontosulfide-oxidizingT.teredinicola.Therefore,forthisanalysisweselectedsixspeciesofwood-eatingshipworms,comparingthesetoaseventhsulfide-oxidizingspecies,K.polythalamius.Wecomparedgillmetagenomesfrom22specimenscomprisingsevenanimalspecieswiththegenomesof23cultivatedbacteriaisolatedfromshipworms(TableS1).Oftheanimalsobtained,weanalyzedthreespecimenseachofBactronophorusthoracites,Kuphusspp.,Neoteredoreynei,andTeredosp.,twospecimensofBankiasp.,andfivespecimensofBankiasetacea.Theseanimalsweredividedintothreegeographicalregions(Figure1):thePhilippines(B.thoracitesandD.manniifromInfanta,Quezon;Kuphusspp.fromMindanaoandMabini);Brazil(N.reyneifromRiodeJaneiro,Teredosp.,andBankiasp.fromCeará);andtheUnitedStates(B.setacea).Thepurposeofsamplingthisrangewastodeterminewhetherthereareanygeographicaldifferencesingillsymbionts.Mostoftheshipwormswereobtainedfrommangrovewood,withtheexceptionofB.setaceafromunidentifiedfoundwood,andKuphusspp.frombothfoundwoodandmud.Thegillmetagenomesoftwospecimensofmud-dwellingK.polythalamiusfromMindanaowerepreviouslysequenced(7).Here,wesequencedandanalyzedgillmetagenomesfromathirdwood-burrowingKuphussp.specimenfromMabini.Sincethetaxonomyofthisspecimenhasyettobefullyresolved,wehavenotprovidedaspeciesnameforthisspecimen.Nonetheless,itsmetagenomewasverysimilartothatofthemuddwellingK.polythalamius(seebelow).B.
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint
5
setaceawassequencedbytheJGIwithlowcoverage.TheremainingspecimensfromthePhilippinesweresequencedatUtahwithhighestcoveragedata(HiSeq),whileBrazilianspecimensweresequencedinBrazilandareintermediateincoverage(MiSeq)(seeTableS1formetagenomestatistics).Becauseoftheirlowercoverage,B.setaceasequenceswereusedonlyinasubsetoftheanalysesdescribedbelow. γ-ProteobacteriaofOrdersCellvibrionales(Teredinibacterandallies)andChromatiales(Thiosociusandallies)aredescribedsymbiontsthatliveintracellularlyinthegillsofshipworms.Weselected23cellulolyticandsulfur-oxidizingisolatescultivatedfromshipwormtissuesamples.Someofthestrainswerepreviouslyisolated,whileothersoriginatedinourrecentsamplecollections(TableS2).Forcomparison,wealsoincludedAgarilyticarhodophyticola017(26)(Ga0198945),afree-livingbacteriumthatiscloselyrelatedtoshipwormstrainsandthathasasequencedgenome.ThestrainsweresequencedattheJGI.Sixofthesecirculargenomeswereclosed,whileremainingassemblieshadbetween2-141scaffolds(seeTableS2forstraindataandaccessionnumbers).Twoofthesegenomes,T.turneraeT7901andT.teredinicola2141T,werepreviouslydescribed(13,21).Examinationof16SrRNAgenesequencesofthecultivatedstrainsallowedustoidentifytheisolateswithgenomesmostsimilartothoseidentifiedinmetagenomesequences(Figures2AandS1;TableS2).Ofthese,3arecellulolyticsymbionts,while2consistofsulfide-oxidizingsymbionts.Crucially,11ofthestrains,belongtoasinglespecies,T.turnerae.Eachshipwormgillsampleisdominatedby1-3majorbacterialspecies,withalargeunderlyingstrainvariation.Metagenomesequencingwasusedtodeterminewhichbacterialspeciesinhabiteachshipwormgill.Afterinitialanalysis,wefoundthatmostofthemetagenomicDNAfrombacteriacouldbemappedtogenomesequencesfromindividualstrainsinourculturecollection.Readcountswereusedtoquantifytheabundanceofeachspecieswithineachgillsample.Thisenabledustoaccuratelyquantifythemajorsymbiontspecieswithhighconfidence.Forthisanalysis,weremovedB.setacea,whichhadlowreadcoverage,andfocusedonthesixspecieswithgoodcoverage(Fig.2B).Eachshipwormgillmetagenomeisdominatedbyonetothreebacterialspecies,whicharerepresentedbycultivatedisolatesthatweobtainedfromthosesamespecies.Threeshipwormspeciesaredominatedbyasinglebacterialspecies,whilethreeothershavemixedsymbiontcommunities.Aspreviouslyreported,Kuphusspp.isdominatedbythesulfur-oxidizerT.teredinicola,andN.reyneiisdominatedbythecellulolyticbacteriumT.turnerae((7,25).B.thoracitiesisdominatedbystrain2753L.Incontrast,BrazilianshipwormsBankiasp.andTeredosp.containmostlyT.turnerae,buttheyalsoharborspeciescloselyrelatedtoCellvibrionalesstrain1162T.D.manniihasthemostdiversecommunity,containingamixtureofT.turneraeandCellvibrionalesstrain2753L,andasignificantfractionofthesulfideoxidizingsymbiontChromatialesstrain2719K.Inthemetagenomesofallshipwormgills,thereisalsoacomplexmixtureofminor,variablebacterialstrains(~1-15%oftotalreads,Fig.2Bshowningray).Thus,theshipwormgillisarelativelysimplesystem,dominatedbyafewspeciesofintracellularsymbioticbacteriathatarecrucialtohostnutritionandsurvival.
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint
6
Underlyingthissimplicity,foreachbacterialspeciespresentinasample,therearemultiplestrainvariants.WemeasuredtheprevalenceofthesevariantsusingapreviouslyreportedmethodinwhichwelookatsinglenucleotidepolymorphismsinconservedgenessuchasDNAgyraseBattheindividualreadlevel(FigureS2)(7).TheresultsareverysimilartoourpreviousreportofbacterialstrainvariationinthegillmetagenomesofK.polythalamius.Thepresenceofmultiple,closelyrelatedstrainvariantsaddstothecomplexityofsecondarymetabolitepathwaysfoundineachhostorganism,sinceineachanimalgilltherearemanymorebiosyntheticpathwaysthanarefoundinindividualsequencedisolates(seebelow).Thesymbiontmixturesarerepresentativeoftheanimallifestyles.K.polythalamiusappearstothriveentirelyonsulfideoxidation(7),asrequiredinitssedimenthabitat,whiletheothershipwormscontainvariouscellulolyticbacteriaresponsibleforwooddegradation.D.manniilikelyhasamorecomplexlifestyle,sinceitcontainsthesulfur-oxidizingbacteriumstrain2719KandthecellulolyticspeciesT.turneraeandstrain2753L.Morecompletedescriptionsofthesesymbioseswillbepublishedinarticlesfocusedonindividualshipwormspecies,whilehereourfocusisonanalysisandcomparisonofsecondarymetabolism.Shipwormsymbiontsandgillmetagenomesareunusuallyrichincomplexsecondarymetabolitepathways.TheprogramantiSMASH(27)wasusedtocombthegenomesforsecondarymetaboliteBGCs.WerefinedtheantiSMASHoutputtofocusonBGCsthatarewellcharacterizedtoencodesecondarymetabolites:polyketidesynthases(PKSs),nonribosomalpeptidesynthetases(NRPSs),siderophores,terpenes,homoserinelactones,andthiopeptides.Usingthesecriteria,weidentified168BGCsfrom23cultivatedisolatesand401BGCsfrom22shipwormgillmetagenomes(Fig.3).Becausethegenomesofcultivatedisolateswerewellassembled,wecoulddiscernandanalyzeentireBGCs.Bycontrast,theanimalmetagenomescontainedsomelargebutmanysmallercontigs(N50sshowninTableS1),inwhichBGCswereoftenfragmented.TheseBGCsnearlyuniversallyoriginatefromOrderCellvibrionales,withveryfewBGCsfoundinthesulfideoxidizingstrainsChromatiales.Thus,thecellulolyticshipwormsymbiontsarerichsourcesofdiverseBGCs.WefoundonlyfiveBGCsthatweresimilartopreviouslyidentifiedclustersfromoutsideofshipworms,basedupon>70%ofgenesconservedinantiSMASH.TheremainderappearedtobeunknownoruncharacterizedBGCs.Thisresultreinforcesthatshipwormsymbioticbacteriaarepromisingsourcesofnewbioactivemolecules.ItfurthersupportsapreviousanalysiscomparinggenomesacrossdomainBacteria,whichrevealedthatT.turneraerepresentsanotablyrich,yetnearlyuntapped,sourceofnewsecondarymetabolitegenes(15).Tofacilitatecomparisonbetweenmetagenomes,wegroupedBGCsintogeneclusterfamilies(GCFs).Thisisamethodthatcomparesgroupsofgenesthatareallinvolvedinaparticularpathway,ratherthancomparingindividualgenes(28,29).Wesetanidentitythresholddefinedin“Methods,”leadingtotheidentificationof122GCFscomprisingall569discreteBGCsinthegenomesandmetagenomescombined(Fig.4andTableS3).GiventhattheseGCFsoriginatedin
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint
7
asmallhandfulofbacterialspecies,thisisalargenumberthatreinforcestheroleofstrainvariationingeneratingchemicaldiversityinshipwormgills.AcaveatisthatweignoredantiSMASHhitsfrompoorlycharacterizedoruncharacterizedbiosyntheticpathwayfamilies,sothat122GCFsrepresentsaveryconservativeestimateofthechemicaldiversitypresentinshipwormgills.ByparingdownthegenestobeanalyzedtoasetofwellcharacterizedBGCs,comparisonbecamepossible,butwealsopotentiallymissedsomeinterestingpathways.Asoneexample,weanalyzedthegenomeofChromatialesstrain2719Kanddiscoveredageneclusterfortabtoxin(30,31)orrelatedcompound(Fig.5).ThisclusterdoesnotcontaincommonPKS/NRPSelementsandthuswasexcludedfromthecomparativeanalysis(forexample,itisnotadefinedGCFasshowninFigures4,6,or7).Akeybiosyntheticgeneinthetabtoxin-likeclusterwaspseudogenousinstrain2719K,buttheD.manniigillmetagenomecontainedanapparentlyfunctionalpathway.Tabtoxinisanimportantβ-lactamthatisusedbyPseudomonasinplantpathogenesis(32).Sincetabtoxinpreventsglutaminesynthesisandleadstotheaccumulationoftoxicammoniainplants(33),itistemptingtospeculateonhowtabtoxinmightimpactthissymbiosis,perhapsimprovingaccesstonitrogen.Anothercaveatisthat,althoughweidentifiedalargenumberofGCFsfromasmallspecimenset,eventhisisanunderestimate.B.setaceametagenomesfromOceanGenomeLegacywereincludedinouranalysisforcomparison,buttheyarevastlyundersampledincomparisontotheotherspeciesduetolowsequencecoverage,andthuswearemissingmostoftheirBGCsintheanalysis.CultivatedisolateGCFsareabundantcomponentsofthesourcegillmetagenomes.WecomparedBGCsandGCFsbetweenshipwormsandbacteria.Of401BGCsidentifiedinthemetagenomes,305ofthemalsohadcloserelativesincultivatedisolates,indicatingthat~75%ofBGCsinthemetagenomesarecoveredinoursequencedculturecollection(Fig.3).Conversely,of168isolateBGCs,148(90%)ofthemarefoundinthemetagenomes.Thus,sequencingfurthercultivatedisolatesinourstraincollectionsislikelytoyieldadditionalnovelBGCs.Only8GCFsarewidelydistributedin10ormoreisolates,andthesearemostlypathwaysthatareuniversalornearlyuniversalinT.turnerae,whichisoverrepresentedinourdataset(Figs.6and7).BycontrasttoisolategenomesinwhichwefoundmanyGCFsthatoccurinonlyasinglegenome,inthemetagenomesmostGCFSarefoundinmultiplespecimens.Only24outof107totalGCFsarefoundonlyonceinmetagenomes(Fig.4).Sincedifferentshipwormspeciesinourspecimencollectioncontaindifferentgroupsofbacteria,thisresultreinforcestheneedtosamplemoreindividualspecimensacrossthediversityofshipwormspeciestooptimizethediscoveryofbioactivesecondarymetabolites. ToobtainamorerefinedviewofBGCdistribution,wefirstusedtheMultiGeneBlast(28)outputtoconstructasimilaritynetwork(Fig.6).ThenetworkprovidedaneasilyinterpretablediagramofhowGCFsaredistributedinbacteria.However,twonotableproblemswereobservedinthisdiagram.First,fromexperiencewehavefoundthetartrolonBGCinnearlyallT.turnerae
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint
8
strains.However,thisBGCwasobservedinonlyafewoftheT.turnerae-hostingshipwormsviaMultiGeneBlast.Thisiscausedbyatechnicalprobleminassemblythatweoftenseewithlargetrans-ATpathwaysfromcomplexsamples.Second,becausethismethodreliesuponcomparinglargercontigs,wecouldnotincludethelow-coverageB.setaceasequencesfromJGIinthisanalysis.Toremedytheseproblems,weusedasecondmethodthatobtainedGCFsfromcultivatedisolatesandusedthoseGCFsintBLASTnsearchesofmetagenomecontigs(Fig.7).Thisprovidedanorthogonalviewofsecondarymetabolisminshipworms,revealingthepresenceofthetartrolonpathway,aswellasotherpathwaysthatdonotassemblewellinmetagenomesbecauseofcharacteristicssuchasrepetitiveDNAsequences.ItalsoenabledustocompareB.setaceawiththeotherspecies.AweaknessofthissecondmethodisthattheclosenessofrelationshipsbetweenBGCsisnotreadilydiscernedintermsofsequencesimilarityandgenespresent.Thus,thesetwomethodsprovidedifferentinsightsintoBGCsinshipwormgills.Usingthosedata,focusingonthesixshipwormspecieswithgoodsequencingcoverage,wecouldobservecleartrendsofwidelysharedGCFs,GCFsthatwerespecifictoshipwormandsymbiontspecies,andstochasticallyoccurringGCFs.WidelysharedGCFs.Fourpathways(GCF_2,GCF_3,GFC_5,andGCF_8)wereprevalentinwood-eatingshipworms,regardlessofsamplelocation(Figs.6and7).TheseGCFswereencodedinthegenomesofT.turneraeandthoseofseveralotherCellvibrionalesisolates(especiallythepathway-rich2753L),explainingtheirwidespreaddistribution.GCF_2encodesaNRPS/trans-acyltransferase(trans-AT)PKSpathway,thechemicalproductsofwhichareunknown.ItisfoundinallshipwormspecimensinthisstudyandinallT.turneraestrains.ItisalsopresentinCellvibrionalesstrain2753L.ThisexplainsitspresenceinB.thoracitesdespitetheabsenceofT.turneraeinthisspecies.GCF_2issynonymouswith“region3”describedintheannotationoftheT.turneraeT7901genome(13).ThemostprominentlyoccurringpathwayinshipwormgillmetagenomesisGCF_3.Itwasidentifiedinallgillmetagenomeswithcellulolyticsymbionts,includingthemetagenomesofspecimenB.setaceaBSG2.ItoccursinallT.turneraestrains,aswellasinCellvibrionalesstrains2753LandBs08.Itwasfirstannotatedas“region1”intheT.turneraeT7901genomeandencodesanelaboratehybridtrans-ATPKS-NRPSpathway(13).UnlikeallotherGCFsidentifiedinshipwormmetagenomesandisolates,GCF_3couldbesubdividedintoatleastthreediscretecategories,eachofwhichincludeddifferentgenecontent(Fig.8).Thefirstcategory,identifiedinT.turneraeT7901,encodesaPKSandasingleNRPS,inadditiontoseveralpotentialmodifyingenzymes.StrainBs08containsasimilarpathway,exceptwithanadditionaltwoNRPSgenesthatpresumablyleadtoadditionalaminoacidsontheresultingproduct.Cellvibrionales2753Lencodedthethirdpathwaytype,whichwassimilartothatfoundinT7901exceptwithdifferentflankinggenecontent.ThepresenceofasingleGCFthatencodessimilarbutnon-identicalproductssuggestsadynamicpathwayevolutionwithinshipworms.
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint
9
GCF_5encodesacombinationofterpenecyclaseandpredictedarylpolyenebiosyntheticgenes,whichwereunrecognizedintheinitialBGCscreeningofT.turneraeT7901genome,sincethearylpolyenepathwaysarerecentdiscoveries.TheGCF_5biosyntheticproductisunknown,althoughthecyclaseandsurroundingregionshaveallofthegenesnecessarytomakeandexporthopanoids.InadditiontooccurringinallT.turneraestrains,GCF_5ispresentinCellvibrionalesstrains1120Wand2753L.Thepathwaywasdetectedinallwood-eatingspecimensexceptTeredosp.TBF07(Fig.8).GCF_8isexemplifiedbythepreviouslydescribedturnerbactinBGC,fromT.turneraeT7901.Turnerbactinisacatecholatesiderophore,crucialtoironacquisitioninT.turnerae(16).TheBGCforturnerbactinwaspreviouslyidentifiedanddescribedas“region7”intheT.turneraeT7901genome.GCF_8ispresentinallT.turneraegenomessequencedhere.OtherCellvibrionalesstrains,including2753LfromB.thoracitesandBs08fromB.setacea(neitherofwhichcontainsT.turnerae),alsoencodeturnerbactin-likesiderophoresynthesis.OnespecimenofB.thoracitescontainedGCF_8.Beyondbacterialironacquisition,siderophoresarealsoimportantinstraincompetitionandpotentiallyinhostanimalphysiology(34,35),possiblyexplainingthewidespreaddistributionofGCF_8.FromtheclusteringpatterninFigure6,itislikelythatGCF_8comprisesatleastthreedifferent,butrelatedtypesofgeneclusters.Thus,GCF_8likelyrepresentscatecholatesiderophores,butnotnecessarilyturnerbactin.ImportantGCFswidelydistributedinT.turnerae-containingshipworms.InadditiontothefourGCFsdescribedabovethathaveawidedistribution,GCFs1,4,and11werefoundinallT.turnerae-containingshipworms.GCF_1isatrans-ATPKS-NRPSpathwaythatappearstobesplitintotwoclustersinsomeshipwormisolates,includingT.turneraeT7901,inwhichitwaspreviouslyannotatedas“region4”and“region5”.GCF_4isthepreviouslydescribed“region8”PKS-NRPSfromT.turneraeT7901.Mostnotably,GCF_11encodestartrolonbiosynthesis(17).TartrolonisanantibioticandpotentantiparasiticagentisolatedfromculturebrothsofT.turneraeT7901(17,36,37).Ithasalsobeenidentifiedinthececumoftheshipworm.Itwasproposedthatthebacteriasynthesizetartroloninthegill,anditistransferredtothececumwhereitmayplayaroleinkeepingthedigestivetractfreeofbacteria(17).GCFsspecifictoshipwormscontainingstrain2753L-likesymbonts.D.manniiandB.thoracitescontainabundant2753L-likestrains.LikeT.turneraeT7901,the2753LisolategenomeencodesGCFs2,3,and5.However,2753LcontainsseveralGCFsnotfoundinT.turnerae,includingGCFs6,10,12,13,14,16,30,and31(listedinorderoftheirrelativefrequencyofoccurrenceinsamples).AlloftheseGCFsarealsofoundinD.manniiandB.thoracitesgillmetagenomes.ThesearePKSandNRPSclustersthatlackcloserelativesaccordingtoantiSMASHannotationandthushaveapotentialtosynthesizenovelsecondarymetaboliteclasses.GCFsspecifictostrain1162T-likesymbiontsinshipwormsfromBrazil.AllthreeBrazilianshipwormspeciescontainsymbiontgenomessimilartoourcultivatedisolate,1162T.Thisislikelyduetothespeciessampledandnottogeographicalvariation,sinceweobtained1162TfromaPhilippinesspecimenofLyrodussp.InBankiasp.andTeredosp.,wherestrain1162T-likesymbiontsareamajorcomponent,thereareseveraluniqueGCFsthatoriginateinthestrain
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint
10
2753L-likesymbionts(Fig.S4B).Becausethestrain2753L-likesymbiontfromBrazilmetagenomesisrelativelydistantlyrelatedtothecultivatedisolateandhaslittleoverlapintermsofBGCs,wecouldnotidentifymostofthosefull-lengthBGCsinthecultivatedisolates.Thus,ifthesetrendsholdupthroughfurthersampling,strain1162T-likesymbiontsmayexhibitthehighlevelofchemicaldiversitysimilartowhatisfoundinstrain2753L-likegenomes.Inaddition,whencomparingthemetagenomesofBankiasp.andTeredosp.,itisclearthatthe1162T-likesymbiontsarenotidenticalinthesetwoshipwormspecies,andthattheymaylikelyharbordifferentGCFs.GCFsspecifictosulfuroxidizingshipwormsymbionts.KuphusanditssymbiontscontainedrelativelyfewBGCs,butstrikinglytwoNRPS-containingGCFshavebeenuniversallyfoundintheshipwormassociatedsulfide-oxidizingspeciesT.teredinicola.Oneofthese,GCF_17,isshowninFig.7,whereitisfoundinthesymbiontmetagenomesofKuphusandD.mannii.ItisclearthatthecellulolyticsymbiontsaremuchricherinBGCs,andinadditiontheBGCsvarymuchmoreextensively.Stochasticpathwayoccurrenceinshipwormgills.Manypathwaysinbothgenomesandmetagenomeswerefoundonlyonceoroccurredrelativelyrarely,sothattrendscouldnotbediscerned.InFig.7,only18GCFsfoundinmorethanonespeciesofshipwormaredisplayed.Theremaining114GCFsoccurrarelyoronlyonce.Thisindicatesthatmostbiosyntheticpathwaysoccurstochastically.ThistrendisreinforcedinFig.6,wheremostGCFsinthediagramoccuronlyonce(single,unlinkedspots).Whileseveralbiosyntheticpathwaysareconservedandthuslikelyhaveanimportantconservedroleinthesymbiosis,mostarenotconserved.Furthersamplingofshipwormspecimens,species,andcultivatedisolateswillyieldmanyfurther,unanticipatedBGCs.VariabilityinconservedshipwormGCFsincreasespotentialcompounddiversity.EvenamongconservedGCFs,thereisvariabilityinsomeofthepathways.ThiscanbeobservedintheBGCnetwork,wheretherelationshipsbetweenclusterscanbegraphicallyobserved.(Fig.6).Forsomeofthesepathways,wecouldobservedifferencesingenecontentconsistentwithdifferentchemicalsthatwouldbeproduced.ExamplesincludetheuniversalGCF_3andsiderophorepathwayGCF_8describedabove.Discussion.Marineinvertebratesoftenusesmallmoleculesinchemicaldefense(38).Thesuiteofdefensivechemicalsinanindividualanimalisusuallyfairlysimple.Theproductionofpotentlybioactivecompoundsinlargeabundance(>0.1%ofanimaldryweight)hasenabledthediscoveryoftensofthousandsofcompounds,someofwhichareclinicallyuseddrugs(39).Becausemanyofthecompoundsresemblebacterialmetabolites,chemistsspeculatedthatthetrueoriginofcompoundsmightbebacterial.Laterworkrelyingongeneticsandmetagenomicsidentifieduncultivatedbacteriaaskeyproducersofdefensivemetabolites(40,41).Thesymbiosesappeartobeobligateandspecies-specific.Todate,noneofthesedefensivesymbiontshasbeenstablycultivated,possiblybecauseofthelongrelianceonhostmetabolism.
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint
11
Ontheotherendofthespectrum,humansalsocontainmanybacteriathatproducepotentiallybioactivesecondarymetabolites(15,42).Mostofthesehavecultivatedrepresentatives.Thebacteriaarenotobligateandarehighlyvariablewithinourspecies.Becausetheresultingcompoundsarelikelypresentinlowabundancewithinaverycomplexmicrobiota,itisdifficulttostudythebiologyofsecondarymetabolisminhumans.Here,wedescribeasymbioticsystemthatisintermediatebetweenthesetwoextremes.Thestorybeganwiththebiologyofshipworms,inwhichcellulolyticbacteriawerelongknowntospecificallyinhabitgillsandhypothesizedtobethecauseofanevolutionarypaththatleadstowood-specializationinmostofthefamily,alongwithdrasticmorphophysiologicalmodifications(1,5,43).Thesesymbiontscouldbecultivated,althoughonlyrecentlyhavewebeenabletosamplethefullspectrumofmajorsymbiontspresentingills.TheunexpectedfindingthatT.turneraeT7901wasexceptionallyrichinBGCs–proportionatelydenserinBGCcontentthanStreptomycesspp.(13,15)–ledustoinvestigateshipwormsasasourceofnewbioactivecompounds.Likeaconsiderableportionofthehumanmicrobiota,shipwormsymbiontsareamenabletocultivation.Here,weshowthat,likedefensivesymbiosesinseveralmarineinvertebrates,thegillmicrobiotaisrelativelysimple,leadingtoarelativelydefinedsuiteofpotentialmetabolites.TheBGCsincultivatedisolatesarefaithfullyfoundasmajorpathwayswithintheshipwormgillmetagenome.Theabilitytoexperimentallymanipulatethegillcommunitywillprovideagoodsystemtounderstandtheroleofsecondarymetabolisminanimalsymbioses.OnlyfourGCFsarewidelyconservedincellulolyticshipworms.Twoofthesepathwayshaveobviouspotentialrolesinsymbiosis.Siderophoressuchasturnerbactinareimportantinsequesteringiron,buttheyarealsowidelyusedinbacterialcompetition(34,35).Hopanoidsareknowntobeimportantinsymbiosis(44).Twoofthepathways(GFC_2andGCF_3)areexcitingintermsofnewchemistryandbiology.Theseencodeverycomplexbiosyntheticpathwaystotrans-ATpolyketides(45).Suchcompoundsareoftenpotentlyactiveagainsthumandiseasetargets.Insymbiosesbetweenfungiandbacteria,trans-ATPKSproductsarehighlytoxicandlikelydefendtheholobiontfrompredation(46).Thus,isolationofthesecompoundsisapriorityofourproject.Manypathwaysarespecifictocertaincladesofshipwormsymbionts,andthereforetothehostanimalsthatharborthem.Thesealsoencodeverycomplexmetabolites,mostofwhicharepolyketides,nonribosomalpeptides,orhybridsofthetwo.TherearemanyothersuchcomplexpathwaysbeyondthePKS/NRPS,includingtabtoxin-likepathwaysinspecificshipworms.Mostofthesehaveyettobecharacterized,withtheexceptionofthetartrolonD/EpathwaythatisfoundinmanyT.turnerae-containingshipworms.Tartrolonsarepicomolarantiparasiticcompounds(37),implyingthattheymaybeimportantinshapingthemicrobialenvironmentinhosts.ThespecificityofBGCstohosttypessuggestthatthesepathwaysareimportanttothebiologyofsymbiosis.Finally,thereisalargedegreeofstochasticoccurrenceofbiosyntheticpathways.Chemicaldefensivetheorysuggeststhatdifferentiationordiversityofsecondarymetabolitesincreases
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint
12
fitness(47).Thismaybeespeciallyimportantinhighlybiodiverseenvironmentssuchasthosefoundinmangrovehabitats.ThisexperimentalsystemwillenableustoexaminetheeffectsofthesevariableBGCs,aswellasthecommon/universalBGCs,onthebiologyoftheshipwormhosts.Insummary,hereweshowthatthebiosyntheticrichnessofcultivatedshipwormsymbiontsisalsofoundinthegillsofthehostanimals.Theseresultsrevealpotentiallyimportantchemicalinteractionsthatwouldaffectavarietyofmarineecosystemsandanovelandunderexploredsourceofbioactivemetabolitesfordrugdiscovery.MethodsCollectionandprocessingofbiologicalmaterial.Shipwormsamples(TablesS1andS2)werecollectedfromfoundwood.Briefly,infestedwoodwascollectedandtransportedimmediatelytothelaboratoryorstoredintheshadeuntilextraction(<1day).Specimenswerecarefullyextractedtoavoiddamageusingwoodworkingtools.Extractedspecimenswereprocessedimmediatelyorstoredinindividualcontainersoffilteredseawaterat4°Cuntilprocessing.Specimenswerecheckedforviabilitybysiphonretractioninresponsetostimulationandobservationofheartbeat,andlivespecimensselected.Specimenswereassignedauniquecode,photographedandidentified.Specimensweredissectedusingadissectingstereoscope.Taxonomicvouchers(valves,pallets,andsiphonaltissueforsequencinghostphylogeneticmarkers),wereretainedandstoredin70%ethanol.Thegillwasdissected,rinsedwithsterileseawater,anddividedforbacterialisolationandmetagenomicsequencing.Oncethegillwasdissecteditwasprocessedimmediatelyorflash-frozeninliquidnitrogen.BacterialDNAextractionandanalysis.Teredinibacterturneraestrains(withTprefix)wereisolatedusingthemethoddescribedinDistelelal.2002(12),whileBankiasetaceasymbionts(withBsprefix)wereobtainedusingthetechniqueindicatedinO’Connoretal.2014(9).Sulfur-oxidizingsymbiontswereisolatedusingtheprotocolspecifiedinAltamiaetal.2019(21).Forthisstudy,additionalT.turneraeandnovelcellulolyticsymbiontsfromPhilippinespecimens(withprefixPMS)wereisolated(TablesS1andS2).Briefly,dissectedgillorcecawerehomogenizedinsterile75%naturalseawaterbufferedwith20mMHEPES,pH8.0usingaDouncehomogenizer.Tissuehomogenateswereeitherstreakedonshipwormbasalmediumcellulose(5)plates(1.0%BactoAgar)orstabbedintosoftagar(0.2%BactoAgar)tubesandincubatedat25°Cuntilcellulolyticclearingsdeveloped.Cellulolyticbacterialcoloniesweresubjectedtoseveralroundsofrestreakingtoensureclonalselection.Contentsofsoftagartubeswithclearingswerestreakedonfreshcelluloseplatestoobtainsinglecolonies.Purecolonieswerethengrownin6mLSBMcelluloseliquidmediumin16×150mmtesttubesuntilthedesiredturbiditywasobserved.Forlong-termpreservationoftheisolates,aturbidmediumwasaddedto40%glycerolat1:1ratioandfrozenat-80°C.Bacterialcellsintheremainingliquidmediumwerepelletedbycentrifugationat8,000gandthensubjectedtogenomicDNAisolation.Thesmall-subunitribosomal(SSU)16SrRNAgeneoftheisolateswasthenPCRamplifiedusing27Fand1492RfromthepreparedgenomicDNAandsequenced.Phylogeneticanalysesof16SrRNAsequenceswasperformedusingprogramsimplementedinGeneious,
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint
13
version10.2.3.Briefly,sequenceswerealignedusingMAFFT(version7.388)byusingtheE-INS-ialgorithm.Thealignedsequencesweretrimmedmanually,resultinginafinalaligneddatasetof1,125nucleotidepositions.PhylogeneticanalysiswasperformedusingFastTree(version2.1.11)usingtheGTRsubstitutionmodelwithoptimizedGamma20likelihoodandratecategoriespersitesetto20.GenomicDNAusedforwholegenomesequencingofnovelisolatesandselectT.turneraestrainswerepreparedusingCTAB/phenol/chloroformDNAextractionmethoddetailedinhttps://www.pacb.com/wp-content/uploads/2015/09/DNA-extraction-chlamy-CTAB-JGI.pdf.ThepurityoftheextractedgenomicDNAwasthenassessedspectrophotometricallyusingNanodropandthequantitywasestimatedusingagarosegelelectrophoresis.SamplesthatpassedthequalitycontrolstepsweresubmittedtoJointGenomeInstitute–DepartmentofEnergy(JGI-DOE)forwholegenomesequencing.ThesequencingplatformandassemblymethodusedtogeneratethefinalisolategenomesequencesusedinthisstudyaredetailedonTableS1.MetagenomicDNAextraction.GilltissuesamplesfromPhilippineshipwormspecimens(TableS2)wereflash-frozeninliquidnitrogenandstoredat-80°Cpriortoprocessing.BulkgillgenomicDNAwaspurifiedbyQiagenBloodandTissueGenomicDNAKitusingthemanufacturer’ssuggestedprotocol.GilltissuesamplesfromBrazilshipwormspecimenswerepulverizedbyflash-frozeninliquidnitrogenandsubmittedtometagenomicDNApurificationbyadaptingaprotocolpreviouslyoptimizedfortotalDNAextractionfromcnidariatissues(48,49).Briefly,shipwormsgillswerecarefullydissected(takingcarenottogetintersectionswithotherorgans),submittedtoaseriesoffivewasheswith3:1sterileseawater/distilledwaterforremovalofexternalcontaminants,andmacerateduntilpowderedinliquidnitrogen.Powderedtissues(~150mg)werethentransferredto2mLmicrotubescontaining1mLoflysisbuffer[2%(m/v)cetyltrimethylammoniumbromide(SigmaAldrich),1.4MNaCl,20mMEDTA,100mMTris-HCl(pH8.0),withfreshlyadded5μgproteinaseK(v/v;Invitrogen),and1%2-mercaptoethanol(SigmaAldrich)]andsubmittedtofivefreeze-thawingcycles(-80°Cto65°C).Proteinswereextractedbywashingtwicewithphenol:chloroform:isoamylalcohol(25:24:1)andoncewithchloroform.MetagenomicDNAwasprecipitatedwithisopropanoland5Mammoniumacetate,washedwith70%ethanol,andelutedinTEbuffer(10mMTris-HCl,1mMEDTA).MetagenomiclibrarieswerepreparedusingtheNexteraXTDNASamplePreparationKit(Illumina)andsequencedwith600-cycle(300bppaired-endruns)MiSeqReagentKitsv3chemistry(Illumina)attheMiSeqDesktopSequencer.Metagenomesequencingandassembly.BankiasetaceametagenomeswereobtainedfromtheJGIdatabase(foraccessionnumbers,seeTableS1).Kuphuspolythalamiusgillmetagenomes(KP2132GandKP2133G)wereobtainedfromapreviousstudy(7).MetagenomesfromKuphussp.specimenKP3700GandDicyathifermanniiandBactronophorusthoracitesspecimensweresequencedusinganIlluminaHiSeq2000sequencerwith~350bpinsertsand125bppaired-endrunsattheHuntsmanCancerInstitute’sHighThroughputGenomicsCenterattheUniversityof
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint
14
Utah.IlluminafastqreadsweretrimmedusingSickle(50)withtheparameters(pesanger-q30–l125).ThetrimmedFASTQfileswereconvertedtoFASTAfilesandmergedusingthePerlscript‘fq2fq’inIBDA_udpackage(51).MergedFASTAfileswereassembledusingIDBA_udwithstandardparametersintheCenterforHighPerformanceComputingattheUniversityofUtah.FormetagenomesamplesfromBrazil,allNeroterdoreyneigillmetagenomicsamplespreviouslyanalyzedwerere-sequencedheretocoveragedepth(25).Teredosp.andBankiasp.gillmetagenomesweresequencedusingIlluminaMiseq.TherawreadswereassembledusingeitherthemetaspadespipelineofSPAdes(52,53)orIDBA-UD(51).RawreadsweremergedusingBBMerge(54).Non-mergedreadswerefilteredandtrimmedusingFaQCs(55).Bothmergedandprocessednon-mergedreadswereusedinassemblyusingthemetaspadespipeline.Identificationofbacterialsequencesinmetagenomicdata.Assembly-assistedbinningwasusedtosortandanalyzetrimmedreadsandassembledcontigsintoclustersputativelyrepresentingsinglegenomesusingMetaAnnotator(56).EachbinnedclusterwasretrievedusingSamtool(57,58).Toidentifybacterialclusters,genesforeachbinwereidentifiedwithProdigal(59).Proteinsequencesforbinswithcodingdensity>50%weresearchedagainstNCBInrdatabasewithDIAMOND(60).Binswith60%ofthegeneshittingbacterialsubjectinthenrdatabasewereconsideredtooriginatefrombacteria.Eachbacterialbinwascomparedtothe23shipwormisolategenomesusinggANIandAFvalues(61).Withacut-offofAF>0.6andgANI>0.95,thebacterialbinsfromeachmetagenomeweremappedtocultivatedbacterialgenomes.Binsthatmappedtoasinglebacterialgenomewerecombinedintoamega-bin.Readsmappingtoeachmega-binwereretrievedusingMetaAnnotator.FormetagenomesamplesfromBrazil,structuralandfunctionalannotationswerecarriedoutusingDFAST(62),includingonlycontigswithlength≥500bp.AllmetagenomeswerebinnedusingAutometa(63).First,eachcontig’staxonomicidentitywaspredictedusingmake_taxonomy_table.py,includingonlycontigs≥1000bp.Predictedbacterialandarchaealcontigswerebinned(withrecruitmentviasupervisedmachinelearning)usingrun_autometa.py.BuildingBGCsimilaritynetworks.BGCswerepredictedfromthebacterialcontigsofeachmetagenomeandfromcultivatedbacterialgenomesusingantiSMASH4.0(27).Fromthepredictions,onlyBGCsforPKSs,NRPSs,siderophores,terpenes,homoserinelactones,andthiopeptides(aswellascombinationsofthesebiosyntheticenzymefamilies)wereincludedinsucceedinganalyses.Anall-versus-allcomparisonoftheseBGCswasperformedusingMultiGeneBlast(28)followingtheprotocolpreviouslyreported(64).BidirectionalMultiGeneBlastBGC-to-BGChitswereconsideredtobereliable.Inmetagenomedata,sometruncatedBGCsonlyshowedsingle-directionalcorrelationtoafulllengthBGC.Thosesingle-directionalhitswererefinedasfollows:proteintranslationsofallcodingsequencesfromtheBGCswerecomparedinanall-versus-allfashionusingblastpsearch.Onlyproteinhitsthathadatleast60%identityandatleast80%coveragetobothqueryandsubjectwereconsideredasvalidhits.Asingle-directionalMultiGeneBlastBGC-to-BGChitwasretainediftherewereatleastn-2numberofproteins(nisthenumberofproteinsinthetruncatedBGC)passingthe
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint
15
blastprefining.TheremainingMultiGeneBlasthitswereusedtoconstructanetworkinCytoscape(65).Finally,eachBGCcluster(GCF)thathasrelativelownumberofbidirectionalcorrelationsweremanuallycheckedbyexaminingtheMultiGeneBlastalignment.OccurrenceofGCFsinmetagenomes.BasedontheGCFsidentifiedinpreviousstep,thecorebiosyntheticproteinsfromeachGCFwereextractedandqueried(NCBItblastn)againsteachmetagenomeassembly.Athresholdofquerycoverageof>50%andidentity>90%wasappliedtoremovethenonspecifichits,andtheremininghits,incombinationwiththeMultiGeneBlasthits,wereusedtomakethematrixofGCFsoccurrenceinmetagenomes.Acknowledgments.AllcollectionsfollowedNagoyaProtocolrequirements;BraziliansamplingwereperformedunderSISBIOlicensenumber48388,andgeneticresourcesaccessedundertheauthorizationoftheBrazilianNationalSystemfortheManagementofGeneticHeritageandAssociatedTraditionalKnowledge(SisGenpermitnumberA2F0DA0).WethanktheGenomicsandBioinformaticsCenterofDrugResearchandDevelopmentCenterofFederalUniversityofCearafortechnicalsupport.TheworkwascompletedundersupervisionoftheDepartmentofAgriculture-BureauofFisheriesandAquaticResources,Philippines(DA-BFAR)incompliancewithallrequiredlegalinstrumentsandregulatoryissuancescoveringtheconductoftheresearch.AllPhilippinespecimenswerecollectedunderGratuitousPermitnumbersFBP-0036-10,GP-0054-11,GP-0064-12,GP-0107-15,andGP-0140-17.WethankthegovernmentsandmunicipalitiesofthePhilippinesandBrazilforaccessandhelp.ThisworkwasalsosupportedbytheNationalCounselofTechnologicalandScientificDevelopment(CNPq)(http://cnpq.br)andbytheCoordinationfortheImprovementofHigherEducationPersonnel(CAPES)(http://www.capes.gov.br)underthegrantnumbers473030/2013-6and400764/2014-8toAETSResearchreportedinthispublicationwassupportedbytheFogartyInternationalCenteroftheNationalInstitutesofHealthunderAwardNumberU19TW008163.ThecontentissolelytheresponsibilityoftheauthorsanddoesnotnecessarilyrepresenttheofficialviewsoftheNationalInstitutesofHealth.TheworkwassupportedinpartbyUSNOAAOERaward#NA190AR0110303References1. DistelDL,etal.(2011)MolecularphylogenyofPholadoideaLamarck,1809supportsa
singleoriginforxylotrophy(woodfeeding)andxylotrophicbacterialendosymbiosisinBivalvia.MolPhylogenetEvol61(2):245-254.
2. TurnerRD(1966)AsurveyandillustratedcatalogueoftheTeredinidae(Mollusca:Bivalvia)(HarvardUniversityPress,Cambridge).
3. DistelDL,BeaudoinDJ,&MorrillW(2002)Coexistenceofmultipleproteobacterialendosymbiontsinthegillsofthewood-boringBivalveLyroduspedicellatus(Bivalvia:Teredinidae).ApplEnvironMicrobiol68(12):6292-6299.
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint
16
4. LuytenYA,ThompsonJR,MorrillW,PolzMF,&DistelDL(2006)Extensivevariationinintracellularsymbiontcommunitycompositionamongmembersofasinglepopulationofthewood-boringbivalveLyroduspedicellatus(Bivalvia:Teredinidae).ApplEnvironMicrobiol72(1):412-417.
5. WaterburyJB,CallowayCB,&TurnerRD(1983)Acellulolyticnitrogen-fixingbacteriumculturedfromtheglandofdeshayesinshipworms(bivalvia:teredinidae).Science221(4618):1401-1403.
6. EkborgNA,MorrillW,BurgoyneAM,LiL,&DistelDL(2007)CelAB,amultifunctionalcellulaseencodedbyTeredinibacterturneraeT7902T,aculturablesymbiontisolatedfromthewood-boringmarinebivalveLyroduspedicellatus.ApplEnvironMicrobiol73(23):7785-7788.
7. DistelDL,etal.(2017)DiscoveryofchemoautotrophicsymbiosisinthegiantshipwormKuphuspolythalamia(Bivalvia:Teredinidae)extendswooden-stepstheory.ProcNatlAcadSciUSA114(18):E3652-E3658.
8. BetcherMA,etal.(2012)Microbialdistributionandabundanceinthedigestivesystemoffiveshipwormspecies(Bivalvia:Teredinidae).PLoSOne7(9):e45309.
9. O'ConnorRM,etal.(2014)Gillbacteriaenableanoveldigestivestrategyinawood-feedingmollusk.ProcNatlAcadSciUSA111(47):E5096-5104.
10. LecheneCP,LuytenY,McMahonG,&DistelDL(2007)Quantitativeimagingofnitrogenfixationbyindividualbacteriawithinanimalcells.Science317(5844):1563-1566.
11. AltamiaMA,etal.(2014)GeneticdifferentiationamongisolatesofTeredinibacterturnerae,awidelyoccurringintracellularendosymbiontofshipworms.MolEcol23(6):1418-1432.
12. DistelDL,MorrillW,MacLaren-ToussaintN,FranksD,&WaterburyJ(2002)Teredinibacterturneraegen.nov.,sp.nov.,adinitrogen-fixing,cellulolytic,endosymbioticgamma-proteobacteriumisolatedfromthegillsofwood-boringmolluscs(Bivalvia:Teredinidae).IntJSystEvolMicrobiol52(Pt6):2261-2269.
13. YangJC,etal.(2009)ThecompletegenomeofTeredinibacterturneraeT7901:anintracellularendosymbiontofmarinewood-boringbivalves(shipworms).PLoSOne4(7):e6085.
14. Trindade-SilvaAE,etal.(2009)PhysiologicaltraitsofthesymbioticbacteriumTeredinibacterturneraeisolatedfromthemangroveshipwormNeoteredoreynei.GenetMolBiol32(3):572-581.
15. CimermancicP,etal.(2014)Insightsintosecondarymetabolismfromaglobalanalysisofprokaryoticbiosyntheticgeneclusters.Cell158(2):412-421.
16. HanAW,etal.(2013)Turnerbactin,anoveltriscatecholatesiderophorefromtheshipwormendosymbiontTeredinibacterturneraeT7901.PLoSOne8(10):e76151.
17. ElshahawiSI,etal.(2013)Boronatedtartrolonantibioticproducedbysymbioticcellulose-degradingbacteriainshipwormgills.ProcNatlAcadSciUSA110(4):E295-304.
18. VoightJRR(2015)Xylotrophicbivalves:aspectsoftheirbiologyandtheimpactsofhumans.J.MolluscanStud.81:175-186.
19. LopesSGBC,DomanseschiO,deMoraesDT,MoritaM,&MeseraniGDLC(2000)FunctionalanatomyofthedigestivesystemofNeoteredoreynei(Bartsch,1920)andPsiloteredohealdi(Bartsch,1931)(Bivalvia:Teredinidae).TheEvolutionaryBiologyof
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint
17
theBivalvia,edsHarperEM,TaylorJD,&CrameJA(GeologicalSociety,London),Vol177,pp257-271.
20. FilhoCS,TagliaroCH,&BeasleyCR(2008)SeasonalabundanceoftheshipwormNeoteredoreynei(Bivalvia,Teredinidae)inmangrovedriftwoodfromanorthernBrazilianbeach.Iheringia.SérieZoologia98:17-23.
21. AltamiaMA,ShipwayJR,ConcepcionGP,HaygoodMG,&DistelDL(2019)Thiosociusteredinicolagen.nov.,sp.nov.,asulfur-oxidizingchemolithoautotrophicendosymbiontcultivatedfromthegillsofthegiantshipworm,Kuphuspolythalamius.IntJSystEvolMicrobiol69(3):638-644.
22. ShipwayJR,etal.(2019)Arock-boringandrock-ingestingfreshwaterbivalve(shipworm)fromthePhilippines.ProcBiolSci286(1905):20190434.
23. ShipwayJR,etal.(2016)Zachsiazenkewitschi(Teredinidae),aRareandUnusualSeagrassBoringBivalveRevisitedandRedescribed.PLoSOne11(5):e0155269.
24. ElshahawiSI(2012)Isolationandbiosynthesisofbioactivenaturalproductsproducedbymarinesymbionts.PhD(OregonHealth&ScienceUniversity,Portland).
25. BritoTL,etal.(2018)Thegill-associatedmicrobiomeisthemainsourceofwoodplantpolysaccharidehydrolasesandsecondarymetabolitegeneclustersinthemangroveshipwormNeoteredoreynei.PLoSOne13(11):e0200437.
26. LingSK,XiaJ,LiuY,ChenGJ,&DuZJ(2017)Agarilyticarhodophyticolagen.nov.,sp.nov.,isolatedfromGracilariablodgettii.IntJSystEvolMicrobiol67(10):3778-3783.
27. BlinK,etal.(2017)antiSMASH4.0-improvementsinchemistrypredictionandgeneclusterboundaryidentification.NucleicAcidsRes45(W1):W36-W41.
28. MedemaMH,TakanoE,&BreitlingR(2013)DetectingsequencehomologyatthegeneclusterlevelwithMultiGeneBlast.MolBiolEvol30(5):1218-1223.
29. AdamekM,SpohnM,StegmannE,&ZiemertN(2017)MiningBacterialGenomesforSecondaryMetaboliteGeneClusters.MethodsMolBiol1520:23-47.
30. KinscherfTG&WillisDK(2005)Thebiosyntheticgeneclusterforthebeta-lactamantibiotictabtoxininPseudomonassyringae.JAntibiot(Tokyo)58(12):817-821.
31. KinscherfTG,ColemanRH,BartaTM,&WillisDK(1991)CloningandexpressionofthetabtoxinbiosyntheticregionfromPseudomonassyringae.JBacteriol173(13):4124-4132.
32. SindenSL&DurbinRD(1968)Glutaminesynthetaseinhibition:possiblemodeofactionofwildfiretoxinfromPseudomonastabaci.Nature219(5152):379-380.
33. TurnerJG&DebbageJM(1982)Tabtoxin-inducedsymptomsareassociatedwiththeaccumulationofammoniaformedduringphotorespiration.Physiol.PlantPathol.20:223-233.
34. GrafJ&RubyEG(2000)NoveleffectsofatransposoninsertionintheVibriofischeriglnDgene:defectsinironuptakeandsymbioticpersistenceinadditiontonitrogenutilization.MolMicrobiol37(1):168-179.
35. HoldenVI&BachmanMA(2015)Divergingrolesofbacterialsiderophoresduringinfection.Metallomics7(6):986-995.
36. IrschikH,SchummerD,GerthK,HofleG,&ReichenbachH(1995)Thetartrolons,newboron-containingantibioticsfromamyxobacterium,Sorangiumcellulosum.JAntibiot(Tokyo)48(1):26-30.
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint
18
37. O’ConnorR&SchmidtEW(2018)PCTWO2018106966A1.38. SchmidtEW(2008)Tradingmoleculesandtrackingtargetsinsymbioticinteractions.Nat
ChemBiol4(8):466-473.39. MolinskiTF,DalisayDS,LievensSL,&SaludesJP(2009)Drugdevelopmentfrommarine
naturalproducts.NatRevDrugDiscov8(1):69-85.40. MoritaM&SchmidtEW(2018)Parallellivesofsymbiontsandhosts:chemical
mutualisminmarineanimals.NatProdRep35(4):357-378.41. PielJ(2009)Metabolitesfromsymbioticbacteria.NatProdRep26(3):338-362.42. DoniaMS,etal.(2014)Asystematicanalysisofbiosyntheticgeneclustersinthehuman
microbiomerevealsacommonfamilyofantibiotics.Cell158(6):1402-1414.43. PophamJD&DicksonMR(1973)BacterialassociationsintheteredoBankiaaustralis
(Lamellibranchia:Mollusca).Mar.Biol.19:338-340.44. KulkarniG,etal.(2015)Specifichopanoidclassesdifferentiallyaffectfree-livingand
symbioticstatesofBradyrhizobiumdiazoefficiens.MBio6(5):e01251-01215.45. NguyenT,etal.(2008)Exploitingthemosaicstructureoftrans-acyltransferase
polyketidesynthasesfornaturalproductdiscoveryandpathwaydissection.NatBiotechnol26(2):225-233.
46. Partida-MartinezLP&HertweckC(2005)Pathogenicfungusharboursendosymbioticbacteriafortoxinproduction.Nature437(7060):884-888.
47. KursarTA,etal.(2009)TheevolutionofantiherbivoredefensesandtheircontributiontospeciescoexistenceinthetropicaltreegenusInga.ProcNatlAcadSciUSA106(43):18073-18078.
48. Costa-LotufoLV,etal.(2018)ChemicalprofilingoftwocongenericseamatcoralsalongtheBraziliancoast:adaptiveandfunctionalpatterns.ChemCommun(Camb)54(16):1952-1955.
49. GarciaGD,etal.(2013)Metagenomicanalysisofhealthyandwhiteplague-affectedMussismiliabraziliensiscorals.MicrobEcol65(4):1076-1086.
50. JoshiNA&FassJN(2011)Sickle:Asliding-window,adaptive,quality-basedtrimmingtoolforFASTQfiles(Version1.33)
51. PengY,LeungHcFau-YiuSM,YiuSmFau-ChinFYL,&ChinFY(2012)IDBA-UD:adenovoassemblerforsingle-cellandmetagenomicsequencingdatawithhighlyunevendepth.Bioinformatics28(11):1420-1428.
52. BankevichA,etal.(2012)SPAdes:anewgenomeassemblyalgorithmanditsapplicationstosingle-cellsequencing.JComputBiol19(5):455-477.
53. NurkS,etal.(2013)AssemblingSingle-CellGenomesandMini-MetagenomesFromChimericMDAProducts.JournalofComputationalBiology20(10):714-737.
54. BushnellB,RoodJ,&SingerE(2017)BBMerge-Accuratepairedshotgunreadmergingviaoverlap.PLoSOne12(10):e0185056.
55. LoCC&ChainPS(2014)RapidevaluationandqualitycontrolofnextgenerationsequencingdatawithFaQCs.BMCBioinformatics15:366.
56. WangY,LeungH,YiuS,&ChinF(2014)MetaCluster-TA:taxonomicannotationformetagenomicdatabasedonassembly-assistedbinning.BMCGenomics15Suppl1:S12.
57. LiH,etal.(2009)TheSequenceAlignment/MapformatandSAMtools.Bioinformatics25(16):2078-2079.
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint
19
58. LiH(2011)AstatisticalframeworkforSNPcalling,mutationdiscovery,associationmappingandpopulationgeneticalparameterestimationfromsequencingdata.Bioinformatics27(21):2987-2993.
59. HyattD,etal.(2010)Prodigal:prokaryoticgenerecognitionandtranslationinitiationsiteidentification.BMCBioinformatics11:119.
60. BuchfinkB,XieC,&HusonDH(2015)FastandsensitiveproteinalignmentusingDIAMOND.NatMethods12(1):59-60.
61. VargheseNJ,etal.(2015)Microbialspeciesdelineationusingwholegenomesequences.NucleicAcidsResearch43(14):6761-6771.
62. TanizawaY,FujisawaT,&NakamuraY(2018)DFAST:aflexibleprokaryoticgenomeannotationpipelineforfastergenomepublication.Bioinformatics34(6):1037-1039.
63. MillerIJ,etal.(2019)Autometa:automatedextractionofmicrobialgenomesfromindividualshotgunmetagenomes.NucleicAcidsRes47(10):e57.
64. LinZ,KakuleTB,ReillyCA,BeyhanS,&SchmidtEW(2019)SecondaryMetabolitesofOnygenalesFungiExemplifiedbyAioliomycespyridodomos.JNatProd82(6):1616-1626.
65. ShannonP,etal.(2003)Cytoscape:asoftwareenvironmentforintegratedmodelsofbiomolecularinteractionnetworks.GenomeRes13(11):2498-2504.
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint
20
Figure1.
Figure1.Top,diagramofgenericshipwormanatomy.InsetsarefromBetcheretal.,PLoSOne,2012Figure2,panelsBandD,andareequalmagnification(8).Red:signalfromafluorescentuniversalbacterialprobeindicatinglargenumbersofbacterialsymbiontsinthebacteriocytesofthegill,andpaucityofbacteriainthececum.Greenisbackgroundfluorescence.Bottom,collectionlocationsofspecimensincludedinthisstudy.
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint
21
Figure2.Cultivatedbacterialisolatesrepresentthemajorshipwormgillsymbionts.A)Isolatedbacteriaanalyzedinthisstudyareshowninabstractedschematicofa16SrRNAphylogenetictree.ThecompletetreewithaccuratebranchlengthsandbootstrapnumbersisshowninFigureS1.T.turnerae,withinGroup1comprised11sequencedstrains,forothergroupsindividualstrainsareshown.EachcolorindicatesdifferentbacteriaappearinginthemetagenomesinB.B)Speciescompositionofshipwormgillsymbiontcommunitybasedonshotgunmetagenomesequenceanalysis.They-axisindicatesthepercentofreadsoriginatingfromeachbacterialspecies,whilethex-axisindicatesindividualshipwormspecimensusedinthestudy.Colorsindicatetheoriginofbacterialreads;grayisminor,sporadic,unidentifiedstrains.Formoredetails,seefigureS1
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint
22
Figure3.MostBGCsfoundinthemetagenomesandinthebacterialisolategenomesareshared.401BGCsfrommetagenomesequenceswerecomparedtothebacterialisolategenomes,ofwhich305couldbefoundinisolates.Conversely,148of168BGCsfromsequencedbacterialisolatescouldbefoundinthemetagenomes.Thesharednumberslikelydifferbecausethecontigsassembledfromthemetagenomesequenceswereshorteronaverage,sothatseveralmetagenomefragmentsmaymaptoasingleBGCinanisolate.
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint
23
Figure4.GCFsfoundinA)bacterialgenomesandB)gillmetagenomes.A)AlistofGCFsfoundincultivatedbacterialgenomesisprovidedinthex-axis,whilethenumberoftimesthattheGCFoccursindifferentsequencedstrainsisshowninthey-axis.ColorsindicatebacteriafromFigure2A.Becausethereare11isolatesofaT.turneraeinGroup1,thenumberofGCFsinthisgroup(darkbluebars)arecomparativelyoverrepresentedinthediagram.B)GCFs(x-axis)foundineachmetagenome(y-axis)areshown.TheinsetexpandsaregioncontainingthemostcommonGCFsfoundinourspecimens.Colorsindicateshipwormhostspecies.SeeTableS3foracompletelistofGCFsusedinthisfigure.
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint
24
Figure5.ApossibletabtoxinpathwayisfoundintheD.mannimetagenome.Tabtoxinisaphytotoxinβ-lactaminitiallydiscoveredinPseudomonasspp.(top).Strain2719Kcontainedatabtoxin-likeclusterthatwaspseudogenized(shownasaninsertionintabB;middle).Anon-pseudogenizedtabtoxin-likeclusterwasfoundintheD.mannimetagenomegill(bottom)supportingtheobservationthatmultiplevariantsofeachsymbiontgenomearerepresentedineachmetagenome.
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint
25
Figure6.GCFdistributionacrossshipwormspecies.Shownisasimilaritynetworkdiagram,inwhichcirclesindicateindividualBGCsfromsequencedisolates(gray)andgillmetagenomes(colorsindicatespeciesoforigin;seelegend).LinesindicatetheMultiGeneBlastscoresbetweenidentifiedBGCs,withthinnerlinesindicatingalowerdegreeofsimilarity.Forexample,theclusterlabeled“GCF_8”encodesthesiderophoreturnerbactin,thestructureofwhichisshownatright.Themaincluster,circledbyalightblueoval,includesBGCsthatareverysimilartotheoriginallydescribedturnerbactingenecluster.MoredistantlyrelatedBGCs,withfewerlinesconnectingthemtothemajoritynodesinGCF_8,mightrepresentothersiderophores.GCF_11likelyallrepresenttartrolonD/E,aboronatedpolyketideshownatright.FordetailedalignmentsofBGCs,seeFig.S3.
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint
26
Figure7.PatternofoccurrenceofmostfrequentlyobservedGCFsinisolatesandmetagenomes.ThevaluesineachboxindicatetheBGCoccurrenceperspecimenforeachGCF.Forexample,GCF_5occursintwooutofthreeTeredosp.specimens.Whenthenumberisgreaterthan1,asfoundwithGCF_3,thisresultsfromhavingmorethanoneofthegeneclustertypespresentinthemetagenome(seeFig.8).Whenthenumberequals1,theGCFisfoundinallspecimenssampled.ThesedatawerecompiledfromanalysisofGFCsinindividualbacterialstrainsandsamples(seeFig.S4).
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint
27
Figure8.ThreetypesofGCF_3geneclustersaredistributedinallcellulolyticshipwormsinthisstudy.tBLASTxwasusedtocomparetheclusters,demonstratingthepresenceofthreecloselyrelatedGCF_3genefamiliesfoundinallcellulolyticshipwormgills.
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint
28
SupportingInformation.SupportingTables:SeeattacheddocxfilesforTablesS1-S3.
FigureS1.Phylogenyofshipwormgillsymbiontsandrelatedfree-livingbacteriabasedonapproximatemaximum-likelihoodtreeof16SrRNAsequences.Thetreewasreconstructedusing1,125nucleotidepositionsemployingGTRsubstitutionmodelinFastTreeversion2.1.11withoptimizedGamma20likelihoodandratecategoriespersitesetto20.Supportvaluesareindicatedforeachnode.Thescalebarrepresentsnucleotidesubstitutionratepersite.Cultivatableshipwormsymbiontsandrelatedbacteriaareinboldface.TheexcerptedversionofthistreeisshowninFigure2A.
T. turnerae PMS-991H.S.0a.06 from Lyrodus pedicellatusT. turnerae T8513 from Teredo navalis (KF959891)T. turnerae from Lyrodus sp. PMS-1133Y.S.0a.04T. turnerae PMS-1675L.S.0a.01 from Kuphus polythalamiusT. turnerae T0609 from Lyrodus pedicellatus (EU604079)T. turnerae T7902 from Lyrodus pedicellatus (NR_027564)T. turnerae T8402 from Teredora malleolus (KF959886)T. turnerae T8412 from Lyrodus bipartitus (KF959887)T. turnerae T7901 from Bankia gouldi (EU604078)T. turnerae T8415 from Bankia gouldi (KF959888)T. turnerae T8602 from Dicyathifer mannii (EU604077) PMS-1120W.S.0a.04 from Teredo fulleri
PMS-2753L.S.0a.02 from Infanta Bactronophorus thoracites PMS-2052S.S.stab0a.01 from Butuan Bactronophorus thoracites
Bsc2 from Bankia setacea (KJ836296) OTU 07 from Bankia setacea (KJ836286)
OTU 11 from Bankia setacea (KJ836290) OTU 06 from Bankia setacea (KJ836285)
OTU 10 from Bankia setacea (KJ836289) Bs12 from Bankia setacea (KJ836295)
Bs08 from Bankia setacea (KJ836294) Bs31 from Bankia setacea
Bs02 from Bankia setacea (KJ836293) OTU 09 from Bankia setacea (KJ836288)
OTU 13 from Bankia setacea (KJ836292) OTU 15 from Bankia setacea (KJ836284)
OTU 12 from Bankia setacea (KJ836291) OTU 08 from Bankia setacea (KJ836287)
Endosymbiont RT17 of Lyrodus pedicellatus (DQ272304) Endosymbiont RT18 of Lyrodus pedicellatus (DQ272313)
Symbiont LP3 of Lyrodus pedicellatus (AY150578) Endosymbiont RT14 of Lyrodus pedicellatus (DQ272315) Endosymbiont RT24 of Lyrodus pedicellatus DQ272312
Symbiont LP1 of Lyrodus pedicellatus (AY150183) Endosymbiont RT20 of Lyrodus pedicellatus (DQ272307)
PMS-1162T.S.0a.05 from Lyrodus sp.Agarilytica rhodophyticola 017 (KR610527)
PMS-1081L.S.0a.03 from Bankia sp. Symbiont LP2 of Lyrodus pedicellatus (AY150184)
Saccharophagus degradans 2-40 (AF055269)Cellvibrio japonicus NCIMB 10462 (AF452103)
Cellvibrio mixtus ACM 2601 (AF448515)Sedimenticola thiotaurini SIP-G1 (JN882289)
Sedimenticola selenatireducens AK4OH1 (AF432145)Thiosocius sp. PMS-2719K.STB50.0a.01 from Dicyathifer mannii
Thiosocius teredinicola PMS-2141T.STBD.0c.01a from Kuphus polythalamius (KY643661) Endosymbiont Alviniconcha sp. Lau Basin (AB235229)
Endosymbiont of scaly-foot snail (AP012978) Sulfur-oxidizing bacterium ODIII6 (AF170422)
Candidatus Thiobios zoothamnicoli (EU439003) Ectosymbiont Zoothamnium niveum (AB544415)
Acidothiobacillus ferrooxidans ATCC 23270 (NC_011761)
0.98
0.99
0.90
0.99
0.880.92
0.90
0.67
0.90
0.94
0.83
11
0.930.97
0.97
0.73
0.98
0.980.91
0.900.89
0.620.95
1
0.250.57
0.81
1
0.97
0.99
1
1
0.900.99
0.990.89
0.93
0.99
0.99
0.050
Order Cellvibrionales
Order Chromatiales
Outgroup
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint
29
FigureS2.Strainvariationinshipwormgillsymbiontbacterialspecies.ThisfigurewasmadeaspreviouslyreportedforKuphussymbionts(7),usingDNAgyraseBin10bpframesandexaminingSNPvariation.DifferentcolorsindicatereadswithdifferentSNPsalongthegyrasesequence.ThespecificexampleshownisfromthegillofB.thoracitesspecimen2771.Theresultsindicatethat,eventhoughCellvibrionales2753ListhemajorspeciespresentinB.thoracites2771,thereareatminimum2majorand4minorstrainvariantsrelatedto2753L.They-axisrepresentsnumberofreadsobserved,whilethex-axisindicateseach10bpregion.
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint
30
FigureS3.RepresentativealignmentsshowingactualdataunderlyingtheclustersshowninFigures3,4,6,and7.A)representativealignmentofGCF_3fromgenomesandmetagenomes.Threesubtypeswereindicatedbyredblueandgreencolors;forexample,theNR03metagenomecontainstwocopiesofbluesubtype.DM2858GandDM2722Gcontainblueandredsubtypes.B)alignmentofGCF_2.C)alignmentofGCF_5.D)alignmentofGCF_8.
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint
31
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint
32
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint
33
FigureS4.OccurrenceofGCFsinindividualsamples,expandingwhatisshowninFig.7.A)GCFsfoundinbacterialstrains.B)GCFsfromindividualshipwormspecimens.A
Ga0198945
BS12
BS02
BS08
2052S
1162T
2719K
2141T
BSC2
BS31
T8602
991H
T7901
T0609
T8412
T8402
T8415
T7902
1133Y
T8513
1675L
1120W
2753L
GCF_8GCF_3GCF_11GCF_9GCF_1GCF_4GCF_5GCF_2GCF_22GCF_33GCF_119GCF_17GCF_25GCF_77GCF_31GCF_14GCF_6GCF_10GCF_30GCF_16GCF_12GCF_13GCF_122GCF_117GCF_76GCF_79GCF_74GCF_106GCF_105GCF_66GCF_58GCF_60GCF_120GCF_113GCF_51GCF_49GCF_50GCF_35GCF_80GCF_114GCF_75GCF_34GCF_52GCF_27GCF_115GCF_24GCF_73GCF_7GCF_116GCF_20GCF_23
group groupgroup1group2group3group4group5group6
0
0.2
0.4
0.6
0.8
1
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint
34
B
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint
35
TableS3.ListofGCFsfoundinthisstudy.GCF_1 cf_fatty_acid-t1pks-
nrps GCF_62 terpene
GCF_2 bacteriocin-transatpks-t1pks-nrps
GCF_63 terpene
GCF_3 cf_fatty_acid-transatpks-t1pks-nrps
GCF_64 terpene
GCF_4 t1pks-cf_saccharide-nrps
GCF_65 terpene
GCF_5 terpene-arylpolyene GCF_66 t1pks GCF_6 transatpks-
cf_saccharide-nrps GCF_67 t1pks
GCF_7 nrps GCF_68 t1pks GCF_8 cf_fatty_acid-
nrps_(tunerbactin) GCF_69 t1pks
GCF_9 t1pks GCF_70 t1pks-PUFA GCF_10 hserlactone-
transatpks-nrps GCF_71 t1pks-nrps
GCF_11 transatpks_(tartrolon) GCF_72 t1pks-nrps GCF_12 transatpks-nrps GCF_73 t1pks-nrps GCF_13 t1pks-nrps GCF_74 t1pks-nrps GCF_14 siderophore GCF_75 t1pks-nrps GCF_15 transatpks-otherks GCF_76 t1pks-cf_saccharide-
nrps GCF_16 t1pks-nrps GCF_77 t1pks-cf_saccharide-
nrps GCF_17 nrps GCF_78 t1pks-cf_fatty_acid GCF_18 transatpks GCF_79 siderophore GCF_19 t1pks GCF_80 nrps-transatpks-otherks GCF_20 nrps GCF_81 nrps GCF_21 transatpks GCF_82 nrps GCF_22 t1pks GCF_83 nrps GCF_23 siderophore GCF_84 nrps GCF_24 nrps GCF_85 nrps GCF_25 nrps GCF_86 nrps GCF_26 transatpks GCF_87 nrps GCF_27 transatpks-otherks-
nrps GCF_88 nrps
GCF_28 nrps GCF_89 nrps GCF_29 nrps GCF_90 nrps GCF_30 nrps GCF_91 nrps GCF_31 nrps GCF_92 nrps
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint
36
GCF_32 transatpks GCF_93 nrps GCF_33 transatpks-t1pks-nrps GCF_94 nrps GCF_34 thiopeptide-
hserlactone GCF_95 nrps
GCF_35 t1pks-cf_saccharide-nrps
GCF_96 nrps
GCF_36 t1pks-nrps GCF_97 nrps GCF_37 t1pks-nrps GCF_98 nrps GCF_38 t1pks-cf_saccharide-
nrps GCF_99 nrps
GCF_39 nrps GCF_100 nrps GCF_40 nrps GCF_101 nrps GCF_41 nrps GCF_102 nrps GCF_42 nrps GCF_103 nrps GCF_43 nrps GCF_104 nrps GCF_44 nrps GCF_105 nrps GCF_45 nrps GCF_106 nrps GCF_46 nrps GCF_107 nrps GCF_47 nrps GCF_108 nrps GCF_48 nrps GCF_109 nrps GCF_49 nrps GCF_110 nrps GCF_50 nrps GCF_111 nrps GCF_51 hserlactone-t1pks-
nrps GCF_112 nrps
GCF_52 cf_saccharide-nrps GCF_113 nrps GCF_53 transatpks GCF_114 nrps GCF_54 transatpks GCF_115 nrps GCF_55 transatpks GCF_116 nrps GCF_56 transatpks GCF_117 hserlactone-transatpks-
cf_fatty_acid GCF_57 transatpks GCF_118 hserlactone-nrps GCF_58 transatpks-t1pks-nrps GCF_119 cf_saccharide-nrps GCF_59 transatpks-otherks GCF_120 cf_fatty_acid-t1pks GCF_60 transatpks-
cf_saccharide GCF_121 bacteriocin-lantipeptide
GCF_61 transatpks-cf_fatty_acid
GCF_122 arylpolyene-nrps_(butunamide)
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint
TableS1.Shipwormgillmetagenomesusedinthisstudy.
#Gillmetagenome
PMS-ICBGsamplecodes
Sourceshipwormspecies
Location Coordinates SequencingcenterSequencingplatform
AssemblerReads,posttrim
SizeinbpNo.ofcontigs
N50 %GCIMGGenomeID
1 DM2722G PMS-2722P DicyathifermanniispecimenPMS-2717Y
Infanta,Quezon,Philippines
N14.68367°,E121.63690°
HuntsmanCancerInstitute,UniversityofUtah
IlluminaHiSeq2000 IDBA_ud 187291588 1235295176 924064 2095 34.9
2 BT2771G PMS-2771XBactronophorusthoracitesspecimenPMS-2769U
Infanta,Quezon,Philippines
N14.68367°,E121.63690°
HuntsmanCancerInstitute,UniversityofUtah
IlluminaHiSeq2000 IDBA_ud 177392546 1056707310 813604 2023 35.4
3 BT2849G PMS-2849YBactronophorusthoracitesspecimenPMS-2839H
Infanta,Quezon,Philippines
N14.68367°,E121.63690°
HuntsmanCancerInstitute,UniversityofUtah
IlluminaHiSeq2000 IDBA_ud 193099534 1059162705 814617 2024 35.4
4 DM2858G PMS-2858W DicyathifermanniispecimenPMS-2823T
Infanta,Quezon,Philippines
N14.68367°,E121.63690°
HuntsmanCancerInstitute,UniversityofUtah
IlluminaHiSeq2000 IDBA_ud 186697500 1236681788 928980 2083 34.9
5 DM3770G PMS-3770U DicyathifermanniispecimenPMS-3768S
Infanta,Quezon,Philippines
N14.68367°,E121.63690°
HuntsmanCancerInstitute,UniversityofUtah
IlluminaHiSeq2000 IDBA_ud 297553066 1328488478 1067922 1946 34.9
6 BT3790G PMS-3790S
BactronophorusthoracitesspecimenPMS-3779S
Infanta,Quezon,Philippines
N14.68367°,E121.63690°
HuntsmanCancerInstitute,UniversityofUtah
IlluminaHiSeq2000
IDBA_ud 309554332 1127330840 927279 1873 35.4
10 KP3700G PMS-3700MKuphussp.specimenPMS-3696Y(wood-boring)
Mabini,Batangas,Philippines
N13.75843°,E120.92586°
HuntsmanCancerInstitute,UniversityofUtah
IlluminaHiSeq2000 IDBA_ud 82015762 734092095 358482 4300 37.6
11 KP2132G PMS-2246KandPMS-2249P
KuphuspolythalamiusspecimenPMS-2132W(mud-dwelling)
Kalamansig,SultanKudarat,Philippines
N6.53631°,E124.048365°
HuntsmanCancerInstitute,UniversityofUtah
IlluminaHiSeq2000 IDBA_ud 318294870 772720664 424816 4530 37.6
12 KP2133G
PMS-2157H,PMS-2116M,andPMS-2110W
KuphuspolythalamiusspecimenPMS-2133X(mud-dwelling)
Kalamansig,SultanKudarat,Philippines
N6.53631°,E124.048365°
HuntsmanCancerInstitute,UniversityofUtah
IlluminaHiSeq2000 IDBA_ud 329174268 795400237 500141 3879 37.4
13 BSG1 - Bankiasetacea PugetSound,Washington,USA
N47.85072°,W122.33843°
JointGenomeInstitute-DepartmentofEnergy
IlluminaHiSeq2000
SOAPdenovo,Newbler,andMinimus2
- 563042012 761912 985 35.0 3300000111
14 BSG3 - Bankiasetacea PugetSound,Washington,USA
N47.957498°,W122.529373°
JointGenomeInstitute-DepartmentofEnergy
IlluminaHiSeq2000
SOAPdenovo,Newbler,andMinimus2
- 620222960 648493 1550 34.9 3300000024
15 BSG2 - BankiasetaceaPugetSound,Washington,USA
N47.957498°,W122.529373°
JointGenomeInstitute-DepartmentofEnergy
IlluminaHiSeq2000
SOAPdenovo,Newbler,andMinimus2
- 540217764 793976 860 34.8 3300000110
17 BSG4 - BankiasetaceaPugetSound,Washington,USA
N47.85072°,W122.33843°
JointGenomeInstitute-DepartmentofEnergy
IlluminaHiSeq2000
SOAPdenovo,Newbler,andMinimus2
- 574332630 692986 1194 34.6 3300000107
19 BS_sunk - Bankiasetacea PugetSound,Washington,USA
N47.85072°,W122.33843
JointGenomeInstitute-DepartmentofEnergy
Illumina,454GSFLXTitanium
NewblerandVelvet - 26539887 38227 1943 45.2 2070309010
20 NR01 - Neoteredoreynei
CoroagrandeMangrove-Sepetibabay,RiodeJaneiroState,BR
22.9081670°S43.8756390°W CEGENBIO IlluminaMiSeq SPAdes 9224156 313630826 413893 779 37.3
21 NR02 - Neoteredoreynei
CoroagrandeMangrove-Sepetibabay,RiodeJaneiroState,BR
22.9081670°S43.8756390°W CEGENBIO IlluminaMiSeq SPAdes 18338062 416566737 468503 986 37.2
22 NR03 - Neoteredoreynei CoroagrandeMangrove-Sepetiba
22.9081670°S43.8756390°W
CEGENBIO IlluminaMiSeq SPAdes 13078802 309408486 414159 769 38.2
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint
bay,RiodeJaneiroState,BR
23 TBF02 - Teredosp.
EnvironmentalPreservationAreaofPacotiriver,CearáState,Brazil
S3.843111,W38.422695(3°50'35.2"S38°25'21.7"W)
CEGENBIO IlluminaMiSeq SPAdes 356571174108472
236018 1037 40.1
24 TBF03 - Bankiasp.
EnvironmentalPreservationAreaofPacotiriver,CearáState,Brazil
S3.843111,W38.422695(3°50'35.2"S38°25'21.7"W)
CEGENBIO IlluminaMiSeq SPAdes 2205607 33524312 75538 1123 41.2
25 TBF05 - Bankiasp.
EnvironmentalPreservationAreaofPacotiriver,CearáState,Brazil
S3.843111,W38.422695(3°50'35.2"S38°25'21.7"W)
CEGENBIO IlluminaMiSeq SPAdes 3632367 107179837 230494 995 37.4
26 TBF07 - Teredosp.
EnvironmentalPreservationAreaofPacotiriver,CearáState,Brazil
S3.843111,W38.422695(3°50'35.2"S38°25'21.7"W)
CEGENBIO IlluminaMiSeq SPAdes 3731031 78684542 258368 965 38.6
27 TBF09 - Teredosp.
EnvironmentalPreservationAreaofPacotiriver,CearáState,Brazil
S3.843111,W38.422695(3°50'35.2"S38°25'21.7"W)
CEGENBIO IlluminaMiSeq SPAdes 4029653 108441874 340072 948 38.1
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint
TableS2:Shipwormsymbiontgenomes.# Codeinthe
manuscriptIsolatename Metabolic
typeHostshipworm Location Coordinates Sequencing
centerSequencingplatform
Sequenceassembler
Estimatedgenomesize
No.ofcontigs/scaffolds
N50 %GC IMGGenomeID
1 T7901 T.turneraestrainT7901
Cellulolytic Bankiagouldi Beaufort,NorthCarolinaUSA
N34.71737°,W76.67198°
J.CraigVenterInstitute
454,Sanger CeleraAssemblerandcustomsoftware
5,193,164 1(closedcircular) Notapplicable
50.89 2541046951
2 T8415 T.turneraestrainT8415
Cellulolytic Bankiagouldi FortPierce,Florida,USA
N27.48063°,W80.30967°
JGI-DOE Illumina ALLPATHS 5,158,349 50 ScaffoldN/L50:5/398.1KbpContigN/L50:6/395.4kbp
50.78 2510917000
3 T8602 T.turneraestrainT8602
Cellulolytic Dicyathifermannii Townsville,Queensland,Australia
S19.27631°,E147.05784°
JGI-DOE Illumina ALLPATHS 5,097,488 59 ScaffoldN/L50:6/291.7kbpContigN/L50:2/291.7kbp
51.03 2513237135
4 T7902 T.turneraestrainT7902
Cellulolytic Lyroduspedicellatus
LongBeach,California,USA
N33.76138°,W118.17281°
JGI-DOE Illumina ALLPATHS 5,387,817 72 ScaffoldN/L50:11/176.4kbpContigN/L50:11/176.4kbp
50.81 2513237099
5 T8402 T.turneraestrainT8402
Cellulolytic Teredoramalleolus FloatingwoodintheAtlanticOcean
N38.30667°,W69.59333°
JGI-DOE Illumina Velvet(1.1.04)andALLPATHS-LG
5,166,130 27 ScaffoldN/L50:6/348.4kbpContigN/L50:7/315.4kbp
50.86 2519899652
6 T8412 T.turneraestrainT8412
Cellulolytic Lyrodusbipartitus JimIsland,FortPiece,Florida,USA
N27.476944°,W80.311944°
JGI-DOE Illumina Velvet(1.1.04)andALLPATHS-LG
5,147,360 58 ScaffoldN/L50:10/205.3kbpContigN/L50:10/205.3kbp
51.07 2519899664
7 T0609 T.turneraestrainT0609
Cellulolytic Lyroduspedicellatus
LongBeach,California,USA
N33.76138°,W118.17281°
JGI-DOE Illumina Velvet(1.1.04)andALLPATHS-LG
5,069,061 49 ScaffoldN/L50:7/246.6kbpContigN/L50:7/246.6kbp
51.15 2519899663
8 991H T.turneraestrainPMS-991H.S.0a.06
Cellulolytic LyroduspedicellatusspecimenPMS-988W
Panglao,Bohol,Philippines
N9.54558°,E123.76030°
JGI-DOE Illumina ALLPATHS-LG 5,279,031 13 ScaffoldN/L50:2/1.8MbpContigN/L50:3/888.4kbp
51.07 2524614873
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint
9 T8513 T.turneraestrainT8513
Cellulolytic Teredonavalis SãoPaulo,Brazil
S23.81992°,W45.40517°
JGI-DOE Illumina Velvet(1.1.04)andALLPATHS-LG
5,268,281 84 ScaffoldN/L50:9/189.8kbpContig:8/189.8kbp
50.92 2523533596
10 1133Y T.turneraestrainPMS-1133Y.S.0a.04
Cellulolytic Lyrodussp.specimenPMS-1128S
Panglao,Bohol,Philippines
N9.59670°,E123.74990°
JGI-DOE Illumina ALLPATHS-LG 5,134,977 6 ScaffoldN/L50:1/3.2MbpContigN/L50:4/607.0kbp
50.85 2540341229
11 1675L T.turneraestrainPMS-1675L.S.0a.01
Cellulolytic KuphuspolythalamiusspecimenPMS-1672Y
Kalamansig,SultanKudarat,Philippines
N6.53631°,E124.04836°
JGI-DOE PacBio HGAP2.1.1 5,283,781 1(closedcircular) Notapplicable
51.05 2571042908
12 2753L PMS-27553L.S.0a.02 Cellulolytic BactronophorusthoracitesspecimenPMS-2749X
Infanta,Quezon,Philippines
N14.68367°,E121.63690°
JGI-DOE PacBio HGAP2.1.1 6,056,039 2 ScaffoldN/L50:1/4.4Mbp
47.96 2579779156
13 1120W PMS-1120W.S.0a.04 Cellulolytic TeredofullerispecimenPMS-1114L
Panglao,Bohol,Philippines
N9.59670°,E123.74990°
JGI-DOE PacBio HGAP2.0.1 5,699,307 1(closedcircular) Notapplicable
50.39 2558309032
14 2052S PMS-2052S.S.stab0a.01
Cellulolytic BactronophorusthoracitesspecimenPMS-1959H
Butuan,AgusandelNorte,Philippines
N8.98650°,E125.45768°
JGI-DOE Illumina ALLPATHS-LG 5,635,926 3 ScaffoldN/L50:1/5.6MbpContig:3/981.6kbp
54.68 2541046951
15 Bs12 Bs12 Cellulolytic Bankiasetacea PugetSound,Washington,USA
N47.95749°,W122.52937°
JGI-DOE PacBio HGAP2.0.0 4,921,245 3 Contig:1/4.7Mbp
45.72 2545555829
16 Bs08 Bs08 Cellulolytic Bankiasetacea PugetSound,Washington,USA
N47.95749°,W122.52937°
JGI-DOE Illumina Velvetv.DEC-2010
4,814,259 90 ScaffoldN/L50:7/255.3MbpContig:14/112.2kbp
47.18 2767802764
17 Bsc2 Bsc2 Cellulolytic Bankiasetacea PugetSound,Washington,USA
N47.95749°,W122.52937°
NewEnglandBiolabs
PacBio HGAP2.0.1 5,414,953 10 4.2Mbp 47.31 2531839719
18 Bs31 Bs31 Cellulolytic Bankiasetacea PugetSound,Washington,USA
N47.95749°,W122.52937°
JGI-DOE PacBio Velvet1.1.04andALLPATHS-LG
5,017,353 46 ScaffoldN/L50:5/341.1kbpContig:8/260.1kbp
47.60 2528768159
19 Bs02 Bs02 Cellulolytic Bankiasetacea PugetSound,Washington,USA
N47.95749°,W122.52937°
JGI-DOE Illumina Velvetv.DEC-2010
3,886,134 141 Contig:8/176.2kbp
47.76 2503982003
20 1162T PMS-1162T.S.0a.05 Cellulolytic Lyrodussp.specimenPMS-1157K
Talibon,Bohol,Philippines
N10.30748°,E124.40168°
JGI-DOE IlluminaandPacBio
ALLPATHS-LG 4,404,964 1(closedcircular) Notapplicable
47.72 2524614822
21 1081L PMS-1081L.S.0a.03 Agarolytic Bankiasp.specimenPMS-1083P
Panglao,Bohol,Philippines
N9.59670°,E123.74990°
JGI-DOE PacBio HGAP2.1.1 4,255,513 13 ScaffoldN/L50:568.3kbp
53.67 2574179784
22 2141T ThiosociusteredinicolaPMS-2141T.STBD.0c.01a
Sulfur-oxidizing
Kuphuspolythalamius
Kalamansig,SultanKudarat,Philippines
N6.53631°,E124.048365°
JGI-DOE PacBio HGAP2.0.1 4,790,451 1(closedcircular) Notapplicable
60.08 2751185674
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint
specimenPMS-2133X
23 2719K Thiosociussp.PMS-2719K.STB50.0a.01
Sulfur-oxidizing
DicyathifermanniispecimenPMS-2715W
Infanta,Quezon,Philippines
N14.68367°,E121.63690°
JGI-DOE PacBio HGAP2.0.1 5,077,565 1(closedcircular) Notapplicable
58.55 2574179721
24 Ga0198945 Agarilyticarhodophyticolastrain017
Agarolytic AssociatedwiththeseaweedGracilariablodgettii
LingshuiCounty,Hainan,China
N18.40828°,E110.0623°
JGI-DOE IlluminaandPacBio
SOAPdenovo2.04;CeleraAssembler8.0
6,878,829 1(closedcircular) Notapplicable
40.97 2751185671
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 31, 2019. . https://doi.org/10.1101/826933doi: bioRxiv preprint