MongoDB Advantages Over Relational Databases

Embed Size (px)

DESCRIPTION

Modern applications require high scalability and easy development from the datastores. Earlier the only option was relational databases; but now we see a host of nosql options coming to the table. This whitepaper discusses advantages and migration considerations.

Citation preview

  • Introduction Therelationaldatabasehasbeenthefoundationofenterprisedatamanagementforoverthirtyyears.Buttheway,webuildandrunapplicationstoday,coupledwithunrelentinggrowthinnewdatasourcesandgrowinguserloadsarepushingrelationaldatabasesbeyondtheirlimits.So,weareseeingalotofapplicationsbeginningtomigratetoNoSQL,likeMongoDBcompletelyormovetoahybridmodelutilizingbothRDBMSandMongoDBforpersistingdifferentpartsofthesamesystem.ThiswhitepaperdiscusseshowMongoDB,theleadingnosqldatabaseoffersanewandbetterapproachforpersistingdatathanrelationaldatabases.Table of Contents

    MongoDBadvantagesRelationalDatabasechallengesMongoDBSolution

    AgilityPerformanceScalabilityHighAvailability

    DetailedExplanationFromrigidtablestoflexibleanddynamicBSONdocumentsApplicationintegrationAtomicityinMongoDBMigrationtoolsfromrelationaltoMongoDBMongoDBMMS

    Conclusion

    MongoDB advantages DynamicSchemadesignenablesrapiddevelopmentwithoutalonginitialtableandforeignkeydesignsencountered,whenusingarelationaldatastore.Horizontalscalingoncommodityhardware:TheproductionsitecanmanageseveralTBinasingletable(collection),w/obeinglimitedbyaddingnewfieldsorbeinglimitedbygrowth.Shardingenableslinearandscaleoutgrowthw/orunningoutofbudget.RapidreplicaSetenablesmeetingregulationwitheasytosetupmultidatacenterDRPandHAsolution.Highwritethroughput:Withalargenumberinserts,MongoDBarchitectureisgreatforasystemthatmustsupporthighinsertload.YetyoucanguaranteetransactionswithfindAndModify(whichisslower)andtwophasecommit(applicationwise).MongoDBsupportsfulltextsearchindexesandrequiresnoadditionalinfrastructuretobedeployed.Developerorientedqueries,enabledeveloperswriteaelegantqueries.

  • Avarietyofindexesareavailable.Especiallyusefulisbuiltinlocationindex,whichcanbeutilizedtobuildgeospatialapplications.Map/Reduce:Ifyouhappentoneedthis,thereisbuiltinsupport.

    Relational Database challenges Relationaldatabaseshavethefollowingbroadcategoriesofchallenges:

    1. Datatypes:Theyarenotdesignedtosupportunstructured,semistructuredorpolymorphicdata.

    2. VolumeofData:Whenitcomestosupportingpetabytesofdata,trillionsofrecordsand/ormillionsofqueriespersecond,RDBMSsstrugglebecauseofinherentdesignlimitation.

    3. AgileDevelopment:RDBMSisillsuitedforiterativedevelopment,becauseyouwouldneedtofreezedatamodelbeforeanydevelopmentcanstart.Ifitsoundslikewaterfallmodel,itsureis.Also,shortdevelopmentcyclesandnewworkloadscanposechallengesinanyproject/productusingRDBMS.

    4. NewArchitecture:AsRDBMSweredesigned30yearsago,theydonthavehorizontalscalingbuiltintothemyet.Theystruggletoscaleusingcommodityserversandcloudcomputing.

    MongoDB Solution Agility MuchofthedataweusetodayhascomplexstructuresthatcanbemodeledandrepresentedmoreefficientlyusingJSON(JavaScriptObjectNotation)documents,ratherthantables.MongoDBstoresJSONdocumentsinabinaryrepresentationcalledBSON(BinaryJSON).BSONencodingextendsthepopularJSONrepresentationtoincludeadditionaldatatypessuchasint,longandfloatingpoint.

  • Bycontrast,tryingtomaptheobjectrepresentationofthedatatothetabularrepresentationofanRDBMSslowsdowndevelopment.AddingObjectRelationalMappers(ORMs)cancreateadditionalcomplexitybyreducingtheflexibilitytoevolveschemasandtooptimizequeriestomeetnewapplicationrequirements.

    Performance MongoDBachievesbetterperformancebyusinginmemorycaching,betterdatalocalityandinplaceupdates.

    Scalability MongoDBusesshardingtosupportdeploymentswithverylargedatasetsandhighthroughputoperations,whichisamethodforstoringdataacrossmultiplemachines.

    High Availability Replicasetsprovidehighavailabilityusingautomaticfailover.Failoverallowsasecondarymemberstobecomeprimaryifprimaryisunavailable.Failover,inmostsituationsdoesnotrequiremanualintervention.

    Detailed Explanation Wewillexpandsomeofthepointscoveredabovenext.

    From rigid tables to flexible and dynamic BSON documents MuchofthedataweusetodayhascomplexstructuresthatcanbemodeledandrepresentedmoreefficientlyusingJSON(JavaScriptObjectNotation)documents,ratherthantables.MongoDBstoresJSONdocumentsinabinaryrepresentationcalledBSON(BinaryJSON).

  • BSONencodingextendsthepopularJSONrepresentationtoincludeadditionaldatatypessuchasint,longandfloatingpoint.Withsubdocumentsandarrays,JSONdocumentsalsoalignwiththestructureofobjectsattheapplicationlevel.Thismakesiteasyfordeveloperstomapthedatausedintheapplicationtoitsassociateddocumentinthedatabase.Bycontrast,tryingtomaptheobjectrepresentationofthedatatothetabularrepresentationofanRDBMSslowsdowndevelopment.AddingObjectRelationalMappers(ORMs)cancreateadditionalcomplexitybyreducingtheflexibilitytoevolveschemasandtooptimizequeriestomeetnewapplicationrequirements.Theprojectteamshouldstarttheschemadesignprocessbyconsideringtheapplicationsrequirements.Itshouldmodelthedatainawaythattakesadvantageofthedocumentmodelsflexibility.Inschemamigrations,itmaybeeasytomirrortherelationaldatabasesflatschematothedocumentmodel.However,thisapproachnegatestheadvantagesenabledbythedocumentmodelsrich,embeddeddatastructures.Forexample,datathatbelongstoaparentchildrelationshipintwoRDBMStableswouldcommonlybecollapsed(embedded)intoasingledocumentinMongoDB.Byusinganelegantdocumentdatabaseschemadesignandmakinguseofatomicityatdocumentlevel,JOIN'sareredundant.MostofJOINsleadtoalotofperformanceissuesinrelationaldatabases.Somemoreadvantagesofthismodelare:

    Anaggregateddocumentcanbeaccessedwithasinglecalltothedatabase,ratherthanhavingtoJOINmultipletablestorespondtoaquery.TheMongoDBdocumentisphysicallystoredasasingleobject,requiringonlyasinglereadfrommemoryordisk.Ontheotherhand,RDBMSJOINsrequiremultiplereadsfrommultiplephysicallocations.

    Asdocumentsareselfcontained,distributingthedatabaseacrossmultiplenodes(aprocesscalledsharding)becomessimplerandmakesitpossibletoachievemassivehorizontalscalabilityoncommodityhardware.TheDBAnolongerneedstoworryabouttheperformancepenaltyofexecutingcrossnodeJOINs(shouldtheyevenbepossibleintheexistingRDBMS)tocollectdatafromdifferenttables.

    Application integration EaseofuseanddeveloperproductivityaresomeofMongoDBscoredesigngoals.OnefundamentaldifferencebetweenaSQLbasedRDBMSandMongoDBisthattheMongoDBinterfaceisimplementedasmethods(orfunctions)withintheAPIofaspecificprogramminglanguage,asopposedtoacompletelyseparatelanguagelikeSQL.This,coupledwiththeaffinitybetweenMongoDBsBSONdocumentmodelandthedatastructuresusedinobjectorientedprogramming,makesapplicationintegrationsimple.MongoDBhasidiomaticdriversforthemostpopularlanguages,includingoveradozendevelopedandsupportedbyMongoDB(e.g.,Java,Python,.NET,PERL)andmorethan30communitysupporteddrivers.

    Atomicity in MongoDB Relationaldatabasestypicallyhavewelldevelopedfeaturesfordataintegrity,includingACIDtransactionsandconstraintenforcement.Rightly,usersdonotwanttosacrificedataintegrityastheymovetonewtypesofdatabases.WithMongoDB,userscanmaintainmanycapabilitiesof

  • relationaldatabases,eventhoughthetechnicalimplementationofthosecapabilitiesmaybedifferentwehavealreadyseenthiswithJOINs.MongoDBwriteoperationsareatomicatthedocumentlevelincludingtheabilitytoupdateembeddedarraysandsubdocumentsatomically.Byembeddingrelatedfieldswithinasingledocument,usersgetthesameintegrityguaranteesasatraditionalRDBMS,whichhastosynchronizecostlyACIDoperationsandmaintainreferentialintegrityacrossseparatetables.DocumentlevelatomicityinMongoDBensurescompleteisolationasadocumentisupdatedanyerrorscausetheoperationtorollbackandclientsreceiveaconsistentviewofthedocument.Despitethepowerofsingledocumentatomicoperations,theremaybecasesthatrequiremultidocumenttransactions.Therearemultipleapproachestothisincludingusingthefindandmodifycommandthatallowsadocumenttobeupdatedatomicallyandreturnedinthesameroundtrip.findandmodifyisapowerfulprimitiveontopofwhichuserscanbuildothermorecomplextransactionprotocols.Forexample,usersfrequentlybuildatomicsoftstatelocks,jobqueues,countersandstatemachinesthatcanhelpcoordinatemorecomplexbehaviors.Anotheralternativeentailsimplementingatwophasecommittoprovidetransactionlikesemantics.

    Migration tools from relational to MongoDB Manyuserscreatetheirownscripts,whichtransformsourcedataintoahierarchicalJSONstructurethatcanbeimportedintoMongoDBusingthemongoimporttool.ExtractTransformLoad(ETL)toolsarealsocommonlyusedwhenmigratingdatafromrelationaldatabasestoMongoDB.AnumberofETLvendorsincludingInformatica,PentahoandTalendhavedevelopedMongoDBconnectorsthatenableaworkflowinwhichdataisextractedfromthesourcedatabase,transformedintothetargetMongoDBschema,stagedthenloadedintodocumentcollections.ManymigrationsinvolverunningtheexistingRDBMSinparallelwiththenewMongoDBdatabase,incrementallytransferringproductiondata:

    1. AsrecordsareretrievedfromtheRDBMS,theapplicationwritesthembackouttoMongoDBintherequireddocumentschema.

    2. Consistencycheckers,forexampleusingMD5checksums,canbeusedtovalidatethemigrateddata.

    3. AllnewlycreatedorupdateddataiswrittentoMongoDBonly.

    MongoDB MMS MongoDBManagementandMonitoringservice(MMSinshort)availableoncloudandonpremisecanbeusedtodoperformancemonitoringanddobackupandrestores.

    Conclusion MongoDBistheleadingnosqldatabaseandisbeingincreasinglyusedinhybridapplicationsalongwithRDBMSslikeMySQLaswellasinsomecases,asstandalonedatabase.