AdvancedDatabaseProjectReport
CloudDatabasesandMicrosoftAzureINFO-H-415
18.12.2017
MarieElisabethHeinrich–000457502
JayanthiKambayatughar–000457113
AdvancedDatabaseProjectReport–CloudDatabasesandMicrosoftAzure
2
TableofContents
1 IntroductiontoMicrosoftAzureandAzureSQLCloudDatabase........................................................51.1 Motivation..................................................................................................................................51.2 MSAzurePlatform......................................................................................................................6
1.2.1 AzureCloudComputingConceptandPlatformArchitecture.................................................61.3 CloudDatabasesonMSAzure....................................................................................................9
1.3.1 GeneralconceptofCloudDatabases......................................................................................91.3.2 DatabaseasaService(DBaaS)onMSAzure........................................................................101.3.3 AvailableCloudDatabasesonMSAzure..............................................................................12
1.4 MSAzureSQLCloudDB............................................................................................................131.4.1 Generalintroduction............................................................................................................131.4.2 TypicalUseCases..................................................................................................................151.4.3 EvaluationofAzureSQLDB..................................................................................................15
1.5 SummarizedAdvantagesandDisadvantagesofAzureSQL......................................................162 Performancebenchmarkforareal-lifeapplicationexampleofaretailsalesmanagementprocess17
2.1 IntroductiontoUseCase&Benchmark....................................................................................172.2 Generationoftestdata.............................................................................................................172.3 Specificationsfortestenvironment..........................................................................................172.4 Databasecreation.....................................................................................................................18
2.4.1 CreateDatabaseinAzureSQL..............................................................................................182.5 QueryingAzureSQLdatabase...................................................................................................23
2.5.1 Selectsalesdata...................................................................................................................232.5.2 Updateproductpricesduringpromotions...........................................................................232.5.3 Calculatetotalsalesamount................................................................................................232.5.4 InsertnewSalesRecords......................................................................................................23
2.6 PerformanceBenchmark..........................................................................................................252.6.1 Performance.........................................................................................................................252.6.2 DatabaseCostComparison...................................................................................................332.6.3 Furtherconsiderationsfordatabaseevaluation..................................................................35
3 Finalconclusion..................................................................................................................................354 Sources...............................................................................................................................................36
AdvancedDatabaseProjectReport–CloudDatabasesandMicrosoftAzure
3
Listoffigures
Figure1:MagicQuadrantforCloudInfrastructureasaService11...............................................................5
Figure2:ArchitecturaloverviewIaaS,PaaSandSaaS11...............................................................................6
Figure3:MicrosoftAzurePlatformArchitecture.........................................................................................8
Figure3:Overviewservicetiers.................................................................................................................11
Figure5:AzurePricingOptions7.................................................................................................................11
Figure5:AzurePricingCalculator12............................................................................................................12
Figure7:RelationaldatabaserankingstatisticsasofDecember20178.....................................................13
Figure8:AzureDataMigrationandConnectionWorkflow9......................................................................14
Figure9:Physicaldistributionandapplicationhierarchies........................................................................14
Figure10:CreateanAzureSQLdatabaseontheMSAzureplatform........................................................18
Figure11:SelectionofServiceTierfortheAzureSQLdatabase...............................................................19
Figure12:SetServerfirewallmenu...........................................................................................................19
Figure13:AddClientIPtotheServerFirewall...........................................................................................19
Figure14:SettingupaconnectionfromSSMStoAzure............................................................................20
Figure15:T-SQLstatementtocreatedatabaseschema............................................................................21
Figure16:bcpcommand-linestatementtoloaddataintothedatabase..................................................22
Figure17:StoredProceduretoinsertsalesrecords..................................................................................24
Figure18:PerformanceTestResultsSelectquery1.000.000rows...........................................................26
Figure19:EstimatedmemoryallocationofSQLAzureperservicetier10..................................................27
Figure20:PerformanceTestResultsJoinquery1.000.000rows..............................................................28
Figure21:PerformanceTestResultsSelectandUpdatequery1.000.000rows.......................................29
Figure22:AzureSQLQueryPerformanceInsights....................................................................................30
Figure23:WriteRateComparison(MB/min)ofdifferentAzureSQLServiceTiers10................................31
Figure24:Summaryofaverageperformancetestresults.........................................................................32
Figure25:DifferencebetweenAzureSQlandSQLServerPerformanceTestResultsin%.......................32
AdvancedDatabaseProjectReport–CloudDatabasesandMicrosoftAzure
4
Listoftables
Table1:AdvantagesandDisadvantagesofCloudDatabases......................................................................9
Table2:OverviewCloudDatabasesonAzure............................................................................................12
Table3:SummarizedAdvantagesandDisadvantagesofMicrosoftAzure................................................16
Table4:MonthlycostestimationonpremiseSQLServer.........................................................................33
Table5:MonthlycostestimationAzureSQLDatabase.............................................................................34
AdvancedDatabaseProjectReport–CloudDatabasesandMicrosoftAzure
5
1 Introduction to Microsoft Azure and Azure SQL CloudDatabase
ClouddatabasesandMicrosoftAzureareboththemselveslargetopics.Tosummarizeandexaminethemost important factors to consider, this report shall first give a top-down introduction from CloudComputing,toMicrosoftAzureandthentotheAzureSQLclouddatabaseinsection1.Second,areal-lifeOLTP application was implemented to conduct a performance benchmark study between a clouddatabase(AzureSQL)andastandardonpremisesolution(SQLServer)insection2.Theobtainedresultsarethencomparedanddiscussedinsection2.6.
1.1 MotivationCloudComputinghasgainedincreasingattractivenessinrecentyears.Researchsaysthat24%ofthetotaladdressableITmarketwillbeinthecloudby2020andalmostoneoutoffivevirtualmachinesarealreadyrunninginthecloudwithadrasticmarketandofferinggrowth.1AccordingtoGartner’sMagicQuadrantforCloudInfrastructureasaService(IaaS)thatwaspublishedinJune2017(refertofigure1),therearecurrentlytwoleadersinthemarket:AmazonWebServices(AWS)andMicrosoftwithMicrosoftAzure(MSAzure).Butwhatarethereasonsforthistrend,whatmakesthesecloudinfrastructuressooutstandingandwhyshouldaclouddatabasebeconsidered?
When it comes to traditional databases they havesomemajor restrictions in common: theirunderlyinghardwarerestrictionsandlimitedscalabilityintermsofCPUandmemorysize.Asasteadilygrowingamountofcomputing power is needed to tackle the challengesthatcomealongwithmassiveamountsofdata,clouddatabasesquicklyraisedmoreinterestduetothefactthat they can be scaled to the required computingpower on demand. This trend does affect most oftoday’s industries, from banking and assurance thathave always dealt with huge datasets but alsoengineering or healthcare industries that seethemselvesconfrontedwiththeInternetofThingsoradvanced image recognition technologies, which aredisrupting their current business with new businessmodelsandtechnologiesbasedondata.
Furthermore, Cloud Computing is not only beneficialfor companies that have large variations in the
workloadoftheirdatabasebutalsoforthosethataregrowingrapidlyandneedsystemsthatcancopewith the growth. Additionally, many cloud databases are offered “as a Service” which shiftsconsiderations about database maintenance and set up away from the user and thus can supportbusinessesinbridginghumanresourceshortagesandeliminatestheneedforhugeupfrontinvestmentsinIT-infrastructure.
However, various factors need to be taken into accountwhen evaluating the use of cloud solutions.Therefore,wedevelopedausecaseandconductedasmallbenchmarkingstudyinordertogetfamiliarwiththecloudenvironmentandexplorehighlightsanddrawbacksofaclouddatabase.
Figure 1: Magic Quadrant for Cloud Infrastructure as aService11
AdvancedDatabaseProjectReport–CloudDatabasesandMicrosoftAzure
6
1.2 MSAzurePlatformThischaptershallgiveabriefgeneral introductiontoCloudComputingandhighlightcharacteristicsofcloudplatformsanddatabases,inparticularofMSAzureandAzureSQL.
1.2.1 AzureCloudComputingConceptandPlatformArchitectureMSAzureisnotasingleclouddatabaseitself.Rather,itcanbeseenasanenterprise-readyplatformwithacomprehensivesetofseveralcloudservicesthatcanbemanagedoveranetworkofgloballydistributeddatacenters,stretchingfromUSandCanadaalloverChinaandSouth-AsiaaswellasseverallocationsinEurope.
1.2.1.1 CloudinfrastructuremodelsTounderstand thecloudserviceconcept inmoredetailwewillhereundershortly introduce themostcommoncloudinfrastructuremodelsthatcanbedeployeddependingonthelevelofownershipthatshallbekept:
• PublicCloud:Allofferedservices–includinghardware,softwareandinfrastructure–canbeaccessedoverapublicnetwork(e.g.internet)andarehostedandmaintainedbytherespectivecloudproviderinamulti-tenantway.
• PrivateCloud:Theinfrastructureissolelyoperatedforasingleorganizationincontrasttothemulti-tenantofferinginapubliccloud.Thus,itoffersmorecontrolovertheclient’sresourcesbutstillcanleveragetheadvantagesofcloudservices.
• HybridCloud:Asecure,privateconnectionbetweenonpremisesolutionsoraprivatecloudandthepubliccloudissetupinordertostrategicallyextendcapabilitieswithcloudservices.2
Azureitselfissetupasapubliccloud,butalsoaddedprivateandhybridcapabilitiesandinterfacesoverthepastyears.
Other common terms used to differentiate cloud services are “Infrastructure as a Service” (IaaS),“PlatformasaService”(PaaS)and“SoftwareasaService”(SaaS).Theyalsodiffer intheownershipofdata,softwareandinfrastructureasshowninfigure2.
Figure2:ArchitecturaloverviewIaaS,PaaSandSaaS11
AdvancedDatabaseProjectReport–CloudDatabasesandMicrosoftAzure
7
AtypicalMicrosoftexampleforSaaScouldbeOffice365thatoffersextendsthesoftwaretothecloud.AzureitselffocussesonPaaSandSaaScapabilities.Generallyspeaking,PaaSistargetingdeveloperswhodon’twanttocareaboutunderlyinghardwareoroperatingsystemsoftheirapplications.AnexampleforPaaSisAzureAppServices.IaaSincontrastfocussesonsystemadministratorsandoffersahigherdegreeofthecompany’sownershipbutalowerdegreeofagilitycomparedtoPaaS.
1.2.1.2 CloudComputingcharacteristicsOnthebasisoftheseconceptsthereareafewcommonbasiccharacteristicsofCloudComputingofferings,definedbytheNationalInstituteofStandardsandTechnology,thatareincorporatedbyMSAzureandhighlightpotentialadvantagesofusingacloudarchitecture:
• Elasticity&self-serviceo Dynamicadjustmentofservices“onthefly”tosuitactualdemandsandsavecostso Supports short-term demand peaks, e.g. in case of product launches or sales promotions or
utilizingmoreVM’sfortestingpurposes• Scalability
o Verticalandhorizontalscalabilityofallservices(e.g.databaseperformance,numberofutilizedVM’setc.),supportingalong-termstrategybyallowinggrowthoftheITinfrastructurealongwiththegrowthofthecorporation
• Poolingo Computing(CPU,RAM),storageandnetworkservicesarepooledfordifferentstakeholderswithin
a company or amongst several corporates and can be scaled on tap without requiring everycorporatetobuildtheirowninfrastructureforeachservice.
• Pay-per-Useconcepto Costsarecalculatedonapay-per-usebasiswhichallowsmaximalefficiencyofresourceutilization
1.2.1.3 AzureplatformarchitectureAs stated in the beginning of the chapter, Azure is a combination of different service offerings. Thisimplicatesthatthereisnofixedarchitecturethatappliesforeveryusecase.Insteadonemaythinkofitasaservice-orientedarchitecturethatdependsontheusecaseandwhichmayincorporatethefollowingcomponents:
• IdentityasaServicetomanagedirectories,e.g.AzureAD(ActiveDirectory)• Container as a Service to host and manage window server containers, Docker containers or
orchestrationenginesusingAzureContainerServices• VirtualMachinesasaServicewithmultitenantstorageoptions• Application Insights as a Service: Telemetry systems that can be attached to cloud applications to
monitorandgetinsightsoftheapplicationortoperformanalytics(e.g.AzureMachineLearning,IoTSuite)
• StorageasServiceinformofamanageddisksmodel,traditionalstorageaccountsorObjectStorageOfferings(Blobstorage)
• Database as a Service: Azure offers its own databases like Azure SQL Database or the NoSQLDocumentDBaswellasotherdatabaseserviceslikeinAppMySQL,MonoDBorHDInsights
• RecoveryasaService:AzureSiteRecoveryprovidesanorchestrationtoolfordisasterrecovery3
AdvancedDatabaseProjectReport–CloudDatabasesandMicrosoftAzure
8
Pleasenotethatthislistshouldnotbeseenasanexhaustivelist.ThefulllistofservicescanbefoundintheProductdirectoryofAzurehere.AswearefocusingonAzureSQLinthisreport,thisshallonlyserveasafirstintroductiveoverviewandonlythedatabasearchitectureofAzureSQLwillbediscussedfurtherinthesubsequentchapters.Anoverviewofthemostprominentprovidedservicesmentionedearlierisshowninfigure3:
Figure3:MicrosoftAzurePlatformArchitecture
AdvancedDatabaseProjectReport–CloudDatabasesandMicrosoftAzure
9
1.3 CloudDatabasesonMSAzureAswe’veseeninthelastsection,CloudComputingcomprisesmanydifferenttopics.Therefore,thenexttwo sections shall guide us from the bigger scale, the overall platform, to one of its services (CloudDatabases,chapter1.3.1ff)andthenprovideanin-depthanalysisofonespecificdatabase:MSAzureSQLDatabase.
1.3.1 GeneralconceptofCloudDatabasesAclouddatabaseingeneralisadatabasethatcanbebuiltonandaccessedthroughacloudcomputinginfrastructureplatformasdiscussedinchapter1.2.Thiscanberealizedeitherthroughdirectqueries(e.g.SQLstatements)orviaAPIcalls.Fromastructuralanddesignperspectiveclouddatabasesaresimilartotheonesthataredeployedonthecorporateonpremiseserversandthedatabasebehaviortoqueriesshouldbethesame.Thekeydifferencesareinthescalableperformanceandthelocationofresidenceofthedatabase:
Aclouddatabasecaneitherbedeployedonthecompany’sITinfrastructureviaavirtualmachineorasaDatabaseasaServicemodel,whereitisrunningontheCloudprovider’sinfrastructure.Thisimplicatesespeciallyforclouddatabasesofalargerscalethattheycanusuallyreferredtoasdistributeddatabases.Evenifthedatabasemayappeartotheuserasasingle,coherentdatabase,itisactuallyconstructedoutofasetofdatabasesthatcanbedistributedoverseveralserverlocations.Thedatabaseofacloudprovidercanbeaccessedsolelyviatheinternetthroughawebinterfaceorvendor-providedapplicationinterface(API)incontrasttoanonpremisedatabasethatisconnectedtolocalusersthroughaninternallocalareanetwork(LAN).4
Thefollowingtablehighlightsthekeyadvantagesanddisadvantagesofclouddatabases:
Advantages Disadvantages• Simultaneousaccessandmodificationofdata
frommultipleusersusinganetwork• No need to buy dedicated hardware (if
databaseisofferedasaservice)• CansupportSQLandNoSQLdatabasesthat• Reduced risk of data loss in case of a power
outageasthedataisdistributedandmirroredovermultiplelocations
• MayhaveaslightlyslowerresponsetimethananonpremisedeploymentduetotheroutingthroughtheinternetinsteadofaninternalLAN
• Computer Security is always amajor concernandriskofCloudComputing,thus,onlytrustedprovidersshouldbechosen
Table1:AdvantagesandDisadvantagesofCloudDatabases
AdvancedDatabaseProjectReport–CloudDatabasesandMicrosoftAzure
10
1.3.2 DatabaseasaService(DBaaS)onMSAzureBasedupontheabovementionedcharacteristicsofclouddatabases,Azureoffersseveralservicesandpricingoptionsfortheirdatabasesthatshallbediscussedinthesubsequentsections.
1.3.2.1 OfferedclouddatabaseservicesUsingtheservicesofaclouddatabaseproviderallowsseveralbenefitsthataresummarizedhereinafter:
• Eliminationof physical infrastructure: In aDBaaSenvironment, the serviceprovider (Azure) isresponsibleformaintaining,settingupandoperatingthedatabasesoftware,leavingtheDBaaSusers’responsibilityandattentiononlyontheirdata
• Cost Savingsdue to reduced capital expenditures, decreasedoperating costs and less neededphysicalspace.Inaddition,clouddatabasetechnologyisbecomingmorematureandthusleadstoapricedecline.
• Instantaneousscalability“onthefly”withoutdowntime:Azurecanquicklyofferadditionalfee-based capacity, throughput and access bandwidth in case of demand peaks or highly volatiledemands.Furthermore,Azureprovidessocalledelasticpools:Usingasingledatabasethatisfullyisolatedandoptimizedforworkloadsisbeneficialwhenperformanceismoreorlesspredictable.Contrary,elasticpoolscanbeusedwhenfacingvolatiledemands.Databasesscrapedtogetherinone pool automatically scale up and down based on the demand. Then, the collectiveperformanceofapoolismanagedratherthantheoneofasingledatabase.
• Performance and availability guarantees: Through a service level agreement (SLA) Azure isobligatedtoprovideperformanceguarantees (typically includingquantifyingminimumuptimeavailability and transaction response times). For the Azure SQL database, Microsoft ensures99.99%availabilityintheSLA.
• Easeofaccess:ClientscanaccessclouddatabasesfromvirtuallyeverywherethroughtheinternetandarenottiedtoaLANanymore
• Latest technology & Security: To remain competitive, Azure is continuously keeping itsinfrastructureuptodatewithsecurityandotherfeatureupdates
• Failoversupport&availability:Failoversupporttypicallyencompassestheoperationofmultiplemirrorimageserveranddatastoragefacilities.Ifhandledproperly,dataiskeptsecurethroughbackups on remote servers in case of natural disasters, equipment failure or power outages.Additionally,Microsoftensures99.99%availabilityintheirSQLdatabaseSLA
• Built–inintelligence:Integratedperformancemonitoring,alertingandtuninglikeautomaticindexmanagement, adaptive query processing or intelligent threat detection to detect potentiallyharmfulattemptstoaccesssensitivedata
• Security and Compliance: Starting in May 2017, all newly created Azure SQL databases areautomaticallyprotectedwithtransparentdataencryption,dynamicdatamaskingandrow-levelsecurity5
AdvancedDatabaseProjectReport–CloudDatabasesandMicrosoftAzure
11
1.3.2.2 MSAzurePricingModelIf using anAzureDatabase, a user only pays forwhat heor sheneeds.As thepricing is different fordifferentdatabases,weshallfocusonthepricingofAzureSQLusedasasingledatabase.
PricingonAzuredependsontheservicetier,whichagainhasdifferentpricinglevels.Billingiscarriedoutperhourlyusageofthedatabase.Therearefourmajorservicestiers,namelyBasic,Standard,PremiumandPremiumRS,thatdifferinthenumberofDatabaseTransactionUnits(DTUs),includedstorageandmaximum storage. Depending on the business application, utilization and resource restrictions, oneservicetierhastobechosenpreliminary,butcanbechangedanytimeatalaterstage.Anotherfactoristheavailability thatAzurecommits to in theService LevelAgreement (SLA)6.Anoverviewof includedcapacitiesandfeaturesofthepricingmodelsisshowninfigure3.
Figure5:AzurePricingOptions7
Figure4:Overviewservicetiers
AdvancedDatabaseProjectReport–CloudDatabasesandMicrosoftAzure
12
Themeasure for resource utilization is called Database Transaction Units or DTUs, and is a blendedmeasureofCPU,memoryandI/O(dataandtransactionlogI/O),whoseratiowasoriginallydeterminedbyanOLTPbenchmarkworkload. If theworkloadexceeds theamountof anyof these resources, thethroughputisthrottled.DTUscanbecalculatedusingtheAzureDTUcalculator.
Aspricesforsingledatabasescanrangefrom0.0057€/hourupto18.14€/hourandforelasticpoolfrom0.0851€/hourupto25.30€/hour,theywon’tbecoveredexhaustivelyhere.Instead,pleaserefertothethiswebsiteforpricingdetailsofallservicetiers.
Alternatively,Azureprovidesapricingcalculatorforallitsproductstocalculatethetotalcostofthecloudserviceset-upupfront(refertofigure5)here.
1.3.3 AvailableCloudDatabasesonMSAzureBeforegoingintoin-depthanalysisoftheAzureSQLAzuredatabase,thefollowingtableprovidesabriefoverviewof thecurrentlyofferedclouddatabaseoptionsonMSAzure. Inadditiontothat,Azurealsoprovidesindividualizedservicesfortimeseriesdata(TimeSeriesInsights),searchenginedatabases(AzureSearch),orobjectstorage(BlobStorage)andoffersintegrationofthird-partysolutionslikeMongoDBoropensourceframeworkslikeHadoopthroughHDInsights.
DatabaseType AzureOfferings
RelationalDatabaseManagementSystems AzureSQLDatabaseAzureDatabaseforMySQLandPostgreSQLSQLServerStretchDatabase
Key/Valuestores AzureCosmosDBAzureRedisCache
Document&GraphDatabase AzureCosmosDBDataAnalytics AzureDataLake
Table2:OverviewCloudDatabasesonAzure
Figure6:AzurePricingCalculator12
AdvancedDatabaseProjectReport–CloudDatabasesandMicrosoftAzure
13
1.4 MSAzureSQLCloudDBComingfromtheoverallAzureplatform,thischaptershallnowfocusononedatabaseavailableontheAzureplatform:AzureSQLdatabase.
1.4.1 GeneralintroductionMSAzureSQLDatabaseisaRelationalCloudDatabaseprovidedasaDBaaS(DatabaseasaService)byMicrosoftAzure.Itwasreleasedin2010byMicrosoftandrunsontheMicrosoftAzurecloudcomputingplatform. It is a managed Database Service, which means that it covers scalability, backup and highavailabilityofthedatabaseandcanbequeriedusingT-SQLor.net.Italsoincludesbuilt-inintelligencefeatureswhichenablesthedatabaseto learnqueryandworkloadpatternsandtunesthedatabasetomaximizeperformance, reliability,anddataprotection. It currentlyholds rank15 in therelationalDB-enginerankingandrank26intheoverallDB-engineranking.
Figure7:RelationaldatabaserankingstatisticsasofDecember20178
MSAzureSQLDatabasesharesacommoncodebasewithSQLServer.Atthedatabaselevel,itsupportsmostofthefeaturesofSQLServerandiscompatiblewithSQLServer2014and2016.ThemajorfeaturedifferencesbetweenAzureSQLDatabaseandSQLServerareattheinstancelevel.Since2015itcannotonlybeconfiguredasasingledatabase,butalsowithinanelasticpool.Thiselasticpoollogicallygroupsseveraldatabasesintoonepool,inwhichtheycanshareresources.ThisallowstomakeuseofcurrentlyunusedresourcesofotherdatabasesinthepoolviaanexchangeofeDTUsthathaveasimilarconcepttothenormalDTUs.
Figure8showsatypicalworkflowfromthesourcetotheconsumptionofAzureSQL.Structured,semi-structuredorunstructureddatacanbestoredinthedatabaseinthecloudandfromthereconsumedbyBusinessIntelligenceAppslikeMicrosoftPowerBIorotherwebapplications.
AdvancedDatabaseProjectReport–CloudDatabasesandMicrosoftAzure
14
Figure8:AzureDataMigrationandConnectionWorkflow9
Comingfromanapplicationperspective,thehierarchyofAzureSQLcanbedescribedasshowninthefollowingfigure:
Figure9:Physicaldistributionandapplicationhierarchies
ViatheAzureplatform,adatabaseusercreatesanaccountanaddsasubscriptionforoneoftheservicetiers.AllcreatedSQLdatabasesaredeployedononeorseveralservers.TheAzureserversactuallyaren’tphysicalserversormachines.Instead,theyserveasalogicalcollectionofdatabasesthataresharingthesameuserdirectories,firewallsettingsanddiverseothergeneralsettingswhichcanbeinvestigatedinthesystem“master”database.Whensettingupaserveranddeployingthedatabase,theusercanonlychoosearegioninwhichthephysicalserversareplacedandwhichshouldideallybeascloseaspossibletheusers’locationtoavoidlatencyissues.However,theexactgeo-graphicaldistributionofthephysicaldatabaseserversandtheirreplicatesisnotvisibletotheuser.
AdvancedDatabaseProjectReport–CloudDatabasesandMicrosoftAzure
15
1.4.2 TypicalUseCasesThefollowingaretypicalusecasesofMSAzureSQLDatabase:
• Relationaldatastorageforwebsitesandcloudbasedapplications• Businessandconsumerwebandmobileapplications,especiallyformulti-tenantapplications• OLTPprocesseswithrelationaldataacrossmanifoldindustries,i.e.usedbyretailorlogistics,often
combinedwithDataWarehousingservices• Outsourcing of the databasemanagement in order to let the company focus on value-added
servicesincaseofthecompanies’resourcelimitationsandstoringdatainthecloudforeffectivesecurityandisolation
1.4.3 EvaluationofAzureSQLDBWedetermined4factorsthatareimportanttoconsiderwhentheperformanceofthedatabaseasawholeshallbeevaluated.
• PerformanceRecallthattheperformanceofthedatabaseiscalculatedinDatabaseTransactionUnits,whichisablendedmeasureofCPU,memoryandI/O.Microsoftguaranteesacertainlevelofresourcesfora database and providing a predictable level of performance.Microsoft also provides a “DTUCalculator”toestimatethenumberofDTUsneeded.Certainly,thisflexibilityisamajoradvantageof Azure SQL but also makes it difficult to compare it to other database systems as theperformanceofthedatabasedependsontheservicetierchosenortheamountofmoneyspent.Furthermore,AzuredoesnotprovidetheexactMemoryandCPUcapacitiesorread/writeratesequivalenttotheDTUs,whichmaynotbeanissueifthedatabaseisusedinday-to-daybusinessbutmakesbenchmarkcomparisonsmorechallenging.
• IncludedServicesAsmentionedintheprevioussections,AzureSQLisofferedasa“DatabaseasaService”.Thus,allusualconcernsaboutmaintenance,securityorreplicationarehandledbyMicrosoftandnottheuser anymore. Several log statistics, alerts and performance optimizers complete the servicepackage.
• Functionality&ComplexityAs Azure SQL is code based on SQL Server, it follows most of SQL Servers design features,integrateswellwithexistingsystemsandoffersafamiliardevelopmentenvironmentfordatabaseadministrators.TheintuitiveuserinterfaceontheAzureportalmakesiteasytousebyeliminatingalmostanyneedforhard-codingduringtheset-up.AsitisembeddedintheAzurecloudserviceplatform,itcanbeutilizedformanyoftheofferedservicesandthusintegratedintodataflowsofwebapplications,BusinessIntelligenceorBigDataFrameworks.
• CostThecostforusageofazureisbasedondifferentfactors.IfweonlyfocusonthecostofAzureSQL,itdependsonthegeographicregion,typeandsizeofdatabase,thetypeofsubscriptionandtheDTU’sconsumed.Microsoftalsoprovidesa“Pricingcalculator”togetanestimationoftheprice.In general, Cloud databases have the great cost advantage of eliminating the need for anexpensiveserverset-up(hardwareandsoftware),maintenanceandreplacementafteracertainperiod of time. The exact cost advantage over an on premise set-up depends on the desiredperformance,databasesizeandworkloads,butoverall seemstobe lesscostly,especially ifall
AdvancedDatabaseProjectReport–CloudDatabasesandMicrosoftAzure
16
includedservicesandfuturescalabilityperspectivesareconsidered.Inchapter2.6.2.wemadearoughcostestimationtogiveafirstimpressionofthepricerangeofAzure.
1.5 SummarizedAdvantagesandDisadvantagesofAzureSQLThe following table provides a summarized overview of the advantages and disadvantages of thedatabase:
Advantages DisadvantagesPerformance• Scalableperformanceuptohighlevels • Adjustableperformanceanytime • Lowerdown-time
• Lowthroughputandhighexecutiontimeforlowerservicetiers
• ThoughsupportingmostofthefeaturesofSQLServers,somearestillnotimplemented
Services
• Development of data-driven applicationsandwebsitesinanyprogramminglanguage,without need for managing theinfrastructure
• Services like hardware maintenance,security, active directory, updates etc. arecoveredbyMicrosoft
• Build-in intelligence and performancemonitoring tools support users to designoptimal database and applicationarchitectures
Functionality&Complexity• Flexibleserviceplansmeettheneedsof
both-largeandsmallbusinesses • Easy connections using username and
password,humanfriendlyuserinterface
• The only way to send data to Azure is touploadorsyncitacrossthenetwork,whichrises the need for an appropriatebandwidth
• SQLAzurehaslimitationsonusageandwillthrottledatabaseconnectionsunderheavyloads
Cost• Thehighertheservicetier,themorethe
cost,butperformanceandresourcesscaleaswell
• Noupfront-costforserverset-up • Cheap,lowperformanceservicetier
offeringsfortestdeployments
• Costincreasetoquiteexpensivelevelsforhigherservicetiers
Table3:SummarizedAdvantagesandDisadvantagesofMicrosoftAzure
AdvancedDatabaseProjectReport–CloudDatabasesandMicrosoftAzure
17
2 Performancebenchmark forareal-lifeapplicationexampleofaretailsalesmanagementprocess
Inordertogetabetterunderstandingofthedatabaseperformanceandusageinreal-lifeapplications,wedevelopeda retail salesmanagementprocess,which isdescribed in section2.1.We thencreated theaccordingdatasetsandsetuptwoidenticaldatabasesonSQLServerandMicrosoftAzure,withastep-by-step guide for Azure SQL DB. In section 2.6 we conduct a performance benchmark between thesedatabases,basedontypicalqueriesoftheapplication.
2.1 IntroductiontoUseCase&Benchmark Inthisapplication,asmall-sizedcompanythatowns3storestosell itsproductswouldliketostoreitssalesandproductdata.Asofnow,alltransactionaldataisstoredinsmall,localrelationaldatabaseswhichmakesitimpossibleforthemanagementtoquerytheoverallsalesdataorconductgenericupdatesofthedatabase, i.e.pricechangesofproductsduetopromotions.Additionally, thenumberofordersforhisproductsusuallyvarytimetotime.Forexample,onthefestivalsthesalesareveryhighandinthenormaldays they are comparatively low, which rises the need for on-demand scalability of the databaseperformance.Therefore,themanagementdecidedtoinvestinonecentraldatabaseforallsalesdata,butisyetunsureifacloudoronpremiseset-upistherightchoice.Byinvestigatingtheperformanceofbothsystems,weshallanalyzewhichoneismoresuitableforthisusecasefromboth,atechnicalandabusinessperspective.
2.2 GenerationoftestdataToperformthebenchmarking,twodatasetswererandomlygenerated,whichonlydifferinthenumberofrows,buthavethesameschemas.Bothsetscontaindataentriesfortwotables,asalestableandaproducttable.Theschemasfortherelationsareasfollows:
• Sales{salesID,sdate,storeName,products,quantity}• Products{productID,product_name,price}
The small dataset contains 10.000 rows,whereas the large data set contains 1.000.000 rows in eachrelation.SalesIDandproductIDareuniqueprimarykeysandproductsinthesalesrelationisaforeignkeyof productID to identify the products sold in a sales transaction. For detailed attribute types andconstrains,pleaserefertothecodesnippedinsection2.4.1.4.
2.3 Specificationsfortestenvironment• SQLServer:MicrosoftSQLServerManagementStudio2014• AzureSQLDatabase,Subscription:FreeTrial,ServiceTier:StandardS1servicetier(250GBstorage,
20DTU)• MacBookPro,MacOSHighSierrav.10.13.1,Processor2,7GHzIntelCorei5,250GBSSD,Memory8
GB1867MHzDDR3,RunningaparallelsDesktopv.13.0.1VirtualMachinewithMicrosoft10andunlimitedscalableresourcesharingupondemand
AdvancedDatabaseProjectReport–CloudDatabasesandMicrosoftAzure
18
2.4 DatabasecreationThissectionprovidesaHow-ToGuidetosetupaSQLdatabaseontheMicrosoftAzureplatform.
2.4.1 CreateDatabaseinAzureSQLTheAzureSQLdatabasecanbedeployedinthreedifferentways:
• usingSQLServerManagementStudio(SSMS),• using.NETor• migrateanexistingSQLServerdatabase
InthefollowingstepsweareshowinghowtocreatethedatabaseusingSSMS.
2.4.1.1 CreatenewSQLdatabaseontheAzureplatform
ThefirststepistocreatethedatabaseattheAzureportal.Therefore,wehave to sign in toourMicrosoft accountandnavigate to theSQLdatabase repository,wherewe can add a newdatabase as shown infigure10.AsanamewechoseSalesApplicationandthesubscriptioncanbe selected from the subscribed service level of the Azure account,which in our case is the free trial. Azure offers to group differentdatabases in a resource group, which share the general databasesettings.Thisisinparticularusefuliftheintentionistosetupmorethanonedatabaseascommonsettingsfore.g.firewallsonlyhavetobesetuponce. Inourcase,wecreatedanewresourcegroupandcalled itPerformanceTesting.
In the select source field, theuser can choosebetweenmigrating anexisting database from SSMS or create a newone. If the user hasn’tcreated a server on Azure before, a new one needs to be created,following the server set-up wizard. In our case we chose a serverlocationinWestEuropeandnameditpertest.Theserverconfigurationisimportanttobeabletoconnectandinteractwiththedatabaselateron.Aswedon’thavemanydatabases,wedon’tconfigureelasticpoolsasofnow.
Thepricingtearisthemostimportantpartoftheconfigurationasitdirectlyinfluencestheperformanceofthedatabase.ThemaximumservicetiersupportinthefreeaccountistheStandardS1servicetierwith250GBstorageand20DTUsasshowninfigure11.
Afterenteringallrequiredinformationpress‘Create’toletAzurecreatethedatabase.Thisprocessmighttakeafewminutes,butcanbemonitoredinthenotificationcenterontheAzureportalstatusbarintheupperrightcornerofthescreenatalltime.
Figure 10: Create an Azure SQLdatabaseontheMSAzureplatform
AdvancedDatabaseProjectReport–CloudDatabasesandMicrosoftAzure
19
Figure11:SelectionofServiceTierfortheAzureSQLdatabase
2.4.1.2 Setfirewallsettings
Aftercreatingthedatabase,thefirewallhastobeconfiguredinordertoallowinboundandoutboundconnections to the database. To do so, open the newly created database on theAzure platform andnavigatetoSetServerFirewallinthemenubaratthetopasshowninfigure12.
Figure12:SetServerfirewallmenu
Pressthe‘AddClientIP’andenteryourIPaddresstoallowconnectionstoyourclientsasshowninfigure13.Ifthisstepisskipped,SSMSwillraiseanaccessdeniederrorwhenconnectingtothedatabase.
Pleasekeepinmindthatthesesettinghavetobeupdated,wheneverthenetworkoverwhichtheclientconnectstothedatabasehaschanged,astherespectiveDNSserverassignsanewIPaddressthathastoberegistered.
Figure13:AddClientIPtotheServerFirewall
AdvancedDatabaseProjectReport–CloudDatabasesandMicrosoftAzure
20
2.4.1.3 ConnectingtoAzureviaSSMS
ThenextstepistoconnecttotheAzureSQLdatabasefromSSMS.OpentheconnectionwizardinSSMSand type in the credentials showed in figure 14. The server name has to be of the type<servername>.database.windows.net. Azure is only accepting SQL Server Authentication. Login andPassword are the ones chosen when creating the server on Azure. Navigate to the tab ‘ConnectionProperties’andentertheSQLdatabasename(here:SalesApplication)inthedatabasefield.Thenpressconnect.TheAzureserveranddatabaseshouldnowshowupintheconnectionmanagerinthesamewayaslocalrepositoriesdo.
Figure14:SettingupaconnectionfromSSMStoAzure
AdvancedDatabaseProjectReport–CloudDatabasesandMicrosoftAzure
21
2.4.1.4 Createdatabaseschema
Sofarthedatabaseisstillempty.Tocreatetheschemaspecifiedinsection2.2openanewqueryandrunthefollowingT-SQLstatement,whichsetstherelationschema,attributetypes,primaryandforeignkeyconstrainsaswellastheindex.Asthedatabasewillbemainlyqueriedtolookupsalesforaspecificstoreorproduct,wedecidedtointroduceclusteredindexesontheprimarykeysofbothrelations.
Figure15:T-SQLstatementtocreatedatabaseschema
Afterrunningthesestatements,thetablesdbo.productsanddbo.salesshouldappearinthetablesfolderofthedatabase.
AdvancedDatabaseProjectReport–CloudDatabasesandMicrosoftAzure
22
2.4.1.5 Populatethedatabasewiththeapplicationdata
Topopulatethedatabase,weusethebcpcommand-lineutility.
Notethatforthefollowingstepsthebcpandsqlcmdcommand-lineutilityhavetobeinstalledontheclient.BothareavailableforfreedownloadintheMicrosoftstore.
Openthecommandpromptwindowonyourclientandexecuteacommandfollowingthisstructure:
bcp <TableName> in <path of file> -S <ServerName> -d <DatabaseName> -U <Username> -P <Password> -q -c -t “;” Replaceall<placeholders>withtherespectiveinformationofyourdatabaseandadjust“;”tothespecialcharacterthatseparatestheentriesinthe.csvor.txtfilecontainingthedatathatshallbeinserted.Forourapplicationthecommandisthefollowing:
Figure16:bcpcommand-linestatementtoloaddataintothedatabase
Afteruploadingtheproductdata,thesalesdatahastobeuploaded,followingthesameprocedure,butreplacingthetablenameproductswithsalesandadjustingthefilepathrespectively.
2.4.1.6 Additionalnotes&SQLServerset-upAs we’re planning to conduct the benchmarking with two dataset sizes (10.000 and 1.000.000 rowsrespectively),wedecidedtosetuptwodatabasestoavoiddeletinganduploadingofdatabetweenthetests.Bothdatabasesareidenticalexceptthenumberofrows.Therefore,werepeatedallstepsoftheprevioussectionstocreateadatabaseforthesmallerdataset,whichisreferredtoSalesApplicationSmallhereinafter.
Furthermore,weassumethatthereaderisfamiliarwiththedatabasesetupinSQLserver,sowewon’texplaindetailedstepshere.ForthebenchmarkingonSQLServerweusedthesameschemeasshowninfigure15andusedtheImportWizardfromSQLServertoimportthedatasetsfromtherespectiveflatfilestothedatabases.Adetaileddocumentationcanbefoundhere.
AdvancedDatabaseProjectReport–CloudDatabasesandMicrosoftAzure
23
2.5 QueryingAzureSQLdatabaseWeidentifiedthe followingtypicalusecases forourapplicationthatrequirethesubsequentdifferentqueriestypes,whichwereexecutedduringourperformancetest.
2.5.1 SelectsalesdataTostartwefirstperformedafulltablescanbyexecutingthefollowingSQLstatement:
SELECT COUNT(*) FROM dbo.sales WHERE quantity between 0 AND 10000
Iareal-lifeapplication,itisquiteunlikelythatafulltablescanofallsalestransactionisrequested.Instead,amorespecificwhereclausewouldbeaddedinordertodisplaysalestransactionsforacertainstoreortimeperiod.However,aswewanttoconductourbenchmarkonthefullsetofrows,wedecidedtosetthewhereclausetoarangethatcoversthefulltablefortestingpurposes.
2.5.2 UpdateproductpricesduringpromotionsAcommonevent inouruse case is settingpromotions for theproducts. In this case, theprice in theproductrelationneedstobeadjusted,executingthefollowingSQLstatement:
UPDATE dbo.products SET price = 120
Followingthesamelogicasabove,weupdateallproductrecordsinourbenchmarktoavoidaffectingonlyafewrowswithourquery.
2.5.3 CalculatetotalsalesamountAnother important query is the calculation of the total sales amount. For this, the sales and productrelationshavetobejoinedbasedontheproductIDandthesalesquantityhastobemultipliedwiththeproductprice:
SELECT quantity * price AS SalesAmount FROM dbo.products p, dbo.sales s WHERE p.productID = s.product
2.5.4 InsertnewSalesRecordsTosimulatetheinsertionof10.000newrecords(1.000.000forSalesApplicationrespectively),weusedthefollowingsetup:
First, a new table called isales with the schema {salesID, sdate, storeName, products, quantity} wascreatedinthedatabase.Tosimulatetheinsertions,wecopiedtheentriesofthesalesrelationtotheisalesrelation,usingthefollowingprocedure:
AdvancedDatabaseProjectReport–CloudDatabasesandMicrosoftAzure
24
Figure17:StoredProceduretoinsertsalesrecords
AdvancedDatabaseProjectReport–CloudDatabasesandMicrosoftAzure
25
2.6 PerformanceBenchmarkThebenchmarkisfollowingtheperformanceevaluationfactorsoutlinedinsection1.4.3.Theresultsforthetechnicalperformancetestingareshownanddiscussedinsection2.6.1.Inasecondstepweevaluatethecostofbothdatabasesbyputtingtheobtainedperformanceresultsintoperspectivewiththearisingcosts according to theAzure SQL cost calculator in section2.6.2. In section2.6.3we lastly adda fewqualitative considerations about the included services, functionality and flexibility of the database toroundupourcomprehensiveperformanceevaluation.
2.6.1 PerformanceThegoaloftheperformancetestingistodetermineifSQLServerAzureSQLismoreperformantfortheselectedusecasequeriesonthechosenservicetier.Toevaluatethedatabaseperformance,weexecutedallquerieslistedinsection2.5eleventimesforeachofthefourdatabases(twodatabaseseachonSQLServerandAzure,onecontainingthelargeandonethesmalldataset).Weignoredthefirstexecutionasthisusuallytakessignificantlylongerduetothecachingandtookanaverageoftheresponsetimeoftheremainingtenexecutions(derivedfromtheSQLServerclientstatistics).
Afterthefinalconsolidationofperformancetestresultsweobservedthattheresultsforthelargeandthesmalldatasetdidnotdiffersignificantlyintheiroveralltrend,butonlyintheirscale.Therefore,wewillhereinafter only discuss the results per query executed on the large data set and shall discuss thedifferencestothesmalldatasetinasubsequentsection.
AdvancedDatabaseProjectReport–CloudDatabasesandMicrosoftAzure
26
2.6.1.1 PerformanceTestResults:SelectqueryThefollowingchartsshowtheExecutiontimesofall10executionsfortheselectquery.
Figure18:PerformanceTestResultsSelectquery1.000.000rows
WeobservethatSQLServerisoutperformingAzurebyrunningthequeryonaverage5xfaster.Surprisedbythatclearresults,wetriedtofigureoutthereasonfortheslowperformanceofAzureandtookacloserlookon thequery.Thequery thatwasusedactually tests theread ratesof thedatabases.ByusingaCOUNT(*)weensurethattheresultsarenotstronglyaffectedbynetworkorI/Oissuesasonlyoneintegerneeds to be outputted,whereas theWHERE clause forces the databasemanagement system to scanthroughthewholetableasthereisnoindexonthequantitycolumn.Thus,aswedon’tsendmuchdatathroughthenetwork,thevolatilityinbothgraphsmightshowtheeffectofchoosingthebestqueryplantoexecutethequery,whereonSQLServerperformsmorestable.
However,fortherelativecomparisonweshouldkeepinmindthatwestillcomparehardwareandnetworkperformance rather thanDBMSperformance,as thecodebase forbothdatabases is SQLServer. SQLServerinthiscasewasrunonanotebookthathas8GBofRAManda2.7GhzprocessorwhereasAzureisonly“equipped”withtheresourcesreflecting20DTUs.
An Azure SQL benchmark that was conducted in a more professional environment over longer timeperiodsoftimewithheavierworkloadswasabletocalculatethememoryunderlyingmemoryofdifferentservice tiers.Despite the fact that the benchmark is already 2 years old and service tiersmight havechanged,itstillshowsimpressively,howlessmemoryisallocatedtothedatabaseinthestandardtiersandhowitscalesupwiththepremiumtiers.
62 6278
109
62 6277
62 62 6269.8
020406080
100120
1 2 3 4 5 6 7 8 9 10 Avg
ExecutionTimeinm
s
SQLServer
188
745
438
203
469 484
265359
126
281355.8
0
200
400
600
800
1 2 3 4 5 6 7 8 9 10 Avg
ExecutionTimeinm
s
Azure
62 62 78 109 62 62 77 62 62 62 69.8188
745
438
203
469 484
265359
126
281355.8
0
200
400
600
800
1 2 3 4 5 6 7 8 9 10 Avg
ExecutionTimeinm
s
SQLServer Azure
AdvancedDatabaseProjectReport–CloudDatabasesandMicrosoftAzure
27
Inthegraphthegreenbarsshowresultsforthetestsconductedin2015whereastheblueonesfortestsconductedin2014andthusreflect,howMicrosoftisshiftingandadjustingresourcesunderthesurfaceof its platform.By investigating this chart, it is obviouswhy theprocessing timeofAzure is generallyslower,asthelowresourceassignmenthasalsobeenshownforCPUorread/writeratesinthebenchmarkstudy.
Figure19:EstimatedmemoryallocationofSQLAzureperservicetier10
AdvancedDatabaseProjectReport–CloudDatabasesandMicrosoftAzure
28
2.6.1.2 PerformanceTestResults:JoinqueryThefollowingchartsshowtheExecutiontimesofall10executionsforthejoinquery.
Figure20:PerformanceTestResultsJoinquery1.000.000rows
For this test resultswe firstobserve that theperformance ismuchmore stable than for thepreviousquery.Thismightbeduetothefactthatthejoinisexecutedonthecolumnswithclusteredprimarykeys,whichmakestheexecutionstraighterforward.Butstill,SQLServerisagainonaverageexecuting9xfasterthanAzure,similartotheresultsobtainedabove.WeassumethattheslowerexecutiontimeofAzuremightagainbeduetothelowperformancelevelassignedaswellasduetonetworklatenciesbetweenthenotebookandtheAzureserver.
1015 1065969
1061
1249 1218
984 1016 1032 1014 1062.3
0
200
400
600
800
1000
1200
1400
1 2 3 4 5 6 7 8 9 10 Avg
ExecutionTimeinm
s
SQLServer
11036 10907 1023811625 12153 12243 12407
11156 11814 1113311471.2
0
5000
10000
15000
1 2 3 4 5 6 7 8 9 10 Avg
ExecutionTimeinm
s
Azure
1015 1065 969 1061 1249 1218 984 1016 1032 1014 1062.3
11036 10907 1023811625 12153 12243 12407
11156 11814 1113311471.2
02000400060008000100001200014000
1 2 3 4 5 6 7 8 9 10 Avg
ExecutionTimeinm
s
SQLServer Azure
AdvancedDatabaseProjectReport–CloudDatabasesandMicrosoftAzure
29
2.6.1.3 PerformanceTestResults:updateandinsertqueriesThefollowingchartsshowtheExecutiontimesofall10executionsfortheupdateandinsertqueries.
Figure21:PerformanceTestResultsSelectandUpdatequery1.000.000rows
2670 4046 2461 3377 2929 2692 2916 2703 3759 3175 3072.8
94999 95765 95453 95467 95407 95468 96907 97344 97840 97302 96195.2
0
100000
200000
1 2 3 4 5 6 7 8 9 10 AvgExecutionTimeinm
s
PERFORMANCE TEST RESULTS: INSERT
SQLServer Azure
0
2000
4000
6000
1 2 3 4 5 6 7 8 9 10 Avg
ExecutionTimeinm
s
SQLServer
92000
94000
96000
98000
100000
1 2 3 4 5 6 7 8 9 10 Avg
ExecutionTimeinm
s
Azure
672 703 686 687 1093 704 671 687 671 671 724.5
78156 77907 77703 78000 78328 78301 78312 78093 77454 77969 78022.3
0
50000
100000
1 2 3 4 5 6 7 8 9 10 AvgExecutionTimeinm
s
PERFORMANCE TEST RESULTS: UPDATE
SQLServer Azure
0
200
400
600
800
1000
1200
1 2 3 4 5 6 7 8 9 10 Avg
ExecutionTimeinm
s
SQLServer
77000
77500
78000
78500
1 2 3 4 5 6 7 8 9 10 Avg
ExecutionTimeinm
s
Azure
AdvancedDatabaseProjectReport–CloudDatabasesandMicrosoftAzure
30
Whenconsideringthetwoqueriesabove,thedifferencebetweenAzureSQLandSQLServerreachesanewlevel.SQLServerisoutperformingAzureSQLby107timesintheupdatequeryand32timesintheinsertquery.Weassumethattwothingscouldbeanessentialdeterminanttothis:TheDUTsassignedtotheusedservicetieraswellasthenetwork.
Toexaminethefirst,wequeriedtheAzureSQLsystemstatisticsofthedatabaseusingthefollowingSQLcode:
SELECT AVG(avg_cpu_percent) AS 'Average CPU Utilization In Percent', MAX(avg_cpu_percent) AS 'Maximum CPU Utilization In Percent', AVG(avg_data_io_percent) AS 'Average Data IO In Percent', MAX(avg_data_io_percent) AS 'Maximum Data IO In Percent', AVG(avg_log_write_percent) AS 'Average Log Write Utilization In Percent', MAX(avg_log_write_percent) AS 'Maximum Log Write Utilization In Percent', AVG(avg_memory_usage_percent) AS 'Average Memory Usage In Percent', MAX(avg_memory_usage_percent) AS 'Maximum Memory Usage In Percent' FROM sys.dm_db_resource_stats; TheoutcomeofthisquerycanalsobeobtainedbylookingatthequeryperformanceinsightsstatisticsprovidedintheAzureportal,whichrevealedthefollowinggraph:
Figure22:AzureSQLQueryPerformanceInsights
FromfirstsightwecanseethatwewererightinourassumptionofexceedingtheDTUlimits.Followingthedarkblueline,wecanidentifythatthegraphclearlyhitsthe100%markwhilethequerywasrunandiscutoffatthispoint.AzureSQListhrottlingtheperformancesignificantlyiftheDTUcapacityisreached.Throughout the testingAzurealso raisedanalert that theworkload isheavyandahigherservice tiershouldbeconsidered.RecallthatDTUisameasurecombiningCPU,memoryandI/Operformancemetrics.WecanseeinthegraphthatinthiscasetheCPUwasnotthemainissue.Thedatabasestatisticsqueryrevealedthatboth,thememoryandthelogwritepercentagesreached100%duringthequeryexecution–aclearevidencethattheservicetierwasutilizedtothemaximum.
InadditiontothatwewantedtoretrievetheaveragewriteofAzureSQLrateduringtheinsert.Therefore,wefirstqueriedthetablesizeofthesalestablewiththefollowingSQLstatement:
EXEC sp_spaceused N'dbo.sales'; GO
AdvancedDatabaseProjectReport–CloudDatabasesandMicrosoftAzure
31
Thestatementreturnedatotaltablesizeof52MB.Knowingfromfigure21thatourexecutiontimefortheinsertwas78,022ms,wecancalculatethatourwriterate(insertion)wasequivalentto0.669MB/sor40MB/min.Whenwecomparethisresultwiththeservicetierbenchmarkthatweintroducedearlier,wenoticethatthewriteratefromoursmallset-upcomplieswiththewriterateresultfortheS1servicetierthatwasderivedduringtheservicetierbenchmark:
Figure23:WriteRateComparison(MB/min)ofdifferentAzureSQLServiceTiers10
Wethereforeconcludethatourassumptionthatwereachedthemaximumofourservicetierwascorrectandthemainreasonfortheslowperformance.Hence,iftheperformanceshouldbeincreased,theservicelevelneedtobeincreasedasitcanbeseeninfigure23.
AdvancedDatabaseProjectReport–CloudDatabasesandMicrosoftAzure
32
2.6.1.4 ResultcomparisonoflargeandsmalldatasetThe followingcharts showtheaverageexecution timeobtainedperquery for the largeand thesmalldatasetandtherespectivedifferentiationinpercent.
Figure24:Summaryofaverageperformancetestresults
Figure25:DifferencebetweenAzureSQlandSQLServerPerformanceTestResultsin%
Itcanclearlybeseenthattheoveralltrendsarethesameforbothdatasets.However,consideringtherelativedifferenceoftheresultsobtainedfromthetwodatabases,theresultsaremuchclosertoeachotherforeveryquery.Thismightmainlybeduetothefactthatthesmalldatasetdoesnotcomeclosetothe limits of CPU, I/O or memory capacities of Azure and neither scratches the network bandwidthrestrictions.Weobservethattheupdateinsteadoftheinsertqueryhasthelongestexecutiontimeforthesmalldataset,whichmightaswellduetoitslimitedsize,whichisnotaffectedbythenetworkorDTUlimitationsrespectively.However,AzurestillshowssignificantlyworseperformancethatSQLServer,duetotheoverallDTUcapacity,whichisverylowwithonly20DTU’sforS1.Nevertheless,thefactthatAzureperformsworseinthistestsdoesnotmeanitisthewrongchoiceforourusecase.
9619578022
11471 3563073 724 1062 700
50000
100000
150000
insert update join select
Executiontim
einms
Azure SQLServer
276 286
1162367
30 31 110
200
400
insert update join selectExecutiontim
einms
Azure SQLServer
AdvancedDatabaseProjectReport–CloudDatabasesandMicrosoftAzure
33
Asmentionedearlier,Azureallowseasyperformancescalabilitybychoosingahigherservicetier.Thus,wecanconcludethattheretailcompanyofourusecaseshouldchooseahigherservicetierinordertoachievesimilarorbetterresultstoSQLserver.
2.6.2 DatabaseCostComparisonWediscussedalreadythattheperformanceofthedatabasedependsontheservicetierchosenandcanbescaledupsignificantlyifoneiswillingtopayforit.IntheprevioussectionweobservedthattheS1servicetierobviouslyisnotsufficienttocopewithworkloadsbasedonthelargerdataset.Tosolveourusecase,itisimportanttoknow,whatfinancialobligationsbothdeployments(onpremiseandcloud)wouldbringalong.Wethereforeestimatedthecostforbothoptions.Theseconsiderationshoweverareveryroughandmightdiffersignificantlybasedonassumptionsandreal-lifedatabaseusage.However,theaimhereisonlytoprovideanideaofthepricerangesneeded,whichthenwouldbeneededtocalculateindetailupondeployment.
Ingeneral,we’reassumingneedforhighavailabilityandsecurity.Fortheonpremisesolutionweassume that the hardware for the server needs to be purchased but network infrastructure incl.firewalls,routers,wiresetc.isalreadyexisting.Sparepartsneededformaintenanceaswellasrentalpartialcostofthespaceforserverroomsordepreciationsarealsonotconsidered.Wewouldneedtosetuptwoservers(onefortheactualdeployment,anotherforthereplication),bothequippedwithWindowsServer2016licensesandcoveredbyawarrantyandserviceagreement.TheSQLserverlicenseonlyneedstobeboughtforoneserver,asthesecondonewon’tbequeriedbyanyoftheusers.BothMicrosoftlicensesaresubjectivetothecorescoveredandcostareinaccordancewiththepricesshownontherespectiveMicrosoftStorewebsitesasofDecember2017.Theservercostsweretakenfromaserveronlineshop:www.servershop24.de.Costwereaggregatedovera3-yearperiodandthenbrokendowntomonthlycostasweassumethatafter3yearstheserversneedtobereplaceddue to performance, scale and security considerations. Cost for Azure SQL were calculated fordifferentservicetiersusingthecostcalculatorontheAzureplatform,assuming1singledatabase,720h/monthavailability,750GBStorage,RegionWestEuropeandstandardsupport.
OnpremiseSQLServer
Hardware:2xServer(HPProLiantDL360Gen9Server,2xXeonE5-2628v38-core2.5GHz,32GBRAM,2x300GBSAS10K) 6,200€
HardwareCarepacket3year24/7warranty 1,320€WindowsServer2016DatacenterEditionLicense(2x16cores) 12,310€SQLServer2016License(min4coresperprocessor=8cores) 57,024€Total 76,854€Permonth(over3yrperiod) 2,135€
Table4:MonthlycostestimationonpremiseSQLServer
AdvancedDatabaseProjectReport–CloudDatabasesandMicrosoftAzure
34
AzureSQLDatabasemonthlycostServiceTierS1:20DTU(usedinthisbenchmark) 277€ServiceTierS4:200DTU 411€ServiceTierP1:125DTU 585€ServiceTierP2:250DTU 888€ServiceTierP4:500DTU 1,467€ServiceTierP6:1000DTU 2,710€
Table5:MonthlycostestimationAzureSQLDatabase
Weobserve thepainful licensecost for theonpremiseset-up that significantlydetermine theoverallmonthlycost.However,aswe’retalkingabouthardware,itisquitelikelythatthesecostsincreaseevenfurther,ifsparepartsorotheradditionalmaintenanceworkisneededthatisnotcoveredbythegeneralinsurance.WithrespecttotheAzureSQLDBcostwecanseethatpriceswithinthepremiumservicetiers(P1-6 in the table)aremoreor less scaling linearlywith theamountofDTUprovided. Incontrast thisseemsnottoapplyforachangebetweenthestandardandpremiumservicetier.FromthetablewecanseethatbetweentheS4andP1servicetier,theDTUamountdecreasedwhereasthepriceincreased.Thisresultsinacostperperformancejumpof130%.ItprovidesanindicationthatDTUactuallydoesnotrefertothesameunderlyingmemory,CPUandread/writeratescomparedbetweentwolevelsofservicetiersandcanonlybeusedforcomparisonsin-betweenthesametierlevel.However,todeterminetheexactresourcesandscalingfunctionfurther, in-depthbenchmarkingacrossallservicetiers isneeded.Afirstinsightcanbederivedfromfigure23,althoughthisfiguremighthavechangedduetoadjustmentsfromMicrosoftaswellasadditionallyaddedservicetiers.
Nevertheless,forourusecasewecouldseethatSQLServerwhichdealtprettywellwiththeworkloadwasonaverageapprox.50xfasterthanAzureSQL.TransferringthistoDTUs,wecouldestimatethatwewouldneed20DTUx50=1000DTUsinastandardservicetiertoachieveagoodperformanceor434DTUsinapremiumtier,assumingthatthe130%cost/perperformancejumpreflectsthesameincreaseintermsofresources.Thus,wewouldtargetservicetierP4asabaseforthenextperformancebenchmarkforourusecase.
Fromacostperspectivefortheusecase,usingAzureSQLwithaP4ServiceTierwouldstillsavecostofapprox.31%permonthcomparedtotheonpremisesolution.Inadditiontoalltheservice,flexibilityandfunctionalityfeaturesoftheDB,thecostestimationturnstheconsiderationsaroundtothefavorofAzureSQL.However,thisshouldjustbeseenasafirstexplorationintotheanalysisofhowmuchresourcesarebehindtheDTUineveryservicetierandcanbeusedasabasisforanewperformancetestbutshouldnotbeconsideredasafullyreliableresultofaperformancebenchmark.
AdvancedDatabaseProjectReport–CloudDatabasesandMicrosoftAzure
35
2.6.3 FurtherconsiderationsfordatabaseevaluationMost conclusions so farwere related to hardware specifications. But there are other factors like theapplicationdesignwhichmayalsoaffecttheperformance.Therefore,wealsoneedtokeepthesethingsintomindwhiledesigningadatabase: Areindexescreatedforrightsetofattributes? Doesthesystemavoiddeadlocksandround-trips?Forthesecases,Azureprovidesaperformancemonitorthatisabletosuggestneworbetterindexstructures.
AnotherimportantfactorforourconsiderationisthecloudservicelandscapeofAzurethatcomeswiththeAzureSQLdatabase.Incasetheretailstoreconsiderstodevelopanapporawebsiteforitsbusiness,Azurewouldhaveallthenecessaryinfrastructurealreadyinplace,whereasforSQLServer,allsoftwareand hardware for development, testing and production environment would need to be bought.Additionally,assecurityisbecomingmoreimportantandthetechnologylandscapeischangingatarapidspeed, in the future it will be difficult for companies to keep up with the latest security patches,maintenanceandtechnologiesifthedatabaseishostedonpremise.Incontrast,foraclouddatabaselikeMicrosoftAzure,MicrosofttakescareofallthatandguaranteesservicesandavailabilityintherespectiveSLAs.
Lastbutnotleast,wementionedintheintroductionoftheusecasethatthebusinessplanstogrowitsbusinessandthuswillneedtoscaleupitsITlandscapeaccordingly,whichishighlysupportedbyonclickforAzureSQL,butnotforSQLServer.
3 FinalconclusionSummarizingtheresultsobtainedduringourperformanceevaluation, itcouldclearlybeseenthattheassignedresources(intermsofDTU)totheusedStandardS1ServiceTier,werenotsufficientenoughtocompetewithouronpremisesetupofSQLServer.However,asthisisrelatedtohardwareresourcesandnetworkcapabilities,wecannotconcludethatAzureSQLdatabaseingeneralperformsworsethantheSQLServerdatabase.Rather,AzureSQLneedstoberunonahigherservicelevelinordertocatchuporevenoutperformSQLServer.
Nevertheless,takingintoaccounttheotherbenefitsthatcomealongwiththeSQLAzuredatabase,suchas flexibility, scalability, availability, integration with other cloud services on the Azure platform andmaintenanceservices,AzureSQLstillremainsaveryattractivepackage.So,ifacompanyjustwantstoconcentrate on its application instead of all the hardware, maintenance, security, license, updates,troubleshoots,storageetc..,thenshiftingtheirdatabasetothecloudissurelyagoodoption.Forourusecase,weclearlysuggesttheretailstoreownertooptfortheclouddatabaseandsubscribetoaservicetierwhich is suitable forhisbusiness.With this,hecandynamicallyadjust the service tierswheneverneededandscalehisITstructureupaccordingtohisbusinessgrowth.
Thinkingevenfurther,we’reheadingtoaworldwhichgetsevermoreconnected.Thus,itisindispensableforallbusinessestothinkaboutthestructureanddatausageoftheirfuturecompanyinordertoinvestintherightandlong-termsuitabletechnologiesfromthebeginning.
AdvancedDatabaseProjectReport–CloudDatabasesandMicrosoftAzure
36
4 Sources
(1) Cancila,M.;Toombs,D.;Waite,A.D.;Khnaser,E.2017PlanningGuideforCloudComputing.2016,No.October2016.
(2) IBM. What is a Cloud Database https://www.techopedia.com/definition/133/cloud-provider(accessedDec16,2017).
(3) Microsoft. Directory Azure Services https://azure.microsoft.com/en-us/services/ (accessed Dec16,2017).
(4) Tupper,C.D.DistributedDatabaseshttp://linkinghub.elsevier.com/retrieve/pii/B978012385126000022X(accessedDec16,2017).
(5) Microsoft.WhatistheAzureSQLDatabaseservice?|MicrosoftDocshttps://docs.microsoft.com/en-gb/azure/sql-database/sql-database-technical-overview(accessedDec16,2017).
(6) Microsoft. SLA for SQL Database https://azure.microsoft.com/en-us/support/legal/sla/sql-database/v1_0/(accessedDec16,2017).
(7) Microsoft. Pricing - SQL Database https://azure.microsoft.com/en-us/pricing/details/sql-database/(accessedDec16,2017).
(8) DB-Engines.RelationalDB-EnginesRankinghttps://db-engines.com/en/ranking/relational+dbms(accessedDec16,2017).
(9) Microsoft. SQL Database Interfaces https://azure.microsoft.com/en-us/services/sql-database/(accessedDec16,2017).
(10) Unknown.AzureSQLDatabasePerformanceBenchmarkhttps://cbailiss.wordpress.com/2015/01/31/azure-sql-database-v12-ga-performance-inc-cpu-benchmaring/(accessedDec16,2017).
(11) Leong,L.;Toombs,D.;Gill,B.;Petri,G.;Haynes,T.MagicQuadrantforCloudInfrastructureasaService.2013,No.May,25.
(12) Microsoft. SQL Database: What is a DTU? https://docs.microsoft.com/en-gb/azure/sql-database/sql-database-what-is-a-dtu%0A(accessedDec16,2017).
AdvancedDatabaseProjectReport–CloudDatabasesandMicrosoftAzure
37
5 AppendixFullresultsetsoftheconductedperformancebenchmarktestsinms.
10000ROWS
AZURESQLDATABASE SQLDATABASEROUND INSERT UPDATE JOIN SELECT SELECT INSERT UPDATE JOIN SELECT SELECT
1 453 796 62 687 15 93 30 47 31 02 281 484 78 1016 30 46 31 15 62 23 187 141 78 750 30 93 31 31 15 04 405 125 158 829 15 62 46 31 47 15 156 110 439 1126 15 61 15 31 63 06 453 187 93 1360 31 94 30 31 46 57 203 171 47 876 14 62 41 15 31 158 327 421 78 1111 30 62 15 31 31 159 140 310 63 439 15 62 46 31 62 1010 157 111 61 609 32 61 15 15 31 0
AVERAGE 276.2 285.6 115.6 880.3 22.7 69.6 30 30.8 41.9 11
1MILLIONROWSAZURESQLDATABASE SQLDATABASE
ROUND INSERT UPDATE JOIN SELECT SELECT INSERT UPDATE JOIN SELECT SELECT
1 94999 78156 11036 51766 188 2670 672 1015 2596 622 95765 77907 10907 49509 745 4046 703 1065 2484 623 95453 77703 10238 55352 438 2461 686 969 2469 784 95467 78000 11625 51260 203 3377 687 1061 2083 1095 95407 78328 12153 59648 469 2929 1093 1249 1953 626 95468 78301 12243 50511 469 2692 704 1218 2219 627 96907 78312 12407 53484 250 2916 671 984 3033 778 97344 78093 11156 53473 359 2703 687 1016 2192 629 97840 77454 11814 48164 126 3759 671 1032 2125 6210 97302 77969 11133 48684 281 3175 671 1014 2019 62
AVERAGE 96195.2 78022.3 11471.2 52185.1 355.8 3072.8 724.5 1062.3
2317.3 69.8