21
DRIVESCALE-HDP REFERENCE ARCHITECTURE April 2017

DRIVESCALE-HDP REFERENCE ARCHITECTUREgo.drivescale.com/.../images/...Arch_HDP-4-17-2017.pdfHDP is the industry's only true secure, enterprise-ready open source Apache™ Hadoop®

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: DRIVESCALE-HDP REFERENCE ARCHITECTUREgo.drivescale.com/.../images/...Arch_HDP-4-17-2017.pdfHDP is the industry's only true secure, enterprise-ready open source Apache™ Hadoop®

DRIVESCALE-HDPREFERENCEARCHITECTURE

April2017

Page 2: DRIVESCALE-HDP REFERENCE ARCHITECTUREgo.drivescale.com/.../images/...Arch_HDP-4-17-2017.pdfHDP is the industry's only true secure, enterprise-ready open source Apache™ Hadoop®

1

Contents1. ExecutiveSummary...........................................................................................................2

2. AudienceandScope..........................................................................................................3

3. GlossaryofTerms..............................................................................................................3

4. DriveScale|HortonworksDataPlatform-ApacheHadoopSolutionOverview..............5

5. DriveScaleComponentsOverview....................................................................................7

4.1Hardware:DriveScaleAdapterChassiswithDriveScaleControllers.....................................7

4.2Software................................................................................................................................7

4.3ConceptualdiagramofDriveScalesolution..........................................................................9

6. BenefitsoftheDriveScalesolution..................................................................................10

7. ReferenceArchitectureDetails........................................................................................11

6.1PhysicalClusterComponentsandConfigurationList.........................................................11

6.2LogicalClusterTopology.....................................................................................................12

6.3PhysicalClusterTopology...................................................................................................13

6.4ClusterManagement..........................................................................................................13

6.4.1SettingupDriveScalecluster...........................................................................................13

6.4.2SettingupHortonworksDataPlatformcluster................................................................16

6.5DiskandFilesystemLayout.................................................................................................17

6.6OSSupportability/CompatibilityMatrix..............................................................................17

8. RackScalability................................................................................................................17

9. References.......................................................................................................................18

10. BillofMaterials..............................................................................................................19

11. Conclusion.....................................................................................................................20

Page 3: DRIVESCALE-HDP REFERENCE ARCHITECTUREgo.drivescale.com/.../images/...Arch_HDP-4-17-2017.pdfHDP is the industry's only true secure, enterprise-ready open source Apache™ Hadoop®

2

1. ExecutiveSummary

Thisdocumentisahigh-leveldesignreferencearchitectureguideforimplementingHortonworksDataPlatformonaDriveScalesolutionwithindustrystandardserversandJBODs.Thereferencearchitectureintroducesallthehigh-levelcomponents,hardware,andsoftwarethatareincludedinthestack.Eachhigh-levelcomponentisthendescribedindividually.ThereferencearchitecturedoesnotdescribetheHortonworksDataPlatformcomponentsortheirapplications.DriveScaleTechnologyOverviewDriveScale is leading the charge in bringing hyper scale computing capabilities tomainstream enterprises. Its composable data center architecture transforms rigid datacenters into flexible and responsive scale-out deployments.UsingDriveScale, data centeradministrators can deploy independent pools of commodity compute and storageresources, automatically discover available assets, and combine and recombine theseresourcesasneeded.Thesolutionisprovidedthroughasetofon-premisesandSaaStoolsthat coordinate between multiple levels of infrastructure. With DriveScale, HadooparchitectscanmoreeasilysupportHadoopdeploymentsofanysizeaswellasothermodernapplicationworkloads.DriveScaleprovideshardwareandsoftwaretechnologythatallowsseparatedeploymentofcomputeandstorageusingcommoditydisklessserversandJBODs(JustaBoxofDisks),withflexiblebindingofstorage-to-computeresourcesinanyratiorequiredbyanapplication.Asneeds change, these bindings can be dissolved and reconfigured on demand, all undersoftwarecontrol.DriveScale technology acquires a deep understanding of the physical infrastructure anddynamics of a data center,which it uses to provide an integrated set of intelligence andautomationtoolstoscale-outdatacenterinfrastructuretogreatlysimplifyandoptimizethedatacenter’soperations.HortonworksDataPlatformTechnologyOverviewHDPistheindustry'sonlytruesecure,enterprise-readyopensourceApache™Hadoop®distributionbasedonacentralizedarchitecture(YARN).YARN(YetAnotherResource

Page 4: DRIVESCALE-HDP REFERENCE ARCHITECTUREgo.drivescale.com/.../images/...Arch_HDP-4-17-2017.pdfHDP is the industry's only true secure, enterprise-ready open source Apache™ Hadoop®

3

Negotiator)allocatesresourcesamongvariousapplicationsandmaximizesdataingestionbyenablingenterprisestoanalyzedatatosupportdiverseusecases.YARNcoordinatescluster-wideservicesforoperations,datagovernanceandsecurity.HDPisinteroperablewithabroadecosystemofdatacenterandcloudproviders,andprovidescentralizedmanagementandmonitoringofclusters.WithHDP,securityanddatagovernanceisbuiltintotheplatform.HDPaddressesthecompleteneedsofdata-at-rest,powersreal-timecustomerapplicationsanddeliversrobustanalyticsthatacceleratedecisionmakingandinnovation.

2. AudienceandScope

This referencearchitectureguide is forHadoopand IT architectswhoare responsible forthedesignanddeploymentofApacheHadoopsolutionsonpremises,aswellasforApacheHadoopadministratorsandarchitectsanddatacenterarchitects/engineerswhocollaboratewithspecialistsinthatspace.

3. GlossaryofTerms

Term Description

DataNode WorkernodesoftheclustertowhichtheHDFSdataiswritten.

HBA HostBusAdapter.AnI/Ocontrollerthatisusedtointerfaceahostwithstoragedevices.

HDD HardDiskDrive.

HDFS Apache™Hadoop®DistributedFileSystem.

HighAvailability

(HA)

Configurationthataddressesavailabilityissuesinacluster.Inastandardconfiguration,theNameNodeisasinglepointoffailure(SPOF).EachclusterhasasingleNameNode,andifthatmachineorprocessbecomesunavailable,theclusterasawholeisunavailableuntiltheNameNodeiseitherrestartedorbroughtuponanewhost.ThesecondaryNameNodedoesnotprovidefailovercapability.HighAvailabilityenablesrunningtwoNameNodesinthesamecluster:theactiveNameNodeandthestandbyNameNode.ThestandbyNameNodeallowsafastfailovertoanewNameNodeincaseofmachinecrashorplannedmaintenance.

Page 5: DRIVESCALE-HDP REFERENCE ARCHITECTUREgo.drivescale.com/.../images/...Arch_HDP-4-17-2017.pdfHDP is the industry's only true secure, enterprise-ready open source Apache™ Hadoop®

4

JBOD JustaBunchofDisks.JBODisanalternativetousingaRAIDconfiguration.RatherthanconfiguringdrivestouseaRAIDlevel,thediskswithinthearrayareeitherspannedortreatedasindependentdisks.

JobHistoryServer

Processthatarchivesjobmetricsandmetadata.Onepercluster.

NameNode ThemetadatamasterofHDFSessentialfortheintegrityandproperfunctioningofthedistributedfilesystem.

NodeManager TheprocessthatstartsapplicationprocessesandmanagesresourcesontheDataNodes.

NIC NetworkInterfaceCard.

HDP™ Hortonworks®DataPlatform(thisincludestheHadoopDistributedFileSystemHDFS)

PDU PowerDistributionUnit.

NTP NetworkTimeProtocol

YARN YetAnotherResourceNegotiator,whichisasoftwarerewritethatdecouplesMapReduce'sresourcemanagementandschedulingcapabilitiesfromthedataprocessingcomponent,enablingHadooptosupportmorevariedprocessingapproachesandabroaderarrayofapplications.IncludedinHDP.

OS OperatingSystem

RM ResourceManager.TheresourcemanagementcomponentofYARN.ThisinitiatesapplicationstartupandcontrolsschedulingontheDataNodesofthecluster(oneinstancepercluster).

ToR TopofRack.

ZK ZooKeeper.Acentralizedserviceformaintainingconfigurationinformation,naming,andprovidingdistributedsynchronizationandgroupservices.

DSC DriveScaleCentral.Aweb-baseduserinterfacetotheDriveScalecloudthatperformsDriveScaleaccountmanagement.DSCiswhereyou

Page 6: DRIVESCALE-HDP REFERENCE ARCHITECTUREgo.drivescale.com/.../images/...Arch_HDP-4-17-2017.pdfHDP is the industry's only true secure, enterprise-ready open source Apache™ Hadoop®

5

findanddownloadthekeystoenableinstallationoftheDriveScalesoftware,andthensetupyourDriveScaleManagementDomain(s)(DMDs).

DMD DriveScaleManagementDomain(s).Thisiswhereyoucreateyourdomain,selectandconfiguretheDMSnodesforthedomain,andselectachassis(withitsassociatedDriveScaleAdapters,DSAs)forthedomain.

DMS

DriveScaleManagementServer.Thisistheserverthatrunsthebundleofsoftware(service)thatmanagesasetofphysicalresourcestoenabletheDriveScaleservices.DriveScaleManageristheweb-baseduserinterfacetotheDMS.

DSAchassis DriveScaleAdapterchassisisa1RUchassisthathosts4EthernettoSAScontrollersservingasabridgebetween10GbpsEthernetconnectingcomputeresourcestoJBODsfullofcommoditydisks.

DSAcontroller DriveScaleAdaptercontrollerisanEthernettoSAScontrollersservingasabridgebetween10GbpsEthernetconnectingcomputeresourcestoJBODsfullofcommoditydisks.

MLAG Multi-chassisLinkAggregation.MLAGistheabilityoftwoormoreswitchestoactasasingleswitchwhenforminglinkbundles.

4. DriveScale|HortonworksDataPlatform-ApacheHadoopSolutionOverview

TheDriveScale|HortonworksDataPlatform(HDP)solutionisdesignedtoaddressthechangingrequirementsfromcustomersforamoreflexibleanddynamichardwareinfrastructurethatprovidessignificantcostandoperationalbenefits.Itisdesignedwithcomposabilityastheprimarygoal,savingmoney,improvingutilizationandgreatlysimplifyingthedeploymentofHadoopclusters.HadoopisanApacheprojectbeingdevelopedintheJavaprogramminglanguagebyaglobalcommunityofcontributors.Yahoo!,hasbeenthelargestcontributortothisproject,andusesApacheHadoopextensivelyacrossitsbusinesses.CorecommittersontheHadoopprojectincludeemployeesfromCloudera,eBay,Facebook,Getopt,Hortonworks,Huawei,IBM,InMobi,INRIA,LinkedIn,MapR,Microsoft,Pivotal,Twitter,UCBerkeley,VMware,WANdisco,andYahoo!,withcontributionsfrommanymoreindividualsandorganizations.

Page 7: DRIVESCALE-HDP REFERENCE ARCHITECTUREgo.drivescale.com/.../images/...Arch_HDP-4-17-2017.pdfHDP is the industry's only true secure, enterprise-ready open source Apache™ Hadoop®

6

AlthoughHadoopispopularandwidelyused,installing,configuring,andrunningaproductionHadoopclusterinvolvesmanyconsiderations,including:

• ChoosingtheappropriateHadoopsoftwaredistributionandextensions• Installingmonitoringandmanagementsoftware• AllocationofHadoopservicestophysicalnodes• Selectionofappropriateserverhardware• Rightsizingthestorageconfiguration• Implementingdatalocality• Designofthenetworkfabric• Sizingandsystemscalability• Overallperformance

Thisiscomplicatedbytheneedtounderstandtheworkloadsthatwillberunningonthecluster,thefast-movingpaceofthecoreHadoopproject,andthechallengestomanagingasystemdesignedtoscaletothousandsofnodesinasingleinstance.

TheDriveScale|HortonworksDataPlatformsolutionstogetherembodiesallthehardware,software,resourcesandservicesneededtorunHadoopinaproductionenvironment.Thisend-to-endapproachmeansthatyoucanbeinproductionwithHadoopinashortertimethanistypicallypossiblewithhomegrownsolutions.Todeliver thecomputeandstoragepoweradatacenterneeds, this solution isbasedonHDP, a secure, enterprise-ready open source Apache Hadoop distribution built on acentralized YARN architecture, DriveScale hardware and software, industry standardservers,networksswitches,andJBODsbuiltfromcommoditydiskdrives.Thissolutionincludescomponentsthatspantheentiresolutionstack:

• Referencearchitectureandbestpractices• Optimizedstorageconfigurations• Optimizednetworkinfrastructure• HDP

ItisdesignedtoaddressthevastmajorityofApacheHadoopusecasesincluding,butnotlimitedto:

• Bigdataanalytics• ETLOffload• DataWarehouseOptimization• Bathprocessingofunstructureddata

Page 8: DRIVESCALE-HDP REFERENCE ARCHITECTUREgo.drivescale.com/.../images/...Arch_HDP-4-17-2017.pdfHDP is the industry's only true secure, enterprise-ready open source Apache™ Hadoop®

7

• Bigdatavisualization• Searchandpredictiveanalysis

5. DriveScaleComponentsOverview DriveScalesystemiscomposedofonehardwarecomponentandfoursoftwarecomponents:

4.1 Hardware: DriveScale Adapter Chassis with DriveScale Controllers

Thisisa1Uappliancewithadaptersthatconnecttoserversvia10GbEthernetinterfacesandtoJBOD’sviaSASinterfaces.

Figure1:DriveScaleAdapter

4.2Software

TherearefourprincipalcomponentsoftheDriveScalesoftware:a) DriveScaleManagementServer(DMS)

• TheserverrunningtheDMSsoftwarebundleiscalledtheDMSnode.• AtypicaldeploymentconsistsofthreeDMSSystemsinaclusteredforhigh

availability(HA).• Thesoftwaremanagesandconfigureresourcesandcontainsthe

inventory/configurationinformationrepositoryanddatabase:ü Inventory:DMS’s,DSAdapters,switches,JBODchassis,disks,servernodesü Configuration:nodetemplates,clustertemplates,configuredclustersü DMSDatabase:usedasamessagebustocommunicatewiththeendpoints.

Page 9: DRIVESCALE-HDP REFERENCE ARCHITECTUREgo.drivescale.com/.../images/...Arch_HDP-4-17-2017.pdfHDP is the industry's only true secure, enterprise-ready open source Apache™ Hadoop®

8

b) DriveScaleServerAgent• DriveScaleServerAgentdiscoveryactionprovidesinventoryforhardwareand

servers,andcreatesmappingsbetweenservernodesandthediskstheyconsume.

c) DriveScaleCentral(DSC)Cloud-basedsoftwaremanagementportalthatactsasthe:o softwaredistributionrepositoriesforsubscriberso DriveScalekeysrepositoryo centralizedlogfilerepositoryo userdocumentationrepositoryo licensemanager

d) DriveScaleAdapterFirmware

• AttheheartoftheclusterwheretheDMSisrunning,thefirmwareontheprocessorenablestheJBODstobemappedtotheserversandusedaslocaldrives.

Page 10: DRIVESCALE-HDP REFERENCE ARCHITECTUREgo.drivescale.com/.../images/...Arch_HDP-4-17-2017.pdfHDP is the industry's only true secure, enterprise-ready open source Apache™ Hadoop®

9

4.3ConceptualdiagramofDriveScalesolution

Figure2:DriveScaleClustercomponentsoverview

Page 11: DRIVESCALE-HDP REFERENCE ARCHITECTUREgo.drivescale.com/.../images/...Arch_HDP-4-17-2017.pdfHDP is the industry's only true secure, enterprise-ready open source Apache™ Hadoop®

10

6. BenefitsoftheDriveScalesolutionTheDriveScalesolutionfornext-generationScale-Outarchitecturedisaggregatesserversinclustersintopoolsofindependentcomputeandstorageresources.DriveScaledisaggregatesstoragefromcomputenodesattheracklevel,allowingdatacenterstobuycommoditydisklessservernodesandJBODsfromtheirpreferredvendorsandinstalltheminracks.DriveScale’sadvancedsoftwaremanagestheorchestrationofserversandclustersfromthepoolsofdrivesandcomputenodesthroughaGUIbuiltonaRESTfulAPI.Administratorscanprovision,decommission,andre-provisionserversandclustersdynamically,asneeded.WithDriveScale,theminimumclustersizeistheminimumsizeofaHadoopcluster.

• SoftwareDefinedorElasticinfrastructureforHadoopclusterWithDriveScalesolution,allthecomputeandstorageresourcesaresharedandcanbedeployedatwill.AdministratorscanprovisionclustersinminutesinsteadofdaysandgetsignificantlybetterutilizationoftheirhardwarebyusingDriveScale’sInfrastructureProvisioningandManagementtool.AdministratorscanbuildasingleresourcepoolormultiplesmallresourcepoolsforHadoopapplications.Theseresourcepoolscanbemodifiedondemandtoquicklyrespondtothechangingworkloadneeds.

• IntegrationDriveScale’ssolutionworksseamlesslywiththedatacenter’sexistingbest-in-classorcommodityserverandJBODtechnology.

• ScalabilityWithDriveScalesolution,thedatacentercanstartwithasmallclusterwithfewcomputenodesandasingleJBODandlaterscalestorageandcomputeseparatelyastheclusterstartstorunoutofresources,withoutcausinganydisruption.

• ImprovedutilizationWithDriveScalesolution,adatacentercanreplacetheserversanddrivesseparately,therebymaximizingthelifetimeofthehardwareinfrastructure.

• SimplyeverythingCustomerscanchoosetokeeptheirexistingbest-in-classequipment,oraddcommodityhardwaretobuildacomposableHadoopinfrastructurethatcaneasilybemodified.Addingorremovingcomputeorstoragecapacityisdonewithjustafewclicks.

Page 12: DRIVESCALE-HDP REFERENCE ARCHITECTUREgo.drivescale.com/.../images/...Arch_HDP-4-17-2017.pdfHDP is the industry's only true secure, enterprise-ready open source Apache™ Hadoop®

11

7. ReferenceArchitectureDetails

6.1PhysicalClusterComponentsandConfigurationList

Thefollowingtableliststhephysicalcomponentsforthecluster.

Component Configuration Description QuantityDriveScaleAdapter

ChassisDHCP,Jumboframeenabled

1UappliancewithadaptersthatconnecttoserversviaEthernet,andtoJBOD’sviaSAS.

1

DriveScaleAdaptercontroller

DHCP,Jumboframeenabled

Providesthedatanetwork. 4foreachchassis

DriveScaleManagementServer(DMS)

DMSrunningonaVM ManagesandconfiguresthenodesandDriveScaleclusterandalsostorestheinventory/configurationrepositoryofeveryhardwareinthecluster.

Min1,forHA3DMS’sshouldbeconfiguredasmasterandslave

Servers 2socketCPUandmemorypertheindividualHadoopclusterrequirements

Commodityx86serversthathousealltheNodeManager,computeinstancesandDriveScaleagents.

Min1Namenodes+3Datanodes

HDDforServers 2drivesconfiguredinRAID1

TheinternaldrivesareusedforOSinstall.

2foreachserver

NICs Dual-port10GbpsEthernetNICs.Theconnectortypedependsonthenetworkdesign;couldbeSFP+orTwinax.

Providesthedatanetwork Min1foreachserver

JBOD Defaultconfiguration HousesthedrivewithdualIOcontrollers.

Min1

HDDforJBOD Defaultconfiguration Drivestohousethedataforthecluster.

Dependingontheclusterrequirements

ToR10Gswitch LLDP,MLAG,JumboFrame9Kconfigured

Providesdatanetworkconnectivity.

2foreachrack

ToR1Gswitch Defaultconfiguration Providesmanagementnetworkconnectivity.

1foreachrack

AmbariServer AmbariserverrunningonaVM

Manages,configuresandmonitorstheHDPHadoopcluster.

1foreachenvironment

Page 13: DRIVESCALE-HDP REFERENCE ARCHITECTUREgo.drivescale.com/.../images/...Arch_HDP-4-17-2017.pdfHDP is the industry's only true secure, enterprise-ready open source Apache™ Hadoop®

12

6.2LogicalClusterTopology

Theminimumrequirementstobuildouttheclusterare:

● 1NameNode● 4DataNodes● 1DriveScaleAdapterChassis● 1DriveScaleManagementServer● 210GSwitches● 11GSwitch● 1JBODchassiswithdrives● 1DMS● 1Ambariserver

Thisreferencearchitectureisbuilton1namenodeand4datanodeswith1JBODand60drivesof1or2or3TBHDD.Thefollowingtableliststheconfigurationsoftheserversandnumberofdrivesused.

Component Configuration Description QuantityNamenode 2socket20coreCPU,

256GBRAM,10GbEIntelNICwith2internalHDDforOSand4highcapacityHDDmountedfromtheJBOD.

NamenodehoststheHDPnameservicesandDriveScaleagents.

1

Datanodes 2socket16coreCPU,256GBRAM,10GbEIntelNICwith2internalHDDforOSand8highcapacityHDDmountedfromtheJBOD.

DatanodeshousetheHDFSDataNodesandYARNNodemanagers,anyadditionalrequiredservicesandDriveScaleagents.

4

Notes:-Customerswithhigher(orlower)computeneedscanacquirebigger(orsmaller)datanodesconfiguredwithCPUandmemorythatfitsthespecificrequirementsoftheirapplications.-Similarly,dependingonthedatarequirements,customerscanaddorremovediskdrivestomatchthespecificneedsoftheirapplications.

Page 14: DRIVESCALE-HDP REFERENCE ARCHITECTUREgo.drivescale.com/.../images/...Arch_HDP-4-17-2017.pdfHDP is the industry's only true secure, enterprise-ready open source Apache™ Hadoop®

13

6.3PhysicalClusterTopology

Figure3:DriveScalelabArchitecturewith1xDSAChassis(4xAdaptersinuse),1xJBOD,1NameNodeand4DataNodes

6.4ClusterManagement

ThissectiondetailsthestepsforsettingupaDriveScaleenabledHadoopclusterusingAmbariServer.

6.4.1SettingupDriveScaleclusterBeforeinstallingAmbariServerorusinganexistinginstallofAmbariServer,youmustcompletethefollowingtasksforsettinguptheDriveScalesolution:

1. RackandinstalltheDriveScaleAdapterchassisandcontrollers(DSAs)usingthedocumentationprovidedbyDriveScale.

2. RackandinstalltheJBODusingthedocumentationprovidedbythevendor.3. Rackandinstalltheserversusingthedocumentationprovidedbythevendor.4. CreateaRAID1configurationfortheinternalHDDontheserverandinstalltheOS

onalltheotherservers.5. InstallandconfigureDriveScaleManagementServer(DMS)eitherasaVMorona

standaloneserver.6. SetupDSAconfigurationfromtheDMS.

Page 15: DRIVESCALE-HDP REFERENCE ARCHITECTUREgo.drivescale.com/.../images/...Arch_HDP-4-17-2017.pdfHDP is the industry's only true secure, enterprise-ready open source Apache™ Hadoop®

14

7. InstallandconfigureDriveScaleagentsonthemasteranddatanodes.8. Createmaster/datanodeandclustertemplatewithrequireddrivesusingDMS.9. CreatetheclusterfromthetemplateusingDMS.10. EnsurethatDriveScaleclusterisupandrunningbeforeproceedingahead.

Figure4:PhysicalcomponentsoverviewfromDMSUI

Figure5:LogicalClusterstatusfromDMSUI

Page 16: DRIVESCALE-HDP REFERENCE ARCHITECTUREgo.drivescale.com/.../images/...Arch_HDP-4-17-2017.pdfHDP is the industry's only true secure, enterprise-ready open source Apache™ Hadoop®

15

Figure6:LogicalclusterdetailsoverviewfromDMSUI

Figure7:LogicalclusterserverdetailsfromDMSUI

Page 17: DRIVESCALE-HDP REFERENCE ARCHITECTUREgo.drivescale.com/.../images/...Arch_HDP-4-17-2017.pdfHDP is the industry's only true secure, enterprise-ready open source Apache™ Hadoop®

16

6.4.2SettingupHortonworksDataPlatformcluster1. Afterthesuccessfulcompletionofthestepsabove,installAmbariServerusingthe

HDPgettingreadytoinstallandinstallationguide.2. Thefollowingservicesweresetupforthisreferencearchitecture.

Figure8:InstalledHadoopservicesdetailsfromAmbariUIDashboard

3. Ensurethatthenameanddatanodesareupandrunningwiththerightassignedrolesandstorage.

Figure9:HostsandrolesoverviewfromAmbariUIHostssection

Page 18: DRIVESCALE-HDP REFERENCE ARCHITECTUREgo.drivescale.com/.../images/...Arch_HDP-4-17-2017.pdfHDP is the industry's only true secure, enterprise-ready open source Apache™ Hadoop®

17

6.5DiskandFilesystemLayout

Node/Role DiskandFilesystemLayout Description

Management/Master Ext4 1/2/3TBdrivesaremountedfromtheJBOD’s

YARNNodeManagernodes

Ext4 1/2/3TBdrivesaremountedfromtheJBOD’s

6.6OSSupportability/CompatibilityMatrix

DMS ServerNodes

CentOS/RHEL6.x X X

CentOS/RHEL7.x X X

Ubuntu14.04 X X

8. RackScalability

Customerscanscalebeyondonerackinastraightforwardmannertoexpandtheircomputeandstorageresourcesdependingasapplicationneedsgrow.Customerscanchangeormaintainthecompute-to-storageratioforthenewracksoranexistingrack.ForeverynewJBODaddition,anewDriveScaleAdapterwithfourcontrollersmustbeaddedaswell.Sincedrivesareassignedfromwithintheracktoserversintherack,scalingisachievedbysimplyaddingmorerackswithServers,DriveScaleAdapters,SwitchesandJBODs.

Page 19: DRIVESCALE-HDP REFERENCE ARCHITECTUREgo.drivescale.com/.../images/...Arch_HDP-4-17-2017.pdfHDP is the industry's only true secure, enterprise-ready open source Apache™ Hadoop®

18

Figure10:HostsandrolesoverviewfromAmbariUIHostssection

9. References

1. HDPInstallationprerequisiteshttps://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.0/bk_installing_manually_book/content/ch_getting_ready_chapter.html

2. AutomatedInstallwithAmbarihttps://docs.hortonworks.com/HDPDocuments/Ambari-2.1.1.0/bk_Installing_HDP_AMB/content/index.html

3. DriveScaledocumentationforrackingandinstallationwhichareprovidedbyDriveScale

4. YARNDefinitionhttp://searchdatamanagement.techtarget.com

Page 20: DRIVESCALE-HDP REFERENCE ARCHITECTUREgo.drivescale.com/.../images/...Arch_HDP-4-17-2017.pdfHDP is the industry's only true secure, enterprise-ready open source Apache™ Hadoop®

19

10. BillofMaterials

ServerComponents Quantity

IntelE5-E5-2665,2.40GHz,16CCPU 2

16GBDIMMS 16

Intel10GbESFP+NIC 1

JBODComponents Quantity

QUANTAJB4602JBOD 1

IOControllers 2

SeagateHDD 60

Switch Quantity

CISCONEXUS554810GbEswitch 2

D-LinkDGs-1518-281GbEswitch 1

DriveScaleComponents Quantity

DriveScaleAdapterChassis 1

DriveScaleAdapter 4

Software Version

CentOS 6.7

DriveScaleAdapter 1.2.0.1

HDP 2.3

Page 21: DRIVESCALE-HDP REFERENCE ARCHITECTUREgo.drivescale.com/.../images/...Arch_HDP-4-17-2017.pdfHDP is the industry's only true secure, enterprise-ready open source Apache™ Hadoop®

20

11. Conclusion

TheDriveScale-HDPsolutionreferencearchitectureguideisdesignedtoprovideanoverviewofthecombinedsolutionsandthekeycomponentsemployed.ThereferencearchitecturealsooutlinestheadvantagesofcomputeandstoragedisaggregationwithDrivescale-HDPsolution.AboutDriveScaleDriveScaleisleadingthechargeinbringinghyperscalecomputingcapabilitiestomainstreamenterprises.Itscomposabledatacenterarchitecturetransformsrigiddatacentersintoflexibleandresponsivescale-outdeployments.UsingDriveScale,datacenteradministratorscandeployindependentpoolsofcommoditycomputeandstorageresources,automaticallydiscoveravailableassets,andcombineandrecombinetheseresourcesasneeded.Thesolutionisprovidedviaasetofon-premisesandSaaStoolsthatcoordinatebetweenmultiplelevelsofinfrastructure.WithDriveScale,companiescanmoreeasilysupportHadoopdeploymentsofanysizeaswellasothermodernapplicationworkloads.DriveScaleisfoundedbyateamwithdeeprootsinITarchitectureandthathasbuiltenterprise-classsystemssuchasCiscoUCSandSunUltraSparc.BasedinSunnyvale,California,thecompanywasfoundedin2013.InvestorsincludePelionVenturePartners,NautilusVenturePartnersandIngrasys,awhollyownedsubsidiaryofFoxconn.Formoreinformation,visitwww.drivescale.comorfollowusonTwitterat@DriveScale_Inc.

AboutHortonworksHortonworksisaleadinginnovatorintheindustry,creating,distributingandsupportingenterprise-readyopendataplatformsandmoderndataapplications.Ourmissionistomanagetheworld’sdata.Wehaveasingle-mindedfocusondrivinginnovationinopensourcecommunitiessuchasApacheHadoop,NiFi,andSpark.Wealongwithour1600+partnersprovidetheexpertise,trainingandservicesthatallowourcustomerstounlocktransformationalvaluefortheirorganizationsacrossanylineofbusiness.Ourconnecteddataplatformspowersmoderndataapplicationsthatdeliveractionableintelligencefromalldata:data-in-motionanddata-at-rest.Visitusathortonworks.com.WearePoweringtheFutureofData™.