21
High Performance Computing with doAzureParallel Using Azure as your Parallel-Backend for Embarassingly Parallel work Microsoft JS Tan

High Performance Computing with - Revolutions · High Performance Computing with doAzureParallel Using Azure as your Parallel-Backend for EmbarassinglyParallel work Microsoft JS Tan

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: High Performance Computing with - Revolutions · High Performance Computing with doAzureParallel Using Azure as your Parallel-Backend for EmbarassinglyParallel work Microsoft JS Tan

HighPerformanceComputingwithdoAzureParallelUsingAzureasyourParallel-BackendforEmbarassingly Parallelwork

MicrosoftJSTan

Page 2: High Performance Computing with - Revolutions · High Performance Computing with doAzureParallel Using Azure as your Parallel-Backend for EmbarassinglyParallel work Microsoft JS Tan

AzureBigCompute

Page 3: High Performance Computing with - Revolutions · High Performance Computing with doAzureParallel Using Azure as your Parallel-Backend for EmbarassinglyParallel work Microsoft JS Tan

Commodity,mostvalueforcost

Fastprocessors,highermemory-to-coreratio,SSDs

Mostmemory,IntelXeonprocessors

HPC/LowLatencyVMsforcomputeintensiveworkloads

GPUenabledVMsforVisualization/Compute

Fastprocessors,lower-memorytocoreratio,SSDs

AzureInfrastructure

Page 4: High Performance Computing with - Revolutions · High Performance Computing with doAzureParallel Using Azure as your Parallel-Backend for EmbarassinglyParallel work Microsoft JS Tan

WhatisBatch?

APP

Tasksareassignedtocomputers/VMs

Manyindividualtasks

Manycomputers/VMs

Page 5: High Performance Computing with - Revolutions · High Performance Computing with doAzureParallel Using Azure as your Parallel-Backend for EmbarassinglyParallel work Microsoft JS Tan

Scenarios

• Aquantback-testingportfolio strategies• Adatascientistoptimizingtheirmodel&parametertuning• Alife-scienceresearcherdoinggenomesequencing

Page 6: High Performance Computing with - Revolutions · High Performance Computing with doAzureParallel Using Azure as your Parallel-Backend for EmbarassinglyParallel work Microsoft JS Tan

Whatdotheyhaveincommon?

• Scale– computationallyexpensivework- needtoscaleupinordertogetresultsbackquickly• MinimalITManagement – theuseristhedomainspecialist,notanITspecialist• Elasticcompute – temporaryneedforalotofcapacity• Costeffective – lowcoststrategiesareimportant!

+TheyareallprobablyusingR…

Page 7: High Performance Computing with - Revolutions · High Performance Computing with doAzureParallel Using Azure as your Parallel-Backend for EmbarassinglyParallel work Microsoft JS Tan

doAzureParallel is...ARpackagethatusesAzureasaparallel-backendforpopularopensourcetoolstouse– foreach,caret, dplyr,etc.

Page 8: High Performance Computing with - Revolutions · High Performance Computing with doAzureParallel Using Azure as your Parallel-Backend for EmbarassinglyParallel work Microsoft JS Tan

Foreach usingdoAzureParallel

foreach (i = 1:100) %dopar% {

myParallelAlgorithm(...)

}MicrosoftAzure

Page 9: High Performance Computing with - Revolutions · High Performance Computing with doAzureParallel Using Azure as your Parallel-Backend for EmbarassinglyParallel work Microsoft JS Tan

doAzureParallel onAzureBatch

AzureBatchisaplatformservicethatprovideseasyjobschedulingandclustermanagement,allowingapplicationsoralgorithmstoruninparallelatscale.

• Capacityondemand;jobsondemand• Autoscale (moreonthislater)• Minimalclustermanagement(nodefailure,install,etc)• Hardwarechoice– useanyVMsize• Paybytheminute• Costeffective– nochargeforusingit,youonlypayfortheVMs• Morecosteffective– lowpriorityVMs(moreonthislater)

Ifyouwanttorunjobsusingelasticcompute,Batchisagreatfit!

Page 10: High Performance Computing with - Revolutions · High Performance Computing with doAzureParallel Using Azure as your Parallel-Backend for EmbarassinglyParallel work Microsoft JS Tan

Scale

• From1to10,000VMsforacluster• From1tomillionsoftasks• Yourselectionofhardware:

• GeneralcomputeVMs(A-Series/D-Series)• Memory/storageoptimized(G-Series)• ComputeOptimized(F-Series)• GPUenabled(N-Series)

• Resultsfromcomputingthemandelbrot setwhenscalingup:

Localmachine

5parallelworkers

10parallelworkers

20parallelworkers

Page 11: High Performance Computing with - Revolutions · High Performance Computing with doAzureParallel Using Azure as your Parallel-Backend for EmbarassinglyParallel work Microsoft JS Tan

MinimalClusterManagement

• AbstractawaycomplexAzure/cloudconcepts• ZeroIT-levelmanagement• WorkentirelyinRStudio

• Monitor/DebugyourjobsdirectlyinRstudio

• ManageyourclusterandmultiplejobsdirectlyinRstudio

• Theresultsofyourdistributed,largescalework canbereturneddirectlytoyourRsession

Page 12: High Performance Computing with - Revolutions · High Performance Computing with doAzureParallel Using Azure as your Parallel-Backend for EmbarassinglyParallel work Microsoft JS Tan

Minimalcodechange

• Minimalcodechangetouse doAzureParallel• Easytouseandyoucangetstartedinjustafewlinesofcode

Page 13: High Performance Computing with - Revolutions · High Performance Computing with doAzureParallel Using Azure as your Parallel-Backend for EmbarassinglyParallel work Microsoft JS Tan

ElasticCompute

• Computeon-demand• Create/deleteyourclusterasyouneed

• Autoscaling pool=maximizingcloudelasticity• Longrunningbatchjobs/overnight• Dailyscheduledwork– pre-provisionclustersoitsreadyforyouatthebeginningoftheday

• Bursty work

Page 14: High Performance Computing with - Revolutions · High Performance Computing with doAzureParallel Using Azure as your Parallel-Backend for EmbarassinglyParallel work Microsoft JS Tan

CostEffective• Low-Priority= (extremely) LowCosts• ProvisioningVMsfromAzure’ssurpluscapacityat80% discount• YourAzureclustercancontainbothregular(dedicated)VMsandlow-priorityVMs

MyLocalRSession

AzureBatch

LowPriorityVMsatupto80%discountDedicatedVMs

Page 15: High Performance Computing with - Revolutions · High Performance Computing with doAzureParallel Using Azure as your Parallel-Backend for EmbarassinglyParallel work Microsoft JS Tan

CostEffective:MoreaboutLowPriorityWhenshouldIuseit?• Longrunningworkthatcanbebrokenintosmallerpiecesandworkthatdoesn'thaveastricttimelimittocomplete

• Experimentation,testing,evaluatingmodels

Whatyouneedtoknowwhenusingit:• PossibilitythatAzure

• willnotallocateyourVMsOR• thatitwilltakesomeorallofthecapacityback

• Ifanodeispre-empted• AzureBatchwillreplaceyournodeforyou• AzureBatchwillrescheduleyourworksothatyoujobcansuccessfullycomplete

Page 16: High Performance Computing with - Revolutions · High Performance Computing with doAzureParallel Using Azure as your Parallel-Backend for EmbarassinglyParallel work Microsoft JS Tan

LowPriorityScenarios Dedicated Low-priority

LowestCostLowercost+

guaranteedbaselinecapacityLowercost

+maintainingcapacityw/autoscale

AzureBatchPool

AzureBatchPool

AzureBatchPool

PreemptedCa

pacity

Time

Capacity

Time

Capacity

Time

Page 17: High Performance Computing with - Revolutions · High Performance Computing with doAzureParallel Using Azure as your Parallel-Backend for EmbarassinglyParallel work Microsoft JS Tan

Questions?www.github.com/azure/doazureparallel

https://aka.ms/earl2017

Page 18: High Performance Computing with - Revolutions · High Performance Computing with doAzureParallel Using Azure as your Parallel-Backend for EmbarassinglyParallel work Microsoft JS Tan

What’snewwithdoAzureParallel?

• Lowprioritysupporta• RicherJobManagementexperiencea• ResourceFilestopreloaddataa• ParameterTuningintegrationwithCareta• SimpleconnectortoAzureBlobStoragea

Page 19: High Performance Computing with - Revolutions · High Performance Computing with doAzureParallel Using Azure as your Parallel-Backend for EmbarassinglyParallel work Microsoft JS Tan

R+AzureBatch

SowhatRworkloadsworkgreatonAzureBatch?• Simulationbasedwork(VaR calculation,back-testing,monte-carlo simulations,financialmodelling)

• ParameterTuning/ModelEvaluation(gridsearch,randomsearch,crossvalidation,etc)• Computingagainstdata/ETLjobs/Data-prepjobs

Whatindustries/verticalsmightbeinterestedinusingthis?• FinancialServices• Education&Research• Sportsanalytics

Page 20: High Performance Computing with - Revolutions · High Performance Computing with doAzureParallel Using Azure as your Parallel-Backend for EmbarassinglyParallel work Microsoft JS Tan

doAzureParallel (sinceinitialrelease)

• InitialreleaseinMarch• Grassrootsstrategy• End-userfocused• FinancialServicestargeted/keymessaginghasbeenaroundsimulationbasedwork• Interestfromthefield• Feedback

Page 21: High Performance Computing with - Revolutions · High Performance Computing with doAzureParallel Using Azure as your Parallel-Backend for EmbarassinglyParallel work Microsoft JS Tan

AzureBatch

LowPriorityVMsatupto80%discountDedicatedVMs

MyLocalRSession