36
Comparative Analysis of Cloud based Machine Learning Platforms Amazon ML , Azure ML, Databricks Cloud Third Eye Consulting Services & Solutions LLC. thirdeyecss.com | [email protected] | @thirdeyecss | 408-462-5257

Webinar - Comparative Analysis of Cloud based Machine Learning Platforms

Embed Size (px)

Citation preview

Page 1: Webinar - Comparative Analysis of Cloud based Machine Learning Platforms

ComparativeAnalysisofCloudbasedMachineLearningPlatforms

AmazonML, AzureML,DatabricksCloud

ThirdEyeConsultingServices&SolutionsLLC.thirdeyecss.com|[email protected]|@thirdeyecss|408-462-5257

Page 2: Webinar - Comparative Analysis of Cloud based Machine Learning Platforms

ThirdEyeConsultingServices&SolutionsLLC.thirdeyecss.com|[email protected]|@thirdeyecss|408-462-5257

DATAANSWERS

Page 3: Webinar - Comparative Analysis of Cloud based Machine Learning Platforms

Disclaimer

• ThirdEyeisadirectvendortoMicrosoft,Amazon&Google.• ThirdEyehasimplementednumerousBigDataprojectsforthemoverlast3years.

• ThirdEyeisNOTaresellerofthecloudservicesofthesecompanies.

• ThirdEyedoesNOTfinanciallybenefitformakinganyofthefollowingrecommendations.

• ThisworkispurelymeantforatechnicalevaluationoftheMLplatformsandshouldnotbeconstruedforanyotherpurposes.

Page 4: Webinar - Comparative Analysis of Cloud based Machine Learning Platforms

ComparisonApproach: Whatdowelookforin an onlineMachineLearningPlatform?

DataPreparation

• DataIngestion(outoftheboxsupportofdatasources)&DataExport• DataCleaning,Transformation,Visualization

DataSelection

• Featureselection/engineering

Algorithms

• Whichalgorithmsaresupportedoutthebox?Modifyorcreatenewones?• Saving/comparingresults

Optimize• E.g.Identifytheoptimalparametersettingsforalgorithms

Knowledge

Page 5: Webinar - Comparative Analysis of Cloud based Machine Learning Platforms

MeettheContestants

• AmazonML• AzureML• DatabricksCloud

Page 6: Webinar - Comparative Analysis of Cloud based Machine Learning Platforms

AmazonML

• Arelativelylimitedentryintermsofcapabilities/algorithmsoffered. • AppearstargetedatexistingAWScustomerswhowanttodosomebaicMLinevetigationswithoutrequiringsignificantexpertise

Page 7: Webinar - Comparative Analysis of Cloud based Machine Learning Platforms

AmazonML

• Supportedcapabilitiesaredescribedin UseCasesterminology– asopposedtonamesofalgorithms:

– Frauddetection– ContentPersonalization– DocumentClassification– CustomerChurnPrediction– Relevancymodelingformarketing– Recommendations

Page 8: Webinar - Comparative Analysis of Cloud based Machine Learning Platforms

AmazonML

• Availablealgorithms:– BinaryandMulticlassClassification– Regression

• Limitedornocustomizability:thealgorithmsarealreadyimplementedandchosenforyou:e.g.BinaryRegressionisimplementedviaLogisticRegression

Page 9: Webinar - Comparative Analysis of Cloud based Machine Learning Platforms

AmazonML

AvailablePerformanceMetrics

• BinaryAUC:Thebinary MLModel usestheAreaUndertheCurve(AUC)techniquetomeasureperformance.

• RegressionRMSE:Theregression MLModel usestheRootMeanSquareError(RMSE)techniquetomeasureperformance.RMSEmeasuresthedifferencebetweenpredictedandactualvaluesforasinglevariable.

• MulticlassAvgFScore:Themulticlass MLModel usestheF1scoretechniquetomeasureperformance.

Page 10: Webinar - Comparative Analysis of Cloud based Machine Learning Platforms

AmazonML

• DataIngestion/integration– Thisistheirstrongestusecase:easyintegrationwithAWSstoragemedia• S 3,EBS• RedShift• RDS

Page 11: Webinar - Comparative Analysis of Cloud based Machine Learning Platforms

AzureML

• IntroducedFebruarythisyear

• Butdonotletitsrelativeyouthfulnessbeadistraction:thisisafeaturerichoffering

• Hasadifferentapproach:amoreserious/richsetofalgorithmsandconfigurationsaremadeavailable .

• Default/cannedalgorithmsarestillavailableforthosenewertoMachineLearning

Page 12: Webinar - Comparative Analysis of Cloud based Machine Learning Platforms

AzureML

• Morecomprehensiveselectionofrepresentativealgorithms:• Providesmoreselectionsforthealgorithmsaswellastuningknobs

Page 13: Webinar - Comparative Analysis of Cloud based Machine Learning Platforms

AzureML

• Firstclassusability :– Tutorials– Walkthroughs– Videos– IntegratedDevelopmentEnvironment

• AzureMLStudio

– Documentation

Page 14: Webinar - Comparative Analysis of Cloud based Machine Learning Platforms

AzureML

• ProcessTools– Selectthedataprocessing,modeling,orpredictionactivitymanually

Page 15: Webinar - Comparative Analysis of Cloud based Machine Learning Platforms

AzureML

• Orfollowthesuggestedworkflow:

Page 16: Webinar - Comparative Analysis of Cloud based Machine Learning Platforms

AzureML

• Thewizardsarefieldnamesanddatatypeaware

Page 17: Webinar - Comparative Analysis of Cloud based Machine Learning Platforms

AzureML

• DataPreparationstages

Page 18: Webinar - Comparative Analysis of Cloud based Machine Learning Platforms

AzureML

• DataPreparationstages

Page 19: Webinar - Comparative Analysis of Cloud based Machine Learning Platforms

AzureML

• WorkflowVisualization

Page 20: Webinar - Comparative Analysis of Cloud based Machine Learning Platforms

AzureML

• ViewPredictionResults

Page 21: Webinar - Comparative Analysis of Cloud based Machine Learning Platforms

AzureML

• Workflowentriesallowviewing/settingdetailedconfiguration/parameters

Page 22: Webinar - Comparative Analysis of Cloud based Machine Learning Platforms

AzureML

• Workflowentriesallowad-hocoperations

Page 23: Webinar - Comparative Analysis of Cloud based Machine Learning Platforms

AzureML

• PointandClickaccesstouseful/popularpublicdatasets

Page 24: Webinar - Comparative Analysis of Cloud based Machine Learning Platforms

AzureML

• Supportforthepopular"Notebooks"structures

Page 25: Webinar - Comparative Analysis of Cloud based Machine Learning Platforms

AzureML: More choices

• Regression:– Linear, Bayesian, Neural Network , Decision Forest,Boosted Decision Tree, Poisson

• Binary Classification– SVM, Percepton, LR, Bayes, NN's, Decision forest

• Multiclass : – LR, NN, Decision Forest/ Jungle, One vs All

• Anomaly Detection:– SVM, PCA

• Clustering: Kmeans

Page 26: Webinar - Comparative Analysis of Cloud based Machine Learning Platforms

AzureML: Available Algorithms

Page 27: Webinar - Comparative Analysis of Cloud based Machine Learning Platforms

DatabricksCloud• Spark:hasjoinedHadoopasde-factoindustrystandardsfordistributed

computing• Rapidlyapproachingpopularityofhadoop

– Andsupplantingitif/whenorganizationscanmaketheswitch• Databricksisthespin-offofBerkeleyAmplab–theoriginalcreatorsofSpark• DatabricksstaffincludealargefractionoftheSparkcorecommitters• Andanevenlargerproportionofthekeydecisionmakers/"shepherds"

– Includingthespark.ml/mllibshepherds• CloudbasedavailabilityofSparkincludingSparkSQLandspark.ml • AccesstocapabilitiesofSparkMllib,SparkDataframes/SQL,Streaming,and

ResilientDistributedDatasets• Notebooksapproach:Scala,Python,Java,andR

Page 28: Webinar - Comparative Analysis of Cloud based Machine Learning Platforms

SparkEcosystem

Page 29: Webinar - Comparative Analysis of Cloud based Machine Learning Platforms

DatabricksCloud

• TheonlineofferingwasannouncedJuly2014 atSparkSummit• Purposestatement - Ease ofworkflowmangementforDataScientists:

Page 30: Webinar - Comparative Analysis of Cloud based Machine Learning Platforms

DatabricksCloud

• TheDatabrickscloudapproach:Notebooks• R,Python,Java,Scala

Page 31: Webinar - Comparative Analysis of Cloud based Machine Learning Platforms

DatabricksCloud:Notebooks

• NotebooksareDataScientists'''friends• Astandard/typicallypreferredapproachingtodoingtheirwork

– Experimentwithdata– Performad-hocvisualizations– Communicate/shareresultswithcolleagues– orevenpublishthem

• Widespectrumofsophisticationlevelsavailable: – simplyuseexistinglibraries– developnewalgorithmsfromscratch

Page 32: Webinar - Comparative Analysis of Cloud based Machine Learning Platforms

DatabricksCloud:Notebooks

Page 33: Webinar - Comparative Analysis of Cloud based Machine Learning Platforms

Wrap-up/SummaryThreegeneraltypesofapproaches (not mutually exclusive)

PointandClick(aswellasbackendAPI's)AmazonMLAzure ML

APIs-OnlyGooglePredictionAPI

NotebooksAzure MLDatabricksCloud

Page 34: Webinar - Comparative Analysis of Cloud based Machine Learning Platforms

Wrap-up/Summary

AmazonML maybesufficient for:- customers thatalreadyhavedataresidinginthoseproviders - simpler/fewer optionsare acceptable

AzureMLhasastrongusabilityandworkflowapprochandprovidesareasonablecrosssectionofalgorithmsavailableforcasual &intermediate users

DatabricksCloudhasthemostcomprehensiveoffering– Variety,performance,configurabilityofAlgorithms– RichnessofthecapabilitiesoftheNotebooks– Options/configurabilityofthehostingclusters/environment

Page 35: Webinar - Comparative Analysis of Cloud based Machine Learning Platforms

THANKYOU!

AskYourQuestionsHerehttp://info.thirdeyecss.com/ask_your_question

Page 36: Webinar - Comparative Analysis of Cloud based Machine Learning Platforms

ThirdEyeConsultingServices&SolutionsLLC.thirdeyecss.com|[email protected]|@thirdeyecss|408-462-5257

DATAANSWERS