35
S91030 - Hybrid Machine Learning with the Kubeflow Pipelines and RAPIDS Sina Chavoshi

Pipelines and RAPIDS Learning with the Kubeflow S91030 - Hybrid … · 2019-03-29 · DASK/SPARK DEEP LEARNING FRAMEWORKS CUDNN RAPIDS CUDF CUML CUGRAPH. BENCHMARKS cuML — XGBoost

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Pipelines and RAPIDS Learning with the Kubeflow S91030 - Hybrid … · 2019-03-29 · DASK/SPARK DEEP LEARNING FRAMEWORKS CUDNN RAPIDS CUDF CUML CUGRAPH. BENCHMARKS cuML — XGBoost

S91030 - Hybrid Machine Learning with the Kubeflow Pipelines and RAPIDS

Sina Chavoshi

Page 2: Pipelines and RAPIDS Learning with the Kubeflow S91030 - Hybrid … · 2019-03-29 · DASK/SPARK DEEP LEARNING FRAMEWORKS CUDNN RAPIDS CUDF CUML CUGRAPH. BENCHMARKS cuML — XGBoost

The right approach for the right problem

Building blocks Platform Solutions

Cloud AI Strategy:

Page 3: Pipelines and RAPIDS Learning with the Kubeflow S91030 - Hybrid … · 2019-03-29 · DASK/SPARK DEEP LEARNING FRAMEWORKS CUDNN RAPIDS CUDF CUML CUGRAPH. BENCHMARKS cuML — XGBoost

The right approach for the right problem

Building blocks Platform Solutions

Cloud AI Strategy:

Page 4: Pipelines and RAPIDS Learning with the Kubeflow S91030 - Hybrid … · 2019-03-29 · DASK/SPARK DEEP LEARNING FRAMEWORKS CUDNN RAPIDS CUDF CUML CUGRAPH. BENCHMARKS cuML — XGBoost

Building BlocksSight Language Conversation

Page 5: Pipelines and RAPIDS Learning with the Kubeflow S91030 - Hybrid … · 2019-03-29 · DASK/SPARK DEEP LEARNING FRAMEWORKS CUDNN RAPIDS CUDF CUML CUGRAPH. BENCHMARKS cuML — XGBoost

The right approach for the right problem

Building blocks Platform Solutions

Cloud AI Strategy:

Page 6: Pipelines and RAPIDS Learning with the Kubeflow S91030 - Hybrid … · 2019-03-29 · DASK/SPARK DEEP LEARNING FRAMEWORKS CUDNN RAPIDS CUDF CUML CUGRAPH. BENCHMARKS cuML — XGBoost

Solutions / Contact Center

Customer

Phone

Chat

Contact Center Provider

Contact Center Interface

Virtual Agent

AgentAssist

Knowledge Base(PDF/HTML)

BackendFulfillment

Virtual Agent

Agent

Google Cloud Contact Center AI

Page 7: Pipelines and RAPIDS Learning with the Kubeflow S91030 - Hybrid … · 2019-03-29 · DASK/SPARK DEEP LEARNING FRAMEWORKS CUDNN RAPIDS CUDF CUML CUGRAPH. BENCHMARKS cuML — XGBoost

The right approach for the right problem

Building blocks Platform Solutions

Cloud AI Strategy:

Page 8: Pipelines and RAPIDS Learning with the Kubeflow S91030 - Hybrid … · 2019-03-29 · DASK/SPARK DEEP LEARNING FRAMEWORKS CUDNN RAPIDS CUDF CUML CUGRAPH. BENCHMARKS cuML — XGBoost

Cloud AI PlatformData pipeline

Cloud Dataprep

BigQuery

Cloud Dataflow

Cloud Dataproc

Model development

Cloud ML Engine

Model deployment and management

Cloud ML Engine

CloudKubernetes Engine

Tools

Jupyter Notebooks

Services

ASL

Community

Kubeflow

Page 9: Pipelines and RAPIDS Learning with the Kubeflow S91030 - Hybrid … · 2019-03-29 · DASK/SPARK DEEP LEARNING FRAMEWORKS CUDNN RAPIDS CUDF CUML CUGRAPH. BENCHMARKS cuML — XGBoost

Building & deploying real-life ML applications is hard and costly because of lack of tooling that covers end-to-end ML development & deployment.

Page 10: Pipelines and RAPIDS Learning with the Kubeflow S91030 - Hybrid … · 2019-03-29 · DASK/SPARK DEEP LEARNING FRAMEWORKS CUDNN RAPIDS CUDF CUML CUGRAPH. BENCHMARKS cuML — XGBoost

In addition to the actual ML...

ML Code

Page 11: Pipelines and RAPIDS Learning with the Kubeflow S91030 - Hybrid … · 2019-03-29 · DASK/SPARK DEEP LEARNING FRAMEWORKS CUDNN RAPIDS CUDF CUML CUGRAPH. BENCHMARKS cuML — XGBoost

You have to worry about so much more.

Configuration

Data Collection

Data Verification

Feature Extraction

Process Management Tools

Analysis Tools

Machine Resource

Management

Serving Infrastructure

Monitoring

ML Code

Source: Sculley et al.: Hidden Technical Debt in Machine Learning Systems

Page 12: Pipelines and RAPIDS Learning with the Kubeflow S91030 - Hybrid … · 2019-03-29 · DASK/SPARK DEEP LEARNING FRAMEWORKS CUDNN RAPIDS CUDF CUML CUGRAPH. BENCHMARKS cuML — XGBoost

02

03

AI problems today

Problems SolutionsDeploymentBrittle, opinionated infrastructure that is hard to productionize and breaks between cloud and on-prem

TalentMachine Learning expertise is scarce

CollaborationDifficult to find, leverage existing solutions

Reusable pipelines

01

02

03

Page 13: Pipelines and RAPIDS Learning with the Kubeflow S91030 - Hybrid … · 2019-03-29 · DASK/SPARK DEEP LEARNING FRAMEWORKS CUDNN RAPIDS CUDF CUML CUGRAPH. BENCHMARKS cuML — XGBoost

01: Kubeflow

Scalable ML services on Kubernetes

Easy to get started• Out-of-box support for top frameworks

– pytorch, caffe, tf and xgboost• Kubernetes manages dependencies, resources

Swappable & scalable• Library of ML services• GPU support• Massive scale

Meet customer where they are• GCP• On-prem with Cisco

Cloud

On-prem

Training

ML microservices

Predict

Training Predict

Page 14: Pipelines and RAPIDS Learning with the Kubeflow S91030 - Hybrid … · 2019-03-29 · DASK/SPARK DEEP LEARNING FRAMEWORKS CUDNN RAPIDS CUDF CUML CUGRAPH. BENCHMARKS cuML — XGBoost

Product Overview

RAPIDS

Page 15: Pipelines and RAPIDS Learning with the Kubeflow S91030 - Hybrid … · 2019-03-29 · DASK/SPARK DEEP LEARNING FRAMEWORKS CUDNN RAPIDS CUDF CUML CUGRAPH. BENCHMARKS cuML — XGBoost

THE BIG PROBLEM IN DATA SCIENCE

All Data

ETL

Manage Data

Structured Data Store

Data Preparation

Training

Model Training Visualization

Evaluate

Scoring

Deploy

Slow Training Times for Data Scientists

Page 16: Pipelines and RAPIDS Learning with the Kubeflow S91030 - Hybrid … · 2019-03-29 · DASK/SPARK DEEP LEARNING FRAMEWORKS CUDNN RAPIDS CUDF CUML CUGRAPH. BENCHMARKS cuML — XGBoost

RAPIDS — OPEN GPU DATA SCIENCESoftware Stack Python

Data PreparationcuDF

Graph AnalyticscuGRAPH

Model TrainingcuML

CUDA

PYTHON

APACHE ARROW on GPU Memory

DAS

K/SP

ARK

DEEP LEARNINGFRAMEWORKS

CUDNN

RAPIDS

CUMLCUDF CUGRAPH

Page 17: Pipelines and RAPIDS Learning with the Kubeflow S91030 - Hybrid … · 2019-03-29 · DASK/SPARK DEEP LEARNING FRAMEWORKS CUDNN RAPIDS CUDF CUML CUGRAPH. BENCHMARKS cuML — XGBoost

BENCHMARKScuML — XGBoost End-to-End

cuIO/cuDF — Load and Data Preparation

Benchmark

200GB CSV dataset; Data preparation includes joins, variable transformations.

CPU Cluster Configuration

CPU nodes (61 GiB of memory, 8 vCPUs, 64-bit platform), Apache Spark

DGX Cluster Configuration

5x DGX-1 on InfiniBand network

Time in seconds — Shorter is better

cuIO / cuDF (Load and Data Preparation) Data Conversion XGBoost

Page 18: Pipelines and RAPIDS Learning with the Kubeflow S91030 - Hybrid … · 2019-03-29 · DASK/SPARK DEEP LEARNING FRAMEWORKS CUDNN RAPIDS CUDF CUML CUGRAPH. BENCHMARKS cuML — XGBoost

AI Hub & Pipelines: Fast & simple adoption of AI

5. PublishUpload & share pipelines running best

within your org or publicly.

1. Search & DiscoverFind best-of-breed solutions on the AI

Hub which leverage Cloud AI solutions

2. DeployQuick 1-click implementation of ML pipelines onto Google Cloud Platform .

4. Run in productionDeploy customized pipelines

in production.

3. CustomizeExperiment and adjustment out-of-the-box pipelines to custom use cases.

Network effect

The Flywheel of AI Adoption

Page 19: Pipelines and RAPIDS Learning with the Kubeflow S91030 - Hybrid … · 2019-03-29 · DASK/SPARK DEEP LEARNING FRAMEWORKS CUDNN RAPIDS CUDF CUML CUGRAPH. BENCHMARKS cuML — XGBoost

02: Reusable Pipelines

Enable developers to build custom ML applications by easily “stitching” and connecting various components.

• Reuse instead of reimplement or reinvent• Discover, learn and replicate successful pipelines

Page 20: Pipelines and RAPIDS Learning with the Kubeflow S91030 - Hybrid … · 2019-03-29 · DASK/SPARK DEEP LEARNING FRAMEWORKS CUDNN RAPIDS CUDF CUML CUGRAPH. BENCHMARKS cuML — XGBoost

What constitutes a Kubeflow Pipeline

● Containerized implementations of ML Tasks○ Containers provide portability, repeatability and

encapsulation○ A task can be single node or *distributed* ○ A containerized task can invoke other services

● Specification of the sequence of steps○ Specified via Python SDK

● Input Parameters○ A “Job” = Pipeline invoked w/ specific parameters

Page 21: Pipelines and RAPIDS Learning with the Kubeflow S91030 - Hybrid … · 2019-03-29 · DASK/SPARK DEEP LEARNING FRAMEWORKS CUDNN RAPIDS CUDF CUML CUGRAPH. BENCHMARKS cuML — XGBoost

03: AI Hub at a glance

All AI content in one placeQuick discovery of plug & play AI pipelines & other content built by teams across Google and by partners and customers.

Fast & simple implementation of AI on GCPOne-click deployment of AI pipelines via Kubeflow on GCP as the go-to platform for AI + hybrid & on premise.

Enterprise-grade internal & external sharingFoster reuse by sharing deployable AI pipelines & othercontent privately within organizations & publicly.

1

2

3

Page 22: Pipelines and RAPIDS Learning with the Kubeflow S91030 - Hybrid … · 2019-03-29 · DASK/SPARK DEEP LEARNING FRAMEWORKS CUDNN RAPIDS CUDF CUML CUGRAPH. BENCHMARKS cuML — XGBoost

Mission

The one place for everything AI, from experimentation to production.

Page 23: Pipelines and RAPIDS Learning with the Kubeflow S91030 - Hybrid … · 2019-03-29 · DASK/SPARK DEEP LEARNING FRAMEWORKS CUDNN RAPIDS CUDF CUML CUGRAPH. BENCHMARKS cuML — XGBoost

Public and private AI Hub

By Google

Unique AI assets by Google

By partners

Created, shared & monetized by anyone

By customers

Content shared securely within and with other organizations

Public content + Private content

AutoML, TPUs, Cloud AI Platform, etc.

Page 24: Pipelines and RAPIDS Learning with the Kubeflow S91030 - Hybrid … · 2019-03-29 · DASK/SPARK DEEP LEARNING FRAMEWORKS CUDNN RAPIDS CUDF CUML CUGRAPH. BENCHMARKS cuML — XGBoost

Kubeflow Pipelines enable Workflow

orchestrationRapid reliable

experimentationShare, re-use &

compose

Page 25: Pipelines and RAPIDS Learning with the Kubeflow S91030 - Hybrid … · 2019-03-29 · DASK/SPARK DEEP LEARNING FRAMEWORKS CUDNN RAPIDS CUDF CUML CUGRAPH. BENCHMARKS cuML — XGBoost

Demo

Page 26: Pipelines and RAPIDS Learning with the Kubeflow S91030 - Hybrid … · 2019-03-29 · DASK/SPARK DEEP LEARNING FRAMEWORKS CUDNN RAPIDS CUDF CUML CUGRAPH. BENCHMARKS cuML — XGBoost

Visual depiction of pipeline topology

Page 27: Pipelines and RAPIDS Learning with the Kubeflow S91030 - Hybrid … · 2019-03-29 · DASK/SPARK DEEP LEARNING FRAMEWORKS CUDNN RAPIDS CUDF CUML CUGRAPH. BENCHMARKS cuML — XGBoost

View all current and historical runs, grouped as “Experiments”

Page 28: Pipelines and RAPIDS Learning with the Kubeflow S91030 - Hybrid … · 2019-03-29 · DASK/SPARK DEEP LEARNING FRAMEWORKS CUDNN RAPIDS CUDF CUML CUGRAPH. BENCHMARKS cuML — XGBoost

Rich visualizations of metrics

Page 29: Pipelines and RAPIDS Learning with the Kubeflow S91030 - Hybrid … · 2019-03-29 · DASK/SPARK DEEP LEARNING FRAMEWORKS CUDNN RAPIDS CUDF CUML CUGRAPH. BENCHMARKS cuML — XGBoost

Clone an existing pipeline

Page 30: Pipelines and RAPIDS Learning with the Kubeflow S91030 - Hybrid … · 2019-03-29 · DASK/SPARK DEEP LEARNING FRAMEWORKS CUDNN RAPIDS CUDF CUML CUGRAPH. BENCHMARKS cuML — XGBoost

Access to all config params, inputs and outputs for each run

Page 31: Pipelines and RAPIDS Learning with the Kubeflow S91030 - Hybrid … · 2019-03-29 · DASK/SPARK DEEP LEARNING FRAMEWORKS CUDNN RAPIDS CUDF CUML CUGRAPH. BENCHMARKS cuML — XGBoost

Update parameters and submit

Page 32: Pipelines and RAPIDS Learning with the Kubeflow S91030 - Hybrid … · 2019-03-29 · DASK/SPARK DEEP LEARNING FRAMEWORKS CUDNN RAPIDS CUDF CUML CUGRAPH. BENCHMARKS cuML — XGBoost

Easy comparison of Runs

Page 33: Pipelines and RAPIDS Learning with the Kubeflow S91030 - Hybrid … · 2019-03-29 · DASK/SPARK DEEP LEARNING FRAMEWORKS CUDNN RAPIDS CUDF CUML CUGRAPH. BENCHMARKS cuML — XGBoost

Easy comparison of Runs

Page 34: Pipelines and RAPIDS Learning with the Kubeflow S91030 - Hybrid … · 2019-03-29 · DASK/SPARK DEEP LEARNING FRAMEWORKS CUDNN RAPIDS CUDF CUML CUGRAPH. BENCHMARKS cuML — XGBoost
Page 35: Pipelines and RAPIDS Learning with the Kubeflow S91030 - Hybrid … · 2019-03-29 · DASK/SPARK DEEP LEARNING FRAMEWORKS CUDNN RAPIDS CUDF CUML CUGRAPH. BENCHMARKS cuML — XGBoost

That’s a wrap.