40
Simplifying ML Workflows with Apache Beam & TensorFlow Extended Tyler Akidau @takidau Software Engineer at Google Apache Beam PMC +

Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

Simplifying ML Workflows with Apache Beam & TensorFlow ExtendedTyler Akidau@takidau

Software Engineer at GoogleApache Beam PMC

+

Page 2: Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

Apache BeamPortable data-processing pipelines

+

Page 3: Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

Example pipelines

+

Java

Python

Page 4: Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

Cross-language Portability Framework

+

Language B SDK

Language A SDK

Language C SDK

Runner 1 Runner 3Runner 2

The Beam Model

Language A Language CLanguage B

The Beam Model

Page 5: Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

Python compatible runners

+

Direct runner (local machine):

Google Cloud Dataflow:

Apache Flink:

Apache Spark:

Now

Now

Q2-Q3

Q3-Q4

Page 6: Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

TensorFlow ExtendedEnd-to-end machine learning in production

+

Page 7: Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

“Doing ML in production is hard.”

-Everyone who has ever tried

Page 8: Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

Because, in addition to the actual ML...

ML Code

+

Page 9: Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

...you have to worry about so much more.

Configuration

Data Collection

Data Verification

Feature Extraction

Process Management Tools

Analysis Tools

Machine Resource

Management

Serving Infrastructure

Monitoring

Source: Sculley et al.: Hidden Technical Debt in Machine Learning Systems

ML Code

+

Page 10: Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

In this talk, I will...

+

Page 11: Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

In this talk, I will...

Show you how to apply transformations...

TensorFlowTransform

Show you how to apply transformations...

+

Page 12: Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

In this talk, we will...

Show you how to apply transformations... ... consistently between Training and Serving

TensorFlowTransform

TensorFlowEstimators

TensorFlowServing

+

Page 13: Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

In this talk, we will...

Introduce something new...

TensorFlowTransform

TensorFlowEstimators

TensorFlowServing

TensorFlowModel

Analysis

+

Page 14: Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

TensorFlow TransformConsistent In-Graph Transformations in Training and Serving

+

Page 15: Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

Typical ML Pipeline

batch processing

During training

“live” processing

During serving

data request

+

Page 16: Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

Typical ML Pipeline

batch processing

During training

“live” processing

During serving

data request

+

Page 17: Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

TensorFlow Transform

tf.Transform batch processing

During training

transform as tf.Graph

During serving

data request

+

Page 18: Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

Defining a preprocessing function in TF Transform

def preprocessing_fn(inputs): x = inputs['X'] ... return { "A": tft.bucketize( tft.normalize(x) * y), "B": tensorflow_fn(y, z), "C": tft.ngrams(z) }

Many operations available for dealing with text and numeric features, user can define their own.

X Y Z

+

Page 19: Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

Defining a preprocessing function in TF Transform

def preprocessing_fn(inputs): x = inputs['X'] ... return { "A": tft.bucketize( tft.normalize(x) * y), "B": tensorflow_fn(y, z), "C": tft.ngrams(z) }

Many operations available for dealing with text and numeric features, user can define their own.

X Y Z

mean stddev

normalize

+

Page 20: Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

Defining a preprocessing function in TF Transform

def preprocessing_fn(inputs): x = inputs['X'] ... return { "A": tft.bucketize( tft.normalize(x) * y), "B": tensorflow_fn(y, z), "C": tft.ngrams(z) }

Many operations available for dealing with text and numeric features, user can define their own.

X Y Z

mean stddev

normalizemultiply

+

Page 21: Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

X Y Z

mean stddev

normalizemultiply

Defining a preprocessing function in TF Transform

def preprocessing_fn(inputs): x = inputs['X'] ... return { "A": tft.bucketize( tft.normalize(x) * y), "B": tensorflow_fn(y, z), "C": tft.ngrams(z) }

Many operations available for dealing with text and numeric features, user can define their own.

quantiles

bucketize

A+

Page 22: Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

X Y Z

mean stddev

normalizemultiply

quantiles

bucketize

A

Defining a preprocessing function in TF Transform

def preprocessing_fn(inputs): x = inputs['X'] ... return { "A": tft.bucketize( tft.normalize(x) * y), "B": tensorflow_fn(y, z), "C": tft.ngrams(z) }

Many operations available for dealing with text and numeric features, user can define their own.

B+

Page 23: Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

X Y Z

mean stddev

normalizemultiply

quantiles

bucketize

A B

Defining a preprocessing function in TF Transform

def preprocessing_fn(inputs): x = inputs['X'] ... return { "A": tft.bucketize( tft.normalize(x) * y), "B": tensorflow_fn(y, z), "C": tft.ngrams(z) }

Many operations available for dealing with text and numeric features, user can define their own.

C+

Page 24: Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

mean stddev

normalizemultiply

quantiles

bucketize

Analyzers

Reduce (full pass)

Implemented as a distributed data pipeline

Transforms

Instance-to-instance (don’t change batch dimension)

Pure TensorFlow

+

Page 25: Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

Analyzenormalize

multiply

bucketize

constant tensors

data

mean stddev

normalizemultiply

quantiles

bucketize

Page 26: Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

What can be done with TF Transform?

Pretty much anything.

tf.Transform batch processing

+

Page 27: Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

What can be done with TF Transform?

Anything that can be expressed as a TensorFlow GraphPretty much anything.

tf.Transform batch processing Serving Graph

+

Page 28: Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

Scale to ... Bag of Words / N-Grams

Bucketization Feature Crosses

Some common use-cases...

Page 29: Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

Apply another TensorFlow Model

Some common use-cases...

Scale to ... Bag of Words / N-Grams

Bucketization Feature Crosses

Page 30: Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

github.com/tensorflow/transform

Page 31: Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

TensorFlow Model AnalysisScaleable, sliced, and full-pass metrics

Introducing…

+

Page 32: Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

Let’s Talk about Metrics...

● How accurate?● Converged model?● What about my TB sized eval set?● Slices / subsets?● Across model versions?

+

Page 33: Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

ML Fairness: analyzing model mistakes by subgroup

Specificity (False Positive Rate)

Sens

itivi

ty (T

rue

Posi

tive

Rate

)

ROC CurveAll groups

Learn more at ml-fairness.com

Page 34: Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

ML Fairness: analyzing model mistakes by subgroup

Specificity (False Positive Rate)

Sens

itivi

ty (T

rue

Posi

tive

Rate

)

ROC CurveAll groups

Group A

Group B

Learn more at ml-fairness.com

Page 35: Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

ML Fairness: understand the failure modes of your models

Page 36: Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

ML Fairness: Learn More

ml-fairness.com

Page 37: Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

How does it work?

...estimator = DNNLinearCombinedClassifier(...)estimator.train(...)

estimator.export_savedmodel( serving_input_receiver_fn=serving_input_fn)

tfma.export.export_eval_savedmodel ( estimator=estimator, eval_input_receiver_fn=eval_input_fn)...

Inference Graph (SavedModel)

SignatureDef

+

Page 38: Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

...estimator = DNNLinearCombinedClassifier(...)estimator.train(...)

estimator.export_savedmodel( serving_input_receiver_fn=serving_input_fn)

tfma.export.export_eval_savedmodel ( estimator=estimator, eval_input_receiver_fn=eval_input_fn)...

How does it work?

Eval Graph (SavedModel)

SignatureDefEval Metadata

Inference Graph (SavedModel)

SignatureDef

+

Page 39: Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

github.com/tensorflow/model-analysis

Page 40: Simplifying ML Workflows with Apache Beam & …...Summary Apache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise

SummaryApache Beam: Data-processing framework the runs locally and scales to massive data, in the Cloud (now) and soon on-premise via Flink (Q2-Q3) and Spark (Q3-Q4). Powers large-scale data processing in the TF libraries below.

tf.Transform: Consistent in-graph transformations in training and serving.

tf.ModelAnalysis: Scalable, sliced, and full-pass metrics.

+