45
© 2017 MapR Technologies 1 Machine Learning Model Management The working of the rendezvous framework

ML Workshop 1: A New Architecture for Machine Learning Logistics

Embed Size (px)

Citation preview

© 2017 MapR Technologies 1

Machine Learning Model Management

The working of the rendezvous framework

© 2017 MapR Technologies 2

Contact Information

Ted Dunning, PhD

Chief Application Architect, MapR Technologies

Committer, PMC member, board member, ASF

O’Reilly author

Email [email protected] [email protected]

Twitter @Ted_Dunning

© 2017 MapR Technologies 3

Traditional View

© 2017 MapR Technologies 4

Traditional View: This isn’t the whole story

© 2017 MapR Technologies 5

90% of the effort in successful machine learning isn’t in the training or model dev…

It’s the logistics

© 2017 MapR Technologies 6

Rendezvous Architecture

Input Scores

RendezvousModel 1

Model 2

Model 3

request

response

Results

© 2017 MapR Technologies 7

What We Ultimately Want

request

responseModel

© 2017 MapR Technologies 8

But This Isn’t The Answer

Model 1

request

response

Load

balancerModel 2

Model 3

© 2017 MapR Technologies 9

First Try with Streams

Input

Model 1

Model 2

Model 3

request

response?

© 2017 MapR Technologies 10

First Rendezvous

Input Scores

RendezvousModel 1

Model 2

Model 3

request

response

Results

© 2017 MapR Technologies 11

Some Key Points

• Note that all models see identical inputs

• All models run in production setting

• All models send scores to same stream

• The rendezvous server decides which scores to ignore

• Roll forward, roll back, correlated comparison are all now trivial

© 2017 MapR Technologies 12

Reality Check, Injecting External State

Model 1

Model 2

Model 3

request

Raw

Add external

dataInput

Database

The world

© 2017 MapR Technologies 13

Recording Raw Data (as it really was)

InputScores

Decoy

Model 2

Model 3

Archive

© 2017 MapR Technologies 14

Quality & Reproducibility of Input Data is Important!

• Recording raw-ish data is really a big deal

– Data as seen by a model is worth gold

– Data reconstructed later often has time-machine leaks

– Databases were made for updates, streams are safer

• Raw data is useful for non-ML cases as well (think flexibility)

• Decoy model records training data as seen by models under

development & evaluation

© 2017 MapR Technologies 15

Canary for Comparison

Real

model∆

Result

Canary

Decoy

Archive

Input

© 2017 MapR Technologies 16

What Does the Canary Do?

• The canary is a real model, but is very rarely updated

• The canary results are almost never used for decisioning

• The virtue of the canary is stability

• Comparing to the canary results gives insight into new models

© 2017 MapR Technologies 17

Isolated Development With Stream Replication

Model 1

Model 2

Model 3

request

Raw

Add

external

data

Input

Internal 1

Internal 2

Internal 3

The world

Model 4

Raw

New

external

data

Input

Internal 4

Production

Development

© 2017 MapR Technologies 18

Scores

ArchiveDecoy

m1

m2

m3

Features / profiles

Input Raw

© 2017 MapR Technologies 19

ResultsRendezvousScores

ArchiveDecoy

m1

m2

m3

Features / profiles

Input Raw

© 2017 MapR Technologies 20

MetricsMetrics

ResultsRendezvousScores

ArchiveDecoy

m1

m2

m3

Features / profiles

Input Raw

© 2017 MapR Technologies 21

Some Details

• Inside the rendezvous server

– Message contents … highlight return address

– Rendezvous mailbox

– Schedule ideas

• Inside a model container

– Identical inputs makes scaling easy

– Nearly stateless models

– Streaming shims, latency rig

© 2017 MapR Technologies 22

Message Content

• Input request contains request data plus administrivia

{timestamp: 1501020498314,messageId: "2a5f2b61fdd848d7954a51b49c2a9e2c",return: "proxy-217"provenance: { ... },diagnostics: { ... },... application specific data here ..

}

© 2017 MapR Technologies 23

Rendezvous Schedules

• Simple part

– Up to deadline, accept preferred models

– Up to next deadline, accept more models

– Near final deadline, accept default answer

• But also some probabilistic choice

• And also consider external experimental control

– Inject as external state

– Use in rendezvous to select model result

– Open question how much power to expose

© 2017 MapR Technologies 24

The rendezvous server is simpler than it looks at first

© 2017 MapR Technologies 25

Model Life Cycle

• Developer / modeler produces container spec

– And uses this to build their development article

• QA inspects container spec

– And uses this to build a test article

• Security inspects container spec

– And uses this to build final artifact

• Important to use tools like Grafeas to inspect supply chain

http://bit.ly/grafeas

• Important that each step be inspectable

© 2017 MapR Technologies 26

Almost all of the framework scales by trivial parallelism

© 2017 MapR Technologies 27

Scaling Up

• Note about streams– At millions of updates per server, the streams aren’t part of the streaming

question

• Scaling up state injection– Partition raw input, replicate state injector

– Beware external throughput limits

– State injection does avoid duplicate queries

• Scaling up models– Stateless models allow trivial scaling

– Sequence state typically also trivial to scale

• Scaling up the rendezvous– Match partition on raw and scores

– Replicate trivially

© 2017 MapR Technologies 28

MetricsMetrics

ResultsRendezvousScores

ArchiveDecoy

m1

m2

m3

Features / profiles

Input Raw

© 2017 MapR Technologies 29

MetricsMetrics

ResultsRendezvousScores

ArchiveDecoy

m1

m2

m3

Features / profiles

Input Raw

© 2017 MapR Technologies 30

MetricsMetrics

ResultsRendezvousScores

ArchiveDecoy

m1

m2

m3

Features / profiles

Input Raw

© 2017 MapR Technologies 31

MetricsMetrics

ResultsRendezvousScores

ArchiveDecoy

m1

m2

m3

Features / profiles

Input Raw

© 2017 MapR Technologies 32

In-place update of the framework via modified Chandry-Lamport

© 2017 MapR Technologies 33

Transition Message

InputFeatures /

profiles Raw

© 2017 MapR Technologies 34

Transition Message

Features / profiles

InputFeatures /

profiles Raw

© 2017 MapR Technologies 35

Transition Message

Features / profiles

Features / profiles

Input Raw

© 2017 MapR Technologies 36

Summary:This is easy-ish

© 2017 MapR Technologies 37

Summary:This is easy-ish

© 2017 MapR Technologies 38

Summary:This is easy-ish

Well, it isn’t real hard

© 2017 MapR Technologies 39

First Rendezvous

Input Scores

RendezvousModel 1

Model 2

Model 3

request

response

Results

© 2017 MapR Technologies 40

Additional Resources

O’Reilly report by Ted Dunning & Ellen Friedman © March 2017

Read free courtesy of MapR:

https://mapr.com/geo-distribution-big-data-and-analytics/

O’Reilly book by Ted Dunning & Ellen Friedman

© March 2016

Read free courtesy of MapR:

https://mapr.com/streaming-architecture-using-

apache-kafka-mapr-streams/

© 2017 MapR Technologies 41

Additional Resources

O’Reilly book by Ted Dunning & Ellen Friedman

© June 2014

Read free courtesy of MapR:

https://mapr.com/practical-machine-learning-

new-look-anomaly-detection/

O’Reilly book by Ellen Friedman & Ted Dunning

© February 2014

Read free courtesy of MapR:

https://mapr.com/practical-machine-learning/

© 2017 MapR Technologies 42

Additional Resources

by Ellen Friedman 8 Aug 2017 on MapR blog:

https://mapr.com/blog/tensorflow-mxnet-caffe-h2o-which-ml-best/

by Ted Dunning 13 Sept 2017 in

InfoWorld:

https://www.infoworld.com/article/3223

688/machine-learning/machine-

learning-skills-for-software-

engineers.html

© 2017 MapR Technologies 43

New book: Machine Learning Logistics

Model Management in the Real World

O’Reilly book by Ellen Friedman & Ted Dunning © Sept 2017

Download free from MapR

http://info.mapr.com/2017_Content_Machine-Learning-

Logistics_eBook_Prereg_RegistrationPage.html

Going to Strata Data NYC? Book will be released 26 Sept 2017:

Visit MapR booth for free book signings or to talk about logistics

© 2017 MapR Technologies 44

Please support women in tech – help build

girls’ dreams of what they can accomplish

© Ellen Friedman 2015#womenintech #datawomen

© 2017 MapR Technologies 45

Q&A

@mapr

[email protected]

ENGAGE WITH US

@ Ted_Dunning