Upload
mapr-data-technologies
View
862
Download
0
Embed Size (px)
Citation preview
© 2017 MapR Technologies 1
Machine Learning Model Management
The working of the rendezvous framework
© 2017 MapR Technologies 2
Contact Information
Ted Dunning, PhD
Chief Application Architect, MapR Technologies
Committer, PMC member, board member, ASF
O’Reilly author
Email [email protected] [email protected]
Twitter @Ted_Dunning
© 2017 MapR Technologies 5
90% of the effort in successful machine learning isn’t in the training or model dev…
It’s the logistics
© 2017 MapR Technologies 6
Rendezvous Architecture
Input Scores
RendezvousModel 1
Model 2
Model 3
request
response
Results
© 2017 MapR Technologies 8
But This Isn’t The Answer
Model 1
request
response
Load
balancerModel 2
Model 3
© 2017 MapR Technologies 10
First Rendezvous
Input Scores
RendezvousModel 1
Model 2
Model 3
request
response
Results
© 2017 MapR Technologies 11
Some Key Points
• Note that all models see identical inputs
• All models run in production setting
• All models send scores to same stream
• The rendezvous server decides which scores to ignore
• Roll forward, roll back, correlated comparison are all now trivial
© 2017 MapR Technologies 12
Reality Check, Injecting External State
Model 1
Model 2
Model 3
request
Raw
Add external
dataInput
Database
The world
© 2017 MapR Technologies 13
Recording Raw Data (as it really was)
InputScores
Decoy
Model 2
Model 3
Archive
© 2017 MapR Technologies 14
Quality & Reproducibility of Input Data is Important!
• Recording raw-ish data is really a big deal
– Data as seen by a model is worth gold
– Data reconstructed later often has time-machine leaks
– Databases were made for updates, streams are safer
• Raw data is useful for non-ML cases as well (think flexibility)
• Decoy model records training data as seen by models under
development & evaluation
© 2017 MapR Technologies 16
What Does the Canary Do?
• The canary is a real model, but is very rarely updated
• The canary results are almost never used for decisioning
• The virtue of the canary is stability
• Comparing to the canary results gives insight into new models
© 2017 MapR Technologies 17
Isolated Development With Stream Replication
Model 1
Model 2
Model 3
request
Raw
Add
external
data
Input
Internal 1
Internal 2
Internal 3
The world
Model 4
Raw
New
external
data
Input
Internal 4
Production
Development
© 2017 MapR Technologies 19
ResultsRendezvousScores
ArchiveDecoy
m1
m2
m3
Features / profiles
Input Raw
© 2017 MapR Technologies 20
MetricsMetrics
ResultsRendezvousScores
ArchiveDecoy
m1
m2
m3
Features / profiles
Input Raw
© 2017 MapR Technologies 21
Some Details
• Inside the rendezvous server
– Message contents … highlight return address
– Rendezvous mailbox
– Schedule ideas
• Inside a model container
– Identical inputs makes scaling easy
– Nearly stateless models
– Streaming shims, latency rig
© 2017 MapR Technologies 22
Message Content
• Input request contains request data plus administrivia
{timestamp: 1501020498314,messageId: "2a5f2b61fdd848d7954a51b49c2a9e2c",return: "proxy-217"provenance: { ... },diagnostics: { ... },... application specific data here ..
}
© 2017 MapR Technologies 23
Rendezvous Schedules
• Simple part
– Up to deadline, accept preferred models
– Up to next deadline, accept more models
– Near final deadline, accept default answer
• But also some probabilistic choice
• And also consider external experimental control
– Inject as external state
– Use in rendezvous to select model result
– Open question how much power to expose
© 2017 MapR Technologies 25
Model Life Cycle
• Developer / modeler produces container spec
– And uses this to build their development article
• QA inspects container spec
– And uses this to build a test article
• Security inspects container spec
– And uses this to build final artifact
• Important to use tools like Grafeas to inspect supply chain
http://bit.ly/grafeas
• Important that each step be inspectable
© 2017 MapR Technologies 27
Scaling Up
• Note about streams– At millions of updates per server, the streams aren’t part of the streaming
question
• Scaling up state injection– Partition raw input, replicate state injector
– Beware external throughput limits
– State injection does avoid duplicate queries
• Scaling up models– Stateless models allow trivial scaling
– Sequence state typically also trivial to scale
• Scaling up the rendezvous– Match partition on raw and scores
– Replicate trivially
© 2017 MapR Technologies 28
MetricsMetrics
ResultsRendezvousScores
ArchiveDecoy
m1
m2
m3
Features / profiles
Input Raw
© 2017 MapR Technologies 29
MetricsMetrics
ResultsRendezvousScores
ArchiveDecoy
m1
m2
m3
Features / profiles
Input Raw
© 2017 MapR Technologies 30
MetricsMetrics
ResultsRendezvousScores
ArchiveDecoy
m1
m2
m3
Features / profiles
Input Raw
© 2017 MapR Technologies 31
MetricsMetrics
ResultsRendezvousScores
ArchiveDecoy
m1
m2
m3
Features / profiles
Input Raw
© 2017 MapR Technologies 39
First Rendezvous
Input Scores
RendezvousModel 1
Model 2
Model 3
request
response
Results
© 2017 MapR Technologies 40
Additional Resources
O’Reilly report by Ted Dunning & Ellen Friedman © March 2017
Read free courtesy of MapR:
https://mapr.com/geo-distribution-big-data-and-analytics/
O’Reilly book by Ted Dunning & Ellen Friedman
© March 2016
Read free courtesy of MapR:
https://mapr.com/streaming-architecture-using-
apache-kafka-mapr-streams/
© 2017 MapR Technologies 41
Additional Resources
O’Reilly book by Ted Dunning & Ellen Friedman
© June 2014
Read free courtesy of MapR:
https://mapr.com/practical-machine-learning-
new-look-anomaly-detection/
O’Reilly book by Ellen Friedman & Ted Dunning
© February 2014
Read free courtesy of MapR:
https://mapr.com/practical-machine-learning/
© 2017 MapR Technologies 42
Additional Resources
by Ellen Friedman 8 Aug 2017 on MapR blog:
https://mapr.com/blog/tensorflow-mxnet-caffe-h2o-which-ml-best/
by Ted Dunning 13 Sept 2017 in
InfoWorld:
https://www.infoworld.com/article/3223
688/machine-learning/machine-
learning-skills-for-software-
engineers.html
© 2017 MapR Technologies 43
New book: Machine Learning Logistics
Model Management in the Real World
O’Reilly book by Ellen Friedman & Ted Dunning © Sept 2017
Download free from MapR
http://info.mapr.com/2017_Content_Machine-Learning-
Logistics_eBook_Prereg_RegistrationPage.html
Going to Strata Data NYC? Book will be released 26 Sept 2017:
Visit MapR booth for free book signings or to talk about logistics
© 2017 MapR Technologies 44
Please support women in tech – help build
girls’ dreams of what they can accomplish
© Ellen Friedman 2015#womenintech #datawomen