Managing machine learning

Managing Machine LearningDavid Murgatroyd - VP, Engineering @dmurga

Problem: I can’t fully specify the behavior I want.

Problem: I can’t fully specify the behavior I want.Solution: Machine Learning

Where does machine learning fit in the technology universe?

Valuable

... a star of the Data Science orchestra. - John Mount, Win-Vector

Central

... the new algorithms ... at the heart of most of what computer science does.

- Hal Daumé III, U. Maryland Professor

Last Resort

… for cases when the desired behavior cannot be effectively expressed in software logic without dependency on external data.

- D. Sculley et al., Google

Where does machine learning fit in developing technology?

Stuff to do Demonstrable ValueStuff to do now

How does machine learning affect value demonstration?

Distill business goal into a repeatable,

balanced metric.

Measure on the most representative data you

can get.

Distinguish intrinsic errors from

implementation bugs.

Let your customer override the model when

they absolutely must get some answer.

Demonstrable Value

Distill business goal into a repeatable, balanced metric.

Demonstrable Value

Business goals in our example:

● fewer incorrect candidates sent to

analysts for review

● no increased volume of work for

analysts

● confidence to help analysts prioritize

Example metric: area under an error trade-off

curve based on confidence, constrained to

max volume. Sometimes called an ‘overall

evaluation criteria’ (OEC).

Note that the more skewed the OEC (e.g., if #

of positives varies by day and season) the

more samples are required to be sure of

statistical significance.

Measure on the most representative data you can get.

Demonstrable Value

Considerations when selecting data:

● online v offline: A/B test in production with

feature flags (one or two variables at a time,

agile-y) vs. stable data set

● implicit v explicit: implicit can correlate

more with value but omits unseen states

● broad v targeted: if explicitly annotating

consider targeting based on diagnostic

value or where systems disagree

Resist the temptation to ‘clean’ data -- you may

kill it. Instead include normalization in your

model.

Distinguish intrinsic errors from implementation bugs.

Demonstrable Value

Distinction

● Error: incorrect output from a model

despite the model being correctly

implemented.

● Bug: incorrect implementation, doing

something other than what was

intended

Useful to manage expectations about quality

and effort required to improve/fix.

Providing an explanation for output can help

make this distinction. Bug Error

Let your customer override the model when they absolutely must get some answer.

Demonstrable Value

Varieties of overrides:

● Always give this answer.

● Never give this answer.

Can apply for sub-models or overall.

Beware of potential toward ‘whack a mole’.

Feel sad every time they use it.

How does machine learning affect team organization?

Machine Learning Expert

Spectrum of options between:

Integrate machine learning expertise in every

team that needs it.

Separate it in an independent, specialist

Option 1: integrated teams with cross-team interest groups

Encourages alignment with

business goals.

Challenges machine learning

collaboration, depth and reuse.

Best for small, diverse products.

Option 2: independent machine learning team delivering models

Encourages machine learning

collaboration, depth and reuse.

Challenges alignment with

business goals.

Best for products with large, complex model(s).

How does machine learning affect iteration structure?

Pros for shorter:

● More simple experiments are better

than fewer complex ones

● The value of machine learning leads to

high cost of delay

Pros for longer:

● Innovation takes deep thinking

● More time to control technical debt

creation

How does machine learning affect chunks of work?

Focus on experiments following the

scientific-method: hypothesis, measurement

and error analysis.

Continuously test for regression versus

expected measurements.

Decouple functional tests from model

variations.

Stuff to do now

Focus on experiments with hypothesis, measurement and analysis.

Stuff to do now

Continuously test for regression versus expected measurements.

Stuff to do now

With machine learning’s dependence on data

changing anything changes everything. This

makes it the “high-interest credit card of

technical debt”.

Determine what’s a significant change,

including looking at aggregate effect across

different data sets.

Decouple functional tests from model variations.

Stuff to do now

Options:

Black-box style: enforce “can’t be wrong”

(“earmark”) input/output pairs. Might lead to

spurious test failures.

Clear-box style: use a mock implementation

of the model that produces expected answers.

Stuff to do now

Options:

Black-box style: ensure “can’t be wrong”

Stuff to do now

Options:

Black-box style: ensure “can’t be wrong”

How does machine learning affect prioritization?

Stuff to do

Do we need more training data?

Do we need a richer representation of our

Do we need a combination of models?

How much could improving a sub-component

of the model help?

What development milestones should we

target?

Do we need more training data?

Stuff to do

The learning curve implies adding training

data should bring down the test error closer

to the desired level.

Do we need a richer representation of our data?

Stuff to do

The learning curve implies adding data won’t

help but a richer data representation may.

Could be more features identified by

someone with domain expertise analyzing

errors. Though remember more features

often means less speed.

Could require a new model if the domain

information identified is not representable in

the existing one.

Do we need a combination of models?

Stuff to do

The learning curve implies the model is

overfitting the training set.

Consider training multiple models on random

subsets of the data and combine them at

runtime to decrease the variance while

retaining a low bias. Presuming you can spend

the compute.

How much could improving a sub-component of the model help?

Stuff to do

Build an ‘oracle’ for the sub-component --

something that takes perfect output from

Annotate to get that perfect output on some

test data to feed the oracle.

Measure the overall system with the oracle

turned on.

What development milestones should we target?

Stuff to do

Make it…

● Glued-together with some rules

(Prototype)

● Function (Alpha)

● Measurable & inspectable (early Beta)

● Accurate, not slow, nice demo,

documented & configurable (late Beta)

● Simple & fast (GA)

● Handle new kinds of input (post-GA)

Questions?

Managing machine learning

Technology

Machine Learning for NLP - Ethics and Machine Learning

Learning with large datasets Machine Learning Large scale machine learning

Bayesian Methods for Machine Learning - Machine Learning (Theory)

An Introduction to Machine Learning - GitHub Pages · Introduction Machine learning The machine learning process What’s machine learning History Supervised learning Non-supervised

1-Machine Learning in Oracle - ITOUG2017/12/01 · – Oracle Machine Learning (on Oracle Autonomous Data Warehouse Cloud Service) – Automated Machine Learning => Machine Learning

¿Qué es Machine Learning? Usos de Machine Learning

Deploying and managing machine

Production and Beyond: Deploying and Managing Machine Learning Models

Deep learning - Machine learning overviewce.sharif.edu/courses/98-99/1/ce719-1/resources/root/...Deep learning j What is machine learning? Types of machine learning Machine learning

Section 1Section 1 Machine Learning basic concepts · Machine Learning TutorialMachine Learning Tutorial ... Section 1Section 1 Machine Learning basic concepts ... Learning Tutorial

DAT340/DIT866 Applied Machine Learning: Machine learning ...richajo/dit866/lectures/l11/l11.pdfDAT340/DIT866 Applied Machine Learning: Machine learning and ethics Vilhelm Verendel

Predicting Housing Prices With Azure Machine Learning · Azure Machine Learning. Expected Learning Outcomes Azure Machine Learning @tetranoodle Build Regression Machine Learning Model

Machine Learning in Logistics: Machine Learning Algorithmsltu.diva-portal.org/smash/get/diva2:1118764/FULLTEXT02.pdf · Machine Learning in Logistics: Machine Learning ... Machine

Machine Learning - 0. Overview€¦ · Machine Learning 1. What is Machine Learning? One area of research, many names (and aspects) machine learning historically, stresses learning

MathWorks - Introducing Machine Learning · 4 ntroducing Machine Learning How Machine Learning Works Machine learning uses two types of techniques: supervised learning, which trains

A Machine Learning Perspective on Managing Noisy DataA Machine Learning Perspective on Managing Noisy Data ... • Human errors • Machine failures • Code bugs Data errors are everywhere

Fast Track Machine Learning Part 1 (Machine Learning Overview) - Types of Machine Learning Problems

ARTIFICIAL INTELLIGENCE & MACHINE LEARNING · Intelligence & Machine Learning technologies and applications including Machine Learning, Deep Learning, Computer Vision, Natural Language

Methods MACHINE LEARNING FROM MACHINE LEARNING TO …

machine Learning Learning