49
Joseph E. Gonzalez Co-director of the RISE Lab [email protected] AI-Systems Machine Learning Lifecycle

AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

Joseph E. GonzalezCo-director of the RISE Lab

[email protected]

AI-Systems

Machine LearningLifecycle

Page 2: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

Objectives For TodayØ Introduce the machine learning lifecycle

Ø Challenges and OpportunitiesØ Science vs Engineering

Ø Review Key Concepts in ReadingsØ HyperparametersØ Model Pipelines, Features, and Feature EngineeringØ Warm Starting and Fine TuningØ Feedback Loops, Retraining and Continuous Training

Ø Important context for papers and what to expect

Page 3: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

OfflineTraining

Data

DataCollection

Cleaning &Visualization

Feature Eng. &Model Design

Training &Validation

Model Development

TrainedModelsTraining Pipelines

LiveData

Training

Validation

End UserApplication

Query

Prediction

Prediction Service

Inference

Feedback

Logic

DataScientist

DataEngineer

DataEngineer

What is the Machine Learning Lifecycle?

Page 4: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

OfflineTraining

Data

DataCollection

Cleaning &Visualization

Feature Eng. &Model Design

Training &Validation

Model Development

DataScientist

Identifying potential sources of data

Joining data from multiple sources

Addressing missing values and outliers

Plotting trends to identify anomalies

Help!

Page 5: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

OfflineTraining

Data

DataCollection

Cleaning &Visualization

Feature Eng. &Model Design

Training &Validation

Model Development

DataScientist

DataEngineer

Identifying potential sources of data

Joining data from multiple sources

Addressing missing values and outliers

Plotting trends to identify anomalies

Help! Happy to JOIN

Page 6: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks
Page 7: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

Andrej Karpathy (Tesla Auto Pilot Team)

How many of you have ever worked with real data?

Page 8: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

OfflineTraining

Data

DataCollection

Cleaning &Visualization

Feature Eng. &Model Design

Training &Validation

Model Development

DataScientist

DataEngineer

Identifying potential sources of data

Joining data from multiple sources

Addressing missing values and outliers

Plotting trends to identify anomalies

Help! Happy to JOIN

Page 9: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

OfflineTraining

Data

DataCollection

Cleaning &Visualization

Feature Eng. &Model Design

Training &Validation

Model Development

DataScientist

Building informative features functions

Designing new model architectures

Tuning hyperparameters

Validating prediction accuracy

I was born for this!

Page 10: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

Features and Feature Engineering

Ø Features: properties or characteristic of the input

Ø Click Prediction Example:

Information about User and Content

Probability user will click on the content

f✓(x) ! y<latexit sha1_base64="kp7AsP9kZQuPLpvt7dZHXdUVdHg=">AAACA3icbVDLSsNAFJ3UV62vqjvdDBahbkpSBV0W3bisYB/QlDCZTpqhk0mYuVFDKbjxV9y4UMStP+HOv3H6WGjrgQuHc+7l3nv8RHANtv1t5ZaWV1bX8uuFjc2t7Z3i7l5Tx6mirEFjEau2TzQTXLIGcBCsnShGIl+wlj+4GvutO6Y0j+UtZAnrRqQvecApASN5xYPAcyFkQMoPJ9hVvB8CUSq+xxn2iiW7Yk+AF4kzIyU0Q90rfrm9mKYRk0AF0brj2Al0h0QBp4KNCm6qWULogPRZx1BJIqa7w8kPI3xslB4OYmVKAp6ovyeGJNI6i3zTGREI9bw3Fv/zOikEF90hl0kKTNLpoiAVGGI8DgT3uGIURGYIoYqbWzENiSIUTGwFE4Iz//IiaVYrzmmlenNWql3O4sijQ3SEyshB56iGrlEdNRBFj+gZvaI368l6sd6tj2lrzprN7KM/sD5/ACPhlzE=</latexit>

Model:

Simple Logistic Model:(”Perceptron”)* f✓(x) = �

dX

k=1

✓k�k(x)

!, �(t) =

1

1 + e�t<latexit sha1_base64="SqWN/4ok0C/nt6dhNMngKpnl4WQ=">AAACTnicbVFNi9NAGJ7Ur1q/qh69DBahi1qSVdDLwqIXjyvY3YVON7yZvGmGzCRx5o1YQn6hF/Hmz/DiQRGdtjnori8MPDwf8/FMUmvlKAy/BoNLl69cvTa8Prpx89btO+O7945d1ViJc1npyp4m4FCrEuekSONpbRFMovEkKV5v9JMPaJ2qyne0rnFpYFWqTEkgT8VjzGJBORJMP+7xAy6cWhkQGjOaCteYuC0Oou4s5TtTXHBR5youvFtYtcpp74l430DaB6e03SSzINuoa6PHeNY+pa6Lx5NwFm6HXwRRDyasn6N4/EWklWwMliQ1OLeIwpqWLVhSUmM3Eo3DGmQBK1x4WIJBt2y3dXT8kWdSnlXWr5L4lv070YJxbm0S7zRAuTuvbcj/aYuGspfLVpV1Q1jK3UFZozlVfNMtT5VFSXrtAUir/F25zMGXQf4HRr6E6PyTL4Lj/Vn0bLb/9vnk8FVfx5A9YA/ZlEXsBTtkb9gRmzPJPrFv7Af7GXwOvge/gt876yDoM/fZPzMY/gGr7bOA</latexit>

* Technically the original perceptron used a 0/1 non-linearity but this is a common abuse of terminology.

Features function extract numeric properties from x

Useful features?

Ø User Features:Ø age, gender, and click history

Ø Product Features: Ø Price, popularity, and description…

Ø Combined (Cross) features:Ø I(20 < age < 30, male, “xbox” in desc)…

e.g., Recurrent NN output..

e.g., Language Model Embedding

Hand coded features

Page 11: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

Additional Notes on FeaturesØ Feature Joins: combine multiple data source in a feature

Ø Feature Reuse: good features can aid in many tasksØ Example: product embeddings, user tags, …

Ø Predictions as Features: predictions for one task (e.g., products in an image) can be useful features for another (e.g., ad targeting)

Ø Feature Tables/Caches: features are often pre-computed and cachedØ Requires tracking data and compute and feature versions

Ø Dynamic Features: features can often be modified faster than modelsØ Useful for addressing fast changing dynamics (e.g., user preferences can be

encoded in click history features).Ø Issue: resulting potential covariate shift can be problematic

Page 12: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

HyperparametersØ the parameters and more generally configuration details

that are not directly determined through trainingØ set by hand or tuned using cross validationØ why not learn directly?

Ø Find the Hyperparameters:

argmin✓

1

n

nX

i=1

L↵ (f✓(xi), yi) + �R(✓)<latexit sha1_base64="CrHjiu7TGVFcK3XWcHN8L8AJBUk=">AAACTnicbVFNbxMxFPSmfITwFcqRi0WElAgU7Rak9lKpggsHDgWRtlIcVm+dt1mrtndlv0WNVvsLuSBu/AwuHEAInI8DtIxkaTQzT34eZ5VWnuL4a9TZuXb9xs3urd7tO3fv3e8/2D3xZe0kTmSpS3eWgUetLE5IkcazyiGYTONpdv5q5Z9+ROdVad/TssKZgYVVuZJAQUr7KMAtuDDKpo2gAglakTuQTdI2thW+NmmjDpP2g+VvUgG6KkBozGmYp5v48CJVo2d8mSrh1KKgEX/KhQ4LzIG/G24yo7Q/iMfxGvwqSbZkwLY4TvtfxLyUtUFLUoP30ySuaNaAIyU1tj1Re6xAnsMCp4FaMOhnzbqOlj8JypznpQvHEl+rf080YLxfmiwkDVDhL3sr8X/etKb8YNYoW9WEVm4uymvNqeSrbvlcOZSkl4GAdCrsymUBoU0KP9ALJSSXn3yVnOyNk+fjvbcvBkcvt3V02SP2mA1ZwvbZEXvNjtmESfaJfWM/2M/oc/Q9+hX93kQ70XbmIfsHne4f48u0pA==</latexit>

Objective:

Architecture:

Stochastic Gradient Descent

✓1<latexit sha1_base64="cbM0eSmGxMtQkVBDiiBAEOk4Olo=">AAAJx3icpZZfb9s2EMDVrtti70/b7bEvwowALZAGlrcg2VuxIkD/BIvb1Wkw0zAo+iQTpSiBpFO7jB72FfbafYJ+o32bUZScSFQLaBsB22fe/e5OJ95JYcaoVMPh3zdufnbr8y++3On1v/r6m29v37n73ZlMV4LAhKQsFechlsAoh4miisF5JgAnIYPX4ZvHhf71BQhJU/5KbTKYJTjmNKIEK7N1jtQSFJ4H8zuD4f7QLr8tBJUw8Ko1nt/d+YAWKVklwBVhWMppMMzUTGOhKGGQ99FKQobJGxzD1IgcJyBn2iac+7tmZ+FHqTAfrny7Wyc0TqTcJKGxTLBaSldXbH5MN12p6GimKc9WCjgpA0Ur5qvUL67eX1ABRLGNETAR1OTqkyUWmChTo34jTJlq3yzE4S1JkwTzhUY4lPk0mGnEIFKXgwAJGi/VZd60IkDZtRlixX9/EPilNRJW30Qilpp4NcZu1KHSwk0oS9mGpXE+HRlyDxU1CSN9QvO5HgS59XXfH4wqLw+cqObik5RT4uDLT9JN/CTlkOuTuQ5yx/EJ5ZHVICOoTUv9lEe/piKp1bL8vr7gy+rnky5OOXR1EbTKJja5NvepuMlTEYczPdw/Ojw8ODjY2x77QvhxNDo8yluhxbo7vG7Btue6Oyh7tOUFMklZyv+Fny3RPtQxvQDjCe1doj0njlQLM4XKGleHA2U097dHY1vqB74DjkXYoMbdmGM3mKJsAYbugB/zJvukS0h6lagRdGGZuzbPQXBr8rxDDutGCsdVCtNrZNZGZNl+V4xtvitwVANd8q1QndkGeoYFxZxAI1uz2SHfp3zRoKo5UUcc4pdtAZGCtdL2ry1iUb7+7q7/8L+uAv4NlM9TZZ9obpNgJvMq0VC/dJuIcgWxmfzXJr+7JhJUbSjr7cBHrt0FEGkeHVC5IpjpM9dmwmnRsCikMVllraqaZ1ClxK5yVZFtjFZYm5H0HdRmY20qFgPg/xb9cZpkDNZUbZxbTeNTG/UULW1B9MMgU3mrb9wTQuNXdszZhK3Y7rU2c5pAXDFW/BjT75u3m8B9l2kLk9H+z/vBi58Gj8bVa86Od8/7wbvvBd6h98h74o29iUc85v3pvff+6j3rZb2L3ro0vXmjYr73Gqv3xz/lK1WD</latexit><latexit sha1_base64="cbM0eSmGxMtQkVBDiiBAEOk4Olo=">AAAJx3icpZZfb9s2EMDVrtti70/b7bEvwowALZAGlrcg2VuxIkD/BIvb1Wkw0zAo+iQTpSiBpFO7jB72FfbafYJ+o32bUZScSFQLaBsB22fe/e5OJ95JYcaoVMPh3zdufnbr8y++3On1v/r6m29v37n73ZlMV4LAhKQsFechlsAoh4miisF5JgAnIYPX4ZvHhf71BQhJU/5KbTKYJTjmNKIEK7N1jtQSFJ4H8zuD4f7QLr8tBJUw8Ko1nt/d+YAWKVklwBVhWMppMMzUTGOhKGGQ99FKQobJGxzD1IgcJyBn2iac+7tmZ+FHqTAfrny7Wyc0TqTcJKGxTLBaSldXbH5MN12p6GimKc9WCjgpA0Ur5qvUL67eX1ABRLGNETAR1OTqkyUWmChTo34jTJlq3yzE4S1JkwTzhUY4lPk0mGnEIFKXgwAJGi/VZd60IkDZtRlixX9/EPilNRJW30Qilpp4NcZu1KHSwk0oS9mGpXE+HRlyDxU1CSN9QvO5HgS59XXfH4wqLw+cqObik5RT4uDLT9JN/CTlkOuTuQ5yx/EJ5ZHVICOoTUv9lEe/piKp1bL8vr7gy+rnky5OOXR1EbTKJja5NvepuMlTEYczPdw/Ojw8ODjY2x77QvhxNDo8yluhxbo7vG7Btue6Oyh7tOUFMklZyv+Fny3RPtQxvQDjCe1doj0njlQLM4XKGleHA2U097dHY1vqB74DjkXYoMbdmGM3mKJsAYbugB/zJvukS0h6lagRdGGZuzbPQXBr8rxDDutGCsdVCtNrZNZGZNl+V4xtvitwVANd8q1QndkGeoYFxZxAI1uz2SHfp3zRoKo5UUcc4pdtAZGCtdL2ry1iUb7+7q7/8L+uAv4NlM9TZZ9obpNgJvMq0VC/dJuIcgWxmfzXJr+7JhJUbSjr7cBHrt0FEGkeHVC5IpjpM9dmwmnRsCikMVllraqaZ1ClxK5yVZFtjFZYm5H0HdRmY20qFgPg/xb9cZpkDNZUbZxbTeNTG/UULW1B9MMgU3mrb9wTQuNXdszZhK3Y7rU2c5pAXDFW/BjT75u3m8B9l2kLk9H+z/vBi58Gj8bVa86Od8/7wbvvBd6h98h74o29iUc85v3pvff+6j3rZb2L3ro0vXmjYr73Gqv3xz/lK1WD</latexit><latexit sha1_base64="cbM0eSmGxMtQkVBDiiBAEOk4Olo=">AAAJx3icpZZfb9s2EMDVrtti70/b7bEvwowALZAGlrcg2VuxIkD/BIvb1Wkw0zAo+iQTpSiBpFO7jB72FfbafYJ+o32bUZScSFQLaBsB22fe/e5OJ95JYcaoVMPh3zdufnbr8y++3On1v/r6m29v37n73ZlMV4LAhKQsFechlsAoh4miisF5JgAnIYPX4ZvHhf71BQhJU/5KbTKYJTjmNKIEK7N1jtQSFJ4H8zuD4f7QLr8tBJUw8Ko1nt/d+YAWKVklwBVhWMppMMzUTGOhKGGQ99FKQobJGxzD1IgcJyBn2iac+7tmZ+FHqTAfrny7Wyc0TqTcJKGxTLBaSldXbH5MN12p6GimKc9WCjgpA0Ur5qvUL67eX1ABRLGNETAR1OTqkyUWmChTo34jTJlq3yzE4S1JkwTzhUY4lPk0mGnEIFKXgwAJGi/VZd60IkDZtRlixX9/EPilNRJW30Qilpp4NcZu1KHSwk0oS9mGpXE+HRlyDxU1CSN9QvO5HgS59XXfH4wqLw+cqObik5RT4uDLT9JN/CTlkOuTuQ5yx/EJ5ZHVICOoTUv9lEe/piKp1bL8vr7gy+rnky5OOXR1EbTKJja5NvepuMlTEYczPdw/Ojw8ODjY2x77QvhxNDo8yluhxbo7vG7Btue6Oyh7tOUFMklZyv+Fny3RPtQxvQDjCe1doj0njlQLM4XKGleHA2U097dHY1vqB74DjkXYoMbdmGM3mKJsAYbugB/zJvukS0h6lagRdGGZuzbPQXBr8rxDDutGCsdVCtNrZNZGZNl+V4xtvitwVANd8q1QndkGeoYFxZxAI1uz2SHfp3zRoKo5UUcc4pdtAZGCtdL2ry1iUb7+7q7/8L+uAv4NlM9TZZ9obpNgJvMq0VC/dJuIcgWxmfzXJr+7JhJUbSjr7cBHrt0FEGkeHVC5IpjpM9dmwmnRsCikMVllraqaZ1ClxK5yVZFtjFZYm5H0HdRmY20qFgPg/xb9cZpkDNZUbZxbTeNTG/UULW1B9MMgU3mrb9wTQuNXdszZhK3Y7rU2c5pAXDFW/BjT75u3m8B9l2kLk9H+z/vBi58Gj8bVa86Od8/7wbvvBd6h98h74o29iUc85v3pvff+6j3rZb2L3ro0vXmjYr73Gqv3xz/lK1WD</latexit><latexit sha1_base64="cbM0eSmGxMtQkVBDiiBAEOk4Olo=">AAAJx3icpZZfb9s2EMDVrtti70/b7bEvwowALZAGlrcg2VuxIkD/BIvb1Wkw0zAo+iQTpSiBpFO7jB72FfbafYJ+o32bUZScSFQLaBsB22fe/e5OJ95JYcaoVMPh3zdufnbr8y++3On1v/r6m29v37n73ZlMV4LAhKQsFechlsAoh4miisF5JgAnIYPX4ZvHhf71BQhJU/5KbTKYJTjmNKIEK7N1jtQSFJ4H8zuD4f7QLr8tBJUw8Ko1nt/d+YAWKVklwBVhWMppMMzUTGOhKGGQ99FKQobJGxzD1IgcJyBn2iac+7tmZ+FHqTAfrny7Wyc0TqTcJKGxTLBaSldXbH5MN12p6GimKc9WCjgpA0Ur5qvUL67eX1ABRLGNETAR1OTqkyUWmChTo34jTJlq3yzE4S1JkwTzhUY4lPk0mGnEIFKXgwAJGi/VZd60IkDZtRlixX9/EPilNRJW30Qilpp4NcZu1KHSwk0oS9mGpXE+HRlyDxU1CSN9QvO5HgS59XXfH4wqLw+cqObik5RT4uDLT9JN/CTlkOuTuQ5yx/EJ5ZHVICOoTUv9lEe/piKp1bL8vr7gy+rnky5OOXR1EbTKJja5NvepuMlTEYczPdw/Ojw8ODjY2x77QvhxNDo8yluhxbo7vG7Btue6Oyh7tOUFMklZyv+Fny3RPtQxvQDjCe1doj0njlQLM4XKGleHA2U097dHY1vqB74DjkXYoMbdmGM3mKJsAYbugB/zJvukS0h6lagRdGGZuzbPQXBr8rxDDutGCsdVCtNrZNZGZNl+V4xtvitwVANd8q1QndkGeoYFxZxAI1uz2SHfp3zRoKo5UUcc4pdtAZGCtdL2ry1iUb7+7q7/8L+uAv4NlM9TZZ9obpNgJvMq0VC/dJuIcgWxmfzXJr+7JhJUbSjr7cBHrt0FEGkeHVC5IpjpM9dmwmnRsCikMVllraqaZ1ClxK5yVZFtjFZYm5H0HdRmY20qFgPg/xb9cZpkDNZUbZxbTeNTG/UULW1B9MMgU3mrb9wTQuNXdszZhK3Y7rU2c5pAXDFW/BjT75u3m8B9l2kLk9H+z/vBi58Gj8bVa86Od8/7wbvvBd6h98h74o29iUc85v3pvff+6j3rZb2L3ro0vXmjYr73Gqv3xz/lK1WD</latexit>

✓2<latexit sha1_base64="YwKmc936ERHJ7kguU7c+8BuB0wk=">AAAJx3icpZZfb9s2EMDVrtti70/b7bEvwowALZAGlrcg2VuxIkD/BIvb1Wkw0zAo+iQTpSiBpFO7jB72FfbafYJ+o32bUZScSFQLaBsB22fe/e5OJ95JYcaoVMPh3zdufnbr8y++3On1v/r6m29v37n73ZlMV4LAhKQsFechlsAoh4miisF5JgAnIYPX4ZvHhf71BQhJU/5KbTKYJTjmNKIEK7N1jtQSFJ6P5ncGw/2hXX5bCCph4FVrPL+78wEtUrJKgCvCsJTTYJipmcZCUcIg76OVhAyTNziGqRE5TkDOtE0493fNzsKPUmE+XPl2t05onEi5SUJjmWC1lK6u2PyYbrpS0dFMU56tFHBSBopWzFepX1y9v6ACiGIbI2AiqMnVJ0ssMFGmRv1GmDLVvlmIw1uSJgnmC41wKPNpMNOIQaQuBwESNF6qy7xpRYCyazPEiv/+IPBLaySsvolELDXxaozdqEOlhZtQlrINS+N8OjLkHipqEkb6hOZzPQhy6+u+PxhVXh44Uc3FJymnxMGXn6Sb+EnKIdcncx3kjuMTyiOrQUZQm5b6KY9+TUVSq2X5fX3Bl9XPJ12ccujqImiVTWxybe5TcZOnIg5nerh/dHh4cHCwtz32hfDjaHR4lLdCi3V3eN2Cbc91d1D2aMsLZJKylP8LP1uifahjegHGE9q7RHtOHKkWZgqVNa4OB8po7m+PxrbUD3wHHIuwQY27McduMEXZAgzdAT/mTfZJl5D0KlEj6MIyd22eg+DW5HmHHNaNFI6rFKbXyKyNyLL9rhjbfFfgqAa65FuhOrMN9AwLijmBRrZms0O+T/miQVVzoo44xC/bAiIFa6XtX1vEonz93V3/4X9dBfwbKJ+nyj7R3CbBTOZVoqF+6TYR5QpiM/mvTX53TSSo2lDW24GPXLsLINI8OqByRTDTZ67NhNOiYVFIY7LKWlU1z6BKiV3lqiLbGK2wNiPpO6jNxtpULAbA/y364zTJGKyp2ji3msanNuopWtqC6IdBpvJW37gnhMav7JizCVux3Wtt5jSBuGKs+DGm3zdvN4H7LtMWJqP9n/eDFz8NHo2r15wd7573g3ffC7xD75H3xBt7E494zPvTe+/91XvWy3oXvXVpevNGxXzvNVbvj38A7odVhA==</latexit><latexit sha1_base64="YwKmc936ERHJ7kguU7c+8BuB0wk=">AAAJx3icpZZfb9s2EMDVrtti70/b7bEvwowALZAGlrcg2VuxIkD/BIvb1Wkw0zAo+iQTpSiBpFO7jB72FfbafYJ+o32bUZScSFQLaBsB22fe/e5OJ95JYcaoVMPh3zdufnbr8y++3On1v/r6m29v37n73ZlMV4LAhKQsFechlsAoh4miisF5JgAnIYPX4ZvHhf71BQhJU/5KbTKYJTjmNKIEK7N1jtQSFJ6P5ncGw/2hXX5bCCph4FVrPL+78wEtUrJKgCvCsJTTYJipmcZCUcIg76OVhAyTNziGqRE5TkDOtE0493fNzsKPUmE+XPl2t05onEi5SUJjmWC1lK6u2PyYbrpS0dFMU56tFHBSBopWzFepX1y9v6ACiGIbI2AiqMnVJ0ssMFGmRv1GmDLVvlmIw1uSJgnmC41wKPNpMNOIQaQuBwESNF6qy7xpRYCyazPEiv/+IPBLaySsvolELDXxaozdqEOlhZtQlrINS+N8OjLkHipqEkb6hOZzPQhy6+u+PxhVXh44Uc3FJymnxMGXn6Sb+EnKIdcncx3kjuMTyiOrQUZQm5b6KY9+TUVSq2X5fX3Bl9XPJ12ccujqImiVTWxybe5TcZOnIg5nerh/dHh4cHCwtz32hfDjaHR4lLdCi3V3eN2Cbc91d1D2aMsLZJKylP8LP1uifahjegHGE9q7RHtOHKkWZgqVNa4OB8po7m+PxrbUD3wHHIuwQY27McduMEXZAgzdAT/mTfZJl5D0KlEj6MIyd22eg+DW5HmHHNaNFI6rFKbXyKyNyLL9rhjbfFfgqAa65FuhOrMN9AwLijmBRrZms0O+T/miQVVzoo44xC/bAiIFa6XtX1vEonz93V3/4X9dBfwbKJ+nyj7R3CbBTOZVoqF+6TYR5QpiM/mvTX53TSSo2lDW24GPXLsLINI8OqByRTDTZ67NhNOiYVFIY7LKWlU1z6BKiV3lqiLbGK2wNiPpO6jNxtpULAbA/y364zTJGKyp2ji3msanNuopWtqC6IdBpvJW37gnhMav7JizCVux3Wtt5jSBuGKs+DGm3zdvN4H7LtMWJqP9n/eDFz8NHo2r15wd7573g3ffC7xD75H3xBt7E494zPvTe+/91XvWy3oXvXVpevNGxXzvNVbvj38A7odVhA==</latexit><latexit sha1_base64="YwKmc936ERHJ7kguU7c+8BuB0wk=">AAAJx3icpZZfb9s2EMDVrtti70/b7bEvwowALZAGlrcg2VuxIkD/BIvb1Wkw0zAo+iQTpSiBpFO7jB72FfbafYJ+o32bUZScSFQLaBsB22fe/e5OJ95JYcaoVMPh3zdufnbr8y++3On1v/r6m29v37n73ZlMV4LAhKQsFechlsAoh4miisF5JgAnIYPX4ZvHhf71BQhJU/5KbTKYJTjmNKIEK7N1jtQSFJ6P5ncGw/2hXX5bCCph4FVrPL+78wEtUrJKgCvCsJTTYJipmcZCUcIg76OVhAyTNziGqRE5TkDOtE0493fNzsKPUmE+XPl2t05onEi5SUJjmWC1lK6u2PyYbrpS0dFMU56tFHBSBopWzFepX1y9v6ACiGIbI2AiqMnVJ0ssMFGmRv1GmDLVvlmIw1uSJgnmC41wKPNpMNOIQaQuBwESNF6qy7xpRYCyazPEiv/+IPBLaySsvolELDXxaozdqEOlhZtQlrINS+N8OjLkHipqEkb6hOZzPQhy6+u+PxhVXh44Uc3FJymnxMGXn6Sb+EnKIdcncx3kjuMTyiOrQUZQm5b6KY9+TUVSq2X5fX3Bl9XPJ12ccujqImiVTWxybe5TcZOnIg5nerh/dHh4cHCwtz32hfDjaHR4lLdCi3V3eN2Cbc91d1D2aMsLZJKylP8LP1uifahjegHGE9q7RHtOHKkWZgqVNa4OB8po7m+PxrbUD3wHHIuwQY27McduMEXZAgzdAT/mTfZJl5D0KlEj6MIyd22eg+DW5HmHHNaNFI6rFKbXyKyNyLL9rhjbfFfgqAa65FuhOrMN9AwLijmBRrZms0O+T/miQVVzoo44xC/bAiIFa6XtX1vEonz93V3/4X9dBfwbKJ+nyj7R3CbBTOZVoqF+6TYR5QpiM/mvTX53TSSo2lDW24GPXLsLINI8OqByRTDTZ67NhNOiYVFIY7LKWlU1z6BKiV3lqiLbGK2wNiPpO6jNxtpULAbA/y364zTJGKyp2ji3msanNuopWtqC6IdBpvJW37gnhMav7JizCVux3Wtt5jSBuGKs+DGm3zdvN4H7LtMWJqP9n/eDFz8NHo2r15wd7573g3ffC7xD75H3xBt7E494zPvTe+/91XvWy3oXvXVpevNGxXzvNVbvj38A7odVhA==</latexit><latexit sha1_base64="YwKmc936ERHJ7kguU7c+8BuB0wk=">AAAJx3icpZZfb9s2EMDVrtti70/b7bEvwowALZAGlrcg2VuxIkD/BIvb1Wkw0zAo+iQTpSiBpFO7jB72FfbafYJ+o32bUZScSFQLaBsB22fe/e5OJ95JYcaoVMPh3zdufnbr8y++3On1v/r6m29v37n73ZlMV4LAhKQsFechlsAoh4miisF5JgAnIYPX4ZvHhf71BQhJU/5KbTKYJTjmNKIEK7N1jtQSFJ6P5ncGw/2hXX5bCCph4FVrPL+78wEtUrJKgCvCsJTTYJipmcZCUcIg76OVhAyTNziGqRE5TkDOtE0493fNzsKPUmE+XPl2t05onEi5SUJjmWC1lK6u2PyYbrpS0dFMU56tFHBSBopWzFepX1y9v6ACiGIbI2AiqMnVJ0ssMFGmRv1GmDLVvlmIw1uSJgnmC41wKPNpMNOIQaQuBwESNF6qy7xpRYCyazPEiv/+IPBLaySsvolELDXxaozdqEOlhZtQlrINS+N8OjLkHipqEkb6hOZzPQhy6+u+PxhVXh44Uc3FJymnxMGXn6Sb+EnKIdcncx3kjuMTyiOrQUZQm5b6KY9+TUVSq2X5fX3Bl9XPJ12ccujqImiVTWxybe5TcZOnIg5nerh/dHh4cHCwtz32hfDjaHR4lLdCi3V3eN2Cbc91d1D2aMsLZJKylP8LP1uifahjegHGE9q7RHtOHKkWZgqVNa4OB8po7m+PxrbUD3wHHIuwQY27McduMEXZAgzdAT/mTfZJl5D0KlEj6MIyd22eg+DW5HmHHNaNFI6rFKbXyKyNyLL9rhjbfFfgqAa65FuhOrMN9AwLijmBRrZms0O+T/miQVVzoo44xC/bAiIFa6XtX1vEonz93V3/4X9dBfwbKJ+nyj7R3CbBTOZVoqF+6TYR5QpiM/mvTX53TSSo2lDW24GPXLsLINI8OqByRTDTZ67NhNOiYVFIY7LKWlU1z6BKiV3lqiLbGK2wNiPpO6jNxtpULAbA/y364zTJGKyp2ji3msanNuopWtqC6IdBpvJW37gnhMav7JizCVux3Wtt5jSBuGKs+DGm3zdvN4H7LtMWJqP9n/eDFz8NHo2r15wd7573g3ffC7xD75H3xBt7E494zPvTe+/91XvWy3oXvXVpevNGxXzvNVbvj38A7odVhA==</latexit>

✓̂<latexit sha1_base64="6WifpBZdSRyY2pUiuBMF5KWI8MI=">AAAJzHicpZZfb9s2EMDV7l/s/Wm7Pe5FmBGgBdLAcuMleStWBGjRoPG2Ji1mGgFFn2SiFCWQdGqP0Ws/Ql+3536jfptRlJxIVANoGwHbJ9797s4n3klhxqhUw+HHW7c/+/yLL7/a6vW//ubb7+7cvff9mUyXgsApSVkqXodYAqMcThVVDF5nAnASMngVvnlS6F9dgJA05S/VOoNZgmNOI0qwMlsILbDSSC1A4fz87mC4ezgMDoM93wiPxsNxYISDw73Rz4Ef7A7tGnjVmpzf2/qA5ilZJsAVYVjKaTDM1ExjoShhkPfRUkKGyRscw9SIHCcgZ9omnfvbZmfuR6kwH658u1snNE6kXCehsUywWkhXV2x+SjddquhgpinPlgo4KQNFS+ar1C8q4M+pAKLY2giYCGpy9ckCC0yUqVO/EaZMtW8W4vCWpEmC+VwjHMp8Gsw0YhCpy0GABI0X6jJvWhGg7NoMseLaHwR+aY2E1TeRiKUmXo2xG3WotHATylK2ZmmcT0eG3EFFTcJIH9P8XA+C3Pq67w9GlZcHTlTz55OUU+LgixvpJn6ccsj18bkOcsfxMeWR1SAjqHVL/YxHL1KR1GpZfl//4cvq50YXJxy6ughaZRPrXJv7VNzkqYjDmTanfX9/PB7vDKvTXgiPRqP9g7wVWqy6w6sWXLZcZwdVi7peIJOUpfxf+NkQ7UMd0wswntDOJdpx4kg1N5OorHF1OFBGc39zNDalfuA74ESEDWrSjTlygynK5mDoDvgRb7JPu4SkV4kaQReWuWvzHAS3Js875LBqpHBUpTC9RmZtRJbtd8XY5rsCRzXQJd8K1ZltoGdYUMwJNLI1mx3yfcbnDaqaE3XEIX7ZFBApWCltL20Ri/L1t7f9h/91FfDvoHyeKvtUc5sEM5lXiYb6N7eJKFcQm8l/bfKHayJB1Yay3gx85NpdAJHm0QGVK4KZPnNtTjktGhaFNCbLrFVV8wyqlNhVLiuyjdEKazOS/gm12VibisUA+L9Ff5ImGYMVVWvnVtP4xEY9QQtbEP0wyFTe6hv3hND4pR1zNmErtnutzZwkEFeMFT/F9Pvm7WbzCuPfLJyNdoPhbvDr3uDxpHrP2fJ+9H7y7nuBt+899p56E+/UI17mvff+8v7uveipnu7lpentWxXzg9dYvXf/AJL+WCU=</latexit><latexit sha1_base64="6WifpBZdSRyY2pUiuBMF5KWI8MI=">AAAJzHicpZZfb9s2EMDV7l/s/Wm7Pe5FmBGgBdLAcuMleStWBGjRoPG2Ji1mGgFFn2SiFCWQdGqP0Ws/Ql+3536jfptRlJxIVANoGwHbJ9797s4n3klhxqhUw+HHW7c/+/yLL7/a6vW//ubb7+7cvff9mUyXgsApSVkqXodYAqMcThVVDF5nAnASMngVvnlS6F9dgJA05S/VOoNZgmNOI0qwMlsILbDSSC1A4fz87mC4ezgMDoM93wiPxsNxYISDw73Rz4Ef7A7tGnjVmpzf2/qA5ilZJsAVYVjKaTDM1ExjoShhkPfRUkKGyRscw9SIHCcgZ9omnfvbZmfuR6kwH658u1snNE6kXCehsUywWkhXV2x+SjddquhgpinPlgo4KQNFS+ar1C8q4M+pAKLY2giYCGpy9ckCC0yUqVO/EaZMtW8W4vCWpEmC+VwjHMp8Gsw0YhCpy0GABI0X6jJvWhGg7NoMseLaHwR+aY2E1TeRiKUmXo2xG3WotHATylK2ZmmcT0eG3EFFTcJIH9P8XA+C3Pq67w9GlZcHTlTz55OUU+LgixvpJn6ccsj18bkOcsfxMeWR1SAjqHVL/YxHL1KR1GpZfl//4cvq50YXJxy6ughaZRPrXJv7VNzkqYjDmTanfX9/PB7vDKvTXgiPRqP9g7wVWqy6w6sWXLZcZwdVi7peIJOUpfxf+NkQ7UMd0wswntDOJdpx4kg1N5OorHF1OFBGc39zNDalfuA74ESEDWrSjTlygynK5mDoDvgRb7JPu4SkV4kaQReWuWvzHAS3Js875LBqpHBUpTC9RmZtRJbtd8XY5rsCRzXQJd8K1ZltoGdYUMwJNLI1mx3yfcbnDaqaE3XEIX7ZFBApWCltL20Ri/L1t7f9h/91FfDvoHyeKvtUc5sEM5lXiYb6N7eJKFcQm8l/bfKHayJB1Yay3gx85NpdAJHm0QGVK4KZPnNtTjktGhaFNCbLrFVV8wyqlNhVLiuyjdEKazOS/gm12VibisUA+L9Ff5ImGYMVVWvnVtP4xEY9QQtbEP0wyFTe6hv3hND4pR1zNmErtnutzZwkEFeMFT/F9Pvm7WbzCuPfLJyNdoPhbvDr3uDxpHrP2fJ+9H7y7nuBt+899p56E+/UI17mvff+8v7uveipnu7lpentWxXzg9dYvXf/AJL+WCU=</latexit><latexit sha1_base64="6WifpBZdSRyY2pUiuBMF5KWI8MI=">AAAJzHicpZZfb9s2EMDV7l/s/Wm7Pe5FmBGgBdLAcuMleStWBGjRoPG2Ji1mGgFFn2SiFCWQdGqP0Ws/Ql+3536jfptRlJxIVANoGwHbJ9797s4n3klhxqhUw+HHW7c/+/yLL7/a6vW//ubb7+7cvff9mUyXgsApSVkqXodYAqMcThVVDF5nAnASMngVvnlS6F9dgJA05S/VOoNZgmNOI0qwMlsILbDSSC1A4fz87mC4ezgMDoM93wiPxsNxYISDw73Rz4Ef7A7tGnjVmpzf2/qA5ilZJsAVYVjKaTDM1ExjoShhkPfRUkKGyRscw9SIHCcgZ9omnfvbZmfuR6kwH658u1snNE6kXCehsUywWkhXV2x+SjddquhgpinPlgo4KQNFS+ar1C8q4M+pAKLY2giYCGpy9ckCC0yUqVO/EaZMtW8W4vCWpEmC+VwjHMp8Gsw0YhCpy0GABI0X6jJvWhGg7NoMseLaHwR+aY2E1TeRiKUmXo2xG3WotHATylK2ZmmcT0eG3EFFTcJIH9P8XA+C3Pq67w9GlZcHTlTz55OUU+LgixvpJn6ccsj18bkOcsfxMeWR1SAjqHVL/YxHL1KR1GpZfl//4cvq50YXJxy6ughaZRPrXJv7VNzkqYjDmTanfX9/PB7vDKvTXgiPRqP9g7wVWqy6w6sWXLZcZwdVi7peIJOUpfxf+NkQ7UMd0wswntDOJdpx4kg1N5OorHF1OFBGc39zNDalfuA74ESEDWrSjTlygynK5mDoDvgRb7JPu4SkV4kaQReWuWvzHAS3Js875LBqpHBUpTC9RmZtRJbtd8XY5rsCRzXQJd8K1ZltoGdYUMwJNLI1mx3yfcbnDaqaE3XEIX7ZFBApWCltL20Ri/L1t7f9h/91FfDvoHyeKvtUc5sEM5lXiYb6N7eJKFcQm8l/bfKHayJB1Yay3gx85NpdAJHm0QGVK4KZPnNtTjktGhaFNCbLrFVV8wyqlNhVLiuyjdEKazOS/gm12VibisUA+L9Ff5ImGYMVVWvnVtP4xEY9QQtbEP0wyFTe6hv3hND4pR1zNmErtnutzZwkEFeMFT/F9Pvm7WbzCuPfLJyNdoPhbvDr3uDxpHrP2fJ+9H7y7nuBt+899p56E+/UI17mvff+8v7uveipnu7lpentWxXzg9dYvXf/AJL+WCU=</latexit><latexit sha1_base64="6WifpBZdSRyY2pUiuBMF5KWI8MI=">AAAJzHicpZZfb9s2EMDV7l/s/Wm7Pe5FmBGgBdLAcuMleStWBGjRoPG2Ji1mGgFFn2SiFCWQdGqP0Ws/Ql+3536jfptRlJxIVANoGwHbJ9797s4n3klhxqhUw+HHW7c/+/yLL7/a6vW//ubb7+7cvff9mUyXgsApSVkqXodYAqMcThVVDF5nAnASMngVvnlS6F9dgJA05S/VOoNZgmNOI0qwMlsILbDSSC1A4fz87mC4ezgMDoM93wiPxsNxYISDw73Rz4Ef7A7tGnjVmpzf2/qA5ilZJsAVYVjKaTDM1ExjoShhkPfRUkKGyRscw9SIHCcgZ9omnfvbZmfuR6kwH658u1snNE6kXCehsUywWkhXV2x+SjddquhgpinPlgo4KQNFS+ar1C8q4M+pAKLY2giYCGpy9ckCC0yUqVO/EaZMtW8W4vCWpEmC+VwjHMp8Gsw0YhCpy0GABI0X6jJvWhGg7NoMseLaHwR+aY2E1TeRiKUmXo2xG3WotHATylK2ZmmcT0eG3EFFTcJIH9P8XA+C3Pq67w9GlZcHTlTz55OUU+LgixvpJn6ccsj18bkOcsfxMeWR1SAjqHVL/YxHL1KR1GpZfl//4cvq50YXJxy6ughaZRPrXJv7VNzkqYjDmTanfX9/PB7vDKvTXgiPRqP9g7wVWqy6w6sWXLZcZwdVi7peIJOUpfxf+NkQ7UMd0wswntDOJdpx4kg1N5OorHF1OFBGc39zNDalfuA74ESEDWrSjTlygynK5mDoDvgRb7JPu4SkV4kaQReWuWvzHAS3Js875LBqpHBUpTC9RmZtRJbtd8XY5rsCRzXQJd8K1ZltoGdYUMwJNLI1mx3yfcbnDaqaE3XEIX7ZFBApWCltL20Ri/L1t7f9h/91FfDvoHyeKvtUc5sEM5lXiYb6N7eJKFcQm8l/bfKHayJB1Yay3gx85NpdAJHm0QGVK4KZPnNtTjktGhaFNCbLrFVV8wyqlNhVLiuyjdEKazOS/gm12VibisUA+L9Ff5ImGYMVVWvnVtP4xEY9QQtbEP0wyFTe6hv3hND4pR1zNmErtnutzZwkEFeMFT/F9Pvm7WbzCuPfLJyNdoPhbvDr3uDxpHrP2fJ+9H7y7nuBt+899p56E+/UI17mvff+8v7uveipnu7lpentWxXzg9dYvXf/AJL+WCU=</latexit>

Training Algorithm

u(t) �u(t�1) + ⌘X

i2Br✓ (L↵ (f✓(xi), yi))

����✓(t)

✓(t+1) ✓(t) + u(t)<latexit sha1_base64="Kar8vNpG/pAtXc/8bKFltdauNyw=">AAACwnicbVHLbtQwFHXCqwyPDrBkYzECzagwSlok2FEVFixYFIlpK42H6MbjJKaOk9o3wCjkJ9n1b3AeEqXDlaIcn3NsH98bl0paDIJLz79x89btOzt3R/fuP3i4O370+MQWleFiwQtVmLMYrFBSiwVKVOKsNALyWInT+Px9q59+F8bKQn/BTSlWOaRaJpIDOioaX1Zf6ynOGvqCMiUSBGOKH5TFAoF20qvQiXuUtQSzVR7VkjKpKcsBMw6qPmoayjTECiKGWWdrD5p+ihioMoN+lQzi9GckZy/pJpLMyDTD2fBzV8o0Ncy4sBjVvbmP5o5no7/EXriV9orZRR1eFI0nwTzoim6DcAATMtRxNP7N1gWvcqGRK7B2GQYlrmowKLkSzYhVVpTAzyEVSwc15MKu6m4EDX3umDVNCuM+jbRjr+6oIbd2k8fO2TbOXtda8n/assLk7aqWuqxQaN5flFSKYkHbedK1NIKj2jgA3EiXlfIMDHB0Ux+5JoTXn7wNTvbn4cF8//PryeHR0I4d8pQ8I1MSkjfkkHwkx2RBuPfOE572Cv+D/82/8G1v9b1hzxPyT/m//gD+Ptlr</latexit>

Architecture is sometimes treated as

separate from hyperparameters

Can be learned…

Page 13: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

OfflineTraining

Data

DataCollection

Cleaning &Visualization

Feature Eng. &Model Design

Training &Validation

Model Development Technologies

Page 14: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

OfflineTraining

Data

DataCollection

Cleaning &Visualization

Feature Eng. &Model Design

Training &Validation

Model DevelopmentWhat is the output of

Trained Model

Reports & Dashboards

(insights …)

Bad Idea

Page 15: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

Why is it a Bad Idea to directly produce trained models from model development?

With just a trained model we are unable to1. retrain models with new data2. track data and code for debugging3. capture dependencies for deployment4. audit training for compliance (e.g., GDPR)

Page 16: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

OfflineTraining

Data

DataCollection

Cleaning &Visualization

Feature Eng. &Model Design

Training &Validation

Model DevelopmentWhat is the output of

Trained Models

Reports & Dashboards

(insights …)

Bad Idea

Page 17: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

OfflineTraining

Data

DataCollection

Cleaning &Visualization

Feature Eng. &Model Design

Training &Validation

Model DevelopmentWhat is the output of Reports & Dashboards

(insights …)

Training Pipelines

Page 18: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

Training Pipelines Capture the Code and Data DependenciesØ Description of how to train the model from data sources

TrainingData

Training Pipelines

TrainedModels

Training Pipelines à Code Trained Models à Binaries

SoftwareEngineering

Analogy

Page 19: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

OfflineTraining

Data

DataCollection

Cleaning &Visualization

Feature Eng. &Model Design

Training &Validation

Model DevelopmentWhat is the output of Reports & Dashboards

(insights …)

Training Pipelines

Page 20: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

OfflineTraining

Data

DataCollection

Cleaning &Visualization

Feature Eng. &Model Design

Training &Validation

Model Development

TrainedModelsTraining Pipelines

LiveData

Training

Validation

DataEngineer

DataScientist

Page 21: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

TrainedModelsTraining Pipelines

LiveData

Training

Validation

DataEngineer

Training models at scaleon live data

Retraining on new data

Automatically validateprediction accuracy

Manage model versioning

Requires minimal expertise in machine learning

Page 22: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

TrainedModelsTraining Pipelines

LiveData

Training

Validation

DataEngineer

TechnologiesApache

Airflow

Workflow Management:

Scalable Training:

Page 23: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

Warm Start Training

Stochastic Gradient Descent

✓1<latexit sha1_base64="cbM0eSmGxMtQkVBDiiBAEOk4Olo=">AAAJx3icpZZfb9s2EMDVrtti70/b7bEvwowALZAGlrcg2VuxIkD/BIvb1Wkw0zAo+iQTpSiBpFO7jB72FfbafYJ+o32bUZScSFQLaBsB22fe/e5OJ95JYcaoVMPh3zdufnbr8y++3On1v/r6m29v37n73ZlMV4LAhKQsFechlsAoh4miisF5JgAnIYPX4ZvHhf71BQhJU/5KbTKYJTjmNKIEK7N1jtQSFJ4H8zuD4f7QLr8tBJUw8Ko1nt/d+YAWKVklwBVhWMppMMzUTGOhKGGQ99FKQobJGxzD1IgcJyBn2iac+7tmZ+FHqTAfrny7Wyc0TqTcJKGxTLBaSldXbH5MN12p6GimKc9WCjgpA0Ur5qvUL67eX1ABRLGNETAR1OTqkyUWmChTo34jTJlq3yzE4S1JkwTzhUY4lPk0mGnEIFKXgwAJGi/VZd60IkDZtRlixX9/EPilNRJW30Qilpp4NcZu1KHSwk0oS9mGpXE+HRlyDxU1CSN9QvO5HgS59XXfH4wqLw+cqObik5RT4uDLT9JN/CTlkOuTuQ5yx/EJ5ZHVICOoTUv9lEe/piKp1bL8vr7gy+rnky5OOXR1EbTKJja5NvepuMlTEYczPdw/Ojw8ODjY2x77QvhxNDo8yluhxbo7vG7Btue6Oyh7tOUFMklZyv+Fny3RPtQxvQDjCe1doj0njlQLM4XKGleHA2U097dHY1vqB74DjkXYoMbdmGM3mKJsAYbugB/zJvukS0h6lagRdGGZuzbPQXBr8rxDDutGCsdVCtNrZNZGZNl+V4xtvitwVANd8q1QndkGeoYFxZxAI1uz2SHfp3zRoKo5UUcc4pdtAZGCtdL2ry1iUb7+7q7/8L+uAv4NlM9TZZ9obpNgJvMq0VC/dJuIcgWxmfzXJr+7JhJUbSjr7cBHrt0FEGkeHVC5IpjpM9dmwmnRsCikMVllraqaZ1ClxK5yVZFtjFZYm5H0HdRmY20qFgPg/xb9cZpkDNZUbZxbTeNTG/UULW1B9MMgU3mrb9wTQuNXdszZhK3Y7rU2c5pAXDFW/BjT75u3m8B9l2kLk9H+z/vBi58Gj8bVa86Od8/7wbvvBd6h98h74o29iUc85v3pvff+6j3rZb2L3ro0vXmjYr73Gqv3xz/lK1WD</latexit><latexit sha1_base64="cbM0eSmGxMtQkVBDiiBAEOk4Olo=">AAAJx3icpZZfb9s2EMDVrtti70/b7bEvwowALZAGlrcg2VuxIkD/BIvb1Wkw0zAo+iQTpSiBpFO7jB72FfbafYJ+o32bUZScSFQLaBsB22fe/e5OJ95JYcaoVMPh3zdufnbr8y++3On1v/r6m29v37n73ZlMV4LAhKQsFechlsAoh4miisF5JgAnIYPX4ZvHhf71BQhJU/5KbTKYJTjmNKIEK7N1jtQSFJ4H8zuD4f7QLr8tBJUw8Ko1nt/d+YAWKVklwBVhWMppMMzUTGOhKGGQ99FKQobJGxzD1IgcJyBn2iac+7tmZ+FHqTAfrny7Wyc0TqTcJKGxTLBaSldXbH5MN12p6GimKc9WCjgpA0Ur5qvUL67eX1ABRLGNETAR1OTqkyUWmChTo34jTJlq3yzE4S1JkwTzhUY4lPk0mGnEIFKXgwAJGi/VZd60IkDZtRlixX9/EPilNRJW30Qilpp4NcZu1KHSwk0oS9mGpXE+HRlyDxU1CSN9QvO5HgS59XXfH4wqLw+cqObik5RT4uDLT9JN/CTlkOuTuQ5yx/EJ5ZHVICOoTUv9lEe/piKp1bL8vr7gy+rnky5OOXR1EbTKJja5NvepuMlTEYczPdw/Ojw8ODjY2x77QvhxNDo8yluhxbo7vG7Btue6Oyh7tOUFMklZyv+Fny3RPtQxvQDjCe1doj0njlQLM4XKGleHA2U097dHY1vqB74DjkXYoMbdmGM3mKJsAYbugB/zJvukS0h6lagRdGGZuzbPQXBr8rxDDutGCsdVCtNrZNZGZNl+V4xtvitwVANd8q1QndkGeoYFxZxAI1uz2SHfp3zRoKo5UUcc4pdtAZGCtdL2ry1iUb7+7q7/8L+uAv4NlM9TZZ9obpNgJvMq0VC/dJuIcgWxmfzXJr+7JhJUbSjr7cBHrt0FEGkeHVC5IpjpM9dmwmnRsCikMVllraqaZ1ClxK5yVZFtjFZYm5H0HdRmY20qFgPg/xb9cZpkDNZUbZxbTeNTG/UULW1B9MMgU3mrb9wTQuNXdszZhK3Y7rU2c5pAXDFW/BjT75u3m8B9l2kLk9H+z/vBi58Gj8bVa86Od8/7wbvvBd6h98h74o29iUc85v3pvff+6j3rZb2L3ro0vXmjYr73Gqv3xz/lK1WD</latexit><latexit sha1_base64="cbM0eSmGxMtQkVBDiiBAEOk4Olo=">AAAJx3icpZZfb9s2EMDVrtti70/b7bEvwowALZAGlrcg2VuxIkD/BIvb1Wkw0zAo+iQTpSiBpFO7jB72FfbafYJ+o32bUZScSFQLaBsB22fe/e5OJ95JYcaoVMPh3zdufnbr8y++3On1v/r6m29v37n73ZlMV4LAhKQsFechlsAoh4miisF5JgAnIYPX4ZvHhf71BQhJU/5KbTKYJTjmNKIEK7N1jtQSFJ4H8zuD4f7QLr8tBJUw8Ko1nt/d+YAWKVklwBVhWMppMMzUTGOhKGGQ99FKQobJGxzD1IgcJyBn2iac+7tmZ+FHqTAfrny7Wyc0TqTcJKGxTLBaSldXbH5MN12p6GimKc9WCjgpA0Ur5qvUL67eX1ABRLGNETAR1OTqkyUWmChTo34jTJlq3yzE4S1JkwTzhUY4lPk0mGnEIFKXgwAJGi/VZd60IkDZtRlixX9/EPilNRJW30Qilpp4NcZu1KHSwk0oS9mGpXE+HRlyDxU1CSN9QvO5HgS59XXfH4wqLw+cqObik5RT4uDLT9JN/CTlkOuTuQ5yx/EJ5ZHVICOoTUv9lEe/piKp1bL8vr7gy+rnky5OOXR1EbTKJja5NvepuMlTEYczPdw/Ojw8ODjY2x77QvhxNDo8yluhxbo7vG7Btue6Oyh7tOUFMklZyv+Fny3RPtQxvQDjCe1doj0njlQLM4XKGleHA2U097dHY1vqB74DjkXYoMbdmGM3mKJsAYbugB/zJvukS0h6lagRdGGZuzbPQXBr8rxDDutGCsdVCtNrZNZGZNl+V4xtvitwVANd8q1QndkGeoYFxZxAI1uz2SHfp3zRoKo5UUcc4pdtAZGCtdL2ry1iUb7+7q7/8L+uAv4NlM9TZZ9obpNgJvMq0VC/dJuIcgWxmfzXJr+7JhJUbSjr7cBHrt0FEGkeHVC5IpjpM9dmwmnRsCikMVllraqaZ1ClxK5yVZFtjFZYm5H0HdRmY20qFgPg/xb9cZpkDNZUbZxbTeNTG/UULW1B9MMgU3mrb9wTQuNXdszZhK3Y7rU2c5pAXDFW/BjT75u3m8B9l2kLk9H+z/vBi58Gj8bVa86Od8/7wbvvBd6h98h74o29iUc85v3pvff+6j3rZb2L3ro0vXmjYr73Gqv3xz/lK1WD</latexit><latexit sha1_base64="cbM0eSmGxMtQkVBDiiBAEOk4Olo=">AAAJx3icpZZfb9s2EMDVrtti70/b7bEvwowALZAGlrcg2VuxIkD/BIvb1Wkw0zAo+iQTpSiBpFO7jB72FfbafYJ+o32bUZScSFQLaBsB22fe/e5OJ95JYcaoVMPh3zdufnbr8y++3On1v/r6m29v37n73ZlMV4LAhKQsFechlsAoh4miisF5JgAnIYPX4ZvHhf71BQhJU/5KbTKYJTjmNKIEK7N1jtQSFJ4H8zuD4f7QLr8tBJUw8Ko1nt/d+YAWKVklwBVhWMppMMzUTGOhKGGQ99FKQobJGxzD1IgcJyBn2iac+7tmZ+FHqTAfrny7Wyc0TqTcJKGxTLBaSldXbH5MN12p6GimKc9WCjgpA0Ur5qvUL67eX1ABRLGNETAR1OTqkyUWmChTo34jTJlq3yzE4S1JkwTzhUY4lPk0mGnEIFKXgwAJGi/VZd60IkDZtRlixX9/EPilNRJW30Qilpp4NcZu1KHSwk0oS9mGpXE+HRlyDxU1CSN9QvO5HgS59XXfH4wqLw+cqObik5RT4uDLT9JN/CTlkOuTuQ5yx/EJ5ZHVICOoTUv9lEe/piKp1bL8vr7gy+rnky5OOXR1EbTKJja5NvepuMlTEYczPdw/Ojw8ODjY2x77QvhxNDo8yluhxbo7vG7Btue6Oyh7tOUFMklZyv+Fny3RPtQxvQDjCe1doj0njlQLM4XKGleHA2U097dHY1vqB74DjkXYoMbdmGM3mKJsAYbugB/zJvukS0h6lagRdGGZuzbPQXBr8rxDDutGCsdVCtNrZNZGZNl+V4xtvitwVANd8q1QndkGeoYFxZxAI1uz2SHfp3zRoKo5UUcc4pdtAZGCtdL2ry1iUb7+7q7/8L+uAv4NlM9TZZ9obpNgJvMq0VC/dJuIcgWxmfzXJr+7JhJUbSjr7cBHrt0FEGkeHVC5IpjpM9dmwmnRsCikMVllraqaZ1ClxK5yVZFtjFZYm5H0HdRmY20qFgPg/xb9cZpkDNZUbZxbTeNTG/UULW1B9MMgU3mrb9wTQuNXdszZhK3Y7rU2c5pAXDFW/BjT75u3m8B9l2kLk9H+z/vBi58Gj8bVa86Od8/7wbvvBd6h98h74o29iUc85v3pvff+6j3rZb2L3ro0vXmjYr73Gqv3xz/lK1WD</latexit>

✓2<latexit sha1_base64="YwKmc936ERHJ7kguU7c+8BuB0wk=">AAAJx3icpZZfb9s2EMDVrtti70/b7bEvwowALZAGlrcg2VuxIkD/BIvb1Wkw0zAo+iQTpSiBpFO7jB72FfbafYJ+o32bUZScSFQLaBsB22fe/e5OJ95JYcaoVMPh3zdufnbr8y++3On1v/r6m29v37n73ZlMV4LAhKQsFechlsAoh4miisF5JgAnIYPX4ZvHhf71BQhJU/5KbTKYJTjmNKIEK7N1jtQSFJ6P5ncGw/2hXX5bCCph4FVrPL+78wEtUrJKgCvCsJTTYJipmcZCUcIg76OVhAyTNziGqRE5TkDOtE0493fNzsKPUmE+XPl2t05onEi5SUJjmWC1lK6u2PyYbrpS0dFMU56tFHBSBopWzFepX1y9v6ACiGIbI2AiqMnVJ0ssMFGmRv1GmDLVvlmIw1uSJgnmC41wKPNpMNOIQaQuBwESNF6qy7xpRYCyazPEiv/+IPBLaySsvolELDXxaozdqEOlhZtQlrINS+N8OjLkHipqEkb6hOZzPQhy6+u+PxhVXh44Uc3FJymnxMGXn6Sb+EnKIdcncx3kjuMTyiOrQUZQm5b6KY9+TUVSq2X5fX3Bl9XPJ12ccujqImiVTWxybe5TcZOnIg5nerh/dHh4cHCwtz32hfDjaHR4lLdCi3V3eN2Cbc91d1D2aMsLZJKylP8LP1uifahjegHGE9q7RHtOHKkWZgqVNa4OB8po7m+PxrbUD3wHHIuwQY27McduMEXZAgzdAT/mTfZJl5D0KlEj6MIyd22eg+DW5HmHHNaNFI6rFKbXyKyNyLL9rhjbfFfgqAa65FuhOrMN9AwLijmBRrZms0O+T/miQVVzoo44xC/bAiIFa6XtX1vEonz93V3/4X9dBfwbKJ+nyj7R3CbBTOZVoqF+6TYR5QpiM/mvTX53TSSo2lDW24GPXLsLINI8OqByRTDTZ67NhNOiYVFIY7LKWlU1z6BKiV3lqiLbGK2wNiPpO6jNxtpULAbA/y364zTJGKyp2ji3msanNuopWtqC6IdBpvJW37gnhMav7JizCVux3Wtt5jSBuGKs+DGm3zdvN4H7LtMWJqP9n/eDFz8NHo2r15wd7573g3ffC7xD75H3xBt7E494zPvTe+/91XvWy3oXvXVpevNGxXzvNVbvj38A7odVhA==</latexit><latexit sha1_base64="YwKmc936ERHJ7kguU7c+8BuB0wk=">AAAJx3icpZZfb9s2EMDVrtti70/b7bEvwowALZAGlrcg2VuxIkD/BIvb1Wkw0zAo+iQTpSiBpFO7jB72FfbafYJ+o32bUZScSFQLaBsB22fe/e5OJ95JYcaoVMPh3zdufnbr8y++3On1v/r6m29v37n73ZlMV4LAhKQsFechlsAoh4miisF5JgAnIYPX4ZvHhf71BQhJU/5KbTKYJTjmNKIEK7N1jtQSFJ6P5ncGw/2hXX5bCCph4FVrPL+78wEtUrJKgCvCsJTTYJipmcZCUcIg76OVhAyTNziGqRE5TkDOtE0493fNzsKPUmE+XPl2t05onEi5SUJjmWC1lK6u2PyYbrpS0dFMU56tFHBSBopWzFepX1y9v6ACiGIbI2AiqMnVJ0ssMFGmRv1GmDLVvlmIw1uSJgnmC41wKPNpMNOIQaQuBwESNF6qy7xpRYCyazPEiv/+IPBLaySsvolELDXxaozdqEOlhZtQlrINS+N8OjLkHipqEkb6hOZzPQhy6+u+PxhVXh44Uc3FJymnxMGXn6Sb+EnKIdcncx3kjuMTyiOrQUZQm5b6KY9+TUVSq2X5fX3Bl9XPJ12ccujqImiVTWxybe5TcZOnIg5nerh/dHh4cHCwtz32hfDjaHR4lLdCi3V3eN2Cbc91d1D2aMsLZJKylP8LP1uifahjegHGE9q7RHtOHKkWZgqVNa4OB8po7m+PxrbUD3wHHIuwQY27McduMEXZAgzdAT/mTfZJl5D0KlEj6MIyd22eg+DW5HmHHNaNFI6rFKbXyKyNyLL9rhjbfFfgqAa65FuhOrMN9AwLijmBRrZms0O+T/miQVVzoo44xC/bAiIFa6XtX1vEonz93V3/4X9dBfwbKJ+nyj7R3CbBTOZVoqF+6TYR5QpiM/mvTX53TSSo2lDW24GPXLsLINI8OqByRTDTZ67NhNOiYVFIY7LKWlU1z6BKiV3lqiLbGK2wNiPpO6jNxtpULAbA/y364zTJGKyp2ji3msanNuopWtqC6IdBpvJW37gnhMav7JizCVux3Wtt5jSBuGKs+DGm3zdvN4H7LtMWJqP9n/eDFz8NHo2r15wd7573g3ffC7xD75H3xBt7E494zPvTe+/91XvWy3oXvXVpevNGxXzvNVbvj38A7odVhA==</latexit><latexit sha1_base64="YwKmc936ERHJ7kguU7c+8BuB0wk=">AAAJx3icpZZfb9s2EMDVrtti70/b7bEvwowALZAGlrcg2VuxIkD/BIvb1Wkw0zAo+iQTpSiBpFO7jB72FfbafYJ+o32bUZScSFQLaBsB22fe/e5OJ95JYcaoVMPh3zdufnbr8y++3On1v/r6m29v37n73ZlMV4LAhKQsFechlsAoh4miisF5JgAnIYPX4ZvHhf71BQhJU/5KbTKYJTjmNKIEK7N1jtQSFJ6P5ncGw/2hXX5bCCph4FVrPL+78wEtUrJKgCvCsJTTYJipmcZCUcIg76OVhAyTNziGqRE5TkDOtE0493fNzsKPUmE+XPl2t05onEi5SUJjmWC1lK6u2PyYbrpS0dFMU56tFHBSBopWzFepX1y9v6ACiGIbI2AiqMnVJ0ssMFGmRv1GmDLVvlmIw1uSJgnmC41wKPNpMNOIQaQuBwESNF6qy7xpRYCyazPEiv/+IPBLaySsvolELDXxaozdqEOlhZtQlrINS+N8OjLkHipqEkb6hOZzPQhy6+u+PxhVXh44Uc3FJymnxMGXn6Sb+EnKIdcncx3kjuMTyiOrQUZQm5b6KY9+TUVSq2X5fX3Bl9XPJ12ccujqImiVTWxybe5TcZOnIg5nerh/dHh4cHCwtz32hfDjaHR4lLdCi3V3eN2Cbc91d1D2aMsLZJKylP8LP1uifahjegHGE9q7RHtOHKkWZgqVNa4OB8po7m+PxrbUD3wHHIuwQY27McduMEXZAgzdAT/mTfZJl5D0KlEj6MIyd22eg+DW5HmHHNaNFI6rFKbXyKyNyLL9rhjbfFfgqAa65FuhOrMN9AwLijmBRrZms0O+T/miQVVzoo44xC/bAiIFa6XtX1vEonz93V3/4X9dBfwbKJ+nyj7R3CbBTOZVoqF+6TYR5QpiM/mvTX53TSSo2lDW24GPXLsLINI8OqByRTDTZ67NhNOiYVFIY7LKWlU1z6BKiV3lqiLbGK2wNiPpO6jNxtpULAbA/y364zTJGKyp2ji3msanNuopWtqC6IdBpvJW37gnhMav7JizCVux3Wtt5jSBuGKs+DGm3zdvN4H7LtMWJqP9n/eDFz8NHo2r15wd7573g3ffC7xD75H3xBt7E494zPvTe+/91XvWy3oXvXVpevNGxXzvNVbvj38A7odVhA==</latexit><latexit sha1_base64="YwKmc936ERHJ7kguU7c+8BuB0wk=">AAAJx3icpZZfb9s2EMDVrtti70/b7bEvwowALZAGlrcg2VuxIkD/BIvb1Wkw0zAo+iQTpSiBpFO7jB72FfbafYJ+o32bUZScSFQLaBsB22fe/e5OJ95JYcaoVMPh3zdufnbr8y++3On1v/r6m29v37n73ZlMV4LAhKQsFechlsAoh4miisF5JgAnIYPX4ZvHhf71BQhJU/5KbTKYJTjmNKIEK7N1jtQSFJ6P5ncGw/2hXX5bCCph4FVrPL+78wEtUrJKgCvCsJTTYJipmcZCUcIg76OVhAyTNziGqRE5TkDOtE0493fNzsKPUmE+XPl2t05onEi5SUJjmWC1lK6u2PyYbrpS0dFMU56tFHBSBopWzFepX1y9v6ACiGIbI2AiqMnVJ0ssMFGmRv1GmDLVvlmIw1uSJgnmC41wKPNpMNOIQaQuBwESNF6qy7xpRYCyazPEiv/+IPBLaySsvolELDXxaozdqEOlhZtQlrINS+N8OjLkHipqEkb6hOZzPQhy6+u+PxhVXh44Uc3FJymnxMGXn6Sb+EnKIdcncx3kjuMTyiOrQUZQm5b6KY9+TUVSq2X5fX3Bl9XPJ12ccujqImiVTWxybe5TcZOnIg5nerh/dHh4cHCwtz32hfDjaHR4lLdCi3V3eN2Cbc91d1D2aMsLZJKylP8LP1uifahjegHGE9q7RHtOHKkWZgqVNa4OB8po7m+PxrbUD3wHHIuwQY27McduMEXZAgzdAT/mTfZJl5D0KlEj6MIyd22eg+DW5HmHHNaNFI6rFKbXyKyNyLL9rhjbfFfgqAa65FuhOrMN9AwLijmBRrZms0O+T/miQVVzoo44xC/bAiIFa6XtX1vEonz93V3/4X9dBfwbKJ+nyj7R3CbBTOZVoqF+6TYR5QpiM/mvTX53TSSo2lDW24GPXLsLINI8OqByRTDTZ67NhNOiYVFIY7LKWlU1z6BKiV3lqiLbGK2wNiPpO6jNxtpULAbA/y364zTJGKyp2ji3msanNuopWtqC6IdBpvJW37gnhMav7JizCVux3Wtt5jSBuGKs+DGm3zdvN4H7LtMWJqP9n/eDFz8NHo2r15wd7573g3ffC7xD75H3xBt7E494zPvTe+/91XvWy3oXvXVpevNGxXzvNVbvj38A7odVhA==</latexit>

Page 24: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

Warm Starts Training

Stochastic Gradient Descent

✓1<latexit sha1_base64="cbM0eSmGxMtQkVBDiiBAEOk4Olo=">AAAJx3icpZZfb9s2EMDVrtti70/b7bEvwowALZAGlrcg2VuxIkD/BIvb1Wkw0zAo+iQTpSiBpFO7jB72FfbafYJ+o32bUZScSFQLaBsB22fe/e5OJ95JYcaoVMPh3zdufnbr8y++3On1v/r6m29v37n73ZlMV4LAhKQsFechlsAoh4miisF5JgAnIYPX4ZvHhf71BQhJU/5KbTKYJTjmNKIEK7N1jtQSFJ4H8zuD4f7QLr8tBJUw8Ko1nt/d+YAWKVklwBVhWMppMMzUTGOhKGGQ99FKQobJGxzD1IgcJyBn2iac+7tmZ+FHqTAfrny7Wyc0TqTcJKGxTLBaSldXbH5MN12p6GimKc9WCjgpA0Ur5qvUL67eX1ABRLGNETAR1OTqkyUWmChTo34jTJlq3yzE4S1JkwTzhUY4lPk0mGnEIFKXgwAJGi/VZd60IkDZtRlixX9/EPilNRJW30Qilpp4NcZu1KHSwk0oS9mGpXE+HRlyDxU1CSN9QvO5HgS59XXfH4wqLw+cqObik5RT4uDLT9JN/CTlkOuTuQ5yx/EJ5ZHVICOoTUv9lEe/piKp1bL8vr7gy+rnky5OOXR1EbTKJja5NvepuMlTEYczPdw/Ojw8ODjY2x77QvhxNDo8yluhxbo7vG7Btue6Oyh7tOUFMklZyv+Fny3RPtQxvQDjCe1doj0njlQLM4XKGleHA2U097dHY1vqB74DjkXYoMbdmGM3mKJsAYbugB/zJvukS0h6lagRdGGZuzbPQXBr8rxDDutGCsdVCtNrZNZGZNl+V4xtvitwVANd8q1QndkGeoYFxZxAI1uz2SHfp3zRoKo5UUcc4pdtAZGCtdL2ry1iUb7+7q7/8L+uAv4NlM9TZZ9obpNgJvMq0VC/dJuIcgWxmfzXJr+7JhJUbSjr7cBHrt0FEGkeHVC5IpjpM9dmwmnRsCikMVllraqaZ1ClxK5yVZFtjFZYm5H0HdRmY20qFgPg/xb9cZpkDNZUbZxbTeNTG/UULW1B9MMgU3mrb9wTQuNXdszZhK3Y7rU2c5pAXDFW/BjT75u3m8B9l2kLk9H+z/vBi58Gj8bVa86Od8/7wbvvBd6h98h74o29iUc85v3pvff+6j3rZb2L3ro0vXmjYr73Gqv3xz/lK1WD</latexit><latexit sha1_base64="cbM0eSmGxMtQkVBDiiBAEOk4Olo=">AAAJx3icpZZfb9s2EMDVrtti70/b7bEvwowALZAGlrcg2VuxIkD/BIvb1Wkw0zAo+iQTpSiBpFO7jB72FfbafYJ+o32bUZScSFQLaBsB22fe/e5OJ95JYcaoVMPh3zdufnbr8y++3On1v/r6m29v37n73ZlMV4LAhKQsFechlsAoh4miisF5JgAnIYPX4ZvHhf71BQhJU/5KbTKYJTjmNKIEK7N1jtQSFJ4H8zuD4f7QLr8tBJUw8Ko1nt/d+YAWKVklwBVhWMppMMzUTGOhKGGQ99FKQobJGxzD1IgcJyBn2iac+7tmZ+FHqTAfrny7Wyc0TqTcJKGxTLBaSldXbH5MN12p6GimKc9WCjgpA0Ur5qvUL67eX1ABRLGNETAR1OTqkyUWmChTo34jTJlq3yzE4S1JkwTzhUY4lPk0mGnEIFKXgwAJGi/VZd60IkDZtRlixX9/EPilNRJW30Qilpp4NcZu1KHSwk0oS9mGpXE+HRlyDxU1CSN9QvO5HgS59XXfH4wqLw+cqObik5RT4uDLT9JN/CTlkOuTuQ5yx/EJ5ZHVICOoTUv9lEe/piKp1bL8vr7gy+rnky5OOXR1EbTKJja5NvepuMlTEYczPdw/Ojw8ODjY2x77QvhxNDo8yluhxbo7vG7Btue6Oyh7tOUFMklZyv+Fny3RPtQxvQDjCe1doj0njlQLM4XKGleHA2U097dHY1vqB74DjkXYoMbdmGM3mKJsAYbugB/zJvukS0h6lagRdGGZuzbPQXBr8rxDDutGCsdVCtNrZNZGZNl+V4xtvitwVANd8q1QndkGeoYFxZxAI1uz2SHfp3zRoKo5UUcc4pdtAZGCtdL2ry1iUb7+7q7/8L+uAv4NlM9TZZ9obpNgJvMq0VC/dJuIcgWxmfzXJr+7JhJUbSjr7cBHrt0FEGkeHVC5IpjpM9dmwmnRsCikMVllraqaZ1ClxK5yVZFtjFZYm5H0HdRmY20qFgPg/xb9cZpkDNZUbZxbTeNTG/UULW1B9MMgU3mrb9wTQuNXdszZhK3Y7rU2c5pAXDFW/BjT75u3m8B9l2kLk9H+z/vBi58Gj8bVa86Od8/7wbvvBd6h98h74o29iUc85v3pvff+6j3rZb2L3ro0vXmjYr73Gqv3xz/lK1WD</latexit><latexit sha1_base64="cbM0eSmGxMtQkVBDiiBAEOk4Olo=">AAAJx3icpZZfb9s2EMDVrtti70/b7bEvwowALZAGlrcg2VuxIkD/BIvb1Wkw0zAo+iQTpSiBpFO7jB72FfbafYJ+o32bUZScSFQLaBsB22fe/e5OJ95JYcaoVMPh3zdufnbr8y++3On1v/r6m29v37n73ZlMV4LAhKQsFechlsAoh4miisF5JgAnIYPX4ZvHhf71BQhJU/5KbTKYJTjmNKIEK7N1jtQSFJ4H8zuD4f7QLr8tBJUw8Ko1nt/d+YAWKVklwBVhWMppMMzUTGOhKGGQ99FKQobJGxzD1IgcJyBn2iac+7tmZ+FHqTAfrny7Wyc0TqTcJKGxTLBaSldXbH5MN12p6GimKc9WCjgpA0Ur5qvUL67eX1ABRLGNETAR1OTqkyUWmChTo34jTJlq3yzE4S1JkwTzhUY4lPk0mGnEIFKXgwAJGi/VZd60IkDZtRlixX9/EPilNRJW30Qilpp4NcZu1KHSwk0oS9mGpXE+HRlyDxU1CSN9QvO5HgS59XXfH4wqLw+cqObik5RT4uDLT9JN/CTlkOuTuQ5yx/EJ5ZHVICOoTUv9lEe/piKp1bL8vr7gy+rnky5OOXR1EbTKJja5NvepuMlTEYczPdw/Ojw8ODjY2x77QvhxNDo8yluhxbo7vG7Btue6Oyh7tOUFMklZyv+Fny3RPtQxvQDjCe1doj0njlQLM4XKGleHA2U097dHY1vqB74DjkXYoMbdmGM3mKJsAYbugB/zJvukS0h6lagRdGGZuzbPQXBr8rxDDutGCsdVCtNrZNZGZNl+V4xtvitwVANd8q1QndkGeoYFxZxAI1uz2SHfp3zRoKo5UUcc4pdtAZGCtdL2ry1iUb7+7q7/8L+uAv4NlM9TZZ9obpNgJvMq0VC/dJuIcgWxmfzXJr+7JhJUbSjr7cBHrt0FEGkeHVC5IpjpM9dmwmnRsCikMVllraqaZ1ClxK5yVZFtjFZYm5H0HdRmY20qFgPg/xb9cZpkDNZUbZxbTeNTG/UULW1B9MMgU3mrb9wTQuNXdszZhK3Y7rU2c5pAXDFW/BjT75u3m8B9l2kLk9H+z/vBi58Gj8bVa86Od8/7wbvvBd6h98h74o29iUc85v3pvff+6j3rZb2L3ro0vXmjYr73Gqv3xz/lK1WD</latexit><latexit sha1_base64="cbM0eSmGxMtQkVBDiiBAEOk4Olo=">AAAJx3icpZZfb9s2EMDVrtti70/b7bEvwowALZAGlrcg2VuxIkD/BIvb1Wkw0zAo+iQTpSiBpFO7jB72FfbafYJ+o32bUZScSFQLaBsB22fe/e5OJ95JYcaoVMPh3zdufnbr8y++3On1v/r6m29v37n73ZlMV4LAhKQsFechlsAoh4miisF5JgAnIYPX4ZvHhf71BQhJU/5KbTKYJTjmNKIEK7N1jtQSFJ4H8zuD4f7QLr8tBJUw8Ko1nt/d+YAWKVklwBVhWMppMMzUTGOhKGGQ99FKQobJGxzD1IgcJyBn2iac+7tmZ+FHqTAfrny7Wyc0TqTcJKGxTLBaSldXbH5MN12p6GimKc9WCjgpA0Ur5qvUL67eX1ABRLGNETAR1OTqkyUWmChTo34jTJlq3yzE4S1JkwTzhUY4lPk0mGnEIFKXgwAJGi/VZd60IkDZtRlixX9/EPilNRJW30Qilpp4NcZu1KHSwk0oS9mGpXE+HRlyDxU1CSN9QvO5HgS59XXfH4wqLw+cqObik5RT4uDLT9JN/CTlkOuTuQ5yx/EJ5ZHVICOoTUv9lEe/piKp1bL8vr7gy+rnky5OOXR1EbTKJja5NvepuMlTEYczPdw/Ojw8ODjY2x77QvhxNDo8yluhxbo7vG7Btue6Oyh7tOUFMklZyv+Fny3RPtQxvQDjCe1doj0njlQLM4XKGleHA2U097dHY1vqB74DjkXYoMbdmGM3mKJsAYbugB/zJvukS0h6lagRdGGZuzbPQXBr8rxDDutGCsdVCtNrZNZGZNl+V4xtvitwVANd8q1QndkGeoYFxZxAI1uz2SHfp3zRoKo5UUcc4pdtAZGCtdL2ry1iUb7+7q7/8L+uAv4NlM9TZZ9obpNgJvMq0VC/dJuIcgWxmfzXJr+7JhJUbSjr7cBHrt0FEGkeHVC5IpjpM9dmwmnRsCikMVllraqaZ1ClxK5yVZFtjFZYm5H0HdRmY20qFgPg/xb9cZpkDNZUbZxbTeNTG/UULW1B9MMgU3mrb9wTQuNXdszZhK3Y7rU2c5pAXDFW/BjT75u3m8B9l2kLk9H+z/vBi58Gj8bVa86Od8/7wbvvBd6h98h74o29iUc85v3pvff+6j3rZb2L3ro0vXmjYr73Gqv3xz/lK1WD</latexit>

✓2<latexit sha1_base64="YwKmc936ERHJ7kguU7c+8BuB0wk=">AAAJx3icpZZfb9s2EMDVrtti70/b7bEvwowALZAGlrcg2VuxIkD/BIvb1Wkw0zAo+iQTpSiBpFO7jB72FfbafYJ+o32bUZScSFQLaBsB22fe/e5OJ95JYcaoVMPh3zdufnbr8y++3On1v/r6m29v37n73ZlMV4LAhKQsFechlsAoh4miisF5JgAnIYPX4ZvHhf71BQhJU/5KbTKYJTjmNKIEK7N1jtQSFJ6P5ncGw/2hXX5bCCph4FVrPL+78wEtUrJKgCvCsJTTYJipmcZCUcIg76OVhAyTNziGqRE5TkDOtE0493fNzsKPUmE+XPl2t05onEi5SUJjmWC1lK6u2PyYbrpS0dFMU56tFHBSBopWzFepX1y9v6ACiGIbI2AiqMnVJ0ssMFGmRv1GmDLVvlmIw1uSJgnmC41wKPNpMNOIQaQuBwESNF6qy7xpRYCyazPEiv/+IPBLaySsvolELDXxaozdqEOlhZtQlrINS+N8OjLkHipqEkb6hOZzPQhy6+u+PxhVXh44Uc3FJymnxMGXn6Sb+EnKIdcncx3kjuMTyiOrQUZQm5b6KY9+TUVSq2X5fX3Bl9XPJ12ccujqImiVTWxybe5TcZOnIg5nerh/dHh4cHCwtz32hfDjaHR4lLdCi3V3eN2Cbc91d1D2aMsLZJKylP8LP1uifahjegHGE9q7RHtOHKkWZgqVNa4OB8po7m+PxrbUD3wHHIuwQY27McduMEXZAgzdAT/mTfZJl5D0KlEj6MIyd22eg+DW5HmHHNaNFI6rFKbXyKyNyLL9rhjbfFfgqAa65FuhOrMN9AwLijmBRrZms0O+T/miQVVzoo44xC/bAiIFa6XtX1vEonz93V3/4X9dBfwbKJ+nyj7R3CbBTOZVoqF+6TYR5QpiM/mvTX53TSSo2lDW24GPXLsLINI8OqByRTDTZ67NhNOiYVFIY7LKWlU1z6BKiV3lqiLbGK2wNiPpO6jNxtpULAbA/y364zTJGKyp2ji3msanNuopWtqC6IdBpvJW37gnhMav7JizCVux3Wtt5jSBuGKs+DGm3zdvN4H7LtMWJqP9n/eDFz8NHo2r15wd7573g3ffC7xD75H3xBt7E494zPvTe+/91XvWy3oXvXVpevNGxXzvNVbvj38A7odVhA==</latexit><latexit sha1_base64="YwKmc936ERHJ7kguU7c+8BuB0wk=">AAAJx3icpZZfb9s2EMDVrtti70/b7bEvwowALZAGlrcg2VuxIkD/BIvb1Wkw0zAo+iQTpSiBpFO7jB72FfbafYJ+o32bUZScSFQLaBsB22fe/e5OJ95JYcaoVMPh3zdufnbr8y++3On1v/r6m29v37n73ZlMV4LAhKQsFechlsAoh4miisF5JgAnIYPX4ZvHhf71BQhJU/5KbTKYJTjmNKIEK7N1jtQSFJ6P5ncGw/2hXX5bCCph4FVrPL+78wEtUrJKgCvCsJTTYJipmcZCUcIg76OVhAyTNziGqRE5TkDOtE0493fNzsKPUmE+XPl2t05onEi5SUJjmWC1lK6u2PyYbrpS0dFMU56tFHBSBopWzFepX1y9v6ACiGIbI2AiqMnVJ0ssMFGmRv1GmDLVvlmIw1uSJgnmC41wKPNpMNOIQaQuBwESNF6qy7xpRYCyazPEiv/+IPBLaySsvolELDXxaozdqEOlhZtQlrINS+N8OjLkHipqEkb6hOZzPQhy6+u+PxhVXh44Uc3FJymnxMGXn6Sb+EnKIdcncx3kjuMTyiOrQUZQm5b6KY9+TUVSq2X5fX3Bl9XPJ12ccujqImiVTWxybe5TcZOnIg5nerh/dHh4cHCwtz32hfDjaHR4lLdCi3V3eN2Cbc91d1D2aMsLZJKylP8LP1uifahjegHGE9q7RHtOHKkWZgqVNa4OB8po7m+PxrbUD3wHHIuwQY27McduMEXZAgzdAT/mTfZJl5D0KlEj6MIyd22eg+DW5HmHHNaNFI6rFKbXyKyNyLL9rhjbfFfgqAa65FuhOrMN9AwLijmBRrZms0O+T/miQVVzoo44xC/bAiIFa6XtX1vEonz93V3/4X9dBfwbKJ+nyj7R3CbBTOZVoqF+6TYR5QpiM/mvTX53TSSo2lDW24GPXLsLINI8OqByRTDTZ67NhNOiYVFIY7LKWlU1z6BKiV3lqiLbGK2wNiPpO6jNxtpULAbA/y364zTJGKyp2ji3msanNuopWtqC6IdBpvJW37gnhMav7JizCVux3Wtt5jSBuGKs+DGm3zdvN4H7LtMWJqP9n/eDFz8NHo2r15wd7573g3ffC7xD75H3xBt7E494zPvTe+/91XvWy3oXvXVpevNGxXzvNVbvj38A7odVhA==</latexit><latexit sha1_base64="YwKmc936ERHJ7kguU7c+8BuB0wk=">AAAJx3icpZZfb9s2EMDVrtti70/b7bEvwowALZAGlrcg2VuxIkD/BIvb1Wkw0zAo+iQTpSiBpFO7jB72FfbafYJ+o32bUZScSFQLaBsB22fe/e5OJ95JYcaoVMPh3zdufnbr8y++3On1v/r6m29v37n73ZlMV4LAhKQsFechlsAoh4miisF5JgAnIYPX4ZvHhf71BQhJU/5KbTKYJTjmNKIEK7N1jtQSFJ6P5ncGw/2hXX5bCCph4FVrPL+78wEtUrJKgCvCsJTTYJipmcZCUcIg76OVhAyTNziGqRE5TkDOtE0493fNzsKPUmE+XPl2t05onEi5SUJjmWC1lK6u2PyYbrpS0dFMU56tFHBSBopWzFepX1y9v6ACiGIbI2AiqMnVJ0ssMFGmRv1GmDLVvlmIw1uSJgnmC41wKPNpMNOIQaQuBwESNF6qy7xpRYCyazPEiv/+IPBLaySsvolELDXxaozdqEOlhZtQlrINS+N8OjLkHipqEkb6hOZzPQhy6+u+PxhVXh44Uc3FJymnxMGXn6Sb+EnKIdcncx3kjuMTyiOrQUZQm5b6KY9+TUVSq2X5fX3Bl9XPJ12ccujqImiVTWxybe5TcZOnIg5nerh/dHh4cHCwtz32hfDjaHR4lLdCi3V3eN2Cbc91d1D2aMsLZJKylP8LP1uifahjegHGE9q7RHtOHKkWZgqVNa4OB8po7m+PxrbUD3wHHIuwQY27McduMEXZAgzdAT/mTfZJl5D0KlEj6MIyd22eg+DW5HmHHNaNFI6rFKbXyKyNyLL9rhjbfFfgqAa65FuhOrMN9AwLijmBRrZms0O+T/miQVVzoo44xC/bAiIFa6XtX1vEonz93V3/4X9dBfwbKJ+nyj7R3CbBTOZVoqF+6TYR5QpiM/mvTX53TSSo2lDW24GPXLsLINI8OqByRTDTZ67NhNOiYVFIY7LKWlU1z6BKiV3lqiLbGK2wNiPpO6jNxtpULAbA/y364zTJGKyp2ji3msanNuopWtqC6IdBpvJW37gnhMav7JizCVux3Wtt5jSBuGKs+DGm3zdvN4H7LtMWJqP9n/eDFz8NHo2r15wd7573g3ffC7xD75H3xBt7E494zPvTe+/91XvWy3oXvXVpevNGxXzvNVbvj38A7odVhA==</latexit><latexit sha1_base64="YwKmc936ERHJ7kguU7c+8BuB0wk=">AAAJx3icpZZfb9s2EMDVrtti70/b7bEvwowALZAGlrcg2VuxIkD/BIvb1Wkw0zAo+iQTpSiBpFO7jB72FfbafYJ+o32bUZScSFQLaBsB22fe/e5OJ95JYcaoVMPh3zdufnbr8y++3On1v/r6m29v37n73ZlMV4LAhKQsFechlsAoh4miisF5JgAnIYPX4ZvHhf71BQhJU/5KbTKYJTjmNKIEK7N1jtQSFJ6P5ncGw/2hXX5bCCph4FVrPL+78wEtUrJKgCvCsJTTYJipmcZCUcIg76OVhAyTNziGqRE5TkDOtE0493fNzsKPUmE+XPl2t05onEi5SUJjmWC1lK6u2PyYbrpS0dFMU56tFHBSBopWzFepX1y9v6ACiGIbI2AiqMnVJ0ssMFGmRv1GmDLVvlmIw1uSJgnmC41wKPNpMNOIQaQuBwESNF6qy7xpRYCyazPEiv/+IPBLaySsvolELDXxaozdqEOlhZtQlrINS+N8OjLkHipqEkb6hOZzPQhy6+u+PxhVXh44Uc3FJymnxMGXn6Sb+EnKIdcncx3kjuMTyiOrQUZQm5b6KY9+TUVSq2X5fX3Bl9XPJ12ccujqImiVTWxybe5TcZOnIg5nerh/dHh4cHCwtz32hfDjaHR4lLdCi3V3eN2Cbc91d1D2aMsLZJKylP8LP1uifahjegHGE9q7RHtOHKkWZgqVNa4OB8po7m+PxrbUD3wHHIuwQY27McduMEXZAgzdAT/mTfZJl5D0KlEj6MIyd22eg+DW5HmHHNaNFI6rFKbXyKyNyLL9rhjbfFfgqAa65FuhOrMN9AwLijmBRrZms0O+T/miQVVzoo44xC/bAiIFa6XtX1vEonz93V3/4X9dBfwbKJ+nyj7R3CbBTOZVoqF+6TYR5QpiM/mvTX53TSSo2lDW24GPXLsLINI8OqByRTDTZ67NhNOiYVFIY7LKWlU1z6BKiV3lqiLbGK2wNiPpO6jNxtpULAbA/y364zTJGKyp2ji3msanNuopWtqC6IdBpvJW37gnhMav7JizCVux3Wtt5jSBuGKs+DGm3zdvN4H7LtMWJqP9n/eDFz8NHo2r15wd7573g3ffC7xD75H3xBt7E494zPvTe+/91XvWy3oXvXVpevNGxXzvNVbvj38A7odVhA==</latexit>

New training data arrives and changes the loss surface.1)Instead of starting over from random weights start at previous solution.

2)

Works well if data is changing slowly.

More challenging for model changes.

Page 25: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

Additional Thoughts on Warm Starting

Ø A form of transfer learning across time.

Ø Useful for situations where new data is arrivingØ Data distribution is not changing rapidly (but changing…)

Ø Issues:Ø Need a mechanism to set learning rates appropriately

Ø Typically start much smallerØ Could get stuck in suboptimal solution for non-convex settings

Ø Though this is true in generalØ Catastrophic forgetting: if you only train on new data may

degrade model on old dataØ Can address by continuing to train on old data

Page 26: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

Fine TuningØ Using small learning rates to train pre-trained or partially

pre-trained model for a new dataset or prediction task.Ø enables both faster training and improved accuracy

f✓old<latexit sha1_base64="9J8cpGLrDUi3EN5qfrRMtrsWlOM=">AAAB/nicbVBNS8NAEN3Ur1q/ouLJS7AInkpSBT0WvXisYD+gKWGznbRLNx/sTsQSAv4VLx4U8erv8Oa/cdvmoK0PBh7vzTAzz08EV2jb30ZpZXVtfaO8Wdna3tndM/cP2ipOJYMWi0Usuz5VIHgELeQooJtIoKEvoOOPb6Z+5wGk4nF0j5ME+iEdRjzgjKKWPPMo8DIXR4DUcxEeMYvFIM89s2rX7BmsZeIUpEoKND3zyx3ELA0hQiaoUj3HTrCfUYmcCcgrbqogoWxMh9DTNKIhqH42Oz+3TrUysIJY6orQmqm/JzIaKjUJfd0ZUhypRW8q/uf1Ugyu+hmPkhQhYvNFQSosjK1pFtaAS2AoJppQJrm+1WIjKilDnVhFh+AsvrxM2vWac16r311UG9dFHGVyTE7IGXHIJWmQW9IkLcJIRp7JK3kznowX4934mLeWjGLmkPyB8fkDYaOWZw==</latexit>

Small Change + Additional Data

Warm Starting

f✓new<latexit sha1_base64="Kzt75IVowchRD2VRZe0wTVsYZh0=">AAAB/nicbVBNS8NAEN34WetXVTx5CRbBU0mqoMeiF48V7Ae0IWy2k3bpZhN2J2oJBf+KFw+KePV3ePPfuG1z0NYHA4/3ZpiZFySCa3Scb2tpeWV1bb2wUdzc2t7ZLe3tN3WcKgYNFotYtQOqQXAJDeQooJ0ooFEgoBUMryd+6x6U5rG8w1ECXkT7koecUTSSXzoM/ayLA0DqdxEeMZPwMB77pbJTcaawF4mbkzLJUfdLX91ezNIIJDJBte64ToJeRhVyJmBc7KYaEsqGtA8dQyWNQHvZ9PyxfWKUnh3GypREe6r+nshopPUoCkxnRHGg572J+J/XSTG89DIukxRBstmiMBU2xvYkC7vHFTAUI0MoU9zcarMBVZShSaxoQnDnX14kzWrFPatUb8/Ltas8jgI5IsfklLjkgtTIDamTBmEkI8/klbxZT9aL9W59zFqXrHzmgPyB9fkDclyWcg==</latexit>

Change in the prediction task

x y

x z

x z x z x z

Warm Starting

Add new top layer(s)

Training just the top layer

Training alllayers

“Learned Features”

Page 27: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

OfflineTraining

Data

DataCollection

Cleaning &Visualization

Feature Eng. &Model Design

Training &Validation

Model Development

TrainedModelsTraining Pipelines

LiveData

Training

Validation

DataEngineer

DataScientist

Open Problems

Context & Composition

Page 28: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

How, What, & Who?Ø How was the model or data created?

Ø What is the latest or best version?

Ø Who is responsible? (blame...)

Context

Training Pipelines

PartialSolution

Track relationships between1. Code versions2. Model & Data versions3. People (versions?)

Page 29: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

Models are being composed to solve new problems

Composition

Cuteness Detector

Cute!

Page 30: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

Models are being composed to solve new problems

Composition

Puppy Detector

Ball Detector

Cuteness Detector

Yes

Yes

Cute!

Page 31: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

Models are being composed to solve new problems

Composition

Puppy Detector

Ball Detector

Cuteness Detector

Yes

Yes

Cute!

Wrong but helpful…

Still Correct

Page 32: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

Models are being composed to solve new problems

Composition

Puppy Detector

Ball Detector

Cuteness Detector

Yes

Yes

Cute!

Wrong but helpful…

Still Correct

DataScientist

Page 33: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

Models are being composed to solve new problems

Composition

Puppy Detector

Ball Detector

Cuteness Detector

No

Yes

Not Cute!

Degradation in accuracy

DataScientist

Reasonable Improvement

Page 34: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

Models are being composed to solve new problems

Composition

Puppy Detector

Ball Detector

Cuteness Detector

No

Yes

Not Cute!

Degradation in accuracy

DataScientist

Reasonable Improvement

Need to track composition and validate end-to-end accuracy.

Need unit and integration testing for models.

Page 35: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

OfflineTraining

Data

DataCollection

Cleaning &Visualization

Feature Eng. &Model Design

Training &Validation

Model Development

TrainedModelsTraining Pipelines

LiveData

Training

Validation

DataEngineer

DataScientist

Page 36: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

TrainedModelsTraining Pipelines

LiveData

Training

Validation

End UserApplication

Query

Prediction

Prediction Service

Inference

Feedback

Logic

DataEngineer

DataEngineer

Page 37: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

End UserApplication

Query

Prediction

Prediction Service

Inference

Feedback

Logic

DataEngineer

Goal: make predictions in ~10ms under bursty load

Complicated by Deep Neural Networksè New ML Algorithms and Systems

Page 38: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

End UserApplication

Query

Prediction

Prediction Service

Inference

Feedback

Logic

DataEngineer

Technologies

Page 39: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

TrainedModelsTraining Pipelines

LiveData

Training

Validation

End UserApplication

Query

Prediction

Prediction Service

Inference

Feedback

Logic

Ø Model updates: retrained as new data arrivesØ Periodically: leverage batch processing and validation

Ø Model could be out-of-date for extended periods of timeØ Continuously (online learning): most fresh model

Ø Needs validation, learning rates? … complicated

Ø Feature updates: new data may change featuresØ Example: update click history for a users à new predictionsØ Can be more robust than online learning

Incorporating Feedback

Page 40: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

Ø Models can bias the data they collectØ Example: content recommendationØ Future models may reflect earlier model bias

Ø Exploration – Exploitation Trade-offØ Exploration: observe diverse outcomesØ Exploitation: leverage model to take

predicted best action

Ø SolutionsØ Randomization (ε-greedy): occasionally ignore the modelØ Bandit Algorithms/Thompson Sampling: optimally balance

exploration and exploitation à active area of research

Feedback Cycles

Page 41: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

OfflineTraining

Data

DataCollection

Cleaning &Visualization

Feature Eng. &Model Design

Training &Validation

Model Development

TrainedModelsTraining Pipelines

LiveData

Training

Validation

End UserApplication

Query

Prediction

Prediction Service

Inference

Feedback

Logic

DataScientist

DataEngineer

DataEngineer

Machine Learning Lifecycle

We will cover each phase in more detail throughout the semester but this week we focus on managing the entire process.

Page 42: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

Objectives For TodayØ Introduce the machine learning lifecycle

Ø Challenges and OpportunitiesØ Science vs Engineering

Ø Review Key Concepts in ReadingsØ HyperparametersØ Model Pipelines, Features, and Feature EngineeringØ Warm Starting and Fine TuningØ Feedback Loops, Retraining and Continuous Training

Ø Important Context for Papers and what to expect.

Page 43: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

Reading for the WeekØ Hidden Technical Debt in Machine Learning Systems

Ø NeurIPS’15, widely citedØ Provides an overview of the challenges from Google

Ø TFX: A TensorFlow-Based Production-Scale Machine Learning PlatformØ KDD’17, now part of https://www.tensorflow.org/tfx (sort of)Ø Googles solution to the challenges in the first paper

Ø Towards Unified Data and Lifecycle Management for Deep LearningØ ICDE’17, Video DemoØ An alternative database community solution

Page 44: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

Related Systems EffortsØ Doing Machine Learning the Uber Way: Five Lessons From the

First Three Years of Michelangelo

Ø Introducing FBLearner Flow: Facebook’s AI backbone

Ø KubeFlow: Kubernetes Pipeline Orchestration Framework

Ø DeepBird: Twitters ML Deployment Framework

Ø Mlflow: A System to Accelerate the Machine Learning Lifecycle

Ø Data Engineering Bulletin on the Machine Learning Lifecycle Ø Full disclosure: I was the editor

Page 45: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

Hidden Technical Debtin Machine Learning SystemsØ Technical Debt: long term development and maintenance

costs incurred by expedient design decisions

Ø Key Idea: machine learning deployments often incur substantial technical debt (compared to traditional software)

Ø Contribution: this paper characterizes the forms of technical debt and alludes to possible compensating actions

Figure 1: Only a small fraction of real-world ML systems is composed of the ML code, as shownby the small black box in the middle. The required surrounding infrastructure is vast and complex.

Static Analysis of Data Dependencies. In traditional code, compilers and build systems performstatic analysis of dependency graphs. Tools for static analysis of data dependencies are far lesscommon, but are essential for error checking, tracking down consumers, and enforcing migrationand updates. One such tool is the automated feature management system described in [12], whichenables data sources and features to be annotated. Automated checks can then be run to ensure thatall dependencies have the appropriate annotations, and dependency trees can be fully resolved. Thiskind of tooling can make migration and deletion much safer in practice.

4 Feedback Loops

One of the key features of live ML systems is that they often end up influencing their own behaviorif they update over time. This leads to a form of analysis debt, in which it is difficult to predict thebehavior of a given model before it is released. These feedback loops can take different forms, butthey are all more difficult to detect and address if they occur gradually over time, as may be the casewhen models are updated infrequently.

Direct Feedback Loops. A model may directly influence the selection of its own future trainingdata. It is common practice to use standard supervised algorithms, although the theoretically correctsolution would be to use bandit algorithms. The problem here is that bandit algorithms (such ascontextual bandits [9]) do not necessarily scale well to the size of action spaces typically required forreal-world problems. It is possible to mitigate these effects by using some amount of randomization[3], or by isolating certain parts of data from being influenced by a given model.

Hidden Feedback Loops. Direct feedback loops are costly to analyze, but at least they pose astatistical challenge that ML researchers may find natural to investigate [3]. A more difficult case ishidden feedback loops, in which two systems influence each other indirectly through the world.

One example of this may be if two systems independently determine facets of a web page, such asone selecting products to show and another selecting related reviews. Improving one system maylead to changes in behavior in the other, as users begin clicking more or less on the other componentsin reaction to the changes. Note that these hidden loops may exist between completely disjointsystems. Consider the case of two stock-market prediction models from two different investmentcompanies. Improvements (or, more scarily, bugs) in one may influence the bidding and buyingbehavior of the other.

5 ML-System Anti-Patterns

It may be surprising to the academic community to know that only a tiny fraction of the code inmany ML systems is actually devoted to learning or prediction – see Figure 1. In the language ofLin and Ryaboy, much of the remainder may be described as “plumbing” [11].

It is unfortunately common for systems that incorporate machine learning methods to end up withhigh-debt design patterns. In this section, we examine several system-design anti-patterns [4] thatcan surface in machine learning systems and which should be avoided or refactored where possible.

4

Page 46: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

TFX: A TensorFlow-Based Production-Scale Machine Learning Platform

Ø Describes solutions to many of the problem outlined in the technical debt paper.

Ø Key Idea: Adapt best practices for software development to address machine learning lifecycleØ empathetic to the reality of “machine learning developers”

Ø Contributions: actual system, interesting ideas around data and model validation, schema enforcement, and meaningful errors.

Page 47: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

Towards Unified Data and Lifecycle Management for Deep Learning

Ø Describes a system (ModelHub) for managing, querying, and manipulating models and their related metadata.

Ø Key Idea(?): Model lifecycle management combines code and data (parameters) à a natural API would then combine version control commands with SQL-like querying.

Ø Solution: Combines a git-like client API with a SQL-like querying interface to enable basic actions and more complex queries. Ø Leverages optimizations to store model weights more efficiently.

Ø (necessary?)

Page 48: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

What to think about when readingØ How does the work differentiate between engineering

and research challenges?

Ø What innovations in machine learning are needed?

Ø What are the key research challenges proposed and addressed?

Ø Are the proposed solutions too opinionatedØ Would they require top down mandates for adoption?Ø Would you use these systems?Ø Are they sufficiently flexible to support innovation

Page 49: AI-Systems Machine Learning Lifecycle · Additional Notes on Features ØFeature Joins:combine multiple data source in a feature ØFeature Reuse: good features can aid in many tasks

Done!