55
Pavlos Protopapas Institute for Applied Computational Science, Harvard AC215 Lecture 15: App Design, Setup & Code Organization Advanced Practical Data Science, MLOps 1

Advanced Practical Data Science, MLOps Lecture 15: App

  • Upload
    others

  • View
    11

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Advanced Practical Data Science, MLOps Lecture 15: App

Pavlos ProtopapasInstitute for Applied Computational Science, Harvard

AC215

Lecture 15: App Design, Setup & Code Organization

Advanced Practical Data Science, MLOps

1

Page 2: Advanced Practical Data Science, MLOps Lecture 15: App

Outline

1. Recap2. Motivation3. App Design4. Setup & Code Organization

2

Page 3: Advanced Practical Data Science, MLOps Lecture 15: App

Outline

1. Recap2. Motivation3. App Design4. Setup & Code Organization

3

Page 4: Advanced Practical Data Science, MLOps Lecture 15: App

Recap: Isolate work into reusable containers?

Container Container

Container

4

Page 5: Advanced Practical Data Science, MLOps Lecture 15: App

Outline

1. Recap2. Motivation3. App Design4. Setup & Code Organization

5

Page 6: Advanced Practical Data Science, MLOps Lecture 15: App

Before you build your App

• You do NOT want to build your entire app in one container• Start thinking of functionality that can be isolated• Identify components that can be containerized

6

How do we do this?

Page 7: Advanced Practical Data Science, MLOps Lecture 15: App

Review: Problem Definition

Pavlos like to go to the forest to do mushroom picking. It is a fun activity and also rewarding as some mushrooms are edible. The problem is in the forest where Pavlos goes to pick mushrooms there are many varieties of poisonous mushrooms. Some of the mushrooms are obvious but there are some which he requires help in identification.

7

Page 8: Advanced Practical Data Science, MLOps Lecture 15: App

Review: Proposed Solution

Pavlos will have is phone with him when he is in the forest. What if he could just take a picture of the mushrooms and and app could tell him what type of mushroom it is and whether it is poisonous or not

8

Page 9: Advanced Practical Data Science, MLOps Lecture 15: App

Review: Proposed Solution

Credit: Nikolas Protopapas

• Pavlos likes to go the forest for mushroom picking

• Some mushrooms can be poisonous• Help build an app to identify mushroom

type and if poisonous or not

9

Page 10: Advanced Practical Data Science, MLOps Lecture 15: App

Review: Project Scope

Proof Of Concept (POC)

● Scrap mushroom data● Verify images● Experiment on some baseline

models● Verify new unseen mushrooms

are predicted by the model(s)● Visualize model activations to

analyse what the model is seeing

Prototype

● Create a mockup of screens to see how the app could look like

● Deploy one model to Fast API to service model predictions as an API

Minimum Viable Product (MVP)● Create App to identify

Mushrooms ● API Server for uploading

images and predicting using best model

10

Page 11: Advanced Practical Data Science, MLOps Lecture 15: App

Review: Project Scope

Proof Of Concept (POC)

● Scrap mushroom data● Verify images● Experiment on some baseline

models● Verify new unseen mushrooms

are predicted by the model(s)● Visualize model activations to

analyse what the model is seeing

Prototype

● Create a mockup of screens to see how the app could look like

● Deploy one model to Fast API to service model predictions as an API

Minimum Viable Product (MVP)● Create App to identify

Mushrooms ● API Server for uploading

images and predicting using best model

11

Using Streamlit

Page 12: Advanced Practical Data Science, MLOps Lecture 15: App

Review: Project WorkflowPOC Prototype MVP

Data Collection

Build Baseline Models

Project, Containers, Deployment & Scaling Setup

Build Mushroom App

Setup Experiment Tracking

Build Better Models

12

Page 13: Advanced Practical Data Science, MLOps Lecture 15: App

Review: Process Flow

Google / Bing

Data Collection

- Scrap mushroom images- Organize / Save image- Verify images

App

- Upload Image- Make Prediction- View Results

Data / Model Store

- Images- Labels

ColabNotebooks

- Models- Metrics

- EDA- Train model- Evaluate

13

Page 14: Advanced Practical Data Science, MLOps Lecture 15: App

Review: Process Flow

Google / Bing

Data Collection

- Scrap mushroom images- Organize / Save image- Verify images

App

- Upload Image- Make Prediction- View Results

Data / Model Store

- Images- Labels

ColabNotebooks

- Models- Metrics

- EDA- Train model- Evaluate

14

Frontend Container

Backend ContainerData Collection Container Persistent Store

Page 15: Advanced Practical Data Science, MLOps Lecture 15: App

Mushroom App: Identifying Components

• Script to download images from Google• A persistent storage for data and models• Backend APIs• Frontend App

15

Page 16: Advanced Practical Data Science, MLOps Lecture 15: App

Outline

1. Recap2. Motivation3. App Design4. Setup & Code Organization

16

Page 17: Advanced Practical Data Science, MLOps Lecture 15: App

App Design

● In a regular software app you have code and data.

● In an AI App, in addition you have models to perform tasks

● We will follow a structured approach to design and develop an AI App

● The design will consist of the following components:

○ Screenflow & Wireframes

○ Solution Architecture

○ Technical Architecture

17

Page 18: Advanced Practical Data Science, MLOps Lecture 15: App

Screenflow & Wireframes

Start with brainstorming ideas on whiteboard/paper

18

Page 19: Advanced Practical Data Science, MLOps Lecture 15: App

Screenflow & Wireframes

Page 20: Advanced Practical Data Science, MLOps Lecture 15: App

Screenflow & Wireframes

Mushroom Identifier

UPLOAD

Upload a photo or take a picture

Mushroom Identifier

Mushroom Identifier

Amanita: 98.5% POISONOUS

Page 21: Advanced Practical Data Science, MLOps Lecture 15: App

Solution Architecture

● Helps to identify the building blocks in an App

● Start by asking how will your App address the Problem Statement

● Identifying the following:

○ The Process being performed by the user

○ The code Execution blocks required to fulfil the Process

○ The State required during the life cycle of the App

21

Page 22: Advanced Practical Data Science, MLOps Lecture 15: App

Solution Architecture

22

Process (People)

Execution (Code)

State (Source, Data, Models)

Page 23: Advanced Practical Data Science, MLOps Lecture 15: App

Solution Architecture

23

Process (People)

Execution (Code)

State (Source, Data, Models)

Developers Users

Source Control Database

Frontend Backend

Page 24: Advanced Practical Data Science, MLOps Lecture 15: App

Solution Architecture

24

Process (People)

Execution (Code)

State (Source, Data, Models)

Developers Users

Source Control Database Data / Models

Frontend Backend Model Training

Data Scientists

Page 25: Advanced Practical Data Science, MLOps Lecture 15: App

Solution Architecture

25

Process (People)

Execution (Code)

State (Source, Data, Models)

Developers Users

Source Control Database Data / Models

Frontend Backend Model Training

Data Scientists● Collect data from

Google Image search● EDA● Model training/tuning● User can upload image● View prediction results● Build App

● Save images to a common store

● Save model weights● Information on pre

processing

● Take an image and apply the same preprocessing

● Use the best model to make prediction

● Return results to user● Poisonous or not● Track best model

Page 26: Advanced Practical Data Science, MLOps Lecture 15: App

26

Solution ArchitectureProcess

State

Execution

Page 27: Advanced Practical Data Science, MLOps Lecture 15: App

27

Solution ArchitectureProcess

Upload picture, view predictionsEDA + Model trainingDevelop App

State

Execution

Page 28: Advanced Practical Data Science, MLOps Lecture 15: App

28

Solution ArchitectureProcess

State

Model StoreDatabase

Execution

Upload picture, view predictions

Image Store

EDA + Model training

Source Control

Develop App

Page 29: Advanced Practical Data Science, MLOps Lecture 15: App

29

Solution ArchitectureProcess

State

Model StoreDatabase

Execution

Upload picture, view predictions

Image Store

EDA + Model training

Source Control

Develop App

(HTTP / SSH)

Page 30: Advanced Practical Data Science, MLOps Lecture 15: App

30

Solution ArchitectureProcess

State

Model StoreDatabase

Execution

Upload picture, view predictions

Image Store

EDA + Model training

Source Control

Develop App

(HTTP / SSH)

Colab

Notebooks

(Human Interactions)

(HTTP)

Page 31: Advanced Practical Data Science, MLOps Lecture 15: App

31

Solution ArchitectureProcess

State

Model StoreDatabase

Execution

Upload picture, view predictions

Image Store

EDA + Model training

Source Control

Develop App

(HTTP / SSH)

Colab

Notebooks

(Human Interactions)

(HTTP)

(Human Interactions)

Frontend

Mushroom App

Page 32: Advanced Practical Data Science, MLOps Lecture 15: App

32

Solution ArchitectureProcess

State

Model StoreDatabase

Execution (Human Interactions)

Upload picture, view predictions

(HTTP)

Frontend

Mushroom App

(Protocol specific)

Image Store

EDA + Model training

Colab

Notebooks

(Human Interactions)

(HTTP)Backend

Source Control

Develop App

(HTTP / SSH)

Page 33: Advanced Practical Data Science, MLOps Lecture 15: App

33

Solution ArchitectureProcess

State

Model StoreDatabase

Execution (Human Interactions)

Upload picture, view predictions

(HTTP)

Frontend

Mushroom App

(Protocol specific)

Image Store

EDA + Model training

Colab

Notebooks

(Human Interactions)

(HTTP)Backend

Data Collector

Source Control

Develop App

(HTTP / SSH)

Page 34: Advanced Practical Data Science, MLOps Lecture 15: App

34

Solution ArchitectureProcess

State

Model StoreDatabase

Execution (Human Interactions)

Upload picture, view predictions

(HTTP)

Frontend

Mushroom App

(Protocol specific)

Image Store

EDA + Model training

Colab

Notebooks

(Human Interactions)

(HTTP)Backend

Data Collector Model Tracking

Source Control

Develop App

(HTTP / SSH)

Page 35: Advanced Practical Data Science, MLOps Lecture 15: App

35

Solution ArchitectureProcess

State

Model StoreDatabase

Execution (Human Interactions)

Upload picture, view predictions

(HTTP)

Frontend

Mushroom App

(Protocol specific)

Image Store

EDA + Model training

Colab

Notebooks

(Human Interactions)

(HTTP)Backend

API ServiceData Collector Model Tracking

Source Control

Develop App

(HTTP / SSH)

Page 36: Advanced Practical Data Science, MLOps Lecture 15: App

Solution Architecture Summary

36

● Process○ Developers build App

○ Users can upload pictures and view predictions

○ Data Scientists perform model training

● Colab○ Web based hosted notebook solution from

Google with access to GPUs for model training

● Frontend○ User friendly single page app with

capabilities to upload an image and view prediction results

● Backend○ API server

○ Data collector

○ Model Tracking

● State○ Source control to store/version code

○ Database to store the prediction metrics or other metadata

○ Image store for the raw image file

○ Models and model artifacts store

Page 37: Advanced Practical Data Science, MLOps Lecture 15: App

Tutorials

Building Solution Architecture for your Project

37

Page 38: Advanced Practical Data Science, MLOps Lecture 15: App

Technical Architecture

● Helps design and develop an AI App

● High level view from development to deployment

● Illustrates interactions between components/containers

● Blueprint of the system

○ Helps team members understand the big picture

○ Helps onboarding new team members

38

Page 39: Advanced Practical Data Science, MLOps Lecture 15: App

39

Building a Technical ArchitectureUsersData ScientistsDevelopers

Page 40: Advanced Practical Data Science, MLOps Lecture 15: App

40

Building a Technical ArchitectureUsers

Browser

Data Scientists

Browser

Developers

IDE/ CLIContainers

Page 41: Advanced Practical Data Science, MLOps Lecture 15: App

41

Building a Technical ArchitectureUsers

Browser

Data Scientists

Browser

Developers● Use IDE (VSCode), CLI to

build app● All development is

containerized

Data Scientists● Use Colab/JupyterHub● EDA & Modeling done

using browser

Users● Access the App using a

browser● Upload images and view

prediction results

Developers

IDE/ CLIContainers

Page 42: Advanced Practical Data Science, MLOps Lecture 15: App

42

Building a Technical ArchitectureUsers

Browser

Data Scientists

Browser

GitHub

Source Control

Colab

Notebooks

HTTPS 443 HTTP 80

App

Developers

IDE/ CLIContainers

Page 43: Advanced Practical Data Science, MLOps Lecture 15: App

43

Building a Technical ArchitectureUsers

Browser

Data Scientists

Browser

GitHub

Source Control

Colab

Notebooks

HTTPS 443 HTTP 80

App

GCP

Developers

IDE/ CLIContainers

Page 44: Advanced Practical Data Science, MLOps Lecture 15: App

44

Building a Technical ArchitectureUsers

Browser

Data Scientists

Browser

GitHub

Source Control

Colab

Notebooks

HTTPS 443 HTTP 80

App

GCP

GCS Bucket

Data & Models

HTTPS 443

Developers

IDE/ CLIContainers

Page 45: Advanced Practical Data Science, MLOps Lecture 15: App

45

Building a Technical ArchitectureUsers

Browser

Data Scientists

Browser

GCP

GitHub

Source Control

Colab

Notebooks

HTTPS 443 HTTP 80

App

GCS Bucket

Data & Models

HTTPS 443

Google Container Registry

API Service Image

Mushroom App Image

Download Collector Image

Developers

IDE/ CLIContainers

Page 46: Advanced Practical Data Science, MLOps Lecture 15: App

46

Building a Technical ArchitectureUsers

Browser

Data Scientists

Browser

GitHub

Source Control

Colab

Notebooks

HTTPS 443 HTTP 80

GCP

GCS Bucket

Data & Models

HTTPS 443

Google Container Registry

API Service Image

Mushroom App Image

Download Collector Image

Single Compute Instance/ Kubernetes Cluster

Developers

IDE/ CLIContainers

Page 47: Advanced Practical Data Science, MLOps Lecture 15: App

47

Building a Technical ArchitectureUsers

Browser

Data Scientists

Browser

GitHub

Source Control

Colab

Notebooks

HTTPS 443 HTTP 80

GCP

GCS Bucket

Data & Models

HTTPS 443

Google Container Registry

API Service Image

Mushroom App Image

Download Collector Image

Single Compute Instance/ Kubernetes Cluster

Developers

IDE/ CLIContainers

GCE Persistent Volume

Database Disk

NFS

HTTPS 443

Page 48: Advanced Practical Data Science, MLOps Lecture 15: App

48

Technical Architecture

API Service Container

NGINX ContainerHTTP 9000

GCS Bucket

Data & Models

GCE Persistent Volume

Database Disk

NFS

Single Compute Instance/ Kubernetes Cluster

GCP

Database Container

TCP/IP 5432

Mushroom App Container

HTTP 3000

Users

Browser

HTTP 80

Data Scientists

Browser

Colab

NotebooksGitHub

Source Control

HTTPS 443

HTTPS 443

HTTPS 443

HTTPS 443

Google Container Registry

API Service Image

Mushroom App Image

Download Collector Image

Developers

IDE/ CLIContainers

Page 49: Advanced Practical Data Science, MLOps Lecture 15: App

49

Technical Architecture

API Service Container

NGINX ContainerHTTP 9000

GCS Bucket

Data & Models

GCE Persistent Volume

Database Disk

NFS

Single Compute Instance/ Kubernetes Cluster

GCP

Database Container

TCP/IP 5432

Mushroom App Container

HTTP 3000

Users

Browser

HTTP 80

Data Scientists

Browser

Colab

NotebooksGitHub

Source Control

HTTPS 443

HTTPS 443

HTTPS 443

HTTPS 443

Google Container Registry

API Service Image

Mushroom App Image

Download Collector Image

Developers

IDE/ CLIContainers

App is ready!

Page 50: Advanced Practical Data Science, MLOps Lecture 15: App

50

Technical Architecture

API Service Container

NGINX ContainerHTTP 9000

GCS Bucket

Data & Models

GCE Persistent Volume

Database Disk

NFS

Single Compute Instance/ Kubernetes Cluster

GCP

Database Container

TCP/IP 5432

Mushroom App Container

HTTP 3000

Users

Browser

HTTP 80

Data Scientists

Browser

Colab

NotebooksGitHub

Source Control

HTTPS 443

HTTPS 443

HTTPS 443

HTTPS 443

Google Container Registry

API Service Image

Mushroom App Image

Download Collector Image

Developers

IDE/ CLIContainers

App is ready!

Well not really, we need to build it

Pavlos says the arrows are completely

random 😂

Page 51: Advanced Practical Data Science, MLOps Lecture 15: App

Technical Architecture Summary

51

• Source Control– GitHub

• Google Cloud Platform (GCP)– GCP will be used for deployment

• Google Container Registry– GCR to host all the container

images

• GCS Buckets– Storage buckets for models and

model artifacts

– Image store

● GCE Persistent Volume

○ Database store

● Compute Instance○ Hosting single instance of all

containers

● Kubernetes Cluster○ Kubernetes cluster will be used to

deploy a scalable version of the app on GCP

Page 52: Advanced Practical Data Science, MLOps Lecture 15: App

Outline

1. Recap2. Motivation3. App Design4. Setup & Code Organization

52

Page 53: Advanced Practical Data Science, MLOps Lecture 15: App

Setup & Code Organization

1. Create a root project folder mushroom-app

2. Organize containers into sub folders

a. api-service

b. data-collector

c. frontend-simple

3. Setup containers, mount folders for

a. Persistent storage

b. Secrets (to store GCP account keys) 53

Page 54: Advanced Practical Data Science, MLOps Lecture 15: App

Tutorials

Mushroom App - Setup & Code Organization

54

Page 55: Advanced Practical Data Science, MLOps Lecture 15: App

THANK YOU

55