32
It’s All About Me From Big Data Models to Personalized Experience Yao Morin, Ph.D.

It’s all about me_ From big data models to personalized experience Presentation

Embed Size (px)

Citation preview

Page 1: It’s all about me_ From big data models to personalized experience Presentation

It’s All About MeFrom Big Data Models to Personalized

Experience

Yao Morin, Ph.D.

Page 2: It’s all about me_ From big data models to personalized experience Presentation

Go from this…

Page 3: It’s all about me_ From big data models to personalized experience Presentation

… to this …

Page 4: It’s all about me_ From big data models to personalized experience Presentation
Page 5: It’s all about me_ From big data models to personalized experience Presentation

• 30 Million users filed their taxes with TurboTax

• 5 Million used desktop• 25 Million used online

• TurboTax is 25 years old• Roots as a Desktop App (and old)

Page 6: It’s all about me_ From big data models to personalized experience Presentation

SERVICES

Page 7: It’s all about me_ From big data models to personalized experience Presentation

Hard-coded business logic

Fixed UI flow

Domain knowledge embedded

Business Logic and TurboTax

Page 8: It’s all about me_ From big data models to personalized experience Presentation

Experience A

Experience B

We know what you PREFER

Page 9: It’s all about me_ From big data models to personalized experience Presentation

We serve up what’s RELEVANT to you

Page 10: It’s all about me_ From big data models to personalized experience Presentation

We know when you need HELP

Page 11: It’s all about me_ From big data models to personalized experience Presentation

How can we tailor the experience just for

YOU?

Page 12: It’s all about me_ From big data models to personalized experience Presentation

Marriage between Data Science and Dynamic and Responsive Frontend

Page 13: It’s all about me_ From big data models to personalized experience Presentation

What is Data Science? It is multidisciplinary study and incorporates various

techniques and theories from many fields, such as statistics, mathematics, artificial intelligence, data engineering, etc.

Answers questions based on data instead of assumptions extract meaning from data and explain phenomenon uncover patterns from data and develop predictive models

Page 14: It’s all about me_ From big data models to personalized experience Presentation
Page 15: It’s all about me_ From big data models to personalized experience Presentation

E2E goals definition

Model KPI, Input/

Output definition

Model creation

and offline

evaluation

Online model

coding & validation

Integration/

Experience QA

Online evaluatio

n

Result analysis

Training/test set

preprocessing

Algorithm & method selection

Model training/

parameters selection

KPI measurement/ accuracy assessment

From business problems to models

Page 16: It’s all about me_ From big data models to personalized experience Presentation

Data model building cycleTraining/test set

preprocessing

Algorithm & method

selection

KPI measurement/

accuracy assessment

Model training/parameters

selection

Page 17: It’s all about me_ From big data models to personalized experience Presentation

Identify data Features - what information do you have From data inventory and/or domain experts Examples: Demographic, behavioral or geographic data, etc.

Labels ( for supervised learning ): what you want to predict What kind of products to recommend Whether a customer buys a product How a customer reacts to an experience

Page 18: It’s all about me_ From big data models to personalized experience Presentation

Pre-processing data “Encoding” categorical data ZIP code, feelings, occupations dummy coding, bucketing, and others

Imputations – “filling in” missing data ML estimations, stochastic regression, multiple imputation

Other cleaning

Page 19: It’s all about me_ From big data models to personalized experience Presentation

Model training

Learning the relationship between features and labels

through data

Page 20: It’s all about me_ From big data models to personalized experience Presentation

Not this kind of relationship

Page 21: It’s all about me_ From big data models to personalized experience Presentation

Labels = f(Features)

RegressorsClassifiers,

etc.

But this kind of relationship

Page 22: It’s all about me_ From big data models to personalized experience Presentation

Model evaluation Evaluate model performance against model-specific

performance metrics with hold-out data and iterate on Model type Hyperparameters Features …

Page 23: It’s all about me_ From big data models to personalized experience Presentation

Example: Training a model

Preprocessing

Separate into training and validation

sets

User data

Labels

Training Set

Validation Set Preprocessing

Model Training(Random Forest)

Model Validation( FP/FN)

Model

Metric

Page 24: It’s all about me_ From big data models to personalized experience Presentation

Advantages of data models

To have dynamic personalized experience, we need to decide what to show out of a large variety of possible experiences, in an algorithmic way. Data models solve this:- Connect user data to user preferences - Machine learning is automated and handles the

complexity

Page 25: It’s all about me_ From big data models to personalized experience Presentation

Limitations of data models Uncertainties May not be suitable when applications require 100% accurate May need to build in safeguards for applications that require high

accuracy

Vulnerable to inaccurate, missing or insufficient data

Page 26: It’s all about me_ From big data models to personalized experience Presentation

• Send information about the user

Logic

• Dispatcher• If… else… logic

blocks

Pages

• Static flow• Static pages• Hide/show DOM

elements

User Requests

Traditional process flow

Page 27: It’s all about me_ From big data models to personalized experience Presentation

• Send information about the user

Model Service

Platform• Hosts models• Processes user

requests based on user data received

Player

• Consume received decision and generate final user experience

User Requests

Dynamic process flow

Page 28: It’s all about me_ From big data models to personalized experience Presentation

Design With Data Science Mindset

Not Static Configurable

Data science works well with configurable

components

Use templates

Scalability

Experiences should support large amounts of variability

Use templates (again!)

Maintainability

A refresh of design should not break underlying logic

Build experiences with separation of logic

and design

Data science and static do not mix

Do not hardcode paths/pages

Page 29: It’s all about me_ From big data models to personalized experience Presentation

How do we apply Data Science to TurboTax UI?

Page 30: It’s all about me_ From big data models to personalized experience Presentation

Dynamic ViewsTraditional Dynamic UI

Dynamic Data

Static Templates

+

=

Dynamic Site

Truly Dynamic UI

Dynamic Data

Dynamic Semantic Templates

+

=

Dynamic Site

{ type:

template }

Page 31: It’s all about me_ From big data models to personalized experience Presentation

Dynamic FlowStatically Defined Routes/States

Dynamic Finite State Machine

• Relationships between pages are pre-determined

• Entry points into the app are pre-determined

• All flow and variation in the application is hard coded

• Relationships among data are pre-determined

• Entry points are determined dynamically

• Flow though the application is completely data driven

Page 32: It’s all about me_ From big data models to personalized experience Presentation

• Data science model enabled• Semantically defined dynamic

experiences• Dynamic application flow

• Device agnostic representation of the UI• Device specific applications to render the

UI

FUEGO