A New Approach to Building Big Data Apps Product Recommendations using Fast Data and Machine...

Preview:

Citation preview

Real-Time Product Recommendations For Retail

February 22, 2018

Agenda

• About us• Recommenders: A Boost to Business• How it Works?• A Recipe for Success!• Realtime recommender in Action• Summary• Q&A Nathan Trueblood

VP, Product Management

DataTorrent Inc.

Kanad Dixit

VP, Data Science

Mindstix Labs Inc.

DataTorrent

• We help large enterprises build, deploy, and operate fast data analytic applications

• Our software makes it easy to deliver business outcomes using the latest innovation in data science and machine learning

Mindstix

• We are Digital Innovation partners toglobal retail powerhouses

• Revenue growth enabler through highlycustomized solutions in Data Science andMachine Learning

• Conversational Commerce expertise to improve customer engagement and retention

Introductions

Mindstix … A Next Gen Solutions Provider

Enterprise Mobility

Experience Design

Data Science & Engineering

Cloud Engineering

Big Data Analytics Conversational Commerce

AI & ML Recommenders

Voice Enabled Experience

Mobile Commerce Sales Enablement

AR & VR Experiences Executive Insights

Cloud-native Architecture

Micro Services and Containers

Cloud-scale Solutions Cloud Migration

Information Architecture

Visual and Interaction Design

Voice and Bot Interfaces

Omni-channel Experience

Stra

tegy

Con

sulti

ng

Tech

nolo

gy C

onsu

lting

Impl

emen

tatio

ns &

Eng

inee

ring

Dev

Ops,

Clou

d Op

s, Da

ta O

ps

Robust platform to assemble, deliver, and operate real time big data applications

Experts in developing custom Big Data Analytics solutions for Global Retailers

Best-of-breed Big Data Solutions for the Retail Industry Using the Latest Innovation in Data Science

Product Recommendations For Retail

Recommenders: A Boost For BusinessPersonalized Recommendations Work For Retail

Big Data = Big Gains for Retail Industry

• In 2016, total retail sales across the globe reached $22.049 trillion, up 6.0% from the previous year. Sales are estimated to top $27 trillion in 2020

• The global retail analytics market was worth $2.25 billion in 2015 reaching $7.47 billion by 2022 with a 18.7% CAGR

• Global data generation in retail is expected to grow at a rate of 60% annually

Common Use Cases:

• Personalized Recommendations• Demand Forecasting• Forecasting Consumer Trends• Utilizing Market Basket Analysis• Social Media Analytics• Personalized Experience• Customer 360° Analysis• Pricing Optimization

Customer Expectations In The Digital Economy

• Know Me• Show Me You Know Me• Enable Me• Value Me

… all in Real-time

“Today, marketplace success is not only defined by a company’s ability to cater to customer needs,but also to intelligently predict and fulfill those needs before the customer even asks. Half ofconsumers say they are likely to switch brands if a company doesn’t anticipate their needs.”

SalesForce.com “State of the Connected Customer”

Recommendations Work for Retail

• Personalization can reduce acquisition costs by as much as 50%, lift revenues by 5 to 15%, and increase the efficiency of marketing spend by 10 to 30% (McKinsey)

• 94% of companies agree that “personalization is critical to current and future success” (Econsultancy)

• 35% of all Amazon consumers purchases come from product recommendations based on personalized recommendation algorithms (McKinsey)

Source: Barilliance

How Does It Work ?

Recommendations On Amazon’s Landing Page

Recommendations On Clicking A Product

Recommendations On Adding To Cart

Beer & Diapers

2008

Men between 30 - 40 years in age, shoppingbetween 5pm and 7pm on Fridays, who purchaseddiapers were most likely to also have beer in theircarts.

Now

Predicting pregnancy before parents know about it.

Predict Your Patterns

Recommender Evolution

Rec

omm

enda

tion

Syst

ems

Content Based

Hybrid

Collaborative

Hybrid C

ontent BasedH

ybridC

ollaborative

Content Based vs Collaborative 1O1

Mindstix Recommender Capabilities

Graph BasedDeep LearningMachine Learning

Beyond The Line Of (Web)Site

Recommenders for High Touch Boutiques

Recommenders for Sales Enablement

Real-time recommenders for last mile retail

Other industries also have huge growth potential by using personalized recommenders

In Today’s Menu

Collaborative Filtering Hybrid Content Based Custom Collaborative Filtering

SparkML-Based Collaborative Filtering

Value Addition for Retailers

• Users who rate similar products similarly tend to have similar buying patterns, predicting the items, boosts the revenue

• Using SparkML Library to predict user ratings for a product based on ratings given by other users

• Predicted ratings are used to determine whether a product should be recommended

Underlying Logic

Collaborative Filtering

Advantages• Great in recommending items for users based on his purchase history• Works well with large item and user space• Works well in providing cross domain recommendations

Disadvantages• Cold start problem when no or little purchase history is available• Created model is static and not real time• On addition of any user or new item, model needs to be retrained to use those values

TensorFlow Based Hybrid Recommender

Value Addition for Retailer

• Personalized predictions will boost up sales revenue significantly

• Basket size prediction helps companies be prepared for the upcoming campaign

Underlying Logic

• Neural Network trained to predict userorder size and recommended product

• User orders, product category and userpurchase history used as features fortraining

Neural Network based Recommenders

Advantages• Works well even where features may not be well defined• Ability for unsupervised learning enable Neural Networks to learn on their own• Generally, accuracy of neural network based model is higher than others

Disadvantages• Neural networks require large amount of data and significant time for training• Requires specialized hardware equipment like GPUs and TPUs

Graph DB Based Collaborative Filtering

User Purchase History Recommended Products for User

Graph Based Recommender Using Neo4J

Value Addition for Retailers• User purchase and browsing pattern updated in real-time increasing relevance• Easy personalization and customization possible with underlying graph data structure• Additional patterns like recent click history can be implemented easily

Underlying Logic• Graph data-structure used to identify similar purchase patterns from user buying history• Based on orders from similar users, new products are recommended to user

Graph Based Recommendations

Advantages• Easy to adapt to new data models• Highly flexible for real-time prediction, like user click relationships

Disadvantages• Needs migration of existing data to graph based backend• Graph based systems are still maturing in terms of distributed computing and deployment

A Recipe For Success

Key Requirements Of Product Recommendation

• Ability to process large volumes of data in-motion and at rest

• Response to consumer interactions in real-time, with microsecond response

• Advanced analytics to gain insights on behavior and effectiveness

• Always-on, enterprise-grade analytical platform

• End-user experience when making recommendations

DataTorrent RTS Processing Platform

• Apache Kafka for event delivery

• Apache Apex for stream data processing

• Spark for machine learning model training

• HDFS for persistent storage

Run on-premises, cloud, or anywhere your data lives

To achieve operability at scale, components intentionally selected & hardened

DataTorrent’s Apoxi™ Framework

Operationalize streaming analytic applications to drive business outcomes for competitive advantage

• Tightly integrated, open-source software components

• Assembly, optimization, lifecycle, and management

• Enterprise integrations & tooling for 24x7 operations

best-of breed components

businessapplications

enterprise management & operations

Apoxi™

Insight &Action

Data

do-it-yourself w/o Apoxi™

Proof of ConceptData

Before:

After:

Real-time Business ApplicationsReal-time Enterprise Data ServicesIngestion ServicesInteractionChannels

Enterprise DataStores

Supporting the Connected Retail BusinessAmplify competitive advantage with real-time insight and action

Social

Web

Mobile

Sensor

POS

Revenue CRM Inventory

AnomalyDetection

Recommendation Product + Service

Fraud PreventionTrend Alerting

Customer 360

Credit Risk Scoring

ContentPersonalization

Predictive Maintenance

Inventory Stockouts

Security BreachPrevention

Service Call Avoidance

Quality Alerting

Campaign Tracking

Financial RiskMonitoring

Outage Detection

Location Tracking / GPS Fencing

Supply ChainMonitoring

Ingestion

Preparation

ComplianceTracking

Archival / Sync

PersonalizedOffer

Enrichment

FailurePrediction

Fulfillment Tracking

Business KPI Alerting

Action OutcomeInsight

Data Science Innovation

CEP RuleEngine

Machine Learning AI

RT Ingestion & Feature Extraction

RT Feature Enrichment, Optimization & Aggregation RT Event ML Scoring

Data Lake

Batch Machine learning

ML Model

Machine Learning Operational Architecture

Continuous model update

Metadata Lookup

Ingest transactions to Data Lake

Low Latency and High Throughput ML Scoring Pipeline

Ingest Score outcome to Data Lake

RT Decisions

Batch Feature Engineering

FeaturesNew Features

Feature Repository

Model RepositoryLookup Stores

Machine Learning Model Innovation

Machine Learning Value Delivery

How Does DataTorrent Help?Operationalizing ML innovation to drive business outcomes

RT Ingestion & Feature Extraction

RT Feature Enrichment, Optimization & Aggregation RT Event ML Scoring

Data Lake

Batch Machine learning

ML Model

Machine Learning Operational Architecture

Continuous model update

Metadata Lookup

Ingest transactions to Data Lake

Low Latency and High Throughput ML Scoring Pipeline

Ingest Score outcome to Data Lake

RT Decisions

Batch Feature Engineering

FeaturesNew Features

Feature Repository

Model RepositoryLookup Stores

Machine Learning Model Innovation

Machine Learning Value Delivery

Support for RecommendationsOperationalizing Machine Learning innovation to drive business outcomes

DataTorrent makes it easy to put ANY recommendation engine into production● SparkML● TensorFlow● GraphDB● And more … easily

Recommenders in ActionPutting it all together to drive better outcomes

Demonstration Overview

E-commerce / mobile / shopping

applicationPage View Events

RecommenderRecommendations

All Shopping/ Page View Events

Online Analytics Service

Trend Visualizations

Query / Explore

Persist & Train

Step

1

Step 2

Step 3

Step 4

Step 1: Model TrainingStep 2: Event ScoringStep 3: RecommendationsStep 4: Visualization

Web

Mobile

POS

Step 1: Model TrainingSpark ML based Collaborative Filtering

Implementation Logic• Spark ML Lib Recommender using Alternate Least Squares algorithm• Recommends products for users based on products liked by other similar users

Dataset• Amazon product reviews: 142.8 million reviews spanning May 1996 - July 2014.• Data Used for model: User ratings (user, item, rating, timestamp)

Accuracy & Effectiveness• Measured & visualized by DataTorrent

Step 2: Model Delivery to DataTorrentSpark ML model runs as DataTorrent analytical Operator

Spark ML Model Delivered as Apache Apex Operator

Operator provides:• Fault tolerance• Scalability/Partitioning• Processing Semantics

Platform provides• Operational management• Application & System Metrics• Analyst and DevOps Visualizations

A successful real-time application combines multiple micro-data services to solve a business

problem

Step 3: Recommendation Application

• Ingests page view events – at volume

• Enriches to make data recommendation ready

• Emits recommendation events – low latency

• Provides metrics and visualizations to show overall effectiveness DataTorrent end-to-end analytic pipeline for Retail product/service

recommendations

Step 4: Measuring Recommender EffectivenessDataTorrent’s Online Analytics Service Computes & Charts Trends

Business: are we getting expected results?

Effectiveness of Recommendations

Technical: are we meeting the SLA?Performance of application

Overall: How does this compare to before?

Observing real-time and historical trends

Summary

Summary

• Personalized recommendations are key to ANY digital commerce

• You need to employ a variety of data science techniques

• DataTorrent RTS has the run-time for production deployment

• Mindstix has the retail, data science and DataTorrent expertise

• Together, we can get your business quickly to a profitable outcome

Questions And How We Can Help

Talk to us!• https://www.datatorrent.com/contact-us/• datascience@mindstix.com

Check Out DataTorrent AppFactory• Applications & building blocks• https://datatorrent.com/appfactory

Download software• https://datatorrent.com/download/

Learn more • Webinars: https://datatorrent.com/webinars/• Videos: https://youtube.com/user/DataTorrent• Slides: http://slideshare.net/DataTorrent/presentations

DataTorrent –www.datatorrent.com

Mindstix Labs –www.mindstix.com

Thank You!

Backup & Technical Content

About DataTorrent

• Formed in 2012 by Yahoo Big Data and Hadoop creators.

• Developed Apache Apex, an event-based stream processing engine, the industry's only production-grade platform designed to ingest, transform, and analyze every data type in real time.

• Developed RTS an easy-to-use management and monitoring UI, featuring real-time dashboards, automated fault tolerance, and real-time predictive models.

• Developed a rich set of pre-built operators, or building blocks, to assist in developing micro dataservices and applications.

Recommended