Large-Scale Machine Learning for E-commerce

Preview:

Citation preview

Large-Scale Machine Learning for E-commerce

Ankur Datta, PhDRIT-Boston Manager

ankur.datta@rakuten.com

Data: Structured or Unstructured

2

Structured Data Unstructured Data

Product Catalog is Unstructured Data

3

Two Key Tasks:

Organizing and Searching Product Catalog

Source: http://www.macosxautomation.com/automator/examples/ex04/index.html

Product Catalog is Unstructured Data

4

Two Key Tasks:

Organizing and Searching Product Catalog

5

Organizing Product Catalog

Product CatalogTaxonomy

Machine Learning

Organize Information for browsing / search / data analysis

Application

Organizing Product Catalog using Classification

6

Lips Too Women's

'Too Sliver' Patent

Casual Shoes

Size 6

10 Crosby Women's

Ynez Pump

1883 by Wolverine

Women's Maisie

Oxford Tan/Taupe

Leather/Suede

1803 Women's

'Nome' Crocodile

Dress Shoes Size 9

Women’s

Shoes

Comfort

Shoes

Pump

s

Sneaker

s

Flats

Machine Learning Model

Input Output

Decision Tree

7

Machine Learning Model: Many Decision Trees

8

…… +++

f1(x) f2(x) fM(x)

Combined decision for x

w1

w2wM

Our Large-Scale Machine Learning System for Classificatio

n

1. Normalize text

2. Extract features

3. Many-levels of Deci

sion Trees serve as

classification model

s

9

Classification Results of New Product Titles

10

Product Title: Cross-Front Peplum Layered Dress

General: Women’s Clothing > Clothing

More Specific: Party & Cocktail Dresses > Dresses > Women’s

Clothing > Clothing

Product Title: Cut-Out Leather Platform Wedge Espadrilles

General: Shoes

More Specific: Pumps > Women’s Shoes > Shoes

Product Catalog is Unstructured Data

11

Two Key Tasks:

Organizing and Searching Product Catalog

12

Compact desktop

computer

Somewhere in US on

Wed, 13 Apr 2016 15:59:47 GMT ….

13

Compact desktop

computer

Lenovo thinkcentre

Lenovo all in one

14

Page 2

Page 3

Purchase!

15

16

Ideal

Situatio

n

Current

Situatio

n

?

How do we find the most relevant

products for a search query?

Text-based search alone does not do the job!

Learning to Rank

17

Machine Learning Model that learns to rank search results

Source: http://blog.csdn.net/eastmount/article/details/43080791

query

document

relevance

Relevance based on text

alone is not enough!

What else can we use?

How about user-behavior

signals?

User’s behavior signals

18

buyclick add

Results of Learning to Rank

19

Search Query: “40inch tv”

Regular Text Search Search with User-Signals and Learning to Rank

Not relevant

Not relevant

Summary

• E-commerce data is primarily unstructured data

– Product catalog, merchant and item reviews, search queries

• Proper organization and precise search of this data is necess

ary for good customer experience

– We built machine learning models for large-scale classification of prod

uct catalogs

– Also, we are learning from user behavior to improve our search releva

nce

20

… …+++

f1(x) f2(x) fM(x)

w1

w2 wM

Recommended