44
All You Need to Know to Build a Product Knowledge Graph Nasser Zalmout Amazon Chenwei Zhang Amazon Xian Li Amazon Yan Liang Amazon Xin Luna Dong AmazonFacebook

All You Need to Know to Build a

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: All You Need to Know to Build a

All You Need to Know to Build a Product Knowledge Graph

Nasser ZalmoutAmazon

Chenwei ZhangAmazon

Xian LiAmazon

Yan LiangAmazon

Xin Luna DongAmazon→Facebook

Page 2: All You Need to Know to Build a

OutlineOverview and Introduction

Knowledge Extraction

Knowledge Cleaning

Break

Ontology Mining

Applications

Conclusion and Future Directions

20 min

40 min

25 min

20 min

25 min

20 min

10 min

Page 3: All You Need to Know to Build a

Overview and Introduction

Overview and Introduction 20 min

Knowledge Extraction

Knowledge Cleaning

Q&A

Break

Ontology Mining

Applications

Conclusion and Future Directions

Q&A

Page 4: All You Need to Know to Build a

Knowledge Graph Example for 2 Songs

artist mid345

mid346

mid127

mid129

mid128

genre

song_writer

name

name

name

name

name

“Shake it off”

“Love Story”

“Taylor Alison Swift”

“Taylor Swift”

“Country pop”

artist

“Dance-pop”

“Pop”name

name

12/13/1989birth_date

Recording type

type

GenretypeEntity type

Entity

Relationship

genre

Page 5: All You Need to Know to Build a

Product Graph Example for 2 Songs

artist

mid345

mid346

mid127

mid129

mid128

genre

song_writer

name

name

name

name

name

“Shake it off”

“Love Story”

“Taylor Alison Swift”

“Taylor Swift”

“Country pop”

artist

“Dance-pop”

“Pop”name

name

12/13/1989birth_date

Genretype

genre

Page 6: All You Need to Know to Build a

Product Graph Example for 2 Songs

artist

mid345

mid346

mid127

mid129

mid128

genre

song_writer

name

name

name

name

name

“Shake it off”

“Love Story”

“Taylor Alison Swift”

“Taylor Swift”

“Country pop”

artist

“Dance-pop”

“Pop”name

name

12/13/1989birth_date

Genretype

genre

mid568

mid570

ASIN

ASIN

B0035QUXWR

B0067XLIG8

type

B0035QUXWQ

B0067XLIG4

ASIN

ASIN

mid567

mid569

mid571

product

product

product

product

product

Release

Track

Page 7: All You Need to Know to Build a

Product Graph Example for 2 Products

Page 8: All You Need to Know to Build a

Use Case I: Providing Information

Page 9: All You Need to Know to Build a

Use Case II: Providing Choices

Page 10: All You Need to Know to Build a

Use Case III: Improving Search

Page 11: All You Need to Know to Build a

Use Case III: Improving Search

Page 12: All You Need to Know to Build a

Use Case III: Improving Search

Page 13: All You Need to Know to Build a

Use Case IV: Improving Recommendation

Page 14: All You Need to Know to Build a

Product Graph vs. Knowledge Graph

Generic KG

PG

Generic KG

PG

Generic KG

PGPG

(A) (B) (C)

Page 15: All You Need to Know to Build a

Generic KG

PG

Generic KG

PG

(A) (B) (C)

Generic KG

Product KG

(Hardline, softline, consumables, etc.)

Movie,Music,Book,etc.

Product Graph vs. Knowledge Graph

Page 16: All You Need to Know to Build a

But, Is The Problem Harder?

Generic KG

Product KG

(Hardline, softline, consumables, etc.)

Movie,Music,Book,Aetc.

We focus on retail products in

this tutorial

Page 17: All You Need to Know to Build a

Challenges in Building Product Graph I

❑Sparse and noisy structured data

Page 18: All You Need to Know to Build a

Challenges in Building Product Graph II

❑Extremely complex domains

❑How to identify the millions of product types?

❑How to organize types into a taxonomy tree?

Sellers’ view Buyers’ view

Page 19: All You Need to Know to Build a

Challenges in Building Product Graph III

❑Big variety across product types

❑Different attributes apply to different product types

❑Different value vocabularies and different patterns

Page 20: All You Need to Know to Build a

Challenges in Building Product Graph III

❑Big variety across product types

❑Different attributes apply to different product types

❑Different value vocabularies and different patterns

Page 21: All You Need to Know to Build a

Scale Up in 3 Dimensions

Thousands of attributes

Millions of categories

Hundreds of languages

Big challenge: Limited training labels for large-scale, rich data

Page 22: All You Need to Know to Build a

Can We Build A Self-Driving Product Understanding System?

Page 23: All You Need to Know to Build a

Our Goal: Self-Driving Product Knowledge Collection

Product Type Flavor Color

Product 1 Snacks Cherry

Product 2 Candy ? ?

Product 3 Candy Choc. Gold

AutoKnowUser logs

Grocery

Snacks Drinks

Candy

Product KG Grocery

Snacks Drinks

CandyPretzels

Catalog

Taxonomy

Prod. 1 Prod. 2

Gold

color

Prod. 3

Choc.Chocolatesynonym

flavor flavor

hasType

Dong et al., AutoKnow: Self-driving knowledge collection for products of thousands of types, SigKDD, 2020.

Page 24: All You Need to Know to Build a

Our Goal: Self-Driving Product Knowledge Collection

Product Type Flavor Color

Product 1 Snacks Cherry

Product 2 Candy ? ?

Product 3 Candy Choc. Gold

AutoKnowUser logs

Grocery

Snacks Drinks

Candy

Product KG Grocery

Snacks Drinks

CandyPretzels

Catalog

Taxonomy

Prod. 1 Prod. 2

Gold

color

Prod. 3

Choc.Chocolatesynonym

flavor flavor

hasType

New Type

New Value

Corrected Value

Dong et al., AutoKnow: Self-driving knowledge collection for products of thousands of types, SigKDD, 2020.

Page 25: All You Need to Know to Build a

Amazon AutoKnow: Self-Driving Product Knowledge Collection

Product Type Flavor Color

Product 1 Snacks Cherry

Product 2 Candy ? ?

Product 3 Candy Choc. Gold

AutoKnowUser logs

Grocery

Snacks Drinks

Candy

Product KG Grocery

Snacks Drinks

CandyPretzels

● #Types ↑ 3X ● Defect rate ↓up to 68 percent points

Catalog

Taxonomy

Prod. 1 Prod. 2

Gold

color

Prod. 3

Choc.Chocolatesynonym

flavor flavor

hasType

Dong et al., AutoKnow: Self-driving knowledge collection for products of thousands of types, SigKDD, 2020.

Page 26: All You Need to Know to Build a

Input Data

PT Taxonomy

Catalog

Behavioral Signals (e.g., search logs,

reviews, Q&A)

Dong et al., AutoKnow: Self-driving knowledge collection for products of thousands of types, SigKDD, 2020.

Amazon AutoKnow: Self-Driving Product Knowledge Collection

Page 27: All You Need to Know to Build a

Input Data

PT Taxonomy

Catalog

Behavioral Signals (e.g., search logs,

reviews, Q&A)

Taxonomy Enrichment

Relation Discovery

Ontology Suite

Dong et al., AutoKnow: Self-driving knowledge collection for products of thousands of types, SigKDD, 2020.

Amazon AutoKnow: Self-Driving Product Knowledge Collection

Page 28: All You Need to Know to Build a

Synonym Discovery

Data Imputation

DataCleaning

Data SuitePT Taxonomy

Catalog

Behavioral Signals (e.g., search logs,

reviews, Q&A)

Taxonomy Enrichment

Relation Discovery

Ontology Suite

Dong et al., AutoKnow: Self-driving knowledge collection for products of thousands of types, SigKDD, 2020.

Amazon AutoKnow: Self-Driving Product Knowledge Collection

Page 29: All You Need to Know to Build a

Broad Graph

{product, attribute,

value}

{value, synonym,

value}

Ontology

Synonym Discovery

Data Imputation

DataCleaning

Data SuitePT Taxonomy

Catalog

Behavioral Signals (e.g., search logs,

reviews, Q&A)

Taxonomy Enrichment

Relation Discovery

Ontology Suite

Dong et al., AutoKnow: Self-driving knowledge collection for products of thousands of types, SigKDD, 2020.

Amazon AutoKnow: Self-Driving Product Knowledge Collection

Page 30: All You Need to Know to Build a

Self Driving to Navigate a Large Space

• Automatic: Fully ML-based• Annotation free: Weak learning based on existing

Catalog data and user behavior• One-size-fits-all: Few taxonomy-aware models• Self guidance: Identify important attributes and

categories to focus efforts

Page 31: All You Need to Know to Build a

Key Intuition I. Learning w. Limited Labels Generated from Existing Catalog Data

Product Type Flavor Color

Product 1 Snacks Cherry

Product 2 Candy ? ?

Product 3 Candy Choc. Gold

Grocery

Snacks Drinks

Candy

Catalog

Taxonomy

Dong et al., AutoKnow: Self-driving knowledge collection for products of thousands of types, SigKDD, 2020.

Page 32: All You Need to Know to Build a

Key Intuition II. Rich Customer Behavior

Grocery

Snacks Drinks

CandyPretzels

CandyPretzels

Prod. 1 Prod. 2

Gold

color

Prod. 3

Choc.Chocolatesynonym

flavor flavor

hasType

Query 1 Query 2 Query 3

Prod. 1 Prod. 2 Prod. 3

purchase purchase purchase

co-purchase co-view

mention

mention

mention

mention

Page 33: All You Need to Know to Build a

Key Intuition III. Product Categories as First-Class Citizen in Modeling

Grocery

Snacks Drinks

CandyPretzels

Prod. 1 Prod. 2 Prod. 3

hasType

Page 34: All You Need to Know to Build a

Key Intuition IV. Leverage Graph StructureTaxonomy is a tree

Grocery

Snacks Drinks

CandyPretzels

CandyPretzels

Prod. 1 Prod. 2

Gold

color

Prod. 3

Choc.Chocolatesynonym

flavor flavor

hasType

KG is a graph

Query 1 Query 2 Query 3

Prod. 1 Prod. 2 Prod. 3

purchase purchase purchase

co-purchase co-view

Customer behavior forms a graph

Page 35: All You Need to Know to Build a

Key Intuition V. Leverage Different Modals

Page 36: All You Need to Know to Build a

Key Techniques in AutoKnow

Page 37: All You Need to Know to Build a

Deliver the Data Business

1000 000000, , ,

Page 38: All You Need to Know to Build a

Deliver the Data Business

1High

precisionmodels

Page 39: All You Need to Know to Build a

Deliver the Data Business

1000,High

precisionmodels

E2E pipeline + AutoMLto reduce

modeling cost

Page 40: All You Need to Know to Build a

Deliver the Data Business

1000 000, ,High

precisionmodels Scale-up to

reduce #models

100s attributes

1000s categories

10s languages

E2E pipeline + AutoMLto reduce

modeling cost

Page 41: All You Need to Know to Build a

Deliver the Data Business

1000 000000, , ,High

precisionmodels Scale-up to

reduce #modelsHigher yield from

multi-modal models

100s attributes

1000s categories

10s languages

E2E pipeline + AutoMLto reduce

modeling cost

Page 42: All You Need to Know to Build a

Broad Graph

{product, attribute,

value}

{value, synonym,

value}

Ontology

Synonym Discovery

Data Imputation

DataCleaning

Data SuitePT Taxonomy

Catalog

Behavioral Signals (e.g., search logs,

reviews, Q&A)

Taxonomy Enrichment

Relation Discovery

Ontology Suite

Dong et al., AutoKnow: Self-driving knowledge collection for products of thousands of types, SigKDD, 2020.

Tutorial Structure

Sec 2

Sec 3

Sec 4

Sec 5. Applications

Page 43: All You Need to Know to Build a

Section Structure

• Problem DefinitionWhat are unique challenges for PG beyond generic KGs?

• Short answer -- key intuitionWhat are key intuitions for building product KGs?

• Long answer -- detailsWhat are practical tips?

• Reflection/short-answer Can we apply the techniques to other domains?

Page 44: All You Need to Know to Build a

Key Questions We Answer in This Tutorial

• Q1. What are unique challenges to build a product knowledge graph and what are solutions?

• Q2. Are these techniques applicable to building other domain knowledge graphs?

• Q3. What are practical tips to make this to production?