45
Rakuten Inc. RIT. Masaya Mori Nov. 7 th , 2012 E-commerce企業における ビッグデータへの挑戦と課題 ‐機械学習への期待について‐

E-commerce企業における ビッグデータへの挑戦と …ibisml.org/archive/ibis2012/ibis2012-mori.pdfRakuten Inc. RIT. Masaya Mori Nov. 7th, 2012 E-commerce企業における

  • Upload
    lamminh

  • View
    220

  • Download
    0

Embed Size (px)

Citation preview

  • Rakuten Inc. RIT. Masaya Mori Nov. 7th, 2012

    E-commerce

  • 2

    Rakuten Open Data

    IT

    http://rit.rakuten.co.jp/rdr/index.html

  • 3

    Introduction

    Introduction

    SuperDB

    BigData

  • 4

    Introduction

    Masaya Mori

    Twitter: @emasha

  • 5

    Rakuten Group

    Introduction

    SuperDB

    BigData

  • 6

    n n 3,2097,615n 1997217nIPO 2000419n 1,079201112n3,7992011n 7562011

    e

  • 7

    13 EC

    2Tokyo, New York)

    Free Cause(USA) Linkshare(USA) Tradoria(Germany)

  • 8

    Next Reality - -

    R&D

    Tokyo & NY

  • 9

    Personalize Platform Recommender Engine

    (working on) Data Mining, NLP, Semantic Web

    Recommender Platform

    SPDB item DB user DB purchase history

    DB page - view history DB

    [ recommender logic ] Collaborative filter

    retargeting basket !

    Search Tech

    Global Catalogue Creation Noise Detection

    Next E-Commerce Platform

  • 10

    SuperDB

    BigData

    Introduction

  • 11

    Amazon,

    Pandora Radio

    PDCA

  • 12

    SuperDB SuperDB

    BigData

    Introduction

  • 13

    E-Commerce Portal and Media

    Telecommunications

    Securities

    Credit Card

    Professional Sports

    Banking

    E-money

  • 14

    78,000,000+ 800,000,000+ 68,000,000+ 3,000,000+ 1 37,000+ 60,000+ . 1Access Log etc

  • 15

  • 16

    DB Rakuten has tons of businesses, and so have many kinds of business data. Its diversified.

    We aggregate such data into one big dataware house.DWH

    Rakuten Super DB

    That is our important core generating revenue.

  • 17

    DB

    )

    Mosaic

  • 18

    AB

    C

    D

    EF

    GHI

    JA

    B

    C

    D

    EF

    GHI

    J

    CD

  • 19

    CVR

  • 20

    TOHO

    Recommender Platform

    DB

    DB for service

  • 21

    DVD

  • 22

    SuperDB

    BigData

    Introduction

  • 23

    DB DB

  • 24

    NLP

    Global Catalogue Creation Noise Detection

    DB

  • 25

    PDCA

    PDCA

    Plan (Hypothesis)

    Do (Learning)

    Check (Understanding)

    Action (Prediction)

  • 26

    SuperDB

    BigData

    Introduction

  • 27

    K-MeanspLSI ( LSH Locally Sensitive Hash

    Collaborative Filtering Basket Analysis

    Text Matching Clustering

    Cluster Coefficient

  • 28

  • 29

    CRF

  • 30

  • 31

    CRF

    6040

    CatID: 2034500167

  • 32

    IP, SVM/ Passive aggressive

  • 33

    SNS SOM EM OK/NG

    FFNN

    No Image

  • 34

    RSGP

  • 35

    (

  • 36

    BigData

    SuperDB

    BigData

    Introduction

  • 37

    Along with this, we are increasingly getting difficulty of processing data.

  • 38

    Big Data

    Its getting more and more difficult to handle with it.

  • 39

    HadoopNoSQL

    OSS

  • 40

    DB

    1/1

    1/300GB

    M/R

    1

    70

    RAN DB

    Calculate

    Rakuten Product

  • 41

    Batch

    Batch

    NGS Hive Shared Hadoop

    Cluster dictionary batch Server

    Batch

    NGS common platform for hive

    suggest batch server

    Dictionary Index

    Suggest Index

    update search index

    update search index

    sync analyzed data

    n Hive"n

    300GB

  • 42

  • 43

    For closing SuperDB

    BigData

    Introduction

  • 44

    Rakuten Open Data

    IT

    http://rit.rakuten.co.jp/rdr/index.html

  • Rakuten Inc. RIT. Masaya Mori Nov. 7th, 2012

    E-commerce