Upload
lamminh
View
220
Download
0
Embed Size (px)
Citation preview
Rakuten Inc. RIT. Masaya Mori Nov. 7th, 2012
E-commerce
2
Rakuten Open Data
IT
http://rit.rakuten.co.jp/rdr/index.html
3
Introduction
Introduction
SuperDB
BigData
4
Introduction
Masaya Mori
Twitter: @emasha
5
Rakuten Group
Introduction
SuperDB
BigData
6
n n 3,2097,615n 1997217nIPO 2000419n 1,079201112n3,7992011n 7562011
e
7
13 EC
2Tokyo, New York)
Free Cause(USA) Linkshare(USA) Tradoria(Germany)
8
Next Reality - -
R&D
Tokyo & NY
9
Personalize Platform Recommender Engine
(working on) Data Mining, NLP, Semantic Web
Recommender Platform
SPDB item DB user DB purchase history
DB page - view history DB
[ recommender logic ] Collaborative filter
retargeting basket !
Search Tech
Global Catalogue Creation Noise Detection
Next E-Commerce Platform
10
SuperDB
BigData
Introduction
11
Amazon,
Pandora Radio
PDCA
12
SuperDB SuperDB
BigData
Introduction
13
E-Commerce Portal and Media
Telecommunications
Securities
Credit Card
Professional Sports
Banking
E-money
14
78,000,000+ 800,000,000+ 68,000,000+ 3,000,000+ 1 37,000+ 60,000+ . 1Access Log etc
15
16
DB Rakuten has tons of businesses, and so have many kinds of business data. Its diversified.
We aggregate such data into one big dataware house.DWH
Rakuten Super DB
That is our important core generating revenue.
17
DB
)
Mosaic
18
AB
C
D
EF
GHI
JA
B
C
D
EF
GHI
J
CD
19
CVR
20
TOHO
Recommender Platform
DB
DB for service
21
DVD
22
SuperDB
BigData
Introduction
23
DB DB
24
NLP
Global Catalogue Creation Noise Detection
DB
25
PDCA
PDCA
Plan (Hypothesis)
Do (Learning)
Check (Understanding)
Action (Prediction)
26
SuperDB
BigData
Introduction
27
K-MeanspLSI ( LSH Locally Sensitive Hash
Collaborative Filtering Basket Analysis
Text Matching Clustering
Cluster Coefficient
28
29
CRF
30
31
CRF
6040
CatID: 2034500167
32
IP, SVM/ Passive aggressive
33
SNS SOM EM OK/NG
FFNN
No Image
34
RSGP
35
(
36
BigData
SuperDB
BigData
Introduction
37
Along with this, we are increasingly getting difficulty of processing data.
38
Big Data
Its getting more and more difficult to handle with it.
39
HadoopNoSQL
OSS
40
DB
1/1
1/300GB
M/R
1
70
RAN DB
Calculate
Rakuten Product
41
Batch
Batch
NGS Hive Shared Hadoop
Cluster dictionary batch Server
Batch
NGS common platform for hive
suggest batch server
Dictionary Index
Suggest Index
update search index
update search index
sync analyzed data
n Hive"n
300GB
42
43
For closing SuperDB
BigData
Introduction
44
Rakuten Open Data
IT
http://rit.rakuten.co.jp/rdr/index.html
Rakuten Inc. RIT. Masaya Mori Nov. 7th, 2012
E-commerce