Reporting: From MySQL to Hadoop/Hive
Manuel Aldana
Classifieds = Kleinanzeigen
Ad = Kleinanzeige
April 2009 - Sept 2009 2009
Launch of Discovery by DLRCC license: http://flic.kr/p/9kRijr
8M Ads
11M UV pM
1B PV pM
http://upload.wikimedia.org/wikipedia/commons/0/0b/Relief_Map_of_Germany.svg
Devil / Angel by Alper CugunCC license: http://flic.kr/p/5tXxze
Reporting back then...
Batch Jobs Apache POI
1.
2.
Email
3.
Problems followed...
Respects of Residents of Adenau by chuckbiscuitoCC license: http://flic.kr/p/eLk2o
New Curbside Recycling in Richland by Colleen LaneCC license: http://flic.kr/p/7GKY6c
Sucks!!!http://flic.kr/p/7GKY6chttp://www.icons-land.com
Batch Job
1.
2.
3. pull
Map Reduce HDFS
Log-FileJSON
SQL Subset(via JDBC)Batch Job
nightly copyFromLocal
- Model?pilot log book by buttersweetCC license: http://flic.kr/p/8Dstb
pilot log book by buttersweetCC license: http://flic.kr/p/8Dstb
Batch Job
1.
2.
3. pull
CatsSeasonality
BikesSeasonality
New Ads By Bundesland on last Sunday
00Respects of Residents of Adenau by chuckbiscuitoCC license: http://flic.kr/p/eLk2o
New Curbside Recycling in Richland by Colleen LaneCC license: http://flic.kr/p/7GKY6c
Tug of War by toffehoffeCC license: http://flic.kr/p/nD2nk
Binoculars Portrait by gerlosCC license: http://flic.kr/p/5KGg5B
0.20.1 v. Hadoop
0.9.0 v. Hive
22 Nodes
11 Reporting Jobs
1TB Overall
5GB Daily
http://flic.kr/p/5KGg5B
pilot log book by buttersweetCC license: http://flic.kr/p/8Dstb
Log-FileJSON
HDFS
nightly copyFromLocal
Mac vs. PC by skrukCC license: http://flic.kr/p/8sdk8Z