Download pdf - Treasure Data and Heroku

Transcript
Page 1: Treasure Data and Heroku

Treasure DataTreasure Data and Heroku

Masahiro Nakagawa

Heroku Meetup #8 TreasureData + Waza Report!! Thu, 04 Apr 2013

Friday, April 5, 13

Page 2: Treasure Data and Heroku

Who are you? Masahiro Nakagawa

• @repeatedly / [email protected]

Treasure Data, Inc.• Senior Software Engineer, since 2012/11

Open Source projects• D Programming Language• MessagePack: D, Python, etc...• Fluentd: Core, mongo, etc...• etc...

2

Friday, April 5, 13

Page 3: Treasure Data and Heroku

Introduction toTreasure Data

Friday, April 5, 13

Page 4: Treasure Data and Heroku

Company Overview Silicon Valley-based Company

• All Founders are Japanese• Hironobu Yoshikawa• Kazuki Ohta• Sadayuki Furuhashi

OSS Enthusiasts• MessagePack, Fluentd, etc.

4

Friday, April 5, 13

Page 5: Treasure Data and Heroku

Investors Bill Tai Naren Gupta - Nexus Ventures, Director of Redhat, TIBCO Othman Laraki - Former VP Growth at Twitter James Lindenbaum, Adam Wiggins, Orion Henry - Heroku

Founders Anand Babu Periasamy, Hitesh Chellani - Gluster Founders Yukihiro “Matz” Matsumoto - Creator of Ruby Dan Scheinman - Director of Arista Networks Jerry Yang - Founder of Yahoo! + 10 more people

• and....

5

Friday, April 5, 13

Page 6: Treasure Data and Heroku

6

Data Volume

Cloud

EnterpriseRDBMSLightweight

RDBMS

DB2

1Bil entryOr 10TB

TraditionalData Warehouse

$10Bmarket

$34Bmarket

Database-as-a-service

Big Data-as-a-Service

On-Premise

© 2012 Forrester Research, Inc. Reproduction Prohibited

Treasure Data = Cloud + Big Data

Friday, April 5, 13

Page 7: Treasure Data and Heroku

7

Why Cloud? ‘Time’ is Money

CustomerValue

Time

IdealExpectation

Sign-up or PO

Obsoleteover time

Reality(On-Premise)

Upgrade

AWS(or hosted Hadoops)

EC2

EMR

RedShift

S3

Step-by-step manual integrations

Maintain

HW/SW Selection, PoC, Deploy...

Friday, April 5, 13

Page 8: Treasure Data and Heroku

8

Full Stack Support for Big Data Reporting

Our best-in-class architecture and operations team ensure the integrity and availability of your data.

Data from almost any source can be securely and reliably uploaded using td-agent in streaming or batch mode.

Our SQL, REST, JDBC, ODBC and command-line interfaces support all major query tools and approaches.

You can store gigabytes to petabytes of data efficiently and securely in our cloud-based columnar datastore.

Friday, April 5, 13

Page 9: Treasure Data and Heroku

Columnar Storage+

HadoopMapReduce

250bil+ records2mil+ jobs

Product9

Data Collection Data Warehouse Data Analysis

Open-SourceLog Collector

2,000+ companies(incl. LinkedIn, etc)

Bulk Loader

CSV / TSVMySQL, Postgres

Oracle, etc.

Web Log

App Log

Sensor

RDBMS

CRM

ERP

Streaming Upload

>60billion / month

BI Tools

Tableau, QlickViewExcel, etc.

RESTJDBC / ODBC

SQL(HiveQL)

orPig

Bulk UploadParallel Upload

Value Proposition:“Time-to-Answer” 20bil+, 2 weeks,

UK/Austria3bil+, 3 weeks

Singapore2 weeks,

US

2 weeks, US

3 weeks,Japan

Dashboard

Custom App,RDBMS, FTP, etc.

Result push

Multi-Tenant: Single Code for Everyone - no code modification, Improving the Platform Faster.

Friday, April 5, 13

Page 10: Treasure Data and Heroku

Customer Use Cases

Friday, April 5, 13

Page 11: Treasure Data and Heroku

11

Our Customers – Fortune Global 500 leaders and start-ups including:

Friday, April 5, 13

Page 12: Treasure Data and Heroku

12

Example in AdTech: MobFox

1. Europe’s largest independent mobile ad exchange.

2. 20 billion imps/month (circa Jan. 2013)

3. Serving ads for 15,000+ mobile apps (circa Jan. 2013)

4. Needed Big Data Analytics infrastructure ASAP.

Friday, April 5, 13

Page 13: Treasure Data and Heroku

13

Two Weeks From Start to Finish!

Friday, April 5, 13

Page 14: Treasure Data and Heroku

Viki.com: “Global Hulu”

Friday, April 5, 13

Page 15: Treasure Data and Heroku

Viki.com Before

Hard to manage Hadoop Complicated data collection

Friday, April 5, 13

Page 16: Treasure Data and Heroku

Viki.com After

No more Hadoop maintenance Versatile data collector, td-agent

Friday, April 5, 13

Page 17: Treasure Data and Heroku

Our Usage

Friday, April 5, 13

Page 18: Treasure Data and Heroku

18

https://console.treasure-data.com/

Friday, April 5, 13

Page 20: Treasure Data and Heroku

Staging environment

Internal testing application

Proxy server for our used services

Other usage

20

Friday, April 5, 13

Page 21: Treasure Data and Heroku

Heroku integration

Friday, April 5, 13

Page 24: Treasure Data and Heroku

24

https://addons.heroku.com/provider/resources/technical/how/overview

Heroku addons

Friday, April 5, 13

Page 25: Treasure Data and Heroku

25

Friday, April 5, 13

Page 27: Treasure Data and Heroku

Setup “td” command• Install via td-toolbelt or rubygems

Setup “td” heroku plugin • heroku plugins:install https://github.com/treasure-data/

heroku-td.git Add ‘td’ gem to your Gemfile

• or STDOUT log collecting “heroku td” is now available for Treasure Data

• “heroku td xxx”: xxx is the same as “td” command

Using Heroku addon

27

https://devcenter.heroku.com/articles/treasure-data

Friday, April 5, 13

Page 28: Treasure Data and Heroku

Just STDOUT Use STDOUT to collect event logs

• No need libraries• log forward via Heroku syslog drain

Format• @[db_name.table_name] json_in_one_line• Ruby:

puts '@[service.users] {"name":"D", "via":"Phobos"}'

28

http://blog.treasure-data.com/post/41886298790/just-stdout-the-simplest-most-flexible-way-to-collect

Friday, April 5, 13

Page 29: Treasure Data and Heroku

29

Friday, April 5, 13

Page 30: Treasure Data and Heroku

Treasure Data• Cloud based Big-data analytics platform• Provide Machete for Big data reporting

Heroku and Treasure Data• Treasure Data addon

• easy to integrate with your Heroku app• STDOUT log collecting with Heroku syslog drain

Conclusion

30

Friday, April 5, 13

Page 31: Treasure Data and Heroku

Big Data for the Rest of Us

www.treasure-data.com | @TreasureData

Friday, April 5, 13