YaC, Москва, 19 сентября 2011 года
Alex Kozlov, Cloudera Inc.
Managing a Zoo: Tools for Managing and
Monitoring Distributed Systems from
Cloudera
Agenda
• About Cloudera and myself
• Background info – Data, data everywhere
– Corporate data management, distributed systems, functional languages
– Hadoop ecosystem
• Distributed system maintenance – Installation/Updating/Monitoring
• Fixed images
• Standard configuration management tools
– Our solution • Partial failures
• Node cast
• …
• Implementation
• What’s next
©2011 Cloudera, Inc. All Rights Reserved.
2
Founded in the summer 2008
Cloudera’s mission is to help organizations profit from all of their data.
Cloudera helps organizations profit from all of their data. We deliver the industry-standard platform which consolidates, stores and processes any kind of data, from any source, at scale. We make it possible to do more powerful analysis of more kinds of data, at scale, than ever before. With Cloudera, you get better insight into their customers, partners, vendors and businesses.
Мы поставляем стандартные платформы для объединения, хранения и обрабатывания большого количества данных любого типа, от любого источника. Мы делаем это в масштабе большем чем когда-либо прежде. С Cloudera, вы получите лучшее понимание своих клиентов, партнеров, поставщиков и предприятий.
About Cloudera
©2011 Cloudera, Inc. All Rights Reserved.
3
4
Introduction
# whoami
alexvk
– Закончил ФизФак МГУ,
Stanford University
– Работал в SGI, HP, Turn
– Senior Solutions Architect, Cloudera, Inc.
# whoru
– Sysadmin
– IT Manager
– TechOps
– Data Scientist
– Researcher
– Developer
– CTO?
…
– Just curious?
©2011 Cloudera, Inc. All Rights Reserved.
Data, data everywhere
We are storing a lot more data:
– 1 click on an average web-site generates
about 100 lines of logs (somewhere)
– 1 additional attribute/integer (8 bytes) means
1TB/day of data (from an ex-Google
employee)
40-80PB stores are becoming common
©2011 Cloudera, Inc. All Rights Reserved.
5
6
Corporate data management
• Traditional
– EDW, centralized (SPoF)
– Fixed set of queries
(sales/revenue by quarter, etc.)
– ETL pipeline taking up to 24-
hours to run
• Future
– Data from • Any source
• Any kind
• At scale
– Flexible insights • The value is not known
beforehand
• Multiple facets of deep,
exhaustive analysis
– Interactive • 5 min delay from click to
insights
©2011 Cloudera, Inc. All Rights Reserved.
2002 2004 2008 2010 2011 2012
The Origins of Hadoop
©2011 Cloudera, Inc. All Rights Reserved.
7
Open source web
crawler project created
by Doug Cutting
Publishes MapReduce
and GFS Paper
Open Source
MapReduce and
HDFS project created
by Doug Cutting
Runs 4,000-node
Hadoop cluster
Hadoop tops Terabyte
sort benchmark
Created Hive, adding
SQL support Releases CDH3 and
Cloudera Enterprise
Releases Cloudera
Enterprise 3.5 & SCM
Express
What is Hadoop?
HDFS + MapReduce = Hadoop
Hadoop is an ecosystem (HBase
+ friends)
Distributed storage
Moving computations to the data
A new model for fault tolerance
©2011 Cloudera, Inc. All Rights Reserved.
8
CDH Overview
©2011 Cloudera, Inc. All Rights Reserved.
9
The #1 commercial and non-commercial Apache Hadoop distribution.
Coordination
Data Integration Fast Read/Write Access
Languages / Compilers
Workflow Scheduling Metadata
APACHE ZOOKEEPER
APACHE FLUME, APACHE SQOOP
APACHE HBASE
APACHE PIG, APACHE HIVE
APACHE OOZIE APACHE OOZIE APACHE HIVE
File System Mount UI Framework SDK
FUSE-DFS HUE HUE SDK
10
Landscape
In 2008 (Cloudera founded)
– 3-5 companies, mostly in social
networking space, using
Hadoop in production
– A lot of interest, but mostly for
the wrong reason
– Biggest applications just smart
log processing
– Largest installation in 10s of PB
In 2011
– 100s of paying clients
– 2-3x growth in Hadoop
conference attendance year-
over-year
– HBase, Oozie, Mahout
– Lots of research (Spark,
Mesos, Low latency DFS/MR,
Graph algorithms)
– Largest installations in 100s of
PB
©2011 Cloudera, Inc. All Rights Reserved.
Problem
• Handling large data in distributed systems is uniquely challenging from an operational perspective.
• Traditional approaches are valuable, but insufficient. Domain knowledge is vital.
• Need for support within the frameworks themselves for operational concerns.
©2011 Cloudera, Inc. All Rights Reserved.
11
Datacenter(s) as a computer
• Existing tools do not generalize well – Partial failure (how many machines might fail before
the datacenter becomes non-operational? … about 50%)
– Hadoop like metrics (data locality, # of slots, heartbeat delays)
– Installation and lifecycle management
– Heterogenious nodes
• The ultimate user wants to USE the system, not CONFIGURE it – let insight = [ for i in my_smart_algos -> data |> i ]
©2011 Cloudera, Inc. All Rights Reserved.
12
Why not machine images
• Machines have complex state (config,
local data)
– Hard unless the state is trivial
– Images need (rolling) upgrades
– Machines can change (multiple) roles
©2011 Cloudera, Inc. All Rights Reserved.
13
Why not config management tools
• Make assumption of running M services
on N machines, not X services running in
the “cloud”
– Very bad with “partial failures”
– Don’t understand Hadoop specific “state”
– Don’t understand Hadoop specific metrics
©2011 Cloudera, Inc. All Rights Reserved.
14
Our solution
• Managing partial failure – Cluster is still usable if x% fail (but might have a data loss
if 3 nodes fail at the same time)
– “Running with concerning health”
• Node cast – Every node can be multiple things (think zoo: it can be a
tiger or a lion or a monkey)
• Finding nodes like one or jobs like one – Nodes are grouped according to functionality (datanode,
tasktracker, regionserver, namenode, jobtracker)
– Find jobs that are similar to a given one and track outliers
• Drill down for Hadoop-specific diagnostic – Workflow -> Jobs -> Tasks -> Attempts
©2011 Cloudera, Inc. All Rights Reserved.
15
Details
• Written in Python and Django
• Each node runs “SCM agent”
• Dial-in mode
• Agent does the best effort to make the prescribed service(s) run
• All state managed by the “server”
• Diagnostic is passed via heartbeats
• Centralized configuration management
©2011 Cloudera, Inc. All Rights Reserved.
16
Services
©2011 Cloudera, Inc. All Rights Reserved.
17
Services (partial failure)
©2011 Cloudera, Inc. All Rights Reserved.
18
New service
©2011 Cloudera, Inc. All Rights Reserved.
19
SCM node selection
©2011 Cloudera, Inc. All Rights Reserved.
20
Visualising drill-down
©2011 Cloudera, Inc. All Rights Reserved.
21
Job matching
• Requires that we build up a rich model of job performance over time
• Surprisingly subtle problem - how do we know when two jobs are the same?
• Periodic jobs offer more clues - time of day, submitting user, map class, reduce class.
• Query jobs are more difficult – For e.g. Hive, query string analysis can tell us
something
©2011 Cloudera, Inc. All Rights Reserved.
22
Job matching
©2011 Cloudera, Inc. All Rights Reserved.
23
Diagnosing job performance
• Ok, your job really is slow. What now?
• Major cause of slowness, as seen by customers, is skew
• Two predominant types of skew
– Environmental skew, when identical tasks run differently depending
on where they run. Breaks MR notion of homogeneity, causes
severe slowdown.
– Workload skew, when supposedly identical tasks have vastly
differing amounts of work to do,
©2011 Cloudera, Inc. All Rights Reserved.
24
Visualising skew
©2011 Cloudera, Inc. All Rights Reserved.
25
What’s next
• Cloudera Enterprise 3.5 & Hadoop
express (June 2011, SCM & SCM
Express)
• Cloudera Enterprise 3.7 on the way
©2011 Cloudera, Inc. All Rights Reserved.
26
Questions?
Do not hesitate to email me alexvk at cloudera dot com
©2011 Cloudera, Inc. All Rights Reserved.
27