41
Proprietary & Confidential. Copyright © 2014. Hado’ops or Had’oops 1 We’re Hiring rocketfuel.com/careers Kishore Kumar Yellamraju Abhijit Pol

Hado"ops" or Had"oops"

Embed Size (px)

Citation preview

Page 1: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

HadorsquoopsrsquoorHadrsquooopsrsquo 1

Wersquore Hiringrocketfuelcomcareers

Kishore Kumar YellamrajuAbhijit Pol

Proprietary amp Confidential Copyright copy 2014

The Web Is Monetized By Advertising

Proprietary amp Confidential Copyright copy 2014

Delivery Methods

raquoDisplayraquoVideoraquoMobileraquoSocial

Proprietary amp Confidential Copyright copy 2014

6 Ad Served

User Segments

3 Bid Reques

t

Overview

Publishers

2 Ad Request

1 Page Request

4 Bid amp Ad

User Engagemen

ts

Data Partners

Advertisers

Browser

Some Exchange Partners

Ad Exchange

Optimize

Rocket Fuel Platform

Real-time BidderAutomated Decisions

Models

Refresh learning

Data Store

Ads ampBudget

ModelScores

Events

5 RocketfuelWinning Ad

Proprietary amp Confidential Copyright copy 2014

$238965$06782$17234

$009$178964$16782$17234$0809$242125

$211$126

$2178$2056$0809$242125

$211$126$278$156

$1809$242125

$211$126$278$056$242125

$211$126$278

$0756$0809$242125

$211$126$278

$1256$1809$242125

$211$126$278

$0586$2009

125$211$126$278$156

$000

[ + ][ + ]

SitePageGeoWeatherTime of DayBrand AffinityUser

Always buying the best impressions amp serving the best ad

Real Time Bidding and Serving

Proprietary amp Confidential Copyright copy 2014

GoalLeadsamp sales

GoalCoupondownloads

GoalBrandawareness

SitePageGeoWeatherTime of DayBrand AffinityDemo

Impression ScorecardDemoBrand AffinityTime of DayGeoWeatherSitePageAd PositionIn-marketBehaviorResponse

Impression ScorecardDemoBrand AffinityTime of DayGeoWeatherSitePageAd PositionIn-MarketBehaviorResponse X

Impression ScorecardDemoBrand AffinityTime of DayGeoWeatherSitePageAd PositionIn-MarketBehaviorResponse

+100+40-20+20+15+10+40+35

+97

+40-70-20+10+15-25-40-18

+07

+10-10-20+20+10-35-25+10

+14

Real Time Bidding and Serving

Xuuml

Proprietary amp Confidential Copyright copy 2014

6 Ad Served

User Segments

3 Bid Reques

t

Overview

Publishers

2 Ad Request

1 Page Request

4 Bid amp Ad

User Engagemen

ts

Data Partners

Advertisers

Browser

Some Exchange Partners

Ad Exchange

Optimize

Rocket Fuel Platform

Real-time BidderAutomated Decisions

Models

Refresh learning

Data Store

Ads ampBudget

ModelScores

Events

5 RocketfuelWinning Ad

Proprietary amp Confidential Copyright copy 2014

Facebook likes

Searches on Google

Bid Requests Considered by Rocketfuel

5 B

6 B

45 B

Requests per day

Throughput

Proprietary amp Confidential Copyright copy 2014

Blink of an eye

SF to Tokyo network round trip

One beat of a hummindbirds wing

Look up in Blackbird

400

100

20

2

Time (ms)

Latency

Proprietary amp Confidential Copyright copy 2014

Architecture and Scale

raquoDatacentersraquoScaleraquoGrowthraquoArchitecture

Proprietary amp Confidential Copyright copy 2014

Data Center Expansion

raquoabc

Proprietary amp Confidential Copyright copy 2014

Data Center Design

bull Racks custom built at Rocket Fuelbull Leased spacebandwidth in colocation facilities

Hadoop Server20 2U servers (85kW)

Bidders40 2-U Twin 2 servers (17kW)

Proprietary amp Confidential Copyright copy 2014

Rocket Fuel Scale

raquo34474 CPU processor coresndash2655 serversndash1874 Teraflops of computing

raquo188 Terabytes of memoryndash13X the memory of IBM computer Watson that

played Jeopardy

raquo42PB Petabytes of storagendash106X the data volume of the entire Library of

Congress

Proprietary amp Confidential Copyright copy 2014

Hadoop at Rocket Fuel

raquo 1400 servers

raquo 15K Disks

raquo 15K Cores

raquo 90 TB

raquo 30K MR slots

raquo 12K daily MR jobs

Proprietary amp Confidential Copyright copy 2014

200 Servers 1400 Servers

1 Year

5 PB

41 PB8x

Growth

Proprietary amp Confidential Copyright copy 2014

Data Architecture 30

Proprietary amp Confidential Copyright copy 2014

Hadoop Setup

QJM ZK Quorum

raquo 6x2TB Disksraquo 2x6 coreraquo 196 GB RAMraquo 2x1G NIC

raquo 12x3TB Disksraquo 2x6 coreraquo 64 GB RAMraquo 10G NIC

raquo same as DNrsquosraquo Dedicated disk

to ZK or JN

JT

Standby NN

ZKFCZKFC

Active NN

DNTT

DNTT

DNTT

DNTT

DNTT

DNTT

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Puppet+

Infradb

Automation is key

Maintenance is Not Easy

Proprietary amp Confidential Copyright copy 2014

Puppet and Infradb

raquo Automate as much as you canraquo Adding a slave node to Hadoop cluster lt 120 secondsraquo Bringing up a new Hadoop cluster lt 500 secondsraquo MR slots are automatically determined based on hardware config

Isnrsquot it cool

Just define once

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

Monitoring

hadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted TTrsquos

with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 2: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

The Web Is Monetized By Advertising

Proprietary amp Confidential Copyright copy 2014

Delivery Methods

raquoDisplayraquoVideoraquoMobileraquoSocial

Proprietary amp Confidential Copyright copy 2014

6 Ad Served

User Segments

3 Bid Reques

t

Overview

Publishers

2 Ad Request

1 Page Request

4 Bid amp Ad

User Engagemen

ts

Data Partners

Advertisers

Browser

Some Exchange Partners

Ad Exchange

Optimize

Rocket Fuel Platform

Real-time BidderAutomated Decisions

Models

Refresh learning

Data Store

Ads ampBudget

ModelScores

Events

5 RocketfuelWinning Ad

Proprietary amp Confidential Copyright copy 2014

$238965$06782$17234

$009$178964$16782$17234$0809$242125

$211$126

$2178$2056$0809$242125

$211$126$278$156

$1809$242125

$211$126$278$056$242125

$211$126$278

$0756$0809$242125

$211$126$278

$1256$1809$242125

$211$126$278

$0586$2009

125$211$126$278$156

$000

[ + ][ + ]

SitePageGeoWeatherTime of DayBrand AffinityUser

Always buying the best impressions amp serving the best ad

Real Time Bidding and Serving

Proprietary amp Confidential Copyright copy 2014

GoalLeadsamp sales

GoalCoupondownloads

GoalBrandawareness

SitePageGeoWeatherTime of DayBrand AffinityDemo

Impression ScorecardDemoBrand AffinityTime of DayGeoWeatherSitePageAd PositionIn-marketBehaviorResponse

Impression ScorecardDemoBrand AffinityTime of DayGeoWeatherSitePageAd PositionIn-MarketBehaviorResponse X

Impression ScorecardDemoBrand AffinityTime of DayGeoWeatherSitePageAd PositionIn-MarketBehaviorResponse

+100+40-20+20+15+10+40+35

+97

+40-70-20+10+15-25-40-18

+07

+10-10-20+20+10-35-25+10

+14

Real Time Bidding and Serving

Xuuml

Proprietary amp Confidential Copyright copy 2014

6 Ad Served

User Segments

3 Bid Reques

t

Overview

Publishers

2 Ad Request

1 Page Request

4 Bid amp Ad

User Engagemen

ts

Data Partners

Advertisers

Browser

Some Exchange Partners

Ad Exchange

Optimize

Rocket Fuel Platform

Real-time BidderAutomated Decisions

Models

Refresh learning

Data Store

Ads ampBudget

ModelScores

Events

5 RocketfuelWinning Ad

Proprietary amp Confidential Copyright copy 2014

Facebook likes

Searches on Google

Bid Requests Considered by Rocketfuel

5 B

6 B

45 B

Requests per day

Throughput

Proprietary amp Confidential Copyright copy 2014

Blink of an eye

SF to Tokyo network round trip

One beat of a hummindbirds wing

Look up in Blackbird

400

100

20

2

Time (ms)

Latency

Proprietary amp Confidential Copyright copy 2014

Architecture and Scale

raquoDatacentersraquoScaleraquoGrowthraquoArchitecture

Proprietary amp Confidential Copyright copy 2014

Data Center Expansion

raquoabc

Proprietary amp Confidential Copyright copy 2014

Data Center Design

bull Racks custom built at Rocket Fuelbull Leased spacebandwidth in colocation facilities

Hadoop Server20 2U servers (85kW)

Bidders40 2-U Twin 2 servers (17kW)

Proprietary amp Confidential Copyright copy 2014

Rocket Fuel Scale

raquo34474 CPU processor coresndash2655 serversndash1874 Teraflops of computing

raquo188 Terabytes of memoryndash13X the memory of IBM computer Watson that

played Jeopardy

raquo42PB Petabytes of storagendash106X the data volume of the entire Library of

Congress

Proprietary amp Confidential Copyright copy 2014

Hadoop at Rocket Fuel

raquo 1400 servers

raquo 15K Disks

raquo 15K Cores

raquo 90 TB

raquo 30K MR slots

raquo 12K daily MR jobs

Proprietary amp Confidential Copyright copy 2014

200 Servers 1400 Servers

1 Year

5 PB

41 PB8x

Growth

Proprietary amp Confidential Copyright copy 2014

Data Architecture 30

Proprietary amp Confidential Copyright copy 2014

Hadoop Setup

QJM ZK Quorum

raquo 6x2TB Disksraquo 2x6 coreraquo 196 GB RAMraquo 2x1G NIC

raquo 12x3TB Disksraquo 2x6 coreraquo 64 GB RAMraquo 10G NIC

raquo same as DNrsquosraquo Dedicated disk

to ZK or JN

JT

Standby NN

ZKFCZKFC

Active NN

DNTT

DNTT

DNTT

DNTT

DNTT

DNTT

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Puppet+

Infradb

Automation is key

Maintenance is Not Easy

Proprietary amp Confidential Copyright copy 2014

Puppet and Infradb

raquo Automate as much as you canraquo Adding a slave node to Hadoop cluster lt 120 secondsraquo Bringing up a new Hadoop cluster lt 500 secondsraquo MR slots are automatically determined based on hardware config

Isnrsquot it cool

Just define once

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

Monitoring

hadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted TTrsquos

with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 3: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

Delivery Methods

raquoDisplayraquoVideoraquoMobileraquoSocial

Proprietary amp Confidential Copyright copy 2014

6 Ad Served

User Segments

3 Bid Reques

t

Overview

Publishers

2 Ad Request

1 Page Request

4 Bid amp Ad

User Engagemen

ts

Data Partners

Advertisers

Browser

Some Exchange Partners

Ad Exchange

Optimize

Rocket Fuel Platform

Real-time BidderAutomated Decisions

Models

Refresh learning

Data Store

Ads ampBudget

ModelScores

Events

5 RocketfuelWinning Ad

Proprietary amp Confidential Copyright copy 2014

$238965$06782$17234

$009$178964$16782$17234$0809$242125

$211$126

$2178$2056$0809$242125

$211$126$278$156

$1809$242125

$211$126$278$056$242125

$211$126$278

$0756$0809$242125

$211$126$278

$1256$1809$242125

$211$126$278

$0586$2009

125$211$126$278$156

$000

[ + ][ + ]

SitePageGeoWeatherTime of DayBrand AffinityUser

Always buying the best impressions amp serving the best ad

Real Time Bidding and Serving

Proprietary amp Confidential Copyright copy 2014

GoalLeadsamp sales

GoalCoupondownloads

GoalBrandawareness

SitePageGeoWeatherTime of DayBrand AffinityDemo

Impression ScorecardDemoBrand AffinityTime of DayGeoWeatherSitePageAd PositionIn-marketBehaviorResponse

Impression ScorecardDemoBrand AffinityTime of DayGeoWeatherSitePageAd PositionIn-MarketBehaviorResponse X

Impression ScorecardDemoBrand AffinityTime of DayGeoWeatherSitePageAd PositionIn-MarketBehaviorResponse

+100+40-20+20+15+10+40+35

+97

+40-70-20+10+15-25-40-18

+07

+10-10-20+20+10-35-25+10

+14

Real Time Bidding and Serving

Xuuml

Proprietary amp Confidential Copyright copy 2014

6 Ad Served

User Segments

3 Bid Reques

t

Overview

Publishers

2 Ad Request

1 Page Request

4 Bid amp Ad

User Engagemen

ts

Data Partners

Advertisers

Browser

Some Exchange Partners

Ad Exchange

Optimize

Rocket Fuel Platform

Real-time BidderAutomated Decisions

Models

Refresh learning

Data Store

Ads ampBudget

ModelScores

Events

5 RocketfuelWinning Ad

Proprietary amp Confidential Copyright copy 2014

Facebook likes

Searches on Google

Bid Requests Considered by Rocketfuel

5 B

6 B

45 B

Requests per day

Throughput

Proprietary amp Confidential Copyright copy 2014

Blink of an eye

SF to Tokyo network round trip

One beat of a hummindbirds wing

Look up in Blackbird

400

100

20

2

Time (ms)

Latency

Proprietary amp Confidential Copyright copy 2014

Architecture and Scale

raquoDatacentersraquoScaleraquoGrowthraquoArchitecture

Proprietary amp Confidential Copyright copy 2014

Data Center Expansion

raquoabc

Proprietary amp Confidential Copyright copy 2014

Data Center Design

bull Racks custom built at Rocket Fuelbull Leased spacebandwidth in colocation facilities

Hadoop Server20 2U servers (85kW)

Bidders40 2-U Twin 2 servers (17kW)

Proprietary amp Confidential Copyright copy 2014

Rocket Fuel Scale

raquo34474 CPU processor coresndash2655 serversndash1874 Teraflops of computing

raquo188 Terabytes of memoryndash13X the memory of IBM computer Watson that

played Jeopardy

raquo42PB Petabytes of storagendash106X the data volume of the entire Library of

Congress

Proprietary amp Confidential Copyright copy 2014

Hadoop at Rocket Fuel

raquo 1400 servers

raquo 15K Disks

raquo 15K Cores

raquo 90 TB

raquo 30K MR slots

raquo 12K daily MR jobs

Proprietary amp Confidential Copyright copy 2014

200 Servers 1400 Servers

1 Year

5 PB

41 PB8x

Growth

Proprietary amp Confidential Copyright copy 2014

Data Architecture 30

Proprietary amp Confidential Copyright copy 2014

Hadoop Setup

QJM ZK Quorum

raquo 6x2TB Disksraquo 2x6 coreraquo 196 GB RAMraquo 2x1G NIC

raquo 12x3TB Disksraquo 2x6 coreraquo 64 GB RAMraquo 10G NIC

raquo same as DNrsquosraquo Dedicated disk

to ZK or JN

JT

Standby NN

ZKFCZKFC

Active NN

DNTT

DNTT

DNTT

DNTT

DNTT

DNTT

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Puppet+

Infradb

Automation is key

Maintenance is Not Easy

Proprietary amp Confidential Copyright copy 2014

Puppet and Infradb

raquo Automate as much as you canraquo Adding a slave node to Hadoop cluster lt 120 secondsraquo Bringing up a new Hadoop cluster lt 500 secondsraquo MR slots are automatically determined based on hardware config

Isnrsquot it cool

Just define once

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

Monitoring

hadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted TTrsquos

with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 4: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

6 Ad Served

User Segments

3 Bid Reques

t

Overview

Publishers

2 Ad Request

1 Page Request

4 Bid amp Ad

User Engagemen

ts

Data Partners

Advertisers

Browser

Some Exchange Partners

Ad Exchange

Optimize

Rocket Fuel Platform

Real-time BidderAutomated Decisions

Models

Refresh learning

Data Store

Ads ampBudget

ModelScores

Events

5 RocketfuelWinning Ad

Proprietary amp Confidential Copyright copy 2014

$238965$06782$17234

$009$178964$16782$17234$0809$242125

$211$126

$2178$2056$0809$242125

$211$126$278$156

$1809$242125

$211$126$278$056$242125

$211$126$278

$0756$0809$242125

$211$126$278

$1256$1809$242125

$211$126$278

$0586$2009

125$211$126$278$156

$000

[ + ][ + ]

SitePageGeoWeatherTime of DayBrand AffinityUser

Always buying the best impressions amp serving the best ad

Real Time Bidding and Serving

Proprietary amp Confidential Copyright copy 2014

GoalLeadsamp sales

GoalCoupondownloads

GoalBrandawareness

SitePageGeoWeatherTime of DayBrand AffinityDemo

Impression ScorecardDemoBrand AffinityTime of DayGeoWeatherSitePageAd PositionIn-marketBehaviorResponse

Impression ScorecardDemoBrand AffinityTime of DayGeoWeatherSitePageAd PositionIn-MarketBehaviorResponse X

Impression ScorecardDemoBrand AffinityTime of DayGeoWeatherSitePageAd PositionIn-MarketBehaviorResponse

+100+40-20+20+15+10+40+35

+97

+40-70-20+10+15-25-40-18

+07

+10-10-20+20+10-35-25+10

+14

Real Time Bidding and Serving

Xuuml

Proprietary amp Confidential Copyright copy 2014

6 Ad Served

User Segments

3 Bid Reques

t

Overview

Publishers

2 Ad Request

1 Page Request

4 Bid amp Ad

User Engagemen

ts

Data Partners

Advertisers

Browser

Some Exchange Partners

Ad Exchange

Optimize

Rocket Fuel Platform

Real-time BidderAutomated Decisions

Models

Refresh learning

Data Store

Ads ampBudget

ModelScores

Events

5 RocketfuelWinning Ad

Proprietary amp Confidential Copyright copy 2014

Facebook likes

Searches on Google

Bid Requests Considered by Rocketfuel

5 B

6 B

45 B

Requests per day

Throughput

Proprietary amp Confidential Copyright copy 2014

Blink of an eye

SF to Tokyo network round trip

One beat of a hummindbirds wing

Look up in Blackbird

400

100

20

2

Time (ms)

Latency

Proprietary amp Confidential Copyright copy 2014

Architecture and Scale

raquoDatacentersraquoScaleraquoGrowthraquoArchitecture

Proprietary amp Confidential Copyright copy 2014

Data Center Expansion

raquoabc

Proprietary amp Confidential Copyright copy 2014

Data Center Design

bull Racks custom built at Rocket Fuelbull Leased spacebandwidth in colocation facilities

Hadoop Server20 2U servers (85kW)

Bidders40 2-U Twin 2 servers (17kW)

Proprietary amp Confidential Copyright copy 2014

Rocket Fuel Scale

raquo34474 CPU processor coresndash2655 serversndash1874 Teraflops of computing

raquo188 Terabytes of memoryndash13X the memory of IBM computer Watson that

played Jeopardy

raquo42PB Petabytes of storagendash106X the data volume of the entire Library of

Congress

Proprietary amp Confidential Copyright copy 2014

Hadoop at Rocket Fuel

raquo 1400 servers

raquo 15K Disks

raquo 15K Cores

raquo 90 TB

raquo 30K MR slots

raquo 12K daily MR jobs

Proprietary amp Confidential Copyright copy 2014

200 Servers 1400 Servers

1 Year

5 PB

41 PB8x

Growth

Proprietary amp Confidential Copyright copy 2014

Data Architecture 30

Proprietary amp Confidential Copyright copy 2014

Hadoop Setup

QJM ZK Quorum

raquo 6x2TB Disksraquo 2x6 coreraquo 196 GB RAMraquo 2x1G NIC

raquo 12x3TB Disksraquo 2x6 coreraquo 64 GB RAMraquo 10G NIC

raquo same as DNrsquosraquo Dedicated disk

to ZK or JN

JT

Standby NN

ZKFCZKFC

Active NN

DNTT

DNTT

DNTT

DNTT

DNTT

DNTT

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Puppet+

Infradb

Automation is key

Maintenance is Not Easy

Proprietary amp Confidential Copyright copy 2014

Puppet and Infradb

raquo Automate as much as you canraquo Adding a slave node to Hadoop cluster lt 120 secondsraquo Bringing up a new Hadoop cluster lt 500 secondsraquo MR slots are automatically determined based on hardware config

Isnrsquot it cool

Just define once

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

Monitoring

hadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted TTrsquos

with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 5: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

$238965$06782$17234

$009$178964$16782$17234$0809$242125

$211$126

$2178$2056$0809$242125

$211$126$278$156

$1809$242125

$211$126$278$056$242125

$211$126$278

$0756$0809$242125

$211$126$278

$1256$1809$242125

$211$126$278

$0586$2009

125$211$126$278$156

$000

[ + ][ + ]

SitePageGeoWeatherTime of DayBrand AffinityUser

Always buying the best impressions amp serving the best ad

Real Time Bidding and Serving

Proprietary amp Confidential Copyright copy 2014

GoalLeadsamp sales

GoalCoupondownloads

GoalBrandawareness

SitePageGeoWeatherTime of DayBrand AffinityDemo

Impression ScorecardDemoBrand AffinityTime of DayGeoWeatherSitePageAd PositionIn-marketBehaviorResponse

Impression ScorecardDemoBrand AffinityTime of DayGeoWeatherSitePageAd PositionIn-MarketBehaviorResponse X

Impression ScorecardDemoBrand AffinityTime of DayGeoWeatherSitePageAd PositionIn-MarketBehaviorResponse

+100+40-20+20+15+10+40+35

+97

+40-70-20+10+15-25-40-18

+07

+10-10-20+20+10-35-25+10

+14

Real Time Bidding and Serving

Xuuml

Proprietary amp Confidential Copyright copy 2014

6 Ad Served

User Segments

3 Bid Reques

t

Overview

Publishers

2 Ad Request

1 Page Request

4 Bid amp Ad

User Engagemen

ts

Data Partners

Advertisers

Browser

Some Exchange Partners

Ad Exchange

Optimize

Rocket Fuel Platform

Real-time BidderAutomated Decisions

Models

Refresh learning

Data Store

Ads ampBudget

ModelScores

Events

5 RocketfuelWinning Ad

Proprietary amp Confidential Copyright copy 2014

Facebook likes

Searches on Google

Bid Requests Considered by Rocketfuel

5 B

6 B

45 B

Requests per day

Throughput

Proprietary amp Confidential Copyright copy 2014

Blink of an eye

SF to Tokyo network round trip

One beat of a hummindbirds wing

Look up in Blackbird

400

100

20

2

Time (ms)

Latency

Proprietary amp Confidential Copyright copy 2014

Architecture and Scale

raquoDatacentersraquoScaleraquoGrowthraquoArchitecture

Proprietary amp Confidential Copyright copy 2014

Data Center Expansion

raquoabc

Proprietary amp Confidential Copyright copy 2014

Data Center Design

bull Racks custom built at Rocket Fuelbull Leased spacebandwidth in colocation facilities

Hadoop Server20 2U servers (85kW)

Bidders40 2-U Twin 2 servers (17kW)

Proprietary amp Confidential Copyright copy 2014

Rocket Fuel Scale

raquo34474 CPU processor coresndash2655 serversndash1874 Teraflops of computing

raquo188 Terabytes of memoryndash13X the memory of IBM computer Watson that

played Jeopardy

raquo42PB Petabytes of storagendash106X the data volume of the entire Library of

Congress

Proprietary amp Confidential Copyright copy 2014

Hadoop at Rocket Fuel

raquo 1400 servers

raquo 15K Disks

raquo 15K Cores

raquo 90 TB

raquo 30K MR slots

raquo 12K daily MR jobs

Proprietary amp Confidential Copyright copy 2014

200 Servers 1400 Servers

1 Year

5 PB

41 PB8x

Growth

Proprietary amp Confidential Copyright copy 2014

Data Architecture 30

Proprietary amp Confidential Copyright copy 2014

Hadoop Setup

QJM ZK Quorum

raquo 6x2TB Disksraquo 2x6 coreraquo 196 GB RAMraquo 2x1G NIC

raquo 12x3TB Disksraquo 2x6 coreraquo 64 GB RAMraquo 10G NIC

raquo same as DNrsquosraquo Dedicated disk

to ZK or JN

JT

Standby NN

ZKFCZKFC

Active NN

DNTT

DNTT

DNTT

DNTT

DNTT

DNTT

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Puppet+

Infradb

Automation is key

Maintenance is Not Easy

Proprietary amp Confidential Copyright copy 2014

Puppet and Infradb

raquo Automate as much as you canraquo Adding a slave node to Hadoop cluster lt 120 secondsraquo Bringing up a new Hadoop cluster lt 500 secondsraquo MR slots are automatically determined based on hardware config

Isnrsquot it cool

Just define once

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

Monitoring

hadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted TTrsquos

with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 6: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

GoalLeadsamp sales

GoalCoupondownloads

GoalBrandawareness

SitePageGeoWeatherTime of DayBrand AffinityDemo

Impression ScorecardDemoBrand AffinityTime of DayGeoWeatherSitePageAd PositionIn-marketBehaviorResponse

Impression ScorecardDemoBrand AffinityTime of DayGeoWeatherSitePageAd PositionIn-MarketBehaviorResponse X

Impression ScorecardDemoBrand AffinityTime of DayGeoWeatherSitePageAd PositionIn-MarketBehaviorResponse

+100+40-20+20+15+10+40+35

+97

+40-70-20+10+15-25-40-18

+07

+10-10-20+20+10-35-25+10

+14

Real Time Bidding and Serving

Xuuml

Proprietary amp Confidential Copyright copy 2014

6 Ad Served

User Segments

3 Bid Reques

t

Overview

Publishers

2 Ad Request

1 Page Request

4 Bid amp Ad

User Engagemen

ts

Data Partners

Advertisers

Browser

Some Exchange Partners

Ad Exchange

Optimize

Rocket Fuel Platform

Real-time BidderAutomated Decisions

Models

Refresh learning

Data Store

Ads ampBudget

ModelScores

Events

5 RocketfuelWinning Ad

Proprietary amp Confidential Copyright copy 2014

Facebook likes

Searches on Google

Bid Requests Considered by Rocketfuel

5 B

6 B

45 B

Requests per day

Throughput

Proprietary amp Confidential Copyright copy 2014

Blink of an eye

SF to Tokyo network round trip

One beat of a hummindbirds wing

Look up in Blackbird

400

100

20

2

Time (ms)

Latency

Proprietary amp Confidential Copyright copy 2014

Architecture and Scale

raquoDatacentersraquoScaleraquoGrowthraquoArchitecture

Proprietary amp Confidential Copyright copy 2014

Data Center Expansion

raquoabc

Proprietary amp Confidential Copyright copy 2014

Data Center Design

bull Racks custom built at Rocket Fuelbull Leased spacebandwidth in colocation facilities

Hadoop Server20 2U servers (85kW)

Bidders40 2-U Twin 2 servers (17kW)

Proprietary amp Confidential Copyright copy 2014

Rocket Fuel Scale

raquo34474 CPU processor coresndash2655 serversndash1874 Teraflops of computing

raquo188 Terabytes of memoryndash13X the memory of IBM computer Watson that

played Jeopardy

raquo42PB Petabytes of storagendash106X the data volume of the entire Library of

Congress

Proprietary amp Confidential Copyright copy 2014

Hadoop at Rocket Fuel

raquo 1400 servers

raquo 15K Disks

raquo 15K Cores

raquo 90 TB

raquo 30K MR slots

raquo 12K daily MR jobs

Proprietary amp Confidential Copyright copy 2014

200 Servers 1400 Servers

1 Year

5 PB

41 PB8x

Growth

Proprietary amp Confidential Copyright copy 2014

Data Architecture 30

Proprietary amp Confidential Copyright copy 2014

Hadoop Setup

QJM ZK Quorum

raquo 6x2TB Disksraquo 2x6 coreraquo 196 GB RAMraquo 2x1G NIC

raquo 12x3TB Disksraquo 2x6 coreraquo 64 GB RAMraquo 10G NIC

raquo same as DNrsquosraquo Dedicated disk

to ZK or JN

JT

Standby NN

ZKFCZKFC

Active NN

DNTT

DNTT

DNTT

DNTT

DNTT

DNTT

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Puppet+

Infradb

Automation is key

Maintenance is Not Easy

Proprietary amp Confidential Copyright copy 2014

Puppet and Infradb

raquo Automate as much as you canraquo Adding a slave node to Hadoop cluster lt 120 secondsraquo Bringing up a new Hadoop cluster lt 500 secondsraquo MR slots are automatically determined based on hardware config

Isnrsquot it cool

Just define once

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

Monitoring

hadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted TTrsquos

with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 7: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

6 Ad Served

User Segments

3 Bid Reques

t

Overview

Publishers

2 Ad Request

1 Page Request

4 Bid amp Ad

User Engagemen

ts

Data Partners

Advertisers

Browser

Some Exchange Partners

Ad Exchange

Optimize

Rocket Fuel Platform

Real-time BidderAutomated Decisions

Models

Refresh learning

Data Store

Ads ampBudget

ModelScores

Events

5 RocketfuelWinning Ad

Proprietary amp Confidential Copyright copy 2014

Facebook likes

Searches on Google

Bid Requests Considered by Rocketfuel

5 B

6 B

45 B

Requests per day

Throughput

Proprietary amp Confidential Copyright copy 2014

Blink of an eye

SF to Tokyo network round trip

One beat of a hummindbirds wing

Look up in Blackbird

400

100

20

2

Time (ms)

Latency

Proprietary amp Confidential Copyright copy 2014

Architecture and Scale

raquoDatacentersraquoScaleraquoGrowthraquoArchitecture

Proprietary amp Confidential Copyright copy 2014

Data Center Expansion

raquoabc

Proprietary amp Confidential Copyright copy 2014

Data Center Design

bull Racks custom built at Rocket Fuelbull Leased spacebandwidth in colocation facilities

Hadoop Server20 2U servers (85kW)

Bidders40 2-U Twin 2 servers (17kW)

Proprietary amp Confidential Copyright copy 2014

Rocket Fuel Scale

raquo34474 CPU processor coresndash2655 serversndash1874 Teraflops of computing

raquo188 Terabytes of memoryndash13X the memory of IBM computer Watson that

played Jeopardy

raquo42PB Petabytes of storagendash106X the data volume of the entire Library of

Congress

Proprietary amp Confidential Copyright copy 2014

Hadoop at Rocket Fuel

raquo 1400 servers

raquo 15K Disks

raquo 15K Cores

raquo 90 TB

raquo 30K MR slots

raquo 12K daily MR jobs

Proprietary amp Confidential Copyright copy 2014

200 Servers 1400 Servers

1 Year

5 PB

41 PB8x

Growth

Proprietary amp Confidential Copyright copy 2014

Data Architecture 30

Proprietary amp Confidential Copyright copy 2014

Hadoop Setup

QJM ZK Quorum

raquo 6x2TB Disksraquo 2x6 coreraquo 196 GB RAMraquo 2x1G NIC

raquo 12x3TB Disksraquo 2x6 coreraquo 64 GB RAMraquo 10G NIC

raquo same as DNrsquosraquo Dedicated disk

to ZK or JN

JT

Standby NN

ZKFCZKFC

Active NN

DNTT

DNTT

DNTT

DNTT

DNTT

DNTT

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Puppet+

Infradb

Automation is key

Maintenance is Not Easy

Proprietary amp Confidential Copyright copy 2014

Puppet and Infradb

raquo Automate as much as you canraquo Adding a slave node to Hadoop cluster lt 120 secondsraquo Bringing up a new Hadoop cluster lt 500 secondsraquo MR slots are automatically determined based on hardware config

Isnrsquot it cool

Just define once

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

Monitoring

hadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted TTrsquos

with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 8: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

Facebook likes

Searches on Google

Bid Requests Considered by Rocketfuel

5 B

6 B

45 B

Requests per day

Throughput

Proprietary amp Confidential Copyright copy 2014

Blink of an eye

SF to Tokyo network round trip

One beat of a hummindbirds wing

Look up in Blackbird

400

100

20

2

Time (ms)

Latency

Proprietary amp Confidential Copyright copy 2014

Architecture and Scale

raquoDatacentersraquoScaleraquoGrowthraquoArchitecture

Proprietary amp Confidential Copyright copy 2014

Data Center Expansion

raquoabc

Proprietary amp Confidential Copyright copy 2014

Data Center Design

bull Racks custom built at Rocket Fuelbull Leased spacebandwidth in colocation facilities

Hadoop Server20 2U servers (85kW)

Bidders40 2-U Twin 2 servers (17kW)

Proprietary amp Confidential Copyright copy 2014

Rocket Fuel Scale

raquo34474 CPU processor coresndash2655 serversndash1874 Teraflops of computing

raquo188 Terabytes of memoryndash13X the memory of IBM computer Watson that

played Jeopardy

raquo42PB Petabytes of storagendash106X the data volume of the entire Library of

Congress

Proprietary amp Confidential Copyright copy 2014

Hadoop at Rocket Fuel

raquo 1400 servers

raquo 15K Disks

raquo 15K Cores

raquo 90 TB

raquo 30K MR slots

raquo 12K daily MR jobs

Proprietary amp Confidential Copyright copy 2014

200 Servers 1400 Servers

1 Year

5 PB

41 PB8x

Growth

Proprietary amp Confidential Copyright copy 2014

Data Architecture 30

Proprietary amp Confidential Copyright copy 2014

Hadoop Setup

QJM ZK Quorum

raquo 6x2TB Disksraquo 2x6 coreraquo 196 GB RAMraquo 2x1G NIC

raquo 12x3TB Disksraquo 2x6 coreraquo 64 GB RAMraquo 10G NIC

raquo same as DNrsquosraquo Dedicated disk

to ZK or JN

JT

Standby NN

ZKFCZKFC

Active NN

DNTT

DNTT

DNTT

DNTT

DNTT

DNTT

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Puppet+

Infradb

Automation is key

Maintenance is Not Easy

Proprietary amp Confidential Copyright copy 2014

Puppet and Infradb

raquo Automate as much as you canraquo Adding a slave node to Hadoop cluster lt 120 secondsraquo Bringing up a new Hadoop cluster lt 500 secondsraquo MR slots are automatically determined based on hardware config

Isnrsquot it cool

Just define once

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

Monitoring

hadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted TTrsquos

with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 9: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

Blink of an eye

SF to Tokyo network round trip

One beat of a hummindbirds wing

Look up in Blackbird

400

100

20

2

Time (ms)

Latency

Proprietary amp Confidential Copyright copy 2014

Architecture and Scale

raquoDatacentersraquoScaleraquoGrowthraquoArchitecture

Proprietary amp Confidential Copyright copy 2014

Data Center Expansion

raquoabc

Proprietary amp Confidential Copyright copy 2014

Data Center Design

bull Racks custom built at Rocket Fuelbull Leased spacebandwidth in colocation facilities

Hadoop Server20 2U servers (85kW)

Bidders40 2-U Twin 2 servers (17kW)

Proprietary amp Confidential Copyright copy 2014

Rocket Fuel Scale

raquo34474 CPU processor coresndash2655 serversndash1874 Teraflops of computing

raquo188 Terabytes of memoryndash13X the memory of IBM computer Watson that

played Jeopardy

raquo42PB Petabytes of storagendash106X the data volume of the entire Library of

Congress

Proprietary amp Confidential Copyright copy 2014

Hadoop at Rocket Fuel

raquo 1400 servers

raquo 15K Disks

raquo 15K Cores

raquo 90 TB

raquo 30K MR slots

raquo 12K daily MR jobs

Proprietary amp Confidential Copyright copy 2014

200 Servers 1400 Servers

1 Year

5 PB

41 PB8x

Growth

Proprietary amp Confidential Copyright copy 2014

Data Architecture 30

Proprietary amp Confidential Copyright copy 2014

Hadoop Setup

QJM ZK Quorum

raquo 6x2TB Disksraquo 2x6 coreraquo 196 GB RAMraquo 2x1G NIC

raquo 12x3TB Disksraquo 2x6 coreraquo 64 GB RAMraquo 10G NIC

raquo same as DNrsquosraquo Dedicated disk

to ZK or JN

JT

Standby NN

ZKFCZKFC

Active NN

DNTT

DNTT

DNTT

DNTT

DNTT

DNTT

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Puppet+

Infradb

Automation is key

Maintenance is Not Easy

Proprietary amp Confidential Copyright copy 2014

Puppet and Infradb

raquo Automate as much as you canraquo Adding a slave node to Hadoop cluster lt 120 secondsraquo Bringing up a new Hadoop cluster lt 500 secondsraquo MR slots are automatically determined based on hardware config

Isnrsquot it cool

Just define once

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

Monitoring

hadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted TTrsquos

with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 10: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

Architecture and Scale

raquoDatacentersraquoScaleraquoGrowthraquoArchitecture

Proprietary amp Confidential Copyright copy 2014

Data Center Expansion

raquoabc

Proprietary amp Confidential Copyright copy 2014

Data Center Design

bull Racks custom built at Rocket Fuelbull Leased spacebandwidth in colocation facilities

Hadoop Server20 2U servers (85kW)

Bidders40 2-U Twin 2 servers (17kW)

Proprietary amp Confidential Copyright copy 2014

Rocket Fuel Scale

raquo34474 CPU processor coresndash2655 serversndash1874 Teraflops of computing

raquo188 Terabytes of memoryndash13X the memory of IBM computer Watson that

played Jeopardy

raquo42PB Petabytes of storagendash106X the data volume of the entire Library of

Congress

Proprietary amp Confidential Copyright copy 2014

Hadoop at Rocket Fuel

raquo 1400 servers

raquo 15K Disks

raquo 15K Cores

raquo 90 TB

raquo 30K MR slots

raquo 12K daily MR jobs

Proprietary amp Confidential Copyright copy 2014

200 Servers 1400 Servers

1 Year

5 PB

41 PB8x

Growth

Proprietary amp Confidential Copyright copy 2014

Data Architecture 30

Proprietary amp Confidential Copyright copy 2014

Hadoop Setup

QJM ZK Quorum

raquo 6x2TB Disksraquo 2x6 coreraquo 196 GB RAMraquo 2x1G NIC

raquo 12x3TB Disksraquo 2x6 coreraquo 64 GB RAMraquo 10G NIC

raquo same as DNrsquosraquo Dedicated disk

to ZK or JN

JT

Standby NN

ZKFCZKFC

Active NN

DNTT

DNTT

DNTT

DNTT

DNTT

DNTT

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Puppet+

Infradb

Automation is key

Maintenance is Not Easy

Proprietary amp Confidential Copyright copy 2014

Puppet and Infradb

raquo Automate as much as you canraquo Adding a slave node to Hadoop cluster lt 120 secondsraquo Bringing up a new Hadoop cluster lt 500 secondsraquo MR slots are automatically determined based on hardware config

Isnrsquot it cool

Just define once

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

Monitoring

hadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted TTrsquos

with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 11: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

Data Center Expansion

raquoabc

Proprietary amp Confidential Copyright copy 2014

Data Center Design

bull Racks custom built at Rocket Fuelbull Leased spacebandwidth in colocation facilities

Hadoop Server20 2U servers (85kW)

Bidders40 2-U Twin 2 servers (17kW)

Proprietary amp Confidential Copyright copy 2014

Rocket Fuel Scale

raquo34474 CPU processor coresndash2655 serversndash1874 Teraflops of computing

raquo188 Terabytes of memoryndash13X the memory of IBM computer Watson that

played Jeopardy

raquo42PB Petabytes of storagendash106X the data volume of the entire Library of

Congress

Proprietary amp Confidential Copyright copy 2014

Hadoop at Rocket Fuel

raquo 1400 servers

raquo 15K Disks

raquo 15K Cores

raquo 90 TB

raquo 30K MR slots

raquo 12K daily MR jobs

Proprietary amp Confidential Copyright copy 2014

200 Servers 1400 Servers

1 Year

5 PB

41 PB8x

Growth

Proprietary amp Confidential Copyright copy 2014

Data Architecture 30

Proprietary amp Confidential Copyright copy 2014

Hadoop Setup

QJM ZK Quorum

raquo 6x2TB Disksraquo 2x6 coreraquo 196 GB RAMraquo 2x1G NIC

raquo 12x3TB Disksraquo 2x6 coreraquo 64 GB RAMraquo 10G NIC

raquo same as DNrsquosraquo Dedicated disk

to ZK or JN

JT

Standby NN

ZKFCZKFC

Active NN

DNTT

DNTT

DNTT

DNTT

DNTT

DNTT

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Puppet+

Infradb

Automation is key

Maintenance is Not Easy

Proprietary amp Confidential Copyright copy 2014

Puppet and Infradb

raquo Automate as much as you canraquo Adding a slave node to Hadoop cluster lt 120 secondsraquo Bringing up a new Hadoop cluster lt 500 secondsraquo MR slots are automatically determined based on hardware config

Isnrsquot it cool

Just define once

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

Monitoring

hadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted TTrsquos

with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 12: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

Data Center Design

bull Racks custom built at Rocket Fuelbull Leased spacebandwidth in colocation facilities

Hadoop Server20 2U servers (85kW)

Bidders40 2-U Twin 2 servers (17kW)

Proprietary amp Confidential Copyright copy 2014

Rocket Fuel Scale

raquo34474 CPU processor coresndash2655 serversndash1874 Teraflops of computing

raquo188 Terabytes of memoryndash13X the memory of IBM computer Watson that

played Jeopardy

raquo42PB Petabytes of storagendash106X the data volume of the entire Library of

Congress

Proprietary amp Confidential Copyright copy 2014

Hadoop at Rocket Fuel

raquo 1400 servers

raquo 15K Disks

raquo 15K Cores

raquo 90 TB

raquo 30K MR slots

raquo 12K daily MR jobs

Proprietary amp Confidential Copyright copy 2014

200 Servers 1400 Servers

1 Year

5 PB

41 PB8x

Growth

Proprietary amp Confidential Copyright copy 2014

Data Architecture 30

Proprietary amp Confidential Copyright copy 2014

Hadoop Setup

QJM ZK Quorum

raquo 6x2TB Disksraquo 2x6 coreraquo 196 GB RAMraquo 2x1G NIC

raquo 12x3TB Disksraquo 2x6 coreraquo 64 GB RAMraquo 10G NIC

raquo same as DNrsquosraquo Dedicated disk

to ZK or JN

JT

Standby NN

ZKFCZKFC

Active NN

DNTT

DNTT

DNTT

DNTT

DNTT

DNTT

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Puppet+

Infradb

Automation is key

Maintenance is Not Easy

Proprietary amp Confidential Copyright copy 2014

Puppet and Infradb

raquo Automate as much as you canraquo Adding a slave node to Hadoop cluster lt 120 secondsraquo Bringing up a new Hadoop cluster lt 500 secondsraquo MR slots are automatically determined based on hardware config

Isnrsquot it cool

Just define once

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

Monitoring

hadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted TTrsquos

with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 13: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

Rocket Fuel Scale

raquo34474 CPU processor coresndash2655 serversndash1874 Teraflops of computing

raquo188 Terabytes of memoryndash13X the memory of IBM computer Watson that

played Jeopardy

raquo42PB Petabytes of storagendash106X the data volume of the entire Library of

Congress

Proprietary amp Confidential Copyright copy 2014

Hadoop at Rocket Fuel

raquo 1400 servers

raquo 15K Disks

raquo 15K Cores

raquo 90 TB

raquo 30K MR slots

raquo 12K daily MR jobs

Proprietary amp Confidential Copyright copy 2014

200 Servers 1400 Servers

1 Year

5 PB

41 PB8x

Growth

Proprietary amp Confidential Copyright copy 2014

Data Architecture 30

Proprietary amp Confidential Copyright copy 2014

Hadoop Setup

QJM ZK Quorum

raquo 6x2TB Disksraquo 2x6 coreraquo 196 GB RAMraquo 2x1G NIC

raquo 12x3TB Disksraquo 2x6 coreraquo 64 GB RAMraquo 10G NIC

raquo same as DNrsquosraquo Dedicated disk

to ZK or JN

JT

Standby NN

ZKFCZKFC

Active NN

DNTT

DNTT

DNTT

DNTT

DNTT

DNTT

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Puppet+

Infradb

Automation is key

Maintenance is Not Easy

Proprietary amp Confidential Copyright copy 2014

Puppet and Infradb

raquo Automate as much as you canraquo Adding a slave node to Hadoop cluster lt 120 secondsraquo Bringing up a new Hadoop cluster lt 500 secondsraquo MR slots are automatically determined based on hardware config

Isnrsquot it cool

Just define once

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

Monitoring

hadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted TTrsquos

with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 14: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

Hadoop at Rocket Fuel

raquo 1400 servers

raquo 15K Disks

raquo 15K Cores

raquo 90 TB

raquo 30K MR slots

raquo 12K daily MR jobs

Proprietary amp Confidential Copyright copy 2014

200 Servers 1400 Servers

1 Year

5 PB

41 PB8x

Growth

Proprietary amp Confidential Copyright copy 2014

Data Architecture 30

Proprietary amp Confidential Copyright copy 2014

Hadoop Setup

QJM ZK Quorum

raquo 6x2TB Disksraquo 2x6 coreraquo 196 GB RAMraquo 2x1G NIC

raquo 12x3TB Disksraquo 2x6 coreraquo 64 GB RAMraquo 10G NIC

raquo same as DNrsquosraquo Dedicated disk

to ZK or JN

JT

Standby NN

ZKFCZKFC

Active NN

DNTT

DNTT

DNTT

DNTT

DNTT

DNTT

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Puppet+

Infradb

Automation is key

Maintenance is Not Easy

Proprietary amp Confidential Copyright copy 2014

Puppet and Infradb

raquo Automate as much as you canraquo Adding a slave node to Hadoop cluster lt 120 secondsraquo Bringing up a new Hadoop cluster lt 500 secondsraquo MR slots are automatically determined based on hardware config

Isnrsquot it cool

Just define once

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

Monitoring

hadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted TTrsquos

with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 15: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

200 Servers 1400 Servers

1 Year

5 PB

41 PB8x

Growth

Proprietary amp Confidential Copyright copy 2014

Data Architecture 30

Proprietary amp Confidential Copyright copy 2014

Hadoop Setup

QJM ZK Quorum

raquo 6x2TB Disksraquo 2x6 coreraquo 196 GB RAMraquo 2x1G NIC

raquo 12x3TB Disksraquo 2x6 coreraquo 64 GB RAMraquo 10G NIC

raquo same as DNrsquosraquo Dedicated disk

to ZK or JN

JT

Standby NN

ZKFCZKFC

Active NN

DNTT

DNTT

DNTT

DNTT

DNTT

DNTT

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Puppet+

Infradb

Automation is key

Maintenance is Not Easy

Proprietary amp Confidential Copyright copy 2014

Puppet and Infradb

raquo Automate as much as you canraquo Adding a slave node to Hadoop cluster lt 120 secondsraquo Bringing up a new Hadoop cluster lt 500 secondsraquo MR slots are automatically determined based on hardware config

Isnrsquot it cool

Just define once

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

Monitoring

hadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted TTrsquos

with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 16: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

Data Architecture 30

Proprietary amp Confidential Copyright copy 2014

Hadoop Setup

QJM ZK Quorum

raquo 6x2TB Disksraquo 2x6 coreraquo 196 GB RAMraquo 2x1G NIC

raquo 12x3TB Disksraquo 2x6 coreraquo 64 GB RAMraquo 10G NIC

raquo same as DNrsquosraquo Dedicated disk

to ZK or JN

JT

Standby NN

ZKFCZKFC

Active NN

DNTT

DNTT

DNTT

DNTT

DNTT

DNTT

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Puppet+

Infradb

Automation is key

Maintenance is Not Easy

Proprietary amp Confidential Copyright copy 2014

Puppet and Infradb

raquo Automate as much as you canraquo Adding a slave node to Hadoop cluster lt 120 secondsraquo Bringing up a new Hadoop cluster lt 500 secondsraquo MR slots are automatically determined based on hardware config

Isnrsquot it cool

Just define once

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

Monitoring

hadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted TTrsquos

with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 17: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

Hadoop Setup

QJM ZK Quorum

raquo 6x2TB Disksraquo 2x6 coreraquo 196 GB RAMraquo 2x1G NIC

raquo 12x3TB Disksraquo 2x6 coreraquo 64 GB RAMraquo 10G NIC

raquo same as DNrsquosraquo Dedicated disk

to ZK or JN

JT

Standby NN

ZKFCZKFC

Active NN

DNTT

DNTT

DNTT

DNTT

DNTT

DNTT

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Puppet+

Infradb

Automation is key

Maintenance is Not Easy

Proprietary amp Confidential Copyright copy 2014

Puppet and Infradb

raquo Automate as much as you canraquo Adding a slave node to Hadoop cluster lt 120 secondsraquo Bringing up a new Hadoop cluster lt 500 secondsraquo MR slots are automatically determined based on hardware config

Isnrsquot it cool

Just define once

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

Monitoring

hadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted TTrsquos

with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 18: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Puppet+

Infradb

Automation is key

Maintenance is Not Easy

Proprietary amp Confidential Copyright copy 2014

Puppet and Infradb

raquo Automate as much as you canraquo Adding a slave node to Hadoop cluster lt 120 secondsraquo Bringing up a new Hadoop cluster lt 500 secondsraquo MR slots are automatically determined based on hardware config

Isnrsquot it cool

Just define once

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

Monitoring

hadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted TTrsquos

with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 19: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

Puppet+

Infradb

Automation is key

Maintenance is Not Easy

Proprietary amp Confidential Copyright copy 2014

Puppet and Infradb

raquo Automate as much as you canraquo Adding a slave node to Hadoop cluster lt 120 secondsraquo Bringing up a new Hadoop cluster lt 500 secondsraquo MR slots are automatically determined based on hardware config

Isnrsquot it cool

Just define once

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

Monitoring

hadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted TTrsquos

with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 20: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

Puppet and Infradb

raquo Automate as much as you canraquo Adding a slave node to Hadoop cluster lt 120 secondsraquo Bringing up a new Hadoop cluster lt 500 secondsraquo MR slots are automatically determined based on hardware config

Isnrsquot it cool

Just define once

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

Monitoring

hadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted TTrsquos

with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 21: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

Monitoring

hadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted TTrsquos

with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 22: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

Monitoring

hadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted TTrsquos

with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 23: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

Monitoring

hadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted TTrsquos

with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 24: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

Monitoring

hadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted TTrsquos

with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 25: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

Monitoring

hadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted TTrsquos

with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 26: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

Monitoring

hadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted TTrsquos

with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 27: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

Monitoring

hadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted TTrsquos

with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 28: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted TTrsquos

with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 29: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted TTrsquos

with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 30: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted TTrsquos

with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 31: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted TTrsquos

with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 32: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted TTrsquos

with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 33: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 34: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 35: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 36: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 37: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 38: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 39: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 40: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 41: Hado"ops" or Had"oops"

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41