Deploying Kafka at Dropbox, Mark Smith, Sean Fellows

Deploying Kafka at DropboxAlternately: how to handle 10,000,000 QPS in one cluster (but don't)

The Plan

• Welcome

• Use Case

• Initial Design

• Iterations of Woe

• Current Setup

• Future Plans

Your Speakers

• Mark Smith <[email protected]>

formerly of Google, Bump, StumbleUpon, etc

likes small airplanes and not getting paged

• Sean Fellows <[email protected]>

formerly of Google

likes corgis and distributed systems

mailto:[email protected]


The Plan

• Welcome

• Use Case

• Initial Design


• Current Setup

• Future Plans

Dropbox

• Over 500 million signups

• Exabyte scale storage system

• Multiple hardware locations + AWS

Log Events

• Wide distribution (1,000 categories)

• Several do >1M QPS each + long tail

• About 200TB/day (raw)

• Payloads range from empty to 15MB JSON blobs

Current System

• Existing system based on Scribe + HDFS

• Aggregate to single destination for analytics

• Powers Hive and standard map-reduce type analytics

Want: real-time stream processing!

The Plan

• Welcome

• Use Case

• Initial Design


• Current Setup

• Future Plans

Initial Design

• One big cluster

• 20 brokers: 96GB RAM, 16x2TB disk, JBOD config

• ZK ensemble run separately (5 members)

• Kafka 0.8.2 from Github

• LinkedIn configuration recommendations

The Plan

• Welcome

• Use Case

• Initial Design


• Current Setup

• Future Plans

Unexpected Catastrophes

• Disks failure or reaching 100%

• Repair is manual, won't expire unless caught up

• Crash looping, controller load

• Simultaneous restarts

• Even graceful, recovery is sometimes very bad (even 0.9!)

• Rebalancing is dangerous

• Saturates disks, partitions fall out of ISRs, offline, etc

System Errors

• Controller issues

• Sometimes goes AWOL with e.g. big rebalances

• Can have multiple controllers (during serial operations)

• Cascading OOMs

• Too many connections

Lack of Tooling

• Usually left to the reader

• Few best practices

• But we love Kafka Manager

• More to come later!

Newer Clients

• State of Go/Python clients

• Bad behavior at scale

• Laserbeam, retries, backoff

• Too many connections == OOM

• Good clients take time

Bad Configs

• Many, many tunables -- lots of rope

• Unclean leader election

• Preferred leader automation

• Disk threads (thanks Gwen!)

• Little modern documentation on running at scale

• Todd Palino helped us out early, tho, so thank you!

The Plan

• Welcome

• Use Case

• Initial Design


• Current Setup

• Future Plans

Hardware

• Hardware RAID 10

• ~25TB usable/box (spinning rust)

• During broker replacement

• 200ms p99 commit latency down to 10ms!

• Failure tolerance, full disk protection

• Canary cluster

Monitoring

• MPS vs QPS (metadata reqs!)

• Bad Stuff graph

• Disk utilization/latency

• Heap usage

• Number of controllers

Tooling

• Rolling restarter (health checks!)

• Rate limited partition rebalancer (MPS)

• Config verifier/enforcer

• Coordinated consumption (pre-0.9)

• Auditing framework

Customer Culture

• Topics : organization :: partitions : scale

• Do not hash to partitions

• No ordering requirements

• Namespaces and ownership are required

Success! x

• Kafka goes fast (18M+ MPS on 20 brokers)

• Multiple parallel consumption

• Low latency (at high produce rates)

• 0.9 is leaps ahead of 0.8.2 (upgrade!)

• Supportable by a small team (at our scale)

The Plan

• Welcome

• Use Case

• Initial Design


• Current Setup

• Future Plans

The Future

• Big is fun but has problems

• Open source our tooling

• Moving towards replication

• Automatic up-partitioning and rebalancing

• Expanding auditing to clients

• Low volume latencies

Deploying Kafka at Dropbox

• Mark Smith <[email protected]>

• Sean Fellows <[email protected]>

We would love to talk with other people who are running Kafka at similar

scales. Email us!

And... questions! (If we have time.)



Engineering

Deploying Kafka at Dropbox, Mark Smith, Sean Fellows