Click here to load reader
View
577
Download
0
Embed Size (px)
1
Guozhang Wang Kafka Meetup Beijing, April 15, 2017
Apache Kafka Development Experience and Lesson Learned
2
A Short History of Kafka
3
A Short History 2010.10: First commit of Kafka
4
A Short History 2010.10: First commit of Kafka
2011.07: Enters Apache Incubator Release 0.7.0: compression, mirror-maker
5
A Short History 2010.10: First commit of Kafka
2011.07: Enters Apache Incubator Release 0.7.0: compression, mirror-maker
2012.10: Graduated to top-level project Release 0.8.0: intra-cluster replication
6
A Short History 2010.10: First commit of Kafka
2011.07: Enters Apache Incubator Release 0.7.0: compression, mirror-maker
2012.10: Graduated to top-level project Release 0.8.0: intra-cluster replication
2014.11: Confluent founded Release 0.8.2: new producer, quota Release 0.9.0: Kafka Connect, new consumer, security Release 0.10.0: Kafka Streams, timestamps, rack awareness
7
What is Kafka, Really?
[NetDB 2011]a scalable pub-sub messaging system..
8
Example: Pub-Sub Messaging
Tracking Logs / Metrics
Hadoop / DW
Apache Kafka
9
What is Kafka, Really?
[NetDB 2011]
[Hadoop Summit 2013]
a scalable pub-sub messaging system..
a real-time data pipeline..
10
Example: Centralized Data Pipeline
KV-Store Doc-Store RDBMSTracking Logs / Metrics
Hadoop / DW Monitoring Rec. Engine Social GraphSearchingSecurity
Apache Kafka
11
What is Kafka, Really?
[NetDB 2011]
[Hadoop Summit 2013]
[VLDB 2015]
a scalable pub-sub messaging system..
a real-time data pipeline..
a distributed and replicated log..
12
Example: Data Store Geo-Replication
Apache Kafka
Local Stores
User Apps User Apps
Local Stores
Apache Kafka
Region 2Region 1
write read
append log
mirroring
apply log
13
What is Kafka, Really?
a scalable pub-sub messaging system.. [NetDB 2011]
a real-time data pipeline.. [Hadoop Summit 2013]
a distributed and replicated log.. [VLDB 2015]
a unified data integration stack.. [CIDR 2015]
14
Example: Async. Micro-Services
15
What is Kafka, Really?
a scalable pub-sub messaging system.. [NetDB 2011]
a real-time data pipeline.. [Hadoop Summit 2013]
a distributed and replicated log.. [VLDB 2015]
a unified data integration stack.. [CIDR 2015]
16
What is Kafka, Really?
a scalable Pub-sub messaging system.. [NetDB 2011]
a real-time data pipeline.. [Hadoop Summit 2013]
a distributed and replicated log.. [VLDB 2015]
a unified data integration stack.. [CIDR 2015]
All of them!
17
Kafka: Streaming Platform
Publish / Subscribe Move data around as online streams
Store Source-of-truth continuous data
Process React / process data in real-time
18
How did we get here?
19
Lesson 1: Build evolvable systems
20
Upgrade Your Kafka Cluster is like ..
21
Kafka @ LI
Release from trunk Push frequency: daily
Staging cluster Full traffic of production Full monitoring / alerting
Production Ramp-up Prepare to roll-back anytime
22
Kafka: Evolvable System Zero down-time Maintenance outage? No such thing.
All protocols versioned Brokers can talk to older versioned clients And vice versa since 0.10.2!
One should do no more than rolling bounces Staging before production
23
Server-Client Compatibility
0.10.0
0.8.2
0.10.2
0.8.0
0.9.0
0.8.2
0.10.1
0.8.1
24
25
Lesson 2: What gets measured gets fixed
26
Audit Trail @ LI
27
28
.. and a LOT of Them
29
Metrics Reporter
30
JMXReporter Included in AK: sensor / metrics -> mbean / attributes
KafkaReporter Send metrics back to Kafka!
More.. Ganglia, Graphite, statsd ..
31
Confluent Enterprise: Delivery Tracking
32
Confluent Enterprise: Delivery Tracking
33
Confluent Enterprise: Cluster Health
34
Lesson 3: APIs stay forever
The Story of KAFKA-1481
35
Hey, we should stop allowing dashes / underscores in MBean name since they care used in hostnames / topics / etc.
Makes sense, lets do this!
DONE! (a few lines of key changes)
The Story of KAFKA-1481
36
The Story of KAFKA-1481
37
The Story of KAFKA-1481
38
OMG what happened?
We need to tell our SRE to change their monitoring metrics names now..
Thats N cluster on M data centers ..
39
40
41
42
Lesson 4: Service needs gatekeepers
43
0.10.0
0.8.2
0.10.2
0.8.0
0.9.0
0.8.2
0.10.1
0.8.1
One naughty client can bother everyone ..
44
0.10.0
0.8.2
0.10.2
0.8.0
0.9.0
0.8.2
0.10.1
0.8.1
One naughty client can bother everyone ..
45
0.10.0
0.8.2
0.10.2
0.8.0
0.9.0
0.8.2
0.10.1
0.8.1
One naughty client can bother everyone ..
46
One naughty client can bother everyone ..
0.10.0
0.8.2
0.10.2
0.8.0
0.9.0
0.8.2
0.10.1
0.8.1
Multi-tenancy Services
47
Security from ground up [Release 0.9.0+] Authentication
Authorization
Resources under control [Release 0.9.0+] Quota on bytes rate / request rate on clientId / GroupId etc Quota on CPU resources
48
Lesson 5: Ecosystems are the key
49
Remember Hadoop-Producer/Consumer?
50
Container Image?
Kafka Manager GUI?
REST Proxy APIs?
Operation Tooling?
What Should Go into AK ?
Other Language Clients?
Schema Registry?
Hadoop / etc Integration?
Stream Processing?
51
Layered Architecture, not Monolithic
Messa
ging
Cross-
DC
Data
Schem
as
Fast E
TL
Search
&Quer
y
Proces
sing
Conso
les
Apache Kafka Enterprise OfferingsOS Eco-Systems
Third-party ToolsFramework Impls
52
API, coding
Full stack evaluation
Operations, debugging,
53
API, coding
Full stack evaluation
Operations, debugging,
Simple is Beautiful
54
55
56
57
Take-aways Build evolvable systems
What gets measured gets fixed
APIs stay forever
(Multi-tenant) Services need gatekeepers
Ecosystems are the key
THANKS!
Guozhang Wang | [email protected] | @guozhangwang
Kafka Summit 2017 @ NYC & SF
mailto:[email protected]