View
797
Download
10
Category
Preview:
Citation preview
Apache Kafka
A Distributed Streaming Platform
StreamProcessing.be - Belgium Wednesday, 18th January 2017
< paolo @ confluent.io >
https://www.confluent.io/blog/stream-data-platform-1/
Industry shift from Big Data to Fast Data and Stream Processing
$ cat < in.txt | grep “apache” | tr a-z A-Z > out.txt
Apache Kafka APIs and UNIX analogy
$ cat < in.txt | grep “apache” | tr a-z A-Z > out.txt
Connect APIs
Apache Kafka APIs and UNIX analogy
$ cat < in.txt | grep “apache” | tr a-z A-Z > out.txt
Producer/Consumer APIs
Apache Kafka APIs and UNIX analogy
$ cat < in.txt | grep “apache” | tr a-z A-Z > out.txt
Streams APIs
Apache Kafka APIs and UNIX analogy
Streams APIs part of Apache Kafka
http://kafka.apache.org/documentation/streams
Build applications, not clusters
<dependency> <groupId>org.apache.kafka</groupId> <artifactId>kafka-streams</artifactId> <version>0.10.1.1</version> </dependency>
Spot the difference(s)
How do I run in production?
How do I run in production?
As any other Java applications...
How do I run in production?
Uncool Cool
Typical High Level Architecture
Typical High Level Architecture
Real-time Data
Ingestion
Typical High Level Architecture
Stream Processing
Storage
Real-time Data
Ingestion
Typical High Level Architecture
Data Publishing / Visualization
Stream Processing
Storage
Real-time Data
Ingestion
How many clusters do you count?
NoSQL (Cassandra,
HBase, Couchbase,
MongoDB, …) or
Elasticsearch, Solr,
…
Storm, Flink, Spark
Streaming, Ignite, Akka
Streams, Apex, …
HDFS, NFS, Ceph,
GlusterFS, Lustre,
...
Apache Kafka
Simplicity is the ultimate sophistication
Apache Kafka Distributed Streaming Platform
Publish & Subscribe to streams of data like a messaging system
Store streams of data safely in a distributed replicated cluster
Process streams of data efficiently and in real-time
Node.js
Apache Kafka and Streams APIs benefits
• Build applications, not clusters • Native integration with Apacke Kafka • Elastic, fast, distributed, fault-tolerant, secure • Scalable: S, M, L, XL, XXL • Run everywhere: from containers to cloud • Streams (with KStream) and tables (with KTable)
• Local state replicated to Kafka for fault-tolerance • Windowing and event time semantics out of the box • Supports late-arriving and out-of-order events
Apache Kafka adoption across the industry… … everybody loves simplicity!
References
• http://kafka.apache.org/ • http://kafka.apache.org/documentation/streams
• http://docs.confluent.io/
• http://docs.confluent.io/current/streams/
• http://blog.confluent.io/
• http://github.com/confluentinc/examples
• http://github.com/apache/kafka/tree/trunk/streams
References
The easiest way to get you started
https://www.confluent.io/download/
SIMPLICITY
WE
YOUR FEEDBACK!
Discount code: kafcom17
Use the Apache Kafka community discount code to get $50 off
www.kafka-summit.org
Kafka Summit New York: May 8
Kafka Summit San Francisco: August 28
Presented by
Recommended