25
Robust Stream Processing with Apache Flink Jamie Grier @jamiegrier [email protected] om

Robust Stream Processing With Apache Flink

Embed Size (px)

Citation preview

Page 1: Robust Stream Processing With Apache Flink

RobustStream Processing

withApache Flink

Jamie Grier@[email protected]

Page 2: Robust Stream Processing With Apache Flink

Who am I?• Director of Applications Engineering at data

Artisans• Previously working on streaming computation

at Twitter, Gnip and Boulder Imaging• Involved in various kinds of stream

processing for about a decade• High-speed video, social media streaming,

general frameworks for stream processing

Page 3: Robust Stream Processing With Apache Flink

Overview• What is Apache Flink?• What is Stateful Stream Processing?• Windowed computation over streams• Robust Time Handling (Event Time vs Processing

Time)• Robust Failure Handling• Robust Planned Downtime Handling• Robust Reprocessing

Page 4: Robust Stream Processing With Apache Flink

What isApache Flink?

Apache Flink is an open source platform for distributed stream and batch data processing.

Page 5: Robust Stream Processing With Apache Flink

What isApache Flink?

Page 6: Robust Stream Processing With Apache Flink

Stream Processing

Your Code

Data Stream Data Stream

Page 7: Robust Stream Processing With Apache Flink

StatefulStream Processing

Your Code

Data Stream Data Stream

State

Page 8: Robust Stream Processing With Apache Flink

More ComplexExample

RabbitMQ

Files

Kafka

Filter

Map

Join / Sum

InfluxDB

C*

Page 9: Robust Stream Processing With Apache Flink

Distributed and Parallel Deployment

MapR Stream

s

Files

Kafka

Filter

Parse

Join / Sum

InfluxDB

C*

Page 10: Robust Stream Processing With Apache Flink

Benchmarking onHPC Cluster

Se-ries1

10 Machines with 40 GigE

Throughput: msgs/sec

72 Million msgs/sec

Page 11: Robust Stream Processing With Apache Flink

Robust Stream Processingwith Apache Flink

Page 12: Robust Stream Processing With Apache Flink

Code Example!

Page 13: Robust Stream Processing With Apache Flink

Amplifier Function

Amplifier

Control StreamAmplified Stream

State*

Data Stream

*State: Amplification factors for each key

Page 14: Robust Stream Processing With Apache Flink

Windowing

Page 15: Robust Stream Processing With Apache Flink

Processing Timevs

Event Time

Page 16: Robust Stream Processing With Apache Flink

Windowing in Processing Time

0 1 2 34 56 7 8 9 0 1 2 3 4 5 6 7 8 9

Processing Time

Event Time

Page 17: Robust Stream Processing With Apache Flink

Windowing in EventTime

0 1 2 34 56 7 8 9 0 1 2 3 4 5 6 7 8 9

Event Time

Page 18: Robust Stream Processing With Apache Flink

Processing Time = Errors!

Page 19: Robust Stream Processing With Apache Flink

Event Time = Accuracy

Page 20: Robust Stream Processing With Apache Flink

Failure Handling

Page 21: Robust Stream Processing With Apache Flink

Downtime Handling

Page 22: Robust Stream Processing With Apache Flink

Data Reprocessing

Page 23: Robust Stream Processing With Apache Flink

We’re Hiring!http://data-artisans.com/careers

Page 24: Robust Stream Processing With Apache Flink

Questions?

Page 25: Robust Stream Processing With Apache Flink

Thanks!