Stream Processing - GitHub Pages · •In Spark and Map Reduce, operators are stateless •They get a dataset as input, produce a dataset as output •Their local state is lost after

Stream Processing

Marco Serafini

COMPSCI 590SLecture 10

22

Stream Processing vs. Batching• Advantages of stream processing

• Near real-time results • Do not need to accumulate data for processing• Time series analysis• Streaming operators typically require less memory

• Disadvantages of stream processing• Some operators are harder to implement with streaming

• Especially if we want operator state to be constant • E.g. Find median

• Stream algorithms are often approximations

33

Example: Lambda Architecture

• Compromise between accuracy and freshness• Requires maintaining 2 platforms and 2 implementations

Persistent store

(e.g. Kafka)

Incoming data stream

Stream Processing

Batch Processing

Periodically

Result of analysis

(e.g. Index)

Recreate accurate results

Update approximate

resultsReal-time

44

Unified Approaches• Batching on top of streaming

• Use stream processing system • Message-at-a-time infrastructure• Add barriers for batching on top of it• Example: Apache Flink

• Streaming on top of (micro-)batching• Group incoming streaming tuples into micro batches• Process them running very frequent batch jobs• Example: Apache Spark Streaming

• Today we look at the former approach

66

Dataflow Graph• DAG of (possibly stateful) operators• Data streams connecting them

• FIFO channels• Partitioning

• Operators are parallelized into subtasks• Streams are split into partitions

77

Streaming Operators• Streaming operators vs. Batching operators

• Input is unbounded, output is continuously refreshed• Stateful: Accumulate state throughout time• Example: average of integers is updated incrementally

• Determinism• Operators are deterministic• Relative order of input messages from multiple channels is not

88

Different Types of Time• Event time: associated to event itself, immutable• Processing time: associated to system, mutable• Event-time processing

• Requires reordering since event time ≠ processing time• Example: process all events generated [from, to)

• Low watermarks• Problem: how to know that I got all events until event-time T?• Watermark is a special message telling us that• With multiple inputs, use minimum low watermark

99

Windowing• Execute operators on a window of the stream

• E.g. average temperature in the last 10 minutes• Types of windows: sliding, tumbling, punctuations,…• Implemented as

• Assigner: maps event to window• Trigger: decides when to compute on a window• Evictor: decides what to remove on a window

10

Fault Tolerance

1111

Fault Tolerance• In Spark and Map Reduce, operators are stateless

• They get a dataset as input, produce a dataset as output

• Their local state is lost after output is produced

• Flink: stateful operators, accumulate info over time• Cannot rerun the whole stream from beginning

• Solution: Periodic checkpointing of stateful operators• Export API to application to define state to be checkpointed

• Need coordinated checkpointing

• Flink: Simplified Chandy-Lamport algorithm

1212

Chandy-Lamport Protocol• Assumptions

• Originator process starts it• FIFO channels• One checkpoint at a time

• Goal: checkpoint state + all in-flight messages• Algorithm

• Originator checkpoints its state and sends checkpoint marker• Upon receiving checkpoint marker

• Checkpoint and send checkpoint marker on each channel• Record subsequent messages on each channel until receive checkpoint marker back

1313

Flink’s Checkpointing Protocol• Asynchronous Barrier Snapshotting

• Operators form DAG so no need to record in-flight messages• Limits the amount of state to be recorded• Does not pause computation

• Upon receiving checkpoint marker from input channel• Block input channel• If all input channels blocked

• Take snapshot (asynchronously, with multi-versioning)• Send checkpoint marker to output channel

14

Control and Coordination

1515

Control vs. Data Messages• Control messages are injected in event stream• Checkpoint markers

• Inserting them in stream helps consistent snapshot• Watermarks for windowing

• Inserting them in stream allows triggering windows• Coordination barriers

• Inserting them in stream allows marking event before-after barrier

1616

Implementing Batch on Streaming• DataSet abstraction in Flink• Batches implemented without master• Example

• Q: How to implement map-reduce on Flink?• A: Control messages

• Mappers send an “eof” marker to each reducer when done• Reducer do not process until they receive markers from all mappers• Quadratic number of control messages; using master would require linear

17

More Details

1818

Query Optimization• Common in DBs with relational operators• Harder with User Defined Functions

• Arbitrary operators• No knowledge of complexity of operations• No knowledge of cardinality of intermediate results

19 19

Common DS Tricks in Flink• Backpressure

• If receiver operator cannot process inputs fast enough…• ... Block or slow down senders (recursively if needed)

• Intermediate buffer pools (queues)• Decouple communication from consuming messages

• Conflicting requirements• Throughput: batch output messages, don’t send one by one• Latency: send messages asap• Tradeoff: Send when either

• Max batch size reached (e.g. 1 kB) or • Timeout (e.g. 5 milliseconds)

2020

Off-Heap Memory Management• Java memory management

• Objects allocated on heap managed by JVM

• Simple but garbage collection is unpredictable

• Off-Heap Memory Management

• C-style memory management

• Key platform-internal data structures kept off-heap (efficient)

• Applications are still written in Java (easy to code)

• Common in Java-based platforms

• One of the main changes in Spark 2.0, great speedup

21

Exercise

2222

Exercise: Online Store• Two input streams

• One has purchases: <userID, time, itemID>• The other has ad impressions for an item: <time, itemID>

• Design application that • Correlates ad impressions with users • Correlated = purchase happens within 10 seconds after ad

2323

Solution• Streaming operator

• Receives both streams• Partitioned by itemID• Join by itemID and return <userID, itemID, ad_time>

• Watermarks• Both input streams emit a low watermark every second• Minimum low watermark triggers the window

• Flink operators• Assigner: map tuples to 10-second sliding windows• Trigger: watermark• Evictor: remove purchases that precede ads

Documents

Stream Processing - GitHub Pages · •In Spark and Map Reduce, operators are stateless •They get a dataset as input, produce a dataset as output •Their local state is lost after