Upload
dataartisans
View
29
Download
1
Embed Size (px)
Citation preview
1
Kostas Tzoumas@kostas_tzoumas
Strata + Hadoop World NYC 2016September 29, 2016
Apache Flink®: State of the Union and What's Next
What I'd like to talk about
Some highlights from Flink Forward 2016
Streaming ecosystem evolution and Flink
What's coming up in Flink 2
3
Original creators of Apache Flink®
Providers of the dA Platform, the supported Flink
distribution
Flink Forward 2016
4
5
Flink Forward 2016
7 sponsors
Speaker organizations
Retail, e-commerce
Better product recommendations
Process monitoring Inventory
management
Finance Differentiation
via tech Push-based
products Fraud detection
Telco, IoT, Infrastructure Infrastructure
monitoring Anomaly
detection
Internet & mobile Personalization User behavior
monitoring Analytics
8
30 Flink applications in production for more than one year. 10 billion events (2TB) processed daily
Complex jobs of > 30 operators running 24/7, processing 30 billion events daily, maintaining state of 100s of GB with exactly-once guarantees
Largest job has > 20 operators, runs on > 5000 vCores in 1000-node cluster, processes millions of events per second
9
10
Streaming ecosystem and Flink
11
Streaming technology is enabling the obvious: continuous processing on data
that is continuously produced
Hint: you already have streaming data12
13
collect log analyze query
app state
history log
14
(Aside: streaming and "batch")
2016-3-112:00 am
2016-3-11:00 am
2016-3-12:00 am
2016-3-1111:00pm
2016-3-1212:00am
2016-3-121:00am
2016-3-1110:00pm
2016-3-122:00am
2016-3-123:00am…
partition
partition
Stream (low latency)
Batch(bounded stream)Stream (high latency)
What is Flink's unique contribution in the streaming data ecosystem?
15
Before Flink, users had to make hard choices between volume, latency, and accuracy
16
Flink eliminates these tradeoffs
10s of millions events per second for stateful applications
Sub-second latency, as low as single-digit milliseconds
Accurate computation results
17
A broader definition of accuracy: the results that I want when I want them
1. Accurate under failures and downtime2. Accurate under out of order data3. Results when you need them4. Accurate modeling of the world
18
1. Failures and downtime
Checkpoints & savepoints Exactly-once guarantees
2. Out of order and late data Event time support Watermarks
3. Results when you need them Low latency Triggers
4. Accurate modeling True streaming engine Sessions and flexible
windows
19
5. Batch + streaming One engine Dedicated APIs
6. Reprocessing High throughput, event
time support, and savepoints
7. Ecosystem Rich connector
ecosystem and 3rd party packages
8. Community support One of the most active
projects with over 200 contributors
20
flink -s <savepoint> <job>
21
Having a dependable framework enables more stateful applications to
run as streaming applications
What's coming up in Flink
22
Provide state of the art streaming capabilities (✔) Operate in the largest infrastructures of the world Open up to a wider set of enterprise users Broaden the scope of stream processing
23
Flink's unique combination of features
24
Low latencyHigh Throughput
Well-behavedflow control
(back pressure)
Consistency
Works on real-timeand historic data
Performance Event Time
APIsLibraries
StatefulStreaming
Savepoints(replays, A/B testing,upgrades, versioning)
Exactly-once semanticsfor fault tolerance
Windows &user-defined state
Flexible windows(time, count, session, roll-your own)
Complex Event Processing
Fluent API
Out-of-order events
Fast and largeout-of-core state
Flink v1.1
25
Connectors MetricSystem (Stream) SQL Session
WindowsLibrary
enhancements
Flink v1.1 + current threads
26
ConnectorsSession
Windows(Stream) SQL
Libraryenhancements
MetricSystem
Metrics &Visualization
Dynamic Scaling
Savepointcompatibility Checkpoints
to savepoints
More connectors Stream SQLWindows
Large stateMaintenance
Fine grainedrecovery
Side in-/outputsWindow DSL
Security
Mesos &others
Dynamic ResourceManagement
Authentication
Queryable State
Flink v1.1 + current threads
27
ConnectorsSession
Windows(Stream) SQL
Libraryenhancements
MetricSystem
Operations
Ecosystem ApplicationFeatures
Metrics &Visualization
Dynamic Scaling
Savepointcompatibility Checkpoints
to savepoints
More connectors Stream SQLWindows
Large stateMaintenance
Fine grainedrecovery
Side in-/outputsWindow DSL
BroaderAudience
Security
Mesos &others
Dynamic ResourceManagement
Authentication
Queryable State
Flink v1.1 + current threads
28
ConnectorsSession
Windows(Stream) SQL
Libraryenhancements
MetricSystem
Operations
Ecosystem ApplicationFeatures
Metrics &Visualization
Dynamic Scaling
Savepointcompatibility Checkpoints
to savepoints
More connectors Stream SQLWindows
Large stateMaintenance
Fine grainedrecovery
Side in-/outputsWindow DSL
BroaderAudience
Security
Mesos &others
Dynamic ResourceManagement
Authentication
Queryable State
Security / Authentication
29
No unauthorized data accessSecured clusters with Kerberos-based authentication• Kafka, ZooKeeper, HDFS, YARN, HBase, …
No unencrypted traffic between Flink Processes• RPC, Data Exchange, Web UI
Largely contributed by
Prevent malicious users to hook into Flink jobs
Checkpoints / Savepoints
30
Recover a running job into a new job
Recover a running job onto a new clusterApplication state backwards compatibility• Flink 1.0 made the APIs backwards compatible• Now making the savepoints backwards compatible
• Applications can be moved to newer versions ofFlink even when state backends or internals change
v1.x v2.0v1.y
Dynamic scaling
31
Changing load bears changing resource requirements• Need to adjust parallelism of running streaming jobs
Re-scaling stateless operators is trivialRe-scaling stateful operators is hard (windows, user state)• Efficiently re-shard state
time
WorkloadResources
Re-scaling Flink jobs preservesexactly-once guarantees
Cluster management
32
Series of improvements to seamlessly interoperate with various cluster managers• YARN, Mesos, Docker, Standalone, …
Driven byMesos integration contributed by
and
Stream SQL
33
SQL is the standard high-level query languageA natural way to open up streaming to more peopleProblem: There is no Streaming SQL standard• At least beyond the basic operations• Challenging: Incorporate windows and time
semanticsFlink community working withApache Calcite to draft a new model
State in stream processing
34
Stateless Streaming(Apache Storm)
Stateful Streaming(Apache Samza)
Accurate Stateful Streaming(Apache Flink)
State sizes in Flink today: 10s gigabytes per operatorHow to scale this to many terabytes?• Queryable State• Data driven triggers over large state
Large-state streaming
35
How to scale the stream processor state?
… and maintain fast checkpoint intervals?… and have very fast recovery on machine failures?
More and more database techniques coming into Flink
36
I wrote a book!
Get it at mapr.com/introduction-to-apache-flink
37
@kostas_tzoumas | @ApacheFlink | @dataArtisans
Thank you! We are hiring!