View
709
Download
4
Category
Tags:
Preview:
DESCRIPTION
This is a work in progress of a talk for the Scala User Group in Tokyo. It touches on basics and some ideas behind Reactive Streams as well as the implementation shipped by Akka.
Citation preview
Konrad `@ktosopl` Malawski
typesafe.com geecon.org
Java.pl / KrakowScala.pl sckrk.com / meetup.com/Paper-Cup @ London
GDGKrakow.pl meetup.com/Lambda-Lounge-Krakow
hAkker @
Streams“You cannot enter the same river twice” ~ Heraclitus
http://en.wikiquote.org/wiki/Heraclitus
StreamsReal Time Stream Processing !When you attach “late” to a Publisher, you may miss initial elements – it’s a river of data.
http://en.wikiquote.org/wiki/Heraclitus
Reactive Streams: Goals
1. Back-pressured Asynchronous Stream processing !
2. Standard implemented by many libraries
Reactive Streams - Who?
http://reactive-streams.org
Kaazing Corp. rxJava @ Netflix,
reactor @ Pivotal (SpringSource), vert.x @ Red Hat,
Twitter, akka-streams @ Typesafe,
spray @ Spray.io, Oracle,
java (?) – Doug Lea - SUNY Oswego …
Reactive Streams - Inter-op
http://reactive-streams.org
We want to make different implementations co-operate with each other.
Reactive Streams - Inter-op
http://reactive-streams.org
The different implementations “talk to each other” using the Reactive Streams protocol.
Reactive Streams - Inter-op
http://reactive-streams.org
The Reactive Streams SPI is NOT meant to be user-api. You should use one of the implementing libraries.
Back-pressure? Push + NACK model (a)
Use bounded buffer, drop messages + require re-sending
Kernel does this! Routers do this!
(TCP)
Just push – not safe when Slow Subscriber Just pull – too slow when Fast Subscriber
Back-pressure? RS: Dynamic Push/Pull
Just push – not safe when Slow Subscriber Just pull – too slow when Fast Subscriber
!Solution:
Dynamic adjustment (Reactive Streams)
Back-pressure? RS: Dynamic Push/Pull
Back-pressure? RS: Dynamic Push/Pull
Slow Subscriber sees it’s buffer can take 3 elements. Publisher will never blow up it’s buffer.
Back-pressure? RS: Dynamic Push/Pull
Fast Publisher will send at-most 3 elements. This is pull-based-backpressure.
Back-pressure? RS: Dynamic Push/Pull
Fast Subscriber can issue more Request(n), before more data arrives!
Back-pressure? RS: Dynamic Push/Pull
Fast Subscriber can issue more Request(n), before more data arrives!
Back-pressure? RS: Accumulate demandTotal demand of elements is safe to publish. Subscriber’s buffer will not overflow.
Back-pressure? RS: Requesting “a lot”
Fast Subscriber, can request “a lot” from Publisher. This is effectively “publisher push”, and is really fast. Buffer size is known and this is safe.
AkkaAkka is a high-performance concurrency library for Scala and Java. !
At it’s core it focuses on the Actor Model:
AkkaAkka is a high-performance concurrency library for Scala and Java. !
At it’s core it focuses on the Actor Model:
An Actor can only: • Send / receive messages • Create Actors • Change it’s behaviour
AkkaAkka has multiple modules: !
Akka-camel: integration Akka-remote: remote actors Akka-cluster: clustering Akka-persistence: CQRS / Event Sourcing Akka-streams: stream processing …
Akka Streams – Linear Flow
FlowFrom[Double].map(_.toInt). [...]
No Source attached yet. “Pipe ready to work with Doubles”.
Akka Streams – Linear Flow
implicit val sys = ActorSystem("tokyo-sys")!!
It’s the world in which Actors live in. AkkaStreams uses Actors, so it needs ActorSystem.
Akka Streams – Linear Flow
implicit val sys = ActorSystem("tokyo-sys")!implicit val mat = FlowMaterializer()!
Contains logic on HOW to materialise the stream. Can be pure Actors, or (future) Apache Spark (in the future).
Akka Streams – Linear Flow
implicit val sys = ActorSystem("tokyo-sys")!implicit val mat = FlowMaterializer()!
You can configure it’s buffer sizes etc. (Or implement your own materialiser (“run on spark”))
Akka Streams – Linear Flow
implicit val sys = ActorSystem("tokyo-sys")!implicit val mat = FlowMaterializer()!
val foreachSink = ForeachSink[Int](println)!val mf = FlowFrom(1 to 3).withSink(foreachSink).run()
Uses the implicit FlowMaterializer
Akka Streams – Linear Flow
implicit val sys = ActorSystem("tokyo-sys")!implicit val mat = FlowMaterializer()!
val foreachSink = ForeachSink[Int](println)!val mf = FlowFrom(1 to 3).withSink(foreachSink).run()(mat)
Akka Streams – Linear Flow
val mf = FlowFrom[Int].! map(_ * 2).! withSink(ForeachSink(println)) // needs source,! // can NOT run
Akka Streams – Linear Flow
val f = FlowFrom[Int].! map(_ * 2).!! ! ! withSink(ForeachSink(i => println(s"i = $i”))).! ! ! // needs Source to run!
Akka Streams – Linear Flow
val f = FlowFrom[Int].! map(_ * 2).!! ! ! withSink(ForeachSink(i => println(s"i = $i”))).! ! ! // needs Source to run!
Akka Streams – Linear Flow
val f = FlowFrom[Int].! map(_ * 2).!! ! ! withSink(ForeachSink(i => println(s"i = $i”))).! ! ! // needs Source to run!
Akka Streams – Linear Flow
val f = FlowFrom[Int].! map(_ * 2).!! ! ! withSink(ForeachSink(i => println(s"i = $i”))).! ! ! // needs Source to run!
Akka Streams – Linear Flow
val f = FlowFrom[Int].! map(_ * 2).!! ! ! withSink(ForeachSink(i => println(s"i = $i”))).! ! ! // needs Source to run!
!! ! ! f.withSource(IterableSource(1 to 10)).run()
Akka Streams – Linear Flow
val f = FlowFrom[Int].! map(_ * 2).!! ! ! withSink(ForeachSink(i => println(s"i = $i”))).! ! ! // needs Source to run!
!! ! ! f.withSource(IterableSource(1 to 10)).run()
Akka Streams – Linear Flow
val f = FlowFrom[Int].! map(_ * 2).!! ! ! withSink(ForeachSink(i => println(s"i = $i”))).! ! ! // needs Source to run!
!! ! ! f.withSource(IterableSource(1 to 10)).run()
Akka Streams – Linear Flow
val f = FlowFrom[Int].! map(_ * 2).!! ! ! withSink(ForeachSink(i => println(s"i = $i”))).! ! ! // needs Source to run!
!! ! ! f.withSource(IterableSource(1 to 10)).run()
Akka Streams – Linear Flow
val f = FlowFrom[Int].! map(_ * 2).!! ! ! withSink(ForeachSink(i => println(s"i = $i”))).! ! ! // needs Source to run!
!! ! ! f.withSource(IterableSource(1 to 10)).run()
Akka Streams – Flows are reusable
!! ! ! f.withSource(IterableSource(1 to 10)).run()! ! ! ! f.withSource(IterableSource(1 to 100)).run()! ! ! ! f.withSource(IterableSource(1 to 1000)).run()
Akka Streams <-> Actors – Advanced
val subscriber = system.actorOf(Props[SubStreamParent], ”parent")!!FlowFrom(1 to 100).! map(_.toString).! filter(_.length == 2).! drop(2).! groupBy(_.last).! publishTo(ActorSubscriber(subscriber))!
Akka Streams <-> Actors – Advanced
val subscriber = system.actorOf(Props[SubStreamParent], ”parent")!!FlowFrom(1 to 100).! map(_.toString).! filter(_.length == 2).! drop(2).! groupBy(_.last).! publishTo(ActorSubscriber(subscriber))!
Each “group” is a stream too! “Stream of Streams”.
Akka Streams <-> Actors – Advanced! groupBy(_.last).
GroupBy groups “11” to group “1”, “12” to group “2” etc.
Akka Streams <-> Actors – Advanced! groupBy(_.last).
It offers (groupKey, subStreamFlow) to Subscriber
Akka Streams <-> Actors – Advanced! groupBy(_.last).
It can then start children, to handle the sub-flows!
Akka Streams <-> Actors – Advanced
val subscriber = system.actorOf(Props[SubStreamParent], ”parent")!!FlowFrom(1 to 100).! map(_.toString).! filter(_.length == 2).! drop(2).! groupBy(_.last).! publishTo(ActorSubscriber(subscriber))!
普通 Akka Actor, will consume SubStream offers.
Akka Streams <-> Actors – Advanced
class SubStreamParent extends ActorSubscriber ! with ImplicitFlowMaterializer ! with ActorLogging {!! override def requestStrategy = OneByOneRequestStrategy!! override def receive = {! case OnNext((groupId: String, subStream: FlowWithSource[_, _])) =>!! val subSub = context.actorOf(Props[SubStreamSubscriber], ! s"sub-$groupId")! subStream.publishTo(ActorSubscriber(subSub))! }!}!
Akka Streams <-> Actors – Advanced
class SubStreamParent extends ActorSubscriber ! with ImplicitFlowMaterializer ! with ActorLogging {!! override def requestStrategy = OneByOneRequestStrategy!! override def receive = {! case OnNext((groupId: String, subStream: FlowWithSource[_, _])) =>!! val subSub = context.actorOf(Props[SubStreamSubscriber], ! s"sub-$groupId")! subStream.publishTo(ActorSubscriber(subSub))! }!}!
Akka Streams <-> Actors – Advanced
class SubStreamParent extends ActorSubscriber ! with ImplicitFlowMaterializer ! with ActorLogging {!! override def requestStrategy = OneByOneRequestStrategy!! override def receive = {! case OnNext((groupId: String, subStream: FlowWithSource[_, _])) =>!! val subSub = context.actorOf(Props[SubStreamSubscriber], ! s"sub-$groupId")! subStream.publishTo(ActorSubscriber(subSub))! }!}!
Akka Streams <-> Actors – Advanced
class SubStreamParent extends ActorSubscriber ! with ImplicitFlowMaterializer ! with ActorLogging {!! override def requestStrategy = OneByOneRequestStrategy!! override def receive = {! case OnNext((groupId: String, subStream: FlowWithSource[_, _])) =>!! val subSub = context.actorOf(Props[SubStreamSubscriber], ! s"sub-$groupId")! subStream.publishTo(ActorSubscriber(subSub))! }!}!
Akka Streams <-> Actors – Advanced
class SubStreamParent extends ActorSubscriber ! with ImplicitFlowMaterializer ! with ActorLogging {!! override def requestStrategy = OneByOneRequestStrategy!! override def receive = {! case OnNext((groupId: String, subStream: FlowWithSource[_, _])) =>!! val subSub = context.actorOf(Props[SubStreamSubscriber], ! s"sub-$groupId")! subStream.publishTo(ActorSubscriber(subSub))! }!}!
Akka Streams <-> Actors – Advanced
class SubStreamParent extends ActorSubscriber {!! override def requestStrategy = OneByOneRequestStrategy!! override def receive = {! case OnNext(n: String) => println(s”n = $n”) ! }!}!
Akka Streams – GraphFlow
// first define some pipeline pieces!val f1 = FlowFrom[Input].map(_.toIntermediate)!val f2 = FlowFrom[Intermediate].map(_.enrich)!val f3 = FlowFrom[Enriched].filter(_.isImportant)!val f4 = FlowFrom[Intermediate].mapFuture(_.enrichAsync)!!// then add input and output placeholders!val in = SubscriberSource[Input]!val out = PublisherSink[Enriched]!
Akka Streams – GraphFlowval b3 = Broadcast[Int]("b3")!val b7 = Broadcast[Int]("b7")!val b11 = Broadcast[Int]("b11")!val m8 = Merge[Int]("m8")!val m9 = Merge[Int]("m9")!val m10 = Merge[Int]("m10")!val m11 = Merge[Int]("m11")!val in3 = IterableSource(List(3))!val in5 = IterableSource(List(5))!val in7 = IterableSource(List(7))!
Akka Streams – GraphFlow
// First layer!in7 ~> b7!b7 ~> m11!b7 ~> m8!!in5 ~> m11!!in3 ~> b3!b3 ~> m8!b3 ~> m10!
Akka Streams – GraphFlow
!// Second layer!m11 ~> b11!b11 ~> FlowFrom[Int].grouped(1000) ~> resultFuture2 !b11 ~> m9!b11 ~> m10!!m8 ~> m9!
Akka Streams – GraphFlow
!// Third layer!m9 ~> FlowFrom[Int].grouped(1000) ~> resultFuture9!m10 ~> FlowFrom[Int].grouped(1000) ~> resultFuture10!
Akka Streams – GraphFlow
!// Third layer!m9 ~> FlowFrom[Int].grouped(1000) ~> resultFuture9!m10 ~> FlowFrom[Int].grouped(1000) ~> resultFuture10!
Akka Streams – GraphFlow
!// Third layer!m9 ~> FlowFrom[Int].grouped(1000) ~> resultFuture9!m10 ~> FlowFrom[Int].grouped(1000) ~> resultFuture10!
Akka Streams – GraphFlow
val resultFuture2 = FutureSink[Seq[Int]]!val resultFuture9 = FutureSink[Seq[Int]]!val resultFuture10 = FutureSink[Seq[Int]]!!val g = FlowGraph { implicit b =>! // ...! m10 ~> FlowFrom[Int].grouped(1000) ~> resultFuture10! // ...!}.run()!!Await.result(g.getSinkFor(resultFuture2), 3.seconds).sorted! should be(List(5, 7))
Sinks and Sources are “keys” which can be addressed within the graph
Akka Streams – GraphFlow
val resultFuture2 = FutureSink[Seq[Int]]!val resultFuture9 = FutureSink[Seq[Int]]!val resultFuture10 = FutureSink[Seq[Int]]!!val g = FlowGraph { implicit b =>! // ...! m10 ~> FlowFrom[Int].grouped(1000) ~> resultFuture10! // ...!}.run()!!Await.result(g.getSinkFor(resultFuture2), 3.seconds).sorted! should be(List(5, 7))
Sinks and Sources are “keys” which can be addressed within the graph
Akka Streams – GraphFlow
!val g = FlowGraph {}!
FlowGraph is immutable and safe to share and re-use! Think of it as “the description” which then gets “run”.
Available Sources• FutureSource • IterableSource • IteratorSource • PublisherSource • SubscriberSource • ThunkSource • TickSource (timer based) • … easy to add your own!
0.7 early preview
Available operations• buffer • collect • concat • conflate • drop / dropWithin • take / takeWithin • filter • fold • foreach • groupBy • grouped • map • onComplete • prefixAndTail
• broadcast • merge / “generalised merge” • zip • … possible to add your own!
0.7 early preview
Available Sinks• BlackHoleSink • FoldSink • ForeachSink • FutureSink • OnCompleteSink • PublisherSink / FanoutPublisherSink • SubscriberSink • … easy to add your own!
0.7 early preview
Links1. http://akka.io 2. http://reactive-streams.org 3. https://groups.google.com/group/akka-user
ありがとう ございました! Questions? http://akka.io
ktoso @ typesafe.com twitter: ktosopl github: ktoso team blog: letitcrash.com
Recommended