30
Spark and Spark Streaming @ Netflix Kedar Sadekar & Monal Daxini

Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)

Embed Size (px)

Citation preview

Page 1: Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)

Spark and Spark Streaming @ Netflix Kedar Sadekar & Monal Daxini

Page 2: Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)

Mission •  Enable rapid pace of innovation for

Algorithm Engineers

•  Business Value – More A/B tests

Page 3: Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)

Experiments

Users with plays Feature selection Large sample size

Turn back time Multiple ideas

Page 4: Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)

Use Cases

Feature Selection

Feature Generation

Model Training

Metric Evaluation

Page 5: Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)

Use Cases

More Users

Faster Iteration

InteractiveTurn back time

Page 6: Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)

Solution – Netflix BDAS

Netflix BDAS

Notebooks• Zeppelin / iPython

• Inline Graphs / REPL

Prana Sidecar• Netflix Ecosystem

Berkeley BDAS• Faster Compute• Scale Users• Access to S3 / Hive

Page 7: Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)

Netflix BDAS - Features •  Simplicity

–  Individual cluster •  Prana - Netflix ecosystem

–  Automatic configuration –  Classloader isolation –  Discovery & healthcheck

Page 8: Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)

Netflix BDAS - Features •  Ad-hoc experimentation •  Time machine functionality •  Access to Hive data and micro services from single place

–  Access to multiple AWS buckets (S3)

Page 9: Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)

Netflix BDAS – Sample Deployment

Page 10: Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)

Wins •  8X the number of users

•  5x - 9x faster

•  Interactive

•  Turn back time

Page 11: Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)

Learnings •  Easy to bring down an online system •  Almost killed 1000’s of ETL jobs

- hive metastore update •  Too many systems and configuration •  Playing catch up with libraries and tools

- Hadoop, iPython, Zeppelin •  Scala / Spark learning curve •  Debugging

- files open, no resources etc.

Page 12: Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)

Increased Adoption Adoption increasing amongst teams •  Multiple Algorithmic Eng. teams •  Personalization Infrastructure •  Marketing •  Security •  A/B Test Engineering

Page 13: Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)

Looking Ahead •  Spark-R / Dataframes support •  Multi-tenancy

–  Job specific configurations •  Debuggability •  Newer notebooks •  Spark Streaming

–  Lambda Architecture –  Real-time algorithms (trending now)

Page 14: Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)

Netflix Streaming Event Data Pipeline

Monal Daxini

Page 15: Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)

Netflix Event Data Pipeline

Event Streams

Stream processing

Page 16: Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)

Publish

Collect

Move

Process

Events @ Cloud Scale

Page 17: Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)

450 Billion Events per Day

8 Million (17 GB) per sec peak

Page 18: Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)

S3 EMR

Event Producer

Fronting Kafka

Consumer Kafka

At least Once Processing

Stream Consumers

Mantis

450 Billion events / day

Page 19: Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)

S3 EMR

Event Producer

Fronting Kafka

Consumer Kafka

At least Once Processing

Stream Consumers

Mantis

450 Billion events / day

Page 20: Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)

S3 EMR

Event Producer

Fronting Kafka

Consumer Kafka

At least Once Processing

Stream Consumers

Mantis

450 Billion events / day

Page 21: Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)

S3 EMR

Event Producer

Fronting Kafka

Consumer Kafka

Stream Consumers

Mantis

450 Billion events / day

At least Once Processing

Page 22: Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)

Backpressure

Direct API for Kafka

Improve Cloud Multi-tenancy

What’s Missing?

+

ì

Page 23: Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)

Backpressure + JobScheduler JobGenerator DStreamGraph

Driver

00 : 00 00 : 01 00 : 02 00 : 03 00 : 04 00 : 05 00 : 06 00 : 07 00 : 08 00 : 09 00 : 10

Unbounded …

Ops Queue

Page 24: Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)

Backpressure

SPARK-7398 – Add backpressure to Spark streaming

SPARK-6691 – Add dynamic rate limiter to Spark streaming

+

Page 25: Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)

Backpressure + Backpressure implementation slated for

Spark 1.5 release

Page 26: Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)

Direct API for Kafka

Spark 1.3

Kafka Integration

Spark 1.2 Receiver based Kafka Integration

2x Faster

Page 27: Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)

Direct API for Kafka

Prefetch messages

Connection reuse (pooling)

Enhance

Page 28: Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)

Cloud Multi-tenancy ì

Cloud Scheduler

Mesos Framework

Mesos Slave

Docker

Executor

Task Task

Docker

Executor

Task Task

Mesos Slave

Docker

Spark Driver

Improve

Page 29: Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)

Next?

Measuring Spark Streaming Latencies

Spark Streaming Cloud Multi-tenancy

Page 30: Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)