Mesos meetup @ shutterstock

Preview:

Citation preview

Analytics on Mesos w/ Spark, DockerApril 14, 2015 @ Shutterstock

Brenden Matthews @brndnmtthws

© 2015 Mesosphere, Inc.

Agenda

2

• DCOS Cloud Early Access update

• Spark with Docker on Marathon demo

• Cluster resizing with AWS Auto Scaling Groups

• Use case discussion

• Q&A

© 2015 Mesosphere, Inc.

DCOS Cloud

This video shows the DCOS Cloud provisioning process with AWS CloudFormation templates on EC2.

DEMO

3

© 2015 Mesosphere, Inc.

Spark with Docker

• Spark scheduler runs on Marathon, within Docker container

4

Schedulers

© 2015 Mesosphere, Inc.

Spark with Docker

• Spark scheduler runs on Marathon, within Docker container

• Spark tasks run atop Mesos

5

Schedulers

Workers

© 2015 Mesosphere, Inc.

Spark with Docker

• Spark scheduler runs on Marathon, within Docker container

• Spark tasks run atop Mesos

• Spark worker tasks are Docker containers too

6

Schedulers

Workers

© 2015 Mesosphere, Inc.

Launch Spark

• Install Spark from DCOS universe

• Install HDFS from DCOS universe

DEMO

7

© 2015 Mesosphere, Inc.

Run TeraSort benchmark

• Submit TeraGen to generate 100m of data

• Submit TeraSort to sort 100m of data

• Submit TeraValidate to validate TeraSort output

• Resize cluster w/ Auto Scaling Groups

DEMO

8

© 2015 Mesosphere, Inc.

Use Cases

9

• Analytics

• Data warehousing (Spark SQL, aka Hive on Spark)

• One-off analysis (copy to S3)

• Machine learning

• Spark includes the excellent Machine Learning Library

• Stream processing

• Spark Streaming w/ Kafka

© 2015 Mesosphere, Inc.

Questions?

10

EOF

Recommended