8
LetsHang Ali S. AlShehab Insight Data Engineering, 2015

Ali AlShehab Insight Demo LetsHang

Embed Size (px)

Citation preview

Page 1: Ali AlShehab Insight Demo LetsHang

LetsHang

Ali S. AlShehab Insight Data Engineering, 2015

Page 2: Ali AlShehab Insight Demo LetsHang

MOTIVATION & DEMO

“Wish I knew you were there”

A tool that helps friends hangout

Real-time location map Batch Analysis: When are my friends available?

Page 3: Ali AlShehab Insight Demo LetsHang

PIPELINE

User Data

Message Broker

Data Store

UI API

Camus HDFS

Batch Processing

Stream Processing

Page 4: Ali AlShehab Insight Demo LetsHang

DATA FLOW Generated Data:

Cassandra Table (Batch): Cassandra Table (Streaming):

Page 5: Ali AlShehab Insight Demo LetsHang

METRICS

Kafka Manager [Throughput]: Storm [Real-time Latency]:

Process ID Latency (ms)

Kafka Spout 18.476

_Acker 0.007

My Bolt 6.863

Total 25.339

Rate Mean 1 Min

Bytes in /sec 1.25 m 1.3 m

Bytes out /sec 3.75 m 3.9 m

[4.68 GB/Hr]

Page 6: Ali AlShehab Insight Demo LetsHang

CLUSTER SETUP

HDFS Name Node Kafka Broker Spark Master Storm Nimbus Data Consumer

HDFS Data Node 1 Kafka Broker Data Producer Spark Worker Storm Supervisor

HDFS Dada Node 2 Kafka Broker Spark Worker Storm Supervisor

HDFS Data Node 3 Kafka Broker Spark Worker Storm Supervisor

Cassandra Seed

Cassandra

Cassandra

m4.large

m3.medium

Page 7: Ali AlShehab Insight Demo LetsHang

CHALLENGES

•  Stitching technologies together: •  Pyleus Framework [Kafka – Storm] •  Cassandra Driver •  PySpark TargetHolding Package

•  Memory monitoring and allocation

•  Front-End rendering optimization

Page 8: Ali AlShehab Insight Demo LetsHang

ABOUT ME B.Sc. in EECS – MIT M.Eng in EECS – MIT