35
Introduction to Zeppelin Lee moon soo, NFLabs [email protected]

Apache Zeppelin 소개

  • Upload
    kslug

  • View
    1.464

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Apache Zeppelin 소개

Introduction to Zeppelin

Lee moon soo, NFLabs [email protected]

Page 2: Apache Zeppelin 소개

me

• Lee moon soo, aka. ‘moon’

[email protected] Committer and PPMC of Zeppelin

[email protected] Co-founder of NFLabs

Page 3: Apache Zeppelin 소개

Agenda

• Why do you like Zeppelin?

• Why your project likes Zeppelin?

• Roadmap

• QnA

Page 4: Apache Zeppelin 소개

What is Zeppelin?

Page 5: Apache Zeppelin 소개

Let’s see demo

Page 6: Apache Zeppelin 소개

How Zeppelin started

Page 7: Apache Zeppelin 소개

What gets measured, gets managed - Peter Drucker

A price of light is less than the cost of darkness - Arthur C. Nielsen , Founder of ACNielsen

War is ninety percent information - Napoleon Bonaparate

In God we trust, all others must bring data - W. Edwards Deming

Data = Understanding

Page 8: Apache Zeppelin 소개

For data analysis, we need tool

Page 9: Apache Zeppelin 소개

But couldn’t find one i like

ImpalaDrill

Hive tajoPig

Cloudera-ML

MLLib

MRQL

Page 10: Apache Zeppelin 소개

Decided to make one

Really good one

Page 11: Apache Zeppelin 소개

Good analytics environment

Analytical language Many Libraries Interactive Visualization Sharing

Page 12: Apache Zeppelin 소개

The First attempt 2012~2013

Page 13: Apache Zeppelin 소개

It’s got graphic REPL, deployment, search, import tool

But failed, because

Page 14: Apache Zeppelin 소개

It wasn’t widely used It wasn’t opensource

Page 15: Apache Zeppelin 소개

Second attempt 2013~2014

Opensourced graphic REPL from commercial product

The first version of Zeppelin

Page 16: Apache Zeppelin 소개

Second attempt 2013~2014

Not widely used

It was slow, difficult to use, …

Page 17: Apache Zeppelin 소개

Third attempt 2014~

After few weeks of study, decided to rewrite Zeppelin

Graphic REPl -> Notebook with Apache Spark integration

Page 18: Apache Zeppelin 소개

Third attempt 2014~

Next week, beautifulized

Page 19: Apache Zeppelin 소개

Why do you like Zeppelin?

Page 20: Apache Zeppelin 소개

Web Based

web framework

d3.js

Visualization

Language

Package management

bower

Build

Page 21: Apache Zeppelin 소개

Notebook

Code

Result

Code

Result

결과가 텝과 뉴라인으로 구분된 테이블 형식이면 자동으로 테이블로 포메팅

Notebook

Page 22: Apache Zeppelin 소개

Notebook

Data Visualize

Page 23: Apache Zeppelin 소개

Pivot

Pivot

Page 24: Apache Zeppelin 소개

Dynamic Form Creation

Dynamic Form

Page 25: Apache Zeppelin 소개

REST

Web socket

Synchronized

Sharing

Page 26: Apache Zeppelin 소개

Zeppelin simplifies data analysis

Page 27: Apache Zeppelin 소개

Why do your project likes Zeppelin?

Page 28: Apache Zeppelin 소개

….

Interpreters Spark PySpark SparkSQL Hive Mysql (JDBC) Markdown Shell

Easy to extend

Page 29: Apache Zeppelin 소개

Zeppelin Interpreter Architecture

Classloader

InterpreterGroup

Interpreter Interpreter

Server

Client

HTTP Rest / Websocket

Classloader

InterpreterGroup

Spark SparkSQL Dep

Page 30: Apache Zeppelin 소개

Classloader

InterpreterGroup

Interpreter Interpreter

Server

Client

HTTP Rest / Websocket

InterpreterGroup

Interpreter Interpreter …

Seprate JVM process

Thrift

Zeppelin Interpreter Architecture

Page 31: Apache Zeppelin 소개

public abstract void open();

public abstract void close();

public abstract InterpreterResult interpret(String st, InterpreterContext context);

public abstract void cancel(InterpreterContext context);

public abstract int getProgress(InterpreterContext context);

public abstract List<String> completion(String buf, int cursor);

public abstract FormType getFormType();

public Scheduler getScheduler();

Implementing new Interpreter

Must have

Good to have

More controls

Page 32: Apache Zeppelin 소개

Roadmap• Integration with more distributed processing framework • Flink, Ignite, Tajo, etc..

• Output message streaming • Ability to create rich GUI

Page 33: Apache Zeppelin 소개

ImpalaDrill

Hive tajoPig

Cloudera-ML

MLLib

MRQL

Page 34: Apache Zeppelin 소개

QnA