51
Wissbi A toolset for distributed event processing http://lunastorm.github.io/wissbi/

Wissbi osdc pdf

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Wissbi osdc pdf

WissbiA toolset for distributed event processing

http://lunastorm.github.io/wissbi/

Page 2: Wissbi osdc pdf

Batch Processing

http://www.yilan-local-life-news.org.tw/joinus_6/eq-detail.php?idNo=20

Page 3: Wissbi osdc pdf

Streaming Processing

Page 4: Wissbi osdc pdf

Successful Stories• "Wissbi is an easy-to-use tool for distributed

event processing" -- Scott Wang, 奇群科技

• "Wissbi provides great utilities for logging, debugging, and monitoring your distributed system"-- 前趨勢科技,王姓工程師

• "Wissbi lets you easily manage your data workflow in a pipe and filter style with intuitive commands"-- lunastorm, Open Source Developer

Page 5: Wissbi osdc pdf

Wissbi 的前世今生

• Wissbi is a highly available and scalable distributed message routing framework like ZeroMQ that took a different path from traditional MQ middlewares.

• The toolkits supports an elegant and intuitive integration model, allowing writing event driven applications as easy as "$ tail -f log | grep error > error.log".

• Scott Wang a.k.a. lunastorm, Sr. Engineer at Zillians

• Trend Message Exchange (TME) is a highly available and scalable distributed message routing framework that took a different path from traditional MQ middlewares.

• TME's client toolkit, MIST, supports an elegant and intuitive integration model, allowing writing event driven applications as easy as "$ tail -f log | grep error > error.log".

• Scott Wang a.k.a. lunastorm, Sr. Engineer at Trend Micro

Page 6: Wissbi osdc pdf

源起Origin

Page 7: Wissbi osdc pdf

I need a messaging system

Page 8: Wissbi osdc pdf
Page 9: Wissbi osdc pdf
Page 10: Wissbi osdc pdf
Page 11: Wissbi osdc pdf

虧損超過四千萬不排除解散

Page 12: Wissbi osdc pdf

Deployment Guide System Requirement

• A working ZooKeeper deployment, version newer than 3.3.2 is required • You have set your hostname and hostname resolution correctly on each machine. Running hostname -i should return

an IP other than 127.0.0.1 • Create an user account TME on each machine, for example, execute useradd -m TME

Standalone Deployment

By standalone it means that all of the TME components (including ZooKeeper) will be deployed on the same host. Mostly this is for development purposes. Because all default configurations will work in standalone mode, you only have to install the packages and bring up the daemons to test and develop.

CentOS

You will need the following 3rd party packages which are not in the official repository:

1. jdk (get RPM from http://www.oracle.com/technetwork/java/javase/downloads/index.html) 2. monit (http://pkgs.org/search/?keyword=monit) 3. nodejs (for portal, http://pkgs.org/search/?keyword=nodejs) 4. ruby (for portal, https://github.com/lunastorm/ruby19_centos/downloads, have to be at least 1.9.2, you can build from

https://github.com/imeyer/ruby-1.9.2-rpm) 5. ruby-bundler (for portal, https://github.com/lunastorm/ruby19_centos/downloads)

Then you can follow the following steps to install:

1. install Sun's JDK first 2. download the dependency RPMs mentioned above 3. download TME RPM binaries and place them in the same folder with the dependencies 4. yum --nogpgcheck install *.rpm

Ubuntu

1. Grab all the deb files you would like to install 2. sudo dpkg -i tme-*.deb 3. sudo apt-get update 4. sudo apt-get -f install

TME web portal only supports Ruby 1.9.2+, and Ubuntu 10.04 only ships Ruby 1.9.1

You have to follow this step to use RVM to install Ruby 1.9.2:

1. aptitude install build-essential libssl-dev libreadline5 libreadline5-dev zlib1g zlib1g-dev 6. bash -s stable < <(curl -s https://raw.github.com/wayneeseguin/rvm/master/binscripts/rvm-installer) 2. source /etc/profile.d/rvm.sh 3. rvm install 1.9.2 ; rvm default 1.9.2 4. Edit /opt/trend/tme/conf/portal-web/portal-web-conf.sh , add "source /etc/profile.d/rvm.sh"

The web portal requires a JavaScript runtime installed. For example, you can install Node.js on the machine.

On All Distribution

Finally, execute /opt/trend/tme/bin/create_zookeeper_nodes.sh 127.0.0.1:2181 /tme2 to initialize the essential information on ZooKeeper before you start the components.

• Distributed Deployment

You can choose to deploy different components on different machines of different hardware specs. Typically, the brokers are required to be deployed on more powerful machines than others. They will handle a large amount of client connections and deliver messages so they consume more memory and use more CPU. On the other hand, the clients need not use too much computing power, but it depends on the applications' needs. The administration packages can be deployed on multiple machines for redundancy.

1. First, you will probably have a distributed ZooKeeper set up, for example, zk1.mydomain:2181,zk2.mydomain:2181,zk3.mydomain:2181

2. Follow the same way described in the standalone guide above to install the packages on the machines of your choice.

3. Initialize the essential information on ZooKeeper: Execute /opt/trend/tme/bin/create_zookeeper_nodes.sh zk1.mydomain:2181,zk2.mydomain:2181,zk3.mydomain:2181 /tme_root_prefix The first argument is the ZooKeeper quorums, and the second argument is a prefix path on ZooKeeper chosen by you. By separating the prefix path on ZooKeeper, we can have multiple TME environments coexist in one ZooKeeper deployment. Remember these two

Service Start / Stop

You can choose one of the following ways to start or stop the components.

service and chkconfig

1. sudo service tme-broker {start / stop / restart} 2. sudo service tme-mistd {start / stop / restart} 3. sudo service tme-graph-editor {start / stop / restart} 4. sudo service tme-portal-collector {start / stop / restart} 5. sudo service tme-portal-web {start / stop / restart}

If you wish to start the services upon boot, the you can use chkconfig to turn on the services.

monit

If you have installed and enabled monit, then you can use it to ensure the services are running.

Under the configuration folders of the components, there are monit watchdog scripts that can be modified to fulfill your need, for example, send a notification when a daemon stops working.

You can use the helper scripts to start / stop the components:

1. sudo /opt/trend/tme/bin/{install|remove}_tme-broker.sh 2. sudo /opt/trend/tme/bin/{install|remove}_tme-mistd.sh 3. sudo /opt/trend/tme/bin/{install|remove}_tme-graph-editor.sh 4. sudo /opt/trend/tme/bin/{install|remove}_tme-portal-collector.sh 5. sudo /opt/trend/tme/bin/{install|remove}_tme-portal-web.sh

Verification

Messaging

After MIST daemon is started and configured correctly, execute mist-session --list to show session information:

$ mist-session -l 0 sessions 0 connections You should get the response like above. If MIST daemon is not started correctly, you may receive the following error response:

$ mist-session -l Error connecting to MIST daemon! If MIST daemon is running correctly, then you can send your first Hello World message. Execute the script to send a message to a queue named test:

$ session_id=`mist-session` && echo 'Hello World!' | mist-encode --wrap test --line | mist-sink $session_id --attach ; mist-session --destroy $session_id destroyed 1453444792 Then execute the script to receive one message from the queue named test:

$ session_id=`mist-session` && mist-source $session_id --mount test && mist-source $session_id --attach --limit 1 | mist-decode --line ; mist-session --destroy $session_id exchange queue:test mounted Hello World! destroyed 1453444796 Congratulations! You can now transmit the messages.

Portal

Open a browser to access http://**portal.host**:**portal.port** to see if it shows correctly.

Graph Editor

I don’t always readthe deployment guide

When I screw up something

I call the developers

Page 13: Wissbi osdc pdf
Page 14: Wissbi osdc pdf

Rethink Your Problem

Page 15: Wissbi osdc pdf

X

X

Do I really need a broker?

Do I really need synchronization?

Page 16: Wissbi osdc pdf

What Do I Need?

• A message publisher

• A message subscriber

• A directory service for publishers to know where the subscribers are

Page 17: Wissbi osdc pdf

Name Foo Bar

AddressDirectory Service

MessageSubscriber

Page 18: Wissbi osdc pdf

Name Foo Bar

Address

192.168.0.1:5566

Directory Service

MessageSubscriber

MessagePublisher

Get all subscribers

Send Message

Page 19: Wissbi osdc pdf

Directory Services• DNS

• First attempt

• LDAP

• $&@?!&@!?

• ...

Page 20: Wissbi osdc pdf

What if

I use directory for directory service?

Page 21: Wissbi osdc pdf

Using Filesystem for Directory Service

• Store metadata on the filesystem

• Follows the philosophy "Everything is a file"

• Use standard Unix commands to manage it

• mkdir, touch, rm, ln, ...

• Just like /proc and /sys

Page 22: Wissbi osdc pdf

Using Filesystem for Directory Service

• You get authorization for free

• chown, chmod, ...

• Good for testing

• Launch any number of cluster just with different metadata directories

Page 23: Wissbi osdc pdf

分散式是吧?

Page 24: Wissbi osdc pdf

Name Foo

Address

192.168.0.1:5566

Directory Service

MessageSubscriber

MessagePublisher

NFS

Page 25: Wissbi osdc pdf

連你阿嬤都會用的 管理方式如果你阿嬤會用Unix...

Page 26: Wissbi osdc pdf

Related Work

Page 27: Wissbi osdc pdf

Plumber (Program)• The plumber, in the Plan 9 from Bell Labs and Inferno operating systems, is a

mechanism for reliable uni- or multicast inter-process communication of formatted textual messages. It uses the Plan 9 network file protocol, 9p, rather than a special-purpose IPC mechanism.

• Any number of clients may listen on a named port (a file) for messages. Ports and port routing are defined by plumbing rules. These rules are dynamic. Each listening program receives a copy of matching messages. For example, if the data /sys/lib/plumb/basic is plumbed with the standard rules, it is sent to the edit port. The port will write a copy of the message to each listener. In this case, all running editors will interpret this message as a file name, and open the file.

• The plumber is the 9P file server that provides this service. Clients may use libplumb to format messages. Since the messages are 9P, they are network transparent.

http://en.wikipedia.org/wiki/Plumber_(program)

Page 28: Wissbi osdc pdf

Design Philosophy

Page 29: Wissbi osdc pdf

Minimum Dependency$ ldd wissbi-pub

linux-vdso.so.1 libpthread.so.0 libstdc++.so.6 libm.so.6 libgcc_s.so.1 libc.so.6 /lib64/ld-linux-x86-64.so.2

The only dependency is a compiler which supports C++11!

Page 30: Wissbi osdc pdf

Easy to Test

偷懶QQ

Page 31: Wissbi osdc pdf

Basic Commands

• wissbi-sub [message source]

• wissbi-pub [message destination]

Page 32: Wissbi osdc pdf

Live Demohttp://ascii.io/a/2913

Page 33: Wissbi osdc pdf

wissbi-sub

Write your filter with any language you like

wissbi-pub

Pipe

Pipe

Page 34: Wissbi osdc pdf

wissbi-sub

Filter

wissbi-pub

Pipe

Pipe

wissbi-sub

Filter

wissbi-pub

Pipe

Pipe

wissbi-sub

Filter

wissbi-pub

Pipe

Pipe

Page 35: Wissbi osdc pdf

You Will Have Multiple Data Pipelines

sub

Filter

pub

sub

Filter

pub

sub

Filter

pub

sub

Filter

pub

sub

Filter

pub

sub

Filter

pub

Page 36: Wissbi osdc pdf

Daemonize Your Filters!

•用 config 寫程式

• Daemon start / stop / restart / status

• start / stop upon boot / shutdown

• Watchdogs

Page 37: Wissbi osdc pdf

<?xml version="1.0"?> <!DOCTYPE Configure PUBLIC "-//Mort Bay Consulting//DTD Configure//EN" "http://jetty.mortbay.org/configure.dtd"> <!-- =============================================================== --> <!-- Configure the Jetty Server --> <!-- --> <!-- Documentation of this file format can be found at: --> <!-- http://docs.codehaus.org/display/JETTY/jetty.xml --> <!-- --> <!-- =============================================================== --> <Configure id="Server" class="org.mortbay.jetty.Server"> <!-- =========================================================== --> <!-- Server Thread Pool --> <!-- =========================================================== --> <Set name="ThreadPool"> <!-- Default bounded blocking threadpool --> <New class="org.mortbay.thread.BoundedThreadPool"> <Set name="minThreads">10 <Set name="maxThreads">50 <Set name="lowThreads">25 </New> <!-- New queued blocking threadpool : better scalability <New class="org.mortbay.thread.QueuedThreadPool"> <Set name="minThreads">10 <Set name="maxThreads">25 <Set name="lowThreads">5 <Set name="SpawnOrShrinkAt">2 </New> -->

用 Config 寫程式之Java 篇

Page 38: Wissbi osdc pdf

# How many filter instances will be run in parallel WISSBI_FILTER_COUNT="1" # How to run the filter WISSBI_FILTER_CMD="sed --unbuffered -e \"s/^/[ / ; s/$/ ]/\"" # If you run multiple instances in parallel, you can use the instance id \$i in the command # WISSBI_FILTER_CMD="sed --unbuffered -e \"s/^/\$i: [ / ; s/$/ ]/\"" # The message source's name, leave it empty if the filter is a message generator WISSBI_FILTER_SOURCE="test.in" # The message sink's name, leave it empty if the filter is a message terminal WISSBI_FILTER_SINK="test.out" WISSBI_FILTER_LOG_PREFIX="/tmp/filter-example" WISSBI_FILTER_PID_PREFIX="/tmp/filter-example" # If WISSBI_DEBUG_DUMP is set, message recording will be enabled, and 50 messages # before the filter is terminated is dumped to the specified file. # If WISSBI_DEBUG_DUMP is set to empty, then a random dump filename will be used #WISSBI_DEBUG_DUMP=""

wissbi filter example

Page 39: Wissbi osdc pdf

The easiest way to build a distributed pipeline

Page 40: Wissbi osdc pdf

Live DemoA Calculator Service http://ascii.io/a/2914

Page 41: Wissbi osdc pdf

The Wissbi Ecosystem

Page 42: Wissbi osdc pdf

Log Collector#!/bin/sh WISSBI_FILTER_COUNT="1" if [ -e "./wissbi_log_collector.sh" ] then WISSBI_FILTER_CMD="./wissbi_log_collector.sh" else WISSBI_FILTER_CMD="/usr/bin/wissbi_log_collector.sh" fi WISSBI_FILTER_SOURCE="" WISSBI_FILTER_SINK="wissbi.log" WISSBI_FILTER_LOG_PREFIX="/tmp/wsblogcollectord" WISSBI_FILTER_PID_PREFIX="/tmp/wsblogcollectord" . /usr/bin/wissbi_filter_template.sh

Page 43: Wissbi osdc pdf

wissbi.log

/var/log/syslog/var/log/postfix

tail -F | collector | wissbi-pub

/var/log/apache/*

tail -F | collector | wissbi-pub

solr / splunk / hadoop /

...

Page 44: Wissbi osdc pdf

Metric Collector

• Ganglia Integration

• Write message count into rrd files

• Also uses wissbi as the transporting tool

• Demo if we have time...

Page 45: Wissbi osdc pdf
Page 46: Wissbi osdc pdf

http://aws.amazon.comVery useful metric

to scale your server farm

Page 47: Wissbi osdc pdf

Current Limitation

• Message size < 4k

• One listening port is occupied per consumer

Page 48: Wissbi osdc pdf

Platform Support• Ubuntu 12.04 with gcc 4.6.3

• BSD branch with clang++ on the way...

• OSX (等上面那個)

• Windows

• Wat?

Page 49: Wissbi osdc pdf

Conclusion•窮人的鮑魚 -- 九孔

•窮人的燕窩 -- 白木耳

•窮人的原子彈 -- 生化武器

•窮人的 Message Bus -- Wissbi

• Minimize operating effort and cost

• Maximize workflow flexibility

Page 50: Wissbi osdc pdf

TME Wissbi

代言 無發哥、伍佰 勞動英雄

語言 丁ava 節能減碳C++

操作 複雜 簡單

版號 2.5 謙虛的0.1

蘋狗日報彙整

Page 51: Wissbi osdc pdf

Thank You