51
Getting Rid of Zookeeper MesosCon Asia 2016 1 Jay Guo Software Developer @IBM [email protected] Kapil Arya Mesos Committer @Mesosphere [email protected]

Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

Embed Size (px)

Citation preview

Page 1: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

Getting Rid of Zookeeper

MesosCon Asia 2016

1

Jay GuoSoftware Developer @[email protected]

Kapil AryaMesos Committer @Mesosphere

[email protected]

Page 2: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

Motivation

2

Page 3: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

Zookeeper is...

● Mature● Feature-rich● ...

3

Page 4: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

But!

● Primitive K/V store○ Provide your own tooling for other abstractions!

● Heavy● Hard dependencies● Language binding instead of RESTfull API● ...

4

Page 5: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

It’s all about having options!

5

● Chocolate

● Strawberry

● Vanilla

● ...

Page 6: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

Mesos HA:An Overview

6

Page 7: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

High Availability

First of all, we need a Distributed Key-Value storage...

7

Page 8: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

Mesos HA

● At least three Mesos Masters● One leading Master

○ Leader election○ Leader detection

● Replicated Log

Zookeeper as the distributed key-value store

8

ZKZKZK

MesosMaster

MesosMaster

MesosMaster

leader

Zookeeper cluster

Mesos Agents / Frameworks

Page 9: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

Mesos HA: Leader Election

● All Masters “contend” to be the leader!

9

ZKZKZK

MesosMaster

MesosMaster

MesosMaster

Contend

Contend Contend

Page 10: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

Mesos HA: Leader Election

● All Masters “contend” to be the leader!

● Only one succeeds; others fail

10

ZKZKZK

MesosMaster

MesosMaster

MesosMaster

Fail

Fail Success

Page 11: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

Mesos HA: Leader Election

● All Masters “contend” to be the leader!

● Only one succeeds; others fail ● We have a leading Masters!

11

ZKZKZK

MesosMaster

MesosMaster

MesosMaster

Watch

Watch Hold

Page 12: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

Mesos HA: Losing a Leader

● Suppose the leading Master is “lost”

12

ZKZKZK

MesosMaster

MesosMaster

MesosMaster

Watch

Watch

Master connection lost

Page 13: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

Mesos HA: Losing a Leader

● Suppose the leading Master is “lost”

● All other Masters are notified

13

ZKZKZK

MesosMaster

MesosMaster

MesosMaster

Notify

Notify

Master connection lost

Page 14: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

Mesos HA: Losing a Leader

● Suppose the leading Master is “lost”

● All other Masters are notified● The remaining Masters

contend again

14

ZKZKZK

MesosMaster

MesosMaster

Contend

Contend

MesosMaster

Page 15: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

Mesos HA: Losing a Leader

● Suppose the leading Master is “lost”

● All other Masters are notified● The remaining Masters

contend again● One of them succeeds

15

ZKZKZK

MesosMaster

MesosMaster

Success

Fail

MesosMaster

Page 16: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

Mesos HA: Losing a Leader

● Suppose the leading Master is “lost”

● All other Masters are notified● The remaining Masters

contend again● One of them succeeds

● A new leader is elected!

16

ZKZKZK

MesosMaster

MesosMaster

Watch

Hold

MesosMaster

Page 17: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

What about Agents/Frameworks?

17

Page 18: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

Mesos HA: Leader Detection

● Framework/Agent connects to Zookeeper to “detect” about the current leading Master

18

ZKZKZK

MesosMaster

MesosMaster

MesosMaster

Watch

Watch Hold

Mesos Agents/ Frameworks

Detect

Page 19: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

Mesos HA: Leader Detection

● Framework/Agent connects to Zookeeper to “detect” about the current leading Master

● Zookeeper provides Master’s location○ I.e. IP:Port

19

ZKZKZK

MesosMaster

MesosMaster

MesosMaster

Watch

Watch Hold

IP:Port

Mesos Agents/ Frameworks

Page 20: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

Mesos HA: Leader Detection

● Framework/Agent connects to Zookeeper to “detect” the current leading Master

● Zookeeper provides Master’s location

● Framework/Agent connectsto the “leader”

20

ZKZKZK

MesosMaster

MesosMaster

MesosMaster

Watch

Watch Hold

Connect

Mesos Agents/ Frameworks

Page 21: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

What about Replicated Log?

Replicated Log lets you create replicated fault-tolerant append-only logs. The Mesos master uses Replicated Log to store cluster state in a replicated, durable way.

21

Page 22: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

Mesos HA: Replicated Log

● Each replica registers its pid into ZK and maintain the presence.

22

ZKZKZK

Replica

Replica

Register & hold

Register & hold

Page 23: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

Mesos HA: Replicated Log

23

● Each replica registers its pid into ZK and maintain the presence.

● When new replica joins the cluster, existing ones get notified and get to know the pid of new replica.

ZKZKZK

Replica

ReplicaReplica

Notified with info of new replica

registerNotified with info of new replica

Page 24: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

Mesos HA: Replicated Log

24

● Each replica registers its own pid into ZK and maintain the presence.

● When new replica joins the cluster, existing ones get notified and get to know the pid of new replica.

● Every replica knows all nodes in the cluster and do Paxos.

ZKZKZK

Replica

ReplicaReplica

Paxos

Paxos

Paxos

Page 25: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

ReplacingZookeeper

25

? = ZK Etcd Consul|||| ...

ZKZK?

MesosMaster

MesosMaster

MesosMaster

leader

DistributedKV Store

Page 26: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

● Master Contender for leader election

Three Key Components

26

ZKZK?

MesosMaster

Contender

DistributedKV Store

bool contend();

Page 27: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

● Master Contender for leader election

● Master Detector for discovery

Three Key Components

27

ZKZK?

MesosMaster

ContenderDetector

DistributedKV Store

bool contend();

MasterInfo detect(MasterInfo previous);

Page 28: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

● Master Contender for leader election

● Master Detector for discovery

● PIDGroup for initialization

Three Key Components

28

bool contend();

MasterInfo detect(MasterInfo previous);

void initialize(pid_t pid); ZKZK?

MesosMaster

ContenderDetector PIDGroup

DistributedKV Store

Page 29: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

A Case for Modularization!

29

● Already a clear-cut interfaces between:○ Master and Contender○ Agent and Detector○ Framework and Detector

● For new distributed KV store implementation, we just write the module without having to modify Mesos itself! ZKZK?

MesosMaster

ContenderDetector PIDGroup

DistributedKV Store

Page 30: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

Let’s Talk about Modules!

30

Page 31: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

MesosModules

● Module/Plugin/Extension● Add/replace a Mesos component

○ Isolators○ Authenticators○ …

● Hook modules:○ Listen to interesting events○ Modify/enhance certain code paths○ Prepare/enhance task environment○ ...

31

Page 32: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

● Compiled as shared libraries○ E.g., libmesos_network_overlay.so

● Specified when launching Master/Agent/Frameworkmesos-agent.sh <master-parameters>

--modules=file:///path/to/modules.json

--isolation=”my_isolator”

● Gets loaded during initialization○ E.g., the ”my_isolator” isolator will be loaded into the Agent to

provide task isolation

How are Modules Used?

32

Page 33: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

Community Modules

33

I just wrote a Mesos module that provides a really cute feature.

How do I make it useful for others!

Page 34: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

Modules are Tricky!

34

● Developing● Building● Testing● Using● Hosting

● How can we make it all better for community?

Page 35: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

Writing Modules

● Doesn’t require intimate Mesos knowledge○ Just the details of the subsystem being implemented (e.g., Isolators)

● Familiarity with Mesos model is required○ E.g., libprocess, events, futures and promises, etc.

● Closely tied with Mesos version○ To ensure mutual compatibility

35

Page 36: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

Building Modules: Issues

● Build Mesos first!○ Install all Mesos dependencies○ Takes a long time to build○ Version dependencies

36

Page 37: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

Building Modules: Good News!

● Starting Mesos 1.0 release, pre-compiled Mesos deb/rpm packages contain everything needed to build modules

37

Page 38: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

Testing Modules

38

I just wrote a simple Mesos module that provide a cute feature and I know how to build it!

Can I write unit tests for it?

Page 39: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

Testing Modules

● Key questions:○ How to get good test coverage?○ How can we solicit help from community?

● Good news!○ Efforts on the way to create a “libmesos_test” library that can be used to

create/run gmock style tests just like with Mesos itself.

39

Page 40: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

How do we, as a community, make third-party modules available for general consumption?

While making sure the developers and consumers can seamlessly test/integrate into their environments!

Community-Driven Modules

40

Page 41: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

Community Modules: Proposal

● A central registry that contains pointers:○ E.g., github.com/mesos/modules ○ Each module (or a set of related modules) in its own repository

● Make Mesos version-specific binary rpm/deb modules available○ E.g., lib_my_module_<module-version>_<mesos_version>.so

41

Page 42: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

Module CI: Coming Soon!

● Builds binary packages for every registered module○ Across a given set of Mesos versions○ Work-in-progress!

● Automatic build/testing for upcoming Mesos release○ Catch incompatibilities sooner!

● Run tests!

42

Page 43: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

Let’s take a look at Etcd!

43

Page 44: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

Etcd:A Distributed KV Store

● HTTP API (no language bindings)● May already exist in your environments

44

Page 45: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

Etcd in a Mesos Cluster

45

● Create Etcd-specific modules for:○ Master detector○ Master Contender○ PIDGroup

● No need to modify/rebuild Mesos

ZKZK

MesosMaster

ContenderDetector PIDGroup

DistributedKV Store

Etcd

Page 46: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

Again, it’s all about having options!

46

● Chocolate

● Strawberry

● Vanilla

● ...

Page 47: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

Again, it’s all about having options!

47

● ChocolateZookeeper

● StrawberryEtcd

● VanillaConsul

● ...

Page 48: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

Demo!

48

Page 49: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

Module CI:A Glimpse!

49

Page 50: Getting Rid of Zookeeper - · PDF fileZookeeper to “detect” the current leading Master Zookeeper provides Master’s location ... Module/Plugin/Extension Add/replace a Mesos component

Acknowledgments!

● Shuai Lin● Cody Maloney● Benjamin Hindman● Joseph Wu

50