Service Discovery in Distributed Systems

Preview:

Citation preview

Service DiscoveryIn Distributed Systems

IVAN VOROSHILIN@vibneiro

My Interests:

Ivan Voroshilin - Computer geek

Bio

Distributed systemsArchitectureFunctional languagesAlgorithms

Service Discovery?Let’s recall

OMG! This is an unmanageable mess!

Google

Twitter

Netflix

Facebook

Why?I don’t have to restart a service to update configuration

I don’t have to redeploy something, when a service is moved to another box

I don’t have to know whether a needed service is alive at the moment

Load-balancing might be taken care for you by Service Discovery also

Service Discovery takes a responsibility to store configs for you in 1 place

Agenda

- Main properties of Service Discovery- Popular Open-Source Solutions Differences, Trade-offs- Bonus

Service Discovery The main components

The main idea of Service Discovery insimple terms

Announce/LookUp

Service Directory

Register Services

Discover Services

ConsistentHighly-available

a.k.a. Announce

a.k.a. LookUp

CAP?Consistency Availability

Partition Tolerance

2 out of 3 properties: Either CP or AP.No other options

Issues when developingService Discovery

Fault ToleranceData ConsistencyDistributed LocksLeader Election (It is not trivial!)

Review of Open Source ProjectsProject Name Implementor Год рождения

Chubby Google 2006

ZooKeeper [*] Apache 2007

Doozerd Blake Mizerany 2010

My in-house solution Deutsche bank 2012

Eureka [*] Netflix 2012

ETCD [*] CoreOS 2013

SmartStack AirBnb 2013

Surf HashiCorp 2014

SkyDNS Erik St. Martin 2014

Consul [*] HashiCorp 2014

Name resolution and DNS

Announce/Lookup?! Sounds like DNS..

- Scales out badly- Split-brain- Too simple

Architectural trade-offsZookeeperETCDConsul

EurekaSmartStackSkyDNSSerf

Service Discovery Open Source Solutions

2 Categories of solutions

- General purpose solutions:ZooKeeper, ETCD- Solely Service Discovery solutions:Eureka, Surf, Consul

Apache Zookeeper- Consensus protocol ZAB, CP- Written in Java- Language binding: Java and C API- Key-value store based on ephemeral nodes- Clients need to handle any load balancing or failover

themselves- On any non-quorum side, reads and writes will return an

error.

-

Apache Zookeeper

In Hadoop, Mesos, Kafka, NetflixDistributed lock for coordination

- Heavyweight- Difficult to use- Not ideal for Service Discovery- Ephemeral nodes not reliable

Netflix Eureka- Eventually consistent, AP- Broadcast async replication among servers, no quorum- HTTP-REST API- Written in Java- Java smart client with round robin balancer- Caching of server entries on the client side- add/remove nodes online- TTL refreshment mechanism- Used for AWS cloud

ETCD- Inspired by Zookeeper- Consensus protocol - RAFT, CP- Written in Go- HTTP REST API- Key-value: Store data in directories- TTLs for keys expiration- Clients need to handle any load balancing or failover

themselves- Watch a key or directory for changes and react to the

new values

Consul- Consensus protocol RAFT: CP- HTTP REST API or even DNS interface- Built-in DNS server as a service registry- Security: TLS, ACLs- Comprehensive solution for Service Discovery- Specialized health-checks, besides just heartbeats- Optional read consistency mode: stale

(when leader is unavailable)

What should do a comprehensive Service Discovery?

Nothing from the mentioned answers the following question:

How to without a hassle to integrate your existing landscape with a service discovery solution?Doesn’t matter whether 3-party or in-house applications

What about closed 3-rd party systems?

Language binding?

Need to write client code to adapt?

Service Discovery in Docker environmentMeet - “Side-kick” processes. (Introduced in SmartStack and Serf Solutions)

Client application

Backend 1 Backend 2

application

HAProxy

ETCD-cluster

service process

side-kick

service process

side-kick

discover

HTTP/TCP proxied HTTP/TCP proxied

announce announce

watchhealth checkTTL TTL

check health

Summary

No One size fits all solutions

The right architecture or Open-Source solution directly depends on clear requirements

Thanks for attention

blog: www.ivoroshilin.com

Q & A

Recommended