Mitchell Hashimoto: Building Robust Systems w/ Service Discovery & Configuration

Building Robust Systems With Consul

I’m Mitchell HashimotoAlso known as @mitchellh

HashiCorpTowards a Software Managed Datacenter

Vagranthttp://www.vagrantup.com

Packerhttp://www.packer.io

SERFhttp://www.serfdom.io

Consulhttp://www.consul.io

Consul

Take a Step BackTaking a look at the big picture.

Service Service Service

Hypervisor

Node Node Node

S S S S S S S S S

Hypervisor

Node Node Node

Container

S S Container S Container

S S S S S S

Hypervisor

Node Node Node

Container

S S Container S Container

S S S S S S

Modern OpsMore everything, more problems.

• Where is service foo?• Is service foo healthy/available?• What is service foo’s

configuration?• Where is the service foo leader?

What happens when the thing that answers these questions is unavailable?

Robust SystemsStem from the ability to answer these questions.

• Start services in any order• Destroy services with confidence• Restart servers safely• Reconfigure services easily

Practical Goals

• Where is service foo?• Is service foo healthy/available?• What is service foo’s

configuration?• Where is the service foo leader?

Where is service foo?

Maybe here: 127.0.0.1Maybe close: 10.0.1.35Maybe there: foo.foohost.com

Is service foo healthy/available?

Yes: Great!No: Avoid or handle gracefully.

What is service foo’s configuration?

Access information, supported features, enabled/disabled.

What is my configuration?

Expect it to be modifiable.

Where is the service foo leader or best choice?

Locality, master/slave, versions.

Meta: Is the thing answering these questions stable/available?

Critical infrastructure component, you want “yes” as often as possible.

Robust! Can find services, can avoid and handle unhealthy services, can be configured externally, and can trust that it can retrieve all of this information.

• Start services in any order• Destroy services with confidence• Restart servers safely• Reconfigure services easily

Practical Goals

Consul

Solution AttemptsIn a world… before Consul...

Manual/Hardcoded• Doesn’t scale with services/nodes• Not resilient to failures• Localized visibility/auditability• Manual locality of services

Config Mgmt Problem• Slow to react to changes• Not resilient to failures• Not really configurable by

developers• Locality, monitoring, etc. manual

LB Fronted Services• Introduces different SPOF• How does LB find service

addresses/configure?• Solves some problems, though.

ZooKeeper• Complicated• Heavy clients• Building block, very manual

Consul

Service Discovery

Where is service foo?

Service Discovery$ dig web-frontend.service.consul. +short10.0.3.8910.0.1.46

$ curl http://localhost:8500/v1/catalog/service/web-frontend[{ “Node”: “node-e818f1”, “Address”: “10.0.3.89”, “ServiceID”: “web-frontend”, …}]

Service Discovery

• DNS is legacy-friendly. No application changes required.

• HTTP returns rich metadata.

Failure Detection

Is service foo healthy/available?

Failure Detection

• DNS won’t return non-healthy services or nodes.

• HTTP has endpoints to list health state of catalog.

Key/Value Storage

What is the config of service foo?

Key/Value Storage$ curl –X PUT –d ‘bar’ http://localhost:8500/v1/kv/footrue

$ curl http://localhost:8500/v1/kv/foo?rawbar

Key/Value Storage

• Highly available storage of configuration.

• Turn knobs without big configuration management process.

Multi-Datacenter

Multi-Datacenter$ dig web-frontend.singapore.service.consul. +short10.3.3.3310.3.1.18

$ dig web-frontend.germany.service.consul. +short10.7.3.4110.7.1.76

Multi-Datacenter$ curl http://localhost:8500/v1/kv/foo?raw&dc=asiatrue

$ curl http://localhost:8500/v1/kv/foo?raw&dc=eufalse

Multi-Datacenter

• Local by default• Can query other datacenters

however you may need to

Web UI

• Node, service, health check, and K/V management and visibility for every datacenter in a single UI.

OperationsConsul Availability / Scalability

The Meta Question

Architecture

Server Cluster• 3, 5, 7 servers• (n/2) + 1 for

availability• Replicated writes• Automatic leader

election, leader forwarding.

Lightweight Clients• Ephemeral state• Health checks• Optional (but

recommended). Legacy machines don’t need them.

• Automatic request forwarding to servers.

Cheap Gossip• Health check and

membership info.• Very cheap• No guaranteed

reliability, but only used for data that can be lost

• (See Serf)

Multi-DC• Independent server

clusters• Request forwarding• WAN gossip for

membership

General Points: Servers

• (n+1)/2 servers for write avail• More servers means higher write latency

because of replication. Throughput marginally affected.

• Can leave/add at will, keeping in mind min. node requirement.

General Points: Clients• Clients can be removed/added at will

without issue.• Clients don’t currently affect read/write

throughput in a meaningful way.• Although technically optional, they’re

highly recommended for delegated health checks.

Throughput

• On virtualized cloud systems with spinning disks: thousands of reads and writes per second

• Practically won’t hit read/write limit

Scalable and available. Consul’s architecture makes it incredibly scalable and highly unlikely to become unavailable.

Robust SystemsConsul configured, monitored, discovered

• Consul KV for configuration.• Consul DNS for service

coupling/discovery.• Consul Health Checks for

monitoring.

Consul KV: Configuration

Consul KV: Configuration$ envconsul –reload myapp/config bin/myapp…

Consul KV: Configuration

• envconsul turns K/V into environmental variables and restarts on change.

• No application changes!

Consul DNS: Service Discovery$ envconsul myapp/config envELASTICSEARCH_HOST=elasticsearch.service.consul.POSTGRESQL_HOST=master.postgresql.service.consul.REDIS_HOST=redis.service.consul.

Consul DNS: Service Discovery

• Configuration to point to other services uses DNS.

• No application changes!

Consul Health Checks: Monitoring$ cat /etc/consul.d/web.json{ “check”: { “name”: “http”, “script”: “curl localhost:80”, “interval”: “5s” }}

Consul Health Checks: Monitoring

• Simple shell scripts (UNIXy)• Logged output• Won’t show as result in service

discovery queries if failing.

Robust! Add/remove services, reconfigure services, see global state of services without complicated logic. And without modifying application code.

Thank You

http://www.consul.io

Mitchell Hashimoto: Building Robust Systems w/ Service Discovery & Configuration

Technology

Jorge Minoru Hashimoto

InTech-Hashimoto s Thyroiditis

Hashimoto Tiroiditis

Hashimoto s

Hashimoto Vita 1 - Fenichfenich.com/images/HashimotoCV.pdf · Hashimoto – Vita 1 KATHRYN HASHIMOTO ... Hashimoto – Vita 3 created, ... (1997) "Louisiana Gaming: A case study for

Thyroiditis Hashimoto

Tiroides De Hashimoto

名称未設定-1 - IEICE Transactions Online TOPsearch.ieice.org/data/ALR/REV2006ED.pdf · Hashimoto, Chikara Hashimoto, Kohichi Hashimoto, Manabu ... Yokomori, Reishi Yokota, Yasunari

Mitchell Hashimoto, HashiCorp

Hashimoto disease

Tiroiditis Aguda, Subaguda, Silente, Hashimoto

HASHIMOTO, Hideki HASHIMOTO, K. HASHIMOTO, Katsushi HASHIMOTO, Masahiko HASHIZUME, Hiroo HATA, Yoshifumi HATAKEYAMA, Tomomitsu HATANO, Keiichi HATANO, Tadashi HATSUI

Encefalitis de hashimoto

Distributionally Robust Losses Against Mixture Covariate ShiftsDistributionally Robust Losses Against Mixture Covariate Shifts John C. Duchi 1;2 Tatsunori Hashimoto Hongseok Namkoong3

Hashimoto & Co. Autoimmunthyreoiditis heute · Hashimoto & Co. – Autoimmunthyreoiditis heute 3 Histologie Atrophische Thyreoiditis Normalbefund Hashimoto-Thyreoiditis Hashimoto

Tiroiditis Hashimoto y Riedel

Shigehiro Hashimoto - eg.mahidol.ac.th

Marcos hashimoto resumee

Shock Hipovolemico (1) Hashimoto

Robust audio watermarking using perceptual masking1 · Signal Processing 66 (1998) 337—355 Robust audio watermarking using perceptual masking1 Mitchell D. Swanson!,*, Bin Zhu!,