Text of Couchbase at LinkedIn: Couchbase Connect 2015
1. Michael Kehoe Brian Cory Sherwin LinkedIn Couchbase at
LinkedIn 2015
2. 3 Overview The LinkedIn Story Development & Operations
Operational Tooling LinkedIns Couchbase as a Database
Questions
3. Site Reliability Engineer (SRE) at LinkedIn SRE for Profile
& Higher-Education Member of CBVT B.E. (Electrical Engineering)
from the University of Queensland, Australia 4 Michael Kehoe
4. 5 The LinkedIn Story Founded in 2002, LinkedIn has grown
into the worlds largest professional social media network Offices
in 24 countries, Available in 23 languages Over 360M members
Revenue of $638M in Q1 2015
5. In-Memory storage needs 6 The LinkedIn Story At our scale,
it becomes challenging to scale data systems Read-Scaling becomes
important Applicable use-cases: Simple cache store Pre-warmed Read
through Temporary data storage for de-duping Potential for Source
of Truth (SoT) store
6. Enter Couchbase 7 The LinkedIn Story Until 2012, we were
only using Memcached as a non SoT In-Memory store However it had
some drawbacks; Long cache warmup times No partitioning/sharing Had
to write our own Cold-cache restarts Difficult to move data across
hosts/clusters/datacentres
7. Enter Couchbase 8 The LinkedIn Story Evaluated systems to
replace Memcached: Mongo, Redis, and others Couchbase had
advantages Drop-in replacement for Memcached Built in replication
and cluster expansion Memory latency for operations Asynchronous
writes to disk Utilize some of the development infrastructure weve
built
8. Coding 9 Development & Operations Memcached configured
with Spring and implements a caching Java interface Implemented
with Couchbase Native Client Developer just replaces the
Spring
9. Operations 10 Development & Operations Hadoop jobs build
warm cache data Tools to partition the data and load into Couchbase
offline Apply deltas when brought on-line Clean, warm caches ready
when needed
10. 11 Operational Tooling In order to efficiently use
Couchbase as SREs, we need the following: Provisioning Installation
Monitoring & Alerting Infrastructure Visibility
11. Provisioning 12 Operational Tooling Provisioning Flow Seek
estimated usage statistics on cluster Size of data to be stored QPS
Redundancy Needs Calculate cluster sizing Currently done via a
spreadsheet with a template Moving into an in-house application
Request hardware for cluster(s)
12. Installation 13 Operational Tooling Current System Enter
cluster metadata into our management system (Yahoo range) Use SALT
module to install & configure cluster Future System Use same
metadata system Use SALT States to install and configure cluster
Benefits of the new system Its possible to have state enforcement
Use SALT Pillars to encrypt cluster/bucket passwords
14. Monitoring & Alerting 15 Operational Tooling We run a
daemon on each Couchbase Server that collects metrics every minute
via a Couchbase Library API Use cluster metadata from range to
build dashboard definition file via Jinja template &
Python
18. Management 19 Operational Tooling We want to see a
world-view of all the clusters that we run Having bucket
cluster/server level statistics are useful Having a view of who
owns each cluster/bucket is useful
19. Management 20 Operational Tooling
20. Management 21 Operational Tooling
21. Management 22 Operational Tooling
22. Management 23 Operational Tooling
23. Management 24 Operational Tooling
24. Management 25 Operational Tooling
25. 26 Conclusions Couchbase fits into our existing
infrastructure We have good management and monitoring of the
clusters Rich set of tooling we extended for our environment
Starting to expand our use from a cache to a store for internal
tooling
26. Brian Cory Sherwin Site Reliability Engineer LinkedIn
LinkedIns Couchbase as a Database
27. Our use case and requirements Why we chose Couchbase vs
MySQL Pitfalls encountered 28 The Agenda
29. AutoRemediation! A job execution platform to remediate
operations issues Database backend for state tracking of a workflow
engine 30 Using Couchbase as a Workflow Backend
31. Couchbase as a database Document store Views for indexing
Data resiliency Replication Simplicity 32 Why Couchbase?
32. Upfront cost in creating the schema Rapidly changing
documents Number of columns Consistent incremental updates 33 Why
not MySQL?
33. ACID implications Durability and Consistency Concurrency
Different and new tech 34 Pitfalls using Couchbase
34. Questions? [email protected] If you want to learn more
on AutoRemediaiton http://www.meetup.com/Auto-
Remediation-and-Event-Driven- Automation/ 35 Questions?