CouchBase @ LivePerson
Ido Shilon | 7/22/2013 | [email protected]
Agenda
● Connection before content
● Who LivePerson is
● Data & Technology @ LivePerson
● What is the use case (Real time analytics)
● The Story
● Why CouchBase was chosen
● Architecture & Code
● Production
● What did we learn so far
● Q&A
Company
● The leading customer intelligent
engagement platform
● 8,500 customers around the globe
● 8 of the top 10 Fortune 500 companies
● Doing SaaS from 1999 (when it was
called ASP)
Mission
Creating Meaningful Customer Connections
LivePerson is…
Typical Engagements
Technology @ LivePerson
● Application Stack○ JVM heavy○ Linux on commodity servers○ Private cloud based on openstack
Data @ LivePerson
● 1.8 billion visitors monitored (sessions) per month
● 20 million connections per month● ~0.5TB compressed data is loading into
Hadoop each day
KafkaStorm
The LivePerson use case
Visitor List
Real time Analytics for LivePerson's customers
What do we need it for ?
● Provide visibility to LivePerson's customers on their online visitors
● Been used by thousands of agents (call centers) around the world
The story - once upon a time
AgentsConsole(Java app)
Visitor events/data
Web site Visitors
Monitoring Events
● Stateful web servers (vertical scalability)
● Multi purpose servers ● Hold all visitors state in memory
DC 2DC1
And then the story continue
Web Agent(Web app) ??????
Web site Visitors
Monitoring Events
● Multiple web servers (horizontal scalability)● SOA architecture ● All visitors states are streamed to event bus
Kafka & Strom(Event bus)
Send events Send
events
Possible solutions we considered
Why did we picked Couchbase
● Performance, high throughput, really fast● Resilience solution ● Linear scale ● Schemaless !!! ● Searchable (Queries)● Supports both K/V & document store ● Cross data center replication● Simplicity (quick dev and roll out)● We needed a solution !!!
System architecture
Visitor (Browser/Mobile)
Stream Event Processing
Visitor Feed - Storm Topology
Customer Representative
Kaf
ka
Couchbase
Visitor Monitoring Service
(1) Visitor browsing
(2) Visitor events
(4) Write event to user document
(6) Return relevant visitors
(7) Return relevant visitors
(5) Get visitors ListEvery 3 sec
Visitor Feed API
(3) Analyze relevant events and persist
Data design & numbers
● Document = User● Document Structure :
○ Each document contains 15-20 attributes, in addition to 3 lists of sub attributes
○ Each doc contain the account id (multi tenant db)
Data design & numbers
● Numbers ○ Avg doc size - 10 K○ Average key size - 10 characters○ 5 2nd level indexes
● Throughput (Final rollout) ○ ~ 1 M concurrent documents/visitors○ ~ 100K ops/sec (heavy on insert/update)
Insert/Update
public Visitor getVisitor(String visitorSessionId) {
dalMetrics.addCouchbaseReadTotalCount();
Visitor visitor = null;
try {
String visitorDoc = (String) client.get(visitorSessionId);
visitor = gson.fromJson(visitorDoc, Visitor.class); } catch (Exception e) {
LOG.debug("Failed to retrieve or convert visitor: " + e.getMessage());
dalMetrics.addCouchbaseReadErrorCount();
throw e;
}
return visitor;
}
public void setVisitorWithFields(String rtSessionId, Visitor visitor) {
try {
client.set(rtSessionId, defaultTtl, gson.toJson(visitor)); } catch (Exception e) {
LOG.error("Error occurred while updating visitor fields: " + e.getMessage());
}
}
Views/ Design doc
Use the view to set the keys and sorting
function (doc, meta) {
order = ....
if (order, doc.accountId, doc.visitStartTime.fieldValue) {
emit([doc.accountId, order, doc.visitStartTime.fieldValue],null); }
}
Retrieve data
.../_view/by_accountid_state_timestamp?limit=10&skip=0&startkey=
["qa15020713", 0 ]&endkey=["qa15020713" , 9 ]
ComplexKey startKey = null;
// Create a new View Query
Query query = new Query();
query.setIncludeDocs(true); // Include the full document as
if (startValueToFilterBy != null) {
startKey = ComplexKey.of(accountId, startValueToFilterBy);
}
if (endValueToFilterBy == null && startKey != null) {
query.setKey(startKey);
} else {
ComplexKey endKey = ComplexKey.of(accountId, endValueToFilterBy);
query.setRange(startKey, endKey);
}
if (limit > 0) {
query.setLimit(limit);
}
if (skip > 0) {
query.setSkip(skip);
}
Cross data center replication options
1. Unidirectional replication to replicate the data to our DR data center
2. (Future) Bi Directional replication, each data center holds portion of entire the traffic
Insights :
○ Keyspace is the same (in both scenarios) - avoid conflicts
○ Impact on the cluster size - from 5 nodes to 7-8 nodes in each cluster
What did we learn till now ?
● Delete docs○ Use TTL instead instead of delete○ Use longer TTL if possible
● In our use case the working set is around 100 % - RAM and SSD are key factors in scalability
● Move to production ASAP, even for staging !!
Couchbase in LP - Additional Use cases
● Session state ● Cross Session state● Caching layer - Memcached style
Thank You