Performance of
NoSQLBy Nicholas Fung
NoSQL Overview
Not Only SQL
ACID properties not guaranteed
BASE properties
Basically Available
Soft state
Eventually consistency
Used for storing massive amounts of non-structured,
complex data
Can be used for storing semi-structured data
NoSQL Database Models
Column Store
Store columns consisting of a unique name, value, and
timestamp
Document Store
Store objects as documents (frequently JSON)
Key-value Store
Associativity array stores key-value pairs
Graph Store
Store data in a graph structure
Twitter Case Study
Twitter API records tweets as JSON objects
Tested DBMS:
Microsoft SQL Server 2014: Enterprise edition, no native
JSON support
PostgreSQL 9.4.1: Open-source SQL, JSON support added in
late 2013
MongoDB 3.0.1: Document store, native JSON support as
Binary JSON (BSON)
Redis 2.8: Key-value store, JSON not supported but allows
storage
Tests three workloads with different read/write ratios
Note: Redis crashed when handling 25+ million tweets
Workload A: 50/50 R/W
Workload B: 95/5 R/W
Workload C: 100 R
EHR Case Study
Designing a new Electronic Health Record (EHR) system
using NoSQL
Tested DBMS:
MongoDB 2.2: Document store
Cassandra 2.0: Column store
Riask 1.4: Key-value store
Typical Workload: 80/20 R/W ratio
Graphs depicted are based on strong consistency
EHR Case Study: Read/Write
EHR Case Study: Read/Write
Cassandra performed the best overall
Riak performed the second best overall
MongoDB performed the worst overall
Scalability must also account latency
EHR Case Study: Latencies
EHR Case Study: Latencies
There is a trade-off between performance and
operation latency
Cassandra has best overall performance at the cost of
highest average latencies
Example: At 32 client threads
Riak’s read latency is 20% faster than Cassandra
MongoDB’s write latency is 25% faster than Cassandra
References
Comparing NoSQL to an SQL DB
https://dl-acm-
org.proxy.lib.sfu.ca/citation.cfm?id=2500047
Performance evaluation of Twitter datasets on SQL and
NoSQL DBMS
http://proxy.lib.sfu.ca/login?url=https://search.ebscohost
.com/login.aspx?direct=true&db=aph&AN=119563760&site
=ehost-live
Performance Evaluation of NoSQL Databases
https://dl-acm-
org.proxy.lib.sfu.ca/citation.cfm?id=2694731#