Upload
others
View
12
Download
0
Embed Size (px)
Citation preview
Performance of
NoSQLBy Nicholas Fung
NoSQL Overview
Not Only SQL
ACID properties not guaranteed
BASE properties
Basically Available
Soft state
Eventually consistency
Used for storing massive amounts of non-structured,
complex data
Can be used for storing semi-structured data
NoSQL Database Models
Column Store
Store columns consisting of a unique name, value, and
timestamp
Document Store
Store objects as documents (frequently JSON)
Key-value Store
Associativity array stores key-value pairs
Graph Store
Store data in a graph structure
Twitter Case Study
Twitter API records tweets as JSON objects
Tested DBMS:
Microsoft SQL Server 2014: Enterprise edition, no native
JSON support
PostgreSQL 9.4.1: Open-source SQL, JSON support added in
late 2013
MongoDB 3.0.1: Document store, native JSON support as
Binary JSON (BSON)
Redis 2.8: Key-value store, JSON not supported but allows
storage
Tests three workloads with different read/write ratios
Note: Redis crashed when handling 25+ million tweets
Workload A: 50/50 R/W
Workload B: 95/5 R/W
Workload C: 100 R
EHR Case Study
Designing a new Electronic Health Record (EHR) system
using NoSQL
Tested DBMS:
MongoDB 2.2: Document store
Cassandra 2.0: Column store
Riask 1.4: Key-value store
Typical Workload: 80/20 R/W ratio
Graphs depicted are based on strong consistency
EHR Case Study: Read/Write
EHR Case Study: Read/Write
Cassandra performed the best overall
Riak performed the second best overall
MongoDB performed the worst overall
Scalability must also account latency
EHR Case Study: Latencies
EHR Case Study: Latencies
There is a trade-off between performance and
operation latency
Cassandra has best overall performance at the cost of
highest average latencies
Example: At 32 client threads
Riak’s read latency is 20% faster than Cassandra
MongoDB’s write latency is 25% faster than Cassandra
References
Comparing NoSQL to an SQL DB
https://dl-acm-
org.proxy.lib.sfu.ca/citation.cfm?id=2500047
Performance evaluation of Twitter datasets on SQL and
NoSQL DBMS
http://proxy.lib.sfu.ca/login?url=https://search.ebscohost
.com/login.aspx?direct=true&db=aph&AN=119563760&site
=ehost-live
Performance Evaluation of NoSQL Databases
https://dl-acm-
org.proxy.lib.sfu.ca/citation.cfm?id=2694731#