Upload
baglinim
View
742
Download
0
Tags:
Embed Size (px)
Citation preview
Key-value databases in
practice
Matteo Baglini Software Developer, [email protected]://it.linkedin.com/in/matteobaglinihttp://github.cpom/bmatte
www.dotnettoscana.org
2
What is Redis?
«Advanced key-value store. It is often
referred to as a data structure server»
3
Key-Value
Key Value
page:index <html><head>[...]
user:123:session xDrSdEwd4dSlZkEkj+
user:123:avatar 77u/PD94bWwgdm+
Everything is a «blob»Commands, primarily, can GET and SET the
values
4
Advanced Key-ValueKey Value Type
page:index <html><head>[...] String
events:timeline { «Joe logged», «File X Uploaded», …} List
logged:today { 1, 2, 3, 4, 5 } Set
user:123:profile time => 10927353username => bmatte Hash
game:leaderboardjoe ~ 1.3483smith ~ 293.45fred ~ 83.22
Sorted Set
Different «data type/structure»Rich set of specialized commands
5
Advanced Key-Value Everything is stored in memory Screamingly fast performance Persistent via snapshot or append-only log file Replication (only Master/Slave) Extensible via embedded scripting engine (Lua) Rich set of client libraries High availability (In progress)
◦ Cluster (Fault tolerance, Multi-Node consistence) ◦ Sentinel (Monitoring, Notification, Automatic failover)
6
Project Created by Salvatore
Sanfilippo (@antirez) First «public release»
in March 2009. Since 2010 sponsored
by VMware.
Initially written to improve performance of Web Analytics product LLOOGG out of his
startup
7
Project
Written in ANSI C No external dependencies Single thread (asynchronous evented I/O) Works on all POSIX-like system Exist unofficial build for Windows Open-source BSD licensed Community (list, IRC & wiki)
8
Manifest
1. A DSL for Abstract Data Types.2. Memory storage is #1.3. Fundamental data structures for a
fundamental API.4. Code is like a poem.5. We're against complexity.6. Two levels of API.7. We optimize for joy.
9
Getting Started
10
Install
Latest stable version (2.6.*)
11
Install
Latest unstable version (2.9.7)
12
Server
13
Configuration
14
Client
15
Telnet
16
Data Types
17
Strings
18
Strings
Any blob will do(A value can be at max 512MB)
19
Strings
Operations on strings holding an integer
20
Strings Commands
21
Strings Patterns
Sharing state across processes◦Distribute lock, Incremental ID, Time series,
User session. Web Analytics
◦User visit (day, week, month), Feature Tracking.
Caching◦String values can hold arbitrary data.
Rate limiting◦Limit number of API calls/minute.
22
Keys
23
Expiration
Any item in can be made to expireafter or at a certain time.
24
Keys Commands
25
Lists
26
Lists
Sequence of string values
27
Lists
Sequence of string values(Max length is 232 - 1 elements)
28
Lists
Prevent indefinite growth
29
Lists Commands
30
Lists Patterns
Events Store or Notification◦Logs, Social Network Timelines, Notifications.
Fixed Data◦Last N activity.
Message Passing◦Durable MQ, Job Queue.
Circular list
31
Sets
32
Sets
Unordered set of unique values
33
Sets
Unordered set of unique values(Max number of members is 232 – 1)
34
SetsYou can do unions, intersections, differences of sets in very short
time.
35
Sets Commands
36
Sets Patterns
Web Analytics◦Unique Page View, IP addresses visiting.
Relations◦Friends, Followers, Tags.
Caching Result◦Store result of expensive intersection of data.
37
Sorted Set
38
Sorted Sets
Ordered set of unique values
39
Sorted Sets
Access by rank
40
Sorted Sets
Access by score
41
Sorted Sets Commands
42
Sorted Sets Patterns
Web Analytics◦Online users, Most visited pages.
Leaderbord◦Show top N.
Order by data◦Maintain a set of ordered data like user by
age.
43
Hashes
44
Hashes
Key → Value map (as value)
45
Hashes
Set attributes(Store up to 232 - 1 field-value pairs)
46
Hashes
Get attributes
47
Hashes Commands
48
Hashes Patterns
Storing Objects◦Hashes are maps between string fields and
string values, so they are the perfect data type to represent objects.
49
Persistence
50
Snapshot (RDB)Dump data to disk after certain
conditions are met
51
Snapshot (RDB) Pro:
◦ RDB is a very compact single-file.◦ RDB files are perfect for backups.◦ RDB is very good for disaster recovery.◦ RDB allows faster restarts with big datasets.◦ RDB maximizes performances (backgr. I/O via
fork(2)). Contro:
◦ RDB is NOT good if you need to minimize the chance of data loss in case Redis stops working (for example after a power outage).
◦ Fork can be time consuming if the dataset is very big.
52
Append-only (AOF)Append all write operations to a log
53
Append-only (AOF)Durability depends on fsync(2)
policy
54
Append-only (AOF) Pro:
◦ AOF is much more durable.◦ AOF is an append only log, no seeks, nor corruption
problems (for example after a power outage).◦ AOF contains a log of all the operations one after the
other in an easy to understand and parse format. Contro:
◦ AOF files are usually bigger than the equivalent RDB.◦ AOF can be slower then RDB depending on the exact
fsync policy.
55
What should I use? Use both persistence methods if you want a degree of
data safety comparable to what any RDBMS can provide you.
If you care a lot about your data, but still can live with a few minutes of data lose in case of disasters, you can simply use RDB alone.
There are many users using AOF alone, but we discourage it since to have an RDB snapshot from time to time is a great idea for doing database backups, for faster restarts.
56
C# Clients
57
C# clients
Rich set of clients
58
C# clients
59
BookSleeve
60
Code
61
Transactions
62
Transactions
Multiple commands (ACID)
63
Transactions Commands
64
Transactions Patterns Classic scenario
◦Multi atomic commands. Optimistic locking
◦Check and Set (CAS Pattern) write only if not changed.
65
Publish Subscribe
66
PubSub
Provide 1-N messaging
67
PubSub
Subscribe multi channels decoupled from the key space
68
PubSub
Publish on some channel
69
PubSub
Subscriber getting notified
70
PubSub Commands
71
PubSub Patterns
Message Passing◦Distribute message-oriented system, Event-
Driven Architecture, Service Bus.
72
Code
73
Replication
74
Replication
One master replicate to multiple slaves
75
ReplicationSlave send SYNC command and master transfers the database
file to the slave
76
ReplicationSlaves can perform only read
operation
77
Replication Patterns
Scalability◦Multiple slaves for read-only queries.
Redundancy◦Data replication.
Slave of Slave◦Graph-like structure for more scalability e
redundancy.
78
Performance
79
Performance
~50K read/write operations per seconds.
~100K read/write ops per second on a regular EC2 instance.
Screamingly fast performance
80
Performanceredis-benchmark tool on a Ubuntu
virtual machine ~36K rps
81
Application Architecture
82
Infrastructure
Application Server
SQL Server
Redis
83
Who is using Redis?
84
Finally
85
This is Redis
«I see Redis definitely more as a flexible tool than as a solution specialized to solve
a specific problem: his mixed soul of cache, store, and messaging server shows
this very well»
Salvatore Sanfilippo
86
Risources http://redis.io/ http://github.com/antirez/redis http://groups.google.com/group/redis-db
That’s all!