Upload
dodan
View
222
Download
5
Embed Size (px)
Citation preview
Mixing Causal Consistency and
Asynchronous Replication for Large
Neo4j Clusters
Dr. Jim Webber
Chief Scientist, Neo4j
Leads to a social graph
Graphs underpin ML to
defeat some cancers
NASA’s Orion mission to Mars:
2 years shaved from planning schedule
MotivationWhy do we need clusters of
Neo4j?
Massive
Throughput
Data Redundancy
Data Redundancy
Data Redundancy
Data Redundancy
High Availability
High Availability
High Availability
Error!
503: Service Unavailable
High Availability
Error!
503: Service Unavailable
High Availability
Error!
503: Service Unavailable
High Availability
Error!
503: Service Unavailable
High Availability
✓
Error!
503: Service Unavailable
Data RedundancyMassive Throughput High Availability
Data RedundancyMassive Throughput High Availability
3.0
Data RedundancyMassive Throughput High Availability
3.0
Bigger Clusters Consensus CommitBuilt-in load
balancing
3.1Causal
Clustering
MotivationCan you reason about
consistency?
Register
Login
You need
to login in
to continue
your
purchase!
Register
Login
You need
to login in
to continue
your
purchase!
Username:
Password:
Create Account
Register
Login
You need
to login in
to continue
your
purchase!
Username:
jim_w
Password:
********
Create Account
Register
Login
You need
to login in
to continue
your
purchase!
Username:
Password:
Login
Username:
jim_w
Password:
********
Login
Purchase
Login
Successful
Try again
No account
found!Username:
jim_w
Password:
********
Login
𝙓
Three critical issues:Fault Tolerance, Scale, and
Correctness
Design Trade-offAvailability Reliability
Roles for safety and scaleDivide and conquer
complexity
Read
Replicas
Core
• Small group of Neo4j databases
• Fault-tolerant Consensus Commit
• Responsible for data safety
Core
Writing to the Core Cluster
Neo4j
Driver
Neo4j
Cluster
Writing to the Core Cluster
Neo4j
Driver
CREATE (:User {...})
✓
Neo4j
Cluster
Writing to the Core Cluster
Neo4j
Driver
CREATE (:User {...})
✓
Neo4j
Cluster
Writing to the Core Cluster
Neo4j
Driver
CREATE (:User {...})
✓
✓
✓
Neo4j
Cluster
Writing to the Core Cluster
Neo4j
Driver
CREATE (:User {...})
✓
✓
✓
Neo4j
Cluster
Writing to the Core Cluster
Neo4j
Driver
CREATE (:User {...})
✓
✓
✓
Neo4j
Cluster
Writing to the Core Cluster
Neo4j
Driver
✓
✓
✓
Success
Neo4j
Cluster
Writing to the Core Cluster
Neo4j
Driver
✓
✓
✓
Success
Neo4j
Cluster
✓
✓
Raft ProtocolNon-Blocking Consensus for
Humans
Raft Protocol
https://github.com/ongardie/raftscope
Raft in a Nutshell
• Raft keeps logs tied together (geddit?)
• Logs contain entries for both the data and cluster membership
• Entries are appended and subsequently committed if a simple majority agree
• Implication: majority agree with the log as proposed
• Anyone can call an election: highest term (logical clock) wins, followed by
highest committed, followed by highest appended
• Appended but uncommitted entries can be truncated, but this is safe
(transaction aborted)
@heidiann360
Consensus Log → Committed Transactions →
Updated Graph
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
0 1 2 3 4 5 6 7 8 9 10 11
0 1 2 3 4 5 6 7 8 9 10 11 12 13
0 1 2 3 4 5 6 7 8 9 10 11
0 1 2 3 4 5 6 7 8 9 10 11
0 1 2 3 4 5 6 7 8 9 10 11
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10 11
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10
Transaction log: the same
transactions appear in the same
order on all members
Consensus log: stores both
committed and uncommitted
transactions
Uncommitted
entries may
differ
between
members
Tra
nsa
ctio
ns a
re o
nly
ap
pe
nd
ed to
th
e
tra
nsa
ctio
n lo
g w
he
n c
om
mitte
d
acco
rdin
g to
Ra
ft
Tra
nsa
ctio
ns a
re a
pp
lied
, u
pd
ating th
e
gra
ph
Neo4j Raft
implementation
• Small group of Neo4j databases
• Fault-tolerant Consensus Commit
• Responsible for data safety
Core
• For massive query throughput
• Read-only replicas• Not involved in Consensus
Commit • Disposable, suitable for
auto-scaling
Read
Replicas
Propagating updates to the Read Replicas
Neo4j
Driver
Neo4j
Cluster
Propagating updates to the Read Replicas
Neo4j
Driver
Neo4j
Cluster
Write
Propagating updates to the Read Replicas
Neo4j
Driver
Neo4j
Cluster
Write
Reading from the Read Replicas
Neo4j
Driver
Neo4j
Cluster
Read
Updating the graph Querying the graph
Read
Replic
as
Cor
e Updating the graph
Queries, analysis,
reporting
ESTATE=$(neo-workbench estate add database -p Local -b core-block -s 3)
neo-workbench estate add database -p Local -b edge-block -s 10 $ESTATE
neo-workbench database install -m Core \
--package-uri file:///Users/jim/Downloads/neo4j-enterprise-3.1.1-unix.tar.gz \
-b core-block $ESTATE
neo-workbench database install -m Read_Replica \
--package-uri file:///Users/jim/Downloads/neo4j-enterprise-3.1.1-unix.tar.gz \
-b edge-block $ESTATE
neo-workbench database start $ESTATE
Building an AppComputer science meets
software dev
App
Server
Neo4j
Driver
Bolt
protocol
Java<dependency><groupId>org.neo4j.driver</groupId><artifactId>neo4j-java-driver</artifactId>
</dependency>
Python
pip install neo4j-driver
.NET
PM> Install-Package Neo4j.Driver
JavaScript
npm install neo4j-driver
https ://neo4j.com/developer/language-guides
bolt://
GraphDatabase.driver( "bolt://aServer" )
bolt+routing://
GraphDatabase.driver( "bolt+routing://aCoreServer" )
GraphDatabase.driver( "bolt+routing://aCoreServer" )
Bootstrap: specify any core server to route load across the whole clus ter
bolt+routing://
Application S erver
Neo4j Driver
Max
J im
J ane
Mark
R outed write s tatements
driver = GraphDatabase.driver( "bolt+routing://aCoreServer" );
try ( Session session = driver.session( AccessMode.WRITE ) )
{
try ( Transaction tx = session.beginTransaction() )
{
tx.run( "MERGE (user:User {userId: {userId}})",
parameters( "userId", userId ) );
tx.success();
}
}
R outed read queries
driver = GraphDatabase.driver( "bolt+routing://aCoreServer" );
try ( Session session = driver.session( AccessMode.READ ) )
{
try ( Transaction tx = session.beginTransaction() )
{
tx.run( "MATCH (user:User {userId: {userId}})-[*]-(:Product) RETURN *",
parameters( "userId", userId ) );
tx.success();
}
}
C ons is tenc y modelsC an you read what you write?
C luster members s lightly “ahead” or “behind” of each other
0 1 2 3 4 5 6 7 8 9 1011
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10
If I query this server I won’t
see the updates from transaction
.
If I query this server, I’ll see all updates from all
committed transactions
11
11
Register
Login
You need
to login in
to continue
your
purchase!
Register
Login
You need
to login in
to continue
your
purchase!
Username:
Password:
Create Account
Register
Login
You need
to login in
to continue
your
purchase!
Username:
jim_w
Password:
********
Create Account
Register
Login
You need
to login in
to continue
your
purchase!
Username:
Password:
Login
Username:
jim_w
Password:
********
Login
Purchase
Login
Successful
Try again
No account
found!Username:
jim_w
Password:
********
Login
𝙓
Username:
jim_w
Password:
********
A few moments later...
✓
Login
Purchase
Login
SuccessfulUsername:
jim_w
Password:
********
Login
A few moments later...
✓
Q Why didn’t this work?A E ventual C ons istency
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9
C reate Account
App S erver
ADriver
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9
CREATE (:User)
C reate Account
App S erver
ADriver
0 1 2 3 4 5 6 7 8 9 10 11
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9
CREATE (:User)
C reate Account
App S erver
ADriver
0 1 2 3 4 5 6 7 8 9 10 11
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9
CREATE (:User)
C reate Account
App S erver
ADriver
11
0 1 2 3 4 5 6 7 8 9 10 11CREATE (:User)
C reate Account
App S erver
ADriver
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9
11
0 1 2 3 4 5 6 7 8 9 10 11CREATE (:User)
C reate Account
App S erver
ADriver
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9
11
0 1 2 3 4 5 6 7 8 9 10 11
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9
CREATE (:User)
C reate Account
App S erver
ADriver
11
0 1 2 3 4 5 6 7 8 9 10 11
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9
CREATE (:User)
C reate Account
App S erver
ADriver
MATCH (:User)Login
App S erver
BDriver
11
0 1 2 3 4 5 6 7 8 9 10 11
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9
CREATE (:User)
C reate Account
App S erver
ADriver
MATCH (:User)Login
App S erver
BDriver
11
Bookmark
S ess ion token
S tring (for portability)
Opaque to application
R epresents ultimate user’s most
recent view of the graph
More capabilities to come
L et’s try again, with C ausal C ons istency
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9
C reate Account
App S erver
ADriver
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9
CREATE (:User)
C reate Account
App S erver
ADriver
0 1 2 3 4 5 6 7 8 9 10 11
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9
CREATE (:User)
C reate Account
App S erver
ADriver
0 1 2 3 4 5 6 7 8 9 10 11
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9
CREATE (:User)
C reate Account
App S erver
ADriver
11
0 1 2 3 4 5 6 7 8 9 10 11CREATE (:User)
C reate Account
App S erver
ADriver
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9
11
0 1 2 3 4 5 6 7 8 9 10 11CREATE (:User)
C reate Account
App S erver
ADriver
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9
11
0 1 2 3 4 5 6 7 8 9 10 11
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9
CREATE (:User)
C reate Account
App S erver
ADriver
11
0 1 2 3 4 5 6 7 8 9 10 11
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9
CREATE (:User)
C reate Account
App S erver
ADriver
MATCH (:User)Login
App S erver
BDriver
11
0 1 2 3 4 5 6 7 8 9 10 11
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10 11
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10
CREATE (:User)
C reate Account
MATCH (:User)Login
App S erver
A
App S erver
B
Driver
Driver
0 1 2 3 4 5 6 7 8 9 10 11
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10 11
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10
CREATE (:User)
C reate Account
MATCH (:User)Login
App S erver
A
App S erver
B
Driver
Driver
11
0 1 2 3 4 5 6 7 8 9 10 11
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10 11
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10
CREATE (:User)
C reate Account
MATCH (:User)Login
App S erver
A
App S erver
B
Driver
Driver
11
O btain bookmark
try ( Session session = driver.session( AccessMode.WRITE ) )
{
try ( Transaction tx = session.beginTransaction() )
{
tx.run( "CREATE (user:User {userId: {userId}, passwordHash:
{passwordHash})",
parameters( "userId", userId, "passwordHash", passwordHash ) );
tx.success();
}
String bookmark = session.lastBookmark();
}
0 1 2 3 4 5 6 7 8 9 10 11
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10 11
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10
CREATE (:User)
C reate Account
MATCH (:User)Login
App S erver
A
App S erver
B
Driver
Driver
11
Obtain bookmark
Use a bookmark
try ( Session session = driver.session( AccessMode.READ ) )
{
try ( Transaction tx = session.beginTransaction( bookmark ) )
{
tx.run( "MATCH (user:User {userId: {userId}}) RETURN *",
parameters( "userId", userId ) );
tx.success();
}
}
0 1 2 3 4 5 6 7 8 9 10 11
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10 11
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10
CREATE (:User)
C reate Account
MATCH (:User)Login
App S erver
A
App S erver
B
Driver
Driver
11
Use bookmark
Takeaways
• Different roles for safety and scale
• Supports large clusters, multi-DC
• Drivers work with cluster to route to best instances
• All wrapped in causal consistency: your always read (at least) your own writes
Thanks for lis tening
Dr. J im WebberChief S cientis t, Neo4j