Upload
bachmanm
View
774
Download
2
Embed Size (px)
DESCRIPTION
Modelling Data in Neo4j for beginners, common mistakes, frequently asked questions, hardware sizing and a few extra tips
Citation preview
GraphAwareTM
by Michal Bachman
a few best practices and lessons learned
Modelling Data in Neo4j
GraphAwareTM
GraphAwareTM
Ride-sharing website
History of rides
Friendships from Facebook
Aim: build trust between users
Example Domain
GraphAwareTM
There is no single correct way.
Modelling Data as Graphs
GraphAwareTM
Graphs are very whiteboard friendly.
Modelling Data as Graphs
ipsum
FRIEND_OF
name: “Michael”
User
name: “Laura”
User
FRIEND_OF
name: “Peter”
User
DROVE
name: “Alice”
User
DROVE
ipsum
FRIEND_OF
name: “Michael”
User
name: “Laura”
User
FRIEND_OF
name: “Peter”
User
DROVE
name: “Alice”
User
DROVE
name: “Jenny”
User
DROVE
ipsum
FRIEND_OF
name: “Michael”
User
name: “Laura”
User
FRIEND_OF
name: “Peter”
User
DROVE
name: “Alice”
User
DROVE
name: “Jenny”
User
DROVE
date: 2014-01-29date: 2014-01-29date: 2014-01-27
ipsum
FRIEND_OF
name: “Michael”
User
name: “Laura”
User
FRIEND_OF
name: “Peter”
User
DROVE
name: “Alice”
User
DROVE
name: “Jenny”
User
DROVE
date: 2014-01-29date: 2014-01-29date: 2014-01-27
RODE_TOGETHER
RODE_TOGETHER
ipsum
FRIEND_OF
name: “Michael”
User
name: “Laura”
User
FRIEND_OF
date: 2014-01-29from: “London”
to: “Nottingham”
RideDRIVER
name: “Alice”
User
PASSENGER
date: 2014-01-27from: “Brighton”
to: “Hastings”
Ride
PASSENGER
name: “Peter”
User
PASSENGER
name: “Jenny”
User
DRIVER
GraphAwareTM
Make important concepts in your domain nodes, you will gain flexibility.
Nodes vs. Relationships
ipsum
FRIEND_OF
name: “Michael”
User
name: “Laura”
User
FRIEND_OF
date: 2014-01-29from: “London”
to: “Nottingham”
RideDRIVER
name: “Alice”
User
PASSENGER
date: 2014-01-27from: “Brighton”
to: “Hastings”
Ride
PASSENGER
name: “Peter”
User
PASSENGER
name: “Jenny”
User
DRIVER
ipsum
FRIEND_OF
name: “Michael”
User
name: “Laura”
User
FRIEND_OF
date: 2014-01-29from: “London”
to: “Nottingham”
RideDRIVER
name: “Alice”
User
PASSENGER
date: 2014-01-27from: “Brighton”
to: “Hastings”
Ride
PASSENGER
name: “Peter”
User
PASSENGER
name: “Jenny”
User
DRIVER
RATEDrating: 5RATED
rating: 3
ipsum
FRIEND_OFname: “Michael”
User
name: “Laura”
User
date: 2014-01-29from: “London”
to: “Nottingham”
RideDRIVER
name: “Alice”
User
PASSENGER
date: 2014-01-27from: “Brighton”
to: “Hastings”
Ride
PASSENGER
name: “Peter”
User
PASSENGER
name: “Jenny”
User
DRIVER
RATEDrating: 5RATED
rating: 3
ipsum
FRIEND_OFname: “Michael”
User
name: “Laura”
User
date: 2014-01-29from: “London”
to: “Nottingham”
RideDRIVER
name: “Alice”
User
PASSENGER
date: 2014-01-27from: “Brighton”
to: “Hastings”
Ride
PASSENGER
name: “Peter”
User
PASSENGER
name: “Jenny”
User
DRIVER
RATEDrating: 5RATED
rating: 3
GraphAwareTM
a common mistake
Bidirectional Relationships
DEFEATEDCzech Republic
Sweden
GraphAwareTM
Ice Hockey
DEFEATEDCzech Republic
Sweden
GraphAwareTM
Ice Hockey
DEFEATED
Czech Republic
Sweden
DEFEATED_BY
GraphAwareTM
Ice Hockey (Implied Relationship)
DEFEATED
Czech Republic
Sweden
DEFEATED_BY
GraphAwareTM
Ice HockeyIce Hockey (Implied Relationship)
PARTNERNeo Technology GraphAware
PARTNERNeo Technology GraphAware
GraphAwareTM
Company Partnership (Naturally Bidirectional)
PARTNER
Neo Technology GraphAware
PARTNER
GraphAwareTM
Company Partnership (Naturally Bidirectional)
PARTNER
Neo Technology GraphAware
PARTNER
GraphAwareTM
Company Partnership (Naturally Bidirectional)
Neo Technology GraphAware
PARTNER
GraphAwareTM
Company Partnership (Naturally Bidirectional)
Neo Technology GraphAware
PARTNER
GraphAwareTM
Company Partnership (Naturally Bidirectional)
GraphAwareTM
In Neo4j, the speed of traversal does not depend on the direction of the relationships being traversed.
Traversal Speed
GraphAwareTM
Why?
GraphAwareTM
GraphAwareTM
Node Record in the Node Store (9 bytes), first bit = inUse flag
Relationship Record in the Relationship Store (33 bytes), first bit = inUse flag, second bit unused
next relationship
(35 bits)
next property (36 bits)
first node(35 bits)
second node (35 bits)
type(16 bits)
first node's previous
relationship (35 bits)
first node's next
relationship (35 bits)
second node's first relationship
(35 bits)
second node's next relationship
(35 bits)
next property (36 bits)
Neo4j Data Layout
GraphAwareTM
Neo4j APIs allow developers to completely ignore relationship direction when querying the graph.
Traversal APIs
GraphAwareTM
MATCH (neo)-‐[:PARTNER]-‐>(partner)
Cypher
GraphAwareTM
MATCH (neo)<-‐[:PARTNER]-‐(partner)
Cypher
GraphAwareTM
MATCH (neo)-‐[:PARTNER]-‐(partner)
Cypher
GraphAwareTM
Different quality in each direction => should have two relationships!
Heads Up!
LOVES
Geeky Guy Girl
DOESN’T CARE ABOUT
ipsum
FRIEND_OFname: “Michael”
User
name: “Laura”
User
date: 2014-01-29from: “London”
to: “Nottingham”
RideDRIVER
name: “Alice”
User
PASSENGER
date: 2014-01-27from: “Brighton”
to: “Hastings”
Ride
PASSENGER
name: “Peter”
User
PASSENGER
name: “Jenny”
User
DRIVER
RATEDrating: 5RATED
rating: 3
ipsum
FRIEND_OFname: “Michael”
User
name: “Laura”
User
date: 2014-01-29from: “London”
to: “Nottingham”
RideDRIVER
name: “Alice”
User
PASSENGER
date: 2014-01-27from: “Brighton”
to: “Hastings”
Ride
PASSENGER
name: “Peter”
User
PASSENGER
name: “Jenny”
User
DRIVER
RATEDrating: ?RATED
rating: 3
HATEDDISLIKEDNEUTRALLIKEDLOVED
FRIEND_OFname: “Michael”
User
name: “Laura”
User
date: 2014-01-29from: “London”
to: “Nottingham”
RideDRIVER
name: “Alice”
User
PASSENGER
date: 2014-01-27from: “Brighton”
to: “Hastings”
Ride
PASSENGER
name: “Peter”
User
PASSENGER
name: “Jenny”
User
DRIVER
LOVEDNEUTRAL
GraphAwareTM
performance comparison
Qualifying Relationships
ipsum
FRIEND_OFname: “Michael”
User
name: “Laura”
User
date: 2014-01-29from: “London”
to: “Nottingham”
RideDRIVER
name: “Alice”
User
PASSENGER
date: 2014-01-27from: “Brighton”
to: “Hastings”
Ride
PASSENGER
name: “Peter”
User
PASSENGER
name: “Jenny”
User
DRIVER
RATEDrating: 5RATED
rating: 3
Qualifying by Properties
GraphAwareTM
START ride=node({id}) MATCH (ride)<-‐[r:RATED]-‐(passenger) WHERE r.rating > 3 RETURN passenger
Who liked the ride? (Cypher)
GraphAwareTM
for (Relationship r : ride.getRelationships(INCOMING, RATED)) { if ((int) r.getProperty("rating") > 3) { Node passenger = r.getStartNode(); //do something with it } }
Who liked the ride? (Java)
FRIEND_OFname: “Michael”
User
name: “Laura”
User
date: 2014-01-29from: “London”
to: “Nottingham”
RideDRIVER
name: “Alice”
User
PASSENGER
date: 2014-01-27from: “Brighton”
to: “Hastings”
Ride
PASSENGER
name: “Peter”
User
PASSENGER
name: “Jenny”
User
DRIVER
LOVEDNEUTRAL
Qualifying by Relationship Type
GraphAwareTM
START ride=node({id}) MATCH (ride)<-‐[r:LIKED|LOVED]-‐(passenger) RETURN passenger
Who liked the ride? (Cypher)
GraphAwareTM
for (Relationship r : ride.getRelationships(INCOMING, LIKED, LOVED)) { Node passenger = r.getStartNode(); //do something with it }
Who liked the ride? (Java)
GraphAwareTM
GraphAwareTM
FRIEND_OFname: “Michael”
User
name: “Laura”
User
date: 2014-01-29from: “London”
to: “Nottingham”
RideDRIVER
name: “Alice”
User
PASSENGER
date: 2014-01-27from: “Brighton”
to: “Hastings”
Ride
PASSENGER
name: “Peter”
User
PASSENGER
name: “Jenny”
User
DRIVER
LOVEDNEUTRAL
Winner!
Other interesting info?
GraphAwareTM
frequently asked question
Hardware Sizing
GraphAwareTM
HDD
Record Files
Transaction Log
Operating System
JVM
Neo4j
Object Cache
Core API
Other APIs
TransactionManagement
File System Cache
Node
s
Rela
tions
hips
Prop
ertie
s
Rela
tions
hip
Type
s
Neo4j Architecture
GraphAwareTM
> cd data > ls -‐ah
Disk Space
GraphAwareTM
drwxr-‐xr-‐x 5 bachmanm wheel 170B 19 Oct 12:56 index -‐rw-‐r-‐-‐r-‐-‐ 1 bachmanm wheel 31K 19 Oct 12:56 messages.log -‐rw-‐r-‐-‐r-‐-‐ 1 bachmanm wheel 69B 19 Oct 12:56 neostore -‐rw-‐r-‐-‐r-‐-‐ 1 bachmanm wheel 9B 19 Oct 12:56 neostore.id -‐rw-‐r-‐-‐r-‐-‐ 1 bachmanm wheel 8.8K 19 Oct 12:56 neostore.nodestore.db -‐rw-‐r-‐-‐r-‐-‐ 1 bachmanm wheel 9B 19 Oct 12:56 neostore.nodestore.db.id -‐rw-‐r-‐-‐r-‐-‐ 1 bachmanm wheel 39M 19 Oct 12:56 neostore.propertystore.db -‐rw-‐r-‐-‐r-‐-‐ 1 bachmanm wheel 153B 19 Oct 12:56 neostore.propertystore.db.arrays -‐rw-‐r-‐-‐r-‐-‐ 1 bachmanm wheel 9B 19 Oct 12:56 neostore.propertystore.db.arrays.id -‐rw-‐r-‐-‐r-‐-‐ 1 bachmanm wheel 9B 19 Oct 12:56 neostore.propertystore.db.id -‐rw-‐r-‐-‐r-‐-‐ 1 bachmanm wheel 43B 19 Oct 12:56 neostore.propertystore.db.index -‐rw-‐r-‐-‐r-‐-‐ 1 bachmanm wheel 9B 19 Oct 12:56 neostore.propertystore.db.index.id -‐rw-‐r-‐-‐r-‐-‐ 1 bachmanm wheel 140B 19 Oct 12:56 neostore.propertystore.db.index.keys -‐rw-‐r-‐-‐r-‐-‐ 1 bachmanm wheel 9B 19 Oct 12:56 neostore.propertystore.db.index.keys.id -‐rw-‐r-‐-‐r-‐-‐ 1 bachmanm wheel 154B 19 Oct 12:56 neostore.propertystore.db.strings -‐rw-‐r-‐-‐r-‐-‐ 1 bachmanm wheel 9B 19 Oct 12:56 neostore.propertystore.db.strings.id -‐rw-‐r-‐-‐r-‐-‐ 1 bachmanm wheel 31M 19 Oct 12:56 neostore.relationshipstore.db -‐rw-‐r-‐-‐r-‐-‐ 1 bachmanm wheel 9B 19 Oct 12:56 neostore.relationshipstore.db.id -‐rw-‐r-‐-‐r-‐-‐ 1 bachmanm wheel 38B 19 Oct 12:56 neostore.relationshiptypestore.db -‐rw-‐r-‐-‐r-‐-‐ 1 bachmanm wheel 9B 19 Oct 12:56 neostore.relationshiptypestore.db.id -‐rw-‐r-‐-‐r-‐-‐ 1 bachmanm wheel 140B 19 Oct 12:56 neostore.relationshiptypestore.db.names -‐rw-‐r-‐-‐r-‐-‐ 1 bachmanm wheel 9B 19 Oct 12:56 neostore.relationshiptypestore.db.names.id
Disk Space
GraphAwareTM
Disk Space
node 14B
relationship 33B
property 41B
GraphAwareTM
Disk Space (Example)
1,000 nodes x 14B = 13.7 kB1,000,000 rels x 33B = 31.5 MB2,010,000 props x 41B = 78.6 MBTOTAL 110.1 MB
GraphAwareTM
How about low level cache? Any guesses?
Low Level Cache
GraphAwareTM
Same as disk space
Low Level Cache
GraphAwareTM
High Level Cache
node 344B
relationship 208B
property 116B
...
Other interesting info?
GraphAwareTM
Cypher is great!
Cypher is improving
But don’t be afraid of writing some Java
Java API vs. Cypher
GraphAwareTM
Experiment
Measure
Analyse
Ask
Conclusion
GraphAwareTM
www.graphaware.com @graph_aware
Thanks!
GraphAwareTM
Next meetup
• The transport graph – Roads, Nodes and Automobiles (Jacqui Read)
– Transport Network Route Finding Using A Graph (Ian Cartwright & Ben Earlham)
• 26th February 2014 • Here!
GraphAwareTM
GraphAwareTM
Ian Robinson, Jim Webber & Emil Eifrem
Graph Databases
h
Compliments
of Neo Technology
GraphAwareTM
Take me to the pub…
GraphAwareTM
www.graphaware.com @graph_aware
Thanks!