48
An Introduc+on to Doctor Who (and Neo4j) @iansrobinson #neo4j

Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

Embed Size (px)

DESCRIPTION

2011-11-01 | 10:40 AM - 11:40 AMDoctor Who is the world’s longest running science-fiction TV series. Battling daleks, cybermen and sontarans, and always accompanied by his trusted human companions, the last Timelord has saved earth from destruction more times than you’ve cursed Maven. Neo4j is the world’s leading open source graph database. Designed to interrogate densely connected data with lightning speed, it lets you traverse millions of nodes in a fraction of the time it takes to run a multi-join SQLquery. When these two meet, the result is an entertaining introduction to the complex history of a complex hero, and a rapid survey of the elegant APIs of a delightfully simple graph database. Armed only with a data store packed full of geeky Doctor Who facts, by the end of this session we’ll have you answering questions of the Doctor Who universe like a die-hard fan.

Citation preview

Page 1: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

An  Introduc+on  to  Doctor  Who  (and  Neo4j)  

 @iansrobinson  

#neo4j  

Page 2: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

#neo4j  

Neo4j  is  a  Graph  Database  

Page 3: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

#neo4j  

Property  Graph  

Page 4: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

#neo4j  

Property  Graph  

Page 5: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

#neo4j  

Neo4j  

Page 6: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

#neo4j  

Page 7: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

32  billion  nodes  32  billion  rela+onships  64  billion  proper+es  

Page 8: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

#neo4j  

Page 9: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

Community  

Advanced  

Enterprise  

Page 10: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

stole  from  

loves  loves  

enemy  

enemy  A  Good  Man  Goes  to  War  

appeared    in  

appeared    in  

appeared    in  

appeared    in  

Victory  of    the  Daleks  

appeared    in  

appeared    in  

companion  

companion  

enemy  

Page 11: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

GraphDatabaseService db = new EmbeddedGraphDatabase("/data/drwho");

Page 12: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

GraphDatabaseService db = new EmbeddedGraphDatabase("/data/drwho");

Node theDoctor = db.createNode(); theDoctor.setProperty("name", "The Doctor");

Node daleks = db.createNode(); daleks.setProperty("name", "Daleks"); Node cybermen = db.createNode(); cybermen.setProperty("name", "Cybermen");

Page 13: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

GraphDatabaseService db = new EmbeddedGraphDatabase("/data/drwho");

Node theDoctor = db.createNode(); theDoctor.setProperty("name", "The Doctor");

Node daleks = db.createNode(); daleks.setProperty("name", "Daleks"); Node cybermen = db.createNode(); cybermen.setProperty("name", "Cybermen");

theDoctor.createRelationshipTo(daleks, DynamicRelationshipType.withName("enemy")); theDoctor.createRelationshipTo(cybermen, DynamicRelationshipType.withName("enemy"));

enemy  

enemy  

Page 14: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

GraphDatabaseService db = new EmbeddedGraphDatabase("/data/drwho"); Transaction tx = db.beginTx(); try {

Node theDoctor = db.createNode(); theDoctor.setProperty("name", "The Doctor");

Node daleks = db.createNode(); daleks.setProperty("name", "Daleks"); Node cybermen = db.createNode(); cybermen.setProperty("name", "Cybermen");

theDoctor.createRelationshipTo(daleks, DynamicRelationshipType.withName("enemy")); theDoctor.createRelationshipTo(cybermen, DynamicRelationshipType.withName("enemy"));

tx.success();

} finally { tx.finish();

}

enemy  

enemy  

Page 15: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

#neo4j  

How  do  I  query  the  data?  

hOp://opfm.jpl.nasa.gov/  

hOp://news.xinhuanet.com  

Page 16: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

#neo4j  

Dalek  Props  hOp://www.dalek6388.co.uk/  

Page 17: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

!tle:Power  of    the  Daleks  

species:Dalek  

APPEARED_IN  

USED_IN  props:Daleks  

name:Dalek  Six-­‐5  

name:Dalek  7  name:Dalek  2  name:Dalek  1  

MEMBER_OF   MEMBER_OF  

type:shoulders   type:skirt  

Page 18: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

name:Dalek  Six-­‐5  

name:Dalek  6   name:Dalek  5  

ORIGINAL_PROP  ORIGINAL_PROP  

ORIGINAL_PROP   ORIGINAL_PROP  

name:Dalek  1   name:Dalek  2   name:Dalek  7  

Page 19: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

!tle:Power  of    the  Daleks  !tle:The  

Daleks  

!tle:The  Dalek  

Invasion  of  Earth  

name:Dalek  Two-­‐1  name:Dalek  One-­‐5   name:Dalek  Six-­‐7  

name:  Dalek  Six-­‐5  name:Dalek  1   name:Dalek  2   name:Dalek  7  

Page 20: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

#neo4j  

Supply  Chain  Traceability  

Page 21: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

#neo4j  

Traversal  Framework  

•  Visits  (and  returns)  nodes  based  on  traversal  descrip+on  

•  Powerful;  can  customize:  – Rela+onships  followed  – Branch  selec+on  policy  – Node/rela+onship  uniqueness  constraints  – Path  evaluators  

Page 22: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

Node theDaleks = database.index().forNodes("species").get("name", "Dalek").getSingle(); Traverser traverser = Traversal.description() .depthFirst() .relationships(Rels.APPEARED_IN, Direction.OUTGOING) .relationships(Rels.USED_IN, Direction.INCOMING) .relationships(Rels.MEMBER_OF, Direction.INCOMING) .relationships(Rels.COMPOSED_OF, Direction.OUTGOING) .relationships(Rels.ORIGINAL_PROP, Direction.OUTGOING) .evaluator(new Evaluator() { @Override public Evaluation evaluate(Path path) { if (path.lastRelationship() != null && path.lastRelationship().isType(Rels.ORIGINAL_PROP)){ return Evaluation.INCLUDE_AND_PRUNE; } return Evaluation.EXCLUDE_AND_CONTINUE; } }) .uniqueness(Uniqueness.NONE) .traverse(theDaleks); Description

Page 23: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

Node theDaleks = database.index().forNodes("species").get("name", "Dalek").getSingle(); Traverser traverser = Traversal.description() .depthFirst() .relationships(Rels.APPEARED_IN, Direction.OUTGOING) .relationships(Rels.USED_IN, Direction.INCOMING) .relationships(Rels.MEMBER_OF, Direction.INCOMING) .relationships(Rels.COMPOSED_OF, Direction.OUTGOING) .relationships(Rels.ORIGINAL_PROP, Direction.OUTGOING) .evaluator(new Evaluator() { @Override public Evaluation evaluate(Path path) { if (path.lastRelationship() != null && path.lastRelationship().isType(Rels.ORIGINAL_PROP)){ return Evaluation.INCLUDE_AND_PRUNE; } return Evaluation.EXCLUDE_AND_CONTINUE; } }) .uniqueness(Uniqueness.NONE) .traverse(theDaleks); Index Lookup

Page 24: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

Node theDaleks = database.index().forNodes("species").get("name", "Dalek").getSingle(); Traverser traverser = Traversal.description() .depthFirst() .relationships(Rels.APPEARED_IN, Direction.OUTGOING) .relationships(Rels.USED_IN, Direction.INCOMING) .relationships(Rels.MEMBER_OF, Direction.INCOMING) .relationships(Rels.COMPOSED_OF, Direction.OUTGOING) .relationships(Rels.ORIGINAL_PROP, Direction.OUTGOING) .evaluator(new Evaluator() { @Override public Evaluation evaluate(Path path) { if (path.lastRelationship() != null && path.lastRelationship().isType(Rels.ORIGINAL_PROP)){ return Evaluation.INCLUDE_AND_PRUNE; } return Evaluation.EXCLUDE_AND_CONTINUE; } }) .uniqueness(Uniqueness.NONE) .traverse(theDaleks); Branching

Page 25: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

Node theDaleks = database.index().forNodes("species").get("name", "Dalek").getSingle(); Traverser traverser = Traversal.description() .depthFirst() .relationships(Rels.APPEARED_IN, Direction.OUTGOING) .relationships(Rels.USED_IN, Direction.INCOMING) .relationships(Rels.MEMBER_OF, Direction.INCOMING) .relationships(Rels.COMPOSED_OF, Direction.OUTGOING) .relationships(Rels.ORIGINAL_PROP, Direction.OUTGOING) .evaluator(new Evaluator() { @Override public Evaluation evaluate(Path path) { if (path.lastRelationship() != null && path.lastRelationship().isType(Rels.ORIGINAL_PROP)){ return Evaluation.INCLUDE_AND_PRUNE; } return Evaluation.EXCLUDE_AND_CONTINUE; } }) .uniqueness(Uniqueness.NONE) .traverse(theDaleks); Relationships

Page 26: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

Node theDaleks = database.index().forNodes("species").get("name", "Dalek").getSingle(); Traverser traverser = Traversal.description() .depthFirst() .relationships(Rels.APPEARED_IN, Direction.OUTGOING) .relationships(Rels.USED_IN, Direction.INCOMING) .relationships(Rels.MEMBER_OF, Direction.INCOMING) .relationships(Rels.COMPOSED_OF, Direction.OUTGOING) .relationships(Rels.ORIGINAL_PROP, Direction.OUTGOING) .evaluator(new Evaluator() { @Override public Evaluation evaluate(Path path) { if (path.lastRelationship() != null && path.lastRelationship().isType(Rels.ORIGINAL_PROP)){ return Evaluation.INCLUDE_AND_PRUNE; } return Evaluation.EXCLUDE_AND_CONTINUE; } }) .uniqueness(Uniqueness.NONE) .traverse(theDaleks); Evaluator

Page 27: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

Node theDaleks = database.index().forNodes("species").get("name", "Dalek").getSingle(); Traverser traverser = Traversal.description() .depthFirst() .relationships(Rels.APPEARED_IN, Direction.OUTGOING) .relationships(Rels.USED_IN, Direction.INCOMING) .relationships(Rels.MEMBER_OF, Direction.INCOMING) .relationships(Rels.COMPOSED_OF, Direction.OUTGOING) .relationships(Rels.ORIGINAL_PROP, Direction.OUTGOING) .evaluator(new Evaluator() { @Override public Evaluation evaluate(Path path) { if (path.lastRelationship() != null && path.lastRelationship().isType(Rels.ORIGINAL_PROP)){ return Evaluation.INCLUDE_AND_PRUNE; } return Evaluation.EXCLUDE_AND_CONTINUE; } }) .uniqueness(Uniqueness.NONE) .traverse(theDaleks); Uniqueness

Page 28: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

Node theDaleks = database.index().forNodes("species").get("name", "Dalek").getSingle(); Traverser traverser = Traversal.description() .depthFirst() .relationships(Rels.APPEARED_IN, Direction.OUTGOING) .relationships(Rels.USED_IN, Direction.INCOMING) .relationships(Rels.MEMBER_OF, Direction.INCOMING) .relationships(Rels.COMPOSED_OF, Direction.OUTGOING) .relationships(Rels.ORIGINAL_PROP, Direction.OUTGOING) .evaluator(new Evaluator() { @Override public Evaluation evaluate(Path path) { if (path.lastRelationship() != null && path.lastRelationship().isType(Rels.ORIGINAL_PROP)){ return Evaluation.INCLUDE_AND_PRUNE; } return Evaluation.EXCLUDE_AND_CONTINUE; } }) .uniqueness(Uniqueness.NONE) .traverse(theDaleks); Traverse

Page 29: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

Iterable<Path>  path.startNode() path.endNode()

APPEARED_IN   USED_IN   COMPOSED_OF  MEMBER_OF   ORIGINAL_PROP  

species:Dalek  +tle:Power  of    the  Daleks   props:Daleks   name:Dalek  7   type:skirt   name:Dalek  7  

APPEARED_IN   USED_IN   COMPOSED_OF  MEMBER_OF   ORIGINAL_PROP  

species:Daleks  +tle:Power  of    the  Daleks   props:Daleks   name:Dalek  7   type:shoulder   name:Dalek  7  

APPEARED_IN   USED_IN   COMPOSED_OF  MEMBER_OF   ORIGINAL_PROP  

species:Daleks  +tle:Power  of    the  Daleks   props:Daleks   name:Dalek  Six-­‐5   type:skirt   name:Dalek  5  

Page 30: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

#neo4j  

Cypher  

•  Declara+ve  graph  pa8ern  matching  language  – Humane  regular  expressions  for  graphs  

•  New  in  1.4  – Con+nuing,  rapid  feature  development  

•  Supports  queries  –  Including  aggrega+on,  ordering  and  limits  – Muta+ng  opera+ons  coming  in  1.6  

Page 31: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

start  daleks=node:species(species='Dalek')    match  daleks-­‐[:APPEARED_IN]-­‐>episode<-­‐[:USED_IN]-­‐                          ()<-­‐[:MEMBER_OF]-­‐()-­‐[:COMPOSED_OF]-­‐>                            part-­‐[:ORIGINAL_PROP]-­‐>originalprop  return  originalprop.name,  part.type,  count(episode.Qtle)    order  by  count(episode.Qtle)  desc    limit  1

Cypher Query

Page 32: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

start  daleks=node:species(species='Dalek')    match  daleks-­‐[:APPEARED_IN]-­‐>episode<-­‐[:USED_IN]-­‐                          ()<-­‐[:MEMBER_OF]-­‐()-­‐[:COMPOSED_OF]-­‐>                            part-­‐[:ORIGINAL_PROP]-­‐>originalprop  return  originalprop.name,  part.type,  count(episode.Qtle)    order  by  count(episode.Qtle)  desc    limit  1

Index Lookup

Page 33: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

start  daleks=node:species(species='Dalek')    match  daleks-­‐[:APPEARED_IN]-­‐>episode<-­‐[:USED_IN]-­‐                          ()<-­‐[:MEMBER_OF]-­‐()-­‐[:COMPOSED_OF]-­‐>                            part-­‐[:ORIGINAL_PROP]-­‐>originalprop  return  originalprop.name,  part.type,  count(episode.Qtle)    order  by  count(episode.Qtle)  desc    limit  1

Match Nodes & Relationships

Page 34: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

match  daleks-­‐[:APPEARED_IN]-­‐>episode<-­‐[:USED_IN]-­‐                          ()<-­‐[:MEMBER_OF]-­‐()-­‐[:COMPOSED_OF]-­‐>                            part-­‐[:ORIGINAL_PROP]-­‐>originalprop    

daleks   episode  

part   originalprop  

APPEARED_IN  

USED_IN  

MEMBER_OF  

COMPOSED_OF  

ORIGINAL_PROP  

Page 35: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

start  daleks=node:species(species='Dalek')    match  daleks-­‐[:APPEARED_IN]-­‐>episode<-­‐[:USED_IN]-­‐                          ()<-­‐[:MEMBER_OF]-­‐()-­‐[:COMPOSED_OF]-­‐>                            part-­‐[:ORIGINAL_PROP]-­‐>originalprop  return  originalprop.name,  part.type,  count(episode.Qtle)    order  by  count(episode.Qtle)  desc    limit  1

Return Values

Page 36: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

CypherParser  parser  =  new  CypherParser();  ExecuQonEngine  engine  =  new  ExecuQonEngine(universe.getDatabase());        String  cql  =  "start  daleks=node:species(species='Dalek')"          +  "  match  daleks-­‐[:APPEARED_IN]-­‐>episode<-­‐[:USED_IN]-­‐()<-­‐[:MEMBER_OF]-­‐()"          +  "-­‐[:COMPOSED_OF]-­‐>part-­‐[:ORIGINAL_PROP]-­‐>originalprop"          +  "  return  originalprop.name,  part.type,  count(episode.Qtle)"          +  "  order  by  count(episode.Qtle)  desc  limit  1";    Query  query  =  parser.parse(cql);  ExecuQonResult  result  =  engine.execute(query);    String  originalProp  =  result.javaColumnAs("originalprop.name").next().toString();  String  part  =  result.javaColumnAs("part.type").next().toString();  int  episodeCount  =  (Integer)  result.javaColumnAs("count(episode.Qtle)").next();

In Code

Page 37: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

CypherParser  parser  =  new  CypherParser();  ExecuQonEngine  engine  =  new  ExecuQonEngine(universe.getDatabase());        String  cql  =  "start  daleks=node:species(species='Dalek')"          +  "  match  daleks-­‐[:APPEARED_IN]-­‐>episode<-­‐[:USED_IN]-­‐()<-­‐[:MEMBER_OF]-­‐()"          +  "-­‐[:COMPOSED_OF]-­‐>part-­‐[:ORIGINAL_PROP]-­‐>originalprop"          +  "  return  originalprop.name,  part.type,  count(episode.Qtle)"          +  "  order  by  count(episode.Qtle)  desc  limit  1";    Query  query  =  parser.parse(cql);  ExecuQonResult  result  =  engine.execute(query);    String  originalProp  =  result.javaColumnAs("originalprop.name").next().toString();  String  part  =  result.javaColumnAs("part.type").next().toString();  int  episodeCount  =  (Integer)  result.javaColumnAs("count(episode.Qtle)").next();

Setup

Page 38: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

CypherParser  parser  =  new  CypherParser();  ExecuQonEngine  engine  =  new  ExecuQonEngine(universe.getDatabase());        String  cql  =  "start  daleks=node:species(species='Dalek')"          +  "  match  daleks-­‐[:APPEARED_IN]-­‐>episode<-­‐[:USED_IN]-­‐()<-­‐[:MEMBER_OF]-­‐()"          +  "-­‐[:COMPOSED_OF]-­‐>part-­‐[:ORIGINAL_PROP]-­‐>originalprop"          +  "  return  originalprop.name,  part.type,  count(episode.Qtle)"          +  "  order  by  count(episode.Qtle)  desc  limit  1";    Query  query  =  parser.parse(cql);  ExecuQonResult  result  =  engine.execute(query);    String  originalProp  =  result.javaColumnAs("originalprop.name").next().toString();  String  part  =  result.javaColumnAs("part.type").next().toString();  int  episodeCount  =  (Integer)  result.javaColumnAs("count(episode.Qtle)").next();

Parse & Execute

Page 39: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

CypherParser  parser  =  new  CypherParser();  ExecuQonEngine  engine  =  new  ExecuQonEngine(universe.getDatabase());        String  cql  =  "start  daleks=node:species(species='Dalek')"          +  "  match  daleks-­‐[:APPEARED_IN]-­‐>episode<-­‐[:USED_IN]-­‐()<-­‐[:MEMBER_OF]-­‐()"          +  "-­‐[:COMPOSED_OF]-­‐>part-­‐[:ORIGINAL_PROP]-­‐>originalprop"          +  "  return  originalprop.name,  part.type,  count(episode.Qtle)"          +  "  order  by  count(episode.Qtle)  desc  limit  1";    Query  query  =  parser.parse(cql);  ExecuQonResult  result  =  engine.execute(query);    String  originalProp  =  result.javaColumnAs("originalprop.name").next().toString();  String  part  =  result.javaColumnAs("part.type").next().toString();  int  episodeCount  =  (Integer)  result.javaColumnAs("count(episode.Qtle)").next();

Results

Page 40: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

In Webadmin

Page 41: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

#neo4j  

The  Hardest  Working  Prop  Part  

hOp://www.dalek6388.co.uk/  

Dalek  One’s  shoulders  

Page 42: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

#neo4j  

Download  

hOp://neo4j.org/download/        hOp://www.springsource.org/spring-­‐data/neo4j  

Page 43: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

#neo4j  

Tutorial  

hOps://github.com/jimwebber/neo4j-­‐tutorial  

Page 44: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

#neo4j  

Dematerialize…  

@iansrobinson  [email protected]  

 

Page 45: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

1.   Subtract:  Select  rectangle  then  3  black  triangles  

2.   Union:  All    

Page 46: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

1.   Subtract:  Select  rectangle  then  3  black  triangles  

2.   Union:  All    

Page 47: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

1.   Subtract:  Select  rectangle  then  3  black  triangles  

2.   Union:  All    

Page 48: Big Data & Cloud | An Introduction to Neo4j (and Doctor Who) | Ian Robinson

daleks   episode  

part   originalprop  

match  (daleks)-­‐[:APPEARED_IN]-­‐>(episode)<-­‐[:USED_IN]-­‐                          ()<-­‐[:MEMBER_OF]-­‐()-­‐[:COMPOSED_OF]-­‐>                            (part)-­‐[:ORIGINAL_PROP]-­‐>(originalprop)    

APPEARED_IN  

USED_IN  

MEMBER_OF  

COMPOSED_OF  

ORIGINAL_PROP