47
TinkerPop: a story of graphs, DBs, and graph DBs Joshua Shinavier and James Thornton Texas Linux Festival June 13th, 2014

TinkerPop: a story of graphs, DBs, and graph DBs

Embed Size (px)

DESCRIPTION

intro to TinkerPop and the Aurelius Graph Cluster for the Graph DB Workshop, Texas Linux Festival 2014

Citation preview

Page 1: TinkerPop: a story of graphs, DBs, and graph DBs

TinkerPop: a story of graphs, DBs, and graph DBs

Joshua Shinavier and James Thornton

Texas Linux FestivalJune 13th, 2014

Page 2: TinkerPop: a story of graphs, DBs, and graph DBs

Once, there was a thing

Page 3: TinkerPop: a story of graphs, DBs, and graph DBs

v(1)

Let’s call it a vertex

Page 4: TinkerPop: a story of graphs, DBs, and graph DBs

The vertex had some metadata

v(1)

name: “Graph DB workshop”

Page 5: TinkerPop: a story of graphs, DBs, and graph DBs

We’ll call that a property

v(1)

name: “Graph DB workshop”

You are here.

Page 6: TinkerPop: a story of graphs, DBs, and graph DBs

In fact, the vertex had multiple properties

v(1)

name: “Graph DB workshop”type: “Event”

Page 7: TinkerPop: a story of graphs, DBs, and graph DBs

The properties were of various types

v(1)

name: “Graph DB workshop”type: “Event”starts: 1402682400000ends: 1402696800000

Page 8: TinkerPop: a story of graphs, DBs, and graph DBs

v(1)

name: “Graph DB workshop”type: “Event”starts: 1402682400000ends: 1402696800000

v(2)

name: “Texas Linux Fest”type: “Event”starts: 1402664400000ends: 1402808400000

Our vertex was not alone

Page 9: TinkerPop: a story of graphs, DBs, and graph DBs

Thus, an edge

v(1)

name: “Graph DB workshop”type: “Event”starts: 1402682400000ends: 1402696800000

v(2)

name: “Texas Linux Fest”type: “Event”starts: 1402664400000ends: 1402808400000

Page 10: TinkerPop: a story of graphs, DBs, and graph DBs

The edge was directed…

v(1)

name: “Graph DB workshop”type: “Event”starts: 1402682400000ends: 1402696800000

v(2)

name: “Texas Linux Fest”type: “Event”starts: 1402664400000ends: 1402808400000

Page 11: TinkerPop: a story of graphs, DBs, and graph DBs

…and labeled

v(1)

name: “Graph DB workshop”type: “Event”starts: 1402682400000ends: 1402696800000

v(2)

name: “Texas Linux Fest”type: “Event”starts: 1402664400000ends: 1402808400000

partOf

Page 12: TinkerPop: a story of graphs, DBs, and graph DBs

The label types the relationship

v(1)

name: “Graph DB workshop”type: “Event”starts: 1402682400000ends: 1402696800000

v(2)

name: “Texas Linux Fest”type: “Event”starts: 1402664400000ends: 1402808400000

partOf

You are here, too.

Page 13: TinkerPop: a story of graphs, DBs, and graph DBs

v(1)

name: “Graph DB workshop”type: “Event”starts: 1402682400000ends: 1402696800000

v(2)

name: “Texas Linux Fest”type: “Event”starts: 1402664400000ends: 1402808400000

partOfv(3)

name: “Chef Workshop”type: “Event”starts: 1402664400000ends: 1402696800000

v(4)

name: “Canonical Charm School”type: “Event”starts: 1402664400000ends: 1402696800000

partOf

partOf

More vertices joined the fun…

Page 14: TinkerPop: a story of graphs, DBs, and graph DBs

v(1)

name: “Graph DB workshop”type: “Event”starts: 1402682400000ends: 1402696800000

v(2)

name: “Texas Linux Fest”type: “Event”starts: 1402664400000ends: 1402808400000

partOfv(3)

name: “Chef Workshop”type: “Event”starts: 1402664400000ends: 1402696800000

v(4)

name: “Canonical Charm School”type: “Event”starts: 1402664400000ends: 1402696800000

partOf

partOf

v(7)

name: “TinkerPop suite”type: “Software”

hasTopic

v(8)

name: “Aurelius Graph Cluster”type: “Software”

hasTopic

More labels, too

Page 15: TinkerPop: a story of graphs, DBs, and graph DBs

Now it was a labeled multigraph

v(1)

name: “Graph DB workshop”type: “Event”starts: 1402682400000ends: 1402696800000

v(2)

name: “Texas Linux Fest”type: “Event”starts: 1402664400000ends: 1402808400000

partOfv(3)

name: “Chef Workshop”type: “Event”starts: 1402664400000ends: 1402696800000

v(4)

name: “Canonical Charm School”type: “Event”starts: 1402664400000ends: 1402696800000

partOf

partOf

v(6)

name: “Joshua Shinavier”type: “Person”githubId: “joshsh”

v(5)

presentedBy

presentedBy

v(7)

name: “TinkerPop suite”type: “Software”

hasTopic

v(8)

name: “Aurelius Graph Cluster”type: “Software”

hasTopic

name: “James Thornton”type: “Person”githubId: “espeed”

Page 16: TinkerPop: a story of graphs, DBs, and graph DBs

A few more edges

v(1)

name: “Graph DB workshop”type: “Event”starts: 1402682400000ends: 1402696800000

v(2)

name: “Texas Linux Fest”type: “Event”starts: 1402664400000ends: 1402808400000

partOfv(3)

name: “Chef Workshop”type: “Event”starts: 1402664400000ends: 1402696800000

v(4)

name: “Canonical Charm School”type: “Event”starts: 1402664400000ends: 1402696800000

partOf

partOf

v(6)

name: “Joshua Shinavier”type: “Person”githubId: “joshsh”

v(5)

presentedBy

presentedBy

v(7)

name: “TinkerPop suite”type: “Software”

hasTopic

v(8)

name: “Aurelius Graph Cluster”type: “Software”

contributesTo

contributesTohasTopic

contributesTo

name: “James Thornton”type: “Person”githubId: “espeed”

Page 17: TinkerPop: a story of graphs, DBs, and graph DBs

Some edges also had properties

v(1)

name: “Graph DB workshop”type: “Event”starts: 1402682400000ends: 1402696800000

v(2)

name: “Texas Linux Fest”type: “Event”starts: 1402664400000ends: 1402808400000

partOfv(3)

name: “Chef Workshop”type: “Event”starts: 1402664400000ends: 1402696800000

v(4)

name: “Canonical Charm School”type: “Event”starts: 1402664400000ends: 1402696800000

partOf

partOf

v(6)

name: “Joshua Shinavier”type: “Person”githubId: “joshsh”

v(5)

presentedBy

presentedBy

v(7)

name: “TinkerPop suite”type: “Software”

hasTopic

v(8)

name: “Aurelius Graph Cluster”type: “Software”

contributesTo

contributesTohasTopic

contributesTo

weight: 0.2

weight: 0.8

name: “James Thornton”type: “Person”githubId: “espeed”

weight: 1.0

Page 18: TinkerPop: a story of graphs, DBs, and graph DBs

We call this a Property Graph

v(1)

name: “Graph DB workshop”type: “Event”starts: 1402682400000ends: 1402696800000

v(2)

name: “Texas Linux Fest”type: “Event”starts: 1402664400000ends: 1402808400000

partOfv(3)

name: “Chef Workshop”type: “Event”starts: 1402664400000ends: 1402696800000

v(4)

name: “Canonical Charm School”type: “Event”starts: 1402664400000ends: 1402696800000

partOf

partOf

v(6)

name: “Joshua Shinavier”type: “Person”githubId: “joshsh”

v(5)

presentedBy

presentedBy

v(7)

name: “TinkerPop suite”type: “Software”

hasTopic

v(8)

name: “Aurelius Graph Cluster”type: “Software”

contributesTo

contributesTohasTopic

contributesTo

weight: 0.2

weight: 0.8

name: “James Thornton”type: “Person”githubId: “espeed”

weight: 1.0

Page 19: TinkerPop: a story of graphs, DBs, and graph DBs

Many graph DB data models are variations on this theme

v(1)

name: “Graph DB workshop”type: “Event”starts: 1402682400000ends: 1402696800000

v(2)

name: “Texas Linux Fest”type: “Event”starts: 1402664400000ends: 1402808400000

partOfv(3)

name: “Chef Workshop”type: “Event”starts: 1402664400000ends: 1402696800000

v(4)

name: “Canonical Charm School”type: “Event”starts: 1402664400000ends: 1402696800000

partOf

partOf

v(6)

name: “Joshua Shinavier”type: “Person”githubId: “joshsh”

v(5)

presentedBy

presentedBy

v(7)

name: “TinkerPop suite”type: “Software”

hasTopic

v(8)

name: “Aurelius Graph Cluster”type: “Software”

contributesTo

contributesTohasTopic

contributesTo

weight: 0.2

weight: 0.8

name: “James Thornton”type: “Person”githubId: “espeed”

weight: 1.0

Page 20: TinkerPop: a story of graphs, DBs, and graph DBs

Neo4j

Page 21: TinkerPop: a story of graphs, DBs, and graph DBs

OrientDB

Page 22: TinkerPop: a story of graphs, DBs, and graph DBs

Sparksee*

* the graph database previously known as DEX

Page 23: TinkerPop: a story of graphs, DBs, and graph DBs

etc.

Page 24: TinkerPop: a story of graphs, DBs, and graph DBs

Enter

• single Property Graph API supported by diverse graph database backends

• choose your favorite, but avoid vendor lock-in

• Blueprints : graph DB :: JDBC : RDBMS

• implementations, “ouplementations”, test suites, and helper utilities are built on top

Page 25: TinkerPop: a story of graphs, DBs, and graph DBs

Blueprints implementations

Page 26: TinkerPop: a story of graphs, DBs, and graph DBs

Now we need a query language…

• build it on the Blueprints API

• query over any Blueprints-compatible DB

• make it path-like, with side-effects

• match abstract traversals through the graph, filtering, ranking, and mutating as you go

• make it interactive. How about a REPL?

Page 27: TinkerPop: a story of graphs, DBs, and graph DBs

• a domain-specific language for traversing graphs

• Turing-complete, permits access to the full JDK

• has been adapted to various JVM languages

• Gremlin : graph DB :: SQL : RDBMS… sort of

Enter

Page 28: TinkerPop: a story of graphs, DBs, and graph DBs

Think “pipes and filters”

Page 29: TinkerPop: a story of graphs, DBs, and graph DBs

• Pipes: dataflow framework. The basis of Gremlin

• Frames: Java bean framework for graphs

• Furnace: Property Graph algorithms

• Rexster: high-performance graph database server

The rest of the TinkerPop family

Page 30: TinkerPop: a story of graphs, DBs, and graph DBs

TinkerPop is…• a developer group creating an open-source graph DB

stack

• a community of users and third-party implementors

• a foundation for building high-performance graph applications of any size

• model some data on your laptop

• build massive clustered applications

• open source, BSD licensed

Page 31: TinkerPop: a story of graphs, DBs, and graph DBs

A detailed guide to the rest of this workshop

• intro to the Aurelius Graph Cluster

• demos of graph tools and concepts

• guided installation of tools

• preview of TinkerPop3

Page 32: TinkerPop: a story of graphs, DBs, and graph DBs

Thanks!

Page 33: TinkerPop: a story of graphs, DBs, and graph DBs

The Aurelius Graph Cluster

Page 34: TinkerPop: a story of graphs, DBs, and graph DBs

In TinkerPop…

• we adapt various graph DBs to a unified API

• they become Property Graph databases

Page 35: TinkerPop: a story of graphs, DBs, and graph DBs

With AGC…

• we adapt various high-performance databases to the Titan API

• they become graph databases

Page 36: TinkerPop: a story of graphs, DBs, and graph DBs

Take your pick of CAP

Page 37: TinkerPop: a story of graphs, DBs, and graph DBs

Titan highlights• graphs, transactions scale with the number of

machines in a cluster

• geo, numeric range, and full text search for vertices and edges

• support for either of two indexing backends

• ElasticSearch, Lucene

• native support for Blueprints, Rexster

Page 38: TinkerPop: a story of graphs, DBs, and graph DBs

Dealing with supernodes

• Titan’s vertex-centric indices permit ordered querying from a vertex

• e.g. retrieve “knows” edges… in order of “since” timestamp

• iterates efficiently, even if there are thousands of edges

Page 39: TinkerPop: a story of graphs, DBs, and graph DBs

What about Faunus

Page 40: TinkerPop: a story of graphs, DBs, and graph DBs

Faunus…

• is a Hadoop-based graph analytics engine

• in Titan 0.5 will simply be called Titan/Hadoop

• adds support for global distributed graph operations

• applies (a subset of) Gremlin in a breadth-first fashion

Page 41: TinkerPop: a story of graphs, DBs, and graph DBs

Faunus inputs and outputs

• Hadoop SequenceFile format (in/out)

• Titan graph DB (in/out)

• GraphSON format (in/out)

• Rexster (in)

• RDF (in)

• Gremlin scripts (in/out)

Page 42: TinkerPop: a story of graphs, DBs, and graph DBs

Demo time

Page 43: TinkerPop: a story of graphs, DBs, and graph DBs

TinkerPop3

Page 44: TinkerPop: a story of graphs, DBs, and graph DBs

What’s new in TP3• new Gremlin implementation which makes good use of

Java 8 closures, enables introspection and optimization of traversals

• new OLAP API with support for message passing systems like Giraph, Hama, Faunus, etc.

• revamped I/O utilities with support for GraphSON, GraphML, and GremlinKryo

• new server model, incl. remote execution of scripts via WebSocket API, server plugin support, customizable serialization formats

Page 45: TinkerPop: a story of graphs, DBs, and graph DBs

Gremlitron

• Blueprints, Pipes, and Gremlin are all integrated in TinkerPop3

• Frames obsoleted by Gremlin DSLs

• Furnace is Gremlin OLAP

• Rexster is Gremlin Server

Page 46: TinkerPop: a story of graphs, DBs, and graph DBs

Try it out

• at:

• https://github.com/tinkerpop/tinkerpop3

• mailing list:

• https://groups.google.com/forum/gremlin-users

• we welcome your feedback and/or PRs