51
AURELIUS THINKAURELIUS.COM Matthias Broecheler @mbroecheler December 2 nd , MMXIV Titan:db Graphs at Scale #TitanDB

Titan: Scaling Graphs and TinkerPop3

Embed Size (px)

DESCRIPTION

Titan is a scalable graph database that can distribute and query graph data across multiple machines. This presentation provides a general introduction to graph computing and Titan in particular. It also focuses on some recent development for Titan 0.9 and TinkerPop 3.

Citation preview

Page 1: Titan: Scaling Graphs and TinkerPop3

AURELIUS THINKAURELIUS.COM

Matthias Broecheler @mbroecheler December 2nd, MMXIV

Titan:db Graphs at Scale #TitanDB

Page 2: Titan: Scaling Graphs and TinkerPop3

software is eating the world

Page 3: Titan: Scaling Graphs and TinkerPop3

software is eating the world

>;N; >LCP?H

Page 4: Titan: Scaling Graphs and TinkerPop3

Tabular Relational

Page 5: Titan: Scaling Graphs and TinkerPop3

Tabular Relational

NoSQL

Flexibility Handling

Page 6: Titan: Scaling Graphs and TinkerPop3

Tabular Relational

NoSQL

Flexibility Handling

Data Model Query Language

Page 7: Titan: Scaling Graphs and TinkerPop3

Tabular Relational

NoSQL

Graph Flexibility Handling

Data Model Query Language

Data Model Query Language

Simplicity Flexibility Handling

Page 8: Titan: Scaling Graphs and TinkerPop3

User Product

productid: 52235 name: cup price: 12.55

userid: matt email: matt@ password: 12345

Page 9: Titan: Scaling Graphs and TinkerPop3

userid   email   password  

ma.   ma.@   12345  

john   john@   qwerty  

billy   billy@   abcde  

produc=d   name   price  

52235   cup   12.55  

42215   spoon   7.22  

24529   knife   5.32  

User Product

Page 10: Titan: Scaling Graphs and TinkerPop3

User Product

productid: 52235 name: cup price: 12.55

userid: matt email: matt@ password: 12345

buy time: 9/5/14

Page 11: Titan: Scaling Graphs and TinkerPop3

userid   email   password  

ma.   ma.@   12345  

john   john@   qwerty  

billy   billy@   abcde  

produc=d   name   price  

52235   cup   12.55  

42215   spoon   7.22  

24529   knife   5.32  

User Product

userid   produc=d   =me  ma.   52235   9/5/14  billy   42215   8/7/14  billy   42215   8/7/14  

Buy

Page 12: Titan: Scaling Graphs and TinkerPop3
Page 13: Titan: Scaling Graphs and TinkerPop3

g.V.has(‘email’,’matt@’).out(‘buy’).valueMap!

7B;N >C> -;NN <OSg

Page 14: Titan: Scaling Graphs and TinkerPop3

SELECT Product.* !FROM User!INNER JOIN Buy !!ON User.userid = Buy.userid!

INNER JOIN Product !!ON Buy.productid = Product.productid!

WHERE User.email = ‘matt@’ !

7B;N >C> -;NN <OSg

Page 15: Titan: Scaling Graphs and TinkerPop3

SELECT TOP (5) [t14].[ProductName] FROM (SELECT COUNT(*) AS [value], [t13].[ProductName] FROM [customers] AS [t0] CROSS APPLY (SELECT [t9].[ProductName] FROM [orders] AS [t1] CROSS JOIN [order details] AS [t2] INNER JOIN [products] AS [t3] ON [t3].[ProductID] = [t2].[ProductID] CROSS JOIN [order details] AS [t4] INNER JOIN [orders] AS [t5] ON [t5].[OrderID] = [t4].[OrderID] LEFT JOIN [customers] AS [t6] ON [t6].[CustomerID] = [t5].[CustomerID] CROSS JOIN ([orders] AS [t7] CROSS JOIN [order details] AS [t8] INNER JOIN [products] AS [t9] ON [t9].[ProductID] = [t8].[ProductID]) WHERE NOT EXISTS(SELECT NULL AS [EMPTY] FROM [orders] AS [t10] CROSS JOIN [order details] AS [t11] INNER JOIN [products] AS [t12] ON [t12].[ProductID] = [t11].[ProductID] WHERE [t9].[ProductID] = [t12].[ProductID] AND [t10].[CustomerID] = [t0].[CustomerID] AND [t11].[OrderID] = [t10].[OrderID]) AND [t6].[CustomerID] <> [t0].[CustomerID] AND [t1].[CustomerID] = [t0].[CustomerID] AND [t2].[OrderID] = [t1].[OrderID] AND [t4].[ProductID] = [t3].[ProductID] AND [t7].[CustomerID] = [t6].[CustomerID] AND [t8].[OrderID] = [t7].[OrderID]) AS [t13] WHERE [t0].[CustomerID] = N'ALFKI' GROUP BY [t13].[ProductName]) AS [t14] ORDER BY [t14].[value] DESC

Page 16: Titan: Scaling Graphs and TinkerPop3

g.V.has('customerId’,cid).as('customer') .out('orderedProduct’).as('products') .in('orderedProduct').except('customer') .out('orderedProduct’).except('products') .productName.groupCount.cap. .map{it.sort{-it.value}}.next()[0..<5]

h.p://sql2gremlin.com/  

Page 17: Titan: Scaling Graphs and TinkerPop3

TinkerPop 3 coming soon

Generic Graph API

Dataflow Processing

TraversalLanguage

Object-GraphMapper

GraphAlgorithms

GraphServer

http://www.tinkerpop.com/docs/3.0.0.M6/

Page 18: Titan: Scaling Graphs and TinkerPop3

label: demigod name: Hercules

label: god name: Jupiter

label: monster name: Cerberus

father father

mother brother

brother battled

pet

time:12

label: god name: Pluto age: 4000

label: god name: Neptune age: 4500

label: human name: Alcmene age: 45

label: titan name: Saturn age: 10000

label: monster name: Hydra

battled

time: 2

Page 19: Titan: Scaling Graphs and TinkerPop3

AURELIUS THINKAURELIUS.COM

Titan:db Architecture & Internals titandb.io

Page 20: Titan: Scaling Graphs and TinkerPop3

Architecture Analogy

MyISAM

Page 21: Titan: Scaling Graphs and TinkerPop3

Flexible Persistence

Partitionability

Availability Consistency

Page 22: Titan: Scaling Graphs and TinkerPop3

Graph Schema

!  Labels & Keys

!  Vertex Property Cardinality !  Single, Set, List

!  Edge Multiplicity !  Multi, Simple, Many2One, One2Many,

One2One

!  Constraints & Consistency

Page 23: Titan: Scaling Graphs and TinkerPop3

I. Navigate Memory

Page 24: Titan: Scaling Graphs and TinkerPop3

Sequential Data Access

Page 25: Titan: Scaling Graphs and TinkerPop3

Vertex Representation

5

Property

Property

Out-Edge

In-Edge

Out-Edge

In-Edge

In-Edge

LIQ CH>C=?M @IL @;MN P?LN?R =?HNLC= KO?LC?M

<SN?

IL>?L

MILNCHA

=?FF t =IFOGH q P;FO?

LIQ

E?S

Page 26: Titan: Scaling Graphs and TinkerPop3

Vertex-Centric Index

!  Create a vertex-centric index per edge label or property key to optimize for retrieval patterns !  Through Titan’s management API

!  Supports multiple indexes

name: Hercules surname: Mighty label: user

time:24

bought

title: “Muscle building for beginners” label: product

[bought, time, DESC]

Page 27: Titan: Scaling Graphs and TinkerPop3

v

time: 1

fought fought father

mother

battled battled battled

battled

time: 3 time: 5

time: 9 v!

Page 28: Titan: Scaling Graphs and TinkerPop3

v

time: 1

father

mother

battled battled battled

battled

time: 3 time: 5

time: 9 v.outE()!

Page 29: Titan: Scaling Graphs and TinkerPop3

v

time: 1

battled battled battled

battled

time: 3 time: 5

time: 9 v.outE(‘battled’)!

Page 30: Titan: Scaling Graphs and TinkerPop3

v

time: 1

battled battled

time: 3

v.outE(‘battled’)! .has(‘time’,lt,5)! .inV!

Page 31: Titan: Scaling Graphs and TinkerPop3

Faster Querying

!  Multi-Query !  Bulk querying reduces latency

!  Currently manual

!  Order in vertex-centric query

!  Adjacency-constraint !  Faster edge retrievals

" Titan Query Optimizer

Page 32: Titan: Scaling Graphs and TinkerPop3

Graph Index

!  Composite internal index !  Can be unique

!  Label specific index

!  Multiple external indexes

label: user name: Hercules surname: Mighty

time:24

bought

label: product title: “Muscle building for beginners”

Page 33: Titan: Scaling Graphs and TinkerPop3

g.E.has(‘location’,WITHIN, Geoshape.circle(38,24,50)!

Full text & Geo Search

Page 34: Titan: Scaling Graphs and TinkerPop3

Token Ring (BOP)

Graph Partitioning

- assigns ids to map vertices into “optimal” token range - Maintains virtual partitions

Vertical Partitioning = divide communities

Page 35: Titan: Scaling Graphs and TinkerPop3

Vertex Partitioning

Horizontal Partitioning = split super vertices

Each partition contains one “part” of a partitioned vertex

Page 36: Titan: Scaling Graphs and TinkerPop3

Combined Partitioning

Page 37: Titan: Scaling Graphs and TinkerPop3

label: demigod name: Hercules

label: god name: Jupiter

label: monster name: Cerberus

father father

mother brother

brother battled

pet

time:12

label: god name: Pluto age: 4000

label: god name: Neptune age: 4500

label: human name: Alcmene age: 45

label: titan name: Saturn age: 10000

label: monster name: Hydra

battled

time: 2

OLTP

Page 38: Titan: Scaling Graphs and TinkerPop3

label: demigod name: Hercules

label: god name: Jupiter

label: monster name: Cerberus

father father

mother brother

brother battled

pet

time:12

label: god name: Pluto age: 4000

label: god name: Neptune age: 4500

label: human name: Alcmene age: 45

label: titan name: Saturn age: 10000

label: monster name: Hydra

battled

time: 2

OLAP

Page 39: Titan: Scaling Graphs and TinkerPop3

Faunus Work Flow

hdfs://user/ubuntu/

output/job-0/

output/job-1/

output/job-2/ { graph*

sideeffect*

g.V.out .out .count()

Compressed HDFS Graphs !  stored in sequence files !  variable length encoding !  prefix compression

Page 40: Titan: Scaling Graphs and TinkerPop3

Degree Distribution

GitHub Network

g.V.sideEffect{ it.degree = it.out(‘follows’).count()

}.degree.groupCount

Page 41: Titan: Scaling Graphs and TinkerPop3

Degree Distribution

P(k) ~ k-γ

γ = 2.2

Page 42: Titan: Scaling Graphs and TinkerPop3

Complete Architecture

!

Use

rs &

Ap

ps

Eve

nt

Pro

cess

ing

OLTP OLAP

Q

Continuous Computing

Interactive Query

Batch Query

(ETL)

Page 43: Titan: Scaling Graphs and TinkerPop3

Use Cases

Page 44: Titan: Scaling Graphs and TinkerPop3

http://arli.us/magazinaluiza

Page 45: Titan: Scaling Graphs and TinkerPop3

Security

Fraud

http://arli.us/cisco-sec1

Page 46: Titan: Scaling Graphs and TinkerPop3

© Sean York @ Pearson Education

http://arli.us/edu-planet-scale

Page 47: Titan: Scaling Graphs and TinkerPop3

http://arli.us/musicgraphintro

Music Graph

Knowledge Graph

Page 48: Titan: Scaling Graphs and TinkerPop3

http://bit.ly/ WPTitanSEAGraph

Page 49: Titan: Scaling Graphs and TinkerPop3

Map+

7B?H NI OM? 'L;JB $;N;<;M?Mg

NoSQL

K V V V V

Persistence

Tabular

Data-driven

Graph

Page 50: Titan: Scaling Graphs and TinkerPop3

Map+

7B?H NI OM? 'L;JB $;N;<;M?Mg

NoSQL

K V V V V

Persistence

Tabular

Data-driven

Graph

Complexity

Page 51: Titan: Scaling Graphs and TinkerPop3

AURELIUS THINKAURELIUS.COM

@AURELIUSGRAPHS

[email protected]