Upload
matthias-broecheler
View
1.446
Download
1
Embed Size (px)
DESCRIPTION
Titan is a scalable graph database that can distribute and query graph data across multiple machines. This presentation provides a general introduction to graph computing and Titan in particular. It also focuses on some recent development for Titan 0.9 and TinkerPop 3.
Citation preview
AURELIUS THINKAURELIUS.COM
Matthias Broecheler @mbroecheler December 2nd, MMXIV
Titan:db Graphs at Scale #TitanDB
software is eating the world
software is eating the world
>;N; >LCP?H
Tabular Relational
Tabular Relational
NoSQL
Flexibility Handling
Tabular Relational
NoSQL
Flexibility Handling
Data Model Query Language
Tabular Relational
NoSQL
Graph Flexibility Handling
Data Model Query Language
Data Model Query Language
Simplicity Flexibility Handling
User Product
productid: 52235 name: cup price: 12.55
userid: matt email: matt@ password: 12345
userid email password
ma. ma.@ 12345
john john@ qwerty
billy billy@ abcde
produc=d name price
52235 cup 12.55
42215 spoon 7.22
24529 knife 5.32
User Product
User Product
productid: 52235 name: cup price: 12.55
userid: matt email: matt@ password: 12345
buy time: 9/5/14
userid email password
ma. ma.@ 12345
john john@ qwerty
billy billy@ abcde
produc=d name price
52235 cup 12.55
42215 spoon 7.22
24529 knife 5.32
User Product
userid produc=d =me ma. 52235 9/5/14 billy 42215 8/7/14 billy 42215 8/7/14
Buy
g.V.has(‘email’,’matt@’).out(‘buy’).valueMap!
7B;N >C> -;NN <OSg
SELECT Product.* !FROM User!INNER JOIN Buy !!ON User.userid = Buy.userid!
INNER JOIN Product !!ON Buy.productid = Product.productid!
WHERE User.email = ‘matt@’ !
7B;N >C> -;NN <OSg
SELECT TOP (5) [t14].[ProductName] FROM (SELECT COUNT(*) AS [value], [t13].[ProductName] FROM [customers] AS [t0] CROSS APPLY (SELECT [t9].[ProductName] FROM [orders] AS [t1] CROSS JOIN [order details] AS [t2] INNER JOIN [products] AS [t3] ON [t3].[ProductID] = [t2].[ProductID] CROSS JOIN [order details] AS [t4] INNER JOIN [orders] AS [t5] ON [t5].[OrderID] = [t4].[OrderID] LEFT JOIN [customers] AS [t6] ON [t6].[CustomerID] = [t5].[CustomerID] CROSS JOIN ([orders] AS [t7] CROSS JOIN [order details] AS [t8] INNER JOIN [products] AS [t9] ON [t9].[ProductID] = [t8].[ProductID]) WHERE NOT EXISTS(SELECT NULL AS [EMPTY] FROM [orders] AS [t10] CROSS JOIN [order details] AS [t11] INNER JOIN [products] AS [t12] ON [t12].[ProductID] = [t11].[ProductID] WHERE [t9].[ProductID] = [t12].[ProductID] AND [t10].[CustomerID] = [t0].[CustomerID] AND [t11].[OrderID] = [t10].[OrderID]) AND [t6].[CustomerID] <> [t0].[CustomerID] AND [t1].[CustomerID] = [t0].[CustomerID] AND [t2].[OrderID] = [t1].[OrderID] AND [t4].[ProductID] = [t3].[ProductID] AND [t7].[CustomerID] = [t6].[CustomerID] AND [t8].[OrderID] = [t7].[OrderID]) AS [t13] WHERE [t0].[CustomerID] = N'ALFKI' GROUP BY [t13].[ProductName]) AS [t14] ORDER BY [t14].[value] DESC
g.V.has('customerId’,cid).as('customer') .out('orderedProduct’).as('products') .in('orderedProduct').except('customer') .out('orderedProduct’).except('products') .productName.groupCount.cap. .map{it.sort{-it.value}}.next()[0..<5]
h.p://sql2gremlin.com/
TinkerPop 3 coming soon
Generic Graph API
Dataflow Processing
TraversalLanguage
Object-GraphMapper
GraphAlgorithms
GraphServer
http://www.tinkerpop.com/docs/3.0.0.M6/
label: demigod name: Hercules
label: god name: Jupiter
label: monster name: Cerberus
father father
mother brother
brother battled
pet
time:12
label: god name: Pluto age: 4000
label: god name: Neptune age: 4500
label: human name: Alcmene age: 45
label: titan name: Saturn age: 10000
label: monster name: Hydra
battled
time: 2
AURELIUS THINKAURELIUS.COM
Titan:db Architecture & Internals titandb.io
Architecture Analogy
MyISAM
Flexible Persistence
Partitionability
Availability Consistency
Graph Schema
! Labels & Keys
! Vertex Property Cardinality ! Single, Set, List
! Edge Multiplicity ! Multi, Simple, Many2One, One2Many,
One2One
! Constraints & Consistency
I. Navigate Memory
Sequential Data Access
Vertex Representation
5
Property
Property
Out-Edge
In-Edge
Out-Edge
In-Edge
In-Edge
LIQ CH>C=?M @IL @;MN P?LN?R =?HNLC= KO?LC?M
<SN?
IL>?L
MILNCHA
=?FF t =IFOGH q P;FO?
LIQ
E?S
Vertex-Centric Index
! Create a vertex-centric index per edge label or property key to optimize for retrieval patterns ! Through Titan’s management API
! Supports multiple indexes
name: Hercules surname: Mighty label: user
time:24
bought
title: “Muscle building for beginners” label: product
[bought, time, DESC]
v
time: 1
fought fought father
mother
battled battled battled
battled
time: 3 time: 5
time: 9 v!
v
time: 1
father
mother
battled battled battled
battled
time: 3 time: 5
time: 9 v.outE()!
v
time: 1
battled battled battled
battled
time: 3 time: 5
time: 9 v.outE(‘battled’)!
v
time: 1
battled battled
time: 3
v.outE(‘battled’)! .has(‘time’,lt,5)! .inV!
Faster Querying
! Multi-Query ! Bulk querying reduces latency
! Currently manual
! Order in vertex-centric query
! Adjacency-constraint ! Faster edge retrievals
" Titan Query Optimizer
Graph Index
! Composite internal index ! Can be unique
! Label specific index
! Multiple external indexes
label: user name: Hercules surname: Mighty
time:24
bought
label: product title: “Muscle building for beginners”
g.E.has(‘location’,WITHIN, Geoshape.circle(38,24,50)!
Full text & Geo Search
Token Ring (BOP)
Graph Partitioning
- assigns ids to map vertices into “optimal” token range - Maintains virtual partitions
Vertical Partitioning = divide communities
Vertex Partitioning
Horizontal Partitioning = split super vertices
Each partition contains one “part” of a partitioned vertex
Combined Partitioning
label: demigod name: Hercules
label: god name: Jupiter
label: monster name: Cerberus
father father
mother brother
brother battled
pet
time:12
label: god name: Pluto age: 4000
label: god name: Neptune age: 4500
label: human name: Alcmene age: 45
label: titan name: Saturn age: 10000
label: monster name: Hydra
battled
time: 2
OLTP
label: demigod name: Hercules
label: god name: Jupiter
label: monster name: Cerberus
father father
mother brother
brother battled
pet
time:12
label: god name: Pluto age: 4000
label: god name: Neptune age: 4500
label: human name: Alcmene age: 45
label: titan name: Saturn age: 10000
label: monster name: Hydra
battled
time: 2
OLAP
Faunus Work Flow
hdfs://user/ubuntu/
output/job-0/
output/job-1/
output/job-2/ { graph*
sideeffect*
g.V.out .out .count()
Compressed HDFS Graphs ! stored in sequence files ! variable length encoding ! prefix compression
Degree Distribution
GitHub Network
g.V.sideEffect{ it.degree = it.out(‘follows’).count()
}.degree.groupCount
Degree Distribution
P(k) ~ k-γ
γ = 2.2
Complete Architecture
!
Use
rs &
Ap
ps
Eve
nt
Pro
cess
ing
OLTP OLAP
Q
Continuous Computing
Interactive Query
Batch Query
(ETL)
Use Cases
http://arli.us/magazinaluiza
Security
Fraud
http://arli.us/cisco-sec1
© Sean York @ Pearson Education
http://arli.us/edu-planet-scale
http://arli.us/musicgraphintro
Music Graph
Knowledge Graph
http://bit.ly/ WPTitanSEAGraph
Map+
7B?H NI OM? 'L;JB $;N;<;M?Mg
NoSQL
K V V V V
Persistence
Tabular
Data-driven
Graph
Map+
7B?H NI OM? 'L;JB $;N;<;M?Mg
NoSQL
K V V V V
Persistence
Tabular
Data-driven
Graph
Complexity