Upload
qian-lin
View
698
Download
2
Embed Size (px)
DESCRIPTION
Citation preview
Trinity: A Distributed Graph Engine on a Memory Cloud
Speaker: LIN Qianhttp://www.comp.nus.edu.sg/~linqian/
Graph applications
Online query processing Low latencyOffline graph analytics High throughput
Online queries
Random data accesse.g., BFS, sub-graph matching, …
Offline computations
Performed iteratively
Insight: Keeping the graph in memory
at least the topology
Trinity
Online query + Offline analytics
Random data access problem in large graph
computationGlobally addressable distr. memory
Random access abstraction
Belief
High-speed network is more availableDRAM is cheaper
In-memory solution become practical
“Trinity itself is not a system that comes with comprehensive built-in graph computation modules.”
Trinity cluster
Stack of Trinity system modules
User define: Graph schema, Communication protocols, Computation paradigms
Memory cloud
Partition memory space into trunksHashing
Memory trunks
2p > m1. Trunk level parallelism
2. Efficient hashing
Hashing
Key-value storep-bit value i [0, 2∈ p – 1]
Inner trunk hash table
Data partitioning and addressing
Benefit:Scalability Fault-tolerance
Modeling graph
Cell: value + schemaRepresent a node in a cell
TSL
Object-oriented cell manipulationData integration
Network communication
Online queries
Traversal basedNew paradigm
Vertex centric offline analytics
Restrictive vertex centric model
Message passing optimization
Create a bipartite partition of the local graph
Buffer hub vertices
A new paradigm for offline analytics
1. Aggregate answers from local computations
2. Employ probabilistic inference
Circular memory management
• Aim to avoid memory gaps between large number of key-value pairs
Fault tolerance
Heartbeat-based failure detectionBSP: checkpointing
Async.: “periodical interruption”
Performance
Performance (cont.)