View
224
Download
0
Tags:
Embed Size (px)
Citation preview
P2P Databases
P2P Today
napster
gnutellamorpheus
kazaa
bearshare seti@home
folding@home
ebay
limewire
icq
fiorana
mojo nation
jxta
united devicesopen cola
uddi
process tree
can
chord
ocean store
farsite
pastry
tapestry
?grove
netmeeting
freenet
popular power
aim
jabber
bittorrentedonkey
Object representation and storageAttributes :
Name , Artist, Album , Genre
Objects
Pointer to object
P2P vs. Distributed DBMS
• Transactions• Distributed Query Optimization• Interoperation of heterogeneous data
sources• Reliability/failure of nodes
Complex features do not scale
Traditional DDBMS Issues:
P2P vs. Distributed DBMSExample application: file-sharing• Simple data model and query language
– No complex query optimization– Easy interoperation
• No guarantee on quality of results– Individual site availability unimportant
• Local updates– No transactions– Network partitions OK
Simple Amenable to large-scale network of PCs
Example: file sharing
• Challenge #1: Performance– Asking everyone
is expensive!– If I am smart, I
only need to ask one peer
– How can I be smart?
?
?
??
File X?
Search in P2P
• System can control:– Connections made by users/topology– Data placement– Query type
• Tight control: “Structured” – Efficient, comprehensive
• Loose control: “Unstructured”– Inefficient, not comprehensive, simple, expressive– Used in real life
Both are useful to study
Centralized
• Napster model• Benefits:
– Efficient search– Limited bandwidth
usage– No per-node state
• Drawbacks:– Central point of failure– Limited scale
Bob Alice
JaneJudy
http://www.snocap.com/
Unstructured – Query Flooding
= forward query
= processed query
= query source
= found result
= forward response
Problems with unstructured
• Inefficient– Query messages are flooded– Even if routing is intelligent, worst case load is still
O(n), where n is # nodes in system
• Not comprehensive– If I do not get a result for my query, is it because
none exists?
• (Of course, many optimizations are possible…)
Structured systems address these problems
Distributed Hash Table (DHTs)
• Model:– Key/Object pair, the key is hashed to get an ID– Example:
• Objects are files• The key is the content of the file• The ID is the hash of the file contents
• Single operation: Lookup(ID)– Input: integer ID– Output: the object with the corresponding ID
Identifiers
• IDs are m-bit integers• Nodes are also assigned IDs
– Commonly assigned by hashing a node’s IP address, although many problems with this
• An object is stored on the node with the smallest ID greater than the object’s ID– This node is called the successor of the object’s
ID– IDs are arranged on a circle, so 0 > 2m-1
Data Placement
0
1
2
3
45
6
7m = 3 Nodes:
• 0• 1• 3
Data:• 1• 2• 6
1
2
2
6
6
Connections
0
1
2
3
45
6
7
“Finger pointers”
Distance• 20
• 21
….
• 2m-1
Query• Lookup(objectID)
– objectID is typically the ID of the object you are looking for, but not necessarily
• Approach:– Find the predecessor of the object
• I.e. the node with the largest ID that is smaller than the object ID
– Return the successor of the predecessor
Query Example
• Say node 0 wants to find the object with ID = 7
• For simplicity, we will assume a node exists at every ID in the space
Query Example
0
1
3
45
6
7
2
Node 0: Lookup(7)
Node 0: FindPred (7)
Query Example
0
1
3
45
6
7
2
Node 4: FindPred(7)
Query Example
0
1
3
45
6
7
2
Node 6: FindPred(7)
Node 6 is predecessorReturn successor node 7
Query characteristics
• With high probability, a query can be answered by contacting O(log N) nodes– N total nodes in the networkEfficient!
• Also notice: if an object with the ID exists in the network, it will be foundComprehensive!
• State is also O(log N) in size
Query characteristics
• Note that finger pointers are not required for correct operation– Only successor pointers are needed– But then cost of query increases
• O(N) in worst case
Advantages of Structured?
• Scalability/Efficiency – load grows with O(log N)
• Comprehensiveness
Disadvantages? (cont)
• Availability of Data– If a node dies suddenly, what happens to
the data it was storing?– MUST replicate data across multiple nodes
• Query Language– How can we express keyword queries
efficiently?– Many useful applications require different
languages
Magnolia
Current approach: Hash each keyword separately and store pointers at h(keyword)
Seven
Innovation
Myths
h(some)
h(innovation)
h(myths)
1100100101
“Seven Innovation Myths” 1100100101h(title) “Innovation”
Resulting Distribution
Prefix hashing
………….
m’
m bits
Innovation
hP(innovation) hP = m’ bit hash function
Partitions network into ~ n/2m’ separate sibling groups
n = nodes, m’ partitioning factorFor m’=12, n= 1 million, ~ 256 nodes will share same prefix Assumption: h is uniformly distributed
100
Prefix Hashing
100
Innovation
Balanced over the sibling group
Sibling group ID=100
Balancing
All siblings in a group share the same prefix
Random Sibling
InsertKeyword hP SiblingGroup ID
Locate a sibling node via SIFT
Lookup
Keyword
O(1)
Group Broadcast or Multicast
Replies
Advantages
• Good Balancing Properties
Advantages
• Low Traffic Load on nodes for popular queries
• Quick Lookup
• Popularity Ranking of Objects
• Distributed Replication for resilience
Implementing Magnolia
• Developed on top of a chord clone written in Python– If you’re going to write a peer-to-peer app, why
not leverage existing modules and libraries?
• Challenge: How do we implement group-based stores and queries without requiring additional network maintenance?
Chord’s Finger Table
• A chord node maintains a finger table of M IP’s pointing to nodes ahead of it in the ring.– A pointer at index i is the successor of node id +
(2^i-1). This lets us reach any node in the network in O(log M) hops
• We use the M’ most significant bits in a node’s id to indicate it’s group. We want to reach any group in O(log M’) hops.– Do we need another table?– Nope. The last M’ entries in our finger table
provide this.
Talking to Siblings
• How do we propagate queries through the group?
• Naïve solution: send to our predecessor and successor.
• A better solution: We can send a query throughout the group by treating the sibling group as a tree.
Sibling Tree
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0
1 8
2
3
15
4
5
6 7
9
10 11
12
13 14
0+1
1+1
2+1
0+2^3
8+2^21+2^2
2+2^1 5+2^1 9+2^1 12+2^1
14+2^0
5+1 9+1
8+1
12+1
N/N’ = 16; M/M’ = 4
Every edge can be found in the finger table!
Sibling Tree Problems
• Problems:– Not every possible node will exist – Not every node will have results to report– The query maker needs to know when the search is
done
• But we’re okay!– Nodes can determine if a child sub-tree is dead– Even if a child node in our sibling table is of a higher ID
than expected• its sub-tree contains all existing descendents of the expected id• we can predict when a child is in a sibling our ancestor’s tree
Bigger Problems
• What if a pointer in our finger table fails?– We either have to find the successor to it’s id or
fail to query the sub-tree
• What if the lowest ID node isn’t the root of our tree?– Some of our edges won’t be in our finger table
Popularity queries
Yulania , Demo
BitTorrent
SplitStream