28
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan http://www.pdos.lcs.mit.edu/chord MIT Laboratory for Computer Science

1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory

Embed Size (px)

Citation preview

Page 1: 1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan  MIT Laboratory

1

Secure Peer-to-Peer File Sharing

Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan

http://www.pdos.lcs.mit.edu/chord

MIT Laboratory for Computer Science

Page 2: 1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan  MIT Laboratory

2

SFS: a secure global file system

• One name space for all files

• Global deployment

• Security over untrusted networks

ServerOxygen

clientClient

MIT

H21

H21

/global/mit/kaashoek/sfs

Page 3: 1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan  MIT Laboratory

3

SFS results

• Research: how to do server authentication?– Self-certifying pathnames– flexible key management

• Complete system available– www.fs.net– 65,000 lines of C++ code– Toolkit for file system research

• System used inside and outside MIT• Ported to iPAQ

Page 4: 1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan  MIT Laboratory

4

New direction: peer-to-peer file sharing

• How to build distributed systems without centrally-managed servers?• Many Oxygen technologies are peer-to-peer

– INS, SFS/Chord, Grid

• Chord is a new elegant primitive for building peer-to-peer applications

Page 5: 1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan  MIT Laboratory

5

Peer-to-Peer Properties

• Main advantage: decentralized– No single point of failure in the system– More robust to random faults or adversaries– No need for central adminstration

• Main disadvantage: decentralized– All failures equally important---no “clients”– Difficult to coordinate use of resources– No opportunity for central administration

Page 6: 1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan  MIT Laboratory

6

Peer-to-Peer Challenges

• Load balancing– No node should be overloaded

• Coordination– Agree globally on who responsible for what

• Dynamic network/fault tolerance– Readjust responsibility as peers come and go

• Scalability– Rousources per peer must be negligible

Page 7: 1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan  MIT Laboratory

7

Peer-to-peer sharing example

• Internet users share music files – Share disk storage and network bandwidth

– 10Gb for 1 hour/day continuous 400Mb

Internet

Page 8: 1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan  MIT Laboratory

8

Key Primitive: Lookup

insert

find

Page 9: 1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan  MIT Laboratory

9

Chord: a P2P Routing Primitive

• Lookup is the key problem– Given identifier, find responsible machine

• Lookup is not easy:– GNUtella scales badly---too much lookup work– Freenet is imprecise---lookups can fail

• Chord lookup provides:– Good naming semantics and efficiency– Elegant base for layered features

Page 10: 1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan  MIT Laboratory

10

Chord Architecture

• Interface:– Lookup(ID) IP-address– ID might be node name, document ID, etc.– Get IP address of node responsible for ID– Application decides what to do with IP address

• Chord consists of– Consistent hashing to assign IDs to nodes– Efficient routing protocol to find right node– Fast join/leave protocol

Page 11: 1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan  MIT Laboratory

11

Chord Properties

• Log(n) lookup messages and table space.– Log(1,000,000) 20

• Well-defined location for each ID– No search required

• Natural load balance

• Minimal join/leave disruption

• Does not store documents– But document store layers easily on top

Page 12: 1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan  MIT Laboratory

12

Assignment of Responsibility

Page 13: 1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan  MIT Laboratory

13

Consistent Hashing

Page 14: 1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan  MIT Laboratory

14

Consistent Hashing

06

13

18

22

31

47

4236

51

60

Each node picks random point on identifier circle

Page 15: 1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan  MIT Laboratory

15

Consistent Hashing

Hash document ID to identifier

circle

06

13

18

22

31

47

4236

51

60

49

Page 16: 1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan  MIT Laboratory

16

Consistent Hashing

06

13

18

22

31

47

4236

51

60

Assign ID to “successor”

node on circle49

Assign doc with hash 49 to node 51

Page 17: 1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan  MIT Laboratory

17

Load Balance

• Each node responsible for circle segment between it and previous node

• But random node positions mean previous node close

• So no node responsible for too much 31

22

Segment for node 31

Page 18: 1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan  MIT Laboratory

18

Dynamic Network

• To know appropriate successor, must know identifiers of all nodes on circle

• Requires lots of state per node

• And state must be kept current

• Requires huge number of messages when a node joins or leaves

Page 19: 1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan  MIT Laboratory

19

Successor Pointers

• Each node keeps track of successor on circle

• To find objects, walk around circle using successor pointers

• When node joins, notify one node to update successor

• Problem: slow!

06

13

18

22

31

47

42 36

51

60

Page 20: 1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan  MIT Laboratory

20

Fingers• Each node keeps

carefully chosen “fingers”---shortcuts around circle

• For distant ID, shortcut covers much distance

• Result: – fast lookups

– small tables

06

13

18

22

31

47

4236

51

60

Page 21: 1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan  MIT Laboratory

21

Powers of 2

• Node at ID n stores fingers to nodes at IDs n+1/2, n+1/4, n+1/8, n+1/16….

• log(n) fingers needed into n nodes

• Key fact: whatever current node, some power of 2 is halfway to target

• Distance to target halves in each step

• log(n) steps suffice to reach target

• log(1,000,000) ~ 20

Page 22: 1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan  MIT Laboratory

22

Chord Lookups0

6

13

18

22

31

47

4236

51

60

Page 23: 1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan  MIT Laboratory

23

Node Join Operations

• Integrate into routing mechanism– New node finds successor (via lookup)– Determines fingers (more lookups)– Total: O(log2(n)) time to join network

• Takes responsibility for certain objects from successor – Upcall for application dependent reaction– E.g., may copy documents from other node

Page 24: 1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan  MIT Laboratory

24

Fault Tolerance

• Node failures have 2 problems:– Lost data– Corrupted routing (fingers cut off)

• Data solution: replicate:– Place copies of data at adjacent nodes– If successor fails, next node becomes successor

• Finger solution: alternate paths– If finger lost, use different (shorter) finger– Lookups still fast

Page 25: 1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan  MIT Laboratory

25

File sharing with Chord

Chord ChordChord

Key/Value Key/Value Key/Value

Client App(e.g. Browser)

Client Server Server

lookup(id)

get(key)put(k, v)

• Fault tolerance: store values at r successors• Hot documents: cache values along Chord lookup path• Authentication: self-certifying names (SFS)

Page 26: 1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan  MIT Laboratory

26

Chord Status

• Working Chord implementation

• SFSRO file system layered on top

• Prototype deployed at 12 sites around world

• Understand design tradeoffs

Page 27: 1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan  MIT Laboratory

27

Open Issues

• Network proximity

• Malicious data insertion

• Malicious Chord table information

• Anonymity

• Keyword search and indexing

Page 28: 1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan  MIT Laboratory

28

Chord Summary

• Chord provides distributed lookup– Efficient, low-impact join and leave

• Flat key space allows flexible extensions

• Good foundation for peer-to-peer systems

http://www.pdos.lcs.mit.edu/chord