46
An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research Pittsburgh

An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

  • View
    219

  • Download
    0

Embed Size (px)

Citation preview

Page 1: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

An Architecture for Internet Data Transfer

Niraj Tolia

Michael Kaminsky*, David G. Andersen, and Swapnil Patil

Carnegie Mellon University and *Intel Research Pittsburgh

Page 2: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 20062

Innovation in Data Transfer is Hard

• Imagine: You have a novel data transfer technique• How do you deploy?

1. Update HTTP. Talk to IETF. Modify Apache, IIS, Firefox, Netscape, Opera, IE, Lynx, Wget, …

2. Update SMTP. Talk to IETF. Modify Sendmail, Postfix, Outlook…

3. Give up in frustration

2001 2006

2002 2003 2004 2005

2001 2006

2002 2003 2004 2005

2001Bittorrent

2001 2006

2002 2003 2004 2005

2001Bittorrent

2005Avalanche

2001 2006

2002 2003 2004 2005

2001Bittorrent

2005Avalanche

2005ALM-FastReplica

2001 2006

2002 2003 2004 2005

2001Bittorrent

2005Avalanche

2005CoBlitz

2005ALM-FastReplica

2001 2006

2002 2003 2004 2005

2001Bittorrent

2001Chord

2005Avalanche

2005CoBlitz

2005ALM-FastReplica

2001 2006

2002 2003 2004 2005

2001Bittorrent

2001Chord

2005Avalanche

2005CoBlitz

2005ALM-FastReplica

2001CAN

2001 2006

2002 2003 2004 2005

2001Bittorrent

2001Chord

2001Pastry

2005Avalanche

2005CoBlitz

2005ALM-FastReplica

2001CAN

2001 2006

2002 2003 2004 2005

2001Bittorrent

2001Chord

2001Pastry

2004OpenDHT

2005Avalanche

2005CoBlitz

2005ALM-FastReplica

2001CAN

2001 2006

2002 2003 2004 2005

2001Bittorrent

2001Chord

2001Pastry

2004OpenDHT

2005Avalanche

2005CoBlitz

2005ALM-FastReplica

2001CAN

2001LBFS

2001 2006

2002 2003 2004 2005

2001Bittorrent

2001Chord

2001Pastry

2004OpenDHT

2003Value-Caching

2005Avalanche

2005CoBlitz

2005ALM-FastReplica

2001CAN

2001LBFS

2001 2006

2002 2003 2004 2005

2001Bittorrent

2001Chord

2001Pastry

2004OpenDHT

2003CASPER

2003Value-Caching

2005Avalanche

2005CoBlitz

2005ALM-FastReplica

2001CAN

2001LBFS

2001 2006

2002 2003 2004 2005

2001Bittorrent

2001Chord

2001Pastry

2004OpenDHT

2003CASPER

2003Value-Caching

2005Avalanche

2005CoBlitz

2005ALM-FastReplica

2003Riverbed

2001CAN

2001LBFS

2001 2006

2002 2003 2004 2005

2001Bittorrent

2001Chord

2001Pastry

2004OpenDHT

2003CASPER

2003Value-Caching

2005Avalanche

2005CoBlitz

2005ALM-FastReplica

2003Riverbed

2001CAN

2001LBFS

2004Lookaside Caching

2001 2006

2002 2003 2004 2005

2001Bittorrent

2001Chord

2001Pastry

2004OpenDHT

2003CASPER

2003Value-Caching

2005Avalanche

2005CoBlitz

2005ALM-FastReplica

2004Segank

2003Riverbed

2001CAN

2001LBFS

2004Lookaside Caching

2001 2006

2002 2003 2004 2005

2001Bittorrent

2001Chord

2001Pastry

2004OpenDHT

2003CASPER

2003Value-Caching

2004USPS

2005Avalanche

2005CoBlitz

2005ALM-FastReplica

2004Segank

2003Riverbed

2001CAN

2001LBFS

2004Lookaside Caching

2001 2006

2002 2003 2004 2005

2001Bittorrent

2001Chord

2001Pastry

2004OpenDHT

2003CASPER

2003Value-Caching

2004USPS

2005Avalanche

2005CoBlitz

2005ALM-FastReplica

2004Segank

2003Riverbed

2001CAN

2001LBFS

2004Lookaside Caching

2004BlueFS

2001 2006

2002 2003 2004 2005

2001Bittorrent

2001Chord

2001Pastry

2004OpenDHT

2003CASPER

2003Value-Caching

2002IBP

2004USPS

2005Avalanche

2005CoBlitz

2005ALM-FastReplica

2004Segank

2003Riverbed

2001CAN

2001LBFS

2004Lookaside Caching

2004BlueFS

2001 2006

2002 2003 2004 2005

2001Bittorrent

2001Chord

2001Pastry

2004OpenDHT

2003CASPER

2003Value-Caching

2002IBP

2004USPS

2005Avalanche

2005CoBlitz

2005ALM-FastReplica

2004Segank

2003DataStaging

2003Riverbed

2001CAN

2001LBFS

2004Lookaside Caching

2004BlueFS

2001 2006

2002 2003 2004 2005

2001Bittorrent

2001Chord

2001Pastry

2004OpenDHT

2003CASPER

2003Value-Caching

2002IBP

2003Bullet

2004USPS

2005Avalanche

2005CoBlitz

2005ALM-FastReplica

2004Segank

2003DataStaging

2003Riverbed

2001CAN

2001LBFS

2004Lookaside Caching

2004BlueFS

Page 3: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 20063

Barriers to Innovation in Data Transfer

• Applications bundle:• Content Negotiation: What data to send

– Naming (URLs, directories, …)– Languages– Identification– …

• Data Transfer: Getting the bits across

• Both are tightly coupled (e.g., HTTP, SMTP)• Hinders innovation and evolution of new

services

Page 4: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 20064

Solution: A Data Transfer Service

• Decouple content negotiation from data transfer• Applications perform negotiation as before• But hand data objects to the Transfer Service

• The Transfer Service is shared by applications

Application ProtocolSender Receiver

Xfer Service Xfer Service

and Data

Data

Page 5: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 20065

Application Protocol

Extensible Transfer Architecture

Sender Receiver

Xfer Service

USBKeychain

Xfer Service

LocalCache

USBKeychain

Bittorrent

Application-independent cache

Non-networked transfers New network features

Bittorrent

Plugins

Page 6: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 20066

Transfer Service BenefitsApps. can reuse available transfer techniques

• No reimplementation needed

Easier deployment of new technologies• Applications need no modification

Provides for cross-application sharing• Can interpose on all data transfers

Handles transient disconnections

Page 7: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 20067

Outline

• Motivation• Data Oriented Transfer (DOT) service• Evaluation• Open Issues and Future Work• Conclusion

Page 8: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 20068

Xfer ServiceXfer Service

Sender Receiver

10,000 Foot View of Transfers using DOT

Request File X

put(X) read() data

• How does the transfer service name data?• How does the transfer service locate data?

??

??

Page 9: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 20069

• Application defined names are not portable• Use content-naming for globally unique names• Objects represented by an OID

• Objects are further sub-divided into “chunks”

• Each OID corresponds to a list of descriptors• Descriptor lists allow for partial transfers

FileFileFileFile

Desc3Desc3

DOT: Object Naming

Foo.txtFoo.txtFoo.txtFoo.txt OIDOID

File Cryptographic Hash

Desc1Desc1

Desc2Desc2

Page 10: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 200610

DOT: Object Location

• Data transfers in DOT are receiver driven• Receiver has better idea of available resources

• Senders specify ‘hints’ - potential data locations– dot://sender.example.com:12000/– dht://opendht.org/– …

Page 11: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 200611

Xfer ServiceXfer Service

Sender Receiver

A Transfer using DOT

Request File X

OID, Hints

put(X) OID, Hints get(OID, Hints) read() data

TransferPlugins

Page 12: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 200612

(1) ApplicationAPI

DOT’s Modular Architecture

DOT

Application

Network

Local Storage

StoragePlugin

TransferPlugin

(3) Storage Plugin API

(2) Transfer Plugin API

Page 13: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 200613

DOTNetworkTransfer

PluginMultiPath

Plugin

Transfer Plugin API

• Simple API• get_descriptor_list( OID, hints )• get_chunks( descriptor_list, hints )• cancel_chunks( chunk_list )

• Transfer plugin chaining is easy• e.g., multipath plugin Transfer

Plugin

TransferPlugin

Page 14: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 200614

Implementation• In C++ using libasync event-driven library• One storage plugin:

– In-memory hash tables, disk backed.

• Three transfer plugins:• Default Xfer-Xfer plugin• Portable Storage plugin• Multipath plugin

• Applications• gcp, an scp-like tool for file transfers• A DOT-enabled Postfix email server

– Included a socket-like adapter library

Page 15: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 200615

Xfer

Current DOT Prototype

NET( DSL )

Xfer

NET Multi-path

NETwireless

SENDER

RECEIVER

USBUSB

InternetInternetcache

Application-independent cache

Non-networked transfers Multipath and Mirror support

NET

MIRROR

Xfer Plugins

Page 16: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 200616

Outline

• Motivation• Data Oriented Transfer (DOT) service• Evaluation• Open Issues and Future Work • Conclusion

Page 17: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 200617

Evaluation

• Standard file transfer• Portable Storage• Multi-Path• Case Study: Postfix Email Server

• Capture and analysis of email trace• Evaluation of DOT-enabled SMTP server• Integration effort

Page 18: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 200618

Standard File Transfer Setup

• Two DOT-enabled machines• Network Emulator

• Evaluate various b/w + delay combinations

• Use gcp for the file transfers • Used 40MB, 4MB, 400KB, 40KB, 4KB files

• Presenting 40MB here

Network Emulator

Page 19: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 200619

Standard File Transfer

• Overhead: hashing, extra RTT• No noticeable overheads with latency

0.5

3.6

1.6

3.7

0.7

3.7

1.2

4.1

0.0

1.0

2.0

3.0

4.0

5.0

1000 Mb/s 100 Mb/s

Tra

ns

fer

Tim

e (

se

c)

- 4

0 M

Bwget scp scp w/o encr. gcp

Page 20: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 200620

Portable Storage Experiment

• 255 MB transfer over emulated DSL• Based on Virtual Machine transfers at Carnegie Mellon• DOT preemptively copies data onto Flash drive

• Wait 5 minutes, plug flash drive into receiver• Two drive speeds

• 8MB/s - 1GB• 20MB/s - 2GB

2 Mbit/s

Page 21: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 200621

Portable Storage Results

.. 1126s(~ 19 min)

Device Inserted

Page 22: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 200622

Multipath Plugin: Load Balancing

• Varied capacity + delay of experimental links• Compare fastest link alone with multipath plugin

on both links; what speedup?

• Transferred 40MB file• 128 KB socket buffer sizes

Gigabit

Experimental links

Network Emulator

Page 23: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 200623

Multipath Plugin is Effective

Link 1 Link 2 Single Multipath Savings

100/0100/0

10/03.59

1.90

3.54

47%

1.4%

100/66

100/66

10/66

1/66

43.33

23.20

22.97

38.25

46%

47%

12%

-40 MB @ 100Mbit/s ideal: 3.2 seconds

-Multipath plugin nearly doubles throughput

- TCP effects dominate. Pipe not full.

- Multipath plugin doubles by adding second stream. Actual capacity irrelevant.

Gigabit

Link 1

Link 2

Page 24: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 200624

Postfix Email Trace Replay

• Generated 10,000 email messages from trace• Random data matched to chunk hash data• Preserves some similarity between messages• Replayed through Postfix to a single local server

• Postfix disk bound… DOT CPU overhead negligible• Savings due to duplication within emails

Program Seconds Bytes Sent

Postfix 468 172 MB

Postfix + DOT 468 117 MB (68%)

Page 25: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 200625

Postfix Integration

• Integrated DOT with the Postfix mail server

• 1 part-time week, 1 student new to Postfix• Includes time to write generic adapter library

Program LoC Added LoC %

GTC Lib -- 421

Postfix 70,824 184 0.3%

smtpd 6,413 107 1.7%

smtp 3,378 71 2.1%

Page 26: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 200626

Discussion on Deployment

• Application Resilience• DOT is a service - it’s outside the control of the

application.• Our Postfix falls back to normal SMTP if

– No Transfer Service contact– Transfer keeps failing

• In the short term, a simple fallback is encouraged. However, this could interfere with some functions

– DOT-based virus scanner…• In the long term, DOT would be a part of a

system’s core infrastructure

Page 27: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 200627

Future Work

• Security• Application encrypts before DOT

- No block-based caching, reuse, mirroring, …• No encryption

- Resembles the status quo• In progress: Convergent encryption

– Requires integration with DOT chunking

• Application Preferences• Encryption, QoS, priorities, …

– DOT might benefit from application input• Need an extensible way to express these

Page 28: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 200628

Conclusion

• DOT separates app. logic from data transfer• Makes it easier to extend both

• Architecture works well• Overhead low (especially in wide-area)• Major benefits

• Caching• Flexibility to implement new transfer techniques

Page 29: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 200629

Backup Slides

Page 30: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 200630

Normal SMTP

SMTP Client ServerEHLO

250 Hello

MAIL FROM: user…

DATA

250 OK

Page 31: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 200631

DOT-Enabled SMTP

X-DOT-DATA

SMTP Client ServerEHLO

250 Hello

MAIL FROM: user…

(OID+Hints)

250 OKXfer Service

Page 32: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 200632

Convergent Encryption

FileFileFileFile

Hash3Hash3

Hash1Hash1

Hash2Hash2

• Chunki is encrypted using Hashi

• All identical cleartext blocks will map to the same encrypted block

• Hashi is further encrypted using a private key

Page 33: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 200633

Standard File Transfer

Page 34: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 200634

Mail Server Evaluation

• Trace: 159 days at low volume academic mail server• 458,861 messages• hash, size of: message, headers, body• Message chunks

• hash and size of each chunk• Static chunking and Rabin fingerprinting

(Content-based block division)

Page 35: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 200635

DOT chunk caching benefits email

Method Total Bytes Percent Bytes

SMTP Default 6800 MB -

DOT body 5876 MB 86.41%

Rabin body 5056 MB 74.35%

Rabin whole 5496 MB 80.81%

Page 36: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 200636

Default GTC-GTC Transfer Protocol

ReceiverGET_DESCRIPTORS(OID)

Desc list 1,2,…

GET_CHUNKS(…)

Sender

Chunk 1Chunk 2

GET_CHUNKS(…)

• GTC-GTC protocol mirrors transfer plugins• Implemented as RPC calls• (Fetches are actually pipelined)

Page 37: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 200637

Rabin Fingerprinting

4 7 8 2 8

File Data

Rabin Fingerprints

Given Value - 8

Natural Boundary Natural Boundary

Hash 1 Hash 2

Page 38: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 200638

Rabin Fingerprinting: Examples of Edits

1. Original File

2. Addition in chunk– Changes only one hash

3. Addition creating a new breakpoint

4. Deletion changing size of chunkFigure from “A Low-bandwidth Network File System”

Page 39: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 200639

DOT Objects Naming

• Objects represented by an OID• Divided into “chunks”, each with a descriptor

• Each OID corresponds to a list of descriptors • Data is fetched using descriptor lists• Supports partial transfers

Page 40: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 200640

Innovation in Data Transfer is Hard

• Imagine: You have a novel data transfer technique• Say… Bittorrent, a P2P protocol for sharing large files

• How do you deploy?1.Update HTTP. Talk to IETF. Modify Apache, IIS,

Firefox, Netscape, Opera, IE, Lynx, Wget, …

2.Update SMTP. Talk to IETF. Modify Sendmail, Postfix, Exchange, Mail.app, Eudora, …

3.Give up in frustration

Page 41: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 200641

(1) ApplicationAPI

DOT’s Modular Architecture

DOT

Application

Network

Local Storage

StoragePlugin

TransferPlugin

TransferPlugin

TransferPlugin

(3) Storage Plugin API

(2) Transfer Plugin API

Page 42: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 200642

DOT Plugins

• Multipath plugin• List of sub-plugins• Balances load

• Portable storage plugin• Sender: Copies new data onto USB flash device• Receiver: Scans USB flash device for blocks

– naïve filesystem layout, unoptimized• but effective

Page 43: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 200643

Portable Storage Results

.. 1126s(~ 19 min)

Device Inserted

Page 44: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 200644

DOT email chunk caching evaluation

• Step 1: Caching analysis (infinite cache)• SMTP default: What was really sent• DOT body: Whole-body only caching, headers

sent separately• Rabin body: Headers sent separately, rabin

fingerprint chunking of body• Rabin whole: Headers+body chunked together

• Easiest to implement for application. Just send data…

• Step 2: Trace replay through Postfix

Page 45: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 200645

Related Work• BEEP• Proxy-based data interposition approaches

– RON, X-Bone, OCALA

• Other Content-Addressable Systems– Bittorrent, DHTs, DTNs, EMC’s Centera

• Using Content-Addressability to save on data transfers– CASPER, LBFS, Rhea et al., Spring et al.

• Portable Storage– Lookaside Caching, BlueFS

• Other transfer protocols– GridFTP, IBP, HTTP, etc.

Page 46: An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research

Niraj Tolia © May 200646

Transfer plugin API

• get_descriptors( OID, hints )• get_chunks( descriptor, hints )• cancel_chunks( chunk_list )

• Hints specify a plugin + data• gtc://sender.example.com:12000/• dht://opendht.org/• …

• Transfer plugin chaining is easy• e.g., multipath plugin