32
SureMail Notification Overlay for Email Reliability Sharad Agarwal Venkat Padmanabhan Dilip A. Joseph 8 March 2006

SureMail Notification Overlay for Email Reliability Sharad Agarwal Venkat Padmanabhan Dilip A. Joseph 8 March 2006

Embed Size (px)

Citation preview

Page 1: SureMail Notification Overlay for Email Reliability Sharad Agarwal Venkat Padmanabhan Dilip A. Joseph 8 March 2006

SureMailNotification Overlay for Email Reliability

Sharad AgarwalVenkat Padmanabhan

Dilip A. Joseph

8 March 2006

Page 2: SureMail Notification Overlay for Email Reliability Sharad Agarwal Venkat Padmanabhan Dilip A. Joseph 8 March 2006

8.MAR.2006 2

Outline

• Email loss problem

• Design philosophy

• SureMail design

• SureMail robustness to security attacks

• SureMail implementation

Page 3: SureMail Notification Overlay for Email Reliability Sharad Agarwal Venkat Padmanabhan Dilip A. Joseph 8 March 2006

8.MAR.2006 3

What is Email Loss?

• Email loss : sent email not received

• Silent email loss– Loss w/o notification (no bounceback / DSN)

• Why?– Aggressive spam filters

• 90% corp. emails thrown away (blacklist)• AOL’s strict whitelist rules (must send 100/day)• Bouncebacks contribute to spam

– Complex mail architecture upgrades / failures• SMTP reliability is per hop, not end-to-end

Page 4: SureMail Notification Overlay for Email Reliability Sharad Agarwal Venkat Padmanabhan Dilip A. Joseph 8 March 2006

8.MAR.2006 4

How Much Email Loss?

• Even loss of 1 email / user / year is bad– If it’s an important email

• To really measure loss– Monitor many users’ send & receive habits– Count how many sent emails not received– Count how many bouncebacks received– Difficult to find enough willing participants that

email each other across multiple domains

Page 5: SureMail Notification Overlay for Email Reliability Sharad Agarwal Venkat Padmanabhan Dilip A. Joseph 8 March 2006

8.MAR.2006 5

Prior Work

• “The State of the Email Address”– Afergan & Beverly, ACM CCR 01.2005

• Rely on bouncebacks; similar to “dictionary” attack• 25% of tested domains send bouncebacks• 1 sender• 0.1% to 5% loss, across 1468 servers, 571 domains

• “Email dependability”– Lang, UNSW B.E. thesis 11.2004

• 40 accounts, 16 domains receive emails from 1 sender• Empty body, sequence number as subject• 0.69% silent loss

Page 6: SureMail Notification Overlay for Email Reliability Sharad Agarwal Venkat Padmanabhan Dilip A. Joseph 8 March 2006

8.MAR.2006 6

Our Email Loss Study

• Methodology– Controller composes email, sends– Our code for SMTP sending– Outlook for receiving (both inbox & junk mail)– Parse sent and received emails into SQL DB– Match on {sender,receiver,subject,attachment}– Heuristics for parsing bouncebacks

• Want– Many sending, receiving accounts– Real email content

Page 7: SureMail Notification Overlay for Email Reliability Sharad Agarwal Venkat Padmanabhan Dilip A. Joseph 8 March 2006

8.MAR.2006 7

Experiment Details

• Email accounts– 36 send, 42 receive– Junk filters off if possible

• Email subject & body– Enron corpus subset– 1266 emails w/o spam

• Email attachment– 70% no attachment– jpg,gif,ppt,doc,pdf,zip,htm– marketing,technical,funny

Domain Type Receive Sendmicrosoft.com Exchange 1 1fusemail.com IMAP 2 2aim.com IMAP 2 2yahoo.co.uk POP 2 2yahoo.com POP 2 2hotmail.com HTTP 1 Xgawab.com POP 2 2bluebottle.com POP 2 2orcon.net.nz POP 2 2nerdshack.com POP 2 2gmail.com POP 2 2eecs.berkeley.edu IMAP 2 2cs.columbia.edu IMAP 2 2cc.gatech.edu POP 2 2nms.lcs.mit.edu POP 2 2cs.princeton.edu POP 2 2cs.ucla.edu POP 2 Xcubinlab.ee.mu.oz.au POP 2 Xusc.edu POP 2 2cs.utexas.edu POP 2 2cs.waterloo.ca POP 2 2cs.wisc.edu IMAP 2 2

Page 8: SureMail Notification Overlay for Email Reliability Sharad Agarwal Venkat Padmanabhan Dilip A. Joseph 8 March 2006

8.MAR.2006 8

Email Loss ResultsStart Date 11/18/2005End Date 1/11/2006Days 54Emails sent 138944Emails received 144949Emails lost 2530Total loss rate 1.82%Bouncebacks received 982Matched bouncebacks 878Unmatched bouncebacks 104Emails lost silently 1548Silent loss rate 1.11%Hard failures 565Conservative silent loss 0.71%

Page 10: SureMail Notification Overlay for Email Reliability Sharad Agarwal Venkat Padmanabhan Dilip A. Joseph 8 March 2006

8.MAR.2006 10

Loss Rates by AttachmentAttachment Size Type Emails Emails Loss

(B) Sent Received Rate(none) 96631 1062 1.11nag_jpg2.jpg 48634 JPEG 1489 28 1.9Nehru_01.jpg 21192 JPEG 2632 55 2.1home_main.gif 35106 GIF 2723 65 2.4phd050305s.gif 65077 GIF 2905 43 1.5ActiveXperts_Network_Monitor2.ppt 105984 MSPowerPoint 2935 43 1.5vijay.ppt 48640 MSPowerPoint 1327 18 1.4CfpA4v10.doc 55808 MS Word 2900 43 1.5CHANG_1587051095.doc 25088 MS Word 2711 15 0.6SSA_PRODUCTSTRATEGY_final.doc 91136 MS Word 2822 33 1.234310344.pdf 32211 PDF 2867 25 0.9f1040v.pdf 47875 PDF 2805 42 1.5CHANG_1587051095.zip 4987 Zip 2940 19 0.6SSA_PRODUCTSTRATEGY_final.zip 43511 Zip 2776 64 2.3IMC_2005_-_Call_for_Papers.htm 10377 HTML 2773 32 1.2ActivePerl_5.8_-_Online_Docs___Getting_Started.htm24598 HTML 2757 23 0.8BBC_NEWS___Entertainment___Space_date_set_for_Scotty's_ashes.htm32776 HTML 2951 42 1.4

• Nothing stands out

Page 11: SureMail Notification Overlay for Email Reliability Sharad Agarwal Venkat Padmanabhan Dilip A. Joseph 8 March 2006

8.MAR.2006 11

Loss Rates by Subject/Body

• ~50-250 emails sent per subject

• Without 35% case : loss rate 1.82% to 1.79%

0

5

10

15

20

25

30

35

40

1 60 119

178

237

296

355

414

473

532

591

650

709

768

827

886

945

1004

1063

1122

1181

1240

Subject

Lo

ss %

Page 12: SureMail Notification Overlay for Email Reliability Sharad Agarwal Venkat Padmanabhan Dilip A. Joseph 8 March 2006

8.MAR.2006 12

Summary of Findings

• Email loss rates are high– 1.82% loss– 0.71% conservative silent loss ( 1 / 140 )

• Difficult to disambiguate cause of loss– Difference between domains (filters or servers?)– No difference between mailboxes– No difference between attachments– Only 1 body had abnormally high loss

Page 13: SureMail Notification Overlay for Email Reliability Sharad Agarwal Venkat Padmanabhan Dilip A. Joseph 8 March 2006

8.MAR.2006 13

Outline

• Email loss problem

• Design philosophy

• SureMail design

• SureMail robustness to security attacks

• SureMail implementation

Page 14: SureMail Notification Overlay for Email Reliability Sharad Agarwal Venkat Padmanabhan Dilip A. Joseph 8 March 2006

8.MAR.2006 14

We Found Email Loss; Now What?

• Can try to fix email architecture, but– Hard to know exactly what is problem– Spam filters continually evolve; not perfect– Some architectures are very complicated– How many email systems are out there?– The current system mostly works

Page 15: SureMail Notification Overlay for Email Reliability Sharad Agarwal Venkat Padmanabhan Dilip A. Joseph 8 March 2006

8.MAR.2006 15

Fixing the Architecture

• Improve email delivery infrastructure– more reliable servers

• e.g., cluster-based (Porcupine [Saito ’00])

– server-less systems • e.g., DHT-based (POST [Mislove ’03])

– total switchover might be risky

• “Smarter” spam filtering– moving target mistakes inevitable

– non-content-based filtering still needed to cope with spam load

Page 16: SureMail Notification Overlay for Email Reliability Sharad Agarwal Venkat Padmanabhan Dilip A. Joseph 8 March 2006

8.MAR.2006 16

Email Notifications

• DSN / bouncebacks– Most spam filters don’t generate DSN on drop– Bogus DSNs due to spam w/ bogus sender– Some MTAs block DSN for privacy– MTA crash may not generate DSN– No DSN for loss between MTA and MUA

• MDN / read receipts– Expose private info (when read, when online)– Can help spammers

Page 17: SureMail Notification Overlay for Email Reliability Sharad Agarwal Venkat Padmanabhan Dilip A. Joseph 8 March 2006

8.MAR.2006 17

Notification Design Requirements

• Cause minimal MTA/MUA disruption

• Cause minimal user disruption

• Preserve asynchronous operation

• Preserve user privacy

• Preserve repudiability

• Maintain spam and virus defenses

• Minimize traffic overhead

Page 18: SureMail Notification Overlay for Email Reliability Sharad Agarwal Venkat Padmanabhan Dilip A. Joseph 8 March 2006

8.MAR.2006 18

Outline

• Email loss problem

• Design philosophy

• SureMail design

• SureMail robustness to security attacks

• SureMail implementation

Page 19: SureMail Notification Overlay for Email Reliability Sharad Agarwal Venkat Padmanabhan Dilip A. Joseph 8 March 2006

8.MAR.2006 19

SureMail Design Requirements

• Cause minimal MTA/MUA disruption• No MTA modification; no Outlook modification

• Cause minimal user disruption• User notified only on loss

• Preserve asynchronous operation• Preserve user privacy

• Only receiver is notified of loss

• Preserve repudiability• No PKI / authentication

• Maintain spam and virus defenses• Emails not modified

• Minimize traffic overhead• 85 byte notification per email

Page 20: SureMail Notification Overlay for Email Reliability Sharad Agarwal Venkat Padmanabhan Dilip A. Joseph 8 March 2006

8.MAR.2006 20

Basic Operation

• Sender S sends email to receiver R– S also posts notification to overlay

• R periodically downloads new email– R also downloads notifications from overlay

• Notification without matching email loss– delay : median 26s, mean 276s, max 36.6 hrs

Page 21: SureMail Notification Overlay for Email Reliability Sharad Agarwal Venkat Padmanabhan Dilip A. Joseph 8 March 2006

8.MAR.2006 21

SureMail Overview

Sender S Recipient R

You’ve Lost Mail!

Request lost message

Dreg=H2(R)

Reg

iste

r

Dnot=H1(R) Verify

GetNotifications

H1(Mnew),

H1(Mold),T,

MAC([T,H1(Mnew)]

,H2(Mold))

Page 22: SureMail Notification Overlay for Email Reliability Sharad Agarwal Venkat Padmanabhan Dilip A. Joseph 8 March 2006

8.MAR.2006 22

SureMail Overview

• Emails, MTAs, MUAs unmodified

• Parallel notification overlay system– Decentralized; limited collusion– Agnostic to actual implementation

• end-host-based (e.g., always-on user desktops) • infrastructure-based (e.g., “NX servers”)

• Prevent notification snooping & spam– Email based registration– Reply based shared secret

Page 23: SureMail Notification Overlay for Email Reliability Sharad Agarwal Venkat Padmanabhan Dilip A. Joseph 8 March 2006

8.MAR.2006 23

Email-Based Registration

• Goal: prevent hijacking of R’s notifications– Only R can receive emails sent to R– Limited collusion among notification nodes

• One-time operation for initial registration– R sends registration request to H2(R), H3(R)– H2(R), H3(R) email registration secrets to R

• To retrieve notifications at H1(R)– R uses registration secrets with H1(R); H1(R)

verifies with H2(R) H3(R), sends back notifications– Neither H1(R), H2(R), H3(R) can associate

notifications with R, unless they collude

Page 24: SureMail Notification Overlay for Email Reliability Sharad Agarwal Venkat Padmanabhan Dilip A. Joseph 8 March 2006

8.MAR.2006 24

Reply-Based Shared Secret

• Goal: prevent notification spoofing & spam– Only R & S know their email conversations– S rarely converses with spammers

• Reply detection– S sends Mold to R, R replies with M’old

– S uses H(Mold) to “prove” identity to R in future

• Notification for Mnew from S to R– H1(Mnew),H1(Mold),T,MAC([T,H1(Mnew)],H2(Mold))

– Only R can identify S– Shared secret can be continually refreshed

Page 25: SureMail Notification Overlay for Email Reliability Sharad Agarwal Venkat Padmanabhan Dilip A. Joseph 8 March 2006

8.MAR.2006 25

Attacks Defeated by Design

• X cannot retrieve H1(R) notifications

• H1(R) cannot identify R

• H2(R), H3(R) cannot see R’s notifications– If they don’t collude; can increase to 3 nodes

• X, H(R) cannot identify S

• X, H(R) cannot learn Mnew, Mold

• X cannot annoy R with bogus notifications

• X cannot masquerade post to H1(R) as S

Page 26: SureMail Notification Overlay for Email Reliability Sharad Agarwal Venkat Padmanabhan Dilip A. Joseph 8 March 2006

8.MAR.2006 26

First Time Sender

• What if FTS email is lost?

• FTS & spammer generally indistinguishable

• But perhaps FTS knows I who knows R– Email networks have small world properties

– I makes shared secret SI with all known parties

– FTS sends email to R• Posts multiple notifications

• One for every SI it has learned

Page 27: SureMail Notification Overlay for Email Reliability Sharad Agarwal Venkat Padmanabhan Dilip A. Joseph 8 March 2006

8.MAR.2006 27

Other Issues

• Reply-detection:– “in-reply-to” header may not always help– indirect checks based on text similarity

• Reducing overhead:– post notifications only for “important” emails– delay posting in hope of receiving implicit ACK (reply) or

NACK (bounce-back)

• Mobility: – reply-based shared secret can be regenerated– web-mail

• Can support mailing lists

Page 28: SureMail Notification Overlay for Email Reliability Sharad Agarwal Venkat Padmanabhan Dilip A. Joseph 8 March 2006

8.MAR.2006 28

Outline

• Email loss problem

• Design philosophy

• SureMail design

• SureMail robustness to security attacks

• SureMail implementation

Page 29: SureMail Notification Overlay for Email Reliability Sharad Agarwal Venkat Padmanabhan Dilip A. Joseph 8 March 2006

8.MAR.2006 29

SureMail Implementation

• Reply detection heuristic for shared secret

• Notification service– Centralized server running– Chord based DHT running

• Notification posting, retrieving– Grab in/out bound email via Outlook MAPI call

• No modification to Outlook binaries

– XML notification put/get commands– Simple Win32 GUI

Page 30: SureMail Notification Overlay for Email Reliability Sharad Agarwal Venkat Padmanabhan Dilip A. Joseph 8 March 2006

8.MAR.2006 30

SureMail GUI• Client UI will see emails, will post & retrieve notifications• E.g. running on two machines [email protected] and [email protected]

Lost!Not lost

Page 31: SureMail Notification Overlay for Email Reliability Sharad Agarwal Venkat Padmanabhan Dilip A. Joseph 8 March 2006

8.MAR.2006 31

Notification Results

Start Date 1/11/2006End Date 2/8/2006Days 29Emails sent 19435Emails lost 653Total loss rate 3.36%Bouncebacks received 406Matched bouncebacks 378Unmatched bouncebacks 28Emails lost silently 247Silent loss rate 1.27%Hard failures 70Conservative silent loss 0.91%Notifications received 19435

Page 32: SureMail Notification Overlay for Email Reliability Sharad Agarwal Venkat Padmanabhan Dilip A. Joseph 8 March 2006

8.MAR.2006 32

Summary• Email does get lost!

– ~40 accounts, 158000 emails, 0.71%-0.91% silent loss• SureMail

– Client based – unmodified email, servers, clients; no PKI– User intervention only on lost email– Keeps repudiability, privacy, asynchronous, spam & virus

defense• Separate notification overlay robust

– Simple, small message format– No virus, malware, spam filters needed– Provides failure independence

• Status– ACM Hotnets 05; ACM Sigcomm 06 submission– Prototype implementation