B@bel:Leveraging Email Delivery for Spam Mitigation

Preview:

DESCRIPTION

Gianluca Stringhini, Manuel Egele, Apostolis Zarras, Thorsten Holz, Christopher Kruegel, and Giovanni Vigna. B@bel:Leveraging Email Delivery for Spam Mitigation. University of California, Santa Barbara Ruhr-University Bochum. Usenix Security 2012. - PowerPoint PPT Presentation

Citation preview

B@bel:Leveraging Email Delivery for Spam Mitigation

Usenix Security 2012

Gianluca Stringhini, Manuel Egele, Apostolis Zarras, Thorsten Holz,

Christopher Kruegel, and Giovanni Vigna

University of California, Santa Barbara Ruhr-University Bochum

李佳恆 leegoder@gmail.com

Outline

Introducion

Background

Approach

Evaluation

Conclusion

IntroducionKASPERSKY LAB. Spam Report: April 2012.

Email spam Accounting for more than 77% of all email traffic

https://www.securelist.com/en/analysis/204792230/Spam_Report_April_2012

SYMANTEC CORP. State of spam & phishing report

http://www.symantec.com/business/theme.jsp?themeid=state_of_spam

About 85% of world-wide spam traffic is sent by botnets

Traditional spam dection systems

1.Content analysis

2.Origin base

Ex.Blacklists

============new way==========

Focus on the email deliivery mechanism

(How messages are sent by spammers)

Background

SMTP

(mail user agent )eg: Outlook

(mail transfer agent )

eg: msa.hinet

From wiki

eg: Hotmail

SMTP Conversaction

SMTP

Reply:220 msr5.hinet.net ESMTP Sendmail 8.14.2/8.14.2; Sun, 29 Jul 2012 17:38:35 +0800 (CST)

EHLO adl.com

Reply:250-msr5.hinet.net Hello 114-34-35-96.HINET-IP.hinet.net [114.34.35.96], pleased to meet you

MAIL FrOm:<dada@msa.hinet.net>

Reply:250 2.1.0 <dada@msa.hinet.net>... Sender ok

rCpt tO: <leegoder@gmail.com>

Reply:250 2.1.5 <leegoder@gmail.com>... Recipient ok

Data

Reply:354 Enter mail, end with "." on a line by itself

SubJECT : HI i am dada

YOYOYO

test !!!`~~~~

...

.

Reply:250 2.0.0 q6T9cZtc012399 Message accepted for delivery

SMTP

SMTP RFC defines 14 commands.

Each command consists of four case-insensitive,alphabetic-character command codes

One or more space characters separate command codes

All command are terminated by line terminator(<CR><LF>)

Smtp replies :three-digit status code+space+description

(one line ,e.g., 250 OK)

RFC 821

Approach

SMTP Dialects

Different clients might implement the SMTP protocol in slightly different ways.

1.RFCs Do not always provide a single Format (e.g.,EHLO vs HELO)

2.Using different extension,client might add different parameters

3.Server accept commands that do not comply with the strict SMTP definitions

Learning Dialects

Passively observe ( )

A set of SMTP conversations

Each conversation is a sequence of <reply,command> pairs

E.g.,<220 hinet.net, EHLO adl.com>

Active probing

Send specifically-crafted replies to a client

And observe its responses

Active probing

Standard SMTP replies (e.g., send error)

Addiional SMTP replies (e.g., send twice)

Out-of-order Smtp replies

Missing replies (nerver sends a reply to a command)

Compliant replies (e.g., hOsT)

Incorrect replies (e.g., 9999)

incorrectly-terminated replis (e.g.,<CR><CR>)

Regular expressions

MAIL FROM:<dada@msa.hinet.net>MAIL FROM:gaga@msa.hinet.net

MAIL FROM:<email-addr>

Mail From :gaga@msa.hinet.net

Mail From :<email-addr>

E.g.,<220 hinet.net, EHLO adl.com> <220 hostname,EHLO domain>

wiki

State machine

spam

<Reply ,Command> <transaction, state>

E.g.,<220 hostname,EHLO domain>

Gmail

Decision state Machine

Wolf

WOLF, W. An Algorithm for Nearly-Minimal Collapsing of Finite-State Machine Networks.

(ICCAD) (1990).

Making a descison

E.g.,<220 hostname,EHLO domain> ...

E.g.,<220 hostname,HELO domain>

<250 OK,MAIL FROM:<email-addr>> ...

< Reply,Command>

E.g.,<220 hostname,HELO domain>

<250 OK,RSET> ...

C3 unknow

C3 unknow

C2

unknowunknow

The Botnet Feedback Mechanism

The Botnet Feedback Mechanism

Some spammers take server feedback into account

e.g., recopient address does not exist

Cutwail : 35% email address were not exist [38]

Providing False Responses to Spam Emails.

[38]http://www.iseclab.org/papers/cutwail-LEET11.pdf

Evaluation

Enviroment

B@bel

1.Virtual machine zoo

2.gateway

3.learner => decision fsm =>

4.decision maker

Evaluating dialects for Classification Run BabelTraining set (13 legitimate , 91malware)

Legitimate MUAs and MTAs are distinct from Bots Legitimate MUAs and MTAs are all speak distinct dialects (except for Outlook Express and Windows Live Mail)

91malware: 48 dialects Same dialects belong to the same family

Evaluating Dialects for Spam Detection

Run Babel

SMTP converastions for 621919 email messages(40days)

7114 bot samples[4] >> bad dialects

MUA+MTA+webmail >> good dialects

Passive spam detection

Decision machine do not recognize the conversaction >> mark as spam

Evaluating Dialects for Spam Detection

621919 email (ALL)

260074 spam , 218675 ham ,143170 ??

Verify

true positive

IP blacklist (30) + resolve domain

99.32% true positive

False negative

21% False negative

(misused web mail account,dedicated MTA)

(half is legitimate MTAs)

Limitations and Evasion

Evading dialects detection:

Use an existing open source smtp engine (CDO)

But spambots are built for performance

Bagle(a spam bot) : 20ms / a letter

CDO(windows) : 200ms / a letter

collaboration data objects library

Conclusion

Introduced a novel way to detect and mitigate spam emails

We study how the feedback mechanism used by botnets can be poisoned

Empirical result confirm that our approach can be used to detect and mitigate spam emails.

THANKS

Recommended