View
877
Download
1
Category
Tags:
Preview:
Citation preview
Multiple classifier systems under attack
Battista Biggio, Giorgio Fumera, Fabio RoliDept. of Electrical and Electronic Eng., Univ. of Cagliari
http://prag.diee.unica.it
9th International Workshop on Multiple Classifier Systems
2
Outline
● Adversarial classification
● MCSs in adversarial classification tasks
● Some experimental results
3
Adversarial classification
I am John Smith
Subject: MCS2010 Suggested tours
Dear MCS 2010 Participant,Attached please find the offerswe negotiated with the travel agency...
legitimate
Subject: Need affordable Drugs??
Order from Canadian Pharmacy& Save You MoneyWe are having SpecialsHot Promotion this week!...
spamgenuine
I am Bob Brown
impostor
J. Smith B. Brown
Templatedatabase
Biometric verificationSpam filtering
Two pattern classes:legitimate, malicious
Examples:● Biometric verification and recognition ● Intrusion detection in computer networks● Spam filtering● Network traffic identification...
4
Adversarial classification
Subject: Need affordab1e D r u g s??
Order from (anadian Ph@rmacy & S@ve You MoneyWe are having Specials H0t Promotion this week!..."Don't you guys ever read a paper? Moyer's a gentleman now. He knows t"Well I'm sure I can't help what you think," she said tartly. "After a
spam
Subject: Need affordable Drugs??
Order from Canadian Pharmacy & Save You MoneyWe are having Specials Hot Promotion this week!...
spam
I am Bob BrownAttack:Bad word obfuscationGood word insertion
Template databaseJ. Smith B. Brown
B. Brown
impostor
Attack: fingerprint spoofing
5
Adversarial classification
Main issues:● vulnerabilities of pattern recognition systems● performance evaluation under attack● design of pattern recognition systems robust to attacks
6
Multiple classifier systemsin adversarial environments
J. Smith B. Brown
I am Bob Brown
impostor
Accepted/Rejected
Fusion rule
Multimodal biometric systems: more accurate than unimodal ones
7
Multiple classifier systemsin adversarial environments
J. Smith B. Brown
I am Bob Brown
impostor
Fusion rule
Multimodal biometric systems: more accurate than unimodal onesAnd also more robust to attacks (?)
Analogous claims in other applications(spam filtering, network intrusion detection, etc.)
Accepted/Rejected
8
Aim of our work
Main issues in adversarial classification:● vulnerabilities of pattern recognition systems● performance evaluation under attack● design of pattern recognition systems robust to attacks
Our goal: to investigate whether and how MCSs allow to improve the robustness of PR systems under attack
9
Linear classifiers under attack
Buy vi4gr4!
Did you ever play that gamewhen you were a kid where the little plastic hippo tries to gobble up all your marbles?
x’ = [ 1 0 0 0 1 0 0 1 …]
Buy viagra!
x = [ 1 0 1 0 0 0 0 0 …]
The adversary exploits some knowledge on● the features● the classifier's decision function
An example: spam filtering, linear classifiersf(x) = sign { ω
1x
1 + ω
2x
2 + ... + ω
Nx
N + ω
0 }
xi {0,1}; f(x) = +1: spam; f(x) = -1: legitimate
10
Linear classifiers under attack
The adversary exploits some knowledge on● the features● the classifier's decision function
buy viagra
kid game
0.52.0
-0.5
-2.0
f(x) = sign { ω1x
1 + ω
2x
2 + ... ω
Nx
N + ω
0 }
Buy viagra! 0.5 + 2.0 - 0.9 = 0.6 > 0: spam
Buy vi4gr4! 0.5 - 0.9 = -0.4 < 0: legitimate
Buy viagra! 0.5 + 2.0 - 2.0 - 0.9 = -0.4 < 0: legitimategame
ω0
-0.9
ω
11
Linear classifiers under attack
Possible strategy to improve the robustness of linear classifiers: keep weights as much uniform as possible (Kolcz and Teo, 6th Conf. on Email and Anti-Spam, CEAS 2009)
buy viagra
kid game
1.0 1.5
-1.0-1.5
f(x) = sign { ω1x
1 + ω
2x
2 + ... ω
Nx
N + ω
0 }
Buy viagra! 1.0 + 1.5 - 0.9 = 1.6 > 0: spam
Buy vi4gr4! 1.0 - 0.9 = 0.1 > 0: spam
Buy viagra! 1.0 + 1.5 - 1.5 - 0.9 = 0.1 > 0: spamgame
Buy viagra! 1.0 + 1.5 - 1.0 - 1.5 - 0.9 = -0.9 < 0kid game legitimate
ω0
-0.9
ω
12
Ensembles of linear classifiers under attack
Do randomisation-based MCS techniques result in more uniform weights of linear base classifiers?● bagging● random subspace method● ...
(accuracy-robustness trade-off)
13
Experimental setting (1)
● Spam filtering task● TREC 2007 data set (20,000 out of > 75,000 e-mails, 2/3 spam)● Features: bag of words (word occurrence) > 360,000● Base linear classifiers: SVM, Logistic Regression● MCS
● ensemble size: 3, 5, 10● bagging: 20%, 100% training samples● RSM: 20%, 50%, 80% feature subset sizes
● 5 runs● Evaluation of performance under attack: worst-case BWO/GWI
attack, for m obfuscated/added words (m = “attack strength”)
14
Performance measure
0 0.1FP
TP
1
1
Receiver Operating Characteristic (ROC) curve
TP = Prob [f(X) = Malicious | Y = Malicious]
FP = Prob [f(X) = Malicious | Y = Legitimate]
AUC10%
15
Measure of weights uniformity
0K
F(K)
N
1least uniform weights
most uniform weights
|ω1 | |ω
Ν |
|ω1 | |ω
Ν |
|ω1 | |ω
Ν |
sum of weights absolute values
sum of top-K weights absolute values
Kolcz and Teo, 6th Conf. on Email and Anti-Spam (CEAS 2009)
|ω|
|ω|
|ω|
16
Results (1)
number of obfuscated/added words
17
Experimental setting (2)
● SpamAssassin● About N = 900 Boolean“tests”, x
1,
x
2, ...,x
N , x
i {0,1}
● Decision function:f(x) = sign { ω
1x
1 + ω
2x
2 + ... + ω
Nx
N + ω
0 },
f(x) = +1: spam; f(x) = -1: legitimate● Default weights: machine learning + manual tuning● Evaluation of performance under attack: evasion of the
worst m tests (m = “attack strength”)
18
Results (2)
number of evaded tests
19
Conclusions
● Adversarial classification: which roles can MCSs play?
● This work:● linear classifiers● attacks based on some knowledge about features
and decision function (case study: spam filtering)
● Future works: investigating MCSs on different applications, base classifiers, kinds of attacks, ...
Recommended