18
Uncovering Social Network Sybils in the Wild Zhi Yang Christo Wilson Xiao Wang Peking University UC Santa Barbara Peking University Tingting Gao Ben Y. Zhao Yafei Dai Renren Inc. UC Santa Barbara Peking University

Uncovering Social Network Sybils in the Wild Zhi YangChristo WilsonXiao Wang Peking UniversityUC Santa BarbaraPeking University Tingting GaoBen Y. ZhaoYafei

Embed Size (px)

Citation preview

Uncovering Social NetworkSybils in the Wild

Zhi Yang Christo Wilson Xiao Wang

Peking University UC Santa Barbara

Peking University

Tingting Gao Ben Y. Zhao Yafei Dai

Renren Inc. UC Santa Barbara

Peking University

2

Sybils on OSNs•Large OSNs are attractive targets for…▫Spam dissemination▫Theft of personal information

•Sybil, sɪbəl, Noun: a fake account that attempts to create many friendships with honest users▫Friendships are precursor to other malicious

activity▫Does not include benign fakes

•Research has identified malicious Sybils on OSNs▫Twitter [CCS 2010]▫Facebook [IMC 2010]

3

Understanding Sybil Behavior•Prior work has focused on spam▫Content, dynamics, campaigns▫Includes compromised accounts

•Open question: What is the behavior of Sybils in the wild?Important for evaluating Sybil detectors

•Partnership with largest OSN in China: Renren▫Leverage ground-truth data on 560K Sybils▫Develop measurement-based, real-time Sybil

detector▫Deployed, caught additional 100K Sybils in 6 months

4

Outline Introduction

Sybils on Renren

Sybil Analysis

Conclusion

5

Sybils on Renren•Renren is the oldest and largest OSN in

China▫160M users▫Facebook’s Chinese twin

•Ad-hoc Sybil detectors▫Threshold-based spam traps▫Keyword and URL blacklists▫Crowdsourced account flagging

•560K Sybils banned as of August 2010

6

Sybil Detection 2.0•Developed improved Sybil detector for

Renren▫Analyzed ground-truth data on existing Sybils▫Identified four reliable Sybil indicators

•Evaluated threshold and SVM detectors▫Similar accuracy for both

▫Deployed threshold, less CPU intensive, real-time

SVM Threshold

Sybil Non-Sybil

Sybil Non-Sybil

98.99% 99.34% 98.68% 99.5%

1. Friend Request Frequency2. Outgoing Friend Requests Accepted3. Incoming Friend Requests Accepted4. Clustering Coefficient

7

Detection Results•Caught 100K Sybils in the first six months▫Vast majority are spammers▫Many banned before generating content

•Low false positive rate▫Use customer complaint rate as signal▫Complaints evaluated by humans▫25 real complaints per 3000 bans (<1%)

Spammers attempted to recover banned Sybils by

complaining to Renren customer support!

More detailsin the paper

8

Outline Introduction

Sybils on Renren

Sybil Analysis

Conclusion

9

Community-based Sybil Detectors•Prior work on decentralized OSN Sybil

detectors▫SybilGuard, SybilLimit, SybilInfer, Sumup▫Key assumption:

Sybils form tight-knit communities

Edges Between Sybils

Attack

Edges

10

1 10 100 10000

10

20

30

40

50

60

70

80

90

100

Sybils, Edges Between Sybils Only

Sybils, All Edges

Normal Users

Degree

Do Sybils Form Connected Components?

0 0.5 10

10

20

30

40

50

60

70

80

90

100

.

CD

F

Vast majority of Sybils blend completely into the social graph

Few communities to detect

80% have degree = 0

No edges to other Sybils!

11

Can Sybil Components be Detected?

1 10 100 1000 100001

10

100

1000

10000

Edges Between Sybils

Att

ack E

dg

es

Sybil components are internally sparse

Not amenable to community detection

12

Sybil Cluster Analysis

Sybil Accounts

Ed

ges B

etw

een

S

yb

ils

Cre

ati

on

Ord

er

•Are edges between Sybils formed intentionally?▫Temporal analysis indicates random

formation

•How are random edges between Sybils formed?▫Surveyed Sybil management tools

▫Biased sampling for friend request targets▫Likelihood of Sybils inadvertently friending

is high

Renren Marketing Assistant V1.0

Renren Super Node Collector V1.0

Renren Almighty Assistant V5.8

More detailsin the paper

13

Outline Introduction

Sybils on Renren

Sybil Analysis

Conclusion

14

Conclusion•First look at Sybils in the wild▫Ground-truth from inside a large OSN▫Deployed detector is still active

•Sybils are quite sophisticated▫Cheap labor very realistic fakes▫Created and managed by-hand

•Need for new, decentralized Sybil detectors▫Results may not generalize beyond Renren▫Evaluation on other large OSNs

15

Questions?

Slides and paper available at http://www.cs.ucsb.edu/~bowlin

Christo WilsonUC Santa [email protected]

P.S.: I’m on the

job market…

16

Backup SlidesOnly use in case of emergency!

17

Creation of Edges Between Sybils

Sybil AccountsEd

ges B

etw

een

Syb

ils

Cre

ati

on

Ord

er

The majority of edges between

Sybils form randomly

18

Friend Target Selection

0 100 200 300 400 500 600 700 800 900 10000

20

40

60

80

100

All Users

Sybil Friend Request Targets

Degree

CD

F

High degree nodes are often Sybils!

Sybils unknowingly friend each other