15
Email Alias Detection Using Social Network Analysis Ralf Holzer, Bradley Malin, Latanya Sweeney LinkKDD 2005 Advisor: Dr. Koh Jia-Ling Reporter: Che-Wei, Liang Date: 2008/08/14 1

Email Alias Detection Using Social Network Analysis Ralf Holzer, Bradley Malin, Latanya Sweeney LinkKDD 2005 Advisor: Dr. Koh Jia-Ling Reporter: Che-Wei,

Embed Size (px)

Citation preview

Page 1: Email Alias Detection Using Social Network Analysis Ralf Holzer, Bradley Malin, Latanya Sweeney LinkKDD 2005 Advisor: Dr. Koh Jia-Ling Reporter: Che-Wei,

Email Alias Detection Using Social Network Analysis Ralf Holzer, Bradley Malin, Latanya Sweeney

LinkKDD 2005

Advisor: Dr. Koh Jia-LingReporter: Che-Wei, Liang

Date: 2008/08/141

Page 2: Email Alias Detection Using Social Network Analysis Ralf Holzer, Bradley Malin, Latanya Sweeney LinkKDD 2005 Advisor: Dr. Koh Jia-Ling Reporter: Che-Wei,

Outline

• Introduction• Alias Detection Method– Data Representation– Ranking Algorithms

• Experiment

2

Page 3: Email Alias Detection Using Social Network Analysis Ralf Holzer, Bradley Malin, Latanya Sweeney LinkKDD 2005 Advisor: Dr. Koh Jia-Ling Reporter: Che-Wei,

Introduction

• Individuals use aliases for various communication purposes

• Alias detection– Useful to both legitimate and illegitimate applications– Important to understand the extent to which the

process can be automated

3

Page 4: Email Alias Detection Using Social Network Analysis Ralf Holzer, Bradley Malin, Latanya Sweeney LinkKDD 2005 Advisor: Dr. Koh Jia-Ling Reporter: Che-Wei,

Introduction

• Aliases are listed on the same webpage can indicate there exists some form of relationship between them

• Many people use several email addresses– This paper attempt to determine which email

addresses correspond to the same entity

4

Page 5: Email Alias Detection Using Social Network Analysis Ralf Holzer, Bradley Malin, Latanya Sweeney LinkKDD 2005 Advisor: Dr. Koh Jia-Ling Reporter: Che-Wei,

Introduction

5

Page 6: Email Alias Detection Using Social Network Analysis Ralf Holzer, Bradley Malin, Latanya Sweeney LinkKDD 2005 Advisor: Dr. Koh Jia-Ling Reporter: Che-Wei,

Data Representation

• Let S represent the set of sourcesModeled as an undirected graph G = (I, E)– I be the set of unique email addresses– Cab = |eab| denote the number of sources

associated with each edge connecting a and b

6

Page 7: Email Alias Detection Using Social Network Analysis Ralf Holzer, Bradley Malin, Latanya Sweeney LinkKDD 2005 Advisor: Dr. Koh Jia-Ling Reporter: Che-Wei,

Ranking Algorithms

• Ranking method– Top-k list of possible aliases– Shortest path algorithm• Used geodesic distance to generate a ranking of nodes

closest to a given originating node

• Relationship strength is augmented with– Number of aliases on a source– Number of collocations of aliases

7

Page 8: Email Alias Detection Using Social Network Analysis Ralf Holzer, Bradley Malin, Latanya Sweeney LinkKDD 2005 Advisor: Dr. Koh Jia-Ling Reporter: Che-Wei,

Ranking Algorithms• Geodesic distance– Length of the shortest path from a to b– Potential aliases are ranked from lowest to

highest geodesic distance

• Multiple Collocation– Two aliases which collocate on more than one

webpage signifies a stronger relationship

8

Page 9: Email Alias Detection Using Social Network Analysis Ralf Holzer, Bradley Malin, Latanya Sweeney LinkKDD 2005 Advisor: Dr. Koh Jia-Ling Reporter: Che-Wei,

Ranking Algorithms

• Source Size– Strength between two aliases in inversely

correlated with the number of aliases in a source

• Combined– Integrates both of previous assumptions

9

Page 10: Email Alias Detection Using Social Network Analysis Ralf Holzer, Bradley Malin, Latanya Sweeney LinkKDD 2005 Advisor: Dr. Koh Jia-Ling Reporter: Che-Wei,

Experiment

• Derived from CMU web pages– 1978 distinct email aliases

• Data Set Statistics

10

Page 11: Email Alias Detection Using Social Network Analysis Ralf Holzer, Bradley Malin, Latanya Sweeney LinkKDD 2005 Advisor: Dr. Koh Jia-Ling Reporter: Che-Wei,

Experiment

11

Page 12: Email Alias Detection Using Social Network Analysis Ralf Holzer, Bradley Malin, Latanya Sweeney LinkKDD 2005 Advisor: Dr. Koh Jia-Ling Reporter: Che-Wei,

Experiment

• Geodesic Alias Distances

12

Page 13: Email Alias Detection Using Social Network Analysis Ralf Holzer, Bradley Malin, Latanya Sweeney LinkKDD 2005 Advisor: Dr. Koh Jia-Ling Reporter: Che-Wei,

Experiment

13

Page 14: Email Alias Detection Using Social Network Analysis Ralf Holzer, Bradley Malin, Latanya Sweeney LinkKDD 2005 Advisor: Dr. Koh Jia-Ling Reporter: Che-Wei,

Experiment

14

Page 15: Email Alias Detection Using Social Network Analysis Ralf Holzer, Bradley Malin, Latanya Sweeney LinkKDD 2005 Advisor: Dr. Koh Jia-Ling Reporter: Che-Wei,

Experiment

15