Contribution and Proposed Solution Sequence-Based Features Collective Classification with Reports Results of Classification Using Reports Collective Spammer Detection in Evolving Multi-Relational Social Networks Shobeir Fakhraei 1,2 , James Foulds 2, Madhusudana Shashanka 3 , and Lise Getoor 2 Problem Statement Data Graph Structure Features 7 1 University of Maryland, College Park, MD, USA 2 University of California, Santa Cruz, CA, USA 3 if(we) Inc. (Currently Niara Inc., CA, USA) 8 • We have a time-stamped multi- relational social network with legitimate users and spammers. • links = actions at time t (e.g. profile view, message, or poke). • Task: Snapshot of the social network + Labels of already identified spammers Find other spammers in the network. Motivation • Spam is pervasive in social networks. • Traditional approaches don't work well: • Spammers can manipulate content-based approaches. E.g., change patterns, split malicious content across messages. • Content may not be available due to privacy reasons. • Spammers have more ways to interact with users in social networks compared to email and the web. • A data sample from Tagged.com, including all active users and their activities in a specific timeframe. • Tagged is a social network for meeting new people with multiple methods for users to interact. • It was founded in 2004 and has over 300 million registered members. • Use only the multi-relational meta- data for spammer detection: • Graph Structure. • Action Sequences. • Collectively refine user generated abuse reports. In each relation graph we compute: • PageRank: Score for each node based on number and quality of links to it. • Degree: Total degree, in-degree, and out-degree of each node. • k-Core: Centrality measure via recursive pruning of the least connected vertices. • Graph Coloring: Assignment of colors to vertices, where no two adjacent vertices share the same color. • Connected Components: Group of vertices with a path between each. • Triangle Count: Number of triangles the vertex participates in. • Sequential k-gram Features: Short sequence segment of k consecutive actions, to capture the order of events. • Mixture of Markov Models: Also called chain-augmented or tree- augmented naive Bayes model to capture longer sequences. HL-MRFs and Probabilistic Soft Logic • Hinge-loss Markov random fields (HL-MRFs) are a general class of conditional, continuous probabilistic models. • Probabilistic soft logic (PSL) uses a first-order logical syntax as a templating language for HL- MRFs. • General rules: • Rule satisfaction: • Predicates have soft truth values between [0,1] • Distance from satisfaction: • Most probable explanation (MPE) by optimizing: Graph Structure and Sequence-Based Results Users can report abusive behavior, but the reports contain a lot of noise. • Model using only reports: • Model using reports and credibility of the reporter: • Model using reports, credibility of the reporter, and collective reasoning: • Complete framework includes graph structure and sequence features, and three demographic features (i.e., age, gender, and time since registration). • We used Graphlab Create for feature extraction and classification with Gradient-Boosted Decision Trees. Precision-Recall ROC

Contribution and Proposed Solution Sequence-Based Features Collective Classification with Reports Results of Classification Using Reports Collective Spammer

Download PPTX Report

Upload
lynne-simpson
View
223
Download
0

Tags:

Embed Size (px)

Citation preview

Contribution and Proposed Solution

Sequence-Based Features

Collective Classification with Reports

Results of Classification Using Reports

Collective Spammer Detection in Evolving Multi-Relational Social Networks

Shobeir Fakhraei1,2, James Foulds2, Madhusudana Shashanka3, and Lise Getoor2

Problem Statement

Data

Graph Structure Features

1University of Maryland, College Park, MD, USA2University of California, Santa Cruz, CA, USA

3if(we) Inc. (Currently Niara Inc., CA, USA)

• We have a time-stamped multi-relational social network with legitimate users and spammers.

• links = actions at time t (e.g. profile view, message, or poke).

• Task: Snapshot of the social network + Labels of already identified spammers

Find other spammers in the network.

Motivation

• Spam is pervasive in social networks. • Traditional approaches don't work well:

• Spammers can manipulate content-based approaches. E.g., change patterns,

split malicious content across messages. • Content may not be available due to

privacy reasons.• Spammers have more ways to interact with

users in social networks compared to email and the web.

• A data sample from Tagged.com, including all active users and their activities in a specific timeframe.• Tagged is a social network for meeting new

people with multiple methods for users to interact.

• It was founded in 2004 and has over 300 million registered members.

• Use only the multi-relational meta-data for spammer detection:• Graph Structure.• Action Sequences.

• Collectively refine user generated abuse reports.

In each relation graph we compute:

• PageRank: Score for each node based on number and quality of links to it.

• Degree: Total degree, in-degree, and out-degree of each node.

• k-Core: Centrality measure via recursive pruning of the least connected vertices.

• Graph Coloring: Assignment of colors to vertices, where no two adjacent vertices share the same color.

• Connected Components: Group of vertices with a path between each.

• Triangle Count: Number of triangles the vertex participates in.

• Sequential k-gram Features: Short sequence segment of k consecutive actions, to capture the order of events.

• Mixture of Markov Models: Also called chain-augmented or tree-augmented naive Bayes model to capture longer sequences.

HL-MRFs and Probabilistic Soft Logic

• Hinge-loss Markov random fields (HL-MRFs) are a general class of conditional, continuous probabilistic models.

• Probabilistic soft logic (PSL) uses a first-order logical syntax as a templating language for HL-MRFs.

• General rules:

• Rule satisfaction:

• Predicates have soft truth values between [0,1]

• Distance from satisfaction:

• Most probable explanation (MPE) by optimizing:

Graph Structure and Sequence-Based Results

Users can report abusive behavior, but the reports contain a lot of noise.

• Model using only reports:

• Model using reports and credibility of the reporter:

• Model using reports, credibility of the reporter, and collective reasoning:

• Complete framework includes graph structure and sequence features, and three demographic features (i.e., age, gender, and time since registration).

• We used Graphlab Create for feature extraction and classification with Gradient-Boosted Decision Trees.

Precision-Recall ROC

Sparsification and Sampling of Networks for Collective Classification

Documents

Social Honeypots: Making Friends With A Spammer Near You · Social Honeypots: Making Friends With A Spammer Near You Steve Webb College of Computing Georgia Institute of Technology

Documents

Collective Agreement Template - MGEU · 6767 except those covered by other collective agreements and those ... this collective agreement, ... classification within the bargaining

Documents

Paco, el spammer de la esquina

Documents

Advanced Mail. Computer Center, CS, NCTU 2 Introduction SPAM vs. non-SPAM Mail sent by spammer vs. non-spammer Problem of SPAM mail Over 99% of E-mails

Documents

Collective Classification with Relational Dependency Networks

Documents

Hemorrhoid - Prince of Songkla Universitymedinfo2.psu.ac.th/surgery/Collective review/2561/6... · (arteriovenous fistula)(14) ... รุนแรงโดยใช Goligher’s classification:

Documents

Detecting Opinion Spammer Groups and Spam Targets through Community Discovery … · 2018. 2. 5. · To the best of our knowledge, this is the ﬁrst attempt to identify opinion spammer

Documents

Collective Spammer Detection in Evolving Multi-Relational ...Collective Classification with Reports Results of Classification Using Reports Collective Spammer Detection in Evolving

Documents

COLLECTIVE AGREEMENT between the INTERIOR … · employees' union (bcgeu) ... training and service career policy ... classification specification attachment - road classification

Documents

Emailez sans spammer

Documents

N4DS V01X09 Convention collective, classification et code ...€¦ · N4DS V01X09 Convention collective, classification et code Métier BTP Dernière mise à jour le 11 août 2014

Documents

Collective Classification for Packed Executable Identification - CEAS 2011

Technology

How does facebook tag you as a spammer- Facebook Marketing Tips

Technology

Paco, el spammer de la esquina (#smpbcn)

Technology

Spammer Presentation Hope Sievert2

Technology

How To Be A Twitter Spammer

Entertainment & Humor

Case-Based Collective Inference for Maritime Object ... · approach on maritime river traffic video captured by our system. We found that case-based collective classification significantly

Documents

COLLECTIVE AGREEMENT · 7.02 (a) The Company and the Union recognize the following classification groups: CLASSIFICATION GROUP A: Tool & Die Marker and Machine Tool Builder Leader

Documents

The Role of Job Classification in Collective Bargaining

Documents

Web2.0 Spammer @ World: Follow me on Twitter!!!...Web2.0 Spammer @ World: Follow me on Twitter!!! Alexandru Cătălin Coşoi Senior Researcher / AntiSpam Laboratory BitDefender Twitter

Documents

Social Honeypots: Making Friends With A Spammer Near You

Documents

A PRACTICAL GUIDE TO THE CLASSIFICATION OF ... classification of financial instruments under IAS 32’ (the Guide). The Guide reflects the collective experience of Grant Thornton International’s

Documents

Social Spammer Detection: A Multi-Relational Embedding ...Social Spammer Detection: A Multi-Relational Embedding Approach Jun Yin 1; 2, Zili Zhou 3, Shaowu Liu , Zhiang Wu?, and Guandong

Documents

[WEBINAR] De spammer a rockstar en 4 pasos

Business

UNIT-III Social Groups –Groups and Classification ...cms.gcg11.ac.in/attachments/article/212/social psychology unit 3... · Collective Behaviour The term "collective behavior" was

Documents

COLLECTIVE CLASSIFICATION OF SOCIAL NETWORK SPAM · COLLECTIVE CLASSIFICATION OF SOCIAL NETWORK SPAM by JONATHAN BROPHY A THESIS ... CURRICULUM VITAE NAME OF AUTHOR: Jonathan Brophy

Documents

Web2.0 Spammer @ World: Follow me on Twitter!!! · Web2.0 Spammer @ World: Follow me on Twitter!!! Alexandru Cătălin Coşoi Senior Researcher / AntiSpam Laboratory BitDefender

Documents

KNOWLEDGE AREA CLASSIFICATION SYSTEM FOR RESEARCH ...€¦ · This document reflects the collective work of the CSREES Program Classification Steering Committee and the Knowledge

Documents

Deep Collective Classification in Heterogeneous Information …web.cs.wpi.edu/~xkong/publications/papers/ · 2020-02-17 · Deep Collective Classification in Heterogeneous Information

Documents