26
Shebuti & Leman Stony Brook University Shebuti Rayana Leman Akoglu

Event Detection and Characterization in Dynamic Graphs

  • Upload
    shebuti

  • View
    121

  • Download
    3

Embed Size (px)

Citation preview

Page 1: Event Detection and Characterization in Dynamic Graphs

Shebuti & Leman

Stony Brook University

Shebuti Rayana Leman Akoglu

Page 2: Event Detection and Characterization in Dynamic Graphs

Shebuti & Leman 2

Network intrusionHealthcare fraud

Credit card fraudTax evasion

Event Detection & Characterization in Dynamic Graphs

& Many More…

Page 3: Event Detection and Characterization in Dynamic Graphs

Shebuti & Leman 3

Problem: Given a sequence of graphs,

Q1. Event detection: find time points at which graph changes significantly

Q2. Characterization: find (top k) nodes / edges / regions that change the most

Event Detection & Characterization in Dynamic Graphs

Page 4: Event Detection and Characterization in Dynamic Graphs

Shebuti & Leman 4

Main framework

Compute graph similarity/distance scores

Find unusual occurrences in time series

… ……

time

Event Detection & Characterization in Dynamic Graphs

Page 5: Event Detection and Characterization in Dynamic Graphs

Shebuti & Leman 5

Flow of Ensemble Approach Event Detection in Dynamic Graphs

Ensemble AlgorithmsEigen Behavior based Event Detection (EBED)

Probabilistic Approach (PTSAD)

SPIRIT

Consensus MethodRank based

Score based

ResultsDataset 1: Challenge Network flow Data

Dataset 2: New York Times News Corpus

Event Detection & Characterization in Dynamic Graphs

Page 6: Event Detection and Characterization in Dynamic Graphs

Shebuti & Leman 6Event Detection & Characterization in Dynamic Graphs

Event Detection

Consensus Rank Merging•Rank based

•Inverse Rank•Kemeny Young

•Score Based•Unification (avg, max)•Mixture Model (avg, max)

• Final Ensemble (Inverse Rank)

Characterization

Page 7: Event Detection and Characterization in Dynamic Graphs

Shebuti & Leman 7

Numerous algorithms for event detection

Hard to decide which one will work well for a specific data set

Our Goal: design an ensemble approach which might not give best result but “better” than most base algorithms

Challenges:

Different scores/scales

Different merging approaches

Event Detection & Characterization in Dynamic Graphs

Page 8: Event Detection and Characterization in Dynamic Graphs

Shebuti & Leman 8

Extract “typical behavior” (eigen-behavior) of nodes/edges

eigen-behavior ≡ principal eigen-vector

Compare eigen-behavior over time Score the time ticks depending on

amount of change in behaviorfrom previous time tick.

Mark the ones with high score as anomalous.

T

N

Feature: Degree

Event Detection & Characterization in Dynamic Graphs

Page 9: Event Detection and Characterization in Dynamic Graphs

Shebuti & Leman 9

Nodes

T

Features(egonet)

Time

T

N

Feature:degree

WW

past pattern

eigen-behavior at t eigen-behaviors

N

rightsingularvector

change-scoremetric: Z = 1- uTr

Event Detection & Characterization in Dynamic Graphs

Page 10: Event Detection and Characterization in Dynamic Graphs

Shebuti & Leman 10

Individual nodes/edges time series with distributions Poisson

Zero-inflated Poisson

Hurdle Process

▪ Hurdle Component: Bernoulli & Markov Chain

▪ Count Component: Zero-truncated Poisson

Model Selection: AIC, log likelihood, Vuong’s test and log gain

Find single-sided p-value as the probability of observing a count as extreme as v [P(X ≥ v)]

Event Detection & Characterization in Dynamic Graphs

Page 11: Event Detection and Characterization in Dynamic Graphs

Shebuti & Leman 11Event Detection & Characterization in Dynamic Graphs

Page 12: Event Detection and Characterization in Dynamic Graphs

Shebuti & Leman 12

Streaming Pattern dIscoveRy in multIple Time-series (SPIRIT) [Papadimitriou et al. 2005]

Discovers trends – whenever trend changes it introduce new hidden variable & remove when not needed

Detects anomalous points in trends

Nodes weights change in each step

At a change point the node which has highest weight is most anomalous

Event Detection & Characterization in Dynamic Graphs

Page 13: Event Detection and Characterization in Dynamic Graphs

Shebuti & Leman 13Event Detection & Characterization in Dynamic Graphs

Event Detection

Characterization

Page 14: Event Detection and Characterization in Dynamic Graphs

Shebuti & Leman 14Event Detection & Characterization in Dynamic Graphs

Rank based Score based

•Inverse Rank•Kemeny Young[J. Kemeny 1959]

•Unification [Zimek et al. 2011]-avg & max

•Mixture Model [Jing et al. 2006]-avg & max

Final Ensemble: inverse rank

Consensus

RankList1

ScoreList1

RankList2

ScoreList2

RankList3

ScoreList3

FinalRankList

Page 15: Event Detection and Characterization in Dynamic Graphs

Shebuti & Leman 15

We were given a “Cyber Challenge Network” from NGAS R&T Space Park

Simulated cyber network traffic

10 days activities

125 hosts

To-from information with timestamps

Find “suspicious” events and the entities associated with the corresponding events in Challenge Network.

Event Detection & Characterization in Dynamic Graphs

Page 16: Event Detection and Characterization in Dynamic Graphs

Shebuti & Leman 16

Eigen-behaviors

Probabilistic Approach

SPIRIT

Z-score

1 – norm.

(sum

p-value)

projection

Event Detection & Characterization in Dynamic Graphs

Time tick

Feature:Degree

Page 17: Event Detection and Characterization in Dynamic Graphs

Shebuti & Leman 17

Eigen-behaviors

Probabilistic Approach

SPIRIT

relative

activity

change

projection

weight

Event Detection & Characterization in Dynamic Graphs

at Time tick 376

nodes

normal.

|log(p)|

Page 18: Event Detection and Characterization in Dynamic Graphs

Shebuti & Leman 18Event Detection & Characterization in Dynamic Graphs

Algorithm Sample rate (10 min)

Base Algorithms

EBED 0.8333

PTSAD 0.5722

SPIRIT 0.7292

ConsensusRank

MergingAlgorithms

Inverse Rank (1/R) 1.0000

Kemeny Young 0.8095

Unification (avg) 0.8056

Unification (max) 0.7255

Mixture model (avg) 0.1684

Mixture model (max) 0.1684

Final Ensemble (1/R) 0.8667

Average Precision Table (Feature: Degree)

Page 19: Event Detection and Characterization in Dynamic Graphs

Shebuti & Leman 19Event Detection & Characterization in Dynamic Graphs

Algorithm Event at 376 Event at 1126

Base Algorithms

EBED 1.0000 1.0000

PTSAD 1.0000 0.2500

SPIRIT 0.3026 0.0213

ConsensusRank

MergingAlgorithms

Inverse Rank (1/R) 1.0000 0.5000

Kemeny Young 1.0000 0.2000

Unification (avg) 1.0000 1.0000

Unification (max) 0.8333 1.0000

Mixture model (avg) 1.0000 1.0000

Mixture model (max) 1.0000 1.0000

Final Ensemble (1/R) 1.0000 1.0000

Average Precision Table for Node anomalies Feature: Degree [Sample rate 10 min]

Page 20: Event Detection and Characterization in Dynamic Graphs

Shebuti & Leman 20Event Detection & Characterization in Dynamic Graphs

Page 21: Event Detection and Characterization in Dynamic Graphs

Shebuti & Leman 21

~8 years (Jan 2000- July 2007) of published articles of New York Times

Graph links: Co-mention of named entities (people, places, organization)

Sample rate: 1 week

No ground truth

Big Events detected:

January, 2001 – George W. Bush elected US president

September 11, 2001 – Terrorist attack in WTC

February 1, 2003 – Space Shuttle Columbia Disaster

Event Detection & Characterization in Dynamic Graphs

Page 22: Event Detection and Characterization in Dynamic Graphs

Shebuti & Leman 22Event Detection & Characterization in Dynamic Graphs

Feature: Weighted Degree

Eigen-behaviors

Probabilistic Approach

SPIRIT

1 – norm.

(sum

p-value)

pro

jectio

n

2001 electionColumbia disaster

9/11 WTC attack

Z S

core

Page 23: Event Detection and Characterization in Dynamic Graphs

Shebuti & Leman 23Event Detection & Characterization in Dynamic Graphs

Page 24: Event Detection and Characterization in Dynamic Graphs

Shebuti & Leman 24

Heterogeneous detectors

different scores

different effectiveness (depending on dataset)

Ensemble for event detection on dynamic graphs

Multiple consensus (merging) approaches

two-phase consensus finding

Event Detection & Characterization in Dynamic Graphs

Page 25: Event Detection and Characterization in Dynamic Graphs

Shebuti & Leman 25

Near-future: Robust consensus by automatically selecting effective base algorithms Challenge: no ground truth

Near-future: real-time detection

Event detection under diverse data sources (e.g., news media, social media, the Web, …)

Challenges: different entity types, different time granularity, entity resolution

Event Detection & Characterization in Dynamic Graphs

Page 26: Event Detection and Characterization in Dynamic Graphs

Shebuti & Leman 26

Judge a man by his questions rather than his answers.-Voltaire

Event Detection & Characterization in Dynamic Graphs

Event Detection

Characterization

[email protected]

http://www.cs.stonybrook.edu/~datalab/