Structure2Vec: Deep Learning for
Security Analytics over Graphs
Le Song
Principal Engineer, AI Department, Ant Financial
Associate Professor, College of Computing
Georgia Institute of Technology
2
Alibaba Ecosystem
Complex networks are everywhere
3
User
Device
GPS
Account
Phone
Card
Shop
Company
Company
Shop
Device
GPS
Card
WIFI Phone No.
Account
Fake account detection
Fake account can increase system level risk
Need to shut down fake accounts
4
Account – device network
Bad
guy
Good
guy
device
New accounts in a month
millions of nodes and millions of edges.
RecallP
recision
detection comparisonRule Structure2vec
Cash-out detection
Cash-out for credit loans create vicious cycles
Need to detect cash-out transactions
5
Shop Owner
Account
Alipay
Account 1
Normal Bank
Account
Alipay
Account 2
Use $1000
credit
Same
person
Transfer
$800 cash
Transfer
$800 cash
Recall
Pre
cisi
on
Structure2vec
node2vec
+ xgboost
Network of millions of transactions a day
detection comparison
Fundamental challenge
6
(city=Atlanta) AND (age=40)
(neighbor=A) XOR (neighbor=B)
(bought=car) OR (time<3 years)
Explosive combinations!
Manual feature designNetworks
Timing
Various apps
Learn
Representation?
Representation Learning
via Structure2Vec
𝑋6
𝑋1
𝑋2 𝑋3
𝑋4
𝑋5
Structure2Vec embedding (mean field)
𝜇2(0)
𝜇1(0)
𝜇6(0)
𝜇3(0)
𝜇5(0)
𝜇4(0)
Obtain embedding via
iterative update algorithm:
1. Initialize 𝜇𝑖(0)
= 0, ∀ 𝑖
2. Iterate 𝑇 times
𝜇𝑖(𝑡)
← 𝜎 𝑊1𝑋𝑖 + 𝑊2
𝑗∈𝒩 𝑖
𝜇𝑗(𝑡−1)
, ∀𝑖
𝜇1(1)
8
Parameterized
as neural network
[Dai, et al. 2016]
Structure2Vec embedding (mean field)
Obtain embedding via
iterative update algorithm:
1. Initialize 𝜇𝑖(0)
= 0, ∀ 𝑖
2. Iterate 𝑇 times
𝜇𝑖(𝑡)
← 𝜎 𝑊1𝑋𝑖 + 𝑊2
𝑗∈𝒩 𝑖
𝜇𝑗(𝑡−1)
, ∀𝑖
9
Parameterized
as neural network
𝑓 𝑓 , 𝑓
Supervised
Learning
Generative
Models
Reinforcement
Learning
𝜇𝑖(𝑇) 𝜇𝑖
(𝑇) 𝜇𝑘𝑇
𝑘∈𝑆𝜇𝑗(𝑇)
𝑋6
𝑋1
𝑋2 𝑋3
𝑋4
𝑋5
𝜇2(1)
𝜇1(1)
𝜇6(1)
𝜇3(1)
𝜇5(1)
𝜇4(1)
Fake account detection
Fake account can increase system level risk
Need to shut down fake accounts
10
Account – device network
Bad
guy
Good
guy
device
New accounts in a month
roughly millions of nodes and millions of edges.
RecallP
recision
detection comparisonRule Structure2vec
Fake account pattern
11
Fake accountNormal account
Device
Aggregation
Activity
Aggregation
Cross-platform code similarity detection
Key problem is to learn code graph presentation across platforms
12[Xu et al. 2017]
Twin-networks to extract features
Codes from the same function should be similar across platforms
13
Structure2vec
New tools for representation learning over graphs
(city=Atlanta) AND (age=40)
(browser=IE) XOR (system=Linux)
(bought=car) OR (usage<3 years)
Explosive combinations!
Graphical Models
𝑋 ⊥ 𝑌 | 𝑍
Life is great
Deep learning
𝑦 = 𝑓(𝑥)
Help
Manual algorithm design
14
𝑋6
𝑋1
𝑋2 𝑋3
𝑋4
𝑋5
Embed belief propagation
Approximate embedding of
𝑝 𝐻𝑖 𝑥𝑗 ↦ 𝜇𝑖
via fixed point update
1. Initialize 𝜇𝑖𝑗 , ∀ 𝑖, 𝑗
2. Iterate 𝑇 times
3. Aggregate 𝜇𝑖 = 𝑊3 ℓ∈𝒩 𝑖 𝜇ℓ𝑖(𝑇)
, ∀ 𝑖
𝜇𝑖𝑗(𝑡)
← 𝜎 𝑊1𝑋𝑖 + 𝑊2
ℓ∈𝒩 𝑖 \j
𝜇ℓ𝑖(𝑡−1)
, ∀(𝑖, 𝑗)
𝑓 𝑓 , 𝑓
Supervised
Learning
Generative
Models
Reinforcement
Learning 15
Parameterized
as neural network
Dynamic Networks
who will do
what and when?
Dynamic processes over networks
ChristineAliceDavid Jacob
BookTowelShoe
𝑅user
item
≈
matrix factorization
𝑈
𝑉
17
Unroll: time-varying dependency structuretime
𝑡0
𝑡2
𝑡1
𝑡3
Represent
𝑋1 𝑋2 𝑋3
𝑋4
𝑋5
𝐻8 𝐻9
𝐻6 𝐻7
𝐻4
𝐻1
𝐻5
𝐻2 𝐻3
𝑋6
LVM
𝐺 = (𝒱, ℇ)
user/item
raw features
Interaction
time/context
18[Dai, et al. 2016]
Embed filtering/forward message passingtime
𝑡0
𝑡2
𝑡1
𝑡3
Represent
𝑋1 𝑋2 𝑋3
𝑋4
𝑋5
𝐻8 𝐻9
𝐻6 𝐻7
𝐻4
𝐻1
𝐻5
𝐻2 𝐻3
𝑋6
LVM
𝐺 = (𝒱, ℇ)
user/item
raw features
Interaction
time/context
19
Cash-out detection
Cash-out for credit loans create vicious cycles
Need to detect cash-out transactions
20
Shop Owner
Account
Alipay
Account 1
Normal Bank
Account
Alipay
Account 2
Use $1000
credit
Same
person
Transfer
$800 cash
Transfer
$800 cash
Recall
Pre
cisi
on
Structure2vec
node2vec
+ xgboost
Network of millions of transactions a day
detection comparison
GDELT database
Events in news media
subject – relation – object
and time
Total archives span >215 years,
trillion of events
Time-varying dependency structure
Temporal knowledge graph:
What will happen next?
21[Trivedi, et al. 2017]
Enemy’s
friend
is an
enemy
CAIRO CROATIA
MANCHESTER
PROTESTOR
NEW ZEALAND
SOMALIA
TEHRAN
SINGAPORE
<ASSAULT : 06-Jul-2015>
CO
NS
ULT
CO
OP
ER
ATE
<0
6-Ju
n-2
01
5>
<2
9-Ju
n-2
01
5>
(predicted event)
<2
8-M
ay-2
01
5>
FIGH
TFIG
HT
<0
5-M
ay-2
01
5>
22
COLOMBIA OTTAWA
NEW DELHI
BELGIUM
LIBYA
VENEZUELA
UAE
GAUTEMALA
<MATERIAL COOP. : 02-Jul-2015>
CO
NS
ULT
PR
OV
IDE
AID
<2
6-Ju
n-2
01
5>
<0
3-Feb
-20
15
>
(predicted event)
<1
6-M
ay-2
01
5>
FIGH
TTR
AD
E C
OO
P.
<2
8-M
ar-2
01
5>
Friends’
friend
is a friend,
common
enemy
strengthen
the tie
23
Reinforcement Learning
Combinatorial optimization over graphs
NP-hard
problems
Application Optimization problem
Fraudster: virus spreading
Analysts: community discovery
Platforms: resource scheduling
Minimum vertex/set cover
Maximum cut
Traveling salesman
25
0
Greedy algorithm for minimum vertex cover
2 - approximation for minimum vertex cover
Repeat till all edges covered:
• Select uncovered edge with largest total degree
Manually
designed rule.
Can we learn
from data?
1
1
00
0
0 0
0 0
0
NP-hard problems
26
Greedy algorithm as Markov decision process
min𝑥𝑖∈ 0,1
𝑖∈𝓥
𝑥𝑖
𝑠. 𝑡. 𝑥𝑖 + 𝑥𝑗 ≥ 1, ∀ 𝑖, 𝑗 ∈ 𝓔
Repeat:
1. Compute total degree of each
uncovered edge
2. Select both ends of uncovered
edge with largest total degree
Until all edges are covered
Minimum vertex cover: smallest number of nodes to cover all edges
Reward: 𝑟𝑡 = −1
State 𝑆: current selected nodes
Action value function: 𝑄(𝑆, 𝑖)
Greedy policy:
𝑖∗ = 𝑎𝑟𝑔𝑚𝑎𝑥𝑖 𝑄(𝑆, 𝑖)
Update state 𝑆
27
Embedding for state-action value function
pick best
node
Greedy action
𝑖∗ = 𝑎𝑟𝑔𝑚𝑎𝑥𝑖 𝑄(𝑆, 𝑖)
State-action value function 𝑄 𝑆, 𝑖
= 𝜃1𝜎(𝜃2 𝑗∈𝑉 𝜇𝑗 + 𝜃3 𝜇𝑖)
aggregated
embedding
individual
embedding
𝑓 ,
Reinforcement
Learning
1. Problem graph
2.
Model &
Structure2Vec
3. Q-function4. Train
28[Dai et al. 2017]
What algorithm is learned?
Learned algorithm balances between
• degree of the picked node and
• fragmentation of the graph
Structure2Vec Node greedy Edge greedy
29
Summary
30
Networks
Timing
Various apps
Structure2Vec
Representation
CNN and RNN are for
images and sequences
Supervised
Learning
Generative
Models
Reinforcement
Learning
We are hiring!