Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Complex Networks and their Analysis with
Random WalksDaniel R. Figueiredo
Federal University of Rio de Janeiro, Brazil
Konstantin AvrachenkovINRIA Sophia Antipolis, France
ITC 27 – Tutorial – Ghent, Belgium
ITC 2015
What is Complex Networks About?
Understand how and why things are connected and consequences of connectededness
“Connected things” Networks
“How, why, consequences” Complex
Network connectedness takes center stageᴏ necessary to understand phenomena
we are yet to understand
The network is not complex!
ITC 2015
Objectives and OrganizationFirst contact
ᴏ “networks everywhere”
Empirical findings of networksᴏ important commonalities
Mathematical models for networks
Introduction to random walk ᴏ simple yet profound
Applications of random walks
ᴏ ranking, sampling, clustering, etc
break
Daniel
1h
1h
Kostia
.5h
1.5h
ITC 2015
Networks
Set of points connected by line segments
What is a network?
a
c
b
e
d
f
Bureuacraticdefinition!
ITC 2015
Networks, another definitionAbstraction to encode relationships
among pairs of objects
network nodesobjects
network edgesrelationships
Mathematical abstraction to represent structures
Networks aka graphs (or vice-versa)
ITC 2015
Social Networks Objects: people
Relationship: be connected on Facebook
Carlos
Marcos
Maria Bruno
Ana Rodrigo
Carol
Pedro
Another relationship: have kissed (in real life!)
Carlos
Marcos
Maria Bruno
Ana Rodrigo
Carol
Pedro
Different relationships encodedon same set of objects
ITC 2015
Collaboration Networks Objects: professors Relationship: co-authorship of papers
PESC/COPPE
DBLP 2009
Importanceof structure?
ITC 2015
Sexual Contact Networks Objects: people Relationship: sexual contact
Importance ofstructure?
ITC 2015
Web Network Information network (webpage is unit of info) Assymetric relationship: hyperlinks
20+ billion webpages (worldwidewebsize.com)
Importance ofstructure?
ITC 2015
Internet Network of networks (AS Level Graph)
40+ thousand nodes (AS)
Importance ofstructure?
ITC 2015
Neural Network C. elegans (roundworm) Simple nervous system
Neural networkᴏ 302 neurons
ᴏ fully mapped in 70s
ᴏ Nobel Prize in 2002
Importance ofstructure?
openworm.org
ITC 2015
Protein Interaction Network Proteins of virus EBV (circles) and proteins of
humans (squares)
Importance ofstructure?
ITC 2015
Talking About Networks
How to describe a network?
Draw it! “A picture is worth a thousand words”ᴏ what if network is large, 10K+ nodes?
Adjacency matrix: simple structural description
ᴏ aij = 1 , if nodes i and j are related
ᴏ aij = 0 , otherwise
Encodes all relationshipsGeneralizes to asymmetric and weighted
relationships
ITC 2015
Adjacency MatrixProblem with describing network
with adjacency matrix?
Structural Summaries!
Too much information, too little intuition!ᴏ ie., adjacency matrix of web
Adjacency matrix is DNA of networkᴏ do you describe people by their DNA?
Understand structure in intuitive level, capturing its essence
ITC 2015
Which are the important characteristics?
Network CharacteristicsCharacteristics provide structural information
on networkᴏ size, density, degrees, distances, clustering,
centrality, homophily, etc
Depends on object of studyᴏ like gens in DNA, we yet to fully know
their meaning
ITC 2015
Two properties to make structural characteristic important
1) predict behavior of particular process
2) influence various processes
Best example: network degree distribution 1) determines random walk behavior
2) influences search and epidemics
Yet to discover powerful characteristics of networksᴏ ex. eigenvalue of adjacency matrix
Important Characteristics
ITC 2015
Number of nodes in a network
ᴏ n = |V|Number of edges (present relationships)
ᴏ m = |E|Largest number of edges a network
can have?
Nodes and Edges
number of unordered pairs in a set of n objects
every node is connected to every other node
cardinality of set of nodes
(n2) n n−1
2≤n2=
ITC 2015
Average Degree and Density
degree of node ig=1/n∑i∈V
d i
∑i∈V
d i=2 m
Degree: number of edges incident on nodeᴏ for asymmetric relationships, in-degree different
from out-degree
Average degree
Note that every edge has two end points
g=2 m /n
Density: fraction of edges present in network
ρ=m
(n2)
2 mn (n−1)
gn−1
= =
ITC 2015
Degree Distribution
pkNumber of nodes with degree k
Total number of nodes=
1−∑i=0
kpk
Empirical distribution of node degreeᴏ fraction of nodes with degree k
Empirical CCDF – Complementary Cumulative Distribution Functionᴏ fraction of nodes with degree greater than k
k = 0, 1, …, n-1
Let D be a random variable representing node degree (node chosen uniformly at random)
P[ D=k ]= pk
ITC 2015
Example
Empirical degree distribution?
21 2
35
46
3
43 1
1
1 2 3
1
degree
PDFP[D = k]
4 1 2 3
1
degree
CCDFP[D > k]
4
ITC 2015
DistancesPath: sequence of edges on the networkDistance: length of shortest path between
pair of nodes
2 6
4 75
1
3
d(1,7) = ?d(5,2) = ?note: (many) more
than 1 shortest path
Average distance of networkᴏ across all
(unordered)pairs
d=
∑i , j∈V
d (i , j)
(n2)
ITC 2015
Clustering CoefficientLocal transitivity
ᴏ (i,j) neighbors, (j,k) neighhors → (i,k) neighbors?
Local transitivity forms trianglesTwo popular ways to measure it
1) Local measure: Fraction of edges among neighbors of a node
2) Global measure: Fraction of two hop paths that are triangles
Related but not identical for most networksᴏ global measure preferred
ITC 2015
Local ClusteringDefined for each network nodeFraction of edges among neighbors of node
C i=Ei
d i
2 max number of edges among node with degree d
i
number of edges among neighbors of i C i=
2 E i
d id i−1
Network clustering: average clustering across all nodes
C=1/ n∑i∈V
C i
ITC 2015
ExampleC i=
Ei
d i
2 max number of edges among node with degree d
i
number of edges among neighbors of i
C1 = ?
C4 = ?
C5 = ?
C=1/ n∑i∈V
C i
2 6
4 75
1
3
ITC 2015
Three Important Aspects
“Small world” effect
“My friends are also friends” effect
“not born equal” effect
Fundamental and present in various real networksᴏ only recently observed (birth not net sci)
ITC 2015
“Small world” Effect“It's a small world, after all”
Average distance is very small
Even on very large networks
Web graph – 108 nodes
Actor colab net – 106 nodes
Facebook friendship - 109 nodes
7.5
3
“Six degrees of separation”
distance between two people in the world
Popularized by Milgram's experiment in 60's
Principle extends beyond social networks
4.5
ITC 2015
“My friends are also friends” Effect
If A has relationship with B and C,
then B and C have large chance of also being related
Two hop paths tend to become triangles on networks
Induces high clustering coefficient
AS graph – 104 nodes
Facebook – 109 nodes
0.39
0.14
Randomlypaired
0.000560.00000026
ITC 2015
“Not born equal” effect
Node degrees are very unequal
most very small, few extremely large
similar to distribution of wealth in some countries (like Brazil)
Heavy tailed degree distribution
AS Graph – 104 nodes
avg degree: 5.9
max degree: ~2100
Citation network – 106 nodes
avg degree: 8.6
max degree: ~9000
ITC 2015
The PuzzleFact I: Various networks have peculiar structural properties
degree distribution, distances, clustering, etc
Fact II: Various networks have similar structural properties
Web, Facebook, neural network, citation, etc
“Million dollar question”
Why?
by and large still searching for a satisfactory answer!