53
Class 3: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

Class 3: Random Graphs

  • Upload
    hedwig

  • View
    58

  • Download
    0

Embed Size (px)

DESCRIPTION

Class 3: Random Graphs. Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino. Network Science: Random Graphs 2012. RANDOM NETWORK MODEL. Network Science: Random Graphs 2012. RANDOM NETWORK MODEL. Alfréd Rényi (1921-1970). Pál Erdös (1913-1996). Erdös-Rényi model (1960) - PowerPoint PPT Presentation

Citation preview

Page 1: Class 3: Random Graphs

Class 3: Random Graphs

Network Science: Random Graphs 2012

Prof. Albert-László BarabásiDr. Baruch Barzel, Dr. Mauro Martino

Page 2: Class 3: Random Graphs

RANDOM NETWORK MODEL

Network Science: Random Graphs 2012

Page 3: Class 3: Random Graphs

Erdös-Rényi model (1960)

Connect with probability p

p=1/6 N=10

k ~ 1.5

Pál Erdös(1913-1996)

Alfréd Rényi(1921-1970)

RANDOM NETWORK MODEL

Network Science: Random Graphs 2012

Page 4: Class 3: Random Graphs

RANDOM NETWORK MODEL

Definition:

A random graph is a graph of N labeled nodes where each pair of nodes is connected by a preset probability p.

We will call is G(N, p).

Network Science: Random Graphs 2012

Page 5: Class 3: Random Graphs

RANDOM NETWORK MODEL

p=1/6 N=12

Network Science: Random Graphs 2012

Page 6: Class 3: Random Graphs

RANDOM NETWORK MODEL

p=0.03 N=100

Network Science: Random Graphs 2012

Page 7: Class 3: Random Graphs

RANDOM NETWORK MODEL

N and p do not uniquely define the network– we can have many different realizations of it. How many?

N=10 p=1/6

The probability to form a particular graph (with some value of L) G(N,p) is That is, each graph G(N,p)

appears with probability P(G(N,p)).

Network Science: Random Graphs 2012

Page 8: Class 3: Random Graphs

RANDOM NETWORK MODEL

P(L): the probability to have exactly L links in a network of N nodes and probability p:

The maximum number of links in a network of N nodes.

Number of different ways we can choose L links among all potential links.

Binomial distribution...

Network Science: Random Graphs 2012

Page 9: Class 3: Random Graphs

MATH TUTORIAL the mean of a binomial distribution

There is a faster way using generating functions, see: http://planetmath.org/encyclopedia/BernoulliDistribution2.html Network Science: Random Graphs 2012

Page 10: Class 3: Random Graphs

MATH TUTORIAL the variance of a binomial distribution

http://keral2008.blogspot.com/2008/10/derivation-of-mean-and-variance-of.htmlNetwork Science: Random Graphs 2012

Page 11: Class 3: Random Graphs

MATH TUTORIAL the variance of a binomial distribution

http://keral2008.blogspot.com/2008/10/derivation-of-mean-and-variance-of.htmlNetwork Science: Random Graphs 2012

Page 12: Class 3: Random Graphs

MATH TUTORIAL Binomian Distribution: The bottom line

http://keral2008.blogspot.com/2008/10/derivation-of-mean-and-variance-of.htmlNetwork Science: Random Graphs 2012

Page 13: Class 3: Random Graphs

RANDOM NETWORK MODEL

P(L): the probability to have a network of exactly L links

• The average number of links <L> in a random graph

• The standard deviation for number of links

Network Science: Random Graphs 2012

Page 14: Class 3: Random Graphs

DEGREE DISTRIBUTION OF A RANDOM GRAPH

As the network size increases, the distribution becomes increasingly narrow—we are increasingly confident that the degree of a node is in the vicinity of <k>.

Select k nodes from N-1

probability of having k edges

probability of missing N-1-kedges

Network Science: Random Graphs 2012

Page 15: Class 3: Random Graphs

DEGREE DISTRIBUTION OF A RANDOM GRAPH

For large N and small k, we can use the following approximations:

for

Network Science: Random Graphs 2012

Page 16: Class 3: Random Graphs

DEGREE DISTRIBUTION OF A RANDOM GRAPH

P(k

)

k Network Science: Random Graphs 2012

Page 17: Class 3: Random Graphs

DEGREE DISTRIBUTION OF A RANDOM NETWORK

Exact Result-binomial distribution-

Large N limit-Poisson distribution-

Pro

babi

lity

Dis

tribu

tion

Func

tion

(PD

F)

Network Science: Random Graphs 2012

Page 18: Class 3: Random Graphs

What does it mean? Continuum formalism:

If we consider a network with average degree <k> then the probability to have a node whose degree exceeds a degree k0 is:

For example, with <k>=10, • the probability to find a node whose degree is at least twice the average degree is 0.00158826. • the probability to find a node whose degree is at least ten times the average degree is 1.79967152 × 10 -13

• the probability to find a node whose degree is less than a tenth of the average degree is 0.00049See http://www.stud.feec.vutbr.cz/~xvapen02/vypocty/po.php

• The probability of seeing a node with very high of very low degree is exponentially small.• Most nodes have comparable degrees.• The larger the size of a random network, the more similar are the node degrees

What does it mean? Discrete formalism:

NODES HAVE COMPARABLE DEGREES IN RANDOM NETWORKS

Page 19: Class 3: Random Graphs

NO OUTLIERS IN A RANDOM SOCIETY

According to sociological research, for a typical individual k ~1,000

The probability to find an individual with degree k>2,000 is 10-27.

Given that N ~109, the chance of finding an individual with 2,000 acquaintances is so tiny that such nodes are virtually inexistent in a random society.

a random society would consist of mainly average individuals, with everyone with roughly the same number of friends.

It would lack outliers, individuals that are either highly popular or recluse.

Network Science: Random Graphs 2012

Page 20: Class 3: Random Graphs

SIX DEGREES small worlds

Frigyes Karinthy, 1929Stanley Milgram, 1967

Peter

Jane

Sarah

Ralph

Network Science: Random Graphs 2012

Page 21: Class 3: Random Graphs

SIX DEGREES 1929: Frigyes Kartinthy

Frigyes Karinthy (1887-1938)Hungarian Writer

“Look, Selma Lagerlöf just won the Nobel Prize for Literature, thus she is bound to know King Gustav of Sweden, after all he is the one who handed her the Prize, as required by tradition. King Gustav, to be sure, is a passionate tennis player, who always participates in international tournaments. He is known to have played Mr. Kehrling, whom he must therefore know for sure, and as it happens I myself know Mr. Kehrling quite well.”

"The worker knows the manager in the shop, who knows Ford; Ford is on friendly terms with the general director of Hearst Publications, who last year became good friends with Arpad Pasztor, someone I not only know, but to the best of my knowledge a good friend of mine. So I could easily ask him to send a telegram via the general director telling Ford that he should talk to the manager and have the worker in the shop quickly hammer together a car for me, as I happen to need one."

1929: Minden másképpen van (Everything is Different) Láncszemek (Chains)

Network Science: Random Graphs 2012

Page 22: Class 3: Random Graphs

SIX DEGREES 1967: Stanley Milgram

HOW TO TAKE PART IN THIS STUDY

1. ADD YOUR NAME TO THE ROSTER AT THE BOTTOM OF THIS SHEET, so that the next person who receives this letter will know who it came from.

2. DETACH ONE POSTCARD. FILL IT AND RETURN IT TO HARVARD UNIVERSITY. No stamp is needed. The postcard is very important. It allows us to keep track of the progress of the folder as it moves toward the target person.

3. IF YOU KNOW THE TARGET PERSON ON A PERSONAL BASIS, MAIL THIS FOLDER DIRECTLY TO HIM (HER). Do this only if you have previously met the target person and know each other on a first name basis.

4. IF YOU DO NOT KNOW THE TARGET PERSON ON A PERSONAL BASIS, DO NOT TRY TO CONTACT HIM DIRECTLY. INSTEAD, MAIL THIS FOLDER (POST CARDS AND ALL) TO A PERSONAL ACQUAINTANCE WHO IS MORE LIKELY THAN YOU TO KNOW THE TARGET PERSON. You may send the folder to a friend, relative or acquaintance, but it must be someone you know on a first name basis.

Network Science: Random Graphs 2012

Page 23: Class 3: Random Graphs

SIX DEGREES 1991: John Guare

"Everybody on this planet is separated by only six other people. Six degrees of separation. Between us and everybody else on this planet. The president of the United States. A gondolier in Venice…. It's not just the big names. It's anyone. A native in a rain forest. A Tierra del Fuegan. An Eskimo. I am bound to everyone on this planet by a trail of six people. It's a profound thought. How every person is a new door, opening up into other worlds."

Network Science: Random Graphs 2012

Page 24: Class 3: Random Graphs

WWW: 19 DEGREES OF SEPARATION

Image by Matthew HurstBlogosphere Network Science: Random Graphs 2012

Page 25: Class 3: Random Graphs

DISTANCES IN RANDOM GRAPHS

Random graphs tend to have a tree-like topology with almost constant node degrees.

• nr. of first neighbors:

• nr. of second neighbors:

• nr. of neighbours at distance d:

• estimate maximum distance:

kN1 2

2 kN

Network Science: Random Graphs 2012

(dmax where N(dmax) = N) logN << N (small world)

Page 26: Class 3: Random Graphs

klogNlog

lmax

????Given the huge differences in scope, size, and average degree, the agreement is excellent. See table 3.2

DISTANCES IN RANDOM GRAPHS compare with real data

Network Science: Random Graphs 2012

Page 27: Class 3: Random Graphs

EVOLUTION OF A RANDOM GRAPH

Network Science: Random Graphs 2012

Page 28: Class 3: Random Graphs

Until now we focused on the static properties of a random graph with fixes p value.

What happens when vary the parameter p?

EVOLUTION OF A RANDOM NETWORK

GOTO http://cs.gmu.edu/~astavrou/random.html

Choose Nodes=100. Note that the p goes up in increments of 0.001, which, for N=100, L=pN(N-1)/2~p*50,000, i.e. each increment is really about 50 new lines.

Network Science: Random Graphs 2012

Page 29: Class 3: Random Graphs
Page 30: Class 3: Random Graphs

<k>

EVOLUTION OF A RANDOM NETWORK

disconnected nodes NETWORK.

How does this transition happen? Network Science: Random Graphs 2012

Page 31: Class 3: Random Graphs

I: Subcritical

<k> < 1

III: Supercritical

<k> > 1

IV: Connected <k> > ln N

II: Critical <k> = 1

<k>=0.5 <k>=1 <k>=3 <k>=5

N=1

00

<k>

Page 32: Class 3: Random Graphs

I: Subcritical

<k> < 1p < pc=1/N

<k>No giant component.

N-L isolated clusters, cluster size distribution is exponential

The largest cluster is a tree, its size ~ ln N; N(giant comp)/N = ln N/N tends to 0 As N tends to infinity.

Page 33: Class 3: Random Graphs

II: Critical <k> = 1

p=pc=1/N

<k>

Unique giant component: NG~ N2/3

contains a vanishing fraction of all nodes, NG/N ~ N-1/3

Small components are trees, GC has loops.A jump in the cluster size:N=1,000 ln N~ 6.9; N2/3~95N=7 109 ln N~ 22; N2/3~3,659,250

Page 34: Class 3: Random Graphs

<k>=3

<k>

Unique giant component: NG~ (p-pc)N (GC has a finite fraction of nodes) GC has loops.

Cluster size distribution: exponential

III: Supercritical

<k> > 1p > pc=1/N

Page 35: Class 3: Random Graphs

IV: Connected <k> > ln Np > (ln N)/N

<k>=5

<k>

Only one cluster: NG=N GC is dense.Cluster size distribution: None

Network Science: Random Graphs 2012

Page 36: Class 3: Random Graphs

I: Subcritical

<k> < 1

III: Supercritical

<k> > 1

IV: Connected <k> > ln N

II: Critical <k> = 1

<k>=0.5 <k>=1 <k>=3 <k>=5

N=1

00

<k>

Page 37: Class 3: Random Graphs
Page 38: Class 3: Random Graphs

Since edges are independent and have the same probability p,

The clustering coefficient of random graphs is small. For fixed degree C decreases with the system size N.

CLUSTERING COEFFICIENT

13.47 from Newman 2010

This is valid for random networks only, with

arbitrary degree distribution

Network Science: Random Graphs 2012

Page 39: Class 3: Random Graphs

• Degree distributionBinomial, Poisson (exponential tails)

• Clustering coefficientVanishing for large network sizes

• Average distance among nodesLogarithmically small

Erdös-Rényi MODEL (1960)

Network Science: Random Graphs 2012

Page 40: Class 3: Random Graphs

HISTORICAL NOTE

Network Science: Random Graphs 2012

Page 41: Class 3: Random Graphs

1951, Rapoport and Solomonoff:

first systematic study of a random graph. demonstrates the phase transition.

natural systems: neural networks; the social networks of physical contacts (epidemics); genetics.

Why do we call it the Erdos-Renyi random model?

HISTORICAL NOTE

Anatol Rapoport 1911- 2007 Edgar N. Gilbert

(b.1923)

1959: G(N,p)

Network Science: Random Graphs 2012

Page 42: Class 3: Random Graphs

Erdös-Rényi MODEL (1960)

Network Science: Random Graphs 2012

Page 43: Class 3: Random Graphs

Are real networks like random graphs?

Network Science: Random Graphs 2012

Page 44: Class 3: Random Graphs

As quantitative data about real networks became available, we cancompare their topology with the predictions of random graph theory.

Note that once we have N and <k> for a random network, from it we can derive every measurable property. Indeed, we have:

Average path length:

Clustering Coefficient:

Degree Distribution:

ARE REAL NETWORKS LIKE RANDOM GRAPHS?

Network Science: Random Graphs 2012

Page 45: Class 3: Random Graphs

Real networks have short distanceslike random graphs.

Prediction: Data:

PATH LENGTHS IN REAL NETWORKS

Network Science: Random Graphs 2012

Page 46: Class 3: Random Graphs

Prediction: Data:

Crand underestimates with orders of magnitudes the clustering coefficient of real networks.

CLUSTERING COEFFICIENT

Network Science: Random Graphs 2012

Page 47: Class 3: Random Graphs

Prediction:

Data:

(a) Internet;(b) Movie Actors;(c) Coauthorship, high energy physics;(d) Coauthorship, neuroscience

THE DEGREE DISTRIBUTION

Network Science: Random Graphs 2012

Page 48: Class 3: Random Graphs

As quantitative data about real networks became available, we cancompare their topology with the predictions of random graph theory.

Note that once we have N and <k> for a random network, from it we can derive every measurable property. Indeed, we have:

Average path length:

Clustering Coefficient:

Degree Distribution:

ARE REAL NETWORKS LIKE RANDOM GRAPHS?

Network Science: Random Graphs 2012

Page 49: Class 3: Random Graphs

(A) Problems with the random network model:-the degree distribution differs from that of real networks-the giant component in most real network does NOT emerge through a phase transition-the clustering coefficient in most systems will now vanish, as predicted by the model.

(B) Most important: we need to ask ourselves, are real networks random?

The answer is simply: NO

Hence it is IRRELEVANT.

There is no network in nature that we know of that would be described by the random network model.

A WRONG B IRRELEVANT

IS THE RANDOM GRAPH MODEL RELEVANT TO REAL SYSTEMS?

Network Science: Random Graphs 2012

Page 50: Class 3: Random Graphs

It is the reference model for the rest of the class.

It will help us calculate many quantities, that can then be compared to the real data, understanding to what degree is a particular property the result of some random process.

Organizing principles: patterns in real networks that are shared by a large number of real networks, yet which deviate from the predictions of the random network model.

In order to identify these, we need to understand how would a particular property look like if it is driven entirely by random processes.

While WRONG and IRRELEVANT, it will turn out to be extremly USEFUL!

IF IT IS WRONG AND IRRELEVANT, WHY DID WE DEVOT TO IT A FULL CLASS?

Network Science: Random Graphs 2012

Page 51: Class 3: Random Graphs

NETWORK DATA: SCIENCE COLLABORATION NETWORKS

Erdos:1,400 papers 507 coauthors

Einstein: EN=2Paul Samuelson EN=5

….

ALB: EN: 3

Network Science: Random Graphs 2012

Page 52: Class 3: Random Graphs

NETWORK DATA: SCIENCE COLLABORATION NETWORKS

Collaboration Network:Nodes: ScientistsLinks: Joint publications

Physical Review:1893 – 2009.

N=449,673L=4,707,958

See also Stanford Large Network database http://snap.stanford.edu/data/#canets.

Network Science: Random Graphs 2012

Page 53: Class 3: Random Graphs

Network Science: Random Graphs 2012