19
COMS 6998-06 Network Theory Week 2: September 15, 2010 Dragomir R. Radev Wednesdays, 6:10-8 PM 325 Pupin Terrace Fall 2010

COMS 6998-06 Network Theory Week 2: September 15, 2010

  • Upload
    cargan

  • View
    30

  • Download
    2

Embed Size (px)

DESCRIPTION

COMS 6998-06 Network Theory Week 2: September 15, 2010. Dragomir R. Radev Wednesdays, 6:10-8 PM 325 Pupin Terrace Fall 2010. (3) Random graphs. Statistical analysis of networks. We want to be able to describe the behavior of networks under certain assumptions. - PowerPoint PPT Presentation

Citation preview

Page 1: COMS 6998-06 Network Theory Week 2: September 15, 2010

COMS 6998-06 Network TheoryWeek 2: September 15, 2010

Dragomir R. RadevWednesdays, 6:10-8 PM

325 Pupin TerraceFall 2010

Page 2: COMS 6998-06 Network Theory Week 2: September 15, 2010

(3) Random graphs

Page 3: COMS 6998-06 Network Theory Week 2: September 15, 2010

Statistical analysis of networks

• We want to be able to describe the behavior of networks under certain assumptions.

• The behavior is described by the diameter, clustering coefficient, degree distribution, size of the largest connected component, the presence and count of complete subgraphs, etc.

• For statistical analysis, we need to introduce the concept of a random graph.

Page 4: COMS 6998-06 Network Theory Week 2: September 15, 2010

Erdos-Renyi model

• A very simple model with several variants.• We fix n and connect each candidate edge with

probability p. This defines an ensemble Gn,p

• The two examples below are specific instances of G10,0.2.

In other models, m is fixed. There are also versions in which some graphs are more likely than others, etc.

Try Pajek

Page 5: COMS 6998-06 Network Theory Week 2: September 15, 2010

Erdos-Renyi model

• We are interested in the computation of specific properties of E-R random graphs.

• The number ofcandidate edges is:

• The actual number of edges mis on average:

• We will look at the actual distribution in a bit.

2

)1(

2

nnn

2

)1(

nnpm

Page 6: COMS 6998-06 Network Theory Week 2: September 15, 2010

Properties

• The expected value of a Poisson-distributed random variable is equal to λ and so is its variance.

• The mode of a Poisson-distributed random variable with non-integer λ is equal to floor(λ), which is the largest integer less than or equal to λ. When λ is a positive integer, the modes are λ and λ − 1.

Page 7: COMS 6998-06 Network Theory Week 2: September 15, 2010

Degree distribution

• The probability p(k) that a node has a degree k is Binomial:

• In practice, this is the Poisson distribution

for large n (n >> kz)where is the mean degree

• Average degree = = 2m/n = p(n-1) ≈ pn

knk ppn

kkp

1)1(1

)(

!)(

k

ekp

k

Page 8: COMS 6998-06 Network Theory Week 2: September 15, 2010

Giant component size

• Let v be the number of nodes that are not in the giant component. Then u=v/n is the fraction of the graph outside of the giant component.

• If a node is outside of the giant component, its k neighbors are too. The probability of this happening is uk.

• Let S=1-u. We now haveFor <1, the only non-negative solution is S=0For >1 (after the phase transition), the only non-negative solution is the size of the giant component

• At the phase transition, the component sizes are distributed according to a power law with exponent 5/2.

)1(

00 !

)(

u

k

k

k

kk e

k

ueupu

SeS 1

Page 9: COMS 6998-06 Network Theory Week 2: September 15, 2010

Giant component size

• Similarly one can prove thatS

s

1

1

[Newman 2003]

Page 10: COMS 6998-06 Network Theory Week 2: September 15, 2010

Diameter

• A given vertex i has Ni1 first neighbors. The

expected value of this number is .• But we also know that = pn.• Now move to Ni

2. This is the number of second neighbors of i. Let’s make the assumption that these are the neighbors of the first neighbors. So,

• What does this remind you of?• When must the procedure end?

212 ii NN

Page 11: COMS 6998-06 Network Theory Week 2: September 15, 2010

Diameter (cont’d)

DDiii NNNn 21

DDiN

At all distances:

For D equal to the diameter of the graph:

In other words (after taking a logarithm):

log

lognD

Page 12: COMS 6998-06 Network Theory Week 2: September 15, 2010

Are E-R graphs realistic?

• They have small world properties (diameter is logarithmic in the size of the graph)

• But low clustering coefficient. Example for autonomous internet systems, compare 0.30 with 0.0004 [Pastor-Satorras and Vespignani]

• And unrealistic degree distributions• Not to mention skinny tails

Page 13: COMS 6998-06 Network Theory Week 2: September 15, 2010

Clustering coefficient

• Given a vertex i and its two real neighbors j and k, what is the probability that the graph contains an edge between j and k.

• Ci = #triangles at i / #triples at I• C = average over all Ci

• Typical value in real graphs can be as high as 50% [Newman 2002].

• In random graphs, C = p (ignoring the fact that j and k share a neighbor (i).

Page 14: COMS 6998-06 Network Theory Week 2: September 15, 2010

Some real networks

• From Newman 2002:Network n Mean degree z Cc Cc for random graph

Internet (AS level) 6,374 3.8 0.24 0.00060

WWW (sites) 153,127 35.2 0.11 0.00023

Power grid 4,941 2.7 0.080 0.00054

Biology collaborations 1,520,251 15.5 0.081 0.000010

Mathematics collaborations 253,339 3.9 0.15 0.000015

Film actor collaborations 449,913 113.4 0.20 0.00025

Company directors 7,673 14.4 0.59 0.0019

Word co-occurrence 460,902 70.1 0.44 0.00015

Neural network 282 14.0 0.28 0.049

Metabolic network 315 28.3 0.59 0.090

Food web 134 8.7 0.22 0.065

Page 15: COMS 6998-06 Network Theory Week 2: September 15, 2010

[Newman 2002]

Page 16: COMS 6998-06 Network Theory Week 2: September 15, 2010

Graphs with predetermined degree sequences

• Bender and Canfield introduced this concept.

• For a given degree sequence, gie the same statistical weight to all graphs in the ensemble.

• Generate a random sequence in proportion to the predefined sequence

• Note that the sum of degrees must be even.

Page 17: COMS 6998-06 Network Theory Week 2: September 15, 2010

(4) Software

Page 18: COMS 6998-06 Network Theory Week 2: September 15, 2010

List of packages

• Pajek: http://vlado.fmf.uni-lj.si/pub/networks/pajek/

• Jung: http://jung.sourceforge.net/ • Guess: http://graphexploration.cond.org/• Networkx: https://networkx.lanl.gov/wiki • Pynetconv: http://pynetconv.sourceforge.net/ • Clairlib: http://www.clairlib.org • UCINET

Page 19: COMS 6998-06 Network Theory Week 2: September 15, 2010