Web as Graph – Empirical Studies

Preview:

DESCRIPTION

Web as Graph – Empirical Studies. The Structure and Dynamics of Networks. Chapter-3. Broder et al, Graph Structure in the Web. Computer Networks 33,309-320(2000). Faloutsos et al, On Power-Law Relationships of the Internet Topology. SIGCOMM 1999 - PowerPoint PPT Presentation

Citation preview

Web as Graph – Empirical Studies

The Structure and Dynamics of Networks

Chapter-3

• Broder et al, Graph Structure in the Web. Computer Networks 33,309-320(2000).

• Faloutsos et al, On Power-Law Relationships of the Internet Topology. SIGCOMM 1999

• M. E. J. Newman, The Structure of Scientific Collaboration Networks. PNAS 2:98,404-409 (2001)

How does it help?• Answer questions like:

– What does the internet look like?– Are there any topological properties that don’t change in time?– How will it look like a year from now?– How can I generate Internet-like graphs for my simulations?

• Designing crawl strategies on Web.• Understanding sociology of content creation on Web.• Analyzing behavior of Web algorithms using link

information.• Predicting evolution of Web structures like bipartite cores,

etc.• Predicting emergence of new phenomena in Web graph.

Power Lawkxy Where, K > 1

Power Law plot Log-log plot of Power Law

Scale-free network: “Scale-free networks' structure and dynamics are independent of the system's size N, the number of nodes the system has.”

-Wikipedia

Examples

• Access statistics of web pages,• number of times users at a single site access

particular pages,• PageRank of web pages,• In-degree, out-degree of webpages,• Amazon’s online store• Library book records• All preferential attachment models

Power Laws

Rvv rd Rank Exponent:

Power Laws

Od df Outdegree Exponent:

Power Laws

Web Graph

Scientific Collaboration Network

Similar to Erdös number

Scientific Collaboration Network

Observations

• Newman’s data does not follow Power law– Presumably because the study was conducted for

a finite time window of 5 years.• In most of the databases, the largest group

occupies around 80-90% of all authors– Authors are very highly connected and – No immediate danger of fragmentation

• 6 degrees of separation holds.

Recommended