Network theory David Lusseau BIOL4062/5062 d.lusseau@dal.ca

Preview:

Citation preview

Network theory

David Lusseau

BIOL4062/5062

d.lusseau@dal.ca

Outline

Today: basics of graph theory and network statistics

8 March: incorporating uncertainty, network models

13 March: community structure Suggested readings:

Newman M.E.J. 2003. The structure and function of complex networks. SIAM Review 45,167-256

What is a network

Set of objects (vertices) with connections (edges)

Represented by an adjacency matrix or a list

1 2 3 4

1 0 0 0 1

2 0 0 1 0

3 0 1 0 1

4 1 0 0 1

v1 v2 weight

Hal John 5

John George 10

Liz Hal 2

Beth Liz 1

Beth John 20

Types of networks

Undirected graph (weighted or not)

Directed graph (weighted or not) Cyclic (contain loops) Acyclic (no loops)

Hypergraph (one edge join more than two vertices)

Undirected graph

Directed graph

cyclic

acyclic?

Cycle: <(a,b),(b,c),(c,a)>

Hypergraph

Meyers et al. 2004 J Th. Bio.

Some terminology

Component: set of interconnected vertices (s)

(in- and out- components in a directed graph)

Giant component: the largest component in the graph (S)

Some terminology

Degree: number of edges connected to a vertex (k)

(in- and out- degrees in a directed graph)

Geodesic path: the shortest path through the network from one vertex to another (l)

Diameter: length of the longest geodesic path (d)

v=7e=9

v=19

e=27

v=3e=2

v=1e=0

k=0

k=4

kin =4kout=4

kin =2kout=3

kin =2kout=1

l(a,b)=2

Component 4

d(4)=5

Other centrality measures

Betweenness

Eigenvector

Reach

Clustering coefficient

Betweenness and bottleneck

Number of geodesic path passing through a vertex

A

B

C

D EBetweenness of B =

1 + 1 + 1 = 3

Betweenness and bottleneck

Number of geodesic path passing through a vertex

A

B

C

D EBetweenness of D =

½ + ½ = 1

Eigenvector

Eigenvector of the dominant eigenvalue

ei integrates the connectivity of i (its degree) and the connectivity of its neighbours

Reach

Number of vertices that can be reached in k steps as a proportion of vertices in the network

Typically 2 or fewer steps

Reach

Centrality measure integrating link redundancy as well (are your friends only talking to your friends?)

Clustering coefficient

1 triangle, 8 connected triples: C=(3*1)/8=3/8 Each triangle contributes to 3 triples

Local clustering coefficientn triangle connected to i/ n triples conn. to i

3/3=1

3/3=10/1=0

0/1=0

3/6=0.5

Dealing with weighted matrices First option: do not deal with them

Ignore the weight of the edges

Transform the weighted matrices in binary matrices Meaningful measures wij>expected by chance, Significance and relevance to hypotheses

ww ij

Extending to weighted matrices Retrieve more information Relevance of binary matrix statistics strength ↔ degree:

N

1jiji as

Some examples of real world networks Social networks Contact networks Food webs Man-made networks (internet, electricity grid) Metabolite interactions …

High school dating

Bearman et al. 2004 Am. J. Soc.

Graph by M.E.J. Newman

High school friendship

Moody 2001 Am. J. Soc.

Internet

Cheswick, Bell Labs

Food webCaribbean coral reef system

Human protein-protein interactions

Chinnaiyan et al. 2005 Nature Biotech

Tools for network analyses

Ucinet/Netdraw (http://www.analytictech.com/)

Socprog (http://myweb.dal.ca/hwhitehe/social.htm)

Pajek (http://vlado.fmf.uni-lj.si/pub/networks/pajek/)

Jung (JAVA) (http://jung.sourceforge.net)

SNA (R package) (http://erzuli.ss.uci.edu/R.stuff)

Tools for network analyses

Net.Linux (Linux OS)

(http://pil.phys.uniroma1.it/%7Eservedio/software.html)

Visualising large graphs

Graphviz (http://www.graphviz.org)

Yed (http://www.yworks.com/en/products_yed_about.htm)

Recommended