Upload
ralf-klamma
View
1.881
Download
4
Embed Size (px)
DESCRIPTION
Ringvorlesung der Research School Business & Economics (RSBE) University of Siegen, Germany June 28, 2011 Ralf Klamma RWTH Aachen University
Citation preview
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-1
TeLLNet
GALA Network Flow and Network Formation:A Social Network Analysis Perspective
Ralf KlammaRWTH Aachen University
Informatik 5 (DBIS)RWTH Aachen University
Ringvorlesung der Research School Business & Economics (RSBE) Siegen
June 28, 2011
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-2
TeLLNet
GALA
Agenda
Netw
ork S
cienc
e
Netw
ork F
low
Netw
ork F
orma
tion
Conc
lusion
s and
Outl
ook
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-3
TeLLNet
GALA
RWTH Aachen University
• 1,250 spin-off businesses have created around 30,000 jobs in the greater Aachen region over the past 20 years.
• IDEA League
• Germany’s Excellence Initiative: 3 clusters of excellence, a graduate school and the institutional strategy “RWTH Aachen 2020: Meeting Global Challenges”
• 260 institutes in 9 faculties as Europe’s leading institutions for science and research
• Currently around 31,400 students are enrolled in over 100 academic programs
• Over 5,000 of them are international students hailing from 120 different countries
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-4
TeLLNet
GALA
Community Information Systems Research Group
Established at DBIS chair, RWTH Aachen University3 Postdocs, 7 PhD students,
+ paid student workers & thesis workers
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-5
TeLLNet
GALA
Network Science
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-6
TeLLNet
GALA
Questions within Network Science How well the position of a agent is to receive and disseminate information?
– experts (centrality measures) [Wasserman & Faust, 1997]
Are users communicate only within their groupsor with some agents from the other groups as well?
– innovation stars (boundary spanners, brokers, highbetwenness centrality) [Burt, 2005]
Who and what effects a agent? – influence networks [Lewis, 2008]
What are groups/communities an agentbelongs to?
– community mining [Clauset et al., 2004]
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-7
TeLLNet
GALA
Executive Board Networks: TheyRule.net
A prototype as of 2004 What is the connection between Motorola and Whirlpool?
How does the academic institutes and the companies network look like?
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-8
TeLLNet
GALA
Who rule 3M, Motorola, AT&T, Coca-Cola, PepsiCo, and McDonald‘s?
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-9
TeLLNet
GALA
Spread of Contagion
Source: orgnet.com
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-10
TeLLNet
GALA
Network Science Paradigms
Merge of analytic and engineering paradigms In an analytic discipline
– To find laws (computing paradigms)– To generate phenomena– To explain observed phenomena
In a engineering discipline– To realize and implement
the paradigms of Networks– To understand the cases when particular technologies should be
used– To store Network data efficiently (Mediabase)
Communicationserves a purpose
Scientific disciplines Commerce
Entertainment Politics
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-11
TeLLNet
GALA
Web Science: The Long Tail & Fragments
The Web is a scale-free, fragmented network– The power law (Pareto-Distribution etc.)– 95 % of users are located in the Long Tail (Communities)– Trust and passion based cooperation
IslandTendrils
IN Continent Central Core OUT Continent
Tunnels
[Barabasi, 2002]
[Anderson, 2006]
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-12
TeLLNet
GALA
Principle Analytic Approach Interdisciplinary multidimensional model of networks
– Social network analysis (SNA) is defining measures for social relations
– Actor network theory (ANT) is connecting human and media agents– i* framework is defining strategic goals and dependencies– Theory of media transcriptions is studying cross-media knowledge
social softwareWiki, Blog, Podcast, IM, Chat, Email, Newsgroup, Chat …
i*-Dependencies(Structural, Cross-media)
Members(Social Network Analysis: Centrality,
Efficiency)
network of artifactsMicrocontent, Blog entry, Message, Burst, Thread,
Comment, Conversation, Feedback (Rating)
network of members
Communities of practice
Media Networks
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-13
TeLLNet
GALA
MediaBase Collection of Social Software
artifacts with parameterized PERL scripts– Mailing lists– Newsletter– Web sites– RSS Feeds– Blogs
Database support by IBM DB2, eXist, Oracle, ...
Web Interface based on Firefox Plugin, Plone/Zope, Widgets, ...
Strategies of visualization– Tree maps– Cross-media graphs
Klamma et al.: Pattern-Based Cross Media Social Network Analysis for Technology Enhanced Learning in Europe, EC-TEL 2006
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-14
TeLLNet
GALA
Network Flow
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-15
TeLLNet
GALA
Fundamentals: Definitions of Network
A network Γ= (N, L) where N = {1, 2, ..., n} is a (finite) set of nodes (vertices), L⊆ N x N is a set of links (edges)
Assumed: – Unweighted– No multiple links
=> only one link exist between two given nodes=> these two nodes are neighbors or adjacent
– Directed or undirected
Fundamentals of networks
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-16
TeLLNet
GALA
Definitions in a Network Degree of a node:
number of incoming and outgoing links A path is a sequence of nodes v0, …, vn-1
with (vi, vi+1) ∈ L, for 0 ≤ i < n-1, A path is a set of connected links
Length of a path : number of links on a path A path is a simple path, if all vertices on a path are pair wise
different A cycle is a path with v0 = vn-1 and length n ≥ 2 A subnetwork of a network Γ= (N, L) is a graph Γ’= (N’,
L’) with N’ ⊆ N und L’ ⊆ L
Fundamentals of networks }:{ LijNjzi ∈∈=
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-17
TeLLNet
GALA
Representation of Networks Adjacency matrix representation
An n x n-dimensional matrix A, where
{ }LjiNji ∈∈≡ ),(:N
1 if (i, j) Laij =
Neighborhood Any network is the collection of neighborhoods
0 otherwise
{ } Ν∈=Γ iiN
∈
Fundamentals of networks
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-18
TeLLNet
GALA
Boolean Adjacency Matrix Example For Network Γ1, the adjacency matrix is as follows:
true =1, if there exists a link between two nodes false = 0, otherwise
0 1 2 3 40 0 1 0 1 01 1 0 0 1 02 0 0 1 0 13 0 0 0 0 14 0 0 1 0 0
Incoming degreeO
utgoing degree0 1 2
3 4
Fundamentals of networks
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-19
TeLLNet
GALA
Important Types of Degree Distribution
For any network Γ, its (kth-order) degree distribution p(·) specifies for each k = 0,
1, …, n-1}:{1)( kzNin
kp i =∈=
Fundamentals of networks
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-20
TeLLNet
GALA
Network Characteristics:Geodesic Distances
The average geodesic distance d(i, j) is defined as the minimum number of links that connect i and jif no such path exists, d(i, j)=+∞
The distribution specifying the fraction of nodes pairs at distance r
where The average network distance
The diameter of the network
)1(}),(:),{(
)(−
=×∈=
nnrjidNNji
rϖ
1)(0
=∑ >rrϖ
)(rϖϖ
∑ ∞<<=
rrrd
0)(ϖ
}0)(:max{ˆ >= rrd ϖ
Fundamentals of networks
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-21
TeLLNet
GALA
Network Characteristics:Density
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-22
TeLLNet
GALA
Network Characteristics:Closeness & Clustering
The total distance The closeness is defined as:
For each node i having at least two neighbors: clustering
For each node j having less than two neighbors
Clustering index of the network Γ
∑ ∈Njjid ),(
∑≡∈Nj
jidic ),(1)(
2)1(
}:{−
∈∧∈∈≡ ii
i
zzLikLijLjk
C
∑=
=n
i
i
n 1
1CC
0=jC
Fundamentals of networks
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-23
TeLLNet
GALA
Network Characteristics:Cohesiveness & Betweeness
Given a network Γ= (N, L), let M⊂N, for each nodethe fraction of its connections
The overall cohesiveness of the set M is defined as
if the network Γ is connected the shortest-paths v(j, k) for each j, k and j≠k the betweenness of node i is
Mi∈
ii
zMjLij
M}:{
)(∈∈
=H
)()( min MM i
Mi
HH∈
=
∑≠
≡kj
ii
kjvkjvb),(),(
Fundamentals of networks
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-24
TeLLNet
GALA
Shortest-path Betweenness: Example
Shortest-path betweenness Nodes A and B will have
high (shortest-path)betweenness in this configuration, while node C will not
∑≠
≡kj
ii
kjvkjvb),(),(
A measure of the extent to which an actor has control over information flowing between others
In a network in which flow is entirely or at least mostly along geodesic paths, the betweenness of a node measures how much flow will pass through that particular node
Fundamentals of networks
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-25
TeLLNet
GALA
Flow Betweenness Flow betweenness of a node i is defined as the amount of
flow through node i when the maximum flow is transmitted from s to t, averaged over all s and t:
While calculating flow betweenness, vertices A and B will get high scores while vertex C will not
∑ >≠≠∈≡
0,,,,)(
stftisiNtsst
stimf f
ifb
Fundamentals of networks
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-26
TeLLNet
GALA
Case: AERCS - Recommendation of Venues for Young Computer Scientists
DBLP (http://www.informatik.uni-trier.de/~ley/db/)
- 788,259 author’s names- 1,226,412 publications- 3,490 venues (conferences,
workshops, journals) CiteSeerX (http://citeseerx.ist.psu.edu/)
- 7,385,652 publications- 22,735,240 citations- Over 4 million author’s names
Combination- Canopy clustering [McCallum 2000]- Result: 864,097 matched pairs - On average: venues cite 2306 and
are cited 2037 timesPham, Klamma, Jarke: Development of Computer Science Disciplines – A Social Network Analysis Approach, SNAM, 2011
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-27
TeLLNet
GALA
Properties of Collaboration and Citation Graphs of Venues
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-28
TeLLNet
GALA
User-based CF:Author Clustering
Data: DBLP Perform 2 test cases for the years of 2005
and 2006 - Clustering of co-authorship networks- Prediction of the venue
Clustering algorithm- Density-based algorithm [Clauset 2004]- Obtained modularity: 0.829 and 0.82
Cluster size distribution follows Power law
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-29
TeLLNet
GALA
User-based CF:Precision and Recall
Precisions for 1000 random chosen authors
Precisions computed at 11 standard recall levels 0%, 10%,….,100%
Results- Clustering performs better- Not significant improved- Better efficiency
Further improvement- Different networks: citation- Overlapping clustering
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-30
TeLLNet
GALA
Item-based CF:Venue Network Creation and Clustering
Knowledge network- Aggregate bibliography coupling counts at venue level- Undirected graph G(V, E), where V: venues, E: edges weighted by cosine
similarity
- Threshold: - Clustering: density-based algorithm [Neuman 2004, Clauset 2004]- Network visualization: force-directed paradigm [Fruchterman 1991]
Knowledge flow network- Aggregate bibliography coupling counts at venue level- Threshold: citation counts >= 50
Domains from Microsoft Academic Search (http://academic.research.microsoft.com/)
∑∑∑
==
==×
•=
n
k kjn
k ki
n
k kjki
ji
jiji
BB
BB
BBBBC
12,1
2,
1 ,,
22
,
1.0, >=jiC
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-31
TeLLNet
GALA
Knowledge Network:the Visualization
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-32
TeLLNet
GALA
Interdisciplinary Venues:Top Betweenness Centrality
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-33
TeLLNet
GALA
High Prestige Series:Top PageRank
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-34
TeLLNet
GALA
Network Formation
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-35
TeLLNet
GALA
Case: TeLLNet - SNA for European Teachers‘ Life Long Learning
How to manage and handle large scale data on social networks?
How to analyse social network data in order to develop teachers’ competence, e.g. to facilitate a better project collaboration?
How to make the network visualization useful for teachers’ lifelong learning?
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-36
TeLLNet
GALA
Analysis and Visualization ofLifelong Learner Data
Performance Data on Projects Network Structures and Patterns
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-37
TeLLNet
GALA
Network Formation Strategies
Homophily – love of the same [LaMe54, MSK01]– similar socio-economical status– thinking in a similar way
Contagion– being influenced by others
How to represent strategies for lifelong learner?
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-38
TeLLNet
GALA
Game Theory Basics
Every situation as a game [Borel38, NeMo44] A player – makes decisions in a game Players choose best strategies based on payoff
functions Payoffs motivations of players A strategy defines a set of moves or actions a player
will follow in a given game (mixed strategy, pure strategy)
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-39
TeLLNet
GALA
Game Theory
A game is a tuple, where
N is a nonempty, finite set of playersEach player has
1. a set of actions (strategy space) 2. payoff functions3. payoff matrix
NiiNii uANG ∈∈= )(,)(,
Ni∈iA
R→Aui :
Player B chooses white Player B chooses blackPlayer A chooses white 1,1 1,0Player A chooses black 0,1 0,0
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-40
TeLLNet
GALA
Social networks are formed by individual decisions– Cost: write an e-mail– Utility: cooperate with others
Social networks between pupils– Cost: make a joke– Utility: get appreciation from others
Lifelong learner networks– Cost: take a learning course– Utility: find learners with
similar way of reasoning
Network Formation Games
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-41
TeLLNet
GALA
Set of agents which are actors of a network. and are typical members of a set
A strategy of an agent is a vector
where for each
Actor and are connected if
Network Formation
}...,1{ nN =
i
i jNi∈
),,,...,( ,1,1,1, niiiiiii aaaaa +−=}1,0{, ∈jia }{\ iNj∈
j 1, =jia
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-42
TeLLNet
GALA
Nash Network : Win-Win Situation Every agent changes its strategy until all agents are satisfied
with their strategies and will not benefit if they changestrategies (the network is stable) Nash equilibrium
A network is a Nash network if each agent is in Nash equilibrium
Chosen strategies defeat others for the good of all players[Nash51, FuTi91]
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-43
TeLLNet
GALA
Epistemic Frame for TeLLNet
• the way how members of a community see themselves in the community• institution role, country
Identity
• tasks, community members perform• languages, subjects, and tools from projects
Skills
• the understanding shared by members of a community• languages, subjects
Knowledge
• beliefs of members• experiences from projects (partners)
Values
• warrants that justify members’ actions as legitimate• quality labels, prizes, European quality labels
Epistemology
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-44
TeLLNet
GALA
Multi-Agent Simulation System
A multi-agent system is a collection of heterogeneousand diverse intelligent agents that interact with eachother and their environment [SiAi08]– Recommendations
Yenta [Foner97] – looking for users with similar interestsbased on data from Web media
– Market-binding mechanismsLooking for the best item (a reward agent, set of items and users agents) [WMJe05]
– Team formationForming teams for performing a task in dynamicenvironment [GaJa05]
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-45
TeLLNet
GALA
Multi-Agent Simulation Questions Which kind of behavior can be expected under arbitrarily
given parameter combinations and initial conditions? Which kind of behavior will a given target system display
in the future? Which state will the target system reach in the future?
[Troitzsch2000]
2008 2009 2010
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-46
TeLLNet
GALA
Agent Based Simulation
Heterogeneous, autonomous and pro-active actors, such as human-centered systems– Agents are capable to act without human intervention– Agents possess goal-directed behavior– Each agent has its own incentives and motives
Suited for modeling organizations: most work is based on cooperation and communication
[Gazendam, 1993]
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-47
TeLLNet
GALA
Inputs for simulation model Agent =Teacher Teacher properties:
– Languages– Subjects– Country– Institution role– Any Awards? (European Quality Label or Prize)
Project properties:– Languages– Tools– Subjects– Number of pupils in a project– Age of pupils in a project– Any Award? (Quality Label)
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-48
TeLLNet
GALA
Network Formation Game Simulation
Payoff definition: payoff matrix is calculated dynamically based on Epistemic Frame vector:– teachers‘ subjects, subjects of projects (experiences)– teachers‘ languages, languages of projects (experiences)– tools used in projects (experiences)– countries past collaborators are coming from (beliefs)– ...
Strategy definition: homophily or contagiosity Looking for a suitable network for a teacher and not
for a suitable partner!
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-49
TeLLNet
GALA
Nash Equilibrium forNetwork Formation
Finding a Nash Equilibrium (NE) is NP-hard Computer scientists deal with finding appropriate
techniques for calculating NE with a lot of agents We are not interested
in the best solutionbut in a better solution
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-50
TeLLNet
GALA
Conclusions & Outlook Network Science is an interdisciplinary approach between computer
science and other disciplines Mediabase framework based on modeling & reflection support Two case studies
– Network Flow: Analysis and visualization of large digital librariesIdentification of basic flow parameters
– Network Formation: Analysis and visualization of large learner networksPerformance Indicators and Visual Analytics
Application of tools on entrepreneurial problems: Causation and Effectuation (Excellence Project OBIP at RWTH Aachen University)
Researching Network Dynamics by Time Series Analysis and Multi Agent Simulation