Upload
slammaster
View
466
Download
0
Tags:
Embed Size (px)
Citation preview
Using Social Network Analysis to Understand
Web 2.0 Communications
Sam Stewart, Syed Sibte Raza Abidi
NICHE Research GroupFaculty of Computer Science
Dalhousie University, Halifax, Canada
[email protected]/∼sstewart
September 18, 2011
Sam Stewart (Dal) SNA and Med 2.0 September 18, 2011 1 / 29
For more information on visualization tool:
I Stewart S. and Sibte Raza Abidi S. (2011).UNDERSTANDING MEDICINE 2.0 - Social NetworkAnalysis and the VECoN System. In Proceedings of theInternational Conference on Health Informatics, pages70-79. DOI: 10.5220/0003167100700079
Sam Stewart (Dal) SNA and Med 2.0 September 18, 2011 2 / 29
Introduction
Experiential Healthcare Knowledge
Experiential knowledge exists in a variety of modalities
I clinical case studies, problem-based discussions betweenclinicians, experience-based insights, diagnostic heuristics ...
There are key issues facing the use of this knowledge inhealthcare
I How to formulate a community of practitioners to createthis knowledge?
I How to extract and share this knowledge?
I How to assign value to the knowledge being shared,especially with respect to clinical decision making?
Sam Stewart (Dal) SNA and Med 2.0 September 18, 2011 3 / 29
Introduction
Medicine 2.0
Our researching investigates the use of Web 2.0 tools infacilitating experiential knowledge sharing, translation andvalidation
Web 2.0 tools: online discussion forums, medical mailing lists,blogs, social networking websites, ...
Provide virtual communities for knowledge exchange andknowledge validation
We want to explore the knowledge sharing dynamics of web 2.0communities
I We will do this using Social Network Analysis (SNA)
Sam Stewart (Dal) SNA and Med 2.0 September 18, 2011 4 / 29
Introduction
Project Outline
This project focuses on the online communication patterns ofthe Pediatric Pain Mailing List (PPML)
I 700 pediatric pain practitioners from around the world sharetheir clinical experiences and seek advice
Not a strong example of web 2.0 data
I Structurally, mailing list data and discussion forum data arevery similar
I Already a strong community between the members (bothprofessionally and on the mailing list)
Sam Stewart (Dal) SNA and Med 2.0 September 18, 2011 5 / 29
Methods
Methods
Sam Stewart (Dal) SNA and Med 2.0 September 18, 2011 6 / 29
Methods
Social Network Analysis
The objective of SNA is to understand the underlying socialstructure of a communication network
It leverages principles of graph theory to represent people andthe ties between them
It focuses on analyzing the structures that emerge out ofrelations between actors, rather than the attributes of actorsthemselves
Sam Stewart (Dal) SNA and Med 2.0 September 18, 2011 7 / 29
Methods
1 vs 2 Mode Data
Traditional network analysis is on 1-mode data
I 1 set of actors, edges are the relations between them
This project studies 2-mode networks
I 2 types of actors, and the ties are between types
I Our data links a user to a thread if that user communicatedon that thread
Because many SNA methods are designed for 1-mode networks,it is necessary to create a 1-mode network out of our two modedata
I A valued link exists between two users for how manythreads they communicated on together
Sam Stewart (Dal) SNA and Med 2.0 September 18, 2011 8 / 29
Methods Centrality
Centrality
Centrality measures provide insight into the most importantactors in the network
We used three different centrality measures
I Degree
I Closeness
I Betweenness
They will provide both user level information about the mostimportant users, along with general network level information
Sam Stewart (Dal) SNA and Med 2.0 September 18, 2011 9 / 29
Methods Centrality
Degree Centrality
Degree centrality is simply measured as the number of ties anactor has
Degree can be normalized to a [0,1] scale by dividing it by itsmaximum
Results:
Actor 2M Deg Norm121 42 0.1772167 41 0.1730066 36 0.1519055 35 0.1477170 31 0.1308
Actor 1M Deg Norm167 85 0.3602170 75 0.3178066 67 0.2839128 66 0.2797055 59 0.2500
Sam Stewart (Dal) SNA and Med 2.0 September 18, 2011 10 / 29
Methods Centrality
Degree Results
There are actors that are quite active in the network
With max normalized 2-Mode degree of 17.7%, there is not oneactor that is present in all the threads
The 1-mode degrees are slightly higher: the most active usershave communicated with ≈ 36% of the other users
Distribution of two−mode Degrees
two−mode degree
Fre
quen
cy
0 10 20 30 40
050
100
150
Distribution of Actor Degrees
Actor degree
Fre
quen
cy
0 20 40 60 80
020
4060
8010
012
0
Sam Stewart (Dal) SNA and Med 2.0 September 18, 2011 11 / 29
Methods Centrality
Closeness Centrality
An actor is “close” if they are within a few steps of every othermember of the network
A network with high closeness values means that informationcan propagate through the network quickly
Actor Closeness167 0.5915170 0.5742128 0.5579066 0.5540055 0.5527
Closeness in Actor Network
Closeness
Fre
quen
cy
0.25 0.30 0.35 0.40 0.45 0.50 0.55 0.60
020
4060
Sam Stewart (Dal) SNA and Med 2.0 September 18, 2011 12 / 29
Methods Centrality
Betweenness Centrality
Betweenness centrality is a measure of how important a node isas a hub of information
Low betweenness scores mean that no-one controls theinformation flow through the network
Actor Betweenness167 0.107170 0.093066 0.080128 0.063035 0.063
Distribution of Actor Betweenness scores
Normalized Betweenness
Fre
quen
cy
0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14
050
100
150
Sam Stewart (Dal) SNA and Med 2.0 September 18, 2011 13 / 29
Methods Centrality
Centrality Conclusions
The centrality measures indicate a healthy and active network
I Low degree and betweenness scores indicate that there isnot a single user or set of users dominating the network
I High closeness scores indicate that users are all closelyconnected to one another
Note that the same actors are near the top of each group
Though they don’t dominate the network, there are power userspresent
Sam Stewart (Dal) SNA and Med 2.0 September 18, 2011 14 / 29
Methods Centrality
1-Mode Degree 2-Mode Degree Closeness Betweenness167 121 167 167170 167 170 170066 066 128 066128 055 066 128055 170 055 035056 035 035 179184 148 184 020035 179 121 121020 184 042 184121 020 020 266179 128 056 055042 224 045 056254 102 015 015224 146 179 224045 015 077 096
Sam Stewart (Dal) SNA and Med 2.0 September 18, 2011 15 / 29
Methods Subgroup Analysis
Subgroup Analysis
With 700 users and over 13 000 messages on the network, thereis too much information to present all messages at once
The idea of subgroup analysis is to group similar actors together,and only study the communications within groups, or betweengroups
Also called cluster analysis, there are a number of methods fordetermining the clusters
I Going to look at structural equivalence
Sam Stewart (Dal) SNA and Med 2.0 September 18, 2011 16 / 29
Methods Subgroup Analysis
Structural Equivalence
Structural equivalence helps identify nodes that occupy similarroles in the network
Two nodes are structurally equivalent if they both contain allthe same ties
True structural equivalence is rare, so we measure approximateequivalence using Hamming/Euclidean distance
Develop a similarity matrix between all users
If we cluster users together hierarchically we create a dendogram
Cutting the dendogram results in disparate clusters (ablockmodel)
Sam Stewart (Dal) SNA and Med 2.0 September 18, 2011 17 / 29
Methods Subgroup Analysis
156
158
35
146
4510
110
271
32 5393
105
104
100
103
15 2055
169
4274 77
111
60 914
121
021
256
226
224
227
223
220
222
64 122
225
57 221
170
181
4718
619
323
017
123
423
323
223
122
921
721
811
635 54
991 7
31 137
5913
215
561 154
16 6769
197
150
198 44 182
7278 73 68 70
46 204
133
9888 135
17 43 136
140 1
919
937
3633
26 2728
187
236
8081 82
29 114
134
62 4
107
52 126 12
985 10
624
174
41 228
177
180
162
163
142
145
143
144
131
179
66 191 62 63
9784 90
213
216
123
167
124
209
115
51 128
3018
517
217
311
38 10
147
11 1223
716
021
518
819
018
316
818
415
113
811
215
7 91
201
202
3934 49
6513 21
9 211
176
121
110 75 14
25 89 21 4016
619
419
5 96 94 9511
810
911
915
215
3 76 175
83 120
159
164 2
211
720
058
4812
514
814
913
020
720
818
917
819
279
86 8738
161
214
1850
23 108
165
203 23
513
912
720
592
196
206
Hierarchical Clustering of the Actor Network
Sam Stewart (Dal) SNA and Med 2.0 September 18, 2011 18 / 29
Methods Subgroup Analysis
Analyzing the Blockmodel
We are interested in the communication patterns both withinand between blocks
The best partitioning of the actors breaks the network into onelarge group and two small groups
The image matrix presents the communication densities betweenand within the three blocks
B1 B2 B3B1(n=199) 0.04497 0.08124 0.07538
B2(n=18) 0.08124 0.92157 0.12778B3(n=20) 0.07538 0.12778 1.00000
Two small networks have very high densities, and somecommunication between them, the large group has low density,and little communication with the two other groups
Sam Stewart (Dal) SNA and Med 2.0 September 18, 2011 19 / 29
Methods Subgroup Analysis
Sam Stewart (Dal) SNA and Med 2.0 September 18, 2011 20 / 29
Methods Subgroup Analysis
Sam Stewart (Dal) SNA and Med 2.0 September 18, 2011 21 / 29
Methods Subgroup Analysis
Structural Equivalence Results
The structural equivalence results have isolated two potentialsubgroups of interest in the network
I Dataset only contains names and email addresses: nothingto differentiate between two groups
I Investigation of common threads amongst the blocksrevealed nothing
I Full survey of the group could reveal common groupattributes (research ongoing)
Could also investigate clustering directly from the two-modenetwork
Sam Stewart (Dal) SNA and Med 2.0 September 18, 2011 22 / 29
Methods VECoN
Visualizing Social Networks
The objective of the VECoN system is
I To provide the users with an overview of the structure ofthe mailing list
I To provide SNA results to the users with the hope ofimproving their knowledge translation practices
I To provide a novel network navigation tool
Is not an analysis system
I Many great network analysis tools exist: UCINET andNetdraw, GUESS, Gephi, SocialAction, R, ...
I Goal is to provide end users with a graph visualization toaccompany their traditional network navigation methods
Sam Stewart (Dal) SNA and Med 2.0 September 18, 2011 23 / 29
Methods VECoN
Sam Stewart (Dal) SNA and Med 2.0 September 18, 2011 24 / 29
Methods VECoN
Current VECoN Status
The visualization is in its early stages
I Node layout needs to be fixed
I Clustering needs to be improved
I More centrality measures need to be added
I Connection to the actual conversations needs to beimplemented
The project demonstrates the potential for graph-basedvisualizations to improve the navigation and understanding ofcommunication networks from a user’s point of view
Sam Stewart (Dal) SNA and Med 2.0 September 18, 2011 25 / 29
Conclusion
Conclusions
Experiential healthcare knowledge is vital
Web 2.0 technologies provide tools for sharing knowledge,establishing virtual communities of practice
It is vital that we understand how these communities function
SNA provides tools for understand how online communicationnetworks function
Sam Stewart (Dal) SNA and Med 2.0 September 18, 2011 26 / 29
Conclusion
Future Work
Research is currently being conducted to apply these methods toa discussion forum
Need to quantify contribution to the conversation (is currently abinary measure)
Develop knowledge seekers and knowledge sharers
Rollout the visualization tool to users
Sam Stewart (Dal) SNA and Med 2.0 September 18, 2011 27 / 29
Conclusion
Acknowledgement
This work is carried out with the aid of a grant from the InternationalDevelopment Research Centre, Ottawa, Canada.
The authors would like to acknowledge Dr. Allen Finley for hiscontributions to the PPML and his ongoing support of this research.
Sam Stewart (Dal) SNA and Med 2.0 September 18, 2011 28 / 29
Conclusion
Questions?
Sam Stewart (Dal) SNA and Med 2.0 September 18, 2011 29 / 29