Upload
matthew-rowe
View
908
Download
1
Embed Size (px)
DESCRIPTION
2012 IEEE International Conference on Social Computing
Citation preview
WHAT MAKES COMMUNITIES TICK? COMMUNITY HEALTH ANALYSIS USING ROLE COMPOSITIONS MATTHEW ROWE1 AND HARITH ALANI2 1SCHOOL OF COMPUTING AND COMMUNICATIONS, LANCASTER UNIVERSITY, LANCASTER, UK 2KNOWLEDGE MEDIA INSTITUTE, THE OPEN UNIVERSITY, MILTON KEYNES, UK 2012 ASE/IEEE INTERNATIONAL CONFERENCE ON SOCIAL COMPUTING AMSTERDAM, THE NETHERLANDS http://www.matthew-rowe.com | http://www.lancs.ac.uk/staff/rowem [email protected]
Managing Online Communities
Many businesses provide online communities to: Increase customer loyalty
Raise brand awareness
Spread word-of-mouth
Facilitate idea generation
Online communities incur significant investment in terms of: Money spent on hosting and bandwidth
Time and effort for maintenance
Community managers monitor community ‘health’ to: Ensure longevity
Enable value generation
However, the notion of ‘health’ is hard to pin down
What makes Communties Tick? Community Health Analysis using Role Compositions
1
The Need for Interpretation
Online communities are dynamic behavioural ecosystems Users in communities can be defined by their roles
i.e. Exhibiting similar collective behaviour
Prevalent behaviour can impact upon community members and health
Management of communities is helped by: Understanding the relation between behaviour and health
How user behaviour changes are associated with health
Encouraging users to modify behaviour, in turn affecting health e.g. content recommendation to specific users
Predicting health changes Enables early decision making on community policy
Can we accurately and effectively detect positive and negative changes in community health from its composition of behavioural roles?
What makes Communties Tick? Community Health Analysis using Role Compositions
2
Outline
What makes Communties Tick? Community Health Analysis using Role Compositions
3
SAP Community Network
Community Health Indicators
Measuring Role Compositions: Measuring user behaviour
Inferring behaviour roles
Mining behaviour roles
Experiments: Health Indicator Regression
Health Change Detection
Findings and Conclusions
SAP Community Network
Collection of SAP forums in which users discuss: Software development SAP Products Usage of SAP tools
Points system for awarding best answers Enables development of user reputation
Provided with a dataset covering 33 communities: Spanning 2004 - 2011 95,200 threads 421,098 messages
78,690 were allocated points
32,942 users
What makes Communties Tick? Community Health Analysis using Role Compositions
4
020
060
010
0014
00
Post
Cou
nt
2004 2005 2006 2007 2008 2009 2010 2011
Community Health Indicators
From the literature there is no single agreed measure of ‘community health’ Multi-faceted nature: loyalty, participation, activity, social capital
Different communities and platforms look at different indicators
Indicator 1: Churn Rate (loyalty) The proportion of users who participate in a community for the final time
Indicator 2: User Count (participation) The number of participating users in the community
Indicator 3: Seeds-to-Non-Seeds Posts Proportion (activity) The Proportion of seed posts (i.e. thread starters that receive a reply) to non-seeds (i.e. no
reply)
Indicator 4: Clustering Coefficient (social capital) The average of users’ clustering coefficients within the largest strongly connected
component
What makes Communties Tick? Community Health Analysis using Role Compositions
5
Measuring Role Compositions I: Modelling and Measuring User Behaviour According to existing literature, user behaviour can be defined using 6
dimensions: (Hautz et al., 2010), (Nolker and Zhou, 2005), (Zhu et al., 2009), (Zhu et al.,
2011) Focus Dispersion
Measure: Forum entropy of the user Engagement
Measure: Out-degree proportioned by potential maximal out-degree Popularity
Measure: In-degree proportioned by potential maximal in-degree Contribution
Measure: Proportion of thread replies created by the user Initiation
Measure: Proportion of threads that were initiated by the user
Content Quality Measure: Average points per post awarded to the user
What makes Communties Tick? Community Health Analysis using Role Compositions
6
Measuring Role Compositions II: Inferring Roles
1. Construct features for community users at a given time step
2. Derive bins using equal frequency binning Popularity-low cutoff = 0.5, Initiation-high cutoff = 0.4!
3. Use skeleton rule base to construct rules using bin levels Popularity = low, Initiation = high -> roleA!
Popularity < 0.5, Initiation > 0.4 -> roleA!
4. Apply rules to infer user roles and community composition
5. Repeat 1-4 for following time steps
What makes Communties Tick? Community Health Analysis using Role Compositions
7
Measuring Role Compositions III: Mining Roles (Skeleton rule base compilation)
What makes Communties Tick? Community Health Analysis using Role Compositions
8
1. Select the tuning segment
2. Discover correlated behaviour dimensions Removed Engagement and Contribution, kept Popularity (Pearson r > 0.75, p < 0.01)
3. Cluster users into behavioural groups
4. Derive role labels for clusters
thereby separating users based on their behaviour and discov-ering distinct roles on the platform. We ran three differentunsupervised clustering algorithms: Expectation-Maximization(EM), K-means and Hierarchical Clustering, over the 6-months’ tuning segment. The model selection phase not onlyrequires choosing the correct clustering method but also se-lecting the optimum number of clusters to use - providing thisvalue as a parameter k. To judge the best model - i.e. clustermethod and number of clusters - we measure the cohesion andseparation of a given clustering as follows: For each clusteringalgorithm (!) we iteratively increase the number of clusters(k) to use where 2 ! k ! 30. At each increment of k werecord the silhouette coefficient produced by !, this is definedfor a given element (i) in a given cluster as:
si =bi ! ai
max(ai, bi)(3)
Where ai denotes the average distance to all other itemsin the same cluster and bi is given by calculating the averagedistance with all other items in each other distinct cluster andthen taking the minimum distance. The value of s i rangesbetween "1 and 1 where the former indicates a poor cluster-ing where distinct items are grouped together and the latterindicates perfect cluster cohesion and separation. To derivethe silhouette coefficient (s(!(k)) for the entire clusteringwe take the average silhouette coefficient of all items. Wefind that the best clustering model and number of clusters touse is K-means with 11 clusters. We found that for smallercluster numbers (k = [3, 8]) each clustering algorithm achievescomparable performance, however as we begin to increase thecluster numbers K-means improves while the two remainingalgorithms produce worse cohesion and separation.3) Deriving Role Labels: Provided with the most cohesive
and separated clustering of users we then derive role labelsfor each cluster. Role label derivation first involves inspectingthe dimension distribution in each cluster and aligning thedistribution with a level mapping (i.e. low, mid, high). Thisenables the conversion of continuous dimension ranges intodiscrete values which our rule-based approach requires in theSkeleton Rule Base. To perform this alignment we assess thedistribution of each dimension and derive boundary points forthe three feature levels using an equal-frequency binning ap-proach. The distribution of each dimension is shown in Figure2 for each of the 11 induced clusters together with the levelboundaries. We assess the distribution of each feature for eachcluster against the levels derived from the equal-frequencybinning of each feature, thereby generating a feature-to-levelmapping. This mapping is shown in Table II where certainclusters are combined together as they have the same feature-to-level mapping patterns - i.e. 2,5 and 8,9.In order to derive the role labels for each cluster we use
a maximum-entropy decision tree to divide the clusters intobranches that maximise the dispersion of dimension levels.Figure 3 shows the separation of the clusters from a completegrouping into a single cluster, or merged clusters in the case of2,5 and 8,9, in each leaf. To perform the separation at a given
Fig. 2. Boxplots of the feature distributions in each of the 11 clusters.Feature distributions are matched against the feature levels derived from equal-frequency binning
TABLE IIMAPPING OF CLUSTER DIMENSIONS TO LEVELS. THE CLUSTERS ARE
ORDERED FROM LOW PATTERNS TO HIGH PATTERNS TO AID LEGIBILITY.
Cluster Dispersion Initiation Quality Popularity1 L L L L0 L M H L6 L H M M10 L H M H4 L H H M2,5 M H L H8,9 M H H H7 H H L H3 H H H H
decision node, we measure the entropy of the dimensions andtheir levels across the clusters, we then choose the dimensionwith the largest entropy. This is defined formally as:
H(dim) = !|levels|!
level
p(level|dim) log p(level|dim) (4)
Fig. 3. Maximum-entropy decision tree used to segment the clusters intominimal-distance paths. The paths are used to generate the role labels for eachrespective cluster.
We perform this process until single clusters, or the pre-viously merged clusters, are in each leaf node and then usethe path to the root node to derive the label. For instance,for cluster 0 the path from the root node to the leaf nodeis quality=high, dispersion=low, initiation=medium, therebyderiving the role label Focussed Expert Participant for thecluster. In the label, focussed describes the focus dispersionof the role - i.e. it is low and therefore not distributed, expert
0 1 2 3 4 5 6 7 8 9
0.0
0.2
0.4
0.6
Cluster
Dispersion
0 1 2 3 4 5 6 7 8 9
0.00
0.01
0.02
0.03
0.04
Cluster
Initiation
0 1 2 3 4 5 6 7 8 9
02
46
810
Cluster
Quality
0 1 2 3 4 5 6 7 8 9
0.000
0.005
0.010
0.015
0.020
Cluster
Popularity
• 1 - Focussed Novice • 2,5 - Mixed Novice • 7 - Distributed Novice • 3 - Distributed Expert • 8,9 - Mixed Expert • 0 - Focussed Expert Participant • 4 - Focussed Expert Initiator • 6 - Knowledgeable Member • 10 - Knowledgeable Sink
Experiment 1: Health Indicator Regression
Managing online communities is helped by understanding the relation between behaviour and health
Experimental Setup Induced Linear Regression Models for each Health Indicator and
Community Using a time-series dataset
Dependent variables: 9 roles with composition proportions as values at a given time point
E.g. @ t = k: Mixed Expert = 0.05, Distributed Novice = 0.51, etc.
Independent variable: health indicator (e.g. churn rate) at the same time point E.g. @ t = k: Churn Rate= 0.21
PCA of each community health indicator model using the model’s coefficients Look for a common health composition pattern
What makes Communties Tick? Community Health Analysis using Role Compositions
9
Experiment 1: Health Indicator Regression Results
Common Health Composition Pattern Churn Rate: Differences for Focussed Expert Participant & Mixed Expert, similarities for
Focussed Expert Initiators (decrease in role correlated with increase in churn rate)
User Count: Differences for Focussed Expert Initiators, commonalities for knowledgeable roles
Seeds-to-Non-Seeds: Similar effects for Focussed Expert Initiators and Participants, and Distributed Experts (all decrease in role correlated with increased proportion)
Clustering Coefficient: no common patterns
Idiosyncratic Health Composition Pattern Divergence patterns between outlier communities
No general pattern exists that describes the relation between roles and health
What makes Communties Tick? Community Health Analysis using Role Compositions
10
−200 200 600
−200
0100
Churn Rate
PC1
PC2
101
161
197198210226252256 264265
270 319
353
354
412
413414
418
419
420 4447050
56
−800 −400 0 400
−200
0100
User Count
PC1PC
2 101
161197198210226252256
264 265270319
353
354412413414
418419
420
44
470
50
56
−400 0 200
−100
0100
200
300
Seeds / Non−seeds Prop
PC1
PC2
101
161197198210226252256
264
265270
319
353
354
412
413414
418
41942044470
50
56
−600 −200 200
−150
−50
050
100
Clustering Coefficient
PC1
PC2
101
161197
198210
226252
256
264
265
270319
353
354412413414
418
419420
44 470
50 56
Experiment 2: Health Change Detection
Can we accurately and effectively detect positive and negative changes in community health from its composition of behavioural roles?
Experimental Setup Binary classification of indicator change At t=k+1: predict increase or decrease in health indicator from t=k Time-ordered dataset:
Features @ t=k+1: 9 roles with composition proportions as values Class @ t=k+1: positive (if increase from t=k), negative (if decrease) Divide dataset into 80/20 split maintaining time-ordering
Tested using a logistic regression classifier Platform-level model Community-specific model
Evaluated using Matthews Correlation Coefficient (MCC) and Area under the ROC Curve (AUC)
What makes Communties Tick? Community Health Analysis using Role Compositions
11
Experiment 2: Health Change Detection Results
What makes Communties Tick? Community Health Analysis using Role Compositions
12
results indicate using the role composition information, evenin the outlier communities, provides sufficient informationto outperform the random guesser baseline for all healthmeasures except the User Count for forum 353. We alsofind that for the 412 and 414 central forums we achievepoorer performance than the baseline for the User Count andClustering Coefficient.
TABLE IVPERFORMANCE OF DETECTING HEALTH CHANGES USING A LOGISTIC
REGRESSION MODEL INDUCED: ACROSS THE ENTIRE PLATFORM (FIGUREIV(A)), PER-FORUM (FIGURE IV(B)) AND FOR SPECIFIC CENTRAL AND
OUTLIER FORUMS (FIGURE IV(C)). IN THIS LATTER CASE WE REPORT THEMATTHEWS CORRELATION COEFFICIENT AND THE F1 SCORE.
(a) PlatformClass MCC Prec Recall F1 AUCChurn 0.047 0.573 0.630 0.531 0.590User Count 0.035 0.591 0.646 0.522 0.598Seeds / Non-seeds 0.078 0.592 0.640 0.566 0.617Clustering Coefficient 0.077. 0.591 0.641 0.581 0.647Signif. codes: p-value < 0.001 *** 0.01 ** 0.05 * 0.1 . 1
(b) Per-forumClass MCC Prec Recall F1 AUCChurn 0.110** 0.618 0.634 0.619 0.569User Count 0.175** 0.652 0.661 0.650 0.589Seeds / Non-seeds 0.163* 0.637 0.657 0.639 0.589Clustering Coefficient 0.089** 0.624 0.642 0.626 0.568
Signif. codes: p-value < 0.001 *** 0.01 ** 0.05 * 0.1 . 1
(c) Forum Specific Results. MCC / F1Central Outliers
Class 252 412 414 353 419 50Churn 0.105 / 0.564 0.042 / 0.621 0.284 / 0.700 -0.076 / 0.543 0.173 / 0.633 0.092 / 0.585User Count 0.088 / 0.543 0.580 / 0.903 -0.106 / 0.701 0.279 / 0.648 0.299 / 0.667 0.343 / 0.693Seeds / Non-seeds 0.117 / 0.575 0.339 / 0.717 0.189 / 0.744 0.007 / 0.519 0.265 / 0.632 0.400 / 0.811Clustering Coefficient 0.057 / 0.536 -0.043 / 0.568 0.353 / 0.727 0.156 / 0.582 0.127 / 0.568 0.282 / 0.641
1) Results: Health Danger Detection: Thus far we haveassessed how well our detection models work in both classsettings (i.e. increase and decrease). We now move to ascenario in which we wish to detect health dangers, and indoing so provide warnings to community managers of thelikely reduction in health of their communities. To do thiswe set the class label in our prediction models to be the badhealth signifier as follows: Churn Rate = Increase, User Count= Decrease, Seeds to Non-seeds Proportion = Decrease andClustering Coefficient = Decrease.Figure 6 shows the Receiver Operator Characteristic Curves
for the per-forum logistic regression models. The curvesindicate we can outperform the random classifier for all forumsapart from 5 for the Churn Rate and User Count, all but 6for the Seeds to Non-seeds Proportion and all but 8 forumsfor the Clustering Coefficient, demonstrating the variation inperformance that we achieve across the communities. Theforums that we consistently performed poorly on were 265(SAP Business One Product Development) and 319 (BestPractice and Benchmarking), achieving worse performancethan the random baseline for all heath indicnators. The reduc-tion in accuracy could be caused by the roles detected in thecommunity not befitting its nature, where instead, conversationand discussion-driven roles are assumed - similar to the rolesused in our previous work [22]. In general, the resultsdemonstrate the effectiveness of using role compositioninformation to detect when a forum’s health will degrade.Using this information the managers of such forums can nowidentify when the users of their communities change theirbehaviour in a way that could negatively affect the health oftheir community.
VII. DISCUSSION AND FUTURE WORK
The findings from our analyses have identified interestinghealth composition patterns and the lack of a global composi-tion pattern for the entirety of SCN. Although it has beenargued that the choice of metrics should be dependent oncommunity objectives [3], there are hardly any studies thatdemonstrate this. SCN is a Question Answering platform,and hence its objective is providing responses to questions.When inspecting the patterns learnt for the Seeds to Non-seeds Proportion indicator we find that across communitiesa decrease in Focussed Experts is correlated with an increasein the indicator. This implies that users knowledgeable onspecific topics actually increase the number of unansweredposts. It could be that the questions asked by such expertsusers cannot be solved by most community users.Marin et al. analysed 11 Linux support communities and
found a global positive correlation of code users and networkcohesion [4]. They defined core users as those whose out-degree is greater than the community average out-degree plusone standard deviation. In our work we account for a multitudeof roles that users may assume, rather than just one set. Ouranalysis also showed that such global patterns are harder toidentify when the communities differ in topic and nature. Forinstance an increase in Knowledgeable Member for 252 and353 was linked with an increase in the Clustering Coefficient,while being the converse for 412 and 414, we also found thatKnowledgeable Sink was associated with increased socialcapital for 252, 353, 419 and 50. This latter role is closestto the notion of core users that we find on SCN given its highPopularity.4In this paper we focused our study on 4 popular health
indicators, and several more can be added next. However, onepertinent question is whether codependencies exist betweenthe health indicators. Our future work will explore thecorrelation between these indicators. It could be the casethat certain metrics are redundant, while others are salient.Similarly, it is possible that some of these metrics are morerepresentative of health of some communities than others.Linked to this avenue of exploration is the creation of a singlehealth metric, or index similar to [7], that provides communitymanagers with a basic observable indicator.Our analysis identified various correlations between be-
haviour roles and health metrics. Next we need to studycausation dynamics to better understand the influence andsequence of events that lead to a health metric, or a behaviourrole, changing.
VIII. CONCLUSIONS
Assessing the health of online communities provides man-agers and operators with information about the condition oftheir community and how it is acting. Tying such assessmentsto the implicit behaviour within online communities, and the
4We found Engagement (normalised out-degree) and Popularity (normalisedin-degree) to have a significant positive correlation when mining roles on SCN(Pearson correlation coefficient = 0.926, p-value < 0.001)
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Churn Rate
FPR
TPR
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
User Count
FPR
TPR
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Seeds / Non−seeds Prop
FPR
TPR
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Clustering Coefficient
FPR
TPR
Per-forum models outperform platform models for each health indicator Demonstrates the need to assess and understand
communities individually We also yield good performance for outlier
communities
ROC Curves surpass baseline for: Churn rate: 20/25 forums User Count: 20/25 forums
Seeds-to-Non-Seeds: 19/25 forums
Clustering Coefficient: 17/25 forums
Findings and Conclusions
No global composition pattern for the entirety of SCN Identified key differences as to ‘What makes Communities tick’
Decrease in Focussed Experts correlated with an increase in Seeds-to-Non-Seeds
(Marin et al., 2009) found a correlation between increase in Core Users and Network Cohesion We found a correlation between an increase in Knowledgeable Sinks and Social Capital
Accurate detection of community health change is possible using role composition information Significantly outperformed baseline models
Per-forum models outperformed platform-level models
Future Work: Explore co-dependencies between health indicators
Application of our approach over different communities and platforms E.g. IBM Connections, Boards.ie
What makes Communties Tick? Community Health Analysis using Role Compositions
13
What makes Communties Tick? Community Health Analysis using Role Compositions
14
Questions?
Web: http://www.matthew-rowe.com |http://www.lancs.ac.uk/staff/rowem Email: [email protected] Twitter: @mattroweshow