What makes communities tick? Community health analysis using role compositions

WHAT MAKES COMMUNITIES TICK? COMMUNITY HEALTH ANALYSIS USING ROLE COMPOSITIONS MATTHEW ROWE1 AND HARITH ALANI2 1SCHOOL OF COMPUTING AND COMMUNICATIONS, LANCASTER UNIVERSITY, LANCASTER, UK 2KNOWLEDGE MEDIA INSTITUTE, THE OPEN UNIVERSITY, MILTON KEYNES, UK 2012 ASE/IEEE INTERNATIONAL CONFERENCE ON SOCIAL COMPUTING AMSTERDAM, THE NETHERLANDS http://www.matthew-rowe.com | http://www.lancs.ac.uk/staff/rowem [email protected]

Managing Online Communities

  Many businesses provide online communities to:   Increase customer loyalty

  Raise brand awareness

  Spread word-of-mouth

  Facilitate idea generation

  Online communities incur significant investment in terms of:   Money spent on hosting and bandwidth

  Time and effort for maintenance

  Community managers monitor community ‘health’ to:   Ensure longevity

  Enable value generation

  However, the notion of ‘health’ is hard to pin down

What makes Communties Tick? Community Health Analysis using Role Compositions

1

The Need for Interpretation

  Online communities are dynamic behavioural ecosystems   Users in communities can be defined by their roles

  i.e. Exhibiting similar collective behaviour

  Prevalent behaviour can impact upon community members and health

  Management of communities is helped by:   Understanding the relation between behaviour and health

  How user behaviour changes are associated with health

  Encouraging users to modify behaviour, in turn affecting health   e.g. content recommendation to specific users

  Predicting health changes   Enables early decision making on community policy

  Can we accurately and effectively detect positive and negative changes in community health from its composition of behavioural roles?


2

Outline


3

  SAP Community Network

  Community Health Indicators

  Measuring Role Compositions:   Measuring user behaviour

  Inferring behaviour roles

  Mining behaviour roles

  Experiments:   Health Indicator Regression

  Health Change Detection

  Findings and Conclusions

SAP Community Network

  Collection of SAP forums in which users discuss:   Software development   SAP Products   Usage of SAP tools

  Points system for awarding best answers   Enables development of user reputation

  Provided with a dataset covering 33 communities:   Spanning 2004 - 2011   95,200 threads   421,098 messages

  78,690 were allocated points

  32,942 users


4

020

060

010

0014

00

Post

Cou

nt

2004 2005 2006 2007 2008 2009 2010 2011

Community Health Indicators

  From the literature there is no single agreed measure of ‘community health’   Multi-faceted nature: loyalty, participation, activity, social capital

  Different communities and platforms look at different indicators

  Indicator 1: Churn Rate (loyalty)   The proportion of users who participate in a community for the final time

  Indicator 2: User Count (participation)   The number of participating users in the community

  Indicator 3: Seeds-to-Non-Seeds Posts Proportion (activity)   The Proportion of seed posts (i.e. thread starters that receive a reply) to non-seeds (i.e. no

reply)

  Indicator 4: Clustering Coefficient (social capital)   The average of users’ clustering coefficients within the largest strongly connected

component


5

Measuring Role Compositions I: Modelling and Measuring User Behaviour   According to existing literature, user behaviour can be defined using 6

dimensions:   (Hautz et al., 2010), (Nolker and Zhou, 2005), (Zhu et al., 2009), (Zhu et al.,

2011)   Focus Dispersion

  Measure: Forum entropy of the user   Engagement

  Measure: Out-degree proportioned by potential maximal out-degree   Popularity

  Measure: In-degree proportioned by potential maximal in-degree   Contribution

  Measure: Proportion of thread replies created by the user   Initiation

  Measure: Proportion of threads that were initiated by the user

  Content Quality   Measure: Average points per post awarded to the user


6

Measuring Role Compositions II: Inferring Roles

  1. Construct features for community users at a given time step

  2. Derive bins using equal frequency binning   Popularity-low cutoff = 0.5, Initiation-high cutoff = 0.4!

  3. Use skeleton rule base to construct rules using bin levels   Popularity = low, Initiation = high -> roleA!

  Popularity < 0.5, Initiation > 0.4 -> roleA!

  4. Apply rules to infer user roles and community composition

  5. Repeat 1-4 for following time steps


7

Measuring Role Compositions III: Mining Roles (Skeleton rule base compilation)


8

  1. Select the tuning segment

  2. Discover correlated behaviour dimensions   Removed Engagement and Contribution, kept Popularity (Pearson r > 0.75, p < 0.01)

  3. Cluster users into behavioural groups

  4. Derive role labels for clusters

thereby separating users based on their behaviour and discov-ering distinct roles on the platform. We ran three differentunsupervised clustering algorithms: Expectation-Maximization(EM), K-means and Hierarchical Clustering, over the 6-months’ tuning segment. The model selection phase not onlyrequires choosing the correct clustering method but also se-lecting the optimum number of clusters to use - providing thisvalue as a parameter k. To judge the best model - i.e. clustermethod and number of clusters - we measure the cohesion andseparation of a given clustering as follows: For each clusteringalgorithm (!) we iteratively increase the number of clusters(k) to use where 2 ! k ! 30. At each increment of k werecord the silhouette coefficient produced by !, this is definedfor a given element (i) in a given cluster as:

si =bi ! ai

max(ai, bi)(3)

Where ai denotes the average distance to all other itemsin the same cluster and bi is given by calculating the averagedistance with all other items in each other distinct cluster andthen taking the minimum distance. The value of s i rangesbetween "1 and 1 where the former indicates a poor cluster-ing where distinct items are grouped together and the latterindicates perfect cluster cohesion and separation. To derivethe silhouette coefficient (s(!(k)) for the entire clusteringwe take the average silhouette coefficient of all items. Wefind that the best clustering model and number of clusters touse is K-means with 11 clusters. We found that for smallercluster numbers (k = [3, 8]) each clustering algorithm achievescomparable performance, however as we begin to increase thecluster numbers K-means improves while the two remainingalgorithms produce worse cohesion and separation.3) Deriving Role Labels: Provided with the most cohesive

and separated clustering of users we then derive role labelsfor each cluster. Role label derivation first involves inspectingthe dimension distribution in each cluster and aligning thedistribution with a level mapping (i.e. low, mid, high). Thisenables the conversion of continuous dimension ranges intodiscrete values which our rule-based approach requires in theSkeleton Rule Base. To perform this alignment we assess thedistribution of each dimension and derive boundary points forthe three feature levels using an equal-frequency binning ap-proach. The distribution of each dimension is shown in Figure2 for each of the 11 induced clusters together with the levelboundaries. We assess the distribution of each feature for eachcluster against the levels derived from the equal-frequencybinning of each feature, thereby generating a feature-to-levelmapping. This mapping is shown in Table II where certainclusters are combined together as they have the same feature-to-level mapping patterns - i.e. 2,5 and 8,9.In order to derive the role labels for each cluster we use

a maximum-entropy decision tree to divide the clusters intobranches that maximise the dispersion of dimension levels.Figure 3 shows the separation of the clusters from a completegrouping into a single cluster, or merged clusters in the case of2,5 and 8,9, in each leaf. To perform the separation at a given

Fig. 2. Boxplots of the feature distributions in each of the 11 clusters.Feature distributions are matched against the feature levels derived from equal-frequency binning

TABLE IIMAPPING OF CLUSTER DIMENSIONS TO LEVELS. THE CLUSTERS ARE

ORDERED FROM LOW PATTERNS TO HIGH PATTERNS TO AID LEGIBILITY.

Cluster Dispersion Initiation Quality Popularity1 L L L L0 L M H L6 L H M M10 L H M H4 L H H M2,5 M H L H8,9 M H H H7 H H L H3 H H H H

decision node, we measure the entropy of the dimensions andtheir levels across the clusters, we then choose the dimensionwith the largest entropy. This is defined formally as:

H(dim) = !|levels|!

level

p(level|dim) log p(level|dim) (4)

Fig. 3. Maximum-entropy decision tree used to segment the clusters intominimal-distance paths. The paths are used to generate the role labels for eachrespective cluster.

We perform this process until single clusters, or the pre-viously merged clusters, are in each leaf node and then usethe path to the root node to derive the label. For instance,for cluster 0 the path from the root node to the leaf nodeis quality=high, dispersion=low, initiation=medium, therebyderiving the role label Focussed Expert Participant for thecluster. In the label, focussed describes the focus dispersionof the role - i.e. it is low and therefore not distributed, expert

0 1 2 3 4 5 6 7 8 9

0.0

0.2

0.4

0.6

Cluster

Dispersion

0 1 2 3 4 5 6 7 8 9

0.00

0.01

0.02

0.03

0.04

Cluster

Initiation

0 1 2 3 4 5 6 7 8 9

02

46

810

Cluster

Quality

0 1 2 3 4 5 6 7 8 9

0.000

0.005

0.010

0.015

0.020

Cluster

Popularity

•  1 - Focussed Novice •  2,5 - Mixed Novice •  7 - Distributed Novice •  3 - Distributed Expert •  8,9 - Mixed Expert •  0 - Focussed Expert Participant •  4 - Focussed Expert Initiator •  6 - Knowledgeable Member •  10 - Knowledgeable Sink

Experiment 1: Health Indicator Regression

  Managing online communities is helped by understanding the relation between behaviour and health

  Experimental Setup   Induced Linear Regression Models for each Health Indicator and

Community   Using a time-series dataset

  Dependent variables: 9 roles with composition proportions as values at a given time point

  E.g. @ t = k: Mixed Expert = 0.05, Distributed Novice = 0.51, etc.

  Independent variable: health indicator (e.g. churn rate) at the same time point   E.g. @ t = k: Churn Rate= 0.21

  PCA of each community health indicator model using the model’s coefficients   Look for a common health composition pattern


9

Experiment 1: Health Indicator Regression Results

  Common Health Composition Pattern   Churn Rate: Differences for Focussed Expert Participant & Mixed Expert, similarities for

Focussed Expert Initiators (decrease in role correlated with increase in churn rate)

  User Count: Differences for Focussed Expert Initiators, commonalities for knowledgeable roles

  Seeds-to-Non-Seeds: Similar effects for Focussed Expert Initiators and Participants, and Distributed Experts (all decrease in role correlated with increased proportion)

  Clustering Coefficient: no common patterns

  Idiosyncratic Health Composition Pattern   Divergence patterns between outlier communities

  No general pattern exists that describes the relation between roles and health


10

−200 200 600

−200

0100

Churn Rate

PC1

PC2

101

161

197198210226252256 264265

270 319

353

354

412

413414

418

419

420 4447050

56

−800 −400 0 400

−200

0100

User Count

PC1PC

2 101

161197198210226252256

264 265270319

353

354412413414

418419

420

44

470

50

56

−400 0 200

−100

0100

200

300

Seeds / Non−seeds Prop

PC1

PC2

101

161197198210226252256

264

265270

319

353

354

412

413414

418

41942044470

50

56

−600 −200 200

−150

−50

050

100

Clustering Coefficient

PC1

PC2

101

161197

198210

226252

256

264

265

270319

353

354412413414

418

419420

44 470

50 56

Experiment 2: Health Change Detection

  Can we accurately and effectively detect positive and negative changes in community health from its composition of behavioural roles?

  Experimental Setup   Binary classification of indicator change   At t=k+1: predict increase or decrease in health indicator from t=k   Time-ordered dataset:

  Features @ t=k+1: 9 roles with composition proportions as values   Class @ t=k+1: positive (if increase from t=k), negative (if decrease)   Divide dataset into 80/20 split maintaining time-ordering

  Tested using a logistic regression classifier   Platform-level model   Community-specific model

  Evaluated using Matthews Correlation Coefficient (MCC) and Area under the ROC Curve (AUC)


11

Experiment 2: Health Change Detection Results


12

results indicate using the role composition information, evenin the outlier communities, provides sufficient informationto outperform the random guesser baseline for all healthmeasures except the User Count for forum 353. We alsofind that for the 412 and 414 central forums we achievepoorer performance than the baseline for the User Count andClustering Coefficient.

TABLE IVPERFORMANCE OF DETECTING HEALTH CHANGES USING A LOGISTIC

REGRESSION MODEL INDUCED: ACROSS THE ENTIRE PLATFORM (FIGUREIV(A)), PER-FORUM (FIGURE IV(B)) AND FOR SPECIFIC CENTRAL AND

OUTLIER FORUMS (FIGURE IV(C)). IN THIS LATTER CASE WE REPORT THEMATTHEWS CORRELATION COEFFICIENT AND THE F1 SCORE.

(a) PlatformClass MCC Prec Recall F1 AUCChurn 0.047 0.573 0.630 0.531 0.590User Count 0.035 0.591 0.646 0.522 0.598Seeds / Non-seeds 0.078 0.592 0.640 0.566 0.617Clustering Coefficient 0.077. 0.591 0.641 0.581 0.647Signif. codes: p-value < 0.001 *** 0.01 ** 0.05 * 0.1 . 1

(b) Per-forumClass MCC Prec Recall F1 AUCChurn 0.110** 0.618 0.634 0.619 0.569User Count 0.175** 0.652 0.661 0.650 0.589Seeds / Non-seeds 0.163* 0.637 0.657 0.639 0.589Clustering Coefficient 0.089** 0.624 0.642 0.626 0.568

Signif. codes: p-value < 0.001 *** 0.01 ** 0.05 * 0.1 . 1

(c) Forum Specific Results. MCC / F1Central Outliers

Class 252 412 414 353 419 50Churn 0.105 / 0.564 0.042 / 0.621 0.284 / 0.700 -0.076 / 0.543 0.173 / 0.633 0.092 / 0.585User Count 0.088 / 0.543 0.580 / 0.903 -0.106 / 0.701 0.279 / 0.648 0.299 / 0.667 0.343 / 0.693Seeds / Non-seeds 0.117 / 0.575 0.339 / 0.717 0.189 / 0.744 0.007 / 0.519 0.265 / 0.632 0.400 / 0.811Clustering Coefficient 0.057 / 0.536 -0.043 / 0.568 0.353 / 0.727 0.156 / 0.582 0.127 / 0.568 0.282 / 0.641

1) Results: Health Danger Detection: Thus far we haveassessed how well our detection models work in both classsettings (i.e. increase and decrease). We now move to ascenario in which we wish to detect health dangers, and indoing so provide warnings to community managers of thelikely reduction in health of their communities. To do thiswe set the class label in our prediction models to be the badhealth signifier as follows: Churn Rate = Increase, User Count= Decrease, Seeds to Non-seeds Proportion = Decrease andClustering Coefficient = Decrease.Figure 6 shows the Receiver Operator Characteristic Curves

for the per-forum logistic regression models. The curvesindicate we can outperform the random classifier for all forumsapart from 5 for the Churn Rate and User Count, all but 6for the Seeds to Non-seeds Proportion and all but 8 forumsfor the Clustering Coefficient, demonstrating the variation inperformance that we achieve across the communities. Theforums that we consistently performed poorly on were 265(SAP Business One Product Development) and 319 (BestPractice and Benchmarking), achieving worse performancethan the random baseline for all heath indicnators. The reduc-tion in accuracy could be caused by the roles detected in thecommunity not befitting its nature, where instead, conversationand discussion-driven roles are assumed - similar to the rolesused in our previous work [22]. In general, the resultsdemonstrate the effectiveness of using role compositioninformation to detect when a forum’s health will degrade.Using this information the managers of such forums can nowidentify when the users of their communities change theirbehaviour in a way that could negatively affect the health oftheir community.

VII. DISCUSSION AND FUTURE WORK

The findings from our analyses have identified interestinghealth composition patterns and the lack of a global composi-tion pattern for the entirety of SCN. Although it has beenargued that the choice of metrics should be dependent oncommunity objectives [3], there are hardly any studies thatdemonstrate this. SCN is a Question Answering platform,and hence its objective is providing responses to questions.When inspecting the patterns learnt for the Seeds to Non-seeds Proportion indicator we find that across communitiesa decrease in Focussed Experts is correlated with an increasein the indicator. This implies that users knowledgeable onspecific topics actually increase the number of unansweredposts. It could be that the questions asked by such expertsusers cannot be solved by most community users.Marin et al. analysed 11 Linux support communities and

found a global positive correlation of code users and networkcohesion [4]. They defined core users as those whose out-degree is greater than the community average out-degree plusone standard deviation. In our work we account for a multitudeof roles that users may assume, rather than just one set. Ouranalysis also showed that such global patterns are harder toidentify when the communities differ in topic and nature. Forinstance an increase in Knowledgeable Member for 252 and353 was linked with an increase in the Clustering Coefficient,while being the converse for 412 and 414, we also found thatKnowledgeable Sink was associated with increased socialcapital for 252, 353, 419 and 50. This latter role is closestto the notion of core users that we find on SCN given its highPopularity.4In this paper we focused our study on 4 popular health

indicators, and several more can be added next. However, onepertinent question is whether codependencies exist betweenthe health indicators. Our future work will explore thecorrelation between these indicators. It could be the casethat certain metrics are redundant, while others are salient.Similarly, it is possible that some of these metrics are morerepresentative of health of some communities than others.Linked to this avenue of exploration is the creation of a singlehealth metric, or index similar to [7], that provides communitymanagers with a basic observable indicator.Our analysis identified various correlations between be-

haviour roles and health metrics. Next we need to studycausation dynamics to better understand the influence andsequence of events that lead to a health metric, or a behaviourrole, changing.

VIII. CONCLUSIONS

Assessing the health of online communities provides man-agers and operators with information about the condition oftheir community and how it is acting. Tying such assessmentsto the implicit behaviour within online communities, and the

4We found Engagement (normalised out-degree) and Popularity (normalisedin-degree) to have a significant positive correlation when mining roles on SCN(Pearson correlation coefficient = 0.926, p-value < 0.001)

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Churn Rate

FPR

TPR

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

User Count

FPR

TPR

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Seeds / Non−seeds Prop

FPR

TPR

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Clustering Coefficient

FPR

TPR

  Per-forum models outperform platform models for each health indicator   Demonstrates the need to assess and understand

communities individually   We also yield good performance for outlier

communities

  ROC Curves surpass baseline for:   Churn rate: 20/25 forums   User Count: 20/25 forums

  Seeds-to-Non-Seeds: 19/25 forums

  Clustering Coefficient: 17/25 forums

Findings and Conclusions

  No global composition pattern for the entirety of SCN   Identified key differences as to ‘What makes Communities tick’

  Decrease in Focussed Experts correlated with an increase in Seeds-to-Non-Seeds

  (Marin et al., 2009) found a correlation between increase in Core Users and Network Cohesion   We found a correlation between an increase in Knowledgeable Sinks and Social Capital

  Accurate detection of community health change is possible using role composition information   Significantly outperformed baseline models

  Per-forum models outperformed platform-level models

  Future Work:   Explore co-dependencies between health indicators

  Application of our approach over different communities and platforms   E.g. IBM Connections, Boards.ie


13


14

Questions?

Web: http://www.matthew-rowe.com |http://www.lancs.ac.uk/staff/rowem Email: [email protected] Twitter: @mattroweshow

Technology

What makes communities tick? Community health analysis using role compositions