49
CHOOSING THE RIGHT CROWD EXPERT FINDING IN SOCIAL NETWORKS Alessandro Bozzon Marco Brambilla Stefano Ceri Matteo Silvestri Giuliano Vesci Politecnico di Milano Dipartimento di Elettronica, Informazione e BioIngegneria

Choosing the right crowd. Expert finding in social networks. edbt 2013

Embed Size (px)

Citation preview

Page 1: Choosing the right crowd. Expert finding in social networks. edbt 2013

CHOOSING THE RIGHT CROWD

EXPERT FINDING IN SOCIAL NETWORKS

Alessandro Bozzon

Marco Brambilla

Stefano Ceri

Matteo Silvestri

Giuliano Vesci

Politecnico di Milano

Dipartimento di Elettronica, Informazione e BioIngegneria

Page 2: Choosing the right crowd. Expert finding in social networks. edbt 2013

Problems and terms• Human Computation:

• Computation carried out by groups of humans (examples: collaborative filtering, online auctions, tagging, games with a purpose)

• Crowd-sourcing: • The process of building a human computation using computers as

organizers, by organizing the computation as several tasks (possibly with dependences) performed by humans

• Crowd-searching: • A specific task consisting of searching information

• Crowd-sourcing Platform:• A software system for managing tasks, capable of organizing tasks,

assigning them to humans, assembling and processing returned results (such as Amazon Mechanical Turk, Doodle)

• Social Platform:• A platform where humans perform social interactions (such as Facebook,

Twitter, LinkedIn)

Page 3: Choosing the right crowd. Expert finding in social networks. edbt 2013

The market

Page 4: Choosing the right crowd. Expert finding in social networks. edbt 2013

Why Crowd-search?• People do not trust web search completely

• Want to get direct feedback from people

• Expect recommendations, insights, opinions, reassurance

Page 5: Choosing the right crowd. Expert finding in social networks. edbt 2013

And given that crowds spend times on social networks….

• Our proposal is to use social networks and Q&A websites as crowd-searching platforms, in addition to crowdsourcing platforms

• Example: search tasks

Page 6: Choosing the right crowd. Expert finding in social networks. edbt 2013

From social workers to communities

• Issues and problems• Motivation of the responders

• Intensity of social activity of the asker

• Topic appropriateness

• Timing of the post (hour of the day, day of the week)

• Context and language barrier

Page 7: Choosing the right crowd. Expert finding in social networks. edbt 2013

Crowd-searching after conventional search

• From search results to friends and experts feedback

Social Platform

initial query

Human SearchSystem

SearchSystem

Social PlatformSocial Platform

Page 8: Choosing the right crowd. Expert finding in social networks. edbt 2013

Example: Find your next job (exploration)

Page 9: Choosing the right crowd. Expert finding in social networks. edbt 2013

Example: Find your job (social invitation)

Page 10: Choosing the right crowd. Expert finding in social networks. edbt 2013

Example: Find your job (social invitation)

Selected data items can be transferred to the crowd question

Page 11: Choosing the right crowd. Expert finding in social networks. edbt 2013

Find your job (response submission)

Page 12: Choosing the right crowd. Expert finding in social networks. edbt 2013

Crowdsearcher results (in the loop)

Page 13: Choosing the right crowd. Expert finding in social networks. edbt 2013

WWW2012 – THE MODEL

Page 14: Choosing the right crowd. Expert finding in social networks. edbt 2013

Task management problemsTypical crowdsourcing problems:

• Task splitting: the input data collection is too complex relative to the cognitive capabilities of users.

• Task structuring: the query is too complex or too critical to be executed in one shot.

• Task routing: a query can be distributed according to the values of some attribute of the collection.

Plus:

• Platform/community assignment: a task can be assigned to different communities or social platforms based on its focus

Page 15: Choosing the right crowd. Expert finding in social networks. edbt 2013

Task Design• Which are the input objects of the crowd interaction?

• Should they have a schema (set of fields, each defined by a name and a type)?

• Which operations should the crowd perform?

• Like, label, comment, add new instances, verify/modify data, order, etc.

• How should the task be split into micro-tasks assigned to each person?

How should a specific object be assigned to each person?

• How should the results of the micro-tasks be aggregated?

• Sum, Average, Majority voting, etc.

• Which execution interface should be used?

Page 16: Choosing the right crowd. Expert finding in social networks. edbt 2013

Operations• In a Task, performers are required to execute logical operations on input objects

• e.g. Locate the faces of the people appearing in the following 5 images

• CrowdSearcher offers pre-defined operation types:• Like: Ask a performer to express a preference (true/false)

• e.g. Do you like this picture?• Comment: Ask a performer to write a description / summary / evaluation

• e.g. Can you summarize the following text using your own words?• Tag: Ask a performer to annotate an object with a set of tags

• e.g. How would you label the following image?• Classify: Ask a performer to classify an object within a closed-set of alternatives

• e.g. Would you classify this tweet as pro-right, pro-left, or neutral? • Add: Ask a performer to add a new object conforming to the specified schema

• e.g. Can you list the name and address of good restaurants nearby Politecnico di Milano?• Modify: Ask a performer to verify/modify the content of one or more input object

• e.g. Is this wine from Cinque Terre? If not, where does it come from? • Order: Ask a performer to order the input objects

• e.g. Order the following books according to your taste

Page 17: Choosing the right crowd. Expert finding in social networks. edbt 2013

Splitting Strategy• Given N objects in the task

• Which objects should appear in each MicroTask?• How many objects in each MicroTask?• How often an object should appear in MicroTasks?• Which objects cannot appear together?• Should objects be presented always in the same order?

Page 18: Choosing the right crowd. Expert finding in social networks. edbt 2013

Assignment Strategy• Given a set of MicroTasks, which performers are assigned to them?

• Online assignment• Micro Tasks dynamically assigned to performers

• First come / First served• Based on a choice of the performer

• Offline assignment• MicroTasks statically assigned to performers

• Based on performers’ priority• Based on matching

• Invitation• Send an email to a mailing list• Publish a HIT on Mechanical Turk (dynamic)• Create a new challenge in your game• Publish a post/tweet on your social network profile• Publish a post/tweet on your friends' profile

Page 19: Choosing the right crowd. Expert finding in social networks. edbt 2013

Deployment: search on the social network

• Multi-platform deployment

Embedded application

Social/ Crowd platformNative

behaviours

External application

Standalone application

API

Embedding

Community / Crowd

Generated query template

Native

Page 20: Choosing the right crowd. Expert finding in social networks. edbt 2013

Deployment: search on the social network• Multi-platform deployment

Page 21: Choosing the right crowd. Expert finding in social networks. edbt 2013

Deployment: search on the social network• Multi-platform deployment

Page 22: Choosing the right crowd. Expert finding in social networks. edbt 2013

Deployment: search on the social network• Multi-platform deployment

Page 23: Choosing the right crowd. Expert finding in social networks. edbt 2013

Deployment: search on the social network• Multi-platform deployment

Page 24: Choosing the right crowd. Expert finding in social networks. edbt 2013

Crowdsearch experiments• Some 150 users

• Two classes of experiments:• Random questions on fixed topics: interests (e.g. restaurants in the vicinity of

Politecnico), to famous 2011 songs, or to top-quality EU soccer teams

• Questions independently submitted by the users

• Different invitation strategies:• Random invitation

• Explicit selection of responders by the asker

• Outcome• 175 like and insert queries

• 1536 invitations to friends

• 230 answers

• 95 questions (~55%) got at least one answer

Page 25: Choosing the right crowd. Expert finding in social networks. edbt 2013

Experiments: Manual and random questions

Page 26: Choosing the right crowd. Expert finding in social networks. edbt 2013

Experiments: Interest and relationship

• Manually written and assigned questions are consistently more responded in time

Page 27: Choosing the right crowd. Expert finding in social networks. edbt 2013

Experiments: Query type• Engagement depends on the difficulty of the task

• Like vs. Add tasks:

Page 28: Choosing the right crowd. Expert finding in social networks. edbt 2013

Experiment: Social platform• The question enactment platform role

• Facebook vs. Doodle

Page 29: Choosing the right crowd. Expert finding in social networks. edbt 2013

Experiment: Posting time• The question enactment platform role

• Facebook vs. Doodle

Page 30: Choosing the right crowd. Expert finding in social networks. edbt 2013

EDBT 2013

Page 31: Choosing the right crowd. Expert finding in social networks. edbt 2013

Problem

• Ranking the members of a social group according to the level of knowledge that they have about a given topic

• Application: crowd selection (for Crowd Searching or Sourcing)

• Available data• User profile • behavioral trace that users leave behind them through

their social activities

Page 32: Choosing the right crowd. Expert finding in social networks. edbt 2013

Considered Features• User Profiles

• Plus Linked Web Pages

• Social Relationships• Facebook Friendship• Twitter mutual following relationship• LinkedIn Connections

• Resource Containers• Groups, Facebook Pages• Linked Pages• Users who are followed by a given user are resource containers

• Resources• Material published in resource containers

Page 33: Choosing the right crowd. Expert finding in social networks. edbt 2013

Feature Organization Meta-Model

Page 34: Choosing the right crowd. Expert finding in social networks. edbt 2013

Example (Facebook)

Page 35: Choosing the right crowd. Expert finding in social networks. edbt 2013

Example (Twitter)

Page 36: Choosing the right crowd. Expert finding in social networks. edbt 2013

Resource Distance• Objects in social graph organized according to their

distance with respect to the user profile• Why? Privacy, Computational Cost, Platform Access Constraints

Distance Resource

0 Expert Candidate Profile

1

Expert Candidate owns/create/annotates Resource

Expert Candidate relatedTo Resource Container

Expert Candidate follows UserProfile

2

Expert Candidate follows UserProfile relatedTo Resource Container

Expert Candidate relatedTo Resource Container contains Resource

Expert Candidate follows UserProfile owns/create/annotates Resource

Expert Candidate follows UserProfile follows UserProfile

Page 37: Choosing the right crowd. Expert finding in social networks. edbt 2013

Distance interpretationDistance Resource

0 Expert Candidate Profile

1

Expert Candidate owns/create/annotates Resource

Expert Candidate relatedTo Resource Container

Expert Candidate follows UserProfile

2

Expert Candidate follows UserProfile relatedTo Resource Container

Expert Candidate relatedTo Resource Container contains Resource

Expert Candidate follows UserProfile owns/create/annotates Resource

Expert Candidate follows UserProfile follows UserProfile

Page 38: Choosing the right crowd. Expert finding in social networks. edbt 2013

Resource Processing

• Extraction from Social Network APIs

• Extraction of Text from linked Web Pages• Alchemy Text Extraction APIs

• Language Identification

• Text Processing• Sanitization, tokenization,

stopword, lemmatization

• Entity Extraction and Disambiguation• TagMe

Page 39: Choosing the right crowd. Expert finding in social networks. edbt 2013

Method – Resource Score

• tf(t,r) term frequency -- irf(t) inverse resource frequency of t• ef(e,r) entity frequency -- eir(e) inverse entity frequency of e

we(e,r) relevance of entity in resource

Entity Component Weighting

Page 40: Choosing the right crowd. Expert finding in social networks. edbt 2013

Method: Expert Score

• Experts are ranked according to score(q,ex)

Resource weight for given expertise

Resource Score

Window Size

Page 41: Choosing the right crowd. Expert finding in social networks. edbt 2013

Dataset• 7 kinds of expertises

• Computer Engineering, Location, Movies & TV, Music, Science, Sport, Technology & Videogames

• 40 volunteer users (on Facebook & Twitter & LinkedIN)

• 330.000 resources (70% with URL to external resources)

• Groundtruth created trough self-assessment• For expertise need, vote on 7 Likert Scale• EXPERTS expertise above average

Page 42: Choosing the right crowd. Expert finding in social networks. edbt 2013

Distribution of Expertise and Resources

• High # Resources on Facebook and Twiitter

• Higher # users on Facebook

• Avg Expertise ~ 3.5 / 7• High Music and Sport

Expertise• Low Location Expertise

Page 43: Choosing the right crowd. Expert finding in social networks. edbt 2013

Metrics• We obtain lists of candidate experts and assess them

against the ground truth, using:• For precision:

• Mean Average Precision (MAP)• 11-Point Interpolated Average Precision (11-P)

• For ranking:• Mean Reciprocal Rank (MRR) – for the first value• Normalized Discounted Cumulative Gain (DCG) – for more values, can

be set @N for the first N values

Page 44: Choosing the right crowd. Expert finding in social networks. edbt 2013

Metrics improves with resources• But it comes with a cost

Page 45: Choosing the right crowd. Expert finding in social networks. edbt 2013

Friendship Relationship not useful• Inspecting friend’s resources does not improve metrics!

Page 46: Choosing the right crowd. Expert finding in social networks. edbt 2013

Social Network Analysis

• a

Comparison of the results obtained with All the social networks, or separately by FaceBook, TWitter, and LinkedIn.

Page 47: Choosing the right crowd. Expert finding in social networks. edbt 2013

Main Results• Profiles are less effective than level-1 resources

• Resources produced by others help in describing each individual’s expertise

• Twitter is the most effective social network for expertise matching – sometimes it outperforms the other social networks• Twitter most effective in Computer Engineering, Science, Technology &

Games, Sport

• Facebook effective in Locations, Sport, Movies & TV, Music• Linked-in never very helpful in locating expertise

Page 48: Choosing the right crowd. Expert finding in social networks. edbt 2013

WWW 2013Reactive Crowdsourcing

Page 49: Choosing the right crowd. Expert finding in social networks. edbt 2013

Main Message• Crowd-sourcing should be dynamically adapted• The best way to do so is through active rules• Four kinds of rules:

execution / object / performer / task control • Guaranteed termination• Extensibility