21
Clustering Crowds Hiroshi Kajino 1 , Yuta Tsuboi 2 , Hisashi Kashima 1 1: The University of Tokyo 2: IBM Research - Tokyo July 16th, 2013 1 AAAI-13 *H. Kajino and H. Kashima were supported by the FIRST program.

20130716 aaai13-short

Embed Size (px)

Citation preview

Page 1: 20130716 aaai13-short

Clustering CrowdsHiroshi Kajino1, Yuta Tsuboi2, Hisashi Kashima1

1: The University of Tokyo2: IBM Research - Tokyo

July 16th, 2013 1AAAI-13

*H. Kajino and H. Kashima were supported by the FIRST program.

Page 2: 20130716 aaai13-short

Outline

• Motivation and Problem Setting

Quality control problem of crowdsourcing

• Existing Method

Learning from a crowd-generated training set

• Proposed Method

Focusing on the similarity between workers

• Experimental Results

Robust estimation can be realized

• Conclusion

July 16th, 2013 AAAI-13 2

Page 3: 20130716 aaai13-short

Outline

• Motivation and Problem Setting

Quality control problem of crowdsourcing

• Existing Method

Learning from a crowd-generated training set

• Proposed Method

Focusing on the similarity between workers

• Experimental Results

Robust estimation can be realized

• Conclusion

July 16th, 2013 AAAI-13 3

Page 4: 20130716 aaai13-short

Crowdsourcing

• Crowdsourcing: system to access large crowds

Pros: process human intelligence tasks at low cost

Cons: abilities of workers are unknown

⇒ Quality of results is not guaranteed

July 16th, 2013 AAAI-13 4

Able to access large, but unknown manpower

WorkerRequester

2. Return results

1. Request tasks

3. Pay rewards

Page 5: 20130716 aaai13-short

Task in Machine Learning Community

• Task: picture = bird ?

Pros: easy construct a large training set at low cost

Cons: quality of labels is not guaranteed

July 16th, 2013 AAAI-13 5

Large, but low-quality training set can be obtained easily

Difficult

Easy

Superior Inferior True labels

(Unobservable)

Yes Yes No Yes

No

No

No No

Yes No Yes

No

Page 6: 20130716 aaai13-short

Task in Machine Learning Community

• Task: picture = bird ?

Pros: easy construct a large training set at low cost

Cons: quality of labels is not guaranteed

July 16th, 2013 AAAI-13 6

Large, but low-quality training set can be obtained easily

Difficult

Easy

Superior Inferior True labels

(Unobservable)

Yes Yes No Yes

No

No

No No

Yes No Yes

No

Overcome this difficulty

Page 7: 20130716 aaai13-short

Problem Setting

• Input

– Feature vector : xi ∈RD (i=1,…,I)

– Worker : j ∈{1,2,…,J}

– Crowd label: yij ∈{0,1}

• Output

– classifier w0 ∈ RD (logistic regression model)

Note: we do not use the ground truths

• Common Approach:

1. Model the relation between w0 and {yij}

2. Inferring the model to obtain w0

July 16th, 2013 AAAI-13 7

Estimate a classifier from crowd-generated data

Bird or not

w0

Page 8: 20130716 aaai13-short

Outline

• Motivation and Problem Setting

Quality control problem of crowdsourcing

• Existing Method

Learning from a crowd-generated training set

• Proposed Method

Focusing on the similarity between workers

• Experimental Results

Robust estimation can be realized

• Conclusion

July 16th, 2013 AAAI-13 8

Page 9: 20130716 aaai13-short

• Personal Classifier (PC) Method [Kajino+,12]

– Worker j = classifier wj (= w0 + (noise))

July 16th, 2013 AAAI-13 9

Aggregate “personal classifiers” to obtain w0

personal classifiers

w0 yi2

yi1

true classifier

crowd labels

w1

w2

w3 yi3

N(w0 | 0, η-1I)

j=2

j=3

j=1

prior

distribution

known

unknown

Existing Method

Page 10: 20130716 aaai13-short

• Parameter estimation = MAP estimation

– Parameters: w0, W={wj}J

j=1

– Solve the convex optimization problem:

July 16th, 2013 AAAI-13 10

Parameter estimation = optimizing a convex function

minw0, W

(logistic loss)

Existing Method

priormodel-relationloss for PCs

Page 11: 20130716 aaai13-short

Existing Method: Discussion

• Personal Classifier Method [Kajino+, 2012]

#(parameters / worker) = D

Pros: global optimum

Cons: bad performance in case of few data per worker

• Clustered Personal Classifier Method

Pros: global optimum & moderate performance

Key: fuse similar workers to decrease the degree of freedom

July 16th, 2013 AAAI-13 11

Estimation can be unstable for the PC method

Proposed

Page 12: 20130716 aaai13-short

Outline

• Motivation and Problem Setting

Quality control problem of crowdsourcing

• Existing Method

Learning from a crowd-generated training set

• Proposed Method

Focusing on the similarity between workers

• Experimental Results

Robust estimation can be realized

• Conclusion

July 16th, 2013 AAAI-13 12

Page 13: 20130716 aaai13-short

Proposed Method: Idea

• Analysis on workers [Welinder+, 2010]

“Notice how the annotators’ decision planes fall roughly into three clusters”

– Clustering workers is a reasonable idea

(phrase & picture are cited from The multidimensional wisdom of crowds by Welinder+, NIPS 2010)

July 16th, 2013 AAAI-13 13

Similarity between workers can be observed in real data

Page 14: 20130716 aaai13-short

Proposed Method: Formulation

• Clustered Personal Classifier (CPC) Method

– Model-relation term finds and fuses similar workers

→ Cut down the degree of freedom

(μ controls the strength of clustering)

July 16th, 2013 AAAI-13 14

Fuse similar workers to cut down the degree of freedom

(cf. for the PC method)

where forcing wj = wk

model-relation

Page 15: 20130716 aaai13-short

Outline

• Motivation and Problem Setting

Quality control problem of crowdsourcing

• Existing Method

Learning from a crowd-generated training set

• Proposed Method

Focusing on the similarity between workers

• Experimental Results

Robust estimation can be realized

• Conclusion

July 16th, 2013 AAAI-13 15

Page 16: 20130716 aaai13-short

Experiments on Synthetic Data

• Synthetic Data (J=I=10, spammers (random worker) & experts)

(L) (Dimension)=2: PC method = CPC method

(R) (Dimension)=10: PC method < CPC method

July 16th, 2013 AAAI-13 16

Robust performance on a small data set

Percentage of spammers Percentage of spammers

Proposed

Existing

bett

er

Page 17: 20130716 aaai13-short

Experiments on Real Data: Performance

• Performance Test on Real Data [Finin+,10]

– NER task (each word is named entity or not)

– (Dimension)=161,901, #(instances)=17,747, #(workers)=42

July 16th, 2013 AAAI-13 17

Proposed method outperforms other methods

Precision Recall F-measure

CPC method 0.647 0.716 0.680

PC method 0.637 0.721 0.677

LC method 0.625 0.732 0.675

AOC method 0.680 0.670 0.675

MV method 0.686 0.651 0.668

Existing

Method

Proposed

Page 18: 20130716 aaai13-short

Experiments on Real Data: Clustering

• Hierarchical clustering on workers by increasing μ

• Outlier worker can be detected without “honey pots”

July 16th, 2013 AAAI-13 18

Clustering result indicates the existence of an outlier worker

Precision: 0.454

Recall: 0.857

Strength of clustering (=μ) -->

Page 19: 20130716 aaai13-short

Outline

• Motivation and Problem Setting

Quality control problem of crowdsourcing

• Existing Method

Learning from a crowd-generated training set

• Proposed Method

Focusing on the similarity between workers

• Experimental Results

Robust estimation can be realized

• Conclusion

July 16th, 2013 AAAI-13 19

Page 20: 20130716 aaai13-short

Conclusion

• Problem Setting

– Learning from redundant, variable-quality training data

• Problem of the PC Method

– #(parameters) is relatively large

– Unstable when data for one worker are small

• Proposed Method (CPC Method)

– Cut the degree of freedom by fusing similar workers

• Experimental Results

– More robust estimation in case of small data sets

– Valid as a method of “mining” workers

July 16th, 2013 AAAI-13 20

Introducing similarities between workers is beneficial

Page 21: 20130716 aaai13-short

July 16th, 2013 AAAI-13 21