18
By Tatiana Cristea Supervised by Lora Aroyo (VU) & Robert-Jan Sips (IBM)

Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data

Embed Size (px)

DESCRIPTION

Crowdsourcing represents a significant source of data which needs to be analyzed and interpreted. These tasks influence the quality of the output as well as the efficiency of the process. Visualization proved to be an effective way of dealing with large amount of data. In this paper we propose a visualization analytic model in the context of the CrowdTruth framework and CrowdTruth metrics for optimizing the crowdsourcing process and improving its data quality. The requirements for the dynamic, scalable and interactive visualizations were extracted through literature and interviews with users of the framework.

Citation preview

Page 1: Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data

By Tatiana Cristea

Supervised by Lora Aroyo (VU) & Robert-Jan Sips (IBM)

Page 2: Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data

Visualizations for quality assessment of crowdsourced data

Noisy Crowdsourced data

Quality data

Page 3: Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data

Current practices: based on the consensus of workers

CrowdTruth metrics : considers disagreement informative

Page 4: Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data

Select from the list the objects depicted in the image:

Balloon Flower Human Car Ghost Person

Can you identify the low quality worker(s)?

Balloon Flower Human Car Ghost Person

Balloon Flower Human Car Ghost Person

Worker 1 Worker 2 Worker 3

Unclear image (content

unit)

Page 5: Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data

Select from the list the objects depicted in the image:

Can you identify the low quality worker(s)?

Balloon Flower Human Car Ghost Person

Worker 1 Balloon Flower Human Car Ghost Person

Worker 2 Balloon Flower Human Car Ghost Person

Worker 3

Not clearly separable

answers

Page 6: Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data

Select from the list the objects depicted in the image:

Can you identify the low quality workers?

Balloon Flower Human Car Ghost Person

Worker 1 Balloon

Flower Human Car Ghost Person

Worker 2 Balloon

Flower Human Car Ghost Person

Worker 3

Low quality

workers

Page 7: Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data

How good is the unit for the specific task?

How well the worker understood the task?

Are the annotation options clear and separable?

Unit

AnnotationWorker

Page 8: Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data

Unit

AnnotationWorker

Unit

AnnotationWorker

Unit

AnnotationWorker

JOB 1 JOB 2

JOB N

Unit Unit

Unit

Worker

Worker

Worker Annotation

Annotation

Annotation

Page 9: Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data

Visualization approach for quality assessment of

crowdsourced data :

a) at aggregate level

b) at a specific level

c) and in the context of their interdependencies

Page 10: Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data
Page 11: Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data

Extracted through interviews

Visualization of properties, statistics and metrics of: single job/unit/worker collection of jobs/unit/workers

Functional requirements: Filtering, sorting Support for detection of outliers Visualization of connected workers, content units and jobs Support of comparative analysis Support for navigation between connected elements, etc.

Page 12: Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data

DEMO TOUR

Page 13: Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data

We evaluated the design with 9

persons

Different levels of experience with

crowdsourcing tasks

Page 14: Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data

useful in:

the assessment of quality

deep analysis of the data

But….

Page 15: Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data

The amount of information was a (little) bit overwhelming…

Page 16: Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data

The interactions are great!

… if you know about them

Page 17: Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data

The time dimension is not always present…

Page 18: Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data

Create user profiles

Decouple the visualization component and provide it as a separate plugin

Add the time dimension to the visualizations

Time