Upload
crowdtruth
View
610
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Crowdsourcing represents a significant source of data which needs to be analyzed and interpreted. These tasks influence the quality of the output as well as the efficiency of the process. Visualization proved to be an effective way of dealing with large amount of data. In this paper we propose a visualization analytic model in the context of the CrowdTruth framework and CrowdTruth metrics for optimizing the crowdsourcing process and improving its data quality. The requirements for the dynamic, scalable and interactive visualizations were extracted through literature and interviews with users of the framework.
Citation preview
By Tatiana Cristea
Supervised by Lora Aroyo (VU) & Robert-Jan Sips (IBM)
Visualizations for quality assessment of crowdsourced data
Noisy Crowdsourced data
Quality data
Current practices: based on the consensus of workers
CrowdTruth metrics : considers disagreement informative
Select from the list the objects depicted in the image:
Balloon Flower Human Car Ghost Person
Can you identify the low quality worker(s)?
Balloon Flower Human Car Ghost Person
Balloon Flower Human Car Ghost Person
Worker 1 Worker 2 Worker 3
Unclear image (content
unit)
Select from the list the objects depicted in the image:
Can you identify the low quality worker(s)?
Balloon Flower Human Car Ghost Person
Worker 1 Balloon Flower Human Car Ghost Person
Worker 2 Balloon Flower Human Car Ghost Person
Worker 3
Not clearly separable
answers
Select from the list the objects depicted in the image:
Can you identify the low quality workers?
Balloon Flower Human Car Ghost Person
Worker 1 Balloon
Flower Human Car Ghost Person
Worker 2 Balloon
Flower Human Car Ghost Person
Worker 3
Low quality
workers
How good is the unit for the specific task?
How well the worker understood the task?
Are the annotation options clear and separable?
Unit
AnnotationWorker
Unit
AnnotationWorker
Unit
AnnotationWorker
Unit
AnnotationWorker
JOB 1 JOB 2
JOB N
Unit Unit
Unit
Worker
Worker
Worker Annotation
Annotation
Annotation
Visualization approach for quality assessment of
crowdsourced data :
a) at aggregate level
b) at a specific level
c) and in the context of their interdependencies
Extracted through interviews
Visualization of properties, statistics and metrics of: single job/unit/worker collection of jobs/unit/workers
Functional requirements: Filtering, sorting Support for detection of outliers Visualization of connected workers, content units and jobs Support of comparative analysis Support for navigation between connected elements, etc.
DEMO TOUR
We evaluated the design with 9
persons
Different levels of experience with
crowdsourcing tasks
useful in:
the assessment of quality
deep analysis of the data
But….
The amount of information was a (little) bit overwhelming…
The interactions are great!
… if you know about them
The time dimension is not always present…
Create user profiles
Decouple the visualization component and provide it as a separate plugin
Add the time dimension to the visualizations
Time