Upload
nikos-palavitsinis-phd
View
127
Download
0
Embed Size (px)
Citation preview
A Hero’s Journey Through Metadata Quality
Nikos PalavitsinisComputer Technology Institute & Press “Diophantus”
EdReNe Seminar, April 24-25, 2017
The 12 Stages of The Hero's Journey◦ A popular form of structure derived from Joseph
Campbell's Monomyth from his book The Hero With A Thousand Faces and adapted by Christopher Vogler is the Twelve Stage Hero's Journey
1. The Ordinary World
2. The Call to Adventure
3. Refusal of the Call
4. Meeting with the Mentor
5. Crossing the Threshold
6. Tests, Allies & Enemies
7. Approach
8. The Ordeal
9. The Reward
10. The Road Back
11. The Resurrection
12. Return with the Elixir
In the center of every repository project
Their quality impacts heavily the services offered on top of the content
Numerous projects and initiatives offer access to resources ◦ BUT in some cases, the quality of metadata cannot fully
support the interesting things we want users to do with the content
Some kind of polarity that causes stress
“Something shakes up the situation so the hero must face the beginning of a change”
No Mandatory Elements %
1 General / Title 99.8%
2 …/ Language 93.9%
3 …/ Description 94.8%
4 Rights / Cost 15.7%
5 Rights / Cost Copyright And Other Restrictions 16.0%
Date: September 2009
Annotated Objects: 6.653 objects (≈60% of total)
No Recommended Elements %
6 General / Keyword 12.8%
7 LifeCycle / Contribute / Role 11.5%
8 Educational / Intended End User Role 12.8%
9 …/ Context 10.2%
10 …/ Typical Age Range 3.8%
11 Rights / Description 7.7%
12 Classification 11.8%
No Optional Elements %
13 General / Coverage 0.2%
14 …/ Structure 7.9%
15 LifeCycle / Status 0.3%
16 Educational / Interactivity Type 0.3%
17 …/ Interactivity Level 0.3%
18 …/ Semantic Density 0.2%
19 …/ Difficulty 0.1%
20 …/ Typical Learning Time 0%
21 …/ Language 0.3%
22 …/ Description 1.5%
Not an option in EU-funded projects, months away from the 1st review meeting
“The hero comes across a seasoned traveler of the worlds who gives him or her training, equipment, or advice that will help on the journey”
Nikos Manouselis
Erik Duval (†)
Initiative: Organic.Edunet
Topics: Organic Agriculture, Agroecology, Environmental Education
Resources: >10.000
Content Providers: 8
Languages: 10
Content Type: Mainly image & text with some videos
“Hero commits to leaving the Ordinary World and entering a new region or condition with unfamiliar rules and values”◦ Lack of completeness,
◦ Redundant metadata,
◦ Lack of clarity,
◦ Incorrect use of elements,
◦ Structural inconsistency,
◦ Inaccurate representation
“Hero is tested and sorts out allegiances in the Special World”
ColleaguesExperts
Forms
Processes
Methods
“Hero and newfound allies prepare for the major challenge in the Special world”
M Q A C PMQ A C P
MetadataQualityAssuranceCertificationProcess
Base Metadata Schema
Metadata Understanding Questionnaire
Metadata Quality Assessment Grid
Excel Completeness Assessment Tool
Good & Bad Metadata Practices Guide
“Hero confronts death or faces his or her greatest fear. Out of the moment of death comes a new life”
Metadata Understanding Session◦ Workshop format
◦ Printed form, application profile
◦ Easiness, usefulness and obligation
◦ Questionnaire, Likert Scale
Hands-on exercise on metadata annotation◦ Sample of resources
◦ Annotated on paper
◦ Support from Metadata Expert
Metadata Design Phase
Hands-on annotation experiment◦ Domain experts go digital
◦ Testing both metadata and system
Review of metadata records◦ Metadata experts test the AP
◦ Refining the definitions of elements wherever needed. judging from real usage data
◦ Core metadata quality criteria
Testing Phase
Peer-review experiment◦ Large scale
◦ Representative sample
◦ Everyone reviews everyone
◦ Metadata Peer Review Grid
◦ Online process
◦ Accuracy; Completeness; Consistency; Objectiveness Appropriateness; Correctness; Overall
Calibration Phase
Analysis of usage data from tools◦ Mainly completeness here
Metadata Quality Certification concept◦ Concept of validation as a quality mark
Critical Mass Phase
Regular Completeness Analysis◦ Monthly
◦ Automated
Online Peer-Review of records◦ Fixed
Informal Mechanisms◦ Prizes
◦ Awards
◦ Social
Regular Operation Phase
“Hero takes possession of the treasure won by facing death. There is celebration, but also danger of losing the treasure again”
Completeness measurement:◦ Education case: Significant improvement in 17 LOM
elements out of 22 used
◦ Improvement ranging from 37% to 87% (average: 61%)
Perception of metadata elements predicted in most cases, their future use:
Peer Review of 6 metadata quality metrics◦ Completeness, Accuracy, Consistency, Objectiveness,
Appropriateness, Correctness
◦ Results kept improving through the MQACP application
Completeness Accuracy Consistency Objectiveness Appropriateness Correctness Overall score
85 % 83 % 78 % 90 % 74 % 90 % 77 %Learning
“Hero leaves the Special World to make sure the treasure is brought home”
The MQACP was applied continuously throughout the duration of the Organic.Edunet case
Also, it was validated in two cases of other types of repositories
Initiative: Natural Europe
Topics: Minerals, Fossils, Plants, etc.
Resources: >15.000
Content Providers: 6
Languages: 10
Content Type: Mainly image & text with a significant number of videos
Initiative: VOA3R
Topics: Agriculture, Veterinary, Horticulture, Aquaculture, Viticulture, etc.
Resources: >71.316 (2.500.000)
Content Providers: 9
Languages: 9
Content Type: Mainly text with some images
“At the climax, the hero is severely tested once more on the threshold of home. Hero is purified by a last sacrifice, another moment of death and rebirth, but on a higher and more complete level”
◦ Cultural case: Significant improvement in 9 elements, from 26% to 65% (average: 47%)
◦ Research case: Significant improvement in 3 DC elements, from 23% to 56% (average: 44%)
Completeness Accuracy Consistency Objectiveness Appropriateness Correctness Overall score
+9% +19% +8% +11% +24% +3% +18%
+5% +8% +4% +13% +47% +8% +20%
Culture
Research
“The hero returns home or continues the journey, bearing some element of the treasure that has the power to transform the world”
Metadata Understanding Session◦ Different domains have different metadata needs
Mandatory: Learning 19%, Research 34%, Culture 72%
Recommended: Learning 30%, Research 17%, Culture 7%
Optional: Learning 51%, Research 49%, Culture 21%
M Q A C P
1. MQACP had a positive effect on the resultingquality of the metadata records
2. The MQACP was transferable with minoradjustments from case to case
3. MQACP produced comparable results in terms ofcompleteness
Case Average Median Mand. Rec. Opt. Elements
Learning 67% 79.1% 96.4% 81.5% 42.3% 22
Cultural 74.1% 76.7% 99.4% 80.2% 56.7% 26
Research 85.2% 96.3% 95.4% 80.4% 91.2% 15
4. The cost involved in MQACP is comparable interms of magnitude for different repositories
Case People Methods Resources Hours Hours per
person
Learning 49 5 11.000 134.7 2.8
Cultural 45 8 15.000 166.5 3.7
Research 58 6 75.000 154.8 2.7
5. The improvement in metadata recordcompleteness is comparable with the costs
Case
Average
improvement for
elements
Elements
in APHours
% of average
improvement
per hour spent
Learning +55.7% 22 134.7 0.43%
Cultural +28.8% 26 166.5 0.17%
Research +23.7% 15 154.8 0.15%
Quality Seals for Metadata
Metadata Quality in Really Large Collections◦ The Europeana case - Peter Kiraly
◦ Automated Metadata Quality Assessment
◦ GEANT
DLF-AIG Metadata Working Group◦ Guidelines, best practices, tools & workflows around the
evaluation and assessment of metadata used by and for digital libraries and repositories
Crowdsourcing Metadata Generation◦ Gamification
Supervisors◦ Salvador Sanchez-Alonso; Nikos Manouselis
Information Engineering Research Unit Personnel◦ Miguel Refusta; Alberto Abian and wider IERU team
Agro-Know colleagues◦ Effie Tsiflidou; Charalampos Thanopoulos; Vassilis
Protonotarios and wider AK team
Colleagues◦ Hannes Ebner, Charalampos Karagiannidis, Jad Najjar
Family & Friends
A Hero’s Journey Through Metadata Quality
Nikos PalavitsinisComputer Technology Institute & Press “Diophantus”
EdReNe Seminar, April 24-25, 2017