Upload
bram-vandeputte
View
1.588
Download
1
Embed Size (px)
DESCRIPTION
Citation preview
Metadata Harvesting and Validation
Bram VandeputteK.U.Leuven
1
slideshare
• http://www.slideshare.net/bramvandeputte
• Validation Service• Online Validation Service• OAI-PMH• Harvesting Infrastructure
3
Overview
• Interoperability : Application Profile (AP)
• Manual check : very time consuming
• Need a tool for enforcing an AP => validation scheme
• A set of validation rules
• Reusable & extendable
4
Validation Service
Best practices derived from previous projects such as MELT and MACE
Reusable : modular + inheritance possible
• Components :
• XML schema : structure• schematron :
• mandatory/conditional elements
• empty fields
• vocabularies (auto generated)
• ...
• Vcard component
5
Validation Service
• Terminology :
• Validation Component
• Validation Scheme
• Validation Scheme URI :• http://aspect-project.org/validation/ASPECTv1.0/core
6
Validation Service
component : atomic block which does specific validation checking
scheme : collection of components that ensures validity against a whole AP
URI : unique identifier of a scheme
7
Validation Service
8
Validation Service
LOM loose
lomloose.xsd
vcard validator
empty attribute fields
ASPECTv1.0/core
vocabulary bank
Legend
uses
extends
ASPECT
vcard validator
validationScheme
validation component
recommended schematron rules
core schematron rules
ASPECTv1.0/recommended
IMS ILOX
9
Validation Service
!
10
Online Validation Service demo
validation to lre APrefer to lre ap document
• Client - Server model• Pull mechanism• options :
• selective harvesting (date and set)• incremental harvesting
• Metadata-agnostic
13
OAI-PMH
• Verbs : Identify, ListRecords, GetRecord• Parameters :
• baseUrl• from & until date• metadataPrefix• sets
14
OAI-PMH
• Multiple targets
• Each target separate properties (sets, date granularity, metadataPrefix, ...)
• Storing metadata (SPI, Filesystem, APP, ...)
• Extra features :
• Incremental harvesting
• harvesting scheduling
• Metadata validation + reporting
• OAI-PMH Target validation
• (User Friendly) GUI
15
Harvest Component
16
invalid : discarded or identifier recorded for next harvesting
16
The Harvest component
invalid : discarded or identifier recorded for next harvesting
ARIADNE Harvester
harvester log
16
invalid : discarded or identifier recorded for next harvesting
ARIADNE Harvester
validation serviceharvester log
ASPECT Repository
16
SQISPI
invalid : discarded or identifier recorded for next harvesting
OAI
External Repository
LOMLOM
OAI-PMH
LOM
LOMLOM
OAI-PMH
ARIADNE Harvester
validation serviceharvester log
ASPECT Repository
16
LOM
SQISPI
invalid : discarded or identifier recorded for next harvesting
OAI
External Repository
LOMLOM
OAI-PMH
LOM
LOMLOM
OAI-PMH
ARIADNE Harvester
validation serviceharvester log
ASPECT Repository
16
LOM
SQISPI
invalid : discarded or identifier recorded for next harvesting
OAI
External Repository
LOMLOM
OAI-PMH
LOM
LOMLOM
OAI-PMH
ARIADNE Harvester
validation serviceharvester log
ASPECT Repository
16
LOMLOM
SQISPI
invalid : discarded or identifier recorded for next harvesting
OAI
External Repository
LOMLOM
OAI-PMH
LOM
LOMLOM
OAI-PMH
ARIADNE Harvester
validation serviceharvester log
ASPECT Repository
16
LOM
LOM
SQISPI
invalid : discarded or identifier recorded for next harvesting
OAI
External Repository
LOMLOM
OAI-PMH
LOM
LOMLOM
OAI-PMH
ARIADNE Harvester
validation serviceharvester log
ASPECT Repository
Validation Msg
16
LOM
SQISPI
invalid : discarded or identifier recorded for next harvesting
OAI
External Repository
LOMLOM
OAI-PMH
LOM
LOMLOM
OAI-PMH
ARIADNE Harvester
validation serviceharvester log
ASPECT Repository
Validation Msg
16
LOM
SQISPI
invalid : discarded or identifier recorded for next harvesting
OAI
External Repository
LOMLOM
OAI-PMH
LOM
LOMLOM
OAI-PMH
ARIADNE Harvester
validation serviceharvester log
ASPECT Repository
Validation Msg
16
LOM
SQISPI
invalid : discarded or identifier recorded for next harvesting
OAI
External Repository
LOMLOM
OAI-PMH
LOM
LOMLOM
OAI-PMH
ARIADNE Harvester
validation serviceharvester log
ASPECT Repository
Validation Msg
16
LOM
SQISPI
invalid : discarded or identifier recorded for next harvesting
OAI
External Repository
LOMLOM
OAI-PMH
LOM
LOMLOM
OAI-PMH
ARIADNE Harvester
validation serviceharvester log
ASPECT Repository
Validation Msg
17
LOM
LOM
SQISPI
LOMLOM
OAI-PMH
LOM
Validation Msg
LOM
LOM
Validation Msg
1 2
34
5
6
18
Harvester Screenshot or live
demo
Validation Reports
• After harvesting -> report generated and put online
• report has 4 “levels” :
• full log (incl. metadata)
• reporting log
• Grouped Errors
• Error Summary
• Questions ?
23
• SPI : http://ariadne.cs.kuleuven.be/lomi/index.php/SimplePublishingInterface
• IEEE LOM : http://ltsc.ieee.org/wg12/
• OAI-PMH : http://www.openarchives.org/
References