35
Metadata Harvesting and Validation Bram Vandeputte K.U.Leuven 1

Metadata Harvesting And Validation

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Metadata Harvesting And Validation

Metadata Harvesting and Validation

Bram VandeputteK.U.Leuven

1

Page 2: Metadata Harvesting And Validation

slideshare

• http://www.slideshare.net/bramvandeputte

Page 3: Metadata Harvesting And Validation

• Validation Service• Online Validation Service• OAI-PMH• Harvesting Infrastructure

3

Overview

Page 4: Metadata Harvesting And Validation

• Interoperability : Application Profile (AP)

• Manual check : very time consuming

• Need a tool for enforcing an AP => validation scheme

• A set of validation rules

• Reusable & extendable

4

Validation Service

Best practices derived from previous projects such as MELT and MACE

Reusable : modular + inheritance possible

Page 5: Metadata Harvesting And Validation

• Components :

• XML schema : structure• schematron :

• mandatory/conditional elements

• empty fields

• vocabularies (auto generated)

• ...

• Vcard component

5

Validation Service

Page 6: Metadata Harvesting And Validation

• Terminology :

• Validation Component

• Validation Scheme

• Validation Scheme URI :• http://aspect-project.org/validation/ASPECTv1.0/core

6

Validation Service

component : atomic block which does specific validation checking

scheme : collection of components that ensures validity against a whole AP

URI : unique identifier of a scheme

Page 7: Metadata Harvesting And Validation

7

Validation Service

Page 8: Metadata Harvesting And Validation

8

Validation Service

LOM loose

lomloose.xsd

vcard validator

empty attribute fields

ASPECTv1.0/core

vocabulary bank

Legend

uses

extends

ASPECT

vcard validator

validationScheme

validation component

recommended schematron rules

core schematron rules

ASPECTv1.0/recommended

IMS ILOX

Page 9: Metadata Harvesting And Validation

9

Validation Service

!

Page 10: Metadata Harvesting And Validation

10

Online Validation Service demo

Page 11: Metadata Harvesting And Validation
Page 12: Metadata Harvesting And Validation

validation to lre APrefer to lre ap document

Page 13: Metadata Harvesting And Validation

• Client - Server model• Pull mechanism• options :

• selective harvesting (date and set)• incremental harvesting

• Metadata-agnostic

13

OAI-PMH

Page 14: Metadata Harvesting And Validation

• Verbs : Identify, ListRecords, GetRecord• Parameters :

• baseUrl• from & until date• metadataPrefix• sets

14

OAI-PMH

Page 15: Metadata Harvesting And Validation

• Multiple targets

• Each target separate properties (sets, date granularity, metadataPrefix, ...)

• Storing metadata (SPI, Filesystem, APP, ...)

• Extra features :

• Incremental harvesting

• harvesting scheduling

• Metadata validation + reporting

• OAI-PMH Target validation

• (User Friendly) GUI

15

Harvest Component

Page 16: Metadata Harvesting And Validation

16

invalid : discarded or identifier recorded for next harvesting

Page 17: Metadata Harvesting And Validation

16

The Harvest component

invalid : discarded or identifier recorded for next harvesting

Page 18: Metadata Harvesting And Validation

ARIADNE Harvester

harvester log

16

invalid : discarded or identifier recorded for next harvesting

Page 19: Metadata Harvesting And Validation

ARIADNE Harvester

validation serviceharvester log

ASPECT Repository

16

SQISPI

invalid : discarded or identifier recorded for next harvesting

Page 20: Metadata Harvesting And Validation

OAI

External Repository

LOMLOM

OAI-PMH

LOM

LOMLOM

OAI-PMH

ARIADNE Harvester

validation serviceharvester log

ASPECT Repository

16

LOM

SQISPI

invalid : discarded or identifier recorded for next harvesting

Page 21: Metadata Harvesting And Validation

OAI

External Repository

LOMLOM

OAI-PMH

LOM

LOMLOM

OAI-PMH

ARIADNE Harvester

validation serviceharvester log

ASPECT Repository

16

LOM

SQISPI

invalid : discarded or identifier recorded for next harvesting

Page 22: Metadata Harvesting And Validation

OAI

External Repository

LOMLOM

OAI-PMH

LOM

LOMLOM

OAI-PMH

ARIADNE Harvester

validation serviceharvester log

ASPECT Repository

16

LOMLOM

SQISPI

invalid : discarded or identifier recorded for next harvesting

Page 23: Metadata Harvesting And Validation

OAI

External Repository

LOMLOM

OAI-PMH

LOM

LOMLOM

OAI-PMH

ARIADNE Harvester

validation serviceharvester log

ASPECT Repository

16

LOM

LOM

SQISPI

invalid : discarded or identifier recorded for next harvesting

Page 24: Metadata Harvesting And Validation

OAI

External Repository

LOMLOM

OAI-PMH

LOM

LOMLOM

OAI-PMH

ARIADNE Harvester

validation serviceharvester log

ASPECT Repository

Validation Msg

16

LOM

SQISPI

invalid : discarded or identifier recorded for next harvesting

Page 25: Metadata Harvesting And Validation

OAI

External Repository

LOMLOM

OAI-PMH

LOM

LOMLOM

OAI-PMH

ARIADNE Harvester

validation serviceharvester log

ASPECT Repository

Validation Msg

16

LOM

SQISPI

invalid : discarded or identifier recorded for next harvesting

Page 26: Metadata Harvesting And Validation

OAI

External Repository

LOMLOM

OAI-PMH

LOM

LOMLOM

OAI-PMH

ARIADNE Harvester

validation serviceharvester log

ASPECT Repository

Validation Msg

16

LOM

SQISPI

invalid : discarded or identifier recorded for next harvesting

Page 27: Metadata Harvesting And Validation

OAI

External Repository

LOMLOM

OAI-PMH

LOM

LOMLOM

OAI-PMH

ARIADNE Harvester

validation serviceharvester log

ASPECT Repository

Validation Msg

16

LOM

SQISPI

invalid : discarded or identifier recorded for next harvesting

Page 28: Metadata Harvesting And Validation

OAI

External Repository

LOMLOM

OAI-PMH

LOM

LOMLOM

OAI-PMH

ARIADNE Harvester

validation serviceharvester log

ASPECT Repository

Validation Msg

17

LOM

LOM

SQISPI

LOMLOM

OAI-PMH

LOM

Validation Msg

LOM

LOM

Validation Msg

1 2

34

5

6

Page 29: Metadata Harvesting And Validation

18

Harvester Screenshot or live

demo

Page 30: Metadata Harvesting And Validation
Page 31: Metadata Harvesting And Validation
Page 32: Metadata Harvesting And Validation
Page 33: Metadata Harvesting And Validation

Validation Reports

• After harvesting -> report generated and put online

• report has 4 “levels” :

• full log (incl. metadata)

• reporting log

• Grouped Errors

• Error Summary

Page 34: Metadata Harvesting And Validation

• Questions ?

23