7

Click here to load reader

Europeana Newspaper metadata LIBER2013

Embed Size (px)

Citation preview

Page 1: Europeana Newspaper metadata LIBER2013

Europeana Newspapers

Munich Workshop

WP5 Metadata – Structural Metadata

Munich, 26th June 2013

Günter Mühlberger, Innsbruck University

WP5 leader

Page 2: Europeana Newspaper metadata LIBER2013

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp

Problem statement

• Europeana Newspapers• 15 libraries from several European countries• 10 mill. of newspaper pages for refinement (OCR, OLR)• Need to be delivered to Europeana

• Approach• Currently no standard format available• Unify the delivery format• Create a METS/ALTO Profile• Create tools in order to ease creation of ENMAP objects

2

Page 3: Europeana Newspaper metadata LIBER2013

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp

ENMAP

• Implementation• More than 3 mill. pages already processed• Workflow is fully scalable, up to 100.000 pages can be processed

per day (OCR and ENMAP creation)

• Public release• ENMAP (Europeana Newspaper Mets Alto Profile) available to the

public• Planned for October 2013• Accompanying information• Examples• Feedback is highly welcome• Final release is planned for 2014

3

Page 4: Europeana Newspaper metadata LIBER2013

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp

Structural Metadata

• Structural elements• Title section, headline, advertisement, illustration, caption, running

title (column title), page number, continuation note, imprint, etc.

• Text types (genres)• breaking news, short news, book review, theatre review, obituary,

family notice, job announcement, weather forecast, novel, poem,...

4

Page 5: Europeana Newspaper metadata LIBER2013

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp

Rationale

• Why do we need these data?• Increase granularity and information• Improve search services (facetted search)• Support crowd based services (apply these metadata) • Instruct service providers

• Other standards in the field?• TEI (Text Encoding Initiative) provides a first starting point but

objectives are different (edition vs. library use)• Best practise models of other libraries

5

Page 6: Europeana Newspaper metadata LIBER2013

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp

ENMAP Structural Map

• Objectives• Contribute to some standardisation in this field• Set up a list of these elements• Gather feedback from libraries• Provide definitions and examples• Include a first version within ENMAP

6

Page 7: Europeana Newspaper metadata LIBER2013

Thank you for your attention!

lGünter Mühlberger <[email protected]>