40
Gabriele Messmer Bavarian State Library Munich, Germany Digitization Workflow of the Bavarian State Library

Digitization Workflow of the Bavarian State Library · Mapping of the entire production processes to a modular system Different service providers (scanning, text capture) can supply

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Digitization Workflow of the Bavarian State Library · Mapping of the entire production processes to a modular system Different service providers (scanning, text capture) can supply

Gabriele Messmer

Bavarian State Library

Munich, Germany

Digitization Workflow

of the

Bavarian State Library

Page 2: Digitization Workflow of the Bavarian State Library · Mapping of the entire production processes to a modular system Different service providers (scanning, text capture) can supply

2

Gabriele Messmer: Digitization Workflow - © Bayerische Staatsbibliothek 2011

Digitization process at a glance

Page 3: Digitization Workflow of the Bavarian State Library · Mapping of the entire production processes to a modular system Different service providers (scanning, text capture) can supply

3

Gabriele Messmer: Digitization Workflow - © Bayerische Staatsbibliothek 2011

ERaTO

ERaTO – a tool

to create

fill in and

print an order form

that informs the patron

about the estimated price

Page 4: Digitization Workflow of the Bavarian State Library · Mapping of the entire production processes to a modular system Different service providers (scanning, text capture) can supply

Order Form for Digitization on Demand (DoD)

4

Gabriele Messmer: Digitization Workflow - © Bayerische Staatsbibliothek 2011

Page 5: Digitization Workflow of the Bavarian State Library · Mapping of the entire production processes to a modular system Different service providers (scanning, text capture) can supply

5

Gabriele Messmer: Digitization Workflow - © Bayerische Staatsbibliothek 2011

ERaTO

Page 6: Digitization Workflow of the Bavarian State Library · Mapping of the entire production processes to a modular system Different service providers (scanning, text capture) can supply

6

Gabriele Messmer: Digitization Workflow - © Bayerische Staatsbibliothek 2011

ERaTO

Page 7: Digitization Workflow of the Bavarian State Library · Mapping of the entire production processes to a modular system Different service providers (scanning, text capture) can supply

7

Gabriele Messmer: Digitization Workflow - © Bayerische Staatsbibliothek 2011

ZEND

ZEND = Zentrale Erfassungs- und NachweisDatenbank

[digital asset management system]

Mapping of the entire production processes

to a modular system

Different service providers (scanning, text capture) can

supply unlimited data to ZEND

Workflow control

Every object of the BSB, which will be digitized,

follows the ZEND workflow

Time and cost reduction through extensive automation

Page 8: Digitization Workflow of the Bavarian State Library · Mapping of the entire production processes to a modular system Different service providers (scanning, text capture) can supply

8

Gabriele Messmer: Digitization Workflow - © Bayerische Staatsbibliothek 2011

ZEND at a glance

Production

System

o Digitisation on Demand

o Project-oriented Digitisation

o Conservation Purposes

Zahlb

ar an $

Order

Exchange of metadate

Inhouse-only

publication

Online Publication

Web

publication

Archival Storage

Definitive file name

URN

Digitised Object

Catalog (OPAC)

Network of Bavarian

Libraries

Digitisation

Portals(BLO, ZvDD, Chronicon,...)

Search engines

OAI

URN-

Resolving

(XEpicur)

DNB ZEND

Administration of all metadata

(bibliographic, technical,

administrative)

Automatic image processing

1

7

2 2

3

6

4

4

8

5

3

5

OCR

Page 9: Digitization Workflow of the Bavarian State Library · Mapping of the entire production processes to a modular system Different service providers (scanning, text capture) can supply

9

Gabriele Messmer: Digitization Workflow - © Bayerische Staatsbibliothek 2011

Search/BrowseTitle / Full Text

PDF-Download

Viewer

RepositoryStructural Information

Images Full Text

DatabaseObject Informaton

Metadata

Archival Storage

ERAToOrder Tool

DaVeDiWorkflow Database

Image Conversion

XML-EditorDeep Indexing

InterfacesOAI, Z39.50

Data Managementvia TSM-Client

Catalog EnrichmentResolving Link

URN GeneratorResolving Link

AdministrationAcces & Administration

Subject Portals

MonitoringAccess & Error Log

Retrieval from Archive

Archival Storage (TSM)

Collection Management

OCR Processing

DoD ServicesOrder & Billing

Data Harvester

Repository Maintenance

RSS FeedsOrder Management

ZEND Administration ZEND Enduser Interface

ZEND Workflow Modules

ZEND Data

MetsificatorMETS Export

Modules of the ZEND Workflow Tool

ZEND modules

Page 10: Digitization Workflow of the Bavarian State Library · Mapping of the entire production processes to a modular system Different service providers (scanning, text capture) can supply

Scanning process – step by step

1. Transport of the original objects to the Scanning Center

2. Conservational check

3. Preparation for the scanning process

4. Image capture

5. Quality control

6. Indexing and OCR

7. Storage and digital long-term preservation

8. Publication on the web / Access

10

Gabriele Messmer: Digitization Workflow - © Bayerische Staatsbibliothek 2011

Page 11: Digitization Workflow of the Bavarian State Library · Mapping of the entire production processes to a modular system Different service providers (scanning, text capture) can supply

ZEND – login screen

11 11

Gabriele Messmer: Digitization Workflow - © Bayerische Staatsbibliothek 2011

Page 12: Digitization Workflow of the Bavarian State Library · Mapping of the entire production processes to a modular system Different service providers (scanning, text capture) can supply

Digitization order in ZEND

12 12

Gabriele Messmer: Digitization Workflow - © Bayerische Staatsbibliothek 2011

Page 13: Digitization Workflow of the Bavarian State Library · Mapping of the entire production processes to a modular system Different service providers (scanning, text capture) can supply

Digitization order in ZEND

13

Gabriele Messmer: Digitization Workflow - © Bayerische Staatsbibliothek 2011

Page 14: Digitization Workflow of the Bavarian State Library · Mapping of the entire production processes to a modular system Different service providers (scanning, text capture) can supply

Import of the bibliographic metadata into ZEND via Z39.50

from the local catalogue system

14

Gabriele Messmer: Digitization Workflow - © Bayerische Staatsbibliothek 2011

Page 15: Digitization Workflow of the Bavarian State Library · Mapping of the entire production processes to a modular system Different service providers (scanning, text capture) can supply

Data import via Z39.50 into ZEND

15

Gabriele Messmer: Digitization Workflow - © Bayerische Staatsbibliothek 2011

Page 16: Digitization Workflow of the Bavarian State Library · Mapping of the entire production processes to a modular system Different service providers (scanning, text capture) can supply

… and at the same time generation of the definitive file

name and allocation of an URN

16

Gabriele Messmer: Digitization Workflow - © Bayerische Staatsbibliothek 2011

Page 18: Digitization Workflow of the Bavarian State Library · Mapping of the entire production processes to a modular system Different service providers (scanning, text capture) can supply

18

Gabriele Messmer: Digitization Workflow - © Bayerische Staatsbibliothek 2011

ZEND – Process slip with barcode

Page 19: Digitization Workflow of the Bavarian State Library · Mapping of the entire production processes to a modular system Different service providers (scanning, text capture) can supply

19

Gabriele Messmer: Digitization Workflow - © Bayerische Staatsbibliothek 2011

Digitization parameters

Focus of in-house digitization

Manuscripts

Music manuscripts and printed music

Old books (6th to 16th centuries)

Rare and expensive books

Production parameters

24 bit

High resolution (300 - 600 ppi in relationship to the originals‘ size)

ICC profiles

TIFF uncompressed

Page 20: Digitization Workflow of the Bavarian State Library · Mapping of the entire production processes to a modular system Different service providers (scanning, text capture) can supply

Scanning speed

Scanning speed depends of

Desired reproduction quality

Preservation requirements

Age and state of the original ( difficult fixing)

Format/binding of the original

20

Gabriele Messmer: Digitization Workflow - © Bayerische Staatsbibliothek 2011

Page 21: Digitization Workflow of the Bavarian State Library · Mapping of the entire production processes to a modular system Different service providers (scanning, text capture) can supply

21

Gabriele Messmer: Digitization Workflow - © Bayerische Staatsbibliothek 2011

Conservational requirements at BSB

Institute of Book and Manuscript Conservation (IBR)

The scanning devices have to follow the book requirements

and not vice versa!

General requirements: ligthing of the object and room climate

Specific requirement: use the scanning device most appropriate to

the original work

Page 22: Digitization Workflow of the Bavarian State Library · Mapping of the entire production processes to a modular system Different service providers (scanning, text capture) can supply

22

Gabriele Messmer: Digitization Workflow - © Bayerische Staatsbibliothek 2011

Scanning devices of Munich Digitization Center

20 scanners (state of

technology 2006-2010)

among them

4 automated scanning

devices

1 thermographic scanner

for watermarks

Hasselblad digital camera

Page 23: Digitization Workflow of the Bavarian State Library · Mapping of the entire production processes to a modular system Different service providers (scanning, text capture) can supply

23

Gabriele Messmer: Digitization Workflow - © Bayerische Staatsbibliothek 2011

Munich Scanning Center

Page 24: Digitization Workflow of the Bavarian State Library · Mapping of the entire production processes to a modular system Different service providers (scanning, text capture) can supply

24

Gabriele Messmer: Digitization Workflow - © Bayerische Staatsbibliothek 2011

Camera Work Table from Graz/Austria

Page 25: Digitization Workflow of the Bavarian State Library · Mapping of the entire production processes to a modular system Different service providers (scanning, text capture) can supply

25

Gabriele Messmer: Digitization Workflow - © Bayerische Staatsbibliothek 2011

Challenge: to scan old books

with scan-robots

Project cofinanced by the

German Research Foundation

Start: December 2007

Equipment 2011: 4 scanrobots

with each 800 pages/h

(old books)

ScanRobot:

EU „ICT“

Grand Prize Winner

at the Cebit 2007

Scanning by robots – the VD16 Project

Page 26: Digitization Workflow of the Bavarian State Library · Mapping of the entire production processes to a modular system Different service providers (scanning, text capture) can supply

Scanning process

Choice of most appropriate

scanner

Installing of a scan-job on

the scanner PC

Image capture

Finalizing the scan job

Moving the digitized images

to a file called „Abholfach“

26

Gabriele Messmer: Digitization Workflow - © Bayerische Staatsbibliothek 2011

Page 27: Digitization Workflow of the Bavarian State Library · Mapping of the entire production processes to a modular system Different service providers (scanning, text capture) can supply

Quality control and approval

Quality control

Entering structural data

Approval

Return of the original

work to the ordering

library department or to

the stocks

27

Gabriele Messmer: Digitization Workflow - © Bayerische Staatsbibliothek 2011

Page 28: Digitization Workflow of the Bavarian State Library · Mapping of the entire production processes to a modular system Different service providers (scanning, text capture) can supply

Automatic processing after scanning is finalized

OCR processing of the images (optional)

Creation of an index file for the web presentation

of the images

Automatic production of image formats

for the web presentation (JPG, PDF etc.)

Creation of the browsing structure for the object

(ToC-Editor)

Storage of all the data in the Leibniz Supercomputing

Centre for long-term preservation

28

Gabriele Messmer: Digitization Workflow - © Bayerische Staatsbibliothek 2011

Page 29: Digitization Workflow of the Bavarian State Library · Mapping of the entire production processes to a modular system Different service providers (scanning, text capture) can supply

ZEND: XML-ToC(Table of Contents) editor

Quality control and correction

(image size & orientation, etc.)

Flagging the structural elements of the document

(i.e. frontispiece, chapter titles, images etc.) -

allows navigation inside the digital object

Subsequent "activation" and instant availability

on the WWW.

29

Gabriele Messmer: Digitization Workflow - © Bayerische Staatsbibliothek 2011

Page 30: Digitization Workflow of the Bavarian State Library · Mapping of the entire production processes to a modular system Different service providers (scanning, text capture) can supply

ZEND: ToC editor

Gabriele Messmer: Digitization Workflow - © Bayerische Staatsbibliothek 2011

30

Page 31: Digitization Workflow of the Bavarian State Library · Mapping of the entire production processes to a modular system Different service providers (scanning, text capture) can supply

31

DILL - Parma - Gabriele Messmer

Gabriele Messmer: Digitization Workflow - © Bayerische Staatsbibliothek 2011

31

Page 32: Digitization Workflow of the Bavarian State Library · Mapping of the entire production processes to a modular system Different service providers (scanning, text capture) can supply

Finished version for Web presentation

32

Gabriele Messmer: Digitization Workflow - © Bayerische Staatsbibliothek 2011

Page 33: Digitization Workflow of the Bavarian State Library · Mapping of the entire production processes to a modular system Different service providers (scanning, text capture) can supply

Immediate availability inside the local catalogue (OPAC)

33

Gabriele Messmer: Digitization Workflow - © Bayerische Staatsbibliothek 2011

Page 34: Digitization Workflow of the Bavarian State Library · Mapping of the entire production processes to a modular system Different service providers (scanning, text capture) can supply

Archival storage of the data

34

Gabriele Messmer: Digitization Workflow - © Bayerische Staatsbibliothek 2011

Page 35: Digitization Workflow of the Bavarian State Library · Mapping of the entire production processes to a modular system Different service providers (scanning, text capture) can supply

Digitization workflow – archiving report

35

Gabriele Messmer: Digitization Workflow - © Bayerische Staatsbibliothek 2011

Page 36: Digitization Workflow of the Bavarian State Library · Mapping of the entire production processes to a modular system Different service providers (scanning, text capture) can supply

Challenge OCR output

36

Gabriele Messmer: Digitization Workflow - © Bayerische Staatsbibliothek 2011

Cited from:

http://www.abbyy.de/recognition_server/brochure_en/

Page 37: Digitization Workflow of the Bavarian State Library · Mapping of the entire production processes to a modular system Different service providers (scanning, text capture) can supply

Challenge OCR output

37 37

Gabriele Messmer: Digitization Workflow - © Bayerische Staatsbibliothek 2011

Page 38: Digitization Workflow of the Bavarian State Library · Mapping of the entire production processes to a modular system Different service providers (scanning, text capture) can supply

38

Gabriele Messmer: Digitization Workflow - © Bayerische Staatsbibliothek 2011

Challenge OCR output

Page 39: Digitization Workflow of the Bavarian State Library · Mapping of the entire production processes to a modular system Different service providers (scanning, text capture) can supply

39

Gabriele Messmer: Digitization Workflow - © Bayerische Staatsbibliothek 2011

Fulltext search in books digitized by Google

Page 40: Digitization Workflow of the Bavarian State Library · Mapping of the entire production processes to a modular system Different service providers (scanning, text capture) can supply

40

Gabriele Messmer: Digitization Workflow - © Bayerische Staatsbibliothek 2011

Contact

[email protected]