View
226
Download
2
Category
Tags:
Preview:
Citation preview
Image Workflow Processes
Elspeth Haston, Robert Cubey, Martin Pullan & David J Harris
Large scale digitisation programmes are becoming more common, resulting in:
Large numbers of files – potentially nearly 3,000,000 for Edinburgh herbarium (E)
High quality images
Large file size – c. 150MB each
Images captured with minimal data records
These images need to be managed and made available and the scale is too large for completely manual processes
Image polling & metadata capture
Capture Image
Edit Image
Save Image
Create jpg & zoomify
Serve Online
OCR
Archive
Save tiff & raw
Dropbox
QC
User
System
Image workflow being developed at RBGE incorporating:
image capture
automated image processing
metadata recording
optical character recognition
quality control
image streaming online
archiving
Save Image
Edit Image
Capture Image
Image polling & metadata capture
Create jpg & zoomify
Serve Online
OCR
Archive
Save tiff & raw
Dropbox
QC
Capture Image
Edit Image
Save Image
Image captured using digital camera or scanner
Image edited in Leaf Capture software and/or Adobe Photoshop
Images saved into folders
batches consisting of ¼ day’s work are checked for quality prior to being transferred
Edit Image
Capture Image
Image polling & metadata capture
Save Image
Create jpg & zoomify
Serve Online
OCR
Archive
Save tiff & raw
Dropbox
QC
A series of dropbox folders are used to facilitate the use of parallel processing
an internal folder structure contains the equipment and operator names which form part of the metadata
The image management system polls the dropbox folders
any new image files are registered in a MySQL data base and the metadata (equipment, operator, date, etc) are recorded
Image polling & metadata capture
Dropbox
Image polling & metadata capture
Capture Image
Edit Image
Save Image
Create jpg & zoomify
Serve Online
OCR
Archive
Save tiff & raw
Dropbox
QC
A copy of the image is processed using ABBYY Optical Character Recognition (OCR) software
the text is recorded in the MySQL database to facilitate searching
a pdf is available to help users carry out additional data entry from the image
OCR
QCWe are developing a quality control checking process
provides an interface for a user to open images and record a quality assessment
enable correction and appending or overwriting as appropriate
Additional modular components
Image polling & metadata capture
Capture Image
Edit Image
Save Image
Create jpg & zoomify
Serve Online
OCR
Archive
Save tiff & raw
Dropbox
QC
The image management system creates a jpg and a zoomify version of the image files
The tiff and the raw files are saved into a zip folder
Create jpg & zoomify
Serve Online
Archive
Save tiff & raw
The zoomify files are served online, enabling users to zoom in and examine the specimen in detail
The zip folders comprising the tiff and the raw file are then archived onto tape and external hard drives
Image polling & metadata capture
Capture Image
Edit Image
Save Image
Create jpg & zoomify
Serve Online
OCR
Archive
Save tiff & raw
Dropbox
QC
The location of each file is also recorded in the MySQL database
Create jpg & zoomify
Serve Online
Archive
Save tiff & raw
Image polling & metadata capture
The image workflow system at RBGE has now processed over 130,000 images.
modular system has flexibility, but each new module may require access to the archived tiff files and some level of reprocessing may be necessary
it has proved unfeasible to maintain the tiff and raw files on a server
during the development of the workflow backlogs built up which can have a large impact on image management and on the curation of the collections
The workflow is enabling us to manage the images effectively:
the system helps with the integration of digitisation and curation in the herbarium
requests for images and data are easily managed and users will shortly be able to download images and data directly
the modular element will allow us to incorporate a georeferencing tool
the workflow is allowing us to manage several large digitisation projects in an integrated system
Thank you
Recommended