View
213
Download
0
Category
Tags:
Preview:
Citation preview
Digital Text and Libraries
Michael Popham
DOI Meeting, Oxford, June 2006
Ranganathan’s laws of library science
1. Books are for use
2. Every reader his book
3. Every book its reader
4. Save the time of the reader
5. A library is a growing organism
(Ranganathan, 1931)
DOI Meeting, Oxford, June 2006
Libraries and digital texts
…as purchasers of digital texts– from publishers, aggregator services
…as producers of digital texts– digitized from analogue originals, analogue surrogates
…as custodians of digital texts– purchased and licensed material– institutional repositories, digital assets created in-
house – acquired e-MSS and personal digital collections
DOI Meeting, Oxford, June 2006
Libraries and digital texts – the challenges
…as purchasers of digital texts– we have to work with what we’re sold/what’s available
…as producers of digital texts– we have to work with what we’ve got
…as custodians of digital texts– we have to work with what we’re given
DOI Meeting, Oxford, June 2006
Thomas Bodley’s Vision
Bodleian founded 1602 Universal library Bodley’s “Republic of Letters” Legal deposit privilege since 1610 60% of Bodleian readers not members of Oxford
University
DOI Meeting, Oxford, June 2006
Bodleian Library
400 staff Budget of £14m (€20.5m) Stock 8 million items 45,000 registered users 120 Miles (192km) of shelving 123,000 monograph items and 194,000 serial
items added each year
DOI Meeting, Oxford, June 2006
Oxford University Library Services
> 660 staff (600 fte) 40 libraries, including the Bodleian Budget > £25m (€37m) Total bookstock:11 million items 156 miles (250km) of shelving, including
repository space
DOI Meeting, Oxford, June 2006
The “Digital Library” at Oxford
1960s Machine-readable texts for scholarly purposes1976 Oxford Text Archive founded1980s Networked databases and CD-ROMs1990s Libraries on the web, e-journals etc.2001 Oxford Digital Library (ODL)2005 ELISO (Electronic Library
& Information Service)Google/Oxford partnership
DOI Meeting, Oxford, June 2006
An affecting and sublime! scene, or, : The great captain going to head his armies
DOI Meeting, Oxford, June 2006
Oxford-Google Project: what to digitize?
Direct discussions with Google since 2003 Win/win situation for both parties Extensive collection of out-of-copyright (and
mostly out-of-print) material identified – Oxford differs from other partners in this aspect of our
agreement– Decision made to begin with the 19th century material– Looking at approximately 1+ million items
DOI Meeting, Oxford, June 2006
Overview of workflow
Selection
Suitable for digitization?
Reshelve
Fast-track
Slow-trackDigitize
Generate deliverables
Store outputs
Update OULS OPAC
QAY
Y
N
N
Update Google.print index
DOI Meeting, Oxford, June 2006
Outputs and outcomes
Large raw colour images from digitization process Per volume, OULS receives:
– JPEG2000 (probably), and TIFFs– Uncorrected OCR
Audit of production process There are quality control processes at Google & Oxford Deliverable images (to be hosted by Google in the first
instance) linked to OPAC records Ongoing software/hardware developments to improve the
process
DOI Meeting, Oxford, June 2006
Challenges that lie ahead…
Building the local infrastructure to manage and deliver the Oxford Digital Copy of the data
Investigating ways to exploit the data, e.g.:– Correcting OCR files, adding additional markup– (Re-)structuring the data – moving beyond a simple search and
page-turning presentation– Completing/extending volumes and collections– Automatic collation, authorship attribution, stylistic analysis.
….and many, many more(?!) Raising the barrier of what is possible, and end-users’
expectations about what we can deliver
DOI Meeting, Oxford, June 2006
Feel the Fear….
©opyright and IPR Threat to (Scholarly) e-Publishers Proliferating plagiarism Encouraging poor research Scope creep, scalability, data deluge Preservation and access
DOI Meeting, Oxford, June 2006
Useful links
http://www.bodley.ox.ac.uk/google/ http://books.google.com/googlebooks/library.html
Recommended