14
Organization, Clarity, Organization, Clarity, and Sanity: Digitization for and Sanity: Digitization for the Future On a Shoestring the Future On a Shoestring University of Alabama Libraries Jody L. Jody L. DeRidder DeRidder jlderidder@ua jlderidder@ua .edu .edu Image courtesy of Life Magazine Or Digital Library Development From the Bottom Up Digital Library Development From the Bottom Up

Organization, Clarity, and Sanity: Digitization for the Future On a Shoestring Organization, Clarity, and Sanity: Digitization for the Future On a Shoestring

Embed Size (px)

Citation preview

  Organization, Clarity, and Sanity: Organization, Clarity, and Sanity: Digitization for the Future On a ShoestringDigitization for the Future On a Shoestring

University of Alabama Libraries

Jody L. DeRidderJody L. [email protected] [email protected]

Image courtesy of Life Magazine

Or

Digital Library Development From the Bottom UpDigital Library Development From the Bottom Up

Libraries organize information… primarily books.

Trinity College Library, Dublin, as captured by Candida Höferin her book Libraries (Thames and Hudson ,UK: 2005).

Photo credit: Flickr user "Libby", used with permission (creative commons)

If libraries organize books… Why not digital files??

It’s all information!

A digital object may belong in MANY potential virtual collections…

… but it originated from ONE SINGLE ANALOG collection. Provenance trumps all!

Slavery African Americans Sheet Music Tombigbee River Southern History … and more

“Gum Tree Canoe,” Published by G.P. Reed (Boston: 1847). Wade Hall collection of Southern History and Culture, Hoole Special Collections, University of Alabama Libraries.

Bringing Order to Chaos

University of Alabama Libraries

Holder ID: u0003

Collection ID: 0000023

Item ID: 0000007

Sequence ID: 0005

Archival File: u0003_0000023_0000007_0005.tif

1) Clarity

2) Low cost

3) Simple

4) Extensible

u0003_0001980_0000001 is the first digitized item in the MSS 1980 collection

HOLDER ID

COLLECTION ID

The Digitization Working Area…

Collection folders are named for the collection identifier. Allowed subfolders include:

Admin Metadata Scans Transcripts

Compound objects have their own subfolders for pages, named for the item.

And a Collection Folder in the Working Area

An Example of the Lowest- Cost Model: The Alabama Digital Preservation Network http://www.adpn.org/

http://www.lockss.org/

Lots of Copies Keeps Stuff Safe!!

storage area

Simple, Clear Hierarchical Organization:

Holder ID Collection ID Item ID Sequence ID

u0003 slide

Identification, Organization and Consistency

Each segment of numbers:

Holder ID Collection ID Item ID Sequence ID

is used in the directory structure.

The directory for u0003_0000003_0002_001.tif

Is simply:

u0003/ 0000003/ 0002/ 001/

Dropping the Technical Metadata in… where it belongs

Makes METS creation a Piece of Cake!

(and redundant!)

Using FITS, the File Information Tool Set developed by Harvard which encapsulates JHOVE, DROID, ExifTool and other tools: http://code.google.com/p/fits/

Bringing Content Up to the Level Of the WEB!!! Greater Usability and Access == Longer Life

Images … ImageMagick: http://www.imagemagick.org(it’s free!)

Protected archive area

u0003 u0003

0000023 0000023

0000007

0005

u0003_0000023_0000007_0005.tif

0000007

0005

Thumb, mid-, and large-size derivatives

Web accessible area

Audio … LAME: http://lame.sourceforge.netOCR … TESSERACT: http://code.google.com/p/tesseract-ocr/

http://acumen.lib.ua.edu

ACCESS! Via Acumen

(also free!)

XML agnostic No ingest No metadata modifications All content easily accessible Open to search engines

Bringing Order to Chaos

1) Clarity

2) Low cost

3) Simple

4) Extensible

University of Alabama Libraries

Holder ID: u0003

Collection ID: 0000023

Item ID: 0000007

Sequence ID: 0005

Archival File: u0003_0000023_0000007_0005.tif

Jody L. [email protected]

UA Digital Services wiki:http://intranet.lib.ua.edu/groups/Digital_Services