47
Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch - Illinois Wesleyan University Copyright Tod Olson, Fred Miller, and Curtis Kelch 2004. This work is the intellectual property of the authors. Permission is granted for this material to be shared for non-commercial, educational purposes, provided that this copyright statement appears on the reproduced materials and notice is given that the copying is by permission of the author. To disseminate otherwise or to republish requires written permission from the author.

Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -

Embed Size (px)

Citation preview

Page 1: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -

Digital Libraries with Greenstone:an open source solution

Tod Olson - University of Chicago

Fred Miller - Illinois Wesleyan University

Curtis Kelch - Illinois Wesleyan University

Copyright Tod Olson, Fred Miller, and Curtis Kelch 2004. This work is the intellectual property of the authors. Permission is granted for this material to be shared for non-commercial, educational purposes, provided that this copyright statement appears on the reproduced materials and notice is given that the copying is by permission of the author. To disseminate otherwise or to republish requires written permission from the author.

Page 2: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -

Digital Libraries with Greenstone

• Introduction

• About digital libraries

• Greenstone overview

• Examples

• Future

• Live demos

• Q & A

Page 3: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -

The World of Digital Libraries

• Access to Digital Collections– Text, images, audio, video– Searching and metadata

• Digital libraries versus repositories– Access and preservation

• Digital Preservation Tutorial http://www.library.cornell.edu/iris/tutorial/dpm/

Page 4: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -

Sorting Out the Ingredients

• Raw materials

• User interface

• Elements of organization

• Building the collection

Page 5: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -

GreenstoneNew Zealand Digital Library Project

at the University of Waikato• with UNESCO, Human Info NGO

International, every continentExamples:• Academic

– Digitization projects– Classes on digital libraries

• Non-academic– UNESCO humanitarian documentation

Page 6: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -

Greenstone features

• Works with existing documents– Imports several formats

• Searching: full text and metadata– Dublin Core, custom metadata

• Browse• Structured documents

– Indexing, access

• Extensible & customizable• OpenSource software (GPL)

Page 7: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -
Page 8: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -
Page 9: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -
Page 10: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -
Page 11: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -
Page 12: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -
Page 13: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -
Page 14: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -
Page 15: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -
Page 16: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -

Greenstone ArchitectureReceptionist

Collection Server Collection Server

DB & Indexes

Redrawn from Witten & Bainbridge, How to Build a Digital Library, p. 356

Protocol

Collection

Import

DB & Indexes

Collection

Import

DB & Indexes

Collection

Import

Receptionist

Page 17: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -

Greenstone Architecture

Receptionist• Provides user

interface• Accept user input• Send to appropriate

collection server• Accept results• Dynamic page

generation

Collection Server• Handle collection

content• Search and filter

information• Return results• multiple collections

Page 18: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -

DB &Indexes

HTML

PDF Import BuildGSAF

???

Building Collections

Page 19: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -

Building collections

• Create a collection framework– or work with an old collection

• Select documents

• Import documents– Converts to internal XML format (GSAF)

• Build collection– creates search indexes and browse listings

Page 20: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -

GSAF: internal XML format

Section:• Description

– Metadata fields

• Content– Text,internal markup, images

• Section– No limit in number or depth

Hierarchical documentsSections nest, tree structure

Page 21: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -

<Section><Description>

<Metadata name=“Title” value=“…”><Content>

[Text, images, links, etc.]<Section>

<Description><Metadata name=“Title” …>

<Content>…<Section>…

<Section>…<Section>…

GSAF: internal XML format

Page 22: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -

Config file: collect.cfg

Collection-specific configuration file, collect.cfg, specifies:  

• file types to import • Indexes and browse lists

– Document or section level– paragraph (text index only)

• display of results and browse listings • document displays

Page 23: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -
Page 24: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -
Page 25: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -
Page 26: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -
Page 27: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -
Page 28: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -

Chopin Early Editions

Over 400 early edition Chopin scores1830’s to 1880’s

Target audience: music scholars & musicians.

On web, page-turnable JPEG images. Online in March 2003

Currently 374 scores in online collection

Usage:Nearly100 hits per day, > 30% of use is international.

Page 29: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -
Page 30: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -
Page 31: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -
Page 32: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -
Page 33: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -
Page 34: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -
Page 35: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -
Page 36: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -

Catalogrecords

ScannedImages

Structuralmetadata

METSXSLT Greenstone

ArchiveFormat

GreenstoneDig. LibrarySoftware

Humanprocessing

XML-based automated processing

Build overview

Page 37: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -

METS to GSAF

dmdSecMODS: Title, …

fileSecpage1.jpgpage2.jpg

structMapdiv: Score

div: Page 1div: Page 2

SectionDescription

Metadata: Title, …Content:

Title, …Section

Content: Page 1 page1.jpg

SectionContent: Page 2

page2.jpg

Page 38: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -

Greenstone benefits for Chopin

• Robust, mature system• Recovered time in project

– Fast to bring up– UI out of the box– Dynamic page generation– Incremental customization

• XML compliant– Natural mapping from METS to GSAF

Page 39: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -

The Argus Digital Collection

• Illinois Wesleyan Student Newspaper– 1894 to 2000

• Preservation and Access

• Image PDF versus full text

• Web interface for building metadata

• Customized searches

Page 40: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -

Argus Metadata Maintenance

Page 41: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -

Argus Search

Page 42: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -

Argus Issue “front door”

Page 43: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -

Ongoing work: Greenstone

• Greenstone Librarian Interface (GLI)

• Greenstone 3

Page 44: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -

Greenstone Librarian Interface (GLI)

• Collection management– Informed by work at

GS sites– Assist collection

designer– Support all phases of

collection build process

– Do not specify workflow

• Java-based GUI tool– Formerly called the

“Gatherer”

• 2 yrs in development– Beta sites: Bangalore

and elsewhere

• Training sessions– UNESCO sessions in

Asia, Africa– JCDL 2004 tutorial

Page 45: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -

Greenstone 3

GS2 mature, 5+ yrs., wide deployment– Constraints: support legacy systems– Other technologies have matured: Java, XML

GS3: rewrite in Java, XML, XSLT• Distributed architecture, SOAP• METS as internal format

– Group assembled for Greenstone METS profile(s)

• OAI support planned• 1 year in dev; alpha testing in lab

Page 46: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -

Links & Further Information

Greenstone: http://www.greenstone.org/ Chopin Early Editions: http://chopin.lib.uchicago.edu/Argus Digital Collection:

http://www.iwu.edu/library/services/argus1.htm Argus Greenstone Documentation:

http://www.iwu.edu/~ckelch/ArgusProjectDoc12.pdf Witten & Bainbridge. How to Build a Digital Library. Morgan

Kaufman, 2003.

Page 47: Digital Libraries with Greenstone: an open source solution Tod Olson - University of Chicago Fred Miller - Illinois Wesleyan University Curtis Kelch -

More about Greenstone…