14
The Open Archives The Open Archives Initiative Initiative Marshall Breeding Marshall Breeding Director for Innovative Technologies and Director for Innovative Technologies and Research Research Vanderbilt University Vanderbilt University http://staffweb.library.vanderbilt.edu/b http://staffweb.library.vanderbilt.edu/b reeding reeding Redefining Libraries: Web 2.0 and other Challenges May 2007 Xiamen, China

The Open Archives Initiative Marshall Breeding Director for Innovative Technologies and Research Vanderbilt University

Embed Size (px)

Citation preview

Page 1: The Open Archives Initiative Marshall Breeding Director for Innovative Technologies and Research Vanderbilt University

The Open Archives The Open Archives InitiativeInitiative

Marshall BreedingMarshall BreedingDirector for Innovative Technologies and Director for Innovative Technologies and ResearchResearchVanderbilt UniversityVanderbilt Universityhttp://staffweb.library.vanderbilt.edu/breedinghttp://staffweb.library.vanderbilt.edu/breeding

Redefining Libraries:Web 2.0 and other Challenges

May 2007 Xiamen, China

Page 2: The Open Archives Initiative Marshall Breeding Director for Innovative Technologies and Research Vanderbilt University

From From www.openarchives.orgwww.openarchives.org Standards for Repository InteroperabilityStandards for Repository Interoperability The Open Archives Initiative develops and The Open Archives Initiative develops and

promotes interoperability standards that aim promotes interoperability standards that aim to facilitate the efficient dissemination of to facilitate the efficient dissemination of content. OAI has its roots in the open access content. OAI has its roots in the open access and institutional repository movements. and institutional repository movements. Continued support of this work remains a Continued support of this work remains a cornerstone of the Open Archives program. cornerstone of the Open Archives program. Over time, however, the work of OAI has Over time, however, the work of OAI has expanded to promote broad access to digital expanded to promote broad access to digital resources for eScholarship, eLearning, and resources for eScholarship, eLearning, and eScience. eScience.

Page 3: The Open Archives Initiative Marshall Breeding Director for Innovative Technologies and Research Vanderbilt University

Model for federated Model for federated searchsearch Harvest metadata from multiple Harvest metadata from multiple

repositoriesrepositories Index metadata on a central Index metadata on a central

serviceservice Offer a search interface on the Offer a search interface on the

central servicecentral service Deliver access to digital objects on Deliver access to digital objects on

the original repositoriesthe original repositories

Page 4: The Open Archives Initiative Marshall Breeding Director for Innovative Technologies and Research Vanderbilt University

Organization / Organization / StructureStructure Data ProvidersData Providers

– Contribute metadata and contentContribute metadata and content– Run a module that responds to Run a module that responds to

requests for metadatarequests for metadata Service ProvidersService Providers

– Harvest metadata from one or more Harvest metadata from one or more repositoriesrepositories

– Operate services that request Operate services that request metadata from data providersmetadata from data providers

Page 5: The Open Archives Initiative Marshall Breeding Director for Innovative Technologies and Research Vanderbilt University

Not MetaSearchNot MetaSearch

Avoids pitfalls of “distributed-query” Avoids pitfalls of “distributed-query” approach to federated searchapproach to federated search

Centralized, consolidated indexes Centralized, consolidated indexes allow for fast searching, sorting, allow for fast searching, sorting, rankingranking

Not dependent on persistent Not dependent on persistent connections with multiple connections with multiple repositoriesrepositories

Page 6: The Open Archives Initiative Marshall Breeding Director for Innovative Technologies and Research Vanderbilt University

Open Archives Open Archives InitiativeInitiative Initiated in the pre-print servers for scholarly Initiated in the pre-print servers for scholarly

articles and research papersarticles and research papers Need to provide federated search model Need to provide federated search model

rather than having each researcher search rather than having each researcher search each repository separatelyeach repository separately

Value-added servicesValue-added services Santa Fe convention – Oct 1999Santa Fe convention – Oct 1999 Need to develop a protocol to create Need to develop a protocol to create

interoperability among many different types interoperability among many different types of repositories, each with different types of of repositories, each with different types of documents and metadatadocuments and metadata

Page 7: The Open Archives Initiative Marshall Breeding Director for Innovative Technologies and Research Vanderbilt University

OAI-PMHOAI-PMH

Open Archives Initiative Protocol for Open Archives Initiative Protocol for Metadata Harvesting Metadata Harvesting

XML protocol for harvesting metadataXML protocol for harvesting metadata Uses unqualified Dublin Core as defaultUses unqualified Dublin Core as default Each community can specify its own Each community can specify its own

metadata formats and profilesmetadata formats and profiles A simple protocol designed to have a low A simple protocol designed to have a low

threshold of difficulty for implementationthreshold of difficulty for implementation

Page 8: The Open Archives Initiative Marshall Breeding Director for Innovative Technologies and Research Vanderbilt University

OAI-PMH Requests / OAI-PMH Requests / ResponseResponse IdentifyIdentify ListIdentifiersListIdentifiers ListMetadataFormatsListMetadataFormats ListRecordsListRecords ListSetsListSets GetRecordGetRecord

Page 9: The Open Archives Initiative Marshall Breeding Director for Innovative Technologies and Research Vanderbilt University

A standard protocolA standard protocol

Initial 1.1 version in July 2001Initial 1.1 version in July 2001 Version 2.0 completed in June Version 2.0 completed in June

20022002

Page 10: The Open Archives Initiative Marshall Breeding Director for Innovative Technologies and Research Vanderbilt University

Operational Operational considerationsconsiderations Initial contact of an OAI harvester to a Initial contact of an OAI harvester to a

repository may involve a massive repository may involve a massive transfer of recordstransfer of records

Resumption token provides a mechanism Resumption token provides a mechanism for the repository to interrupt the for the repository to interrupt the transfertransfer– Harvestor may come back at a later time to Harvestor may come back at a later time to

resume the transferresume the transfer Subsequent requests involve only added Subsequent requests involve only added

and changed recordsand changed records

Page 11: The Open Archives Initiative Marshall Breeding Director for Innovative Technologies and Research Vanderbilt University

OAI in actionOAI in action

Widely implementedWidely implemented Part of the standard infrastructure for Part of the standard infrastructure for

pre-print servers, institutional pre-print servers, institutional repositories, etc.repositories, etc.– D-SpaceD-Space– FedoraFedora

Pragmatic approach for any Pragmatic approach for any application needing to transfer records application needing to transfer records between systemsbetween systems

Some service providers use additional Some service providers use additional protocolsprotocols

Page 12: The Open Archives Initiative Marshall Breeding Director for Innovative Technologies and Research Vanderbilt University

ExamplesExamples

Networked Digital Library of Networked Digital Library of Theses and Dissertations Theses and Dissertations

The American SouthThe American South– AmericanSouth.orgAmericanSouth.org

Page 13: The Open Archives Initiative Marshall Breeding Director for Innovative Technologies and Research Vanderbilt University

ImplementationsImplementations

Tools available in many different Tools available in many different programming languages and programming languages and environmentsenvironments

Built-in to many digital library Built-in to many digital library applicationsapplications

Page 14: The Open Archives Initiative Marshall Breeding Director for Innovative Technologies and Research Vanderbilt University

For detailed For detailed information:information: http://www.openarchives.org/http://www.openarchives.org/ Understanding the Protocol for Understanding the Protocol for

Metadata Harvesting of the Open Metadata Harvesting of the Open Archives Initiative (Marshall Archives Initiative (Marshall Breeding, Computers in Libraries, Breeding, Computers in Libraries, Sept 2002)Sept 2002)

http://www.librarytechnology.org/lhttp://www.librarytechnology.org/ltg-displaytext.pl?RCtg-displaytext.pl?RC=9944=9944