37
CLOCKSS, LOCKSS & Barrels of Stuff: Libraries and Publishers share the Task Peter Burnhill EDINA, University of Edinburgh ICOLC, Rome, 13 October 2006

CLOCKSS, LOCKSS & Barrels of Stuff: Libraries and Publishers share the Task Peter Burnhill EDINA, University of Edinburgh ICOLC, Rome, 13 October 2006

Embed Size (px)

Citation preview

Page 1: CLOCKSS, LOCKSS & Barrels of Stuff: Libraries and Publishers share the Task Peter Burnhill EDINA, University of Edinburgh ICOLC, Rome, 13 October 2006

CLOCKSS, LOCKSS & Barrels of Stuff: Libraries and Publishers share the Task

Peter BurnhillEDINA, University of Edinburgh

ICOLC, Rome, 13 October 2006

Page 2: CLOCKSS, LOCKSS & Barrels of Stuff: Libraries and Publishers share the Task Peter Burnhill EDINA, University of Edinburgh ICOLC, Rome, 13 October 2006

Overview

1. Introduction

2. What is CLOCKSS?

3. CLOCKSS Project

4. Progress Report

5. Recap & look to the future

6. How you can engage with CLOCKSS

Page 3: CLOCKSS, LOCKSS & Barrels of Stuff: Libraries and Publishers share the Task Peter Burnhill EDINA, University of Edinburgh ICOLC, Rome, 13 October 2006

1. Introduction

• Director, EDINA National Data Centre (1 of 2 in UK) www.edina.ac.uk Based at University of Edinburgh Designated & largely funded by the JISC I could say a lot more about EDINA (if given half a chance…)

• SUNCAT (national serials union catalogue), OpenURL Router, Onix for Serials, etc

• Member of Information Services Directorate (my other 50%), University of Edinburgh Research-led university in Scotland’s capital city

• Active in CURL & SCURL• Contributes to JISC work • Keen to engage in international initiatives

Could also speak of the Digital Curation Centre …

• But here to speak of CLOCKSS, a University, Research Library, Learned Society and Publisher initiative

Page 4: CLOCKSS, LOCKSS & Barrels of Stuff: Libraries and Publishers share the Task Peter Burnhill EDINA, University of Edinburgh ICOLC, Rome, 13 October 2006

2. What is CLOCKSS?

Public good solution to problem of global significance

How to preserve & ensure continuing access to electronic scholarly content

• Partly an organisational solution Collaboration and shared governance between

Libraries & Publishers

• Partly a technical solution LOCKSS technology

In short, it’s Controlled use of LOCKSS

Page 5: CLOCKSS, LOCKSS & Barrels of Stuff: Libraries and Publishers share the Task Peter Burnhill EDINA, University of Edinburgh ICOLC, Rome, 13 October 2006

So, what is LOCKSS?

“Lots of Copies Keep Stuff Safe”

Digital Preservation Infrastructure

Decentralized, Peer to Peer, Continuous Content Audit & Repair

“computers chattering away to one another across the Internet”

Open Source

Page 6: CLOCKSS, LOCKSS & Barrels of Stuff: Libraries and Publishers share the Task Peter Burnhill EDINA, University of Edinburgh ICOLC, Rome, 13 October 2006

3. The CLOCKSS Project

Two-year (2006 -) demonstrator project that intends:

• open reporting of progress & outcome• a public demonstration that this solution really

can be trusted for the long term• scalability in terms of publisher content &

library deployment

The project was first funded by its participants, now with additional NDIPP grant support from the Library of Congress to assist reporting.

Page 7: CLOCKSS, LOCKSS & Barrels of Stuff: Libraries and Publishers share the Task Peter Burnhill EDINA, University of Edinburgh ICOLC, Rome, 13 October 2006

Who is in CLOCKSS?

Consortium acting on behalf of the wider community of libraries and publishers• was 6 Libraries and 6 Publishers

– Including learned societies acting as publishers

• now 7 Libraries and 12 Publishers• will be …. sufficient to cover the bases

Commitment based on stewardship of libraries & responsibility of publishers

Page 8: CLOCKSS, LOCKSS & Barrels of Stuff: Libraries and Publishers share the Task Peter Burnhill EDINA, University of Edinburgh ICOLC, Rome, 13 October 2006

Libraries

University of Edinburgh New York Public Library Indiana University Rice University Stanford University University of Virginia + OCLC (recently joined as the 7th)

• Aim to add more to cover ‘tecktonic plates’ of all types of geography

Page 9: CLOCKSS, LOCKSS & Barrels of Stuff: Libraries and Publishers share the Task Peter Burnhill EDINA, University of Edinburgh ICOLC, Rome, 13 October 2006

Publishers

Blackwell PublishingElsevier Nature Publishing Group Oxford University PressSAGE Publications Springer Taylor and Francis John Wiley & SonsAmerican Chemical AssociationAmerican Medical Association American Physiological Society Institute of Physics +aim to add all the rest …

Page 10: CLOCKSS, LOCKSS & Barrels of Stuff: Libraries and Publishers share the Task Peter Burnhill EDINA, University of Edinburgh ICOLC, Rome, 13 October 2006

Equal Partners

Librarians have made a strategic decision, with publishers, that retains their role as stewards, as memory institutions

Publishers have made strategic decision to trust and engage those libraries, committing to prospect of continuing access

Both are exploring social and technical models over an initial two-year period, working to build a full-scale production system

Costs of the initiative are shared equally between the parties, with additional funds to support for audit & reporting from NDIPPNational Digital Information Infrastructure and Preservation Program

administered by the US Library of Congress

Page 11: CLOCKSS, LOCKSS & Barrels of Stuff: Libraries and Publishers share the Task Peter Burnhill EDINA, University of Edinburgh ICOLC, Rome, 13 October 2006

Agreed Mission

“CLOCKSS is a not-for-profit community partnership between publishers and libraries that is developing a distributed, validated, comprehensive archive that preserves and ensures continuing access to electronic scholarly content”

Page 12: CLOCKSS, LOCKSS & Barrels of Stuff: Libraries and Publishers share the Task Peter Burnhill EDINA, University of Edinburgh ICOLC, Rome, 13 October 2006

Community Governance

• Governed by both library and publisher partners• Each partner represents an organization but

collectively represents each sectorLibraries & Publishers (& Learned Societies?)

• No one-single point of failure or institutional interest will prevent long-term governance

• Consensus driven, united for support of scholarly communication over the long term

Complementary to territorial arrangements for legal deposit

Page 13: CLOCKSS, LOCKSS & Barrels of Stuff: Libraries and Publishers share the Task Peter Burnhill EDINA, University of Edinburgh ICOLC, Rome, 13 October 2006

Format Migration

Ingest format from publishers (during the project) of both/either:

1. as delivered to the Web2. as XML source files

Access format is “on the fly” When content is requestedProcess is transparent to the reader

http://www.dlib.org/dlib/january05/ rosenthal/01rosenthal.html

Page 14: CLOCKSS, LOCKSS & Barrels of Stuff: Libraries and Publishers share the Task Peter Burnhill EDINA, University of Edinburgh ICOLC, Rome, 13 October 2006

Reduce the cost of ingest• allowing more material to be preserved

value for moneyPostpone costs of migration• taking advantage both of the time value of

money, and of the technology cost curve.Migrate material upon reader request• vastly lowering amount of content that needs to

be processed Allow what the reader sees to be the result of

best available technology at time of access

Preserve the original look-and-feel, (which can be a large part of the value)

Page 15: CLOCKSS, LOCKSS & Barrels of Stuff: Libraries and Publishers share the Task Peter Burnhill EDINA, University of Edinburgh ICOLC, Rome, 13 October 2006

4. Progress Report

• We are up and running, with two LOCKSS boxes per Library Partner, and one for observation at each Publisher

• We are ingesting content from the Publishers• The (C)LOCKSS boxes are chattering away• We meet via teleconference on weekly basis• We are readying to simulate our first ‘trigger event’• We are beginning to report and preparing to audit

• The CLOCKSS is ticking …

Page 16: CLOCKSS, LOCKSS & Barrels of Stuff: Libraries and Publishers share the Task Peter Burnhill EDINA, University of Edinburgh ICOLC, Rome, 13 October 2006

Incoming email, 9 October 2004

Dear CLOCKSS technical group,We are pleased to announce the release of more content to the CLOCKSS network.Today we have released additional content for the four previously released titles, three are from Oxford University Press and one from SAGE Publications:•Age and Ageing (OUP)•Environment & Urbanization (SAGE)•Journal of Experimental Botany (OUP)•Toxicological Sciences (OUP)CLOCKSS libraries

The CLOCKSS system works differently from the LOCKSS system. The titles will be automatically configured for your CLOCKSS boxes. You NEED NOT DO ANYTHING -- the content will be automatically ingested and preserved. … we will be paying close attention to make sure that harvesting and auditing is going smoothly. In the future we willtransition some of this responsibility [to] each institution hosting CLOCKSS boxes. We will send details on how to do that in the future. We are working to develop and test plugins for titles that currently have manifest pages. We will be releasing these in the near future.

Thomas S. [email protected] Director & Technical Manager,

Page 17: CLOCKSS, LOCKSS & Barrels of Stuff: Libraries and Publishers share the Task Peter Burnhill EDINA, University of Edinburgh ICOLC, Rome, 13 October 2006

5. Recap & look to future

• What’s the Problem?

• What are the Threats?

• Need for Public Good

• Trust in Library Stewardship

• Purpose of CLOCKSS

• Strategies

• Transition to full production

Page 18: CLOCKSS, LOCKSS & Barrels of Stuff: Libraries and Publishers share the Task Peter Burnhill EDINA, University of Edinburgh ICOLC, Rome, 13 October 2006

What’s the Problem?

Coming of the digital and the Web accidentally changed the business relationship between librarians and publishers.

• With rare exceptions, libraries no longer take physical custody of the content, but provide access to web materials

• This has disrupted the role libraries have played in society for hundreds of years as trusted keepers of information and culture

• There is concern that was is now digital may cease to be availableOur digital cultural and intellectual heritage is at risk.

Page 19: CLOCKSS, LOCKSS & Barrels of Stuff: Libraries and Publishers share the Task Peter Burnhill EDINA, University of Edinburgh ICOLC, Rome, 13 October 2006

What are the Threats?

Continuous and Abrupt changes

• Technology storage media, hardware, software, formats

• Commerce here one day, gone the next

• Organizations (even Institutions) shifting priorities, politics, staffing

• Natural disasters• Human folly: errors and attacks

Page 20: CLOCKSS, LOCKSS & Barrels of Stuff: Libraries and Publishers share the Task Peter Burnhill EDINA, University of Edinburgh ICOLC, Rome, 13 October 2006

Need for Public Good

To ensure• Content kept safe on behalf of scholarly

community• Global access to content on a continuing

basis• The ‘trigger event’

Establish a self-sustaining large dark archive • That keep costs low, with revenue to sustain

operations and access

Page 21: CLOCKSS, LOCKSS & Barrels of Stuff: Libraries and Publishers share the Task Peter Burnhill EDINA, University of Edinburgh ICOLC, Rome, 13 October 2006

Trust in Library Stewardship

Decentralized, to gain leverage from existing infrastructure of libraries

• The libraries hold content, act as custodians • On trigger event

Board insures content becomes available again

• No one-single point of failure, nor institutional interest will prevent provision of access

• Libraries are here for the long term

Page 22: CLOCKSS, LOCKSS & Barrels of Stuff: Libraries and Publishers share the Task Peter Burnhill EDINA, University of Edinburgh ICOLC, Rome, 13 October 2006

Transition to full production

Page 23: CLOCKSS, LOCKSS & Barrels of Stuff: Libraries and Publishers share the Task Peter Burnhill EDINA, University of Edinburgh ICOLC, Rome, 13 October 2006

Review strategies

• Replication - more copies are safer• Migration - move copies forward in time• Transparency - open source software• Diversity - no single point of failure• Audit – to confirm data really is preserved• Sustainable economics - cost effective

processes, more materials preserved per Euro/Dollar/Pound

Page 24: CLOCKSS, LOCKSS & Barrels of Stuff: Libraries and Publishers share the Task Peter Burnhill EDINA, University of Edinburgh ICOLC, Rome, 13 October 2006

Review Vision

Comprehensive • How much is all?

Need to define scope and ambition• Once content is in the archive it stays in the archive

Stock and flow • Will be available in perpetuity for use by the (dark) archive

Need to ensure access by ‘loss’ trigger Need to investigate ‘end-licence’ trigger of back runs?

• Will serve as a secure backup to world-wide e-copies of material

Implications of these 4 statements are being investigated during the two-year project

Page 25: CLOCKSS, LOCKSS & Barrels of Stuff: Libraries and Publishers share the Task Peter Burnhill EDINA, University of Edinburgh ICOLC, Rome, 13 October 2006

Purpose of CLOCKSS

to preserve content over time, & ensure there is always prospect of service access

Being .. comprehensive of all electronic scholarly content

• with keen focus on published journal articles & the like

globally secure against regional disaster• of all types: natural, commercial, political• through deployment across ‘tecktonic plates’ of all

forms of geography

Page 26: CLOCKSS, LOCKSS & Barrels of Stuff: Libraries and Publishers share the Task Peter Burnhill EDINA, University of Edinburgh ICOLC, Rome, 13 October 2006

6. How you can engage with CLOCKSS

www.lockss.org/clockss/talkbackBe a clockss-watcher!

Let us have ideas & feedback; register interest to be an Associate ; undertake advocacy

In turn, we will use your support to build the community archive to preserve and ensure continuing access to electronic scholarly content.

Page 27: CLOCKSS, LOCKSS & Barrels of Stuff: Libraries and Publishers share the Task Peter Burnhill EDINA, University of Edinburgh ICOLC, Rome, 13 October 2006

(Controlled) Lots of Copies

(to) Keep Stuff Safe

Publisher buy-in, shared governance

Many geographically distributed sites • guarding against catastrophic failure

natural or man-made

Many independently administered repositories• guard against “insider attacks”

Many open source software contributor's (eyes and minds)

• guard against technical arrogance

Page 28: CLOCKSS, LOCKSS & Barrels of Stuff: Libraries and Publishers share the Task Peter Burnhill EDINA, University of Edinburgh ICOLC, Rome, 13 October 2006

I should end here

I do have additional slides on:

• Look and Feel to Readers

• Fancy graphics

But I should close & invite questions

Page 29: CLOCKSS, LOCKSS & Barrels of Stuff: Libraries and Publishers share the Task Peter Burnhill EDINA, University of Edinburgh ICOLC, Rome, 13 October 2006

Join Us

Page 30: CLOCKSS, LOCKSS & Barrels of Stuff: Libraries and Publishers share the Task Peter Burnhill EDINA, University of Edinburgh ICOLC, Rome, 13 October 2006

Look and Feel to Readers

• When content is served to the user from a LOCKSS & CLOCKSS Box Look and feel is as close as

possible to what the publisher published

Preserve content & presentation

Page 31: CLOCKSS, LOCKSS & Barrels of Stuff: Libraries and Publishers share the Task Peter Burnhill EDINA, University of Edinburgh ICOLC, Rome, 13 October 2006
Page 32: CLOCKSS, LOCKSS & Barrels of Stuff: Libraries and Publishers share the Task Peter Burnhill EDINA, University of Edinburgh ICOLC, Rome, 13 October 2006
Page 33: CLOCKSS, LOCKSS & Barrels of Stuff: Libraries and Publishers share the Task Peter Burnhill EDINA, University of Edinburgh ICOLC, Rome, 13 October 2006
Page 34: CLOCKSS, LOCKSS & Barrels of Stuff: Libraries and Publishers share the Task Peter Burnhill EDINA, University of Edinburgh ICOLC, Rome, 13 October 2006
Page 35: CLOCKSS, LOCKSS & Barrels of Stuff: Libraries and Publishers share the Task Peter Burnhill EDINA, University of Edinburgh ICOLC, Rome, 13 October 2006

Format Migration

Ingest format (during the project) both/either:1. As delivered to the Web2. XML source filesAccess format is “on the fly” When content is requestedProcess is transparent to the reader

http://www.dlib.org/dlib/january05/ rosenthal/01rosenthal.html

Page 36: CLOCKSS, LOCKSS & Barrels of Stuff: Libraries and Publishers share the Task Peter Burnhill EDINA, University of Edinburgh ICOLC, Rome, 13 October 2006

Reduce the cost of ingest• allowing more material to be preserved (VFM)Postpone costs of migration• taking advantage both of the time value of

money, and of the technology cost curve.Migrate material upon reader request• vastly lowering amount of content that needs to

be processed Allow what the reader sees to be the result of

best available technology at time of access

Preserve the original look-and-feel, which can be a large part of the value

Page 37: CLOCKSS, LOCKSS & Barrels of Stuff: Libraries and Publishers share the Task Peter Burnhill EDINA, University of Edinburgh ICOLC, Rome, 13 October 2006

Comprehensive

• Once content is in the archive it stays in the archive

• Will be available in perpetuity for use by the archive

• Will serve as a secure backup to world-wide e-copies of the material