26
Peter Tindemans, Geneva, 10-07-06 1 Creating a pragmatic pan-European framework for permanent access to the records of science Dr. Peter Tindemans Chairman Task Force Permanent Access

Peter Tindemans, Geneva, 10-07-061 Creating a pragmatic pan-European framework for permanent access to the records of science Dr. Peter Tindemans Chairman

Embed Size (px)

Citation preview

Peter Tindemans, Geneva, 10-07-06 1

Creating a pragmatic pan-European framework for permanent access to the records of science

Dr. Peter Tindemans

Chairman Task Force Permanent Access

Peter Tindemans, Geneva, 10-07-06 2

Summary

3 ICT Infrastructures: Networks, High Performance Computing, GRIDs

4th one will affect science as profoundly: an ‘infrastructure’ to provide long-term preservation of and access (“P&A”) to Records of Science (and digital heritage in general)

How does this look like? How should we build it” Major stakeholders from science, libraries, archives

offer their strategic commitment to national governments and the EU to create in 3-5 years sufficient momentum

Peter Tindemans, Geneva, 10-07-06 3

Overview

1. Background: including Records of Science in Digital Heritage: Task Force Permanent Access.

2. The problem and range of technical solutions required

3. High-level, strategic, pragmatic approach: need and essence

4. How does ‘European Digital Infrastructure for Preservation of and Access to Records of Science’ looks like?

5. Alliance in the making6. Financial challenge7. Some issues

Peter Tindemans, Geneva, 10-07-06 4

1. Background

Documents (+ images): libraries (+NASA, ESA) Audiovisual media, cultural heritage: broadcasting

organisations, museums, .. Scientific and operational data: labs, communities,

scientists, service providers,..

Hence ‘curation’ Digital libraries, digital archives, digital repositories Preservation or perennial access

Peter Tindemans, Geneva, 10-07-06 5

Background in terms of process and getting political attention

Political attention for preservation focused on cultural heritage and on libraries (legal deposit)

Recently, inclusion of records of science in digital cultural heritage evolved from ‘records of history of science’ to ‘records of science in operation’; this concerns not just scope, but also nature of records: ‘data’ next to ‘documents’ (and other cultural physical artefacts)

Particular culmination point EU Conference “Permanent Access to the Records of Science” (National Library of Netherlands KB, Netherlands EU Presidency),1st November, 2004,The Hague.

Participants agreed to need to create European infrastructure for long-term preservation to and permanent access to records of science. KB urged to create a Task Force.

Peter Tindemans, Geneva, 10-07-06 6

Composition Task Force Bertil Andersson, Chief Executive European Science Foundation; Lynne Brindley, Chief Executive The British Library; Wim van Drimmelen, Director General Koninklijke Bibliotheek; Norbert Kroo, Secretary-General Hungarian Academy of Sciences; Wolffried Stucky, professor Institute of Applied Informatics and Format

Description Methods, Karlsruhe University, curator Max Planck Institute of Computer Science, Germany;

Malcolm Read, Executive Secretary Joint Information Systems Committee, UK;

Vincenzo Beruti , ESA/ESRIN; John Wood, Chief Executive Council for the Central Laboratory of the

Research Councils, UK; Peter Hendriks, Board Springer Science and Business Media,

Executive Board International Association of Scientific, Technical and Medical Publishers.

Tomas Lidman, Director General The National Archives of Sweden; Peter Tindemans, chair, on behalf of the Koninklijke Bibliotheek.

(reflects ‘data’ and ‘documents’)

Peter Tindemans, Geneva, 10-07-06 7

2. Problem Science-angle

Individual scientist Maintaining and accessing databases built up by individuals: e.g.

Madison database on GDP Requirements of journals and funders with regard to supplementary or

original data New cultural paradigm challenges individuals, universities, funders, etc.

Large research organisations and communities (CERN, ESA,..): volume of data

European social sciences data archives

Libraries- and archives angle (“perennial storage”) acidification threatened paper; obsolescence and volume explosion

jeopardise digital heritage

Other cultural heritage organisations Similar to libraries and archives

Peter Tindemans, Geneva, 10-07-06 8

Dimensions of the problem Technical

Digital data unstable Perishable Migration (or other techniques to ensure permanent accessibility if software and

hardware changes) Constant care and intervention Interoperability Volume

Economic Cost estimates Business model Public good nature Relation to Open Access Build into normal R&D funding model

Digital rights/access management

Organisational, including ‘data model of world: (producing and (re-using data)

Peter Tindemans, Geneva, 10-07-06 9

Range of solutions required: RDD programme for curation in general, preservation in particular Storing Petabytes to 100s of Petabytes to Exabytes, surviving

changes in hardware and software technologies, retrieving information

Standardised approach to describe information (metadata) and management of information as successive ‘virtualisation’ layers (hardware, data, knowledge, workflows, trust, management) to enable fully automated, distributed solutions

Complex dynamic datasets and databases

Legal solutions for digital access and rights management

Economic business models based on value-chain analysis and public-good aspects

Technical tools, e.g. to overcome ‘museum of old ICT technologies’, and cumbersome migration

Peter Tindemans, Geneva, 10-07-06 10

Digital preservation methods suggested (Thibodeau, 2002)

Peter Tindemans, Geneva, 10-07-06 11

What has happened?

Documents (+images): research libraries (esp. US), deposit libraries (esp. Europe: BL, KB)

Data: individual labs, scientists, networks of archives Some national efforts: UK, e.g. JISC, Digital

Preservation Coalition (+ Digital Curation Centre), Germany (NESTOR) emerging, Netherlands

Some EU-funded projects: scattered, focus often on co-ordination

KB: back-up arrangements with several large scientific publishers

Some global co-operation, some efforts at standardisation

Peter Tindemans, Geneva, 10-07-06 12

3. High-level, strategic, pragmatic approach: need

Increasing awareness among experts, sometimes institutions about size and complexity of problem of P&A.

Many projects, standards, good practices, etc.

But: No recognition in Europe that preserving and making accessible the

digital heritage on very long time scales is strategic issue for Organisations (with few exceptions) Governments, as well as many private sector parties no financing mechanism

In USA since 2002 National Digital Information Infrastructure and Preservation Program: Library of Congress working together with NSF, research libraries, archives etc; 100 M$ to start with; recently NARA got 300 M$ (emphasis still on documents).

Peter Tindemans, Geneva, 10-07-06 13

High-level approach: essence

1. Make ‘digital heritage’ stakeholders understand at ‘board level’ economic and cultural importance of P&A for their strategic

development 2. Involve public and private parties: essential to find business model

based on private and user interests and cost allocations, public infrastructure: important ‘public good’ aspect.

3. Adopt non-technical ‘model of world’ as basis for the ‘infrastructure’

4. Adopt practical way aheada. Where is highest impact possible?b. Involve initially not too few, not too many stakeholders c. Connect to ongoing activities: don’t replace, but integrate

responsibilities

Peter Tindemans, Geneva, 10-07-06 14

a. Highest impact: focus on Records of Science, taken in broad sense

Two worlds1. Cultural heritage

Begins to include digital heritage UNESCO; ‘memory institutions’: archives, deposit libraries, museums. Science = ‘records of history of science’. Politically increasingly visible: UNESCO, EU

2. Records of science. S&T in digital age: ‘data’ next to ‘documents’; small part to spill into traditional archives. ‘Science’ = S + T; NSE+BMS+SSH; Large scale data collection for operational services

and science (meteorology, GIS, census, …); experiments + observations + simulations + surveys + census and poll + history records; data also includes ‘enriched’ and ‘curated’ data: “knowledge preservation”

Gearing up best done by focusing on ‘Records of Science’ Greatest momentum:

Inherent needs of scientific community and organisations High ‘specific mass’ (including financial mass)

Covers broad field Academic and deposit libraries, scientific publishers straddle two worlds. Archives linked to e.g. historical, social and economic sciences.

Peter Tindemans, Geneva, 10-07-06 15

b. Not too few, not too many

Common European approaches: ‘call for tender’ for projects, All-inclusive approach: all stakeholders from 25 member

states plus Commission, resolutions, communications, agency, ….

Instead focus on critical mass of stakeholders and focused action, i.e. Emphasis on preservation (though preservation cannot be

separated from building digital collections) Aim to create ‘infrastructure’ Aim to create growing consensus among and conditions for

‘communities’ and organisations and their particular preservation projects.

Peter Tindemans, Geneva, 10-07-06 16

4. Model of the world Framework of conditions and rules of conduct ‘Communities’ produce science (particle physics; social

sciences; astronomy/space science; geophysics/oceanography/earth sciences/earth observation;..), are different, but have similar structural elements to house “Record of Science” In some disciplines short-term role individual researchers ‘Laboratories’ Specialised data providers Specialised publishers or web-based archives Specialised reserch libraries

Cross-cutting horizontal structure too exists: Scientific publishers, multidisciplinary open archives Academic research libraries Deposit libraries Conventional archives

All are digital archives or repositories in digital world

Peter Tindemans, Geneva, 10-07-06 17

Community A Community B Community C

labs

special data providers

Special publishers

special research libraries

labs labs

special data providers

special data providers

special publishers

special research libraries

special publishers

special research libraries

general scientific publishers, general open archives, academic research libraries, deposit libraries, conventional archives

community –specificprovisions

community –specificprovisions

community –specificprovisions

Cross-disciplinary, cross- community conditions, mechanisms and provisions

Peter Tindemans, Geneva, 10-07-06 18

Transform into framework (‘infrastructure’) of real life organisations and operating conditions (for interoperability and collaboration)

1. Identify set of core physical digital archives in limited number of initial communities, and in horizontal layer (“critical mass” and ‘high specific mass’ are essential criteria)

2. These must OAIS-compliant to ensure proper archiving, interoperability and long-term preservation

3. Framework for metadata, Framework for persistent identifiers, and number of registries

4. Cost-effective preservation methods and services must be available

5. Common framework of principles and guidelines for management of access and rights (underlying the technical tools to implement this framework)

6. Financial mechanism for developing and testing implementation tools, techniques and services

7. a. Certification service providers, accredited according to b. Common European accreditation mechanism.

Peter Tindemans, Geneva, 10-07-06 19

5. An Alliance in the making

Aims Establish wide consensus on framework

(‘infrastructure’) for LTPA; initial focus on science Accelerate significantly creation of its main building

blocks Work with national governments and EU to

strengthen European strategies, policies and their implementation

Strengthen role European parties world-wide Articulate and maintain ongoing R&D&D programme

3-5 years

A “Rolling Stone”

Peter Tindemans, Geneva, 10-07-06 20

Tasks

Assisting communities: initial core set and others Enhancing and consolidating consensus on the

building blocks of the ‘infrastructure’ Helping establish European funding mechanism Helping establish European accreditation mechanism Liasing with national governments and EU Promoting sustainable business models Raising awareness: funding bodies, professional

societies, universities, …..

Peter Tindemans, Geneva, 10-07-06 21

Core Alliance Partners

European Science Foundation Some of most active libraries: British Library, KB Some major scientific organisations: ESF, ESA,

CERN, EMBL (EIROFORUM), CCLRC, Max Planck Gesellschaft, CESSDA are among those approached

Association of Scientific, Technical and Medical Publishers

Some major national archives JISC, ‘National coalitions’ for P&A, where they exist: UK,

Germany, ….

Corporate associate members (e.g. ICT industry): ‘Customer-contractor’ principle

Peter Tindemans, Geneva, 10-07-06 22

Strengthening emerging consensus;Building on what is being done

Conceptualisation and standardisation, e.g. OAIS Dublin Metadata Core Initiative (but still very much

library/document-oriented) Draft Audit Checklist for Certification of Trusted Digital

Repositories (RLG, NARA plus European experts) Practical development and implementation, e.g.

Several EU-funded projects (but too much focus on co-ordination); important new ones: DRIVER, CASPAR (with e.g. CCLRC, ESA-ESRIN)

Strong national projects (but in few countries only); e.g. DARE (Netherlands)

Public-private agreements (e.g. libraries and publishers) Audit and Certification of Digital Archives Project (CRL) to test

audit 3 archives

Peter Tindemans, Geneva, 10-07-06 23

6. Financial model

1. Need to create European, but strongly distributed infrastructure;

2. Need to make Europe visible, strong partner in global efforts

Therefore: Partners continue current efforts and investments Partners contribute to establish small European organisation

to co-ordinate Alliance efforts ‘100 M€’ for the real action for European funding mechanism

not to be disbursed by Alliance; for developments in communities the Alliance will work

with; to create the enabling conditions.

Leveraging national and further European funding (central + decentralised funding totals to build this

infrastructure much and much higher than 100 M €)

Peter Tindemans, Geneva, 10-07-06 24

Practicalities about the Alliance

Members: leading national or international organisations

Strategic allies: national coalitions or competence networks; commercial companies or vendors

Board; Director and some staff Office in Brussels (at ESF’s COST office?) Budget 3 years: 1.8 M€ Per partner: ~ 75 k€

Peter Tindemans, Geneva, 10-07-06 25

Workplan Year 1

Interface EU (FP7), ESFRI, national organisations Facilitate information sharing about preservation approaches and support infrastructure

(standards, authentication, registries, metadata capture mechanisms,..) Gathering cost information Involvement in on-going drafting archive certification standard Identifying resources for science drive interoperability as potential basis for automated

interoperability Year 2 (apart from continuing interfacing)

Shared persistent identifiers scheme Prototype interoperable search and discovery tools supported by common data models Certification standard ready for submission to ISO; preliminary work on accreditation

organisation Some alignment of operating practices and use of Digital Rights Management and

Authentication and Authorisation systems Year 3 (apart from continuing interfacing)

Prototyping and testbed activities to put some into production use (e.g. applications to find and combine data and relevant publication material, supported by shared catalogues and data models, single sign-on access to non-public data, etc)

Co-operation on large scale storage solutions (exabyte) Finalise business model for accreditation system

Year 4 + (only after evaluation) Advanced development of interoperable virtualisation layers

Peter Tindemans, Geneva, 10-07-06 26

7. Some Issues

‘Raw’ data, so far much focus on documents Model of world primarily based on

international communities, or on national approaches (cf. networking with NRENs, connected via GEANT)

ALLIANCE to be set up as corporation or e.g. as consortium

How to get EU involved in a strategic (not individual project-based) and internally co-ordinated approach?