27
David Giaretta Associate Director (Development) for Chris Rusbridge (Director) Funders : Digital Curation Centre Digital Curation Centre a centre of expertise in data curation and preservation

David Giaretta Associate Director (Development) for Chris Rusbridge (Director) Funders: Digital Curation Centre a centre of expertise in data curation

Embed Size (px)

Citation preview

David GiarettaAssociate Director (Development)

forChris Rusbridge (Director)

Funders:

Digital Curation Centre

Digital Curation Centrea centre of expertise in data curation and preservation

Development

2

• DCC: Why?

• DCC: What?

• DCC: Where?

• DCC: Who?

• DCC: How and When? – Progress

• How can we help – each other?

Development

3

Time for Digital Curation

• problem of the moment fragility of digital information recognised data curation & data deluge in e-science/research longevity of digital heritage & research investment

• re-examining ‘Communication’ in ICT Internet and GRID: communication across space

with utmost accuracy

Digital Curation: communication across time, with utmost

accuracy• ensure Content travels despite turbulence of IT

agree strategies & methods for digital preservation

Development

4

Unifying Themes for the DCC

• ‘data as evidence’– for understanding and decision– for one or more designated communities

• ‘archival responsibility’– at one or more institutional levels– institutional policies & individuals’ competence– legal compliance & agreement on procedures

• turn ‘open access’ into ‘continuing access’• turn costs into investment

– valuing flow of benefit from re-usable assets

Development

5

• DCC: Why?

• DCC: What?

• DCC: Where?

• DCC: Who?

• DCC: How and When? – Progress

• How can we help – each other?

Development

6

Digital Curation

• preservation and use/interoperability– preservation is interoperability with the

future

• bits and information

• libraries and science data

• digital information rendered for human reading and automated processing– may be transient distinction but…

Development

7

Aims & Objectives for the DCC

‘quality improvement in data curation & digital preservation’

initial focus: data as evidence for scholarly conclusions

wider remit: scholarly communication & eLearning

• ‘excellence in research & excellence in service’• working with repositories, rather than being one• ‘connecting communities’ via Associates Network

– universities & research institutes– scientific data tradition & document tradition

– international & cross-sectoral

Development

8

• DCC: Why?

• DCC: What?

• DCC: Where?• DCC: Who?

• DCC: How and When? – Progress

• How can we help – each other?

Development

9

Organisation to Engage & Collaborate

Industry

research collaborators

standards bodies

testbeds& tools

communities of practice: users

UKOLN

U of Edinburgh

CCLRC

U of Glasgow

U of Edinburgh

curation organisations eg DPC

Collaborative Associates Network of DataOrganisations

Development

10

Organisation to Engage & Collaborate

Industry

research collaborators

standards bodies

testbeds& tools

communities of practice: users

community support & outreach

research

development co-ordination

service definition & delivery

management & admin support

curation organisations eg DPC

Collaborative Associates Network of DataOrganisations

Development

11

• DCC: Why?

• DCC: What?

• DCC: Where?

• DCC: Who?• DCC: How and When? – Progress

• How can we help – each other?

Development

12

Organisation to Succeed

Phase One leadership over first eight months of funding

• Community Support & Outreach– Led by Dr Liz Lyon (UKOLN, University of Bath)

• Service Definition & Delivery– Led by Professor Seamus Ross (HATII [ERPANET], University of Glasgow)

• Development– Led by Dr David Giaretta (Astronomical Software & Services, CCLRC)

• Research– Led by Professor Peter Buneman (Informatics, University of Edinburgh)

• Management & Co-ordination– Director Chris Rusbridge

• Peter Burnhill had been Phase One Director

‘Ex Portfolio’: Malcolm Atkinson (NeSC)

Development

13

CCLRC UKOLN

UofGUofE

CMS-Bristol

NIEeS

RG

Durham

WT-CFGLeicester

ICMaastricht

Oxford

Dutch NASwiss NAUrbino

UNC

Salzburg

SDSC

NEODC

CEH

RI

NCS

RLG

Innogen

NHS

Capri NTUAINRIAHUJUPCMax-

PlanckMIMAS

IASSIST

LDCACM

Data Archive

EDGGridPPEGEE

CambridgeLeicester

Jodrell Bank

DLI (US)DPC

DELOS

UNC

ESA

NASANARACNESESARLG

BNSC

TU Vienna UPenn

EBIMRC HGU

KyotoUSC

INRIA

GSK

Roslin

IBM Almaden

JHUCSIRO

CaltechJHU

CSIRO

CDSESO

OCLC

AHDSMicrosoft

IBMOracle

BTSTK

BADCBODC

ESO

IVOA

ResearchCouncils

HEIs&

FE

ResearchInstitutes

InternationalCollaborations

StandardsBodies

DPC

MIMAS

ILRT

Council forMuseums, Archives

& LibrariesRDN. OCLC

So’ton

OAI

NOF

NLA

NeSC

Development

14

• DCC: Why?

• DCC: What?

• DCC: Where?

• DCC: Who?

• DCC: How and When? – Progress

• How can we help – each other?

Development

15

Outreach

• User interviews and focus groups

• Internet Journal

• Web presence (http://www.dcc.ac.uk) and

Portal

• DPC membership and collaboration

• Associates Network

• DCC Conference (Sept 29-30)

• PV2005 Conference (Nov 21-23)

Development

16

Engage Communities of Practice

• with those who have responsibility• … to invoke/provoke good practices

– appraisal & retention/disposal– logical & physical integrity: authenticity/security

• place research in productive research domains– eg Informatics, Law School, e-Science ...

• work on the ‘R&D’, create services of relevance– achieve ‘virtuous circle’– turn products of research into tools for use

Development

17

Services

• Advisory service and Help desk

• Site visits and case studies

• Curation Manual and Briefings

• Tools and testbeds

• Standards watch

• Certification

• Training

Development

18

Development

• OAIS fundamentals• Registries/Repositories for Representation

Information – offering a repository of tools and technical

information, a focal point for digital curators– metadata standards

• Testbeds– for testing and evaluating tools, methods,

standards and policies in realistic settings• Certification

– standards

Development

19

OAIS Reference Model – Functional Model

4-1.

2

MANAGEMENT

Ingest

Data Management

SIP

AIPDIP

queries

result setsAccess

PRODUCER

CONSUMER

Descriptive Info

AIP

orders

Descriptive Info

Archival Storage

Administration

Preservation Planning

Development

20

Representation Net

Development

21

Layered Model from OAIS

Development

22

Knowledge Based Persistent Archive

AttributesSemantics

Knowledge

Information

Data

Ingest Services

Management AccessServices

(Topic Maps / Buckets / Model-based Access)

(Data Handling System - SRB / FTP / HTTP)

MC

AT

/HD

F

Gri

ds

XM

L D

TD

SD

LIP

XT

M D

TD

Rul

es -

KQ

L

InformationRepository

Attribute- based Query

Feature-basedQuery

Knowledge orTopic-Based Query / Browse

KnowledgeRepository for Rules

RelationshipsBetweenConcepts

FieldsContainersFolders

Storage(Replicas,Persistent IDs)

Process Infrastructure Process

Development

23

Working with Others

• Digital Library Federation• The National Archives• Global Grid Forum• NARA• Library of Congress• Research Library Group• Digital Preservation Coalition• JISC community• E-Science Community• Associates Network• …and many many more

Development info – see

http://dev.dcc.rl.ac.uk

for details of Wiki and email list open to all

Development

24

Research

• To draw together the various functions of curation, from the traditional archival functions to the maintenance and publication of evolving knowledge as seen in scientific databases.

• To identify through direct research collaboration, and through interaction with the service arm of DCC, the key projects in which research is needed.

• To conduct research in areas already identified by the partners as crucial to digital curation.

• To institute two-way conduits between research and service in which practical issues can be drawn to the attention of researchers and the products of research can be tested in practice.

Development

25

Current research priorities

• Data integration and publication • Performance and optimisation • Annotation • Appraisal and long-term preservation • Socio-economic and legal context: rights,

responsibilities and viability • Cost-benefit analysis of the data curation process • Security: safe and effective data analysis

environments • Automation of metadata extraction • Visitors Programme and Seminar Series

Development

26

How can we help - each other?

• No one knows how to “do” curation properly

• There is an overlap between DCC and other JISC projects

• We can help each other

Development

27

The DCC can:

• act as a clearing house for ideas• help to coordinate efforts• undertake longer term research• take particular products and

generalise/package them for general use.

• provide some common infrastructure elements