38
1 We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre Peter Burnhill Director, EDINA JISC National Data Centre, University of Edinburgh, Scotland UK 10 October 2006 les & Responsibilities

Peter Burnhill Director, EDINA JISC National Data Centre, University of Edinburgh, Scotland UK

  • Upload
    zamora

  • View
    37

  • Download
    0

Embed Size (px)

DESCRIPTION

Roles & Responsibilities. We Must All Be Curators Now from Ingest to Service Delivery, in Data Library & National Data Centre. Peter Burnhill Director, EDINA JISC National Data Centre, University of Edinburgh, Scotland UK 10 October 2006. Three different voices / roles. - PowerPoint PPT Presentation

Citation preview

Page 1: Peter Burnhill Director, EDINA  JISC National Data Centre,  University of Edinburgh, Scotland UK

1

We Must All Be Curators Now

from Ingest to Service Delivery, in Data Library & National Data Centre

Peter Burnhill

Director, EDINA

JISC National Data Centre, University of Edinburgh, Scotland UK

10 October 2006

Roles & Responsibilities

Page 2: Peter Burnhill Director, EDINA  JISC National Data Centre,  University of Edinburgh, Scotland UK

2

Three different voices / roles

1. Director, EDINA National Data Centre– serving researchers, lecturers and students across the UK

* so something about what EDINA is & what EDINA does

– EDINA is funded by the JISC* so something about the JISC & the JISC IE

2. A time-served data person & fellow professional, from the University of Edinburgh– building on the past, planning for the future

3. A substitute for another guy … – trying to make sense of what is going on– working towards shared understanding– proposing a framework of verbs & nouns

Page 3: Peter Burnhill Director, EDINA  JISC National Data Centre,  University of Edinburgh, Scotland UK

Joint Information Systems Committee (JISC) …

… of all the UK funding councils for higher and further education

Mission:

“world-class leadership in the innovative use of ICT for support of education & research”

Information Communication Technology

Income mix of ‘top-slice’ recurrent funding + capital grants

Page 4: Peter Burnhill Director, EDINA  JISC National Data Centre,  University of Edinburgh, Scotland UK

4

Funding Councils, the JISC and EDINA

UK National Data Centres

Higher Ed funding councils

Further Ed funding bodies (Learning & Skills

Council)

Research Councils as ‘Partners’

NDCs are now HEFCE-related bodies

Page 5: Peter Burnhill Director, EDINA  JISC National Data Centre,  University of Edinburgh, Scotland UK

organisational infrastructure for JISC Services• UKERNA – runs Joint Academic Network (JANET)

• EDINA & MIMAS – national data centres

+

• Arts & Humanities Data Service (AHDS)

• Economic and Social Data Service (ESDS)

+

• UKOLN; Centre for Educational Technology Interoperability Standards (CETIS); Digital Curation Centre (DCC); British Universities Film & Video Council (BUFVC); Technical Advisory Service on Images (TASI); Open Source Advisory Service; Nat. Centre for Text Mining; Plagiarism Advisory Service

• JISC Legal/Monitoring/…TechDis ; Regional Support Centres; UK Access Management / Athens

* most located in universities across UK *

Page 6: Peter Burnhill Director, EDINA  JISC National Data Centre,  University of Edinburgh, Scotland UK

6

What is EDINA? • A National Data Centre, designated by the JISC in 1995/96

– based on Edinburgh University Data Library, est. 1983/84

Mission to enhance productivity of research, learning & teaching in UK higher and further education

• part of JISC Information Environment– Keywords have been Accessibility/Outreach/Inter-working/Inter-operability …

• range of development projects and 24/7 services– Geo-spatial, about which more later ..

– Scholarly communication & Multimedia * films & images; spoken word

– Infrastructure for Digital Library* certificates; rights; middleware

* SDSS -> UK Access Management Federation

• And the name, what’s that stand for?– Edinburgh Data Information Access– ‘Edina’ is the poetic name for Edinburgh …

Page 7: Peter Burnhill Director, EDINA  JISC National Data Centre,  University of Edinburgh, Scotland UK

7

Delivering online services, 24/7 …

http://edina.ac.uk

http://edina.ac.uk/

Page 8: Peter Burnhill Director, EDINA  JISC National Data Centre,  University of Edinburgh, Scotland UK
Page 9: Peter Burnhill Director, EDINA  JISC National Data Centre,  University of Edinburgh, Scotland UK

Biog: as data person these past 25+ years …

• Moved to the University of Edinburgh in 1979 – formerly science staff at Social Science Research Council (ESRC), 1974/77 – then medical statistician at Queen Charlotte’s Maternity Hospital, 1978/79

• first as statistician & researcher (& senior lecturer)– with Scottish Education Data Archive, from 1979

* making survey data at Govt-funded research centre (CES)– from design, data creation and documentation, onto analysis

* as survey methodologist in Edinburgh Survey Methodology Group

• then recruited to do R&D for service delivery– setting up & managing Edinburgh University Data Library, 1984 -– Co-director, ESRC Regional Research Laboratory, Scotland 1986/90

* early days of Geographical Information Systems (GIS)* member of Data Task Force, Inter-Agency Global Env. Change

– European Secretary (1993/95); President (1996/2001) of IASSIST* international assoc. for (social science) data librarians and archivists

• Now EDINA & IS Directorate at Univ. of EdinburghWas Set-up Director for Digital Curation Centre, 2003/4 to 2004/5

Page 10: Peter Burnhill Director, EDINA  JISC National Data Centre,  University of Edinburgh, Scotland UK

10

• Scottish Education Data Archive, late 1970s – mid ‘80s– Database of surveys of school leavers & cohorts of young people (16-19)

* derived data, trend datasets over time, changing classifiers (eg Social Class)– integrating data from different sources, eg census ‘small area’ statistics– made available online but under ‘privileged’ not ‘open’ access’

• Edinburgh University Data Library, mid- ‘80s & on– Wider variety of datasets, obtained from others, often via others

* A ‘local’ library of datasets* Easing access to data held elsewhere (eg UKDA)

– made available online across ERCC wide area network and beyond* building databases, sometimes with special software,

• ESRC Regional Research Laboratory, Scotland 1986/90– early days of Geographical Information Systems (GIS)– Integrating ‘large-scale’ data, much geographic or geo-spatial

• EDINA national data centre, mid-1990s & on– National online access to wider range of reference and source data

* obtained under licence– required value-added ‘curation’

* Digimap as but one example

… maybe I’ve been a ‘data curator’ all along

Page 11: Peter Burnhill Director, EDINA  JISC National Data Centre,  University of Edinburgh, Scotland UK

one example of ‘data curation’

OS digital data

Software + application of cartographic skill/rules

Value added component

11000152100913Playing Field 0901103 120001016400000%2100000010001004040097130 0%15000155 0321 0901103 0000000%2100000010001055810075820 0%15000156 0321 0901103 0000000%2100000010001057130076690 0%15000157 0321 0901103 0000000%2100000010001060110075460 0%15000158 0321 0901103 0000000%2100000010001063260074650 0%15000159 0321 8010619 0000000%2100000010001063370071760 0%15000160 0321 0901103 0000000%2100000010001066730076700 0%15000161 0321 0901103 0000000%2100000010001058910068550 0%15000162 0321 0901103 0000000%2100000010001064490069040 0%15000164 0321 0901103 0000000%2100000010001055710052730 0%15000173 0321 0901103 0000000%2100000010001058730050390 0%15000174 0321 0901103 0000000%2100000010001059520050430 0%15000175 0321 0901103 0000000%2100000010001056430049210 0%15000176 0321 0901103 0000000%

Software + default rules

Page 12: Peter Burnhill Director, EDINA  JISC National Data Centre,  University of Edinburgh, Scotland UK

12

• Scottish Education Data Archive, late 1970s – mid ‘80s– Database of surveys of school leavers & cohorts of young people (16-19)

* derived data, trend datasets over time, changing classifiers (eg Social Class)– integrating data from different sources, eg census ‘small area’ statistics– made available online but under ‘privileged’ not ‘open’ access’

• Edinburgh University Data Library, mid- ‘80s & on– Wider variety of datasets, obtained from others, often via others

* A ‘local’ library of datasets* Easing access to data held elsewhere (eg UKDA)

• ESRC Regional Research Laboratory, Scotland 1986/90– early days of Geographical Information Systems (GIS)– Integrating ‘large-scale’ data, much geographic or geo-spatial

• EDINA national data centre, mid-1990s & on– National online access to wider range of reference and source data

* obtained under licence– required value-added ‘curation’

* Digimap as but one example– national repositories of digital content: Jorum, GRADE, TheDepot

• Digital Curation Centre, 2004 & 2005 – strategic role: ‘data curation’ & ‘digital preservation’– even wider range of databases (e-science), held by others

… maybe I’ve been a ‘data curator’ all along

Page 13: Peter Burnhill Director, EDINA  JISC National Data Centre,  University of Edinburgh, Scotland UK

Data Provider

e.g. Ordnance Survey

end user(staff/student)

access

HE & FE funding councils

Institution(Licence)

£

££

£

Licensing Agent(JISC

Collections)

Value-added Service Provider

Authorising Institutions for free-at-point of use

Key role for Authentication (is-member of Institution) and Authorisation (is-licensed Institution)

Page 14: Peter Burnhill Director, EDINA  JISC National Data Centre,  University of Edinburgh, Scotland UK

14

EDINA as national data centre

• http://edina.ac.uk

• 50% direct funding from JISC for delivering services– Good reputation for helpdesk, user interfaces, FAQs etc

– 24/7, 99% uptime

• 50% is extra awarded for Development activity

– Developing services; developing JISC IE; working with Researchers

– Acknowledged project competence for R&D

• Strategic role as Geographic Data Centre

– For JISC (Digimap etc), for ESRC (UKBORDERS)

– Building Spatial Data Infrastructure with NERC and internationally (OGC)

Page 15: Peter Burnhill Director, EDINA  JISC National Data Centre,  University of Edinburgh, Scotland UK

Existing Geo-data Services

Page 16: Peter Burnhill Director, EDINA  JISC National Data Centre,  University of Edinburgh, Scotland UK

16

Where are we with GIS?• University of Edinburgh & its Data Library have long run interest &

experience– Geography Department (Coppock/Hotson; Waugh/GIMMS) & PLU

first MSc GIS course, and much else

– ESRC Regional Research Laboratory for Scotland, 1987-– Launch of UKBORDERS in 1994

• EDINA has continued and extended that for geo-spatial data– JISC eLib project: access to Ordnance Survey mapping, 1996- – Launch of Digimap service, 2000 -– Extension of UKBORDERS, 2001 -

• ‘Shared Services’ provisionGo-Geo! (geo-data portal) geoXwalkGRADE – Geospatial Repositor for Academic Deposit and Extraction

• Not all (only a fraction) of geo-referenced data at EDINA• Strategic importance of interoperability

– GI web services

• Interested in furthering the use of GI data across disciplines– Geo-parsing & mark-up; geo-finding; geoXwalk (vocabularies)

Page 17: Peter Burnhill Director, EDINA  JISC National Data Centre,  University of Edinburgh, Scotland UK

17

Disciplinary data-centres

* Something’s special about the spatial *

EDINA role as Geographic Data Centre?

Slide ‘borrowed’ from Liz Lyon, & curated ..

Page 18: Peter Burnhill Director, EDINA  JISC National Data Centre,  University of Edinburgh, Scotland UK

2. Getting back to Problem Statement

‘roles & responsibilities’Some Thoughts, and Questions…

• What resources, and how should we share?– What are ‘scholarly resources’?

• What is special about scholarship?

• What is different about digital?

• Who should do what?– A division of labour that leverages

* ‘responsibility’ and ‘expertise’ for curation* Means of service delivery

I. Find our place – in old and new geography• ‘words, numbers, pictures, sounds

all to be digital & accessed from afar’

Page 19: Peter Burnhill Director, EDINA  JISC National Data Centre,  University of Edinburgh, Scotland UK

19

Scholarship: Services and Stewardship

• Services, in support of scholarship, – Libraries have traditionally focussed on the formal part of

scholarly communication– Relevance: searching strategies– new challenges: how to cope with digital everything?

• Stewardship– Was ‘Special Collections’, now ‘Collections, inc. the digital’– Ensuring provenance & continuing access

* Digital curation, preservation & archiving* Sharing with future scholarship* Sharing with wider world

• Research– What do researchers do, and what do they want/need?– eScience, Data, and ‘scholar workstation’ and the VRE

• Learning and Teaching– What do students need?– What do teachers/lecturers need?– e-learning and the VLE (virtual learning environment)

Page 20: Peter Burnhill Director, EDINA  JISC National Data Centre,  University of Edinburgh, Scotland UK

20

Infrastructure to support four ‘demand-side’ verbs

discover information object of intereste.g. article referenced in database, A&I, eToC, etc

locate organisation offering service e.g. library (union catalogue/OPAC)

or document delivery service

request use of servicevia payment of money or privilege of membership

access object of interestvia personal visit, document delivery, online access

based on MODELS workshops (UKOLN/JISC eLib)

Page 21: Peter Burnhill Director, EDINA  JISC National Data Centre,  University of Edinburgh, Scotland UK

21

Simplified workflow

Discover

Locate

Access

Use

‘Publish*’

Fit for purpose?

Curate

*Issue

Page 22: Peter Burnhill Director, EDINA  JISC National Data Centre,  University of Edinburgh, Scotland UK

22

Dataset publishing

• Re examine concept of Dataset Publishing (Callahan, Johnson, and Shelley 1996)

– analogous to publishing papers– rewards for publishing datasets (e.g. promotion, RAE)– procedures (e.g. standards to use, peer review) & resources to

manage procedures* Should minimise time and effort required

– need tools to assist in creation, maintenance and dissemination of dataset descriptions

• Means of ‘putting’ into a public/community– Deposit and Share are too cosy– to ‘publicate, to issue

• Terms of access and use– Open? – Privilege of membership– Payment of money

Page 23: Peter Burnhill Director, EDINA  JISC National Data Centre,  University of Edinburgh, Scotland UK

Repositories of digital content

• So what is a digital repository?– I like (user) verbs, not (supply-side) nouns …

• A repository is a noun that meets a set of (user) verbs/tasks, by supporting delivery of [services] for a given/designated client community:

– Put [ingest service]– Keep-safe [storage service]– Get [access service]

Motivation:

• for the record? preservation; prospect of access

• for re-use? curation; current access • Can we say, “Behind every great service, there is a wonderful

managed repository”?

No, not if access service does not have corresponding ingest service.

Page 24: Peter Burnhill Director, EDINA  JISC National Data Centre,  University of Edinburgh, Scotland UK

Repositories & OAIS Reference Model

?? In a classic Repository, the DIP is the same as the SIP ??

In a data centre, and many data libraries, it rarely is.

4-1

.2

MANAGEMENT

Ingest

Data Management

SIP

AIPDIP

queries

result setsAccess

PRODUCER

CONSUMER

Descriptive Info

AIP

orders

Descriptive Info

Archival Storage

Administration

Preservation Planning

Page 25: Peter Burnhill Director, EDINA  JISC National Data Centre,  University of Edinburgh, Scotland UK

25

Support for Research & research-led learning• Data, software and facilities

– Data as ‘evidence’– Data curation and digital preservation: continuing access

• Data Archives and Data Libraries– Social surveys, and much more – IASSIST

* International Association for data professionals (1972 -)* Members in Philippines and Vietnam

• Census Programme– Small area statistics [MIMAS]– UKBORDERS (boundaries for thematic mapping) [EDINA]

• EDINA Digimap Collection– Topographic mapping data, from national mapping agency– Marine & Geological mapping data

• then there is the challenge of scientific visualisation, and observational images and documentary films!

Page 26: Peter Burnhill Director, EDINA  JISC National Data Centre,  University of Edinburgh, Scotland UK

26

Scholarly Communication

1. Access to commercial services & resources– Consortium licensing– ‘local’ hosting licensed data at National Data Centres (NDCs)

2. Focus on community-generated resources– Union catalogues (& links to ILL/docdel) - SUNCAT– digital library developments– Open Access repositories

* “Put it in The Depot” (www.depot.ac.uk)

3. Need for Access Control as Middleware development– Shibboleth framework, developed as part of Internet2

* UK Access Management Federation for Education & Research* Managed by UKERNA, based on work by EDINA SDSS

– replacing vendor’s UserID & password with community scheme

Page 27: Peter Burnhill Director, EDINA  JISC National Data Centre,  University of Edinburgh, Scotland UK

Scholarly Communication

Author

Reader

writes to be recognised by peer community &

for institutional Research Assessment Excersise (RAE) purposes

… perhaps to be read

Key User (Reader) Verbs:

Discover article of interestLocate service on those articlesRequest permission to use serviceAccess to service/article

(content of) article is the ‘information

object of desire’

Page 28: Peter Burnhill Director, EDINA  JISC National Data Centre,  University of Edinburgh, Scotland UK

Author(article)

Reader(article)

Publisherarticle serial

issue

Library(serial)

Licence

Scholarly Communication(simple model: focus on article–length work published in journals)

Libraries and Publishers provide framework …

the traditional ‘middleware’/infrastructure’

... with Licence(s) for electronic (online) and print (on-shelf)

£

P.Burnhill, EDINA/JISC, 2005

Page 29: Peter Burnhill Director, EDINA  JISC National Data Centre,  University of Edinburgh, Scotland UK

Author(article)

Reader(article)

Publisherarticle serial

issue

Library(serial)

Licence

Scholarly Communication & Open Access(Access to article–length work)

peer review

peer exchange

Informal: ‘invisible college’ and the ‘gift economy’

Institutional arrangement

Licensed Online Access

Forma£

economy

ILL/docdel

repositories

‘Open Access’‘Digital Preservation’

free2web access

E-prints££

learned

society

Page 30: Peter Burnhill Director, EDINA  JISC National Data Centre,  University of Edinburgh, Scotland UK

Research Data

Creator

Researcher

Generates (curates) data for own purpose, or as part of team

… wants/has to ‘put’ it somewhere for use by others

(perhaps to be recognised by a peer community)

Key User (Researcher) Verbs:

Discover data of interestLocate service on that data with documentation on provenance etc

Request permission to use serviceAccess to service/data,

Evidential value of data in analysis as

object of desire’

Page 31: Peter Burnhill Director, EDINA  JISC National Data Centre,  University of Edinburgh, Scotland UK

Creator(dataset)

Researcher(data)

Data Centre(database)

(Data) Library

Licence

Data (simple model)

who provides framework? … the ‘middleware’/infrastructure’

... with what kind of Licence(s) for access?

£ ??

P.Burnhill, EDINA/JISC, 2006

Page 32: Peter Burnhill Director, EDINA  JISC National Data Centre,  University of Edinburgh, Scotland UK

Creator(dataset)

Researcher

Institution

Licence

Doing Data

peer review

peer exchange

Informal: ‘invisible college’ and the ‘gift economy’

Institutional arrangement

Authorised Online Access

Forma£

economy

repositories

‘Open Access’‘Digital Preservation’

free2web access

datasets££

learned

society

Data Centre

Page 33: Peter Burnhill Director, EDINA  JISC National Data Centre,  University of Edinburgh, Scotland UK

33

All Curators Now …

Thank you

[email protected]

http://edina.ac.uk

http://jisc.ac.uk

Page 34: Peter Burnhill Director, EDINA  JISC National Data Centre,  University of Edinburgh, Scotland UK

JISC Information Environment Architecture

(Idealised) Technical Infrastructure for ServicesAndy Powell, 2005

Page 35: Peter Burnhill Director, EDINA  JISC National Data Centre,  University of Edinburgh, Scotland UK

35

Disciplinary data-centres

* Something’s special about the spatial *

EDINA has role as Geographic Data Centre

Slide ‘borrowed’ from Liz Lyon, & curated ..

Page 36: Peter Burnhill Director, EDINA  JISC National Data Centre,  University of Edinburgh, Scotland UK

36

Support for Research & research-led learning• Data, software and facilities

– Data as ‘evidence’– Data curation and digital preservation: continuing access

* Digital Curation Centre established (Edinburgh-led)

• Data Archives and Data Libraries– Social surveys, and much more – IASSIST

* International Association for data professionals (1972 -)* Members in Philippines and Vietnam

• Census Programme– Small area statistics [MIMAS]– UKBORDERS (boundaries for thematic mapping) [EDINA]

• EDINA Digimap Collection– Topographic mapping data, from national mapping agency– Marine & Geological mapping data– I could say very much more about Digimap!!

• And then there are images and documentary films!

Page 37: Peter Burnhill Director, EDINA  JISC National Data Centre,  University of Edinburgh, Scotland UK

37

Page 38: Peter Burnhill Director, EDINA  JISC National Data Centre,  University of Edinburgh, Scotland UK

38

Focus on community-generated resources

1. ‘traditional ground for libraries’– Union catalogues (& links to ILL/docdel) – SUNCAT– [SAsk me about SUNCAT]

2. ‘digital library developments’* Resource Discovery Network* Inter-operability – not just http, but m2m interfaces* Digitisation

– Newspapers, NewsFilm, Manuscripts …– DIWAN: digitising Islamic Materials in UK university collections

3. New challenge: Open Access repositories* International development – UK active * Institutional Repositories

– ‘put it in The Depot’ – www.depot.ac.uk [not yet launched]

need Access Management Federation for Education & Research – Shibboleth framework, developed as part of Internet2