Upload
christopher-brown
View
120
Download
1
Embed Size (px)
Citation preview
Jisc Research Data Discovery Service ProjectChristopher Brown
February2016
2
Background and Context» Phase 1 – Digital Curation Centre (DCC) piloted an approach to a registry
service aggregating metadata for research data held within UK universities and national, discipline specific data centres
» Phase 2 – will build on this pilot work with the aim to lay the firm foundations for a UK Research Data Discovery service that enables the discovery of UK research data and meets Jisc’s customer requirements. Includes a service operation plan and business case for its delivery into the future.
» Project page – http://jisc.ac.uk/rd/projects/uk-research-data-discovery » Project blog – http://rdds.jiscinvolve.org/wp/ » #jiscRDDS
3
Benefits and justification» Discovery of research data within HEIs and Data Centres
› Increased visibility and transparency of research data› Promotes their research / researchers› Re-use of data› Validation of research› Improves research data management practices, including metadata
creation› Adoption of standards› Satisfy research councils’ policies› Funders can ensure research isn’t being repeated› Potential increase in cross-disciplinary and cross-institutional research› Links to other research artefacts
4
Project Team» Catherine Grout – Project Director (Jisc)» Christopher Brown – Project Manager (Jisc)» Mark Winterbottom – Technical Developer (Jisc)» Dom Fripp – Metadata Developer (Jisc)» Ade Stevenson - Technical Innovations Coordinator (Jisc Manchester)» Veerle van den Eynden – Data Centre Engagement (UKDS)» Diana Sisu – HEI Engagement (DCC)
5
What the project is delivering?» Develop and agree sector requirements for a UK Research Data
Discovery Service.» Develop key use cases, both for human and system interfaces,
for the use of the discovery service. These will be developed through the lifetime of the project to test and guide its scope, architecture and functional needs.
» Ensure development is steered by the UK research data community through the project’s governance structure.
» Evaluate the ANDS and CKAN software as a potential solution to developing a research data discovery service.
» Collaborate closely with the HEIs and Data Centres from phase 1, where they are willing to participate, to ensure their metadata is harvested and successfully imported into the registry through an easy-to-use interface.
6
What the project is delivering?» Identify and finalise the agreement on the metadata
standard/profile that is appropriate for a successful cross disciplinary service.
» Identify the architecture that a UK service could operate on and develop a functioning service instance, including the opportunity for institutions to have a localised view into the registry.
» Ingest metadata into a functioning service instance for all participating Data Centres and HEIs.
» Develop the business case for a UK Research Data Discovery Service including evidence based market research and cost-benefit analysis.
» Establish and run stakeholder groups to engage with the community to understand their needs and to help to build an effective solution.
7
What the project is delivering?» Evaluate the role of this service as providing institutional
infrastructure for data discovery and how it works with universities.
» Ensure the service and user interface has undergone comprehensive usability tests.
» Produce toolkits, guidance and advice, as appropriate, on implementation.
» Clear articulation of where the UK Research Data Discovery Service sits within other elements of research data infrastructure.
8
Who’s it for? Participating pilotsHEIs» University of Hull» University of St Andrews» University of Glasgow» Oxford Brookes University» University of Edinburgh» University of Oxford» University of Southampton» University of Leeds» University of Lincoln
Data Centres» Archaeology Data Centre» Cambridge Crystallographic Data
Centre » ISIS/ICAT - STFC» UK Data Service» Visual Arts Data Centre» NERC
9
Who’s it for? Governance Structure» Oversight Group
› Representatives from Jisc and partner organisations to lead the direction of the project.
» User Group › Not researchers but people sharing catalogues and submitting data.
» Technical & Metadata Advisory Group› Looking at the service from a technical standpoint. Scope includes
consideration of issues such as handling duplicates, deletions, choice of crosswalks for support, QA of crosswalks, transport mechanisms (e.g. OAI-PMH), and other relevant issues. Comprised of developers and architects within the project, plus developers from ANDS and CKAN and relevant technical experts in participating data centres and HEIs.
› Advising on the development of the metadata schema, including the necessary and desirable metadata elements to achieve discovery functionality and which conventions should be adopted when using these and other relevant issues. Comprised of metadata experts from within the project and relevant metadata experts in participating data centres and HEIs.
» User Group – researchers› As the overall aim of the project is production of a service to provide improved
discoverability of research data for reuse in research, it is critical that we provide a mechanism for researchers to interact with and feedback on the development of the service. This may be achieved by representative bodies and / or nomination of researchers by project partner institutions.
03/05/2023
10
Who’s it for? Use cases for different actorsResearcherDiscover datasetsDiscover related objects/resourcesFind data across disciplines by locationFind exemplar data to inspire my researchTargeted search for topical dataVisual search for dataFind linked open dataUnderstand metadata qualityUnderstand data qualityShow research impact
Project/research managerReporting to fundersFind research outputs of my institution
FunderReturn on investment
Data repositoryShow repository impactMetadata rights respectedShow licence and rights of dataIndex to external services Force refresh of registry content
MachineHarvestable registryShow relationships between resources
System managerNo duplicate recordsHarvest datasets Update platform software
11
Metadata flows
12
Current status» High level evaluation of ANDS and CKAN with report complete» CKAN selected as solution and installed for testing
http://ckan.data.alpha.jisc.ac.uk/ » Test instance of CKAN (alpha) with data harvested from
HEIs/Data Centres» HEI and Data Centres Requirements Reports delivered by
DCC/UKDS» Statement of Requirements extracted from use cases and agreed
with Advisory Groups» Finalised membership of Advisory Groups (Technical & Metadata,
User, Researcher)» Five rounds of Advisory Group meetings convened » Second F2F Workshop for participants and advisory group
members – 18 Feb
13
Issues» Quality of metadata exposed by different HEIs and Data Centres» Diversity in mandatory and optional metadata fields» Copyright of metadata, for example Visual Arts Data Service
(VADS), and whether or not all metadata can be licensed via a CC0 licence
» Updates of metadata in the registry to handle deletions/changes» Usability of the discovery service» Browse and Search – how best to present results. Ranking of
datasets.» Ensure all requirements have been clearly defined, for example
quality of data, links to other artefacts, duplicate records (data archived in >1 repository).
» Alpha site is public so ensure feedback is tracked but focus on user requirements from participants
14
Current Focus» The current focus is on the following areas:
› System testing – matching functionality against requirements› Requirements – signing off main (“must have”) requirements
and ensuring user needs are met. Finalising less well defined requirements
› Metadata – finalising the core metadata schema with participants / advisory groups
› Scope of datasets – ensuring there is agreement on what datasets are harvested – Primary – the primary purpose of the service is to include data created in
ac.uk domains and that this data is available for reuse– Secondary – this may refer to sources of data, not produced by UK
researchers, in .ac.uk– Exploratory/Developmental – other sources not part of current
development work
15
TimelineMilestones 2015
April-June July September-October
November December
- Project plan- Grant letters- Initial Workshop- Advisory Groups- User Stories
- Metadata format defined- Prototype RDDS development- Call for proposal (Inst’al Implem)
- Test harvesting- RDDS initial testing- RDDS prototyping
- Requirements gathered- Use stories -> Use Cases (refined/prioritised)
- High level evaluation- CKAN selected as platform- Reqs defined from use cases
- Metadata standard format of service defined- Service proto- Call for HEIs to pilot inst’al impl.
- Test metadata defined and harvested- Iterative development- Initial testing
- User Stories refined-Advisory Groups setup (Tech & Metadata, User, Researcher)
- Technical Evaluation Report- CKAN installation- Requirements defined
- Data Centre Reqs Report- HEI Reqs Report- Use Cases
16
TimelineMilestones 2016
January February-March April-May June-August September
- RDDS prototyping- RDDS testing- Metadata format / standards
- Metadata Tech Report- Metadata Records/Stores- Institutional Implementation Reports
- Business Case- Working service software implementation
- Data Centre / HEI Pilots Implementation Reports
- Service Operational Spec- Localised implementation and report
- Iterative development- Testing- Metadata format defined, supported formats agreed, export format.
- Metadata records harvested from pilots- Pilots have metadata stores for harvesting- HEI/Sector/Use cases reports (Inst. Imp)
- Options and costs for running a sustainable service- Implementation as a service ready for deployment
- Implementation reports from Data Centres and HEIs
- Spec on running as a service- Localised institutional implementation / deployment
17
Find out more…
Christopher BrownSenior Co-design Manager, [email protected] @chriscb
Except where otherwise noted, this work is licensed under CC-BY-NC-ND