Upload
jiscdatapool
View
1.985
Download
1
Tags:
Embed Size (px)
DESCRIPTION
There cannot be many mature products where development meetings have not been interrupted with a rueful declaration that to make further progress “you wouldn’t start from here”. This encapsulates one key difference between the architect and engineer, the latter prepared to work with the set of tools provided, the other preferring to start with a blank sheet of paper or an open space. In building research data repositories using two different softwares, Microsoft Sharepoint and EPrints, the DataPool Project is working somewhere between these extremes. Which approach will prove to be the more resilient for research data management (RDM)? In this talk we will look at the relevant factors.
Citation preview
To architect or engineer? Lessons from DataPool on building RDM repositories
Steve Hitchcock, JISC DataPool Project 9th DCC Research Data Management Forum (RDMF9) Cambridge, 14-15 November 2012
Why architecting?
http://datapool.soton.ac.uk
DataPool architecture (Sharepoint)
Peter Hancock, iSolutions, University of Southampton
DataPool Building Capacity, Developing Skills, Supporting Researchers
JISCMRD Progress Workshop 24-25 October 2012 Nottingham
Byatt, D. ([email protected])Hitchcock, S. ([email protected] )White, W. ([email protected] )
Policy and guidance Data repositoryTraining
http:/datapool.soton.ac.uk/
SharePoint
EPrints 3.3
EPrints data apps
Support for Data Management Plans e.g.
3-layer metadata
Capture/share with external sources, e.g. SWORD-ARM
Informed by
Developing/working with
Progress
AssignDataCiteDOIs
Graduate & staff training services
Large-scaledata storage
University Strategic Research Groups
Doctoral Training Centres
IDMB Surveys of data practices among academics
Case studies +• Imaging, 3D• Geodata• ++
Data repository platforms
• DataFlow• MS Sharepoint
• EPrints
Other platforms available
• DSpace, CKAN, data.bris, etc.
Architected
EngineeredFrom a data repository
perspective
Implementations of DataFlow Model
DataStage SWORDCurated repository/archive
DataFlow: two data deposit motivations for creators: want to (practice), need to (policy)
Two-stage architecture DataBank
Addresses Dropbox effect for data producers
EPrints
DSpace QMUL
DataStage: Upload file
DataStage was developed at the University of OxfordDataStage screenshots courtesy JISC Kaptur project
http://www.vads.ac.uk/kaptur/Thanks to Carlos Silva
DataStage: Submit as data package
3-layer metadata model
Takeda et al., 6th IDCC, Dec. 2010available from http://eprints.soton.ac.uk/169533/
JISC Institutional Data Management Blueprint (IDMB) Project, University of Southampton
SharePoint user interface 1: project
SharePoint user interface 2: data
+ fields for format, keywords
Prof. Simon Cox (engng) on Sharepoint“The concept that formed part of SP thinking (at Southampton) from the very inception … that ability to use SP as a way to manage or at least collaborate as part of a 5-10 year programme of work.
“The other side is what we’re doing with intellectual property and what we’re offering for students. I chair a group design project, and every single student has said ‘I just do it all on Dropbox’. The same is happening with our research. So I think we have at least to provide a level of service and a level of integration between our research experience and our teaching experience. Would these people go to Southampton rather than University of Nowhereshire on the Web or the University of Google or the University of Dropbox? These are deep questions for us.”
ePrints Soton: Item type: Dataset
Currently EPrints v3.2, customised to ePrints SotonDataset Item Type from 2007
ePrints Soton: start to deposit Dataset
EPrints data apps
Apps available from EPrints Bazaar http://bazaar.eprints.org/ Apps work with EPrints v3.3 or later
EPrints (test repo) DataShare enabled
App by Tim Brody, EPrints + DataPool
EPrints (test repo) Data Core enabled
App by Patrick McSweeney
Data Core “adds a few fields and doesn’t remove any fields from the eprint object. It creates an alternate workflow for datasets which is much smaller than a normal eprints workflow.”
EPrints (test repo) Data Core enabled 2
App by Patrick McSweeney
Essex Research Data metadata profile aims“Using metadata schema relevant to UK HE and research data (DataCite, INSPIRE and DDI 2.1), we have developed a basic metadata profile suited to describing research data generated at institutions with disciplinary diversity. The inclusion of fields like Funder and Grant number will ensure future harvesting and linking opportunities (like RCUK Research Outcome Systems). The metadata also suits the EPSRC data registry requirements.”http://researchdataessex.posterous.com/repository-beta-metadata-profile-released
EPrints: Essex Research Data repository
EPrints v3.3.10, customised to Essex Research Data
http://researchdata.essex.ac.uk/
Screenshots courtesy JISC
Research Data @Essex project
Thanks to Louise Corti, Tom Ensom,
Alexis Wolton
Essex Research Data record
Essex Research Data: observations
• Assumes data deposit, so no selection of EPrints Item Type• No selection of e.g. Creative Commons licence, just copyright• Requirement for Time Period suggests particular type of data expected• Fields for Geographic info (not required) suggests particular type of data expected
Architects and surroundings“On one plot aggressively crystalline blocks by Rogers Stirk Harbour are going up, their diamond shapes having nothing in particular to do with anything around them. On another Foster and Partners have designed a series of curving, stepped, blobby things, of the kind usually designed to take advantage of views on the Med or the Gulf, but are here facing each other like rows of daleks. Again, it shows little interest in anything around it.”
R. Moore, Utopia on Thames, Observer, 11 Nov 2012
Nine Elms, London
usembassylondon
Open access repository interoperabilityConfederation of Open Access Repositories (COAR)Dublin Core, CRIS-CERIFOpenAIRE, RepositoryNet+, RioxxRCUK: Research Outcomes System, Gateway to Research, REF
Is there the same current debate about interoperability of data repositories?
COAR on OA interoperabilitySpecific initiatives designed to support interoperability: AuthorClaim, CRIS-OAR, DataCite, DINI Certificate for Document and Publication Services, DOI, DRIVER, Handle System, KE Usage Statistics Guidelines, OAI-ORE, OAI-PMH, OA-Statistik, OA Repository Junction, OpenAIRE, ORCID, PersID, PIRUS, SURE, SWORD, and UK RepositoryNet+.COAR, The Current State of Open Access Repository Interoperability (2012), 26 Oct. 2012 v.02
MT @gknight2000 (Gareth Knight) Lincoln's CKan instance impressive bit.ly/QQd1au Doesn't appear to support OAIPMH or preservation function #jiscmrd
What next for DataPool repositories?Sharepoint• User test and feedback sessions scheduled, will direct further development
EPrints apps (1 or 2 0f following, initially)• Develop app based on Essex data repository, providing other repositories with a 1-click install of this profile• Build interoperability (I/O) apps: e.g. Data Management Plans, Dropbox• Automate record capture for producers of large-scale, regular data outputs