Active research management and sharingChair: Professor David De Roure, OERC
01/05/2023
1
01/05/2023
IntroductionChair: Professor David De Roure, OERC
01/05/2023
Open science frameworkJeff Spies, Centre for Open Science
Open Science FrameworkJeffrey Spies
@jeffspies
Center for Open Science | University of Virginia
CENTRE
MissionIncrease openness, integrity, and reproducibility of scientific research.
ProblemThe gap between scholarly values and practices.
Incentives for individual success are focused on getting it published, not getting it right.
Nosek, Spies, & Motyl, 2012
Key incentive = PublicationWhat is published?Novel resultsPositive resultsClean results
What is not?ReplicationsNegative/NullsMixed evidence
NormsCommunalityOpen sharing
CounternormsSecrecyClosed
NormsCommunalityOpen sharingUniversalismEvaluate research on own merit
CounternormsSecrecyClosedParticularlismEvaluate research by reputation
NormsCommunalityOpen sharingUniversalismEvaluate research on own meritDisinterestednessMotivated by knowledge and discovery
CounternormsSecrecyClosedParticularlismEvaluate research by reputationSelf-interestednessTreat science as a competition
NormsCommunalityOpen sharingUniversalismEvaluate research on own meritDisinterestednessMotivated by knowledge and discoveryOrganized skepticismConsider all new evidence, even against one’s prior work
CounternormsSecrecyClosedParticularismEvaluate research by reputationSelf-interestednessTreat science as a competition
Organized dogmatismInvest career promoting one’s own theories, findings
NormsCommunalityOpen sharingUniversalismEvaluate research on own meritDisinterestednessMotivated by knowledge and discoveryOrganized skepticismConsider all new evidence, even against one’s prior workQuality
CounternormsSecrecyClosedParticularismEvaluate research by reputationSelf-interestednessTreat science as a competition
Organized dogmatismInvest career promoting one’s own theories, findingsQuantity
Anderson, Martinson, & DeVries, 2007
Anderson, Martinson, & DeVries, 2007
Anderson, Martinson, & DeVries, 2007
ProblemThe gap between scholarly values and practices.
SolutionOpenness.
Openness• Increases process transparency• Increases accountability• Facilitates reproducibility• Facilitates metascience• Fosters collaboration• Fosters inclusivity• Fosters innovation• Protects against lock-in
• Open Content + APIs + Open Source Services
Openness is a means to increase research quality and efficiency.
SolutionTechnology can operationalize and incentivize scholarly values.
Openness is the solution;Workflow makes it practical.
Publish Report
Search/Discovery
Develop Idea
Design Study
Collect Data
Store Data
Analyze Data
Write Report
Publish Report
Search/Discovery
Develop Idea
Design Study
Collect Data
Store Data
Analyze Data
Write Report
Let experts be experts.
Scholars should focus on scholarship.
Publish Report
Search/Discovery
Develop Idea
Design Study
Collect Data
Store Data
Analyze Data
Write Report
OSF
OSF Application Framework
• Workflow• Authentication• Permissions• File Storage• File Rendering• Meta-database• Persistence• Integrations• Search• SHARE
osf.io
journalsregistries
preprint serversgrants managementuniversity systems
http://osf.io
CollaborationDocumentati
onArchiving
Version Control
Other Features• Granular permissions• Analytics dashboards• Forking• Persistent, citable identifiers • Persistent content• Project snapshotting (i.e., registration)
Connects Services
Researchers Use
Publish Report
Search/Discovery
Develop Idea
Design Study
Collect Data
Store Data
Analyze Data
Write Report
Now
OpenSesameSoon29 grants to develop open tools and services: https://cos.io/pr/2015-09-24/
Publish Report
Search/Discovery
Develop Idea
Design Study
Collect Data
Store Data
Analyze Data
Write Report
Let experts be experts.
OSF Application Framework
• Workflow• Authentication• Permissions• File Storage• File Rendering• Meta-database• Persistence• Integrations• Search• SHARE
osf.io
journalsregistries
preprint serversgrants managementuniversity systems
http://osf.io/prereg
http://osf.io/prereg
Society for the Study of Autism Registry
<menu> <title>
Society for the Study of...</title><osf-search /> <osf-login /></menu><society-questionnaire>
Author: <osf-user name />Question 1: <osf-metadata />
Analysis code:<osf-file upload /></society-questionnaire>
OSF Application Framework
• Workflow• Authentication• Permissions• File Storage• File Rendering• Meta-database• Persistence• Integrations• Search• SHARE
osf.io
journalsregistries
preprint serversgrants managementuniversity systems
Let experts be experts.
OSF Application Framework
• Workflow• Authentication• Permissions• File Storage• File Rendering• Meta-database• Persistence• Integrations• Search• SHARE
registries
registries
OSF Application Framework
• Workflow• Authentication• Permissions• File Storage• File Rendering• Meta-database• Persistence• Integrations• Search• SHARE
preprint servers
preprint servers
preprint servers
Coming soon:http://osf.io/preprints
OSF Application Framework
• Workflow• Authentication• Permissions• File Storage• File Rendering• Meta-database• Persistence• Integrations• Search• SHARE
osf.io
journalsregistries
preprint serversgrants managementuniversity systems
Substantive experts shouldn’t deal with…
• Workflow integration
• Authentication• Permissions• File storage• File rendering
• Database– Metadata/
Annotations/Commenting
• Persistence• External service
integrations• Search• SHARE Data
…or scaling these:
• Workflow integration
• Authentication• Permissions• File storage• File rendering
• Database– Metadata/
Annotations/Commenting
• Persistence• External service
integrations• Search• SHARE Data
@jeffspies | [email protected]
Find this presentation athttp://osf.io/u4gs7
Active research from lab to publication
Prof. Simon Coles ([email protected])
Director, UK National Crystallography Service
What is “active” here?Continual management throughout the whole experimental process
Immediate feedback during the experiment to inform next steps or direction
Operating a Service
66
Approval
Submission
Experiment
Analysis
Publication
Proposal
Admin
Reporting
Working Across Facilities
67
Conceive Research Propose Experiment Analyse Publish
Approval
Submission
Experiment
Analysis
Publication
Proposal
Proposal
Approval
Schedule
Experiment
Archive
Analyse
Publish
NCS User
NCS
Central Facility
LIMS: Lifecycle Management
68
Approval
Submission
Experiment
Analysis
Publication
Proposal
Admin
Reporting
Getting Mobile in the Lab
69
Package & Sample Tube
Bulk Sample & Manipulation
Mounted Sample
Trial Diffraction Pattern
70
Unstructured data
Structured Data
71
http://ecrystals.chem.soton.ac.uk
In the context of ‘traditional’ publishing• ELN as Supplementary Information for
conventional publication (Chemistry Central Journal 2013, 7:182 )
72
Can we make metadata do more for us (actively)?
Formal frameworks for real-time capture required…
A semantic framework for chemistry• Describes and relates different types of process information
74
elnItemManifest
high-level semantic
description of ELN record
Core Scientific Metadata
model SIMS
Reaction Procedures
S88
Analytical dataAllotrope
Foundation
elnItemManifest• Layered metadata model for description, export &
packaging• This is the first (information) layer – leads into
knowledge• Published through Dial-a-Molecule athttp
://wp.me/p2JoQ6-xF & in J. ChemInf 2013, 5:52
75
Core Scientific Metadata modelas a Starting Point
• Doesn’t cover all, but…
• Forms the basis for extensions:- To derived data- To laboratory based science- To secondary analysis data- To preservation information- To publication data
76
Investigation
Publication KeywordTopic
Sample Sample ParameterDataset
Dataset ParameterDatafile
Datafile Parameter
Investigator
Related Datafile
Parameter
Authorisation
SIMS: Sample Information Management System• A standard/format for crystallographic sample and
experiment data management and archival • Supported by CrystalClear and NCS Portal, providing
interaction between facility, instruments and CIF, ImgCIF etc
77
Standards for reactions: S88
78
• Group arising from Dial-a-Molecule consisting of Mettler Toledo, Pfizer, GlaxoSmithKline, AstraZeneca, Johnson & Johnson, Southampton University, NextMove, Royal Society of Chemistry looking to:
– Provide guidance for S88 implementations for synthetic organic chemistry reaction procedures
– Provide example set– Agree on controlled vocabularies for elements– Generate a schema– IUPAC uptake?
Standards for reactions: S88
79
• Group arising from Dial-a-Molecule consisting of Mettler Toledo, Pfizer, GlaxoSmithKline, AstraZeneca, Johnson & Johnson, Southampton University, NextMove, Royal Society of Chemistry looking to:
– Provide guidance for S88 implementations for synthetic organic chemistry reaction procedures
– Provide example set– Agree on controlled vocabularies for elements– Generate a schema– IUPAC uptake?
16 (0.101 g, 0.132 mmol) and Cu(OTf)2 (0.006 g, 0.01 mmol) were added to a round-bottom flask under a N2 atmosphere. In a separate vial, 2 (0.155 g, 0.753 mmol) was dissolved in C2H4Cl2 (1.3 mL) and transferred to the reaction flask. CF3CO2H (0.030 mL, 3 equiv) was added to the reaction mixture, which was refluxed at 100 °C for 1 h. The reaction mixture was washed with saturated NaHCO3 (15 mL) and extracted with C2H4Cl2 (3 x 5 mL). The organic fractions were collected, dried (MgSO4), and filtered to give a dark red solution. The solvent was removed, and the product was purified by column chromatography (SiO2, 30:70 CH2Cl2 : hexane) to yield 17 as a pale yellow powder (0.096 g, 68% yield).
Allotrope Foundation
• Standards for analytical process in (pharma) industry
80
Allotrope Foundation
81
Allotrope Foundation
82
Research Data Alliance
• Chemistry data interest group • Joint RDA/IUPAC Charter drafted
– Characterise chemical data types– Leverage to establish standards– Examine workflows in disciplines interacting
with chemistry– Cultivate a sharing culture
83
What about process?
Recording process
• Plan (Prospective provenance)
85
• Enactment (Retrospective provenance)
• Realisation
oreChem Plan for eCrystals• Machine-readable
representation of methodology
• Describes requirements for software and data products
86
CREAM:Collaboration for Research Enhancement using Active Metadata• How to collect and use
metadata actively to capture tacit information
• Active metadata: assemblage of metadata and annotations used actively within the process that generates it (capable of being reused by another process).
• Central Facilities; Chemistry; Geosciences; Art; Music…
• Uptake: CODATA; Research Data Alliance
87https://blog.soton.ac.uk/cream/
01/05/2023
Managing active research in the universityRobin Rice, University of Edinburgh
Managing active research in the University of Edinburgh
Robin Rice University of Edinburgh
(@sparrowbarley)
Elements of the presentation
• Funded & unfunded research, PGR students, collaborators
• Managing & support for …– research grants– research outputs– research data
• Simplified research lifecycle– (before – during – after)
90
New Research Management and Administration System (grants)
Worktribe - Empowering and supporting research administrators and investigators from idea through costing, approvals, award, post-award management and closure.• A recent requirements and procurement project focused on
delivery of a new 'Worktribe Research Management' system to support improved pre-award and post-award business processes across the University.
– 3 pilot schools / institutes November 2015– Go-live across university April 2016
• The new 'Worktribe Research Management' system is integrated with the existing Finance, HR and PURE systems.
• https://www.projects.ed.ac.uk/programme/rmas91
Managing research outputs: guidance for authors; OJS
1. Make your work Open Access 5. Acknowledge your funder2. Use your name consistently 6. Statement on research data3. Use an ORCID identifier 7. Cite the DOI and OA links4. Institutional affiliation 8. Claim your digital space
‘Ensure your research reaches the widest possible global audience, is eligible for submission in research assessment exercises, and fulfils funder requirements.’
92
Type, format volume of data, chosen software for long-term access, existing data, file naming, structure, versioning, quality assurance process.
Information needed for the data to be read and interpreted in future, metadata standards, methodology, definition of variables, format & file type of data.Restrict access to data, risks to data security, appropriate methods to transfer / share data, encryption.
Secure & sufficient storage for active data, regular backups, disaster recovery
Make data publicly available (where possible) at the end of a project, license data, any restrictions on sharing, access controls?
Select data to keep, decide how long data will be kept, in which repository, costs involved in long-term storage?
Day-
to-d
ay m
anag
emen
t of d
ata
Managing Research Data(from researcher training)
93
Who manages active data?
• RDM Policy (2011) sets out roles and responsibilities for researchers & the institution– Researchers are responsible for their work– Enabling role of institution
• Services, including stewardship• Monitoring compliance
• Principal Investigators• Research Institutes & Schools
94
95
From RDM programme to Research Data Service…• RDM programme a result of both bottom-up (‘action
group’) activity and top-down policy implementation• Services had various providers and their purpose and
names were confusing• New single service has
SO & SOM with Virtual Team across Information Services
96
Before -
97
After (Simplified data lifecycle)
98
Tools & support for working with data
99
Finding and analysing data
• What is Data Library & Consultancy? • The Data Library & Consultancy team assists researchers to
discover and use datasets for analysis, learning and teaching. • Data librarians are available to help you find answers to data-
related questions.• Tools include a data catalogue and the Survey Documentation
and Analysis online data browser
100
Storing data
• What is DataStore? • DataStore is file storage for active research data, and is
available to all research staff and postgraduate research students (PGRs).
• DataStore provides a free individual allocation for each researcher, as well as shared group spaces. Additional capacity of virtually any size is available.
101
Transferring data
• What is DataSync? • DataSync is a tool to synchronise and share research data with
collaborators. It has an app to synchronise data to computers and mobile devices, and a web interface to allow access to data from any web browser.
• Data can be shared with anyone who has an email address, via the web interface.
102
Versioning software
• What is Subversion? • Subversion is a version control tool which allows users to store code. It is
also available as an extension called SourcEd which provides a web based collaboration tool integrated with your repository.
• When documents stored in a Subversion repository are updated the old versions are kept so you can revert if necessary. The service also allows multiple people to collaborate on documents.
103
Data management support
One-to-one support is available on the following areas of RDM:• Writing and reviewing DMPs;• Creating SOPs for metadata collection and publication;• Creating SOPs for good data management practice;• Choosing a data repository and preparing data for deposit and
publication
104
Training - Online
• MANTRA: MANTRA is a free, non-credit, self-paced course designed for postgraduate students and early career researchers which provides guidelines for good practice in research data management
• Research Data Management and Sharing MOOC: This free five-week course - created by the Universities of Edinburgh and North Carolina - is designed to reach learners across disciplines and continents.
105
106
MANTRA sessions by country 2012-2016
107
Sessions by city: January – June 2016
108
MOOC’s contents
• Understanding Research Data• Data Management Planning• Working with Data• Sharing Data• Archiving Data
110
111
112
Training workshops
• Creating a data management plan for your grant application• Managing your research data: why is it important and what
should you do?• Working with personal & sensitive research data• Good practice in research data management• Handling data using SPSS
113
RDM training take-up, 2014-2016
114
01/05/2023
Making research available - FAIR principles and Force 11David De Roure, Oxford e-Research Centre
David De Roure @dder
Making research available:FAIR principles and FORCE11
DIRECTOR, UNIVERSITY OF OXFORD E-RESEARCH CENTRE
A Brief History of Force11● 2008/2009:
– Elsevier Grand challenge● 2010:
– Found & connected to Phil Bourne
– Planned Dagstuhl meeting● 2011:
– January: Beyond the PDF, San Diego: 97 Attendees
– August: Force11 at Dagstuhl: 34 attendees
– November: Manifesto is published >> Force11!
● 2012: Funding Sloan
● 2013: Beyond the PDF2, Amsterdam:
– 148 attendees, great discussion
● 2014: Working groups take off:
– Data Citation Principles Working group
– Resource Identifier Working group
● 2015: Force15, Oxford: – 257 attendees
● 2015: Force 2016, Portland, Oregon
www.force11.org
PREAMBLE. Sound, reproducible scholarship rests upon a foundation of robust, accessible data. For this to be
so in practice as well as theory, data must be accorded due importance in the practice of scholarship and in the
enduring scholarly record.
In support of this assertion, and to encourage good practice, we offer a set of guiding principles for data within
scholarly literature, another dataset, or any other research object.
1. IMPORTANCE. Data should be considered legitimate, citable products of research. Data citations should be accorded the
same importance in the scholarly record as citations of other research objects, such as publications [1].
2. CREDIT AND ATTRIBUTION. Data citations should facilitate giving scholarly credit and normative and legal attribution
to all contributors to the data, recognizing that a single style or mechanism of attribution may not be applicable to all data [2].
3. EVIDENCE. In scholarly literature, whenever and wherever a claim relies upon data, the corresponding data should be cited
[3].
4. UNIQUE IDENTIFICATION. A data citation should include a persistent method for identification that is machine-
actionable, globally unique, and widely used by a community [4].
5. ACCESS. Data citations should facilitate access to the data themselves and to such associated metadata, documentation,
code, and other materials, as are necessary for both humans and machines to make informed use of the referenced data [5].
6. PERSISTENCE. Unique identifiers, and metadata describing the data and its disposition, should persist -- even beyond the
lifespan of the data they describe [6].
7. SPECIFICITY AND VERIFIABILITY. Data citations should facilitate identification of, access to, and verification of the
specific data that support a claim. Citations or citation metadata should include information about provenance and fixity
sufficient to facilitate verifying that the specific timeslice, version and/or granular portion of data retrieved subsequently is the
same as was originally cited [7].
8. INTEROPERABILITY AND FLEXIBILITY. Data citation methods should be sufficiently flexible to accommodate the
variant practices among communities, but should not differ so much that they compromise interoperability of data citation
practices across communities [8].
DC 1
ENDORSE PRINCIPLESWWW.FORCE11.ORG/DATACITATION
AUTHORS: Data Citation Synthesis Group FORCE11.org/datacitation/workinggroupREFERENCES: FORCE11.ORG/DCREFERENCE
The Joint Declaration of Data Citation Principles
121https://www.force11.org/fairprinciples
A diverse set of stakeholders - representing academia, industry, funding agencies, and scholarly
publishers - have come together to design and jointly endorse a concise and measureable set of principles, for those wishing to enhance the reusability of their data holdings
Including, but not limited to:
European Open Science Cloud – High Level Expert Group
These put emphasis on enhancing the ability of machines to automatically find and use the data, in
addition to supporting its reuse by individual
NOTE: The Principles are high-level; do not suggest any specific technology, standard, or implementation-
solution
Exemplar implementations, their (level of) FAIRness and
resulting value-added
FindableAccessibleInteroperableReusable
Thanks to Susanna Sansone
Active research management and sharing
01/05/2023
126