Active research management and sharing

Preview:

Citation preview

Active research management and sharingChair: Professor David De Roure, OERC

01/05/2023

1

01/05/2023

IntroductionChair: Professor David De Roure, OERC

01/05/2023

Open science frameworkJeff Spies, Centre for Open Science

Open Science FrameworkJeffrey Spies

@jeffspies

Center for Open Science | University of Virginia

CENTRE

MissionIncrease openness, integrity, and reproducibility of scientific research.

ProblemThe gap between scholarly values and practices.

Incentives for individual success are focused on getting it published, not getting it right.

Nosek, Spies, & Motyl, 2012

Key incentive = PublicationWhat is published?Novel resultsPositive resultsClean results

What is not?ReplicationsNegative/NullsMixed evidence

NormsCommunalityOpen sharing

CounternormsSecrecyClosed

NormsCommunalityOpen sharingUniversalismEvaluate research on own merit

CounternormsSecrecyClosedParticularlismEvaluate research by reputation

NormsCommunalityOpen sharingUniversalismEvaluate research on own meritDisinterestednessMotivated by knowledge and discovery

CounternormsSecrecyClosedParticularlismEvaluate research by reputationSelf-interestednessTreat science as a competition

NormsCommunalityOpen sharingUniversalismEvaluate research on own meritDisinterestednessMotivated by knowledge and discoveryOrganized skepticismConsider all new evidence, even against one’s prior work

CounternormsSecrecyClosedParticularismEvaluate research by reputationSelf-interestednessTreat science as a competition

Organized dogmatismInvest career promoting one’s own theories, findings

NormsCommunalityOpen sharingUniversalismEvaluate research on own meritDisinterestednessMotivated by knowledge and discoveryOrganized skepticismConsider all new evidence, even against one’s prior workQuality

CounternormsSecrecyClosedParticularismEvaluate research by reputationSelf-interestednessTreat science as a competition

Organized dogmatismInvest career promoting one’s own theories, findingsQuantity

Anderson, Martinson, & DeVries, 2007

Anderson, Martinson, & DeVries, 2007

Anderson, Martinson, & DeVries, 2007

ProblemThe gap between scholarly values and practices.

SolutionOpenness.

Openness• Increases process transparency• Increases accountability• Facilitates reproducibility• Facilitates metascience• Fosters collaboration• Fosters inclusivity• Fosters innovation• Protects against lock-in

• Open Content + APIs + Open Source Services

Openness is a means to increase research quality and efficiency.

SolutionTechnology can operationalize and incentivize scholarly values.

Openness is the solution;Workflow makes it practical.

Publish Report

Search/Discovery

Develop Idea

Design Study

Collect Data

Store Data

Analyze Data

Write Report

Publish Report

Search/Discovery

Develop Idea

Design Study

Collect Data

Store Data

Analyze Data

Write Report

Let experts be experts.

Scholars should focus on scholarship.

Publish Report

Search/Discovery

Develop Idea

Design Study

Collect Data

Store Data

Analyze Data

Write Report

OSF

OSF Application Framework

• Workflow• Authentication• Permissions• File Storage• File Rendering• Meta-database• Persistence• Integrations• Search• SHARE

osf.io

journalsregistries

preprint serversgrants managementuniversity systems

http://osf.io

CollaborationDocumentati

onArchiving

Version Control

Other Features• Granular permissions• Analytics dashboards• Forking• Persistent, citable identifiers • Persistent content• Project snapshotting (i.e., registration)

Connects Services

Researchers Use

Publish Report

Search/Discovery

Develop Idea

Design Study

Collect Data

Store Data

Analyze Data

Write Report

Now

OpenSesameSoon29 grants to develop open tools and services: https://cos.io/pr/2015-09-24/

Publish Report

Search/Discovery

Develop Idea

Design Study

Collect Data

Store Data

Analyze Data

Write Report

Let experts be experts.

OSF Application Framework

• Workflow• Authentication• Permissions• File Storage• File Rendering• Meta-database• Persistence• Integrations• Search• SHARE

osf.io

journalsregistries

preprint serversgrants managementuniversity systems

http://osf.io/prereg

http://osf.io/prereg

Society for the Study of Autism Registry

<menu> <title>

Society for the Study of...</title><osf-search /> <osf-login /></menu><society-questionnaire>

Author: <osf-user name />Question 1: <osf-metadata />

Analysis code:<osf-file upload /></society-questionnaire>

OSF Application Framework

• Workflow• Authentication• Permissions• File Storage• File Rendering• Meta-database• Persistence• Integrations• Search• SHARE

osf.io

journalsregistries

preprint serversgrants managementuniversity systems

Let experts be experts.

OSF Application Framework

• Workflow• Authentication• Permissions• File Storage• File Rendering• Meta-database• Persistence• Integrations• Search• SHARE

registries

registries

OSF Application Framework

• Workflow• Authentication• Permissions• File Storage• File Rendering• Meta-database• Persistence• Integrations• Search• SHARE

preprint servers

preprint servers

preprint servers

Coming soon:http://osf.io/preprints

OSF Application Framework

• Workflow• Authentication• Permissions• File Storage• File Rendering• Meta-database• Persistence• Integrations• Search• SHARE

osf.io

journalsregistries

preprint serversgrants managementuniversity systems

Substantive experts shouldn’t deal with…

• Workflow integration

• Authentication• Permissions• File storage• File rendering

• Database– Metadata/

Annotations/Commenting

• Persistence• External service

integrations• Search• SHARE Data

…or scaling these:

• Workflow integration

• Authentication• Permissions• File storage• File rendering

• Database– Metadata/

Annotations/Commenting

• Persistence• External service

integrations• Search• SHARE Data

@jeffspies | jeff@cos.io

Find this presentation athttp://osf.io/u4gs7

Active research from lab to publication

Prof. Simon Coles (s.j.coles@soton.ac.uk)

Director, UK National Crystallography Service

What is “active” here?Continual management throughout the whole experimental process

Immediate feedback during the experiment to inform next steps or direction

Operating a Service

66

Approval

Submission

Experiment

Analysis

Publication

Proposal

Admin

Reporting

Working Across Facilities

67

Conceive Research Propose Experiment Analyse Publish

Approval

Submission

Experiment

Analysis

Publication

Proposal

Proposal

Approval

Schedule

Experiment

Archive

Analyse

Publish

NCS User

NCS

Central Facility

LIMS: Lifecycle Management

68

Approval

Submission

Experiment

Analysis

Publication

Proposal

Admin

Reporting

Getting Mobile in the Lab

69

Package & Sample Tube

Bulk Sample & Manipulation

Mounted Sample

Trial Diffraction Pattern

70

Unstructured data

Structured Data

71

http://ecrystals.chem.soton.ac.uk

In the context of ‘traditional’ publishing• ELN as Supplementary Information for

conventional publication (Chemistry Central Journal 2013, 7:182 )

72

Can we make metadata do more for us (actively)?

Formal frameworks for real-time capture required…

A semantic framework for chemistry• Describes and relates different types of process information

74

elnItemManifest

high-level semantic

description of ELN record

Core Scientific Metadata

model SIMS

Reaction Procedures

S88

Analytical dataAllotrope

Foundation

elnItemManifest• Layered metadata model for description, export &

packaging• This is the first (information) layer – leads into

knowledge• Published through Dial-a-Molecule athttp

://wp.me/p2JoQ6-xF & in J. ChemInf 2013, 5:52

75

Core Scientific Metadata modelas a Starting Point

• Doesn’t cover all, but…

• Forms the basis for extensions:- To derived data- To laboratory based science- To secondary analysis data- To preservation information- To publication data

76

Investigation

Publication KeywordTopic

Sample Sample ParameterDataset

Dataset ParameterDatafile

Datafile Parameter

Investigator

Related Datafile

Parameter

Authorisation

SIMS: Sample Information Management System• A standard/format for crystallographic sample and

experiment data management and archival • Supported by CrystalClear and NCS Portal, providing

interaction between facility, instruments and CIF, ImgCIF etc

77

Standards for reactions: S88

78

• Group arising from Dial-a-Molecule consisting of Mettler Toledo, Pfizer, GlaxoSmithKline, AstraZeneca, Johnson & Johnson, Southampton University, NextMove, Royal Society of Chemistry looking to:

– Provide guidance for S88 implementations for synthetic organic chemistry reaction procedures

– Provide example set– Agree on controlled vocabularies for elements– Generate a schema– IUPAC uptake?

Standards for reactions: S88

79

• Group arising from Dial-a-Molecule consisting of Mettler Toledo, Pfizer, GlaxoSmithKline, AstraZeneca, Johnson & Johnson, Southampton University, NextMove, Royal Society of Chemistry looking to:

– Provide guidance for S88 implementations for synthetic organic chemistry reaction procedures

– Provide example set– Agree on controlled vocabularies for elements– Generate a schema– IUPAC uptake?

16 (0.101 g, 0.132 mmol) and Cu(OTf)2 (0.006 g, 0.01 mmol) were added to a round-bottom flask under a N2 atmosphere. In a separate vial, 2 (0.155 g, 0.753 mmol) was dissolved in C2H4Cl2 (1.3 mL) and transferred to the reaction flask. CF3CO2H (0.030 mL, 3 equiv) was added to the reaction mixture, which was refluxed at 100 °C for 1 h. The reaction mixture was washed with saturated NaHCO3 (15 mL) and extracted with C2H4Cl2 (3 x 5 mL). The organic fractions were collected, dried (MgSO4), and filtered to give a dark red solution. The solvent was removed, and the product was purified by column chromatography (SiO2, 30:70 CH2Cl2 : hexane) to yield 17 as a pale yellow powder (0.096 g, 68% yield).

Allotrope Foundation

• Standards for analytical process in (pharma) industry

80

Allotrope Foundation

81

Allotrope Foundation

82

Research Data Alliance

• Chemistry data interest group • Joint RDA/IUPAC Charter drafted

– Characterise chemical data types– Leverage to establish standards– Examine workflows in disciplines interacting

with chemistry– Cultivate a sharing culture

83

What about process?

Recording process

• Plan (Prospective provenance)

85

• Enactment (Retrospective provenance)

• Realisation

oreChem Plan for eCrystals• Machine-readable

representation of methodology

• Describes requirements for software and data products

86

CREAM:Collaboration for Research Enhancement using Active Metadata• How to collect and use

metadata actively to capture tacit information

• Active metadata: assemblage of metadata and annotations used actively within the process that generates it (capable of being reused by another process).

• Central Facilities; Chemistry; Geosciences; Art; Music…

• Uptake: CODATA; Research Data Alliance

87https://blog.soton.ac.uk/cream/

01/05/2023

Managing active research in the universityRobin Rice, University of Edinburgh

Managing active research in the University of Edinburgh

Robin Rice University of Edinburgh

(@sparrowbarley)

Elements of the presentation

• Funded & unfunded research, PGR students, collaborators

• Managing & support for …– research grants– research outputs– research data

• Simplified research lifecycle– (before – during – after)

90

New Research Management and Administration System (grants)

Worktribe - Empowering and supporting research administrators and investigators from idea through costing, approvals, award, post-award management and closure.• A recent requirements and procurement project focused on

delivery of a new 'Worktribe Research Management' system to support improved pre-award and post-award business processes across the University.

– 3 pilot schools / institutes November 2015– Go-live across university April 2016

• The new 'Worktribe Research Management' system is integrated with the existing Finance, HR and PURE systems.

• https://www.projects.ed.ac.uk/programme/rmas91

Managing research outputs: guidance for authors; OJS

1. Make your work Open Access 5. Acknowledge your funder2. Use your name consistently 6. Statement on research data3. Use an ORCID identifier 7. Cite the DOI and OA links4. Institutional affiliation 8. Claim your digital space

‘Ensure your research reaches the widest possible global audience, is eligible for submission in research assessment exercises, and fulfils funder requirements.’

92

Type, format volume of data, chosen software for long-term access, existing data, file naming, structure, versioning, quality assurance process.

Information needed for the data to be read and interpreted in future, metadata standards, methodology, definition of variables, format & file type of data.Restrict access to data, risks to data security, appropriate methods to transfer / share data, encryption.

Secure & sufficient storage for active data, regular backups, disaster recovery

Make data publicly available (where possible) at the end of a project, license data, any restrictions on sharing, access controls?

Select data to keep, decide how long data will be kept, in which repository, costs involved in long-term storage?

Day-

to-d

ay m

anag

emen

t of d

ata

Managing Research Data(from researcher training)

93

Who manages active data?

• RDM Policy (2011) sets out roles and responsibilities for researchers & the institution– Researchers are responsible for their work– Enabling role of institution

• Services, including stewardship• Monitoring compliance

• Principal Investigators• Research Institutes & Schools

94

95

From RDM programme to Research Data Service…• RDM programme a result of both bottom-up (‘action

group’) activity and top-down policy implementation• Services had various providers and their purpose and

names were confusing• New single service has

SO & SOM with Virtual Team across Information Services

96

Before -

97

After (Simplified data lifecycle)

98

Tools & support for working with data

99

Finding and analysing data

• What is Data Library & Consultancy? • The Data Library & Consultancy team assists researchers to

discover and use datasets for analysis, learning and teaching. • Data librarians are available to help you find answers to data-

related questions.• Tools include a data catalogue and the Survey Documentation

and Analysis online data browser

100

Storing data

• What is DataStore? • DataStore is file storage for active research data, and is

available to all research staff and postgraduate research students (PGRs).

• DataStore provides a free individual allocation for each researcher, as well as shared group spaces. Additional capacity of virtually any size is available.

101

Transferring data

• What is DataSync? • DataSync is a tool to synchronise and share research data with

collaborators. It has an app to synchronise data to computers and mobile devices, and a web interface to allow access to data from any web browser.

• Data can be shared with anyone who has an email address, via the web interface.

102

Versioning software

• What is Subversion? • Subversion is a version control tool which allows users to store code. It is

also available as an extension called SourcEd which provides a web based collaboration tool integrated with your repository.

• When documents stored in a Subversion repository are updated the old versions are kept so you can revert if necessary. The service also allows multiple people to collaborate on documents.

103

Data management support

One-to-one support is available on the following areas of RDM:• Writing and reviewing DMPs;• Creating SOPs for metadata collection and publication;• Creating SOPs for good data management practice;• Choosing a data repository and preparing data for deposit and

publication

104

Training - Online

• MANTRA: MANTRA is a free, non-credit, self-paced course designed for postgraduate students and early career researchers which provides guidelines for good practice in research data management

• Research Data Management and Sharing MOOC: This free five-week course - created by the Universities of Edinburgh and North Carolina - is designed to reach learners across disciplines and continents.

105

106

MANTRA sessions by country 2012-2016

107

Sessions by city: January – June 2016

108

www.coursera.org/learn/data-management #RDMSmooc

109

MOOC’s contents

• Understanding Research Data• Data Management Planning• Working with Data• Sharing Data• Archiving Data

110

111

112

Training workshops

• Creating a data management plan for your grant application• Managing your research data: why is it important and what

should you do?• Working with personal & sensitive research data• Good practice in research data management• Handling data using SPSS

113

RDM training take-up, 2014-2016

114

THANKSR.Rice@ed.ac.uk

115

01/05/2023

Making research available - FAIR principles and Force 11David De Roure, Oxford e-Research Centre

David De Roure @dder

Making research available:FAIR principles and FORCE11

DIRECTOR, UNIVERSITY OF OXFORD E-RESEARCH CENTRE

A Brief History of Force11● 2008/2009:

– Elsevier Grand challenge● 2010:

– Found & connected to Phil Bourne

– Planned Dagstuhl meeting● 2011:

– January: Beyond the PDF, San Diego: 97 Attendees

– August: Force11 at Dagstuhl: 34 attendees

– November: Manifesto is published >> Force11!

● 2012: Funding Sloan

● 2013: Beyond the PDF2, Amsterdam:

– 148 attendees, great discussion

● 2014: Working groups take off:

– Data Citation Principles Working group

– Resource Identifier Working group

● 2015: Force15, Oxford: – 257 attendees

● 2015: Force 2016, Portland, Oregon

www.force11.org

PREAMBLE. Sound, reproducible scholarship rests upon a foundation of robust, accessible data.  For this to be

so in practice as well as theory, data must be accorded due importance in the practice of scholarship and in the

enduring scholarly record. 

In support of this assertion, and to encourage good practice, we offer a set of guiding principles for data within

scholarly literature, another dataset, or any other research object.  

1. IMPORTANCE. Data should be considered legitimate, citable products of research. Data citations should be accorded the

same importance in the scholarly record as citations of other research objects, such as publications [1].

2. CREDIT AND ATTRIBUTION. Data citations should facilitate giving scholarly credit and normative and legal attribution

to all contributors to the data, recognizing that a single style or mechanism of attribution may not be applicable to all data [2].

3. EVIDENCE. In scholarly literature, whenever and wherever a claim relies upon data, the corresponding data should be cited

[3].

4. UNIQUE IDENTIFICATION. A data citation should include a persistent method for identification that is machine-

actionable, globally unique, and widely used by a community [4].

5. ACCESS. Data citations should facilitate access to the data themselves and to such associated metadata, documentation,

code, and other materials, as are necessary for both humans and machines to make informed use of the referenced data [5].

6. PERSISTENCE. Unique identifiers, and metadata describing the data and its disposition, should persist -- even beyond the

lifespan of the data they describe [6].

7. SPECIFICITY AND VERIFIABILITY. Data citations should facilitate identification of, access to, and verification of the

specific data that support a claim. Citations or citation metadata should include information about provenance and fixity

sufficient to facilitate verifying that the specific timeslice, version and/or granular portion of data retrieved subsequently is the

same as was originally cited [7].

8. INTEROPERABILITY AND FLEXIBILITY. Data citation methods should be sufficiently flexible to accommodate the

variant practices among communities, but should not differ so much that they compromise interoperability of data citation

practices across communities [8].

DC 1

ENDORSE PRINCIPLESWWW.FORCE11.ORG/DATACITATION

AUTHORS: Data Citation Synthesis Group FORCE11.org/datacitation/workinggroupREFERENCES: FORCE11.ORG/DCREFERENCE

The Joint Declaration of Data Citation Principles

121https://www.force11.org/fairprinciples

A diverse set of stakeholders - representing academia, industry, funding agencies, and scholarly

publishers - have come together to design and jointly endorse a concise and measureable set of principles, for those wishing to enhance the reusability of their data holdings

Including, but not limited to:

European Open Science Cloud – High Level Expert Group

These put emphasis on enhancing the ability of machines to automatically find and use the data, in

addition to supporting its reuse by individual

NOTE: The Principles are high-level; do not suggest any specific technology, standard, or implementation-

solution

Exemplar implementations, their (level of) FAIRness and

resulting value-added

FindableAccessibleInteroperableReusable

Thanks to Susanna Sansone

Active research management and sharing

01/05/2023

126

Recommended