19
Andrew Hufton Managing Editor, Data Nature Publishing Group Scientific Data An open-access, peer-reviewed platform for data-focused publications

Andrew Hufton Managing Editor, Data - HMS LINCS Project€¦ · 17 Scientific Data Concept Overview" Credit: citeable, peer-reviewed mechanism for recording credit for dataset creation.!

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Andrew Hufton Managing Editor, Data - HMS LINCS Project€¦ · 17 Scientific Data Concept Overview" Credit: citeable, peer-reviewed mechanism for recording credit for dataset creation.!

Andrew Hufton!Managing Editor, Data!Nature Publishing Group!!!

!!

Scientific Data An open-access, peer-reviewed platform for data-focused publications"

Page 2: Andrew Hufton Managing Editor, Data - HMS LINCS Project€¦ · 17 Scientific Data Concept Overview" Credit: citeable, peer-reviewed mechanism for recording credit for dataset creation.!

2 2

Data, data, data"Depositions of datasets in archives continue to grow, surpassing journal articles !in biomedical research!!

Growth of biomedical research publications (red; current total >19 million), alongside the accumulation of research data, including nucleic acid sequences (black; current total ~163 million), computer-annotated protein sequences (magenta; current total 9 million), manually annotated protein sequences (green; current total 500,000) and protein structures (blue; current total 60,000)!

Source: Biochemical Journal 2009 424, 317-333 - Teresa K. Attwood, Douglas B. Kell and others."

Page 3: Andrew Hufton Managing Editor, Data - HMS LINCS Project€¦ · 17 Scientific Data Concept Overview" Credit: citeable, peer-reviewed mechanism for recording credit for dataset creation.!

3 3

An illustrative example…."

DNA Structure 1953 1 Page

2 Authors 1 Figure no data

Page 4: Andrew Hufton Managing Editor, Data - HMS LINCS Project€¦ · 17 Scientific Data Concept Overview" Credit: citeable, peer-reviewed mechanism for recording credit for dataset creation.!

4 4

Human Genome 2001 62 Pages, 150 Authors,

49 Figure, 27 tables

An illustrative example…."

Page 5: Andrew Hufton Managing Editor, Data - HMS LINCS Project€¦ · 17 Scientific Data Concept Overview" Credit: citeable, peer-reviewed mechanism for recording credit for dataset creation.!

5 5

Existing challenges"

•  Data producers do not necessarily get appropriate credit for their work!

•  The peer review process at research journals is not ideal for ensuring data release and data standards!

•  Data and info about datasets often ends in supp. material!

•  Valuable datasets are not released !

Page 6: Andrew Hufton Managing Editor, Data - HMS LINCS Project€¦ · 17 Scientific Data Concept Overview" Credit: citeable, peer-reviewed mechanism for recording credit for dataset creation.!

6 6

A Role for Data Publications ""

GigaScience F1000R ZooKeys Ecological Archives and soon Scientific Data…

Page 7: Andrew Hufton Managing Editor, Data - HMS LINCS Project€¦ · 17 Scientific Data Concept Overview" Credit: citeable, peer-reviewed mechanism for recording credit for dataset creation.!

7 7

Why more data?"•  To ensure that the scientific

process is transparent and reproducible!

•  To accelerate new discoveries, and reduce unnecessarily repeated work!

!!

Page 8: Andrew Hufton Managing Editor, Data - HMS LINCS Project€¦ · 17 Scientific Data Concept Overview" Credit: citeable, peer-reviewed mechanism for recording credit for dataset creation.!

8 8

Release

Discover Reuse

DATA

Page 9: Andrew Hufton Managing Editor, Data - HMS LINCS Project€¦ · 17 Scientific Data Concept Overview" Credit: citeable, peer-reviewed mechanism for recording credit for dataset creation.!

9 9

If we want scientists to release their data, we need to provide a credit mechanism"

Page 10: Andrew Hufton Managing Editor, Data - HMS LINCS Project€¦ · 17 Scientific Data Concept Overview" Credit: citeable, peer-reviewed mechanism for recording credit for dataset creation.!

10 10

If we want scientists to release their data, we need to provide a credit mechanism"

PUBLICATION

Page 11: Andrew Hufton Managing Editor, Data - HMS LINCS Project€¦ · 17 Scientific Data Concept Overview" Credit: citeable, peer-reviewed mechanism for recording credit for dataset creation.!

11 11

If want released data to be reusable, we need critical evaluation to verify experimental rigor and the completeness of the description"

Page 12: Andrew Hufton Managing Editor, Data - HMS LINCS Project€¦ · 17 Scientific Data Concept Overview" Credit: citeable, peer-reviewed mechanism for recording credit for dataset creation.!

12 12

If want released data to be reusable, we need critical evaluation to verify experimental rigor and the completeness of the description"

PEER REVIEW

Page 13: Andrew Hufton Managing Editor, Data - HMS LINCS Project€¦ · 17 Scientific Data Concept Overview" Credit: citeable, peer-reviewed mechanism for recording credit for dataset creation.!

13 13

If we want scientists to be able to find datasets that will accelerate their research, then datasets need to be searchable & discoverable"

Page 14: Andrew Hufton Managing Editor, Data - HMS LINCS Project€¦ · 17 Scientific Data Concept Overview" Credit: citeable, peer-reviewed mechanism for recording credit for dataset creation.!

14 14

If we want scientists to be able to find datasets that will accelerate their research, then datasets need to be searchable & discoverable"

STANDARDIZED, CURATED DESCRIPTIONS

Page 15: Andrew Hufton Managing Editor, Data - HMS LINCS Project€¦ · 17 Scientific Data Concept Overview" Credit: citeable, peer-reviewed mechanism for recording credit for dataset creation.!

15 15 15

The Data Publicationseamless integration between research findings and data!

SciData DD

Structured content

ISA_tab

Datasets

Interoperate with Community resources

Code Workflows

Advanced Search and Discovery functions

SciData DD

Structured content

SciData DD

Structured content

SciData DD

Structured content

Nature Methods

Scientific Reports

Nature Genetics

Machine-readable exports

Page 16: Andrew Hufton Managing Editor, Data - HMS LINCS Project€¦ · 17 Scientific Data Concept Overview" Credit: citeable, peer-reviewed mechanism for recording credit for dataset creation.!

16

Use Example"

SciData DD

Structured content

SciData DD

Structured content

SciData DD

Structured content

SciData DD

Structured content

SciData DD

Structured content

SciData DD

Structured content

SciData DD

Structured content

NPG Journal

Reports Expression Fingerprint in Bone Cancer

Details on underlying expression data, help reader download and reanalyze data

Browse other Bone Cancer Expression Datasets

Data Descriptor Search

KIDNEY

HUMAN

MICROARRAY

tissue

organism

technology

Is the pattern observed in other similar samples?

Is the pattern observed in the cancer I am studying?

Export Datasets and Reanalyze

Export Datasets and Reanalyze

Page 17: Andrew Hufton Managing Editor, Data - HMS LINCS Project€¦ · 17 Scientific Data Concept Overview" Credit: citeable, peer-reviewed mechanism for recording credit for dataset creation.!

17 17

Scientific Data Concept Overview"Credit: citeable, peer-reviewed mechanism for recording credit for dataset creation.!

Rapid peer-review and publication evaluate the technical quality of the procedures used to generate the data, the reuse value of the resulting datasets, the completeness of the data description, and alignment of with existing community standards. ! !Reuse: complete and standardized descriptions help others discover and reuse your data, and discover your related research articles. ! !Diverse data-types: Scientific Data will initially focus on experimental and observational datasets from the life, biomedical and environmental science communities, but will be open to content from a wide range of experimental disciplines.! !Cooperation with a broad set of public, community-recognized repositories will promote existing standards, and integration with generalist scientific repositories like Dryad and FigShare will make it easy to deposit diverse data-types. ! !Open access: Data Descriptors will be released freely to the public under an open access license.! !Complementary: Data Descriptors can be used to describe both datasets that are associated with a traditional research article, as well as standalone datasets.!

Page 18: Andrew Hufton Managing Editor, Data - HMS LINCS Project€¦ · 17 Scientific Data Concept Overview" Credit: citeable, peer-reviewed mechanism for recording credit for dataset creation.!

18 18

Focused Scope complements both journal articles and repository records"

Includes!–  Highly detailed, reproducible methods descriptions!–  Quality control & technical validation experiments!–  Searchable, machine-readable meta-data!

Does Not Include!–  In depth analysis or tests of hypotheses!–  New scientific conclusions!–  Exploratory analysis (e.g. clustering)!

Page 19: Andrew Hufton Managing Editor, Data - HMS LINCS Project€¦ · 17 Scientific Data Concept Overview" Credit: citeable, peer-reviewed mechanism for recording credit for dataset creation.!

19 19

Thanks!"

Andrew Hufton [email protected]!Managing Editor, Data!!