Beyond supplementary material · Opus Majus Theories supplied by reason should be verified by...

Preview:

Citation preview

Beyondsupplementarymaterial:sharingdataeffectivelythroughrepositoriesanddatajournalsAndrewLHuftonManagingEditor,ScientificData

andrew.hufton@nature.com

1

1

Scienceisabouttrust(withevidence)

2

A.L.HuftonLausanne2017

OpusMajusTheoriessuppliedbyreasonshouldbeverifiedbysensorydata,aidedbyinstruments,andcorroboratedbytrustworthywitnesses

OpusTertiumThestrongestargumentprovesnothingsolongastheconclusionsarenotverifiedbyexperience.

RogerBacon,1214– 1292

“ “

Backingupyourclaimswithevidencehasalwaysbeenakeypartofscience

3

A.L.HuftonLausanne2017

Withoutproperdatahandlingthingscangowrong…

4

A.L.HuftonLausanne2017

Raisingreportingstandardsfordatadescription

(a)WesternblotofcelllysatesofcontrolandRac1-siRNA-treatedMTLn3cells,blottedforRac1andβ-actin.Arepresentativeimageisshownfrom3blots.(b)MTLn3cellstransfectedwithcontrolorRac1siRNAandplatedonAlexa-405-conjugatedgelatinovernight.Arrowspointtoinvadopodia andsitesofdegradation.Scalebars,10μm.Representativeimagesetsareshownfrom50imagesetseachforthecontrolandRac1siRNA.(c)Quantificationofmeandegradationareapercellfromb,includingRac1inhibitorNSC23766treatmentat100μM.n =60fieldsforeachcondition,pooledfrom5independentexperiments;errorbarsares.e.m.Student’st-testwasused.**P =0.00022,^ ^P =0.011639.Uncropped images ofblotsareshowninSupplementaryFig.9.

Checklisttoimprovefigurelegendsandreportinghttp://www.nature.com/authors/policies/checklist.pdf

statementofreplication

definitionofn

definitionofstatistictestsNatureCellBiology 16,571–583(2014)doi:10.1038/ncb2972

rawsourcedata

5

A.L.HuftonLausanne2017

• Macroeconomicsgrewfromthesharingofgovernmentdatawitheconomists

• Meteorologyandclimatescience:Therehasbeeninternationalsharingofweatherdataforover100years

• Bioinformatics:MadepossiblebyopensharingofproteinstructureandDNAsequencedata

Large-scaledatasharingisalsonotnewwholefieldshavebeenbornfromeffectivedatasharing

Sieber,J.E.DataSharinginHistoricalPerspective (2015)http://www.socialsciencespace.com/2015/09/data-sharing-in-historical-perspective/Hufton,A.L.Sharingthestructuresdoi:10.1038/nature13369(2014)

6

A.L.HuftonLausanne2017

Datasharingadvancesscience

7

A.L.HuftonLausanne2017

Datasharingisinthepublicinterest

Ø Weencourageourauthorstosharedatarelatedtopublichealthemergenciesassoonaspossible,evenprior topeerrevieworpublication.

Nature 518, 477–479 (2015)

8

2

Datasharing– everyone’sdoingit

9

A.L.HuftonLausanne2017

The current situation

• Mostresearchersaresharingdata,andusingthedataofothers

• Directcontactbetweenresearchers(onrequest)isacommonwayofsharingdata

• Repositoriesaresecondmostcommonmethodofsharing

Kratz JE,Strasser C(2015)ResearcherPerspectivesonPublicationandPeerReviewofData.PLoS ONE10(2):e0117619.

10

A.L.HuftonLausanne2017

FundamentalsharingpolicyforNature andtheNatureresearchjournals

Aninherentprincipleofpublicationisthatothersshouldbeabletoreplicateandbuildupontheauthors'publishedclaims.AconditionofpublicationinaNaturejournalisthatauthorsarerequiredtomakematerials,data,code,andassociatedprotocolspromptlyavailabletoreaderswithoutunduequalifications.Anyrestrictionsontheavailabilityofmaterialsorinformationmustbedisclosedtotheeditors…[and]…inthe submittedmanuscript.

Supportingdatamustbemadeavailabletoeditorsandpeer-reviewersatthetimeofsubmissionforthepurposesofevaluatingthemanuscript.

Seehttp://www.nature.com/authors/policies/availability.html

11

A.L.HuftonLausanne2017

Sharinguponrequesthasproblems

Replotted from:Vinesetal. CurrentBiology(2014)doi:10.1016/j.cub.2013.11.014

RawdataatDryaddoi:10.5061/dryad.q3g37

0%

5%

10%

15%

20%

25%

30%

35%

40%

45%

50%

0 5 10 15 20 25

dataareextan

t(percentofp

ublications)

Ageofpublication(years)

12

A.L.HuftonLausanne2017

• Clearpreferenceforsharinglargedatasetsviapublicrepositories.

• Enforcedatadepositioninfieldswherethereisstrongcommunityconsensus

• ListofpublicdatarepositoriesnowmaintainedbyScientificData

• EncourageauthorstopublishDataDescriptorsatScientificData• before,withoraftertheanalysispaper• editorsworkwithauthors

• Dataavailabilitystatementsrequired

Data-accesspracticesstrengthenedinNaturejournals

13

3

Shareyourdatainausefulmanner

14

A.L.HuftonLausanne2017

Opendataisaboutmorethandisclosure– itmustbe“FAIR”

• Findable• Accessible• Interoperable• Re-usable

How do you make your data useful?

Wilkinsonetal.Sci.Datadoi:10.1038/sdata.2016.18 (2016)

15

A.L.HuftonLausanne2017

Publishitassupplementaryinformation

• Oftensufficientforsmalldatasets• Generallystableovertime• Generallycompatiblewitharangeofinformationtypes

But,• Oftenpoorlycuratedandnotalwaysthoroughlyreviewed– quality

variessignificantly• Notoftenmachine-readable• Hardtociteseparately,lackofcredit

16

A.L.HuftonLausanne2017

A B C D E1 Group1 Group22 Day 03 Sodium 139 1424 Potassium 3.3 4.85 Chloride 100 1086 BUN 18 187 Creatine 1.2 1.28 Uric acid 5.5* 6.2*9 Day 710 Sodium 140 14611 Potassium 3.4 5.112 Chloride 97 108

S1Sh.cuo

Poordatalayout

Meaninglesscolumntitles

Specialcharacterscancausetextmining

errors

Nounits

Unhelpfuldocumentname

Undefinedabbreviation

Formattingforinformationthatshouldbeinmetadata

17

A.L.HuftonLausanne2017

A B C D E F1 Parameter Day Control Treated Units P2 Sodium 0 139 142 mEq/l 0.823 Sodium 7 140 146 mEq/l 0.704 Sodium 14 140 158 mEq/l 0.035 Sodium 21 143 160 mEq/l 0.026 Potassium 0 3.3 4.8 mEq/l 0.067 Potassium 7 3.4 5.1 mEq/l 0.078 Potassium 14 3.7 4.7 mEq/l 0.109 Potassium 21 3.1 3.6 mEq/l 0.5210 Chloride 0 100 108 mEq/l 0.5611 Chloride 7 97 108 mEq/l 0.6812 Chloride 14 101 106 mEq/l 0.79

Table_S1_Shanghai_blood.xls

Betterdatalayout

18

A.L.HuftonLausanne2017

• Manycommunitieshavedevelopedspecificguidelinesforreportingcertainkindsofdata

• Checkjournalguidelinestoseewhatisrequired.• Canhelpyouformatyourdatainausefulmanner

• Browseinformationonover600reportingstandards• Findstandardsthatarerelevanttoyourtypeofdata

Knowtherelevantstandardsinyourcommunity

(https://biosharing.org)

19

4

Findtherightrepositoryforyourdata

20

A.L.HuftonLausanne2017

Findtherightrepositoryforyourdata

Whattolookforinadatarepository

• Qualitycuration

• Acommitmenttolong-termpreservation

• Featuresthatsupportcollaborativeanalysis

• Featuresthatallowyoukeepdataprivateuntilyouarereadytopublish.

• Investigatedataarchivingoptionsatyourinstitution

21

A.L.HuftonLausanne2017

Browseourrecommendeddatarepositoryonline.• Wecurrentlylistmorethan90 repositories,acrossthe

biological,physicalandsocialsciences• Weadviseauthorsonthebestplacetostoretheirdata

Findtherightrepositoryforyourdata

http://www.nature.com/sdata/policies/repositories

22

Publishyourdata

5

23

A.L.HuftonLausanne2017

• Aclear,peerrevieweddescriptionofdata,tomaximizeusage• Citablepublicationsthatgivecreditforreusabledata• Severaljournalspublishdatapaperformats,includingGigaScience,Earth

SystemsScienceData,andatNature,ScientificData

thedatapaper

24

A.L.HuftonLausanne2017

Get Credit for Sharing Your DataPublications will be indexed and citeable.

Open-accessArticles are published by default under a Creative Commons Attribution licence (CC BY). Each publication supported by CC0 metadata.

Focused on Data ReuseAll the information others need to reuse the data; no interpretative analysis, or hypothesis testing

Peer-reviewedRigorous peer-review focused on technical data quality and reuse value

Promoting Community Data RepositoriesNot a new data repository; data stored in community data repositories

25

A.L.HuftonLausanne2017

LaunchedinMay2014

26

A.L.HuftonLausanne2017

TheDataDescriptorarticle-type

27

A.L.HuftonLausanne2017

Data DescriptorFocus on data reuse

Sections:• Title• Abstract• Background&Summary• Methods• DataRecords• TechnicalValidation• UsageNotes• Figures&Tables• References• DataCitations

• Detaileddescriptionsofthemethodsandtechnicalanalysessupportingthequalityofthemeasurements.

• Doesnotcontaintestsofnewscientifichypotheses

28

A.L.HuftonLausanne2017

• Screen results and in-depth analysis published in 2011 at PNAS

• Full screen data published at Scientific Datain 2014

• Data at figshare

• Data Descriptor cited 89 times according to Google Scholar!

ADataDescriptor

29

A.L.HuftonLausanne2017

Neuroscience

CodeinGitHub

• Previously unpublished dataset

• Data in OpenfMRI• Cited 39 times according

to Google Scholar• Authors ran an analysis

challenge after publication

30

A.L.HuftonLausanne2017

Thinkbeyondyourowndata

citethedataofotherswhenyouuseit

beadata-mindedpeer-reviewer

usedataresponsibly

31

A.L.HuftonLausanne2017

Getthemostfromyourdata

PreserveitEncouragereuse

Getcredit

Encourageotherstodothesame

32

A.L.HuftonLausanne2017

Now launched!

Visit nature.com/scientificdata

Email scientificdata@nature.com

Tweet @ScientificData

Managing EditorAndrew L. Huftonandrew.hufton@nature.com

Honorary Academic EditorSusanna-Assunta Sansone

Data Curation EditorVarsha Khodiyar

Senior Publishing AssistantJoseph Salter

Head of Data PublishingIain Hrynaszkiewicz

Thanks!

Supportedby