36
Enhancing Access to Data in Scholarly Research Changing Cultures, Building Standards Linda Beebe Senior Director, PsycINFO

Enhancing Access to Data in Scholarly Research

  • Upload
    zulema

  • View
    32

  • Download
    0

Embed Size (px)

DESCRIPTION

Enhancing Access to Data in Scholarly Research. Changing Cultures, Building Standards Linda Beebe Senior Director, PsycINFO. About 12 years ago Supplemental Materials emerged with a bang!. Technology allowed us to add almost anything outside the article. . . And authors and publishers did─ - PowerPoint PPT Presentation

Citation preview

Page 1: Enhancing Access to Data in Scholarly Research

Enhancing Access to Data in Scholarly Research

Changing Cultures, Building Standards

Linda BeebeSenior Director, PsycINFO

Page 2: Enhancing Access to Data in Scholarly Research

ICSTI Annual Meeting 2012

About 12 years ago Supplemental Materials emerged with a bang!

Page 3: Enhancing Access to Data in Scholarly Research

ICSTI Annual Meeting 2012

And authors and publishers did─◦ Text (extended methodology sections,

bibliographies, survey results, derivations. . .)◦ Tables and figures◦ Multimedia◦ Gene sequences, protein structures, chemical

compounds, structures, 3-D images◦ Computer programs—algorithms, code,

executables◦ Datasets—and raw research data

Technology allowed us to add almost anything outside the article. . .

Page 4: Enhancing Access to Data in Scholarly Research

ICSTI Annual Meeting 2012

No standards Very different cultures

and practices from one discipline to another

Inconsistent identifiers Poor metadata Lack of discovery tools Abuse of readers and

reviewers

We had rapid, unplanned growth.

Page 5: Enhancing Access to Data in Scholarly Research

ICSTI Annual Meeting 2012

Business Policies & Practices cover selecting, editing, hosting, assuring discoverability, referencing, packaging, maintaining links, providing context, and preserving.

Technical Recommendations emphasize metadata, persistent identifiers, preservation, packaging and exchange.

Bi-directional linking using DOIs, emphasis on persistent linking reliability.

Flexibility and simplicity to support either a simple approach or the most detailed and granular metadata.

Clear definitions of metadata elements. Attention to preservation and migration, including saving of

objects along the migration chain.

NISO-NFAIS Recommended Practices

Nearing Final

Publication

Page 6: Enhancing Access to Data in Scholarly Research

ICSTI Annual Meeting 2012

Following 2 slides from Howard Ratner good reminder of the growth

Borrowed with permission from his talk December 2011 STM Innovations meeting

Ideas generated by the STM Future Lab Committee.

Today the buzz is around raw data.

Page 7: Enhancing Access to Data in Scholarly Research

ICSTI Annual Meeting 2012

Important Topic #1: API Platforms*New Access to Content

NEW ACCESS TO CONTENT

*API-platforms for third party developers available at Elsevier, Springer, NPG, IEEE (search)Getting ready for launch:IoPP, T&F, CABIMany more expected to follow

Curiosity driven R&DGRANULARITY OF CONTENT

SEMANTICS

LET THE OUTSIDE WORLD IN

OUR CONTENT YOUR WAY

CREATE CROSS-PUBLISHER STANDARDS

Common metadataFull text formats

HTML5

API PLATFORMS

XHTML

THIRD PARTY APPS

App store

LINKED DATALINKED OPEN DATA

RDF

MOBILE PRODUCTIVITYMULTI-DEVICE PRODUCTIVITY

Seamlessly linked platforms

M-commerce MOBILE

Transmedia itemsVoice Activation

Page 8: Enhancing Access to Data in Scholarly Research

ICSTI Annual Meeting 2012

Important Topic #2: Research DataNew Presentations for Re-use

RESEARCH DATA

DATA OBJECTS ARE FIRST CLASS RESEARCH OBJECTS

MAKE DATA INTERACTIVE

share the actual workflow of the researcher?

graphics represent data sets; how to open them up?

ACTIONABLE DATA

DATA CREATION

What formats do users want?COMMON STANDARDS

AUTHORING TOOLS

how to treat supplemental files to journals?

Guidelines for:-Reuse and sharing-Incentives and barriers-Editorial policies

Discoverability of data

BIG DATA

Deep Linking

REPOSITORIESDATACITEBibliographic tools

User behaviour

MendeleyCiteSeer

ColWizReadCUBE

Data journal

Page 9: Enhancing Access to Data in Scholarly Research

ICSTI Annual Meeting 2012

“Hard sciences” such as Physics and Chemistry—long history of handling supplemental material and requiring access to data.

Disciplines that study human subjects (psychology, sociology, health sciences)—far less likely to have such practices.

There is growing interest in standards and other support for data deposits and access.

Different Cultures & Practices

Page 10: Enhancing Access to Data in Scholarly Research

ICSTI Annual Meeting 2012

Study of Matter AAAS—must deposit

in approved repository.

ACS—must submit data and deposit.

AGU—must deposit data in approved repository

ASPB—must submit to journal.

Study of Humans APA—to date only

expected to supply for verification.

APS—no requirements

ASA—no requirements posted

AAA—no requirements posted

The Divide on Data Deposits

Page 11: Enhancing Access to Data in Scholarly Research

ICSTI Annual Meeting 2012

In the “softer” sciences, increased quantities of data are scattered on laptops, in file drawers, on the web—all in danger of being lost, even thrown away.

Question: how do we preserve these data and make them available for further research?

Page 12: Enhancing Access to Data in Scholarly Research

ICSTI Annual Meeting 2012

Actually, there are many questions

What constitutes

data?

What must the author do to it?

Who will maintain it?

What about confidentiali

ty?

How does one cite

data?

. . . And many more

Page 13: Enhancing Access to Data in Scholarly Research

ICSTI Annual Meeting 2012

Websters—factual information (as measurements or statistics) used as a basis for reasoning,discussion, or calculation.

Chaim Zins (2006)—statistical observations and other recordings or collections of evidence

NSF—any information that can be stored in digital form and accessed electronically, including, but not limited to, numeric data, text, publications, sensor data streams, video, audio, algorithms, software, models and simulations, images, etc.

Altman & King—systematic compilation of measurements for machine reading; must be systematically organized and described

What constitutes data?

Page 14: Enhancing Access to Data in Scholarly Research

ICSTI Annual Meeting 2012

Report on Integration of Data and Publications, October 17, 2011. Susan Reilly, Wouter Schallier, Sabine Schrimpf, Eefke Smit, and Max Wilkinson. Retrieved 10/11/2012 from http://www.stm-assoc.org / 2011_12_5_ODE_Report_On_Integration_of_Data_and_Publications.pdf

Page 15: Enhancing Access to Data in Scholarly Research

ICSTI Annual Meeting 2012

Replication standard—sufficient information to enable a third party to to replicate with no additional information from the author (King 1995). So authors must— Provide clear metadata. Code consistently and list coding instructions. Explain how data were used. Provide all raw data. Organize data in a way that can be used by

others. Making data available requires a different

workflow and more work—but makes for a better scientist.

What must an author do?

Page 16: Enhancing Access to Data in Scholarly Research

ICSTI Annual Meeting 2012

Natural sciences, many options such as Crystallography, ChemStar, ChemSpider, PubChem, PANGAEA.

Life Sciences, Protein DataBank now one entity with data from former banks in US, Europe, and Japan. Also Dryad, National Biological Information Infrastructure.

Not so many options in Social Sciences.

Who will maintain the data?

Page 17: Enhancing Access to Data in Scholarly Research

ICSTI Annual Meeting 2012

Inter-university Consortium for Political and Social Research (ICPSR)

U of Michigan Data deposit and

management Publication-Related

Archive quickly available, but ICPSR does not process.

Institute for Quantitative Social Science (IQSS) Dataverse Network

Harvard Maintains dataverses

(individual repositories). Delivers formal

persistent citations.

Two options in Social Sciences

Page 18: Enhancing Access to Data in Scholarly Research

ICSTI Annual Meeting 2012

IQSS Dataverse Network terms and conditions (paraphrased): Agree not to use materials to obtain information

that could ID subjects in any way, produce links that could ID them or do anything that could constitute invasion of privacy or breach of confidentiality.

Also, will not download or use in any way prohibited by applicable law.

And will always include the bibliographic citation for the data in any publication that references the data.

What about confidentiality and attribution?

Page 19: Enhancing Access to Data in Scholarly Research

ICSTI Annual Meeting 2012

Like any citation, it must contain basic elements that identify the dataset as unique:Title, Author, Date, Version, Persistent Identifier

DataCite, the organization that manages DOIs for data, recommends Creator (Publication/Year): Title. Version. Publisher. ResourceType. Identifier.

Example: Irino, T; Tada, R (2009): Chemical and mineral compositions of sediments from ODP Site 127‐797. Geological Institute, University of Tokyo. http://dx.doi.org/10.1594/PANGAEA.726855

And how does one cite data?

Page 20: Enhancing Access to Data in Scholarly Research

ICSTI Annual Meeting 2012

How do we know the data have not changed? Altman & King (2007) advocated the

Universal Numeric Fingerprint (UNF)—a short fixed-length string of numbers and characters

Example: UNF: 3: ZNQRI1405389xOBffg?== in which the 3 is the version number, the suffix is the fingerprint. If that number changes, the set is a new version of the data.

Another issue—fixity. . .

Page 21: Enhancing Access to Data in Scholarly Research

ICSTI Annual Meeting 2012

Just like citing other sources of information—encourages findability, credits the creator, makes any impact trackable.

Promotes more and better science, as it enables reuse and verification of data.

Rewards the data producer—may encourage others to deposit data.

The importance of citations. . .

Page 22: Enhancing Access to Data in Scholarly Research

ICSTI Annual Meeting 2012

DataCite—very international with members around the world (CDL & Purdue US members, Microsoft & ICPSR associates)

Co-Data—International Council for Science, Committee on Data for Science & Technology

International Association for Social Science Information Services & Technology

Day-PASS—Data Preservation Alliance for the Social Sciences, membership organization of archives and research centers to date

Some Advocates for a Culture of Data Citation

Page 23: Enhancing Access to Data in Scholarly Research

ICSTI Annual Meeting 2012

Linkability and Citability of Research Data Responsibilities for researchers, data

archives, publishers Co-reponsibility for bi-directional linking

between datasets and publications using persistent identifiers

Support for data reuse Issued in June 2012 Joined by CrossRef in July

Joint STM-DataCite Statement

Page 24: Enhancing Access to Data in Scholarly Research

ICSTI Annual Meeting 2012

Researcher

Institution

Funder

Publisher

Data Manage

r

Need for collaboration among all major participants in the Research Cycle

Page 25: Enhancing Access to Data in Scholarly Research

ICSTI Annual Meeting 2012

Funder mandates for data sharing plans encourage new thinking from some disciplines.

Connection with the publications is needed. FundRef new initiative within CrossRef Collaboration between publishers and funders to

make connections between grants and resulting publications

Pilot for publishers to create and submit standard metadata with funder name and grant number.

Working group includes several publishers and funders. http://www.crossref.org/fundref/index.html

Funder/Research/Publishing Connections

Page 26: Enhancing Access to Data in Scholarly Research

ICSTI Annual Meeting 2012

Established to solve the name abiguity problem in scholarly communications by creating a registry of persistent unique identifiers for individual researchers.

Provides an open and transparent linking mechanism between ORCID, other identifiers, and research objects—pubs, grants, patents, etc.

Governed by a board representing all stakeholders.

Launching this month. http://about.orcid.org/

ORCID Another Example

Page 27: Enhancing Access to Data in Scholarly Research

ICSTI Annual Meeting 2012

Designed to facilitate information exchange about research and scholarship.

Funded by NIH, National Center for Research Resources

Initially, 7academic institutions, but growing APA has instance: www.vivo.apa.org Semantic web of information to support

interconnectedness and trust support maintenance of research data.

And VIVO still another . . .

Page 28: Enhancing Access to Data in Scholarly Research

ICSTI Annual Meeting 2012

Data standards, changing cultures, new

infrastructures will help us avoid the tumult we’ve

experienced with supplemental materials.

Page 29: Enhancing Access to Data in Scholarly Research

ICSTI Annual Meeting 2012

Past expectation--psychologists do not withhold data and will share for verification of results.

New expectation—authors must agree to share data.

In Psychology—a new model for data sharing at APA

Page 30: Enhancing Access to Data in Scholarly Research

ICSTI Annual Meeting 2012

New Journal Open in Every Regard

Page 31: Enhancing Access to Data in Scholarly Research

ICSTI Annual Meeting 2012

Psychologists worried— Potential nefarious uses—unscrupulous people

could twist the data or hector people author trying to help.

Well-intentioned but inept secondary analysis—they might get it wrong!

Loss of potential publications for self—I haven’t written all my articles from this data!

But most common fear—loss of academic credit for what may be years of data collection.

Sharing broadly a rare event in the past

Page 32: Enhancing Access to Data in Scholarly Research

ICSTI Annual Meeting 2012

Archives of Scientific Psychology is very different from other APA journals in 4 regards:

Authors must submit data to APA or approved repository.

Journal is electronic only. It is an open access/author pays model. Authors must submit two methods

sections: 1 scientific and 1 in lay language.

A radical change for psychology. . .

Page 33: Enhancing Access to Data in Scholarly Research

ICSTI Annual Meeting 2012

Authors sign a Collaboration Agreement specifying that others may reuse their data.

Researchers who wish to reuse the data must sign a Collaboration Agreement stating1. They will not do anything to reveal identity of

subjects.2. They will not engage in “gotcha” publishing—

run analyses to prove author wrong and publish the results.

3. They will offer the original data collector co-authorship.

Data Collaboration Most Radical Aspect

Page 34: Enhancing Access to Data in Scholarly Research

ICSTI Annual Meeting 2012

Change the paradigm for use and reuse of data in psychological research by assuring full attribution and credit for the original creator of the data.

Contribute to the culture of transparency and prevention of fraud in science.

Maintain APA’s high standards for peer-reviewed literature and contributions to science.

APA’s Goals

The jury is still out—but manuscripts are coming in.

Page 35: Enhancing Access to Data in Scholarly Research

ICSTI Annual Meeting 2012

As all the participants in the

scholarly communications process work to

enhance access to data, there

undoubtedly will be more revolutionary

changes.

Page 36: Enhancing Access to Data in Scholarly Research

ICSTI Annual Meeting 2012

Linda BeebeSenior Director, PsycINFO

American Psychological [email protected]

www.apa.org/pubs/index.aspx

Thanks for Listening!

By building standards, we can change cultures.