15
ASIS&T DATA RESEARCH ACCESS AND PRESERVATION SUMMIT -LEGAL AND SOCIAL IMPLICATIONS OF SHARED COLLECTIONS- PHOENIX, AZ APRIL 10, 2010 MELISSA CRAGIN CENTER FOR INFORMATICS RESEARCH IN SCIENCE AND SCHOLARSHIP GSLIS – UIUC Small science research and the data sharing strata

Small science research and the data sharing strata

  • Upload
    vonda

  • View
    30

  • Download
    5

Embed Size (px)

DESCRIPTION

Small science research and the data sharing strata. ASIS&T Data Research Access and Preservation Summit -Legal and Social Implications of Shared Collections- Phoenix, AZ April 10, 2010 Melissa Cragin Center For Informatics Research in Science and Scholarship GSLIS – UIUC. - PowerPoint PPT Presentation

Citation preview

Page 1: Small science research and  the data sharing strata

ASIS&T DATA RESEARCH ACCESS AND PRESERVATION SUMMIT

-LEGAL AND SOCIAL IMPLICATIONS OF SHARED COLLECTIONS-

PHOENIX, AZAPRIL 10, 2010

MELISSA CRAGINCENTER FOR INFORMATICS RESEARCH IN SCIENCE AND SCHOLARSHIP

GSLIS – UIUC

Small science research and the data sharing strata

Page 2: Small science research and  the data sharing strata

What are the social contexts under which research communities assemble to share and manage data?

Small Science

Investigating Data Communities

Data Curation Profiles

Implications for Data Management Systems

Page 3: Small science research and  the data sharing strata

20-80Rule: The small are big!

Total Grants 9347

$2,137,636,716

20% 80%

Number Grants 1869 7478

Total Dollars $1,199,088,125 $938,548,595

Range $6,892,810-$350,000

$350,000-$831

Heidorn, P. B. (2008).

Page 4: Small science research and  the data sharing strata

Small science in flux

Traditional features single PI (often) often dependent on graduate students ad hoc data management systems idiosyncratic sharing practices “success” dependent on using one’s own data

But… may be producing all digital data may be working at community level may be conducting “data-driven” science may be producing very large data sets Reference data sets needed even in small, specialized data

communities

See, for example, Borgman, Wallis, & Enyedi (2007); Cragin et al. (in press)

Page 5: Small science research and  the data sharing strata

Data Communities and Collections

institutional repository across fields

scholar created primary source collections

disciplinary resource

local, cross-departmental

geographically based, cross-disciplinary resource

national cultural heritage collection aggregation

national data cyberinfrastructure paradigm

Page 6: Small science research and  the data sharing strata

Data Communities

While the conceptualization of ‘data community’ is in a developmental phase now, we have

identified at least 3 quite vividly, including

1.) sub-disciplines focused on particular kinds of data that support specific

measurements or analysis;

2.) scientists focused on specific, focused research problem (this is often interdisciplinary

in nature);

3.) researchers working to develop and use a shared, community-level data collection (i.e.

“Resource Collection,” NSB, 2005)

Page 7: Small science research and  the data sharing strata

Data Curation Profiles – Data Conservancy

The Data Curation Profile supports the ongoing development of system and policy Assists in the facilitation of workflows and services for

particular data types focal to use within the data community or sub-discipline

Provides crucial information; can serve as documentation for collection policy including selection, appraisal and retention guidelines

A template is currently in draft form and will be tested and revised in (2) stages of “use” tests: Production Application

Page 8: Small science research and  the data sharing strata

Salient Features

Description of DataRequired Contextual

InformationApplicable Standards

Links to formal, area-specific metadata, ontologies, etc.

Intellectual Property & Access “Rules” Data owners Terms of use Attribution

Anticipated User Support

Re-appraisal Schedule Data provenance Version control Format migration

Workflow for Ingest & Maintenance of this “kind” of data

Page 9: Small science research and  the data sharing strata

Data Curation Profiles Project

research data management / metadata workflow policies for archiving and access system requirements for managing data in a repository librarians roles and skill sets to support archiving and sharing

BiochemistryBiology

Civil EngineeringElectrical Engineering

Food SciencesEarth and Atmospheric Sciences

Soil Science

AnthropologyGeology

Plant SciencesKinesiology

Speech and Hearing Earth and Atmospheric

SciencesSoil Science

Purdue University Libraries, D. Scott Brandt, PI IMLS # LG-06-070032-07

Page 10: Small science research and  the data sharing strata

Field

Specific Research

Area Form to be shared FormatsType of data set Size

Shared when?

Atmospheric science

severe weather modeling

compressed output of the model Vis5D

1 file / dataset 10-100 Mb

4-6 month embargo,

Agronomy

water quality, drainage, and plant growth

cleaned and reviewed sensor and hand-collected sample data .xls

approx. 100 files

~1MB each, up to 20 Mb

After publication

Geologyrock, water and microbes

averaged sensor and hand-collected sample data; photographs .xls; jpg

1 file; images < 1 Mb

After publication

Civil Engineering

traffic movement

cleaned and normalized sensor data

MySQL (postgresql)

1 database

approx. 1000 K/day

1 month to 1 year embargo

Examples of what, and when

Page 11: Small science research and  the data sharing strata

Profiling data complexities & differences

Data Characteristics

Crystallography Geobiology

Type

1. “Raw data” Most information rich, long-term value for

re-use…4. “CIF file” – crystallography exchange Most commonly shared data type

1. “Reduced spreadsheet” – table withaverage values for multiple observations

Most often requested by others

Intellectual Property/Data Owners

Service modelprovide a service to chemists by solving crystal structures

Ownership of the data is ambiguous, and require negotiation before data “hand-off

Depends on source of fundinggovernmental and private grants, gov. institutions, industry

Ownership of and right to the data range from full to very limited, some long-term “embargoes”

Accessibility

Field-wide repositories Many journals require deposit of CIF files OAI-PMH tools becoming available for CIF files

Difficult and ad hoc Well-known researchers receive direct requests for data, often based on publications

Page 12: Small science research and  the data sharing strata

Sharing Practices

Distinguishing private exchange from open sharing exchange: sharing amongst collaborators is a primary

concern, often with significant barriers (more) open access: limited by need for control and reward

system, but also

Sharing with wider “publics” is conditioned by both data management pressures and personal experience the “known person – cost” algorithm incidents of misuse

What is most easily or willingly shared is not always the data that has the most re-use value

Page 13: Small science research and  the data sharing strata

Distinguishing Public from Private Sharing

“Supplying” data – functional privacy targeted transfer of data to and from current

collaborators or close colleagues distribution on request Evaluation

Data exposure – “publication” general dissemination activity makes data accessible to the wider public

Page 14: Small science research and  the data sharing strata

Implications for concerns with misuse

Misuse incidents experienced by the scientists in this study –

influenced their views on the appeal of data sharingdecreased their willingness to shareincreased cynicism about data sharing initiativeshad a real impact on their behavior

- misappropriation - lack of enforcement of institutional agreements

Page 15: Small science research and  the data sharing strata

THANK YOU

[email protected]