Upload
theresa-mccarthy
View
213
Download
0
Tags:
Embed Size (px)
Citation preview
Scientists’ Data and Information Practices and Needs
Carol Tenopir, University of Tennessee and
Mike Frame, USGSJune 15, 2011
UC3 Summer Webinar Series
Scientists’ Data and Information Practices and Needs:
A Baseline Assessment & Implications for Libraries
Carol Tenopir, University of Tennessee and
Mike Frame, USGSCo-Leaders of the DataONE Usability & Assessment Working Group
2
Provide universal access to data about life on earth and the environment that sustains it
1. Build on existing cyberinfrastructure
2. Create new cyberinfrastructure 3. Support new communities
of practice
3
Scientists
Data Managers
Public Officials
Citizen-scientists
Libraries & Librarians
Students & Teachers
Assessment-stakeholders
Publishers
Baseline Assessment of Scientists (2010)
n=1329n=1317
Primary Discipline
Primary Discipline
social sciences15%
computer science/en-gineering
9%
physical sciences12%
environmental sciences & ecology36%
atmospheric science4%
biology14%
medicine2%
other7%
academic80%
government13%
others8%
Primary Work Sector
6
Meet the Scientists: Joe & Mabel
7
Joe is a biodiversity scientist employed by a government agency. He acts as a program manager and consultant. Joe oversees collection of new data in the field and also manages historical data from other providers. Joe has data from a variety of different projects conducted over the years.
Mabel is an academic environmental scientist. She collects and records data in the field on a variety of specimen variables and environmental impacts. Mabel has a data set related to her personal research interests, as well as data collected for a university museum collection.
10
experiment
observational
data models
biotic survey
abiotic survey
remote-sensed abiotic
remote-sensed biotic
social science survey
interviews
Other
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
54%48%
38% 34% 33%27%
20% 19%15%
6%
Data Types
share my data with others place at least some of my data into a central data repository
place all of my data into a central data repository
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
75%78%
41%
Current Sharing Practices
Willing to place all of my data into a central data repository with no restric-
tions
Appropriate to create new datasets from shared data
Willing to place at least some of my data into a central data repository with
no restrictions
Willing to share data across a broad group of researchers
0% 20% 40% 60% 80% 100%
41%
76%
78%
81%
Many are interested in sharing data
Percent agree
Joe & Mabel: About Sharing Data
13
“If NBII required anyone who extracted data through the portal to also share data with the portal, then a resounding yes.”
“I’m interested in having data available to researchers interested in larger questions, particularly climate change questions.”
“We are torn between putting it out there for everyone and worry about suffering the risk of something bad happening with it. Saddest thing would be if the data loses its use, where it isn’t shared.”
“I don’t think I would be opposed to it. It would not be a decision I would make personally; we would have to have permission to share.”
Gap Between Willingness to Share and Accessibility
15
place at least some of my data into a central data repository
place all of my data into a central data repository
Others can access my data easily 0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
78%
41%36%
use other researchers' datasets if their datasets were easily accessible
willing to share data across a broad group of researchers
it is appropriate to create new datasets from shared data
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
84% 81%76%
Interest in Data Sharing
16
Reprints of articles
Reciprocal sharing agreement
Opportunity to collaborate
Acknowledge provider/funder
Formally cite provider/funder
0% 20% 40% 60% 80% 100%
70%
72%
81%
93%
95%
Conditions on data sharing
Percent agree
Lack of funding
Insufficient time to make data available
No place to put data
Don't have the rights to make the data public
0% 20% 40% 60% 80% 100%
40%
54%
24%
24%
More challenges ..
Percent agree
Lack of funding
Insufficient time to make data available
No place to put data
Don't have the rights to make the data public
0% 20% 40% 60% 80% 100%
43%
62%
24%
18%
40%
54%
24%
24%
More challenges ..
Percent agree
Joe & Mabel: About Restrictions & Conditions to Sharing Data
20
“We want to make sure that those of us who have been involved in gathering the data get appropriate recognition for it.”
“If someone were to ask about rare or endangered plants, I would limit that information to appropriate people: natural heritage, universities and federal agencies.”
“We will share it with people who want to use the data for restoration or research. If a consultant wants data to make money, then we are hesitant to hand it out.”
“Is there a mechanism by which we can know when our data is being used? Knowing how valuable we are to the general public comes from the use of our data.”
3. There are different needs, attitudes, and practices between scientists who work in government agencies and those who work in academia.
21
the process for cataloging/describing data
the tools for preparing my documentation
tools and technical support for data management during the life of the project
formal established process to store data beyond the project
0% 20% 40% 60% 80% 100%
62%
46%
40%
35%
48%
34%
52%
53%
GovernmentAcademic
“I am satisfied with …”
Percent agree/strongly agree
• Academic respondents are more likely to have sole responsibility for approving access to some or all of their datasets.– Academic 83%, Government 63%
23
Responsibilities for Data
• Government respondents are more likely to agree their organization was involved in:– “managing data during the life of the project”
• Government 52%, Academic 39%,
– “storing data beyond the life of the project” • Government 53%, Academic 46%
24
Organizational Involvement
25
“If other people are using my data then I somehow need to report that. I need to know how it’s being used and if any publications result.”
“I don’t have anything I’m keeping private. I’m willing to put it all out there.”
“I don’t have the authority to make decisions about data sharing. “
“Our data sharing policy makes it difficult for us to withhold parts of the datasets we receive. As a result, some data contributors only share sub-sets of their data.”
Joe & Mabel: The View from Government & Academic Organizations
4. The skill level of scientists and use and access to appropriate tools varies across the data life cycle.
26
DIF DwC DC EML FGDC Open GIS
ISO My Lab none
12 21 26
95 95 96 97
266
676
Metadata standard
What metadata standard do you currently use?
28
“We are currently redoing all of our collection databases at the museum. We are building an in-house system. We looked at available standards and decided to write our own.”
“For my research, very little metadata has been created. For metadata associated with the museum collection, Darwin Core has been used.“
“For contemporary sets, the person who submits the data also submits a metadata record. We create another record representing what we think it is. We have one version of the data, submitter may have a version they keep on their website. We want to be able to show that these are two different things.”
“We write FGDC records.”
Joe & Mabel: About Metadata
30
% Government % Academic
Training on best practices 23 21
Funds for data management long-term 27 20
Funds for data management short-term 34 29
Tools and technical support for data management long-term
39 34
Tools and technical support for data management short-term
48 43
My organization provides…
Lack of funding
Insufficient time to make data available
No place to put data
Don't have the rights to make the data public
0% 20% 40% 60% 80% 100%
40%
54%
24%
24%
More challenges ..
Percent agree
Joe & Mabel: Looking for Assistance
32
“It is cumbersome to put those data sets together, but only because it is important. If there were ways to automate some of that information collection out of the data sets, it would help.”
“Maximum utility of the data would require geo-referencing of the data. We would need help geo-referencing the part of the collection that isn’t geo-referenced.”
“Ideally, we would like for our research results to be disseminated in a way that’s accessible and digestible to not just academics but to everybody.”
“Manpower. We need more people to handle these sorts of things.”
Are there standards?
Collect
Assure
Describe
Deposit
Preserve
Discover
Integrate
Analyze
Data Life Cycle Scientist Challenges
How do I preserve my
data?
What tools do I use?
Will I get credit for my work?
How much will it cost?
What is a data management
plan?
Who can help me?
What is metadata?
Where do I preserve my
data?
Year 1 Year 2 Year 3 Year 4 Year 5
Scientists: BL
Future Assessments
Scientists: FU
Librarians: BL Librarians: FU
Policy Makers: BL Policy Makers: FU
Educators: BL Educators: FU
Library Policies: BL Library Policies: FU
Library and Librarian Surveys
• Library (1 per library) current practices• Librarian (individuals) attitudes and
perceptions• Started with ARL libraries (spring and summer
2011; 38 library responses and 223 librarians so far)
• Will expand to other North American academic libraries and librarians
Stewardship role (select &
deselect)?
Librarian & Library Assessment
Collect
Assure
Describe
Deposit
Preserve
Discover
Integrate
Analyze
Are RDS priority?
Role in partnering with
researcher?
Level of knowledge &
skills ?
Is there an agency repository that accepts data?
Level of participation with data?
Role of librarian discovering
data?
Level of involvement
with metadata?
Role of the librarian to help preservation?
Library SurveyResearch Data Services (RDS)
- Research data reference/consultation services to researchers are provided by individual discipline librarians (33%) or dedicated data librarians (17%) or a combination of both (50%).
- Almost half of the libraries (45%) do not have policies and/or procedures associated with research data services.
Librarian Survey
– Distributed to 950 librarians– Science, data, metadata, scholarly communication,
digital collection, electronic resources librarians– 223 people replied at least one question
Librarian Survey
• Interact with faculty, students, or staff in support of RDS 28% Yes-integral part, 41% Yes-occasionally, 32% No (n=221)
• With faculty or staff consultation on
n=192
n=194
n=193
Librarian Survey
• Outreach and collaboration w/ other RDS– Off campus 61% Never, 34% few times a year (n=157)– On campus 51% Never, 34% few times a year (n=157)
• Participation in … about RDS
informal discussion groups
working groups/professional groups
policy development
strategic planning
2%
3%
4%
3%
6%
8%
4%
4%
20%
12%
9%
11%
49%
40%
34%
40%
24%
39%
50%
42%
daily once a week once a month few times a year never
n=158
n=158
n=158
n=156
Librarian Survey
Most important motivation to be involved in RDS
RDS are important to subject disci-pline I support
RDS is primary responsibility
personal interest in RDS
My job includes facilitating data contributions to our institutional
repository
My job includes metadata creation,
training, and/or management
Other My research includes RDS
0%
5%
10%
15%
20%
25%
30%
25%
23%
16%
14%13%
9%
2%
Next steps
• Follow-up to ARL libraries and librarians• Expand scope to other academic libraries• Federal libraries/librarians• Data Managers• Other Working Groups looking at citizen
scientists and UG educators