40
Task Force on Research Data Implementation Summary Report December 2014 Prepared by: Ron Jantz Yu-Hung Lin Aletia Morgan Laura Palumbo, Chair Minglu Wang Krista White Ryan Womack Yingting Zhang Yini Zhu Revised January 2015

Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

Task Force on Research Data Implementation

Summary Report

December 2014

Prepared by:

Ron Jantz

Yu-Hung Lin

Aletia Morgan

Laura Palumbo, Chair

Minglu Wang

Krista White

Ryan Womack

Yingting Zhang

Yini Zhu

Revised January 2015

Page 2: Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

2

Task Force on Research Data Implementation

Summary Report

December 2014

Introduction:

Last year, the Office of Science and Technology Policy mandated that the direct results of

research funded by federal agencies with research budgets of more than one hundred million

dollars be made publicly accessible.1 This followed the 2011 policy change by the National

Science Foundation, which required researchers to submit a data management plan outlining how

their funded data would be managed, shared, and preserved.2 As a result, researchers are

complying with these new mandates by seeking easy and effective ways to share, access, and

preserve research data.

Academic libraries have begun to fill the demand for digital repositories which allow their

researchers’ data to be discoverable, accessible, and preserved for the long term. Rutgers

University Libraries are poised to offer exceptional research data services, and in July 2014 the

Task Force on Research Data Implementation began work in order to “establish an

administrative and evaluation framework for the deposit of research data” in accordance with the

Libraries’ and the University’s Strategic Plans.3 The Task Force was charged with the

completion of ten items to prepare for the ongoing and efficient acceptance of research data (see

Task Force Charge, Appendix A). The following sections of this report will address each of these

task items individually.

Environmental Scan of Institutional and Data Repositories

1. Review the administrative structure of other data repositories that might serve as models.

2. Review the evaluation process for technical, legal, and confidential issues involving data

deposit at other institutions that might serve as models.

The Task Force completed a review of thirty-seven repositories to assess their administrative

structure, and their evaluation processes for technical, legal, and confidential issues in fulfillment

of the first two task items of our charge. The repositories were evaluated based on the

Association of Research Libraries Systems and Procedures Exchange Center: Research Data

Management Services (ARL SPEC) Kit 334 (July 2013), which “…surveys ARL member

libraries on their activities related to access, management, and archiving of research data at their

institutions.”

The Task Force developed a set of thirty four review criteria to analyze the Research Data

Management (RDM) systems of the reviewed institutions, which were categorized into five

areas: Research Data Management Services (RDMS); Data Archiving Services; RDM Service

Staffing; Partnerships; and Research Data Policy. These criteria were reviewed based on publicly

1 http://www.whitehouse.gov/sites/default/files/microsites/ostp/ostp_public_access_memo_2013.pdf

2 http://www.nsf.gov/bfa/dias/policy/dmp.jsp

3 Excerpts from the Task Force’s Interim Report are used throughout the Final Report where relevant.

Page 3: Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

3

available information from the repositories’ and libraries’ websites, and the findings were

summarized in an Interim Report, dated October 2014. Since the Interim Report was written,

additional information was sought from selected repository managers via phone conversations

and is included in this report where relevant to the remaining items of our charge. (See Task

Force on Research Data Implementation Interim Report, October 2014.)

Following are summarized excerpts from the Interim Report. From our research, we discovered

that:

Almost all of the institutions reviewed provided research data management training and

consulting, typically in data management plan preparations. This is an area to be

leveraged to increase library visibility and to establish additional connections with

research faculty.

About half of the reviewed repositories were operated by libraries, and many worked in

collaboration with outside units and offices such as the Office of the Vice President for

Research, and the Office of Information Technology.

The number of research data management service staff members was dependent on each

institution’s funding and culture. Staffing numbers ranged in size, from one or two staff

to as many as eighteen at one institution.

Most repositories place the responsibility for the evaluation of data on the principal

investigator. Only two of the reviewed repositories placed curation responsibilities

exclusively with librarians, although others used teams including librarians.

Over three-fourths of the reviewed repositories allowed self-deposit of data or self-

deposit and mediated deposit.

Data deposit agreements were common, and most shared a similar format. Depositors

typically needed to agree that they were legally allowed to deposit the data for public

access; that the data does not contain any personal or sensitive information; that the

depositor holds the institution harmless from any liability incurred as a result of the

deposit and public access of the data; and that the repository may enact certain described

operations in order to provide for data discovery, maintenance, and preservation.

Privacy and security issues were typically addressed by agreements wherein the depositor

stated that the data was free of any confidential or sensitive data; and by stripping of

identifying information, or in some cases by encryption. Responsibility for the protection

of confidential data was placed with the principal investigator or the researcher depositor.

Information about repository storage capacity was limited. Restrictions to file sizes and

file types were more prevalent, with offerings ranging from 10 – 500 GB free of charge;

and acceptance of most standard file types associated with open source and widely used

proprietary software was common.

Page 4: Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

4

Funding models for storage and preservation of research data have not been established

for many repositories, although a few did provide information about costs of services.

These are described below.

We found that the institutions reviewed were at varying stages with regard to acceptance of

research data. A few were not accepting data or had a limited number of datasets, but many had

respectable quantities of data, and some were well established data exclusive repositories.

Eighteen institutions had repositories operated by libraries, and many worked in collaboration

with outside units and offices such as the Office of the Vice President for Research, and the

Office of Information Technology. Staff responsible for the repositories’ activities varied by the

size of the institution, with the largest data management teams at two institutions consisting of

fifteen to eighteen members. Collaborative efforts with units outside of the libraries and a team

approach in general seem to make the most sense for larger institutions.

Most repositories place the responsibility of the evaluation of data on the principal investigator.

Only two repositories placed curation responsibilities exclusively with librarians, although others

used teams including librarians. Over three-fourths of the reviewed repositories allowed self-

deposit of data or self-deposit and mediated deposit, similar to the process used by RUcore in

acceptance of research documents, albeit with additional forms and guidance. Data deposit

agreements were common, and most shared a similar format as noted above. Responsibility for

the protection of confidential data was placed with the principal investigator or the researcher

depositor. Based on our findings, we believe that researcher responsibility for legal issues, and

self-deposit make the most sense from a liability and efficiency standpoint, with case by case

exceptions.

Information about repository storage capacity was limited. Restrictions to file sizes and file types

were more prevalent, with offerings ranging from 10 – 500 GB free of charge; and acceptance of

most standard file types associated with open source and widely used proprietary software was

common. It seems as if funding models for storage and preservation of research data have not

been established for most repositories, although a few did provide information about costs.

Ongoing funding for data preservation and repository growth is an important aspect of data

management services that will need to be addressed in order to create sustainable systems,

particularly where staff time is a factor.

Although the cost of storage is relatively low, some repositories have fees based on storage

volume, possibly as a way to quantify their services. At the mission-driven, non-institutional

organizations (ICPSR, Dryad, Odum), a more explicit funding arrangement is specified. ICPSR

funds its operations through grant-funding for major subject areas, with some chargebacks to

users for deposit. Dryad has some tiers of pricing, with the lowest price set at $65 per deposit.

Odum also charges fees tailored to each project.

Among university repositories, a few specify limits to the free service offered. The University of

Edinburgh allows up to 500GB of space for each researcher for free. Michigan indicates there

will be charges for extra metadata work by librarians. Stanford recommends that grant proposals

include IT costs, and that it may charge for data over 10GB in the future (not at present).

Syracuse also mentions an evolving funding model. Princeton and Berkeley charge strictly by

Page 5: Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

5

size. Princeton charges 0.006/MB (or $6/GB) as a one-time charge. Berkeley charges

$0.14/month for each GB stored.

Johns Hopkins model is unusual, partly because it was designed from the start to become self-

funding once initial grants ran out. For small collections, a $1600 charge is standard. For large

collections (2TB or more), 2% of the total grant funding is billed to support the data repository.

Finally, Purdue has the most detailed funding model. Central university funding pays for the

following free allocations: 10GB for 3 years for trial projects, 1GB for 10 years for a small

publication, and 100GB for 10 years for a grant-funded project or publication. Additional space

is billed per GB on a yearly, or a 10-year basis. (See https://purr.purdue.edu/about/pricing for full

details.)

Our review also revealed that services in addition to the technical, legal, and administrative

aspects of data ingest, such as data management training and consulting, are flourishing in

institutions even without data capabilities in their repositories. This is an area that can be

leveraged for enhanced library visibility in anticipation of the acceptance of data into RUcore.

The above findings from our Interim Report guided the completion of the remaining items of our

charge.

Rutgers Major Stakeholders in Research Data: Policies and Practices 3. Consult with appropriate major stakeholders to ensure that RUL workflows and practices facilitate and

do not conflict with policies and practices of those departments, especially the office of the Vice President

for Research, and Research and Sponsored Programs.

Although Rutgers is currently without an existing University-wide Research Data Policy, we

assumed that a data policy would be in place in the near future, and would resemble the data

policies we have seen at other institutions. Based on our research, the commonalities in the best

policies seem to be that the university owns the data; the principal investigator is responsible for

making sure it is preserved; and protocols exist in the event the PI leaves the institution. We

allowed this to guide our conversations with Rutgers research offices.

Several members of the Research Data Task Force have periodic meetings with campus

stakeholders in the disposition of research data as part of their normal responsibilities, including

regular meetings with Terri Kinzy (Associate Vice President for Research Administration) and

Eileen Murphy (Director of Research Development) since they assumed their new leadership

positions in the Office of Research & Economic Development (ORED) in late 2013. The

Research Data Manager, Aletia Morgan, and Data Librarian, Ryan Womack, initially met with

them in December to introduce them to RUL, RUcore, and the existing and potential RUL data

management services available to campus researchers. Since then, Aletia has continued to meet

with them periodically, and as available, she and Yingting Zhang have been regular attendees to

the monthly Research Facilitators meetings sponsored by ORED.

Given the evolution in this relationship, we were able to talk with Eileen and Terri specifically

regarding the RUL Research Data Task Force, and its goal of finalizing service processes and

Page 6: Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

6

expectations that can make RUL a trusted partner in the management, preservation, sharing, and

reuse of research data. They are excited about the potential for the Libraries to facilitate

researcher compliance with funder requirements. Their primary piece of advice as we move

forward with the service is to “keep it simple”. They were emphatic that if we make the

submission process cumbersome and judgmental, researchers will simply go elsewhere with their

data.

A key question critical to the acceptance of research data is whether the researcher has the

authority as data custodian to determine whether datasets are appropriate for preservation

services, or whether the library’s workflow process should include a detailed assessment of each

dataset with respect to the quality, human subject protection, ownership (copyright), and

commercial value of any data. Eileen recommended that we speak directly to Judith Neubauer,

Associate Vice President for Research Regulatory Affairs in ORED, and that our workflow

should be crafted to be consistent with her guidance. On September 2, Aletia Morgan and Laura

Palumbo met with Judy Neubauer, Eileen Murphy and Terri Kinzy to discuss the role of the

Libraries with regard to the assessment of research data.

Judy Neubauer was very helpful, and was clear in her belief that the researcher is the ultimate

custodian of the data generated from research, and should be the final arbiter regarding the

suitability for data deposit in any sharing resource or repository, whether in RUcore or

elsewhere. We discussed the current absence of a University Research Data Policy, and while

she did not see this as an impediment to the work of the Libraries, she did say that it should be a

very simple document that defines the researcher as the responsible party regarding the use of

any data generated from funded research.

As another important stakeholder in the research process, we met with leadership of the Office of

Research and Sponsored Programs (ORSP). On November 14, Laura Palumbo and Aletia

Morgan met with Diane Ambrose and Cassandra Burrows (Senior Associate Director and

Assistant Director, respectively). Our primary question was whether their unit would want to be

notified and given any right of approval for any dataset deposits. Additionally, we hoped to

identify processes that would improve our awareness of projects that could yield data for RUcore

deposit, and ways to promote our services among researchers.

Our conversation was very helpful. In brief, Diane and Cassandra felt it unnecessary for us to

notify any ORSP staff when researchers deposit data with RUcore. Certainly, their staff would

be available to respond to any questions we have, but they would not expect this to happen often.

We progressed from this issue to ways to promote awareness of RUcore among the research

community. Their recommendation is that we present updates on RUcore and the RUL research

data services to ORSP staff periodically, to ensure that they are aware of what we can do to help

researchers in their data management efforts, including development of data management plans.

Further, we talked about developing processes to promote RUcore to researchers at the time of

funding commitments, and also sharing a list of awards with us periodically, so that we can reach

out to researchers with projects that might yield data suitable for RUcore preservation.

On November 13, Yingting Zhang had a conversation with Donna Hoagland, Director of the

Institutional Review Board for New Brunswick-Piscataway, about working with the Libraries on

Page 7: Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

7

issues related to human subject data, and she was supportive of a future collaboration with us.

Additionally, Laura Palumbo and Aletia Morgan met with Paula Bistak (Executive Director

Human Subjects Protection Program) and Michelle Gibel Watkinson (Senior IRB

Administrator), to discuss the question of how or whether the IRB would want to review any

RUcore data submissions; and to understand whether IRB approvals typically address the

protection, sharing, maintenance or disposition of any research data. This will obviously not be

an issue during the initial implementation, as no projects with human subject privacy issues will

be accepted. But as we move forward, we will want to make sure that we are working in

accordance with IRB requirements. Paula and Michelle were supportive of further collaboration

between the Libraries and their offices, and it was suggested that the Libraries work with them to

develop guidance for researchers regarding the levels of sharing permitted through RUcore, from

public access, to restricted access to a defined user group, or preservation only.

The charge for the task force to consult with the leaders of the Rutgers research enterprise

yielded valuable information. The relationship we had already forged with these offices

encouraged candid discussions that will help us ensure that our final service recommendations

and ultimate implementation practices are consistent with their standards and expectations.

Principles for Prioritization of Data Deposit Projects 4. Establish principles for prioritization of data deposit projects based on RUL strategic priorities. This

should include a definition of various types of potential projects to ensure that we have the resources both

to host and to sustain projects, ie. federal grants, non-grant funded research, etc

Research data to be accepted into RUcore will demonstrate the scholarly and scientific research

being conducted by Rutgers’ researchers. This data will contribute to the advancement of

knowledge and research in diverse subject areas, including the sciences, health sciences, social

sciences, and humanities, as described in the Libraries’ Strategic Plan. By accepting and publicly

sharing research data, we will highlight the unique ability of the Libraries to facilitate discovery,

access, and reuse of Rutgers University’s research. Research data will be accepted insofar as the

Libraries have “…the resources, including, but not limited to, expertise, technology, and funding,

to support the project, both initially and ongoing.” (Digital Projects Evaluation Process, Rutgers

University Libraries, March 2013).

Research data may be the result of unfunded as well as grant-funded research, to allow for a

broad spectrum of research areas to be included; however projects which require data deposit to

comply with funder mandates may be given preference. Working with Rutgers’ researchers, the

Libraries will provide access to Rutgers’ research data. The Principal Investigator or Primary

Responsible Researcher, hereafter to be referred to as the Responsible Researcher and which

shall be meant to include both titles, will be responsible for assuring that the data can be shared

publicly in accordance with University policies, Federal and other funders’ directives, and is in

compliance with any legal restrictions. Through a deposit agreement, they will attest that by

sharing the data they will not be in violation of any confidentiality agreements, copyright laws,

or other laws, and will hold Rutgers University Libraries harmless from any damages resulting

from the sharing or misuse of the data. (See the following Evaluation of Data Projects for

Deposit for information concerning the deposit agreement.)

Page 8: Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

8

The Task Force has developed Research Data Service Guidelines for acceptance of data into

RUcore. These guidelines were drafted assuming a staged approach, with the initial

implementation of mediated ingest consisting of data projects without human or animal subjects,

commercial interests, and which are typically less than 100 GB of data volume per project,

although larger quantities may be considered. We believe that we currently have the staffing,

storage, and system requirements to accept these projects immediately.

During the initial implementation of data acceptance, the deposit process will be mediated by

members of the RUL Research Data Team. Researchers will work directly with a Project

Manager from the RUL Research Data Team who will guide the researcher through the deposit

process and see it through to completion (see RUL Research Data Team and Data Deposit

Workflow for more information). Data projects outside the guidelines for the initial

implementation, such as those with human subjects, would be considered in the full

implementation of research data services, or on a case-by-case basis as a special data project. We

envision development of a full implementation of data acceptance that would allow researchers

to self-deposit data, in addition to providing mediated deposit when necessary. (See the

following Evaluation of Data Projects for Deposit for more information about self-deposit of

data.)

Please see Appendix B for the Research Data Service Guidelines, which outlines acceptance of

data projects for initial mediated deposit; the subsequent self-deposit and mediated full

implementation; and special projects which will be assessed on a case-by-case basis.

Evaluation of Data Projects for Deposit 5. Develop a framework for evaluation for data deposit in RUcore that includes a questionnaire or series

of questionnaires to be used for each data deposit, covering technical, legal, and confidential criteria

(similar to the Digital Projects Evaluation Process approved by Cabinet in March 2013).

The Task Force has created separate high level criteria intake questionnaires for the initial

acceptance of mediated data projects, and for the full implementation of data acceptance, which

also includes self-deposit. The questionnaires for each stage of data acceptance ensure that the

requirements of the Guidelines are met, and are to be signed by the Responsible Researcher.

Once the questionnaire has been completed and it has been determined that the high level criteria

are met, an application form is completed by the Responsible Researcher to establish a minimum

amount of metadata. During mediated data deposit, the questions would be asked of a researcher

by the appropriate member of the RUL Research Data Team, and/or a subject liaison. (See

Proposed RUL Research Data Team.) Once the project application is complete, the Responsible

Researcher would sign a Data Deposit Agreement, allowing RUL to accept the data.

The data deposit agreements reviewed during our environmental scan of data repositories

typically state that the Responsible Researcher is responsible for insuring that they are legally

allowed to deposit the data for public access; that the data does not contain any personal or

sensitive information; that the depositor holds the institution harmless from any liability incurred

as a result of the deposit and public access of the data; and that the repository may enact certain

described operations in order to provide for data discovery, maintenance, and preservation.

Page 9: Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

9

The Task Force drafted a suggested data deposit agreement adapted from a previously created

document prepared by the Copyright Librarian. We have made suggested modifications based on

our review of existing deposit agreements, as well as data policies found at peer and aspirant

institutions, with the assumption that similar research data policy would be adopted by Rutgers.

In short, we found that most data policies assert that the University owns the research data; that

the Principal Investigator or Responsible Researcher is the custodian of that data; and it

stipulates that the data would remain with the institution should the Responsible Researcher

leave. The suggested deposit agreement will need to be modified to align with a University data

policy, when such a policy is adopted.

During the full implementation of data services, data deposit may be automated as well as

mediated. Mediated data deposit will still be an option for researchers needing assistance, and for

projects which are very large, complex, or which would require infrastructure modifications, i.e.

a special research data project. For self-deposited data, the forms would be online and would

require NetID authentication and an electronic signature. Guideline questions would be affirmed

by the researcher, preliminary metadata entered, and the deposit agreement accepted. This self-

deposit process would include a waiting period, during which time the RUL Research Data Team

would review documents before the data becomes public.

Discussions with repository director Lisa Johnston at the University of Minnesota reveal that

their waiting period is two days for self-deposit, during which time a cursory review of file types

is performed. A conversation with Purdue’s PURR manager Scott Brandt indicates that no

checking of self-deposited data is done; the researcher alone is responsible for compliance with

any restrictions on sharing the data, limiting the library’s liability for these issues. This is also the

case at Penn State, where Mike Giarlo indicated that uploaded data becomes “live” immediately.

For RUcore, we believe that the appropriate time frame for review of the high level criteria

intake questionnaires would be five working days. The time frame for further review of

applications and documentation will depend on the complexity of the project.

Should researchers need assistance with copyright issues, guidance would be available through

referral to the Copyright Librarian. Researchers needing assistance with issues concerning

intellectual property and commercial interests would be directed to seek advice from the Office

of Technology Commercialization. Researchers with human or animal subjects’ data would be

directed by the Research Data Manager to a suitable repository for data sharing during the initial

implementation of data acceptance.

For both mediated and self-deposited data in RUcore, the responsibility for compliance with any

legal restrictions would lie with the Principal Investigator/Responsible Researcher. They will

assume liability for determining if their data is free from any copyright or intellectual property

constraints, sensitive or confidential information, any restrictions on public accessibility, or any

other legal and ethical issues which might prevent their depositing and sharing the data publicly.

The Libraries will be exempt from liability by not assuming responsibility for these

determinations.

Please see Appendix C for the initial implementation Mediated Research Data Projects

Questionnaire; Appendix D for the Flowchart for the Mediated Data Projects

Page 10: Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

10

Questionnaire; Appendix E for the full implementation Self-Deposit, Mediated, and Special

Research Data Projects Questionnaire; Appendix F for the Full Implementation Flowchart

for Mediated and Self-Deposited Data Projects; and Appendix G for the Research Data

Project Application for the Responsible Researcher. Also included in Appendix H is a

proposed Research Data Deposit Agreement, which can be modified when a Rutgers

University Data Policy becomes available.

Evaluation Guidelines for Subject Liaisons 6. Develop a corresponding guide on evaluation criteria to provide clarity to subject librarians. (Similar

to the Deed of Gift Explanation in the RUL Deed of Gift).

Guidelines have been drafted to better enable subject liaisons who have completed the

RUresearch training course “Supporting Faculty Research Data Needs”, to work directly with

researchers in assisting with data deposit. If a subject liaison has not been trained, they will work

with an appropriate Project Manager from the RUL Research Data Team until such time as they

are able to manage a data project without assistance. Project Managers will provide assistance to

researchers with forms and referrals to other personnel or offices if necessary, and enter metadata

into RUcore. (For more details about the Project Manager’s duties, please see Data Deposit

Workflow.)

Please see Appendix I for the General Guidelines for Librarians advising on Research Data

Projects for RUcore deposit. Project Managers also have available to them more detailed

information about consulting with researchers regarding data in the RUresearch Sakai site.

Proposed RUL Research Data Team

7. Recommend assignments for functional responsibility in the area of data deposit.

Based on our review of other repositories, and given the background and experience of current

staff, we believe that RUL has the staffing capacity to accept research data immediately. After

consideration of the staffing arrangements at other institutions, we propose the following

organizational structure for a RUL Research Data Team, which would serve researchers at all

Rutgers locations.

The RUL Data Team should consist of existing Libraries personnel, who are already well

qualified for the review and acceptance of research data. The Team should be lead by a full time

Data Manager, whose time is one hundred percent attributable to the activities related to data

acceptance in RUcore. The Research Data Manager would be responsible the overall leadership

of the Research Data Service, including participation in national and international organizations

related to Research Data. The Research Data Manager would also coordinate the work of

student interns from SCI and elsewhere, and work with others assisting on data projects (e.g.,

departmental graduate students, postdocs, and research fellows).

We envision a Data Team that consists of two parts; a Core Data Team, who will be responsible

for preliminary review of data projects and who will also serve as Project Managers when

appropriate; and an Expanded Data Team, who will act as Project Managers and oversee data

Page 11: Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

11

projects to their completion. The Core Data Team as a whole could meet on a weekly basis, if

projects are awaiting review. If there are issues with rights, commercialization, sensitive

information, or other legal issues, the Responsible Researcher will be referred to the appropriate

personnel or office for guidance, such as the Copyright Librarian, the Repository Collection

Librarian, the Office of Technology Commercialization, or the Institutional Review Board, for

resolution of any issues before the data project is accepted.

The Expanded Data Team would meet as needed to review prioritization and scheduling of

projects, and members of the expanded team who have been assigned to specific data projects

would meet separately as needed to move these projects forward. These are the personnel who

we would assign to each group:

Core Data Team

Research Data Manager

Data Librarians- 2

Chemistry & Physics Librarian/Science Data Specialist

Metadata Librarian for Continuing Resources, Scholarship and Data

Digital Data Curator

Digital Library Architect

Expanded Data Team-includes the above plus:

Digital Humanities Librarians- 2

Health Sciences Librarians- 2

Social Science Librarian

Physical Sciences Librarian, Newark

Life Sciences Librarian, Newark

With the exception of the Research Data Manager, these personnel would be responsible for data

duties on a part-time basis, and as RUL Research Data Services grow, additional staffing and

resources should be allocated to the Research Data Service. Job descriptions may need to

accommodate changes in time spent on data in the future. We should seek to include

representatives on the data team from Camden as well.

Data Deposit Workflow

8. Chart a workflow for the data deposit evaluation process.

The preceding guidelines and questionnaires establish the process for evaluation of data for

deposit. These have been drafted so that they can be adapted to accommodate a self-deposit

process, where the responsible researcher would be enabled to answer relevant questions about

his or her data directly in an online form. On the following pages are a flowchart which provides

an overview of the workflow of the data deposit process, and narratives which describe the

process in more detail.

Page 12: Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

12

RUL Research Data Project Workflow

Page 13: Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

13

Project Workflow for Initial Implementation: Mediated Deposit

Potential data management projects will come to the attention of the RUL Research Data Team

from any of several sources, including the researcher, research assistant or agent, another

librarian, or subsequent to earlier Research Data Management Plan support. The Research Data

Manager (RDM) will typically work with the Responsible Researcher (RR) to identify the basic

characteristics of the project as specified in the high level criteria intake Questionnaire, although

this may be done by a member of the Core Data Team if this is where the researcher has made

initial contact.

The RDM (or initially contacted member of the Core Data Team) will review the attributes of the

project as identified in the intake Questionnaire to determine its likely suitability for the initial

implementation of RUL Research Data Services. If the project appears to conform to the

specifications of the initial implementation of the Research Data Service, the RDM or Team

member will notify the rest of the Core Data Team. The group will identify an appropriate

Project Manager (PM) from the Data Team to lead the further review and implementation of the

project.

The Project Manager will ensure that the appropriate departmental Liaison Librarian is aware of

the project (if not already part of the Data Team), and will work with the researcher to complete

the Project Application, which will supply metadata and collection information necessary for

RUcore ingest. With the Project Application complete, the PM will review the project with the

Data Team, who will confirm that the project is suitable for deposit in the initial implementation.

Ideally the entire review process will not take more than two weeks. The Data Team may bring

in other librarians or staff members as needed, who may participate in the project as an

opportunity for shared learning.

The PM will notify the researcher of acceptance of the data project, and the researcher will then

be required to sign a Data Deposit Agreement. After the Responsible Researcher signs the Data

Deposit Agreement, the PM will collect the data and any other additional documents or resources

not already provided. The PM will then work with the other members of the RUL Data Team,

including metadata specialists as needed to complete the ingest of the data and documentation

into RUcore, and will enter metadata into WMS. The minimum amount of metadata which will

be required for ingest are those identifiers which have been described in the Project Application.

Additional metadata may be available, and the PM will enter this metadata as required. Once the

collection is created and the ingest is complete, the project data will be available on RUcore.

The datasets and links to all related events such as scholarly publications will be available for

public access and download.

If the project does not appear to be suitable for the initial implementation, either based on the

responses to the intake Questionnaire or the information gathered in the more formal Project

Application process, the RDM or PM will communicate with the Core Data Team to review

whether the project might meet the guidelines for acceptance of data during the full

implementation, or whether it is a candidate for treatment as a special project. Proposed special

projects will be reviewed by the Associate University Librarian for Digital Library Systems for

Page 14: Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

14

acceptance. If there are unresolved rights issues, technical issues, or if the project is otherwise

outside of the criteria for the initial implementation, the RDM or PM will assist with referrals to

appropriate personnel to resolve these issues, and attend further consultations as necessary.

Project Workflow for Full Implementation: Mediated or Self-Deposit

The full implementation of the RUL Research Data Service will offer two routes for research

data to come in to RUcore; mediated and self-deposit. The mediated project process in the full

implementation will be the same as was described previously for the initial implementation, with

the additional capability of accepting datasets with more diverse characteristics and varied

requirements for controlled access. If the researcher desires a mediated process, that workflow

will be followed.

The development of tools for self-deposit of research datasets will be modeled after the existing

Faculty Deposit for scholarly publications. The online self-deposit process will require the

depositor to provide information about the project by completing the Questionnaire, the

Application Form, and the Deposit Agreement.

If the data is self-deposited, an automatic notification will be sent to the RUL Core Data Team,

who will review the completed Questionnaire, Application, signed Deposit Agreement, and data

files to determine if the project can be accepted into RUcore directly, whether it will require

mediated deposit, or if it should be considered a special data project. The researcher will receive

online confirmation that the data has been successfully uploaded, that the project is being

reviewed, and that notification of the outcome of the review will be sent within five working

days.

The Core Data Team will determine who will be the Project Manager (PM) for the duration of

the project. The PM will be a member of the Expanded Data Team, who may decide to include a

subject specialist or other librarian or staff member to assist with the project. If the Data Team

determines that there are unresolved rights issues, technical issues, or issues concerning sensitive

data, the PM will contact the researcher with referrals to appropriate personnel to resolve these

issues, and will also work with the researcher as needed to resolve any outstanding issues. The

PM will collect any outstanding documentation.

If the project does not appear to have unresolved issues, the data will be given a brief review to

check for de-identification of confidential or sensitive information if applicable. The data will

also be checked for descriptive documentation in the form of a “README” file, so that

researchers will be able to understand and use the data files; to verify that file names are not

nonsensical; that the file types can be accepted into RUcore; that the files can be opened and read

in the appropriate application; that there is sufficient supplementary documentation provided

such as codebooks or questionnaires; and that any URLs are persistent. This brief review for

completeness should take no more than five working days. After that time the researcher will be

notified regarding the acceptance of the project, and the name of the RUL Project Manager who

will become the primary contact for questions concerning the data project.

Page 15: Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

15

The PM will then work with the other members of the RUL Data Team to complete the ingest of

the data and documentation into RUcore, and will enter metadata into WMS. The minimum

amount of metadata which will be required for ingest are those identifiers which have been

described in the Project Application. Additional metadata may be necessary, and the PM or other

subject specialist or staff member will enter this metadata as required, but this should not delay

the ingest and release of data through RUcore. Additional metadata for project completeness

should be entered within two to four weeks after publication of the data in RUcore.

Rutgers Major Stakeholders in Research Data

9. Determine the major stakeholders at Rutgers who need to be familiar with the RUL data deposit

process.

In addition to creating a high-quality data preservation and sharing service that meets the needs

of our institutional researchers, it is important to recognize that outreach and communication are

a critically important part of the initiative. Researchers are typically busy people; they are

initially unlikely to reach out to the library as a provider of data services, since we are still

thought of by many as “the book people”. In a recent conversation with an IT professional in

SAS we heard, “RUcore is the best-kept secret on campus.” This is something we need to see

change. Fortunately, the Open Access initiative and new SOAR portal to RUcore is an

opportunity to raise the profile of the Libraries and RUcore in the minds of faculty members.

The RUL Research Data Service will allow the Libraries to become a valued partner in

compliance with research funder data management requirements.

Stakeholders for the RUL Research Data Service include many university organizations. These

include:

Office of Research & Economic Development

Office of Research & Sponsored Programs

Institutional Review Boards

Office of Information Technology

Research faculty

Unit research facilitators

Department staff charged with supporting researchers

Academic & research unit computing staff

RUL librarians & IT staff

Peer institutions

The task force can envision three different levels of awareness that we need to develop among

our various stakeholders.

Recognition The RUL Research Data Team would seek to create a University-wide awareness of

the existence of the RUL Research Data Service, and its potential to support research data

management through Data Management Planning as well as data preservation and sharing.

Outreach activities to support this level of awareness could include content in the RUL web site,

the ORED Newsletter, regular Rutgers email newsletters, plus periodic presentations to

established regular meetings of academic departments, OIT, and other university units. Special

Page 16: Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

16

marketing and promotional materials such as bookmarks, brochures, flyers, QR codes, and

digital displays could be created. Additionally, it is clear that any data service needs to be

publicized through outside journal articles and conference proceedings, which will create

additional visibility for RUL and Rutgers in general.

Implementation Once we begin implementation beyond support for Research Data Management

Plan development, researchers and their support administrators will need clear documentation

and educational materials to support the actual process of data preservation. During the initial

implementation, the nature of datasets we will be accepting for ingest is straightforward and can

be easily described to potential depositors through our general outreach activities noted above, as

well as through more detailed presentations about the actual project processing. As we progress

toward full implementation, we will need to develop additional documents to help researchers

identify datasets that require and are suitable for preservation and or sharing. Additionally,

informational materials should be developed and sent to all researchers as they deposit scholarly

articles to encourage them to connect their data with their publications.

Development and Support For the long term success of RUL Research Data Services, it will be

critical that the RUL Research Data Team maintain and enhance relations with key university

stakeholders such as ORED, IRB, and Academic Unit Deans. We must prove the value and

effectiveness of our services in research data management, preservation, and sharing, while

meeting the expectations for ease of use and efficiency. The development of a regular reporting

process by which we will inform these units of RUcore metrics that include collection access,

dataset citations, and measures of researcher satisfaction, will invite recognition and support.

Ultimately, our goal is to develop and implement communication vehicles and feedback

mechanisms to ensure that RUcore is accepted as a high-quality data repository for research

products.

RUcore Technical Support for Research Data

10. Consult with CISC to assess RUcore hardware and software infrastructure to support immediate,

three year and five year needs.

On November 26, the Guidelines for Research Data Services were reviewed with the Cyber-

Infrastructure Steering Committee (CISC), and projections of storage needs were discussed. As a

result, suggestions were incorporated into the Guidelines included in this report. The Task Force

proposed the following projected storage capacity needs, which were approved by CISC.

For the initial implementation of mediated data acceptance, which would last for

approximately one year, the storage needs are not anticipated to exceed 2 Terabytes

(TB).

For the first three years, the storage needs would be up to 20 TB.

For the first five years, it is not expected that storage needs would exceed 100 TB.

It was suggested that very large data projects which might exceed the existing available storage

capacity should be considered, and that additional storage could be purchased for the researcher

Page 17: Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

17

for a fee. Storage fees could be established as a reasonable method of quantifying data services

provided to the researcher.

Conclusions and Recommendations:

Researchers are working to comply with federal mandates to make funded research publicly

accessible, and are seeking easy and efficient methods of safely sharing and preserving their

data. There has been a rapid advancement of academic libraries into this arena, in an effort to

help researchers fulfill the requirements for public access to federally funded research. In

addition to institutional repositories, data specific repositories such as Dryad and ICPSR

continue to grow. Academic libraries with institutional repositories see the opportunity to

become part of the research workflow, and are actively promoting their research data services to

their communities. We believe that Rutgers University Libraries currently have the expertise,

experience, and system capabilities to accept research data in RUcore immediately, as evidenced

by the development of the pilot data portal in 2010, and the acceptance of pilot data projects

starting in 2012. Further, if we do not soon establish ourselves as participants in the sharing and

preservation of research data, we will be left behind as researchers find other ways to comply

with funding requirements.

We would seek to replicate the successes of our peer and aspirant institutions in the acceptance

of research data. Our review of institutional and data repositories found similarities in the way in

which others are facing the challenge of providing research data services, and in their research

data policies. It was assumed that a policy similar to those we reviewed would be adopted by

Rutgers, and allowed this to guide our thinking about data acceptance. Based on our research, we

found the commonalities in the best data policies seem to be that the university owns the data;

the Principal Investigator is the custodian of the data; and protocols exist in the event the

Principal Investigator leaves the institution. The Interim Report submitted by the Task Force in

October 2014 reviews institutional data policies in more detail, and provides examples of data

policies that were considered to have been thorough and well-written.

We also found that most of the institutional and data repositories we reviewed offered self-

deposit of data, or both self-deposit and mediated data deposit. Of the thirty-seven repositories

we reviewed, the following institutions offer self-deposit or a combination of self-deposit and

mediated deposit, with CIC institutions shown in boldface:

University of California at Berkeley

University of California at San Diego (iDASH)

University of California at San Francisco

Columbia University

Cornell University (eCommons)

Dryad

University of Edinburgh

University of Guelph

Harvard University

ICPSR

Page 18: Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

18

University of Illinois at Chicago

University of Illinois at Urbana-Champaign

Indiana University

University of Iowa

Johns Hopkins University

University of Maryland

University of Michigan

University of Minnesota

MIT

New York University

Ohio State University

University of Oregon

Penn State University

University of Pittsburgh

Purdue University

Stanford University

Syracuse University (QDR)

University of Texas

University of Virginia

University of Wisconsin- Madison

The Task Force believes that we should allow self-deposit of data as many of our peer and

aspirant institutions have done, and as the Libraries are already doing with scholarly articles.

With the acceptance and coming implementation of the Rutgers Open Access Policy, we can

soon expect to receive data associated with published articles in a self-deposit process. Self-

deposit is already familiar to researchers through submittals to the government’s PubMed

Central, and other established data repositories. Self-deposit of data, we feel, is a way of heeding

the advice we received from ORED with regard to data deposit: “Keep it simple.”

In order to obtain more details about the deposit processes reviewed in the Interim Report, Aletia

Morgan and Laura Palumbo had subsequent conversations with repository managers at the

University of Maryland, the University of Minnesota, Purdue University, and Penn State

University to verify procedures and staffing. They found that Maryland and Minnesota do

cursory reviews of self-deposited data for formatting issues or other obvious problems, with the

University of Minnesota aiming to complete this review within two days. Penn State allows self-

deposit with immediate visibility of uploaded data, and without review of any kind. Only

minimal metadata is required, and the researchers create their own metadata exclusively. Purdue

also performs no review of deposited data, leaving full responsibility for compliance with any

restrictions up to the researcher. While the Task Force recommends a brief review for obvious

problems and technical issues, we support the notion that by reviewing data in detail we would

accept responsibility for any errors and legal violations.

Staffing of data service teams at these institutions was also discussed, and it was determined that

a limited number of full-time staff typically work with teams of librarians and others with part-

time data responsibilities to accomplish the tasks associated with data acceptance. Time spent on

data related tasks varied, based on how much data the repository is currently accepting. It was

Page 19: Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

19

discovered that some staff at Purdue receive funding from grant money to accomplish their work.

We feel that RUL currently has sufficient well-trained staff to accept research data into RUcore,

as described previously.

In order to create a sustainable service, funding should be sought once the Libraries have begun

to accept research data on a regular basis. The most logical source of this funding would be from

the Office of Research and Economic Development, whose goal it is to help researchers obtain

grants and comply with funder directives. Some additional funding can be achieved by

establishing fees for additional storage capacity, which can be passed on to funders by

incorporation into grant proposals. However, because storage is relatively inexpensive, this

probably will not be a major source of income. If the Libraries can establish research data

acceptance as a core service, funding can be provided through budgeting from departments who

would benefit from this service.

Efforts should be made to maintain the relationships we have established with the Office of

Research and Economic Development, the Office of Research and Sponsored Programs, and the

Institutional Review Boards. We should continue to work to establish integration of data

acceptance in RUcore into their workflows, so that researchers are aware of the availability of

RUcore for sharing and preservation of research data; and of the related services that RUL can

provide, such as the preparation of data management plans and consulting on data projects.

These offices indicated that they are happy to work with the Libraries to establish workflows for

our mutual benefit.

The acceptance of research data into RUcore is an important service to faculty which would

highlight the expertise of the Libraries, and which would allow us to establish deeper

relationships with our research communities. It could also become a source of funding as a core

service to researchers. However, research data services must be easy to use in order to be of

value to time-pressed researchers, and to be seen as worthy of financial support. RUL research

data services will not be used if they do not become more visible soon. Researchers are obligated

to comply with funding mandates, and will find ways to do so without the Libraries if we do not

take action. We propose the immediate acceptance of mediated data projects as described in this

report, which will provide the basis for further learning and expansion of our data services. With

the benefit of this additional experience, and the resulting deepening relationships with research

faculty, we will be ready for the establishment of seamless online data acceptance, such as is

being done by our peer and aspirant institutions.

Page 20: Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

20

Appendices

Appendix A: Task Force on Research Data Implementation Charge

Appendix B: Research Data Service Guidelines

Appendix C: Mediated Research Data Projects Questionnaire

Appendix D: Flowchart for Mediated Data Projects Questionnaire

Appendix E: Self-Deposit, Mediated, and Special Research Data Projects Questionnaire

Appendix F: Full Implementation Flowchart for Mediated and Self-Deposited Data Projects

Appendix G: Research Data Project Application for the Responsible Researcher

Appendix H: Research Data Deposit Agreement

Appendix I: General Guidelines for Librarians advising on Research Data Projects for RUcore

Deposit

Page 21: Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

21

Appendix A: Task Force on Research Data Implementation Charge

Rutgers Libraries Task Force on Research Data Implementation

The task force on Research Data Implementation is charged with establishing an administrative and

evaluation framework for the deposit of research data in RUcore. This implementation process will

inform the development of a university data policy by the office of General Counsel working with our

Libraries Copyright and Licensing Librarian.

The task force should involve other individuals as necessary to do its work, and engage at the outset with

the office of the Vice President for Research and the Office of Research and Sponsored Programs to

ensure that the implementation addresses issues of importance to the research faculty and appropriate

administrative offices. The task force should also liaise with the Committee on Scholarly Communication

through its chair, Laura Mullen.

Janice Pilch, Copyright and Licensing Librarian, will work separately on drafting a data policy. When

your draft implementation plan is ready, Janice can review it from the copyright perspective. We believe

this two part process will work effectively.

This is a Cabinet Task Force under the joint leadership of Grace Agnew and Melissa Just who will

oversee and guide its work on behalf of Cabinet. We expect the plan to be completed no later than

December 2014, and Cabinet would expect a progress report mid-way through the process.

The charge to the Task Force is to:

1. Review the administrative structure of other data repositories that might serve as models.

2. Review the evaluation process for technical, legal, and confidential issues involving data deposit at

other institutions that might serve as models.

3. Consult with appropriate major stakeholders to ensure that RUL workflows and practices facilitate

and do not conflict with policies and practices of those departments, especially the office of the Vice

President for Research, and Research and Sponsored Programs.

4. Establish principles for prioritization of data deposit projects based on RUL strategic priorities. This

should include a definition of various types of potential projects to ensure that we have the resources

both to host and to sustain projects, ie. federal grants, non-grant funded research, etc.

5. Develop a framework for evaluation for data deposit in RUcore that includes a questionnaire or series

of questionnaires to be used for each data deposit, covering technical, legal, and confidential criteria

(similar to the Digital Projects Evaluation Process approved by Cabinet in March 2013).

6. Develop a corresponding guide on evaluation criteria to provide clarity to subject librarians. (Similar

to the Deed of Gift Explanation in the RUL Deed of Gift).

7. Recommend assignments for functional responsibility in the area of data deposit.

8. Chart a workflow for the data deposit evaluation process.

9. Determine the major stakeholders at Rutgers who need to be familiar with the RUL data deposit

process.

10. Consult with CISC to assess RUcore hardware and software infrastructure to support immediate,

three year and five year needs.

Page 22: Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

22

Task Force Members:

Laura Palumbo, Chair

Ron Jantz

Yu-Hung Lin

Aletia Morgan

Minglu Wang

Krista White

Ryan Womack

Yingting Zhang

Yini Zhu

6/26/14

Page 23: Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

23

Appendix B

Research Data Service Guidelines

Research data to be accepted into RUcore will demonstrate the scholarly and scientific research

being conducted by Rutgers’ researchers. This data will contribute to the advancement of

knowledge and research in diverse subject areas, including the sciences, health sciences, social

sciences, and humanities, as described in the Libraries’ Strategic Plan. By accepting and publicly

sharing research data, we will highlight the unique ability of the Libraries to facilitate discovery,

access, and reuse of Rutgers University’s research.

Research data may be the result of unfunded as well as grant-funded research, to allow for a

broad spectrum of research areas to be included; however projects which require data deposit to

comply with funder mandates may be given preference. Working with Rutgers’ researchers, the

Libraries will provide access to Rutgers’ research data. The Principal Investigator or Primary

Responsible Researcher, hereafter to be referred to as the Responsible Researcher and which

shall be meant to include both titles, will be responsible for assuring that the data can be shared

publicly in accordance with University policies, Federal and other funders’ directives, and is in

compliance with any legal restrictions.

Research data will be accepted insofar as the Libraries have “…the resources, including, but not

limited to, expertise, technology, and funding, to support the project, both initially and ongoing.”

(Digital Projects Evaluation Process, Rutgers University Libraries, March 2013).

We anticipate a phased approach to acceptance of data, in order to be able to scale services as

projects become more complex. During the initial implementation of data acceptance, the deposit

process will be mediated by members of the RUL Research Data Team. Researchers will work

directly with a trained Project Manager, who will guide the researcher through the deposit

process and see it through to completion. For the initial implementation of data acceptance, we

recommend the following guidelines:

Initial Implementation: Mediated Data Acceptance

All projects will be considered, regardless of funding status. Projects which require data

deposit to comply with funder mandates may be given preference.

One of the Responsible Researchers (or co-PIs) must be Rutgers faculty or staff. This

Responsible Researcher must initiate the data deposit process.

Datasets which are associated with Rutgers graduate students’ deposited electronic theses

and dissertations will be accepted.

Short-term embargoes will be allowed, such as until the publication of a book or article.

The Responsible Researcher will be responsible for determining that the data is

appropriate for public sharing. Through a deposit agreement, they will attest that by

Page 24: Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

24

sharing the data they will not be in violation of any confidentiality agreements, copyright

laws, or other laws, and will hold Rutgers University Libraries harmless from any

damages resulting from the sharing or misuse of the data.

Total data volume will be 100 GB or less per project without storage fees. Projects

requiring more than 100 GB will be reviewed the Associate University Librarian for

Digital Library Systems and considered on a case-by-case basis.

Data which requires media transformation or digitization will be referred to the

Repository Collection Librarian for necessary transformation/digitization.

The data will not be derived from human or animal subjects. Sensitive or confidential

data, if de-identified, will be accepted later in the full implementation of data services.

The data will not be the result of research conducted on behalf of or in conjunction with

any outside commercial interests. Projects which involve commercial interests will be

accepted later during the full implementation of data services.

Research data which would require technical or system modifications within RUcore will

be considered on a case-by-case basis.

Full Implementation: Self-Deposited and Mediated Data Acceptance

After the initial acceptance of data as described above, self-deposit of data by the Responsible

Researcher is recommended, provided that the necessary storage capacity, staffing, and technical

components are in place. During the full implementation of data services, researchers would

have the option to use online self-deposit forms, or to choose mediated deposit as was offered in

the initial implementation. Self-deposited data will require the researcher to affirm that high level

criteria are met, which will be reviewed by the Data Team. If these criteria are met, Data Deposit

and Application Forms will then be completed online, and additional internal reviews will be

performed. Once the internal review has been completed, the data will be made accessible in

accordance with any restrictions which have been placed on the data (see Evaluation of Data

Projects for Deposit and Data Deposit Workflow for details).

Datasets that meet the initial implementation requirements will continue to be accepted in

the full implementation of data services.

Data involving human and animal subjects will be permitted with proper de-

identification; IRB and other approvals must be in place. The Responsible Researcher

will be responsible for determining that confidentiality and any other legal requirements

are met.

Projects requiring up to 500 GB of storage will be accepted, and fees may be assessed

accordingly.

Page 25: Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

25

Projects which involve commercial interests will be considered.

We anticipate that the acceptance of research data into RUcore will be an evolving process,

during which Rutgers Libraries will adapt and grow as new challenges are presented. We cannot

foresee what new developments might arise, however we believe that RUL will be agile enough

to accommodate changing research data needs. These special, complex and/or very large

research data projects will be considered on a case-by-case basis.

Special Research Data Projects, to be assessed on a per case basis

Data projects which will require extensive staff time to develop may be accepted; fees

may be assessed.

Data which will require the purchase of additional storage capacity will be considered,

and storage costs will be assessed.

Projects which will require system modifications will be considered.

Data which requires media transformation or digitization will be considered.

Page 26: Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

26

Appendix C

Mediated Research Data Projects Questionnaire

During the initial implementation of Research Data Services, data deposit will be a mediated

process. Researchers will work in conjunction with members of the proposed Libraries’ Research

Data Team, and other appropriate personnel in RUL. Members of the Research Data Team will

guide the researcher through the process of data deposit, and will begin by soliciting answers to

the following high level criteria intake questionnaire.

The responsibility for compliance with any legal restrictions lies with the Principal

Investigator/Responsible Researcher (hereafter referred to as the Responsible Researcher). They

will assume liability for determining if their data is free from any copyright or intellectual

property constraints, sensitive or confidential information, any restrictions on public

accessibility, or any other legal and ethical issues which might prevent their depositing and

sharing the data publicly. The Libraries will be exempt from liability by not assuming

responsibility for these determinations.

Guidance is available to researchers needing assistance with copyright issues from the Copyright

Librarian. Researchers needing assistance with issues concerning intellectual property and

commercial interests should seek advice from the Office of Technology Commercialization.

Researchers with human or animal subjects’ data should contact the Research Data Manager for

advice concerning a suitable repository for data sharing during the initial implementation of data

acceptance.

This high level criteria intake questionnaire can be sent to the Responsible Researcher as a pre-

consultation step, or completed at the initial consultation with the Data Project Manager. This

information will determine whether the project can be considered for deposit under the

Guidelines for Research Data Services during the initial implementation of data acceptance.

Please have the Responsible Researcher answer the following questions, and sign the completed

form:

1. Is the Responsible Researcher Rutgers faculty or staff? For datasets associated with

Rutgers electronic theses and dissertations, please contact the Repository Collection

Librarian, Rhonda Marker, at [email protected].

2. Are all data in digital format? For projects requiring digitization, please contact the

Repository Collection Librarian at [email protected] .

3. Is the project without human or animal subjects? Data involving human or animal

subjects will be accepted in the full implementation of data services. Please contact the

Research Data Manager, Aletia Morgan at [email protected] for guidance

concerning an appropriate repository for human or animal subjects’ data.

Page 27: Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

27

4. Is the project independent of support or participation of outside commercial interests?

Projects with outside commercial interests will be considered during the full

implementation of data services. For assistance with commercial issues, please consult

with the Office of Technology Commercialization at 848-932-0115 or

[email protected]

5. Have all copyright, licensing, and other legal restrictions been met? If unsure, please

consult with the Office of Technology Commercialization at 848-932-0115 or

[email protected]; and the Copyright Librarian, Janice Pilch at

[email protected].

6. May the data be made publicly accessible? We can accept data with short-term

embargoes, such as until the publication of a book or article; for projects that require

permanent preservation without public access, please contact the Research Data Manager

at [email protected].

7. Is the data the final version intended for public release? We cannot offer working storage

for ongoing research projects.

8. Is the data volume less than 100 GB for this project? If not, please consult with the

Repository Collection Librarian at [email protected].

9. Is the research funded? If so, a copy of the funded grant application will need to be

provided. The grant application is for internal use only and will not be made publicly

accessible, unless so desired.

10. If all of the above conditions are met, the Project Manager should assist the researcher, as

needed, with completion of the Research Data Project Application for the Responsible

Researcher. If the above questions cannot be affirmed, please consult the Research Data

Manager at [email protected] to see if the data can be accepted at this time.

All of the above questions have been correctly and truthfully answered in the affirmative.

Responsible Researcher Date

Page 28: Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

28

Appendix D: Flowchart for Mediated Data Projects Questionnaire

START

Is the Responsible Researcher Rutgers Faculty or Staff?

Is the researcher is a grad student with a thesis or

dissertation?

No

See the Repository Collection Librarian

Yes

Data cannot be accepted, contact RDM for alternatives

No

Is the data in digital format?

Are there human or animal subjects?

Yes

No

This data can be accepted during the Full Implementation of RD

Services. See the RDM for further instructions.

Yes

No

Are there commercial interests?

No

This data can be accepted during the Full Implementation of RD Services. See

the Office of Technology Commercialization for assistance.

Yes

Have all copyright, licensing, and other legal restrictions

been met?

Yes

See the Copyright Librarian and/or the Office of Technology

Commercialization for assistance.

No

Yes

Continued on next page

Page 29: Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

29

Page 2: Flowchart for Mediated Data Projects Questionnaire

Is the data volume less than 100 GB for this

project?

Contact the RDM. Project will be considered by AUL for Digital Library Systems.

No

Is the research funded?

Yes

Provide a copy of your grant documents for

internal use.

Yes

Sign form, Proceed to

application form and deposit agreement

No

Is the data the final version for public release?

Working storage is not an option. Contact the RDM for assistance.

No

Can the data be shared publicly now?

Can the data be shared eventually (i.e. after an embargo

period)?

Contact the RDM for more information.

No

Yes Yes

Yes

Continued from previous page

No

Page 30: Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

30

Appendix E

Self-Deposit, Mediated, and Special Research Data Projects Questionnaire

The responsibility for compliance with any legal restrictions lies with the Principal

Investigator/Responsible Researcher (hereafter referred to as the Responsible Researcher). The

Responsible Researcher will assume liability for determining if their data is free from any

copyright or intellectual property constraints, sensitive or confidential information, any

restrictions on public accessibility, or any other legal and ethical issues which might prevent

depositing and sharing the data publicly.

Guidance is available to researchers needing assistance with copyright issues from the Copyright

Librarian. Researchers needing assistance with issues concerning intellectual property and

commercial interests should seek advice from the Office of Technology Commercialization.

Those with questions about data from human or animal subjects should contact the appropriate

Institutional Review Board.

Please answer the following questions:

1. Is the Responsible Researcher Rutgers faculty or staff? For datasets associated with

Rutgers electronic theses and dissertations, please contact the Repository Collection

Librarian, Rhonda Marker, at [email protected].

2. Are all data in digital format? For projects requiring digitization, please contact the

Repository Collection Librarian at [email protected] .

3. Has IRB approval been obtained for projects which contain human or animal subjects?

Can you provide copies of all IRB approvals from all institutions? These documents can

be restricted from public access as necessary. Have all information directly identifying

the subjects been removed as required by IRB, and any other measures necessary to

prevent the disclosure of the subjects’ identities been taken? Prior to depositing this data,

you will be required to sign an agreement attesting that you have taken all actions as

necessary to permit you to lawfully share the data through RUcore.

4. Have all copyright, licensing, and other legal restrictions been met? All commercial

interests must be disclosed. For questions about these issues, please consult with the

Office of Technology Commercialization at 848-932-0115 or [email protected];

and the Copyright Librarian, Janice Pilch at [email protected].

Page 31: Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

31

5. If the materials include photographs, audio recordings, or audiovisual recordings, do you

have signed release forms for use of a person’s image or voice? Copies of signed release

forms must be provided, but will be restricted from public access.

6. Is the data able to be made publicly accessible? We can accept data with short-term

embargoes, such as until the publication of a book or article; for projects that require

permanent preservation without public access, please contact the Research Data Manager

at [email protected].

7. Is the data the final version intended for public release? We cannot offer working storage

for ongoing research projects.

8. Is the data volume less than 500 GB for this project? If not, please consult with the

Repository Collection Librarian at [email protected].

9. Is the research funded? If so, a copy of the funded grant application will need to be

provided. This document is for internal use only and will not be made publicly accessible.

10. If all of the above conditions are met, the Responsible Researcher, in addition to

providing all necessary documentation, will be required to fill out a project application

and sign a Data Deposit Agreement, prior to upload of data. If the above questions cannot

be affirmed, please consult the Research Data Manager at [email protected] to

see if the data can be accepted at this time.

All of the above questions have been correctly and truthfully answered in the affirmative.

Responsible Researcher Date

Page 32: Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

32

Appendix F: Full Implementation Flowchart for Mediated and Self-Deposited Data

Projects

START

Is the Responsible Researcher Rutgers Faculty or Staff?

Is the researcher is a grad student with a thesis or

dissertation?

No

See the Repository Collection Librarian

Yes

Data cannot be accepted, contact RDM for alternatives

No

Is the data in digital format?

Are there human or animal subjects?

Yes

No

Has IRB approval been obtained? Yes

No

Yes

Have all copyright, licensing, and other legal restrictions

been met?

Yes

See the Copyright Librarian and/or the Office of Technology

Commercialization for assistance.

Yes

Contact your local IRB.

No

Has all personal, confidential, or sensitive information been

removed?

Contact ORED and RDM for

assistance

No

Yes

Have all commercial interests been disclosed?

Yes or

No

Have release forms been obtained for images or audiovisual

recordings?

No

Yes

No

Continued on next page

N/A

Page 33: Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

33

Page 2: Full Implementation Flowchart for Mediated and Self-Deposited Data Projects

Is the data volume less than 500 GB for this

project?

Contact the RDM. Project will be considered by AUL for Digital Library Systems.

No

Is the research funded?

Yes

Provide a copy of your grant documents for

internal use.

Yes

Sign form, Proceed to

application form and deposit agreement

Is the data the final version for public release?

Working storage is not an option. Contact the RDM for assistance.

No

Yes

No

Can the data be shared publicly now?

Can the data be shared eventually (i.e. after an embargo

period)?

Contact the RDM for more information.

Continued from previous page

No No

Yes Yes

Page 34: Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

34

Appendix G

Research Data Project Application for the Responsible Researcher

Please complete the following information about your research data:

1. Project Title

2. Name of Principal Investigator/Responsible Researcher, Department and lab group, e-

mail address and Rutgers NetID. The Principal Investigator/Responsible Researcher must

be affiliated with Rutgers.

3. List the contributors whose names will be associated with the project. Please list

departments and lab groups, e-mail addresses and Rutgers NetIDs. If not Rutgers

affiliates, for each researcher please list his or her current institutional association and

department, e-mail address, phone number, and physical address. Briefly describe their

role(s) in the project.

4. An abstract describing the project.

5. Keywords which will help identify your data domain(s).

6. Any links or additional resources associated with the project.

7. Documentation to allow users to understand the nature of your data, even if they are not

familiar with your subject area. A README file should be included which describes

your data files. Other supplementary files such as codebooks or questionnaires should

also be provided.

8. A list of files or data objects and the format of these files/objects. All data objects must

be in digital format; for data which requires digitization please contact the Repository

Collection Librarian, Rhonda Marker, at [email protected].

9. If your data contains copyrighted material, please provide proof of permission to share

the data publicly via RUcore.

10. Funding sources for the project, if any. Please provide a copy of your grant application

and approval for internal use.

11. Any constraint on sharing the data not previously described, such as an embargo period.

If your data will be restricted to use by Rutgers researchers only who must meet certain

criteria, please describe the archiving and re-use conditions you believe are appropriate to

your data. We will contact you to discuss this further.

12. Any approvals obtained or in process, which are not already provided.

Page 35: Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

35

(For Full Implementation of data services projects only)

13. If your project contains data from human or animal subjects, please describe the level of

the sensitivity of the data. Indicate if the data contains confidential information, and

describe the methods you used to anonymize your data. You must submit your IRB

application and approval along with the data.

I have provided all of the above information as the Responsible Researcher for this data project.

All of the information is correct, and I have not knowingly withheld or misrepresented any

information.

Responsible Researcher Date

Page 36: Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

36

Appendix H

Research Data Deposit Agreement

As the Responsible Researcher and custodian of this research data (“Work”), I hereby grant to

Rutgers University Libraries the non-exclusive right to retain, reproduce, and distribute the

deposited work in whole or in part, in and from its electronic format, without fee. This agreement

does not represent a transfer of copyright to Rutgers University Libraries.

Rutgers University Libraries may make and keep more than one copy of the Work for purposes

of security, backup, preservation, and access, and may migrate the Work to any medium or

format for the purpose of preservation and access in the future. Rutgers University Libraries will

not make any alteration, other than as allowed by this agreement, to the Work.

I represent and warrant to Rutgers University Libraries that the Work is my original work. I also

represent that the Work does not, to the best of my knowledge, infringe or violate any rights of

others. It does not contain any confidential or sensitive information.

I further represent and warrant that I have obtained all necessary rights to permit Rutgers

University Libraries to reproduce and distribute the Work and that any third-party owned content

is clearly identified and acknowledged within the Work.

By granting this license, I acknowledge that I have read and agreed to the terms of this

agreement and all related RUcore and Rutgers policies at

http://rucore.libraries.rutgers.edu/policies/ and http://policies.rutgers.edu/ . I hold Rutgers

University Libraries harmless from any damages incurred as a result of public sharing and/or

reuse of this data.

Name (please print)____________________________________________________________________

Signature ___________________________________________ Date ____________________________

Address ______________________________________________________________________________

City __________________________________________ State ____________ Zip _________________

Telephone Number (____) _____________ E-mail: _________________________________________

Author and Title of Work Department and School

_____________________________________________________________________________________

Co-Researchers Department and School

_____________________________________________________________________________________

Page 37: Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

37

Appendix I

General Guidelines for Librarians advising on Research Data Projects for RUcore deposit

This document uses the term Responsible Researcher (RR) to indicate the lead researcher

responsible for the data project under consideration. In the case of a grant-funded project, the

Responsible Researcher is the Principal Investigator. If the project is not grant-funded, the

Responsible Researcher is the person responsible for creating the data or supervising data

collection.

About the Responsible Researcher

1. Librarians may be approached by various levels of personnel who have worked on a data

project --- graduate students, administrative staff, and others. While these contacts may

useful for providing information about the project, and may ultimately be involved in the

data transfer and review, the librarian must determine who the Responsible Researcher is for

the project. Only the RR can vouch for the ultimate veracity and compliance of the data, and

RUL will need the RR to participate in the deposit process. The RR for the project must be

currently employed by Rutgers.

2. Joint research projects with non-Rutgers partners must proceed with contact through the

Rutgers RR. The Rutgers RR must consider whether rights and permissions are jointly held,

and it is acceptable to all parties for any non-Rutgers or shared data to be deposited at

Rutgers.

Copyright, Commercial Interests, Sensitive Information, and other rights issues

3. Copyright to data is held by the University, but the RR as the creator and steward of the data

has the right to share the data in accordance with funder requirements and in furtherance of

scholarly communication. The RR should be familiar with these policy issues and be able to

affirm that they are the person with the right to deposit the data (in anticipation of an

accepted data policy). Guidance may be obtained from the Copyright Librarian.

4. If data is based on other data sources (e.g., extracts from commercial databases, data shared

under a use agreement), the RR should verify that the data to be shared does not violate any

usage terms. The RR should provide information about the usage terms in these cases.

5. If data includes audiovisual materials involving people, the RR must provide copies of the

release forms or consent agreements for the study participants.

6. The RR should be prepared to submit grant documents and IRB approval documents along

with the data submission. These materials will not be publicly shared, but are necessary for

RUL staff to ensure that the data will be properly maintained over time in accordance with

the original grant and IRB terms.

Page 38: Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

38

7. The librarian should obtain from the RR information about any related publications, and offer

to have these deposited in RUcore.

8. If a project involves potential patent or commercialization issues, the RR should verify that

the public release of data is permissible and does not interfere with potential

commercialization. The RR should consult with the Office of Technology

Commercialization in such cases. In practice, the RR will likely be well aware of such issues

and have been working with OTC from the start of the project.

9. If the data involves human subjects or other sensitive information, the RR should be sure that

the data for public release has no personally identifiable information (PII) or similar issues.

If the project has been through IRB review, many of these issues will already have been

considered, and the distribution of public release data should be discussed in the IRB

documents. HIPAA regulations would apply to health data, and FERPA regulations would

apply to educational data, but even in cases where no statutory obligations hold, ethical

concerns would prohibit certain kinds of data release. The RR should consider these issues

in the preparation of data for widespread public dissemination.

Storage, Access, and Files

10. The RR should understand that RUcore does not provide working storage, and that data

provided should be the final form intended for release. To a limited degree, we can

accommodate versioning of files that change over the course of a project, but each version

should be intended for release.

11. The RR should understand the RUcore is primarily intended for public data distribution.

Embargoes of data for a limited time (e.g., until publication of a related article/book) can be

accommodated. There should be a compelling reason for data to be subject to longer term

embargoes.

12. Similarly, data access can be restricted to specific user groups, such as Rutgers users or

individual users who have signed a usage agreement, but there should be a compelling reason

for these restrictions.

13. The RR should be encouraged to deposit data in widely used, open formats when available.

These will ensure wider audience and a greater ability to ensure long-term preservation.

Data in proprietary or unusual formats will be accepted, however, if that is necessary to

support the researchers’ workflow. All data should be accompanied by a “ReadMe” file, so

that users have sufficient information to be able to understand and use the data.

14. Data deposit is intended for “born digital” or files that are already digitized. If the RR has

analog materials that need to be converted prior to deposit, the librarian should advise the RR

about the digitization process offered by the Libraries.

Page 39: Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

39

15. The librarian should ascertain the approximate total file size and number of files that the RR

intends to deposit. If a large quantity of storage space is needed, the librarian can advise on

the costs of storage. If there are a large number of files which will require distinct metadata,

the librarian should consult with the Head of Central Technical Services to estimate the

complexity of the project. Multiple files can be bundled into zip files, and RUcore’s

Directory Ingest tool can be used to represent complex structures, so the absolute quantity of

files is less important than whether or not the project will require extra staff time for

metadata work.

Page 40: Task Force on Research Data Implementation Summary …...preserve research data. Academic libraries have begun to fill the demand for digital repositories which allow their researchers’

40