Upload
heather-coates
View
189
Download
2
Tags:
Embed Size (px)
Citation preview
Creating a usable plan to handle your data
Practical Data Management Planning
Data Topics Workshop Series: Fall 2014
Meet & Greet
• First Name• Program or Department • Are you currently involved with a research project?
• Individually or as part of a team?
ScenarioFour years after your article is published, a researcher in your field contacts you with questions about the integrity of the data. • Can you find the files
supporting your published findings?
• Can you access and view the files?
• Can you justify your rationale for the procedures based on your documentation?
• Can someone pick up your research and build on it?
Agenda Timeline
• The Big Picture
• Practical Strategies
• Activities
• Presentation: 10-15 minutes• Activity 1:
• Individual Work: 10 minutes• Small Group Discussion: 5 minutes• Large Group Discussion: 5 minutes
• Activity 2:• Individual Work: 10 minutes• Small Group Discussion: 5 minutes• Large Group Discussion: 5 minutes
• Review | Q&A
Phot
o co
urte
sy o
f ww
w.c
arbo
afric
a.ne
t
Data is collected from sensors, sensor networks, remote sensing, observations, and more - this calls for increased attention to data management and stewardship
Data Deluge
Phot
o co
urte
sy o
f ht
tp:/
/mod
is.gs
fc.n
asa.
gov/
Phot
o co
urte
sy o
f ht
tp:/
/ww
w.fu
turle
c.co
mCC
imag
e by
CIM
MYT
on
Flic
kr
Imag
e co
llect
ed b
y Vi
vHu
tchi
nson
CC im
age
by ta
jaio
n Fl
ickr
• Natural disaster • Facilities infrastructure failure • Storage failure • Server hardware/software failure• Application software failure• External dependencies (e.g. PKI failure)• Format obsolescence• Legal encumbrance • Human error• Malicious attack by human or automated agents• Loss of staffing competencies• Loss of institutional commitment • Loss of financial stability • Changes in user expectations and requirements
The World of Data Around Us: Data Loss
CC
imag
e by
Sha
ryn
Mor
row
on
Flic
kr
CC
imag
e by
mom
bole
umon
Flic
kr
Poor Data Management Affects Everyone
“MEDICARE PAYMENT ERRORS NEAR $20B” (CNN) December 2004Miscoding and Billing Errors from Doctors and Hospitals totaled $20,000,000,000 in FY2003 (9.3%
error rate) . The error rate measured claims that were paid despite being medically unnecessary, inadequately documented or improperly coded. In some instances, Medicare asked health care providers for medical records to back up their claims and got no response. The survey did not document instances of alleged fraud. This error rate actually was an improvement over the previous fiscal year (9.8% error rate).“AUDIT: JUSTICE STATS ON ANTI-TERROR CASES FLAWED” (AP) February 2007 The Justice Department Inspector General found only two sets of data out of 26 concerning terrorism attacks were accurate. The Justice Department uses these statistics to argue for their budget. The Inspector General said the data “appear to be the result of decentralized and haphazard methods of collections … and do not appear to be intentional.” “OOPS! TECH ERROR WIPES OUT Alaska Info” (AP) March 2007
A technician managed to delete the data and backup for the $38 billion Alaska oil revenue fund –money received by residents of the State. Correcting the errors cost the State an additional $220,700 (which of course was taken off the receipts to Alaska residents.)
Policy Drivers Personal Benefits
• Funding agencies• Increased impact of funding
dollars• Reduce redundant data collection• Further scientific research
• Research Communities• Enhance use and value of existing
data• Address big challenges
• Efficiency • Safety• Quality• Reputation• Compliance
Data management is an investment in your research.
Humphrey, Knowledge Creation Cycle
Practical DMPs
• ARE NOT • a checkbox on a funding proposal • NOR a statement about why you can’t share the data underlying
published findings
• ARE• Customized to address the issues most relevant to your research• A guide to help you plan out how you will deal with your data
before you have a messy pile of bits on your hands
Good Data Management Practices
• Begin with the end in mind OR Produce report-ready outputs• Plan, test, revise, plan, test, revise…implement• Include all stakeholders in the design of the protocol, data
collection tools, data management plan, etc.• Document, document, document
• Specify documents required for reproducible research• Facilitates clear communication and shared understanding throughout the
project• Specify roles and responsibilities from the beginning
Activity 1: Data Inventory
• On your own• Read the case study• Fill in as much of the data inventory as you can
• In small groups (2-3 people), share and discuss• Your inventories• What information is missing from the case study?• What information might you need that isn’t listed in the table?
• Whole group discussion• How might this be helpful throughout various phases of the project?• Do you have all of this information documented for your own projects?• If not, what are the barriers?
Activity 2: Starting a storage & backup plan
• On your own• Read the case study again• Identify primary storage location• Identify backup storage locations
• In small groups (2-3 people), share and discuss• Identify potential ethical or legal issues that could impact data storage• Identify risks of selected storage locations
• Whole group discussion• Identify resources @ IUPUI for various types of security needs
Resources
1. DataONE Education Module: Data Management. DataONE. Retrieved December 2013. From http://www.dataone.org/sites/all/documents/L01_DataManagement.pptx
2. Cook, B. (2013). NACP All Investigator Meeting: Data Management Practices for Early Career Scientists. Presented February 3, 2013. From http://daac.ornl.gov/NACP_AIM_2013/NACP_AIM_Agenda.html
3. Society for Clinical Data Management. (2013). Good Clinical Data Management Practices. Washington, D.C.