Upload
rebecca-reznik-zellen
View
5
Download
1
Embed Size (px)
DESCRIPTION
Data Management Basics workshop slides for
Citation preview
Dat
a M
anag
emen
t Bas
icsData Management
BasicsA Workshop for Graduate StudentsMarch 1, 2013
3/01
/13
1
Dat
a M
anag
emen
t Bas
ics
WHY MANAGE DATA?
3/01
/13
2
Dat
a M
anag
emen
t Bas
ics
1. Funders Require It• National Institutes of Health: Data Sharing Policy (2003)• All grants funded at $500K or above must include a Data Sharing Plan
• National Science Foundation: Data Management Plan Requirement (2011)• All proposals must submit a 2 pp supplementary “Data Management Plan” to
describe how projects will comply with NSF data sharing policy
• National Endowment for the Humanities: Sustainability and Data Management Plans Requirement (2012)• Digital Humanities Implementation Grants must include a plan to discuss how
data will be managed, disseminated, and preserved
• OSTP Directive to Funding Agencies (2013)• Federal agencies with more than $100M in R&D expenditures must ensure
that published results of federally funded research are freely available to the public within one year of publication -- including data
3/01
/13
3
Dat
a M
anag
emen
t Bas
ics
National Science Foundation• Data Management Plan Requirement• How projects will conform to NSF data sharing policy• Flexible
• “The plan should reflect best practices in your area of research, and should be appropriate to the data you generate.”
• Directorate for Social, Behavioral and Economic Sciences• Discipline-specific guidelines
• Archeology (Digital Archeological Record)• Economics (American Economic Association)
• Universals (for the NSF Universe)• What data are generated by your research? • What is your plan for managing the data?
3/01
/13
4
Dat
a M
anag
emen
t Bas
ics
2. It Makes Life Easier• For you…• Increases efficiency
• Easier to understand the data collected throughout the life cycle of the project
• Easier to find the data that you need throughout the life cycle of the project• Satisfies applicable legal obligations• Addresses preservation, documentation, verification issues• Helps reviewers understand the characteristics of your data• Increases citation rates for articles
• For others…• Provides continuity – other researchers can build on your data• Enhances longevity and usability• Facilitates new discoveries• Supports open access
3/01
/13
5
Dat
a M
anag
emen
t Bas
ics
3. It’s the Right Thing To Do
Responsible Conduct of Research/Research Ethics• Data Acquisition, Management, Sharing and Ownership • Using the appropriate research method • Providing attention to detail• Obtaining appropriate permissions • Recording data accurately and securely• Maintaining data to allow it to confirm research findings,
establish priority, and be reanalyzed by other researchers. • Storing data to protect confidentiality, be secure from physical and
electronic damage, destruction or theft, and be maintained for the appropriate time frame dictated by sponsor and University policies.
Compliance• Research using Human Subjects (Institutional Review Board)
3/01
/13
6
Dat
a M
anag
emen
t Bas
ics
SMART DATA PRACTICES
Naming Your filesOrganizing Your DataBackup and StoragePost-Project Considerations
3/01
/13
7
Dat
a M
anag
emen
t Bas
ics
Organizing Your Data• Getting Started• Consider your goals
• What do you want to get out of managing your data?• What is the most efficient way to organize your data?
• Figure out your criteria for keeping data• Think about where you want your data to end up
3/01
/13
8
Dat
a M
anag
emen
t Bas
ics
filename = chief identifier for a research data file
3/01
/13
9
Dat
a M
anag
emen
t Bas
ics
3/01
/13
10
File naming
and labeling
Organization
ContextConsistency
Dat
a M
anag
emen
t Bas
ics
Some potential components for your file naming strategy
• Version number• Date of creation• Name of creator• Description of content• Name of individual/research team/department• Publication date• Project number
3/01
/13
11
Dat
a M
anag
emen
t Bas
ics
Organizing Your Data
3/01
/13
12
W. E. B. Du Bois, Niagara delegate meeting, Boston, 1907. W. E. B. Du Bois Papers (MS 312). Special Collections and University Archives, University Libraries, University of Massachusetts Amherst
Dat
a M
anag
emen
t Bas
ics
Organizing Your Data• Let’s Clean Up Those File Names• abcdefghijklmnopqrstuvwxyz.jpg
• doesn’t make much sense, does it?
• How about:• 20120925_credo_du_bois_rrz_001.jpg
• And I put it in a directory called:• credo_du_bois
3/01
/13
13
Dat
a M
anag
emen
t Bas
ics
Organizing Your Data• Why this structure? • Oh, I just made it up! But I’m going to be consistent
• 20120925 = date I found the image• credo = database/collection where I found the image• du_bois = image subject• rrz = my initials (I am working in a group!)• 001 = an accession number (I made that up, too, but I’ll continue to
use that schema)
3/01
/13
14
Dat
a M
anag
emen
t Bas
ics
BAD naming practices• Using generic data file names that may conflict when moved
from one location to another• Failing to think about scale • Using special characters in a filename such as:
& * % $ £ ] { ! @
3/01
/13
15
Dat
a M
anag
emen
t Bas
ics
Versioning• Use ordinal numbers (1,2,3) for major version changes and the
decimal for minor changes: v1, v1.1, v2.6• Beware of using confusing labels: revision, final, final2,
definitive_copy• Discard or delete obsolete versions • Use an auto-backup facility (if available) rather than saving or
archiving multiple versions• Turn on versioning or tracking in collaborative documents or
storage utilities such as Wikis, GoogleDocs, etc.
3/01
/13
16
Dat
a M
anag
emen
t Bas
ics
Quiz! File naming by date
What is the best filename?A. 2012-09-25_AttachmentB. 25 September 2012 AttachmentC. 25092012attch 3/
01/1
3
17
Dat
a M
anag
emen
t Bas
ics
Quiz! File naming by description
What is the best filename?A. dubois_great_barrington_recent_20120925_old
version.docxB. 2012-09-25_dubois_great_barrington_V1.docxC. FFTX_2365498_old.docx
3/01
/13
18
Dat
a M
anag
emen
t Bas
ics
Organizing Your Data• Organizational methods• Hierarchical• Tag-based
• Retrieval• Location-based• Search-based
3/01
/13
19
“Very little skill is needed to actually be organized and efficient…. just the consciousness to put this file or folder in the right place.”
Dat
a M
anag
emen
t Bas
ics
3/01
/13
20
DuBoisDuBois_Images
DuBois_Images/1868-1898/ DuBois_Images/1898-1928/
DuBois_Letters DuBois_Letters/1868-1898/ DuBois_Letters/1898-1928/
DuBois_Newspapers/
Organizing Your DataUse folders!
etc.
Dat
a M
anag
emen
t Bas
ics
Archive what you don’t or won’t need
• Decide what your final data sets are• Once your project is over, weed out obsolete data and decide
what you want to keep for the long-term• Move files and folders to an ‘Archive’ or ‘Old files’ folder• z_archive
3/01
/13
21
Dat
a M
anag
emen
t Bas
ics
Backup and Storage
3/01
/13
22
January 2011: “Stolen laptop contains cancer cure data”
Dat
a M
anag
emen
t Bas
ics
Backup and Storage• Backup is an essential component of data management• Prevent against accidental or malicious data loss• Restore original data
• Keep 3 copies
• Consider• How much?• How frequently?• Which media?• Synchronization
• Test your system
3/01
/13
23
Original
External Local
External Remote
Dat
a M
anag
emen
t Bas
ics
Backup and Storage• Accessibility of data depends on storage media and file format• Vulnerable to deterioration• Become obsolete over time
• Plan for disruption
• Consider• Non-proprietary
file formats• Different media types
in storage strategy• Migrate data• Unencrypted,
uncompressed
3/01
/13
24
Original
External Local
External Remote
Dat
a M
anag
emen
t Bas
ics
Backup and Storage• Security• Encryption can be used for safely moving or storing files,
• Encrypting files on storage devices (flash drives) • Encryption during file transfer (ie: WinSCP)• Encrypted storage services
• Deleting Data• Weed out obsolete data and decide what you want to keep for the
long-term• Deleting files does not delete files
• Other things to Consider• How will the data be used?• Who pays for storage?
3/01
/13
25
Dat
a M
anag
emen
t Bas
ics
Post-Project Activities• Publication? Sharing?• Intellectual Property• Copyright
• Creative Commons
• Platforms? • ScholarWorks@UMass Amherst• ICPSR
• Copyright & Information Policy LibrarianLaura Quilter [email protected]
3/01
/13
26
Dat
a M
anag
emen
t Bas
ics
Data Management is About Planning
Data management will: • Prevent bad things
from happening to your data;• Make you a more
efficient researcher;• Prepare you for
grant management.
Collection Description
Storage and Backup Access
3/01
/13
27
Dat
a M
anag
emen
t Bas
ics
Data Management PlansNSF
• The types of data; • The standards to be used for data and metadata format and
content ;• The policies for access and sharing; • The policies and provisions for re-use, re-distribution, and the
production of derivatives; and• The plans for archiving and for preservation of access.
3/01
/13
28
Dat
a M
anag
emen
t Bas
ics
RESOURCES
3/01
/13
29
Dat
a M
anag
emen
t Bas
ics
Planning• Data Working Group (email [email protected])• Digital projects• Long-term preservation• Assessment
• Web resources • UMass Amherst Libraries: General Resources (http://guides.library.umass.edu/
datamanagement)• Discipline-specific• Your faculty • Your mentors• Your professional associations• Industry partners• Public engagement
3/01
/13
30
Dat
a M
anag
emen
t Bas
ics
Backup and Storage• Storage• Udrive (http://www.oit.umass.edu/udrive )• Departmental servers • CDs/DVDs/external hard drives
• Filesharing (see http://chronicle.com/blogs/profhacker/protecting-your-data/37350 ) • Dropbox • Google Docs
• Cloud Storage• Amazon Web Services• Rackspace• Microsoft Azure• Sugar Sync
• Additional Information• MIT on Backups and Security
http://libraries.mit.edu/guides/subjects/data-management/backups.html• UK Data Archive on Data Storage
http://www.data-archive.ac.uk/create-manage/storage• UK Preservation Office “Caring for CDs and DVDs”
http://www.bl.uk/blpac/pdf/cd.pdf
3/01
/13
31
Dat
a M
anag
emen
t Bas
ics
ToolsInformation Management• Devonthink
http://www.devontechnologies.com• Yojimbo http://www.barebones.com
/products/yojimbo• EverNote
http://www.evernote.com/about/home.php• Scribe (Mac, Windows, Free)
http://chnm.gmu.edu/tools/scribe/• Springpad
http://springpadit.com/home Citation Management • Mendeley
http://www.mendeley.com/features/• Zotero
http://www.zotero.org/• RefWorks
http://guides.library.umass.edu/refworksatumass
Desktop Search Tools• Windows Search
http://www.microsoft.com/en-us/download/details.aspx?id=23
• UltraSearchhttp://www.jam-software.com/ultrasearch/
• Locate 32http://locate32.cogit.net/
Tagging Tools• Tabbles
http://tabbles.net/• TaggTool
http://www.taggtool.com/index.php• TaggedFrog
http://lunarfrog.com/taggedfrog/Tool Directories• Bamboo DiRT
http://dirt.projectbamboo.org/• CHNM Research + Tools
http://chnm.gmu.edu/research-and-tools/
3/01
/13
32
Dat
a M
anag
emen
t Bas
ics
Sources• MIT Data Management
(http://libraries.mit.edu/guides/subjects/data-management/) • UK Data Archive
(http://www.data-archive.ac.uk/) • MANTRA (http://datalib.edina.ac.uk/mantra/organisingdata.html) • Creating Order from Chaos: 9 Great Ideas for Managing Your
Computer Files(http://www.makeuseof.com/tag/creating-order-chaos-9-great-ideas-managing-computer-files/)
• Research Information Management: Tools for the Humanities(http://sudamih.oucs.ox.ac.uk/docs/Generic%20Courses/Tools%20for%20the%20Humanities%20course%20book.docx)
3/01
/13
33
Dat
a M
anag
emen
t Bas
ics
Questions/contact
3/01
/13
34