22
Metadata considerations for the curation and preservation of research data Matt Carruthers Metadata Projects Librarian University of Michigan

Metadata considerations for the digital curation and ...€¦ · Levels of Metadata and Documentation: Study-level: provides an overview of the research context and design, data collection

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Metadata considerations for the digital curation and ...€¦ · Levels of Metadata and Documentation: Study-level: provides an overview of the research context and design, data collection

Metadata considerations for the

curation and preservation of research

data

Matt Carruthers

Metadata Projects Librarian

University of Michigan

Page 2: Metadata considerations for the digital curation and ...€¦ · Levels of Metadata and Documentation: Study-level: provides an overview of the research context and design, data collection
Page 3: Metadata considerations for the digital curation and ...€¦ · Levels of Metadata and Documentation: Study-level: provides an overview of the research context and design, data collection

Traditional view: Focus on end products of research(Articles, books, reports, etc.)

Page 4: Metadata considerations for the digital curation and ...€¦ · Levels of Metadata and Documentation: Study-level: provides an overview of the research context and design, data collection

Reproducible research is hard to come by

(http://www.nytimes.com/2015/08/28/science/many-social-science-findings-

not-as-strong-as-claimed-study-says.html?_r=0)

Page 5: Metadata considerations for the digital curation and ...€¦ · Levels of Metadata and Documentation: Study-level: provides an overview of the research context and design, data collection

Grant-funding institutions are beginning to

mandate that recipients have plans to manage

and preserve their data sets.

Page 6: Metadata considerations for the digital curation and ...€¦ · Levels of Metadata and Documentation: Study-level: provides an overview of the research context and design, data collection

Grant-funding institutions are beginning to

mandate that recipients have plans to manage

and preserve their data sets.

Alfred P. Sloan FoundationNational Science FoundationDepartment of EnergyGulf of Mexico Research Initiative

Gordon and Betty Moore FoundationInstitute of Museum and Library ServicesDepartment of EducationJoint Fire Science Program

National Endowment for the HumanitiesNational Institutes of HealthNational Oceanic and Atmospheric AdministrationU.S. Geological SurveyU.S. Department of Agriculture

Page 7: Metadata considerations for the digital curation and ...€¦ · Levels of Metadata and Documentation: Study-level: provides an overview of the research context and design, data collection

Deep Blue Data

University of Michigan Research Data

Repository

(UMRDR)

Page 8: Metadata considerations for the digital curation and ...€¦ · Levels of Metadata and Documentation: Study-level: provides an overview of the research context and design, data collection

Deep Blue Data

University of Michigan Research Data

Repository

(UMRDR)

Page 9: Metadata considerations for the digital curation and ...€¦ · Levels of Metadata and Documentation: Study-level: provides an overview of the research context and design, data collection
Page 10: Metadata considerations for the digital curation and ...€¦ · Levels of Metadata and Documentation: Study-level: provides an overview of the research context and design, data collection

What is metadata?

“Data about data”

Page 11: Metadata considerations for the digital curation and ...€¦ · Levels of Metadata and Documentation: Study-level: provides an overview of the research context and design, data collection

What is metadata?

“Data about data”

Page 12: Metadata considerations for the digital curation and ...€¦ · Levels of Metadata and Documentation: Study-level: provides an overview of the research context and design, data collection

What is metadata?

1. Metadata describes the content, quality,

condition, and other characteristics of data.

2. Metadata is standardized, structured information

about an object that facilitates functions

associated with that object.

(Discovery, management, rights and access

control, reuse.)

Page 13: Metadata considerations for the digital curation and ...€¦ · Levels of Metadata and Documentation: Study-level: provides an overview of the research context and design, data collection
Page 14: Metadata considerations for the digital curation and ...€¦ · Levels of Metadata and Documentation: Study-level: provides an overview of the research context and design, data collection
Page 15: Metadata considerations for the digital curation and ...€¦ · Levels of Metadata and Documentation: Study-level: provides an overview of the research context and design, data collection

Describing collections of digital objects

• Digital collections

• Online finding aids

• Digital Exhibits

Encoded Archival Description (EAD)

Text Encoding Initiative (TEI)

Dublin Core

Metadata Object Description Schema (MODS)

Metadata Encoding and Transmission Standard (METS)

Page 16: Metadata considerations for the digital curation and ...€¦ · Levels of Metadata and Documentation: Study-level: provides an overview of the research context and design, data collection

New things to consider:

• How do you represent context in a digital

environment?

• How do you facilitate long-term preservation of

digital files with vastly different characteristics?

• How do you track changes to digital objects over

time?

Page 17: Metadata considerations for the digital curation and ...€¦ · Levels of Metadata and Documentation: Study-level: provides an overview of the research context and design, data collection

Levels of Metadata and Documentation:

Study-level: provides an overview of the research

context and design, data collection methods, data

preparation and results or findings, etc.

Data-level: provides labelling and documentation of

individual data items, such as names and descriptions

of variables, and explanations of codes and

classification schemes used. It can be embedded

within a data collection or recorded in an

accompanying document.

Page 18: Metadata considerations for the digital curation and ...€¦ · Levels of Metadata and Documentation: Study-level: provides an overview of the research context and design, data collection
Page 19: Metadata considerations for the digital curation and ...€¦ · Levels of Metadata and Documentation: Study-level: provides an overview of the research context and design, data collection

What difference does metadata make?

• Scholarly Communicationo Fight the “Digital Data Deluge”

• Increase the “long tail” of research data

• Potential for increase in data citations

• Meet funding agency requirements

Page 20: Metadata considerations for the digital curation and ...€¦ · Levels of Metadata and Documentation: Study-level: provides an overview of the research context and design, data collection

Training Resources and Guides:

• Digital Curation Centre disciplinary metadata standardso http://www.dcc.ac.uk/resources/metadata-standards

• Managing Research Data 101 – Documentation and Metadata (MIT):o http://libraries.mit.edu/data-management/store/documentation/

• Data Management Course Module for Graduate Students (University of Minnesota Libraries):o https://www.lib.umn.edu/datamanagement/workshops (particularly Module 3)

• MANTRA (Research Data Management Training): Documentation, Metadata, Citation (University of Edinburgh):o http://datalib.edina.ac.uk/mantra/documentation_metadata_citation/

• “Practical guidance for anyone working with research data”: Chapters 4 and 5 (UK Data Service):o http://ukdataservice.ac.uk/manage-data/handbook.aspx

• School of Data:o http://schoolofdata.org/courses/

• Metadata Guide Working Level (Australian National Data Service):o http://ands.org.au/guides/metadata-working.html

• Create & Manage Data: Documenting Your Data (UK Data Service):o http://www.data-archive.ac.uk/create-manage/document

• Guide to writing “ReadMe” style metadata (Cornell University):o http://data.research.cornell.edu/content/readme

• “Understanding Metadata” (National Information Standards Organization):o http://www.niso.org/publications/press/UnderstandingMetadata.pdf

• Best Practices in Creating Metadata (ICPSR):o http://www.icpsr.umich.edu/icpsrweb/content/deposit/guide/chapter3docs.html

Page 21: Metadata considerations for the digital curation and ...€¦ · Levels of Metadata and Documentation: Study-level: provides an overview of the research context and design, data collection

Tools for Metadata Creation:

For lists of discipline-specific metadata tools, visit:http://www.dcc.ac.uk/resources/metadata-standards/tools

• Automated extraction of technical metadata from files:o File Information Tool Set (FITS): http://projects.iq.harvard.edu/fits

• The File Information Tool Set (FITS) identifies, validates and extracts technical metadata for a wide range of file formats.

o JHOVE: http://sourceforge.net/projects/jhove/

• JHOVE provides functions to perform format-specific identification, validation, and characterization of digital objects.

o JHOVE2: https://bitbucket.org/jhove2/main/wiki/Home

• JHOVE2 is open source software for format-aware characterization of digital objects.

o Exiftool: http://www.sno.phy.queensu.ca/~phil/exiftool/

• ExifTool is a platform-independent Perl library plus a command-line application for reading, writing and editing meta information in a wide variety of files. ExifTool is also available as a stand-alone Windows executable and a Macintosh OS X package

o Apache Tika: http://tika.apache.org/

• The Apache Tika™ toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF). All of these file types can be parsed through a single interface, making Tika useful for search engine indexing, content analysis, translation, and much more.

• Adding metadata to Microsoft documents:o Microsoft Document Properties: http://www.lib.cam.ac.uk/dataman/resources/Cambridge_documentproperties_factsheet.pdf

• The Document Properties feature in Microsoft Office applications such as Word, PowerPoint, Access or Excel allow you to attach information about your document to the file.

o Colectica for Excel: http://www.colectica.com/software/colecticaforexcel

• Colectica for Microsoft Excel is a free tool to document your spreadsheet data using the open standard for data documentation.