Upload
heather-coates
View
215
Download
0
Embed Size (px)
Citation preview
Ensuring data qualityMapping outcomes for quality assurance & control
Data Topics Workshop Series: Fall 2014
Meet & Greet
• First Name• Program or Department • Current role in a research
project
Heather CoatesDigital Scholarship & Data Management LibrarianLiaison to the Fairbanks School of Public [email protected]
Timeline
• The Big Picture
• Practical Strategies
• Activities
• Presentation: 10 minutes• Discussion: Defining quality• Discussion: Mapping outcomes• Review | Q&A
Agenda
ScenarioFour years after your article is published, a researcher in your field contacts you with questions about the integrity of the data. • Can you find the files
supporting your published findings?
• Can you access and view the files?
• Can you justify your rationale for the procedures based on your documentation?
• Can someone pick up your research and build on it?
Goals
• Recognize the need for quality standards.• Begin to define quality standards for your research.• Identify quality assurance and quality control activities.
Data Integrity
• Data have integrity if they have been maintained without unauthorized alteration or destruction
• Data integrity is data that has a complete or whole structure. (http://www.princeton.edu/~achaney/tmve/wiki100k/docs/Data_integrity.html)
Data Quality
• Fitness for use (depends on context of your questions)• Data quality is the most important aspect of data management• Ensured by
• Sufficient resources and expertise• Paying close attention to the design of data collection instruments• Creating appropriate entry, validation, and reporting processes• Ongoing QC processes• Understanding the data collected
Chapman, 2005
Source: Dept of Biostatistics – Data Management, IUSM
Data Quality Standards
• Check data for its logical consistency.• Check data for reasonableness.• Ensure adherence to sound estimation methodologies.• Ensure adherence to monetary submission standards for stolen and recovered
property.• Ensure that other statistical edit functions are processed within established
parameters.
FBI: http://www.fbi.gov/about-us/cjis/ucr/data_quality_guidelines
Source: Dept of Biostatistics – Data Management, IUSM
Discussion: Defining Data Quality
Define data quality standards for the following variables:• Age, BMI• Life satisfaction scale• Number of close friends• Blood draw, bone fossil, water sample• Satellite image, photograph,
Defining QA/QC
• Strategies for preventing errors from entering a dataset• Activities to ensure quality of data before collection• Activities that involve monitoring and maintaining the quality of data
during the study
QA/QC Before Collection
• Define & enforce standards• Formats• Codes• Measurement units• Metadata
• Assign responsibility for data quality• Be sure assigned person is educated in QA/QC
Quality Assurance v. Control
• QA: set of processes, procedures, and activities that are initiated prior to data collection to ensure the expected level of quality will be reached and data integrity will be maintained.
• QC: a system for verifying and maintaining a desired level of quality in a product or service.
http://c2.com/cgi/wiki?QualityAssuranceIsNotQualityControl
Quality Assurance in Practice
• CRF (data collection instrument) review & validation• System/process testing & validation• Training, education, communication of a team• Standard Operating Procedures, Standard Operating Guidelines• Site audits
Source: Dept of Biostatistics – Data Management, IUSM
Quality Control in Practice
• Set of processes, procedures, and activities associated with monitoring, detection, and action during and after data collection.
• Examples:• Errors in individual data fields• Systematic errors• Violation of protocol• Staff performance issues• Fraud or scientific misconduct
Source: Dept of Biostatistics – Data Management, IUSM
General themes: GCDMP
• Plan, test, revise, test, revise, test…implement• All stakeholders should be involved in designing protocol, data
collection tools, data management plan, etc.• Document, document, document• Rule: the bigger and more complex the study (sites, data, people), the
more planning you need
Relevant practices from GCDMP
• Specify documents required for reproducible research at various levels
• Institutional: SOP• Study: protocol, manual of procedures, data management plan, statistical
analysis plan
• Documentation serves practical purposes, among them a shared understanding of the project, and benefits the team immediately
• Specify roles and responsibilities from the beginning
Begin with the end in mind
Produce report-ready outputs
Collect data in a way to enable efficient data entry, processing, validation, analysis, reporting
Enabled by standardized data collection tools
Mapping research & data outcomes
• Review the instructions• Review the example (on screen)• Discussion
Resources
1. Department of Biostatistics – Data Management Team, Indiana University School of Medicine (2013). Data Management including REDCap. (provided via email)
2. Chapman, A. D. 2005. Principles of Data Quality, version 1.0. Report for the Global Biodiversity Information Facility, Copenhagen. ISBN 87-92020-03-8. http://www.gbif.org/resources/2829
3. DataONE Education Module: Data Quality Control and Assurance. DataONE. From http://www.dataone.org/sites/all/documents/L05_DataQualityControlAssurance.pptx
4. Good Clinical Data Management Practices (2013). Available at http://www.scdm.org/sitecore/content/be-bruga/scdm/Publications/gcdmp.aspx