Upload
daniella-jacobs
View
219
Download
0
Embed Size (px)
Citation preview
The Study Database
Robert R. Kelley, Ph.D.
C l i n i c a l a n d Tr a n s l a t i o n a l Re s e a r c h S u p p o r t C e n t e r
Outline
The Study Database
• Overview• Designing the Database• Building the Database
C l i n i c a l a n d Tr a n s l a t i o n a l Re s e a r c h S u p p o r t C e n t e r
Over
The Study Database
• The Study Database:– Tool(s) used to store data electronically for
queries and data analysis
C l i n i c a l a n d Tr a n s l a t i o n a l Re s e a r c h S u p p o r t C e n t e r
Components of the Study Database
The Study Database
Case Report Form
Study Database
Analysis Software
Query andMonitoring
Software
C l i n i c a l a n d Tr a n s l a t i o n a l Re s e a r c h S u p p o r t C e n t e r
Designing the Study Database
The Study Database
C l i n i c a l a n d Tr a n s l a t i o n a l Re s e a r c h S u p p o r t C e n t e r
The Case Report Form
The Study Database
Good database design starts with a good Case Report Form
Should be driven by the study protocol
Should support the study statistical plan
Should be divided into logical sections or multiple forms
C l i n i c a l a n d Tr a n s l a t i o n a l Re s e a r c h S u p p o r t C e n t e r
Carefully consider variable names
The Study Database case id
Case IDProcedure
Type
Physician
Comments
Temperature
Diastolic BP
Systolic BP
Use only lower case
procedure type
physician
procedure_type
Abbreviate when possible
dbp
temp
Use underscore to separate words
case_id
Consider prefixing exam_temp
Use the shortest variable name you can!!!
Be consistent!
!!
C l i n i c a l a n d Tr a n s l a t i o n a l Re s e a r c h S u p p o r t C e n t e r
Coding levels
The Study Database
Case IDSex
Congestive Heart Failure
Comments
Diabetes
Antibiotic
Diagnosis
For binary data always code with 1/0
1 = Male, 0=Female
1=Yes, 0,No
For multiple choice data, code 1. . .N
1, Beta-lactam monotherapy only2, Beta-lactam + macrolide combination only3, Beta-lactam + quinolone combination only4, Quinolone monotherapy only
For Yes/No, always code with 1/0
Don’t forget to code for NA/UNKNOWN
1 = Yes, 0=No -1 UK
Use alpha numeric codes when they make sense
blmono, Beta-lactam monotherapy onlyblmac, Beta-lactam + macrolide combination onlyblq, Beta-lactam + quinolone combination onlyquin, Quinolone monotherapy only
C l i n i c a l a n d Tr a n s l a t i o n a l Re s e a r c h S u p p o r t C e n t e r
The Data Dictionary
The Study Database
Variable nameCan be a simple spreadsheet with . . .
Data type
Coding levels
Coding restrictions
Description
Units
C l i n i c a l a n d Tr a n s l a t i o n a l Re s e a r c h S u p p o r t C e n t e r
An example. . .
The Study Database
Variable Name Description Data Type Coding Levels Restrictions Unitscase_id Case Identifier Integer >0 NAage Age Integer Continuous >18 Yearssex Sex Integer 1,Yes 0,No NA NAdoa Date of Arrival Date Short Date >Study Start NA
dod Date of Discharge Date Short Date >Study Start and >doa NAicu Admitted to ICU Integer 1,Yes 0,No NA NArul Right Upper Lobe Integer 1,Yes 0,No NA NArml Right Middle Lobe Integer 1,Yes 0,No NA NArll Right Lower Lobe Integer 1,Yes 0,No NA NAlul Left Upper Lobe Integer 1,Yes 0,No NA NAlll Left Lower Lobe Integer 1,Yes 0,No NA NApe Pleural Effusion Integer 1,Yes 0,No NA NA
resp_rate Respiratory Rate Integer Continuous >60 Breaths/Minutetemp Temperature Integer Continuous >35 and <40.5 Celsius
psiPnuemonia Severity Index Integer Continuous 0-350 NA
Defined according to rules
Can be as detailed as necessary
This will often be database
dependent
Very important
for consistency
Derived from
protocol and study definitions
Very important – especially
for lab values
C l i n i c a l a n d Tr a n s l a t i o n a l Re s e a r c h S u p p o r t C e n t e r
Building the Study Database
The Study Database
C l i n i c a l a n d Tr a n s l a t i o n a l Re s e a r c h S u p p o r t C e n t e r
Types of Studies from a Data Perspective
The Study Database
Large Studies
Small Studies
> 100 cases
<20Variables
<100 cases
> 20 Variables
C l i n i c a l a n d Tr a n s l a t i o n a l Re s e a r c h S u p p o r t C e n t e r
Tools for Smaller Studies
The Study Database
Spreadsheet Applications
Personal Database
Applications
MS Excel
OpenOffice-Calc
Apple - Numbers
MS Access
OpenOffice-Base
Filemaker Pro
C l i n i c a l a n d Tr a n s l a t i o n a l Re s e a r c h S u p p o r t C e n t e r
Excel
The Study Database
Many people in science and medicine are already familiar with Excel
Supports descriptive statistics and calculations
Provides rudimentary data visualization
Easy to back up and have multiple copies for safekeeping. MS Excel
LimitationsStrengths
No capability for metadata
Need to encrypt manually
Does not provide authentication, authorization, audit trail
Inadvertent data corruption
Depending on verison, may be limited to 256 columns (variables)
C l i n i c a l a n d Tr a n s l a t i o n a l Re s e a r c h S u p p o r t C e n t e r
Access
The Study Database
Provides a front-end for easy form creation
Supports descriptive statistics and calculations
Provides rudimentary data visualization
Easy to back up and have multiple copies for safekeeping.
MS Access
LimitationsStrengths
Must understand the basics of database design to be able build
effective databases
Need to encrypt manually
Does not provide authentication, authorization, audit trail
Programming help may be required for complicated features
C l i n i c a l a n d Tr a n s l a t i o n a l Re s e a r c h S u p p o r t C e n t e r
Tools for Larger Studies
The Study Database
Database / Applications
OpenClinicaLabKe
y
REDCap
Medidata Rave
Custom Software
C l i n i c a l a n d Tr a n s l a t i o n a l Re s e a r c h S u p p o r t C e n t e r
The Study Database
Unlike Excel and Access these tools require infrastructure and specialized skills to
maintain.
C l i n i c a l a n d Tr a n s l a t i o n a l Re s e a r c h S u p p o r t C e n t e r
Tools for Larger Studies
The Study Database
Infrastructure
Licenses for software (much more expensive than Excel/Access)
Technical Skills to install and maintain system
Place to “host” your database
RecommendationHire a dedicated person
to administer this system or contract it out.
C l i n i c a l a n d Tr a n s l a t i o n a l Re s e a r c h S u p p o r t C e n t e r
OpenClinica
The Study Database
Specifications• Windows Server 2008 or Linux
Variant• Java• Java Application Server
(Apache)• Database: PostgreSQL 8.4 or
Oracle 10.2g
• Designed for capturing data for clinical trials
• Has paid and free versions• Can be hosted locally or by Akaza
Research LLC• Supports exporting data to:
– HTML– Tab-Delimited– SPSS Syntax and Data– CDISC ODM xml 1.3 and 1.2– CDISC OpenClinica Exentsion 1.3 and 1.2
https://www.openclinica.com
C l i n i c a l a n d Tr a n s l a t i o n a l Re s e a r c h S u p p o r t C e n t e r
LabKey
The Study Database
Specifications• Windows Server 2008 or Linux
Variant• Java• Java Application Server
(Apache)• Database: PostgreSQL 8.4 or
Microsoft SQL Server
• Designed for storing lab data such as:– Specimen data– Gene Sequences– Flow Cytometry Data– Proteomics Data– Any Assay Data
• It also supports storing other clinical data• Has paid and free versions• Can be hosted locally or by Akaza Research LLC• Supports exporting data to:
– Excel– Text– Javascript– R– SAS
https://www.labkey.org/
C l i n i c a l a n d Tr a n s l a t i o n a l Re s e a r c h S u p p o r t C e n t e r
Medidata Rave
The Study Database
• A comprehensive commercial clinical data management system
• Very flexible• Expensive and requires significant
technical support• Supports exporting data to:
– SAS– CSV– XML
Specifications• Windows Server 2008 or Linux
Variant• Java• Java Application Server
(Apache)• Database: PostgreSQL 8.4 or
Microsoft SQL Server
http://www.mdsol.com/products/rave_overview.htm
C l i n i c a l a n d Tr a n s l a t i o n a l Re s e a r c h S u p p o r t C e n t e r
REDCap
The Study Database
http://project-redcap.org/
Specifications• Windows Server 2008 or Linux
Variant• PHP• Database: MySQL Server• SMTP email server
• A comprehensive open source clinical data management system
• Supports creating:– Surveys– Data Entry Forms (eCRFs)– Longitudinal Studies
• Free, but requires significant technical support
• Supports exporting data to:– Excel– SPSS– SAS– R– STATA
C l i n i c a l a n d Tr a n s l a t i o n a l Re s e a r c h S u p p o r t C e n t e r
Custom Software
The Study Database
If your requirements are extremely particular you may
want to build your own system
Can be time consuming and resource intensive
NOT RECOMMENDED
C l i n i c a l a n d Tr a n s l a t i o n a l Re s e a r c h S u p p o r t C e n t e r
Questions?
C l i n i c a l a n d Tr a n s l a t i o n a l Re s e a r c h S u p p o r t C e n t e r
References
• Beyond CRES: Excel - Informatics tools for managing your clinical research data, Anderson N. www.iths.org/bmi
• Why not use Excel for data management?, Carlin L., Bondy J, Wolfe P. http://connect.ucdenver.edu/excel/