Upload
pat-barlow
View
3.814
Download
3
Embed Size (px)
DESCRIPTION
This presentation covers a brief introduction to some of the more common statistical analyses we run into while working with medical residents. The point is to make the audience familiar with these statistics rather than calculate them, so it is well-suited for journal clubs or other EBM-related sessions. By the end of this presentation the students should be able to: Define parametric and descriptive statistics • Compare and contrast three primary classes of parametric statistics: relationships, group differences, and repeated measures with regards to when and why to use each • Link parametric statistics with their non-parametric equivalents • Identify the benefits and risks associated with using multivariate statistics • Match research scenarios with the appropriate parametric statistics The presentation is accompanied with the following handout: http://slidesha.re/1178weg
Citation preview
COMMONLY USED STATISTICS IN
MEDICAL RESEARCH PART I
Tiffany Smith
Patrick BarlowStatistical and Research Design Consultants,
OMERAD
WHY LEARN HOW TO READ STATISTICS? Several studies have reported the error
rate in reporting and/or interpreting statistics in the medical literature is between 30-90% (Novak et al., 2006).
Understanding basic statistical concepts will allow you to become a more critical consumer of the medical literature, and ultimately be able to produce better research and make better clinical decisions.
THREE MAJOR CATEGORIES OF STATISTICAL TESTS:
Descriptive Statistics
Parametric Statistics
Non-Parametric Statistics
SOME KEY TERMS Null Hypothesis Alternative Hypothesis Mean Standard Deviation Correlation Confidence Interval
INITIAL QUESTIONS Fit the statistics to the research
question, not the other way around! First, ask yourself, “Am I interested in….
Describing a sample or outcome?”Looking at how groups differ?”Looking at how outcomes are related?”Looking at changes over time?”
Second, “How am I measuring my outcomes?”
Finally, “How is my outcome distributed in the sample?”
IN THIS PRESENTATION Descriptive Statistics Parametric Statistics
Common tests of relationships Pearson r Linear/multiple regression
Common tests of group differences Independent t-test Between subjects analysis of variance (ANOVA)
Common tests of repeated measures Dependent t-test Within subjects ANOVA
Activity
DESCRIPTIVE STATISTICS
Numbers used to describe the sample They do not actually test any hypotheses (or
yield any p-values) Types:
Measures of Center - Mean Median Mode
Measures of Spread - Quartiles Standard Deviation Range Variance
Frequencies
PARAMETRIC STATISTICS Most powerful type of statistics we use Researchers must make sure their data
meets a number of assumptions (or parameters) before these tests can be used properly.Some key assumptions
Normality Independence of observations
In research, you always want to use parametric statistics if possible.
COMMON PARAMETRIC TESTS OF
RELATIONSHIPSPearson r correlation
Linear/Multiple Regression
TESTS OF RELATIONSHIPS:PEARSON r CORRELATION What is it?
A statistical analysis that tests the relationship between two continuous variables.
Commonly Associated Terms: Bivariate correlation, relationship, r-value,
scatterplot, association, direction, magnitude.
11
WHAT TO INTERPRET?
Strong Relationship: r > .50
Moderate Relationship:
r ≈ |.30|
Weak Relationship:r ≈ |.10|
No Relationship
:r ≈ |.00|
12
IMPORTANCE OF GRAPHS
Anscombe, F.J., Graphs in Statistical Analysis, American Statistican, 27, 17-21
Each has a Pearson Correlation of r=.82, is
& is statistically
significant
EXAMPLE What you read:
Study found a relationship between age and number of medications an individual is taking, r=.35, p = .03.
What to interpret:Results show r = .35, p = .03, R2=.12
How to interpret:There is a weak, significant positive
relationship between age and number of medications an individual is taking. As age increases, number of medications also increases.
14
LINEAR/MULTIPLE REGRESSION What is it?
A statistical analysis that tests the relationship between multiple predictor variables and one continuous outcome variable. Predictors: Any number of continuous or
dichotomous variables, e.g. age, anxiety, SES Outcome: 1 Continuous variable, e.g. ER
visits per Month Commonly Associated Terms:
Multivariate, beta weight, r2-value, model, forward/backward regression, sequential/hierarchical regression, standard/simultaneous regression, statistical/stepwise regression.
INTERPRETATION What to interpret?
p-values (<.05) R2 Value, magnitude of the relationship B/beta
weights: B/beta < 1 = protective effect/negative relationship, beta > 1 = positive relationship.
How to interpret?B(β) is positive (e.g. 1.25): as the predictor
increases by 1 unit (1lbs to 2lbs), the outcome variable also increases by B(β) (LDL Cholesterol increases by 1.25 mg/dl).
B(β) is negative (e.g. -1.25): as the predictor increases by 1 unit (1lbs to 2lbs), the outcome variable decreases by B(β) (LDL decreases by 1.25 mg/dl).
EXAMPLE
Table 3: Predictors of Number of Surgical Site Infections
PredictorRegression Coefficient1
p-value2
B(SE) β
Length of Stay .25 (.06) .30 <.001
Age -.75 (.05) -.45 <.0011B = Unstandardized coefficient, SE=standard error, and β = standardized coefficient2 Overall: F(2, 317)=17.19, p<.001, R=.31, R2 =.10
What you read
What to interpret:“B’s” for each predictor: LoS=.25 and Age=
-.75p-value of each predictor: both <.001p-value for the model: <.001.R2 value for the model: .10
EXAMPLE CONTINUED
How to interpret:Overall: Both length of stay and age
significantly predict a patient’s number of surgical site infections, and account for 10% of the variance.
For Length of Stay: For every additional day a patient spends in the hospital, their number of surgical site infections increases by .25
For Age: For every additional year of age, a patient’s number of surgical site infections decreases by .75
REMEMBER: CORRELATION DOES NOT IMPLY CAUSATION
COMMON PARAMETRIC TESTS OF GROUP
DIFFERENCES
Independent t-test
Between Subjects Analysis of Variance (ANOVA)
TESTS OF GROUP DIFFERENCES: INDEPENDENT t-TEST
What is it? Tests the difference between two groups on
a single, continuous dependent variable. Commonly associated terms:
Two sample t-test, student’s t-test, means, group means, standard deviations, mean differences, group difference, confidence interval, group comparison.
INTERPRETATION What to interpret?
p-values (<.05)Mean differences and standard deviationsConfidence intervals
How to interpret? There is a significant difference between the
two groups where one group has a significantly higher/lower score on the dependent variable than the other.
EXAMPLE What you read:
Patients admitted to “academic” hospital clinics (M=.50, SD=.40) had lower average 90-day readmissions than patients seen by non-academic clinics (M=1.5, SD=.75), p = .02.
What to interpret: _____________________________ _____________________________ _____________________________
How to interpret: _______________________________________________
_________________________________________
BETWEEN SUBJECTS ANALYSIS OF VARIANCE (ANOVA) What is it?
Tests the difference among more than two groups on a single, continuous variable. Post-Hoc tests are required to examine where
the differences are.
Commonly associated terms:F-test, interactions, post-hoc tests (tukey
HSD, bonferroni, scheffe, dunnett).
INTERPRETATION What to interpret?
p-values (<.05) Main effect: Shows overall significance Post-hoc tests: shows specific group differences
Mean differences, standard deviations How to interpret?
Main Effect: There was an overall significant difference among the groups of the independent variable on the dependent variable.
Post-Hoc: Same interpretation as an independent t-test
EXAMPLE What you read:
A researcher looks at differences in number of side effects patients had on three difference drugs (A, B, and C). Main effect: Overall F=20.10, p=.01 Post-hoc: Comparison of Drug “A” to Drug “B” shows average
side effects to be 4(SD=2.5) and 7(SD=4.8), respectively, p=.04.
What to interpret: _____________________________ _____________________________
How to interpret: ________________________________________________________
________________________________________ ________________________________________________________
________________________________________
COMMON PARAMETRIC TESTS OF REPEATED
MEASURES
Dependent t-test
Within Subjects Analysis of Variance (ANOVA)
DEPENDENT t-TEST What is it?
Tests the differences for one group between two time-points or matched pairs
Commonly Associated Terms: Pre and posttest, matched pairs, paired samples,
time. What to interpret?
p-values (<.05) Mean change between measurements (i.e. over time
or between pairs) How to interpret:?
There is a significant difference between the pretest and posttest where the score on the posttest was significantly higher/lower on the dependent variable than the pretest.
EXAMPLE What you read:
An article shows a difference in average number of COPD-related readmissions before (M=1.5, SD=2.0) and after (M=.05, SD=.90) a patient education intervention, p=.08.
What to interpret: _____________________________ _____________________________
How to interpret: _____________________________________________
_______________________________________________________________________________________
REPEATED MEASURES ANOVA What is it?
A statistical analysis that tests differences of one group between two or more time-points or matched pairs (e.g. pretest, posttest, & follow-up or treatment “A” patient, treatment “B” matched patient, & placebo matched patient).
Commonly Associated Terms: Multiple time-points/matched pairs, repeated measures,
post-hoc. What to interpret?
Main effect: p-values Post-hoc: p-values, mean change, direction of change.
How to interpret: Main Effect – There was an overall significant difference
among the time points/matched pairs on the dependent variable.
Post-Hoc: Same as a dependent t-test.
EXAMPLE What you read:
An article shows a difference in average number of COPD-related readmissions before (M=1.5, SD=2.0) and after (M=.05, SD=.90), and six months following a patient education intervention (M=0.80, SD=3.0). Main effect: Overall F=3.59, p=.12.
What to interpret: p-value=.12, not statistically significant Mean change=1.0 fewer readmissions at post-
intervention How to interpret:
The number of COPD-related readmissions did not significantly change among any of the the three time points.
FINAL THOUGHTS
Other Types of ANOVAs
Conclusion
OTHER ANOVAS Mixed ANOVA: Used when comparing more than one group
over more than one time-point on a measure Example – Males vs. females, before and after smoking cessation
intervention – Average cigarettes per day
Factorial ANOVA: Comparing two or more separate independent variables on one dependent variable. Example – Where the patient was seen (UTH, HSM, or UFP), AND
Whether or not the diabetes regimen was intensified – Average readmissions
Analysis of covariance (ANCOVA): Examining the differences among groups while controlling for an additional variable Example – Whether or not the diabetes regimen was intensified,
controlling for baseline A1C – Average readmissions
All of these methods are used to test interaction effects
WHY BE SO COMPLICATED? Using complicated statistics give the
researcher several advantages:Reduced statistical errorAbility to look at complex relationshipsCan control for confoundersAllows for a more complete and in-depth
interpretation of the phenomenon. No phenomenon you study exists in a vacuum!
Questions?
HANDOUT SPECIFICSTest Name Commonly Associated Terms
Those that are bolded are terms specific to the test in question
What to interpretWhat to look for to understand the
relevance/importance p-values, confidence, mean differences, effect
size, etc.
How to interpretProvides test-specific ways to interpret
results Non-Parametric Equivalent (where
applicable)
Remember:
Just because a finding is not significant does not mean that it is not meaningful. You should always consider the effect size and context of the research when making a decision about whether or not any finding is clinically
relevant.
ACTIVITYWork together (in pairs) to answer the questions on the handout using your
“Commonly Used Statistics” resource. Be prepared to share how you found your
answers.