30
Advanced Statistics Presentation outline: Regression Risks & Ratios Survival Analysis Survival Curves Sensitivity & Specificity

Advanced Statistics Presentation outline

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Advanced Statistics Presentation outline

Advanced StatisticsPresentation outline:

Regression

Risks & Ratios

Survival Analysis

Survival Curves

Sensitivity & Specificity

Page 2: Advanced Statistics Presentation outline

Regression

Allows you to use change in independent variables (IVs) to predict

changes in a dependent variable (DV)

– Accomplished through a regression equation

• Uses y-intercept and slope to create straight line through

scatter plot

– IVs must have a linear relationship (i.e., must be correlated) with

the DV

– IVs are not manipulated, but may have multiple levels

Page 3: Advanced Statistics Presentation outline

Example of Regression analysis:

• Suicide rate regressed on antidepressant prescription rate ,

controlling for factors like age, sex, race, and income.

Page 4: Advanced Statistics Presentation outline
Page 5: Advanced Statistics Presentation outline

Interpreting a Multiple Regression

R (Measure of the relationship between variables)

– Range -1 to +1

R2 (coefficient of determination)

– Ratio of residual variability

B and β

– Unstandardized and standardized regression coefficients

– Associated t and p

Can include interaction terms

ANOVA used to determine overall fit

Page 6: Advanced Statistics Presentation outline

Factors That Create Misleading Results

Susceptible to all the same misinterpretations as correlations

• Linear relationships

• Restricted range, Skewed distribution, Outliers & Extreme

groups

Other considerations:

• Correlation between IVs should not be too large

(multicollinearity)

Page 7: Advanced Statistics Presentation outline

Logistic Regression

Used for prediction of dichotomous DV

Assess impact of IV and interactions on DV similar to linear

regression

Page 8: Advanced Statistics Presentation outline

• Physician’s referral for cardiac catheterization was regressed on patient’s sex and race

after controlling for other confounders.

• African Americans had 40% lower odds for recommendation relative to odds for the

Whites.

Example of Logistic Regression

Page 9: Advanced Statistics Presentation outline

Interpreting Logistic Regression

Each IV will have a Wald chi-square statistic and associated p-value

Uses Hosmer-Lemshow Goodness of Fit Test to evaluate overall

model, except you are looking for ns

Provides Odds Ratio and corresponding CI for each IV in the equation

Classification table for sensitivity and specificity calculation

Page 10: Advanced Statistics Presentation outline

Measures of Risk

Absolute risk

– Probability of developing a disease within a specified time frame

Relative risk (RR)

– the probability that a member of an exposed group (or group with

a specific behavior, gene etc.) will develop a disease relative to

the probability that a member of an unexposed group will

develop that same disease

Page 11: Advanced Statistics Presentation outline

Measures of Ratio Odds ratio

– Measure of the likelihood of an event occurring versus the

likelihood of an event not occurring

– Example: If the odds ratio of (smokers v/s non-smokers) for

developing lung cancer is 2.16, it implies that smokers have 166%

higher odds of developing lung cancer than the odds of non-

smokers.

Hazards ratio

– measure of how often a particular event happens in one group

compared to how often it happens in another group, over time.

– Example: A hazard ratio(Group1 v/s Group 2) of 1.6 implies that

Group 1 has 60% greater risk of developing the outcome.

Page 12: Advanced Statistics Presentation outline

Survival Analysis

Uses

– Duration – time from randomization to relapse

– Time to development of a condition

– Survival – time from randomization until death

Censors data

– the critical event has not yet occurred

– lost to follow-up

– other interventions offered

– event occurred but unrelated cause

Page 13: Advanced Statistics Presentation outline

Survival Analysis: the tests

Kaplan-Meier analysis provides adjusted mean and median time to

event and CI for a group(s)

Tests for equality of survival distributions between groups (Log Rank

test and associated p-value)

Can use Cox regression to control for covariates

Provides hazard ratios with CI

Page 14: Advanced Statistics Presentation outline

Survival Curves: The Kaplan-Meier Estimate Nonparametric estimate of the survivor function

Accommodates missing data such as censoring

– Censored Data:

• Mathematically removing a patient at the end of their follow-up time

(usually denoted by a vertical tick mark)

• When a patient is censored it reduces the number at risk for the next

interval

Estimate of absolute risk

Will always have a “staircase” appearance

Page 15: Advanced Statistics Presentation outline

• Kaplan Meier survival curves were constructed to compare

survival probability between the treatment and control group.

• Cox regression analysis performed to identify association

between patient characteristics and survival days.

Example of survival analysis:

Page 16: Advanced Statistics Presentation outline

Survival Curves: The Kaplan-Meier Estimate

Helps you clarify your thinking about

– Treatment

– Prognosis

Think of Kaplan-Meier curve as a “movie” rather than a “snapshot”

Avoid focusing on one point on the curve - it’s the entire curve that tells the

story

Page 17: Advanced Statistics Presentation outline

Survival Curves: The Kaplan-Meier Estimate

Small N can be very misleading

An experience of 50 – 100 can give you the lay of the land

> 100 will most adequately represent the true range of possible

experience

Typical endpoint is survival but alternative endpoints can be:

– Disease Free Survival

– Progression Free Survival

– Response Duration

Page 18: Advanced Statistics Presentation outline

Survival Curves: The Kaplan-Meier Estimate

Log-rank Test

– Typical test to compare two groups of Kaplan-Meier Survival Curves

• Log-Rank statistic: variety of names, similar results

– Mantel-Haenszel Chi Square Statistic

– Cox-Mantel log-rank statistic

– Mantel log-rank statistic

– or simply the SES: approximate Chi-square test for significance

between observed vs. expected number of events

• Hazard Ratio with 95% CI

– Similar to Odds ratio

– Demonstrates numerically, differences in curves on a plot

Page 19: Advanced Statistics Presentation outline

Survival Curves: The Kaplan-Meier Estimate

Page 20: Advanced Statistics Presentation outline

Survival Curves: The Kaplan-Meier Estimate

Page 21: Advanced Statistics Presentation outline

Interval

(Start-End)

# At Risk at

Start of Interval

# Censored

During

Interval

# At Risk at End of

Interval

# Who Died at End of

Interval

Proportion Surviving This

Interval

Cumulative Survival at End of

Interval

0-1 7 0 7 1 6/7 = 0.86 0.86

1-4 6 2 4 1 3/4 = 0.75 0.86 * 0.75 = 0.64

4-10 3 1 2 1 1/2 = 0.5 0.86 * 0.75 * 0.5 = 0.31

10-12 1 0 1 0 1/1 = 1.0 0.86 * 0.75 * 0.5 * 1.0 = 0.31

Survival Curves: The Kaplan-Meier Estimate

Page 22: Advanced Statistics Presentation outline

Life Tables Expression of death rates of a particular population during a particular

time

– Probability of death within certain age ranges or counts of events

within a time period

– Types:

• Population (based on census or large scale survey)

• Cohort (longitudinal follow-up with a specific group)

• Clinical

Page 23: Advanced Statistics Presentation outline

• Time to cardiovascular death and other fatal cardiovascular outcomes was

compared for the two groups in terms of relative risk and hazard ratio.

Example of survival analysis:

Page 24: Advanced Statistics Presentation outline

Hazard Function Curves

Page 25: Advanced Statistics Presentation outline

Sensitivity and Specificity

Terms used to evaluate a clinical test

Independent of the population of interest subjected to the test

Positive and negative predictive values are useful when considering the

value of a test to a clinician

– They are dependent on the prevalence of the disease in the population of

interest

The sensitivity and specificity of a quantitative test are dependent on the

cut-off value above or below which the test is positive

– In general, the higher the sensitivity, the lower the specificity, and vice versa

Receiver operator characteristic (ROC) curves are a plot of false

positives against true positives for all cut-off values

– The area under the curve of a perfect test is 1.0 and that of a useless test,

no better than tossing a coin, is 0.5

Page 26: Advanced Statistics Presentation outline

Sensitivity and Specificity

Sensitivity: If a person has a disease, how often will the

test be positive (true positive rate)?

– If the test is highly sensitive and the test result is negative you

can be nearly certain that they don’t have disease

– Rules out disease (when the result is negative)

Specificity: If a person does not have the disease how

often will the test be negative (true negative rate)?

– If the test result for a highly specific test is positive you can be

nearly certain that they actually have the disease

– rules in disease with a high degree of confidence

Page 27: Advanced Statistics Presentation outline

Fundamental terms to understanding Sensitivity & Specificity

True positive: the patient has the disease and the test is positive.

False positive: the patient does not have the disease but the test is positive.

True negative: the patient does not have the disease and the test is negative.

False negative: the patient has the disease but the test is negative.

Sensitivity = true positives / (true positive + false negative)

Specificity = true negatives / (true negative + false positives)

Predictive value for a positive result (PV+):

PV+ asks "If the test result is positive what is the probability that the patient actually has the disease?"

PV+ = true positive / (true positive + false positive)

Predictive value for a negative result (PV-):PV- asks "If f the test result is negative what is the probability that the patient does not have disease?"

PV - = true negatives / (true negatives + false negatives)

Page 28: Advanced Statistics Presentation outline

Diagnostic Test Design: sensitivity & specificity

Page 29: Advanced Statistics Presentation outline

• Sensitivity, Specificity, Positive and negative predictive value and area under the

receiver operating curve was compared for procalcitonin, C reactive protein and

Leucocyte count.

Example for ROC :

Page 30: Advanced Statistics Presentation outline

Thank You