BASIC STATISTICS

Preview:

Citation preview

BASIC BIOSTATISTICS

Diane Flynn, LTC, MC

Colin Greene, LTC, MC

Objectives

Overview of Biostatistical

Terms and Concepts Application of Statistical Tests

Why Use Statistics?

Descriptive Statistics• identify patterns • leads to hypothesis generating

Inferential Statistics• distinguish true differences from

random variation• allows hypothesis testing

Why Use Statistics?

Cardiovascular Mortality in Males

00.20.40.60.8

11.2

'35-'44

'45-'54

'55-'64

'65-'74

'75-'84

SMR BangorRoseto

AJPH 1992

Types of Data

Numerical

• Continuous

• Discrete Categorical

• Ordinal

• Nominal

Descriptive Statistics

Identifies patterns in the data Identifies outliers Guides choice of statistical test

Percentage of Specimens Testing Positive for RSV

Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun

South 2 2 5 7 20 30 15 20 15 8 4 3

North-east

2 3 5 3 12 28 22 28 22 20 10 9

West 2 2 3 3 5 8 25 27 25 22 15 12

Mid-west

2 2 3 2 4 12 12 12 10 19 15 8

Descriptive Statistics

Percentage of Specimens Testing Postive for RSV 1998-99

0

5

10

15

20

25

30

35

Jul Sep Nov Jan Mar May Jul

SouthNortheastWestMidwest

Describing the Data with Numbers

Measures of Central Tendency• MEAN -- average• MEDIAN -- middle value• MODE -- most frequently observed

value(s)

Distribution of Course Grades

0

2

4

6

8

10

12

14

Number of Students

A A- B+ B B- C+ C C- D+ D D- F

Grade

Describing the Data with Numbers

Measures of Dispersion• RANGE • STANDARD DEVIATION• SKEWNESS

Measures of Dispersion

• RANGE • highest to lowest values

• STANDARD DEVIATION• how closely do values cluster around the

mean value• SKEWNESS

• refers to symmetry of curve

Measures of Dispersion

• RANGE • highest to lowest values

• STANDARD DEVIATION• how closely do values cluster around the

mean value• SKEWNESS

• refers to symmetry of curve

Standard Deviation

B

A

Curve B

Curve A

Measures of Dispersion

• RANGE • highest to lowest values

• STANDARD DEVIATION• how closely do values cluster around the

mean value• SKEWNESS

• refers to symmetry of curve

Skewness

Curve A

Curve B

negative skew

Mo

de

Me

dia

n

Me

an

The Normal Distribution

Mean = median = mode

Skew is zero 68% of values fall

between 1 SD 95% of values fall

between 2 SDs

.

Me

an

, Med

ian

, Mo

de

1

2

Inferential Statistics

Used to determine the likelihood that a conclusion based on data from a sample is true

Terms

p value: the probability that an observed difference could have occurred by chance

Hypertension Trial

DRUG Baseline mean SBP F/u mean SBP

A 150 130

B 150 125

Terms

confidence interval:

The range of values we can be reasonably certain includes the true value.

30 Day % Mortality

Study IC STK Control p N

Khaja 5.0 10.0 0.55 40

Anderson 4.2 15.4 0.19 50

Kennedy 3.7 11.2 0.02 250

95% Confidence Intervals

-.40 -.35 -.30 -.25 -.20 -.15 -.10 -.05 .00 .05 .10 .15 .20

Khaja(n=40)

Anderson(n=50)

Kennedy(n=250)

Types of Errors

Nodifference

Difference

Nodifference

TYPE IIERROR ()

Difference TYPE IERROR ()

Truth

Conclusion

Power = 1-

What Test Do I Use?

1. What type of data?

2. How many samples?

3. Are the data normally distributed?

4. What is the sample size?

Recommended