Download ppt - Kin 304 Descriptive Statistics & the Normal Distribution

Kin 304Descriptive Statistics & the Normal Distribution

Normal DistributionMeasures of Central TendencyMeasures of VariabilityPercentiles

Z-ScoresArbitrary Scores & ScalesSkewnes & Kurtosis

Descriptive Statistics

Descriptive

Describe the characteristics of your sample Descriptive statistics

– Measures of Central Tendency Mean, Mode, Median

– Skewness, Kurtosis

– Measures of Variability Standard Deviation & Variance

– Percentiles, Scores in comparison to the Normal distribution


Normal Frequency Distribution

-4 -3 -2 -1 0 1 2 3 4

34.13% 34.13%

13.59%13.59%2.15% 2.15%

Mean Mode Median

68.26%

95.44%


Measures of Central Tendency

When do you use mean, mode or median?– Height– Skinfolds (positively skewed)– House prices in Vancouver– Measurements (criterion value)

Objective testAnthropometryVertical jump100m run time


Measures of Variability

Standard DeviationVariance = Standard Deviation2

Range (approx. = ±3 SDs)Nonparametric

– Quartiles (25%ile, 75%ile)– Interquartile distance


Central Limit Theorem

If a sufficiently large number of random samples of the same size were drawn from an infinitely large population, and the mean (average) was computed for each sample, the distribution formed by these averages would be normal.

Using the Normal Distribution

Standard Error of the Mean

SEMSD

n

Describes how confident you are that the mean of the sample is the mean of the population


Z- Scores

s

XXZ

)( Score = 24

Norm Mean = 30

Norm SD = 4

Z-score = (24 – 30) / 4 = -1.5


Internal or External Norm

Internal Norm

A sample of subjects are measured. Z-scores are calculated based upon mean and sd of the sample.

Mean = 0, sd = 1

External Norm

A sample of subjects are measured. Z-scores are calculated based upon mean and sd of an external normative sample (national, sport specific etc.)

Mean = ?, sd = ? (depends upon how the sample differs from the external norm.

Testing Normality

Standardizing Data

Transforming data into standard scores Useful in eliminating units of measurements

HT

186.0182.0

178.0174.0

170.0166.0

162.0158.0

154.0150.0

146.0

800

600

400

200

0

Std. Dev = 6.22

Mean = 161.0

N = 5782.00

Zscore(HT)

4.003.50

3.002.50

2.001.50

1.00.50

0.00-.50

-1.00-1.50

-2.00-2.50

700

600

500

400

300

200

100

0

Std. Dev = 1.00

Mean = 0.00

N = 5782.00

Testing Normality

Standardizing Data

Standardizing does not change the distribution of the data.

WT

125.0120.0

115.0110.0

105.0100.0

95.090.0

85.080.0

75.070.0

65.060.0

55.050.0

45.0

700

600

500

400

300

200

100

0

Std. Dev = 11.14

Mean = 61.9

N = 5704.00

Zscore(WT)

5.505.00

4.504.00

3.503.00

2.502.00

1.501.00

.500.00

-.50-1.00

-1.50

800

600

400

200

0

Std. Dev = 1.00

Mean = 0.00

N = 5704.00

Z-scores allow measurements from tests with different units to be combined. But beware: Higher z-scores are not necessarily better performances

Variablez-scores for

profile Az-scores for

profile B

Sum of 5 Skinfolds (mm) 1.5 -1.5*

Grip Strength (kg) 0.9 0.9

Vertical Jump (cm) -0.8 -0.8

Shuttle Run (sec) 1.2 -1.2*

Overall Rating 0.7 -0.65

*Z-scores are reversed because lower skinfold and shuttle run scores are regarded as better performances

-1 0 1 2

Sum of 5Skinfolds

(mm)

GripStrength

(kg)

VerticalJump(cm)

ShuttleRun (sec)

OverallRating

z-scores

-2 -1 0 1 2

Sum of 5Skinfolds

(mm)

GripStrength

(kg)

VerticalJump(cm)

ShuttleRun (sec)

OverallRating

z-scores

Test Profile A Test Profile B


Percentile: The percentage of the population that lies at or below that score

-4 -3 -2 -1 0 1 2 3 4

34.13% 34.13%

13.59%13.59%2.15% 2.15%

Mean Mode Median

68.26%

95.44%


Cumulative Frequency Distribution

Area under the Standard Normal Curve

What percentage of the population is above or below a given z-score or between two given z-scores?

Percentage between 0 and -1.5

43.32%

Percentage above -1.5

50 + 43.32% = 93.32%

-4 -3 -2 -1 0 1 2 3 4


Predicting Percentiles from Mean and sdassuming a normal distribution

PercentileZ-score for

Percentile

Predicted Percentile value

based upon Mean = 170Sd = 10

5 -1,645 153.55

25 -0.675 163.25

50 0 170

75 +0.675 176.75

95 +1.645 186.45


T-scores– Mean = 50, sd

= 10

Hull scores– Mean = 50, sd

= 14

Arbitrary Scores & Scales


All Scores based upon z-scores

Z-score = +1.25

T-Score = 50 + (+1.25 x 10) = 62.5

Hull Score = 50 + (+1.25 x 14) = 67.5

Z-score = -1.25

T-Score = 50 + (-1.25 x 10) = 37.5

Hull Score = 50 + (-1.25 x 14) = 32.5


Skewness

Many variables in Kinesiology are positively skewed


Kurtosis

Normal distribution is mesokurtic

Testing Normality

Skewness & Kurtosis

Skewness is a measure of symmetry, or more accurately, the lack of symmetry. A distribution, or data set, is symmetric if it looks the same to the left and right of the center point.

Kurtosis is a measure of whether the data are peaked or flat relative to a normal distribution. That is, data sets with a high kurtosis tend to have a distinct peak near the mean, decline rather rapidly, and have heavy tails. Data sets with low kurtosis tend to have a flat top near the mean rather than a sharp peak. A uniform distribution would be the extreme case

Testing Normality

Coefficient of Skewness

Where: X = mean, N = sample size, s = standard deviation

Normal Distribution: Skewness = 0

31

3

)1(

)(

sN

XXskewness

N

i i

Testing Normality

Coefficient of Kurtosis

41

4

)1(

)(

sN

XXkurtosis

N

i i

Where: X = mean, N = sample size, s = standard deviation

Normal Distribution: Kurtosis = 3

HT

186.0182.0

178.0174.0

170.0166.0

162.0158.0

154.0150.0

146.0

Height (Women)800

600

400

200

0

Std. Dev = 6.22

Mean = 161.0

N = 5782.00

WT

125.0120.0

115.0110.0

105.0100.0

95.090.0

85.080.0

75.070.0

65.060.0

55.050.0

45.0

Weight (Women)700

600

500

400

300

200

100

0

Std. Dev = 11.14

Mean = 61.9

N = 5704.00


5704 61.9210 .1474 11.1361 1.297 .032 2.643 .065

5782 161.0457 8.183E-02 6.2225 .092 .032 .090 .064

5362 75.7820 .3961 29.0066 1.043 .033 1.299 .067

5347

WT

HT

S5SF

Valid N (listwise)

Statistic Statistic Std. Error Statistic Statistic Std. Error Statistic Std. Error

N Mean Std.Deviation

Skewness Kurtosis

S5SF

220.0200.0

180.0160.0

140.0120.0

100.080.0

60.040.0

20.0

Sum of 5 Skinfolds (Women)1000

800

600

400

200

0

Std. Dev = 29.01

Mean = 75.8

N = 5362.00

Testing Normality

Normal Probability Plots

Correlation of observed with expected cumulative probability is a measure of the deviation from normal

Normal P-P Plot of WT

Observed Cum Prob

1.00.75.50.250.00

Exp

ect

ed

Cu

m P

rob

1.00

.75

.50

.25

0.00

Normal P-P Plot of HT

Observed Cum Prob

1.00.75.50.250.00

Exp

ect

ed

Cu

m P

rob

1.00

.75

.50

.25

0.00

T-scores and Osteoporosis

To diagnose osteoporosis, clinicians measure a patient’s bone mineral density (BMD) and then express the patient’s BMD in terms of standard deviations above or below the mean BMD for a “young normal” person of the same sex and ethnicity.

Although they call this standardized score a T-score, it is really just a Z-score where the reference mean and standard deviation come from an external population (i.e., young normal adults of a given sex and ethnicity).


lyoungnorma

lyoungnormapatient

SD

BDMBDMscoreT

)(

Classification using t-scores

T-scores are used to classify a patient’s BMD into one of three categories: – T-scores of -1.0 indicate normal bone density– T-scores between -1.0 and -2.5 indicate low bone

mass (“osteopenia”)– T-scores -2.5 indicate osteoporosis

Decisions to treat patients with osteoporosis medication are based, in part, on T-scores.

http://www.nof.org/sites/default/files/pdfs/NOF_ClinicianGuide2009_v7.pdf