Kin 304Descriptive Statistics & the Normal Distribution
Normal DistributionMeasures of Central TendencyMeasures of VariabilityPercentiles
Z-ScoresArbitrary Scores & ScalesSkewnes & Kurtosis
Descriptive Statistics
Descriptive
Describe the characteristics of your sample Descriptive statistics
– Measures of Central Tendency Mean, Mode, Median
– Skewness, Kurtosis
– Measures of Variability Standard Deviation & Variance
– Percentiles, Scores in comparison to the Normal distribution
Descriptive Statistics
Normal Frequency Distribution
-4 -3 -2 -1 0 1 2 3 4
34.13% 34.13%
13.59%13.59%2.15% 2.15%
Mean Mode Median
68.26%
95.44%
Descriptive Statistics
Measures of Central Tendency
When do you use mean, mode or median?– Height– Skinfolds (positively skewed)– House prices in Vancouver– Measurements (criterion value)
Objective testAnthropometryVertical jump100m run time
Descriptive Statistics
Measures of Variability
Standard DeviationVariance = Standard Deviation2
Range (approx. = ±3 SDs)Nonparametric
– Quartiles (25%ile, 75%ile)– Interquartile distance
Descriptive Statistics
Central Limit Theorem
If a sufficiently large number of random samples of the same size were drawn from an infinitely large population, and the mean (average) was computed for each sample, the distribution formed by these averages would be normal.
Using the Normal Distribution
Standard Error of the Mean
SEMSD
n
Describes how confident you are that the mean of the sample is the mean of the population
Using the Normal Distribution
Z- Scores
s
XXZ
)( Score = 24
Norm Mean = 30
Norm SD = 4
Z-score = (24 – 30) / 4 = -1.5
Using the Normal Distribution
Internal or External Norm
Internal Norm
A sample of subjects are measured. Z-scores are calculated based upon mean and sd of the sample.
Mean = 0, sd = 1
External Norm
A sample of subjects are measured. Z-scores are calculated based upon mean and sd of an external normative sample (national, sport specific etc.)
Mean = ?, sd = ? (depends upon how the sample differs from the external norm.
Testing Normality
Standardizing Data
Transforming data into standard scores Useful in eliminating units of measurements
HT
186.0182.0
178.0174.0
170.0166.0
162.0158.0
154.0150.0
146.0
800
600
400
200
0
Std. Dev = 6.22
Mean = 161.0
N = 5782.00
Zscore(HT)
4.003.50
3.002.50
2.001.50
1.00.50
0.00-.50
-1.00-1.50
-2.00-2.50
700
600
500
400
300
200
100
0
Std. Dev = 1.00
Mean = 0.00
N = 5782.00
Testing Normality
Standardizing Data
Standardizing does not change the distribution of the data.
WT
125.0120.0
115.0110.0
105.0100.0
95.090.0
85.080.0
75.070.0
65.060.0
55.050.0
45.0
700
600
500
400
300
200
100
0
Std. Dev = 11.14
Mean = 61.9
N = 5704.00
Zscore(WT)
5.505.00
4.504.00
3.503.00
2.502.00
1.501.00
.500.00
-.50-1.00
-1.50
800
600
400
200
0
Std. Dev = 1.00
Mean = 0.00
N = 5704.00
Z-scores allow measurements from tests with different units to be combined. But beware: Higher z-scores are not necessarily better performances
Variablez-scores for
profile Az-scores for
profile B
Sum of 5 Skinfolds (mm) 1.5 -1.5*
Grip Strength (kg) 0.9 0.9
Vertical Jump (cm) -0.8 -0.8
Shuttle Run (sec) 1.2 -1.2*
Overall Rating 0.7 -0.65
*Z-scores are reversed because lower skinfold and shuttle run scores are regarded as better performances
-1 0 1 2
Sum of 5Skinfolds
(mm)
GripStrength
(kg)
VerticalJump(cm)
ShuttleRun (sec)
OverallRating
z-scores
-2 -1 0 1 2
Sum of 5Skinfolds
(mm)
GripStrength
(kg)
VerticalJump(cm)
ShuttleRun (sec)
OverallRating
z-scores
Test Profile A Test Profile B
Descriptive Statistics
Percentile: The percentage of the population that lies at or below that score
-4 -3 -2 -1 0 1 2 3 4
34.13% 34.13%
13.59%13.59%2.15% 2.15%
Mean Mode Median
68.26%
95.44%
Using the Normal Distribution
Cumulative Frequency Distribution
Area under the Standard Normal Curve
What percentage of the population is above or below a given z-score or between two given z-scores?
Percentage between 0 and -1.5
43.32%
Percentage above -1.5
50 + 43.32% = 93.32%
-4 -3 -2 -1 0 1 2 3 4
Using the Normal Distribution
Predicting Percentiles from Mean and sdassuming a normal distribution
PercentileZ-score for
Percentile
Predicted Percentile value
based upon Mean = 170Sd = 10
5 -1,645 153.55
25 -0.675 163.25
50 0 170
75 +0.675 176.75
95 +1.645 186.45
Using the Normal Distribution
T-scores– Mean = 50, sd
= 10
Hull scores– Mean = 50, sd
= 14
Arbitrary Scores & Scales
Using the Normal Distribution
All Scores based upon z-scores
Z-score = +1.25
T-Score = 50 + (+1.25 x 10) = 62.5
Hull Score = 50 + (+1.25 x 14) = 67.5
Z-score = -1.25
T-Score = 50 + (-1.25 x 10) = 37.5
Hull Score = 50 + (-1.25 x 14) = 32.5
Descriptive Statistics
Skewness
Many variables in Kinesiology are positively skewed
Descriptive Statistics
Kurtosis
Normal distribution is mesokurtic
Testing Normality
Skewness & Kurtosis
Skewness is a measure of symmetry, or more accurately, the lack of symmetry. A distribution, or data set, is symmetric if it looks the same to the left and right of the center point.
Kurtosis is a measure of whether the data are peaked or flat relative to a normal distribution. That is, data sets with a high kurtosis tend to have a distinct peak near the mean, decline rather rapidly, and have heavy tails. Data sets with low kurtosis tend to have a flat top near the mean rather than a sharp peak. A uniform distribution would be the extreme case
Testing Normality
Coefficient of Skewness
Where: X = mean, N = sample size, s = standard deviation
Normal Distribution: Skewness = 0
31
3
)1(
)(
sN
XXskewness
N
i i
Testing Normality
Coefficient of Kurtosis
41
4
)1(
)(
sN
XXkurtosis
N
i i
Where: X = mean, N = sample size, s = standard deviation
Normal Distribution: Kurtosis = 3
HT
186.0182.0
178.0174.0
170.0166.0
162.0158.0
154.0150.0
146.0
Height (Women)800
600
400
200
0
Std. Dev = 6.22
Mean = 161.0
N = 5782.00
WT
125.0120.0
115.0110.0
105.0100.0
95.090.0
85.080.0
75.070.0
65.060.0
55.050.0
45.0
Weight (Women)700
600
500
400
300
200
100
0
Std. Dev = 11.14
Mean = 61.9
N = 5704.00
Descriptive Statistics
5704 61.9210 .1474 11.1361 1.297 .032 2.643 .065
5782 161.0457 8.183E-02 6.2225 .092 .032 .090 .064
5362 75.7820 .3961 29.0066 1.043 .033 1.299 .067
5347
WT
HT
S5SF
Valid N (listwise)
Statistic Statistic Std. Error Statistic Statistic Std. Error Statistic Std. Error
N Mean Std.Deviation
Skewness Kurtosis
S5SF
220.0200.0
180.0160.0
140.0120.0
100.080.0
60.040.0
20.0
Sum of 5 Skinfolds (Women)1000
800
600
400
200
0
Std. Dev = 29.01
Mean = 75.8
N = 5362.00
Testing Normality
Normal Probability Plots
Correlation of observed with expected cumulative probability is a measure of the deviation from normal
Normal P-P Plot of WT
Observed Cum Prob
1.00.75.50.250.00
Exp
ect
ed
Cu
m P
rob
1.00
.75
.50
.25
0.00
Normal P-P Plot of HT
Observed Cum Prob
1.00.75.50.250.00
Exp
ect
ed
Cu
m P
rob
1.00
.75
.50
.25
0.00
T-scores and Osteoporosis
To diagnose osteoporosis, clinicians measure a patient’s bone mineral density (BMD) and then express the patient’s BMD in terms of standard deviations above or below the mean BMD for a “young normal” person of the same sex and ethnicity.
Although they call this standardized score a T-score, it is really just a Z-score where the reference mean and standard deviation come from an external population (i.e., young normal adults of a given sex and ethnicity).
Descriptive Statistics
lyoungnorma
lyoungnormapatient
SD
BDMBDMscoreT
)(
Classification using t-scores
T-scores are used to classify a patient’s BMD into one of three categories: – T-scores of -1.0 indicate normal bone density– T-scores between -1.0 and -2.5 indicate low bone
mass (“osteopenia”)– T-scores -2.5 indicate osteoporosis
Decisions to treat patients with osteoporosis medication are based, in part, on T-scores.
http://www.nof.org/sites/default/files/pdfs/NOF_ClinicianGuide2009_v7.pdf