View
221
Download
0
Category
Preview:
Citation preview
Applying the Normal Distribution: Z-Scores
Chapter 3.5 – Tools for Analyzing DataMathematics of Data Management (Nelson)MDM 4U
Comparing Data
Consider the following two students: Student 1
MDM 4U, Mr. Lieff, Semester 1, 2004-2005 Mark = 84%,
Student 2MDM 4U, Mr. Lieff, Semester 2, 2005-2006 Mark = 83%,
Can we compare the two students fairly when the mark distributions are different?
x 74 8,
x 70 9 8, .
Mark Distributions for Each Class
Semester 1, 2004-05 Semester 2, 2005-06
74665850 82 90 99.489.679.87060.250.440.698
Comparing Distributions
It is difficult to compare two distributions when they have different characteristics
For example, the two histograms have different means and standard deviations
z-scores allow us to make the comparison
Co
un
t
123456
a1 2 3 4 5 6 7 8
Collection 1 Histogram
Co
un
t
2
4
6
b
4 5 6 7 8 9 10 11
Collection 1 Histogram
The Standard Normal Distribution A distribution with a mean of zero and a standard
deviation of one X~N(0,1²) Each element of any normal distribution can be
translated to the same place on a Standard Normal Distribution using the z-score of the element
the z-score is the number of standard deviations the piece of data is below or above the mean
If the z-score is positive, the data lies above the mean, if negative, below
xx
z
Standardizing The process of reducing the normal
distribution to a standard normal distribution N(0,12) is called standardizing
Remember that a standardized normal distribution has a mean of 0 and a standard deviation of 1
Example 1 For the distribution X~N(10,2²) determine the number
of standard deviations each value lies above or below the mean:
a. x = 7
z = 7 – 10 2 z = -1.5
7 is 1.5 standard deviations below the mean 18.5 is 4.25 standard deviations above the mean
(anything beyond 3 is an outlier)
b. x = 18.5
z = 18.5 – 10
2
z=4.25
Example continued…
34% 34%
13.5% 13.5%
2.35% 2.35%
95%
99.7%
10 12 1486
7
16
18.5
Standard Deviation
A recent math quiz offered the following data
The z-scores offer a way to compare scores among members of the class, find out how many had a mark greater than yours, indicate position in the class, etc.
mean = 68.0 standard deviation = 10.9
Co
un
t
2
4
6
8
10
marks40 45 50 55 60 65 70 75 80 85 90
Test 1 Histogram
Example 2:
Suppose your mark was 64 Compare your mark to the rest of the class z = (64 – 68.0)/10.9 = -0.37
(using the z-score table on page 398) We get 0.3557 or 35.6% So 35.6% of the class has a mark less than or
equal to yours
Example 3: Percentiles
The kth percentile is the data value that is greater than k% of the population
If another student has a mark of 75, what percentile is this student in?
z = (75 - 68)/10.9 = 0.64 From the table on page 398 we get 0.7389 or
73.9%, so the student is in the 74th percentile – their mark is greater than 74% of the others
Example 4: Ranges
Now find the percent of data between a mark of 60 and 80
For 60: z = (60 – 68)/10.9 = -0.73 gives 23.3%
For 80: z = (80 – 68)/10.9 = 1.10 gives 86.4%
86.4% - 23.3% = 63.1% So 63.1% of the class is between a mark of
60 and 80
Back to the two students...
Student 1
Student 2
Student 2 has the lower mark, but a higher z-score!
z
84 74
81 25.
83 701.326
9.8z
Exercises read through the examples on pages 180-185 try page 186 #2-5, 7, 8, 10
Mathematical Indices
Chapter 3.6 – Tools for Analyzing Data
Mathematics of Data Management (Nelson)
MDM 4U
What is an Index?
An index is an arbitrarily defined number that provides a measure of scale
These are used to indicate a value, but do not actually represent some actual measurement or quantity so that we can make comparisons
1) BMI – Body Mass Index
A mathematical formula created to determine whether a person’s mass puts them at risk for health problems
BMI = m = mass(kg), h = height(m)
Standard / Metric BMI Calculator http://nhlbisupport.com/bmi/bmicalc.htm
Underweight Below 18.5
Normal 18.5 - 24.9
Overweight 25.0 - 29.9
Obese 30.0 and Above
2
m
h
2) Slugging Percentage
Baseball is the most statistically analyzed sport in the world A number of indices are used to measure the value of a
player Batting Average (AVG) measures a player’s ability to get on
base (hits / at bats) Slugging percentage (SLG) also takes into account the
number of bases that a player earns (total bases / at bats)
SLG = where TB = 1B + 2B*2 + 3B*3 + HR*4
and 1B = singles, 2B = doubles,
3B = triples, HR = homeruns
TB
AB
Slugging PercentageExample
e.g. DH Frank Thomas, Toronto Blue Jayshttp://sports.espn.go.com/mlb/players/stats?playerId=2370
2006 Statistics: 466 AB, 126 H, 11 2B, 0 3B, 39 HR
SLG = (H + 2B + 2*3B + 3*HR) / AB
= (126 + 11 + 2*0 + 3*39) / 466
= 254 / 466
= 0.545 (3 decimal places)
Moving Average
Used when time-series data show a great deal of fluctuation (e.g. long term trend of a stock)
takes the average of the previous n values e.g. 5-Day Moving Average
cannot calculate until the 5th day value for Day 5 is the average of Days 1-5 value for Day 6 is the average of Days 2-6
e.g. Look up a stock symbol at http://ca.finance.yahoo.com
Click Charts Technical chart n-Day Moving Average
Exercises
read pp. 189-192 1a (odd), 2-3 ac, 4 (alt: calculate SLG for 3
players on your favourite team for 2007), 8, 9, 11
References
Halls, S. (2004). Body Mass Index Calculator. Retrieved October 12, 2004 from http://www.halls.md/body-mass-index/av.htm
Wikipedia (2004). Online Encyclopedia. Retrieved September 1, 2004 from http://en.wikipedia.org/wiki/Main_Page
Recommended