Upload
todd-rogers
View
217
Download
0
Embed Size (px)
Citation preview
1
Normal Distributions
Heibatollah Baghi, and
Mastee Badii, integrated with another anonymous PowerPoint by
Linda Imhoff
2
The Normal Curve
• A mathematical model or and an idealized conception of the form a distribution might have taken under certain circumstances.– Mean of any distribution has a Normal
distribution (Central Limit Theorem)– Many observations (height of adults,
weight of children in California, intelligence) have Normal distributions
• Shape– Bell shaped graph, most of data in
middle– Symmetric, with mean, median and
mode at same point
3
Percent of Values Within One Standard Deviations
68.26% of Cases
4
Percent of Values Within Two Standard Deviations
95.44% of Cases
5
Percent of Values Within Three Standard Deviations
99.72% of Cases
The Normal Distribution
X
f(X)
μ
σ
Changing μ shifts the distribution left or right.
Changing σ increases or decreases the spread.
7
Percent of Values Greater than 1 Standard Deviation
8
Percent of Values Greater than -2 Standard Deviations
**The beauty of the normal curve:
No matter what μ and σ are, the area between μ-σ and μ+σ is about 68%; the area between μ-2σ and μ+2σ is about 95%; and the area between μ-3σ and μ+3σ is about 99.7%. Almost all values fall within 3 standard deviations.
10
Percent of Values Greater than +2 Standard Deviations
11
cores% of the sbout contains aS 68 )1X(
cores% of the sabout contains S 95)2X(
cores% of the sbout contains aS 99 )3X(
Data in Normal Distribution
12
Properties Of Normal Curve
• Normal curves are symmetrical.
• Normal curves are unimodal.
• Normal curves have a bell-shaped form.
• Mean, median, and mode all have the same value.
Are my data normally distributed?
1. Look at the histogram! Does it appear bell shaped?2. Compute descriptive summary measures—are
mean, median, and mode similar?3. Do 2/3 of observations lie within 1 std dev of the
mean? Do 95% of observations lie within 2 std dev of the mean?
How good is rule for real data?
Check some example data:
The mean of the weight of the women = 127.8
The standard deviation (SD) = 15.5
80 90 100 110 120 130 140 150 160 0
5
10
15
20
25
P e r c e n t
POUNDS
127.8 143.3112.3
68% of 120 = .68x120 = ~ 82 runners
In fact, 79 runners fall within 1-SD (15.5 lbs) of the mean.
80 90 100 110 120 130 140 150 160 0
5
10
15
20
25
P e r c e n t
POUNDS
127.896.8
95% of 120 = .95 x 120 = ~ 114 runners
In fact, 115 runners fall within 2-SD’s of the mean.
158.8
80 90 100 110 120 130 140 150 160 0
5
10
15
20
25
P e r c e n t
POUNDS
127.881.3
99.7% of 120 = .997 x 120 = 119.6 runners
In fact, all 120 runners fall within 3-SD’s of the mean.
174.3
Example
• Suppose SAT scores roughly follows a normal distribution in the U.S. population of college-bound students (with range restricted to 200-800), and the average math SAT is 500 with a standard deviation of 50, then:– 68% of students will have scores between 450 and
550
– 95% will be between 400 and 600
– 99.7% will be between 350 and 650
Example
• BUT…
• What if you wanted to know the math SAT score corresponding to the 90th percentile (=90% of students are lower)?
P(X≤Q) = .90
90.2)50(
1
200
)50
500(
2
1 2
Q x
dxe
Solve for Q?….Yikes!
20
Standard Scores
• One use of the normal curve is to explore Standard Scores. Standard Scores are expressed in standard deviation units, making it much easier to compare variables measured on different scales.
• There are many kinds of Standard Scores. The most common standard score is the ‘z’ scores.
• A ‘z’ score states the number of standard deviations by which the original score lies above or below the mean of a normal curve.
21
The Z Score
• The normal curve is not a single curve but a family of curves, each of which is determined by its mean and standard deviation.
• In order to work with a variety of normal curves, we cannot have a table for every possible combination of means and standard deviations.
22
The Z Score
• What we need is a standardized normal curve which can be used for any normally distributed variable. Such a curve is called the Standard Normal Curve.
Sz
i XX
The Standard Normal Distribution (Z) All normal distributions can be converted into
the standard normal curve by subtracting the mean and dividing by the standard deviation:
σμ
X
Z
Somebody calculated all the integrals for the standard normal and put them in a table! So we never have to integrate!
Even better, computers now do all the integration.
24
The Standard Normal Curve
• The Standard Normal Curve (z distribution) is the distribution of normally distributed standard scores with mean equal to zero and a standard deviation of one.
• A z score is nothing more than a figure, which represents how many standard deviation units a raw score is away from the mean.
25
Example Z Score
• For scores above the mean, the z score has a positive sign. Example + 1.5z.
• Below the mean, the z score has a minus sign. Example - 0.5z.
• Calculate Z score for blood pressure of 140 if the sample mean is 110 and the standard deviation is 10
• Z = 140 – 110 / 10 = 3
26
Comparing Scores from Different Distributions
• Interpreting a raw score requires additional information about the entire distribution. In most situations, we need some idea about the mean score and an indication of how much the scores vary.
• For example, assume that an individual took two tests in reading and mathematics. The reading score was 32 and mathematics was 48. Is it correct to say that performance in mathematics was better than in reading?
27
Z Scores Help in Comparisons
• Not without additional information. One method to interpret the raw score is to transform it to a z score.
• The advantage of the z score transformation is that it takes into account both the mean value and the variability in a set of raw scores.
28
Did Sara improve?• Score in pretest was 18 and post test was
42
• Sara’s score did increase. From 18 to 42.
• But her relative position in the Class decreased.
Pretest Post test
Observation 18 42
Mean 17 49
Standard deviation 3 49
Z score 0.33 -0.14
29
Area When Score is Known
• For a normal distribution with mean of 100 and standard deviation of 20, what proportion of cases fall below 80?
• ~16%
30
Score When Area Is Known
• For a normal distribution with mean of 100 and standard deviation of 20, find the score that separates the upper 20% of the cases from the lower 80%
• Answer = 116.8
31
Transforming Standard Scores
• Sometimes it is more convenient to work with standard scores that do not have negative numbers or decimals.
• Standard scores can be transformed to have any desired mean and standard deviation.
• SAT and GRE are transformed scores (similar to z) with a mean of 500 and an SD of 100 – (score x 100) + 500
• Widely used cognitive and personality test (Wechsler IQ test) are standardized to have a mean of 100 and an SD of 15 – ( z x 15) + 100
32
Transforming a raw score of 12 on
Behavioral Problem Index • Age 5: Mean: 10.0 SD: 2.0
• Age 6: Mean: 12.0 SD: 3.0
• Age 7: Mean: 14.0 SD: 3.0
33
Transforming a raw score of 12 on
Behavioral Problem Index • Age 5: Mean: 10.0 SD: 2.0
• Age 6: Mean: 12.0 SD: 3.0
• Age 7: Mean: 14.0 SD: 3.0
• Age 5: Z = (12-10) / 2 = 1.0
• Age 6: Z = (12-12) / 3 = 0.0
• Age 7: Z = (12-14) / 3 = -0.67
34
Transforming a raw score of 12 on
Behavioral Problem Index • Age 5: Mean: 10.0 SD: 2.0
• Age 6: Mean: 12.0 SD: 3.0
• Age 7: Mean: 14.0 SD: 3.0
• Age 5: Z = (12-10) / 2 = 1.0
• Age 6: Z = (12-12) / 3 = 0.0
• Age 7: Z = (12-14) / 3 = -0.67
• Age 5: Standard Score 100.15=(1.0 X 15) + 100= 115
• Age 6: Standard Score 100.15=(0.0 X 15) + 100= 100
• Age 7: Standard Score 100.15=(-0.67 X 15) +100= 90
35
Other Standard Scores
• A T score is created from a z score simply by multiplying each standard deviation unit by 10 to get rid of the decimals, and then adding 50 to each of these scores to get rid of the negatives.
• Now the mean becomes 50 ([10*0] + 50 = 50).
• Plus 1 z becomes 60 ([10*1] + 50 = 60).
36
Multiple Transformation of Data
37
The Normal Curve & Probability
• The normal curve also is central to many aspects of inferential statistics. This is because the normal curve can be used to answer questions concerning the probability of events.
• For example, by knowing that 16% of adults have a Wechsler IQ greater than 115 (z = +1.00), one can state the probability(p) of randomly selecting from the adult population a person whose IQ is greater than 115.
• You are correct if you suspect that P is .16.
38
Data on the IQ Scores of 1000 Six Grade Children
39
The Normal Curve & Probability
• The mean of the distribution is 100 and the SD is 15
• What is the probability that a randomly selected student from this population would have an IQ score of 115 or greater?
• Approximately .16• 16 percent of the total area under the curve
in the distribution