39
1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

Embed Size (px)

Citation preview

Page 1: 1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

1

Normal Distributions

Heibatollah Baghi, and

Mastee Badii, integrated with another anonymous PowerPoint by

Linda Imhoff

Page 2: 1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

2

The Normal Curve

• A mathematical model or and an idealized conception of the form a distribution might have taken under certain circumstances.– Mean of any distribution has a Normal

distribution (Central Limit Theorem)– Many observations (height of adults,

weight of children in California, intelligence) have Normal distributions

• Shape– Bell shaped graph, most of data in

middle– Symmetric, with mean, median and

mode at same point

Page 3: 1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

3

Percent of Values Within One Standard Deviations

68.26% of Cases

Page 4: 1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

4

Percent of Values Within Two Standard Deviations

95.44% of Cases

Page 5: 1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

5

Percent of Values Within Three Standard Deviations

99.72% of Cases

Page 6: 1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

The Normal Distribution

X

f(X)

μ

σ

Changing μ shifts the distribution left or right.

Changing σ increases or decreases the spread.

Page 7: 1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

7

Percent of Values Greater than 1 Standard Deviation

Page 8: 1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

8

Percent of Values Greater than -2 Standard Deviations

Page 9: 1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

**The beauty of the normal curve:

No matter what μ and σ are, the area between μ-σ and μ+σ is about 68%; the area between μ-2σ and μ+2σ is about 95%; and the area between μ-3σ and μ+3σ is about 99.7%. Almost all values fall within 3 standard deviations.

Page 10: 1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

10

Percent of Values Greater than +2 Standard Deviations

Page 11: 1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

11

cores% of the sbout contains aS 68 )1X(

cores% of the sabout contains S 95)2X(

cores% of the sbout contains aS 99 )3X(

Data in Normal Distribution

Page 12: 1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

12

Properties Of Normal Curve

• Normal curves are symmetrical.

• Normal curves are unimodal.

• Normal curves have a bell-shaped form.

• Mean, median, and mode all have the same value.

Page 13: 1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

Are my data normally distributed?

1. Look at the histogram! Does it appear bell shaped?2. Compute descriptive summary measures—are

mean, median, and mode similar?3. Do 2/3 of observations lie within 1 std dev of the

mean? Do 95% of observations lie within 2 std dev of the mean?

Page 14: 1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

How good is rule for real data?

Check some example data:

The mean of the weight of the women = 127.8

The standard deviation (SD) = 15.5

Page 15: 1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

80 90 100 110 120 130 140 150 160 0

5

10

15

20

25

P e r c e n t

POUNDS

127.8 143.3112.3

68% of 120 = .68x120 = ~ 82 runners

In fact, 79 runners fall within 1-SD (15.5 lbs) of the mean.

Page 16: 1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

80 90 100 110 120 130 140 150 160 0

5

10

15

20

25

P e r c e n t

POUNDS

127.896.8

95% of 120 = .95 x 120 = ~ 114 runners

In fact, 115 runners fall within 2-SD’s of the mean.

158.8

Page 17: 1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

80 90 100 110 120 130 140 150 160 0

5

10

15

20

25

P e r c e n t

POUNDS

127.881.3

99.7% of 120 = .997 x 120 = 119.6 runners

In fact, all 120 runners fall within 3-SD’s of the mean.

174.3

Page 18: 1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

Example

• Suppose SAT scores roughly follows a normal distribution in the U.S. population of college-bound students (with range restricted to 200-800), and the average math SAT is 500 with a standard deviation of 50, then:– 68% of students will have scores between 450 and

550

– 95% will be between 400 and 600

– 99.7% will be between 350 and 650

Page 19: 1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

Example

• BUT…

• What if you wanted to know the math SAT score corresponding to the 90th percentile (=90% of students are lower)?

P(X≤Q) = .90

90.2)50(

1

200

)50

500(

2

1 2

Q x

dxe

Solve for Q?….Yikes!

Page 20: 1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

20

Standard Scores

• One use of the normal curve is to explore Standard Scores. Standard Scores are expressed in standard deviation units, making it much easier to compare variables measured on different scales.

• There are many kinds of Standard Scores. The most common standard score is the ‘z’ scores.

• A ‘z’ score states the number of standard deviations by which the original score lies above or below the mean of a normal curve.

Page 21: 1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

21

The Z Score

• The normal curve is not a single curve but a family of curves, each of which is determined by its mean and standard deviation.

• In order to work with a variety of normal curves, we cannot have a table for every possible combination of means and standard deviations.

Page 22: 1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

22

The Z Score

• What we need is a standardized normal curve which can be used for any normally distributed variable. Such a curve is called the Standard Normal Curve.

Sz

i XX

Page 23: 1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

The Standard Normal Distribution (Z) All normal distributions can be converted into

the standard normal curve by subtracting the mean and dividing by the standard deviation:

σμ

X

Z

Somebody calculated all the integrals for the standard normal and put them in a table! So we never have to integrate!

Even better, computers now do all the integration.

Page 24: 1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

24

The Standard Normal Curve

• The Standard Normal Curve (z distribution) is the distribution of normally distributed standard scores with mean equal to zero and a standard deviation of one.

• A z score is nothing more than a figure, which represents how many standard deviation units a raw score is away from the mean.

Page 25: 1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

25

Example Z Score

• For scores above the mean, the z score has a positive sign. Example + 1.5z.

• Below the mean, the z score has a minus sign. Example - 0.5z.

• Calculate Z score for blood pressure of 140 if the sample mean is 110 and the standard deviation is 10

• Z = 140 – 110 / 10 = 3

Page 26: 1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

26

Comparing Scores from Different Distributions

• Interpreting a raw score requires additional information about the entire distribution. In most situations, we need some idea about the mean score and an indication of how much the scores vary.

• For example, assume that an individual took two tests in reading and mathematics. The reading score was 32 and mathematics was 48. Is it correct to say that performance in mathematics was better than in reading?

Page 27: 1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

27

Z Scores Help in Comparisons

• Not without additional information. One method to interpret the raw score is to transform it to a z score.

• The advantage of the z score transformation is that it takes into account both the mean value and the variability in a set of raw scores.

Page 28: 1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

28

Did Sara improve?• Score in pretest was 18 and post test was

42

• Sara’s score did increase. From 18 to 42.

• But her relative position in the Class decreased.

Pretest Post test

Observation 18 42

Mean 17 49

Standard deviation 3 49

Z score 0.33 -0.14

Page 29: 1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

29

Area When Score is Known

• For a normal distribution with mean of 100 and standard deviation of 20, what proportion of cases fall below 80?

• ~16%

Page 30: 1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

30

Score When Area Is Known

• For a normal distribution with mean of 100 and standard deviation of 20, find the score that separates the upper 20% of the cases from the lower 80%

• Answer = 116.8

Page 31: 1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

31

Transforming Standard Scores

• Sometimes it is more convenient to work with standard scores that do not have negative numbers or decimals.

• Standard scores can be transformed to have any desired mean and standard deviation.

• SAT and GRE are transformed scores (similar to z) with a mean of 500 and an SD of 100 – (score x 100) + 500

• Widely used cognitive and personality test (Wechsler IQ test) are standardized to have a mean of 100 and an SD of 15 – ( z x 15) + 100

Page 32: 1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

32

Transforming a raw score of 12 on

Behavioral Problem Index • Age 5: Mean: 10.0 SD: 2.0

• Age 6: Mean: 12.0 SD: 3.0

• Age 7: Mean: 14.0 SD: 3.0

Page 33: 1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

33

Transforming a raw score of 12 on

Behavioral Problem Index • Age 5: Mean: 10.0 SD: 2.0

• Age 6: Mean: 12.0 SD: 3.0

• Age 7: Mean: 14.0 SD: 3.0

• Age 5: Z = (12-10) / 2 = 1.0

• Age 6: Z = (12-12) / 3 = 0.0

• Age 7: Z = (12-14) / 3 = -0.67

Page 34: 1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

34

Transforming a raw score of 12 on

Behavioral Problem Index • Age 5: Mean: 10.0 SD: 2.0

• Age 6: Mean: 12.0 SD: 3.0

• Age 7: Mean: 14.0 SD: 3.0

• Age 5: Z = (12-10) / 2 = 1.0

• Age 6: Z = (12-12) / 3 = 0.0

• Age 7: Z = (12-14) / 3 = -0.67

• Age 5: Standard Score 100.15=(1.0 X 15) + 100= 115

• Age 6: Standard Score 100.15=(0.0 X 15) + 100= 100

• Age 7: Standard Score 100.15=(-0.67 X 15) +100= 90

Page 35: 1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

35

Other Standard Scores

• A T score is created from a z score simply by multiplying each standard deviation unit by 10 to get rid of the decimals, and then adding 50 to each of these scores to get rid of the negatives.

• Now the mean becomes 50 ([10*0] + 50 = 50).

• Plus 1 z becomes 60 ([10*1] + 50 = 60).

Page 36: 1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

36

Multiple Transformation of Data

Page 37: 1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

37

The Normal Curve & Probability

• The normal curve also is central to many aspects of inferential statistics. This is because the normal curve can be used to answer questions concerning the probability of events.

• For example, by knowing that 16% of adults have a Wechsler IQ greater than 115 (z = +1.00), one can state the probability(p) of randomly selecting from the adult population a person whose IQ is greater than 115.

• You are correct if you suspect that P is .16.

Page 38: 1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

38

Data on the IQ Scores of 1000 Six Grade Children

Page 39: 1 Normal Distributions Heibatollah Baghi, and Mastee Badii, integrated with another anonymous PowerPoint by Linda Imhoff

39

The Normal Curve & Probability

• The mean of the distribution is 100 and the SD is 15

• What is the probability that a randomly selected student from this population would have an IQ score of 115 or greater?

• Approximately .16• 16 percent of the total area under the curve

in the distribution