View
46
Download
0
Category
Tags:
Preview:
DESCRIPTION
Understandable Statistics Seventh Edition By Brase and Brase Prepared by: Lynn Smith Gloucester County College. Chapter Three Averages and Variation. Measures of Central Tendency. Mode Median Mean. The Mode. the value or property that occurs most frequently in the data. Find the mode:. - PowerPoint PPT Presentation
Citation preview
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1
Understandable StatisticsSeventh Edition
By Brase and BrasePrepared by: Lynn Smith
Gloucester County College
Chapter Three
Averages and Variation
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 2
Measures of Central Tendency
• Mode
• Median
• Mean
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 3
The Mode
the value or property that occurs most frequently in the data
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 4
Find the mode:
6, 7, 2, 3, 4, 6, 2, 6
The mode is 6.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 5
Find the mode:
6, 7, 2, 3, 4, 5, 9, 8
There is no mode for this data.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 6
The Median
the central value of an ordered distribution
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 7
To find the median of raw data:
• Order the data from smallest to largest.
• For an odd number of data values, the
median is the middle value.
• For an even number of data values, the
median is found by dividing the sum of
the two middle values by two.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 8
Find the median:
Data: 5, 2, 7, 1, 4, 3, 2
Rearrange: 1, 2, 2, 3, 4, 5, 7
The median is 3.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 9
Find the median:
Data: 31, 57, 12, 22, 43, 50
Rearrange: 12, 22, 31, 43, 50, 57
The median is the average of the middle two values =
372
4331
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 10
The Mean
The mean of a collection of data is found by:• summing all the entries• dividing by the number of entries
entriesofnumberentriesallofsum
mean
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 11
Find the mean:
6, 7, 2, 3, 4, 5, 2, 8
6.4625.48
378
82543276mean
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 12
Sigma Notation
•The symbol means “sum the following.”
• is the Greek letter (capital) sigma.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 13
Notations for mean
Sample mean
“x bar”
Population mean
Greek letter (mu)x
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 14
Find the measure of center• Find the
mode:
2,3,4,1,2,3,4,5,2,5,5,8,9,9,10
• Find the median:
5,3,4,10,2,3,9,5,2,5,2,8,4,9,1
• Find the mean:
10, 9, 2, 9, 3, 8, 4, 5, 1, 5, 2, 2, 3, 5, 4
There are two modes: 2 and 5(bimodal data)
The median (middle): 4
The sum of the numbers : 72 divided by the count 15 is the mean = 4.8
These lists are all the same!Which is the best measure of center to report????
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 15
Number of entries in a set of data
• If the data represents a sample, the
number of entries = n.
• If the data represents an entire
population, the number of entries = N.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 16
Sample mean
nx
x
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 17
Population mean
N
x
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 18
Resistant Measure
a measure that is not influenced by extremely high or low data values
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 19
Which is less resistant?
• Mean• Median
The mean is less resistant. It can be made arbitrarily large by increasing the size of one value.
The median is more resistant. It will not be heavily influenced by large or small values.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 20
Midrange
a measure of center that is the midpoint of a set of data
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 21
Compute the midrange:
36,20,18,60,17,15,20,32,25,30
• Order the list from smallest to largest
15,17, 18, 20, 20, 25, 30, 32, 36,60• Midrange = 5.37
2
1560
2
minmax,
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 22
Trimmed Mean
a measure of center that is more resistant than the mean but is still
sensitive to specific data values
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 23
To calculate a (5 or 10%) trimmed mean
• Order the data from smallest to largest.• Delete the bottom 5 or 10% of the data.• Delete the same percent from the top of
the data.• Compute the mean of the remaining 80
or 90% of the data.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 24
Compute a 10% trimmed mean:
15, 17, 18, 20, 20, 25, 30, 32, 36, 60• Delete the top and bottom 10%
( one value for every 10 random variables)
• New data list:
17, 18, 20, 20, 25, 30, 32, 36• 10% trimmed mean =
8.248
198
n
x
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 25
Trimmed Mean Guidelines
1. For 5% trimmed mean:i. Data sets n=3-20 ; drop min and max
ii. Data sets n=21 – 40 ; drop lowest 2 and highest 2 values, etc.
2. For 10% trimmed mean:i. Data sets n=11-20 ; drop lowest 2 and highest 2
values
ii. Data sets n=21 – 30 ; drop lowest 3 and highest 3 values, etc.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 26
Weighted Meananother measure of center that is
more resistant than the mean but is still sensitive to specific data
frequencies
ORAverage calculated where some of
the numbers are assigned more importance or weight
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 27
Weighted Average
x. value data the ofweight the w
AverageWeighted
where
w
xw
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 28
To calculate a weighted mean
• Order the data from least frequent to most frequent.
• Multiply each value by its frequency or percentage.
• Add the products and divide by the total frequency or total percentage (100% or 1.00).
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 29
Compute a weighted mean:
Using our syllabus grading system, you have a test average of 77, quiz avg. of 91, homework avg. of 85, and classwork avg. of 100.
• 45% x 77• 25% x 91• 20% x 85• 10% x 100
• Weighted average=
4.84100
8440
n
x
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 30
Compute the Weighted Average:
• Midterm grade = 92• Term Paper grade = 80• Final exam grade = 88• Midterm weight = 25%• Term paper weight = 25%• Final exam weight = 50%
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 31
Compute the Weighted Average:
x w xw• Midterm 92 .25 23• Term Paper 80 .25 20• Final exam 88 .50 44
1.00 87
Average Weighted8700.1
87
w
xw
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 32
Mean of Grouped Data (Frequency Table)
• Make a frequency table• Compute the midpoint (x) for each class.• Count the number of entries in each class
(f).• Sum the f values to find n, the total
number of entries in the distribution.• Treat each entry of a class as if it falls at
the class midpoint.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 33
Calculation of the mean of grouped data
Calculation of the mean of grouped data
Ages: f
30 - 34 4
35 - 39 5
40 - 44 2
45 – 49 9
x (mdpt) 32
37
42
47
xf 128 185 84
423
xf = 820
f = 20
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 34
Mean of Grouped Data
f
xf
n
xfx
0.4120
820
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 35
When do I use that?• Use the mode or midrange when the data appears to
be uniform or unimodal but not symmetric; especially when there are extreme values to the left and right in the graph
• Use the median when the data appears to be skewed or bimodal and symmetric; especially when there are extreme values to the left or right in the graph
• Use the mean when the data appears to be somewhat symmetrical and unimodal; especially when the graph has very few extreme values
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 36
Measures of Variation
• Range
• Interquartile Range
• Standard Deviation (Variance)
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 37
The Range
the difference between the largest and smallest values of a
distribution
(measure of spread most closely associated with mode or midrange as the center)
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 38
Find the range:
10, 13, 17, 17, 18
The range = largest minus smallest
= 18 minus 10 = 8
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 39
Interquartile Range (IQR)
the difference between the “low middle” and “high middle” or the middle 50% of a
distribution
(measure of spread most closely associated with median as the center)
• Percentiles that divide the data into fourths• Q1 = 25th percentile
• Q2 = the median
• Q3 = 75th percentile
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 40
Quartiles
• Percentiles that divide the data into fourths
• Q1 = 25th percentile
• Q2 = the median
• Q3 = 75th percentile
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 41
Quartiles
Q1
Median = Q2
Q3
Inter-quartile range = IQR = Q3 — Q1
Low
est
valu
e
Hig
hes
t va
lue
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 42
Find the quartiles:
12 15 16 16 17 18 22 22
23 24 25 30 32 33 33 34
41 45 51
The data has been ordered.
The median is 24.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 43
Find the quartiles:
12 15 16 16 17 18 22 22
23 24 25 30 32 33 33 34
41 45 51
For the data below the median, the median is 17.5
17.5 is the first quartile.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 44
Find the quartiles:
12 15 16 16 17 18 22 22
23 24 25 30 32 33 33 34
41 45 51
For the data above the median, the median is 33.
33 is the third quartile.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 45
Find the interquartile range:
12 15 16 16 17 18 22 22
23 24 25 30 32 33 33 34
41 45 51
IQR = Q3 – Q1 = 33 – 17.5 = 15.5
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 46
The standard deviation
a measure of the average variation of the data entries from the mean
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 47
Standard deviation of a sample
1n
)xx(s
2
n = sample size
mean of the sample
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 48
To calculate standard deviation of a sample
• Calculate the mean of the sample.• Find the difference between each entry (x) and the
mean. These differences will add up to zero.• Square the deviations from the mean.• Sum the squares of the deviations from the
mean.• Divide the sum by (n 1) to get the variance.• Take the square root of the variance to get
the standard deviation.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 49
The Variance
the square of the standard deviation
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 50
Variance of a Sample
1n)xx(
s2
2
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 51
Find the standard deviation and variance
22, 26, 30x302622
2)x(x xx
4 04
16 016___3278 mean=
26
Sum = 0
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 52
1)( 2
2
nxx
s = 32 2 =16
The variance
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 53
The standard deviation
s = 416
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 54
x
4
5
5
7
4
2)x-(x xx
25
1
0
0
2
1
Find the mean, the standard deviation and variance
4, 4, 5, 5, 7
Find the mean, the standard deviation and variance
4, 4, 5, 5, 7
1
0
0
4
1 6mean = 5
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 55
The mean, the standard deviation and variance
Mean = 5
5.14
6Variance
22.15.1deviationdardtanS
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 56
Computation formula for sample standard
deviation:
nx
xSSwhere
1nSS
s
2
2
x
x
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 57
To find
Square the x values, then add.
2x
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 58
To find
Sum the x values, then square.
2)x(
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 59
Use the computing formulas to find s and s2
x
4
5
5
7
4
x2
16
25
25
49
16
25 131
n = 5
(Sx) 2 = 25 2 = 625
Sx2 = 131
SSx = 131 – 625/5 = 6
s2 = 6/(5 –1) = 1.5
s = 1.22
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 60
Population Mean and Standard Deviation
population the in values data ofnumber N
deviation standard population
mean population
2
where
N
xx
N
x
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 61
COEFFICIENT OF VARIATION:
a measurement of the relative variability (or consistency) of data
100or100x
sCV
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 62
CV is used to compare variability or
consistencyA sample of newborn infants had a mean weight of 6.2 pounds with a standard deviation of 1 pound.
A sample of three-month-old children had a mean weight of 10.5 pounds with a standard deviation of 1.5 pounds.
Which (newborns or 3-month-olds) are more variable in weight?
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 63
To compare variability, compare Coefficient of Variation
For newborns:
For 3-month-olds:
CV = 16%
CV = 14%
Higher CV: more variable
Lower CV: more consistent
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 64
Use Coefficient of Variation
To compare two groups of data,
to answer:
Which is more consistent?
Which is more variable?
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 65
CHEBYSHEV'S THEOREM
For any set of data and for any number k,
greater than one, the proportion of the
data that lies within k standard
deviations of the mean is at least:
2k1
1
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 66
CHEBYSHEV'S THEOREM for k = 2CHEBYSHEV'S THEOREM for k = 2
According to Chebyshev’s Theorem, at least what fraction of the data falls within “k” (k = 2) standard deviations of the mean?
At least
of the data falls within 2 standard deviations of the mean.
%7543
21
12
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 67
CHEBYSHEV'S THEOREM for k = 3CHEBYSHEV'S THEOREM for k = 3
According to Chebyshev’s Theorem, at least what fraction of the data falls within “k” (k = 3) standard deviations of the mean?
At least
of the data falls within 3 standard deviations of the mean.
%9.8898
31
12
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 68
CHEBYSHEV'S THEOREM for k =4CHEBYSHEV'S THEOREM for k =4
According to Chebyshev’s Theorem, at least what fraction of the data falls within “k” (k = 4) standard deviations of the mean?
At least
of the data falls within 4 standard deviations of the mean.
%8.931615
41
12
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 69
Using Chebyshev’s Theorem
A mathematics class completes an examination and it is found that the class mean is 77 and the standard deviation is 6.
According to Chebyshev's Theorem, between what two values would at least 75% of the grades be?
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 70
Mean = 77 Standard deviation = 6
At least 75% of the grades would be in the interval:
s2xtos2x
77 – 2(6) to 77 + 2(6)
77 – 12 to 77 + 12
65 to 89
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 71
Mean and Standard Deviation of Grouped Data
• Make a frequency table• Compute the midpoint (x) for each class.• Count the number of entries in each class
(f).• Sum the f values to find n, the total
number of entries in the distribution.• Treat each entry of a class as if it falls at
the class midpoint.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 72
Sample Mean for a Frequency Distribution
x = class midpoint
n
xfx
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 73
Sample Standard Deviation for a Frequency Distribution
1
)( 2
n
fxxs
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 74
Computation Formula for Standard Deviation for a Frequency Distribution
n
xffx
n
SSs x
22
xSS where
1
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 75
Calculation of the mean of grouped data
Calculation of the mean of grouped data
Ages: f
30 - 34 4
35 - 39 5
40 - 44 2
45 - 49 9
x 32
374247
xf 128
185
84
423
xf = 820
f = 20
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 76
Calculation of the standard deviation of grouped data
Calculation of the standard deviation of grouped data
Ages: f 30 – 34 4
35 - 39 5
40 - 44 2
45 - 49 9
x
32
37
42
47
x – mean – 9
– 4
1
6
Mean
(x – mean)2 81
16
1
36
(x – mean)2 f 324
80
2
324
f = 20
(x – mean)2 f = 730
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 77
Calculation of the standard deviation of grouped data
Calculation of the standard deviation of grouped data
f = n = 20
20.642.38
120
730
1
)( 2
n
fxxs
7302 xx
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 78
Computation Formula for Standard Deviation for a Frequency Distribution
n
xffx
n
SSs x
22
xSS where
1
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 79
Computation Formula for Standard Deviation
Computation Formula for Standard Deviation
f
4
5
2
9
x
32
37
42
47
xf 128
185
84
423
xf = 820
f = 20
x2f 4096
6845
3528
19881
x2f = 34350
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 80
Computation Formula for Standard Deviation for a Frequency Distribution
20.6120
730
1
73020
82034350
SS where
2
22
x
n
SSs
n
xffx
x
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 81
Percentiles
For any whole number P (between 1 and 99), the Pth percentile of a distribution is a value such that P% of the data fall at or below it.
The percent falling above the Pth percentile will be (100 – P)%.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 82
Percentiles
40% of data
Low
est
valu
e
Hig
hes
t va
lueP 40
60% of data
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 83
Quartiles
• Percentiles that divide the data into fourths
• Q1 = 25th percentile
• Q2 = the median
• Q3 = 75th percentile
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 84
Quartiles
Q1
Median = Q2
Q3
Inter-quartile range = IQR = Q3 — Q1
Low
est
valu
e
Hig
hes
t va
lue
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 85
Computing Quartiles
• Order the data from smallest to largest.• Find the median, the second quartile.• Find the median of the data falling below
Q2. This is the first quartile.
• Find the median of the data falling above Q2. This is the third quartile.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 86
Find the quartiles:
12 15 16 16 17 18 22 22
23 24 25 30 32 33 33 34
41 45 51
The data has been ordered.
The median is 24.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 87
Find the quartiles:
12 15 16 16 17 18 22 22
23 24 25 30 32 33 33 34
41 45 51
The data has been ordered.
The median is 24.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 88
Find the quartiles:
12 15 16 16 17 18 22 22
23 24 25 30 32 33 33 34
41 45 51
For the data below the median, the median is 17.
17 is the first quartile.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 89
Find the quartiles:
12 15 16 16 17 18 22 22
23 24 25 30 32 33 33 34
41 45 51
For the data above the median, the median is 33.
33 is the third quartile.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 90
Find the interquartile range:
12 15 16 16 17 18 22 22
23 24 25 30 32 33 33 34
41 45 51
IQR = Q3 – Q1 = 33 – 17 = 16
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 91
Five-Number Summary of Data
• Lowest value• First quartile• Median• Third quartile• Highest value
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 92
Box-and-Whisker Plot
a graphical presentation of the five-number summary of data
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 93
Making a Box-and-Whisker Plot
• Draw a vertical scale including the lowest and highest values.
• To the right of the scale, draw a box from Q1 to Q3.
• Draw a solid line through the box at the median.
• Draw lines (whiskers) from Q1 to the lowest and from Q3 to the highest values.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 94
Construct a Box-and-Whisker Plot:
12 15 16 16 17 18 22 22
23 24 25 30 32 33 33 34
41 45 51
Lowest = 12 Q1 = 17
median = 24 Q3 = 33
Highest = 51
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 95
Box-and-Whisker Plot
Lowest = 12
Q1 = 17
median = 24
Q3 = 33
Highest = 51
60 -
55 -
50 -
45 -
40 -
35 -
30 -
25 -
20 -
15 -
10 -
Recommended