View
220
Download
0
Category
Tags:
Preview:
Citation preview
1
Review• Sections 2.1-2.4
• Descriptive Statistics– Qualitative (Graphical)– Quantitative (Graphical)– Summation Notation– Qualitative (Numerical)
• Central Measures (mean, median, mode and modal class)
• Shape of the Data
2
Review• Sections 2.1-2.4• Descriptive Statistics
– Qualitative (Graphical)– Quantitative (Graphical)– Summation Notation– Qualitative (Numerical)
• Central Measures (mean, median, mode and modal class)• Shape of the Data• Measures of Variability
3
Outlier
A data measurement which is unusually large or small compared to the rest of the data.
Usually from:– Measurement or recording error– Measurement from a different population– A rare, chance event.
4
Advantages/Disadvantages Mean
• Disadvantages– is sensitive to outliers
• Advantages– always exists– very common– nice mathematical properties
5
Advantages/Disadvantages Median
• Disadvantages– does not take all data into account
• Advantages– always exists– easily calculated– not affected by outliers– nice mathematical properties
6
Advantages/Disadvantages Mode
• Disadvantages– does not always exist, there could be just one
of each data point– sometimes more than one
• Advantages– appropriate for qualitative data
7
Review
A data set is skewed if one tail of the distribution has more extreme observations than the other.
http://www.shodor.org/interactivate/activities/SkewDistribution/
8
Review
Skewed to the right: The mean is bigger than the median.
xM
9
Review
Skewed to the left: The mean is less than the median.
x M
10
Review
When the mean and median are equal, the data is symmetric
Mx
11
Numerical Measures of Variability
These measure the variability or spread of the data.
12
Numerical Measures of Variability
These measure the variability or spread of the data.
Relative Frequency
0 1 3 4 52
0.3
0.4
0.5
0.2
0.1
Mx
13
Numerical Measures of Variability
These measure the variability or spread of the data.
Relative Frequency
0 1 3 4 52
0.3
0.4
0.5
0.2
0.1
Mx
14
Numerical Measures of Variability
These measure the variability or spread of the data.
Relative Frequency
0 1 3 4 52
0.3
0.4
0.5
0.2
0.1
6 7
Mx
15
Numerical Measures of Variability
These measure the variability, spread or relative standing of the data.
– Range– Standard Deviation– Percentile Ranking– Z-score
16
Range
The range of quantitative data is denoted R and is given by:
R = Maximum – Minimum
17
Range
The range of quantitative data is denoted R and is given by:
R = Maximum – Minimum
In the previous examples the first two graphs have a range of 5 and the third has a range of 7.
18
Range
R = Maximum – Minimum
Disadvantages: – Since the range uses only two values in the
sample it is very sensitive to outliers.– Give you no idea about how much data is in the
center of the data.
19
What else?
We want a measure which shows how far away most of the data points are from the mean.
20
What else?
We want a measure which shows how far away most of the data points are from the mean.
One option is to keep track of the average distance each point is from the mean.
21
Mean Deviation
The Mean Deviation is a measure of dispersion which calculates the distance between each data point and the mean, and then finds the average of these distances.
n
xx
n
xx ii
sumDeviation Mean
22
Mean Deviation
Advantages: The mean deviation takes into account all values in the sample.
Disadvantages: The absolute value signs are very cumbersome in mathematical equations.
23
Standard Deviation
The sample variance, denoted by s², is:
1
)( s
22
n
xxi
24
Standard Deviation
The sample variance, denoted by s², is:
The sample standard deviation is
The sample standard deviation is much more commonly used as a measure of variance.
.2ss
1
)( s
22
n
xxi
25
Example
Let the following be data from a sample:
2, 4, 3, 2, 5, 2, 1, 4, 5, 2.
Find:
a) The range
b) The standard deviation of this sample.
26
Sample: 2, 4, 3, 2, 5, 2, 1, 4, 5, 2.
a) The range
b) The standard deviation of this sample.
2 4 3 2 5 2 1 4 5 2
x
R
ix
)( xxi 2)( xxi
27
Sample: 2, 4, 3, 2, 5, 2, 1, 4, 5, 2. a) The range
b) The standard deviation of this sample.
2 4 3 2 5 2 1 4 5 2
310
30
10
2541252342
x
415R
ix
)( xxi 2)( xxi
28
Sample: 2, 4, 3, 2, 5, 2, 1, 4, 5, 2. a) The range
b) The standard deviation of this sample.
2 4 3 2 5 2 1 4 5 2
-1 1 0
310
30
10
2541252342
x
415R
ix
)( xxi 2)( xxi
29
Sample: 2, 4, 3, 2, 5, 2, 1, 4, 5, 2. a) The range
b) The standard deviation of this sample.
2 4 3 2 5 2 1 4 5 2
-1 1 0 -1 2 -1 -2 1 2 -1
1 1 0 1 4 1 4 1 4 1
310
30
10
2541252342
x
415R
ix
)( xxi 2)( xxi
30
Sample: 2, 4, 3, 2, 5, 2, 1, 4, 5, 2. 2 4 3 2 5 2 1 4 5 2
-1 1 0 -1 2 -1 -2 1 2 -1
1 1 0 1 4 1 4 1 4 1
ix
)( xxi 2)( xxi
1
)( s
22
n
xxi
31
Sample: 2, 4, 3, 2, 5, 2, 1, 4, 5, 2. 2 4 3 2 5 2 1 4 5 2
-1 1 0 -1 2 -1 -2 1 2 -1
1 1 0 1 4 1 4 1 4 1
ix
)( xxi 2)( xxi
110
1414141011
1
)( s
22
n
xxi
32
Sample: 2, 4, 3, 2, 5, 2, 1, 4, 5, 2. 2 4 3 2 5 2 1 4 5 2
-1 1 0 -1 2 -1 -2 1 2 -1
1 1 0 1 4 1 4 1 4 1
ix
)( xxi 2)( xxi
2110
1414141011
1
)( s
22
n
xxi
33
Sample: 2, 4, 3, 2, 5, 2, 1, 4, 5, 2.
2110
1414141011
1
)( s
22
n
xxi
41.12 ss 2
Standard Deviation:
34
More Standard DeviationThere is a “short cut” formula for finding the variance and the standard deviation
35
More Standard DeviationThere is a “short cut” formula for finding the variance and the standard deviation
1 s
2
2
2
n
n
xx ii
36
More Standard Deviation
Use this to find the standard deviation of the previous example:
1 s
2
2
2
n
n
xx ii
37
More Standard Deviation
Use this to find the standard deviation of the previous example:
1 s
2
2
2
n
n
xx ii
2 4 3 2 5 2 1 4 5 2ix2ix
38
More Standard Deviation
Use this to find the standard deviation of the previous example:
1 s
2
2
2
n
n
xx ii
2 4 3 2 5 2 1 4 5 2
4 16 9 4 25 4 1 16 25 4
ix2ix
39
More Standard Deviation
Use this to find the standard deviation of the previous example:
1 s
2
2
2
n
n
xx ii
2 4 3 2 5 2 1 4 5 2
4 16 9 4 25 4 1 16 25 4
ix2ix
40
More Standard Deviation
Use this to find the standard deviation of the previous example:
1 s
2
2
2
n
n
xx ii
2 4 3 2 5 2 1 4 5 2
4 16 9 4 25 4 1 16 25 4
ix2ix
30
108
41
More Standard Deviation
1 s
2
2
2
n
n
xx ii
2 4 3 2 5 2 1 4 5 2
4 16 9 4 25 4 1 16 25 4
ix2ix
30
108
42
More Standard Deviation
2
1101030
108
1 s
22
2
2
n
n
xx ii
2 4 3 2 5 2 1 4 5 2
4 16 9 4 25 4 1 16 25 4
ix2ix
30
108
43
More Standard Deviation
2
1101030
108
1 s
22
2
2
n
n
xx ii
2 4 3 2 5 2 1 4 5 2
4 16 9 4 25 4 1 16 25 4
ix2ix
30
108
41.12 ss 2
44
More Standard DeviationLike the mean, we are also interested in the population variance (i.e. your sample is the whole population) and the population standard deviation.
The population variance and standard deviation are denoted σ and σ2 respectively.
45
More Standard DeviationThe population variance and standard deviation are denoted σ and σ2 respectively.
****The formula for population variance is slightly different than sample variance
nn
xx
n
xxi
ii
2
22
2 )(
2
46
Example - Calculator
Find the mean, median, mode, range and standard deviation for the following sample of data:
2.3, 2.5, 2.6, 2.7, 3.0, 3.4,
3.4, 3.5, 3.5, 3.5, 3.7, 3.8
Use your calculator
47
Using your Calculator
• Change calculator to statistics mode. (SD if you have it)
• Enter in the data and then press the key, or data key.
• Keep entering data by pressing the key, or data key until complete.
• To obtain the summary data, find the key for the sample mean and the s key or n-1 key to display the sample standard deviation.
x
48
2.3, 2.5, 2.6, 2.7, 3.0, 3.4,3.4, 3.5, 3.5, 3.5, 3.7, 3.8
• Change calculator to statistics mode. (SD if you have it)
• Enter in the data and then press the key, or data key.
• Keep entering data by pressing the key, or data key until complete.
• To obtain the summary data, find the key for the sample mean and the s key or n-1 key to display the sample standard deviation.
x
49
Example - CalculatorFind the mean, median, mode, range and standard deviation for the following sample of data:
2.3, 2.5, 2.6, 2.7, 3.0, 3.4,
3.4, 3.5, 3.5, 3.5, 3.7, 3.8
Answer:
Mode = 3.5
M = 3.4
Range = 1.5
51.0 s
16.3 x
50
Example – Using Standard Deviation
Here are eight test scores from a previous Stats 201 class:
35, 59, 70, 73, 75, 81, 84, 86.
The mean and standard deviation are 70.4 and 16.7, respectively.
51
Example – Using Standard Deviation
Here are eight test scores from a previous Stats 201 class:
35, 59, 70, 73, 75, 81, 84, 86.
The mean and standard deviation are 70.4 and 16.7, respectively.
We wish to know if any of are data points are outliers. That is whether they don’t fit with the general trend of the rest of the data.
52
Example – Using Standard Deviation
35, 59, 70, 73, 75, 81, 84, 86.
The mean and standard deviation are 70.4 and 16.7, respectively.
We wish to know if any of are data points are outliers. That is whether they don’t fit with the general trend of the rest of the data.
To find this we calculate the number of standard deviations each point is from the mean.
53
Example – Using Standard Deviation
To find this we calculate the number of standard deviations each point is from the mean.
To simplify things for now, work out which data points are within
a) one standard deviation from the mean i.e.
b) two standard deviations from the mean i.e.
c) three standard deviations from the mean i.e.
) ,( sxsx
)2 ,2( sxsx
)3 ,3( sxsx
54
Example – Using Standard Deviation
Here are eight test scores from a previous Stats 201 class:
35, 59, 70, 73, 75, 81, 84, 86.
The mean and standard deviation are 70.4 and 16.7, respectively. Work out which data points are within
a) one standard deviation from the mean i.e.
b) two standard deviations from the mean i.e.
c) three standard deviations from the mean i.e.
)1.87 ,7.53()7.160.47 ,7.164.70(
)8.301 ,0.37())7.16(20.47 ),7.16(24.70(
)5.021 ,3.21())7.16(30.47 ),7.16(34.70(
55
Example – Using Standard Deviation
Here are eight test scores from a previous Stats 201 class:
35, 59, 70, 73, 75, 81, 84, 86.
The mean and standard deviation are 70.4 and 16.7, respectively. Work out which data points are within
a) one standard deviation from the mean i.e.
59, 70, 73, 75, 81, 84, 86
b) two standard deviations from the mean i.e.
59, 70, 73, 75, 81, 84, 86
c) three standard deviations from the mean i.e.
35, 59, 70, 73, 75, 81, 84, 86
Recommended