Upload
everett-page
View
213
Download
0
Embed Size (px)
Citation preview
SlideSlide 1
Baby Leo’s 4-month “Healthy Baby” check-up reported the following:
1) He is in the 90th percentile for weight
2) He is in the 95th percentile for head circumference
3) He is in the 100th percentile for height.
Interpret these scores. What do they mean?
SlideSlide 3
This section discusses outliers, then introduces a new statistical graph called a boxplot, which is helpful for visualizing the distribution of data.
Key Concept
SlideSlide 4
Exploratory Data Analysis (EDA)
the process of using statistical tools (such as graphs, measures of center, and measures of variation) to investigate data sets in order to understand their important characteristics
Definition
SlideSlide 5
Definition An outlier is a value that is located very far away
from almost all of the other values. An extreme value that falls outside general pattern
of data. Not all outliers are errors. An outlier can have a dramatic effect on the
mean. An outlier can have a dramatic effect on the
standard deviation. An outlier can have a dramatic effect on the
scale of the histogram so that the true nature of the distribution is totally obscured.
SlideSlide 6
For a set of data, the 5-number summary consists of the minimum value; the first quartile Q1; the median (or second quartile Q2); the third quartile, Q3; and the maximum value.
A boxplot ( or box-and-whisker-diagram) is a graph of a data set that consists of a line extending from the minimum value to the maximum value, and a box with lines drawn at the first quartile, Q1; the median; and the third quartile, Q3.
Definitions
(Calculator: with/without potential outlier, boxplot/modified boxplot)
SlideSlide 8
Modified Boxplots
Some statistical packages provide modified boxplots which represent outliers as special points.
A data value is an outlier if it is …
above Q3 by an amount greater than 1.5 X IQR
or
below Q1 by an amount greater than 1.5 X IQR
Use the outlier criterion on our dataset1 2 3 4 8 165
To identify any outliers.
SlideSlide 9
Modified Boxplot Construction
A modified boxplot is constructed with these specifications:
A special symbol (such as an asterisk) is used to identify outliers.
The solid horizontal line extends only as far as the minimum data value that is not an outlier and the maximum data value that is not an outlier.
SlideSlide 11
Do male doctors perform more C-sections than female doctors? A study in Switzerland examined the number
of C-sections performed in one year by a sample of male/female doctors.
Male Dr. Data:
20 25 25 27 28 31 33 34 36 37 44 50 59 85 86
Min Q1 M Q3 Max
Female dr. data:
5 7 10 14 18 19 25 29 31 33
Min Q1 M = 18.5 Q3 Max
• Are there any outliers?
• What do the modified boxplots tell you?
• Give some overall observations.
SlideSlide 12
You do!
Here are measured reaction times (in seconds) in a test of driving skills:
2.4 2.5 2.8 2.0 2.4 2.9 3.2 3.5 2.7 2.7 2.8 2.7
1) Find the five-number summary
2) Draw the modified boxplot. Are there any outliers?
SlideSlide 15
Statdisk Pulse rates
Compare male and female
1) Center
2) Variation
3) 5 # summary
4) Are there any outliers?