Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
1
Introduction to the Practice of Statistics
Sixth Edition
Chapter 1: Looking at Data—Distributions
Copyright © 2009 by W. H. Freeman and Company
David S. Moore • George P. McCabe Bruce A. Craig
Statistics is the science of collecting, organizing and interpreting numerical facts, which we call data.
The study and collection of data are important in the work of many professions, so training in the science of statistics is valuable preparation for a variety of career.
We will learn data analysis and collection approaches within various application context. Context and state are important. Remember that the goal of statistics is not calculation for its own sake but gaining understanding of certain information from numbers. Such information has to be useful within that context.
You cannot learn statistics without practice. So practice, practice and practice!! Be prepared to solve problems from data!
The gain will be worth the pain. We will learn it together.
Let us start with some concepts so we sound like professional. Any set of data contains information about some group of individuals. The information is organized in variables.
Figure 1.1 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company
For example, we can organize your statistics class scores (variables) by your student ID (individual). The purpose is for me to check the grades to make sure everyone is doing well (context).
Some variables uses numerical data (scores) and some use letters (grade). Statistician came up with three ways to describe your data.
Definition, pg 4 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company
The distribution seems like a different animal! How to describe the distribution of our data? Before we answer this question, let’s first plot our data. Plotting the data is the more intuitive way to learn some information.
Figure 1.3 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company
Bar chart Pie chart
- One variable at a time - Takes to much screen space
2
Table 1.1 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company
Table format
Figure 1.4 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company
Histogram: yeah… I learned this at 7th grade. A histogram breaks the range of values of a variable into classes and displays only the count or percent of the observations that fall into each class.
Important: you can choose any convenient number of classes, but you should always choose classes of equal width.
Definition, pg 9 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company
OK. We talked a lot about stemplot. I will just leave two graphs in the next two page. I trust you know how to get there. Note, it is not good for large samples.
Table 1.2 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company
Figure 1.5 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company
Definition, pg 18 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company
Table 1.4 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company
3
Figure 1.10 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company
Definition, pg 15 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company
OK. We learned how to plot data. Now get back to the distribution. This slide is very important that and it tells you what to look for from a graph or a data source. We will learn how you get these numbers.
Definition, pg 31 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company Definition, pg 32–33
Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company
Definition, pg 35 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company
Definition, pg 37 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company
4
Figure 1.19 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company
Definition, pg 38a Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company
Definition, pg 38b Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company
Definition, pg 39 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company
Definition, pg 40–41 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company
Figure 1.21 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company
5
Definition, pg 43a Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company
Definition, pg 43b Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company
Definition, pg 45 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company
You surely know this from high school.
Definition, pg 46–47 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company
1.3 Density Curve and Normal Distributions
Figure 1.24 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company
You know what the blue bars represent, right? – Histogram. Simply fitting a smooth curve to the histogram, we get the density curve.
Characteristics: - Is always on or above the horizontal axis (histogram uses frequency so it cannot be negative) - Has area exactly 1 underneath it (it shows the distribution of *all* data)
6
Definition, pg 56 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company
Definition, pg 57 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company
More characteristics
Figure 1.28 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company
One particular type of density curve is a normal curve that follows a normal distribution: N(μ, σ)
*All* normal distributions have the same overall shape. The exact density curve for a particular normal distribution is specified by given its mean and its standard deviation.
Figure 1.30 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company
An example of a normal distribution, N(64.5, 2.5)
The 68-95-99.7 rule
Definition, pg 59 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company
Figure 1.29 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company
Standardized Normal Distribution: N(0, 1)
Notice the mean is 0 and the standard deviation is 1.
7
Definition, pg 61 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company
We can do a linear transformation to get a standardized distribution.
Definition, pg 62 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company
Figure 1.31 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company
When x is the mean, the cumulative proportion to the left will be 50%.
Figure 1.32 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company
Check Table A in your textbook, given the z score of 1.47, what is the area on the left of this z score?
Figure 1.33 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company
How high for the top 10% (the blue block) given a distribution of N(505, 110) ?
Two steps to solve the problem. (1) Use the Table A to get the z value. (look for 0.9 from the
table which is the proportion to the left of z (the yellow block).
(2) Un-standardize: transform z back to the data using linear transformation. (x-550) / 110 = 1.28. Then x = 645.8.
Review Chapter 1
Five number rules Mean / std Histogram
Density curve Normal distribution, its mean and std
Z score, 68-95-97.5 rule Plot your data (stem plot, bar chart, pie chart, time plot)