7
1 Introduction to the Practice of Statistics Sixth Edition Chapter 1: Looking at Data—Distributions Copyright © 2009 by W. H. Freeman and Company David S. Moore • George P. McCabe Bruce A. Craig Statistics is the science of collecting, organizing and interpreting numerical facts, which we call data. The study and collection of data are important in the work of many professions, so training in the science of statistics is valuable preparation for a variety of career. We will learn data analysis and collection approaches within various application context. Context and state are important. Remember that the goal of statistics is not calculation for its own sake but gaining understanding of certain information from numbers. Such information has to be useful within that context. You cannot learn statistics without practice. So practice, practice and practice!! Be prepared to solve problems from data! The gain will be worth the pain. We will learn it together. Let us start with some concepts so we sound like professional. Any set of data contains information about some group of individuals. The information is organized in variables. Figure 1.1 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company For example, we can organize your statistics class scores (variables) by your student ID (individual). The purpose is for me to check the grades to make sure everyone is doing well (context). Some variables uses numerical data (scores) and some use letters (grade). Statistician came up with three ways to describe your data. Definition, pg 4 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company The distribution seems like a different animal! How to describe the distribution of our data? Before we answer this question, let’s first plot our data. Plotting the data is the more intuitive way to learn some information. Figure 1.3 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company Bar chart Pie chart -One variable at a time -Takes to much screen space

Introduction to the Practice of Statisticsivcl.cs.usm.edu/Courses/Statistics2011S/notes/IPS_6e_Ch...4! Figure 1.19 Introduction to the Practice of Statistics, Sixth Edition © 2009

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Introduction to the Practice of Statisticsivcl.cs.usm.edu/Courses/Statistics2011S/notes/IPS_6e_Ch...4! Figure 1.19 Introduction to the Practice of Statistics, Sixth Edition © 2009

1

Introduction to the Practice of Statistics

Sixth Edition

Chapter 1: Looking at Data—Distributions

Copyright © 2009 by W. H. Freeman and Company

David S. Moore • George P. McCabe Bruce A. Craig

Statistics is the science of collecting, organizing and interpreting numerical facts, which we call data.

The study and collection of data are important in the work of many professions, so training in the science of statistics is valuable preparation for a variety of career.

We will learn data analysis and collection approaches within various application context. Context and state are important. Remember that the goal of statistics is not calculation for its own sake but gaining understanding of certain information from numbers. Such information has to be useful within that context.

You cannot learn statistics without practice. So practice, practice and practice!! Be prepared to solve problems from data!

The gain will be worth the pain. We will learn it together.

Let us start with some concepts so we sound like professional. Any set of data contains information about some group of individuals. The information is organized in variables.

Figure 1.1 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company

For example, we can organize your statistics class scores (variables) by your student ID (individual). The purpose is for me to check the grades to make sure everyone is doing well (context).

Some variables uses numerical data (scores) and some use letters (grade). Statistician came up with three ways to describe your data.

Definition, pg 4 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company

The distribution seems like a different animal! How to describe the distribution of our data? Before we answer this question, let’s first plot our data. Plotting the data is the more intuitive way to learn some information.

Figure 1.3 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company

Bar chart Pie chart

- One variable at a time - Takes to much screen space

Page 2: Introduction to the Practice of Statisticsivcl.cs.usm.edu/Courses/Statistics2011S/notes/IPS_6e_Ch...4! Figure 1.19 Introduction to the Practice of Statistics, Sixth Edition © 2009

2

Table 1.1 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company

Table format

Figure 1.4 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company

Histogram: yeah… I learned this at 7th grade. A histogram breaks the range of values of a variable into classes and displays only the count or percent of the observations that fall into each class.

Important: you can choose any convenient number of classes, but you should always choose classes of equal width.

Definition, pg 9 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company

OK. We talked a lot about stemplot. I will just leave two graphs in the next two page. I trust you know how to get there. Note, it is not good for large samples.

Table 1.2 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company

Figure 1.5 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company

Definition, pg 18 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company

Table 1.4 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company

Page 3: Introduction to the Practice of Statisticsivcl.cs.usm.edu/Courses/Statistics2011S/notes/IPS_6e_Ch...4! Figure 1.19 Introduction to the Practice of Statistics, Sixth Edition © 2009

3

Figure 1.10 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company

Definition, pg 15 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company

OK. We learned how to plot data. Now get back to the distribution. This slide is very important that and it tells you what to look for from a graph or a data source. We will learn how you get these numbers.

Definition, pg 31 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company Definition, pg 32–33

Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company

Definition, pg 35 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company

Definition, pg 37 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company

Page 4: Introduction to the Practice of Statisticsivcl.cs.usm.edu/Courses/Statistics2011S/notes/IPS_6e_Ch...4! Figure 1.19 Introduction to the Practice of Statistics, Sixth Edition © 2009

4

Figure 1.19 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company

Definition, pg 38a Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company

Definition, pg 38b Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company

Definition, pg 39 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company

Definition, pg 40–41 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company

Figure 1.21 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company

Page 5: Introduction to the Practice of Statisticsivcl.cs.usm.edu/Courses/Statistics2011S/notes/IPS_6e_Ch...4! Figure 1.19 Introduction to the Practice of Statistics, Sixth Edition © 2009

5

Definition, pg 43a Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company

Definition, pg 43b Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company

Definition, pg 45 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company

You surely know this from high school.

Definition, pg 46–47 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company

1.3 Density Curve and Normal Distributions

Figure 1.24 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company

You know what the blue bars represent, right? – Histogram. Simply fitting a smooth curve to the histogram, we get the density curve.

Characteristics: - Is always on or above the horizontal axis (histogram uses frequency so it cannot be negative) - Has area exactly 1 underneath it (it shows the distribution of *all* data)

Page 6: Introduction to the Practice of Statisticsivcl.cs.usm.edu/Courses/Statistics2011S/notes/IPS_6e_Ch...4! Figure 1.19 Introduction to the Practice of Statistics, Sixth Edition © 2009

6

Definition, pg 56 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company

Definition, pg 57 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company

More characteristics

Figure 1.28 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company

One particular type of density curve is a normal curve that follows a normal distribution: N(μ, σ)

*All* normal distributions have the same overall shape. The exact density curve for a particular normal distribution is specified by given its mean and its standard deviation.

Figure 1.30 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company

An example of a normal distribution, N(64.5, 2.5)

The 68-95-99.7 rule

Definition, pg 59 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company

Figure 1.29 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company

Standardized Normal Distribution: N(0, 1)

Notice the mean is 0 and the standard deviation is 1.

Page 7: Introduction to the Practice of Statisticsivcl.cs.usm.edu/Courses/Statistics2011S/notes/IPS_6e_Ch...4! Figure 1.19 Introduction to the Practice of Statistics, Sixth Edition © 2009

7

Definition, pg 61 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company

We can do a linear transformation to get a standardized distribution.

Definition, pg 62 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company

Figure 1.31 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company

When x is the mean, the cumulative proportion to the left will be 50%.

Figure 1.32 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company

Check Table A in your textbook, given the z score of 1.47, what is the area on the left of this z score?

Figure 1.33 Introduction to the Practice of Statistics, Sixth Edition © 2009 W.H. Freeman and Company

How high for the top 10% (the blue block) given a distribution of N(505, 110) ?

Two steps to solve the problem. (1) Use the Table A to get the z value. (look for 0.9 from the

table which is the proportion to the left of z (the yellow block).

(2) Un-standardize: transform z back to the data using linear transformation. (x-550) / 110 = 1.28. Then x = 645.8.

Review Chapter 1

Five number rules Mean / std Histogram

Density curve Normal distribution, its mean and std

Z score, 68-95-97.5 rule Plot your data (stem plot, bar chart, pie chart, time plot)