Upload
ovaizowais
View
225
Download
0
Embed Size (px)
Citation preview
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 1/723
This text was adapted by The Saylor Foundation
under a Creative Commons Attribution-
NonCommercial-ShareAlike 3. !icense without
attribution as re"uested by the work#s ori$inal
creator or licensee.
Saylor URL: http://www.saylor.org/books Saylor.org1
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 2/723
Preface
This book is meant to be a textbook for a standard one-semester introductory statistics course for
general education students. Our motivation for writing it is twofold: 1.) to provide a low-costalternative to many existing popular textbooks on the market and !.) to provide a "uality textbook
on the sub#ect with a focus on the core material of the course in a balanced presentation.
The high cost of textbooks has spiraled out of control in recent years. The high fre"uency at which
new editions of popular texts appear puts a tremendous burden on students and faculty alike$ as well
as the natural environment. %gainst this background we set out to write a "uality textbook with
materials such as examples and exercises that age well with time and that would therefore notre"uire fre"uent new editions. Our vision resonates well with the publisher&s business model which
includes free digital access$ reduced paper prints$ and easy customi'ation by instructors if additional
material is desired.
Over time the core content of this course has developed into a well-defined body of material that is
substantial for a one-semester course. The authors believe that the students in this course are best
served by a focus on the core material and not by an exposure to a plethora of peripheral topics.Therefore in writing this book we have sought to present material that comprises fully a central body
of knowledge that is defined according to convention$ realistic expectation with respect to course
duration and students& maturity level$ and our professional #udgment and experience. (e believe
that certain topics$ among them oisson and geometric distributions and the normal approximation
to the binomial distribution *particularly with a continuity correction) are distracting in nature.
Other topics$ such as nonparametric methods$ while important$ do not belong in a first course in
statistics. %s a result we envision a smaller and less intimidating textbook that trades some extended
and unnecessary topics for a better focused presentation of the central material.
Textbooks for this course cover a wide range in terms of simplicity and complexity. +ome popular
textbooks emphasi'e the simplicity of individual concepts to the point of lacking the coherence of an
overall network of concepts. Other textbooks include overly detailed conceptual and computational
Saylor URL: http://www.saylor.org/books Saylor.org2
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 3/723
discussions and as a result repel students from reading them. The authors believe that a successful
book must strike a balance between the two extremes$ however difficult it may be. %s a conse"uence
the overarching guiding principle of our writing is to seek simplicity but to preserve the coherence of
the whole body of information communicated$ both conceptually and computationally. (e seek to
remind ourselves *and others) that we teach ideas$ not #ust step-by-step algorithms$ but ideas thatcan be implemented by straightforward algorithms.
,n our experience most students come to an introductory course in statistics with a calculator that
they are familiar with and with which their proficiency is more than ade"uate for the course material.
,f the instructor chooses to use technological aids$ either calculators or statistical software such as
initab or +++$ for more than mere arithmetical computations but as a significant component of
the course then effective instruction for their use will re"uire more extensive written instruction thana mere paragraph or two in the text. iven the plethora of such aids available$ to discuss a few of
them would not provide sufficiently wide or detailed coverage and to discuss many would digress
unnecessarily from the conceptual focus of the book. The overarching philosophy of this textbook is
to present the core material of an introductory course in statistics for non-ma#ors in a complete yet
streamlined way. uch room has been intentionally left for instructors to apply their own
instructional styles as they deem appropriate for their classes and educational goals. (e believe that
the whole matter of what technological aids to use$ and to what extent$ is precisely the type of
material best left to the instructor&s discretion.
%ll figures with the exception of /igure 1.1 0The rand icture of +tatistics0$/igure !.1 0+tem and
eaf 2iagram0$ /igure !.! 0Ordered +tem and eaf 2iagram0$/igure !.13 0The 4ox lot0$ /igure 15.6
0inear 7orrelation 7oefficient 0$ /igure 15.8 0The +imple inear odel 7oncept0$ and the
unnumbered figure in 9ote !.85 0xample 1;0 of 7hapter ! 02escriptive +tatistics0 were generated
using %T%4$ copyright !515.
Saylor URL: http://www.saylor.org/books Saylor.org3
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 4/723
Chapter %
&ntroduction
,n this chapter we will introduce some basic terminology and lay the groundwork for the course. (e will explain in general terms what statistics and probability are and the problems that these two
areas of study are designed to solve.
% 'asic (e)nitions and Concepts
LEARNN! "#$E%&'E
1 &o lear( the bas)c *e+()t)o(s ,se* )( stat)st)cs a(* so-e of )ts key co(cepts.
(e begin with a simple example. There are millions of passenger automobiles in the <nited +tates.
(hat is their average value= ,t is obviously impractical to attempt to solve this problem directly by
assessing the value of every single car in the country$ adding up all those numbers$ and then dividing
by however many numbers there are. ,nstead$ the best we can do would be to estimate the average.
One natural way to do so would be to randomly select some of the cars$ say !55 of them$ ascertain
the value of each of those cars$ and find the average of those !55 numbers. The set of all those
millions of vehicles is called the population of interest$ and the number attached to each one$ its
value$ is a measurement . The average value is a parameter: a number that describes a characteristic
of the population$ in this case monetary worth. The set of !55 cars selected from the population is
Saylor URL: http://www.saylor.org/books Saylor.org
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 5/723
called a sample$ and the !55 numbers$ the monetary values of the cars we selected$ are the sample
data. The average of the data is called a statistic: a number calculated from the sample data. This
example illustrates the meaning of the following definitions.
e+()t)o(
A population is any specific collection of objects of interest. A sample is any subset or subcollection of
the population, including the case that the sample consists of the whole population, in which case it is
termed a census.
e+()t)o(
A measurement is a number or attribute computed for each member of a population or of a sample.
The measurements of sample elements are collectively called the sample data.
e+()t)o(
A parameter is a number that summarizes some aspect of the population as a whole. A statistic is a
number computed from the sample data.
7ontinuing with our example$ if the average value of the cars in our sample was >?$38@$ then it seems
reasonable to conclude that the average value of all cars is about >?$38@. ,n reasoning this way we
have drawn an inference about the population based on information obtained from the sample. ,ngeneral$ statistics is a study of data: describing properties of the data$ which is called descriptive
statistics$ and drawing conclusions about a population of interest from information extracted from a
sample$ which is called inferential statistics. 7omputing the single number >?$38@ to summari'e the
data was an operation of descriptive statistics using it to make a statement about the population was
an operation of inferential statistics.
Saylor URL: http://www.saylor.org/books Saylor.org0
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 6/723
e+()t)o(
Statistics is a collection of methods for collecting, displaying, analyzing, and drawing conclusions from data.
e+()t)o(
Descriptive statistics is the branch of statistics that involves organizing, displaying, and describing
data.
e+()t)o(
Inferential statistics is the branch of statistics that involves drawing conclusions about a population
based on information contained in a sample taken from that population.
The measurement made on each element of a sample need not be numerical. ,nthe case of
automobiles$ what is noted about each car could be its color$ its make$ its body type$ and so on. +uch
data are categorical or qualitative$ as opposed to numerical or quantitative data such as value or age.
This is a general distinction.
e+()t)o(
Qualitative data are measurements for which there is no natural numerical scale, but which consist of
attributes, labels, or other nonnumerical characteristics.
Saylor URL: http://www.saylor.org/books Saylor.org
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 7/723
e+()t)o(
Quantitative data are numerical measurements that arise from a natural numerical scale.
Aualitative data can generate numerical sample statistics. ,n the automobile example$ for instance$
we might be interested in the proportion of all cars that are less than six years old. ,n our same
sample of !55 cars we could note for each car whether it is less than six years old or not$ which is a
"ualitative measurement. ,f 1@! cars in the sample are less than six years old$ which is 5.?; or ?;B$
then we would estimate the parameter of interest$ the population proportion$ to be about the same as
the sample statistic$ the sample proportion$ that is$ about 5.?;.
The relationship between a population of interest and a sample drawn from that population isperhaps the most important concept in statistics$ since everything else rests on it. This relationship is
illustrated graphically in /igure 1.1 0The rand icture of +tatistics0. The circles in the large box
represent elements of the population. ,n the figure there was room for only a small number of them
but in actual situations$ like our automobile example$ they could very well number in the millions.
The solid black circles represent the elements of the population that are selected at random and that
together form the sample. /or each element of the sample there is a measurement of interest$
denoted by a lower case x *which we have indexed asx1,…,xnto tell them apart) these measurements
collectively form the sample data set. /rom the data we may calculate various statistics. To anticipate
the notation that will be used later$ we might compute the sample mean x−and the sample
proportion p$ and take them as approximations to the population mean *this is the lower case
reek letter mu$ the traditional symbol for this parameter) and the population proportion p$
respectively. The other symbols in the figure stand for other parameters and statistics that we will
encounter.
Saylor URL: http://www.saylor.org/books Saylor.org
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 8/723
!igure "." The #rand $icture of %tatistics
*+, TA*+AA,S
• Stat)st)cs )s a st,*y of *ata: *escr)b)(g propert)es of *ata *escr)pt)4e
stat)st)cs5 a(* *raw)(g co(cl,s)o(s abo,t a pop,lat)o( base* o( )(for-at)o(
)( a sa-ple )(fere(t)al stat)st)cs5.
• &he *)st)(ct)o( betwee( a pop,lat)o( together w)th )ts para-eters a(* a
sa-ple together w)th )ts stat)st)cs )s a f,(*a-e(tal co(cept )( )(fere(t)al
stat)st)cs.
Saylor URL: http://www.saylor.org/books Saylor.org6
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 9/723
• (for-at)o( )( a sa-ple )s ,se* to -ake )(fere(ces abo,t the pop,lat)o( fro-
wh)ch the sa-ple was *raw(.
++/C&S+S
1 E7pla)( what )s -ea(t by the ter- population.
2 E7pla)( what )s -ea(t by the ter- sample.
3 E7pla)( how a sa-ple *)8ers fro- a pop,lat)o(.
E7pla)( what )s -ea(t by the ter- sample data.
0 E7pla)( what a parameter )s.
E7pla)( what a statistic )s.
!)4e a( e7a-ple of a pop,lat)o( a(* two *)8ere(t character)st)cs that -ay be of
)(terest.
6 escr)be the *)8ere(ce betwee( descriptive statistics a(* inferential statistics. ll,strate
w)th a( e7a-ple.
9 *e(t)fy each of the follow)(g *ata sets as e)ther a pop,lat)o( or a sa-ple:
a &he gra*e po)(t a4erages !PAs5 of all st,*e(ts at a college.
b &he !PAs of a ra(*o-ly selecte* gro,p of st,*e(ts o( a college ca-p,s.
c &he ages of the ()(e S,pre-e %o,rt $,st)ces of the U()te* States o( $a(,ary 1
162.
* &he ge(*er of e4ery seco(* c,sto-er who e(ters a -o4)e theater.
e &he le(gths of Atla(t)c croakers ca,ght o( a +sh)(g tr)p to the beach.
1; *e(t)fy the follow)(g -eas,res as e)ther <,a(t)tat)4e or <,al)tat)4e:
a &he 3; h)gh=te-perat,re rea*)(gs of the last 3; *ays.
b &he scores of ; st,*e(ts o( a( E(gl)sh test.
c &he bloo* types of 12; teachers )( a -)**le school.
* &he last fo,r *)g)ts of soc)al sec,r)ty (,-bers of all st,*e(ts )( a class.
e &he (,-bers o( the >erseys of 03 football players o( a tea-.
11 *e(t)fy the follow)(g -eas,res as e)ther <,a(t)tat)4e or <,al)tat)4e:
a &he ge(*ers of the +rst ; (ewbor(s )( a hosp)tal o(e year.
b &he (at,ral ha)r color of 2; ra(*o-ly selecte* fash)o( -o*els.
Saylor URL: http://www.saylor.org/books Saylor.org9
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 10/723
c &he ages of 2; ra(*o-ly selecte* fash)o( -o*els.
* &he f,el eco(o-y )( -)les per gallo( of 2; (ew cars p,rchase* last -o(th.
e &he pol)t)cal a?l)at)o( of 0;; ra(*o-ly selecte* 4oters.
12 A researcher w)shes to est)-ate the a4erage a-o,(t spe(t per perso( by 4)s)tors to a
the-e park. @e takes a ra(*o- sa-ple of forty 4)s)tors a(* obta)(s a( a4erage of 26
per perso(.
a Bhat )s the pop,lat)o( of )(terestC
b Bhat )s the para-eter of )(terestC
c #ase* o( th)s sa-ple *o we k(ow the a4erage a-o,(t spe(t per perso( by
4)s)tors to the parkC E7pla)( f,lly.
13 A researcher w)shes to est)-ate the a4erage we)ght of (ewbor(s )( So,th A-er)ca )(
the last +4e years. @e takes a ra(*o- sa-ple of 230 (ewbor(s a(* obta)(s a( a4erage
of 3.2 k)logra-s.
a Bhat )s the pop,lat)o( of )(terestC
b Bhat )s the para-eter of )(terestC
c #ase* o( th)s sa-ple *o we k(ow the a4erage we)ght of (ewbor(s )( So,th
A-er)caC E7pla)( f,lly.
1 A researcher w)shes to est)-ate the proport)o( of all a*,lts who ow( a cell pho(e. @e
takes a ra(*o- sa-ple of 102 a*,ltsD 1296 of the- ow( a cell pho(e he(ce
1296102 F .63 or abo,t 63G ow( a cell pho(e.
a Bhat )s the pop,lat)o( of )(terestC
b Bhat )s the para-eter of )(terestC
c Bhat )s the stat)st)c )(4ol4e*C
* #ase* o( th)s sa-ple *o we k(ow the proport)o( of all a*,lts who ow( a cell
pho(eC E7pla)( f,lly.
10 A soc)olog)st w)shes to est)-ate the proport)o( of all a*,lts )( a certa)( reg)o( who ha4e
(e4er -arr)e*. ( a ra(*o- sa-ple of 132; a*,lts 10 ha4e (e4er -arr)e* he(ce
10132; F .11 or abo,t 11G ha4e (e4er -arr)e*.
a Bhat )s the pop,lat)o( of )(terestC
Saylor URL: http://www.saylor.org/books Saylor.org1;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 11/723
b Bhat )s the para-eter of )(terestC
c Bhat )s the stat)st)c )(4ol4e*C
* #ase* o( th)s sa-ple *o we k(ow the proport)o( of all a*,lts who ha4e (e4er
-arr)e*C E7pla)( f,lly.
1 a. Bhat -,st be tr,e of a sa-ple )f )t )s to g)4e a rel)able est)-ate of the 4al,e
of a part)c,lar
pop,lat)o( para-eterC
b. Bhat -,st be tr,e of a sa-ple )f )t )s to g)4e certain k(owle*ge of the 4al,e
of a part)c,lar
pop,lat)o( para-eterC
ANS+/S
1 A pop,lat)o( )s the total collect)o( of ob>ects that are of )(terest )( a stat)st)cal st,*y.
3 A sa-ple be)(g a s,bset )s typ)cally s-aller tha( the pop,lat)o(. ( a stat)st)cal st,*y
all ele-e(ts of a sa-ple are a4a)lable for obser4at)o( wh)ch )s (ot typ)cally the case
for a pop,lat)o(.
0 A para-eter )s a 4al,e *escr)b)(g a character)st)c of a pop,lat)o(. ( a stat)st)cal st,*y
the 4al,e of a para-eter )s typ)cally ,(k(ow(.
All c,rre(tly reg)stere* st,*e(ts at a part)c,lar college for- a pop,lat)o(. &wo
pop,lat)o( character)st)cs of )(terest co,l* be the a4erage !PA a(* the proport)o( of
st,*e(ts o4er 23 years.
9 a. Pop,lat)o(.
b. Sa-ple.
c Pop,lat)o(.
* Sa-ple.
e Sa-ple.
Saylor URL: http://www.saylor.org/books Saylor.org11
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 12/723
11 a. H,al)tat)4e.
b. H,al)tat)4e.
c H,a(t)tat)4e.
* H,a(t)tat)4e.
e H,al)tat)4e.
13 a. All (ewbor( bab)es )( So,th A-er)ca )( the last +4e years.
b. &he a4erage b)rth we)ght of all (ewbor( bab)es )( So,th A-er)ca )( the last +4e
years. c. No (ot e7actly b,t we k(ow the appro7)-ate 4al,e of
the a4erage.
10 a. All a*,lts )( the reg)o(.
b. &he proport)o( of the a*,lts )( the reg)o( who ha4e (e4er -arr)e*.
c. &he proport)o( co-p,te* fro- the sa-ple ;.1.
*. No (ot e7actly b,t we k(ow the appro7)-ate 4al,e of the proport)o(.
0 1verview
LEARNN! "#$E%&'E
1 &o obta)( a( o4er4)ew of the -ater)al )( the te7t.
The example we have given in the first section seems fairly simple$ but there are some significant
problems that it illustrates. (e have supposed that the !55 cars of the sample had an average value
of >?$38@ *a number that is precisely known)$ and concluded that the population has an average of
Saylor URL: http://www.saylor.org/books Saylor.org12
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 13/723
about the same amount$ although its precise value is still unknown. (hat would happen if someone
were to take another sample of exactly the same si'e from exactly the same population= (ould he get
the same sample average as we did$ >?$38@= %lmost surely not. ,n fact$ if the investigator who took
the second sample were to report precisely the same value$ we would immediately become suspicious
of his result. The sample average is an example of what is called a random variable: a number that varies from trial to trial of an experiment *in this case$ from sample to sample)$ and does so in a way
that cannot be predicted precisely. Candom variables will be a central ob#ect of study for us$
beginning in 7hapter 6 02iscrete Candom Dariables0.
%nother issue that arises is that different samples have different levels of reliability. (e have
supposed that our sample of si'e !55 had an average of >?$38@. ,f a sample of si'e 1$555 yielded an
average value of >@$?3!$ then we would naturally regard this latter number as likely to be a betterestimate of the average value of all cars. Eow can this be expressed= %n important idea that we will
develop in 7hapter @ 0stimation0 is that of the confidence interval : from the data we will construct
an interval of values so that the process has a certain chance$ say a F8B chance$ of generating an
interval that contains the actual population average. Thus instead of reporting a single estimate$
>?$38@$ for the population mean$ we would say that we are F8B certain that the true average is
within >155 of our sample mean$ that is$ between >?$!8@ and >?$68@$ the number >155 having been
computed from the sample data #ust like the sample mean >?$38@ was. This will automatically
indicate the reliability of the sample$ since to obtain the same chance of containing the unknown
parameter a large sample will typically produce a shorter interval than a small one will. 4ut unless
we perform a census$ we can never be completely sure of the true average value of the population the
best that we can do is to make statements of probability$ an important concept that we will begin to
study formally in 7hapter 3 04asic 7oncepts of robability0.
+ampling may be done not only to estimate a population parameter$ but to test a claim that is made
about that parameter. +uppose a food package asserts that the amount of sugar in one serving of the
product is 16 grams. % consumer group might suspect that it is more. Eow would they test the
competing claims about the amount of sugar$ 16 grams versus more than 16 grams= They might take
a random sample of perhaps !5 food packages$ measure the amount of sugar in one serving of each
one$ and average those amounts. They are not interested in the true amount of sugar in one serving
in itself their interest is simply whether the claim about the true amount is accurate. +tated another
way$ they are sampling not in order to estimate the average amount of sugar in one serving$ but to
Saylor URL: http://www.saylor.org/books Saylor.org13
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 14/723
see whether that amount$ whatever it may be$ is larger than 16 grams. %gain because one can have
certain knowledge only by taking a census$ ideas of probability enter into the analysis. (e will
examine tests of hypotheses beginning in 7hapter ? 0Testing Eypotheses0.
+everal times in this introduction we have used the term Grandom sample.H enerally the value of
our data is only as good as the sample that produced it. /or example$ suppose we wish to estimate
the proportion of all students at a large university who are females$ which we denote by p. ,f we
select 85 students at random and !@ of them are female$ then a natural estimate is p≈ p-27 50-
0.54or 86B. Eow much confidence we can place in this estimate depends not only on the si'e of the
sample$ but on its "uality$ whether or not it is truly random$ or at least truly representative of the
whole population. ,f all 85 students in our sample were drawn from a 7ollege of 9ursing$ then the
proportion of female students in the sample is likely higher than that of the entire campus. ,f all 85
students were selected from a 7ollege of ngineering +ciences$ then the proportion of students in the
entire student body who are females could be underestimated. ,n either case$ the estimate would be
distorted or biased. ,n statistical practice an unbiased sampling scheme is important but in most
cases not easy to produce. /or this introductory course we will assume that all samples are either
random or at least representative.
IEJ &AIEABAJ
• Stat)st)cs co-p,te* fro- sa-ples 4ary ra(*o-ly fro- sa-ple to sa-ple.
%o(cl,s)o(s -a*e abo,t pop,lat)o( para-eters are state-e(ts of probab)l)ty.
3 2resentation o (ata
LEARNN! "#$E%&'E
Saylor URL: http://www.saylor.org/books Saylor.org1
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 15/723
1 &o lear( two ways that *ata w)ll be prese(te* )( the te7t.
,n this book we will use two formats for presenting data sets. The first is a data list$ which is an
explicit listing of all the individual measurements$ either as a display with space between the
individual measurements$ or in set notation with individual measurements separated by commas.
EKAPLE 1
&he *ata obta)(e* by -eas,r)(g the age of 21 ra(*o-ly selecte* st,*e(ts e(rolle* )(
fresh-a( co,rses at a ,()4ers)ty co,l* be prese(te* as the *ata l)st
18 18 19 19 19 18 22 20 18 18 17
19 18 24 18 20 18 21 20 17 19
or )( set (otat)o( as
{18,18,19,19,19,18,22,20,18,18,17,19,18,24,18,20,18,21,20,17,19}
% data set can also be presented by means of a data frequency table$ a table in which
each distinct value x is listed in the first row and its frequency f $ which is the number of times the
value x appears in the data set$ is listed below it in the second row.
EKAPLE 2
&he *ata set of the pre4)o,s e7a-ple )s represe(te* by the *ata fre<,e(cy table
x %4 %5 %6 0 0% 00 07
0 5 8 3 % % %
Saylor URL: http://www.saylor.org/books Saylor.org10
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 16/723
The data fre"uency table is especially convenient when data sets are large and the number of distinct
values is not too large.
IEJ &AIEABAJ
• ata sets ca( be prese(te* e)ther by l)st)(g all the ele-e(ts or by g)4)(g a table
of 4al,es a(* fre<,e(c)es.
EKER%SES
1 L)st all the -eas,re-e(ts for the *ata set represe(te* by the follow)(g *ata fre<,e(cy
table.
x 0% 00 00 07 08
% 8 9 7 0
2 L)st all the -eas,re-e(ts for the *ata set represe(te* by the follow)(g *ata fre<,e(cy
table.
x 64 65 66 % %% %0 %0 %8
4 8 0 7 0 0 % %
3 %o(str,ct the *ata fre<,e(cy table for the follow)(g *ata set.
22 25 22 27 24 23
Saylor URL: http://www.saylor.org/books Saylor.org1
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 17/723
26 24 22 24 26
%o(str,ct the *ata fre<,e(cy table for the follow)(g *ata set.
{1,5,2,3,5,1,4,4,4,3,2,5,1,3,2,
1,1,1,2}
ANSBERS
1 M31323232323233333333333333333030.
3
x 00 03 07 08 09 04
3 % 3 % 0 %
Chapter 0(escriptive Statistics
Saylor URL: http://www.saylor.org/books Saylor.org1
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 18/723
%s described in 7hapter 1 0,ntroduction0$ statistics naturally divides into two branches$ descriptive
statistics and inferential statistics. Our main interest is in inferential statistics$ as shown in /igure 1.1
0The rand icture of +tatistics0 in 7hapter 1 0,ntroduction0. 9evertheless$ the starting point for
dealing with a collection of data is to organi'e$ display$ and summari'e it effectively. These are the
ob#ectives of descriptive statistics$ the topic of this chapter.
0.% Three 2opular (ata (isplays
!+A/N&N: 1';+CT&<+
Saylor URL: http://www.saylor.org/books Saylor.org16
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 19/723
1 &o lear( to )(terpret the -ea()(g of three graph)cal represe(tat)o(s of sets of
*ata: ste- a(* leaf *)agra-s fre<,e(cy h)stogra-s a(* relat)4e fre<,e(cy
h)stogra-s.
% well-known adage is that Ga picture is worth a thousand words.H This saying proves true when it
comes to presenting statistical information in a data set. There are many effective ways to present
data graphically. The three graphical tools that are introduced in this section are among the most
commonly used and are relevant to the subse"uent presentation of the material in this book.
Stem and !ea (ia$rams
+uppose 35 students in a statistics class took a test and made the following scores:
86 80 25 77 73 76 100 90 69 9390 83 70 73 73 70 90 83 71 95
40 58 68 69 100 78 87 97 92 74
Eow did the class do on the test= % "uick glance at the set of 35 numbers does not immediately give a
clear answer. Eowever the data set may be reorgani'ed and rewritten to make relevant information more
visible. One way to do so is to construct a stem and leaf diagram as shown in . The numbers in the tens
place$ from ! through F$ and additionally the number 15$ are the Gstems$H and are arranged in numerical
order from top to bottom to the left of a vertical line. The number in the units place in each measurement
is a Gleaf$H and is placed in a row to the right of the corresponding stem$ the number in the tens place ofthat measurement. Thus the three leaves F$ ?$ and F in the row headed with the stem ; correspond to the
three exam scores in the ;5s$ ;F *in the first row of data)$ ;? *in the third row)$ and ;F *also in the third
row). The display is made even more useful for some purposes by rearranging the leaves in numerical
order$ as shown in . ither way$ with the data reorgani'ed certain information of interest becomes
apparent immediately. There are two perfect scores three students made scores under ;5 most students
scored in the @5s$ ?5s and F5s and the overall average is probably in the high @5s or low ?5s.
igure &." %tem and 'eaf (iagram
Saylor URL: http://www.saylor.org/books Saylor.org19
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 20/723
!igure &.& )rdered %tem and 'eaf (iagram
,n this example the scores have a natural stem *the tens place) and leaf *the ones place). One could spread
the diagram out by splitting each tens place number into lower and upper categories. /or example$ all the
scores in the ?5s may be represented on two separate stems$ lower ?5s and upper ?5s:
5 3 3
Saylor URL: http://www.saylor.org/books Saylor.org2;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 21/723
5 9 4
The definitions of stems and leaves are flexible in practice. The general purpose of a stem and leaf
diagram is to provide a "uick display of how the data are distributed across the range of their values some
improvisation could be necessary to obtain a diagram that best meets that goal.
9ote that all of the original data can be recovered from the stem and leaf diagram. This will not be true in
the next two types of graphical displays.
Fre"uency =isto$rams
The stem and leaf diagram is not practical for large data sets$ so we need a different$ purely graphical way
to represent data. % frequency histogram is such a device. (e will illustrate it using the same data set
from the previous subsection. /or the 35 scores on the exam$ it is natural to group the scores on the
standard ten-point scale$ and count the number of scores in each group. Thus there are two 155s$ seven
scores in the F5s$ six in the ?5s$ and so on. (e then construct the diagram shown in by drawing for each
group$ or class$ a vertical bar whose length is the number of observations in that group. ,n our example$
the bar labeled 155 is ! units long$ the bar labeled F5 is @ units long$ and so on. (hile the individual data
values are lost$ we know the number in each class. This number is called the frequency of the class$
hence the name fre"uency histogram.
!igure &.* !requency +istogram
Saylor URL: http://www.saylor.org/books Saylor.org21
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 22/723
The same procedure can be applied to any collection of numerical data. Observations are grouped into
several classes and the fre"uency *the number of observations) of each class is noted. These classes are
arranged and indicated in order on the hori'ontal axis *called the x -axis)$ and for each group a vertical
bar$ whose length is the number of observations in that group$ is drawn. The resulting display is a
fre"uency histogram for the data. The similarity in and is apparent$ particularly if you imagine turning the
stem and leaf diagram on its side by rotating it a "uarter turn counterclockwise.
,n general$ the definition of the classes in the fre"uency histogram is flexible. The general purpose of a
fre"uency histogram is very much the same as that of a stem and leaf diagram$ to provide a graphical
display that gives a sense of data distribution across the range of values that appear. (e will not discuss
the process of constructing a histogram from data since in actual practice it is done automatically with
statistical software or even handheld calculators.
/elative Fre"uency =isto$rams,n our example of the exam scores in a statistics class$ five students scored in the ?5s. The number 8 is
the frequency of the group labeled G?5s.H +ince there are 35 students in the entire statistics class$ the
proportion who scored in the ?5s is 8I35. The number 8I35$ which could also be expressed as0.16≈.1667$
or as 1;.;@B$ is the relative frequency of the group labeled G?5s.H very group *the @5s$ the ?5s$ and
so on) has a relative fre"uency. (e can thus construct a diagram by drawing for each group$ or class$ a
Saylor URL: http://www.saylor.org/books Saylor.org22
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 23/723
vertical bar whose length is the relative fre"uency of that group. /or example$ the bar for the ?5s will have
length 8I35 unit$ not 8 units. The diagram is a relative frequency histogram for the data$ and is
shown in . ,t is exactly the same as the fre"uency histogram except that the vertical axis in the relative
fre"uency histogram is not fre"uency but relative fre"uency.
!igure &. -elative !requency +istogram
The same procedure can be applied to any collection of numerical data. 7lasses are selected$ the relative
fre"uency of each class is noted$ the classes are arranged and indicated in order on the hori'ontal axis$
and for each class a vertical bar$ whose length is the relative fre"uency of the class$ is drawn. The resulting
display is a relative fre"uency histogram for the data. % key point is that now if each vertical bar has width
1 unit$ then the total area of all the bars is 1 or 155B.
%lthough the histograms in and have the same appearance$ the relative fre"uency histogram is more
important for us$ and it will be relative fre"uency histograms that will be used repeatedly to
represent data in this text. To see why this is so$ reflect on what it is that you are actually seeing in
the diagrams that "uickly and effectively communicates information to you about the data. ,t is
the relative sizes of the bars. The bar labeled G@5sH in either figure takes up 1I3 of the total area of all
the bars$ and although we may not think of this consciously$ we perceive the proportion 1I3 in the
figures$ indicating that a third of the grades were in the @5s. The relative fre"uency histogram is
important because the labeling on the vertical axis reflects what is important visually: the relative
si'es of the bars.
(hen the si'e n of a sample is small only a few classes can be used in constructing a relative
fre"uency histogram. +uch a histogram might look something like the one in panel *a) of . ,f the
sample si'e n were increased$ then more classes could be used in constructing a relative fre"uency
Saylor URL: http://www.saylor.org/books Saylor.org23
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 24/723
histogram and the vertical bars of the resulting histogram would be finer$ as indicated in panel *b)
of . /or a very large sample the relative fre"uency histogram would look very fine$ like the one in *c)
of. ,f the sample si'e were to increase indefinitely then the corresponding relative fre"uency
histogram would be so fine that it would look like a smooth curve$ such as the one in panel *d) of .
!igure &. %ample %ize and -elative !requency +istograms
,t is common in statistics to represent a population or a very large data set by a smooth curve. ,t is
good to keep in mind that such a curve is actually #ust a very fine relative fre"uency histogram in
which the exceedingly narrow vertical bars have disappeared. 4ecause the area of each such vertical
bar is the proportion of the data that lies in the interval of numbers over which that bar stands$ this
means that for any two numbers a and b$ the proportion of the data that lies between the two
numbers a and b is the area under the curve that is above the interval *a$b) in the hori'ontal axis.
This is the area shown in . ,n particular the total area under the curve is 1$ or 155B.
!igure &./ A 0ery !ine -elative !requency +istogram
Saylor URL: http://www.saylor.org/books Saylor.org2
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 25/723
*+, TA*+AA,S
• !raph)cal represe(tat)o(s of large *ata sets pro4)*e a <,)ck o4er4)ew of the
(at,re of the *ata.
• A pop,lat)o( or a 4ery large *ata set -ay be represe(te* by a s-ooth c,r4e. &h)s
c,r4e )s a 4ery +(e relat)4e fre<,e(cy h)stogra- )( wh)ch the e7cee*)(gly (arrow
4ert)cal bars ha4e bee( o-)tte*.
• Bhe( a c,r4e *er)4e* fro- a relat)4e fre<,e(cy h)stogra- )s ,se* to *escr)be a
*ata set the proport)o( of *ata w)th 4al,es betwee( two (,-bers a a(* b )s the
area ,(*er the c,r4e betwee( a a(* b as )ll,strate* )( )g,re 2. QA 'ery )(e
Relat)4e re<,e(cy @)stogra-Q.
Saylor URL: http://www.saylor.org/books Saylor.org20
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 26/723
Saylor URL: http://www.saylor.org/books Saylor.org2
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 27/723
Saylor URL: http://www.saylor.org/books Saylor.org2
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 28/723
Saylor URL: http://www.saylor.org/books Saylor.org26
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 29/723
Saylor URL: http://www.saylor.org/books Saylor.org29
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 30/723
Saylor URL: http://www.saylor.org/books Saylor.org3;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 31/723
Saylor URL: http://www.saylor.org/books Saylor.org31
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 32/723
0.0 >easures o Central !ocation
LEARNN! "#$E%&'ES
1 &o lear( the co(cept of the ce(ter of a *ata set.
2 &o lear( the -ea()(g of each of three -eas,res of the ce(ter of a *ata setTthe
-ea( the -e*)a( a(* the -o*eTa(* how to co-p,te each o(e.
This section could be titled Gthree kinds of averages of a data set.H %ny kind of GaverageH is meant to
be an answer to the "uestion G(here do the data center=H ,t is thus a measure of the central location
Saylor URL: http://www.saylor.org/books Saylor.org32
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 33/723
of the data set. (e will see that the nature of the data set$ as indicated by a relative fre"uency
histogram$ will determine what constitutes a good answer. 2ifferent shapes of the histogram call for
different measures of central location.
The >ean
The first measure of central location is the usual GaverageH that is familiar to everyone. ,n the formula in
the following definition we introduce the standard summation notation J$ where J is the capital reek
letter sigma. ,n general$ the notation J followed by a second mathematical symbol means to add up all the
values that the second symbol can take in the context of the problem. Eere is an example to illustrate this.
,n the definition we follow the convention of using lowercase n to denote the number of
measurements in a sample$ which is called the sample size.
Saylor URL: http://www.saylor.org/books Saylor.org33
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 34/723
Saylor URL: http://www.saylor.org/books Saylor.org3
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 35/723
Saylor URL: http://www.saylor.org/books Saylor.org30
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 36/723
Saylor URL: http://www.saylor.org/books Saylor.org3
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 37/723
,n the examples above the data sets were described as samples. Therefore the means were sample means$
denoted by x . ,f the data come from a census$ so that there is a measurement for every element of the
population$ then the mean is calculated by exactly the same process of summing all the measurements
and dividing by how many of them there are$ but it is now the population mean and is denoted by $ the
lower case reek letter mu.
The mean of two numbers is the number that is halfway between them. /or example$ the average of the
numbers 8 and 1@ is *8 K 1@) L ! M 11$ which is ; units above 8 and ; units below 1@. ,n this sense the
average 11 is the GcenterH of the data set N8$1@. /or larger data sets the mean can similarly be regarded as
the GcenterH of the data.
The >edian
To see why another concept of average is needed$ consider the following situation. +uppose we are
interested in the average yearly income of employees at a large corporation. (e take a random sample of
seven employees$ obtaining the sample data *rounded to the nearest hundred dollars$ and expressed in
thousands of dollars).
24.8 22.8 24. !"2.# 2#.2 !8.# 2$.%
The mean *rounded to one decimal place) is x- [email protected]$ but the statement Gthe average income of employees
at this corporation is >6@$655H is surely misleading. ,t is approximately twice what six of the seven
employees in the sample make and is nowhere near what any of them makes. ,t is easy to see what went
wrong: the presence of the one executive in the sample$ whose salary is so large compared to everyone
else&s$ caused the numerator in the formula for the sample mean to be far too large$ pulling the mean far
Saylor URL: http://www.saylor.org/books Saylor.org3
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 38/723
to the right of where we think that the average GoughtH to be$ namely around >!6$555 or >!8$555. The
number 1F!.8 in our data set is called an outlier$ a number that is far removed from most or all of the
remaining measurements. any times an outlier is the result of some sort of error$ but not always$ as is
the case here. (e would get a better measure of the GcenterH of the data if we were to arrange the data in
numerical order$
!8.# 22.8 2$.% 24. 24.8 2#.2 !"2.#
then select the middle number in the list$ in this case !6.;. The result is called the median of the data set$
and has the property that roughly half of the measurements are larger than it is$ and roughly half are
smaller. ,n this sense it locates the center of the data. ,f there are an even number of measurements in the
data set$ then there will be two middle elements when all are lined up in order$ so we take the mean of the
middle two as the median. Thus we have the following definition.
e+()t)o(
The sample median xPQ of a set of sample data for which there are an odd number of measurements is
the middle measurement when the data are arranged in numerical order. The sample median xPQ
of a
set of sample data for which there are an even number of measurements is the mean of the two middle
measurements when the data are arranged in numerical order.
The population median is defined in a similar way$ but we will not have occasion to refer to it again
in this text.
The median is a value that divides the observations in a data set so that 85B of the data are on its left
and the other 85B on its right. ,n accordance with $ therefore$ in the curve that represents the
distribution of the data$ a vertical line drawn at the median divides the area in two$ area 5.8 *85B of
the total area 1) to the left and area 5.8 *85B of the total area 1) to the right$ as shown in . ,n our
income example the median$ >!6$;55$ clearly gave a much better measure of the middle of the data
set than did the mean >6@$655. This is typical for situations in which the distribution is skewed.
*+kewness and symmetry of distributions are discussed at the end of this subsection.)
Saylor URL: http://www.saylor.org/books Saylor.org36
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 39/723
!igure &.1 The 2edian
Saylor URL: http://www.saylor.org/books Saylor.org39
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 40/723
Saylor URL: http://www.saylor.org/books Saylor.org;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 41/723
The relationship between the mean and the median for several common shapes of distributions is shown
in . The distributions in panels *a) and *b) are said to be symmetric because of the symmetry that they
exhibit. The distributions in the remaining two panels are said to be skewed . ,n each distribution we have
drawn a vertical line that divides the area under the curve in half$ which in accordance with is located at
the median. The following facts are true in general:
a. (hen the distribution is symmetric$ as in panels *a) and *b) of $ the mean and the median are
e"ual.
Saylor URL: http://www.saylor.org/books Saylor.org1
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 42/723
b. (hen the distribution is as shown in panel *c) of $ it is said to be skewed right . The mean has
been pulled to the right of the median by the long Gright tailH of the distribution$ the few relatively large
data values.
c. (hen the distribution is as shown in panel *d) of $ it is said to be skewed left . The mean has been
pulled to the left of the median by the long Gleft tailH of the distribution$ the few relatively small data
values.
!igure &.3 %kewness of -elative !requency +istograms
The >ode
erhaps you have heard a statement like GThe average number of automobiles owned by households
in the <nited +tates is 1.3@$H and have been amused at the thought of a fraction of an automobile
Saylor URL: http://www.saylor.org/books Saylor.org2
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 43/723
sitting in a driveway. ,n such a context the following measure for central location might make more
sense.
e+()t)o(
The sample mode of a set of sample data is the most frequently occurring value.
The population mode is defined in a similar way$ but we will not have occasion to refer to it again in
this text.
On a relative fre"uency histogram$ the highest point of the histogram corresponds to the mode of the
data set. illustrates the mode.
!igure &.4 2ode
Saylor URL: http://www.saylor.org/books Saylor.org3
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 44/723
/or any data set there is always exactly one mean and exactly one median. This need not be true of the
mode several different values could occur with the highest fre"uency$ as we will see. ,t could even happen
that every value occurs with the same fre"uency$ in which case the concept of the mode does not make
much sense.
EKAPLE 6
)(* the -o*e of the follow)(g *ata set.
−1 0 2 0
Sol,t)o(:
&he 4al,e ; )s -ost fre<,e(tly obser4e* a(* therefore the -o*e )s ;.
Saylor URL: http://www.saylor.org/books Saylor.org
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 45/723
EKAPLE 9
%o-p,te the sa-ple -o*e for the *ata of .
Sol,t)o(:
&he two -ost fre<,e(tly obser4e* 4al,es )( the *ata set are 1 a(* 2. &herefore -o*e
)s a set of two 4al,es: M12.
The mode is a measure of central location since most real-life data sets have moreobservations near the
center of the data range and fewer observations on the lower and upper ends. The value with the highest
fre"uency is often in the middle of the data range.
IEJ &AIEABAJ
&he -ea( the -e*)a( a(* the -o*e each a(swer the <,est)o( Bhere )s the ce(ter
of the *ata setC &he (at,re of the *ata set as )(*)cate* by a relat)4e fre<,e(cy
h)stogra- *eter-)(es wh)ch o(e g)4es the best a(swer.
Saylor URL: http://www.saylor.org/books Saylor.org0
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 46/723
Saylor URL: http://www.saylor.org/books Saylor.org
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 47/723
Saylor URL: http://www.saylor.org/books Saylor.org
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 48/723
Saylor URL: http://www.saylor.org/books Saylor.org6
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 49/723
Saylor URL: http://www.saylor.org/books Saylor.org9
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 50/723
Saylor URL: http://www.saylor.org/books Saylor.org0;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 51/723
Saylor URL: http://www.saylor.org/books Saylor.org01
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 52/723
!A/:+ (ATA S+T ++/ C&S+S
26 Large ata Set 1 l)sts the SA& scores a(* !PAs of 1;;; st,*e(ts.
http://www.1.7ls
a %o-p,te the -ea( a(* -e*)a( of the 1;;; SA& scores.
b %o-p,te the -ea( a(* -e*)a( of the 1;;; !PAs.29 Large ata Set 1 l)sts the SA& scores of 1;;; st,*e(ts.
http://www.1.7ls
a Regar* the *ata as ar)s)(g fro- a ce(s,s of all st,*e(ts at a h)gh school )( wh)ch the
SA& score of e4ery st,*e(t was -eas,re*. %o-p,te the pop,lat)o( -ea( μ.
b Regar* the +rst 20 obser4at)o(s as a ra(*o- sa-ple *raw( fro- th)s pop,lat)o(.
%o-p,te the sa-ple -ea( xP− a(* co-pare )t to μ.
c Regar* the (e7t 20 obser4at)o(s as a ra(*o- sa-ple *raw( fro- th)s pop,lat)o(.
%o-p,te the sa-ple -ea( xP− a(* co-pare )t to μ.
3; Large ata Set 1 l)sts the !PAs of 1;;; st,*e(ts.
http://www.1.7ls
a Regar* the *ata as ar)s)(g fro- a ce(s,s of all fresh-a( at a s-all college at the e(* of
the)r +rst aca*e-)c year of college st,*y )( wh)ch the !PA of e4ery s,ch perso( was
-eas,re*. %o-p,te the pop,lat)o( -ea( μ.
b Regar* the +rst 20 obser4at)o(s as a ra(*o- sa-ple *raw( fro- th)s pop,lat)o(.
%o-p,te the sa-ple -ea( xP− a(* co-pare )t to μ.
c Regar* the (e7t 20 obser4at)o(s as a ra(*o- sa-ple *raw( fro- th)s pop,lat)o(.
%o-p,te the sa-ple -ea( xP− a(* co-pare )t to μ.
31 Large ata Sets A a(* # l)st the s,r4)4al t)-es )( *ays of 1; laboratory -)ce w)th
thy-)c le,ke-)a fro- o(set to *eath.
http://www..7ls
http://www.A.7ls
http://www.#.7ls
a %o-p,te the -ea( a(* -e*)a( s,r4)4al t)-e for all -)ce w)tho,t regar* to ge(*er.
b %o-p,te the -ea( a(* -e*)a( s,r4)4al t)-e for the 0 -ale -)ce separately recor*e*
)( Large ata Set A5.
c %o-p,te the -ea( a(* -e*)a( s,r4)4al t)-e for the 0 fe-ale -)ce separately
recor*e* )( Large ata Set #5.
Saylor URL: http://www.saylor.org/books Saylor.org02
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 53/723
Saylor URL: http://www.saylor.org/books Saylor.org03
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 54/723
0.3 >easures o <ariability
LEARNN! "#$E%&'ES
1 &o lear( the co(cept of the 4ar)ab)l)ty of a *ata set.
2 &o lear( how to co-p,te three -eas,res of the 4ar)ab)l)ty of a *ata set: the
ra(ge the 4ar)a(ce a(* the sta(*ar* *e4)at)o(.
ook at the two data sets in Table !.1 0Two 2ata +ets0 and the graphical representation of each$
called a dot plot $ in /igure !.15 02ot lots of 2ata +ets0.
Saylor URL: http://www.saylor.org/books Saylor.org0
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 55/723
Table !.1 Two 2ata +ets
ata Set : ; 36 2 ; 39 39 3 ; 39 ;
ata Set : 3 ; 33 2 3 ; 3 0
!igure &."5 (ot $lots of (ata %ets
The two sets of ten measurements each center at the same value: they both have mean$ median$ and
mode 65. 9evertheless a glance at the figure shows that they are markedly different. ,n 2ata +et , the
measurements vary only slightly from the center$ while for 2ata +et ,, the measurements vary
greatly. Rust as we have attached numbers to a data set to locate its center$ we now wish to associate
to each data set numbers that measure "uantitatively how the data either scatter away from the
center or cluster close to it. These new "uantities are called measures of variability$ and we will
discuss three of them.
The /an$e
The first measure of variability that we discuss is the simplest.
(e)nition
The range of a data set is the number - defined by the formula
R=xmax−xmin
where xmaxis the largest measurement in the data set and xminis the smallest.
+A>2!+ %
)(* the ra(ge of each *ata set )( &able 2.1 Q&wo ata SetsQ.
Saylor URL: http://www.saylor.org/books Saylor.org00
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 56/723
Sol,t)o(:
or ata Set the -a7)-,- )s 3 a(* the -)()-,- )s 36 so the ra(ge )s R=43−38=5.
or ata Set the -a7)-,- )s a(* the -)()-,- )s 33 so the ra(ge )s R=47−33=14.
The range is a measure of variability because it indicates the si'e of the interval over which the data
points are distributed. % smaller range indicates less variability *less dispersion) among the data$
whereas a larger range indicates the opposite.
The <ariance and the Standard (eviation
The other two measures of variability that we will consider are more elaborate and also depend on
whether the data set is #ust a sample drawn from a much larger population or is the whole populationitself *that is$ a census).
Saylor URL: http://www.saylor.org/books Saylor.org0
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 57/723
%lthough the first formula in each case looks less complicated than the second$ the latter is easier to
use in hand computations$ and is called a shortcut formula.
Saylor URL: http://www.saylor.org/books Saylor.org0
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 58/723
The student is encouraged to compute the ten deviations for 2ata +et , and verify that their s"uares
add up to !5$ so that the sample variance and standard deviation of 2ata +et , are the much smaller
numberss2=20/9=2.2¯ ands=√ 20/9≈1.49.
Saylor URL: http://www.saylor.org/books Saylor.org06
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 59/723
The sample variance has different units from the data. /or example$ if the units in the data set were
inches$ the new units would be inches s"uared$ or s"uare inches. ,t is thus primarily of theoretical
importance and will not be considered further in this text$ except in passing.
,f the data set comprises the whole population$ then the population standard deviation$
denoted 6 *the lower case reek letter sigma)$ and its s"uare$ the population variance 6 !
$ are definedas follows.
9ote that the denominator in the fraction is the full number of observations$ not that number
reduced by one$ as is the case with the sample standard deviation. +ince most data sets are samples$
we will always work with the sample standard deviation and variance.
/inally$ in many real-life situations the most important statistical issues have to do with comparingthe means and standard deviations of two data sets. /igure !.11 02ifference between Two 2ata
+ets0 illustrates how a difference in one or both of the sample mean and the sample standard
deviation are reflected in the appearance of the data set as shown by the curves derived from the
relative fre"uency histograms built using the data.
Saylor URL: http://www.saylor.org/books Saylor.org09
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 60/723
!igure &."" (ifference between Two (ata %ets
IEJ &AIEABAJ
Saylor URL: http://www.saylor.org/books Saylor.org;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 61/723
&he ra(ge the sta(*ar* *e4)at)o( a(* the 4ar)a(ce each g)4e a <,a(t)tat)4e a(swer
to the <,est)o( @ow 4ar)able are the *ataC
Saylor URL: http://www.saylor.org/books Saylor.org1
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 62/723
Saylor URL: http://www.saylor.org/books Saylor.org2
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 63/723
Saylor URL: http://www.saylor.org/books Saylor.org3
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 64/723
!A/:+ (ATA S+T ++/ C&S+S
19 Large ata Set 1 l)sts the SA& scores a(* !PAs of 1;;; st,*e(ts.
http://www.1.7ls
a %o-p,te the ra(ge a(* sa-ple sta(*ar* *e4)at)o( of the 1;;; SA& scores.
b %o-p,te the ra(ge a(* sa-ple sta(*ar* *e4)at)o( of the 1;;; !PAs.
2; Large ata Set 1 l)sts the SA& scores of 1;;; st,*e(ts.
http://www.1.7ls
a Regar* the *ata as ar)s)(g fro- a ce(s,s of all st,*e(ts at a h)gh school )( wh)ch the
SA& score of e4ery st,*e(t was -eas,re*. %o-p,te the pop,lat)o( ra(ge a(*
pop,lat)o( sta(*ar* *e4)at)o( σ .
b Regar* the +rst 20 obser4at)o(s as a ra(*o- sa-ple *raw( fro- th)s pop,lat)o(.
%o-p,te the sa-ple ra(ge a(* sa-ple sta(*ar* *e4)at)o( s a(* co-pare the- to the
pop,lat)o( ra(ge a(* σ .
c Regar* the (e7t 20 obser4at)o(s as a ra(*o- sa-ple *raw( fro- th)s pop,lat)o(.
%o-p,te the sa-ple ra(ge a(* sa-ple sta(*ar* *e4)at)o( s a(* co-pare the- to the
pop,lat)o( ra(ge a(* σ .
21 Large ata Set 1 l)sts the !PAs of 1;;; st,*e(ts.
http://www.1.7ls
a Regar* the *ata as ar)s)(g fro- a ce(s,s of all fresh-a( at a s-all college at the e(* of
the)r +rst aca*e-)c year of college st,*y )( wh)ch the !PA of e4ery s,ch perso( was
-eas,re*. %o-p,te the pop,lat)o( ra(ge a(* pop,lat)o( sta(*ar* *e4)at)o( σ .
b Regar* the +rst 20 obser4at)o(s as a ra(*o- sa-ple *raw( fro- th)s pop,lat)o(.
%o-p,te the sa-ple ra(ge a(* sa-ple sta(*ar* *e4)at)o( s a(* co-pare the- to the
pop,lat)o( ra(ge a(* σ .
c Regar* the (e7t 20 obser4at)o(s as a ra(*o- sa-ple *raw( fro- th)s pop,lat)o(.
%o-p,te the sa-ple ra(ge a(* sa-ple sta(*ar* *e4)at)o( s a(* co-pare the- to the
pop,lat)o( ra(ge a(* σ .
22 Large ata Sets A a(* # l)st the s,r4)4al t)-es )( *ays of 1; laboratory -)ce w)th
thy-)c le,ke-)a fro- o(set to *eath.
http://www..7lshttp://www.A.7ls
http://www.#.7ls
a %o-p,te the ra(ge a(* sa-ple sta(*ar* *e4)at)o( of s,r4)4al t)-e for all -)ce w)tho,t
regar* to ge(*er.
Saylor URL: http://www.saylor.org/books Saylor.org
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 65/723
b %o-p,te the ra(ge a(* sa-ple sta(*ar* *e4)at)o( of s,r4)4al t)-e for the 0 -ale -)ce
separately recor*e* )( Large ata Set A5.
c %o-p,te the ra(ge a(* sa-ple sta(*ar* *e4)at)o( of s,r4)4al t)-e for the 0 fe-ale
-)ce separately recor*e* )( Large ata Set #5. o yo, see a *)8ere(ce )( the res,lts
for -ale a(* fe-ale -)ceC oes )t appear to be s)g()+ca(tC
Saylor URL: http://www.saylor.org/books Saylor.org0
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 66/723
0.7 /elative 2osition o (ata
!+A/N&N: 1';+CT&<+S
1 &o lear( the co(cept of the relat)4e pos)t)o( of a( ele-e(t of a *ata set.
Saylor URL: http://www.saylor.org/books Saylor.org
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 67/723
2 &o lear( the -ea()(g of each of two -eas,res the perce(t)le ra(k a(* the z =
score of the relat)4e pos)t)o( of a -eas,re-e(t a(* how to co-p,te each o(e.
3 &o lear( the -ea()(g of the three <,art)les assoc)ate* to a *ata set a(* how to
co-p,te the-.
&o lear( the -ea()(g of the +4e=(,-ber s,--ary of a *ata set how to co(str,ctthe bo7 plot assoc)ate* to )t a(* how to )(terpret the bo7 plot.
(hen you take an exam$ what is often as important as your actual score on the exam is the way your
score compares to other students& performance. ,f you made a @5 but the average score *whether the
mean$ median$ or mode) was ?8$ you did relatively poorly. ,f you made a @5 but the average score
was only 88 then you did relatively well. ,n general$ the significance of one observed value in a data
set strongly depends on how that value compares to the other observed values in a data set.
Therefore we wish to attach to each observed value a number that measures its relative position.
2ercentiles and ?uartiles
%nyone who has taken a national standardi'ed test is familiar with the idea of being given both a score on
the exam and a Gpercentile rankingH of that score. Sou may be told that your score was ;!8 and that it is
the ?8th percentile. The first number tells how you actually did on the exam the second says that ?8B of
the scores on the exam were less than or e"ual to your score$ ;!8.
(e)nition
#iven an observed value x in a data set $ x is the &th percentile of the data if the percentage of the data
that are less than or equal to x is $. The number $ is the percentile ran' of x .
+A>2!+ %3
Bhat perce(t)le )s the 4al,e 1.39 )( the *ata set of te( !PAs co(s)*ere* )( Note 2.12
QE7a-ple 3Q )( Sect)o( 2.2 Qeas,res of %e(tral Locat)o(QC Bhat perce(t)le )s the
4al,e 3.33C
Sol,t)o(:
&he *ata wr)tte( )( )(creas)(g or*er are
1.39 1.76 1.90 2.12 2.53 2.71 3.00 3.33 3.71 4.00
&he o(ly *ata 4al,e that )s less tha( or e<,al to 1.39 )s 1.39 )tself. S)(ce 1 )s 11; .
1; or 1;G of 1; the 4al,e 1.39 )s the 1;th perce(t)le. E)ght *ata 4al,es are less tha(
Saylor URL: http://www.saylor.org/books Saylor.org
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 68/723
or e<,al to 3.33. S)(ce 6 )s 61; .6; or 6;G of 1; the 4al,e 3.33 )s the 6;th
perce(t)le.
The $ th percentile cuts the data set in two so that approximately $ B of the data lie below it
and(100−P)B of the data lie above it. ,n particular$ the three percentiles that cut the data into
fourths$ as shown in /igure !.1! 02ata 2ivision by Auartiles0$ are called the quartiles. The following
simple computational definition of the three "uartiles works well in practice.
!igure &."& (ata (ivision by 7uartiles
(e)nition
!or any data set8
1 The second quartile Q2of the data set is its median.
! (efine two subsets8
1 the lower set8 all observations that are strictly less than Q2
Saylor URL: http://www.saylor.org/books Saylor.org6
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 69/723
! the upper set8 all observations that are strictly greater than Q2.
3 The first quartile Q1of the data set is the median of the lower set.
6 The third quartile Q3of the data set is the median of the upper set.
+A>2!+ %7
)(* the <,art)les of the *ata set of !PAs of Note 2.12 QE7a-ple 3Q )( Sect)o( 2.2
Qeas,res of %e(tral Locat)o(Q.
Sol,t)o(:
As )( the pre4)o,s e7a-ple we +rst l)st the *ata )( (,-er)cal or*er:
1.39 1.76 1.90 2.12 2.53 2.71 3.00 3.33 3.71 4.00
&h)s *ata set has n 1; obser4at)o(s. S)(ce 1; )s a( e4e( (,-ber the -e*)a( )s the
-ea( of the two -)**le obser4at)o(s: x=(2.53 + 2.71)/2=2.62. &h,s the seco(* <,art)le
)s Q2=2.62. &he lower a(* ,pper s,bsets are
Lower:L={1.39,1.76,1.90,2.12,2.53}
Upper:U={2.71,3.00,3.33,3.71,4.00}
Each has a( o** (,-ber of ele-e(ts so the -e*)a( of each )s )ts -)**le
obser4at)o(. &h,s the +rst <,art)le )s Q1=1.90 the -e*)a( of L a(* the th)r* <,art)le
)s Q3=3.33 the -e*)a( of U.
EKAPLE 10
A*>o)( the obser4at)o( 3.66 to the *ata set of the pre4)o,s e7a-ple a(* +(* the
<,art)les of the (ew set of *ata.
Sol,t)o(:
As )( the pre4)o,s e7a-ple we +rst l)st the *ata )( (,-er)cal or*er:
Saylor URL: http://www.saylor.org/books Saylor.org9
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 70/723
1.39 1.76 1.90 2.12 2.53 2.71 3.00 3.33 3.71 3.88 4.00
&h)s *ata set has 11 obser4at)o(s. &he seco(* <,art)le )s )ts -e*)a( the -)**le 4al,e
2.1. &h,s Q2=2.71. &he lower a(* ,pper s,bsets are (ow
Lower:L={1.39,1.76,1.90,2.12,2.53}
Upper: U= {3.00,3.33,3.71,3.88,4.00}
&he lower set L has -e*)a( the -)**le 4al,e 1.9; so Q1=1.90. &he ,pper set has
-e*)a( the -)**le 4al,e 3.1 so Q3=3.71.
,n addition to the three "uartiles$ the two extreme values$ the minimum x min and the maximum x max are
also useful in describing the entire data set. Together these five numbers are called the five(
number summary of the data set:
N x min$ 71$ 7!$ 73$ x max
The five-number summary is used to construct a bo) plot as in /igure !.13 0The 4ox lot0. ach of the
five numbers is represented by a vertical line segment$ a box is formed using the line segments
at Q! and Q$ as its two vertical sides$ and two hori'ontal line segments are extended from the vertical
segments marking Q! and Q$ to the ad#acent extreme values. *The two hori'ontal line segments are
referred to as Gwhiskers$H and the diagram is sometimes called a Gbox and whisker plot.H) (e caution the
reader that there are other types of box plots that differ somewhat from the ones we are constructing$
although all are based on the three "uartiles.
!igure &."* The 9ox $lot
9ote that the distance fromQ1toQ3is the length of the interval over which the middle half of the
data range. Thus it has the following special name.
Saylor URL: http://www.saylor.org/books Saylor.org;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 71/723
e+()t)o(
The inter"uartile range *,AC) is the quantity
IQR=Q3−Q1
EKAPLE 1
%o(str,ct a bo7 plot a(* +(* the HR for the *ata )( Note 2. QE7a-ple 1Q.
Sol,t)o(:
ro- o,r work )( Note 2. QE7a-ple 1Q we k(ow that the +4e=(,-ber s,--ary )s
xmin=1.39 Q1=1.90 Q2=2.62 Q3=3.33 xmax=4.00
&he bo7 plot )s
&he )(ter<,art)le ra(ge )s IQR=3.33−1.90=1.43.
z-scores
%nother way to locate a particular observation x in a data set is to compute its distance from the mean in
units of standard deviation.
Saylor URL: http://www.saylor.org/books Saylor.org1
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 72/723
The formulas in the definition allow us to compute the z -score when x is known. ,f the z -score is
known then x can be recovered using the corresponding inverse formulas
x=(x −)+sz or x=µ+σz
The z -score indicates how many standard deviations an individual observation x is from the center of
the data set$ its mean. ,f z is negative then x is below average. ,f z is 5 then x is e"ual to the average.
,f z is positive then x is above average. +ee /igure !.16.
!igure &." x :%cale versus z :%core
Saylor URL: http://www.saylor.org/books Saylor.org2
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 73/723
Saylor URL: http://www.saylor.org/books Saylor.org3
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 74/723
+A>2!+ %5
S,ppose the -ea( a(* sta(*ar* *e4)at)o( of the !PAs of all c,rre(tly reg)stere*
st,*e(ts at a college are μ 2.; a(* σ ;.0;. &he z =scores of the !PAs of two
st,*e(ts A(to()o a(* #eatr)ce are z=−0.62a(* z 1.26 respect)4ely. Bhat are the)r
!PAsC
Saylor URL: http://www.saylor.org/books Saylor.org
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 75/723
Sol,t)o(:
Us)(g the seco(* for-,la r)ght after the *e+()t)o( of z =scores we co-p,te the !PAs
as
Antonio:x=µ+z σ=2.70+(−0.62)(0.50)=2.39
Beatrice:x=µ+z σ=2.70+(1.28)(0.50)=3.34
*+, TA*+AA,S
• &he perce(t)le ra(k a(* z =score of a -eas,re-e(t )(*)cate )ts relat)4e pos)t)o(
w)th regar* to the other -eas,re-e(ts )( a *ata set.
• &he three <,art)les *)4)*e a *ata set )(to fo,rths.
• &he +4e=(,-ber s,--ary a(* )ts assoc)ate* bo7 plot s,--ar)Ve the locat)o(
a(* *)str)b,t)o( of the *ata.
Saylor URL: http://www.saylor.org/books Saylor.org0
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 76/723
Saylor URL: http://www.saylor.org/books Saylor.org
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 77/723
Saylor URL: http://www.saylor.org/books Saylor.org
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 78/723
Saylor URL: http://www.saylor.org/books Saylor.org6
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 79/723
Saylor URL: http://www.saylor.org/books Saylor.org9
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 80/723
Saylor URL: http://www.saylor.org/books Saylor.org6;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 81/723
Saylor URL: http://www.saylor.org/books Saylor.org61
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 82/723
Saylor URL: http://www.saylor.org/books Saylor.org62
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 83/723
Saylor URL: http://www.saylor.org/books Saylor.org63
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 84/723
30
E-)l)a a(* er*)(a(* took the sa-e fresh-a( che-)stry co,rse E-)l)a )( the fall
er*)(a(* )( the spr)(g. E-)l)a -a*e a( 63 o( the co--o( +(al e7a- that she took
o( wh)ch the -ea( was a(* the sta(*ar* *e4)at)o( 6. er*)(a(* -a*e a 9 o( the
co--o( +(al e7a- that he took wh)ch was -ore *)?c,lt s)(ce the -ea( was 0
a(* the sta(*ar* *e4)at)o( 12. &he o(e who has a h)gher z =score *)* relat)4ely better.
Bas )t E-)l)a or er*)(a(*C
3 Refer to the pre4)o,s e7erc)se. "( the +(al e7a- )( the sa-e co,rse the follow)(g
se-ester the -ea( )s 6 a(* the sta(*ar* *e4)at)o( )s 9. Bhat gra*e o( the e7a-
-atches E-)l)aWs perfor-a(ceC er*)(a(*WsC
3 Rose(cra(tV a(* !,)l*e(ster( are o( a we)ght=re*,c)(g *)et. Rose(cra(tV who we)ghs
16 lb belo(gs to a( age a(* bo*y=type gro,p for wh)ch the -ea( we)ght )s 10 lb a(*
the sta(*ar* *e4)at)o( )s 10 lb. !,)l*e(ster( who we)ghs 2; lb belo(gs to a( age a(*
bo*y=type gro,p for wh)ch the -ea( we)ght )s 10 lb a(* the sta(*ar* *e4)at)o( )s 2;
lb. Ass,-)(g z =scores are goo* -eas,res for co-par)so( )( th)s co(te7t who )s -ore
o4erwe)ght for h)s age a(* bo*y typeC
LAR!E A&A SE& EKER%SES
36 Large ata Set 1 l)sts the SA& scores a(* !PAs of 1;;; st,*e(ts.
http://www.1.7ls
a %o-p,te the three <,art)les a(* the )(ter<,art)le ra(ge of the 1;;; SA& scores.
b %o-p,te the three <,art)les a(* the )(ter<,art)le ra(ge of the 1;;; !PAs.
39 Large ata Set 1; recor*s the scores of 2 st,*e(ts o( a stat)st)cs e7a-.
http://www.1;.7ls
a %o-p,te the +4e=(,-ber s,--ary of the *ata.
Saylor URL: http://www.saylor.org/books Saylor.org6
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 85/723
b escr)be )( wor*s the perfor-a(ce of the class o( the e7a- )( the l)ght of the res,lt )(
part a5.
; Large ata Sets 3 a(* 3A l)st the he)ghts of 1 c,sto-ers e(ter)(g a shoe store.
http://www.3.7ls
http://www.3A.7ls
a %o-p,te the +4e=(,-ber s,--ary of the he)ghts w)tho,t regar* to ge(*er.
b %o-p,te the +4e=(,-ber s,--ary of the he)ghts of the -e( )( the sa-ple.
c %o-p,te the +4e=(,-ber s,--ary of the he)ghts of the wo-e( )( the sa-ple.
1 Large ata Sets A a(* # l)st the s,r4)4al t)-es )( *ays of 1; laboratory -)ce w)th
thy-)c le,ke-)a fro- o(set to *eath.
http://www..7ls
http://www.A.7ls
http://www.#.7ls
a %o-p,te the three <,art)les a(* the )(ter<,art)le ra(ge of the s,r4)4al t)-es for all
-)ce w)tho,t regar* to ge(*er.
b %o-p,te the three <,art)les a(* the )(ter<,art)le ra(ge of the s,r4)4al t)-es for the 0
-ale -)ce separately recor*e* )( Large ata Set A5.
c %o-p,te the three <,art)les a(* the )(ter<,art)le ra(ge of the s,r4)4al t)-es for the 0
fe-ale -)ce separately recor*e* )( Large ata Set #5.
Saylor URL: http://www.saylor.org/books Saylor.org60
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 86/723
Saylor URL: http://www.saylor.org/books Saylor.org6
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 87/723
Saylor URL: http://www.saylor.org/books Saylor.org6
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 88/723
0.8 The +mpirical /ule and Chebyshev#sTheorem
!+A/N&N: 1';+CT&<+S
1 &o lear( what the 4al,e of the sta(*ar* *e4)at)o( of a *ata set )-pl)es abo,t how
the *ata scatter away fro- the -ea( as *escr)be* by the E-p)r)cal R,le a(*
%hebyshe4Ws &heore-.
2 &o ,se the E-p)r)cal R,le a(* %hebyshe4Ws &heore- to *raw co(cl,s)o(s abo,t a
*ata set.
Sou probably have a good intuitive grasp of what the average of a data set says about that data set. ,n
this section we begin to learn what the standard deviation has to tell us about the nature of the data
set.
The +mpirical /ule
(e start by examining a specific set of data. Table !.! 0Eeights of en0 shows the heights in inches of 155
randomly selected adult men. % relative fre"uency histogram for the data is shown in /igure !.18 0Eeights
of %dult en0. The mean and standard deviation of the data are$ rounded to two decimal
places$ x−=69.92 and s M 1.@5. ,f we go through the data and count the number of observations that are
within one standard deviation of the mean$ that is$ that are
between69.92−1.70=68.22and69.92+1.70=71.62inches$ there are ;F of them. ,f we count the number
of observations that are within two standard deviations of the mean$ that is$ that are
bet*een69.92−2(1.70)=66.52and69.92+2(1.70)=73.32inches$ there are F8 of them. %ll of the
measurements are within three standard deviations of the mean$ that is$
between69.92−3(1.70)=64.822and69.92+3(1.70)=75.02inches. These tallies are not coincidences$ but
are in agreement with the following result that has been found to be widely applicable.
Table !.! Eeights of en
68.7 72.3 71.3 72.5 70.6 68.2 70.1 68.4 68.6 70.6
73.7 70.5 71.0 70.9 69.3 69.4 69.7 69.1 71.5 68.6
70.9 70.0 70.4 68.9 69.4 69.4 69.2 70.7 70.5 69.9
69.8 69.8 68.6 69.5 71.6 66.2 72.4 70.7 67.7 69.1
Saylor URL: http://www.saylor.org/books Saylor.org66
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 89/723
68.8 69.3 68.9 74.8 68.0 71.2 68.3 70.2 71.9 70.4
71.9 72.2 70.0 68.7 67.9 71.1 69.0 70.8 67.3 71.8
70.3 68.8 67.2 73.0 70.4 67.8 70.0 69.5 70.1 72.0
72.2 67.6 67.0 70.3 71.2 65.6 68.1 70.8 71.4 70.2
70.1 67.5 71.3 71.5 71.0 69.1 69.5 71.1 66.8 71.8
69.6 72.7 72.8 69.6 65.9 68.0 69.7 68.7 69.8 69.7
!igure &." +eights of Adult 2en
&he E-p)r)cal R,le
,f a data set has an approximately bell-shaped relative fre"uency histogram$ then *see /igure !.1; 0The
mpirical Cule0)
1 approximately ;?B of the data lie within one standard deviation of the mean$ that is$ in the interval
with endpoints x −±sfor samples and with endpointsµ±σfor populations
! approximately F8B of the data lie within two standard deviations of the mean$ that is$ in the interval
with endpoints x −±2sfor samples and with endpointsµ±2σfor populations and
Saylor URL: http://www.saylor.org/books Saylor.org69
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 90/723
3 approximately FF.@B of the data lies within three standard deviations of the mean$ that is$ in the
interval with endpoints x −±3sfor samples and with endpointsµ±3σfor populations.
!igure &."/ The ;mpirical -ule
Saylor URL: http://www.saylor.org/books Saylor.org9;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 91/723
Two key points in regard to the mpirical Cule are that the data distribution must be approximately bell:
shaped and that the percentages are only approximately true. The mpirical Cule does not apply to data
sets with severely asymmetric distributions$ and the actual percentage of observations in any of the
intervals specified by the rule could be either greater or less than those given in the rule. (e see this with
the example of the heights of the men: the mpirical Cule suggested ;? observations between ;?.!! and
@1.;! inches but we counted ;F.
Saylor URL: http://www.saylor.org/books Saylor.org91
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 92/723
Saylor URL: http://www.saylor.org/books Saylor.org92
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 93/723
Figure 2.17Distribution of eig!ts
EKAPLE 2;
Scores o( H tests ha4e a bell=shape* *)str)b,t)o( w)th -ea( μ 1;; a(* sta(*ar*
*e4)at)o( σ 1;. )sc,ss what the E-p)r)cal R,le )-pl)es co(cer()(g )(*)4)*,als w)th
H scores of 11; 12; a(* 13;.
Sol,t)o(:
A sketch of the H *)str)b,t)o( )s g)4e( )( )g,re 2.16 Q)str)b,t)o( of H ScoresQ. &he
E-p)r)cal R,le states that
1 appro7)-ately 6G of the H scores )( the pop,lat)o( l)e betwee( 9; a(* 11;
2 appro7)-ately 90G of the H scores )( the pop,lat)o( l)e betwee( 6; a(* 12;
a(*
3 appro7)-ately 99.G of the H scores )( the pop,lat)o( l)e betwee( ; a(* 13;.
Saylor URL: http://www.saylor.org/books Saylor.org93
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 94/723
Figure 2.1"Distribution of #$ %cores
S)(ce 6G of the H scores l)e &it!in the )(ter4al fro- 9; to 11; )t -,st be the case
that 32G l)e outside that )(ter4al. #y sy--etry appro7)-ately half of that 32G or
1G of all H scores w)ll l)e abo4e 11;. f 1G l)e abo4e 11; the( 6G l)e below. Be
co(cl,*e that the H score 11; )s the 6th perce(t)le.
&he sa-e a(alys)s appl)es to the score 12;. S)(ce appro7)-ately 90G of all H scores
l)e w)th)( the )(ter4al for- 6; to 12; o(ly 0G l)e o,ts)*e )t a(* half of the- or 2.0G
of all scores are abo4e 12;. &he H score 12; )s th,s h)gher tha( 9.0G of all H
scores a(* )s <,)te a h)gh score.
#y a s)-)lar arg,-e(t o(ly 10/1;; of 1G of all a*,lts or abo,t o(e or two )( e4ery
tho,sa(* wo,l* ha4e a( H score abo4e 13;. &h)s fact -akes the score 13;
e7tre-ely h)gh.
Saylor URL: http://www.saylor.org/books Saylor.org9
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 95/723
Chebyshev#s Theorem
The mpirical Cule does not apply to all data sets$ only to those that are bell-shaped$ and even then is
stated in terms of approximations. % result that applies to every data set is known as 7hebyshev&s
Theorem.
%hebyshe4Ws &heore-
/or any numerical data set$
1 at least 3I6 of the data lie within two standard deviations of the mean$ that is$ in the interval with
endpoints x −±2sfor samples and with endpointsµ±2σfor populations
! at least ?IF of the data lie within three standard deviations of the mean$ that is$ in the interval with
endpoints x −±3sfor samples and with endpointsµ±3σfor populations
3 at least1−1/k2 of the data lie within k standard deviations of the mean$ that is$ in the interval with
endpoints x −±ksfor samples and with endpointsµ±kσfor populations$ where k is any positive whole
number that is greater than 1.
/igure !.1F 07hebyshev&s Theorem0 gives a visual illustration of 7hebyshev&s Theorem.
igure &."4 <hebyshev=s Theorem
Saylor URL: http://www.saylor.org/books Saylor.org90
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 96/723
,t is important to pay careful attention to the words Gat leastH at the beginning of each of the three parts.
The theorem gives the minimum proportion of the data which must lie within a given number of standard
deviations of the mean the true proportions found within the indicated regions could be greater than
what the theorem guarantees.
Saylor URL: http://www.saylor.org/books Saylor.org9
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 97/723
Saylor URL: http://www.saylor.org/books Saylor.org9
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 98/723
+A>2!+ 00
&he (,-ber of 4eh)cles pass)(g thro,gh a b,sy )(tersect)o( betwee( 6:;; a.-.
a(* 1;:;; a.-. was obser4e* a(* recor*e* o( e4ery week*ay -or()(g of the last
year. &he *ata set co(ta)(s n 201 (,-bers. &he sa-ple -ea( )s x −=725a(* the
sa-ple sta(*ar* *e4)at)o( )s s 20. *e(t)fy wh)ch of the follow)(g
state-e(ts must be tr,e.
1 "( appro7)-ately 90G of the week*ay -or()(gs last year the (,-ber of 4eh)cles
pass)(g thro,gh the )(tersect)o( fro- 6:;; a.-. to 1;:;; a.-. was betwee( 0
a(* 0.
2 "( at least 0G of the week*ay -or()(gs last year the (,-ber of 4eh)cles
pass)(g thro,gh the )(tersect)o( fro- 6:;; a.-. to 1;:;; a.-. was betwee( 0
a(* 0.
3 "( at least 169 week*ay -or()(gs last year the (,-ber of 4eh)cles pass)(g
thro,gh the )(tersect)o( fro- 6:;; a.-. to 1;:;; a.-. was betwee( 0 a(* 0.
"( at -ost 20G of the week*ay -or()(gs last year the (,-ber of 4eh)cles
pass)(g thro,gh the )(tersect)o( fro- 6:;; a.-. to 1;:;; a.-. was e)ther less
tha( 0 or greater tha( 0.
0 "( at -ost 12.0G of the week*ay -or()(gs last year the (,-ber of 4eh)cles
pass)(g thro,gh the )(tersect)o( fro- 6:;; a.-. to 1;:;; a.-. was less tha( 0.
"( at -ost 20G of the week*ay -or()(gs last year the (,-ber of 4eh)cles
pass)(g thro,gh the )(tersect)o( fro- 6:;; a.-. to 1;:;; a.-. was less tha( 0.
Sol,t)o(:
1 S)(ce )t )s (ot state* that the relat)4e fre<,e(cy h)stogra- of the *ata )s bell=
shape* the E-p)r)cal R,le *oes (ot apply. State-e(t 15 )s base* o( the
E-p)r)cal R,le a(* therefore )t -)ght (ot be correct.
2 State-e(t 25 )s a *)rect appl)cat)o( of part 15 of %hebyshe4Ws &heore-
beca,se (x −−2s,x −+2s)=(675,775).t -,st be correct.
3 State-e(t 35 says the sa-e th)(g as state-e(t 25 beca,se 0G of 201 )s
166.20 so the -)()-,- whole (,-ber of obser4at)o(s )( th)s )(ter4al )s 169.
&h,s state-e(t 35 )s *e+()tely correct.
State-e(t 5 says the sa-e th)(g as state-e(t 25 b,t )( *)8ere(t wor*s a(*
therefore )s *e+()tely correct.
Saylor URL: http://www.saylor.org/books Saylor.org96
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 99/723
0 State-e(t 5 wh)ch )s *e+()tely correct states that at -ost 20G of the t)-e
e)ther fewer tha( 0 or -ore tha( 0 4eh)cles passe* thro,gh the )(tersect)o(.
State-e(t 05 says that half of that 20G correspo(*s to *ays of l)ght tra?c. &h)s
wo,l* be correct )f the relat)4e fre<,e(cy h)stogra- of the *ata were k(ow( to be
sy--etr)c. #,t th)s )s (ot state*D perhaps all of the obser4at)o(s o,ts)*e the
)(ter4al 005 are less tha( 0. &h,s state-e(t 05 -)ght (ot be correct
State-e(t 5 )s *e+()tely correct a(* state-e(t 5 )-pl)es state-e(t 5: e4e(
)f e4ery -eas,re-e(t that )s o,ts)*e the )(ter4al 005 )s less tha( 0
wh)ch )s co(ce)4able s)(ce sy--etry )s (ot k(ow( to hol*5 e4e( so at -ost
20G of all obser4at)o(s are less tha( 0. &h,s state-e(t 5 -,st *e+()tely be
correct.
*+, TA*+AA,S
• &he E-p)r)cal R,le )s a( appro7)-at)o( that appl)es o(ly to *ata sets w)th a bell=
shape* relat)4e fre<,e(cy h)stogra-. t est)-ates the proport)o( of the
-eas,re-e(ts that l)e w)th)( o(e two a(* three sta(*ar* *e4)at)o(s of the
-ea(.
• %hebyshe4Ws &heore- )s a fact that appl)es to all poss)ble *ata sets. t *escr)bes
the -)()-,- proport)o( of the -eas,re-e(ts that l)e -,st w)th)( o(e two or
-ore sta(*ar* *e4)at)o(s of the -ea(.
++/C&S+S
'AS&C
1 State the E-p)r)cal R,le.
2 escr)be the co(*)t)o(s ,(*er wh)ch the E-p)r)cal R,le -ay be appl)e*.
3 State %hebyshe4Ws &heore-.
escr)be the co(*)t)o(s ,(*er wh)ch %hebyshe4Ws &heore- -ay be appl)e*.
0 A sa-ple *ata set w)th a bell=shape* *)str)b,t)o( has -ea( x −=6a(* sta(*ar*
*e4)at)o( s 2. )(* the appro7)-ate proport)o( of obser4at)o(s )( the *ata set that l)e:a betwee( a(* 6D
b betwee( 2 a(* 1;D
c betwee( ; a(* 12.
A pop,lat)o( *ata set w)th a bell=shape* *)str)b,t)o( has -ea( μ a(* sta(*ar*
*e4)at)o( σ 2. )(* the appro7)-ate proport)o( of obser4at)o(s )( the *ata set that l)e:
a betwee( a(* 6D
Saylor URL: http://www.saylor.org/books Saylor.org99
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 100/723
b betwee( 2 a(* 1;D
c betwee( ; a(* 12.
A pop,lat)o( *ata set w)th a bell=shape* *)str)b,t)o( has -ea( μ 2 a(* sta(*ar*
*e4)at)o( σ 1.1. )(* the appro7)-ate proport)o( of obser4at)o(s )( the *ata set that
l)e:
a abo4e 2D
b abo4e 3.1D
c betwee( 2 a(* 3.1.6 A sa-ple *ata set w)th a bell=shape* *)str)b,t)o( has -ea( x−=2a(* sta(*ar*
*e4)at)o( s 1.1. )(* the appro7)-ate proport)o( of obser4at)o(s )( the *ata set
that l)e:
a below X;.2D
b below 3.1Dc betwee( X1.3 a(* ;.9.
9 A pop,lat)o( *ata set w)th a bell=shape* *)str)b,t)o( a(* s)Ve ' 0;; has -ea( μ
2 a(* sta(*ar* *e4)at)o( σ 1.1. )(* the appro7)-ate (,-ber of obser4at)o(s )(
the *ata set that l)e:
a abo4e 2D
b abo4e 3.1D
c betwee( 2 a(* 3.1.
1; A sa-ple *ata set w)th a bell=shape* *)str)b,t)o( a(* s)Ve n 126 has -ea( x −=2a(*
sta(*ar* *e4)at)o( s 1.1. )(* the appro7)-ate (,-ber of obser4at)o(s )( the *ata
set that l)e:
a below X;.2D
b below 3.1D
c betwee( X1.3 a(* ;.9.
11 A sa-ple *ata set has -ea( x −=6a(* sta(*ar* *e4)at)o( s 2. )(* the -)()-,-
proport)o( of obser4at)o(s )( the *ata set that -,st l)e:
a betwee( 2 a(* 1;D
b betwee( ; a(* 12D
c betwee( a(* 6.
12 A pop,lat)o( *ata set has -ea( μ 2 a(* sta(*ar* *e4)at)o( σ 1.1. )(* the
-)()-,- proport)o( of obser4at)o(s )( the *ata set that -,st l)e:
a betwee( X;.2 a(* .2D
b betwee( X1.3 a(* 0.3.
Saylor URL: http://www.saylor.org/books Saylor.org1;;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 101/723
13 A pop,lat)o( *ata set of s)Ve ' 0;; has -ea( μ 0.2 a(* sta(*ar* *e4)at)o( σ
1.1. )(* the -)()-,- (,-ber of obser4at)o(s )( the *ata set that -,st l)e:
a betwee( 3 a(* .D
b betwee( 1.9 a(* 6.0.
1 A sa-ple *ata set of s)Ve n 126 has -ea(x −=2a(* sta(*ar* *e4)at)o( s 2. )(*
the -)()-,- (,-ber of obser4at)o(s )( the *ata set that -,st l)e:
a betwee( X2 a(* )(cl,*)(g X2 a(* 5D
b betwee( X a(* 6 )(cl,*)(g X a(* 65.
10 A sa-ple *ata set of s)Ve n 3; has -ea( x −=6a(* sta(*ar* *e4)at)o( s 2.
a Bhat )s the -a7)-,- proport)o( of obser4at)o(s )( the *ata set that ca( l)e
o,ts)*e the )(ter4al 21;5C
b Bhat ca( be sa)* abo,t the proport)o( of obser4at)o(s )( the *ata set that are
below 2C
c Bhat ca( be sa)* abo,t the proport)o( of obser4at)o(s )( the *ata set that are
abo4e 1;C
* Bhat ca( be sa)* abo,t the (,-ber of obser4at)o(s )( the *ata set that are
abo4e 1;C
1 A pop,lat)o( *ata set has -ea( μ 2 a(* sta(*ar* *e4)at)o( σ 1.1.
a Bhat )s the -a7)-,- proport)o( of obser4at)o(s )( the *ata set that ca( l)e
o,ts)*e the )(ter4al (−1.3,5.3)C
b Bhat ca( be sa)* abo,t the proport)o( of obser4at)o(s )( the *ata set that are
below X1.3C
c Bhat ca( be sa)* abo,t the proport)o( of obser4at)o(s )( the *ata set that are
abo4e 0.3C
A22!&CAT&1NS
1 Scores o( a +(al e7a- take( by 12;; st,*e(ts ha4e a bell=shape* *)str)b,t)o( w)th
-ea( 2 a(* sta(*ar* *e4)at)o( 9.
a Bhat )s the -e*)a( score o( the e7a-C
b Abo,t how -a(y st,*e(ts score* betwee( 3 a(* 61C
c Abo,t how -a(y st,*e(ts score* betwee( 2 a(* 9;C
* Abo,t how -a(y st,*e(ts score* below 0C
Saylor URL: http://www.saylor.org/books Saylor.org1;1
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 102/723
16 Le(gths of +sh ca,ght by a co--erc)al +sh)(g boat ha4e a bell=shape* *)str)b,t)o( w)th
-ea( 23 )(ches a(* sta(*ar* *e4)at)o( 1.0 )(ches.
a Abo,t what proport)o( of all +sh ca,ght are betwee( 2; )(ches a(* 2 )(ches lo(gC
b Abo,t what proport)o( of all +sh ca,ght are betwee( 2; )(ches a(* 23 )(ches lo(gC
c Abo,t how lo(g )s the lo(gest +sh ca,ght o(ly a s-all fract)o( of a perce(t are lo(ger5C
19 @ockey p,cks ,se* )( profess)o(al hockey ga-es -,st we)gh betwee( 0.0 a(*
o,(ces. f the we)ght of p,cks -a(,fact,re* by a part)c,lar process )s bell=shape* has
-ea( 0.0 o,(ces a(* sta(*ar* *e4)at)o( ;.120 o,(ce what proport)o( of the p,cks
w)ll be ,sable )( profess)o(al ga-esC
2; @ockey p,cks ,se* )( profess)o(al hockey ga-es -,st we)gh betwee( 0.0 a(*
o,(ces. f the we)ght of p,cks -a(,fact,re* by a part)c,lar process )s bell=shape* a(*
has -ea( 0.0 o,(ces how large ca( the sta(*ar* *e4)at)o( be )f 99.G of the p,cks
are to be ,sable )( profess)o(al ga-esC
21 Spee*s of 4eh)cles o( a sect)o( of h)ghway ha4e a bell=shape* *)str)b,t)o( w)th -ea(
; -ph a(* sta(*ar* *e4)at)o( 2.0 -ph.
a f the spee* l)-)t )s 00 -ph abo,t what proport)o( of 4eh)cles are spee*)(gC
b Bhat )s the -e*)a( spee* for 4eh)cles o( th)s h)ghwayC
c Bhat )s the perce(t)le ra(k of the spee* 0 -phC
* Bhat spee* correspo(*s to the 1th perce(t)leC
22 S,ppose that as )( the pre4)o,s e7erc)se spee*s of 4eh)cles o( a sect)o( of h)ghway
ha4e -ea( ; -ph a(* sta(*ar* *e4)at)o( 2.0 -ph b,t (ow the *)str)b,t)o( of
spee*s )s ,(k(ow(.
a f the spee* l)-)t )s 00 -ph at least what proport)o( of 4eh)cles -,st
spee*)(gC
b Bhat ca( be sa)* abo,t the proport)o( of 4eh)cles go)(g 0 -ph or fasterC
23 A( )(str,ctor a((o,(ces to the class that the scores o( a rece(t e7a- ha* a bell=
shape* *)str)b,t)o( w)th -ea( 0 a(* sta(*ar* *e4)at)o( 0.
a Bhat )s the -e*)a( scoreC
Saylor URL: http://www.saylor.org/books Saylor.org1;2
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 103/723
b Appro7)-ately what proport)o( of st,*e(ts )( the class score* betwee( ; a(*
6;C
c Appro7)-ately what proport)o( of st,*e(ts )( the class score* abo4e 60C
* Bhat )s the perce(t)le ra(k of the score 60C
2 &he !PAs of all c,rre(tly reg)stere* st,*e(ts at a large ,()4ers)ty ha4e a bell=shape*
*)str)b,t)o( w)th -ea( 2. a(* sta(*ar* *e4)at)o( ;.. St,*e(ts w)th a !PA below 1.0
are place* o( aca*e-)c probat)o(. Appro7)-ately what perce(tage of c,rre(tly
reg)stere* st,*e(ts at the ,()4ers)ty are o( aca*e-)c probat)o(C
20 &h)rty=s)7 st,*e(ts took a( e7a- o( wh)ch the a4erage was 6; a(* the sta(*ar*
*e4)at)o( was . A r,-or says that +4e st,*e(ts ha* scores 1 or below. %a( the
r,-or be tr,eC Bhy or why (otC
Saylor URL: http://www.saylor.org/books Saylor.org1;3
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 104/723
Saylor URL: http://www.saylor.org/books Saylor.org1;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 105/723
Saylor URL: http://www.saylor.org/books Saylor.org1;0
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 106/723
Saylor URL: http://www.saylor.org/books Saylor.org1;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 107/723
Saylor URL: http://www.saylor.org/books Saylor.org1;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 108/723
Chapter 3
'asic Concepts o 2robability
+uppose a polling organi'ation "uestions 1$!55 voters in order to estimate the proportion of all
voters who favor a particular bond issue. (e would expect the proportion of the 1$!55 voters in the
survey who are in favor to be close to the proportion of all voters who are in favor$ but this need not
be true. There is a degree of randomness associated with the survey result. ,f the survey result is
highly likely to be close to the true proportion$ then we have confidence in the survey result. ,f it is
not particularly likely to be close to the population proportion$ then we would perhaps not take the
survey result too seriously. The likelihood that the survey proportion is close to the population
proportion determines our confidence in the survey result. /or that reason$ we would like to be able
to compute that likelihood. The task of computing it belongs to the realm of probability$ which we
study in this chapter.
3.% Sample Spaces@ +vents@ and Their 2robabilities
LEARNN! "#$E%&'ES
1 &o lear( the co(cept of the sa-ple space assoc)ate* w)th a ra(*o- e7per)-e(t.
2 &o lear( the co(cept of a( e4e(t assoc)ate* w)th a ra(*o- e7per)-e(t.
3 &o lear( the co(cept of the probab)l)ty of a( e4e(t.
Sample Spaces and +vents
Colling an ordinary six-sided die is a familiar example of a random experiment $ an action for which all
possible outcomes can be listed$ but for which the actual outcome on any given trial of the experiment
cannot be predicted with certainty. ,n such a situation we wish to assign to each outcome$ such as rolling a
two$ a number$ called the probability of the outcome$ that indicates how likely it is that the outcome will
occur. +imilarly$ we would like to assign a probability to any event $ or collection of outcomes$ such as
rolling an even number$ which indicates how likely it is that the event will occur if the experiment is
Saylor URL: http://www.saylor.org/books Saylor.org1;6
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 109/723
performed. This section provides a framework for discussing probability problems$ using the terms #ust
mentioned.
e+()t)o(
A random e)periment is a mechanism that produces a definite outcome that cannot be predicted
with certainty. The sample space associated with a random experiment is the set of all possible
outcomes. An event is a subset of the sample space.
e+()t)o(
An event ; is said to occur on a particular trial of the experiment if the outcome observed is an element
of the set ; .
+A>2!+ %
%o(str,ct a sa-ple space for the e7per)-e(t that co(s)sts of toss)(g a s)(gle co)(.
Sol,t)o(:
&he o,tco-es co,l* be labele* ! for hea*s a(* t for ta)ls. &he( the sa-ple space )s
the set S={h,t}.
+A>2!+ 0
%o(str,ct a sa-ple space for the e7per)-e(t that co(s)sts of roll)(g a s)(gle *)e. )(*
the e4e(ts that correspo(* to the phrases a( e4e( (,-ber )s rolle* a(* a (,-ber
greater tha( two )s rolle*.
Sol,t)o(:
&he o,tco-es co,l* be labele* accor*)(g to the (,-ber of *ots o( the top face of the
*)e. &he( the sa-ple space )s the set S={1,2,3,4,5,6}.
Saylor URL: http://www.saylor.org/books Saylor.org1;9
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 110/723
&he o,tco-es that are e4e( are 2 a(* so the e4e(t that correspo(*s to the
phrase a( e4e( (,-ber )s rolle* )s the set M2 wh)ch )t )s (at,ral to *e(ote by
the letter (. Be wr)te E={2,4,6}.
S)-)larly the e4e(t that correspo(*s to the phrase a (,-ber greater tha( two )s
rolle* )s the set T={3,4,5,6}@ wh)ch we ha4e *e(ote* ) .
% graphical representation of a sample space and events is a +enn diagram$ as shown in /igure
3.1 0Denn 2iagrams for Two +ample +paces0 for 9ote 3.; 0xample 10 and 9ote 3.@ 0xample !0.
,n general the sample space % is represented by a rectangle$ outcomes by points within the
rectangle$ and events by ovals that enclose the outcomes that compose them.
ure *." 0enn (iagrams for Two %ample %paces
+A>2!+ 3
A ra(*o- e7per)-e(t co(s)sts of toss)(g two co)(s.
a. %o(str,ct a sa-ple space for the s)t,at)o( that the co)(s are )(*)st)(g,)shable
s,ch as two bra(* (ew pe(()es.
b. %o(str,ct a sa-ple space for the s)t,at)o( that the co)(s are *)st)(g,)shable s,ch as
o(e a pe((y a(* the other a ()ckel.
Sol,t)o(:
a. After the co)(s are tosse* o(e sees e)ther two hea*s wh)ch co,l* be labele* 2htwo ta)ls wh)ch co,l* be labele* 2t or co)(s that *)8er wh)ch co,l* be labele* d. &h,s a
sa-ple space )s S={2h,2t,d}.
b. S)(ce we ca( tell the co)(s apart there are (ow two ways for the co)(s to *)8er: the
pe((y hea*s a(* the ()ckel ta)ls or the pe((y ta)ls a(* the ()ckel hea*s. Be ca(
label each o,tco-e as a pa)r of letters the +rst of wh)ch )(*)cates how the pe((y
Saylor URL: http://www.saylor.org/books Saylor.org11;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 111/723
la(*e* a(* the seco(* of wh)ch )(*)cates how the ()ckel la(*e*. A sa-ple space )s
the( S′={hh,ht,th,tt}.
% device that can be helpful in identifying all possible outcomes of a random experiment$ particularly one
that can be viewed as proceeding in stages$ is what is called a tree diagram. ,t is described in the
following example.
EKAPLE
%o(str,ct a sa-ple space that *escr)bes all three=ch)l* fa-)l)es accor*)(g to the
ge(*ers of the ch)l*re( w)th respect to b)rth or*er.
Sol,t)o(:
&wo of the o,tco-es are two boys the( a g)rl wh)ch we -)ght *e(ote bbg a(* a
g)rl the( two boys wh)ch we wo,l* *e(ote gbb.%learly there are -a(y o,tco-es
a(* whe( we try to l)st all of the- )t co,l* be *)?c,lt to be s,re that we ha4e
fo,(* the- all ,(less we procee* syste-at)cally. &he tree *)agra- show(
)()g,re 3.2 Q&ree )agra- or &hree=%h)l* a-)l)esQ g)4es a syste-at)c
approach.
Figure *.2)ree Diagram For )!ree+,!ild Families
Saylor URL: http://www.saylor.org/books Saylor.org111
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 112/723
&he *)agra- was co(str,cte* as follows. &here are two poss)b)l)t)es for the +rst
ch)l* boy or g)rl so we *raw two l)(e seg-e(ts co-)(g o,t of a start)(g po)(t
o(e e(*)(g )( a b for boy a(* the other e(*)(g )( a g for g)rl. or each of
these two poss)b)l)t)es for the +rst ch)l* there are two poss)b)l)t)es for the seco(*
ch)l* boy or g)rl so fro- each of the b a(* g we *raw two l)(e seg-e(ts o(e
seg-e(t e(*)(g )( a b a(* o(e )( a g. or each of the fo,r e(*)(g po)(ts (ow )(
the *)agra- there are two poss)b)l)t)es for the th)r* ch)l* so we repeat the
process o(ce -ore.
&he l)(e seg-e(ts are calle* branches of the tree. &he r)ght e(*)(g po)(t of each
bra(ch )s calle* a node. &he (o*es o( the e7tre-e r)ght are the )nal nodesD to
each o(e there correspo(*s a( o,tco-e as show( )( the +g,re.
ro- the tree )t )s easy to rea* o8 the e)ght o,tco-es of the e7per)-e(t so the
sa-ple space )s rea*)(g fro- the top to the botto- of the +(al (o*es )( the tree
Saylor URL: http://www.saylor.org/books Saylor.org112
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 113/723
S={bbb,bbg,bgb,bgg,gbb,gbg,ggb,ggg}
2robability
e+()t)o(
The probability of an outcome e in a sample space % is a number p between 5 and " that measures
the likelihood that e will occur on a single trial of the corresponding random experiment. The value p M
5 corresponds to the outcome e being impossible and the value p M 1 corresponds to the outcome e being
certain.
e+()t)o(
The probability of an event A is the sum of the probabilities of the individual outcomes of which it is
composed. >t is denoted P(A).
The following formula expresses the content of the definition of the probability of an event:
,f an event ; isE={e1,e2,…,ek}$ then
$ * ; )M $ *e1)K $ *e!)K Y Y Y K $ *ek)
/igure 3.3 0+ample +paces and robability0 graphically illustrates the definitions.
!igure *.* %ample %paces and $robability
Saylor URL: http://www.saylor.org/books Saylor.org113
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 114/723
+ince the whole sample space % is an event that is certain to occur$ the sum of the probabilities of all
the outcomes must be the number 1.
,n ordinary language probabilities are fre"uently expressed as percentages. /or example$ we would
say that there is a @5B chance of rain tomorrow$ meaning that the probability of rain is 5.@5. (e will
use this practice here$ but in all the computational formulas that follow we will use the form 5.@5 and
not @5B.
+A>2!+ 8
A co)( )s calle* bala(ce* or fa)r )f each s)*e )s e<,ally l)kely to la(* ,p. Ass)g( a
probab)l)ty to each o,tco-e )( the sa-ple space for the e7per)-e(t that co(s)sts of
toss)(g a s)(gle fa)r co)(.
Sol,t)o(:
B)th the o,tco-es labele* ! for hea*s a(* t for ta)ls the sa-ple space )s the
set S={h,t}.S)(ce the o,tco-es ha4e the sa-e probab)l)t)es wh)ch -,st a** ,p to 1
each o,tco-e )s ass)g(e* probab)l)ty 1/2.
+A>2!+ 9
A *)e )s calle* bala(ce* or fa)r )f each s)*e )s e<,ally l)kely to la(* o( top. Ass)g( a
probab)l)ty to each o,tco-e )( the sa-ple space for the e7per)-e(t that co(s)sts of
toss)(g a s)(gle fa)r *)e. )(* the probab)l)t)es of the e4e(ts (: a( e4e( (,-ber )s
rolle* a(* ) : a (,-ber greater tha( two )s rolle*.
Sol,t)o(:
Saylor URL: http://www.saylor.org/books Saylor.org11
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 115/723
B)th o,tco-es labele* accor*)(g to the (,-ber of *ots o( the top face of the *)e the
sa-ple space )s the set S={1,2,3,4,5,6}.S)(ce there are s)7 e<,ally l)kely o,tco-es
wh)ch -,st a** ,p to 1 each )s ass)g(e* probab)l)ty 1/.
+A>2!+ 4
&wo fa)r co)(s are tosse*. )(* the probab)l)ty that the co)(s -atch ).e. e)ther
both la(* hea*s or both la(* ta)ls.
Sol,t)o(:
( Note 3.6 QE7a-ple 3Q we co(str,cte* the sa-ple space S={2h,2t,d}for the
s)t,at)o( )( wh)ch the co)(s are )*e(t)cal a(* the sa-ple space S′={hh,ht,th,tt}for the
s)t,at)o( )( wh)ch the two co)(s ca( be tol* apart.
&he theory of probab)l)ty *oes (ot tell ,s !o& to ass)g( probab)l)t)es to the
o,tco-es o(ly what to *o w)th the- o(ce they are ass)g(e*. Spec)+cally ,s)(g
sa-ple space % -atch)(g co)(s )s the e4e(t M ={2h,2t} wh)ch has probab)l)ty P(2h)
+P(2t).Us)(g sa-ple space S′ -atch)(g co)(s )s the e4e(t M ′={hh,tt} wh)ch has
probab)l)ty P(hh)+P(tt).( the phys)cal worl* )t sho,l* -ake (o *)8ere(ce whether
the co)(s are )*e(t)cal or (ot a(* so we wo,l* l)ke to ass)g( probab)l)t)es to the
o,tco-es so that the (,-bers P(M )a(* P(M ′)are the sa-e a(* best -atch what
we obser4e whe( act,al phys)cal e7per)-e(ts are perfor-e* w)th co)(s that
see- to be fa)r. Act,al e7per)e(ce s,ggests that the o,tco-es )( S′ are e<,ally
l)kely so we ass)g( to each probab)l)ty 1 a(* the(
P(M ′)=P(hh)+P(tt)=1/4+1/4=1/2
S)-)larly fro- e7per)e(ce appropr)ate cho)ces for the o,tco-es )( % are:
P(2h)=1/4 P(2t)=1/4 P(d)=1/2
wh)ch g)4e the sa-e +(al a(swer
P(M )=P(2h)+P(2t)=1/4+1/4=1/2
The previous three examples illustrate how probabilities can be computed simply by counting
when the sample space consists of a finite number of e"ually likely outcomes. ,n some situations
the individual outcomes of any sample space that represents the experiment are unavoidably
Saylor URL: http://www.saylor.org/books Saylor.org110
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 116/723
une"ually likely$ in which case probabilities cannot be computed merely by counting$ but the
computational formula given in the definition of the probability of an event must be used.
+A>2!+ 5
&he break*ow( of the st,*e(t bo*y )( a local h)gh school accor*)(g to race a(*
eth()c)ty )s 01G wh)te 2G black 11G @)spa()c G As)a( a(* 0G for all others.
A st,*e(t )s ra(*o-ly selecte* fro- th)s h)gh school. &o select ra(*o-ly
-ea(s that e4ery st,*e(t has the sa-e cha(ce of be)(g selecte*.5 )(* the
probab)l)t)es of the follow)(g e4e(ts:
a -: the st,*e(t )s black
a. : the st,*e(t )s -)(or)ty that )s (ot wh)te5
b. ': the st,*e(t )s (ot black.
Sol,t)o(:
&he e7per)-e(t )s the act)o( of ra(*o-ly select)(g a st,*e(t fro- the st,*e(t
pop,lat)o( of the h)gh school. A( ob4)o,s sa-ple space )s S={w,b,h,a,o}.S)(ce 01G
of the st,*e(ts are wh)te a(* all st,*e(ts ha4e the sa-e cha(ce of be)(g
selecte* P(w)=0.51 a(* s)-)larly for the other o,tco-es. &h)s )(for-at)o( )s
s,--ar)Ve* )( the follow)(g table:
EKAPLE 9
&he st,*e(t bo*y )( the h)gh school co(s)*ere* )( Note 3.16 QE7a-ple 6Q -ay be
broke( *ow( )(to te( categor)es as follows: 20G wh)te -ale 2G wh)te fe-ale
12G black -ale 10G black fe-ale G @)spa()c -ale 0G @)spa()c fe-ale 3G
Saylor URL: http://www.saylor.org/books Saylor.org11
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 117/723
As)a( -ale 3G As)a( fe-ale 1G -ale of other -)(or)t)es co-b)(e* a(* G
fe-ale of other -)(or)t)es co-b)(e*. A st,*e(t )s ra(*o-ly selecte* fro- th)s
h)gh school. )(* the probab)l)t)es of the follow)(g e4e(ts:
a. -: the st,*e(t )s black
b. MF: the st,*e(t )s -)(or)ty fe-ale
c. FN : the st,*e(t )s fe-ale a(* )s (ot black.
Sol,t)o(:
Now the sa-ple space )s S={wm ,bm ,hm ,am ,om ,wf,bf,hf,af,of}. &he )(for-at)o( g)4e( )(
the e7a-ple ca( be s,--ar)Ve* )( the follow)(g table calle* a t&o+&a/
contingenc/ table:
:ender
/ace +thnicity
hite 'lack =ispanic Asian 1thers
ale ;.20 ;.12 ;.; ;.;3 ;.;1
e-ale ;.2 ;.10 ;.;0 ;.;3 ;.;
a. S)(ce B={bm ,bf}@ P(B)=P(bm )+P(bf)=0.12+0.15=0.27.
b. S)(ce MF={bf,hf,af,of}@
P(M )=P(bf)+P(hf)+P(af)+P(of)=0.15+0.05+0.03+0.04=0.27
c. S)(ce FN ={wf,hf,af,of}@
P(FN )=P(wf)+P(hf)+P(af)+P(of)=0.26+0.05+0.03+0.04=0.38
*+, TA*+AA,S
• &he sa-ple space of a ra(*o- e7per)-e(t )s the collect)o( of all poss)bleo,tco-es.
• A( e4e(t assoc)ate* w)th a ra(*o- e7per)-e(t )s a s,bset of the sa-ple space.
• &he probab)l)ty of a(y o,tco-e )s a (,-ber betwee( ; a(* 1. &he probab)l)t)es of
all the o,tco-es a** ,p to 1.
Saylor URL: http://www.saylor.org/books Saylor.org11
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 118/723
• &he probab)l)ty of a(y e4e(t 0 )s the s,- of the probab)l)t)es of the o,tco-es
)( 0.
EKER%SES
#AS%
1 A bo7 co(ta)(s 1; wh)te a(* 1; black -arbles. %o(str,ct a sa-ple space for the
e7per)-e(t of ra(*o-ly *raw)(g o,t w)th replace-e(t two -arbles )( s,ccess)o( a(*
(ot)(g the color each t)-e. &o *raw w)th replace-e(t -ea(s that the +rst -arble )s
p,t back before the seco(* -arble )s *raw(.5
2 A bo7 co(ta)(s 1 wh)te a(* 1 black -arbles. %o(str,ct a sa-ple space for the
e7per)-e(t of ra(*o-ly *raw)(g o,t w)th replace-e(t three -arbles )( s,ccess)o(
a(* (ot)(g the color each t)-e. &o *raw w)th replace-e(t -ea(s that each -arble )s
p,t back before the (e7t -arble )s *raw(.5
3 A bo7 co(ta)(s 6 re* 6 yellow a(* 6 gree( -arbles. %o(str,ct a sa-ple space for the
e7per)-e(t of ra(*o-ly *raw)(g o,t w)th replace-e(t two -arbles )( s,ccess)o( a(*
(ot)(g the color each t)-e.
A bo7 co(ta)(s re* yellow a(* gree( -arbles. %o(str,ct a sa-ple space for the
e7per)-e(t of ra(*o-ly *raw)(g o,t w)th replace-e(t three -arbles )( s,ccess)o(
a(* (ot)(g the color each t)-e.
0 ( the s)t,at)o( of E7erc)se 1 l)st the o,tco-es that co-pr)se each of the follow)(g
e4e(ts.
a At least o(e -arble of each color )s *raw(.
Saylor URL: http://www.saylor.org/books Saylor.org116
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 119/723
b No wh)te -arble )s *raw(.
( the s)t,at)o( of E7erc)se 2 l)st the o,tco-es that co-pr)se each of the follow)(g
e4e(ts.
a At least o(e -arble of each color )s *raw(.
b No wh)te -arble )s *raw(.
c ore black tha( wh)te -arbles are *raw(.
( the s)t,at)o( of E7erc)se 3 l)st the o,tco-es that co-pr)se each of the follow)(g
e4e(ts.
a No yellow -arble )s *raw(.
b &he two -arbles *raw( ha4e the sa-e color.
c At least o(e -arble of each color )s *raw(.
6 ( the s)t,at)o( of E7erc)se l)st the o,tco-es that co-pr)se each of the follow)(g
e4e(ts.
a No yellow -arble )s *raw(.
b &he three -arbles *raw( ha4e the sa-e color.
c At least o(e -arble of each color )s *raw(.
9 Ass,-)(g that each o,tco-e )s e<,ally l)kely +(* the probab)l)ty of each e4e(t )(
E7erc)se 0.
1; Ass,-)(g that each o,tco-e )s e<,ally l)kely +(* the probab)l)ty of each e4e(t )(
E7erc)se .
11 Ass,-)(g that each o,tco-e )s e<,ally l)kely +(* the probab)l)ty of each e4e(t )(
E7erc)se .
12 Ass,-)(g that each o,tco-e )s e<,ally l)kely +(* the probab)l)ty of each e4e(t )(
E7erc)se 6.
13 A sa-ple space )s S={a,b,c,d,e}.*e(t)fy two e4e(ts
as U={a,b,d}a(* V={b,c,d}.S,ppose P(a)a(* P(b)are each ;.2 a(* P(c)a(* P(d)are each ;.1.
a eter-)(e what P(e)-,st be.
Saylor URL: http://www.saylor.org/books Saylor.org119
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 120/723
b )(* P(U).
c )(* P(V).
1 A sa-ple space )s S={u,v,w,x}.*e(t)fy two e4e(ts
as A={v,w}a(* B={u,w,x}.S,ppose P(u)=0.22@ P(w)=0.36 a(* P(x)=0.27.
a eter-)(e what P(v)-,st be.
b )(* P(A).
c )(* P(B).
APPL%A&"NS
1 &he sa-ple space that *escr)bes all three=ch)l* fa-)l)es accor*)(g to the ge(*ers of the
ch)l*re( w)th respect to b)rth or*er was co(str,cte* )( Note 3.9 QE7a-ple Q. *e(t)fy
the o,tco-es that co-pr)se each of the follow)(g e4e(ts )( the e7per)-e(t of select)(g
a three=ch)l* fa-)ly at ra(*o-.
Saylor URL: http://www.saylor.org/books Saylor.org12;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 121/723
a At least o(e ch)l* )s a g)rl.
b At -ost o(e ch)l* )s a g)rl.
c All of the ch)l*re( are g)rls.
* E7actly two of the ch)l*re( are g)rls.
e &he +rst bor( )s a g)rl.
16 &he sa-ple space that *escr)bes three tosses of a co)( )s the sa-e as the o(e
co(str,cte* )( Note 3.9 QE7a-ple Q w)th boy replace* by hea*s a(* g)rl replace*
by ta)ls. *e(t)fy the o,tco-es that co-pr)se each of the follow)(g e4e(ts )( the
e7per)-e(t of toss)(g a co)( three t)-es.
a &he co)( la(*s hea*s -ore ofte( tha( ta)ls.
b &he co)( la(*s hea*s the sa-e (,-ber of t)-es as )t la(*s ta)ls.
c &he co)( la(*s hea*s at least tw)ce.
* &he co)( la(*s hea*s o( the last toss.
19 Ass,-)(g that the o,tco-es are e<,ally l)kely +(* the probab)l)ty of each e4e(t )(
E7erc)se 1.
2; Ass,-)(g that the o,tco-es are e<,ally l)kely +(* the probab)l)ty of each e4e(t )(
E7erc)se 16.
A((&T&1NA! ++/C&S+S
21 &he follow)(g two=way co(t)(ge(cy table g)4es the break*ow( of the pop,lat)o( )( a
part)c,lar locale accor*)(g to age a(* tobacco ,sage:
Age
Tobacco Use
Smoker Non-smoker
Under 30 0.05 0.20
Over 30 0.20 0.55
A perso( )s selecte* at ra(*o-. )(* the probab)l)ty of each of the follow)(g e4e(ts.
a &he perso( )s a s-oker.b &he perso( )s ,(*er 3;.
c &he perso( )s a s-oker who )s ,(*er 3;.
22 &he follow)(g two=way co(t)(ge(cy table g)4es the break*ow( of the pop,lat)o( )( a
part)c,lar locale accor*)(g to party a?l)at)o( 0 - , or 'one5 a(* op)()o( o( a bo(*
)ss,e:
Saylor URL: http://www.saylor.org/books Saylor.org121
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 122/723
Affiliation
Opinion
Favors Opposes Undecided
A 0.12 0.09 0.07
B 0.16 0.12 0.14C 0.04 0.03 0.06
None 0.08 0.06 0.03
A perso( )s selecte* at ra(*o-. )(* the probab)l)ty of each of the follow)(g e4e(ts.
a &he perso( )s a?l)ate* w)th party -.
b &he perso( )s a?l)ate* w)th so-e party.
c &he perso( )s )( fa4or of the bo(* )ss,e.
* &he perso( has (o party a?l)at)o( a(* )s ,(*ec)*e* abo,t the bo(* )ss,e.
23 &he follow)(g two=way co(t)(ge(cy table g)4es the break*ow( of the pop,lat)o( of
-arr)e* or pre4)o,sly -arr)e* wo-e( beyo(* ch)l*=bear)(g age )( a part)c,lar locale
accor*)(g to age at +rst -arr)age a(* (,-ber of ch)l*re(:
Age
Number of Children
0 or ! " or #ore
Under 20 0.02 0.14 0.08
20–29 0.07 0.37 0.11
30 and above 0.10 0.10 0.01
A wo-a( )s selecte* at ra(*o-. )(* the probab)l)ty of each of the follow)(g e4e(ts.
a &he wo-a( was )( her twe(t)es at her +rst -arr)age.
b &he wo-a( was 2; or ol*er at her +rst -arr)age.
c &he wo-a( ha* (o ch)l*re(.
* &he wo-a( was )( her twe(t)es at her +rst -arr)age a(* ha* at least three
ch)l*re(.
e
2 &he follow)(g two=way co(t)(ge(cy table g)4es the break*ow( of the pop,lat)o( of
a*,lts )( a part)c,lar locale accor*)(g to h)ghest le4el of e*,cat)o( a(* whether or (ot
the )(*)4)*,al reg,larly takes *)etary s,pple-e(ts:
Saylor URL: http://www.saylor.org/books Saylor.org122
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 123/723
$ducation
Use of Supplements
Takes %oes Not Take
No High !hoo" #i$"o%a 0.04 0.06
High !hoo" #i$"o%a 0.06 0.44Undergrad&a'e #egree 0.09 0.28
(rad&a'e #egree 0.01 0.02
A( a*,lt )s selecte* at ra(*o-. )(* the probab)l)ty of each of the follow)(g e4e(ts.
a &he perso( has a h)gh school *)plo-a a(* takes *)etary s,pple-e(ts
reg,larly.
b &he perso( has a( ,(*ergra*,ate *egree a(* takes *)etary s,pple-e(ts
reg,larly.
c &he perso( takes *)etary s,pple-e(ts reg,larly.
* &he perso( *oes (ot take *)etary s,pple-e(ts reg,larly.
LAR!E A&A SE& EKER%SES
20 Large ata Sets a(* A recor* the res,lts of 0;; tosses of a co)(. )(* the relat)4e
fre<,e(cy of each o,tco-e 1 2 3 0 a(* . oes the co)( appear to be bala(ce*
or fa)rC
http://www..7ls
http://www.A.7ls
2 Large ata Sets A a(* # recor* res,lts of a ra(*o- s,r4ey of 2;; 4oters )( each of
two reg)o(s )( wh)ch they were aske* to e7press whether they prefer %a(*)*ate 0for a
U.S. Se(ate seat or prefer so-e other ca(*)*ate.
a )(* the probab)l)ty that a ra(*o-ly selecte* 4oter a-o(g these ;; prefers
%a(*)*ate 0.
b )(* the probab)l)ty that a ra(*o-ly selecte* 4oter a-o(g the 2;; who l)4e )(
Reg)o( 1 prefers %a(*)*ate 0 separately recor*e* )( Large ata Set A5.
Saylor URL: http://www.saylor.org/books Saylor.org123
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 124/723
c )(* the probab)l)ty that a ra(*o-ly selecte* 4oter a-o(g the 2;; who l)4e )(
Reg)o( 2 prefers %a(*)*ate 0 separately recor*e* )( Large ata Set #5.
http://www..7ls
http://www.A.7ls
http://www.#.7ls
Saylor URL: http://www.saylor.org/books Saylor.org12
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 125/723
Saylor URL: http://www.saylor.org/books Saylor.org120
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 126/723
3.0 Complements@ &ntersections@ and Bnions
LEARNN! "#$E%&'ES
1 &o lear( how so-e e4e(ts are (at,rally e7press)ble )( ter-s of other e4e(ts.
2 &o lear( how to ,se spec)al for-,las for the probab)l)ty of a( e4e(t that )s
e7presse* )( ter-s of o(e or -ore other e4e(ts.
Saylor URL: http://www.saylor.org/books Saylor.org12
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 127/723
+ome events can be naturally expressed in terms of other$ sometimes simpler$ events.
Complements
e+()t)o(
The complement of an event A in a sample space % $ denoted Ac$ is the collection of all outcomes
in % that are not elements of the set A. >t corresponds to negating any description in words of the
event A.
EKAPLE 1;
&wo e4e(ts co((ecte* w)th the e7per)-e(t of roll)(g a s)(gle *)e are (: the (,-ber
rolle* )s e4e( a(* ) : the (,-ber rolle* )s greater tha( two. )(* the co-ple-e(t of each.
Sol,t)o(:
( the sa-ple space S={1,2,3,4,5,6}the correspo(*)(g sets of o,tco-es
are E={2,4,6}and T={3,4,5,6}. &he co-ple-e(ts are Ec={1,3,5}a(* Tc={1,2}.
( wor*s the co-ple-e(ts are *escr)be* by the (,-ber rolle* )s (ot e4e( a(* the
(,-ber rolle* )s (ot greater tha( two. "f co,rse eas)er *escr)pt)o(s wo,l* be the
(,-ber rolle* )s o** a(* the (,-ber rolle* )s less tha( three.
,f there is a ;5B chance of rain tomorrow$ what is the probability of fair weather= The obvious
answer$ 65B$ is an instance of the following general rule.
2robability /ule or Complements
P Ac)=1−P(A)
This formula is particularly useful when finding the probability of an event
EKAPLE 11
)(* the probab)l)ty that at least o(e hea*s w)ll appear )( +4e tosses of a fa)r co)(.
Saylor URL: http://www.saylor.org/books Saylor.org12
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 128/723
Sol,t)o(:
*e(t)fy o,tco-es by l)sts of +4e !s a(* t s s,ch as tthtta(* hhttt.Altho,gh )t )s
te*)o,s to l)st the- all )t )s (ot *)?c,lt to co,(t the-. &h)(k of ,s)(g a tree
*)agra- to *o so. &here are two cho)ces for the +rst toss. or each of these there
are two cho)ces for the seco(* toss he(ce 2×2=4o,tco-es for two tosses. or
each of these fo,r o,tco-es there are two poss)b)l)t)es for the th)r* toss
he(ce 4×2=8o,tco-es for three tosses. S)-)larly there are 8×2=16o,tco-es for
fo,r tosses a(* +(ally 16×2=32o,tco-es for +4e tosses.
Let *e(ote the e4e(t at least o(e hea*s. &here are -a(y ways to obta)( at least
o(e hea*s b,t o(ly o(e way to fa)l to *o so: all ta)ls. &h,s altho,gh )t )s *)?c,lt to
l)st all the o,tco-es that for- )t )s easy to wr)te Oc={ttttt}.S)(ce there are 32
e<,ally l)kely o,tco-es each has probab)l)ty 1/32 so P(Oc)=1/32
he(ce P(O)=1−1/32≈0.97or abo,t a 9G cha(ce.
&ntersection o +vents
(e)nitionThe intersection of events A and 9$ denoted A ) 9$ is the collection of all outcomes that are elements
of both of the sets A and 9. >t corresponds to combining descriptions of the two events using the word
?and.@
To say that the event A ) 9 occurred means that on a particular trial of the experiment
both A and 9 occurred. % visual representation of the intersection of events A and 9 in a sample
space % is given in /igure 3.6 0The ,ntersection of vents 0. The intersection corresponds to the
shaded lens-shaped region that lies within both ovals.
Saylor URL: http://www.saylor.org/books Saylor.org126
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 129/723
!igure *. The >ntersection of ;vents Aand 9
Saylor URL: http://www.saylor.org/books Saylor.org129
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 130/723
Saylor URL: http://www.saylor.org/books Saylor.org13;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 131/723
e+()t)o(
;vents A and 9 are mutually e)clusive if they have no elements in common.
/or A and 9 to have no outcomes in common means precisely that it is impossible for both A and 9 to
occur on a single trial of the random experiment. This gives the following rule.
Probab)l)ty R,le for ,t,ally E7cl,s)4e E4e(ts
vents A and 9 are mutually exclusive if and only if
P(A∩B)=0
%ny event A and its complement Ac are mutually exclusive$ but A and 9 can be mutually exclusive without
being complements.
EKAPLE 1
Saylor URL: http://www.saylor.org/books Saylor.org131
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 132/723
( the e7per)-e(t of roll)(g a s)(gle *)e +(* three cho)ces for a( e4e(t 0 so that the
e4e(ts 0 a(* (: the (,-ber rolle* )s e4e( are -,t,ally e7cl,s)4e.
Sol,t)o(:
S)(ce E={2,4,6}a(* we wa(t 0 to ha4e (o ele-e(ts )( co--o( w)th ( a(y e4e(t that
*oes (ot co(ta)( a(y e4e( (,-ber w)ll *o. &hree cho)ces are M130 the
co-ple-e(t (c the o**s5 M13 a(* M0.
Bnion o +vents
e+()t)o(
The union of events A and 9$ denoted A Z 9$ is the collection of all outcomes that are elements of one
or the other of the sets A and 9$ or of both of them. >t corresponds to combining descriptions of the two
events using the word ?or.@
To say that the event A Z 9 occurred means that on a particular trial of the experiment
either A or 9 occurred *or both did). % visual representation of the union of events A and 9 in a sample
space % is given in /igure 3.8 0The <nion of vents 0. The union corresponds to the shaded region.
!igure *. The nion of ;vents A and 9
Saylor URL: http://www.saylor.org/books Saylor.org132
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 133/723
EKAPLE 10
( the e7per)-e(t of roll)(g a s)(gle *)e +(* the ,()o( of the e4e(ts (: the (,-ber
rolle* )s e4e( a(* ) : the (,-ber rolle* )s greater tha( two.
Sol,t)o(:
S)(ce the o,tco-es that are )( e)ther E={2,4,6}or T={3,4,5,6}or both5 are 2 3 0 a(*
EZT={2,3,4,5,6}.Note that a( o,tco-e s,ch as that )s )( both sets )s st)ll l)ste* o(ly
o(ce altho,gh str)ctly speak)(g )t )s (ot )(correct to l)st )t tw)ce5.
( wor*s the ,()o( )s *escr)be* by the (,-ber rolle* )s e4e( or )s greater tha( two.
E4ery (,-ber betwee( o(e a(* s)7 e7cept the (,-ber o(e )s e)ther e4e( or )s greater
tha( two correspo(*)(g to ( Z ) g)4e( abo4e.
EKAPLE 1
A two=ch)l* fa-)ly )s selecte* at ra(*o-. Let - *e(ote the e4e(t that at least o(e
ch)l* )s a boy let D *e(ote the e4e(t that the ge(*ers of the two ch)l*re( *)8er a(*
let *e(ote the e4e(t that the ge(*ers of the two ch)l*re( -atch. )(* - Z D a(* B
M .
Sol,t)o(:
Saylor URL: http://www.saylor.org/books Saylor.org133
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 134/723
A sa-ple space for th)s e7per)-e(t )s S={bb,bg,gb,gg} where the +rst letter *e(otes
the ge(*er of the +rstbor( ch)l* a(* the seco(* letter *e(otes the ge(*er of the
seco(* ch)l*. &he e4e(ts - D a(* are
B={bb,bg,gb} D={bg,gb} M ={bb,gg}
Each o,tco-e )( D )s alrea*y )( - so the o,tco-es that are )( at least o(e or the
other of the sets - a(* D )s >,st the set - )tself: BD={bb,bg,gb}=B.
E4ery o,tco-e )( the whole sa-ple space % )s )( at least o(e or the other of the
sets - a(* so B M ={bb,bg,gb,gg}=S.
The following ,dditive -ule of &robability is a useful formula for calculating the probability
ofAZB.
Additive /ule o 2robability
P(A B)=P(A)+P(B)−P(A∩B)
The next example$ in which we compute the probability of a union both by counting and by using the
formula$ shows why the last term in the formula is needed.
Saylor URL: http://www.saylor.org/books Saylor.org13
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 135/723
Saylor URL: http://www.saylor.org/books Saylor.org130
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 136/723
EKAPLE 16
A t,tor)(g ser4)ce spec)al)Ves )( prepar)(g a*,lts for h)gh school e<,)4ale(ce tests.
A-o(g all the st,*e(ts seek)(g help fro- the ser4)ce 3G (ee* help )( -athe-at)cs
3G (ee* help )( E(gl)sh a(* 2G (ee* help )( both -athe-at)cs a(* E(gl)sh. Bhat
)s the perce(tage of st,*e(ts who (ee* help )( e)ther -athe-at)cs or E(gl)shC
Sol,t)o(:
-ag)(e select)(g a st,*e(t at ra(*o- that )s )( s,ch a way that e4ery st,*e(t has
the sa-e cha(ce of be)(g selecte*. Let *e(ote the e4e(t the st,*e(t (ee*s help
)( -athe-at)cs a(* let ( *e(ote the e4e(t the st,*e(t (ee*s help )( E(gl)sh. &he
)(for-at)o( g)4e( )s that P(M )=0.63@ P(E)=0.34@ a(* P(M ∩E)=0.27. &he A**)t)4e R,le of
Probab)l)ty g)4es
Saylor URL: http://www.saylor.org/books Saylor.org13
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 137/723
P(M E)=P(M )+P(E)−P(M ∩E)=0.63+0.34−0.27=0.70
9ote how the nave reasoning that if ;3B need help in mathematics and 36B need help in nglish
then ;3 plus 36 or F@B need help in one or the other gives a number that is too large. The percentage
that need help in both sub#ects must be subtracted off$ else the people needing help in both are
counted twice$ once for needing help in mathematics and once again for needing help in nglish. The
simple sum of the probabilities would work if the events in "uestion were mutually exclusive$ for
thenP(A∩B)is 'ero$ and makes no difference.
Saylor URL: http://www.saylor.org/books Saylor.org13
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 138/723
Saylor URL: http://www.saylor.org/books Saylor.org136
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 139/723
Saylor URL: http://www.saylor.org/books Saylor.org139
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 140/723
*+, TA*+AA,
•
&he probab)l)ty of a( e4e(t that )s a co-ple-e(t or ,()o( of e4e(ts of k(ow(probab)l)ty ca( be co-p,te* ,s)(g for-,las.
Saylor URL: http://www.saylor.org/books Saylor.org1;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 141/723
Saylor URL: http://www.saylor.org/books Saylor.org11
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 142/723
Saylor URL: http://www.saylor.org/books Saylor.org12
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 143/723
Saylor URL: http://www.saylor.org/books Saylor.org13
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 144/723
Saylor URL: http://www.saylor.org/books Saylor.org1
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 145/723
R S T
M 0.09 0.25 0.19
N 0.31 0.16 0.00
a P(R)@ P(S)@ P(R∩S).
b P(M )@ P(N )@ P(M ∩N ).
c P(RS).
d P(Rc).
e eter-)(e whether or (ot the e4e(ts ' a(* % are -,t,ally e7cl,s)4eD the e4e(ts ' a(* ) .A22!&CAT&1NS
11 ake a state-e(t )( or*)(ary E(gl)sh that *escr)bes the co-ple-e(t of each e4e(t *o
(ot s)-ply )(sert the wor* (ot5.
a ( the roll of a *)e: +4e or -ore.
b ( a roll of a *)e: a( e4e( (,-ber.
c ( two tosses of a co)(: at least o(e hea*s.
* ( the ra(*o- select)o( of a college st,*e(t: Not a fresh-a(.
12 ake a state-e(t )( or*)(ary E(gl)sh that *escr)bes the co-ple-e(t of each e4e(t *o
(ot s)-ply )(sert the wor* (ot5.
a ( the roll of a *)e: two or less.
b ( the roll of a *)e: o(e three or fo,r.
c ( two tosses of a co)(: at -ost o(e hea*s.
* ( the ra(*o- select)o( of a college st,*e(t: Ne)ther a fresh-a( (or a se()or.
%3 &he sa-ple space that *escr)bes all three=ch)l* fa-)l)es accor*)(g to the ge(*ers of the
ch)l*re( w)th respect to b)rth or*er )s
S={bbb,bbg,bgb,bgg,gbb,gbg,ggb,ggg}.
or each of the follow)(g e4e(ts )( the e7per)-e(t of select)(g a three=ch)l* fa-)ly at
ra(*o- state the co-ple-e(t of the e4e(t )( the s)-plest poss)ble ter-s the( +(* the
o,tco-es that co-pr)se the e4e(t a(* )ts co-ple-e(t.
Saylor URL: http://www.saylor.org/books Saylor.org10
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 146/723
a At least o(e ch)l* )s a g)rl.
b At -ost o(e ch)l* )s a g)rl.
c All of the ch)l*re( are g)rls.
* E7actly two of the ch)l*re( are g)rls.
e &he +rst bor( )s a g)rl.
1 &he sa-ple space that *escr)bes the two=way class)+cat)o( of c)t)Ve(s accor*)(g to
ge(*er a(* op)()o( o( a pol)t)cal )ss,e )s
S={mf,ma,mn,ff,fa,fn},
where the +rst letter *e(otes ge(*er m: -ale f : fe-ale5 a(* the seco(* op)()o( f :
for a: aga)(st n: (e,tral5. or each of the follow)(g e4e(ts )( the e7per)-e(t of
select)(g a c)t)Ve( at ra(*o- state the co-ple-e(t of the e4e(t )( the s)-plest
poss)ble ter-s the( +(* the o,tco-es that co-pr)se the e4e(t a(* )ts co-ple-e(t.
a &he perso( )s -ale.
b &he perso( )s (ot )( fa4or.
c &he perso( )s e)ther -ale or )( fa4or.
* &he perso( )s fe-ale a(* (e,tral.
10 A to,r)st who speaks E(gl)sh a(* !er-a( b,t (o other la(g,age 4)s)ts a reg)o( of
Slo4e()a. f 30G of the res)*e(ts speak E(gl)sh 10G speak !er-a( a(* 3G speak both
E(gl)sh a(* !er-a( what )s the probab)l)ty that the to,r)st w)ll be able to talk w)th a
ra(*o-ly e(co,(tere* res)*e(t of the reg)o(C
1 ( a certa)( co,(try 3G of all a,to-ob)les ha4e a)rbags 2G ha4e a(t)=lock brakes
a(* 13G ha4e both. Bhat )s the probab)l)ty that a ra(*o-ly selecte* 4eh)cle w)ll ha4e
both a)rbags a(* a(t)=lock brakesC
1 A -a(,fact,rer e7a-)(es )ts recor*s o4er the last year o( a co-po(e(t part
rece)4e* fro- o,ts)*e s,ppl)ers. &he break*ow( o( so,rce s,ppl)er 0 s,ppl)er -5
a(* <,al)ty : h)gh U: ,sable D: *efect)4e5 )s show( )( the two=way co(t)(ge(cy
table.
Saylor URL: http://www.saylor.org/books Saylor.org1
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 147/723
H U D
A 0.6937 0.0049 0.0014
B 0.2982 0.0009 0.0009
&he recor* of a part )s selecte* at ra(*o-. )(* the probab)l)ty of each of the follow)(g
e4e(ts.
a &he part was *efect)4e.
b &he part was e)ther of h)gh <,al)ty or was at least ,sable )( two ways: )5 by a**)(g
(,-bers )( the table a(* ))5 ,s)(g the a(swer to a5 a(* the Probab)l)ty R,le for
%o-ple-e(ts.
c &he part was *efect)4e a(* ca-e fro- s,ppl)er -.
* &he part was *efect)4e or ca-e fro- s,ppl)er - )( two ways: by +(*)(g the cells )( the
table that correspo(* to th)s e4e(t a(* a**)(g the)r probab)l)t)es a(* ))5 ,s)(g the
A**)t)4e R,le of Probab)l)ty.
16. (*)4)*,als w)th a part)c,lar -e*)cal co(*)t)o( were class)+e* accor*)(g to the prese(ce
) 5 or abse(ce '5 of a pote(t)al to7)( )( the)r bloo* a(* the o(set of the co(*)t)o( (:
early : -)*ra(ge L: late5. &he break*ow( accor*)(g to th)s class)+cat)o( )s show( )( the
two=way co(t)(ge(cy table.
E M L
T 0.012 0.124 0.013
N 0.170 0.638 0.043
"(e of these )(*)4)*,als )s selecte* at ra(*o-. )(* the probab)l)ty of each of the
follow)(g e4e(ts.
a &he perso( e7per)e(ce* early o(set of the co(*)t)o(.
b &he o(set of the co(*)t)o( was e)ther -)*ra(ge or late )( two ways: )5 by
a**)(g (,-bers )( the table a(* ))5 ,s)(g the a(swer to a5 a(* the
Probab)l)ty R,le for %o-ple-e(ts.
c &he to7)( )s prese(t )( the perso(Ws bloo*.* &he perso( e7per)e(ce* early o(set of the co(*)t)o( a(* the to7)( )s prese(t
)( the perso(Ws bloo*.
e &he perso( e7per)e(ce* early o(set of the co(*)t)o( or the to7)( )s prese(t )(
the perso(Ws bloo* )( two ways: )5 by +(*)(g the cells )( the table that
correspo(* to th)s e4e(t a(* a**)(g the)r probab)l)t)es a(* ))5 ,s)(g the
A**)t)4e R,le of Probab)l)ty.
Saylor URL: http://www.saylor.org/books Saylor.org1
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 148/723
19 &he break*ow( of the st,*e(ts e(rolle* )( a ,()4ers)ty co,rse by class F :
fresh-a( So: sopho-ore : >,()or Se: se()or5 a(* aca*e-)c -a>or %: sc)e(ce
-athe-at)cs or e(g)(eer)(g L: l)beral arts : other5 )s show( )( the two=way
class)+cat)o( table.
#a&or
Class
F So J Se
S 92 42 20 13
L 368 167 80 53
O 460 209 100 67
A st,*e(t e(rolle* )( the co,rse )s selecte* at ra(*o-. A*>o)( the row a(* col,-(
totals to the table a(* ,se the e7pa(*e* table to +(* the probab)l)ty of each of the
follow)(g e4e(ts.
a &he st,*e(t )s a fresh-a(.
b &he st,*e(t )s a l)beral arts -a>or.
c &he st,*e(t )s a fresh-a( l)beral arts -a>or.
* &he st,*e(t )s e)ther a fresh-a( or a l)beral arts -a>or.
e &he st,*e(t )s (ot a l)beral arts -a>or.
2; &he table relates the respo(se to a f,(*=ra)s)(g appeal by a college to )ts al,-() to
the (,-ber of years s)(ce gra*,at)o(.
'esponse
(ears Since )raduation
0*+ ,*!0 !*"+ Over "+
*o+i'ive 120 440 210 90
None 1380 3560 3290 910
A( al,-(,s )s selecte* at ra(*o-. A*>o)( the row a(* col,-( totals to the table a(*
,se the e7pa(*e* table to +(* the probab)l)ty of each of the follow)(g e4e(ts.
a &he al,-(,s respo(*e*.
b &he al,-(,s *)* (ot respo(*.
c &he al,-(,s gra*,ate* at least 21 years ago.
* &he al,-(,s gra*,ate* at least 21 years ago a(* respo(*e*.
A((&T&1NA! ++/C&S+S
21 &he sa-ple space for toss)(g three co)(s )s
S={hhh,hht,hth,htt,thh,tht,tth,ttt}
a L)st the o,tco-es that correspo(* to the state-e(t All the co)(s are hea*s.
Saylor URL: http://www.saylor.org/books Saylor.org16
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 149/723
b L)st the o,tco-es that correspo(* to the state-e(t Not all the co)(s are hea*s.
c L)st the o,tco-es that correspo(* to the state-e(t All the co)(s are (ot hea*s.
Saylor URL: http://www.saylor.org/books Saylor.org19
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 150/723
Saylor URL: http://www.saylor.org/books Saylor.org10;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 151/723
3.3 Conditional 2robability and &ndependent +vents
LEARNN! "#$E%&'ES
1 &o lear( the co(cept of a co(*)t)o(al probab)l)ty a(* how to co-p,te )t.
2 &o lear( the co(cept of )(*epe(*e(ce of e4e(ts a(* how to apply )t.
Saylor URL: http://www.saylor.org/books Saylor.org101
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 152/723
Conditional 2robability
+uppose a fair die has been rolled and you are asked to give the probability that it was a five. There are six
e"ually likely outcomes$ so your answer is 1I;. 4ut suppose that before you give your answer you are given
the extra information that the number rolled was odd. +ince there are only three odd numbers that arepossible$ one of which is five$ you would certainly revise your estimate of the likelihood that a five was
rolled from 1I; to 1I3. ,n general$ the revised probability that an event A has occurred$ taking into
account the additional information that another event 9 has definitely occurred on this trial of the
experiment$ is called the conditional probability of A given 9 and is denoted byP(A|B).The reasoning
employed in this example can be generali'ed to yield the computational formula in the following
definition.
(e)nition
The conditional probability of A given 9$ denoted P(A|B) is the probability that event A has
occurred in a trial of a random experiment for which it is known that event 9 has definitely occurred. >t
may be computed by means of the following formula8
Cule for 7onditional robability
P(A|B)=P(A∩B)/P(B)
+A>2!+ 0
A fa)r *)e )s rolle*.
a. )(* the probab)l)ty that the (,-ber rolle* )s a +4e g)4e( that )t )s o**.
b. )(* the probab)l)ty that the (,-ber rolle* )s o** g)4e( that )t )s a +4e.
Sol,t)o(:
&he sa-ple space for th)s e7per)-e(t )s the set S={1,2,3,4,5,6}co(s)st)(g of s)7 e<,ally
l)kely o,tco-es. Let F *e(ote the e4e(t a +4e )s rolle* a(* let *e(ote the e4e(t
a( o** (,-ber )s rolle* so that
F={5} and O={1,3,5}
Saylor URL: http://www.saylor.org/books Saylor.org102
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 153/723
Saylor URL: http://www.saylor.org/books Saylor.org103
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 154/723
Rust as we did not need the computational formula in this example$ we do not need it when the
information is presented in a two-way classification table$ as in the next example.
+A>2!+ 0%
( a sa-ple of 9;2 )(*)4)*,als ,(*er ; who were or ha* pre4)o,sly bee( -arr)e*
each perso( was class)+e* accor*)(g to ge(*er a(* age at +rst -arr)age. &he res,lts
are s,--ar)Ve* )( the follow)(g two=way class)+cat)o( table where the -ea()(g of
the labels )s:
• : -ale
• F : fe-ale
• (: a tee(ager whe( +rst -arr)e*
• 3 : )( o(eWs twe(t)es whe( +rst -arr)e*
• : )( o(eWs th)rt)es whe( +rst -arr)e*
E W H Total
M 43 293 114 450
F 82 299 71 452
,o'a" 125 592 185 902
&he (,-bers )( the +rst row -ea( that 3 people )( the sa-ple were -e( who were
+rst -arr)e* )( the)r tee(s 293 were -e( who were +rst -arr)e* )( the)r twe(t)es
11 -e( who were +rst -arr)e* )( the)r th)rt)es a(* a total of 0; people )( the
sa-ple were -e(. S)-)larly for the (,-bers )( the seco(* row. &he (,-bers )( the
last row -ea( that )rrespect)4e of ge(*er 120 people )( the sa-ple were -arr)e* )(
Saylor URL: http://www.saylor.org/books Saylor.org10
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 155/723
the)r tee(s 092 )( the)r twe(t)es 160 )( the)r th)rt)es a(* that there were 9;2 people
)( the sa-ple )( all. S,ppose that the proport)o(s )( the sa-ple acc,rately re[ect
those )( the pop,lat)o( of all )(*)4)*,als )( the pop,lat)o( who are ,(*er ; a(* who
are or ha4e pre4)o,sly bee( -arr)e*. S,ppose s,ch a perso( )s selecte* at ra(*o-.
a )(* the probab)l)ty that the )(*)4)*,al selecte* was a tee(ager at +rst
-arr)age.
b )(* the probab)l)ty that the )(*)4)*,al selecte* was a tee(ager at +rst
-arr)age g)4e( that the perso( )s -ale.
Sol,t)o(:
t )s (at,ral to let ( also *e(ote the e4e(t that the perso( selecte* was a tee(ager at
+rst -arr)age a(* to let *e(ote the e4e(t that the perso( selecte* )s -ale.
a. Accor*)(g to the table the proport)o( of )(*)4)*,als )( the sa-ple who were )(
the)r tee(s at the)r +rst -arr)age )s 120/9;2. &h)s )s the relat)4e fre<,e(cy of s,ch people
)( the pop,lat)o( he(ce P(E)=125/902≈0.139or abo,t 1G.
S)(ce )t )s k(ow( that the perso( selecte* )s -ale all the fe-ales -ay be
re-o4e* fro- co(s)*erat)o( so that o(ly the row )( the table correspo(*)(g
to -e( )( the sa-ple appl)es:
E W H Total
M 43 293 114 450
&he proport)o( of -ales )( the sa-ple who were )( the)r tee(s at the)r +rst -arr)age
)s 3/0;. &h)s )s the relat)4e fre<,e(cy of s,ch people )( the pop,lat)o( of -ales
he(ce P(E|M )=43/450≈0.096or abo,t 1;G.
,n the next example$ the computational formula in the definition must be used.
+A>2!+ 00
Saylor URL: http://www.saylor.org/books Saylor.org100
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 156/723
S,ppose that )( a( a*,lt pop,lat)o( the proport)o( of people who are both
o4erwe)ght a(* s,8er hyperte(s)o( )s ;.;9D the proport)o( of people who are (ot
o4erwe)ght b,t s,8er hyperte(s)o( )s ;.11D the proport)o( of people who are
o4erwe)ght b,t *o (ot s,8er hyperte(s)o( )s ;.;2D a(* the proport)o( of people
who are (e)ther o4erwe)ght (or s,8er hyperte(s)o( )s ;.6. A( a*,lt )s ra(*o-ly
selecte* fro- th)s pop,lat)o(.
a. )(* the probab)l)ty that the perso( selecte* s,8ers hyperte(s)o( g)4e( that he )s
o4erwe)ght.
b. )(* the probab)l)ty that the selecte* perso( s,8ers hyperte(s)o( g)4e( that he )s (ot
o4erwe)ght.
c. %o-pare the two probab)l)t)es >,st fo,(* to g)4e a( a(swer to the <,est)o( as to
whether o4erwe)ght people te(* to s,8er fro- hyperte(s)o(.
Sol,t)o(:
Let *e(ote the e4e(t the perso( selecte* s,8ers hyperte(s)o(. Let *e(ote
the e4e(t the perso( selecte* )s o4erwe)ght. &he probab)l)ty )(for-at)o( g)4e(
)( the proble- -ay be orga()Ve* )(to the follow)(g co(t)(ge(cy table:
O Oc
H 0.09 0.11
H c 0.02 0.78
Saylor URL: http://www.saylor.org/books Saylor.org10
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 157/723
&ndependent +vents
%lthough typically we expect the conditional probabilityP(A|B)to be different from the
probabilityP(A)of A$ it does not have to be different from P(A). (henP(A|B)=P(A) the occurrence
of 9 has no effect on the likelihood of A. (hether or not the event A has occurred is independent of
the event 9.
<sing algebra it can be shown that the e"ualityP(A|B)=P(A)holds if and only if the e"ualityP(A∩
B)=P(A)DP(B)holds$ which in turn is true if and only if P(B|A)=P(B).This is the basis for the
following definition.
(e)nition
Saylor URL: http://www.saylor.org/books Saylor.org10
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 158/723
;vents A and 9 are independent if
P(A∩B)=P(A)DP(B)
>f A and 9 are not independent then they are dependent.
The formula in the definition has two practical but exactly opposite uses:
1 ,n a situation in which we can compute all three probabilitiesP(A)P(B) andP(A∩B) it is used to
check whether or not the events A and 9 are independent:
o ,fP(A∩B)=P(A)DP(B) then A and 9 are independent.
o ,fP(A∩B)≠P(A)DP(B) then A and 9 are not independent.
! ,n a situation in which each ofP(A)andP(B)can be computed and it is known that A and 9 are
independent$ then we can computeP(A∩B) by multiplying
togetherP(A)andP(B)/P(A∩B)=P(A)DP(B).
+A>2!+ 03
A s)(gle fa)r *)e )s rolle*. Let A={3}a(* B={1,3,5}.Are 0 a(* - )(*epe(*e(tC
Sol,t)o(:
( th)s e7a-ple we ca( co-p,te all three probab)l)t)es P(A)=1/6@ P(B)=1/2 a(* P(A∩
B)=P({3})=1/6.S)(ce the pro*,ct P(A)DP(B)=(1/6)(1/2)=1/12)s (ot the sa-e (,-ber as P(A∩
B)=1/6 the e4e(ts 0 a(* - are (ot )(*epe(*e(t.
+A>2!+ 07
&he two=way class)+cat)o( of -arr)e* or pre4)o,sly -arr)e* a*,lts ,(*er ; accor*)(g
to ge(*er a(* age at +rst -arr)age )( Note 3.6 QE7a-ple 21Q pro*,ce* the table
E W H Total
M 43 293 114 450
Saylor URL: http://www.saylor.org/books Saylor.org106
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 159/723
E W H Total
F 82 299 71 452
,o'a" 125 592 185 902
eter-)(e whether or (ot the e4e(ts F : fe-ale a(* (: was a tee(ager at +rst
-arr)age are )(*epe(*e(t.
+A>2!+ 08a(y *)ag(ost)c tests for *etect)(g *)seases *o (ot test for the *)sease *)rectly
b,t for a che-)cal or b)olog)cal pro*,ct of the *)sease he(ce are (ot perfectly
rel)able. &he sensitivit/ of a test )s the probab)l)ty that the test w)ll be pos)t)4e
whe( a*-)()stere* to a perso( who has the *)sease. &he h)gher the se(s)t)4)ty
the greater the *etect)o( rate a(* the lower the false (egat)4e rate.
Saylor URL: http://www.saylor.org/books Saylor.org109
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 160/723
S,ppose the se(s)t)4)ty of a *)ag(ost)c proce*,re to test whether a perso( has a
part)c,lar *)sease )s 92G. A perso( who act,ally has the *)sease )s teste* for )t
,s)(g th)s proce*,re by two )(*epe(*e(t laborator)es.
a. Bhat )s the probab)l)ty that both test res,lts w)ll be pos)t)4eC
b. Bhat )s the probab)l)ty that at least o(e of the two test res,lts w)ll be pos)t)4eC
Sol,t)o(:
a. Let 01 *e(ote the e4e(t the test by the +rst laboratory )s pos)t)4e a(*
let 02 *e(ote the e4e(t the test by the seco(* laboratory )s pos)t)4e.
S)(ce 01 a(* 02 are )(*epe(*e(t
P(A1 ∩A2)=P(A1)DP(A2)=0.92×0.92=0.8464
b. Us)(g the A**)t)4e R,le for Probab)l)ty a(* the probab)l)ty >,st
co-p,te*
P(A1 A2)=P(A1)+P(A2)−P(A1 ∩A2)=0.92+0.92−0.8464=0.9936
+A>2!+ 09
&he speci4cit/ of a *)ag(ost)c test for a *)sease )s the probab)l)ty that the test w)ll be
(egat)4e whe( a*-)()stere* to a perso( who *oes (ot ha4e the *)sease. &he h)gher
the spec)+c)ty the lower the false pos)t)4e rate.
S,ppose the spec)+c)ty of a *)ag(ost)c proce*,re to test whether a perso( has a
part)c,lar *)sease )s 69G.
a. A perso( who *oes (ot ha4e the *)sease )s teste* for )t ,s)(g th)s proce*,re.
Bhat )s the probab)l)ty that the test res,lt w)ll be pos)t)4eC
b. A perso( who *oes (ot ha4e the *)sease )s teste* for )t by two )(*epe(*e(t
laborator)es ,s)(g th)s proce*,re. Bhat )s the probab)l)ty that both test res,lts w)ll be
pos)t)4eC
Sol,t)o(:
Saylor URL: http://www.saylor.org/books Saylor.org1;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 161/723
a. Let - *e(ote the e4e(t the test res,lt )s pos)t)4e. &he co-ple-e(t of - )s
that the test res,lt )s (egat)4e a(* has probab)l)ty the spec)+c)ty of the test ;.69.
&h,s
P(B)=1−P(Bc)=1−0.89=0.11.
b. Let -1 *e(ote the e4e(t the test by the +rst laboratory )s pos)t)4e a(*
let -2 *e(ote the e4e(t the test by the seco(* laboratory )s pos)t)4e.
S)(ce -1 a(* -2 are )(*epe(*e(t by part a5 of the e7a-ple
P(B1∩B2)=P(B1)DP(B2)=0.11×0.11=0.0121.
The concept of independence applies to any number of events. /or example$ three events A$ 9$
and < are independent if P(A∩B∩C)=P(A)DP(B)DP(C).9ote carefully that$ as is the case with
#ust two events$ this is not a formula that is always valid$ but holds precisely when the events in
"uestion are independent.
Saylor URL: http://www.saylor.org/books Saylor.org11
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 162/723
Saylor URL: http://www.saylor.org/books Saylor.org12
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 163/723
2robabilities on Tree (ia$rams
+ome probability problems are made much simpler when approached using a tree diagram. The next
example illustrates how to place probabilities on a tree diagram and use it to solve a problem.
EKAPLE 26
A >ar co(ta)(s 1; -arbles black a(* 3 wh)te. &wo -arbles are *raw( w)tho,t
replace-e(t wh)ch -ea(s that the +rst o(e )s (ot p,t back before the seco(* o(e )s
*raw(.
a. Bhat )s the probab)l)ty that both -arbles are blackC
b. Bhat )s the probab)l)ty that e7actly o(e -arble )s blackC
c. Bhat )s the probab)l)ty that at least o(e -arble )s blackC
Sol,t)o(:
Saylor URL: http://www.saylor.org/books Saylor.org13
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 164/723
A tree *)agra- for the s)t,at)o( of *raw)(g o(e -arble after the other w)tho,t
replace-e(t )s show( )( )g,re 3. Q&ree )agra- for raw)(g &wo arblesQ. &he
c)rcle a(* recta(gle w)ll be e7pla)(e* later a(* sho,l* be )g(ore* for (ow.
Figure *.5)ree Diagram for Dra&ing )&o arbles
&he (,-bers o( the two left-ost bra(ches are the probab)l)t)es of gett)(g e)ther
a black -arble o,t of 1; or a wh)te -arble 3 o,t of 1; o( the +rst *raw. &he
(,-ber o( each re-a)()(g bra(ch )s the probab)l)ty of the e4e(t correspo(*)(g to
the (o*e o( the r)ght e(* of the bra(ch occ,rr)(g g)4e( that the e4e(t
correspo(*)(g to the (o*e o( the left e(* of the bra(ch has occ,rre*. &h,s for
the top bra(ch co((ect)(g the two #s )t )s P(B2|B1)@ where -1 *e(otes the e4e(t
the +rst -arble *raw( )s black a(* -2 *e(otes the e4e(t the seco(* -arble
*raw( )s black. S)(ce after *raw)(g a black -arble o,t there are 9 -arbles left
of wh)ch are black th)s probab)l)ty )s /9.
&he (,-ber to the r)ght of each +(al (o*e )s co-p,te* as show( ,s)(g the
pr)(c)ple that )f the for-,la )( the %o(*)t)o(al R,le for Probab)l)ty )s -,lt)pl)e*
by P(B)@ the( the res,lt )s
Saylor URL: http://www.saylor.org/books Saylor.org1
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 165/723
P(B∩A)=P(B)DP(A|B)
a. &he e4e(t both -arbles are black )s B1 ∩B2 a(* correspo(*s to the top r)ght
(o*e )( the tree wh)ch has bee( c)rcle*. &h,s as )(*)cate* there )t )s ;..
b. &he e4e(t e7actly o(e -arble )s black correspo(*s to the two (o*es of the tree
e(close* by the recta(gle. &he e4e(ts that correspo(* to these two (o*es are
-,t,ally e7cl,s)4e: black followe* by wh)te )s )(co-pat)ble w)th wh)te followe* by
black. &h,s )( accor*a(ce w)th the A**)t)4e R,le for Probab)l)ty we -erely a** the
two probab)l)t)es (e7t to these (o*es s)(ce what wo,l* be s,btracte* fro- the s,-
)s Vero. &h,s the probab)l)ty of *raw)(g e7actly o(e black -arble )( two tr)es
)s 0.23+0.23=0.46.
&he e4e(t at least o(e -arble )s black correspo(*s to the three (o*es of the
tree e(close* by e)ther the c)rcle or the recta(gle. &he e4e(ts that correspo(*
to these (o*es are -,t,ally e7cl,s)4e so as )( part b5 we -erely a** the
probab)l)t)es (e7t to these (o*es. &h,s the probab)l)ty of *raw)(g at least o(e
black -arble )( two tr)es )s 0.47+0.23+0.23=0.93.
"f co,rse th)s a(swer co,l* ha4e bee( fo,(* -ore eas)ly ,s)(g the Probab)l)ty
Law for %o-ple-e(ts s)-ply s,btract)(g the probab)l)ty of the co-ple-e(tary
e4e(t two wh)te -arbles are *raw( fro- 1 to obta)( 1−0.07=0.93.
%s this example shows$ finding the probability for each branch is fairly straightforward$ since we
compute it knowing everything that has happened in the se"uence of steps so far. Two principles that
are true in general emerge from this example:
2robabilities on Tree (ia$rams
1 The probability of the event corresponding to any node on a tree is the product of the numbers on the
uni"ue path of branches that leads to that node from the start.
! ,f an event corresponds to several final nodes$ then its probability is obtained by adding the numbers
next to those nodes.
Saylor URL: http://www.saylor.org/books Saylor.org10
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 166/723
*+, TA*+AA,S
• A co(*)t)o(al probab)l)ty )s the probab)l)ty that a( e4e(t has occ,rre* tak)(g )(to
acco,(t a**)t)o(al )(for-at)o( abo,t the res,lt of the e7per)-e(t.
• A co(*)t)o(al probab)l)ty ca( always be co-p,te* ,s)(g the for-,la )( the
*e+()t)o(. So-et)-es )t ca( be co-p,te* by *)scar*)(g part of the sa-ple space.
• &wo e4e(ts 0 a(* - are )(*epe(*e(t )f the probab)l)ty P(A∩B)of the)r
)(tersect)o( 0 \ - )s e<,al to the pro*,ct P(A)DP(B)of the)r )(*)4)*,al probab)l)t)es.
Saylor URL: http://www.saylor.org/books Saylor.org1
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 167/723
Saylor URL: http://www.saylor.org/books Saylor.org1
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 168/723
a &he probab)l)ty that the car* *raw( )s re*.
b &he probab)l)ty that the car* )s re* g)4e( that )t )s (ot gree(.
c &he probab)l)ty that the car* )s re* g)4e( that )t )s (e)ther re* (or yellow.
* &he probab)l)ty that the car* )s re* g)4e( that )t )s (ot a fo,r.
1; A spec)al *eck of 1 car*s has that are bl,e yellow gree( a(* re*. &he fo,r
car*s of each color are (,-bere* fro- o(e to fo,r. A s)(gle car* )s *raw( at ra(*o-.
)(* the follow)(g probab)l)t)es.
Saylor URL: http://www.saylor.org/books Saylor.org16
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 169/723
a &he probab)l)ty that the car* *raw( )s a two or a fo,r.
b &he probab)l)ty that the car* )s a two or a fo,r g)4e( that )t )s (ot a o(e.
c &he probab)l)ty that the car* )s a two or a fo,r g)4e( that )t )s e)ther a two or a
three.
* &he probab)l)ty that the car* )s a two or a fo,r g)4e( that )t )s re* or gree(.
11 A ra(*o- e7per)-e(t ga4e r)se to the two=way co(t)(ge(cy table show(. Use )t to
co-p,te the probab)l)t)es )(*)cate*.
R S
A 0.12 0.18
B 0.28 0.42
a P(A)@ P(R)@ P(A∩R).
b #ase* o( the a(swer to a5 *eter-)(e whether or (ot the e4e(ts 0 a(* 6 are
)(*epe(*e(t.
c #ase* o( the a(swer to b5 *eter-)(e whether or (otP(A|R)ca( be pre*)cte* w)tho,t
a(y co-p,tat)o(. f so -ake the pre*)ct)o(. ( a(y case co-p,te P(A|R),s)(g the R,le
for %o(*)t)o(al Probab)l)ty.
12 A ra(*o- e7per)-e(t ga4e r)se to the two=way co(t)(ge(cy table show(. Use )t to
co-p,te the probab)l)t)es )(*)cate*.
R S A 0.13 0.07
B 0.61 0.19
a P(A)@ P(R)@ P(A∩R).
b #ase* o( the a(swer to a5 *eter-)(e whether or (ot the e4e(ts 0 a(* 6 are
)(*epe(*e(t.
c #ase* o( the a(swer to b5 *eter-)(e whether or (ot P(A|R)ca( be pre*)cte*
w)tho,t a(y co-p,tat)o(. f so -ake the pre*)ct)o(. ( a(y case co-p,te P(A|
R),s)(g the R,le for %o(*)t)o(al Probab)l)ty.
13 S,ppose for e4e(ts 0 a(* - )( a ra(*o- e7per)-e(t P(A)=0.70a(* P(B)=0.30.%o-p,te
the )(*)cate* probab)l)ty or e7pla)( why there )s (ot e(o,gh )(for-at)o( to *o so.
a P(A∩B).
b P(A∩B)@ w)th the e7tra )(for-at)o( that 0 a(* - are )(*epe(*e(t.
c P(A∩B)@ w)th the e7tra )(for-at)o( that 0 a(* - are -,t,ally e7cl,s)4e.
Saylor URL: http://www.saylor.org/books Saylor.org19
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 170/723
1 S,ppose for e4e(ts 0 a(* - co((ecte* to so-e ra(*o-
e7per)-e(t P(A)=0.50a(* P(B)=0.50.%o-p,te the )(*)cate* probab)l)ty or e7pla)( why
there )s (ot e(o,gh )(for-at)o( to *o so.
a P(A∩B).
b P(A∩B)@ w)th the e7tra )(for-at)o( that 0 a(* - are )(*epe(*e(t.
c P(A∩B)@ w)th the e7tra )(for-at)o( that 0 a(* - are -,t,ally e7cl,s)4e.
10 S,ppose for e4e(ts 0 - a(* , co((ecte* to so-e ra(*o- e7per)-e(t 0 - a(* , are
)(*epe(*e(t a(* P(A)=0.88@ P(B)=0.65@ and P(C)=0.44.%o-p,te the )(*)cate* probab)l)ty or
e7pla)( why there )s (ot e(o,gh )(for-at)o( to *o so.
a P(A∩B∩C)
b P Ac∩Bc∩Cc)
1 S,ppose for e4e(ts 0 - a(* , co((ecte* to so-e ra(*o- e7per)-e(t 0 - a(* , are
)(*epe(*e(t a(* P(A)=0.95@ P(B)=0.73 a(* P(C)=0.62.%o-p,te the )(*)cate* probab)l)ty or
e7pla)( why there )s (ot e(o,gh )(for-at)o( to *o so.
a P(A∩B∩C)
b P Ac ∩Bc
∩Cc)
A22!&CAT&1NS
1 &he sa-ple space that *escr)bes all three=ch)l* fa-)l)es accor*)(g to the ge(*ers of the
ch)l*re( w)th respect to b)rth or*er )s
S={bbb,bbg,bgb,bgg,gbb,gbg,ggb,ggg}
( the e7per)-e(t of select)(g a three=ch)l* fa-)ly at ra(*o- co-p,te each of the
follow)(g probab)l)t)es ass,-)(g all o,tco-es are e<,ally l)kely.
a &he probab)l)ty that the fa-)ly has at least two boys.
b &he probab)l)ty that the fa-)ly has at least two boys g)4e( that (ot all of the
ch)l*re( are g)rls.
c &he probab)l)ty that at least o(e ch)l* )s a boy.
* &he probab)l)ty that at least o(e ch)l* )s a boy g)4e( that the +rst bor( )s a
g)rl.
16 &he follow)(g two=way co(t)(ge(cy table g)4es the break*ow( of the pop,lat)o( )( a
part)c,lar locale accor*)(g to age a(* (,-ber of 4eh)c,lar -o4)(g 4)olat)o(s )( the past
three years:
Saylor URL: http://www.saylor.org/books Saylor.org1;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 171/723
Age
iolations
0 !.
Under 21 0.04 0.06 0.02
21–40 0.25 0.16 0.0141–60 0.23 0.10 0.02
60 0.08 0.03 0.00
A perso( )s selecte* at ra(*o-. )(* the follow)(g probab)l)t)es.
a &he perso( )s ,(*er 21.
b &he perso( has ha* at least two 4)olat)o(s )( the past three years.
c &he perso( has ha* at least two 4)olat)o(s )( the past three years g)4e( that
he )s ,(*er 21.
* &he perso( )s ,(*er 21 g)4e( that he has ha* at least two 4)olat)o(s )( the
past three years.
e eter-)(e whether the e4e(ts the perso( )s ,(*er 21 a(* the perso( has
ha* at least two 4)olat)o(s )( the past three years are )(*epe(*e(t or (ot.
19 &he follow)(g two=way co(t)(ge(cy table g)4es the break*ow( of the pop,lat)o( )( a
part)c,lar locale accor*)(g to party a?l)at)o( 0 - , or 'one5 a(* op)()o( o( a bo(*
)ss,e:
Affiliation
Opinion
Favors Opposes Undecided
A 0.12 0.09 0.07
B 0.16 0.12 0.14
C 0.04 0.03 0.06
None 0.08 0.06 0.03
A perso( )s selecte* at ra(*o-. )(* each of the follow)(g probab)l)t)es.
a &he perso( )s )( fa4or of the bo(* )ss,e.
b &he perso( )s )( fa4or of the bo(* )ss,e g)4e( that he )s a?l)ate* w)thparty 0.
c &he perso( )s )( fa4or of the bo(* )ss,e g)4e( that he )s a?l)ate* w)th
party -.
2; &he follow)(g two=way co(t)(ge(cy table g)4es the break*ow( of the pop,lat)o( of
patro(s at a grocery store accor*)(g to the (,-ber of )te-s p,rchase* a(* whether
or (ot the patro( -a*e a( )-p,lse p,rchase at the checko,t co,(ter:
Saylor URL: http://www.saylor.org/books Saylor.org11
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 172/723
Number of /tems
/mpulse urchase
#ade Not #ade
e/ 0.01 0.19
an 0.04 0.76A patro( )s selecte* at ra(*o-. )(* each of the follow)(g probab)l)t)es.
a &he patro( -a*e a( )-p,lse p,rchase.
b &he patro( -a*e a( )-p,lse p,rchase g)4e( that the total (,-ber of )te-s
p,rchase* was -a(y.
c eter-)(e whether or (ot the e4e(ts few p,rchases a(* -a*e a( )-p,lse
p,rchase at the checko,t co,(ter are )(*epe(*e(t.
21 &he follow)(g two=way co(t)(ge(cy table g)4es the break*ow( of the pop,lat)o( of
a*,lts )( a part)c,lar locale accor*)(g to e-ploy-e(t type a(* le4el of l)fe )(s,ra(ce:
$mplo1ment T1pe
2evel of /nsurance
2o3 #edium 4igh
Un+i""ed 0.07 0.19 0.00
e%i-+i""ed 0.04 0.28 0.08
i""ed 0.03 0.18 0.05
*roe++iona" 0.01 0.05 0.02
A( a*,lt )s selecte* at ra(*o-. )(* each of the follow)(g probab)l)t)es.
a &he perso( has a h)gh le4el of l)fe )(s,ra(ce.b &he perso( has a h)gh le4el of l)fe )(s,ra(ce g)4e( that he *oes (ot ha4e a
profess)o(al pos)t)o(.
c &he perso( has a h)gh le4el of l)fe )(s,ra(ce g)4e( that he has a profess)o(al
pos)t)o(.
* eter-)(e whether or (ot the e4e(ts has a h)gh le4el of l)fe )(s,ra(ce a(*
has a profess)o(al pos)t)o( are )(*epe(*e(t.
Saylor URL: http://www.saylor.org/books Saylor.org12
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 173/723
2 A -a( has two l)ghts )( h)s well ho,se to keep the p)pes fro- freeV)(g )( w)(ter. @e
checks the l)ghts *a)ly. Each l)ght has probab)l)ty ;.;;2 of b,r()(g o,t before )t )s
checke* the (e7t *ay )(*epe(*e(tly of the other l)ght5.
a f the l)ghts are w)re* )( parallel o(e w)ll co(t)(,e to sh)(e e4e( )f the other
b,r(s o,t. ( th)s s)t,at)o( co-p,te the probab)l)ty that at least o(e l)ght w)ll
co(t)(,e to sh)(e for the f,ll 2 ho,rs. Note the greatly )(crease* rel)ab)l)ty of
the syste- of two b,lbs o4er that of a s)(gle b,lb.
b f the l)ghts are w)re* )( ser)es (e)ther o(e w)ll co(t)(,e to sh)(e e4e( )f o(ly
o(e of the- b,r(s o,t. ( th)s s)t,at)o( co-p,te the probab)l)ty that at least
o(e l)ght w)ll co(t)(,e to sh)(e for the f,ll 2 ho,rs. Note the sl)ghtly
*ecrease* rel)ab)l)ty of the syste- of two b,lbs o4er that of a s)(gle b,lb.
Saylor URL: http://www.saylor.org/books Saylor.org13
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 174/723
20 A( acco,(ta(t has obser4e* that 0G of all cop)es of a part)c,lar two=part for- ha4e
a( error )( Part a(* 2G ha4e a( error )( Part . f the errors occ,r )(*epe(*e(tly
+(* the probab)l)ty that a ra(*o-ly selecte* for- w)ll be error=free.
2 A bo7 co(ta)(s 2; screws wh)ch are )*e(t)cal )( s)Ve b,t 12 of wh)ch are V)(c coate* a(*
6 of wh)ch are (ot. &wo screws are selecte* at ra(*o- w)tho,t replace-e(t.
a )(* the probab)l)ty that both are V)(c coate*.b )(* the probab)l)ty that at least o(e )s V)(c coate*.
A((&T&1NA! ++/C&S+S
2 E4e(ts 0 a(* - are -,t,ally e7cl,s)4e. )(* P(A|B).
26 &he c)ty co,(c)l of a part)c,lar c)ty )s co-pose* of +4e -e-bers of party 0 fo,r
-e-bers of party - a(* three )(*epe(*e(ts. &wo co,(c)l -e-bers are ra(*o-ly
selecte* to for- a( )(4est)gat)4e co--)ttee.
a )(* the probab)l)ty that both are fro- party 0.
b )(* the probab)l)ty that at least o(e )s a( )(*epe(*e(t.
c )(* the probab)l)ty that the two ha4e *)8ere(t party a?l)at)o(s that )s (ot
both 0 (ot both - a(* (ot both )(*epe(*e(t5.
29 A basketball player -akes ;G of the free throws that he atte-pts e7cept that )f he
has >,st tr)e* a(* -)sse* a free throw the( h)s cha(ces of -ak)(g a seco(* o(e go
*ow( to o(ly 3;G. S,ppose he has >,st bee( awar*e* two free throws.
a )(* the probab)l)ty that he -akes both.
b )(* the probab)l)ty that he -akes at least o(e. A tree *)agra- co,l* help.5
3; A( eco(o-)st w)shes to ascerta)( the proport)o( p of the pop,lat)o( of )(*)4)*,al
ta7payers who ha4e p,rposely s,b-)tte* fra,*,le(t )(for-at)o( o( a( )(co-e ta7
ret,r(. &o tr,ly g,ara(tee a(o(y-)ty of the ta7payers )( a ra(*o- s,r4ey ta7payers
<,est)o(e* are g)4e( the follow)(g )(str,ct)o(s.
1 l)p a co)(.
2 f the co)( la(*s hea*s a(swer Jes to the <,est)o( @a4e yo, e4er
s,b-)tte* fra,*,le(t )(for-at)o( o( a ta7 ret,r(C e4e( )f yo, ha4e
(ot.
3 f the co)( la(*s ta)ls g)4e a tr,thf,l Jes or No a(swer to the
<,est)o( @a4e yo, e4er s,b-)tte* fra,*,le(t )(for-at)o( o( a ta7
ret,r(C
Saylor URL: http://www.saylor.org/books Saylor.org1
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 175/723
&he <,est)o(er )s (ot tol* how the co)( la(*e* so he *oes (ot k(ow )f a Jes a(swer )s
the tr,th or )s g)4e( o(ly beca,se of the co)( toss.
a Us)(g the Probab)l)ty R,le for %o-ple-e(ts a(* the )(*epe(*e(ce of the co)(
toss a(* the ta7payersW stat,s +ll )( the e-pty cells )( the two=way
co(t)(ge(cy table show(. Ass,-e that the co)( )s fa)r. Each cell e7cept thetwo )( the botto- row w)ll co(ta)( the ,(k(ow( proport)o( or probab)l)ty5 p.
Status
Coin
robabilit14 T
ra&d p
No ra&d
*robabi"i' 1
b &he o(ly )(for-at)o( that the eco(o-)st sees are the e(tr)es )( the follow)(g
table:
Saylor URL: http://www.saylor.org/books Saylor.org10
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 176/723
Saylor URL: http://www.saylor.org/books Saylor.org1
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 177/723
Saylor URL: http://www.saylor.org/books Saylor.org1
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 178/723
Saylor URL: http://www.saylor.org/books Saylor.org16
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 179/723
Chapter 7
(iscrete /andom <ariables
,t is often the case that a number is naturally associated to the outcome of a random experiment: thenumber of boys in a three-child family$ the number of defective light bulbs in a case of 155 bulbs$ the
length of time until the next customer arrives at the drive-through window at a bank. +uch a number
varies from trial to trial of the corresponding experiment$ and does so in a way that cannot be
predicted with certainty hence$ it is called a random variable. ,n this chapter and the next we study
such variables.
7.% /andom <ariables
!+A/N&N: 1';+CT&<+S
1 &o lear( the co(cept of a ra(*o- 4ar)able.
2 &o lear( the *)st)(ct)o( betwee( *)screte a(* co(t)(,o,s ra(*o- 4ar)ables.
(e)nition
A random variable is a numerical quantity that is generated by a random experiment.
(e will denote random variables by capital letters$ such as B or C $ and the actual values that they can
take by lowercase letters$ such as x and z .
Table 6.1 0/our Candom Dariables0 gives four examples of random variables. ,n the second example$ the
three dots indicates that every counting number is a possible value for B . %lthough it is highly unlikely$ for
example$ that it would take 85 tosses of the coin to observe heads for the first time$ nevertheless it is
conceivable$ hence the number 85 is a possible value. The set of possible values is infinite$ but is still at
least countable$ in the sense that all possible values can be listed one after another. ,n the last two
examples$ by way of contrast$ the possible values cannot be individually listed$ but take up a wholeinterval of numbers. ,n the fourth example$ since the light bulb could conceivably continue to shine
indefinitely$ there is no natural greatest value for its lifetime$ so we simply place the symbol ∞ for infinity
as the right endpoint of the interval of possible values.
Saylor URL: http://www.saylor.org/books Saylor.org19
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 180/723
Table 6.1 /our Candom Dariables
+xperiment Number X
2ossible <alues
o X
Roll two fa)r *)ce
S,- of the (,-ber of *ots o(
the top faces
2 3 0 6 9
1; 11 12
l)p a fa)r co)( repeate*ly
N,-ber of tosses ,(t)l the co)(
la(*s hea*s 1 2 3 ]
eas,re the 4oltage at a(
electr)cal o,tlet 'oltage -eas,re* 116 ^ 122
"perate a l)ght b,lb ,(t)l )t
b,r(s o,t &)-e ,(t)l the b,lb b,r(s o,t ; _ `
e+()t)o(
A random variable is called discrete if it has either a finite or a countable number of possible values. A
random variable is called continuous if its possible values contain a whole interval of numbers.
The examples in the table are typical in that discrete random variables typically arise from a counting
process$ whereas continuous random variables typically arise from a measurement.
*+, TA*+AA,S
• A ra(*o- 4ar)able )s a (,-ber ge(erate* by a ra(*o- e7per)-e(t.
• A ra(*o- 4ar)able )s calle* discrete )f )ts poss)ble 4al,es for- a +()te or
co,(table set.
• A ra(*o- 4ar)able )s calle* continuous )f )ts poss)ble 4al,es co(ta)( a whole
)(ter4al of (,-bers.
++/C&S+S
'AS&C
Saylor URL: http://www.saylor.org/books Saylor.org16;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 181/723
1 %lass)fy each ra(*o- 4ar)able as e)ther *)screte or co(t)(,o,s.
a &he (,-ber of arr)4als at a( e-erge(cy roo- betwee( -)*()ght a(* :;;
a.-.
b &he we)ght of a bo7 of cereal labele* 16 o,(ces.
c &he *,rat)o( of the (e7t o,tgo)(g telepho(e call fro- a b,s)(ess o?ce.
* &he (,-ber of ker(els of popcor( )( a 1=po,(* co(ta)(er.
e &he (,-ber of appl)ca(ts for a >ob.
2 %lass)fy each ra(*o- 4ar)able as e)ther *)screte or co(t)(,o,s.
a &he t)-e betwee( c,sto-ers e(ter)(g a checko,t la(e at a reta)l store.
b &he we)ght of ref,se o( a tr,ck arr)4)(g at a la(*+ll.
c &he (,-ber of passe(gers )( a passe(ger 4eh)cle o( a h)ghway at r,sh ho,r.
* &he (,-ber of cler)cal errors o( a -e*)cal chart.
e &he (,-ber of acc)*e(t=free *ays )( o(e -o(th at a factory.
3 %lass)fy each ra(*o- 4ar)able as e)ther *)screte or co(t)(,o,s.
a &he (,-ber of boys )( a ra(*o-ly selecte* three=ch)l* fa-)ly.
b &he te-perat,re of a c,p of co8ee ser4e* at a resta,ra(t.
c &he (,-ber of (o=shows for e4ery 1;; reser4at)o(s -a*e w)th a co--erc)al
a)rl)(e.
* &he (,-ber of 4eh)cles ow(e* by a ra(*o-ly selecte* ho,sehol*.
e &he a4erage a-o,(t spe(t o( electr)c)ty each $,ly by a ra(*o-ly selecte*
ho,sehol* )( a certa)( state.
%lass)fy each ra(*o- 4ar)able as e)ther *)screte or co(t)(,o,s.
a &he (,-ber of patro(s arr)4)(g at a resta,ra(t betwee( 0:;; p.-. a(* :;;
p.-.
b &he (,-ber of (ew cases of )([,e(Va )( a part)c,lar co,(ty )( a co-)(g
-o(th.
c &he a)r press,re of a t)re o( a( a,to-ob)le.
* &he a-o,(t of ra)( recor*e* at a( a)rport o(e *ay.
Saylor URL: http://www.saylor.org/books Saylor.org161
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 182/723
e &he (,-ber of st,*e(ts who act,ally reg)ster for classes at a ,()4ers)ty (e7t
se-ester.
0 *e(t)fy the set of poss)ble 4al,es for each ra(*o- 4ar)able. ake a reaso(able
est)-ate base* o( e7per)e(ce where (ecessary.5
a &he (,-ber of hea*s )( two tosses of a co)(.
b &he a4erage we)ght of (ewbor( bab)es bor( )( a part)c,lar co,(ty o(e -o(th.
c &he a-o,(t of l)<,)* )( a 12=o,(ce ca( of soft *r)(k.
* &he (,-ber of ga-es )( the (e7t Borl* Ser)es best of ,p to se4e( ga-es5.
e &he (,-ber of co)(s that -atch whe( three co)(s are tosse* at o(ce.
*e(t)fy the set of poss)ble 4al,es for each ra(*o- 4ar)able. ake a reaso(able
est)-ate base* o( e7per)e(ce where (ecessary.5a &he (,-ber of hearts )( a +4e=car* ha(* *raw( fro- a *eck of 02 car*s that
co(ta)(s 13 hearts )( all.
b &he (,-ber of p)tches -a*e by a start)(g p)tcher )( a -a>or leag,e baseball
ga-e.
c &he (,-ber of break*ow(s of c)ty b,ses )( a large c)ty )( o(e week.
* &he *)sta(ce a re(tal car re(te* o( a *a)ly rate )s *r)4e( each *ay.
e &he a-o,(t of ra)(fall at a( a)rport (e7t -o(th.
ANS+/S
1 a. *)screte
a co(t)(,o,s
b co(t)(,o,s
c *)screte
* *)screte
3
a *)screte
b co(t)(,o,s
c *)screte
* *)screte
e co(t)(,o,s
Saylor URL: http://www.saylor.org/books Saylor.org162
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 183/723
0
a {0.1.2}
b a( )(ter4al (a,b)a(swers 4ary5
c a( )(ter4al (a,b)a(swers 4ary5
* M0
e M23
7.0 2robability (istributions or (iscrete/andom <ariables
!+A/N&N: 1';+CT&<+S
1 &o lear( the co(cept of the probab)l)ty *)str)b,t)o( of a *)screte ra(*o- 4ar)able.
2 &o lear( the co(cepts of the -ea( 4ar)a(ce a(* sta(*ar* *e4)at)o( of a *)screte
ra(*o- 4ar)able a(* how to co-p,te the-.
2robability (istributions %ssociated to each possible value x of a discrete random variable B is the probabilityP(x)that B will take
the value x in one trial of the experiment.
(e)nition
The probability distribution of a discrete random variable B is a list of each possible value
of B together with the probability that B takes that value in one trial of the experiment.
The probabilities in the probability distribution of a random variable B must satisfy the following two
conditions:
1 ach probabilityP(x)must be between 5 and 1:0≤P(x)≤1.
! The sum of all the probabilities is 1:ΣP(x)=1.
Saylor URL: http://www.saylor.org/books Saylor.org163
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 184/723
Saylor URL: http://www.saylor.org/books Saylor.org16
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 185/723
Saylor URL: http://www.saylor.org/books Saylor.org160
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 186/723
Saylor URL: http://www.saylor.org/books Saylor.org16
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 187/723
Figure 8.29robabilit/ Distribution for )ossing )&o Fair Dice
Saylor URL: http://www.saylor.org/books Saylor.org16
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 188/723
The >ean and Standard (eviation o a (iscrete /andom
<ariable
e+()t)o(
The mean Dalso called the e)pected value E of a discrete random variable B is the number
µ=E(X)=Σx P(x)
The mean of a random variable may be interpreted as the average of the values assumed by the random
variable in repeated trials of the experiment.
Saylor URL: http://www.saylor.org/books Saylor.org166
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 189/723
Saylor URL: http://www.saylor.org/books Saylor.org169
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 190/723
The concept of expected value is also basic to the insurance industry$ as the following simplified
example illustrates.
Saylor URL: http://www.saylor.org/books Saylor.org19;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 191/723
EKAPLE 0
A l)fe )(s,ra(ce co-pa(y w)ll sell a 2;;;;; o(e=year ter- l)fe )(s,ra(ce pol)cy to a(
)(*)4)*,al )( a part)c,lar r)sk gro,p for a pre-),- of 190. )(* the e7pecte* 4al,e to
the co-pa(y of a s)(gle pol)cy )f a perso( )( th)s r)sk gro,p has a 99.9G cha(ce of
s,r4)4)(g o(e year.
Sol,t)o(:
Let : *e(ote the (et ga)( to the co-pa(y fro- the sale of o(e s,ch pol)cy. &here are
two poss)b)l)t)es: the )(s,re* perso( l)4es the whole year or the )(s,re* perso( *)es
before the year )s ,p. Apply)(g the )(co-e -)(,s o,tgo pr)(c)ple )( the for-er case
the 4al,e of : )s 190 X ;D )( the latter case )t )s 195−200,000=−199,805.S)(ce the
probab)l)ty )( the +rst case )s ;.999 a(* )( the seco(* case )s 1−0.9997=0.0003 the
probab)l)ty *)str)b,t)o( for : )s:
Saylor URL: http://www.saylor.org/books Saylor.org191
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 192/723
Saylor URL: http://www.saylor.org/books Saylor.org192
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 193/723
Saylor URL: http://www.saylor.org/books Saylor.org193
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 194/723
%o-p,te each of the follow)(g <,a(t)t)es.
a. a.
b. P(0).
c. 9 : ;5.
*. 9 : ;5.
e. P(X≤−2).
f. &he -ea( μ of : .
g. &he 4ar)a(ce σ2of : .
h. &he sta(*ar* *e4)at)o( σ of : .
Sol,t)o(:
a. S)(ce all probab)l)t)es -,st a** ,p to 1 a=1−(0.2+0.5+0.1)=0.2.
b. )rectly fro- the table P(0)=0.5.
c. ro- the table P(X>0)=P(1)+P(4)=0.2+0.1=0.3.
d. ro- the table P(X≥0)=P(0)+P(1)+P(4)=0.5+0.2+0.1=0.8.
e. S)(ce (o(e of the (,-bers l)ste* as poss)ble 4al,es for : )s less tha( or e<,al to
X2 the e4e(t : ^ X2 )s )-poss)ble so 9 : ^ X25 ;.
f. Us)(g the for-,la )( the *e+()t)o( of μ
µ=Σx P(x)=(−1)D0.2+0D0.5+1D0.2+4D0.1=0.4
Saylor URL: http://www.saylor.org/books Saylor.org19
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 195/723
*+, TA*+AA,S
• &he probab)l)ty *)str)b,t)o( of a *)screte ra(*o- 4ar)able : )s a l)st)(g of each
poss)ble 4al,e take( by : alo(g w)th the probab)l)ty P(x)that : takes that 4al,e
)( o(e tr)al of the e7per)-e(t.
• &he -ea( μ of a *)screte ra(*o- 4ar)able : )s a (,-ber that )(*)cates the
a4erage 4al,e of : o4er (,-ero,s tr)als of the e7per)-e(t. t )s co-p,te* ,s)(g
the for-,la µ=Σx P(x).
• &he 4ar)a(ce σ2a(* sta(*ar* *e4)at)o( σ of a *)screte ra(*o- 4ar)able : are
(,-bers that )(*)cate the 4ar)ab)l)ty of : o4er (,-ero,s tr)als of the e7per)-e(t.
&hey -ay be co-p,te* ,s)(g the for-,la σ2= Σx2 P(x) ]−µ2 tak)(g the s<,are root
to obta)( σ .
Saylor URL: http://www.saylor.org/books Saylor.org190
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 196/723
Saylor URL: http://www.saylor.org/books Saylor.org19
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 197/723
Saylor URL: http://www.saylor.org/books Saylor.org19
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 198/723
Saylor URL: http://www.saylor.org/books Saylor.org196
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 199/723
1; Let : *e(ote the (,-ber of t)-es a fa)r co)( la(*s hea*s )( three tosses. %o(str,ct
the probab)l)ty *)str)b,t)o( of : .
11 )4e tho,sa(* lottery t)ckets are sol* for 1 each. "(e t)cket w)ll w)( 1;;; two t)ckets
w)ll w)( 0;; each a(* te( t)ckets w)ll w)( 1;; each. Let : *e(ote the (et ga)( fro-
the p,rchase of a ra(*o-ly selecte* t)cket.
a %o(str,ct the probab)l)ty *)str)b,t)o( of : .
b %o-p,te the e7pecte* 4al,e E(X)of : . (terpret )ts -ea()(g.
Saylor URL: http://www.saylor.org/books Saylor.org199
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 200/723
c %o-p,te the sta(*ar* *e4)at)o( σ of : .
12 Se4e( tho,sa(* lottery t)ckets are sol* for 0 each. "(e t)cket w)ll w)( 2;;; two
t)ckets w)ll w)( 0; each a(* +4e t)ckets w)ll w)( 1;; each. Let : *e(ote the (et ga)(
fro- the p,rchase of a ra(*o-ly selecte* t)cket.
a %o(str,ct the probab)l)ty *)str)b,t)o( of : .b %o-p,te the e7pecte* 4al,e E(X)of : . (terpret )ts -ea()(g.
c %o-p,te the sta(*ar* *e4)at)o( σ of : .
13 A( )(s,ra(ce co-pa(y w)ll sell a 9;;;; o(e=year ter- l)fe )(s,ra(ce pol)cy to a(
)(*)4)*,al )( a part)c,lar r)sk gro,p for a pre-),- of 6. )(* the e7pecte* 4al,e to
the co-pa(y of a s)(gle pol)cy )f a perso( )( th)s r)sk gro,p has a 99.2G cha(ce of
s,r4)4)(g o(e year.
1 A( )(s,ra(ce co-pa(y w)ll sell a 1;;;; o(e=year ter- l)fe )(s,ra(ce pol)cy to a(
)(*)4)*,al )( a part)c,lar r)sk gro,p for a pre-),- of 36. )(* the e7pecte* 4al,e to
the co-pa(y of a s)(gle pol)cy )f a perso( )( th)s r)sk gro,p has a 9.20G cha(ce of
s,r4)4)(g o(e year.
10 A( )(s,ra(ce co-pa(y est)-ates that the probab)l)ty that a( )(*)4)*,al )( a part)c,lar
r)sk gro,p w)ll s,r4)4e o(e year )s ;.9620. S,ch a perso( w)shes to b,y a 10;;;; o(e=
year ter- l)fe )(s,ra(ce pol)cy. Let , *e(ote how -,ch the )(s,ra(ce co-pa(y charges
s,ch a perso( for s,ch a pol)cy.a %o(str,ct the probab)l)ty *)str)b,t)o( of : . &wo e(tr)es )( the table w)ll
co(ta)(,.5
b %o-p,te the e7pecte* 4al,e E(X)of : .
c eter-)(e the 4al,e , -,st ha4e )( or*er for the co-pa(y to break e4e( o(
all s,ch pol)c)es that )s to a4erage a (et ga)( of Vero per pol)cy o( s,ch
pol)c)es5.
* eter-)(e the 4al,e , -,st ha4e )( or*er for the co-pa(y to a4erage a (et
ga)( of 20; per pol)cy o( all s,ch pol)c)es.
1 A( )(s,ra(ce co-pa(y est)-ates that the probab)l)ty that a( )(*)4)*,al )( a part)c,lar r)sk
gro,p w)ll s,r4)4e o(e year )s ;.99. S,ch a perso( w)shes to b,y a 0;;; o(e=year ter-
l)fe )(s,ra(ce pol)cy. Let , *e(ote how -,ch the )(s,ra(ce co-pa(y charges s,ch a
perso( for s,ch a pol)cy.
a %o(str,ct the probab)l)ty *)str)b,t)o( of : . &wo e(tr)es )( the table w)ll
co(ta)(,.5
Saylor URL: http://www.saylor.org/books Saylor.org2;;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 201/723
b %o-p,te the e7pecte* 4al,e E(X)of : .
c eter-)(e the 4al,e , -,st ha4e )( or*er for the co-pa(y to break e4e( o(
all s,ch pol)c)es that )s to a4erage a (et ga)( of Vero per pol)cy o( s,ch
pol)c)es5.
* eter-)(e the 4al,e , -,st ha4e )( or*er for the co-pa(y to a4erage a (etga)( of 10; per pol)cy o( all s,ch pol)c)es.
1 A ro,lette wheel has 36 slots. &h)rty=s)7 slots are (,-bere* fro- 1 to 3D half of the- are
re* a(* half are black. &he re-a)()(g two slots are (,-bere* ; a(* ;; a(* are gree(. (
a 1 bet o( re* the bettor pays 1 to play. f the ball la(*s )( a re* slot he rece)4es back
the *ollar he bet pl,s a( a**)t)o(al *ollar. f the ball *oes (ot la(* o( re* he loses h)s
*ollar. Let : *e(ote the (et ga)( to the bettor o( o(e play of the ga-e.
a %o(str,ct the probab)l)ty *)str)b,t)o( of : .
b %o-p,te the e7pecte* 4al,e E(X)of : a(* )(terpret )ts -ea()(g )( the
co(te7t of the proble-.
c %o-p,te the sta(*ar* *e4)at)o( of : .
16 A ro,lette wheel has 36 slots. &h)rty=s)7 slots are (,-bere* fro- 1 to 3D the re-a)()(g
two slots are (,-bere* ; a(* ;;. S,ppose the (,-ber ;; )s co(s)*ere* (ot to be
e4e( b,t the (,-ber ; )s st)ll e4e(. ( a 1 bet o( e4e( the bettor pays 1 to play. f
the ball la(*s )( a( e4e( (,-bere* slot he rece)4es back the *ollar he bet pl,s a(
a**)t)o(al *ollar. f the ball *oes (ot la(* o( a( e4e( (,-bere* slot he loses h)s *ollar.
Let : *e(ote the (et ga)( to the bettor o( o(e play of the ga-e.
a %o(str,ct the probab)l)ty *)str)b,t)o( of : .b %o-p,te the e7pecte* 4al,e E(X)of : a(* e7pla)( why th)s ga-e )s (ot
o8ere* )( a cas)(o where ; )s (ot co(s)*ere* e4e(5.
c %o-p,te the sta(*ar* *e4)at)o( of : .
Saylor URL: http://www.saylor.org/books Saylor.org2;1
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 202/723
Saylor URL: http://www.saylor.org/books Saylor.org2;2
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 203/723
Saylor URL: http://www.saylor.org/books Saylor.org2;3
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 204/723
Saylor URL: http://www.saylor.org/books Saylor.org2;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 205/723
Saylor URL: http://www.saylor.org/books Saylor.org2;0
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 206/723
Saylor URL: http://www.saylor.org/books Saylor.org2;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 207/723
Saylor URL: http://www.saylor.org/books Saylor.org2;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 208/723
7.3 The 'inomial (istribution
LEARNN! "#$E%&'ES
1 &o lear( the co(cept of a b)(o-)al ra(*o- 4ar)able.
2 &o lear( how to recog()Ve a ra(*o- 4ar)able as be)(g a b)(o-)al ra(*o- 4ar)able.
The experiment of tossing a fair coin three times and the experiment of observing the genders
according to birth order of the children in a randomly selected three-child family are completely
different$ but the random variables that count the number of heads in the coin toss and the number
Saylor URL: http://www.saylor.org/books Saylor.org2;6
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 209/723
of boys in the family *assuming the two genders are e"ually likely) are the same random variable$ the
one with probability distribution
% histogram that graphically illustrates this probability distribution is given in/igure 6.6 0robability
2istribution for Three 7oins and Three 7hildren0. (hat is common to the two experiments is that we
perform three identical and independent trials of the same action$ each trial has only two outcomes
*heads or tails$ boy or girl)$ and the probability of success is the same number$ 5.8$ on every trial. The
random variable that is generated is called the binomial random variable *ith parameters n M
3 and p M 5.8. This is #ust one case of a general situation.
!igure . $robability (istribution for Three <oins and Three <hildren
(e)nition
Saylor URL: http://www.saylor.org/books Saylor.org2;9
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 210/723
%uppose a random experiment has the following characteristics.
1 There are n identical and independent trials of a common procedure.
! There are exactly two possible outcomes for each trial, one termed ?success@ and the other ?failure.@
3 The probability of success on any one trial is the same number p.
Then the discrete random variable B that counts the number of successes in the n trials is the binomial
random variable *ith parameters nand p. Fe also say that B has a binomial distribution *ith
parameters n and p.
The following four examples illustrate the definition. 9ote how in every case GsuccessH is the outcome
that is counted$ not the outcome that we prefer or think is better in some sense.
1 % random sample of 1!8 students is selected from a large college in which the proportion of students
who are females is 8@B. +uppose B denotes the number of female students in the sample. ,n this
situation there are n M 1!8 identical and independent trials of a common procedure$ selecting a
student at random there are exactly two possible outcomes for each trial$ GsuccessH *what we are
counting$ that the student be female) and GfailureH and finally the probability of success on any one
trial is the same number pM 5.8@. B is a binomial random variable with parameters n M 1!8 and p M
5.8@.
! % multiple-choice test has 18 "uestions$ each of which has five choices. %n unprepared student taking
the test answers each of the "uestions completely randomly by choosing an arbitrary answer from the
five provided. +uppose B denotes the number of answers that the student gets right. B is a binomial
random variable with parameters n M 18 and p=1/5=0.20.
3 ,n a survey of 1$555 registered voters each voter is asked if he intends to vote for a candidate Titania
Aueen in the upcoming election. +uppose B denotes the number of voters in the survey who intend to
vote for Titania Aueen. B is a binomial random variable with n M 1555 and p e"ual to the true
proportion of voters *surveyed or not) who intend to vote for Titania Aueen.
6 %n experimental medication was given to 35 patients with a certain medical condition.
+uppose B denotes the number of patients who develop severe side effects. B is a binomial random
variable with n M 35 and p e"ual to the true probability that a patient with the underlying condition
will experience severe side effects if given that medication.
2robability Formula or a 'inomial /andom <ariable
Often the most difficult aspect of working a problem that involves the binomial random variable is
recogni'ing that the random variable in "uestion has a binomial distribution. Once that is known$
probabilities can be computed using the following formula.
Saylor URL: http://www.saylor.org/books Saylor.org21;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 211/723
Saylor URL: http://www.saylor.org/books Saylor.org211
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 212/723
Saylor URL: http://www.saylor.org/books Saylor.org212
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 213/723
Figure 8.;9robabilit/ Distribution of t!e -inomial 6andom <ariable in 'ote 8.2 >(ample
7>
Saylor URL: http://www.saylor.org/books Saylor.org213
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 214/723
Saylor URL: http://www.saylor.org/books Saylor.org21
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 215/723
Special Formulas or the >ean and Standard (eviation
o a 'inomial /andom <ariable
+ince a binomial random variable is a discrete random variable$ the formulas for its mean$ variance$
and standard deviation given in the previous section apply to it$ as we #ust saw in 9ote 6.!F
0xample @0 in the case of the mean. Eowever$ for the binomial random variable there are much
simpler formulas.
The Cumulative 2robability (istribution o a 'inomial
/andom <ariable
,n order to allow a broader range of more realistic problems 7hapter 1! 0%ppendix0 contains
probability tables for binomial random variables for various choices of the parameters n and p. These
tables are not the probability distributions that we have seen so far$ but are cumulative probability
distributions. ,n the place of the probability P(x)the table contains the probability
Saylor URL: http://www.saylor.org/books Saylor.org210
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 216/723
P(X≤x)=P(0)+P(1)+D D D +P(x)
This is illustrated in /igure 6.; 07umulative robabilities0. The probability entered in the tablecorresponds to the area of the shaded region. The reason for providing a cumulative table is that in
practical problems that involve a binomial random variable typically the probability that is sought is
of the form P(X≤x)orP(X≥x).The cumulative table is much easier to use for
computingP(X≤x)since all the individual probabilities have already been computed and added. The
one table suffices for bothP(X≤x)orP(X≥x)and can be used to readily obtain probabilities of the
formP(x) too$ because of the following formulas. The first is #ust the robability Cule for
7omplements.
!igure ./ <umulative $robabilities
,f B is a discrete random variable$ then
Saylor URL: http://www.saylor.org/books Saylor.org21
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 217/723
P(X≥x)=1−P(X≤x−1) and P(x)=P(X≤x)−P(X≤x−1)
Saylor URL: http://www.saylor.org/books Saylor.org21
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 218/723
b &he st,*e(t -,st g,ess correctly o( at least ;G of the <,est)o(s wh)ch
)s 0.60Y10=6<,est)o(s. &he probab)l)ty so,ght )s not P(6)a( easy -)stake to
-ake5 b,t
P(X≥6)=P(6)+P(7)+P(8)+P(9)+P(10)
(stea* of co-p,t)(g each of these +4e (,-bers ,s)(g the for-,la a(* a**)(g the-
we ca( ,se the table to obta)(
P(X≥6)=1−P(X≤5)=1−0.6230=0.3770
wh)ch )s -,ch less work a(* of s,?c)e(t acc,racy for the s)t,at)o( at ha(*.
EKAPLE 1;
A( appl)a(ce repa)r-a( ser4)ces +4e wash)(g -ach)(es o( s)te each *ay. "(e=
th)r* of the ser4)ce calls re<,)re )(stallat)o( of a part)c,lar part.
a. &he repa)r-a( has o(ly o(e s,ch part o( h)s tr,ck to*ay. )(* the probab)l)ty that
the o(e part w)ll be e(o,gh to*ay that )s that at -ost o(e wash)(g -ach)(e he ser4)ces
w)ll re<,)re )(stallat)o( of th)s part)c,lar part.
b. )(* the -)()-,- (,-ber of s,ch parts he sho,l* take w)th h)- each *ay )( or*er
that the probab)l)ty that he ha4e e(o,gh for the *ays ser4)ce calls )s at least 90G.
Sol,t)o(:
Let : *e(ote the (,-ber of ser4)ce calls to*ay o( wh)ch the part )s re<,)re*.
&he( : )s a b)(o-)al ra(*o- 4ar)able w)th para-eters n 0 a(* p=1 3=0.35−.
a. Note that the probab)l)ty )( <,est)o( )s (otP(1) b,t rather 9 : ^ 15. Us)(g the
c,-,lat)4e *)str)b,t)o( table )( %hapter 12 QAppe(*)7Q
P(X≤1)=0.4609
b. &he a(swer )s the s-allest (,-ber s,ch that the table e(try P(X≤x))s at
least ;.90;;. S)(ce P(X≤2)=0.7901)s less tha( ;.90 two parts are (ot e(o,gh.
Saylor URL: http://www.saylor.org/books Saylor.org216
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 219/723
S)(ce P(X≤3)=0.9547)s as large as ;.90 three parts w)ll s,?ce at least 90G of the
t)-e. &h,s the -)()-,- (ee*e* )s three.
*+, TA*+AA,S
• &he *)screte ra(*o- 4ar)able : that co,(ts the (,-ber of s,ccesses )( n
)*e(t)cal )(*epe(*e(t tr)als of a proce*,re that always res,lts )( e)ther of two
o,tco-es s,ccess or fa)l,re a(* )( wh)ch the probab)l)ty of s,ccess o( each
tr)al )s the sa-e (,-ber p )s calle* the b)(o-)al ra(*o- 4ar)able w)th
para-eters n a(* p.
• &here )s a for-,la for the probab)l)ty that the b)(o-)al ra(*o- 4ar)able w)th
para-eters n a(* p w)ll take a part)c,lar 4al,e .
• &here are spec)al for-,las for the -ea( 4ar)a(ce a(* sta(*ar* *e4)at)o( of the
b)(o-)al ra(*o- 4ar)able w)th para-eters n a(* p that are -,ch s)-pler tha(
the ge(eral for-,las that apply to all *)screte ra(*o- 4ar)ables.
• %,-,lat)4e probab)l)ty *)str)b,t)o( tables whe( a4a)lable fac)l)tate co-p,tat)o(
of probab)l)t)es e(co,(tere* )( typ)cal pract)cal s)t,at)o(s.
'AS&C
1 eter-)(e whether or (ot the ra(*o- 4ar)able : )s a b)(o-)al ra(*o- 4ar)able. f so
g)4e the 4al,es of n a(* p. f (ot e7pla)( why (ot.
a : )s the (,-ber of *ots o( the top face of fa)r *)e that )s rolle*.
b : )s the (,-ber of hearts )( a +4e=car* ha(* *raw( w)tho,t replace-e(t5
fro- a well=sh,de* or*)(ary *eck.
c : )s the (,-ber of *efect)4e parts )( a sa-ple of te( ra(*o-ly selecte* parts
co-)(g fro- a -a(,fact,r)(g process )( wh)ch ;.;2G of all parts are
*efect)4e.
* : )s the (,-ber of t)-es the (,-ber of *ots o( the top face of a fa)r *)e )s
e4e( )( s)7 rolls of the *)e.
e : )s the (,-ber of *)ce that show a( e4e( (,-ber of *ots o( the top face
whe( s)7 *)ce are rolle* at o(ce.
2 eter-)(e whether or (ot the ra(*o- 4ar)able : )s a b)(o-)al ra(*o- 4ar)able. f so
g)4e the 4al,es of n a(* p. f (ot e7pla)( why (ot.
a : )s the (,-ber of black -arbles )( a sa-ple of 0 -arbles *raw( ra(*o-ly
a(* w)tho,t replace-e(t fro- a bo7 that co(ta)(s 20 wh)te -arbles a(* 10
black -arbles.
Saylor URL: http://www.saylor.org/books Saylor.org219
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 220/723
b : )s the (,-ber of black -arbles )( a sa-ple of 0 -arbles *raw( ra(*o-ly
a(* w)th replace-e(t fro- a bo7 that co(ta)(s 20 wh)te -arbles a(* 10 black
-arbles.
c : )s the (,-ber of 4oters )( fa4or of propose* law )( a sa-ple 12;; ra(*o-ly
selecte* 4oters *raw( fro- the e(t)re electorate of a co,(try )( wh)ch 30G ofthe 4oters fa4or the law.
* : )s the (,-ber of +sh of a part)c,lar spec)es a-o(g the (e7t te( la(*e* by
a co--erc)al +sh)(g boat that are -ore tha( 13 )(ches )( le(gth whe( 1G
of all s,ch +sh e7cee* 13 )(ches )( le(gth.
e : )s the (,-ber of co)(s that -atch at least o(e other co)( whe( fo,r co)(s
are tosse* at o(ce.
3 : )s a b)(o-)al ra(*o- 4ar)able w)th para-eters n 12 a(* p ;.62. %o-p,te the
probab)l)ty )(*)cate*.
a P(11)
b P(9)
c P(0)
d P(13)
: )s a b)(o-)al ra(*o- 4ar)able w)th para-eters n 1 a(* p ;.. %o-p,te the
probab)l)ty )(*)cate*.
a P(14)
b P(4)
c P(0)
d P(20)
0 : )s a b)(o-)al ra(*o- 4ar)able w)th para-eters n 0 p ;.0. Use the tables
)(%hapter 12 QAppe(*)7Q to co-p,te the probab)l)ty )(*)cate*.
a 9 : ^ 35
Saylor URL: http://www.saylor.org/books Saylor.org22;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 221/723
b 9 : 35
c P(3)
d P(0)
e P(5)
6 : )s a b)(o-)al ra(*o- 4ar)able w)th para-eters n 0 p=0.35−.Use the table
)(%hapter 12 QAppe(*)7Q to co-p,te the probab)l)ty )(*)cate*.
a 9 : ^ 25
b 9 : 25
c P(2)
d P(0)
e P(5)
: )s a b)(o-)al ra(*o- 4ar)able w)th the para-eters show(. Use the tables )(%hapter
12 QAppe(*)7Q to co-p,te the probab)l)ty )(*)cate*.
a n 1; p ;.20 9 : ^ 5
b n 1; p ;.0 9 : ^ 5
c n 10 p ;.0 9 : ^ 5
* n 10 p ;.0 P(12)
e n 10 p=0.6−@ P(10≤X≤12)
6 : )s a b)(o-)al ra(*o- 4ar)able w)th the para-eters show(. Use the tables )(
%hapter 12 QAppe(*)7Q to co-p,te the probab)l)ty )(*)cate*.
a n 0 p ;.;0 9 : ^ 15
b n 0 p ;.0 9 : ^ 15
c n 1; p ;.0 9 : ^ 05
* n 1; p ;.0 P(12)
e n 1; p=0.6−@ P(5≤X≤8)
Saylor URL: http://www.saylor.org/books Saylor.org221
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 222/723
9 : )s a b)(o-)al ra(*o- 4ar)able w)th the para-eters show(. Use the spec)al for-,las
to co-p,te )ts -ea( μ a(* sta(*ar* *e4)at)o( σ .
a n 6 p ;.3
b n p ;.62
c n 12;; p ;.
* n 21;; p ;.2
1; : )s a b)(o-)al ra(*o- 4ar)able w)th the para-eters show(. Use the spec)al for-,las to
co-p,te )ts -ea( μ a(* sta(*ar* *e4)at)o( σ .
a n 1 p ;.00
b n 63 p ;.;0
c n 90 p ;.30
* n 10; p ;.9
Saylor URL: http://www.saylor.org/books Saylor.org222
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 223/723
1 A co)( )s be(t so that the probab)l)ty that )t la(*s hea*s ,p )s 2/3. &he co)( )s tosse* te(
t)-es.
Saylor URL: http://www.saylor.org/books Saylor.org223
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 224/723
a )(* the probab)l)ty that )t la(*s hea*s ,p at -ost +4e t)-es.
b )(* the probab)l)ty that )t la(*s hea*s ,p -ore t)-es tha( )t la(*s ta)ls ,p.
APPL%A&"NS
1 A( E(gl)sh=speak)(g to,r)st 4)s)ts a co,(try )( wh)ch 3;G of the pop,lat)o( speaks
E(gl)sh. @e (ee*s to ask so-eo(e *)rect)o(s.
a )(* the probab)l)ty that the +rst perso( he e(co,(ters w)ll be able to speak
E(gl)sh.
b &he to,r)st sees fo,r local people sta(*)(g at a b,s stop. )(* the probab)l)ty
that at least o(e of the- w)ll be able to speak E(gl)sh.
16 &he probab)l)ty that a( egg )( a reta)l package )s cracke* or broke( )s ;.;20.
a )(* the probab)l)ty that a carto( of o(e *oVe( eggs co(ta)(s (o eggs that are
e)ther cracke* or broke(.
b )(* the probab)l)ty that a carto( of o(e *oVe( eggs has )5 at least o(e that )s
e)ther cracke* or broke(D ))5 at least two that are cracke* or broke(.
c )(* the a4erage (,-ber of cracke* or broke( eggs )( o(e *oVe( carto(s.
19 A( appl)a(ce store sells 2; refr)gerators each week. &e( perce(t of all p,rchasers of a
refr)gerator b,y a( e7te(*e* warra(ty. Let : *e(ote the (,-ber of the (e7t 2;p,rchasers who *o so.
a 'er)fy that : sat)s+es the co(*)t)o(s for a b)(o-)al ra(*o- 4ar)able a(* +(* n
a(* p.
b )(* the probab)l)ty that : )s Vero.
c )(* the probab)l)ty that : )s two three or fo,r.
* )(* the probab)l)ty that : )s at least +4e.
2; A*4erse grow)(g co(*)t)o(s ha4e ca,se* 0G of grapefr,)t grow( )( a certa)( reg)o( to
be of )(fer)or <,al)ty. !rapefr,)t are sol* by the *oVe(.
a )(* the a4erage (,-ber of )(fer)or <,al)ty grapefr,)t per bo7 of a *oVe(.
b A bo7 that co(ta)(s two or -ore grapefr,)t of )(fer)or <,al)ty w)ll ca,se a
stro(g a*4erse c,sto-er react)o(. )(* the probab)l)ty that a bo7 of o(e *oVe(
grapefr,)t w)ll co(ta)( two or -ore grapefr,)t of )(fer)or <,al)ty.
Saylor URL: http://www.saylor.org/books Saylor.org22
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 225/723
21 &he probab)l)ty that a =o,(ce ske)( of a *)sco,(t worste* we)ght k()tt)(g yar( co(ta)(s
a k(ot )s ;.20. !o(er)l b,ys te( ske)(s to crochet a( afgha(.
a )(* the probab)l)ty that )5 (o(e of the te( ske)(s w)ll co(ta)( a k(otD ))5 at
-ost o(e w)ll.
b )(* the e7pecte* (,-ber of ske)(s that co(ta)( k(ots.
c )(* the -ost l)kely (,-ber of ske)(s that co(ta)( k(ots.
22 "(e=th)r* of all pat)e(ts who ,(*ergo a (o(=)(4as)4e b,t ,(pleasa(t -e*)cal test
re<,)re a se*at)4e. A laboratory perfor-s 2; s,ch tests *a)ly. Let : *e(ote the (,-ber
of pat)e(ts o( a(y g)4e( *ay who re<,)re a se*at)4e.
a 'er)fy that : sat)s+es the co(*)t)o(s for a b)(o-)al ra(*o- 4ar)able a(* +(* n
a(* p.
b )(* the probab)l)ty that o( a(y g)4e( *ay betwee( +4e a(* ()(e pat)e(ts w)ll
re<,)re a se*at)4e )(cl,*e +4e a(* ()(e5.
c )(* the a4erage (,-ber of pat)e(ts each *ay who re<,)re a se*at)4e.
* Us)(g the c,-,lat)4e probab)l)ty *)str)b,t)o( for : )( %hapter 12 QAppe(*)7Q
+(* the -)()-,- (,-ber x minof *oses of the se*at)4e that sho,l* be o( ha(*
at the start of the *ay so that there )s a 99G cha(ce that the laboratory w)ll
(ot r,( o,t.
23 Abo,t 2G of al,-() g)4e -o(ey ,po( rece)4)(g a sol)c)tat)o( fro- the college or
,()4ers)ty fro- wh)ch they gra*,ate*. )(* the a4erage (,-ber -o(etary g)fts a
college ca( e7pect fro- e4ery 2;;; sol)c)tat)o(s )t se(*s.
2 "f all college st,*e(ts who are el)g)ble to g)4e bloo* abo,t 16G *o so o( a reg,lar
bas)s. Each -o(th a local bloo* ba(k se(*s a( appeal to g)4e bloo* to 20; ra(*o-ly
selecte* st,*e(ts. )(* the a4erage (,-ber of appeals )( s,ch -a)l)(gs that are -a*e
to st,*e(ts who alrea*y g)4e bloo*.
20 Abo,t 12G of all )(*)4)*,als wr)te w)th the)r left ha(*s. A class of 13; st,*e(ts -eets )(
a classroo- w)th 13; )(*)4)*,al *esks e7actly 1 of wh)ch are co(str,cte* for people
who wr)te w)th the)r left ha(*s. )(* the probab)l)ty that e7actly 1 of the st,*e(ts
e(rolle* )( the class wr)te w)th the)r left ha(*s.
2 A tra4ell)(g sales-a( -akes a sale o( 0G of h)s calls o( reg,lar c,sto-ers. @e -akes
fo,r sales calls each *ay.
Saylor URL: http://www.saylor.org/books Saylor.org220
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 226/723
a %o(str,ct the probab)l)ty *)str)b,t)o( of : the (,-ber of sales -a*e each
*ay.
b )(* the probab)l)ty that o( a ra(*o-ly selecte* *ay the sales-a( w)ll -ake
a sale.
c Ass,-)(g that the sales-a( -akes 2; sales calls per week +(* the -ea(a(* sta(*ar* *e4)at)o( of the (,-ber of sales -a*e per &ee? .
2 A corporat)o( has a*4ert)se* hea4)ly to try to )(s,re that o4er half the a*,lt pop,lat)o(
recog()Ves the bra(* (a-e of )ts pro*,cts. ( a ra(*o- sa-ple of 2; a*,lts 1
recog()Ve* )ts bra(* (a-e. Bhat )s the probab)l)ty that 1 or -ore people )( s,ch a
sa-ple wo,l* recog()Ve )ts bra(* (a-e )f the act,al proport)o( p of all a*,lts who
recog()Ve the bra(* (a-e were o(ly ;.0;C
A((&T&1NA! ++/C&S+S
26 Bhe( *roppe* o( a har* s,rface a th,-btack la(*s w)th )ts sharp po)(t to,ch)(g the
s,rface w)th probab)l)ty 2/3D )t la(*s w)th )ts sharp po)(t *)recte* ,p )(to the a)r w)th
probab)l)ty 1/3. &he tack )s *roppe* a(* )ts la(*)(g pos)t)o( obser4e* 10 t)-es.
a )(* the probab)l)ty that )t la(*s w)th )ts po)(t )( the a)r at least t)-es.
b f the e7per)-e(t of *ropp)(g the tack 10 t)-es )s *o(e repeate*ly what )s
the a4erage (,-ber of t)-es )t la(*s w)th )ts po)(t )( the a)rC
29 A profess)o(al proofrea*er has a 96G cha(ce of *etect)(g a( error )( a p)ece of wr)tte(
work other tha( -)sspell)(gs *o,ble wor*s a(* s)-)lar errors that are -ach)(e
*etecte*5. A work co(ta)(s fo,r errors.
a )(* the probab)l)ty that the proofrea*er w)ll -)ss at least o(e of the-.
b Show that two s,ch proofrea*ers work)(g )(*epe(*e(tly ha4e a 99.9G
cha(ce of *etect)(g a( error )( a p)ece of wr)tte( work.
c )(* the probab)l)ty that two s,ch proofrea*ers work)(g )(*epe(*e(tly w)ll
-)ss at least o(e error )( a work that co(ta)(s fo,r errors.
3; A -,lt)ple cho)ce e7a- has 2; <,est)o(sD there are fo,r cho)ces for each <,est)o(.
a A st,*e(t g,esses the a(swer to e4ery <,est)o(. )(* the cha(ce that he
g,esses correctly betwee( fo,r a(* se4e( t)-es.
b )(* the -)()-,- score the )(str,ctor ca( set so that the probab)l)ty that a
st,*e(t w)ll pass >,st by g,ess)(g )s 2;G or less.
31 ( sp)te of the re<,)re-e(t that all *ogs boar*e* )( a ke((el be )(oc,late* the
cha(ce that a healthy *og boar*e* )( a clea( well=4e(t)late* ke((el w)ll *e4elop
ke((el co,gh fro- a carr)er )s ;.;;6.
Saylor URL: http://www.saylor.org/books Saylor.org22
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 227/723
a f a carr)er (ot k(ow( to be s,ch of co,rse5 )s boar*e* w)th three other *ogs
what )s the probab)l)ty that at least o(e of the three healthy *ogs w)ll *e4elop
ke((el co,ghC
b f a carr)er )s boar*e* w)th fo,r other *ogs what )s the probab)l)ty that at least
o(e of the fo,r healthy *ogs w)ll *e4elop ke((el co,ghC
c &he patter( e4)*e(t fro- parts a5 a(* b5 )s that )f K+1*ogs are boar*e*
together o(e a carr)er a(* @ healthy *ogs the( the probab)l)ty that at least
o(e of the healthy *ogs w)ll *e4elop ke((el co,gh )s P(X≥1)=1−(0.992)K
where : )s the b)(o-)al ra(*o- 4ar)able that co,(ts the (,-ber of healthy
*ogs that *e4elop the co(*)t)o(. E7per)-e(t w)th *)8ere(t 4al,es of @ )( th)s
for-,la to +(* the -a7)-,- (,-ber K+1of *ogs that a ke((el ow(er ca(
boar* together so that )f o(e of the *ogs has the co(*)t)o( the cha(ce that
a(other *og w)ll be )(fecte* )s less tha( ;.;0.
32 (4est)gators (ee* to *eter-)(e wh)ch of ;; a*,lts ha4e a -e*)cal co(*)t)o( that
a8ects 2G of the a*,lt pop,lat)o(. A bloo* sa-ple )s take( fro- each of the
)(*)4)*,als.
a Show that the e7pecte* (,-ber of *)sease* )(*)4)*,als )( the gro,p of ;; )s
12 )(*)4)*,als.
b (stea* of test)(g all ;; bloo* sa-ples to +(* the e7pecte* 12 *)sease*
)(*)4)*,als )(4est)gators gro,p the sa-ples )(to ; gro,ps of 1; each -)7 a
l)ttle of the bloo* fro- each of the 1; sa-ples )( each gro,p a(* test each of
the ; -)7t,res. Show that the probab)l)ty that a(y s,ch -)7t,re w)ll co(ta)(
the bloo* of at least o(e *)sease* perso( he(ce test pos)t)4e )s abo,t ;.16.
c #ase* o( the res,lt )( b5 show that the e7pecte* (,-ber of -)7t,res that
test pos)t)4e )s abo,t 11. S,ppos)(g that )(*ee* 11 of the ; -)7t,res test
pos)t)4e the( we k(ow that (o(e of the 9; perso(s whose bloo* was )( the
re-a)()(g 9 sa-ples that teste* (egat)4e has the *)sease. Be ha4e
el)-)(ate* 9; perso(s fro- o,r search wh)le perfor-)(g o(ly ; tests.5
Saylor URL: http://www.saylor.org/books Saylor.org22
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 228/723
Saylor URL: http://www.saylor.org/books Saylor.org226
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 229/723
Saylor URL: http://www.saylor.org/books Saylor.org229
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 230/723
Chapter 8
Continuous /andom <ariables
%s discussed in +ection 6.1 0Candom Dariables0 in 7hapter 6 02iscrete Candom Dariables0$ a random variable is called continuous if its set of possible values contains a whole interval of decimal numbers.
,n this chapter we investigate such random variables.
8.% Continuous /andom <ariables
LEARNN! "#$E%&'ES
1 &o lear( the co(cept of the probab)l)ty *)str)b,t)o( of a co(t)(,o,s ra(*o-
4ar)able a(* how )t )s ,se* to co-p,te probab)l)t)es.
2 &o lear( bas)c facts abo,t the fa-)ly of (or-ally *)str)b,te* ra(*o- 4ar)ables.
The 2robability (istribution o a Continuous /andom
<ariable
/or a discrete random variable B the probability that B assumes one of its possible values on a single trial
of the experiment makes good sense. This is not the case for a continuous random variable. /or example$suppose B denotes the length of time a commuter #ust arriving at a bus stop has to wait for the next bus. ,f
buses run every 35 minutes without fail$ then the set of possible values of B is the interval denoted[0,30]
the set of all decimal numbers between 5 and 35. 4ut although the number @.!11F1; is a possible value
of B $ there is little or no meaning to the concept of the probability that the commuter will wait precisely
@.!11F1; minutes for the next bus. ,f anything the probability should be 'ero$ since if we could
meaningfully measure the waiting time to the nearest millionth of a minute it is practically inconceivable
that we would ever get exactly @.!11F1; minutes. ore meaningful "uestions are those of the form: (hat
is the probability that the commuterUs waiting time is less than 15 minutes$ or is between 8 and 15
minutes= ,n other words$ with continuous random variables one is concerned not with the event that the
variable assumes a single particular value$ but with the event that the random variable assumes a value in
a particular interval.
e+()t)o(
Saylor URL: http://www.saylor.org/books Saylor.org23;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 231/723
The probability distribution of a continuous random variable B is an assignment of
probabilities to intervals of decimal numbers using a function f(x)$ called a density function$ in the
following way8 the probability that B assumes a value in the interval [a,b]is equal to the area of the
region that is bounded above by the graph of the equation y=f(x)$bounded below by the x:axis, and
bounded on the left and right by the vertical lines through a and b$ as illustrated in !igure ."
G$robability #iven as Area of a -egion under a <urveG .
!igure ." $robability #iven as Area of a -egion under a <urve
This definition can be understood as a natural outgrowth of the discussion in+ection !.1.3 0Celative
/re"uency Eistograms0 in 7hapter ! 02escriptive +tatistics0. There we saw that if we have in view a
population *or a very large sample) and make measurements with greater and greater precision$ then
as the bars in the relative fre"uency histogram become exceedingly fine their vertical sides merge
and disappear$ and what is left is #ust the curve formed by their tops$ as shown in /igure !.8 0+ample
+i'e and Celative /re"uency Eistograms0 in 7hapter ! 02escriptive +tatistics0. oreover the total
area under the curve is 1$ and the proportion of the population with measurements between two
numbersa and b is the area under the curve and between a and b$ as shown in /igure !.; 0% Dery
Saylor URL: http://www.saylor.org/books Saylor.org231
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 232/723
/ine Celative /re"uency Eistogram0 in 7hapter ! 02escriptive +tatistics0. ,f we think of B as a
measurement to infinite precision arising from the selection of any one member of the population at
random$ thenP(a<X<b)is simply the proportion of the population with measurements
between a and b$ the curve in the relative fre"uency histogram is the density function for B $ and we
arrive at the definition #ust above.
very density functionf(x)must satisfy the following two conditions:
1 /or all numbers x f(x)≥0$ so that the graph of y=f(x)never drops below the x -axis.
! The area of the region under the graph ofy=f(x)and above the x -axis is 1.
4ecause the area of a line segment is 5$ the definition of the probability distribution of a continuous
random variable implies that for any particular decimal number$ say a$ the probability
that B assumes the exact value a is 5. This property implies that whether or not the endpoints of an
interval are included makes no difference concerning the probability of the interval.
/or any continuous random variable B :
P(a≤X≤b)=P(a<X≤b)=P(a≤X<b)=P(a<X<b)
EKAPLE 1
A ra(*o- 4ar)able : has the ,()for- *)str)b,t)o( o( the )(ter4al [0,1]: the *e(s)ty
f,(ct)o( )s f(x)=1)f )s betwee( ; a(* 1 a(* f(x)=0for all other 4al,es of as
show( )( )g,re 0.2 QU()for- )str)b,t)o( o( Q.
Figure ;.2Uniform Distribution on [0,1]
Saylor URL: http://www.saylor.org/books Saylor.org232
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 233/723
a. )(* 9 : ;.05 the probab)l)ty that : ass,-es a 4al,e greater tha( ;.0.
b. )(* 9 : ;.25 the probab)l)ty that : ass,-es a 4al,e less tha( or e<,al
to ;.2.
c. )(* 9;. _ : _ ;.5 the probab)l)ty that : ass,-es a 4al,e betwee( ;.
a(* ;..
Sol,t)o(:
a. 9 : ;.05 )s the area of the recta(gle of he)ght 1 a(* base le(gth 1−0.75=0.25
he(ce )s base×height=(0.25)Y(1)=0.25.See )g,re 0.3 QProbab)l)t)es fro- the U()for- )str)b,t)o(
o( Qa5.
b. 9 : ;.25 )s the area of the recta(gle of he)ght 1 a(* base le(gth 0.2−0=0.2 he(ce
)s base×height=(0.2)Y(1)=0.2.See )g,re 0.3 QProbab)l)t)es fro- the U()for- )str)b,t)o( o(
Qb5.
c. 9;. _ : _ ;.5 )s the area of the recta(gle of he)ght 1 a(* le(gth 0.7−0.4=0.3 he(ce
)s base×height=(0.3)Y(1)=0.3.See )g,re 0.3 QProbab)l)t)es fro- the U()for- )str)b,t)o( o(Qc5.
Figure ;.*9robabilities from t!e Uniform Distribution on [0,1]
Saylor URL: http://www.saylor.org/books Saylor.org233
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 234/723
EKAPLE 2
A -a( arr)4es at a b,s stop at a ra(*o- t)-e that )s w)th (o regar* for the
sche*,le* ser4)ce5 to catch the (e7t b,s. #,ses r,( e4ery 3; -)(,tes w)tho,t fa)l
he(ce the (e7t b,s w)ll co-e a(y t)-e *,r)(g the (e7t 3; -)(,tes w)th e4e(ly
*)str)b,te* probab)l)ty a ,()for- *)str)b,t)o(5. )(* the probab)l)ty that a b,s w)ll
co-e w)th)( the (e7t 1; -)(,tes.
Sol,t)o(:
&he graph of the *e(s)ty f,(ct)o( )s a hor)Vo(tal l)(e abo4e the )(ter4al fro- ; to
3; a(* )s the =a7)s e4erywhere else. S)(ce the total area ,(*er the c,r4e -,st
be 1 the he)ght of the hor)Vo(tal l)(e )s 1/3;. See )g,re 0. QProbab)l)ty of
Ba)t)(g At ost 1; )(,tes for a #,sQ. &he probab)l)ty so,ght )s P(0≤X≤10).#y
*e+()t)o( th)s probab)l)ty )s the area of the recta(g,lar reg)o( bo,(*e* abo4e by
the hor)Vo(tal l)(e f(x)=1/30 bo,(*e* below by the =a7)s bo,(*e* o( the left by
the 4ert)cal l)(e at ; the / =a7)s5 a(* bo,(*e* o( the r)ght by the 4ert)cal l)(e at
1;. &h)s )s the sha*e* reg)o( )( )g,re 0. QProbab)l)ty of Ba)t)(g At ost 1;
)(,tes for a #,sQ. ts area )s the base of the recta(gle t)-es )ts
he)ght 10D(1/30)=1/3. &h,s P(0≤X≤10)=1/3.
Saylor URL: http://www.saylor.org/books Saylor.org23
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 235/723
Figure ;.89robabilit/ of 3aiting 0t ost 1A inutes for a -us
Saylor URL: http://www.saylor.org/books Saylor.org230
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 236/723
!igure . 9ell <urves with 6 H 5.& and (ifferent 0alues of
The value of 6 determines whether the bell curve is tall and thin or short and s"uat$ sub#ect always to
the condition that the total area under the curve be e"ual to 1. This is shown in /igure 8.; 04ell
7urves with 0$ where we have arbitrarily chosen to center the curves at M ;.
!igure ./ 9ell <urves with H / and (ifferent 0alues of 6
Saylor URL: http://www.saylor.org/books Saylor.org23
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 237/723
e+()t)o(
The probability distribution corresponding to the density function for the bell curve with
parameters and 6 is called the normal distribution with mean and standard deviation 6 .
e+()t)o(
A continuous random variable whose probabilities are described by the normal distribution with
mean and standard deviation 6 is called a normally distributed random variable or a
normal random variable for short, with mean and standard deviation 6 .
/igure 8.@ 02ensity /unction for a 9ormally 2istributed Candom Dariable with ean 0 shows the
density function that determines the normal distribution with mean and standard deviation 6 . (e
repeat an important fact about this curve:
Saylor URL: http://www.saylor.org/books Saylor.org23
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 238/723
The density curve for the normal distribution is symmetric about the mean.
!igure .1 (ensity !unction for a Iormally (istributed -andom 0ariable with 2ean and %tandard
(eviation 6
EKAPLE 3
@e)ghts of 20=year=ol* -e( )( a certa)( reg)o( ha4e -ea( 9.0 )(ches a(*
sta(*ar* *e4)at)o( 2.09 )(ches. &hese he)ghts are appro7)-ately (or-ally
*)str)b,te*. &h,s the he)ght : of a ra(*o-ly selecte* 20=year=ol* -a( )s a (or-al
ra(*o- 4ar)able w)th -ea( μ 9.0 a(* sta(*ar* *e4)at)o( σ 2.09. Sketch a
<,al)tat)4ely acc,rate graph of the *e(s)ty f,(ct)o( for : . )(* the probab)l)ty that
a ra(*o-ly selecte* 20=year=ol* -a( )s -ore tha( 9.0 )(ches tall.
Sol,t)o(:
&he *)str)b,t)o( of he)ghts looks l)ke the bell c,r4e )( )g,re 0.6 Qe(s)ty ,(ct)o(
for @e)ghts of 20=Jear="l* e(Q. &he )-porta(t po)(t )s that )t )s ce(tere* at )ts
-ea( 9.0 a(* )s sy--etr)c abo,t the -ea(.
Saylor URL: http://www.saylor.org/books Saylor.org236
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 239/723
Figure ;."Densit/ Function for eig!ts of 2;+Bear+ld en
S)(ce the total area ,(*er the c,r4e )s 1 by sy--etry the area to the r)ght of
9.0 )s half the total or ;.0. #,t th)s area )s prec)sely the probab)l)ty 9 :
9.05 the probab)l)ty that a ra(*o-ly selecte* 20=year=ol* -a( )s -ore tha(
9.0 )(ches tall.
Be w)ll lear( how to co-p,te other probab)l)t)es )( the (e7t two sect)o(s.
*+, TA*+AA,S
• or a co(t)(,o,s ra(*o- 4ar)able : the o(ly probab)l)t)es that are co-p,te* are
those of : tak)(g a 4al,e )( a spec)+e* )(ter4al.
• &he probab)l)ty that : take a 4al,e )( a part)c,lar )(ter4al )s the sa-e whether or
(ot the e(*po)(ts of the )(ter4al are )(cl,*e*.
• &he probab)l)ty P(a<X<b)@ that : take a 4al,e )( the )(ter4al fro- a to b )s the area
of the reg)o( betwee( the 4ert)cal l)(es thro,gh a a(* b abo4e the =a7)s a(*
below the graph of a f,(ct)o( f(x)calle* the *e(s)ty f,(ct)o(.
• A (or-ally *)str)b,te* ra(*o- 4ar)able )s o(e whose *e(s)ty f,(ct)o( )s a bell
c,r4e.
Saylor URL: http://www.saylor.org/books Saylor.org239
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 240/723
• E4ery bell c,r4e )s sy--etr)c abo,t )ts -ea( a(* l)es e4erywhere abo4e the =
a7)s wh)ch )t approaches asy-ptot)cally arb)trar)ly closely w)tho,t to,ch)(g5.
EKER%SES
#AS%
1 A co(t)(,o,s ra(*o- 4ar)able : has a ,()for- *)str)b,t)o( o( the )(ter4al [5,12].Sketch the
graph of )ts *e(s)ty f,(ct)o(.
2 A co(t)(,o,s ra(*o- 4ar)able : has a ,()for- *)str)b,t)o( o( the )(ter4al [−3,3].Sketch
the graph of )ts *e(s)ty f,(ct)o(.
3 A co(t)(,o,s ra(*o- 4ar)able : has a (or-al *)str)b,t)o( w)th -ea( 1;; a(* sta(*ar*
*e4)at)o( 1;. Sketch a <,al)tat)4ely acc,rate graph of )ts *e(s)ty f,(ct)o(.
A co(t)(,o,s ra(*o- 4ar)able : has a (or-al *)str)b,t)o( w)th -ea( 3 a(* sta(*ar*
*e4)at)o( 2.0. Sketch a <,al)tat)4ely acc,rate graph of )ts *e(s)ty f,(ct)o(.
0 A co(t)(,o,s ra(*o- 4ar)able : has a (or-al *)str)b,t)o( w)th -ea( 3. &he probab)l)ty
that : takes a 4al,e greater tha( 6; )s ;.212. Use th)s )(for-at)o( a(* the sy--etry of
the *e(s)ty f,(ct)o( to +(* the probab)l)ty that : takes a 4al,e less tha( . Sketch the
*e(s)ty c,r4e w)th rele4a(t reg)o(s sha*e* to )ll,strate the co-p,tat)o(.
A co(t)(,o,s ra(*o- 4ar)able : has a (or-al *)str)b,t)o( w)th -ea( 19. &he
probab)l)ty that : takes a 4al,e greater tha( 16; )s ;.1. Use th)s )(for-at)o( a(* the
sy--etry of the *e(s)ty f,(ct)o( to +(* the probab)l)ty that : takes a 4al,e less tha(
106. Sketch the *e(s)ty c,r4e w)th rele4a(t reg)o(s sha*e* to )ll,strate the
co-p,tat)o(.
A co(t)(,o,s ra(*o- 4ar)able : has a (or-al *)str)b,t)o( w)th -ea( 0;.0. &he
probab)l)ty that : takes a 4al,e less tha( 0 )s ;.. Use th)s )(for-at)o( a(* the
sy--etry of the *e(s)ty f,(ct)o( to +(* the probab)l)ty that : takes a 4al,e greater
tha( . Sketch the *e(s)ty c,r4e w)th rele4a(t reg)o(s sha*e* to )ll,strate the
co-p,tat)o(.
Saylor URL: http://www.saylor.org/books Saylor.org2;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 241/723
6 A co(t)(,o,s ra(*o- 4ar)able : has a (or-al *)str)b,t)o( w)th -ea( 12.20. &he
probab)l)ty that : takes a 4al,e less tha( 13 )s ;.62. Use th)s )(for-at)o( a(* the
sy--etry of the *e(s)ty f,(ct)o( to +(* the probab)l)ty that : takes a 4al,e greater
tha( 11.0;. Sketch the *e(s)ty c,r4e w)th rele4a(t reg)o(s sha*e* to )ll,strate the
co-p,tat)o(.
9 &he +g,re pro4)*e* shows the *e(s)ty c,r4es of three (or-ally *)str)b,te* ra(*o-
4ar)ables : 0 : - a(* : ,. &he)r sta(*ar* *e4)at)o(s )( (o part)c,lar or*er5 are 10
a(* 2;. Use the +g,re to )*e(t)fy the 4al,es of the -ea(s µA µB a(* µC a(* sta(*ar*
*e4)at)o(s σA σB a(* σC of the three ra(*o- 4ar)ables.
1; &he +g,re pro4)*e* shows the *e(s)ty c,r4es of three (or-ally *)str)b,te* ra(*o-
4ar)ables : 0 : - a(* : ,. &he)r sta(*ar* *e4)at)o(s )( (o part)c,lar or*er5 are 2; 0
a(* 1;. Use the +g,re to )*e(t)fy the 4al,es of the -ea(s µA µB a(* µC a(* sta(*ar*
*e4)at)o(s σA σB a(* σC of the three ra(*o- 4ar)ables.
Saylor URL: http://www.saylor.org/books Saylor.org21
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 242/723
A22!&CAT&1NS
11 ogberrys alar- clock )s battery operate*. &he battery co,l* fa)l w)th e<,al probab)l)ty
at a(y t)-e of the *ay or ()ght. E4ery *ay ogberry sets h)s alar- for :3; a.-. a(*
goes to be* at 1;:;; p.-. )(* the probab)l)ty that whe( the clock battery +(ally *)es )t
w)ll *o so at the -ost )(co(4e()e(t t)-e betwee( 1;:;; p.-. a(* :3; a.-.
12 #,ses r,(()(g a b,s l)(e (ear es*e-o(as ho,se r,( e4ery 10 -)(,tes. B)tho,t
pay)(g atte(t)o( to the sche*,le she walks to the (earest stop to take the b,s to tow(.
)(* the probab)l)ty that she wa)ts -ore tha( 1; -)(,tes.
13 &he a-o,(t : of ora(ge >,)ce )( a ra(*o-ly selecte* half=gallo( co(ta)(er 4ar)es
accor*)(g to a (or-al *)str)b,t)o( w)th -ea( o,(ces a(* sta(*ar* *e4)at)o( ;.20
o,(ce.
a Sketch the graph of the *e(s)ty f,(ct)o( for : .
b Bhat proport)o( of all co(ta)(ers co(ta)( less tha( a half gallo( o,(ces5C
E7pla)(.
c Bhat )s the -e*)a( a-o,(t of ora(ge >,)ce )( s,ch co(ta)(ersC E7pla)(.
1 &he we)ght : of grass see* )( bags -arke* 0; lb 4ar)es accor*)(g to a (or-al
*)str)b,t)o( w)th -ea( 0; lb a(* sta(*ar* *e4)at)o( 1 o,(ce ;.;20 lb5.a Sketch the graph of the *e(s)ty f,(ct)o( for : .
b Bhat proport)o( of all bags we)gh less tha( 0; po,(*sC E7pla)(.
c Bhat )s the -e*)a( we)ght of s,ch bagsC E7pla)(.
Saylor URL: http://www.saylor.org/books Saylor.org22
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 243/723
8.0 The Standard Normal (istribution
LEARNN! "#$E%&'ES
1 &o lear( what a sta(*ar* (or-al ra(*o- 4ar)able )s.
2 &o lear( how to ,se )g,re 12.2 Q%,-,lat)4e Nor-al Probab)l)tyQ to co-p,te
probab)l)t)es relate* to a sta(*ar* (or-al ra(*o- 4ar)able.
e+()t)o(
A standard normal random variable is a normally distributed random variable with mean M
5 and standard deviation 6 M 1. >t will always be denoted by the letter C .
Saylor URL: http://www.saylor.org/books Saylor.org23
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 244/723
The density function for a standard normal random variable is shown in /igure 8.F 02ensity 7urve
for a +tandard 9ormal Candom Dariable0.
!igure .4 (ensity <urve for a %tandard Iormal -andom 0ariable
To compute probabilities for C we will not work with its density function directly but instead read
probabilities out of /igure 1!.! 07umulative 9ormal robability0 in 7hapter 1! 0%ppendix0. The
tables are tables of cumulative probabilities their entries are probabilities of the form P(Z<z).The
use of the tables will be explained by the following series of examples.
+A>2!+ 7
)(* the probab)l)t)es )(*)cate* where as always C *e(otes a sta(*ar* (or-al
ra(*o- 4ar)able.a. 9 C _ 1.65.
b. 9 C _ X;.205.
Sol,t)o(:
a. )g,re 0.1; Q%o-p,t)(g Probab)l)t)es Us)(g the %,-,lat)4e &ableQ shows how th)s
probab)l)ty )s rea* *)rectly fro- the table w)tho,t a(y co-p,tat)o( re<,)re*. &he *)g)ts )(
Saylor URL: http://www.saylor.org/books Saylor.org2
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 245/723
the o(es a(* te(ths places of 1.6 (a-ely 1. are ,se* to select the appropr)ate row of
the tableD the h,(*re*ths part of 1.6 (a-ely ;.;6 )s ,se* to select the appropr)ate
col,-( of the table. &he fo,r *ec)-al place (,-ber )( the )(ter)or of the table that l)es )(
the )(tersect)o( of the row a(* col,-( selecte* ;.93; )s the probab)l)ty
so,ght: P(Z<1.48)=0.9306.
Figure ;.1A,omputing 9robabilities Using t!e ,umulative )able
b. &he -)(,s s)g( )( X;.20 -akes (o *)8ere(ce )( the proce*,reD the table )s ,se*
)( e7actly the sa-e way as )( part a5: the probab)l)ty so,ght )s the (,-ber that )s )( the
)(tersect)o( of the row w)th hea*)(g X;.2 a(* the col,-( w)th hea*)(g ;.;0 the (,-ber
;.;13. &h,s 9 C _ X;.205 ;.;13.
EKAPLE 0
)(* the probab)l)t)es )(*)cate*.
a. 9 C 1.;5.
b. 9 C X1.;25.
Sol,t)o(:
a. #eca,se the e4e(ts C 1.; a(* C ^ 1.; are co-ple-e(ts the Probab)l)ty
R,le for %o-ple-e(ts )-pl)es that
P(Z>1.60)=1−P(Z≤1.60)
S)(ce )(cl,s)o( of the e(*po)(t -akes (o *)8ere(ce for the co(t)(,o,s ra(*o-
4ar)able C P(Z≤1.60)=P(Z<1.60)@ wh)ch we k(ow how to +(* fro- the table. &he
Saylor URL: http://www.saylor.org/books Saylor.org20
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 246/723
(,-ber )( the row w)th hea*)(g 1. a(* )( the col,-( w)th hea*)(g ;.;; )s
;.902. &h,s P(Z<1.60)=0.9452so
P(Z>1.60)=1−P(Z≤1.60)=1−0.9452=0.0548
)g,re 0.11 Q%o-p,t)(g a Probab)l)ty for a R)ght @alf=L)(eQ)ll,strates the )*eas
geo-etr)cally. S)(ce the total area ,(*er the c,r4e )s 1 a(* the area of the
reg)o( to the left of 1.; )s fro- the table5 ;.902 the area of the reg)o( to
the r)ght of 1.; -,st be 1−0.9452=0.0548.
Figure ;.11,omputing a 9robabilit/ for a 6ig!t alf+Line
b &he -)(,s s)g( )( X1.;2 -akes (o *)8ere(ce )( the proce*,reD the table )s
,se* )( e7actly the sa-e way as )( part a5. &he (,-ber )( the )(tersect)o( of
the row w)th hea*)(g X1.; a(* the col,-( w)th hea*)(g ;.;2 )s ;.1039. &h)s
-ea(s that P(Z<−1.02)=P(Z≤−1.02)=0.1539 he(ce
P(Z>−1.02)=1−P(Z≤−1.02)=1−0.1539=0.8461
Saylor URL: http://www.saylor.org/books Saylor.org2
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 247/723
Figure ;.12 ,omputing a 9robabilit/ for an #nterval of Finite Lengt!
b &he proce*,re for +(*)(g the probab)l)ty that C takes a 4al,e )( a +()te
)(ter4al whose e(*po)(ts ha4e oppos)te s)g(s )s e7actly the sa-e proce*,re
,se* )( part a5 a(* )s )ll,strate* )( )g,re 0.13 Q%o-p,t)(g a Probab)l)ty for
a( (ter4al of )()te Le(gthQ. ( sy-bols the co-p,tat)o( )s
P(−2.55<Z<0.09)==P(Z<0.09)−P(Z<−2.55)
=0.5359−0.0054=0.5305
Saylor URL: http://www.saylor.org/books Saylor.org2
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 248/723
Figure ;.1* ,omputing a 9robabilit/ for an #nterval of Finite Lengt!
The next example shows what to do if the value of C that we want to look up in the table is not
present there.
EKAPLE
)(* the probab)l)t)es )(*)cate*.
a. P(1.13<Z<4.16).
b. P(−5.22<Z<2.15).
Sol,t)o(:
a. Be atte-pt to co-p,te the probab)l)ty e7actly as )( Note 0.2; QE7a-ple
Q by look)(g ,p the (,-bers 1.13 a(* .1 )( the table. Be obta)( the 4al,e ;.6;6
for the area of the reg)o( ,(*er the *e(s)ty c,r4e to left of 1.13 w)tho,t a(y proble-
b,t whe( we go to look ,p the (,-ber .1 )( the table )t )s (ot there. Be ca( see
fro- the last row of (,-bers )( the table that the area to the left of .1 -,st be so
close to 1 that to fo,r *ec)-al places )t ro,(*s to 1.;;;;. &herefore
P(1.13<Z<4.16)=1.0000−0.8708=0.1292
b. S)-)larly here we ca( rea* *)rectly fro- the table that the area ,(*er the
*e(s)ty c,r4e a(* to the left of 2.10 )s ;.962 b,t X0.22 )s too far to the left o(
the (,-ber l)(e to be )( the table. Be ca( see fro- the +rst l)(e of the table that
Saylor URL: http://www.saylor.org/books Saylor.org26
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 249/723
the area to the left of X0.22 -,st be so close to ; that to fo,r *ec)-al places )t
ro,(*s to ;.;;;;. &herefore
P(−5.22<Z<2.15)=0.9842−0.0000=0.9842
The final example of this section explains the origin of the proportions given in the mpirical Cule.
+A>2!+ 5
)(* the probab)l)t)es )(*)cate*.
a. P(−1<Z<1).
b. P(−2<Z<2).
c. P(−3<Z<3).
Sol,t)o(:
a. Us)(g the table as was *o(e )( Note 0.2; QE7a-ple Qb5 we obta)(
P(−1<Z<1)=0.8413−0.1587=0.6826
S)(ce C has -ea( ; a(* sta(*ar* *e4)at)o( 1 for C to take a 4al,e betwee( X1
a(* 1 -ea(s that C takes a 4al,e that )s w)th)( o(e sta(*ar* *e4)at)o( of the-ea(. ",r co-p,tat)o( shows that the probab)l)ty that th)s happe(s )s abo,t
;.6 the proport)o( g)4e( by the E-p)r)cal R,le for h)stogra-s that are -o,(*
shape* a(* sy--etr)cal l)ke the bell c,r4e.
b. Us)(g the table )( the sa-e way
P(−2<Z<2)=0.9772−0.0228=0.9544
&h)s correspo(*s to the proport)o( ;.90 for *ata w)th)( two sta(*ar* *e4)at)o(s
of the -ea(.
c. S)-)larly
Saylor URL: http://www.saylor.org/books Saylor.org29
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 250/723
P(−3<Z<3)=0.9987−0.0013=0.9974
wh)ch correspo(*s to the proport)o( ;.99 for *ata w)th)( three sta(*ar*
*e4)at)o(s of the -ea(.
IEJ &AIEABAJS
• A sta(*ar* (or-al ra(*o- 4ar)able C )s a (or-ally *)str)b,te* ra(*o- 4ar)able
w)th -ea( μ ; a(* sta(*ar* *e4)at)o( σ 1.
• Probab)l)t)es for a sta(*ar* (or-al ra(*o- 4ar)able are co-p,te* ,s)(g)g,re
12.2 Q%,-,lat)4e Nor-al Probab)l)tyQ.
Saylor URL: http://www.saylor.org/books Saylor.org20;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 251/723
Saylor URL: http://www.saylor.org/books Saylor.org201
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 252/723
Saylor URL: http://www.saylor.org/books Saylor.org202
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 253/723
Saylor URL: http://www.saylor.org/books Saylor.org203
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 254/723
Saylor URL: http://www.saylor.org/books Saylor.org20
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 255/723
8.3 2robability Computations or :eneral Normal /andom
<ariables
LEARNN! "#$E%&'E
1 &o lear( how to co-p,te probab)l)t)es relate* to a(y (or-al ra(*o- 4ar)able.
,f B is any normally distributed normal random variable then /igure 1!.! 07umulative 9ormal
robability0 can also be used to compute a probability of the form P(a<X<b) by means of the
following e"uality.
Saylor URL: http://www.saylor.org/books Saylor.org200
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 256/723
The new endpoints(a−µ)/σand(b−µ)/σare the z -scores of a and b as defined in +ection
!.6.! in 7hapter ! 02escriptive +tatistics0.
/igure 8.16 0robability for an ,nterval of /inite ength0 illustrates the meaning of the e"uality
geometrically: the two shaded regions$ one under the density curve for B and the other under the
density curve for C $ have the same area. ,nstead of drawing both bell curves$ though$ we will always
draw a single generic bell-shaped curve with both an x -axis and a z -axis below it.
!igure ." $robability for an >nterval of !inite 'ength
Saylor URL: http://www.saylor.org/books Saylor.org20
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 257/723
+A>2!+ 6
Let : be a (or-al ra(*o- 4ar)able w)th -ea( μ 1; a(* sta(*ar* *e4)at)o( σ
2.0. %o-p,te the follow)(g probab)l)t)es.
a. 9 : _ 15.
b. P(8<X<14).
Sol,t)o(:
Saylor URL: http://www.saylor.org/books Saylor.org20
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 258/723
Saylor URL: http://www.saylor.org/books Saylor.org206
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 259/723
EKAPLE 1;
&he l)fet)-es of the trea* of a certa)( a,to-ob)le t)re are (or-ally *)str)b,te*
w)th -ea( 30;; -)les a(* sta(*ar* *e4)at)o( 0;; -)les. )(* the probab)l)ty
that the trea* l)fe of a ra(*o-ly selecte* t)re w)ll be betwee( 3;;;; a(* ;;;;
-)les.
Saylor URL: http://www.saylor.org/books Saylor.org209
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 260/723
Sol,t)o(:
Let : *e(ote the trea* l)fe of a ra(*o-ly selecte* t)re. &o -ake the (,-bers
eas)er to work w)th we w)ll choose tho,sa(*s of -)les as the ,()ts. &h,s μ
3.0 σ .0 a(* the proble- )s to co-p,te P(30<X<40).)g,re 0.1 QProbab)l)ty%o-p,tat)o( for &)re &rea* BearQ )ll,strates the follow)(g co-p,tat)o(:
Saylor URL: http://www.saylor.org/books Saylor.org2;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 261/723
EKAPLE 11
Scores o( a sta(*ar*)Ve* college e(tra(ce e7a-)(at)o( ,((5 are (or-ally
*)str)b,te* w)th -ea( 01; a(* sta(*ar* *e4)at)o( ;. A select)4e ,()4ers)ty
co(s)*ers for a*-)ss)o( o(ly appl)ca(ts w)th ,(( scores o4er 0;. )(*
perce(tage of all )(*)4)*,als who took the ,(( who -eet the ,()4ers)tys ,((
re<,)re-e(t for co(s)*erat)o( for a*-)ss)o(.
Sol,t)o(:
Let : *e(ote the score -a*e o( the ,(( by a ra(*o-ly selecte* )(*)4)*,al.
&he( : )s (or-ally *)str)b,te* w)th -ea( 01; a(* sta(*ar* *e4)at)o( ;. &he
probab)l)ty that : l)e )( a part)c,lar )(ter4al )s the sa-e as the proport)o( of all
e7a- scores that l)e )( that )(ter4al. &h,s the sol,t)o( to the proble- )s 9 :
0;5 e7presse* as a perce(tage. )g,re 0.16 QProbab)l)ty %o-p,tat)o( for E7a-
ScoresQ )ll,strates the follow)(g co-p,tat)o(:
Saylor URL: http://www.saylor.org/books Saylor.org21
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 262/723
*+, TA*+AA, • Probab)l)t)es for a ge(eral (or-al ra(*o- 4ar)able are co-p,te* ,s)(g)g,re 12.2
Q%,-,lat)4e Nor-al Probab)l)tyQ after co(4ert)(g =4al,es to z =scores.
++/C&S+S
'AS&C
1 : )s a (or-ally *)str)b,te* ra(*o- 4ar)able w)th -ea( 0 a(* sta(*ar* *e4)at)o( . )(*
the probab)l)ty )(*)cate*.
a 9 : _ 09.05
b 9 : _ .25
c 9 : 02.25
* 9 : ;5
2 : )s a (or-ally *)str)b,te* ra(*o- 4ar)able w)th -ea( X20 a(* sta(*ar* *e4)at)o( .
)(* the probab)l)ty )(*)cate*.
a 9 : _ X2.25
Saylor URL: http://www.saylor.org/books Saylor.org22
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 263/723
b 9 : _ X1.65
c 9 : X33.15
* 9 : X1.05
3 : )s a (or-ally *)str)b,te* ra(*o- 4ar)able w)th -ea( 112 a(* sta(*ar* *e4)at)o(
10. )(* the probab)l)ty )(*)cate*.
a P(100<X<125)
b P(91<X<107)
c P(118<X<160)
: )s a (or-ally *)str)b,te* ra(*o- 4ar)able w)th -ea( 2 a(* sta(*ar* *e4)at)o( 22.
)(* the probab)l)ty )(*)cate*.
a P(78<X<127)
b P(60<X<90)
c P(49<X<71)
0 : )s a (or-ally *)str)b,te* ra(*o- 4ar)able w)th -ea( 0;; a(* sta(*ar* *e4)at)o(
20. )(* the probab)l)ty )(*)cate*.
a 9 : _ ;;5
b P(466<X<625)
: )s a (or-ally *)str)b,te* ra(*o- 4ar)able w)th -ea( ; a(* sta(*ar* *e4)at)o( ;.0.
)(* the probab)l)ty )(*)cate*.
a 9X.;2 _ : _ 3.625
b 9 : .115
: )s a (or-ally *)str)b,te* ra(*o- 4ar)able w)th -ea( 10 a(* sta(*ar* *e4)at)o( 1.
Use )g,re 12.2 Q%,-,lat)4e Nor-al Probab)l)tyQ to +(* the +rst probab)l)ty l)ste*.
)(* the seco(* probab)l)ty ,s)(g the sy--etry of the *e(s)ty c,r4e. Sketch the
*e(s)ty c,r4e w)th rele4a(t reg)o(s sha*e* to )ll,strate the co-p,tat)o(.
a 9 : _ 125 9 : 165
b 9 : _ 15 9 : 15
c 9 : _ 11.205 9 : 16.05
* 9 : _ 12.5 9 : 1.335
6 : )s a (or-ally *)str)b,te* ra(*o- 4ar)able w)th -ea( 1;; a(* sta(*ar* *e4)at)o(
1;. Use )g,re 12.2 Q%,-,lat)4e Nor-al Probab)l)tyQ to +(* the +rst probab)l)ty l)ste*.
Saylor URL: http://www.saylor.org/books Saylor.org23
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 264/723
)(* the seco(* probab)l)ty ,s)(g the sy--etry of the *e(s)ty c,r4e. Sketch the
*e(s)ty c,r4e w)th rele4a(t reg)o(s sha*e* to )ll,strate the co-p,tat)o(.
a 9 : _ 6;5 9 : 12;5
b 9 : _ 05 9 : 1205
c 9 : _ 6.005 9 : 110.05
* 9 : _ .25 9 : 122.065
9 : )s a (or-ally *)str)b,te* ra(*o- 4ar)able w)th -ea( a(* sta(*ar* *e4)at)o( 13.
&he probab)l)ty that : takes a 4al,e )( the ,()o( of )(ter4als (−∞,67−a] [67+a,∞)w)ll be
*e(ote* P(X≤67−a orX≥67+a).Use )g,re 12.2 Q%,-,lat)4e Nor-al Probab)l)tyQ to +(*
the follow)(g probab)l)t)es of th)s type. Sketch the *e(s)ty c,r4e w)th rele4a(t reg)o(s
sha*e* to )ll,strate the co-p,tat)o(. #eca,se of the sy--etry of the *e(s)ty c,r4e
yo, (ee* to ,se )g,re 12.2 Q%,-,lat)4e Nor-al Probab)l)tyQ o(ly o(e t)-e for eachpart.
a P(X<57 orX>77)
b P(X<47 orX>87)
c P(X<49 orX>85)
d P(X<37 orX>97)
1; : )s a (or-ally *)str)b,te* ra(*o- 4ar)able w)th -ea( 266 a(* sta(*ar* *e4)at)o( .
&he probab)l)ty that : takes a 4al,e )( the ,()o( of )(ter4als (−∞,288−a] [288+a,∞)w)ll
be *e(ote* P(X≤288−a orX≥288+a).Use )g,re 12.2 Q%,-,lat)4e Nor-al Probab)l)tyQ to
+(* the follow)(g probab)l)t)es of th)s type. Sketch the *e(s)ty c,r4e w)th rele4a(t
reg)o(s sha*e* to )ll,strate the co-p,tat)o(. #eca,se of the sy--etry of the *e(s)ty
c,r4e yo, (ee* to ,se )g,re 12.2 Q%,-,lat)4e Nor-al Probab)l)tyQ o(ly o(e t)-e for
each part.
a P(X<278 orX>298)
b P(X<268 orX>308)
c P(X<273 orX>303)
d P(X<280 orX>296)
A22!&CAT&1NS
Saylor URL: http://www.saylor.org/books Saylor.org2
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 265/723
11 &he a-o,(t : of be4erage )( a ca( labele* 12 o,(ces )s (or-ally *)str)b,te* w)th -ea(
12.1 o,(ces a(* sta(*ar* *e4)at)o( ;.;0 o,(ce. A ca( )s selecte* at ra(*o-.
a )(* the probab)l)ty that the ca( co(ta)(s at least 12 o,(ces.
b )(* the probab)l)ty that the ca( co(ta)(s betwee( 11.9 a(* 12.1 o,(ces.
12 &he le(gth of gestat)o( for sw)(e )s (or-ally *)str)b,te* w)th -ea( 11 *ays a(*sta(*ar* *e4)at)o( ;.0 *ay. )(* the probab)l)ty that a l)tter w)ll be bor( w)th)( o(e *ay
of the -ea( of 11.
13 &he systol)c bloo* press,re : of a*,lts )( a reg)o( )s (or-ally *)str)b,te* w)th -ea( 112
-- @g a(* sta(*ar* *e4)at)o( 10 -- @g. A perso( )s co(s)*ere* prehyperte(s)4e )f
h)s systol)c bloo* press,re )s betwee( 12; a(* 13; -- @g. )(* the probab)l)ty that the
bloo* press,re of a ra(*o-ly selecte* perso( )s prehyperte(s)4e.
1 @e)ghts : of a*,lt wo-e( are (or-ally *)str)b,te* w)th -ea( 3. )(ches a(* sta(*ar*
*e4)at)o( 2.1 )(ches. Ro-eo who )s 9.20 )(ches tall w)shes to *ate o(ly wo-e( who
are shorter tha( he b,t w)th)( )(ches of h)s he)ght. )(* the probab)l)ty that the (e7t
wo-a( he -eets w)ll ha4e s,ch a he)ght.
10 @e)ghts : of a*,lt -e( are (or-ally *)str)b,te* w)th -ea( 9.1 )(ches a(* sta(*ar*
*e4)at)o( 2.92 )(ches. $,l)et who )s 3.20 )(ches tall w)shes to *ate o(ly -e( who are
taller tha( she b,t w)th)( )(ches of her he)ght. )(* the probab)l)ty that the (e7t -a(
she -eets w)ll ha4e s,ch a he)ght.
1 A reg,lat)o( hockey p,ck -,st we)gh betwee( 0.0 a(* o,(ces. &he we)ghts : of
p,cks -a*e by a part)c,lar process are (or-ally *)str)b,te* w)th -ea( 0.0 o,(ces
a(* sta(*ar* *e4)at)o( ;.11 o,(ce. )(* the probab)l)ty that a p,ck -a*e by th)s
process w)ll -eet the we)ght sta(*ar*.
1 A reg,lat)o( golf ball -ay (ot we)gh -ore tha( 1.2; o,(ces. &he we)ghts : of golf
balls -a*e by a part)c,lar process are (or-ally *)str)b,te* w)th -ea( 1.31 o,(ces
a(* sta(*ar* *e4)at)o( ;.;9 o,(ce. )(* the probab)l)ty that a golf ball -a*e by th)s
process w)ll -eet the we)ght sta(*ar*.
16 &he le(gth of t)-e that the battery )( @)ppolytas cell pho(e w)ll hol* e(o,gh charge
to operate acceptably )s (or-ally *)str)b,te* w)th -ea( 20. ho,rs a(* sta(*ar*
*e4)at)o( ;.32 ho,r. @)ppolyta forgot to charge her pho(e yester*ay so that at the
-o-e(t she +rst w)shes to ,se )t to*ay )t has bee( 2 ho,rs 16 -)(,tes s)(ce the
Saylor URL: http://www.saylor.org/books Saylor.org20
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 266/723
pho(e was last f,lly charge*. )(* the probab)l)ty that the pho(e w)ll operate
properly.
19 &he a-o,(t of (o(=-ortgage *ebt per ho,sehol* for ho,sehol*s )( a part)c,lar
)(co-e bracket )( o(e part of the co,(try )s (or-ally *)str)b,te* w)th -ea( 2630;
a(* sta(*ar* *e4)at)o( 320. )(* the probab)l)ty that a ra(*o-ly selecte* s,ch
ho,sehol* has betwee( 2;;;; a(* 3;;;; )( (o(=-ortgage *ebt.
2; #)rth we)ghts of f,ll=ter- bab)es )( a certa)( reg)o( are (or-ally *)str)b,te* w)th
-ea( .120 lb a(* sta(*ar* *e4)at)o( 1.29; lb. )(* the probab)l)ty that a ra(*o-ly
selecte* (ewbor( w)ll we)gh less tha( 0.0 lb the h)stor)c *e+()t)o( of pre-at,r)ty.
21 &he *)sta(ce fro- the seat back to the fro(t of the k(ees of seate* a*,lt -ales )s
(or-ally *)str)b,te* w)th -ea( 23.6 )(ches a(* sta(*ar* *e4)at)o( 1.22 )(ches. &he
*)sta(ce fro- the seat back to the back of the (e7t seat forwar* )( all seats o(
a)rcraft [ow( by a b,*get a)rl)(e )s 2 )(ches. )(* the proport)o( of a*,lt -e( [y)(g
w)th th)s a)rl)(e whose k(ees w)ll to,ch the back of the seat )( fro(t of the-.
22 &he *)sta(ce fro- the seat to the top of the hea* of seate* a*,lt -ales )s (or-ally
*)str)b,te* w)th -ea( 3.0 )(ches a(* sta(*ar* *e4)at)o( 1.39 )(ches. &he *)sta(ce
fro- the seat to the roof of a part)c,lar -ake a(* -o*el car )s ;.0 )(ches. )(* the
proport)o( of a*,lt -e( who whe( s)tt)(g )( th)s car w)ll ha4e at least o(e )(ch of
hea*roo- *)sta(ce fro- the top of the hea* to the roof5.
A&"NAL EKER%SES
23 &he ,sef,l l)fe of a part)c,lar -ake a(* type of a,to-ot)4e t)re )s (or-ally *)str)b,te*
w)th -ea( 00;; -)les a(* sta(*ar* *e4)at)o( 90; -)les.
a )(* the probab)l)ty that s,ch a t)re w)ll ha4e a ,sef,l l)fe of betwee( 0;;;
a(* 06;;; -)les.
b @a-let b,ys fo,r s,ch t)res. Ass,-)(g that the)r l)fet)-es are )(*epe(*e(t
+(* the probab)l)ty that all fo,r w)ll last betwee( 0;;; a(* 06;;; -)les. f
so the best t)re w)ll ha4e (o -ore tha( 1;;; -)les left o( )t whe( the +rst t)re
Saylor URL: http://www.saylor.org/books Saylor.org2
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 267/723
fa)ls.5 @)(t: &here )s a b)(o-)al ra(*o- 4ar)able here whose 4al,e of p co-es
fro- part a5.
2 A -ach)(e pro*,ces large faste(ers whose le(gth -,st be w)th)( ;.0 )(ch of 22 )(ches.
&he le(gths are (or-ally *)str)b,te* w)th -ea( 22.; )(ches a(* sta(*ar* *e4)at)o( ;.1
)(ch.
a )(* the probab)l)ty that a ra(*o-ly selecte* faste(er pro*,ce* by the
-ach)(e w)ll ha4e a( acceptable le(gth.
b &he -ach)(e pro*,ces 2; faste(ers per ho,r. &he le(gth of each o(e )s
)(specte*. Ass,-)(g le(gths of faste(ers are )(*epe(*e(t +(* the probab)l)ty
that all 2; w)ll ha4e acceptable le(gth. @)(t: &here )s a b)(o-)al ra(*o-
4ar)able here whose 4al,e of p co-es fro- part a5.
20 &he le(gths of t)-e take( by st,*e(ts o( a( algebra pro+c)e(cy e7a- )f (ot force* to
stop before co-plet)(g )t5 are (or-ally *)str)b,te* w)th -ea( 26 -)(,tes a(* sta(*ar*
*e4)at)o( 1.0 -)(,tes.
a )(* the proport)o( of st,*e(ts who w)ll +()sh the e7a- )f a 3;=-)(,te t)-e
l)-)t )s set.
b S)7 st,*e(ts are tak)(g the e7a- to*ay. )(* the probab)l)ty that all s)7 w)ll
+()sh the e7a- w)th)( the 3;=-)(,te l)-)t ass,-)(g that t)-es take( by
st,*e(ts are )(*epe(*e(t. @)(t: &here )s a b)(o-)al ra(*o- 4ar)able here
whose 4al,e of p co-es fro- part a5.
2 @e)ghts of a*,lt -e( betwee( 16 a(* 3 years of age are (or-ally *)str)b,te* w)th -ea(
9.1 )(ches a(* sta(*ar* *e4)at)o( 2.92 )(ches. "(e re<,)re-e(t for e(l)st-e(t )( the
-)l)tary )s that -e( -,st sta(* betwee( ; a(* 6; )(ches tall.
a )(* the probab)l)ty that a ra(*o-ly electe* -a( -eets the he)ght
re<,)re-e(t for -)l)tary ser4)ce.
b &we(ty=three -e( )(*epe(*e(tly co(tact a recr,)ter th)s week. )(* the
probab)l)ty that all of the- -eet the he)ght re<,)re-e(t. @)(t: &here )s a
b)(o-)al ra(*o- 4ar)able here whose 4al,e of p co-es fro- part a5.
2 A reg,lat)o( hockey p,ck -,st we)gh betwee( 0.0 a(* o,(ces. ( a( alter(at)4e
-a(,fact,r)(g process the -ea( we)ght of p,cks pro*,ce* )s 0.0 o,(ce. &he we)ghts of
p,cks ha4e a (or-al *)str)b,t)o( whose sta(*ar* *e4)at)o( ca( be *ecrease* by
Saylor URL: http://www.saylor.org/books Saylor.org2
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 268/723
)(creas)(gly str)(ge(t a(* e7pe(s)4e5 co(trols o( the -a(,fact,r)(g process. )(* the
-a7)-,- allowable sta(*ar* *e4)at)o( so that at -ost ;.;;0 of all p,cks w)ll fa)l to -eet
the we)ght sta(*ar*. @)(t: &he *)str)b,t)o( )s sy--etr)c a(* )s ce(tere* at the -)**le of
the )(ter4al of acceptable we)ghts.5
26 &he a-o,(t of gasol)(e : *el)4ere* by a -etere* p,-p whe( )t reg)sters 0 gallo(s )s a
(or-ally *)str)b,te* ra(*o- 4ar)able. &he sta(*ar* *e4)at)o( σ of : -eas,res the
prec)s)o( of the p,-pD the s-aller σ )s the s-aller the 4ar)at)o( fro- *el)4ery to *el)4ery.
A typ)cal sta(*ar* for p,-ps )s that whe( they show that 0 gallo(s of f,el has bee(
*el)4ere* the act,al a-o,(t -,st be betwee( .9 a(* 0.;3 gallo(s wh)ch correspo(*s
to be)(g o8 by at -ost abo,t half a c,p5. S,ppos)(g that the -ea( of : )s 0 +(* the
largest that σ ca( be so that 9.9 _ : _ 0.;35 )s 1.;;;; to fo,r *ec)-al places whe(
co-p,te* ,s)(g )g,re 12.2 Q%,-,lat)4e Nor-al Probab)l)tyQ wh)ch -ea(s that the
p,-p )s s,?c)e(tly acc,rate. @)(t: &he z =score of 0.;3 w)ll be the s-allest 4al,e of C sothat )g,re 12.2 Q%,-,lat)4e Nor-al Probab)l)tyQ g)4es P(Z<z)=1.0000.E
Saylor URL: http://www.saylor.org/books Saylor.org26
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 269/723
Saylor URL: http://www.saylor.org/books Saylor.org29
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 270/723
8.7 Areas o Tails o (istributions
!+A/N&N: 1';+CT&<+
1 &o lear( how to +(* for a (or-al ra(*o- 4ar)able : a(* a( area a the
4al,e x*of : so that P(X<x*)=aor that P(X>x*)=a wh)che4er )s re<,)re*.
e+()t)o(
The left tail of a density curve y=f(x)of a continuous random variable Bcut off by a
value x*of B is the region under the curve that is to the left of x*$ as shown by the shading
in !igure ."4 G-ight and 'eft Tails of a (istributionG DaE. The right tail cut off by x*is defined
similarly, as indicated by the shading in !igure ."4 G-ight and 'eft Tails of a (istributionG DbE.
!igure ."4 -ight and 'eft Tails of a (istribution
Saylor URL: http://www.saylor.org/books Saylor.org2;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 271/723
The probabilities tabulated in /igure 1!.! 07umulative 9ormal robability0 are areas of left tails in
the standard normal distribution.
Tails o the Standard Normal (istribution
%t times it is important to be able to solve the kind of problem illustrated by /igure 8.!5. (e have a
certain specific area in mind$ in this case the area 5.51!8 of the shaded region in the figure$ and we want
to find the valuez*of C that produces it. This is exactly the reverse of the kind of problems encountered so
far. ,nstead of knowing a value z*of C and finding a corresponding area$ we know the area and want to
findz*.,n the case at hand$ in the terminology of the definition #ust above$ we wish to find the
valuez*that cuts off a left tail of area 5.51!8 in the standard normal distribution.
The idea for solving such a problem is fairly simple$ although sometimes its implementation can be a bit
complicated. ,n a nutshell$ one reads the cumulative probability table for C in reverse$ looking up the
relevant area in the interior of the table and reading off the value of C from the margins.
!igure .&5 C 0alue that $roduces a Jnown Area
Saylor URL: http://www.saylor.org/books Saylor.org21
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 272/723
EKAPLE 12
)(* the 4al,e z*of C as *eter-)(e* by )g,re 0.2;: the 4al,e z*that c,ts o8 a left
ta)l of area ;.;120 )( the sta(*ar* (or-al *)str)b,t)o(. ( sy-bols +(* the
(,-ber z*s,ch that P(Z<z*)=0.0125.
Sol,t)o(:
&he (,-ber that )s k(ow( ;.;120 )s the area of a left ta)l a(* as alrea*y
-e(t)o(e* the probab)l)t)es tab,late* )( )g,re 12.2 Q%,-,lat)4e Nor-al
Probab)l)tyQ are areas of left ta)ls. &h,s to sol4e th)s proble- we (ee* o(ly search
)( the )(ter)or of )g,re 12.2 Q%,-,lat)4e Nor-al Probab)l)tyQ for the (,-ber
;.;120. t l)es )( the row w)th the hea*)(g X2.2 a(* )( the col,-( w)th the
hea*)(g ;.;. &h)s -ea(s that 9 C _ X2.25 ;.;120 he(ce z*=−2.24.
+A>2!+ %3
)(* the 4al,e z*of C as *eter-)(e* by )g,re 0.21: the 4al,e z*that c,ts o8 a
r)ght ta)l of area ;.;20; )( the sta(*ar* (or-al *)str)b,t)o(. ( sy-bols +(* the
(,-ber z*s,ch that P(Z>z*)=0.0250.
Fiigure ;.21 C <alue t!at 9roduces a @no&n 0rea
Saylor URL: http://www.saylor.org/books Saylor.org22
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 273/723
Sol,t)o(:
&he )-porta(t *)st)(ct)o( betwee( th)s e7a-ple a(* the pre4)o,s o(e )s that here
)t )s the area of a rig!t ta)l that )s k(ow(. ( or*er to be able to ,se )g,re 12.2
Q%,-,lat)4e Nor-al Probab)l)tyQ we -,st +rst +(* that area of the left ta)l c,t o8
by the ,(k(ow( (,-ber z*.S)(ce the total area ,(*er the *e(s)ty c,r4e )s 1 that
area )s 1−0.0250=0.9750. &h)s )s the (,-ber we look for )( the )(ter)or of )g,re 12.2
Q%,-,lat)4e Nor-al Probab)l)tyQ. t l)es )( the row w)th the hea*)(g 1.9 a(* )( the
col,-( w)th the hea*)(g ;.;. &herefore z*=1.96.
e+()t)o(
Saylor URL: http://www.saylor.org/books Saylor.org23
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 274/723
The value of the standard normal random variable C that cuts off a right tail of area c is denoted z c. 9y
symmetry, value of C that cuts off a left tail of area c is −zc. %ee !igure .&& GThe Iumbers G .
!igure .&&The Iumbers z c and −zc
+A>2!+ %7
)(* z.01a(* −z.01 the 4al,es of C that c,t o8 r)ght a(* left ta)ls of area ;.;1 )( the
sta(*ar* (or-al *)str)b,t)o(.
Sol,t)o(:
S)(ce −z.01c,ts o8 a left ta)l of area ;.;1 a(* )g,re 12.2 Q%,-,lat)4e Nor-al
Probab)l)tyQ )s a table of left ta)ls we look for the (,-ber ;.;1;; )( the )(ter)or of
the table. t )s (ot there b,t falls betwee( the two (,-bers ;.;1;2 a(* ;.;;99 )(
the row w)th hea*)(g X2.3. &he (,-ber ;.;;99 )s closer to ;.;1;; tha( ;.;1;2 )s
so for the h,(*re*ths place )( −z.01we ,se the hea*)(g of the col,-( that co(ta)(s
;.;;99 (a-ely ;.;3 a(* wr)te −z.01≈−2.33.
&he a(swer to the seco(* half of the proble- )s a,to-at)c: s)(ce −z.01=−2.33 we
co(cl,*e )--e*)ately that z.01=2.33.
Saylor URL: http://www.saylor.org/books Saylor.org2
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 275/723
Be co,l* >,st as well ha4e sol4e* th)s proble- by look)(g for z.01+rst a(* )t )s
)(str,ct)4e to rework the proble- th)s way. &o beg)( w)th we -,st +rst s,btract
;.;1 fro- 1 to +(* the area 1−0.0100=0.9900of the left ta)l c,t o8 by the ,(k(ow(
(,-ber z.01.See )g,re 0.23 Q%o-p,tat)o( of the N,-ber Q. &he( we search for
the area ;.99;; )( )g,re 12.2 Q%,-,lat)4e Nor-al Probab)l)tyQ. t )s (ot there b,t
falls betwee( the (,-bers ;.9696 a(* ;.99;1 )( the row w)th hea*)(g 2.3. S)(ce
;.99;1 )s closer to ;.99;; tha( ;.9696 )s we ,se the col,-( hea*)(g abo4e )t
;.;3 to obta)( the appro7)-at)o( z.01≈2.33. &he( +(ally −z.01≈−2.33.
Figure ;.2*,omputation of t!e 'umber z.01
Tails o :eneral Normal (istributions
The problem of finding the valuex*of a general normally distributed random variable B that cuts off
a tail of a specified area also arises. This problem may be solved in two steps.
+uppose B is a normally distributed random variable with mean and standard deviation 6 . To find the
valuex*of B that cuts off a left or right tail of area c in the distribution of B :
Saylor URL: http://www.saylor.org/books Saylor.org20
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 276/723
1 find the valuez*of C that cuts off a left or right tail of area c in the standard normal distribution
! z*is the z -score ofx*0 computex*using the destandardi'ation formula
x*=µ+z*σ
+A>2!+ %8
)(* x*s,ch that P(X<x*)=0.9332@ where : )s a (or-al ra(*o- 4ar)able w)th
-ea( μ 1; a(* sta(*ar* *e4)at)o( σ 2.0.
Sol,t)o(:
All the )*eas for the sol,t)o( are )ll,strate* )( )g,re 0.2 Q&a)l of a Nor-ally
)str)b,te* Ra(*o- 'ar)ableQ. S)(ce ;.9332 )s the area of a left ta)l we ca(
+(* z*s)-ply by look)(g for ;.9332 )( the )(ter)or of )g,re 12.2 Q%,-,lat)4e
Nor-al Probab)l)tyQ. t )s )( the row a(* col,-( w)th hea*)(gs 1.0 a(* ;.;;
he(ce z*=1.50. &h,s x*)s 1.0; sta(*ar* *e4)at)o(s abo4e the -ea( so
x*=µ+z*σ=10+1.50D2.5=13.75.
Figure ;.28)ail of a 'ormall/ Distributed 6andom <ariable
Saylor URL: http://www.saylor.org/books Saylor.org2
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 277/723
+A>2!+ %9
)(* x*s,ch that P(X>x*)=0.65@ where : )s a (or-al ra(*o- 4ar)able w)th -ea( μ
10 a(* sta(*ar* *e4)at)o( σ 12.
Sol,t)o(:
&he s)t,at)o( )s )ll,strate* )( )g,re 0.20 Q&a)l of a Nor-ally )str)b,te* Ra(*o-
'ar)ableQ. S)(ce ;.0 )s the area of a r)ght ta)l we +rst s,btract )t fro- 1 to
obta)( 1−0.65=0.35 the area of the co-ple-e(tary left ta)l. Be +(* z*by look)(g for
;.30;; )( the )(ter)or of )g,re 12.2 Q%,-,lat)4e Nor-al Probab)l)tyQ. t )s (ot
prese(t b,t l)es betwee( table e(tr)es ;.302; a(* ;.363. &he e(try ;.363 w)th
row a(* col,-( hea*)(gs X;.3 a(* ;.;9 )s closer to ;.30;; tha( the other e(try
)s so z*≈−0.39. &h,s x*)s ;.39 sta(*ar* *e4)at)o(s below the -ea( so
x*=µ+z*σ=175+(−0.39)D12=170.32
Figure ;.2;)ail of a 'ormall/ Distributed 6andom <ariable
Saylor URL: http://www.saylor.org/books Saylor.org2
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 278/723
+A>2!+ %4
Scores o( a sta(*ar*)Ve* college e(tra(ce e7a-)(at)o( ,((5 are (or-ally
*)str)b,te* w)th -ea( 01; a(* sta(*ar* *e4)at)o( ;. A select)4e ,()4ers)ty
*ec)*es to g)4e ser)o,s co(s)*erat)o( for a*-)ss)o( to appl)ca(ts whose ,((
scores are )( the top 0G of all ,(( scores. )(* the -)()-,- score that -eets th)s
cr)ter)o( for ser)o,s co(s)*erat)o( for a*-)ss)o(.
Sol,t)o(:
Let : *e(ote the score -a*e o( the ,(( by a ra(*o-ly selecte* )(*)4)*,al.
&he( : )s (or-ally *)str)b,te* w)th -ea( 01; a(* sta(*ar* *e4)at)o( ;. &he
probab)l)ty that : l)e )( a part)c,lar )(ter4al )s the sa-e as the proport)o( of all
e7a- scores that l)e )( that )(ter4al. &h,s the -)()-,- score that )s )( the top
0G of all ,(( )s the score x*that c,ts o8 a r)ght ta)l )( the *)str)b,t)o( of : of area
;.;0 0G e7presse* as a proport)o(5. See )g,re 0.2 Q&a)l of a Nor-ally
)str)b,te* Ra(*o- 'ar)ableQ.
Figure ;.25)ail of a 'ormall/ Distributed 6andom <ariable
S)(ce ;.;0;; )s the area of a r)ght ta)l we +rst s,btract )t fro- 1 to
obta)( 1−0.0500=0.9500 the area of the co-ple-e(tary left ta)l. Be +(* z*=z.05by
look)(g for ;.90;; )( the )(ter)or of )g,re 12.2 Q%,-,lat)4e Nor-al Probab)l)tyQ. t
Saylor URL: http://www.saylor.org/books Saylor.org26
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 279/723
)s (ot prese(t a(* l)es e7actly half=way betwee( the two (earest e(tr)es that are
;.990 a(* ;.90;0. ( the case of a t)e l)ke th)s we w)ll always a4erage the
4al,es of C correspo(*)(g to the two table e(tr)es obta)()(g here the
4al,e z*=1.645.Us)(g th)s 4al,e we co(cl,*e that x*)s 1.0 sta(*ar* *e4)at)o(s
abo4e the -ea( so
x*=µ+z*σ=510+1.645D60=608.7
+A>2!+ %5
All boys at a -)l)tary school -,st r,( a +7e* co,rse as fast as they ca( as part of
a phys)cal e7a-)(at)o(. )()sh)(g t)-es are (or-ally *)str)b,te* w)th -ea( 29
-)(,tes a(* sta(*ar* *e4)at)o( 2 -)(,tes. &he -)**le 0G of all +()sh)(g t)-es
are class)+e* as a4erage. )(* the ra(ge of t)-es that are a4erage +()sh)(gt)-es by th)s *e+()t)o(.
Sol,t)o(:
Let : *e(ote the +()sh t)-e of a ra(*o-ly selecte* boy. &he( : )s (or-ally
*)str)b,te* w)th -ea( 29 a(* sta(*ar* *e4)at)o( 2. &he probab)l)ty that : l)e )( a
part)c,lar )(ter4al )s the sa-e as the proport)o( of all +()sh t)-es that l)e )( that
)(ter4al. &h,s the s)t,at)o( )s as show( )( )g,re 0.2 Q)str)b,t)o( of &)-es to
R,( a %o,rseQ. #eca,se the area )( the -)**le correspo(*)(g to a4erage t)-es
)s ;.0 the areas of the two ta)ls a** ,p to 1 X ;.0 ;.20 )( all. #y the
sy--etry of the *e(s)ty c,r4e each ta)l -,st ha4e half of th)s total or area ;.120
each. &h,s the fastest t)-e that )s a4erage has z =score −z.125 wh)ch by )g,re
12.2 Q%,-,lat)4e Nor-al Probab)l)tyQ )s X1.10 a(* the slowest t)-e that )s
a4erage has z =score z.125=1.15. &he fastest a(* slowest t)-es that are st)ll
co(s)*ere* a4erage are
x fast=µ+(−z.125)σ=29+(−1.15)D2=26.7
a(*
x slow=µ+z.125σ=29+(1.15)D2=31.3
Figure ;.27Distribution of )imes to 6un a ,ourse
Saylor URL: http://www.saylor.org/books Saylor.org29
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 280/723
A boy has a( a4erage +()sh)(g t)-e )f he r,(s the co,rse w)th a t)-e betwee(
2. a(* 31.3 -)(,tes or e<,)4ale(tly betwee( 2 -)(,tes 2 seco(*s a(* 31
-)(,tes 16 seco(*s.
*+, TA*+AA,S
• &he proble- of +(*)(g the (,-ber z*so that the probab)l)ty P(Z<z*))s a spec)+e*
4al,e c )s sol4e* by look)(g for the (,-ber c )( the )(ter)or of )g,re 12.2
Q%,-,lat)4e Nor-al Probab)l)tyQ a(* rea*)(g z*fro- the -arg)(s.
• &he proble- of +(*)(g the (,-ber z*so that the probab)l)ty P(Z>z*))s a spec)+e*
4al,e c )s sol4e* by look)(g for the co-ple-e(tary probab)l)ty 1−c)( the )(ter)or
of )g,re 12.2 Q%,-,lat)4e Nor-al Probab)l)tyQ a(* rea*)(g z*fro- the -arg)(s.
• or a (or-al ra(*o- 4ar)able : w)th -ea( μ a(* sta(*ar* *e4)at)o( σ the
proble- of +(*)(g the (,-ber x*so that P(X<x*))s a spec)+e* 4al,e c or so
that P(X>x*))s a spec)+e* 4al,e c5 )s sol4e* )( two steps: 15 sol4e the
correspo(*)(g proble- for C w)th the sa-e 4al,e of c thereby obta)()(g the z =
score z* of x*D 25 +(* x*,s)(g x*=µ+z*Yσ.
• &he 4al,e of C that c,ts o8 a r)ght ta)l of area c )( the sta(*ar* (or-al
*)str)b,t)o( )s *e(ote* z c.
Saylor URL: http://www.saylor.org/books Saylor.org26;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 281/723
Saylor URL: http://www.saylor.org/books Saylor.org261
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 282/723
9 : )s a (or-ally *)str)b,te* ra(*o- 4ar)able : w)th -ea( 10 a(* sta(*ar* *e4)at)o(
;.20. )(* the 4al,es L a(* 6 of : that are sy--etr)cally locate* w)th respect to the
-ea( of : a(* sat)sfy 9 L _ : _ 65 ;.6;. @)(t. )rst sol4e the correspo(*)(g
proble- for C .5
1; : )s a (or-ally *)str)b,te* ra(*o- 4ar)able : w)th -ea( 26 a(* sta(*ar* *e4)at)o( 3..
)(* the 4al,es L a(* 6 of : that are sy--etr)cally locate* w)th respect to the -ea(
of : a(* sat)sfy 9 L _ : _ 65 ;.0. @)(t. )rst sol4e the correspo(*)(g proble-
for C .5
A22!&CAT&1NS
11 Scores o( a (at)o(al e7a- are (or-ally *)str)b,te* w)th -ea( 362 a(* sta(*ar*
*e4)at)o( 2.
a )(* the score that )s the 0;th perce(t)le.
b )(* the score that )s the 9;th perce(t)le.
12 @e)ghts of wo-e( are (or-ally *)str)b,te* w)th -ea( 3. )(ches a(* sta(*ar*
*e4)at)o( 2. )(ches.
Saylor URL: http://www.saylor.org/books Saylor.org262
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 283/723
a )(* the he)ght that )s the 1;th perce(t)le.
b )(* the he)ght that )s the 6;th perce(t)le.
13 &he -o(thly a-o,(t of water ,se* per ho,sehol* )( a s-all co--,()ty )s (or-ally
*)str)b,te* w)th -ea( ;9 gallo(s a(* sta(*ar* *e4)at)o( 06 gallo(s. )(* the three
<,art)les for the a-o,(t of water ,se*.1 &he <,a(t)ty of gasol)(e p,rchase* )( a s)(gle sale at a cha)( of +ll)(g stat)o(s )( a
certa)( reg)o( )s (or-ally *)str)b,te* w)th -ea( 11. gallo(s a(* sta(*ar* *e4)at)o( 2.6
gallo(s. )(* the three <,art)les for the <,a(t)ty of gasol)(e p,rchase* )( a s)(gle sale.
10 Scores o( the co--o( +(al e7a- g)4e( )( a large e(roll-e(t -,lt)ple sect)o( co,rse
were (or-ally *)str)b,te* w)th -ea( 9.30 a(* sta(*ar* *e4)at)o( 12.93. &he
*epart-e(t has the r,le that )( or*er to rece)4e a( A )( the co,rse h)s score -,st be )(
the top 1;G of all e7a- scores. )(* the -)()-,- e7a- score that -eets th)s
re<,)re-e(t.
1 &he a4erage +()sh)(g t)-e a-o(g all h)gh school boys )( a part)c,lar track e4e(t )( a
certa)( state )s 0 -)(,tes 1 seco(*s. &)-es are (or-ally *)str)b,te* w)th sta(*ar*
*e4)at)o( 12 seco(*s.
a &he <,al)fy)(g t)-e )( th)s e4e(t for part)c)pat)o( )( the state -eet )s to be set
so that o(ly the fastest 0G of all r,((ers <,al)fy. )(* the <,al)fy)(g t)-e.
@)(t: %o(4ert seco(*s to -)(,tes.5
b ( the wester( reg)o( of the state the t)-es of all boys r,(()(g )( th)s e4e(t
are (or-ally *)str)b,te* w)th sta(*ar* *e4)at)o( 12 seco(*s b,t w)th -ea( 0
-)(,tes 22 seco(*s. )(* the proport)o( of boys fro- th)s reg)o( who <,al)fy
to r,( )( th)s e4e(t )( the state -eet.
1 &ests of a (ew t)re *e4elope* by a t)re -a(,fact,rer le* to a( est)-ate* -ea( trea*
l)fe of 30; -)les a(* sta(*ar* *e4)at)o( of 112; -)les. &he -a(,fact,rer w)ll
a*4ert)se the l)fet)-e of the t)re for e7a-ple a 0;;;; -)le t)re5 ,s)(g the largest
4al,e for wh)ch )t )s e7pecte* that 96G of the t)res w)ll last at least that lo(g.
Ass,-)(g t)re l)fe )s (or-ally *)str)b,te* +(* that a*4ert)se* 4al,e.
16 &ests of a (ew l)ght le* to a( est)-ate* -ea( l)fe of 1321 ho,rs a(* sta(*ar*
*e4)at)o( of 1; ho,rs. &he -a(,fact,rer w)ll a*4ert)se the l)fet)-e of the b,lb ,s)(g
the largest 4al,e for wh)ch )t )s e7pecte* that 9;G of the b,lbs w)ll last at least that
lo(g. Ass,-)(g b,lb l)fe )s (or-ally *)str)b,te* +(* that a*4ert)se* 4al,e.
19 &he we)ghts : of eggs pro*,ce* at a part)c,lar far- are (or-ally *)str)b,te* w)th
-ea( 1.2 o,(ces a(* sta(*ar* *e4)at)o( ;.12 o,(ce. Eggs whose we)ghts l)e )( the
-)**le 0G of the *)str)b,t)o( of we)ghts of all eggs are class)+e* as -e*),-. )(*
Saylor URL: http://www.saylor.org/books Saylor.org263
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 284/723
the -a7)-,- a(* -)()-,- we)ghts of s,ch eggs. &hese we)ghts are e(*po)(ts of
a( )(ter4al that )s sy--etr)c abo,t the -ea( a(* )( wh)ch the we)ghts of 0G of the
eggs pro*,ce* at th)s far- l)e.5
2; &he le(gths : of har*woo* [oor)(g str)ps are (or-ally *)str)b,te* w)th -ea( 26.9
)(ches a(* sta(*ar* *e4)at)o( .12 )(ches. Str)ps whose le(gths l)e )( the -)**le
6;G of the *)str)b,t)o( of le(gths of all str)ps are class)+e* as a4erage=le(gth
str)ps. )(* the -a7)-,- a(* -)()-,- le(gths of s,ch str)ps. &hese le(gths are
e(*po)(ts of a( )(ter4al that )s sy--etr)c abo,t the -ea( a(* )( wh)ch the le(gths
of 6;G of the har*woo* str)ps l)e.5
21 All st,*e(ts )( a large e(roll-e(t -,lt)ple sect)o( co,rse take co--o( )(=class
e7a-s a(* a co--o( +(al a(* s,b-)t co--o( ho-ework ass)g(-e(ts. %o,rse
gra*es are ass)g(e* base* o( st,*e(ts +(al o4erall scores wh)ch are appro7)-ately
(or-ally *)str)b,te*. &he *epart-e(t ass)g(s a % to st,*e(ts whose scores co(st)t,te
the -)**le 2/3 of all scores. f scores th)s se-ester ha* -ea( 2.0 a(* sta(*ar*
*e4)at)o( .1 +(* the )(ter4al of scores that w)ll be ass)g(e* a %.
22 Researchers w)sh to )(4est)gate the o4erall health of )(*)4)*,als w)th ab(or-ally h)gh
or low le4els of gl,cose )( the bloo* strea-. S,ppose gl,cose le4els are (or-ally
*)str)b,te* w)th -ea( 9 a(* sta(*ar* *e4)at)o( 6.0 -g/* ℓ a(* that (or-al )s
*e+(e* as the -)**le 9;G of the pop,lat)o(. )(* the )(ter4al of (or-al gl,cose
le4els that )s the )(ter4al ce(tere* at 9 that co(ta)(s 9;G of all gl,cose le4els )(the pop,lat)o(.
A((&T&1NA! ++/C&S+S
23 A -ach)(e for +ll)(g 2=l)ter bottles of soft *r)(k *el)4ers a( a-o,(t to each bottle that
4ar)es fro- bottle to bottle accor*)(g to a (or-al *)str)b,t)o( w)th sta(*ar* *e4)at)o(
;.;;2 l)ter a(* -ea( whate4er a-o,(t the -ach)(e )s set to *el)4er.
a f the -ach)(e )s set to *el)4er 2 l)ters so the -ea( a-o,(t *el)4ere* )s 2
l)ters5 what proport)o( of the bottles w)ll co(ta)( at least 2 l)ters of soft *r)(kC
b )(* the -)()-,- sett)(g of the -ea( a-o,(t *el)4ere* by the -ach)(e so
that at least 99G of all bottles w)ll co(ta)( at least 2 l)ters.
2 A (,rsery has obser4e* that the -ea( (,-ber of *ays )t -,st *arke( the e(4)ro(-e(t
of a spec)es po)(sett)a pla(t *a)ly )( or*er to ha4e )t rea*y for -arket )s 1 *ays.
S,ppose the le(gths of s,ch per)o*s of *arke()(g are (or-ally *)str)b,te* w)th sta(*ar*
*e4)at)o( 2 *ays. )(* the (,-ber of *ays )( a*4a(ce of the pro>ecte* *el)4ery *ates of
Saylor URL: http://www.saylor.org/books Saylor.org26
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 285/723
the pla(ts to -arket that the (,rsery -,st beg)( the *a)ly *arke()(g process )( or*er
that at least 90G of the pla(ts w)ll be rea*y o( t)-e. Po)(sett)as are so lo(g=l)4e* that
o(ce rea*y for -arket the pla(t re-a)(s salable )(*e+()tely.5
Saylor URL: http://www.saylor.org/books Saylor.org260
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 286/723
Chapter 9
Samplin$ (istributions
% statistic$ such as the sample mean or the sample standard deviation$ is a number computed from asample. +ince a sample is random$ every statistic is a random variable: it varies from sample to
sample in a way that cannot be predicted with certainty. %s a random variable it has a mean$ a
standard deviation$ and a probability distribution. The probability distribution of a statistic is called
itssampling distribution. Typically sample statistics are not ends in themselves$ but are computed in
order to estimate the corresponding population parameters$ as illustrated in the grand picture of
statistics presented in /igure 1.1 0The rand icture of +tatistics0 in 7hapter 1 0,ntroduction0.
This chapter introduces the concepts of the mean$ the standard deviation$ and the sampling
distribution of a sample statistic$ with an emphasis on the sample mean x −.
9.% The >ean and Standard (eviation o the Sample >ean
LEARNN! "#$E%&'ES
1 &o beco-e fa-)l)ar w)th the co(cept of the probab)l)ty *)str)b,t)o( of the sa-ple
-ea(.
2 &o ,(*ersta(* the -ea()(g of the for-,las for the -ea( a(* sta(*ar* *e4)at)o(
of the sa-ple -ea(.
+uppose we wish to estimate the mean of a population. ,n actual practice we would typically take
#ust one sample. ,magine however that we take sample after sample$ all of the same si'e n$ and
compute the sample mean x −of each one. (e will likely get a different value of x −each time. The
sample mean x −is a random variable: it varies from sample to sample in a way that cannot be
predicted with certainty. (e will write X −− when the sample mean is thought of as a random
variable$ and write x −for the values that it takes. The random variable X −−has a mean$
Saylor URL: http://www.saylor.org/books Saylor.org26
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 287/723
denotedµX −−$ and a standard deviation$ denotedσX −−.Eere is an example with such a small
population and small sample si'e that we can actually write down every single sample.
EKAPLE 1
A row)(g tea- co(s)sts of fo,r rowers who we)gh 102 10 1; a(* 1 po,(*s.
)(* all poss)ble ra(*o- sa-ples w)th replace-e(t of s)Ve two a(* co-p,te the
sa-ple -ea( for each o(e. Use the- to +(* the probab)l)ty *)str)b,t)o( the
-ea( a(* the sta(*ar* *e4)at)o( of the sa-ple -ea( X −−.
Sol,t)o(
&he follow)(g table shows all poss)ble sa-ples w)th replace-e(t of s)Ve two
alo(g w)th the -ea( of each:
Sample >ean Sample >ean Sample >ean Sample >ean
102 102 102 10 102 10 1; 102 10 1 102 106
102 10 10 10 10 10 1; 10 106 1 10 1;
102 1; 10 10 1; 106 1; 1; 1; 1 1; 12
102 1 106 10 1 1; 1; 1 12 1 1 1
Saylor URL: http://www.saylor.org/books Saylor.org26
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 288/723
Saylor URL: http://www.saylor.org/books Saylor.org266
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 289/723
Saylor URL: http://www.saylor.org/books Saylor.org269
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 290/723
*+, TA*+AA,S
• &he sa-ple -ea( )s a ra(*o- 4ar)ableD as s,ch )t )s wr)tte( X−− a(* x− sta(*s for
)(*)4)*,al 4al,es )t takes.
• As a ra(*o- 4ar)able the sa-ple -ea( has a probab)l)ty *)str)b,t)o( a -ea( µX−−
a(* a sta(*ar* *e4)at)o( σX−−.
• &here are for-,las that relate the -ea( a(* sta(*ar* *e4)at)o( of the sa-ple
-ea( to the -ea( a(* sta(*ar* *e4)at)o( of the pop,lat)o( fro- wh)ch the
sa-ple )s *raw(.
++/C&S+S
1 Ra(*o- sa-ples of s)Ve 220 are *raw( fro- a pop,lat)o( w)th -ea( 1;; a(* sta(*ar*
*e4)at)o( 2;. )(* the -ea( a(* sta(*ar* *e4)at)o( of the sa-ple -ea(.
2 Ra(*o- sa-ples of s)Ve are *raw( fro- a pop,lat)o( w)th -ea( 32 a(* sta(*ar*
*e4)at)o( 0. )(* the -ea( a(* sta(*ar* *e4)at)o( of the sa-ple -ea(.
3 A pop,lat)o( has -ea( 0 a(* sta(*ar* *e4)at)o( 12.
a Ra(*o- sa-ples of s)Ve 121 are take(. )(* the -ea( a(* sta(*ar* *e4)at)o(
of the sa-ple -ea(.
Saylor URL: http://www.saylor.org/books Saylor.org29;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 291/723
b @ow wo,l* the a(swers to part a5 cha(ge )f the s)Ve of the sa-ples were ;;
)(stea* of 121C
A pop,lat)o( has -ea( 0.0 a(* sta(*ar* *e4)at)o( 1.;2.
a Ra(*o- sa-ples of s)Ve 61 are take(. )(* the -ea( a(* sta(*ar* *e4)at)o(
of the sa-ple -ea(.b @ow wo,l* the a(swers to part a5 cha(ge )f the s)Ve of the sa-ples were 20
)(stea* of 61C
9.0 The Samplin$ (istribution o the Sample >ean
LEARNN! "#$E%&'ES
1 &o lear( what the sa-pl)(g *)str)b,t)o( of X −− )s whe( the sa-ple s)Ve )s large.
2 &o lear( what the sa-pl)(g *)str)b,t)o( of X −− )s whe( the pop,lat)o( )s (or-al.
The Central !imit Theorem
,n 9ote ;.8 0xample 10 in +ection ;.1 0The ean and +tandard 2eviation of the +ample ean0 we
constructed the probability distribution of the sample mean for samples of si'e two drawn from the
population of four rowers. The probability distribution is:
Saylor URL: http://www.saylor.org/books Saylor.org291
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 292/723
Saylor URL: http://www.saylor.org/books Saylor.org292
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 293/723
Eistograms illustrating these distributions are shown in /igure ;.! 02istributions of the +ample
ean0.
Saylor URL: http://www.saylor.org/books Saylor.org293
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 294/723
!igure /.& (istributions of the %ample 2ean
%s n increases the sampling distribution of X −−evolves in an interesting way: the probabilities on
the lower and the upper ends shrink and the probabilities in the middle become larger in relation to
them. ,f we were to continue to increase nthen the shape of the sampling distribution would become
smoother and more bell-shaped.
(hat we are seeing in these examples does not depend on the particular population distributions
involved. ,n general$ one may start with any distribution and the sampling distribution of the sample
mean will increasingly resemble the bell-shaped normal curve as the sample si'e increases. This is
the content of the 7entral imit Theorem.
Saylor URL: http://www.saylor.org/books Saylor.org29
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 295/723
Saylor URL: http://www.saylor.org/books Saylor.org290
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 296/723
The dashed vertical lines in the figures locate the population mean. Cegardless of the distribution of
the population$ as the sample si'e is increased the shape of the sampling distribution of the sample
mean becomes increasingly bell-shaped$ centered on the population mean. Typically by the time the
sample si'e is 35 the distribution of the sample mean is practically the same as a normal distribution.
The importance of the 7entral imit Theorem is that it allows us to make probability statements
about the sample mean$ specifically in relation to its value in comparison to the population mean$ as
Saylor URL: http://www.saylor.org/books Saylor.org29
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 297/723
we will see in the examples. 4ut to use the result properly we must first reali'e that there are two
separate random variables *and therefore two probability distributions) at play:
1 B $ the measurement of a single element selected at random from the population the distribution
of B is the distribution of the population$ with mean the population mean and standard deviationthe population standard deviation 6
! X−−$ the mean of the measurements in a sample of si'e n the distribution of X−−is its sampling
distribution$ with meanµX−−=µand standard deviationσX−−=σ/n√.
Saylor URL: http://www.saylor.org/books Saylor.org29
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 298/723
Saylor URL: http://www.saylor.org/books Saylor.org296
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 299/723
Normally (istributed 2opulations
The 7entral imit Theorem says that no matter what the distribution of the population is$ as long as
the sample is Glarge$H meaning of si'e 35 or more$ the sample mean is approximately normally
distributed. ,f the population is normal to begin with then the sample mean also has a normal
distribution$ regardless of the sample si'e.
/or samples of any si'e drawn from a normally distributed population$ the sample mean is normally
distributed$ with meanµX −−=µand standard deviationσX−−=σ/√n$ where n is the sample si'e.
The effect of increasing the sample si'e is shown in /igure ;.6 02istribution of +ample eans for a
9ormal opulation0.
Saylor URL: http://www.saylor.org/books Saylor.org299
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 300/723
!igure /. (istribution of %ample 2eans for a Iormal $opulation
Saylor URL: http://www.saylor.org/books Saylor.org3;;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 301/723
Saylor URL: http://www.saylor.org/books Saylor.org3;1
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 302/723
*+, TA*+AA,S
• Bhe( the sa-ple s)Ve )s at least 3; the sa-ple -ea( )s (or-ally *)str)b,te*.
• Bhe( the pop,lat)o( )s (or-al the sa-ple -ea( )s (or-ally *)str)b,te*
regar*less of the sa-ple s)Ve.
Saylor URL: http://www.saylor.org/books Saylor.org3;2
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 303/723
++/C&S+S
'AS&C
1 A pop,lat)o( has -ea( 126 a(* sta(*ar* *e4)at)o( 22.
a )(* the -ea( a(* sta(*ar* *e4)at)o( of X−− for sa-ples of s)Ve 3.
b )(* the probab)l)ty that the -ea( of a sa-ple of s)Ve 3 w)ll be w)th)( 1;
,()ts of the pop,lat)o( -ea( that )s betwee( 116 a(* 136.
2 A pop,lat)o( has -ea( 102 a(* sta(*ar* *e4)at)o( 2.
a )(* the -ea( a(* sta(*ar* *e4)at)o( of X−− for sa-ples of s)Ve 1;;.
b )(* the probab)l)ty that the -ea( of a sa-ple of s)Ve 1;; w)ll be w)th)( 1;;
,()ts of the pop,lat)o( -ea( that )s betwee( 12 a(* 12.
3 A pop,lat)o( has -ea( 3.0 a(* sta(*ar* *e4)at)o( 2.0.
a )(* the -ea( a(* sta(*ar* *e4)at)o( of X−− for sa-ples of s)Ve 3;.
b )(* the probab)l)ty that the -ea( of a sa-ple of s)Ve 3; w)ll be less tha( 2.
A pop,lat)o( has -ea( 6. a(* sta(*ar* *e4)at)o( .3.
a )(* the -ea( a(* sta(*ar* *e4)at)o( of X−− for sa-ples of s)Ve .
b )(* the probab)l)ty that the -ea( of a sa-ple of s)Ve w)ll be less tha(
..
0 A (or-ally *)str)b,te* pop,lat)o( has -ea( 20. a(* sta(*ar* *e4)at)o( 3.3.
a )(* the probab)l)ty that a s)(gle ra(*o-ly selecte* ele-e(t : of the
pop,lat)o( e7cee*s 3;.
b )(* the -ea( a(* sta(*ar* *e4)at)o( of X−− for sa-ples of s)Ve 9.
c )(* the probab)l)ty that the -ea( of a sa-ple of s)Ve 9 *raw( fro- th)s
pop,lat)o( e7cee*s 3;.
A (or-ally *)str)b,te* pop,lat)o( has -ea( 0. a(* sta(*ar* *e4)at)o( 12.1.
a )(* the probab)l)ty that a s)(gle ra(*o-ly selecte* ele-e(t : of the
pop,lat)o( )s less tha( 0.
b )(* the -ea( a(* sta(*ar* *e4)at)o( of X−− for sa-ples of s)Ve 1.
c )(* the probab)l)ty that the -ea( of a sa-ple of s)Ve 1 *raw( fro- th)s
pop,lat)o( )s less tha( 0.
A pop,lat)o( has -ea( 00 a(* sta(*ar* *e4)at)o( 30.
a )(* the -ea( a(* sta(*ar* *e4)at)o( of X−− for sa-ples of s)Ve 0;.
Saylor URL: http://www.saylor.org/books Saylor.org3;3
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 304/723
b )(* the probab)l)ty that the -ea( of a sa-ple of s)Ve 0; w)ll be -ore tha(
0;.
6 A pop,lat)o( has -ea( 1 a(* sta(*ar* *e4)at)o( 1..
a )(* the -ea( a(* sta(*ar* *e4)at)o( of X−− for sa-ples of s)Ve 6;.
b )(* the probab)l)ty that the -ea( of a sa-ple of s)Ve 6; w)ll be -ore tha(
1..
9 A (or-ally *)str)b,te* pop,lat)o( has -ea( 121 a(* sta(*ar* *e4)at)o( 122.
a )(* the probab)l)ty that a s)(gle ra(*o-ly selecte* ele-e(t : of the
pop,lat)o( )s betwee( 11;; a(* 13;;.
b )(* the -ea( a(* sta(*ar* *e4)at)o( of X−− for sa-ples of s)Ve 20.
c )(* the probab)l)ty that the -ea( of a sa-ple of s)Ve 20 *raw( fro- th)s
pop,lat)o( )s betwee( 11;; a(* 13;;.
1; A (or-ally *)str)b,te* pop,lat)o( has -ea( 06;; a(* sta(*ar* *e4)at)o( 0;.
a )(* the probab)l)ty that a s)(gle ra(*o-ly selecte* ele-e(t : of the
pop,lat)o( )s betwee( 0;;; a(* 06;;;.
b )(* the -ea( a(* sta(*ar* *e4)at)o( of X−− for sa-ples of s)Ve 1;;.
c )(* the probab)l)ty that the -ea( of a sa-ple of s)Ve 1;; *raw( fro- th)s
pop,lat)o( )s betwee( 0;;; a(* 06;;;.
11 A pop,lat)o( has -ea( 2 a(* sta(*ar* *e4)at)o( .
a )(* the -ea( a(* sta(*ar* *e4)at)o( of X−− for sa-ples of s)Ve 0.
b )(* the probab)l)ty that the -ea( of a sa-ple of s)Ve 0 w)ll *)8er fro- the
pop,lat)o( -ea( 2 by at least 2 ,()ts that )s )s e)ther less tha( ; or -ore
tha( . @)(t: "(e way to sol4e the proble- )s to +rst +(* the probab)l)ty of
the co-ple-e(tary e4e(t.5
12 A pop,lat)o( has -ea( 12 a(* sta(*ar* *e4)at)o( 1.0.
a )(* the -ea( a(* sta(*ar* *e4)at)o( of X−− for sa-ples of s)Ve 9;.
b )(* the probab)l)ty that the -ea( of a sa-ple of s)Ve 9; w)ll *)8er fro- the
pop,lat)o( -ea( 12 by at least ;.3 ,()t that )s )s e)ther less tha( 11. or
Saylor URL: http://www.saylor.org/books Saylor.org3;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 305/723
-ore tha( 12.3. @)(t: "(e way to sol4e the proble- )s to +rst +(* the
probab)l)ty of the co-ple-e(tary e4e(t.5
A22!&CAT&1NS
13 S,ppose the -ea( (,-ber of *ays to ger-)(at)o( of a 4ar)ety of see* )s 22 w)th
sta(*ar* *e4)at)o( 2.3 *ays. )(* the probab)l)ty that the -ea( ger-)(at)o( t)-e of a
sa-ple of 1; see*s w)ll be w)th)( ;.0 *ay of the pop,lat)o( -ea(.
1 S,ppose the -ea( le(gth of t)-e that a caller )s place* o( hol* whe( telepho()(g a
c,sto-er ser4)ce ce(ter )s 23.6 seco(*s w)th sta(*ar* *e4)at)o( . seco(*s. )(* the
probab)l)ty that the -ea( le(gth of t)-e o( hol* )( a sa-ple of 12;; calls w)ll be w)th)(
;.0 seco(* of the pop,lat)o( -ea(.
10 S,ppose the -ea( a-o,(t of cholesterol )( eggs labele* large )s 16 -)ll)gra-s w)th
sta(*ar* *e4)at)o( -)ll)gra-s. )(* the probab)l)ty that the -ea( a-o,(t of
cholesterol )( a sa-ple of 1 eggs w)ll be w)th)( 2 -)ll)gra-s of the pop,lat)o( -ea(.
1 S,ppose that )( o(e reg)o( of the co,(try the -ea( a-o,(t of cre*)t car* *ebt per
ho,sehol* )( ho,sehol*s ha4)(g cre*)t car* *ebt )s 1020; w)th sta(*ar* *e4)at)o(
120. )(* the probab)l)ty that the -ea( a-o,(t of cre*)t car* *ebt )( a sa-ple of
1;; s,ch ho,sehol*s w)ll be w)th)( 3;; of the pop,lat)o( -ea(.
1 S,ppose spee*s of 4eh)cles o( a part)c,lar stretch of roa*way are (or-ally *)str)b,te*
w)th -ea( 3. -ph a(* sta(*ar* *e4)at)o( 1. -ph.
a )(* the probab)l)ty that the spee* : of a ra(*o-ly selecte* 4eh)cle )s
betwee( 30 a(* ; -ph.
b )(* the probab)l)ty that the -ea( spee* X−− of 2; ra(*o-ly selecte* 4eh)cles
)s betwee( 30 a(* ; -ph.
16 a(y sharks e(ter a state of to()c )--ob)l)ty whe( )(4erte*. S,ppose that )( a
part)c,lar spec)es of sharks the t)-e a shark re-a)(s )( a state of to()c )--ob)l)ty whe(
)(4erte* )s (or-ally *)str)b,te* w)th -ea( 11.2 -)(,tes a(* sta(*ar* *e4)at)o( 1.1
-)(,tes.
a f a b)olog)st )(*,ces a state of to()c )--ob)l)ty )( s,ch a shark )( or*er to
st,*y )t +(* the probab)l)ty that the shark w)ll re-a)( )( th)s state for
betwee( 1; a(* 13 -)(,tes.
b Bhe( a b)olog)st w)shes to est)-ate the -ea( t)-e that s,ch sharks stay
)--ob)le by )(*,c)(g to()c )--ob)l)ty )( each of a sa-ple of 12 sharks +(*
the probab)l)ty that -ea( t)-e of )--ob)l)ty )( the sa-ple w)ll be betwee( 1;
a(* 13 -)(,tes.
Saylor URL: http://www.saylor.org/books Saylor.org3;0
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 306/723
19 S,ppose the -ea( cost across the co,(try of a 3;=*ay s,pply of a ge(er)c *r,g )s
.06 w)th sta(*ar* *e4)at)o( .6. )(* the probab)l)ty that the -ea( of a sa-ple of
1;; pr)ces of 3;=*ay s,ppl)es of th)s *r,g w)ll be betwee( 0 a(* 0;.
2; S,ppose the -ea( le(gth of t)-e betwee( s,b-)ss)o( of a state ta7 ret,r( re<,est)(g a
ref,(* a(* the )ss,a(ce of the ref,(* )s *ays w)th sta(*ar* *e4)at)o( *ays. )(*the probab)l)ty that )( a sa-ple of 0; ret,r(s re<,est)(g a ref,(* the -ea( s,ch t)-e
w)ll be -ore tha( 0; *ays.
21 Scores o( a co--o( +(al e7a- )( a large e(roll-e(t -,lt)ple=sect)o( fresh-a( co,rse
are (or-ally *)str)b,te* w)th -ea( 2. a(* sta(*ar* *e4)at)o( 13.1.
a )(* the probab)l)ty that the score : o( a ra(*o-ly selecte* e7a- paper )s
betwee( ; a(* 6;.
b )(* the probab)l)ty that the -ea( score X−− of 36 ra(*o-ly selecte* e7a-
papers )s betwee( ; a(* 6;.
22 S,ppose the -ea( we)ght of school ch)l*re(Ws bookbags )s 1. po,(*s w)th sta(*ar*
*e4)at)o( 2.2 po,(*s. )(* the probab)l)ty that the -ea( we)ght of a sa-ple of 3;
bookbags w)ll e7cee* 1 po,(*s.
23 S,ppose that )( a certa)( reg)o( of the co,(try the -ea( *,rat)o( of +rst -arr)ages that
e(* )( *)4orce )s .6 years sta(*ar* *e4)at)o( 1.2 years. )(* the probab)l)ty that )( a
sa-ple of 0 *)4orces the -ea( age of the -arr)ages )s at -ost 6 years.
2 #orach)o eats at the sa-e fast foo* resta,ra(t e4ery *ay. S,ppose the t)-e : betwee(
the -o-e(t #orach)o e(ters the resta,ra(t a(* the -o-e(t he )s ser4e* h)s foo* )s
(or-ally *)str)b,te* w)th -ea( .2 -)(,tes a(* sta(*ar* *e4)at)o( 1.3 -)(,tes.
a )(* the probab)l)ty that whe( he e(ters the resta,ra(t to*ay )t w)ll be at least
0 -)(,tes ,(t)l he )s ser4e*.
b )(* the probab)l)ty that a4erage t)-e ,(t)l he )s ser4e* )( e)ght ra(*o-ly
selecte* 4)s)ts to the resta,ra(t w)ll be at least 0 -)(,tes.
A((&T&1NA! ++/C&S+S
20 A h)gh=spee* pack)(g -ach)(e ca( be set to *el)4er betwee( 11 a(* 13 o,(ces of a
l)<,)*. or a(y *el)4ery sett)(g )( th)s ra(ge the a-o,(t *el)4ere* )s (or-ally *)str)b,te*
w)th -ea( so-e a-o,(t μ a(* w)th sta(*ar* *e4)at)o( ;.;6 o,(ce. &o cal)brate the
-ach)(e )t )s set to *el)4er a part)c,lar a-o,(t -a(y co(ta)(ers are +lle* a(* 20co(ta)(ers are ra(*o-ly selecte* a(* the a-o,(t they co(ta)( )s -eas,re*. )(* the
probab)l)ty that the sa-ple -ea( w)ll be w)th)( ;.;0 o,(ce of the act,al -ea( a-o,(t
be)(g *el)4ere* to all co(ta)(ers.
2 A t)re -a(,fact,rer states that a certa)( type of t)re has a -ea( l)fet)-e of ;;;;
-)les. S,ppose l)fet)-es are (or-ally *)str)b,te* w)th sta(*ar* *e4)at)o( σ= 3,500-)les.
Saylor URL: http://www.saylor.org/books Saylor.org3;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 307/723
a )(* the probab)l)ty that )f yo, b,y o(e s,ch t)re )t w)ll last o(ly 0;;; or
fewer -)les. f yo, ha* th)s e7per)e(ce )s )t part)c,larly stro(g e4)*e(ce that
the t)re )s (ot as goo* as cla)-e*C
b A co(s,-er gro,p b,ys +4e s,ch t)res a(* tests the-. )(* the probab)l)ty
that a4erage l)fet)-e of the +4e t)res w)ll be 0;;; -)les or less. f the -ea()s so low )s that part)c,larly stro(g e4)*e(ce that the t)re )s (ot as goo* as
cla)-e*C
Saylor URL: http://www.saylor.org/books Saylor.org3;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 308/723
.3 &he Sa-ple Proport)o(
LEARNN! "#$E%&'ES
1 &o recog()Ve that the sa-ple proport)o( Pˆ)s a ra(*o- 4ar)able.
2 &o ,(*ersta(* the -ea()(g of the for-,las for the -ea( a(* sta(*ar* *e4)at)o(
of the sa-ple proport)o(.
3 &o lear( what the sa-pl)(g *)str)b,t)o( of P )s whe( the sa-ple s)Ve )s large.
Often sampling is done in order to estimate the proportion of a population that has a specific
characteristic$ such as the proportion of all items coming off an assembly line that are defective or
the proportion of all people entering a retail store who make a purchase before leaving. The
population proportion is denoted p and the sample proportion is denoted p.Thus if in reality 63B of
people entering a store make a purchase before leaving$ p M 5.63 if in a sample of !55 people
entering the store$ @? make a purchase$ p=78/200=0.39.
The sample proportion is a random variable: it varies from sample to sample in a way that cannot be
predicted with certainty. Diewed as a random variable it will be written P .,t has a mean µP and
a standard deviation σP.Eere are formulas for their values.
Saylor URL: http://www.saylor.org/books Saylor.org3;6
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 309/723
Saylor URL: http://www.saylor.org/books Saylor.org3;9
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 310/723
/igure ;.8 02istribution of +ample roportions0 shows that when p M 5.1 a sample of si'e 18 is too
small but a sample of si'e 155 is acceptable. /igure ;.; 02istribution of +ample roportions for
0 shows that when p M 5.8 a sample of si'e 18 is acceptable.
Saylor URL: http://www.saylor.org/books Saylor.org31;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 311/723
!igure /. (istribution of %ample $roportions
!igure /./ (istribution of %ample $roportions for p H 5. and n H "
Saylor URL: http://www.saylor.org/books Saylor.org311
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 312/723
Saylor URL: http://www.saylor.org/books Saylor.org312
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 313/723
+A>2!+ 5
A( o(l)(e reta)ler cla)-s that 9;G of all or*ers are sh)ppe* w)th)( 12 ho,rs of
be)(g rece)4e*. A co(s,-er gro,p place* 121 or*ers of *)8ere(t s)Ves a(* at
*)8ere(t t)-es of *ayD 1;2 or*ers were sh)ppe* w)th)( 12 ho,rs.
a %o-p,te the sa-ple proport)o( of )te-s sh)ppe* w)th)( 12 ho,rs.
b %o(+r- that the sa-ple )s large e(o,gh to ass,-e that the sa-ple
proport)o( )s (or-ally *)str)b,te*. Use p ;.9; correspo(*)(g to the
ass,-pt)o( that the reta)lerWs cla)- )s 4al)*.
c Ass,-)(g the reta)lerWs cla)- )s tr,e +(* the probab)l)ty that a sa-ple of
s)Ve 121 wo,l* pro*,ce a sa-ple proport)o( so low as was obser4e* )(
th)s sa-ple.
* #ase* o( the a(swer to part c5 *raw a co(cl,s)o( abo,t the reta)lerWs
cla)-.
Saylor URL: http://www.saylor.org/books Saylor.org313
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 314/723
Saylor URL: http://www.saylor.org/books Saylor.org31
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 315/723
Saylor URL: http://www.saylor.org/books Saylor.org310
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 316/723
Saylor URL: http://www.saylor.org/books Saylor.org31
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 317/723
Saylor URL: http://www.saylor.org/books Saylor.org31
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 318/723
A22!&CAT&1NS
13 S,ppose that 6G of all -ales s,8er so-e for- of color bl)(*(ess. )(* the probab)l)ty
that )( a ra(*o- sa-ple of 20; -e( at least 1;G w)ll s,8er so-e for- of color
bl)(*(ess. )rst 4er)fy that the sa-ple )s s,?c)e(tly large to ,se the (or-al *)str)b,t)o(.
1 S,ppose that 29G of all res)*e(ts of a co--,()ty fa4or a((e7at)o( by a (earby
-,()c)pal)ty. )(* the probab)l)ty that )( a ra(*o- sa-ple of 0; res)*e(ts at least 30G
w)ll fa4or a((e7at)o(. )rst 4er)fy that the sa-ple )s s,?c)e(tly large to ,se the (or-al
*)str)b,t)o(.
10 S,ppose that 2G of all cell pho(e co((ect)o(s by a certa)( pro4)*er are *roppe*. )(*
the probab)l)ty that )( a ra(*o- sa-ple of 10;; calls at -ost ; w)ll be *roppe*. )rst
4er)fy that the sa-ple )s s,?c)e(tly large to ,se the (or-al *)str)b,t)o(.
1 S,ppose that )( 2;G of all tra?c acc)*e(ts )(4ol4)(g a( )(>,ry *r)4er *)stract)o( )( so-e
for- for e7a-ple cha(g)(g a ra*)o stat)o( or te7t)(g5 )s a factor. )(* the probab)l)ty
that )( a ra(*o- sa-ple of 20 s,ch acc)*e(ts betwee( 10G a(* 20G )(4ol4e *r)4er
*)stract)o( )( so-e for-. )rst 4er)fy that the sa-ple )s s,?c)e(tly large to ,se the
(or-al *)str)b,t)o(.
1 A( a)rl)(e cla)-s that 2G of all )ts [)ghts to a certa)( reg)o( arr)4e o( t)-e. ( a ra(*o-
sa-ple of 3; rece(t arr)4als 19 were o( t)-e. Jo, -ay ass,-e that the (or-al
*)str)b,t)o( appl)es.a %o-p,te the sa-ple proport)o(.
b Ass,-)(g the a)rl)(eWs cla)- )s tr,e +(* the probab)l)ty of a sa-ple of s)Ve 3;
pro*,c)(g a sa-ple proport)o( so low as was obser4e* )( th)s sa-ple.
16 A h,-a(e soc)ety reports that 19G of all pet *ogs were a*opte* fro- a( a()-al shelter.
Ass,-)(g the tr,th of th)s assert)o( +(* the probab)l)ty that )( a ra(*o- sa-ple of 6;
Saylor URL: http://www.saylor.org/books Saylor.org316
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 319/723
pet *ogs betwee( 10G a(* 2;G were a*opte* fro- a shelter. Jo, -ay ass,-e that the
(or-al *)str)b,t)o( appl)es.
19 ( o(e st,*y )t was fo,(* that 6G of all ho-es ha4e a f,(ct)o(al s-oke *etector.
S,ppose th)s proport)o( )s 4al)* for all ho-es. )(* the probab)l)ty that )( a ra(*o-
sa-ple of ;; ho-es betwee( 6;G a(* 9;G w)ll ha4e a f,(ct)o(al s-oke *etector. Jo,-ay ass,-e that the (or-al *)str)b,t)o( appl)es.
2; A state )(s,ra(ce co--)ss)o( est)-ates that 13G of all -otor)sts )( )ts state are
,()(s,re*. S,ppose th)s proport)o( )s 4al)*. )(* the probab)l)ty that )( a ra(*o- sa-ple
of 0; -otor)sts at least 0 w)ll be ,()(s,re*. Jo, -ay ass,-e that the (or-al
*)str)b,t)o( appl)es.
21 A( o,ts)*e +(a(c)al a,*)tor has obser4e* that abo,t G of all *oc,-e(ts he e7a-)(es
co(ta)( a( error of so-e sort. Ass,-)(g th)s proport)o( to be acc,rate +(* the
probab)l)ty that a ra(*o- sa-ple of ;; *oc,-e(ts w)ll co(ta)( at least 3; w)th so-e
sort of error. Jo, -ay ass,-e that the (or-al *)str)b,t)o( appl)es.
22 S,ppose G of all ho,sehol*s ha4e (o ho-e telepho(e b,t *epe(* co-pletely o( cell
pho(es. )(* the probab)l)ty that )( a ra(*o- sa-ple of 0; ho,sehol*s betwee( 20
a(* 30 w)ll ha4e (o ho-e telepho(e. Jo, -ay ass,-e that the (or-al *)str)b,t)o(
appl)es.
A((&T&1NA! ++/C&S+S
23 So-e co,(tr)es allow )(*)4)*,al packages of prepackage* goo*s to we)gh less tha( what
)s state* o( the package s,b>ect to certa)( co(*)t)o(s s,ch as the a4erage of all
packages be)(g the state* we)ght or greater. S,ppose that o(e re<,)re-e(t )s that at
-ost G of all packages -arke* 0;; gra-s ca( we)gh less tha( 9; gra-s. Ass,-)(g
that a pro*,ct act,ally -eets th)s re<,)re-e(t +(* the probab)l)ty that )( a ra(*o-
sa-ple of 10; s,ch packages the proport)o( we)gh)(g less tha( 9; gra-s )s at least 3G.
Jo, -ay ass,-e that the (or-al *)str)b,t)o( appl)es.
2 A( eco(o-)st w)shes to )(4est)gate whether people are keep)(g cars lo(ger (ow tha( )(
the past. @e k(ows that +4e years ago 36G of all passe(ger 4eh)cles )( operat)o( were
at least te( years ol*. @e co--)ss)o(s a st,*y )( wh)ch 320 a,to-ob)les are ra(*o-ly
sa-ple*. "f the- 132 are te( years ol* or ol*er.
a )(* the sa-ple proport)o(.
b )(* the probab)l)ty that whe( a sa-ple of s)Ve 320 )s *raw( fro- a
pop,lat)o( )( wh)ch the tr,e proport)o( )s ;.36 the sa-ple proport)o( w)ll be
as large as the 4al,e yo, co-p,te* )( part a5. Jo, -ay ass,-e that the
(or-al *)str)b,t)o( appl)es.
c !)4e a( )(terpretat)o( of the res,lt )( part b5. s there stro(g e4)*e(ce that
people are keep)(g the)r cars lo(ger tha( was the case +4e years agoC
Saylor URL: http://www.saylor.org/books Saylor.org319
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 320/723
20 A state p,bl)c health *epart-e(t w)shes to )(4est)gate the e8ect)4e(ess of a ca-pa)g(
aga)(st s-ok)(g. @)stor)cally 22G of all a*,lts )( the state reg,larly s-oke* c)gars or
c)garettes. ( a s,r4ey co--)ss)o(e* by the p,bl)c health *epart-e(t 29 of 10;;
ra(*o-ly selecte* a*,lts state* that they s-oke reg,larly.
a )(* the sa-ple proport)o(.b )(* the probab)l)ty that whe( a sa-ple of s)Ve 10;; )s *raw( fro- a
pop,lat)o( )( wh)ch the tr,e proport)o( )s ;.22 the sa-ple proport)o( w)ll be
(o larger tha( the 4al,e yo, co-p,te* )( part a5. Jo, -ay ass,-e that the
(or-al *)str)b,t)o( appl)es.
c !)4e a( )(terpretat)o( of the res,lt )( part b5. @ow stro(g )s the e4)*e(ce that
the ca-pa)g( to re*,ce s-ok)(g has bee( e8ect)4eC
2 ( a( e8ort to re*,ce the pop,lat)o( of ,(wa(te* cats a(* *ogs a gro,p of 4eter)(ar)a(s
set ,p a low=cost spay/(e,ter cl)()c. At the )(cept)o( of the cl)()c a s,r4ey of pet ow(ers
)(*)cate* that 6G of all pet *ogs a(* cats )( the co--,()ty were spaye* or (e,tere*.
After the low=cost cl)()c ha* bee( )( operat)o( for three years that +g,re ha* r)se( to
6G.
a Bhat )(for-at)o( )s -)ss)(g that yo, wo,l* (ee* to co-p,te the probab)l)ty
that a sa-ple *raw( fro- a pop,lat)o( )( wh)ch the proport)o( )s 6G
correspo(*)(g to the ass,-pt)o( that the low=cost cl)()c ha* ha* (o e8ect5 )s
as h)gh as 6GC
b I(ow)(g that the s)Ve of the or)g)(al sa-ple three years ago was 10; a(* that
the s)Ve of the rece(t sa-ple was 120 co-p,te the probab)l)ty -e(t)o(e* )(
part a5. Jo, -ay ass,-e that the (or-al *)str)b,t)o( appl)es.
c !)4e a( )(terpretat)o( of the res,lt )( part b5. @ow stro(g )s the e4)*e(ce that
the prese(ce of the low=cost cl)()c has )(crease* the proport)o( of pet *ogs
a(* cats that ha4e bee( spaye* or (e,tere*C
2 A( or*)(ary *)e )s fa)r or bala(ce* )f each face has a( e<,al cha(ce of la(*)(g o( top
whe( the *)e )s rolle*. &h,s the proport)o( of t)-es a three )s obser4e* )( a large (,-ber
of tosses )s e7pecte* to be close to 1/ or 0.16−.S,ppose a *)e )s rolle* 2; t)-es a(*
shows three o( top 3 t)-es for a sa-ple proport)o( of ;.10.
a )(* the probab)l)ty that a fa)r *)e wo,l* pro*,ce a proport)o( of ;.10 or less. Jo,
-ay ass,-e that the (or-al *)str)b,t)o( appl)es.
b !)4e a( )(terpretat)o( of the res,lt )( part b5. @ow stro(g )s the e4)*e(ce that the
*)e )s (ot fa)rC
c S,ppose the sa-ple proport)o( ;.10 ca-e fro- roll)(g the *)e 2;; t)-es )(stea*
of o(ly 2; t)-es. Rework part a5 ,(*er these c)rc,-sta(ces.
Saylor URL: http://www.saylor.org/books Saylor.org32;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 321/723
* !)4e a( )(terpretat)o( of the res,lt )( part c5. @ow stro(g )s the e4)*e(ce that the
*)e )s (ot fa)rC
Saylor URL: http://www.saylor.org/books Saylor.org321
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 322/723
Saylor URL: http://www.saylor.org/books Saylor.org322
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 323/723
Chapter 4
+stimation
,f we wish to estimate the mean of a population for which a census is impractical$ say the averageheight of all 1?-year-old men in the country$ a reasonable strategy is to take a sample$ compute its
meanx−$ and estimate the unknown number by the known numberx−./or example$ if the average
height of 155 randomly selected men aged 1? is @5.; inches$ then we would say that the average
height of all 1?-year-old men is *at least approximately) @5.; inches.
stimating a population parameter by a single number like this is called point estimation in the
case at hand the statistic x −is a point estimate of the parameter . The terminology arises because
a single number corresponds to a single point on the number line.
% problem with a point estimate is that it gives no indication of how reliable the estimate is. ,n
contrast$ in this chapter we learn about interval estimation. ,n brief$ in the case of estimating a
population mean we use a formula to compute from the data a number ; $ called
the margin of error of the estimate$ and form the interval [x −−E,x−+E]. (e do this in such a way
that a certain proportion$ say F8B$ of all the intervals constructed from sample data by means of this
formula contain the unknown parameter . +uch an interval is called
a "#1 confidence interval f or .
7ontinuing with the example of the average height of 1?-year-old men$ suppose that the sample of
155 men mentioned above for which x−=70.6inches also had sample standard deviation s M 1.@
inches. ,t then turns out that ; M 5.33 and we would state that we are F8B confident that the average
height of all 1?-year-old men is in the interval formed by 70.6±0.33inches$ that is$ the average is
between @5.!@ and @5.F3 inches. ,f the sample statistics had come from a smaller sample$ say a
sample of 85 men$ the lower reliability would show up in the F8B confidence interval being longer$
Saylor URL: http://www.saylor.org/books Saylor.org323
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 324/723
hence less precise in its estimate. ,n this example the F8B confidence interval for the same sample
statistics but with n M 85 is70.6±0.47inches$ or from @5.13 to @1.5@ inches.
4.% !ar$e Sample +stimation o a 2opulation >ean
LEARNN! "#$E%&'ES
1 &o beco-e fa-)l)ar w)th the co(cept of a( )(ter4al est)-ate of the pop,lat)o(
-ea(.
2 &o ,(*ersta(* how to apply for-,las for a co(+*e(ce )(ter4al for a pop,lat)o(
-ea(.
Saylor URL: http://www.saylor.org/books Saylor.org32
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 325/723
/igure @.! 07omputer +imulation of 65 F8B 7onfidence ,ntervals for a ean0shows the intervals
generated by a computer simulation of drawing 65 samples from a normally distributed population
and constructing the F8B confidence interval for each one. (e expect that about (0.05)(40)=2of the
intervals so constructed would fail to contain the population mean $ and in this simulation two of the
intervals$ shown in red$ do.
Saylor URL: http://www.saylor.org/books Saylor.org320
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 326/723
!igure 1.& <omputer %imulation of 5 4K <onfidence >ntervals for a 2ean
,t is standard practice to identify the level of confidence in terms of the area αin the two tails of the
distribution of X^−− when the middle part specified by the level of confidence is taken out. This is
shown in /igure @.3$ drawn for the general situation$ and in /igure @.6$ drawn for F8B confidence.
Cemember from +ection 8.6.1 0Tails of the +tandard 9ormal 2istribution0 in 7hapter 8 07ontinuousCandom Dariables0 that the z -value that cuts off a right tail of area c is denoted z c. Thus the number
1.F;5 in the example isz.025$ which iszα 2forα=1−0.95=0.05.
!igure 1.*
Saylor URL: http://www.saylor.org/books Saylor.org32
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 327/723
100(1−α)α/2.
!igure 1.
α/2=0.025.
Saylor URL: http://www.saylor.org/books Saylor.org32
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 328/723
Saylor URL: http://www.saylor.org/books Saylor.org326
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 329/723
EKAPLE 2
Use )g,re 12.3 Q%r)t)cal 'al,es of Q to +(* the (,-ber zα/2(ee*e* )( co(str,ct)o(
of a co(+*e(ce )(ter4al:
a. whe( the le4el of co(+*e(ce )s 9;GD
Saylor URL: http://www.saylor.org/books Saylor.org329
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 330/723
b. whe( the le4el of co(+*e(ce )s 99G.
Sol,t)o(:
a. ( the (e7t sect)o( we w)ll lear( abo,t a co(t)(,o,s ra(*o- 4ar)able that has a
probab)l)ty *)str)b,t)o( calle* the St,*e(t t =*)str)b,t)o(. )g,re 12.3 Q%r)t)cal 'al,es of
Q g)4es the 4al,e t c that c,ts o8 a r)ght ta)l of area c for *)8ere(t 4al,es of c. &he last l)(e
of that table the o(e whose hea*)(g )s the sy-bol ∞ for )(+()ty a(* [z] g)4es the
correspo(*)(g z =4al,e z c that c,ts o8 a r)ght ta)l of the sa-e area c. ( part)c,lar z ;.;0 )s
the (,-ber )( that row a(* )( the col,-( w)th the hea*)(g t ;.;0. Be rea* o8 *)rectly
that z0.05=1.645.
b. ( )g,re 12.3 Q%r)t)cal 'al,es of Q z ;.;;0 )s the (,-ber )( the last row a(* )( the col,-(
hea*e* t ;.;;0 (a-ely 2.0.
/igure 1!.3 07ritical Dalues of 0 can be used to find z c only for those values of cfor which there is a
column with the heading t c appearing in the table otherwise we must use /igure 1!.! 07umulative
9ormal robability0 in reverse. 4ut when it can be done it is both faster and more accurate to use the
last line of /igure 1!.3 07ritical Dalues of 0 to find z c than it is to do so using /igure 1!.! 07umulative
9ormal robability0 in reverse.
Saylor URL: http://www.saylor.org/books Saylor.org33;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 331/723
Saylor URL: http://www.saylor.org/books Saylor.org331
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 332/723
++/C&S+S'AS&C
1 A ra(*o- sa-ple )s *raw( fro- a pop,lat)o( of k(ow( sta(*ar* *e4)at)o( 11.3.
%o(str,ct a 9;G co(+*e(ce )(ter4al for the pop,lat)o( -ea( base* o( the )(for-at)o(
g)4e( (ot all of the )(for-at)o( g)4e( (ee* be ,se*5.
a n 3 x−=105.2 s 11.2
b n 1;; x−=105.2 s 11.2
2 A ra(*o- sa-ple )s *raw( fro- a pop,lat)o( of k(ow( sta(*ar* *e4)at)o( 22.1.
%o(str,ct a 90G co(+*e(ce )(ter4al for the pop,lat)o( -ea( base* o( the )(for-at)o(
g)4e( (ot all of the )(for-at)o( g)4e( (ee* be ,se*5.
a n 121 x−=82.4 s 21.9
b n 61 x−=82.4 s 21.9
3 A ra(*o- sa-ple )s *raw( fro- a pop,lat)o( of ,(k(ow( sta(*ar* *e4)at)o(. %o(str,ct
a 99G co(+*e(ce )(ter4al for the pop,lat)o( -ea( base* o( the )(for-at)o( g)4e(.
a n 9 x−=17.1 s 2.1
Saylor URL: http://www.saylor.org/books Saylor.org332
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 333/723
b n 19 x−=17.1 s 2.1
A ra(*o- sa-ple )s *raw( fro- a pop,lat)o( of ,(k(ow( sta(*ar* *e4)at)o(.
%o(str,ct a 96G co(+*e(ce )(ter4al for the pop,lat)o( -ea( base* o( the
)(for-at)o( g)4e(.
a n 220 x−=92.0 s 6.
b n x−=92.0 s 6.
0 A ra(*o- sa-ple of s)Ve 1 )s *raw( fro- a pop,lat)o( whose *)str)b,t)o( -ea(
a(* sta(*ar* *e4)at)o( are all ,(k(ow(. &he s,--ary stat)st)cs are x−=58.2a(* s
2..
a %o(str,ct a( 6;G co(+*e(ce )(ter4al for the pop,lat)o( -ea( μ.
b %o(str,ct a 9;G co(+*e(ce )(ter4al for the pop,lat)o( -ea( μ.
c %o--e(t o( why o(e )(ter4al )s lo(ger tha( the other.
A ra(*o- sa-ple of s)Ve 20 )s *raw( fro- a pop,lat)o( whose *)str)b,t)o( -ea(
a(* sta(*ar* *e4)at)o( are all ,(k(ow(. &he s,--ary stat)st)cs are x−=1011a(* s
3.
a %o(str,ct a 9;G co(+*e(ce )(ter4al for the pop,lat)o( -ea( μ.
b %o(str,ct a 99G co(+*e(ce )(ter4al for the pop,lat)o( -ea( μ.
c %o--e(t o( why o(e )(ter4al )s lo(ger tha( the other.
APPL%A&"NS
A go4er(-e(t age(cy was charge* by the leg)slat,re w)th est)-at)(g the le(gth of t)-e
)t takes c)t)Ve(s to +ll o,t 4ar)o,s for-s. &wo h,(*re* ra(*o-ly selecte* a*,lts were
t)-e* as they +lle* o,t a part)c,lar for-. &he t)-es re<,)re* ha* -ea( 12.6 -)(,tes
w)th sta(*ar* *e4)at)o( 1. -)(,tes. %o(str,ct a 9;G co(+*e(ce )(ter4al for the -ea(
t)-e take( for all a*,lts to +ll o,t th)s for-.
6 o,r h,(*re* ra(*o-ly selecte* work)(g a*,lts )( a certa)( state )(cl,*)(g those who
worke* at ho-e were aske* the *)sta(ce fro- the)r ho-e to the)r workplace. &he
a4erage *)sta(ce was 6.6 -)les w)th sta(*ar* *e4)at)o( 2.; -)les. %o(str,ct a 99G
Saylor URL: http://www.saylor.org/books Saylor.org333
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 334/723
co(+*e(ce )(ter4al for the -ea( *)sta(ce fro- ho-e to work for all res)*e(ts of th)s
state.
9 "( e4ery passe(ger 4eh)cle that )t tests a( a,to-ot)4e -agaV)(e -eas,res at tr,e
spee* 00 -ph the *)8ere(ce betwee( the tr,e spee* of the 4eh)cle a(* the spee*
)(*)cate* by the spee*o-eter. or 3 4eh)cles teste* the -ea( *)8ere(ce was X1.2
-ph w)th sta(*ar* *e4)at)o( ;.2 -ph. %o(str,ct a 9;G co(+*e(ce )(ter4al for the -ea(
*)8ere(ce betwee( tr,e spee* a(* )(*)cate* spee* for all 4eh)cles.
1; A corporat)o( -o()tors t)-e spe(t by o?ce workers brows)(g the web o( the)r
co-p,ters )(stea* of work)(g. ( a sa-ple of co-p,ter recor*s of 0; workers the
a4erage a-o,(t of t)-e spe(t brows)(g )( a( e)ght=ho,r work *ay was 2.6 -)(,tes
w)th sta(*ar* *e4)at)o( 6.2 -)(,tes. %o(str,ct a 99.0G co(+*e(ce )(ter4al for the
-ea( t)-e spe(t by all o?ce workers )( brows)(g the web )( a( e)ght=ho,r *ay.
11 A sa-ple of 20; workers age* 1 a(* ol*er pro*,ce* a( a4erage le(gth of t)-e w)th
the c,rre(t e-ployer >ob te(,re5 of . years w)th sta(*ar* *e4)at)o( 3.6 years.
%o(str,ct a 99.9G co(+*e(ce )(ter4al for the -ea( >ob te(,re of all workers age* 1 or
ol*er.
12 &he a-o,(t of a part)c,lar b)oche-)cal s,bsta(ce relate* to bo(e break*ow( was
-eas,re* )( 3; healthy wo-e(. &he sa-ple -ea( a(* sta(*ar* *e4)at)o( were 3.3
(a(ogra-s per -)ll)l)ter (g/-L5 a(* 1. (g/-L. %o(str,ct a( 6;G co(+*e(ce )(ter4al
for the -ea( le4el of th)s s,bsta(ce )( all healthy wo-e(.
13 A corporat)o( that ow(s apart-e(t co-ple7es w)shes to est)-ate the a4erage le(gth of
t)-e res)*e(ts re-a)( )( the sa-e apart-e(t before -o4)(g o,t. A sa-ple of 10; re(tal
co(tracts ga4e a -ea( le(gth of occ,pa(cy of 3. years w)th sta(*ar* *e4)at)o( 1.2
years. %o(str,ct a 90G co(+*e(ce )(ter4al for the -ea( le(gth of occ,pa(cy of
apart-e(ts ow(e* by th)s corporat)o(.
Saylor URL: http://www.saylor.org/books Saylor.org33
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 335/723
1 &he *es)g(er of a garbage tr,ck that l)fts roll=o,t co(ta)(ers -,st est)-ate the -ea(
we)ght the tr,ck w)ll l)ft at each collect)o( po)(t. A ra(*o- sa-ple of 320 co(ta)(ers of
garbage o( c,rre(t collect)o( ro,tes y)el*e* x−=75.3lb s 12.6 lb. %o(str,ct a 99.6G
co(+*e(ce )(ter4al for the -ea( we)ght the tr,cks -,st l)ft each t)-e.
10 ( or*er to est)-ate the -ea( a-o,(t of *a-age s,sta)(e* by 4eh)cles whe( a *eer )s
str,ck a( )(s,ra(ce co-pa(y e7a-)(e* the recor*s of 0; s,ch occ,rre(ces a(*
obta)(e* a sa-ple -ea( of 260 w)th sa-ple sta(*ar* *e4)at)o( 221. %o(str,ct a
90G co(+*e(ce )(ter4al for the -ea( a-o,(t of *a-age )( all s,ch acc)*e(ts.
1 ( or*er to est)-ate the -ea( %" cre*)t score of )ts -e-bers a cre*)t ,()o( sa-ples
the scores of 90 -e-bers a(* obta)(s a sa-ple -ea( of 36.2 w)th sa-ple sta(*ar*
*e4)at)o( .2. %o(str,ct a 99G co(+*e(ce )(ter4al for the -ea( %" score of all of )ts
-e-bers.
Saylor URL: http://www.saylor.org/books Saylor.org330
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 336/723
Saylor URL: http://www.saylor.org/books Saylor.org33
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 337/723
LAR!E A&A S E& EKE R%SES
23 Large ata Set 1 recor*s the SA& scores of 1;;; st,*e(ts. Regar*)(g )t as a ra(*o-
sa-ple of all h)gh school st,*e(ts ,se )t to co(str,ct a 99G co(+*e(ce )(ter4al for the
-ea( SA& score of all st,*e(ts.
http://www.1.7ls
Saylor URL: http://www.saylor.org/books Saylor.org33
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 338/723
2 Large ata Set 1 recor*s the !PAs of 1;;; college st,*e(ts. Regar*)(g )t as a ra(*o-
sa-ple of all college st,*e(ts ,se )t to co(str,ct a 90G co(+*e(ce )(ter4al for the
-ea( !PA of all st,*e(ts.
http://www.1.7ls
20 Large ata Set 1 l)sts the SA& scores of 1;;; st,*e(ts.
http://www.1.7ls
a Regar* the *ata as ar)s)(g fro- a ce(s,s of all st,*e(ts at a h)gh school )(
wh)ch the SA& score of e4ery st,*e(t was -eas,re*. %o-p,te the pop,lat)o(
-ea( μ.
b Regar* the +rst 3 st,*e(ts as a ra(*o- sa-ple a(* ,se )t to co(str,ct a
99G co(+*e(ce for the -ea( μ of all 1;;; SA& scores. oes )t act,ally
capt,re the -ea( μC
2 Large ata Set 1 l)sts the !PAs of 1;;; st,*e(ts.
http://www.1.7ls
a Regar* the *ata as ar)s)(g fro- a ce(s,s of all fresh-a( at a s-all college at the
e(* of the)r +rst aca*e-)c year of college st,*y )( wh)ch the !PA of e4ery s,ch
perso( was -eas,re*. %o-p,te the pop,lat)o( -ea( μ.
b Regar* the +rst 3 st,*e(ts as a ra(*o- sa-ple a(* ,se )t to co(str,ct a 90G
co(+*e(ce for the -ea( μ of all 1;;; !PAs. oes )t act,ally capt,re the -ea( μC
Saylor URL: http://www.saylor.org/books Saylor.org336
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 339/723
4.0 Small Sample +stimation o a 2opulation>ean
!+A/N&N: 1';+CT&<+S
1 &o beco-e fa-)l)ar w)th St,*e(tWs t =*)str)b,t)o(.
2 &o ,(*ersta(* how to apply a**)t)o(al for-,las for a co(+*e(ce )(ter4al for a
pop,lat)o( -ea(.
The confidence interval formulas in the previous section are based on the 7entral imit Theorem$ the
statement that for large samples X −−is normally distributed with mean and standard
deviationσ/√n. (hen the population mean is estimated with a small sample *n V 35)$ the 7entral
imit Theorem does not apply. ,n order to proceed we assume that the numerical population from
which the sample is taken has a normal distribution to begin with. ,f this condition is satisfied then
Saylor URL: http://www.saylor.org/books Saylor.org339
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 340/723
when the population standard deviation 6 is known the old formula x−±zα 2 σ √n)can still be
used to construct a100(1−α)1 confidence interval for .
,f the population standard deviation is unknown and the sample si'e n is small then when we
substitute the sample standard deviation s for 6 the normal approximation is no longer valid. The
solution is to use a different distribution$ called Students t(
distribution *ith n−1degrees of freedom. +tudent&s t -distribution is very much like the
standard normal distribution in that it is centered at 5 and has the same "ualitative bell shape$ but it
has heavier tails than the standard normal distribution does$ as indicated by /igure @.8 0+tudent&s 0$
in which the curve *in brown) that meets the dashed vertical line at the lowest point is the t -
distribution with two degrees of freedom$ the next curve *in blue) is the t -distribution with five
degrees of freedom$ and the thin curve *in red) is the standard normal distribution. %s also indicated
by the figure$ as the sample si'e n increases$ +tudent&s t -distribution ever more closely resembles thestandard normal distribution. %lthough there is a different t -distribution for every value of n$ once the
sample si'e is 35 or more it is typically acceptable to use the standard normal distribution instead$ as
we will always do in this text.
!igure 1. %tudent=s t :(istribution
Rust as the symbol z c stands for the value that cuts off a right tail of area c in the standard normal
distribution$ so the symbol t c stands for the value that cuts off a right tail of area c in the standard
normal distribution. This gives us the following confidence interval formulas.
Saylor URL: http://www.saylor.org/books Saylor.org3;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 341/723
Saylor URL: http://www.saylor.org/books Saylor.org31
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 342/723
Saylor URL: http://www.saylor.org/books Saylor.org32
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 343/723
7ompare 9ote @.F 0xample 60 in +ection @.1 0arge +ample stimation of a opulation
ean0 and 9ote @.1; 0xample ;0. The summary statistics in the two samples are the same$ but the
F5B confidence interval for the average % of all students at the university in 9ote @.F 0xample
60 in +ection @.1 0arge +ample stimation of a opulation ean0$(2.63,2.79) is shorter than the
F5B confidence interval(2.45,2.97) in 9ote @.1; 0xample ;0. This is partly because in 9ote @.F
Saylor URL: http://www.saylor.org/books Saylor.org33
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 344/723
0xample 60 the sample si'e is larger there is more information pertaining to the true value of in
the large data set than in the small one.
*+, TA*+AA,S
• ( select)(g the correct for-,la for co(str,ct)o( of a co(+*e(ce )(ter4al for a
pop,lat)o( -ea( ask two <,est)o(s: )s the pop,lat)o( sta(*ar* *e4)at)o( σ k(ow(
or ,(k(ow( a(* )s the sa-ple large or s-allC
• Be ca( co(str,ct co(+*e(ce )(ter4als w)th s-all sa-ples o(ly )f the pop,lat)o( )s
(or-al.
Saylor URL: http://www.saylor.org/books Saylor.org3
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 345/723
Saylor URL: http://www.saylor.org/books Saylor.org30
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 346/723
Saylor URL: http://www.saylor.org/books Saylor.org3
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 347/723
Saylor URL: http://www.saylor.org/books Saylor.org3
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 348/723
Saylor URL: http://www.saylor.org/books Saylor.org36
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 349/723
Saylor URL: http://www.saylor.org/books Saylor.org39
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 350/723
Saylor URL: http://www.saylor.org/books Saylor.org30;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 351/723
Saylor URL: http://www.saylor.org/books Saylor.org301
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 352/723
Saylor URL: http://www.saylor.org/books Saylor.org302
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 353/723
4.3 !ar$e Sample +stimation o a 2opulation2roportion
!+A/N&N: 1';+CT&<+
1 &o ,(*ersta(* how to apply the for-,la for a co(+*e(ce )(ter4al for a pop,lat)o(
proport)o(.
+ince from +ection ;.3 0The +ample roportion0 in 7hapter ; 0+ampling 2istributions0 we know the
mean$ standard deviation$ and sampling distribution of the sample proportion p$ the ideas of the
previous two sections can be applied to produce a confidence interval for a population proportion.
Eere is the formula.
Saylor URL: http://www.saylor.org/books Saylor.org303
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 354/723
*+, TA*+AA,S
• Be ha4e a s)(gle for-,la for a co(+*e(ce )(ter4al for a pop,lat)o( proport)o(
wh)ch )s 4al)* whe( the sa-ple )s large.
• &he co(*)t)o( that a sa-ple be large )s (ot that )ts s)Ve n be at least 3; b,t that
the *e(s)ty f,(ct)o( +t )(s)*e the )(ter4al [0,1].
Saylor URL: http://www.saylor.org/books Saylor.org30
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 355/723
Saylor URL: http://www.saylor.org/books Saylor.org300
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 356/723
a !)4e a po)(t est)-ate of the proport)o( p of all people who co,l* rea* wor*s
*)sg,)se* )( th)s way.
b Show that the sa-ple )s (ot s,?c)e(tly large to co(str,ct a co(+*e(ce
)(ter4al for the proport)o( of all people who co,l* rea* wor*s *)sg,)se* )( th)sway.
6 ( a ra(*o- sa-ple of 9;; a*,lts 2 *e+(e* the-sel4es as 4egetar)a(s.
a !)4e a po)(t est)-ate of the proport)o( of all a*,lts who wo,l* *e+(e
the-sel4es as 4egetar)a(s.
Saylor URL: http://www.saylor.org/books Saylor.org30
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 357/723
b 'er)fy that the sa-ple )s s,?c)e(tly large to ,se )t to co(str,ct a co(+*e(ce
)(ter4al for that proport)o(.
c %o(str,ct a( 6;G co(+*e(ce )(ter4al for the proport)o( of all a*,lts who
wo,l* *e+(e the-sel4es as 4egetar)a(s.
9 ( a ra(*o- sa-ple of 20; e-ploye* people 1 sa)* that they br)(g work ho-e w)ththe- at least occas)o(ally.
a !)4e a po)(t est)-ate of the proport)o( of all e-ploye* people who br)(g work
ho-e w)th the- at least occas)o(ally.
b %o(str,ct a 99G co(+*e(ce )(ter4al for that proport)o(.
1; ( a ra(*o- sa-ple of 120; ho,sehol* -o4es 622 were -o4es to a locat)o( w)th)( the
sa-e co,(ty as the or)g)(al res)*e(ce.
a !)4e a po)(t est)-ate of the proport)o( of all ho,sehol* -o4es that are to a
locat)o( w)th)( the sa-e co,(ty as the or)g)(al res)*e(ce.
b %o(str,ct a 96G co(+*e(ce )(ter4al for that proport)o(.
11 ( a ra(*o- sa-ple of 12 h)p replace-e(t or re4)s)o( s,rgery proce*,res
(at)o(w)*e 12 pat)e(ts *e4elope* a s,rg)cal s)te )(fect)o(.
a !)4e a po)(t est)-ate of the proport)o( of all pat)e(ts ,(*ergo)(g a h)p
s,rgery proce*,re who *e4elop a s,rg)cal s)te )(fect)o(.
b 'er)fy that the sa-ple )s s,?c)e(tly large to ,se )t to co(str,ct a co(+*e(ce
)(ter4al for that proport)o(.
c %o(str,ct a 90G co(+*e(ce )(ter4al for the proport)o( of all pat)e(ts
,(*ergo)(g a h)p s,rgery proce*,re who *e4elop a s,rg)cal s)te )(fect)o(.
12 ( a certa)( reg)o( prepackage* pro*,cts labele* 0;; g -,st co(ta)( o( a4erage at least
0;; gra-s of the pro*,ct a(* at least 9;G of all packages -,st we)gh at least 9;
gra-s. ( a ra(*o- sa-ple of 3;; packages 266 we)ghe* at least 9; gra-s.
a !)4e a po)(t est)-ate of the proport)o( of all packages that we)gh at least 9;
gra-s.
b 'er)fy that the sa-ple )s s,?c)e(tly large to ,se )t to co(str,ct a co(+*e(ce
)(ter4al for that proport)o(.
c %o(str,ct a 99.6G co(+*e(ce )(ter4al for the proport)o( of all packages that
we)gh at least 9; gra-s.
Saylor URL: http://www.saylor.org/books Saylor.org30
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 358/723
10 ( or*er to est)-ate the proport)o( of e(ter)(g st,*e(ts who gra*,ate w)th)( s)7 years
the a*-)()strat)o( at a state ,()4ers)ty e7a-)(e* the recor*s of ;; ra(*o-ly selecte*
st,*e(ts who e(tere* the ,()4ers)ty s)7 years ago a(* fo,(* that 312 ha* gra*,ate*.
a !)4e a po)(t est)-ate of the s)7=year gra*,at)o( rate the proport)o( of e(ter)(g
st,*e(ts who gra*,ate w)th)( s)7 years.
b Ass,-)(g that the sa-ple )s s,?c)e(tly large co(str,ct a 96G co(+*e(ce
)(ter4al for the s)7=year gra*,at)o( rate.
1 ( a ra(*o- sa-ple of 23;; -ortgages take( o,t )( a certa)( reg)o( last year 16
were a*>,stable=rate -ortgages.
Saylor URL: http://www.saylor.org/books Saylor.org306
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 359/723
a !)4e a po)(t est)-ate of the proport)o( of all -ortgages take( o,t )( th)s reg)o(
last year that were a*>,stable=rate -ortgages.
b Ass,-)(g that the sa-ple )s s,?c)e(tly large co(str,ct a 99.9G co(+*e(ce
)(ter4al for the proport)o( of all -ortgages take( o,t )( th)s reg)o( last year that
were a*>,stable=rate -ortgages.1 ( a research st,*y )( cattle bree*)(g 109 of 23 cows )( se4eral her*s that were )(
estr,s were *etecte* by -ea(s of a( )(te(s)4e o(ce a *ay o(e=ho,r obser4at)o( of the
her*s )( early -or()(g.
a !)4e a po)(t est)-ate of the proport)o( of all cattle )( estr,s who are *etecte* by
th)s -etho*.
b Ass,-)(g that the sa-ple )s s,?c)e(tly large co(str,ct a 9;G co(+*e(ce
)(ter4al for the proport)o( of all cattle )( estr,s who are *etecte* by th)s -etho*.
16 A s,r4ey of 2120; ho,sehol*s co(cer()(g telepho(e ser4)ce ga4e the res,lts show( )(
the table.
2andline No 2andline
e"" $hone 12474 5844
No !e"" $hone 2529 403
a !)4e a po)(t est)-ate for the proport)o( of all ho,sehol*s )( wh)ch there )s a cell
pho(e b,t (o la(*l)(e.
b Ass,-)(g the sa-ple )s s,?c)e(tly large co(str,ct a 99.9G co(+*e(ce )(ter4al
for the proport)o( of all ho,sehol*s )( wh)ch there )s a cell pho(e b,t (o la(*l)(e.
c !)4e a po)(t est)-ate for the proport)o( of all ho,sehol*s )( wh)ch there )s (o
telepho(e ser4)ce of e)ther k)(*.
* Ass,-)(g the sa-ple )s s,?c)e(tly large co(str,ct a 99.9G co(+*e(ce )(ter4al
for the proport)o( of all all ho,sehol*s )( wh)ch there )s (o telepho(e ser4)ce of
e)ther k)(*.
A((&T&1NA! ++/C&S+S
19 ( a ra(*o- sa-ple of 9;; a*,lts 2 *e+(e* the-sel4es as 4egetar)a(s. "f these 2
29 were wo-e(.
a !)4e a po)(t est)-ate of the proport)o( of all self=*escr)be* 4egetar)a(s who
are wo-e(.
b 'er)fy that the sa-ple )s s,?c)e(tly large to ,se )t to co(str,ct a co(+*e(ce
)(ter4al for that proport)o(.
Saylor URL: http://www.saylor.org/books Saylor.org309
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 360/723
c %o(str,ct a 9;G co(+*e(ce )(ter4al for the proport)o( of all all self=*escr)be*
4egetar)a(s who are wo-e(.
2; A ra(*o- sa-ple of 160 college soccer players who ha* s,8ere* )(>,r)es that res,lte*
)( loss of play)(g t)-e was -a*e w)th the res,lts show( )( the table. (>,r)es are
class)+e* accor*)(g to se4er)ty of the )(>,ry a(* the co(*)t)o( ,(*er wh)ch )t wass,sta)(e*.
#inor #oderate Serious
*ra!'i!e 48 20 6
(a%e 62 32 17
a !)4e a po)(t est)-ate for the proport)o( p of all )(>,r)es to college soccer
players that are s,sta)(e* )( pract)ce.
b %o(str,ct a 90G co(+*e(ce )(ter4al for the proport)o( p of all )(>,r)es to
college soccer players that are s,sta)(e* )( pract)ce.
c !)4e a po)(t est)-ate for the proport)o( p of all )(>,r)es to college soccer
players that are e)ther -o*erate or ser)o,s.
21 &he bo*y -ass )(*e7 #5 was -eas,re* )( 12;; ra(*o-ly selecte* a*,lts w)th
the res,lts show( )( the table.
6#/
Under 78+ 78+*!+ Over !+
en 36 165 315
o%en 75 274 335
a !)4e a po)(t est)-ate for the proport)o( of all -e( whose # )s o4er 20.
b Ass,-)(g the sa-ple )s s,?c)e(tly large co(str,ct a 99G co(+*e(ce )(ter4al for the
proport)o( of all -e( whose # )s o4er 20.
c !)4e a po)(t est)-ate for the proport)o( of all a*,lts regar*less of ge(*er whose # )s
o4er 20.
* Ass,-)(g the sa-ple )s s,?c)e(tly large co(str,ct a 99G co(+*e(ce )(ter4al for the
proport)o( of all a*,lts regar*less of ge(*er whose # )s o4er 20.
Saylor URL: http://www.saylor.org/books Saylor.org3;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 361/723
Saylor URL: http://www.saylor.org/books Saylor.org31
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 362/723
Saylor URL: http://www.saylor.org/books Saylor.org32
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 363/723
Saylor URL: http://www.saylor.org/books Saylor.org33
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 364/723
4.7 Sample Sie Considerations
LEARNN! "#$E%&'E
Saylor URL: http://www.saylor.org/books Saylor.org3
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 365/723
1 &o lear( how to apply for-,las for est)-at)(g the s)Ve sa-ple that w)ll be (ee*e*
)( or*er to co(str,ct a co(+*e(ce )(ter4al for a pop,lat)o( -ea( or proport)o(
that -eets g)4e( cr)ter)a.
+ampling is typically done with a set of clear ob#ectives in mind. /or example$ an economist might
wish to estimate the mean yearly income of workers in a particular industry at F5B confidence and
to within >855. +ince sampling costs time$ effort$ and money$ it would be useful to be able to
estimate the smallest si'e sample that is likely to meet these criteria.
Saylor URL: http://www.saylor.org/books Saylor.org30
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 366/723
Saylor URL: http://www.saylor.org/books Saylor.org3
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 367/723
Saylor URL: http://www.saylor.org/books Saylor.org3
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 368/723
Saylor URL: http://www.saylor.org/books Saylor.org36
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 369/723
There is a dilemma here: the formula for estimating how large a sample to take contains the
number p$ which we know only after we have taken the sample. There are two ways out of this
dilemma. Typically the researcher will have some idea as to the value of the population proportion p$
hence of what the sample proportion p is likely to be. /or example$ if last month 3@B of all voters
thought that state taxes are too high$ then it is likely that the proportion with that opinion this month
will not be dramatically different$ and we would use the value 5.3@ for p in the formula.
The second approach to resolving the dilemma is simply to replace p in the formula by 5.8. This is
because if p is large then1− p is small$ and vice versa$ which limits their product to a maximum
value of 5.!8$ which occurs when p=0.5.This is called the most conservative estimate$ since it
gives the largest possible estimate of n.
Saylor URL: http://www.saylor.org/books Saylor.org39
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 370/723
Saylor URL: http://www.saylor.org/books Saylor.org3;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 371/723
*+, TA*+AA,S• f the pop,lat)o( sta(*ar* *e4)at)o( σ )s k(ow( or ca( be est)-ate* the( the
-)()-,- sa-ple s)Ve (ee*e* to obta)( a co(+*e(ce )(ter4al for the pop,lat)o(
-ea( w)th a g)4e( -a7)-,- error of the est)-ate a(* a g)4e( le4el of co(+*e(ce
ca( be est)-ate*.
• &he -)()-,- sa-ple s)Ve (ee*e* to obta)( a co(+*e(ce )(ter4al for a pop,lat)o(
proport)o( w)th a g)4e( -a7)-,- error of the est)-ate a(* a g)4e( le4el of
co(+*e(ce ca( always be est)-ate*. f there )s pr)or k(owle*ge of the pop,lat)o(
proport)o( p the( the est)-ate ca( be sharpe(e*.
++/C&S+S
'AS&C
1 Est)-ate the -)()-,- sa-ple s)Ve (ee*e* to for- a co(+*e(ce )(ter4al for the -ea(
of a pop,lat)o( ha4)(g the sta(*ar* *e4)at)o( show( -eet)(g the cr)ter)a g)4e(.
a σ 3; 90G co(+*e(ce ( 1;
b σ 3; 99G co(+*e(ce ( 1;
Saylor URL: http://www.saylor.org/books Saylor.org31
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 372/723
c σ 3; 90G co(+*e(ce ( 0
2 Est)-ate the -)()-,- sa-ple s)Ve (ee*e* to for- a co(+*e(ce )(ter4al for the -ea(
of a pop,lat)o( ha4)(g the sta(*ar* *e4)at)o( show( -eet)(g the cr)ter)a g)4e(.
a σ 90G co(+*e(ce ( 1
b σ 99G co(+*e(ce ( 1c σ 90G co(+*e(ce ( ;.0
3 Est)-ate the -)()-,- sa-ple s)Ve (ee*e* to for- a co(+*e(ce )(ter4al for the
proport)o( of a pop,lat)o( that has a part)c,lar character)st)c -eet)(g the cr)ter)a
g)4e(.
a p F ;.3 6;G co(+*e(ce ( ;.;0
b p F ;.3 9;G co(+*e(ce ( ;.;0
c p F ;.3 6;G co(+*e(ce ( ;.;1
Est)-ate the -)()-,- sa-ple s)Ve (ee*e* to for- a co(+*e(ce )(ter4al for the
proport)o( of a pop,lat)o( that has a part)c,lar character)st)c -eet)(g the cr)ter)a
g)4e(.
a p F ;.61 90G co(+*e(ce ( ;.;2
b p F ;.61 99G co(+*e(ce ( ;.;2
c p F ;.61 90G co(+*e(ce ( ;.;1
0 Est)-ate the -)()-,- sa-ple s)Ve (ee*e* to for- a co(+*e(ce )(ter4al for the
proport)o( of a pop,lat)o( that has a part)c,lar character)st)c -eet)(g the cr)ter)a
g)4e(.
a 6;G co(+*e(ce ( ;.;0
b 9;G co(+*e(ce ( ;.;0
c 6;G co(+*e(ce ( ;.;1
Est)-ate the -)()-,- sa-ple s)Ve (ee*e* to for- a co(+*e(ce )(ter4al for the
proport)o( of a pop,lat)o( that has a part)c,lar character)st)c -eet)(g the cr)ter)a
g)4e(.
a 90G co(+*e(ce ( ;.;2
b 99G co(+*e(ce ( ;.;2
c 90G co(+*e(ce ( ;.;1
A22!&CAT&1NS
Saylor URL: http://www.saylor.org/books Saylor.org32
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 373/723
A software e(g)(eer w)shes to est)-ate to w)th)( 0 seco(*s the -ea( t)-e that a (ew
appl)cat)o( takes to start ,p w)th 90G co(+*e(ce. Est)-ate the -)()-,- s)Ve sa-ple
re<,)re* )f the sta(*ar* *e4)at)o( of start ,p t)-es for s)-)lar software )s 12 seco(*s.
6 A real estate age(t w)shes to est)-ate to w)th)( 2.0; the -ea( reta)l cost per s<,are
foot of (ewly b,)lt ho-es w)th 6;G co(+*e(ce. @e est)-ates the sta(*ar* *e4)at)o( of
s,ch costs at 0.;;. Est)-ate the -)()-,- s)Ve sa-ple re<,)re*.
9 A( eco(o-)st w)shes to est)-ate to w)th)( 2 -)(,tes the -ea( t)-e that e-ploye*
perso(s spe(* co--,t)(g each *ay w)th 90G co(+*e(ce. "( the ass,-pt)o( that the
sta(*ar* *e4)at)o( of co--,t)(g t)-es )s 6 -)(,tes est)-ate the -)()-,- s)Ve sa-ple
re<,)re*.
1; A -otor cl,b w)shes to est)-ate to w)th)( 1 ce(t the -ea( pr)ce of 1 gallo( of reg,lar
gasol)(e )( a certa)( reg)o( w)th 96G co(+*e(ce. @)stor)cally the 4ar)ab)l)ty of pr)ces )s
-eas,re* by σ=$0.03.Est)-ate the -)()-,- s)Ve sa-ple re<,)re*.
11 A ba(k w)shes to est)-ate to w)th)( 20 the -ea( a4erage -o(thly bala(ce )( )ts
check)(g acco,(ts w)th 99.6G co(+*e(ce. Ass,-)(g σ=$250 est)-ate the -)()-,- s)Ve
sa-ple re<,)re*.
12 A reta)ler w)shes to est)-ate to w)th)( 10 seco(*s the -ea( *,rat)o( of telepho(e
or*ers take( at )ts call ce(ter w)th 99.0G co(+*e(ce. ( the past the sta(*ar* *e4)at)o(
of call le(gth has bee( abo,t 1.20 -)(,tes. Est)-ate the -)()-,- s)Ve sa-ple
re<,)re*. #e caref,l to e7press all the )(for-at)o( )( the sa-e ,()ts.5
13 &he a*-)()strat)o( at a college w)shes to est)-ate to w)th)( two perce(tage po)(ts the
proport)o( of all )ts e(ter)(g fresh-e( who gra*,ate w)th)( fo,r years w)th 9;G
co(+*e(ce. Est)-ate the -)()-,- s)Ve sa-ple re<,)re*.1 A cha)( of a,to-ot)4e repa)r stores w)shes to est)-ate to w)th)( +4e perce(tage po)(ts
the proport)o( of all passe(ger 4eh)cles )( operat)o( that are at least +4e years ol* w)th
96G co(+*e(ce. Est)-ate the -)()-,- s)Ve sa-ple re<,)re*.
10 A( )(ter(et ser4)ce pro4)*er w)shes to est)-ate to w)th)( o(e perce(tage po)(t the
c,rre(t proport)o( of all e-a)l that )s spa- w)th 99.9G co(+*e(ce. Last year the
proport)o( that was spa- was 1G. Est)-ate the -)()-,- s)Ve sa-ple re<,)re*.
Saylor URL: http://www.saylor.org/books Saylor.org33
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 374/723
1 A( agro(o-)st w)shes to est)-ate to w)th)( o(e perce(tage po)(t the proport)o( of a
(ew 4ar)ety of see* that w)ll ger-)(ate whe( pla(te* w)th 90G co(+*e(ce. A typ)cal
ger-)(at)o( rate )s 9G. Est)-ate the -)()-,- s)Ve sa-ple re<,)re*.
1 A char)table orga()Vat)o( w)shes to est)-ate to w)th)( half a perce(tage po)(t the
proport)o( of all telepho(e sol)c)tat)o(s to )ts *o(ors that res,lt )( a g)ft w)th 9;G
co(+*e(ce. Est)-ate the -)()-,- sa-ple s)Ve re<,)re* ,s)(g the )(for-at)o( that )(
the past the respo(se rate has bee( abo,t 3;G.
16 A go4er(-e(t age(cy w)shes to est)-ate the proport)o( of *r)4ers age* 12 who
ha4e bee( )(4ol4e* )( a tra?c acc)*e(t )( the last year. t w)shes to -ake the est)-ate
to w)th)( o(e perce(tage po)(t a(* at 9;G co(+*e(ce. )(* the -)()-,- sa-ple s)Ve
re<,)re* ,s)(g the )(for-at)o( that se4eral years ago the proport)o( was ;.12.A((&T&1NA! ++/C&S+S
19 A( eco(o-)st w)shes to est)-ate to w)th)( s)7 -o(ths the -ea( t)-e betwee( sales of
e7)st)(g ho-es w)th 90G co(+*e(ce. Est)-ate the -)()-,- s)Ve sa-ple re<,)re*. (
h)s e7per)e(ce 4)rt,ally all ho,ses are re=sol* w)th)( ; -o(ths so ,s)(g the E-p)r)cal
R,le he w)ll est)-ate σ by o(e=s)7th the ra(ge or 40/6=6.7.
2; A w)l*l)fe -a(ager w)shes to est)-ate the -ea( le(gth of +sh )( a large lake to w)th)(
o(e )(ch w)th 6;G co(+*e(ce. Est)-ate the -)()-,- s)Ve sa-ple re<,)re*. ( h)s
e7per)e(ce 4)rt,ally (o +sh ca,ght )( the lake )s o4er 23 )(ches lo(g so ,s)(g the
E-p)r)cal R,le he w)ll est)-ate σ by o(e=s)7th the ra(ge or 23/6=3.8.
21 Jo, w)sh to est)-ate the c,rre(t -ea( b)rth we)ght of all (ewbor(s )( a certa)( reg)o(
to w)th)( 1 o,(ce 1/1 po,(*5 a(* w)th 90G co(+*e(ce. A sa-ple w)ll cost ;; pl,s
1.0; for e4ery (ewbor( we)ghe*. Jo, bel)e4e the sta(*ar* *e4)at)o(s of we)ght to be
(o -ore tha( 1.20 po,(*s. Jo, ha4e 20;; to spe(* o( the st,*y.
a %a( yo, a8or* the sa-ple re<,)re*C
b f (ot what are yo,r opt)o(sC
22 Jo, w)sh to est)-ate a pop,lat)o( proport)o( to w)th)( three perce(tage po)(ts at 90G
co(+*e(ce. A sa-ple w)ll cost 0;; pl,s 0; ce(ts for e4ery sa-ple ele-e(t -eas,re*. Jo, ha4e 1;;; to spe(* o( the st,*y.
a %a( yo, a8or* the sa-ple re<,)re*C
b f (ot what are yo,r opt)o(sC
Saylor URL: http://www.saylor.org/books Saylor.org3
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 375/723
Saylor URL: http://www.saylor.org/books Saylor.org30
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 376/723
Chapter 5
Testin$ =ypotheses
% manufacturer of emergency e"uipment asserts that a respirator that it makes delivers pure air for
@8 minutes on average. % government regulatory agency is charged with testing such claims$ in this
case to verify that the average time is not less than @8 minutes. To do so it would select a random
sample of respirators$ compute the mean time that they deliver pure air$ and compare that mean to
the asserted time @8 minutes.
,n the sampling that we have studied so far the goal has been to estimate a population parameter.
4ut the sampling done by the government agency has a somewhat different ob#ective$ not so much
to estimate the population mean as totest an assertionWor a hypothesisWabout it$ namely$ whether
it is as large as @8 or not. The agency is not necessarily interested in the actual value of $ #ust
whether it is as claimed. Their sampling is done to perform a test of hypotheses$ the sub#ect of this
chapter.
Saylor URL: http://www.saylor.org/books Saylor.org3
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 377/723
5.% The +lements o =ypothesis Testin$
LEARNN! "#$E%&'ES
1 &o ,(*ersta(* the log)cal fra-ework of tests of hypotheses.
2 &o lear( bas)c ter-)(ology co((ecte* w)th hypothes)s test)(g.
3 &o lear( f,(*a-e(tal facts abo,t hypothes)s test)(g.
Types o =ypotheses
% hypothesis about the value of a population parameter is an assertion about its value. %s in the
introductory example we will be concerned with testing the truth of two competing hypotheses$ only one
of which can be true.
Saylor URL: http://www.saylor.org/books Saylor.org3
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 378/723
(e)nitionThe null hypothesis, denoted + 5, is the statement about the population parameter that is assumed to
be true unless there is convincing evidence to the contrary.
The alternative hypothesis, denoted + a, is a statement about the population parameter that is
contradictory to the null hypothesis, and is accepted as true only if there is convincing evidence in favor
of it.
(e)nition
3ypothesis testing is a statistical procedure in which a choice is made between a null hypothesis and
an alternative hypothesis based on information in a sample.
The end result of a hypotheses testing procedure is a choice of one of the following two possible
conclusions:
1 Ce#ect + 5 *and therefore accept + a)$ or
! /ail to re#ect + 5 *and therefore fail to accept + a).
The null hypothesis typically represents the status "uo$ or what has historically been true. ,n the
example of the respirators$ we would believe the claim of the manufacturer unless there is reason not
to do so$ so the null hypotheses is H 0:µ=75.The alternative hypothesis in the example is the
contradictory statementH a:µ<75.The null hypothesis will always be an assertion containing an
e"uals sign$ but depending on the situation the alternative hypothesis can have any one of three
forms: with the symbol GV$H as in the example #ust discussed$ with the symbol GX$H or with the symbol
GYH The following two examples illustrate the latter two cases.
+A>2!+ %
A p,bl)sher of college te7tbooks cla)-s that the a4erage pr)ce of all har*bo,(*
college te7tbooks )s 12.0;. A st,*e(t gro,p bel)e4es that the act,al -ea( )s
h)gher a(* w)shes to test the)r bel)ef. State the rele4a(t (,ll a(* alter(at)4e
hypotheses.
Sol,t)o(:
&he *efa,lt opt)o( )s to accept the p,bl)sherWs cla)- ,(less there )s co-pell)(g
e4)*e(ce to the co(trary. &h,s the (,ll hypothes)s )s H 0:µ=127.50.S)(ce the st,*e(t
Saylor URL: http://www.saylor.org/books Saylor.org36
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 379/723
gro,p th)(ks that the a4erage te7tbook pr)ce )s greater tha( the p,bl)sherWs +g,re
the alter(at)4e hypothes)s )( th)s s)t,at)o( )s H a:µ>127.50.
+A>2!+ 0
&he rec)pe for a bakery )te- )s *es)g(e* to res,lt )( a pro*,ct that co(ta)(s 6
gra-s of fat per ser4)(g. &he <,al)ty co(trol *epart-e(t sa-ples the pro*,ct
per)o*)cally to )(s,re that the pro*,ct)o( process )s work)(g as *es)g(e*. State
the rele4a(t (,ll a(* alter(at)4e hypotheses.
Sol,t)o(:
&he *efa,lt opt)o( )s to ass,-e that the pro*,ct co(ta)(s the a-o,(t of fat )t was
for-,late* to co(ta)( ,(less there )s co-pell)(g e4)*e(ce to the co(trary. &h,s
the (,ll hypothes)s )s H 0:µ=8.0.S)(ce to co(ta)( e)ther -ore fat tha( *es)re* or to
co(ta)( less fat tha( *es)re* are both a( )(*)cat)o( of a fa,lty pro*,ct)o( process
the alter(at)4e hypothes)s )( th)s s)t,at)o( )s that the -ea( )s dierent fro- 6.;
so H a:µ≠8.0.
,n 9ote ?.? 0xample 10$ the textbook example$ it might seem more natural that the publisher&s
claim be that the average price is at most >[email protected]$ not exactly >[email protected]. ,f the claim were made this
way$ then the null hypothesis would beH 0:µ≤127.50$ and the value >[email protected] given in the example
would be the one that is least favorable to the publisher&s claim$ the null hypothesis. ,t is always true
that if the null hypothesis is retained for its least favorable value$ then it is retained for every other
value.
Thus in order to make the null and alternative hypotheses easy for the student to distinguish$ in
every example and problem in this text we will always present one of the two competing claims about
the value of a parameter with an e"uality. The claim expressed with an equality is the null
hypothesis. This is the same as always stating the null hypothesis in the least favorable light. +o in the
introductory example about the respirators$ we stated the manufacturer&s claim as Gthe average is @8minutesH instead of the perhaps more natural Gthe average is at least @8 minutes$H essentially
reducing the presentation of the null hypothesis to its worst case.
The first step in hypothesis testing is to identify the null and alternative hypotheses.
Saylor URL: http://www.saylor.org/books Saylor.org39
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 380/723
The !o$ic o =ypothesis Testin$
%lthough we will study hypothesis testing in situations other than for a single population mean *for
example$ for a population proportion instead of a mean or in comparing the means of two different
populations)$ in this section the discussion will always be given in terms of a single population
mean .
The null hypothesis always has the form H 0:µ=µ0for a specific number µ0 *in the respirator
exampleµ0=75$ in the textbook exampleµ0=127.50$ and in the baked goods example µ0=8.0). +ince
the null hypothesis is accepted unless there is strong evidence to the contrary$ the test procedure is
based on the initial assumption that + 5 is true. This point is so important that we will repeat it in a
display:
The test procedure is based on the initial assumption that + 5 is true.
!igure 3." The (ensity <urve for X−−if + 5 >s True
Saylor URL: http://www.saylor.org/books Saylor.org36;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 381/723
Think of the respirator example$ for which the null hypothesis isH0:µ=75$ the claim that the average
time air is delivered for all respirators is @8 minutes. ,f the sample mean is @8 or greater then we
certainly would not re#ect + 5 *since there is no issue with an emergency respirator delivering air even
longer than claimed).
,f the sample mean is slightly less than @8 then we would logically attribute the difference to
sampling error and also not re#ect + 5 either.
Dalues of the sample mean that are smaller and smaller are less and less likely to come from a
population for which the population mean is @8. Thus if the sample mean is far less than @8$ say
around ;5 minutes or less$ then we would certainly re#ect + 5$ because we know that it is highly
unlikely that the average of a sample would be so low if the population mean were @8. This is the rare
event criterionfor re#ection: what we actually observed X^−−<605 would be so rare an event if M @8
were true that we regard it as much more likely that the alternative hypothesis V @8 holds.
,n summary$ to decide between + 5 and + a in this example we would select a Gre6ection regionH of
values sufficiently far to the left of @8$ based on the rare event criterion$ and re#ect + 5 if the sample
mean X−−lies in the re#ection region$ but not re#ect + 5 if it does not.
Saylor URL: http://www.saylor.org/books Saylor.org361
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 382/723
The /eGection /e$ion
ach different form of the alternative hypothesis + a has its own kind of re#ection region:
1 if *as in the respirator example) + a has the formHa:µ<µ0$ we re#ect + 5 ifx−is far to the left ofµ0$ that is$
to the left of some number < $ so the re#ection region has the form of an interval *Z[$< \
! if *as in the textbook example) + a has the formHa:µ>µ0$ we re#ect + 5 ifx−is far to the right ofµ0$ that
is$ to the right of some number < $ so the re#ection region has the form of an interval ]< $[)
3 if *as in the baked good example) + a has the formHa:µ≠µ0$ we re#ect + 5 ifx−is far away fromµ0in
either direction$ that is$ either to the left of some number < or to the right of some other number < $so the re#ection region has the form of the union of two intervals *Z[$< \Z]< $[).
The key issue in our line of reasoning is the "uestion of how to determine the number < or
numbers < and < $ called the critical value or critical values of the statistic$ that determine the
re#ection region.
The key issue in our line of reasoning is the "uestion of how to determine the number < or
numbers < and < $ called the critical value or critical values of the statistic$ that determine the
re#ection region.
e+()t)o(
The critical value or critical values of a test of hypotheses are the number or numbers that determine
the rejection region.
+uppose the re#ection region is a single interval$ so we need to select a single number < . Eere is the
procedure for doing so. (e select a small probability$ denotedα$ say 1B$ which we take as our
definition of Grare event:H an event is GrareH if its probability of occurrence is less than α.*,n all the
examples and problems in this text the value ofα will be given already.) The probability that X^−
Saylor URL: http://www.saylor.org/books Saylor.org362
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 383/723
− takes a value in an interval is the area under its density curve and above that interval$ so as shown
in /igure ?.! *drawn under the assumption that + 5 is true$ so that the curve centers at µ0) the critical
value < is the value of X^−− that cuts off a tail areaαin the probability density curve of X^−−. (hen
the re#ection region is in two pieces$ that is$ composed of two intervals$ the total area above both of
them must beα$ so the area above each one is α/2$ as also shown in /igure ?.!.
!igure 3.&
Saylor URL: http://www.saylor.org/books Saylor.org363
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 384/723
Figure ".*6eEection 6egion for t!e ,!oice α=0.10
Saylor URL: http://www.saylor.org/books Saylor.org36
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 385/723
&he *ec)s)o( proce*,re )s: take a sa-ple of s)Ve 0 a(* co-p,te the sa-ple
-ea( x−.f x− )s e)ther .69 gra-s or less or 6.11 gra-s or -ore the( re>ect the
hypothes)s that the a4erage a-o,(t of fat )( all ser4)(gs of the pro*,ct )s 6.;
gra-s )( fa4or of the alter(at)4e that )t )s *)8ere(t fro- 6.; gra-s. "therw)se *o
(ot re>ect the hypothes)s that the a4erage a-o,(t )s 6.; gra-s.
&he reaso()(g )s that )f the tr,e a4erage a-o,(t of fat per ser4)(g were 6.;
gra-s the( there wo,l* be less tha( a 1;G cha(ce that a sa-ple of s)Ve 0 wo,l*
pro*,ce a -ea( of e)ther .69 gra-s or less or 6.11 gra-s or -ore. @e(ce )f that
happe(e* )t wo,l* be -ore l)kely that the 4al,e 6.; )s )(correct always ass,-)(g
that the pop,lat)o( sta(*ar* *e4)at)o( )s ;.10 gra-5.
4ecause the re#ection regions are computed based on areas in tails of distributions$ as shown
in /igure ?.!$ hypothesis tests are classified according to the form of the alternative hypothesis in the
following way.
e+()t)o(
Saylor URL: http://www.saylor.org/books Saylor.org360
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 386/723
>f + a has the form µ≠µ0the test is called a t*o(tailed test.
>f + a has the form µ<µ0the test is called a left(tailed test.
>f + a has the form µ>µ0the test is called a right(tailed test.
;ach of the last two forms is also called a one(tailed test.
Two Types o +rrors
The format of the testing procedure in general terms is to take a sample and use the information it
contains to come to a decision about the two hypotheses. %s stated before our decision will always be
either
1 re#ect the null hypothesis + 5 in favor of the alternative + a presented$ or
! do not re#ect the null hypothesis + 5 in favor of the alternative + a presented.
There are four possible outcomes of hypothesis testing procedure$ as shown in the following table:
,r&e 'a'e o Na'&re
H 0 i+ 'r&e H 0 i+ a"+e
O&r #e!i+ion
#o no' ree!' H 0 orre!' de!i+ion ,$e error
ee!' H 0 ,$e error orre!' de!i+ion
%s the table shows$ there are two ways to be right and two ways to be wrong. Typically to
re#ect + 5 when it is actually true is a more serious error than to fail to re#ect it when it is false$ so the
former error is labeled GType ,H and the latter error GType ,,.H
(e)nition
>n a test of hypotheses, a Type , error is the decision to reject + 5 when it is in fact true. A Type ,, error is
the decision not to reject + 5 when it is in fact not true.
Saylor URL: http://www.saylor.org/books Saylor.org36
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 387/723
<nless we perform a census we do not have certain knowledge$ so we do not know whether our
decision matches the true state of nature or if we have made an error. (e re#ect + 5 if what we observe
would be a GrareH event if + 5 were true. 4ut rare events are not impossible: they occur with
probabilityα.Thus when + 5 is true$ a rare event will be observed in the proportionαof repeated
similar tests$ and + 5 will be erroneously re#ected in those tests. Thus αis the probability that in
following the testing procedure to decide between + 5 and + a we will make a Type , error.
e+()t)o(
The number αthat is used to determine the rejection region is called the level of significance of the test. >t
is the probability that the test procedure will result in a Type > error.
The probability of making a Type ,, error is too complicated to discuss in a beginning text$ so we will say
no more about it than this: for a fixed sample si'e$ choosingαsmaller in order to reduce the chance of
making a Type , error has the effect of increasing the chance of making a Type ,, error. The only way to
simultaneously reduce the chances of making either kind of error is to increase the sample si'e.
Standardiin$ the Test Statistic
Eypotheses testing will be considered in a number of contexts$ and great unification as well as
simplification results when the relevant sample statistic is standardized by subtracting its mean from it
and then dividing by its standard deviation. The resulting statistic is called a standardized test statistic. ,nevery situation treated in this and the following two chapters the standardi'ed test statistic will have
either the standard normal distribution or +tudent&s t -distribution.
e+()t)o(
Saylor URL: http://www.saylor.org/books Saylor.org36
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 388/723
A standardized test statistic for a hypothesis test is the statistic that is formed by subtracting from
the statistic of interest its mean and dividing by its standard deviation.
Saylor URL: http://www.saylor.org/books Saylor.org366
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 389/723
Saylor URL: http://www.saylor.org/books Saylor.org369
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 390/723
very instance of hypothesis testing discussed in this and the following two chapters will have a
re#ection region like one of the six forms tabulated in the tables above.
9o matter what the context a test of hypotheses can always be performed by applying the following
systematic procedure$ which will be illustrated in the examples in the succeeding sections.
Syste-at)c @ypothes)s &est)(g Proce*,re: %r)t)cal 'al,e
Approach
1 ,dentify the null and alternative hypotheses.
! ,dentify the relevant test statistic and its distribution.
3 7ompute from the data the value of the test statistic.
6 7onstruct the re#ection region.
8 7ompare the value computed in +tep 3 to the re#ection region constructed in +tep 6 and make a
decision. /ormulate the decision in the context of the problem$ if applicable.
The procedure that we have outlined in this section is called the G7ritical Dalue %pproachH tohypothesis testing to distinguish it from an alternative but e"uivalent approach that will be
introduced at the end of +ection ?.3 0The Observed +ignificance of a Test0.
*+, TA*+AA,S
• A test of hypotheses )s a stat)st)cal process for *ec)*)(g betwee( two co-pet)(g
assert)o(s abo,t a pop,lat)o( para-eter.
• &he test)(g proce*,re )s for-al)Ve* )( a +4e=step proce*,re.
++/C&S+S
1 State the (,ll a(* alter(at)4e hypotheses for each of the follow)(g s)t,at)o(s. &hat )s
)*e(t)fy the correct (,-ber µ0 a(* wr)te H0:µ=µ0 a(* the appropr)ate a(alogo,s e7press)o(
for a.5
Saylor URL: http://www.saylor.org/books Saylor.org39;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 391/723
a &he a4erage $,ly te-perat,re )( a reg)o( h)stor)cally has bee( .0. Perhaps
)t )s h)gher (ow.
b &he a4erage we)ght of a fe-ale a)rl)(e passe(ger w)th l,ggage was 10
po,(*s te( years ago. &he AA bel)e4es )t to be h)gher (ow.
c &he a4erage st)pe(* for *octoral st,*e(ts )( a part)c,lar *)sc)pl)(e at a state,()4ers)ty )s 10. &he *epart-e(t cha)r-a( bel)e4es that the (at)o(al
a4erage )s h)gher.
* &he a4erage roo- rate )( hotels )( a certa)( reg)o( )s 62.03. A tra4el age(t
bel)e4es that the a4erage )( a part)c,lar resort area )s *)8ere(t.
e &he a4erage far- s)Ve )( a pre*o-)(ately r,ral state was 9. acres. &he
secretary of agr)c,lt,re of that state asserts that )t )s less to*ay.
2 State the (,ll a(* alter(at)4e hypotheses for each of the follow)(g s)t,at)o(s. &hat )s
)*e(t)fy the correct (,-ber µ0 a(* wr)te H0:µ=µ0 a(* the appropr)ate a(alogo,s e7press)o(
for a.5
a &he a4erage t)-e workers spe(t co--,t)(g to work )( 'ero(a +4e years ago
was 36.2 -)(,tes. &he 'ero(a %ha-ber of %o--erce asserts that the
a4erage )s less (ow.
b &he -ea( salary for all -e( )( a certa)( profess)o( )s 06291. A spec)al
)(terest gro,p th)(ks that the -ea( salary for wo-e( )( the sa-e profess)o(
)s *)8ere(t.
c &he accepte* +g,re for the ca8e)(e co(te(t of a( 6=o,(ce c,p of co8ee )s 133
-g. A *)et)t)a( bel)e4es that the a4erage for co8ee ser4e* )( a local
resta,ra(ts )s h)gher.
* &he a4erage y)el* per acre for all types of cor( )( a rece(t year was 11.9
b,shels. A( eco(o-)st bel)e4es that the a4erage y)el* per acre )s *)8ere(t th)s
year.
e A( )(*,stry assoc)at)o( asserts that the a4erage age of all self=*escr)be* [y
+sher-e( )s 2.6 years. A soc)olog)st s,spects that )t )s h)gher.
3 escr)be the two types of errors that ca( be -a*e )( a test of hypotheses.
U(*er what c)rc,-sta(ce )s a test of hypotheses certa)( to y)el* a correct *ec)s)o(C
Saylor URL: http://www.saylor.org/books Saylor.org391
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 392/723
5.0 !ar$e Sample Tests or a 2opulation >ean
!+A/N&N: 1';+CT&<+S
1 &o lear( how to apply the +4e=step test proce*,re for a test of hypotheses
co(cer()(g a pop,lat)o( -ea( whe( the sa-ple s)Ve )s large.
2 &o lear( how to )(terpret the res,lt of a test of hypotheses )( the co(te7t of the
or)g)(al (arrate* s)t,at)o(.
Saylor URL: http://www.saylor.org/books Saylor.org392
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 393/723
Saylor URL: http://www.saylor.org/books Saylor.org393
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 394/723
EKAPLE
t )s hope* that a (ewly *e4elope* pa)( rel)e4er w)ll -ore <,)ckly pro*,ce
percept)ble re*,ct)o( )( pa)( to pat)e(ts after -)(or s,rger)es tha( a sta(*ar*
pa)( rel)e4er. &he sta(*ar* pa)( rel)e4er )s k(ow( to br)(g rel)ef )( a( a4erage of
3.0 -)(,tes w)th sta(*ar* *e4)at)o( 2.1 -)(,tes. &o test whether the (ew pa)(
Saylor URL: http://www.saylor.org/books Saylor.org39
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 395/723
rel)e4er works -ore <,)ckly tha( the sta(*ar* o(e 0; pat)e(ts w)th -)(or
s,rger)es were g)4e( the (ew pa)( rel)e4er a(* the)r t)-es to rel)ef were recor*e*.
&he e7per)-e(t y)el*e* sa-ple -ea( x −=3.1-)(,tes a(* sa-ple sta(*ar*
*e4)at)o( s 1.0 -)(,tes. s there s,?c)e(t e4)*e(ce )( the sa-ple to )(*)cate
at the 0G le4el of s)g()+ca(ce that the (ewly *e4elope* pa)( rel)e4er *oes
*el)4er percept)ble rel)ef -ore <,)cklyC
Sol,t)o(:
Be perfor- the test of hypotheses ,s)(g the +4e=step proce*,re g)4e( at the e(*
of Sect)o( 6.1 Q&he Ele-e(ts of @ypothes)s &est)(gQ.
• Step 1. &he (at,ral ass,-pt)o( )s that the (ew *r,g )s (o better tha( the ol*
o(e b,t -,st be pro4e* to be better. &h,s )f μ *e(otes the a4erage t)-e ,(t)l
all pat)e(ts who are g)4e( the (ew *r,g e7per)e(ce pa)( rel)ef the hypothes)s
test )s
Saylor URL: http://www.saylor.org/books Saylor.org390
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 396/723
percept)ble rel)ef fro- pa)( ,s)(g the (ew pa)( rel)e4er )s s-aller tha( the
a4erage t)-e for the sta(*ar* pa)( rel)e4er.
Figure ".;6eEection 6egion and )est %tatistic for 'ote ".27 >(ample 8>
Saylor URL: http://www.saylor.org/books Saylor.org39
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 397/723
+A>2!+ 8
A cos-et)cs co-pa(y +lls )ts best=sell)(g 6=o,(ce >ars of fac)al crea- by a(
a,to-at)c *)spe(s)(g -ach)(e. &he -ach)(e )s set to *)spe(se a -ea( of 6.1
o,(ces per >ar. U(co(trollable factors )( the process ca( sh)ft the -ea( away
fro- 6.1 a(* ca,se e)ther ,(*er+ll or o4er+ll both of wh)ch are ,(*es)rable. (
s,ch a case the *)spe(s)(g -ach)(e )s stoppe* a(* recal)brate*. Regar*less of
the -ea( a-o,(t *)spe(se* the sta(*ar* *e4)at)o( of the a-o,(t *)spe(se*
always has 4al,e ;.22 o,(ce. A <,al)ty co(trol e(g)(eer ro,t)(ely selects 3; >ars
fro- the asse-bly l)(e to check the a-o,(ts +lle*. "( o(e occas)o( the sa-ple
-ea( )s x−=8.2o,(ces a(* the sa-ple sta(*ar* *e4)at)o( )s s ;.20 o,(ce.
eter-)(e )f there )s s,?c)e(t e4)*e(ce )( the sa-ple to )(*)cate at the 1G le4el
of s)g()+ca(ce that the -ach)(e sho,l* be recal)brate*.
Sol,t)o(:
• Step 1. &he (at,ral ass,-pt)o( )s that the -ach)(e )s work)(g properly. &h,s
)f μ *e(otes the -ea( a-o,(t of fac)al crea- be)(g *)spe(se* the
hypothes)s test )s
H 0:µ = 8.1
vs.H a:µ=≠8.1@ α=0.01
Saylor URL: http://www.saylor.org/books Saylor.org39
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 398/723
Figure ".56eEection 6egion and )est %tatistic for 'ote ".2" >(ample ;>
Saylor URL: http://www.saylor.org/books Saylor.org396
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 399/723
*+, TA*+AA,S
• &here are two for-,las for the test stat)st)c )( test)(g hypotheses abo,t a
pop,lat)o( -ea( w)th large sa-ples. #oth test stat)st)cs follow the sta(*ar*
(or-al *)str)b,t)o(.
•
&he pop,lat)o( sta(*ar* *e4)at)o( )s ,se* )f )t )s k(ow( otherw)se the sa-plesta(*ar* *e4)at)o( )s ,se*.
• &he sa-e +4e=step proce*,re )s ,se* w)th e)ther test stat)st)c.
++/C&S+S
'AS&C
1 )(* the re>ect)o( reg)o( for the sta(*ar*)Ve* test stat)st)c5 for each hypothes)s test.
a H0:µ=274s. Ha:µ<27 α=0.05.
b H0:µ=524s. Ha:µ≠52 α=0.05.
c H0:µ=−1054s. Ha:µ>−105 α=0.10.
* H0:µ=78.84s. Ha:µ≠78.8 α=0.10.
2 )(* the re>ect)o( reg)o( for the sta(*ar*)Ve* test stat)st)c5 for each hypothes)s test.
a H0:µ=174s. Ha:µ<17 α=0.01.
b H0:µ=8804s. Ha:µ≠880 α=0.01.
c H0:µ=−124s. Ha:µ>−12 α=0.05.
* H0:µ=21.14s. Ha:µ≠21.1 α=0.05.
3 )(* the re>ect)o( reg)o( for the sta(*ar*)Ve* test stat)st)c5 for each hypothes)s test.
*e(t)fy the test as left=ta)le* r)ght=ta)le* or two=ta)le*.
a H0:µ=1414s. Ha:µ<141 α=0.20.
Saylor URL: http://www.saylor.org/books Saylor.org399
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 400/723
b H0:µ=−544s. Ha:µ<−54 α=0.05.
c H0:µ=98.64s. Ha:µ≠98.6 α=0.05.
* H0:µ=3.84s. Ha:µ>3.8 α=0.001.
)(* the re>ect)o( reg)o( for the sta(*ar*)Ve* test stat)st)c5 for each hypothes)s test.
*e(t)fy the test as left=ta)le* r)ght=ta)le* or two=ta)le*.
a H0:µ=−624s. Ha:µ≠−62 α=0.005.
b H0:µ=734s. Ha:µ>73 α=0.001.
c H0:µ=11244s. Ha:µ<1124 α=0.001.
* H0:µ=0.124s. Ha:µ≠0.12 α=0.001.
0 %o-p,te the 4al,e of the test stat)st)c for the )(*)cate* test base* o( the
)(for-at)o( g)4e(.
a &est)(g H0:µ=72.24s. Ha:µ>72.2 σ ,(k(ow( n 00 x−=75.1 s 9.20
b &est)(g H0:µ=584s. Ha:µ>58 σ 1.22 n ; x−=58.5 s 1.29
c &est)(g H0:µ=−19.54s. Ha:µ<−19.5 σ ,(k(ow( n 3; x−=−23.2 s 9.00
* &est)(g H0:µ=8054s. Ha:µ≠805 σ 3.0 n 0 x−=818 s 3.2
%o-p,te the 4al,e of the test stat)st)c for the )(*)cate* test base* o( the
)(for-at)o( g)4e(.
a &est)(g H0:µ=3424s. Ha:µ<342 σ 11.2 n ; x−=339 s 1;.3
b &est)(g H0:µ=1054s. Ha:µ>105 σ 0.3 n 6; x−=107 s 0.1
c &est)(g H0:µ=−13.54s. Ha:µ≠−13.5 σ ,(k(ow( n 32 x−=−13.8 s 1.0
* &est)(g H0:µ=284s. Ha:µ≠28 σ ,(k(ow( n 6 x−=27.8 s 1.3
Perfor- the )(*)cate* test of hypotheses base* o( the )(for-at)o( g)4e(.
a &est H0:µ=2124s. Ha:µ<212 α=0.10 σ ,(k(ow( n 3 x−=211.2 s 2.2
b &est H0:µ=−184s. Ha:µ>−18 α=0.05 σ 3.3 n x−=−17.2 s 3.1
c &est H0:µ=244s. Ha:µ≠24 α=0.02 σ ,(k(ow( n 0; x−=22.8 s 1.9
6 Perfor- the )(*)cate* test of hypotheses base* o( the )(for-at)o( g)4e(.
a &est H0:µ=1054s. Ha:µ>105 α=0.05 σ ,(k(ow( n 3; x−=108 s .2
b &est H0:µ=21.64s. Ha:µ<21.6 α=0.01 σ ,(k(ow( n 6 x−=20.5 s 3.9
c &est H0:µ=−3754s. Ha:µ≠−375 α=0.01 σ 16.0 n 31 x−=−388 s 16.;
Saylor URL: http://www.saylor.org/books Saylor.org;;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 401/723
A22!&CAT&1NS
9 ( the past the a4erage le(gth of a( o,tgo)(g telepho(e call fro- a b,s)(ess o?ce has
bee( 13 seco(*s. A -a(ager w)shes to check whether that a4erage has *ecrease*
after the )(tro*,ct)o( of pol)cy cha(ges. A sa-ple of 1;; telepho(e calls pro*,ce* a
-ea( of 133 seco(*s w)th a sta(*ar* *e4)at)o( of 30 seco(*s. Perfor- the rele4a(ttest at the 1G le4el of s)g()+ca(ce.
1; &he go4er(-e(t of a( )-po4er)she* co,(try reports the -ea( age at *eath a-o(g
those who ha4e s,r4)4e* to a*,lthoo* as .2 years. A rel)ef age(cy e7a-)(es 3;
ra(*o-ly selecte* *eaths a(* obta)(s a -ea( of 2.3 years w)th sta(*ar* *e4)at)o( 6.1
years. &est whether the age(cyWs *ata s,pport the alter(at)4e hypothes)s at the 1G
le4el of s)g()+ca(ce that the pop,lat)o( -ea( )s less tha( .2.
11 &he a4erage ho,sehol* s)Ve )( a certa)( reg)o( se4eral years ago was 3.1 perso(s. A
soc)olog)st w)shes to test at the 0G le4el of s)g()+ca(ce whether )t )s *)8ere(t (ow.
Perfor- the test ,s)(g the )(for-at)o( collecte* by the soc)olog)st: )( a ra(*o- sa-ple
of 0 ho,sehol*s the a4erage s)Ve was 2.96 perso(s w)th sa-ple sta(*ar* *e4)at)o(
;.62 perso(.
12 &he reco--e(*e* *a)ly calor)e )(take for tee(age g)rls )s 22;; calor)es/*ay. A
(,tr)t)o()st at a state ,()4ers)ty bel)e4es the a4erage *a)ly calor)c )(take of g)rls )( that
state to be lower. &est that hypothes)s at the 0G le4el of s)g()+ca(ce aga)(st the (,ll
hypothes)s that the pop,lat)o( a4erage )s 22;; calor)es/*ay ,s)(g the follow)(g sa-ple
*ata: n 3 x−= 2,150 s 2;3.
13 A( a,to-ob)le -a(,fact,rer reco--e(*s o)l cha(ge )(ter4als of 3;;; -)les. &o
co-pare act,al )(ter4als to the reco--e(*at)o( the co-pa(y ra(*o-ly sa-ples
recor*s of 0; o)l cha(ges at ser4)ce fac)l)t)es a(* obta)(s sa-ple -ea( 302 -)les w)th
sa-ple sta(*ar* *e4)at)o( 36 -)les. eter-)(e whether the *ata pro4)*e s,?c)e(t
e4)*e(ce at the 0G le4el of s)g()+ca(ce that the pop,lat)o( -ea( )(ter4al betwee( o)l
cha(ges e7cee*s 3;;; -)les.
1 A -e*)cal laboratory cla)-s that the -ea( t,r(=aro,(* t)-e for perfor-a(ce of a
battery of tests o( bloo* sa-ples )s 1.66 b,s)(ess *ays. &he -a(ager of a large -e*)cal
pract)ce bel)e4es that the act,al -ea( )s larger. A ra(*o- sa-ple of 0 bloo* sa-ples
y)el*e* -ea( 2.;9 a(* sa-ple sta(*ar* *e4)at)o( ;.13 *ay. Perfor- the rele4a(t test atthe 1;G le4el of s)g()+ca(ce ,s)(g these *ata.
10 A grocery store cha)( has as o(e sta(*ar* of ser4)ce that the -ea( t)-e c,sto-ers wa)t
)( l)(e to beg)( check)(g o,t (ot e7cee* 2 -)(,tes. &o 4er)fy the perfor-a(ce of a store
the co-pa(y -eas,res the wa)t)(g t)-e )( 3; )(sta(ces obta)()(g -ea( t)-e 2.1
-)(,tes w)th sta(*ar* *e4)at)o( ;. -)(,te. Use these *ata to test the (,ll hypothes)s
Saylor URL: http://www.saylor.org/books Saylor.org;1
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 402/723
that the -ea( wa)t)(g t)-e )s 2 -)(,tes 4ers,s the alter(at)4e that )t e7cee*s 2
-)(,tes at the 1;G le4el of s)g()+ca(ce.
1 A -agaV)(e p,bl)sher tells pote(t)al a*4ert)sers that the -ea( ho,sehol* )(co-e of )ts
reg,lar rea*ersh)p )s 10;;. A( a*4ert)s)(g age(cy w)shes to test th)s cla)- aga)(st
the alter(at)4e that the -ea( )s s-aller. A sa-ple of ; ra(*o-ly selecte* reg,larrea*ers y)el*s -ea( )(co-e 096;; w)th sta(*ar* *e4)at)o( 060;. Perfor- the
rele4a(t test at the 1G le4el of s)g()+ca(ce.
1 A,thors of a co-p,ter algebra syste- w)sh to co-pare the spee* of a (ew
co-p,tat)o(al algor)th- to the c,rre(tly )-ple-e(te* algor)th-. &hey apply the (ew
algor)th- to 0; sta(*ar* proble-sD )t a4erages 6.1 seco(*s w)th sta(*ar* *e4)at)o(
;.1 seco(*. &he c,rre(t algor)th- a4erages 6.21 seco(*s o( s,ch proble-s. &est at
the 1G le4el of s)g()+ca(ce the alter(at)4e hypothes)s that the (ew algor)th- has a
lower a4erage t)-e tha( the c,rre(t algor)th-.
16 A ra(*o- sa-ple of the start)(g salar)es of 30 ra(*o-ly selecte* gra*,ates w)th
bachelorWs *egrees last year ga4e sa-ple -ea( a(* sta(*ar* *e4)at)o( 12;2 a(*
21 respect)4ely. &est whether the *ata pro4)*e s,?c)e(t e4)*e(ce at the 0G le4el
of s)g()+ca(ce to co(cl,*e that the -ea( start)(g salary of all gra*,ates last year )s
less tha( the -ea( of all gra*,ates two years before 3069.
A((&T&1NA! ++/C&S+S
19 &he -ea( ho,sehol* )(co-e )( a reg)o( ser4e* by a cha)( of cloth)(g stores )s 60;.
( a sa-ple of ; c,sto-ers take( at 4ar)o,s stores the -ea( )(co-e of the c,sto-ers
was 010;0 w)th sta(*ar* *e4)at)o( 602.a &est at the 1;G le4el of s)g()+ca(ce the (,ll hypothes)s that the -ea(
ho,sehol* )(co-e of c,sto-ers of the cha)( )s 60; aga)(st that
alter(at)4e that )t )s *)8ere(t fro- 60;.
b &he sa-ple -ea( )s greater tha( 60; s,ggest)(g that the act,al -ea( of
people who patro()Ve th)s store )s greater tha( 60;. Perfor- th)s test also
at the 1;G le4el of s)g()+ca(ce. &he co-p,tat)o( of the test stat)st)c *o(e )(
part a5 st)ll appl)es here.5
2; &he labor charge for repa)rs at a( a,to-ob)le ser4)ce ce(ter are base* o( a sta(*ar*
t)-e spec)+e* for each type of repa)r. &he t)-e spec)+e* for replace-e(t of ,()4ersal
>o)(t )( a *r)4e shaft )s o(e ho,r. &he -a(ager re4)ews a sa-ple of 3; s,ch repa)rs. &he
a4erage of the act,al repa)r t)-es )s ;.6 ho,r w)th sta(*ar* *e4)at)o( ;.32 ho,r.
a &est at the 1G le4el of s)g()+ca(ce the (,ll hypothes)s that the act,al -ea(
t)-e for th)s repa)r *)8ers fro- o(e ho,r.
Saylor URL: http://www.saylor.org/books Saylor.org;2
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 403/723
b &he sa-ple -ea( )s less tha( o(e ho,r s,ggest)(g that the -ea( act,al t)-e
for th)s repa)r )s less tha( o(e ho,r. Perfor- th)s test also at the 1G le4el of
s)g()+ca(ce. &he co-p,tat)o( of the test stat)st)c *o(e )( part a5 st)ll appl)es
here.5
!A/:+ (ATA S+T ++/C &S+S
21 Large ata Set 1 recor*s the SA& scores of 1;;; st,*e(ts. Regar*)(g )t as a ra(*o-
sa-ple of all h)gh school st,*e(ts ,se )t to test the hypothes)s that the pop,lat)o(
-ea( e7cee*s 101; at the 1G le4el of s)g()+ca(ce. &he (,ll hypothes)s )s that μ
101;.5
http://www.1.7ls
22 Large ata Set 1 recor*s the !PAs of 1;;; college st,*e(ts. Regar*)(g )t as a ra(*o-
sa-ple of all college st,*e(ts ,se )t to test the hypothes)s that the pop,lat)o( -ea( )s
less tha( 2.0; at the 1;G le4el of s)g()+ca(ce. &he (,ll hypothes)s )s that μ 2.0;.5
http://www.1.7ls
23 Large ata Set 1 l)sts the SA& scores of 1;;; st,*e(ts.
http://www.1.7ls
a Regar* the *ata as ar)s)(g fro- a ce(s,s of all st,*e(ts at a h)gh school )(
wh)ch the SA& score of e4ery st,*e(t was -eas,re*. %o-p,te the pop,lat)o(
-ea( μ.
b Regar* the +rst 0; st,*e(ts )( the *ata set as a ra(*o- sa-ple *raw( fro-
the pop,lat)o( of part a5 a(* ,se )t to test the hypothes)s that the pop,lat)o(
-ea( e7cee*s 101; at the 1;G le4el of s)g()+ca(ce. &he (,ll hypothes)s )sthat μ 101;.5
c s yo,r co(cl,s)o( )( part b5 )( agree-e(t w)th the tr,e state of (at,re wh)ch
by part a5 yo, k(ow5 or )s yo,r *ec)s)o( )( errorC f yo,r *ec)s)o( )s )( error )s
)t a &ype error or a &ype errorC
2 Large ata Set 1 l)sts the !PAs of 1;;; st,*e(ts.
http://www.1.7ls
a Regar* the *ata as ar)s)(g fro- a ce(s,s of all fresh-a( at a s-all college at
the e(* of the)r +rst aca*e-)c year of college st,*y )( wh)ch the !PA of e4ery
s,ch perso( was -eas,re*. %o-p,te the pop,lat)o( -ea( μ.
b Regar* the +rst 0; st,*e(ts )( the *ata set as a ra(*o- sa-ple *raw( fro-
the pop,lat)o( of part a5 a(* ,se )t to test the hypothes)s that the pop,lat)o(
-ea( )s less tha( 2.0; at the 1;G le4el of s)g()+ca(ce. &he (,ll hypothes)s )s
that μ 2.0;.5
Saylor URL: http://www.saylor.org/books Saylor.org;3
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 404/723
c s yo,r co(cl,s)o( )( part b5 )( agree-e(t w)th the tr,e state of (at,re wh)ch
by part a5 yo, k(ow5 or )s yo,r *ec)s)o( )( errorC f yo,r *ec)s)o( )s )( error )s
)t a &ype error or a &ype errorC
Saylor URL: http://www.saylor.org/books Saylor.org;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 405/723
5.3 The 1bserved Si$ni)cance o a Test
LEARNN! "#$E%&'ES
1 &o lear( what the obser4e* s)g()+ca(ce of a test )s.
2 &o lear( how to co-p,te the obser4e* s)g()+ca(ce of a test.
3 &o lear( how to apply the p=4al,e approach to hypothes)s test)(g.
The 1bserved Si$ni)cance
The conceptual basis of our testing procedure is that we re#ect + 5 only if the data that we obtained would
constitute a rare event if + 5 were actually true. The level of significanceαspecifies what is meant by
Grare.H The observed significance of the test is a measure of how rare the value of the test statistic that we
have #ust observed would be if the null hypothesis were true. That is$ the observed significance of the test
#ust performed is the probability that$ if the test were repeated with a new sample$ the result of the new
test would be at least as contrary to + 5 and in support of + a as what was observed in the original test.
Saylor URL: http://www.saylor.org/books Saylor.org;0
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 406/723
e+()t)o(
The observed significance or p(value of a specific test of hypotheses is the probability, on the
supposition that + 5 is true, of obtaining a result at least as contrary to + 5 and in favor of + a as the result
actually observed in the sample data.
Think back to 9ote ?.!@ 0xample 60 in +ection ?.! 0arge +ample Tests for a opulation
ean0 concerning the effectiveness of a new pain reliever. This was a left-tailed test in which the value of
the test statistic was Z1.??;. To be as contrary to + 5 and in support of + a as the resultZ=−1.886actually
observed means to obtain a value of the test statistic in the interval(−∞,−1.886].Counding Z1.??; to
Z1.?F$ we can read directly from /igure 1!.! 07umulative 9ormal
robability0 that P(Z≤−1.89)=0.0294.Thus the p-value or observed significance of the test in 9ote ?.!@
0xample 60 is 5.5!F6 or about 3B. <nder repeated sampling from this population$ if + 5 were true then
only about 3B of all samples of si'e 85 would give a result as contrary to + 5 and in favor of + a as the
sample we observed. 9ote that the probability 5.5!F6 is the area of the left tail cut off by the test statistic
in this left-tailed test.
%nalogous reasoning applies to a right-tailed or a two-tailed test$ except that in the case of a two-tailed
test being as far from 5 as the observed value of the test statistic but on the opposite side of 5 is #ust as
contrary to + 5 as being the same distance away and on the same side of 5$ hence the corresponding tail
area is doubled.
Computational (e)nition o the 1bserved Si$ni)canceo a Test o =ypothesesThe observed significance of a test of hypotheses is the area of the tail of the distribution cut off by the
test statistic *times two in the case of a two-tailed test).
+A>2!+ 9
%o-p,te the obser4e* s)g()+ca(ce of the test perfor-e* )( Note 6.26 QE7a-ple
0Q)( Sect)o( 6.2 QLarge Sa-ple &ests for a Pop,lat)o( ea(Q.
Saylor URL: http://www.saylor.org/books Saylor.org;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 407/723
Sol,t)o(:
&he 4al,e of the test stat)st)c was z 2.9; wh)ch by )g,re 12.2 Q%,-,lat)4e
Nor-al Probab)l)tyQ c,ts o8 a ta)l of area ;.;; as show( )( )g,re 6. QArea of the
&a)l for Q. S)(ce the test was two=ta)le* the obser4e* s)g()+ca(ce )s 2×0.0064=0.0128.
Figure ".7 0rea of t!e )ail for 'ote ".*8 >(ample 5>
The p-value Approach to =ypothesis Testin$
,n 9ote ?.!@ 0xample 60 in +ection ?.! 0arge +ample Tests for a opulation ean0 the test was
performed at the 8B level of significance: the definition of GrareH event was probability α=0.05or less.
(e saw above that the observed significance of the test was p M 5.5!F6 or about 3B.
Saylor URL: http://www.saylor.org/books Saylor.org;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 408/723
+ince p=0.0294<0.05=α*or 3B is less than 8B)$ the decision turned out to be to re#ect: what was
observed was sufficiently unlikely to "ualify as an event so rare as to be regarded as *practically)
incompatible with + 5.
,n 9ote ?.!? 0xample 80 in +ection ?.! 0arge +ample Tests for a opulation ean0 the test was
performed at the 1B level of significance: the definition of GrareH event was probability α=0.01or less.
The observed significance of the test was computed in 9ote ?.36 0xample ;0 as p M 5.51!? or about
1.3B. +ince p=0.0128>0.01=α*or 1.3B is greater than 1B)$ the decision turned out to be not to re#ect.
The event observed was unlikely$ but not sufficiently unlikely to lead to re#ection of the null
hypothesis.
The reasoning #ust presented is the basis for a slightly different but e"uivalent formulation of the
hypothesis testing process. The first three steps are the same as before$ but instead of using αto
compute critical values and construct a re#ection region$ one computes the p-value p of the test and
compares it toα$ re#ecting + 5 if p≤αand not re#ecting if p>α.
Systematic =ypothesis Testin$ 2rocedureH p-<alueApproach1 ,dentify the null and alternative hypotheses.
! ,dentify the relevant test statistic and its distribution.
3 7ompute from the data the value of the test statistic.
6 7ompute the p-value of the test.
8 7ompare the value computed in +tep 6 to significance levelαand make a decision:
re#ect + 5 if p≤αand do not re#ect + 5 if p>α./ormulate the decision in the context of the problem$ ifapplicable.
Saylor URL: http://www.saylor.org/books Saylor.org;6
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 409/723
Saylor URL: http://www.saylor.org/books Saylor.org;9
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 410/723
+A>2!+ 5
r. Prospero has bee( teach)(g Algebra fro- a part)c,lar te7tbook at Re-ote
sle @)gh School for -a(y years. "4er the years st,*e(ts )( h)s Algebra classesha4e co(s)ste(tly score* a( a4erage of o( the e(* of co,rse e7a- E"%5. &h)s
year r. Prospero ,se* a (ew te7tbook )( the hope that the a4erage score o( the
E"% test wo,l* be h)gher. &he a4erage E"% test score of the st,*e(ts who
took Algebra fro- r. Prospero th)s year ha* -ea( 9. a(* sa-ple sta(*ar*
*e4)at)o( .1. eter-)(e whether these *ata pro4)*e s,?c)e(t e4)*e(ce at the
Saylor URL: http://www.saylor.org/books Saylor.org1;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 411/723
1G le4el of s)g()+ca(ce to co(cl,*e that the a4erage E"% test score )s h)gher
w)th the (ew te7tbook.
Sol,t)o(:
• Step 1. Let μ be the tr,e a4erage score o( the E"% e7a- of all r. ProsperoWs
st,*e(ts who take the Algebra co,rse w)th the (ew te7tbook. &he (at,ral
state-e(t that wo,l* be ass,-e* tr,e ,(less there were stro(g e4)*e(ce to
the co(trary )s that the (ew book )s abo,t the sa-e as the ol* o(e. &he
alter(at)4e wh)ch )t takes e4)*e(ce to establ)sh )s that the (ew book )s
better wh)ch correspo(*s to a h)gher 4al,e of μ. &h,s the rele4a(t test )s
H0:µ = 67
vs.Ha:µ >67@ α=0.01
Saylor URL: http://www.saylor.org/books Saylor.org11
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 412/723
Figure ".)est %tatistic for 'ote ".*7 >(ample ">
Saylor URL: http://www.saylor.org/books Saylor.org12
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 413/723
Saylor URL: http://www.saylor.org/books Saylor.org13
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 414/723
Figure ".1A)est %tatistic for 'ote ".*" >(ample >
*+, TA*+AA,S
• &he obser4e* s)g()+ca(ce or p=4al,e of a test )s a -eas,re of how )(co(s)ste(t
the sa-ple res,lt )s w)th ; a(* )( fa4or of a.
• &he p=4al,e approach to hypothes)s test)(g -ea(s that o(e -erely co-pares
the p=4al,e to α)(stea* of co(str,ct)(g a re>ect)o( reg)o(.
Saylor URL: http://www.saylor.org/books Saylor.org1
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 415/723
• &here )s a syste-at)c +4e=step proce*,re for the p=4al,e approach to hypothes)s
test)(g.
Saylor URL: http://www.saylor.org/books Saylor.org10
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 416/723
a Perfor- the rele4a(t test of hypotheses at the 2;G le4el of s)g()+ca(ce ,s)(g
the cr)t)cal 4al,e approach.
b %o-p,te the obser4e* s)g()+ca(ce of the test.
c Perfor- the test at the 2;G le4el of s)g()+ca(ce ,s)(g the p=4al,e approach.
Jo, (ee* (ot repeat the +rst three steps alrea*y *o(e )( part a5.
9 &he -ea( score o( a 20=po)(t place-e(t e7a- )( -athe-at)cs ,se* for the past two
years at a large state ,()4ers)ty )s 1.3. &he place-e(t coor*)(ator w)shes to test
whether the -ea( score o( a re4)se* 4ers)o( of the e7a- *)8ers fro- 1.3. She g)4es
Saylor URL: http://www.saylor.org/books Saylor.org1
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 417/723
the re4)se* e7a- to 3; e(ter)(g fresh-e( early )( the s,--erD the -ea( score )s 1.
w)th sta(*ar* *e4)at)o( 2..
a Perfor- the test at the 1;G le4el of s)g()+ca(ce ,s)(g the cr)t)cal 4al,e
approach.
b %o-p,te the obser4e* s)g()+ca(ce of the test.c Perfor- the test at the 1;G le4el of s)g()+ca(ce ,s)(g the p=4al,e approach. Jo,
(ee* (ot repeat the +rst three steps alrea*y *o(e )( part a5.
1; &he -ea( )(crease )( wor* fa-)ly 4ocab,lary a-o(g st,*e(ts )( a o(e=year fore)g(
la(g,age co,rse )s 0 wor* fa-)l)es. ( or*er to est)-ate the e8ect of a (ew type of
class sche*,l)(g a( )(str,ctor -o()tors the progress of ; st,*e(tsD the sa-ple -ea(
)(crease )( wor* fa-)ly 4ocab,lary of these st,*e(ts )s 02 wor* fa-)l)es w)th sa-ple
sta(*ar* *e4)at)o( 16 wor* fa-)l)es.
a &est at the 0G le4el of s)g()+ca(ce whether the -ea( )(crease w)th the (ew
class sche*,l)(g )s *)8ere(t fro- 0 wor* fa-)l)es ,s)(g the cr)t)cal 4al,e
approach.
b %o-p,te the obser4e* s)g()+ca(ce of the test.
c Perfor- the test at the 0G le4el of s)g()+ca(ce ,s)(g the p=4al,e approach. Jo,
(ee* (ot repeat the +rst three steps alrea*y *o(e )( part a5.
11 &he -ea( y)el* for har* re* w)(ter wheat )( a certa)( state )s .6 b,/acre. ( a p)lot
progra- a -o*)+e* grow)(g sche-e was )(tro*,ce* o( 30 )(*epe(*e(t plots. &he
res,lt was a sa-ple -ea( y)el* of 0. b,/acre w)th sa-ple sta(*ar* *e4)at)o( 1.
b,/acre a( appare(t )(crease )( y)el*.
a &est at the 0G le4el of s)g()+ca(ce whether the -ea( y)el* ,(*er the (ew
sche-e )s greater tha( .6 b,/acre ,s)(g the cr)t)cal 4al,e approach.
b %o-p,te the obser4e* s)g()+ca(ce of the test.
c Perfor- the test at the 0G le4el of s)g()+ca(ce ,s)(g the p=4al,e approach. Jo,
(ee* (ot repeat the +rst three steps alrea*y *o(e )( part a5.
12 &he a4erage a-o,(t of t)-e that 4)s)tors spe(t look)(g at a reta)l co-pa(yWs ol* ho-e
page o( the worl* w)*e web was 23. seco(*s. &he co-pa(y co--)ss)o(s a (ew ho-e
page. "( )ts +rst *ay )( place the -ea( t)-e spe(t at the (ew page by 26 4)s)tors
was 23.0 seco(*s w)th sta(*ar* *e4)at)o( 0.1 seco(*s.
a &est at the 0G le4el of s)g()+ca(ce whether the -ea( 4)s)t t)-e for the (ew
page )s less tha( the for-er -ea( of 23. seco(*s ,s)(g the cr)t)cal 4al,e
approach.
b %o-p,te the obser4e* s)g()+ca(ce of the test.
c Perfor- the test at the 0G le4el of s)g()+ca(ce ,s)(g the p=4al,e approach. Jo,
(ee* (ot repeat the +rst three steps alrea*y *o(e )( part a5.
Saylor URL: http://www.saylor.org/books Saylor.org1
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 418/723
Saylor URL: http://www.saylor.org/books Saylor.org16
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 419/723
5.7 Small Sample Tests or a 2opulation >ean
LEARNN! "#$E%&'E
1 &o lear( how to apply the +4e=step test proce*,re for test of hypotheses
co(cer()(g a pop,lat)o( -ea( whe( the sa-ple s)Ve )s s-all.
,n the previous section hypotheses testing for population means was described in the case of large
samples. The statistical validity of the tests was insured by the 7entral imit Theorem$ with
essentially no assumptions on the distribution of the population. (hen sample si'es are small$ as is
often the case in practice$ the 7entral imit Theorem does not apply. One must then impose stricter
assumptions on the population to give statistical validity to the test procedure. One commonassumption is that the population from which the sample is taken has a normal probability
distribution to begin with. <nder such circumstances$ if the population standard deviation is known$
then the test statistic x−−µ0) σ √n)still has the standard normal distribution$ as in the previous
two sections. ,f 6 is unknown and is approximated by the sample standard deviation s$ then the
resulting test statistic x−−µ0) s √n)follows +tudent&s t -distribution withn−1degrees of freedom.
Saylor URL: http://www.saylor.org/books Saylor.org19
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 420/723
!igure 3."" (istribution of the %tandardized Test %tatistic and the -ejection -egion
Saylor URL: http://www.saylor.org/books Saylor.org2;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 421/723
The p-value of a test of hypotheses for which the test statistic has +tudent&s t -distribution can be
computed using statistical software$ but it is impractical to do so using tables$ since that would
re"uire 35 tables analogous to /igure 1!.! 07umulative 9ormal robability0$ one for each degree of
freedom from 1 to 35./igure 1!.3 07ritical Dalues of 0 can be used to approximate the p-value of such
a test$ and this is typically ade"uate for making a decision using the p-value approach to hypothesis
testing$ although not always. /or this reason the tests in the two examples in this section will be
made following the critical value approach to hypothesis testing summari'ed at the end of +ection ?.1
0The lements of Eypothesis Testing0$ but after each one we will show how the p-value approach
could have been used.
EKAPLE 1;
&he pr)ce of a pop,lar te(()s racket at a (at)o(al cha)( store )s 19. Port)a bo,ght
+4e of the sa-e racket at a( o(l)(e a,ct)o( s)te for the follow)(g pr)ces:
155 179 175 175 161
Saylor URL: http://www.saylor.org/books Saylor.org21
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 422/723
Ass,-)(g that the a,ct)o( pr)ces of rackets are (or-ally *)str)b,te* *eter-)(e
whether there )s s,?c)e(t e4)*e(ce )( the sa-ple at the 0G le4el of s)g()+ca(ce to
co(cl,*e that the a4erage pr)ce of the racket )s less tha( 19 )f p,rchase* at a(
o(l)(e a,ct)o(.
Sol,t)o(:
• Step 1. &he assert)o( for wh)ch e4)*e(ce -,st be pro4)*e* )s that the a4erage
o(l)(e pr)ce μ )s less tha( the a4erage pr)ce )( reta)l stores so the hypothes)s
test )s
Saylor URL: http://www.saylor.org/books Saylor.org22
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 423/723
(−∞,−2.132].
• Step 0. As show( )( )g,re 6.12 QRe>ect)o( Reg)o( a(* &est Stat)st)c for Q the
test stat)st)c falls )( the re>ect)o( reg)o(. &he *ec)s)o( )s to re>ect ;. ( the
co(te7t of the proble- o,r co(cl,s)o( )s:
&he *ata pro4)*e s,?c)e(t e4)*e(ce at the 0G le4el of s)g()+ca(ce to
co(cl,*e that the a4erage pr)ce of s,ch rackets p,rchase* at o(l)(e a,ct)o(s )s
less tha( 19.
Saylor URL: http://www.saylor.org/books Saylor.org23
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 424/723
Figure ".126eEection 6egion and )est %tatistic for 'ote ".82 >(ample 1A>
To perform the test in 9ote ?.6! 0xample 150 using the p-value approach$ look in the row in /igure 1!.3
07ritical Dalues of 0 with the headingdf=4and search for the two t -values that bracket the unsigned value
!.18! of the test statistic. They are !.13! and !.@@;$ in the columns with headings t 5.585 and t 5.5!8. They cut
off right tails of area 5.585 and 5.5!8$ so because !.18! is between them it must cut off a tail of area
between 5.585 and 5.5!8. 4y symmetry Z!.18! cuts off a left tail of area between 5.585 and 5.5!8$ hence
the p-value corresponding tot=−2.152is between 5.5!8 and 5.58. %lthough its precise value is unknown$
it must be less than α=0.05$ so the decision is to re#ect + 5.
+A>2!+ %%
A s-all co-po(e(t )( a( electro()c *e4)ce has two s-all holes where a(other t)(y
part )s +tte*. ( the -a(,fact,r)(g process the a4erage *)sta(ce betwee( the two
holes -,st be t)ghtly co(trolle* at ;.;2 -- else -a(y ,()ts wo,l* be *efect)4e
a(* waste*. a(y t)-es thro,gho,t the *ay <,al)ty co(trol e(g)(eers take a
s-all sa-ple of the co-po(e(ts fro- the pro*,ct)o( l)(e -eas,re the *)sta(ce
betwee( the two holes a(* -ake a*>,st-e(ts )f (ee*e*. S,ppose at o(e t)-e
fo,r ,()ts are take( a(* the *)sta(ces are -eas,re* as
0.021 0.019 0.023 0.020
Saylor URL: http://www.saylor.org/books Saylor.org2
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 425/723
eter-)(e at the 1G le4el of s)g()+ca(ce )f there )s s,?c)e(t e4)*e(ce )( the
sa-ple to co(cl,*e that a( a*>,st-e(t )s (ee*e*. Ass,-e the *)sta(ces of
)(terest are (or-ally *)str)b,te*.
Sol,t)o(:
• Step 1. &he ass,-pt)o( )s that the process )s ,(*er co(trol ,(less there )s
stro(g e4)*e(ce to the co(trary. S)(ce a *e4)at)o( of the a4erage *)sta(ce to
e)ther s)*e )s ,(*es)rable the rele4a(t test )s
Saylor URL: http://www.saylor.org/books Saylor.org20
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 426/723
co(cl,s)o( )s:
&he *ata *o (ot pro4)*e s,?c)e(t e4)*e(ce at the 1G le4el of s)g()+ca(ce to
co(cl,*e that the -ea( *)sta(ce betwee( the holes )( the co-po(e(t *)8ers
fro- ;.;2 --.
Figure ".1*6eEection 6egion and )est %tatistic for 'ote ".8* >(ample 11>
To perform the test in 9ote ?.63 0xample 110 using the p-value approach$ look in the row
in /igure 1!.3 07ritical Dalues of 0 with the headingdf=3and search for the two t -values that
bracket the value 5.?@@ of the test statistic. %ctually 5.?@@ is smaller than the smallest number in
the row$ which is 5.F@?$ in the column with heading t 5.!55. The value 5.F@? cuts off a right tail of
area 5.!55$ so because 5.?@@ is to its left it must cut off a tail of area greater than 5.!55. Thus
the p-value$ which is the double of the area cut off *since the test is two-tailed)$ is greater than
5.655. %lthough its precise value is unknown$ it must be greater than α=0.01$ so the decision is not
to re#ect + 5.
*+, TA*+AA,S
Saylor URL: http://www.saylor.org/books Saylor.org2
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 427/723
• &here are two for-,las for the test stat)st)c )( test)(g hypotheses abo,t a
pop,lat)o( -ea( w)th s-all sa-ples. "(e test stat)st)c follows the sta(*ar*
(or-al *)str)b,t)o( the other St,*e(tWs t =*)str)b,t)o(.
• &he pop,lat)o( sta(*ar* *e4)at)o( )s ,se* )f )t )s k(ow( otherw)se the sa-ple
sta(*ar* *e4)at)o( )s ,se*.• E)ther +4e=step proce*,re cr)t)cal 4al,e or p=4al,e approach )s ,se* w)th e)ther
test stat)st)c.
Saylor URL: http://www.saylor.org/books Saylor.org2
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 428/723
a &est H0:µ=2504s. Ha:µ>250 α=0.05.
b Est)-ate the obser4e* s)g()+ca(ce of the test )( part a5 a(* state a *ec)s)o( base* o(
the p=4al,e approach to hypothes)s test)(g.
6 A ra(*o- sa-ple of s)Ve 12 *raw( fro- a (or-al pop,lat)o( y)el*e* the follow)(g
res,lts: x−=86.2 s ;.3.
a &est H0:µ=85.54s. Ha:µ≠85.5 α=0.01.
b Est)-ate the obser4e* s)g()+ca(ce of the test )( part a5 a(* state a *ec)s)o(
base* o( the p=4al,e approach to hypothes)s test)(g.
Saylor URL: http://www.saylor.org/books Saylor.org26
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 429/723
A22!&CAT&1NS
9 Researchers w)sh to test the e?cacy of a progra- )(te(*e* to re*,ce the le(gth of
labor )( ch)l*b)rth. &he accepte* -ea( labor t)-e )( the b)rth of a +rst ch)l* )s 10.3
ho,rs. &he -ea( le(gth of the labors of 13 +rst=t)-e -others )( a p)lot progra- was 6.6ho,rs w)th sta(*ar* *e4)at)o( 3.1 ho,rs. Ass,-)(g a (or-al *)str)b,t)o( of t)-es of
labor test at the 1;G le4el of s)g()+ca(ce test whether the -ea( labor t)-e for all
wo-e( follow)(g th)s progra- )s less tha( 10.3 ho,rs.
1; A *a)ry far- ,ses the so-at)c cell co,(t S%%5 report o( the -)lk )t pro4)*es to a
processor as o(e way to -o()tor the health of )ts her*. &he -ea( S%% fro- +4e sa-ples
of raw -)lk was 20;;;; cells per -)ll)l)ter w)th sta(*ar* *e4)at)o( 30;; cell/-l. &est
whether these *ata pro4)*e s,?c)e(t e4)*e(ce at the 1;G le4el of s)g()+ca(ce to
co(cl,*e that the -ea( S%% of all -)lk pro*,ce* at the *a)ry e7cee*s that )( the
pre4)o,s report 21;20; cell/-l. Ass,-e a (or-al *)str)b,t)o( of S%%.
11 S)7 co)(s of the sa-e type are *)sco4ere* at a( archaeolog)cal s)te. f the)r we)ghts o(
a4erage are s)g()+ca(tly *)8ere(t fro- 0.20 gra-s the( )t ca( be ass,-e* that the)r
pro4e(a(ce )s (ot the s)te )tself. &he co)(s are we)ghe* a(* ha4e -ea( .3 g w)th
sa-ple sta(*ar* *e4)at)o( ;.16 g. Perfor- the rele4a(t test at the ;.1G 1/1;th of 1G5
le4el of s)g()+ca(ce ass,-)(g a (or-al *)str)b,t)o( of we)ghts of all s,ch co)(s.
12 A( eco(o-)st w)shes to *eter-)(e whether people are *r)4)(g less tha( )( the past. (
o(e reg)o( of the co,(try the (,-ber of -)les *r)4e( per ho,sehol* per year )( the past
was 16.09 tho,sa(* -)les. A sa-ple of 10 ho,sehol*s pro*,ce* a sa-ple -ea( of
1.23 tho,sa(* -)les for the last year w)th sa-ple sta(*ar* *e4)at)o( .; tho,sa(*
-)les. Ass,-)(g a (or-al *)str)b,t)o( of ho,sehol* *r)4)(g *)sta(ces per year perfor-
the rele4a(t test at the 0G le4el of s)g()+ca(ce.
13 &he reco--e(*e* *a)ly allowa(ce of )ro( for fe-ales age* 190; )s 16 -g/*ay. A
caref,l -eas,re-e(t of the *a)ly )ro( )(take of 10 wo-e( y)el*e* a -ea( *a)ly )(take of
1.2 -g w)th sa-ple sta(*ar* *e4)at)o( . -g.
a Ass,-)(g that *a)ly )ro( )(take )( wo-e( )s (or-ally *)str)b,te* perfor- the
test that the act,al -ea( *a)ly )(take for all wo-e( )s *)8ere(t fro- 16
-g/*ay at the 1;G le4el of s)g()+ca(ce.
Saylor URL: http://www.saylor.org/books Saylor.org29
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 430/723
b &he sa-ple -ea( )s less tha( 16 s,ggest)(g that the act,al pop,lat)o( -ea(
)s less tha( 16 -g/*ay. Perfor- th)s test also at the 1;G le4el of s)g()+ca(ce.
&he co-p,tat)o( of the test stat)st)c *o(e )( part a5 st)ll appl)es here.5
1 &he target te-perat,re for a hot be4erage the -o-e(t )t )s *)spe(se* fro- a 4e(*)(g
-ach)(e )s 1;. A sa-ple of te( ra(*o-ly selecte* ser4)(gs fro- a (ew -ach)(e,(*ergo)(g a pre=sh)p-e(t )(spect)o( ga4e -ea( te-perat,re 13 w)th sa-ple
sta(*ar* *e4)at)o( .3.
a Ass,-)(g that te-perat,re )s (or-ally *)str)b,te* perfor- the test that the
-ea( te-perat,re of *)spe(se* be4erages )s *)8ere(t fro- 1; at the 1;G
le4el of s)g()+ca(ce.
b &he sa-ple -ea( )s greater tha( 1; s,ggest)(g that the act,al pop,lat)o(
-ea( )s greater tha( 1;. Perfor- th)s test also at the 1;G le4el of
s)g()+ca(ce. &he co-p,tat)o( of the test stat)st)c *o(e )( part a5 st)ll appl)es
here.5
10 &he a4erage (,-ber of *ays to co-plete reco4ery fro- a part)c,lar type of k(ee
operat)o( )s 123. *ays. ro- h)s e7per)e(ce a phys)c)a( s,spects that ,se of a top)cal
pa)( -e*)cat)o( -)ght be le(gthe()(g the reco4ery t)-e. @e ra(*o-ly selects the
recor*s of se4e( k(ee s,rgery pat)e(ts who ,se* the top)cal -e*)cat)o(. &he t)-es to
total reco4ery were:
Saylor URL: http://www.saylor.org/books Saylor.org3;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 431/723
2;;;; at the 1;G le4el of s)g()+ca(ce. Ass,-e that the SP% follows a (or-al
*)str)b,t)o(.
16 "(e water <,al)ty sta(*ar* for water that )s *)scharge* )(to a part)c,lar type of strea-
or po(* )s that the a4erage *a)ly water te-perat,re be at -ost 16%. S)7 sa-ples take(
thro,gho,t the *ay ga4e the *ata:
16.8 21.5 19.1 12.8 18.0 20.7
&he sa-ple -ea( x−=18.15e7cee*s 16 b,t perhaps th)s )s o(ly sa-pl)(g error.
eter-)(e whether the *ata pro4)*e s,?c)e(t e4)*e(ce at the 1;G le4el of
s)g()+ca(ce to co(cl,*e that the -ea( te-perat,re for the e(t)re *ay e7cee*s 16%.
A((&T&1NA! ++/C&S+S
19 A calc,lator has a b,)lt=)( algor)th- for ge(erat)(g a ra(*o- (,-ber accor*)(g to the
sta(*ar* (or-al *)str)b,t)o(. &we(ty=+4e (,-bers th,s ge(erate* ha4e -ea( ;.10 a(*
Saylor URL: http://www.saylor.org/books Saylor.org31
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 432/723
sa-ple sta(*ar* *e4)at)o( ;.9. &est the (,ll hypothes)s that the -ea( of all (,-bers
so ge(erate* )s ; 4ers,s the alter(at)4e that )t )s *)8ere(t fro- ; at the 2;G le4el of
s)g()+ca(ce. Ass,-e that the (,-bers *o follow a (or-al *)str)b,t)o(.
2; At e4ery sett)(g a h)gh=spee* pack)(g -ach)(e *el)4ers a pro*,ct )( a-o,(ts that 4ary
fro- co(ta)(er to co(ta)(er w)th a (or-al *)str)b,t)o( of sta(*ar* *e4)at)o( ;.12 o,(ce. &o co-pare the a-o,(t *el)4ere* at the c,rre(t sett)(g to the *es)re* a-o,(t .1
o,(ce a <,al)ty )(spector ra(*o-ly selects +4e co(ta)(ers a(* -eas,res the co(te(ts
of each obta)()(g sa-ple -ea( 3.9 o,(ces a(* sa-ple sta(*ar* *e4)at)o( ;.1;
o,(ce. &est whether the *ata pro4)*e s,?c)e(t e4)*e(ce at the 0G le4el of s)g()+ca(ce
to co(cl,*e that the -ea( of all co(ta)(ers at the c,rre(t sett)(g )s less tha( .1
o,(ces.
21 A -a(,fact,r)(g co-pa(y rece)4es a sh)p-e(t of 1;;; bolts of (o-)(al shear stre(gth
30; lb. A <,al)ty co(trol )(spector selects +4e bolts at ra(*o- a(* -eas,res the
shear stre(gth of each. &he *ata are:
4,320 4,290 4,360 4,350 4,320
a Ass,-)(g a (or-al *)str)b,t)o( of shear stre(gths test the (,ll hypothes)s
that the -ea( shear stre(gth of all bolts )( the sh)p-e(t )s 30; lb 4ers,s the
alter(at)4e that )t )s less tha( 30; lb at the 1;G le4el of s)g()+ca(ce.
b Est)-ate the p=4al,e obser4e* s)g()+ca(ce5 of the test of part a5.
c %o-pare the p=4al,e fo,(* )( part b5 to α=0.10a(* -ake a *ec)s)o( base* o(
the p=4al,e approach. E7pla)( f,lly.
22 A l)terary h)stor)a( e7a-)(es a (ewly *)sco4ere* *oc,-e(t poss)bly wr)tte( by "bero(
&hese,s. &he -ea( a4erage se(te(ce le(gth of the s,r4)4)(g ,(*)sp,te* works of
"bero( &hese,s )s 6.2 wor*s. &he h)stor)a( co,(ts wor*s )( se(te(ces betwee( +4e
s,ccess)4e 1;1 per)o*s )( the *oc,-e(t )( <,est)o( to obta)( a -ea( a4erage se(te(ce
le(gth of 39. wor*s w)th sta(*ar* *e4)at)o( .0 wor*s. &h,s the sa-ple s)Ve )s +4e.5
a eter-)(e )f these *ata pro4)*e s,?c)e(t e4)*e(ce at the 1G le4el of
s)g()+ca(ce to co(cl,*e that the -ea( a4erage se(te(ce le(gth )( the
*oc,-e(t )s less tha( 6.2.
b Est)-ate the p=4al,e of the test.
c #ase* o( the a(swers to parts a5 a(* b5 state whether or (ot )t )s l)kely that
the *oc,-e(t was wr)tte( by "bero( &hese,s.
Saylor URL: http://www.saylor.org/books Saylor.org32
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 433/723
Saylor URL: http://www.saylor.org/books Saylor.org33
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 434/723
5.8 !ar$e Sample Tests or a 2opulation2roportion
!+A/N&N: 1';+CT&<+S
1 &o lear( how to apply the +4e=step cr)t)cal 4al,e test proce*,re for test of
hypotheses co(cer()(g a pop,lat)o( proport)o(.
2 &o lear( how to apply the +4e=step p=4al,e test proce*,re for test of hypotheses
co(cer()(g a pop,lat)o( proport)o(.
4oth the critical value approach and the p-value approach can be applied to test hypotheses about a
population proportion p. The null hypothesis will have the form H0: p= p0for some specific
number p5 between 5 and 1. The alternative hypothesis will be one of the three
ine"ualities p< p0 p> p0 or p≠ p0 for the same number p5 that appears in the null hypothesis.
Saylor URL: http://www.saylor.org/books Saylor.org3
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 435/723
Saylor URL: http://www.saylor.org/books Saylor.org30
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 436/723
!igure 3." (istribution of the %tandardized Test %tatistic and the -ejection -egion
Saylor URL: http://www.saylor.org/books Saylor.org3
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 437/723
• Step 0. As show( )( )g,re 6.10 QRe>ect)o( Reg)o( a(* &est Stat)st)c for Q the
test stat)st)c falls )( the re>ect)o( reg)o(. &he *ec)s)o( )s to re>ect ;. ( the
co(te7t of the proble- o,r co(cl,s)o( )s:
&he *ata pro4)*e s,?c)e(t e4)*e(ce at the 0G le4el of s)g()+ca(ce to
co(cl,*e that a -a>or)ty of a*,lts prefer the co-pa(yWs be4erage to that of
the)r co-pet)torWs.
Saylor URL: http://www.saylor.org/books Saylor.org3
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 438/723
Figure ".1;6eEection 6egion and )est %tatistic for 'ote ".87 >(ample 12>
Saylor URL: http://www.saylor.org/books Saylor.org36
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 439/723
Saylor URL: http://www.saylor.org/books Saylor.org39
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 440/723
• Step 0. As show( )( )g,re 6.1 QRe>ect)o( Reg)o( a(* &est Stat)st)c for Q the
test stat)st)c *oes (ot fall )( the re>ect)o( reg)o(. &he *ec)s)o( )s (ot to
re>ect ;. ( the co(te7t of the proble- o,r co(cl,s)o( )s:
&he *ata *o (ot pro4)*e s,?c)e(t e4)*e(ce at the 1;G le4el of s)g()+ca(ce to
co(cl,*e that the proport)o( of (ewbor(s who are -ale *)8ers fro- the h)stor)c
proport)o( )( t)-es of eco(o-)c recess)o(.
Saylor URL: http://www.saylor.org/books Saylor.org;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 441/723
Figure ".156eEection 6egion and )est %tatistic for 'ote ".8" >(ample 1*>
EKAPLE 1
Perfor- the test of Note 6. QE7a-ple 12Q ,s)(g the p=4al,e approach.
Sol,t)o(:
Be alrea*y k(ow that the sa-ple s)Ve )s s,?c)e(tly large to 4al)*ly perfor- the
test.
• Steps 13 of the +4e=step proce*,re *escr)be* )( Sect)o( 6.3.2
Q&he Q ha4e alrea*y bee( *o(e )( Note 6. QE7a-ple 12Q so we
w)ll (ot repeat the- here b,t o(ly say that we k(ow that the test)s r)ght=ta)le* a(* that 4al,e of the test stat)st)c )s C 1.69.
• Step . S)(ce the test )s r)ght=ta)le* the p=4al,e )s the area ,(*er
the sta(*ar* (or-al c,r4e c,t o8 by the obser4e* test stat)st)c z
1.69 as )ll,strate* )()g,re 6.1. #y )g,re 12.2 Q%,-,lat)4e
Saylor URL: http://www.saylor.org/books Saylor.org1
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 442/723
Nor-al Probab)l)tyQ that area a(* therefore the p=4al,e
)s 1−0.9633=0.0367.
• Step 0. S)(ce the p=4al,e )s less tha( α=0.05the *ec)s)o( )s to
re>ect ;.
Figure ".179+<alue for 'ote ".8 >(ample 18>
+A>2!+ %8
Perfor- the test of Note 6.6 QE7a-ple 13Q ,s)(g the p=4al,e approach.
Sol,t)o(:
Saylor URL: http://www.saylor.org/books Saylor.org2
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 443/723
Be alrea*y k(ow that the sa-ple s)Ve )s s,?c)e(tly large to 4al)*ly perfor- the
test.
• Steps 13 of the +4e=step proce*,re *escr)be* )( Sect)o( 6.3.2 Q&he Q ha4e
alrea*y bee( *o(e )( Note 6.6 QE7a-ple 13Q. &hey tell ,s that the test )s two=
ta)le* a(* that 4al,e of the test stat)st)c )s C 1.02.
• Step . S)(ce the test )s two=ta)le* the p=4al,e )s the *o,ble of the area ,(*er the
sta(*ar* (or-al c,r4e c,t o8 by the obser4e* test stat)st)c z 1.02. #y)g,re
12.2 Q%,-,lat)4e Nor-al Probab)l)tyQ that area )s 1−0.9382=0.0618 as )ll,strate*
)( )g,re 6.16 he(ce the p=4al,e )s 2×0.0618=0.1236.
• Step 0. S)(ce the p=4al,e )s greater tha( α=0.10the *ec)s)o( )s (ot to re>ect;.
Figure ".1"9+<alue for 'ote ".;A >(ample 1;>
*+, TA*+AA,S
• &here )s o(e for-,la for the test stat)st)c )( test)(g hypotheses abo,t a
pop,lat)o( proport)o(. &he test stat)st)c follows the sta(*ar* (or-al *)str)b,t)o(.
• E)ther +4e=step proce*,re cr)t)cal 4al,e or p=4al,e approach ca( be ,se*.
Saylor URL: http://www.saylor.org/books Saylor.org3
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 444/723
Saylor URL: http://www.saylor.org/books Saylor.org
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 445/723
APPL%A&"NS
11 )4e years ago 3.9G of ch)l*re( )( a certa)( reg)o( l)4e* w)th so-eo(e other tha( a
pare(t. A soc)olog)st w)shes to test whether the c,rre(t proport)o( )s *)8ere(t. Perfor-
the rele4a(t test at the 0G le4el of s)g()+ca(ce ,s)(g the follow)(g *ata: )( a ra(*o-
sa-ple of 209 ch)l*re( 119 l)4e* w)th so-eo(e other tha( a pare(t.
12 &he go4er(-e(t of a part)c,lar co,(try reports )ts l)teracy rate as 02G. A
(o(go4er(-e(tal orga()Vat)o( bel)e4es )t to be less. &he orga()Vat)o( takes a ra(*o-
sa-ple of ;; )(hab)ta(ts a(* obta)(s a l)teracy rate of 2G. Perfor- the rele4a(t test
at the ;.0G o(e=half of 1G5 le4el of s)g()+ca(ce.
13 &wo years ago 2G of ho,sehol* )( a certa)( co,(ty reg,larly part)c)pate* )( recycl)(g
ho,sehol* waste. &he co,(ty go4er(-e(t w)shes to )(4est)gate whether that proport)o(
has )(crease* after a( )(te(s)4e ca-pa)g( pro-ot)(g recycl)(g. ( a s,r4ey of 9;;
Saylor URL: http://www.saylor.org/books Saylor.org0
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 446/723
ho,sehol*s reg,larly part)c)pate )( recycl)(g. Perfor- the rele4a(t test at the 1;G
le4el of s)g()+ca(ce.
1 Pr)or to a spec)al a*4ert)s)(g ca-pa)g( 23G of all a*,lts recog()Ve* a part)c,lar
co-pa(yWs logo. At the close of the ca-pa)g( the -arket)(g *epart-e(t co--)ss)o(e*
a s,r4ey )( wh)ch 311 of 12;; ra(*o-ly selecte* a*,lts recog()Ve* the logo.
eter-)(e at the 1G le4el of s)g()+ca(ce whether the *ata pro4)*e s,?c)e(t e4)*e(ce
to co(cl,*e that -ore tha( 23G of all a*,lts (ow recog()Ve the co-pa(yWs logo.
10 A report +4e years ago state* that 30.0G of all state=ow(e* br)*ges )( a part)c,lar state
were *e+c)e(t. A( a*4ocacy gro,p took a ra(*o- sa-ple of 1;; state=ow(e* br)*ges
)( the state a(* fo,(* 33 to be c,rre(tly rate* as be)(g *e+c)e(t. &est whether the
c,rre(t proport)o( of br)*ges )( s,ch co(*)t)o( )s 30.0G 4ers,s the alter(at)4e that )t )s
*)8ere(t fro- 30.0G at the 1;G le4el of s)g()+ca(ce.
1 ( the pre4)o,s year the proport)o( of *epos)ts )( check)(g acco,(ts at a certa)( ba(k
that were -a*e electro()cally was 0G. &he ba(k w)shes to *eter-)(e )f the proport)o(
)s h)gher th)s year. t e7a-)(e* 2;;;; *epos)t recor*s a(* fo,(* that 921 were
electro()c. eter-)(e at the 1G le4el of s)g()+ca(ce whether the *ata pro4)*e
s,?c)e(t e4)*e(ce to co(cl,*e that -ore tha( 0G of all *epos)ts to check)(g acco,(ts
are (ow be)(g -a*e electro()cally.
1 Accor*)(g to the e*eral Po4erty eas,re 12G of the U.S. pop,lat)o( l)4es )( po4erty.
&he go4er(or of a certa)( state bel)e4es that the proport)o( there )s lower. ( a sa-ple
of s)Ve 100; 13 were )-po4er)she* accor*)(g to the fe*eral -eas,re.
a &est whether the tr,e proport)o( of the stateWs pop,lat)o( that )s
)-po4er)she* )s less tha( 12G at the 0G le4el of s)g()+ca(ce.
1 %o-p,te the obser4e* s)g()+ca(ce of the test.
16 A( )(s,ra(ce co-pa(y states that )t settles 60G of all l)fe )(s,ra(ce cla)-s w)th)( 3;
*ays. A co(s,-er gro,p asks the state )(s,ra(ce co--)ss)o( to )(4est)gate. ( a
sa-ple of 20; l)fe )(s,ra(ce cla)-s 2;3 were settle* w)th)( 3; *ays.
a &est whether the tr,e proport)o( of all l)fe )(s,ra(ce cla)-s -a*e to th)s
co-pa(y that are settle* w)th)( 3; *ays )s less tha( 60G at the 0G le4el of
s)g()+ca(ce.
b %o-p,te the obser4e* s)g()+ca(ce of the test.
Saylor URL: http://www.saylor.org/books Saylor.org
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 447/723
19 A spec)al )(terest gro,p asserts that 9;G of all s-okers bega( s-ok)(g before age 16.
( a sa-ple of 60; s-okers 6 bega( s-ok)(g before age 16.
a &est whether the tr,e proport)o( of all s-okers who bega( s-ok)(g before
age 16 )s less tha( 9;G at the 1G le4el of s)g()+ca(ce.
b %o-p,te the obser4e* s)g()+ca(ce of the test.
2; ( the past 6G of a garageWs b,s)(ess was w)th for-er patro(s. &he ow(er of the
garage sa-ples 2;; repa)r )(4o)ces a(* +(*s that for o(ly 11 of the- the patro( was a
repeat c,sto-er.
a &est whether the tr,e proport)o( of all c,rre(t b,s)(ess that )s w)th repeat
c,sto-ers )s less tha( 6G at the 1G le4el of s)g()+ca(ce.
b %o-p,te the obser4e* s)g()+ca(ce of the test.
A&"NAL EKER%SES
21 A r,le of th,-b )s that for work)(g )(*)4)*,als o(e=<,arter of ho,sehol* )(co-e sho,l*
be spe(t o( ho,s)(g. A +(a(c)al a*4)sor bel)e4es that the a4erage proport)o( of )(co-e
spe(t o( ho,s)(g )s -ore tha( ;.20. ( a sa-ple of 3; ho,sehol*s the -ea( proport)o(
of ho,sehol* )(co-e spe(t o( ho,s)(g was ;.260 w)th a sta(*ar* *e4)at)o( of ;.;3.
Perfor- the rele4a(t test of hypotheses at the 1G le4el of s)g()+ca(ce. @)(t: &h)s
e7erc)se co,l* ha4e bee( prese(te* )( a( earl)er sect)o(.
22 ce crea- )s legally re<,)re* to co(ta)( at least 1;G -)lk fat by we)ght. &he
-a(,fact,rer of a( eco(o-y )ce crea- w)shes to be close to the legal l)-)t he(ce
pro*,ces )ts )ce crea- w)th a target proport)o( of ;.1; -)lk fat. A sa-ple of +4e
co(ta)(ers y)el*e* a -ea( proport)o( of ;.;9 -)lk fat w)th sta(*ar* *e4)at)o( ;.;;2.
&est the (,ll hypothes)s that the -ea( proport)o( of -)lk fat )( all co(ta)(ers )s ;.1;
aga)(st the alter(at)4e that )t )s less tha( ;.1; at the 1;G le4el of s)g()+ca(ce.
Ass,-e that the proport)o( of -)lk fat )( co(ta)(ers )s (or-ally *)str)b,te*. @)(t: &h)s
e7erc)se co,l* ha4e bee( prese(te* )( a( earl)er sect)o(.
LAR!E A&A S E& EKE R%SES
Saylor URL: http://www.saylor.org/books Saylor.org
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 448/723
23 Large ata Sets a(* A l)st the res,lts of 0;; tosses of a *)e. Let p *e(ote the
proport)o( of all tosses of th)s *)e that wo,l* res,lt )( a +4e. Use the sa-ple *ata to test
the hypothes)s that p )s *)8ere(t fro- 1/ at the 2;G le4el of s)g()+ca(ce.
http://www..7ls
http://www.A.7ls
2 Large ata Set recor*s res,lts of a ra(*o- s,r4ey of 2;; 4oters )( each of two
reg)o(s )( wh)ch they were aske* to e7press whether they prefer %a(*)*ate 0 for a U.S.
Se(ate seat or prefer so-e other ca(*)*ate. Use the f,ll *ata set ;; obser4at)o(s5 to
test the hypothes)s that the proport)o( p of all 4oters who prefer %a(*)*ate 0 e7cee*s
;.30. &est at the 1;G le4el of s)g()+ca(ce.
http://www..7ls
20 L)(es 2 thro,gh 03 )( Large ata Set 11 )s a sa-ple of 030 real estate sales )( a
certa)( reg)o( )( 2;;6. &hose that were foreclos,re sales are )*e(t)+e* w)th a 1 )( the
seco(* col,-(. Use these *ata to test at the 1;G le4el of s)g()+ca(ce the hypothes)s
that the proport)o( p of all real estate sales )( th)s reg)o( )( 2;;6 that were foreclos,re
sales was less tha( 20G. &he (,ll hypothes)s )s H0: p=0.25.5
http://www.11.7ls
2 L)(es 03 thro,gh 11; )( Large ata Set 11 )s a sa-ple of 0; real estate sales )( a
certa)( reg)o( )( 2;1;. &hose that were foreclos,re sales are )*e(t)+e* w)th a 1 )( the
seco(* col,-(. Use these *ata to test at the 0G le4el of s)g()+ca(ce the hypothes)s
Saylor URL: http://www.saylor.org/books Saylor.org6
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 449/723
that the proport)o( p of all real estate sales )( th)s reg)o( )( 2;1; that were foreclos,re
sales was greater tha( 23G. &he (,ll hypothes)s )s H0: p=0.23.5
http://www.11.7ls
Saylor URL: http://www.saylor.org/books Saylor.org9
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 450/723
Saylor URL: http://www.saylor.org/books Saylor.org0;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 451/723
Chapter 6
Two-Sample 2roblems
The previous two chapters treated the "uestions of estimating and making inferences about a
parameter of a single population. ,n this chapter we consider a comparison of parameters that
belong to two different populations. /or example$ we might wish to compare the average income of
all adults in one region of the country with the average income of those in another region$ or we
might wish to compare the proportion of all men who are vegetarians with the proportion of all
women who are vegetarians.
(e will study construction of confidence intervals and tests of hypotheses in four situations$
depending on the parameter of interest$ the si'es of the samples drawn from each of the populations$
and the method of sampling. (e also examine sample si'e considerations.
Saylor URL: http://www.saylor.org/books Saylor.org01
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 452/723
6.% Comparison o Two 2opulation >eansH!ar$e@ &ndependent Samples
!+A/N&N: 1';+CT&<+S
1 &o ,(*ersta(* the log)cal fra-ework for est)-at)(g the *)8ere(ce betwee( the
-ea(s of two *)st)(ct pop,lat)o(s a(* perfor-)(g tests of hypotheses co(cer()(g
those -ea(s.
2 &o lear( how to co(str,ct a co(+*e(ce )(ter4al for the *)8ere(ce )( the -ea(s of
two *)st)(ct pop,lat)o(s ,s)(g large )(*epe(*e(t sa-ples.
3 &o lear( how to perfor- a test of hypotheses co(cer()(g the *)8ere(ce betwee(
the -ea(s of two *)st)(ct pop,lat)o(s ,s)(g large )(*epe(*e(t sa-ples.
+uppose we wish to compare the means of two distinct populations. /igure F.1 0,ndependent
+ampling from Two opulations0 illustrates the conceptual framework of our investigation in this
Saylor URL: http://www.saylor.org/books Saylor.org02
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 453/723
and the next section. ach population has a mean and a standard deviation. (e arbitrarily label one
population as opulation 1 and the other as opulation !$ and subscript the parameters with the
numbers 1 and ! to tell them apart. (e draw a random sample from opulation 1 and label the
sample statistics it yields with the subscript 1. (ithout reference to the first sample we draw a
sample from opulation ! and label its sample statistics with the subscript !.
!igure 4." >ndependent %ampling from Two $opulations
(e)nition
%amples from two distinct populations are independent if each one is drawn without reference to the
other, and has no connection with the other.
Saylor URL: http://www.saylor.org/books Saylor.org03
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 454/723
EKAPLE 1
&o co-pare c,sto-er sat)sfact)o( le4els of two co-pet)(g cable tele4)s)o(
co-pa()es 1 c,sto-ers of %o-pa(y 1 a(* 300 c,sto-ers of %o-pa(y 2 were
ra(*o-ly selecte* a(* were aske* to rate the)r cable co-pa()es o( a +4e=po)(t
scale w)th 1 be)(g least sat)s+e* a(* 0 -ost sat)s+e*. &he s,r4ey res,lts are
s,--ar)Ve* )( the follow)(g table:
Saylor URL: http://www.saylor.org/books Saylor.org0
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 455/723
Company % Company 0
n1=174 n2=355
x−1=3.51 x−2=3.24
s1=0.51 s2=0.52
%o(str,ct a po)(t est)-ate a(* a 99G co(+*e(ce )(ter4al for µ1−µ2 the *)8ere(ce
)( a4erage sat)sfact)o( le4els of c,sto-ers of the two co-pa()es as -eas,re* o(
th)s +4e=po)(t scale.
Sol,t)o(:
&he po)(t est)-ate of µ1−µ2 )s
x−1−x−2=3.51−3.24=0.27.
Saylor URL: http://www.saylor.org/books Saylor.org00
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 456/723
=ypothesis Testin$
Eypotheses concerning the relative si'es of the means of two populations are tested using the same
critical value and p-value procedures that were used in the case of a single population. %ll that is
needed is to know how to express the null and alternative hypotheses and to know the formula for
the standardi'ed test statistic and the distribution that it follows.
The null and alternative hypotheses will always be expressed in terms of the difference of the two
population means. Thus the null hypothesis will always be written
H 0:µ1−µ2=D0
Saylor URL: http://www.saylor.org/books Saylor.org0
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 457/723
where (5 is a number that is deduced from the statement of the situation. %s was the case with a
single population the alternative hypothesis can take one of the three forms$ with the same
terminology:
Form o Ha Terminolo$y
Ha:µ1−µ2<D0 Left=ta)le*
Ha:µ1−µ2>D0 R)ght=ta)le*
Ha:µ1−µ2≠D0 &wo=ta)le*
%s long as the samples are independent and both are large the following formula for the standardi'ed
test statistic is valid$ and it has the standard normal distribution. *,n the relatively rare case that both
population standard deviations σ1andσ2are known they would be used instead of the sample
standard deviations.)
Saylor URL: http://www.saylor.org/books Saylor.org0
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 458/723
Saylor URL: http://www.saylor.org/books Saylor.org06
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 459/723
re>ect ;. ( the co(te7t of the proble- o,r co(cl,s)o( )s:
&he *ata pro4)*e s,?c)e(t e4)*e(ce at the 1G le4el of s)g()+ca(ce to
co(cl,*e that the -ea( c,sto-er sat)sfact)o( for %o-pa(y 1 )s h)gher tha(
that for %o-pa(y 2.
Saylor URL: http://www.saylor.org/books Saylor.org09
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 460/723
EKAPLE 3
Perfor- the test of Note 9. QE7a-ple 2Q ,s)(g the p=4al,e approach.
Sol,t)o(:
&he +rst three steps are )*e(t)cal to those )( Note 9. QE7a-ple 2Q.
• Step . &he obser4e* s)g()+ca(ce or p=4al,e of the test )s the area of the r)ght ta)l
of the sta(*ar* (or-al *)str)b,t)o( that )s c,t o8 by the test stat)st)c C 0.6.
&he (,-ber 0.6 )s too large to appear )( )g,re 12.2 Q%,-,lat)4e Nor-al
Probab)l)tyQ wh)ch -ea(s that the area of the left ta)l that )t c,ts o8 )s 1.;;;; to
fo,r *ec)-al places. &he area that we seek the area of the rig!t ta)l )s
therefore 1−1.0000=0.0000to fo,r *ec)-al places. See )g,re 9.3. &hat )s p -
value=0.0000to fo,r *ec)-al places. &he act,al 4al,e )s appro7)-ately 0.000 000 007.E
Figure .*9+<alue for 'ote .7 >(ample *>
• Step 0. S)(ce ;.;;;; _ ;.;1 p -value<αso the *ec)s)o( )s to re>ect the (,ll
hypothes)s:
Saylor URL: http://www.saylor.org/books Saylor.org;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 461/723
&he *ata pro4)*e s,?c)e(t e4)*e(ce at the 1G le4el of s)g()+ca(ce to
co(cl,*e that the -ea( c,sto-er sat)sfact)o( for %o-pa(y 1 )s h)gher tha(
that for %o-pa(y 2.
*+, TA*+AA,S
• A po)(t est)-ate for the *)8ere(ce )( two pop,lat)o( -ea(s )s s)-ply the
*)8ere(ce )( the correspo(*)(g sa-ple -ea(s.
• ( the co(te7t of est)-at)(g or test)(g hypotheses co(cer()(g two pop,lat)o(
-ea(s large sa-ples -ea(s that bot! sa-ples are large.
• A co(+*e(ce )(ter4al for the *)8ere(ce )( two pop,lat)o( -ea(s )s co-p,te*
,s)(g a for-,la )( the sa-e fash)o( as was *o(e for a s)(gle pop,lat)o( -ea(.
• &he sa-e +4e=step proce*,re ,se* to test hypotheses co(cer()(g a s)(gle
pop,lat)o( -ea( )s ,se* to test hypotheses co(cer()(g the *)8ere(ce betwee(
two pop,lat)o( -ea(s. &he o(ly *)8ere(ce )s )( the for-,la for the sta(*ar*)Ve*
test stat)st)c.
Saylor URL: http://www.saylor.org/books Saylor.org1
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 462/723
Saylor URL: http://www.saylor.org/books Saylor.org2
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 463/723
Saylor URL: http://www.saylor.org/books Saylor.org3
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 464/723
Saylor URL: http://www.saylor.org/books Saylor.org
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 465/723
Saylor URL: http://www.saylor.org/books Saylor.org0
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 466/723
Saylor URL: http://www.saylor.org/books Saylor.org
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 467/723
A22!&CAT&1NS
13 ( or*er to )(4est)gate the relat)o(sh)p betwee( -ea( >ob te(,re )( years a-o(g
workers who ha4e a bachelorWs *egree or h)gher a(* those who *o (ot ra(*o- sa-ples
of each type of worker were take( w)th the follow)(g res,lts.
Saylor URL: http://www.saylor.org/books Saylor.org
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 468/723
n x− s
:a!he"or;+ degree or higher 155 5.2 1.3
No degree 210 5.0 1.5
a %o(str,ct the 99G co(+*e(ce )(ter4al for the *)8ere(ce )( the pop,lat)o(
-ea(s base* o( these *ata.
b &est at the 1G le4el of s)g()+ca(ce the cla)- that -ea( >ob te(,re a-o(g
those w)th h)gher e*,cat)o( )s greater tha( a-o(g those w)tho,t aga)(st the
*efa,lt that there )s (o *)8ere(ce )( the -ea(s.
c %o-p,te the obser4e* s)g()+ca(ce of the test.
1 Recor*s of ; ,se* passe(ger cars a(* ; ,se* p)ck,p tr,cks (o(e ,se* co--erc)ally5
were ra(*o-ly selecte* to )(4est)gate whether there was a(y *)8ere(ce )( the -ea(
t)-e )( years that they were kept by the or)g)(al ow(er before be)(g sol*. or cars the
-ea( was 0.3 years w)th sta(*ar* *e4)at)o( 2.2 years. or p)ck,p tr,cks the -ea( was
.1 years w)th sta(*ar* *e4)at)o( 3.; years.
a %o(str,ct the 90G co(+*e(ce )(ter4al for the *)8ere(ce )( the -ea(s base*
o( these *ata.
b &est the hypothes)s that there )s a *)8ere(ce )( the -ea(s aga)(st the (,ll
hypothes)s that there )s (o *)8ere(ce. Use the 1G le4el of s)g()+ca(ce.
c %o-p,te the obser4e* s)g()+ca(ce of the test )( part b5.
10 ( pre4)o,s years the a4erage (,-ber of pat)e(ts per ho,r at a hosp)tal e-erge(cy
roo- o( weeke(*s e7cee*e* the a4erage o( week*ays by .3 4)s)ts per ho,r. A hosp)tal
a*-)()strator bel)e4es that the c,rre(t weeke(* -ea( e7cee*s the week*ay -ea( by
fewer tha( .3 ho,rs.
a %o(str,ct the 99G co(+*e(ce )(ter4al for the *)8ere(ce )( the pop,lat)o(
-ea(s base* o( the follow)(g *ata *er)4e* fro- a st,*y )( wh)ch 3; weeke(*
a(* 3; week*ay o(e=ho,r per)o*s were ra(*o-ly selecte* a(* the (,-ber of
(ew pat)e(ts )( each recor*e*.
n x− s
eeend+ 30 13.8 3.1
eeda+ 30 8.6 2.7
b &est at the 0G le4el of s)g()+ca(ce whether the c,rre(t weeke(* -ea(
e7cee*s the week*ay -ea( by fewer tha( .3 pat)e(ts per ho,r.
c %o-p,te the obser4e* s)g()+ca(ce of the test.
Saylor URL: http://www.saylor.org/books Saylor.org6
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 469/723
1 A soc)olog)st s,r4eys 0; ra(*o-ly selecte* c)t)Ve(s )( each of two co,(tr)es to
co-pare the -ea( (,-ber of ho,rs of 4ol,(teer work *o(e by a*,lts )( each. A-o(g
the 0; )(hab)ta(ts of L)ll)p,t the -ea( ho,rs of 4ol,(teer work per year was 02 w)th
sta(*ar* *e4)at)o( 11.6. A-o(g the 0; )(hab)ta(ts of #lef,sc, the -ea( (,-ber of
ho,rs of 4ol,(teer work per year was 3 w)th sta(*ar* *e4)at)o( .2.a %o(str,ct the 99G co(+*e(ce )(ter4al for the *)8ere(ce )( -ea( (,-ber of
ho,rs 4ol,(teere* by all res)*e(ts of L)ll)p,t a(* the -ea( (,-ber of ho,rs
4ol,(teere* by all res)*e(ts of #lef,sc,.
b &est at the 1G le4el of s)g()+ca(ce the cla)- that the -ea( (,-ber of ho,rs
4ol,(teere* by all res)*e(ts of L)ll)p,t )s -ore tha( te( ho,rs greater tha( the
-ea( (,-ber of ho,rs 4ol,(teere* by all res)*e(ts of #lef,sc,.
c %o-p,te the obser4e* s)g()+ca(ce of the test )( part b5.
1 A ,()4ers)ty a*-)()strator asserte* that ,pperclass-e( spe(* -ore t)-e st,*y)(g
tha( ,(*erclass-e(.
a &est th)s cla)- aga)(st the *efa,lt that the a4erage (,-ber of ho,rs of st,*y
per week by the two gro,ps )s the sa-e ,s)(g the follow)(g )(for-at)o(
base* o( ra(*o- sa-ples fro- each gro,p of st,*e(ts. &est at the 1G le4el of
s)g()+ca(ce.
n x− s
U$$er!"a++%en 35 15.6 2.9
Under!"a++%en 35 12.3 4.1
b %o-p,te the obser4e* s)g()+ca(ce of the test.
16 A( k)(es)olog)st cla)-s that the rest)(g heart rate of -e( age* 16 to 20 who e7erc)se
reg,larly )s -ore tha( +4e beats per -)(,te less tha( that of -e( who *o (ot
e7erc)se reg,larly. e( )( each category were selecte* at ra(*o- a(* the)r rest)(g
heart rates were -eas,re* w)th the res,lts show(.
n x− s
eg&"ar e<er!i+e 40 63 1.0
No reg&"ar e<er!i+e 30 71 1.2a Perfor- the rele4a(t test of hypotheses at the 1G le4el of s)g()+ca(ce.
b %o-p,te the obser4e* s)g()+ca(ce of the test.
19 %h)l*re( )( two ele-e(tary school classroo-s were g)4e( two 4ers)o(s of the sa-e
test b,t w)th the or*er of <,est)o(s arra(ge* fro- eas)er to -ore *)?c,lt )(
Saylor URL: http://www.saylor.org/books Saylor.org9
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 470/723
'ers)o( 0 a(* )( re4erse or*er )( 'ers)o( -. Ra(*o-ly selecte* st,*e(ts fro- each
class were g)4e( 'ers)o( 0 a(* the rest 'ers)o( -. &he res,lts are show( )( the table.
n x− s
=er+ion A 31 83 4.6=er+ion B 32 78 4.3
a %o(str,ct the 9;G co(+*e(ce )(ter4al for the *)8ere(ce )( the -ea(s of the
pop,lat)o(s of all ch)l*re( tak)(g 'ers)o( 0 of s,ch a test a(* of all ch)l*re(
tak)(g 'ers)o( - of s,ch a test.
b &est at the 1G le4el of s)g()+ca(ce the hypothes)s that the 0 4ers)o( of the
test )s eas)er tha( the - 4ers)o( e4e( tho,gh the <,est)o(s are the sa-e5.
c %o-p,te the obser4e* s)g()+ca(ce of the test.
2; &he ,()c)pal &ra(s)t A,thor)ty wa(ts to k(ow )f o( week*ays -ore passe(gers r)*e
the (orthbo,(* bl,e l)(e tra)( towar*s the c)ty ce(ter that *eparts at 6:10 a.-. or the
o(e that *eparts at 6:3; a.-. &he follow)(g sa-ple stat)st)cs are asse-ble* by the
&ra(s)t A,thor)ty.
n x− s
8>15 a.%. 'rain 30 323 41
8>30 a.%. 'rain 45 356 45
a %o(str,ct the 9;G co(+*e(ce )(ter4al for the *)8ere(ce )( the -ea( (,-ber
of *a)ly tra4ellers o( the 6:10 tra)( a(* the -ea( (,-ber of *a)ly tra4ellers o(the 6:3; tra)(.
b &est at the 0G le4el of s)g()+ca(ce whether the *ata pro4)*e s,?c)e(t
e4)*e(ce to co(cl,*e that -ore passe(gers r)*e the 6:3; tra)(.
c %o-p,te the obser4e* s)g()+ca(ce of the test.
21 ( co-par)(g the aca*e-)c perfor-a(ce of college st,*e(ts who are a?l)ate* w)th
frater()t)es a(* those -ale st,*e(ts who are ,(a?l)ate* a ra(*o- sa-ple of
st,*e(ts was *raw( fro- each of the two pop,lat)o(s o( a ,()4ers)ty ca-p,s.
S,--ary stat)st)cs o( the st,*e(t !PAs are g)4e( below.
n x− s
ra'erni' 645 2.90 0.47
Unai"ia'ed 450 2.88 0.42
22 &est at the 0G le4el of s)g()+ca(ce whether the *ata pro4)*e s,?c)e(t e4)*e(ce to
co(cl,*e that there )s a *)8ere(ce )( a4erage !PA betwee( the pop,lat)o( of
Saylor URL: http://www.saylor.org/books Saylor.org;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 471/723
frater()ty st,*e(ts a(* the pop,lat)o( of ,(a?l)ate* -ale st,*e(ts o( th)s ,()4ers)ty
ca-p,s.
23 ( co-par)(g the aca*e-)c perfor-a(ce of college st,*e(ts who are a?l)ate* w)th
soror)t)es a(* those fe-ale st,*e(ts who are ,(a?l)ate* a ra(*o- sa-ple of
st,*e(ts was *raw( fro- each of the two pop,lat)o(s o( a ,()4ers)ty ca-p,s.S,--ary stat)st)cs o( the st,*e(t !PAs are g)4e( below.
n x− s
orori' 330 3.18 0.37
Unai"ia'ed 550 3.12 0.41
2 &est at the 0G le4el of s)g()+ca(ce whether the *ata pro4)*e s,?c)e(t e4)*e(ce to
co(cl,*e that there )s a *)8ere(ce )( a4erage !PA betwee( the pop,lat)o( of soror)ty
st,*e(ts a(* the pop,lat)o( of ,(a?l)ate* fe-ale st,*e(ts o( th)s ,()4ers)ty
ca-p,s.
20 &he ow(er of a profess)o(al football tea- bel)e4es that the leag,e has beco-e -ore
o8e(se or)e(te* s)(ce +4e years ago. &o check h)s bel)ef 32 ra(*o-ly selecte*
ga-es fro- o(e yearWs sche*,le were co-pare* to 32 ra(*o-ly selecte* ga-es
fro- the sche*,le +4e years later. S)(ce -ore o8e(se pro*,ces -ore po)(ts per
ga-e the ow(er a(alyVe* the follow)(g )(for-at)o( o( po)(ts per ga-e ppg5.
n x− s
$$g $revio&+" 32 20.62 4.17 $$g re!en'" 32 22.05 4.01
2 &est at the 1;G le4el of s)g()+ca(ce whether the *ata o( po)(ts per ga-e pro4)*e
s,?c)e(t e4)*e(ce to co(cl,*e that the ga-e has beco-e -ore o8e(se or)e(te*.
2 &he ow(er of a profess)o(al football tea- bel)e4es that the leag,e has beco-e -ore
o8e(se or)e(te* s)(ce +4e years ago. &o check h)s bel)ef 32 ra(*o-ly selecte*
ga-es fro- o(e yearWs sche*,le were co-pare* to 32 ra(*o-ly selecte* ga-es
fro- the sche*,le +4e years later. S)(ce -ore o8e(se pro*,ces -ore o8e(s)4e yar*s
per ga-e the ow(er a(alyVe* the follow)(g )(for-at)o( o( o8e(s)4e yar*s per ga-e
oypg5.
n x− s
o$g $revio&+" 32 316 40
o$g re!en'" 32 336 35
Saylor URL: http://www.saylor.org/books Saylor.org1
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 472/723
26 &est at the 1;G le4el of s)g()+ca(ce whether the *ata o( o8e(s)4e yar*s per ga-e
pro4)*e s,?c)e(t e4)*e(ce to co(cl,*e that the ga-e has beco-e -ore o8e(se
or)e(te*.
!A/:+ (ATA S+T ++/C &S+S20 Large ata Sets 1A a(* 1# l)st the SA& scores for 1;;; ra(*o-ly selecte* st,*e(ts.
e(ote the pop,lat)o( of all -ale st,*e(ts as Pop,lat)o( 1 a(* the pop,lat)o( of all
fe-ale st,*e(ts as Pop,lat)o( 2.
http://www.1A.7ls
http://www.1#.7ls
a Restr)ct)(g atte(t)o( to >,st the -ales +(* n1 x−1 a(* s1. Restr)ct)(g atte(t)o(
to >,st the fe-ales +(* n2 x−2 a(* s2.
b Let µ1*e(ote the -ea( SA& score for all -ales a(* µ2the -ea( SA& score for
all fe-ales. Use the res,lts of part a5 to co(str,ct a 9;G co(+*e(ce )(ter4al
for the *)8ere(ce µ1−µ2.
c &est at the 0G le4el of s)g()+ca(ce the hypothes)s that the -ea( SA& scores
a-o(g -ales e7cee*s that of fe-ales.
2 Large ata Sets 1A a(* 1# l)st the !PAs for 1;;; ra(*o-ly selecte* st,*e(ts. e(ote
the pop,lat)o( of all -ale st,*e(ts as Pop,lat)o( 1 a(* the pop,lat)o( of all fe-ale
st,*e(ts as Pop,lat)o( 2.
http://www.1A.7ls
http://www.1#.7ls
a Restr)ct)(g atte(t)o( to >,st the -ales +(* n1 x−1 a(* s1. Restr)ct)(g atte(t)o(
to >,st the fe-ales +(* n2 x−2 a(* s2.
b Let µ1*e(ote the -ea( !PA for all -ales a(* µ2the -ea( !PA for all fe-ales.
Use the res,lts of part a5 to co(str,ct a 90G co(+*e(ce )(ter4al for the
*)8ere(ce µ1−µ2.
c &est at the 1;G le4el of s)g()+ca(ce the hypothes)s that the -ea( !PAs
a-o(g -ales a(* fe-ales *)8er.
2 Large ata Sets A a(* # l)st the s,r4)4al t)-es for 0 -ale a(* 0 fe-ale laboratory
-)ce w)th thy-)c le,ke-)a. e(ote the pop,lat)o( of all s,ch -ale -)ce as Pop,lat)o( 1
a(* the pop,lat)o( of all s,ch fe-ale -)ce as Pop,lat)o( 2.
Saylor URL: http://www.saylor.org/books Saylor.org2
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 473/723
http://www.A.7ls
http://www.#.7ls
a Restr)ct)(g atte(t)o( to >,st the -ales +(* n1x−1
a(* s1. Restr)ct)(g atte(t)o(to >,st the fe-ales +(* n2 x−2 a(* s2.
b Let µ1*e(ote the -ea( s,r4)4al for all -ales a(* µ2the -ea( s,r4)4al t)-e for
all fe-ales. Use the res,lts of part a5 to co(str,ct a 99G co(+*e(ce )(ter4al
for the *)8ere(ce µ1−µ2.
c &est at the 1G le4el of s)g()+ca(ce the hypothes)s that the -ea( s,r4)4al
t)-e for -ales e7cee*s that for fe-ales by -ore tha( 162 *ays half a year5.
* %o-p,te the obser4e* s)g()+ca(ce of the test )( part c5.
Saylor URL: http://www.saylor.org/books Saylor.org3
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 474/723
Saylor URL: http://www.saylor.org/books Saylor.org
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 475/723
Saylor URL: http://www.saylor.org/books Saylor.org0
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 476/723
6.0 Comparison o Two 2opulation >eansHSmall@ &ndependent Samples
!+A/N&N: 1';+CT&<+S
1 &o lear( how to co(str,ct a co(+*e(ce )(ter4al for the *)8ere(ce )( the -ea(s of
two *)st)(ct pop,lat)o(s ,s)(g s-all )(*epe(*e(t sa-ples.
2 &o lear( how to perfor- a test of hypotheses co(cer()(g the *)8ere(ce betwee(
the -ea(s of two *)st)(ct pop,lat)o(s ,s)(g s-all )(*epe(*e(t sa-ples.
(hen one or the other of the sample si'es is small$ as is often the case in practice$ the 7entral imit
Theorem does not apply. (e must then impose conditions on the population to give statistical
validity to the test procedure. (e will assume that both populations from which the samples are
taken have a normal probability distribution and that their standard deviations are e"ual.
Con)dence &ntervals
(hen the two populations are normally distributed and have e"ual standard deviations$ the following
formula for a confidence interval for µ1−µ2is valid.
Saylor URL: http://www.saylor.org/books Saylor.org
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 477/723
EKAPLE
A software co-pa(y -arkets a (ew co-p,ter ga-e w)th two e7per)-e(tal
packag)(g *es)g(s. es)g( 1 )s se(t to 11 storesD the)r a4erage sales the +rst
-o(th )s 02 ,()ts w)th sa-ple sta(*ar* *e4)at)o( 12 ,()ts. es)g( 2 )s se(t to
storesD the)r a4erage sales the +rst -o(th )s ,()ts w)th sa-ple sta(*ar*
*e4)at)o( 1; ,()ts. %o(str,ct a po)(t est)-ate a(* a 90G co(+*e(ce )(ter4al for
the *)8ere(ce )( a4erage -o(thly sales betwee( the two package *es)g(s.
Sol,t)o(:
Saylor URL: http://www.saylor.org/books Saylor.org
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 478/723
Saylor URL: http://www.saylor.org/books Saylor.org6
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 479/723
Saylor URL: http://www.saylor.org/books Saylor.org9
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 480/723
Saylor URL: http://www.saylor.org/books Saylor.org6;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 481/723
&he *ata *o (ot pro4)*e s,?c)e(t e4)*e(ce at the 1G le4el of s)g()+ca(ce to
co(cl,*e that the -ea( sales per -o(th of the two *es)g(s are *)8ere(t.
EKAPLE
Perfor- the test of Note 9.13 QE7a-ple 0Q ,s)(g the p=4al,e approach.
Sol,t)o(:
Saylor URL: http://www.saylor.org/books Saylor.org61
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 482/723
&he +rst three steps are )*e(t)cal to those )( Note 9.13 QE7a-ple 0Q.
• Step . #eca,se the test )s two=ta)le* the obser4e* s)g()+ca(ce or p=4al,e of
the test )s the *o,ble of the area of the r)ght ta)l of St,*e(tWst =*)str)b,t)o(
w)th 10 *egrees of free*o- that )s c,t o8 by the test stat)st)c ) 1.;;. Be
ca( o(ly appro7)-ate th)s (,-ber. Look)(g )( the row of )g,re 12.3 Q%r)t)cal
'al,es of Q hea*e* df=15 the (,-ber 1.;; )s betwee( the (,-bers ;.6 a(*
1.31 correspo(*)(g tot ;.2;; a(* t ;.1;;.
&he area c,t o8 by t ;.6 )s ;.2;; a(* the area c,t o8 by t 1.31 )s
;.1;;. S)(ce 1.;; )s betwee( ;.6 a(* 1.31 the area )t c,ts o8 )s betwee(
;.2;; a(* ;.1;;. &h,s the p=4al,e s)(ce the area -,st be *o,ble*5 )s betwee(
;.;; a(* ;.2;;.
• Step 0. S)(ce p>0.200>0.01 p>α so the *ec)s)o( )s (ot to re>ect the (,ll
hypothes)s:
&he *ata *o (ot pro4)*e s,?c)e(t e4)*e(ce at the 1G le4el of s)g()+ca(ce to
co(cl,*e that the -ea( sales per -o(th of the two *es)g(s are *)8ere(t.
IEJ &AIE ABAJS
• ( the co(te7t of est)-at)(g or test)(g hypotheses co(cer()(g two pop,lat)o(
-ea(s s-all sa-ples -ea(s that at least one sa-ple )s s-all. ( part)c,lar
e4e( )f o(e sa-ple )s of s)Ve 3; or -ore )f the other )s of s)Ve less tha( 3; the
for-,las of th)s sect)o( -,st be ,se*.
• A co(+*e(ce )(ter4al for the *)8ere(ce )( two pop,lat)o( -ea(s )s co-p,te*
,s)(g a for-,la )( the sa-e fash)o( as was *o(e for a s)(gle pop,lat)o( -ea(.
Saylor URL: http://www.saylor.org/books Saylor.org62
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 483/723
Saylor URL: http://www.saylor.org/books Saylor.org63
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 484/723
Saylor URL: http://www.saylor.org/books Saylor.org6
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 485/723
Saylor URL: http://www.saylor.org/books Saylor.org60
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 486/723
Saylor URL: http://www.saylor.org/books Saylor.org6
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 487/723
Saylor URL: http://www.saylor.org/books Saylor.org6
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 488/723
Saylor URL: http://www.saylor.org/books Saylor.org66
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 489/723
Saylor URL: http://www.saylor.org/books Saylor.org69
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 490/723
Saylor URL: http://www.saylor.org/books Saylor.org9;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 491/723
Saylor URL: http://www.saylor.org/books Saylor.org91
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 492/723
Saylor URL: http://www.saylor.org/books Saylor.org92
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 493/723
6.3 Comparison o Two 2opulation >eansH2aired Samples
!+A/N&N: 1';+CT&<+S
1 &o lear( the *)st)(ct)o( betwee( )(*epe(*e(t sa-ples a(* pa)re* sa-ples.
2 &o lear( how to co(str,ct a co(+*e(ce )(ter4al for the *)8ere(ce )( the -ea(s of
two *)st)(ct pop,lat)o(s ,s)(g pa)re* sa-ples.
3 &o lear( how to perfor- a test of hypotheses co(cer()(g the *)8ere(ce )( the
-ea(s of two *)st)(ct pop,lat)o(s ,s)(g pa)re* sa-ples.
+uppose chemical engineers wish to compare the fuel economy obtained by two different
formulations of gasoline. +ince fuel economy varies widely from car to car$ if the mean fuel economy
of two independent samples of vehicles run on the two types of fuel were compared$ even if one
formulation were better than the other the large variability from vehicle to vehicle might make any
difference arising from difference in fuel difficult to detect. Rust imagine one random sample having
many more large vehicles than the other. ,nstead of independent random samples$ it would make
more sense to select pairs of cars of the same make and model and driven under similar
circumstances$ and compare the fuel economy of the two cars in each pair. Thus the data would look
something like Table F.1 0/uel conomy of airs of Dehicles0$ where the first car in each pair is
Saylor URL: http://www.saylor.org/books Saylor.org93
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 494/723
operated on one formulation of the fuel *call it Type 1 gasoline) and the second car is operated on the
second *call it Type ! gasoline).
Table F.1 /uel conomy of airs of Dehicles
#ake and #odel Car Car !
:&i! ?aro++e 17.0 17.0
#odge =i$er 13.2 12.9
Honda -@ 35.3 35.4
H&%%er H 3 13.6 13.2
?e<&+ A 32.7 32.5
aBda A-9 18.4 18.1
aab 9-3 22.5 22.5
,oo'a oro""a 26.8 26.7
=o"vo A 90 15.1 15.0
The first column of numbers form a sample from opulation 1$ the population of all cars operated on
Type 1 gasoline the second column of numbers form a sample from opulation !$ the population of
all cars operated on Type ! gasoline. ,t would be incorrect to analy'e the data using the formulas
from the previous section$ however$ since the samples were not drawn independently.
(hat is correct is to compute the difference in the numbers in each pair *subtracting in the same
order each time) to obtain the third column of numbers as shown in Table F.! 0/uel conomy of
airs of Dehicles0 and treat the differences as the data. %t this point$ the new sample ofdifferencesd1=0.0,…,d9=0.1in the third column of Table F.! 0/uel conomy of airs of Dehicles0 may
be considered as a random sample of si'e n M F selected from a population with mean µd=µ1−µ2.This
approach essentially transforms the paired two-sample problem into a one-sample problem as
discussed in the previous two chapters.
Table F.! /uel conomy of airs of Dehicles
#ake and #odel Car Car ! %ifference
:&i! ?aro++e 17.0 17.0 0.0
#odge =i$er 13.2 12.9 0.3
Honda -@ 35.3 35.4 C0.1
H&%%er H 3 13.6 13.2 0.4
?e<&+ A 32.7 32.5 0.2
aBda A-9 18.4 18.1 0.3
Saylor URL: http://www.saylor.org/books Saylor.org9
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 495/723
#ake and #odel Car Car ! %ifference
aab 9-3 22.5 22.5 0.0
,oo'a oro""a 26.8 26.7 0.1
=o"vo A 90 15.1 15.0 0.1
9ote carefully that although it does not matter what order the subtraction is done$ it must be done in
the same order for all pairs. This is why there are both positive and negative "uantities in the third
column of numbers in Table F.! 0/uel conomy of airs of Dehicles0.
Con)dence &ntervals
(hen the population of differences is normally distributed the following formula for a confidence interval
forµd=µ1−µ2is valid.
EKAPLE
Us)(g the *ata )( &able 9.1 Q,el Eco(o-y of Pa)rs of 'eh)clesQ co(str,ct a po)(t
est)-ate a(* a 90G co(+*e(ce )(ter4al for the *)8ere(ce )( a4erage f,el
eco(o-y betwee( cars operate* o( &ype 1 gasol)(e a(* cars operate* o( &ype 2
gasol)(e.
Saylor URL: http://www.saylor.org/books Saylor.org90
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 496/723
Sol,t)o(:
Be ha4e referre* to the *ata )( &able 9.1 Q,el Eco(o-y of Pa)rs of
'eh)clesQbeca,se that )s the way that the *ata are typ)cally prese(te* b,t we
e-phas)Ve that w)th pa)re* sa-pl)(g o(e )--e*)ately co-p,tes the *)8ere(ces
as g)4e( )( &able 9.2 Q,el Eco(o-y of Pa)rs of 'eh)clesQ a(* ,ses the *)8ere(ces
as the *ata.
&he -ea( a(* sta(*ar* *e4)at)o( of the *)8ere(ces are
=ypothesis Testin$
Saylor URL: http://www.saylor.org/books Saylor.org9
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 497/723
Testing hypotheses concerning the difference of two population means using paired difference
samples is done precisely as it is done for independent samples$ although now the null and
alternative hypotheses are expressed in terms ofµdinstead ofµ1−µ2.Thus the null hypothesis will
always be written
H 0:µd=D0
The three forms of the alternative hypothesis$ with the terminology for each case$ are:
Form ofHa Terminolog1
Ha:µd<D0 ?e'-'ai"ed
Ha:µd>D0 igh'-'ai"ed
Ha:µd≠D0 ,/o-'ai"ed
The same conditions on the population of differences that was re"uired for constructing a confidence
interval for the difference of the means must also be met when hypotheses are tested. Eere is the
standardi'ed test statistic that is used in the test.
EKAPLE 6
Saylor URL: http://www.saylor.org/books Saylor.org9
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 498/723
Us)(g the *ata of &able 9.2 Q,el Eco(o-y of Pa)rs of 'eh)clesQ test the hypothes)s
that -ea( f,el eco(o-y for &ype 1 gasol)(e )s greater tha( that for &ype 2
gasol)(e aga)(st the (,ll hypothes)s that the two for-,lat)o(s of gasol)(e y)el*
the sa-e -ea( f,el eco(o-y. &est at the 0G le4el of s)g()+ca(ce ,s)(g the
cr)t)cal 4al,e approach.
Sol,t)o(:
&he o(ly part of the table that we ,se )s the th)r* col,-( the *)8ere(ces.
• Step 1. S)(ce the *)8ere(ces were co-p,te* )( the or*er Type 1 mpg− Type 2 mpg
better f,el eco(o-y w)th &ype 1 f,el correspo(*s to µd=µ1−µ2>0. &h,s the test )s
H 0:µd = 0
vs.H a:µd>0@ α=0.05
f the *)8ere(ces ha* bee( co-p,te* )( the oppos)te or*er the( the
alter(at)4e hypotheses wo,l* ha4e bee( H a:µd<0.E
Saylor URL: http://www.saylor.org/books Saylor.org96
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 499/723
Figure .;6eEection 6egion and )est %tatistic for 'ote .2A >(ample ">
Saylor URL: http://www.saylor.org/books Saylor.org99
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 500/723
&he *ata pro4)*e s,?c)e(t e4)*e(ce at the 0G le4el of s)g()+ca(ce to co(cl,*e that
the -ea( f,el eco(o-y pro4)*e* by &ype 1 gasol)(e )s greater tha( that for &ype 2
gasol)(e.
EKAPLE 9Perfor- the test of Note 9.2; QE7a-ple 6Q ,s)(g the p=4al,e approach.
Sol,t)o(:
&he +rst three steps are )*e(t)cal to those )( Note 9.2; QE7a-ple 6Q.
•Step . #eca,se the test )s o(e=ta)le* the obser4e* s)g()+ca(ce or p=4al,e ofthe test )s >,st the area of the r)ght ta)l of St,*e(tWs t =*)str)b,t)o( w)th 6
*egrees of free*o- that )s c,t o8 by the test stat)st)c ) 2.;;. Be ca( o(ly
appro7)-ate th)s (,-ber. Look)(g )( the row of )g,re 12.3 Q%r)t)cal 'al,es of
Q hea*e* df=8 the (,-ber 2.;; )s betwee( the (,-bers 2.3; a(* 2.69
correspo(*)(g tot ;.;20 a(* t ;.;1;.
&he area c,t o8 by t 2.3; )s ;.;20 a(* the area c,t o8 by t 2.69 )s;.;1;. S)(ce 2.;; )s betwee( 2.3; a(* 2.69 the area )t c,ts o8 )s betwee(
;.;20 a(* ;.;1;. &h,s the p=4al,e )s betwee( ;.;20 a(* ;.;1;. ( part)c,lar )t
)s less tha( ;.;20. See )g,re 9..
Figure .59+<alue for 'ote .21 >(ample >
Saylor URL: http://www.saylor.org/books Saylor.org0;;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 501/723
• Step 0. S)(ce ;.;20 _ ;.;0 p<α so the *ec)s)o( )s to re>ect the (,ll hypothes)s:
&he *ata pro4)*e s,?c)e(t e4)*e(ce at the 0G le4el of s)g()+ca(ce to co(cl,*e that
the -ea( f,el eco(o-y pro4)*e* by &ype 1 gasol)(e )s greater tha( that for &ype 2
gasol)(e.
The paired two-sample experiment is a very powerful study design. ,t bypasses many unwanted
sources of Gstatistical noiseH that might otherwise influence the outcome of the experiment$ and
focuses on the possible difference that might arise from the one factor of interest.
,f the sample is large *meaning that n _ 35) then in the formula for the confidence interval we may
replacetα/2 byzα/2./or hypothesis testing when the number of pairs is at least 35$ we may use the same
statistic as for small samples for hypothesis testing$ except now it follows a standard normal
distribution$ so we use the last line of /igure 1!.3 07ritical Dalues of 0 to compute critical values$
and p-values can be computed exactly with /igure 1!.! 07umulative 9ormal robability0$ not merely
estimated using /igure 1!.3 07ritical Dalues of 0.
*+, TA*+AA,S
• Bhe( the *ata are collecte* )( pa)rs the *)8ere(ces co-p,te* for each pa)r are
the *ata that are ,se* )( the for-,las.
Saylor URL: http://www.saylor.org/books Saylor.org0;1
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 502/723
• A co(+*e(ce )(ter4al for the *)8ere(ce )( two pop,lat)o( -ea(s ,s)(g pa)re*
sa-pl)(g )s co-p,te* ,s)(g a for-,la )( the sa-e fash)o( as was *o(e for a
s)(gle pop,lat)o( -ea(.
•
• &he sa-e +4e=step proce*,re ,se* to test hypotheses co(cer()(g a s)(gle
pop,lat)o( -ea( )s ,se* to test hypotheses co(cer()(g the *)8ere(ce betwee(
two pop,lat)o( -ea(s ,s)(g pa)r sa-pl)(g. &he o(ly *)8ere(ce )s )( the for-,la
for the sta(*ar*)Ve* test stat)st)c.
Saylor URL: http://www.saylor.org/books Saylor.org0;2
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 503/723
Saylor URL: http://www.saylor.org/books Saylor.org0;3
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 504/723
4ouse Count1 )overnment rivate Compan1
1 217 219
2 350 338
3 296 291
4 237 237
5 237 235
6 272 269
Saylor URL: http://www.saylor.org/books Saylor.org0;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 505/723
4ouse Count1 )overnment rivate Compan1
7 257 239
8 277 275
9 312 32010 335 335
a !)4e a po)(t est)-ate for the *)8ere(ce betwee( the -ea( pr)4ate appra)sal of
all s,ch ho-es a(* the go4er(-e(t appra)sal of all s,ch ho-es.
b %o(str,ct the 99G co(+*e(ce )(ter4al base* o( these *ata for the *)8ere(ce.
c &est at the 1G le4el of s)g()+ca(ce the hypothes)s that appra)se* 4al,es by
the co,(ty go4er(-e(t of all s,ch ho,ses )s greater tha( the appra)se* 4al,es
by the pr)4ate appra)sal co-pa(y.
6 ( or*er to c,t costs a w)(e pro*,cer )s co(s)*er)(g ,s)(g *,o or 1 1 corks )( place
of f,ll (at,ral woo* corks b,t )s co(cer(e* that )t co,l* a8ect b,yersWs percept)o( of
the <,al)ty of the w)(e. &he w)(e pro*,cer sh)ppe* e)ght pa)rs of bottles of )ts best
yo,(g w)(es to e)ght w)(e e7perts. Each pa)r )(cl,*es o(e bottle w)th a (at,ral woo*
cork a(* o(e w)th a *,o cork. &he e7perts are aske* to rate the w)(es o( a o(e to te(
scale h)gher (,-bers correspo(*)(g to h)gher <,al)ty. &he res,lts are:
9ine $:pert %uo Cork 9ood Cork
1 8.5 8.5
2 8.0 8.5
3 6.5 8.0
4 7.5 8.5
5 8.0 7.5
6 8.0 8.0
7 9.0 9.0
8 7.0 7.5
a !)4e a po)(t est)-ate for the *)8ere(ce betwee( the -ea( rat)(gs of the w)(e
whe( bottle* are seale* w)th *)8ere(t k)(*s of corks.
b %o(str,ct the 9;G co(+*e(ce )(ter4al base* o( these *ata for the *)8ere(ce.
c &est at the 1;G le4el of s)g()+ca(ce the hypothes)s that o( the a4erage *,o
corks *ecrease the rat)(g of the w)(e.
9 E(g)(eers at a t)re -a(,fact,r)(g corporat)o( w)sh to test a (ew t)re -ater)al for
)(crease* *,rab)l)ty. &o test the t)res ,(*er real)st)c roa* co(*)t)o(s (ew fro(t t)res are
Saylor URL: http://www.saylor.org/books Saylor.org0;0
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 506/723
-o,(te* o( each of 11 co-pa(y cars o(e t)re -a*e w)th a pro*,ct)o( -ater)al a(* the
other w)th the e7per)-e(tal -ater)al. After a +7e* per)o* the 11 pa)rs were -eas,re*
for wear. &he a-o,(t of wear for each t)re )( --5 )s show( )( the table:
Car roduction $:perimental
1 5.1 5.0
2 6.5 6.5
3 3.6 3.1
4 3.5 3.7
5 5.7 4.5
6 5.0 4.1
7 6.4 5.3
8 4.7 2.6
9 3.2 3.0
10 3.5 3.5
11 6.4 5.1
a !)4e a po)(t est)-ate for the *)8ere(ce )( -ea( wear.
b %o(str,ct the 99G co(+*e(ce )(ter4al for the *)8ere(ce base* o( these *ata.
c &est at the 1G le4el of s)g()+ca(ce the hypothes)s that the -ea( wear w)th
the e7per)-e(tal -ater)al )s less tha( that for the pro*,ct)o( -ater)al.
1; A -arr)age co,(selor a*-)()stere* a test *es)g(e* to -eas,re o4erall co(te(t-e(t to
3; ra(*o-ly selecte* -arr)e* co,ples. &he scores for each co,ple are g)4e( below. A
h)gher (,-ber correspo(*s to greater co(te(t-e(t or happ)(ess.
Couple 4usband 9ife
1 47 44
2 44 46
3 49 44
4 53 44
5 42 43
6 45 45
7 48 47
8 45 44
Saylor URL: http://www.saylor.org/books Saylor.org0;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 507/723
Couple 4usband 9ife
9 52 44
10 47 42
11 40 3412 45 42
13 40 43
14 46 41
15 47 45
16 46 45
17 46 41
18 46 41
19 44 45
20 45 43
21 48 38
22 42 46
23 50 44
24 46 51
25 43 45
26 50 40
27 46 46
28 42 41
29 51 41
30 46 47
a &est at the 1G le4el of s)g()+ca(ce the hypothes)s that o( a4erage -e( a(*
wo-e( are (ot e<,ally happy )( -arr)age.
b &est at the 1G le4el of s)g()+ca(ce the hypothes)s that o( a4erage -e( are
happ)er tha( wo-e( )( -arr)age.
LAR!E A&A S E& EKE R%SES
Saylor URL: http://www.saylor.org/books Saylor.org0;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 508/723
11 Large ata Set 0 l)sts the scores for 20 ra(*o-ly selecte* st,*e(ts o( pract)ce SA&
rea*)(g tests before a(* after tak)(g a two=week SA& preparat)o( co,rse. e(ote the
pop,lat)o( of all st,*e(ts who ha4e take( the co,rse as Pop,lat)o( 1 a(* the pop,lat)o(
of all st,*e(ts who ha4e (ot take( the co,rse as Pop,lat)o( 2.
http://www.0.7ls
a %o-p,te the 20 *)8ere(ces )( the or*er after− before the)r -ea( d− a(* the)r
sa-ple sta(*ar* *e4)at)o( sd.
b !)4e a po)(t est)-ate for µd=µ1−µ2 the *)8ere(ce )( the -ea( score of all
st,*e(ts who ha4e take( the co,rse a(* the -ea( score of all who ha4e (ot.
c %o(str,ct a 96G co(+*e(ce )(ter4al for µd.
* &est at the 1G le4el of s)g()+ca(ce the hypothes)s that the -ea( SA& score
)(creases by at least te( po)(ts by tak)(g the two=week preparat)o( co,rse.
12 Large ata Set 12 l)sts the scores o( o(e ro,(* for 0 ra(*o-ly selecte* -e-bers at a
golf co,rse +rst ,s)(g the)r ow( or)g)(al cl,bs the( two -o(ths later after ,s)(g (ew
cl,bs w)th a( e7per)-e(tal *es)g(. e(ote the pop,lat)o( of all golfers ,s)(g the)r ow(
or)g)(al cl,bs as Pop,lat)o( 1 a(* the pop,lat)o( of all golfers ,s)(g the (ew style cl,bs
as Pop,lat)o( 2.
http://www.12.7ls
a %o-p,te the 0 *)8ere(ces )( the or*er original clubs− new clubs the)r -ea( d− a(*
the)r sa-ple sta(*ar* *e4)at)o( sd.
b !)4e a po)(t est)-ate for µd=µ1−µ2 the *)8ere(ce )( the -ea( score of all
golfers ,s)(g the)r or)g)(al cl,bs a(* the -ea( score of all golfers ,s)(g the
(ew k)(* of cl,bs.
c %o(str,ct a 9;G co(+*e(ce )(ter4al for µd.
* &est at the 1G le4el of s)g()+ca(ce the hypothes)s that the -ea( golf score
*ecreases by at least o(e stroke by ,s)(g the (ew k)(* of cl,bs.
Saylor URL: http://www.saylor.org/books Saylor.org0;6
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 509/723
13 %o(s)*er the pre4)o,s proble- aga)(. S)(ce the *ata set )s so large )t )s reaso(able to
,se the sta(*ar* (or-al *)str)b,t)o( )(stea* of St,*e(tWs t =*)str)b,t)o( w)th *egrees
of free*o-.
a %o(str,ct a 9;G co(+*e(ce )(ter4al for µd,s)(g the sta(*ar* (or-al
*)str)b,t)o( -ea()(g that the for-,la )s d−±zα/2sdn−−√.&he co-p,tat)o(s *o(e
)( part a5 of the pre4)o,s proble- st)ll apply a(* (ee* (ot be re*o(e.5 @ow
*oes the res,lt obta)(e* here co-pare to the res,lt obta)(e* )( part c5 of the
pre4)o,s proble-C
b &est at the 1G le4el of s)g()+ca(ce the hypothes)s that the -ea( golf score
*ecreases by at least o(e stroke by ,s)(g the (ew k)(* of cl,bs ,s)(g the
sta(*ar* (or-al *)str)b,t)o(. All the work *o(e )( part *5 of the pre4)o,s
proble- appl)es e7cept the cr)t)cal 4al,e )s (ow zα )(stea* of tα or the p=4al,e
ca( be co-p,te* e7actly )(stea* of o(ly appro7)-ate* )f yo, ,se* the p=
4al,e approach5.5 @ow *oes the res,lt obta)(e* here co-pare to the res,lt
obta)(e* )( part c5 of the pre4)o,s proble-C
c %o(str,ct the 99G co(+*e(ce )(ter4als for µd,s)(g both the t-a(* z-
*)str)b,t)o(s. @ow -,ch *)8ere(ce )s there )( the res,lts (owC
Saylor URL: http://www.saylor.org/books Saylor.org0;9
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 510/723
Saylor URL: http://www.saylor.org/books Saylor.org01;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 511/723
6.7 Comparison o Two 2opulation2roportions
!+A/N&N: 1';+CT&<+S
1 &o lear( how to co(str,ct a co(+*e(ce )(ter4al for the *)8ere(ce )( the
proport)o(s of two *)st)(ct pop,lat)o(s that ha4e a part)c,lar character)st)c of
)(terest.
2 &o lear( how to perfor- a test of hypotheses co(cer()(g the *)8ere(ce )( the
proport)o(s of two *)st)(ct pop,lat)o(s that ha4e a part)c,lar character)st)c of
)(terest.
+uppose we wish to compare the proportions of two populations that have a specific characteristic$
such as the proportion of men who are left-handed compared to the proportion of women who are
left-handed. /igure F.@ 0,ndependent +ampling from Two opulations ,n Order to 7ompare
roportions0 illustrates the conceptual framework of our investigation. ach population is divided
into two groups$ the group of elements that have the characteristic of interest *for example$ being
left-handed) and the group of elements that do not. (e arbitrarily label one population as
opulation 1 and the other as opulation !$ and subscript the proportion of each population that
possesses the characteristic with the number 1 or ! to tell them apart. (e draw a random sample
from opulation 1 and label the sample statistic it yields with the subscript 1. (ithout reference to
the first sample we draw a sample from opulation ! and label its sample statistic with the subscript
!.
!igure 4.1 >ndependent %ampling from Two $opulations >n )rder to <ompare $roportions
Saylor URL: http://www.saylor.org/books Saylor.org011
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 512/723
Our goal is to use the information in the samples to estimate the difference p1− p2in the
two population proportions and to make statistically valid inferences about it.
Con)dence &ntervals
+ince the sample proportion p1computed using the sample drawn from opulation 1 is a good estimator
of population proportion p1 of opulation 1 and the sample proportion p2computed using the sample
drawn from opulation ! is a good estimator of population proportion p! of opulation !$ a reasonable
point estimate of the difference p1− p2 is p 1− p 2.,n order to widen this point estimate into a confidence
interval we suppose that both samples are large$ as described in +ection @.3 0arge +ample stimation of a
opulation roportion0 in 7hapter @ 0stimation0 and repeated below. ,f so$ then the following formula
for a confidence interval for p1− p2is valid.
Saylor URL: http://www.saylor.org/books Saylor.org012
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 513/723
Saylor URL: http://www.saylor.org/books Saylor.org013
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 514/723
Saylor URL: http://www.saylor.org/books Saylor.org01
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 515/723
Saylor URL: http://www.saylor.org/books Saylor.org010
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 516/723
The three forms of the alternative hypothesis$ with the terminology for each case$ are:
Form ofHa Terminolog1
Ha: p1− p2<D0 ?e'-'ai"ed
Ha: p1− p2>D0 igh'-'ai"ed
Ha: p1− p2≠D0 ,/o-'ai"ed
Saylor URL: http://www.saylor.org/books Saylor.org01
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 517/723
%s long as the samples are independent and both are large the following formula for the standardi'ed
test statistic is valid$ and it has the standard normal distribution.
EKAPLE 11
Us)(g the *ata of Note 9.20 QE7a-ple 1;Q test whether there )s s,?c)e(t
e4)*e(ce to co(cl,*e that p,bl)c web access to the )(spect)o( recor*s has
)(crease* the proport)o( of pro>ects that passe* o( the +rst )(spect)o( by -ore
tha( 0 perce(tage po)(ts. Use the cr)t)cal 4al,e approach at the 1;G le4el of
s)g()+ca(ce.
Sol,t)o(:
Saylor URL: http://www.saylor.org/books Saylor.org01
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 518/723
• Step 1. &ak)(g )(to acco,(t the label)(g of the pop,lat)o(s a( )(crease )(
pass)(g rate at the +rst )(spect)o( by -ore tha( 0 perce(tage po)(ts after
p,bl)c access o( the web -ay be e7presse* as p2> p1+0.05 wh)ch by algebra )s
the sa-e as p1− p2<−0.05. &h)s )s the alter(at)4e hypothes)s. S)(ce the (,ll
hypothes)s )s always e7presse* as a( e<,al)ty w)th the sa-e (,-ber o( the
r)ght as )s )( the alter(at)4e hypothes)s the test )s
Saylor URL: http://www.saylor.org/books Saylor.org016
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 519/723
• &he *ata pro4)*e s,?c)e(t e4)*e(ce at the 1;G le4el of s)g()+ca(ce to
co(cl,*e that the rate of pass)(g o( the +rst )(spect)o( has )(crease* by
-ore tha( 0 perce(tage po)(ts s)(ce recor*s were p,bl)cly poste* o( the web.
Figure ."6eEection 6egion and )est %tatistic for 'ote .27 >(ample 11>
+A>2!+ %0
Perfor- the test of Note 9.2 QE7a-ple 11Q ,s)(g the p=4al,e approach.
Sol,t)o(:
&he +rst three steps are )*e(t)cal to those )( Note 9.2 QE7a-ple 11Q.
• Step . #eca,se the test )s left=ta)le* the obser4e* s)g()+ca(ce or p=4al,e of the
test )s >,st the area of the left ta)l of the sta(*ar* (or-al *)str)b,t)o( that )s c,t
o8 by the test stat)st)c Z=−1.770.ro- )g,re 12.2 Q%,-,lat)4e Nor-al
Probab)l)tyQ the area of the left ta)l *eter-)(e* by X1. )s ;.;36. &he p=4al,e )s
;.;36.
Saylor URL: http://www.saylor.org/books Saylor.org019
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 520/723
• Step 0. S)(ce the p=4al,e ;.;36 )s less tha( α=0.10 the *ec)s)o( )s to re>ect the
(,ll hypothes)s: &he *ata pro4)*e s,?c)e(t e4)*e(ce at the 1;G le4el of
s)g()+ca(ce to co(cl,*e that the rate of pass)(g o( the +rst )(spect)o( has
)(crease* by -ore tha( 0 perce(tage po)(ts s)(ce recor*s were p,bl)cly poste*
o( the web.
/inally a common misuse of the formulas given in this section must be mentioned. +uppose a large
pre-election survey of potential voters is conducted. ach person surveyed is asked to express a
preference between$ say$ 7andidate % and 7andidate 4. *erhaps Gno preferenceH or GotherH are also
choices$ but that is not important.) ,n such a survey$ estimators p A and p B of p Aand p B can be
calculated. ,t is important to reali'e$ however$ that these two estimators were not calculated from two
independent samples. (hile p A− p B may be a reasonable estimator of pA− pB the formulas for
confidence intervals and for the standardi'ed test statistic given in this section are not valid for data
obtained in this manner.
*+, TA*+AA,S
• A co(+*e(ce )(ter4al for the *)8ere(ce )( two pop,lat)o( proport)o(s )s co-p,te*
,s)(g a for-,la )( the sa-e fash)o( as was *o(e for a s)(gle pop,lat)o( -ea(.
• &he sa-e +4e=step proce*,re ,se* to test hypotheses co(cer()(g a s)(gle
pop,lat)o( proport)o( )s ,se* to test hypotheses co(cer()(g the *)8ere(ce
betwee( two pop,lat)o( proport)o(s. &he o(ly *)8ere(ce )s )( the for-,la for the
sta(*ar*)Ve* test stat)st)c.
Saylor URL: http://www.saylor.org/books Saylor.org02;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 521/723
Saylor URL: http://www.saylor.org/books Saylor.org021
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 522/723
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 523/723
Saylor URL: http://www.saylor.org/books Saylor.org023
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 524/723
Saylor URL: http://www.saylor.org/books Saylor.org02
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 525/723
Saylor URL: http://www.saylor.org/books Saylor.org020
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 526/723
b Test H 0: p1− p2=0.30vs. H a: p1− p2≠0.30I α=0.10@
n1=7500@ p1=0.664
n2=1000@ p2=0.319
Saylor URL: http://www.saylor.org/books Saylor.org02
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 527/723
A22!&CAT&1NS
( all the re-a)()(g e7ercs)ses the sa-ples are s,?c)e(tly large so th)s (ee* (ot
be checke*5.
13 'oters )( a part)c,lar c)ty who )*e(t)fy the-sel4es w)th o(e or the other of two pol)t)cal
part)es were ra(*o-ly selecte* a(* aske* )f they fa4or a proposal to allow c)t)Ve(s w)th
proper l)ce(se to carry a co(ceale* ha(*g,( )( c)ty parks. &he res,lts are:
art1 A art1 6
a%$"e +iBe n 150 200
N&%ber in avor x 90 140
a !)4e a po)(t est)-ate for the *)8ere(ce )( the proport)o( of all -e-bers of
Party A a(* all -e-bers of Party # who fa4or the proposal.
b %o(str,ct the 90G co(+*e(ce )(ter4al for the *)8ere(ce base* o( these *ata.
c &est at the 0G le4el of s)g()+ca(ce the hypothes)s that the proport)o( of all
-e-bers of Party A who fa4or the proposal )s less tha( the proport)o( of all
-e-bers of Party # who *o.
* %o-p,te the p=4al,e of the test.
1 &o )(4est)gate a poss)ble relat)o( betwee( ge(*er a(* ha(*e*(ess a ra(*o- sa-ple of
32; a*,lts was take( w)th the follow)(g res,lts:
#en 9omen
a%$"e +iBe n 168 152
N&%ber o "e'-handed x 24 9
a !)4e a po)(t est)-ate for the *)8ere(ce )( the proport)o( of all -e( who are
left=ha(*e* a(* the proport)o( of all wo-e( who are left=ha(*e*.
b %o(str,ct the 90G co(+*e(ce )(ter4al for the *)8ere(ce base* o( these *ata.
c &est at the 0G le4el of s)g()+ca(ce the hypothes)s that the proport)o( of -e(
who are left=ha(*e* )s greater tha( the proport)o( of wo-e( who are.
* %o-p,te the p=4al,e of the test.
10 A local school boar* -e-ber ra(*o-ly sa-ple* pr)4ate a(* p,bl)c h)gh school teachers
)( h)s *)str)ct to co-pare the proport)o(s of Nat)o(al #oar* %ert)+e* N#%5 teachers )(
the fac,lty. &he res,lts were:
rivate Schools ublic Schools
a%$"e +iBe n 80 520
Saylor URL: http://www.saylor.org/books Saylor.org02
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 528/723
rivate Schools ublic Schools
*ro$or'ion o N: 'ea!her+ p 0.175 0.150
a !)4e a po)(t est)-ate for the *)8ere(ce )( the proport)o( of all teachers )(
area p,bl)c schools a(* the proport)o( of all teachers )( pr)4ate schools whoare Nat)o(al #oar* %ert)+e*.
b %o(str,ct the 9;G co(+*e(ce )(ter4al for the *)8ere(ce base* o( these *ata.
c &est at the 1;G le4el of s)g()+ca(ce the hypothes)s that the proport)o( of all
p,bl)c school teachers who are Nat)o(al #oar* %ert)+e* )s less tha( the
proport)o( of pr)4ate school teachers who are.
* %o-p,te the p=4al,e of the test.
1 ( profess)o(al basketball ga-es the fa(s of the ho-e tea- always try to *)stract free
throw shooters o( the 4)s)t)(g tea-. &o )(4est)gate whether th)s tact)c )s act,ally
e8ect)4e the free throw stat)st)cs of a profess)o(al basketball player w)th a h)gh free
throw perce(tage were e7a-)(e*. ,r)(g the e(t)re last seaso( th)s player ha* 0
free throws 2; )( ho-e ga-es a(* 23 )( away ga-es. &he res,lts are s,--ar)Ve*
below.
4ome A3a1
a%$"e +iBe n 420 236
ree 'hro/ $er!en' p 81.5D 78.8D
a !)4e a po)(t est)-ate for the *)8ere(ce )( the proport)o( of free throws -a*eat ho-e a(* away.
b %o(str,ct the 9;G co(+*e(ce )(ter4al for the *)8ere(ce base* o( these *ata.
c i&est at the 1;G le4el of s)g()+ca(ce the hypothes)s that there e7)sts a ho-e
a*4a(tage )( free throws.
* %o-p,te the p=4al,e of the test.
1 Ra(*o-ly selecte* -)**le=age* people )( both %h)(a a(* the U()te* States were aske*
)f they bel)e4e* that a*,lts ha4e a( obl)gat)o( to +(a(c)ally s,pport the)r age* pare(ts.
&he res,lts are s,--ar)Ve* below.
China USA
a%$"e +iBe n 1300 150
N&%ber o e+ x 1170 110
&est at the 1G le4el of s)g()+ca(ce whether the *ata pro4)*e s,?c)e(t e4)*e(ce to
co(cl,*e that there e7)sts a c,lt,ral *)8ere(ce )( att)t,*e regar*)(g th)s <,est)o(.
Saylor URL: http://www.saylor.org/books Saylor.org026
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 529/723
16 A -a(,fact,rer of walk=beh)(* p,sh -owers rece)4es ref,rb)she* s-all e(g)(es fro-
two (ew s,ppl)ers 0 a(* -. t )s (ot ,(co--o( that so-e of the ref,rb)she* e(g)(es
(ee* to be l)ghtly ser4)ce* before they ca( be +tte* )(to -owers. &he -ower
-a(,fact,rer rece(tly rece)4e* 1;; e(g)(es fro- each s,ppl)er. ( the sh)p-e(t fro- 0
13 (ee*e* f,rther ser4)ce. ( the sh)p-e(t fro- - 1; (ee*e* f,rther ser4)ce. &est atthe 1;G le4el of s)g()+ca(ce whether the *ata pro4)*e s,?c)e(t e4)*e(ce to co(cl,*e
that there e7)sts a *)8ere(ce )( the proport)o(s of e(g)(es fro- the two s,ppl)ers
(ee*)(g ser4)ce.
LAR!E A&A S E& EKE R%SES
19 Large ata Sets A a(* # recor* res,lts of a ra(*o- s,r4ey of 2;; 4oters )( each of
two reg)o(s )( wh)ch they were aske* to e7press whether they prefer %a(*)*ate 0for a
U.S. Se(ate seat or prefer so-e other ca(*)*ate. Let the pop,lat)o( of all 4oters )(
reg)o( 1 be *e(ote* Pop,lat)o( 1 a(* the pop,lat)o( of all 4oters )( reg)o( 2 be *e(ote*
Pop,lat)o( 2. Let p1 be the proport)o( of 4oters )( Pop,lat)o( 1 who prefer %a(*)*ate 0
a(* p2 the proport)o( )( Pop,lat)o( 2 who *o.
http://www.A.7ls
http://www.#.7ls
a )(* the rele4a(t sa-ple proport)o(s p1a(* p2.
b %o(str,ct a po)(t est)-ate for p1− p2.
c %o(str,ct a 90G co(+*e(ce )(ter4al for p1− p2.
* &est at the 0G le4el of s)g()+ca(ce the hypothes)s that the sa-e proport)o(
of 4oters )( the two reg)o(s fa4or %a(*)*ate 0 aga)(st the alter(at)4e that a
larger proport)o( )( Pop,lat)o( 2 *o.
2; Large ata Set 11 recor*s the res,lts of sa-ples of real estate sales )( a certa)( reg)o(
)( the year 2;;6 l)(es 2 thro,gh 035 a(* )( the year 2;1; l)(es 03 thro,gh 11;5.
oreclos,re sales are )*e(t)+e* w)th a 1 )( the seco(* col,-(. Let all real estate sales )(
the reg)o( )( 2;;6 be Pop,lat)o( 1 a(* all real estate sales )( the reg)o( )( 2;1; be
Pop,lat)o( 2.
Saylor URL: http://www.saylor.org/books Saylor.org029
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 530/723
http://www.11.7ls
a Use the sa-ple *ata to co(str,ct po)(t est)-ates p1a(* p2of the
proport)o(s p1 a(* p2 of all real estate sales )( th)s reg)o( )( 2;;6 a(* 2;1;
that were foreclos,re sales. %o(str,ct a po)(t est)-ate of p1− p2.
b Use the sa-ple *ata to co(str,ct a 9;G co(+*e(ce for p1− p2.
c &est at the 1;G le4el of s)g()+ca(ce the hypothes)s that the proport)o( of
real estate sales )( the reg)o( )( 2;1; that were foreclos,re sales was greater
tha( the proport)o( of real estate sales )( the reg)o( )( 2;;6 that were
foreclos,re sales. &he *efa,lt )s that the proport)o(s were the sa-e.5
Saylor URL: http://www.saylor.org/books Saylor.org03;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 531/723
Saylor URL: http://www.saylor.org/books Saylor.org031
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 532/723
6.8 Sample Sie Considerations
!+A/N&N: 1';+CT&<+
1 &o lear( how to apply for-,las for est)-at)(g the s)Ve sa-ples that w)ll be
(ee*e* )( or*er to co(str,ct a co(+*e(ce )(ter4al for the *)8ere(ce )( two
pop,lat)o( -ea(s or proport)o(s that -eets g)4e( cr)ter)a.
%s was pointed out at the beginning of +ection @.6 0+ample +i'e 7onsiderations0in 7hapter @
0stimation0$ sampling is typically done with definite ob#ectives in mind. /or example$ a physician
might wish to estimate the difference in the average amount of sleep gotten by patients suffering a
certain condition with the average amount of sleep got by healthy adults$ at F5B confidence and to
within half an hour. +ince sampling costs time$ effort$ and money$ it would be useful to be able to
estimate the smallest si'e samples that are likely to meet these criteria.
Saylor URL: http://www.saylor.org/books Saylor.org032
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 533/723
Saylor URL: http://www.saylor.org/books Saylor.org033
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 534/723
Saylor URL: http://www.saylor.org/books Saylor.org03
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 535/723
Saylor URL: http://www.saylor.org/books Saylor.org030
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 536/723
Saylor URL: http://www.saylor.org/books Saylor.org03
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 537/723
Saylor URL: http://www.saylor.org/books Saylor.org03
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 538/723
Saylor URL: http://www.saylor.org/books Saylor.org036
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 539/723
*+, TA*+AA,S
• f the pop,lat)o( sta(*ar* *e4)at)o(s σ1a(* σ2are k(ow( or ca( be est)-ate*
the( the -)()-,- e<,al s)Ves of )(*epe(*e(t sa-ples (ee*e* to obta)( a
co(+*e(ce )(ter4al for the *)8ere(ce µ1−µ2 )( two pop,lat)o( -ea(s w)th a g)4e(
-a7)-,- error of the est)-ate ( a(* a g)4e( le4el of co(+*e(ce ca( be
est)-ate*.
• f the sta(*ar* *e4)at)o( σdof the pop,lat)o( of *)8ere(ces )( pa)rs *raw( fro-
two pop,lat)o(s )s k(ow( or ca( be est)-ate* the( the -)()-,- (,-ber of
sa-ple pa)rs (ee*e* ,(*er pa)re* *)8ere(ce sa-pl)(g to obta)( a co(+*e(ce
)(ter4al for the *)8ere(ce µd=µ1−µ2 )( two pop,lat)o( -ea(s w)th a g)4e( -a7)-,-
error of the est)-ate ( a(* a g)4e( le4el of co(+*e(ce ca( be est)-ate*.
• &he -)()-,- e<,al sa-ple s)Ves (ee*e* to obta)( a co(+*e(ce )(ter4al for the
*)8ere(ce )( two pop,lat)o( proport)o(s w)th a g)4e( -a7)-,- error of theest)-ate a(* a g)4e( le4el of co(+*e(ce ca( always be est)-ate*. f there )s pr)or
k(owle*ge of the pop,lat)o( proport)o(s p1 a(* p2 the( the est)-ate ca( be
sharpe(e*.
++/C&S+S
'AS&C
1 Est)-ate the co--o( sa-ple s)Ve n of e<,ally s)Ve* )(*epe(*e(t sa-ples (ee*e* to
est)-ate µ1−µ2as spec)+e* whe( the pop,lat)o( sta(*ar* *e4)at)o(s are as show(.
a 9;G co(+*e(ce to w)th)( 3 ,()ts σ1=10a(* σ2=7
b 99G co(+*e(ce to w)th)( ,()ts σ1=6.8a(* σ2=9.3
c 90G co(+*e(ce to w)th)( 0 ,()ts σ1=22.6a(* σ2=31.8
2 Est)-ate the co--o( sa-ple s)Ve n of e<,ally s)Ve* )(*epe(*e(t sa-ples (ee*e* to
est)-ate µ1−µ2as spec)+e* whe( the pop,lat)o( sta(*ar* *e4)at)o(s are as show(.
a 6;G co(+*e(ce to w)th)( 2 ,()ts σ1=14a(* σ2=23
b 9;G co(+*e(ce to w)th)( ;.3 ,()ts σ1=1.3a(* σ2=0.8
c 99G co(+*e(ce to w)th)( 11 ,()ts σ1=42a(* σ2=37
3 Est)-ate the (,-ber n of pa)rs that -,st be sa-ple* )( or*er to est)-ate µd=µ1−µ2as
spec)+e* whe( the sta(*ar* *e4)at)o( sd of the pop,lat)o( of *)8ere(ces )s as show(.
a 6;G co(+*e(ce to w)th)( ,()ts σd=26.5
b 90G co(+*e(ce to w)th)( ,()ts σd=12
Saylor URL: http://www.saylor.org/books Saylor.org039
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 540/723
c 9;G co(+*e(ce to w)th)( 0.2 ,()ts σd=11.3
Est)-ate the (,-ber n of pa)rs that -,st be sa-ple* )( or*er to est)-ate µd=µ1−µ2as
spec)+e* whe( the sta(*ar* *e4)at)o( sd of the pop,lat)o( of *)8ere(ces )s as show(.
a 9;G co(+*e(ce to w)th)( 2; ,()ts σd=75.5
b 90G co(+*e(ce to w)th)( 11 ,()ts σd=31.4
c 99G co(+*e(ce to w)th)( 1.6 ,()ts σd=4
0 Est)-ate the -)()-,- e<,al sa-ple s)Ves n1=n2(ecessary )( or*er to est)-ate p1− p2as
spec)+e*.
a 6;G co(+*e(ce to w)th)( ;.;0 +4e perce(tage po)(ts5
1 whe( (o pr)or k(owle*ge of p1 or p2 )s a4a)lable
2 whe( pr)or st,*)es )(*)cate that p1≈0.20a(* p2≈0.65
b 9;G co(+*e(ce to w)th)( ;.;2 two perce(tage po)(ts5
1 whe( (o pr)or k(owle*ge of p1 or p2 )s a4a)lable
2 whe( pr)or st,*)es )(*)cate that p1≈0.75a(* p2≈0.63
c 90G co(+*e(ce to w)th)( ;.1; te( perce(tage po)(ts5
1 whe( (o pr)or k(owle*ge of p1 or p2 )s a4a)lable
2 whe( pr)or st,*)es )(*)cate that p1≈0.11a(* p2≈0.37
Est)-ate the -)()-,- e<,al sa-ple s)Ves n1=n2(ecessary )( or*er to
est)-ate p1− p2as spec)+e*.
a 6;G co(+*e(ce to w)th)( ;.;2 two perce(tage po)(ts5
a whe( (o pr)or k(owle*ge of p1 or p2 )s a4a)lable
b whe( pr)or st,*)es )(*)cate that p1≈0.78a(* p2≈0.65
b 9;G co(+*e(ce to w)th)( ;.;0 two perce(tage po)(ts5
a whe( (o pr)or k(owle*ge of p1 or p2 )s a4a)lable
b whe( pr)or st,*)es )(*)cate that p1≈0.12a(* p2≈0.24
c 90G co(+*e(ce to w)th)( ;.1; te( perce(tage po)(ts5
a whe( (o pr)or k(owle*ge of p1 or p2 )s a4a)lable
b whe( pr)or st,*)es )(*)cate that p1≈0.14a(* p2≈0.21
A22!&CAT&1NS
A( e*,cat)o(al researcher w)shes to est)-ate the *)8ere(ce )( a4erage scores of
ele-e(tary school ch)l*re( o( two 4ers)o(s of a 1;;=po)(t sta(*ar*)Ve* test at 99G
co(+*e(ce a(* to w)th)( two po)(ts. Est)-ate the -)()-,- e<,al sa-ple s)Ves
Saylor URL: http://www.saylor.org/books Saylor.org0;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 541/723
(ecessary )f )t )s k(ow( that the sta(*ar* *e4)at)o( of scores o( *)8ere(t 4ers)o(s of
s,ch tests )s .9.
6 A ,()4ers)ty a*-)()strator w)shes to est)-ate the *)8ere(ce )( -ea( gra*e po)(t
a4erages a-o(g all -e( a?l)ate* w)th frater()t)es a(* all ,(a?l)ate* -e( w)th 90G
co(+*e(ce a(* to w)th)( ;.10. t )s k(ow( fro- pr)or st,*)es that the sta(*ar**e4)at)o(s of gra*e po)(t a4erages )( the two gro,ps ha4e co--o( 4al,e ;.. Est)-ate
the -)()-,- e<,al sa-ple s)Ves (ecessary to -eet these cr)ter)a.
9 A( a,to-ot)4e t)re -a(,fact,rer w)shes to est)-ate the *)8ere(ce )( -ea( wear of t)res
-a(,fact,re* w)th a( e7per)-e(tal -ater)al a(* or*)(ary pro*,ct)o( t)re w)th 9;G
co(+*e(ce a(* to w)th)( ;.0 --. &o el)-)(ate e7tra(eo,s factors ar)s)(g fro- *)8ere(t
*r)4)(g co(*)t)o(s the t)res w)ll be teste* )( pa)rs o( the sa-e 4eh)cles. t )s k(ow( fro-
pr)or st,*)es that the sta(*ar* *e4)at)o(s of the *)8ere(ces of wear of t)res co(str,cte*
w)th the two k)(*s of -ater)als )s 1.0 --. Est)-ate the -)()-,- (,-ber of pa)rs )(
the sa-ple (ecessary to -eet these cr)ter)a.
1; &o assess to the relat)4e happ)(ess of -e( a(* wo-e( )( the)r -arr)ages a -arr)age
co,(selor pla(s to a*-)()ster a test -eas,r)(g happ)(ess )( -arr)age ton ra(*o-ly
selecte* -arr)e* co,ples recor* the the)r test scores +(* the *)8ere(ces a(* the(
*raw )(fere(ces o( the poss)ble *)8ere(ce. Let µ1a(* µ2be the tr,e a4erage le4els of
happ)(ess )( -arr)age for -e( a(* wo-e( respect)4ely as -eas,re* by th)s test.
S,ppose )t )s *es)re* to +(* a 9;G co(+*e(ce )(ter4al for est)-at)(g µd=µ1−µ2to w)th)(
two test po)(ts. S,ppose f,rther that fro- pr)or st,*)es )t )s k(ow( that the sta(*ar*
*e4)at)o( of the *)8ere(ces )( test scores )s σd≈10.Bhat )s the -)()-,- (,-ber of
-arr)e* co,ples that -,st be )(cl,*e* )( th)s st,*yC
11 A >o,r(al)st pla(s to )(ter4)ew a( e<,al (,-ber of -e-bers of two pol)t)cal part)es to
co-pare the proport)o(s )( each party who fa4or a proposal to allow c)t)Ve(s w)th a
proper l)ce(se to carry a co(ceale* ha(*g,( )( p,bl)c parks. Let p1 a(* p2 be the tr,e
proport)o(s of -e-bers of the two part)es who are )( fa4or of the proposal. S,ppose )t )s
*es)re* to +(* a 90G co(+*e(ce )(ter4al for est)-at)(g p1− p2to w)th)( ;.;0. Est)-ate the
-)()-,- e<,al (,-ber of -e-bers of each party that -,st be sa-ple* to -eet these
cr)ter)a.
12 A -e-ber of the state boar* of e*,cat)o( wa(ts to co-pare the proport)o(s of Nat)o(al#oar* %ert)+e* N#%5 teachers )( pr)4ate h)gh schools a(* )( p,bl)c h)gh schools )( the
state. @)s st,*y pla( calls for a( e<,al (,-ber of pr)4ate school teachers a(* p,bl)c
school teachers to be )(cl,*e* )( the st,*y. Let p1 a(* p2 be these proport)o(s. S,ppose
)t )s *es)re* to +(* a 99G co(+*e(ce )(ter4al that est)-ates p1− p2to w)th)( ;.;0.
Saylor URL: http://www.saylor.org/books Saylor.org01
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 542/723
a S,ppos)(g that both proport)o(s are k(ow( fro- a pr)or st,*y to be
appro7)-ately ;.10 co-p,te the -)()-,- co--o( sa-ple s)Ve (ee*e*.
b %o-p,te the -)()-,- co--o( sa-ple s)Ve (ee*e* o( the s,ppos)t)o( that
(oth)(g )s k(ow( abo,t the 4al,es of p1 a(* p2.
Saylor URL: http://www.saylor.org/books Saylor.org02
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 543/723
Chapter %
Correlation and /e$ression
Our interest in this chapter is in situations in which we can associate to each element of a population
or sample two measurements x and y$ particularly in the case that it is of interest to use the value
of x to predict the value of y. /or example$ the population could be the air in automobile
garages$ x could be the electrical current produced by an electrochemical reaction taking place in a
carbon monoxide meter$ and y the concentration of carbon monoxide in the air. ,n this chapter we
will learn statistical methods for analy'ing the relationship between variables x and y in this context.
% list of all the formulas that appear anywhere in this chapter are collected in the last section for ease
of reference.
Saylor URL: http://www.saylor.org/books Saylor.org03
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 544/723
%.% !inear /elationships 'etween <ariables
LEARNN! "#$E%&'E
1 &o lear( what )t -ea(s for two 4ar)ables to e7h)b)t a relat)o(sh)p that )s close to
l)(ear b,t wh)ch co(ta)(s a( ele-e(t of ra(*o-(ess.
The following table gives examples of the kinds of pairs of variables which could be of interest from a
statistical point of view.
x y
Pre*)ctor or )(*epe(*e(t 4ar)able Respo(se or *epe(*e(t 4ar)able
&e-perat,re )( *egrees %els),s &e-perat,re )( *egrees ahre(he)t
Area of a ho,se s<.ft.5 'al,e of the ho,se
Age of a part)c,lar -ake a(* -o*el car Resale 4al,e of the car
A-o,(t spe(t by a b,s)(ess o( a*4ert)s)(g )( a
year Re4e(,e rece)4e* that year
@e)ght of a 20=year=ol* -a( Be)ght of the -a(
The first line in the table is different from all the rest because in that case and no other the
relationship between the variables is deterministic: once the value of x is known the value of y is
Saylor URL: http://www.saylor.org/books Saylor.org0
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 545/723
completely determined. ,n fact there is a formula for y in terms of x :y=95x+32.7hoosing several
values for x and computing the corresponding value for y for each one using the formula gives the
table
Saylor URL: http://www.saylor.org/books Saylor.org00
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 546/723
!igure "5." $lot of <elsius and !ahrenheit Temperature $airs
The relationship between x and y in the temperature example is deterministic because once the value
of x is known$ the value of y is completely determined. ,n contrast$ all the other relationships listed in
the table above have an element of randomness in them. 7onsider the relationship described in the
last line of the table$ the height x of a man aged !8 and his weight y. ,f we were to randomly select
several !8-year-old men and measure the height and weight of each one$ we might obtain a collection
of(x,y)pairs something like this:
(68,151) (69,146) (70,157) (70,164) (71,171) (72,160)
(72,163)(72,180)(73,170)(73,175)(74,178)(75,188)
% plot of these data is shown in /igure 15.! 0lot of Eeight and (eight airs0. +uch a plot is called
a scatter diagram or scatter plot. ooking at the plot it is evident that there exists a linear
relationship between height x and weight y$ but not a perfect one. The points appear to be following a
line$ but not exactly. There is an element of randomness present.
!igure "5.& $lot of +eight and Feight $airs
Saylor URL: http://www.saylor.org/books Saylor.org0
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 547/723
,n this chapter we will analy'e situations in which variables x and y exhibit such a linear relationship
with randomness. The level of randomness will vary from situation to situation. ,n the introductory
example connecting an electric current and the level of carbon monoxide in air$ the relationship is
almost perfect. ,n other situations$ such as the height and weights of individuals$ the connection
between the two variables involves a high degree of randomness. ,n the next section we will see how
to "uantify the strength of the linear relationship between two variables.
*+, TA*+AA,S
• &wo 4ar)ables a(* / ha4e a *eter-)()st)c l)(ear relat)o(sh)p )f po)(ts plotte*
fro- (x,y)pa)rs l)e e7actly alo(g a s)(gle stra)ght l)(e.
• ( pract)ce )t )s co--o( for two 4ar)ables to e7h)b)t a relat)o(sh)p that )s close to
l)(ear b,t wh)ch co(ta)(s a( ele-e(t poss)bly large of ra(*o-(ess.
Saylor URL: http://www.saylor.org/books Saylor.org0
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 548/723
++/C&S+S
'AS&C
1 A l)(e has e<,at)o( y=0.5x+2.
a P)ck +4e *)st)(ct =4al,es ,se the e<,at)o( to co-p,te the correspo(*)(g / =
4al,es a(* plot the +4e po)(ts obta)(e*.
b !)4e the 4al,e of the slope of the l)(eD g)4e the 4al,e of the / =)(tercept.
2 A l)(e has e<,at)o( y=x−0.5.
a P)ck +4e *)st)(ct =4al,es ,se the e<,at)o( to co-p,te the correspo(*)(g / =
4al,es a(* plot the +4e po)(ts obta)(e*.
b !)4e the 4al,e of the slope of the l)(eD g)4e the 4al,e of the / =)(tercept.
3 A l)(e has e<,at)o( y=−2x+4.
a P)ck +4e *)st)(ct =4al,es ,se the e<,at)o( to co-p,te the correspo(*)(g / =
4al,es a(* plot the +4e po)(ts obta)(e*.b !)4e the 4al,e of the slope of the l)(eD g)4e the 4al,e of the / =)(tercept.
A l)(e has e<,at)o( y=−1.5x+1.
a P)ck +4e *)st)(ct =4al,es ,se the e<,at)o( to co-p,te the correspo(*)(g / =
4al,es a(* plot the +4e po)(ts obta)(e*.
b !)4e the 4al,e of the slope of the l)(eD g)4e the 4al,e of the / =)(tercept.
0 #ase* o( the )(for-at)o( g)4e( abo,t a l)(e *eter-)(e how / w)ll cha(ge )(crease
*ecrease or stay the sa-e5 whe( )s )(crease* a(* e7pla)(. ( so-e cases )t -)ght
be )-poss)ble to tell fro- the )(for-at)o( g)4e(.
a &he slope )s pos)t)4e.
b &he / =)(tercept )s pos)t)4e.
c &he slope )s Vero.
#ase* o( the )(for-at)o( g)4e( abo,t a l)(e *eter-)(e how / w)ll cha(ge )(crease
*ecrease or stay the sa-e5 whe( )s )(crease* a(* e7pla)(. ( so-e cases )t -)ght
be )-poss)ble to tell fro- the )(for-at)o( g)4e(.
a &he / =)(tercept )s (egat)4e.
b &he / =)(tercept )s Vero.
c &he slope )s (egat)4e.
A *ata set co(s)sts of e)ght (x,y)pa)rs of (,-bers:
(0,12)(2,15)(4,16)(5,14)(8,22)(13,24)(15,28)(20,30)
a Plot the *ata )( a scatter *)agra-.
b #ase* o( the plot e7pla)( whether the relat)o(sh)p betwee( a(* / appears
to be *eter-)()st)c or to )(4ol4e ra(*o-(ess.
Saylor URL: http://www.saylor.org/books Saylor.org06
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 549/723
c #ase* o( the plot e7pla)( whether the relat)o(sh)p betwee( a(* / appears
to be l)(ear or (ot l)(ear.
6 A *ata set co(s)sts of te( (x,y)pa)rs of (,-bers:
(3,20)(5,13)(6,9)(8,4)(11,0)(12,0)(14,1)(17,6)(18,9)(20,16)
a Plot the *ata )( a scatter *)agra-.b #ase* o( the plot e7pla)( whether the relat)o(sh)p betwee( a(* / appears
to be *eter-)()st)c or to )(4ol4e ra(*o-(ess.
c #ase* o( the plot e7pla)( whether the relat)o(sh)p betwee( a(* / appears
to be l)(ear or (ot l)(ear.
9 A *ata set co(s)sts of ()(e (x,y)pa)rs of (,-bers:
(8,16)(9,9)(10,4)(11,1)(12,0)(13,1)(14,4)(15,9)(16,16)
a Plot the *ata )( a scatter *)agra-.
b #ase* o( the plot e7pla)( whether the relat)o(sh)p betwee( a(* / appears
to be *eter-)()st)c or to )(4ol4e ra(*o-(ess.
c #ase* o( the plot e7pla)( whether the relat)o(sh)p betwee( a(* / appears
to be l)(ear or (ot l)(ear.
1; A *ata set co(s)sts of +4e (x,y)pa)rs of (,-bers:
(0,1) (2,5) (3,7) (5,11) (8,17)
a Plot the *ata )( a scatter *)agra-.
b #ase* o( the plot e7pla)( whether the relat)o(sh)p betwee( a(* / appears
to be *eter-)()st)c or to )(4ol4e ra(*o-(ess.
c #ase* o( the plot e7pla)( whether the relat)o(sh)p betwee( a(* / appearsto be l)(ear or (ot l)(ear.
A22!&CAT&1NS
11 At ; a part)c,lar ble(* of a,to-ot)4e gasol)(e we)ghts .1 lb/gal. &he we)ght / of
gasol)(e o( a ta(k tr,ck that )s loa*e* w)th gallo(s of gasol)(e )s g)4e( by the l)(ear
e<,at)o(
y=6.17x
a E7pla)( whether the relat)o(sh)p betwee( the we)ght / a(* the a-o,(t of
gasol)(e )s *eter-)()st)c or co(ta)(s a( ele-e(t of ra(*o-(ess.
b Pre*)ct the we)ght of gasol)(e o( a ta(k tr,ck that has >,st bee( loa*e* w)th
0; gallo(s of gasol)(e.
Saylor URL: http://www.saylor.org/books Saylor.org09
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 550/723
12 &he rate for re(t)(g a -otor scooter for o(e *ay at a beach resort area )s 20 pl,s 3;
ce(ts for each -)le the scooter )s *r)4e(. &he total cost / )( *ollars for re(t)(g a scooter
a(* *r)4)(g )t -)les )s
y=0.30x+25
a E7pla)( whether the relat)o(sh)p betwee( the cost / of re(t)(g the scooter for
a *ay a(* the *)sta(ce that the scooter )s *r)4e( that *ay )s *eter-)()st)c or
co(ta)(s a( ele-e(t of ra(*o-(ess.
b A perso( )(te(*s to re(t a scooter o(e *ay for a tr)p to a( attract)o( 1 -)les
away. Ass,-)(g that the total *)sta(ce the scooter )s *r)4e( )s 3 -)les
pre*)ct the cost of the re(tal.
13 &he pr)c)(g sche*,le for labor o( a ser4)ce call by a( ele4ator repa)r co-pa(y )s 10;
pl,s 0; per ho,r o( s)te.
a Br)te *ow( the l)(ear e<,at)o( that relates the labor cost / to the (,-ber of
ho,rs that the repa)r-a( )s o( s)te.
b %alc,late the labor cost for a ser4)ce call that lasts 2.0 ho,rs.
1 &he cost of a telepho(e call -a*e thro,gh a lease* l)(e ser4)ce )s 2.0 ce(ts per -)(,te.
a Br)te *ow( the l)(ear e<,at)o( that relates the cost / )( ce(ts5 of a call to )ts
le(gth .
b %alc,late the cost of a call that lasts 23 -)(,tes.
!A/:+ (ATA S+T ++/C &S+S
10 Large ata Set 1 l)sts the SA& scores a(* !PAs of 1;;; st,*e(ts. Plot the scatter
*)agra- w)th SA& score as the )(*epe(*e(t 4ar)able 5 a(* !PA as the *epe(*e(t
4ar)able / 5. %o--e(t o( the appeara(ce a(* stre(gth of a(y l)(ear tre(*.
http://www.1.7ls
1 Large ata Set 12 l)sts the golf scores o( o(e ro,(* of golf for 0 golfers +rst ,s)(g the)r
ow( or)g)(al cl,bs the( ,s)(g cl,bs of a (ew e7per)-e(tal *es)g( after two -o(ths of
fa-)l)ar)Vat)o( w)th the (ew cl,bs5. Plot the scatter *)agra- w)th golf score ,s)(g the
or)g)(al cl,bs as the )(*epe(*e(t 4ar)able 5 a(* golf score ,s)(g the (ew cl,bs as the
*epe(*e(t 4ar)able / 5. %o--e(t o( the appeara(ce a(* stre(gth of a(y l)(ear tre(*.
Saylor URL: http://www.saylor.org/books Saylor.org00;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 551/723
http://www.12.7ls
1 Large ata Set 13 recor*s the (,-ber of b)**ers a(* sales pr)ce of a part)c,lar type of
a(t)<,e gra(*father clock at ; a,ct)o(s. Plot the scatter *)agra- w)th the (,-ber of
b)**ers at the a,ct)o( as the )(*epe(*e(t 4ar)able 5 a(* the sales pr)ce as the
*epe(*e(t 4ar)able / 5. %o--e(t o( the appeara(ce a(* stre(gth of a(y l)(ear tre(*.
http://www.13.7ls
Saylor URL: http://www.saylor.org/books Saylor.org001
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 552/723
%.0 The !inear Correlation CoeJcient
LEARNN! "#$E%&'E
1 &o lear( what the l)(ear correlat)o( coe?c)e(t )s how to co-p,te )t a(* what )t
tells ,s abo,t the relat)o(sh)p betwee( two 4ar)ables a(* / .
Saylor URL: http://www.saylor.org/books Saylor.org002
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 553/723
/igure 15.3 0inear Celationships of Darying +trengths0 illustrates linear relationships between two
variables x and y of varying strengths. ,t is visually apparent that in the situation in panel *a)$ x could
serve as a useful predictor of y$ it would be less useful in the situation illustrated in panel *b)$ and in
the situation of panel *c) the linear relationship is so weak as to be practically nonexistent. The linear
correlation coefficient is a number computed directly from the data that measures the strength of the
linear relationship between the two variables x and y.
!igure "5.* 'inear -elationships of 0arying %trengths
Saylor URL: http://www.saylor.org/books Saylor.org003
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 554/723
! ,f r` is near 5 *that is$ if r is near 5 and of either sign) then the linear relationship
between x and y is weak.
Saylor URL: http://www.saylor.org/books Saylor.org00
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 555/723
!igure "5. 'inear <orrelation <oefficient -
ay particular attention to panel *f) in /igure 15.6 0inear 7orrelation 7oefficient 0. ,t shows a
perfectly deterministic relationship between x and y$ butr=0 because the relationship is not linear.
*,n this particular case the points lie on the top half of a circle.)
+A>2!+ %
%o-p,te the l)(ear correlat)o( coe?c)e(t for the he)ght a(* we)ght pa)rs plotte*
)( )g,re 1;.2 QPlot of @e)ght a(* Be)ght Pa)rsQ.
Sol,t)o(:
E4e( for s-all *ata sets l)ke th)s o(e co-p,tat)o(s are too lo(g to *o co-pletely
by ha(*. ( act,al pract)ce the *ata are e(tere* )(to a calc,lator or co-p,ter a(*
a stat)st)cs progra- )s ,se*. ( or*er to clar)fy the -ea()(g of the for-,las we
w)ll *)splay the *ata a(* relate* <,a(t)t)es )( tab,lar for-. or each (x,y)pa)r we
Saylor URL: http://www.saylor.org/books Saylor.org000
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 556/723
co-p,te three (,-bers: 2 xy a(* / 2 as show( )( the table pro4)*e*. ( the last
l)(e of the table we ha4e the s,- of the (,-bers )( each col,-(. Us)(g the- we
co-p,te:
x y x ! xy y!
68 151 4624 10268 22801
69 146 4761 10074 21316
70 157 4900 10990 24649
70 164 4900 11480 26896
71 171 5041 12141 29241
72 160 5184 11520 25600
72 163 5184 11736 26569
72 180 5184 12960 32400
73 170 5329 12410 28900
73 175 5329 12775 30625
74 178 5476 13172 31684
75 188 5625 14100 35344
E 859 2003 61537 143626 336025
Saylor URL: http://www.saylor.org/books Saylor.org00
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 557/723
IEJ &AIE ABAJS
• &he l)(ear correlat)o( coe?c)e(t -eas,res the stre(gth a(* *)rect)o( of the l)(ear
relat)o(sh)p betwee( two 4ar)ables a(* / .
• &he s)g( of the l)(ear correlat)o( coe?c)e(t )(*)cates the *)rect)o( of the l)(ear
relat)o(sh)p betwee( a(* / .
• Bhe( r )s (ear 1 or X1 the l)(ear relat)o(sh)p )s stro(gD whe( )t )s (ear ; the
l)(ear relat)o(sh)p )s weak.
EKER%SES
#AS%
B)th the e7cept)o( of the e7erc)ses at the e(* of Sect)o( 1;.3 Qo*ell)(g L)(ear
Relat)o(sh)ps w)th Ra(*o-(ess Prese(tQ the +rst #as)c e7erc)se )( each of the
follow)(g sect)o(s thro,gh Sect)o( 1;. QEst)-at)o( a(* Pre*)ct)o(Q ,ses the *ata
fro- the +rst e7erc)se here the seco(* #as)c e7erc)se ,ses the *ata fro- the seco(*
e7erc)se here a(* so o( a(* s)-)larly for the Appl)cat)o( e7erc)ses. Sa4e yo,r
co-p,tat)o(s *o(e o( these e7erc)ses so that yo, *o (ot (ee* to repeat the- later.
Saylor URL: http://www.saylor.org/books Saylor.org00
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 558/723
Saylor URL: http://www.saylor.org/books Saylor.org006
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 559/723
Saylor URL: http://www.saylor.org/books Saylor.org009
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 560/723
Saylor URL: http://www.saylor.org/books Saylor.org0;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 561/723
Saylor URL: http://www.saylor.org/books Saylor.org01
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 562/723
Saylor URL: http://www.saylor.org/books Saylor.org02
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 563/723
Saylor URL: http://www.saylor.org/books Saylor.org03
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 564/723
Saylor URL: http://www.saylor.org/books Saylor.org0
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 565/723
Saylor URL: http://www.saylor.org/books Saylor.org00
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 566/723
http://www.1.7ls
3; Large ata Set 12 l)sts the golf scores o( o(e ro,(* of golf for 0 golfers +rst ,s)(g
the)r ow( or)g)(al cl,bs the( ,s)(g cl,bs of a (ew e7per)-e(tal *es)g( after two
-o(ths of fa-)l)ar)Vat)o( w)th the (ew cl,bs5. %o-p,te the l)(ear correlat)o(
coe?c)e(t r . %o-pare )ts 4al,e to yo,r co--e(ts o( the appeara(ce a(* stre(gth of
a(y l)(ear tre(* )( the scatter *)agra- that yo, co(str,cte* )( the seco(* large *ata
set proble- for Sect)o( 1;.1 QL)(ear Relat)o(sh)ps #etwee( 'ar)ablesQ.
Saylor URL: http://www.saylor.org/books Saylor.org0
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 567/723
http://www.12.7ls
31 Large ata Set 13 recor*s the (,-ber of b)**ers a(* sales pr)ce of a part)c,lar type
of a(t)<,e gra(*father clock at ; a,ct)o(s. %o-p,te the l)(ear correlat)o(
coe?c)e(t r . %o-pare )ts 4al,e to yo,r co--e(ts o( the appeara(ce a(* stre(gth of
a(y l)(ear tre(* )( the scatter *)agra- that yo, co(str,cte* )( the th)r* large *ata
set proble- for Sect)o( 1;.1 QL)(ear Relat)o(sh)ps #etwee( 'ar)ablesQ.
http://www.13.7ls
Saylor URL: http://www.saylor.org/books Saylor.org0
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 568/723
%.3 >odellin$ !inear /elationships with /andomness 2resent
LEARNN! "#$E%&'E
Saylor URL: http://www.saylor.org/books Saylor.org06
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 569/723
1 &o lear( the fra-ework )( wh)ch the stat)st)cal a(alys)s of the l)(ear relat)o(sh)p
betwee( two 4ar)ables a(* / w)ll be *o(e.
,n this chapter we are dealing with a population for which we can associate to each element two
measurements$ x and y. (e are interested in situations in which the value of x can be used to draw
conclusions about the value of y$ such as predicting the resale value y of a residential house based on
its si'e x . +ince the relationship between x and y is not deterministic$ statistical procedures must be
applied. /or any statistical procedures$ given in this book or elsewhere$ the associated formulas are
valid only under specific assumptions. The set of assumptions in simple linear regression are a
mathematical description of the relationship between x and y. +uch a set of assumptions is known as
a model.
/or each fixed value of x a sub-population of the full population is determined$ such as the collection
of all houses with !$155 s"uare feet of living space. /or each element of that sub-population there is a
measurement y$ such as the value of any !$155-s"uare-foot house. etE(y)denote the mean of all
the y-values for each particular value of x .E(y)can change from x -value to x -value$ such as the mean
value of all !$155-s"uare-foot houses$ the *different) mean value for all !$855-s"uare foot-houses$
and so on.
Our first assumption is that the relationship between x and the mean of they-values in the sub-
population determined by x is linear. This means that there exist numbersβ1andβ0such that
E(y)=β1x+β0
This linear relationship is the reason for the word GlinearH in Gsimple linear regressionH below. *The
word GsimpleH means that y depends on only one other variable and not two or more.)
Our next assumption is that for each value of x the y-values scatter about the meanE(y)according to
a normal distribution centered atE(y)and with a standard deviation 6 that is the same for every
Saylor URL: http://www.saylor.org/books Saylor.org09
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 570/723
value of x . This is the same as saying that there exists a normally distributed random variable L with
mean 5 and standard deviation 6 so that the relationship between x and y in the whole population is
y=β1x+β0+ε
Our last assumption is that the random deviations associated with different observations are
independent.
,n summary$ the model is:
S)-ple L)(ear Regress)o( o*el
/or each point(x,y)in data set the y-value is an independent observation of
y=β1x+β0+ε
whereβ1andβ0are fixed parameters and L is a normally distributed random variable with mean 5 and an
unknown standard deviation 6 .
The line with e"uationy=β1x+β0is called the population regression line.
/igure 15.8 0The +imple inear odel 7oncept0 illustrates the model. The symbols N(µ,σ2)denote a
normal distribution with mean and varianceσ2$ hence standard deviation 6 .
!igure "5. The %imple 'inear 2odel <oncept
Saylor URL: http://www.saylor.org/books Saylor.org0;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 571/723
,t is conceptually important to view the model as a sum of two parts:
y=β1x+β0+ε
1 Deterministic &art. The first partβ1x+β0is the e"uation that describes the trend
in y as x increases. The line that we seem to see when we look at the scatter diagram is an
approximation of the liney=β1x+β0.There is nothing random in this part$ and therefore it is calledthe deterministic part of the model.
! -andom &art. The second part L is a random variable$ often called the error term or the noise. This
part explains why the actual observed values of y are not exactly on but fluctuate near a line.
,nformation about this term is important since only when one knows how much noise there is in the
data can one know how trustworthy the detected trend is.
There are three parameters in this model: β0$β1$ and 6 . ach has an important interpretation$
particularlyβ1and 6 . The slope parameterβ1represents the expected change in y brought about by a
unit increase in x . The standard deviation 6 represents the magnitude of the noise in the data.
There are procedures for checking the validity of the three assumptions$ but for us it will be sufficient
to visually verify the linear trend in the data. ,f the data set is large then the points in the scatter
Saylor URL: http://www.saylor.org/books Saylor.org01
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 572/723
diagram will form a band about an apparent straight line. The normality of L with a constant
standard deviation corresponds graphically to the band being of roughly constant width$ and with
most points concentrated near the middle of the band.
/ortunately$ the three assumptions do not need to hold exactly in order for the procedures and
analysis developed in this chapter to be useful.
IEJ &AIEABAJ
• Stat)st)cal proce*,res are 4al)* o(ly whe( certa)( ass,-pt)o(s are 4al)*. &he
ass,-pt)o(s ,(*erly)(g the a(alyses *o(e )( th)s chapter are graph)cally
s,--ar)Ve* )( )g,re 1;.0 Q&he S)-ple L)(ear o*el %o(ceptQ.
++/C&S+S
1 State the three ass,-pt)o(s that are the bas)s for the S)-ple L)(ear Regress)o( o*el.
2 &he S)-ple L)(ear Regress)o( o*el )s s,--ar)Ve* by the e<,at)o(
y=β1x+β0+ε
*e(t)fy the *eter-)()st)c part a(* the ra(*o- part.
3 s the (,-ber β1)( the e<,at)o( y=β1x+β0a stat)st)c or a pop,lat)o( para-eterC E7pla)(.
s the (,-ber σ )( the S)-ple L)(ear Regress)o( o*el a stat)st)c or a pop,lat)o(
para-eterC E7pla)(.
0 escr)be what to look for )( a scatter *)agra- )( or*er to check that the ass,-pt)o(s of
the S)-ple L)(ear Regress)o( o*el are tr,e.
&r,e or false: the ass,-pt)o(s of the S)-ple L)(ear Regress)o( o*el -,st hol* e7actly
)( or*er for the proce*,res a(* a(alys)s *e4elope* )( th)s chapter to be ,sef,l.
ANS+/S
1
a &he -ea( of / )s l)(early relate* to .
b or each g)4e( / )s a (or-al ra(*o- 4ar)able w)th -ea( β1x+β0a(*
sta(*ar* *e4)at)o( σ .
c All the obser4at)o(s of / )( the sa-ple are )(*epe(*e(t.
3 β1)s a pop,lat)o( para-eter.
0 A l)(ear tre(*.
Saylor URL: http://www.saylor.org/books Saylor.org02
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 573/723
%.7 The !east S"uares /e$ression !ine
LEARNN! "#$E%&'ES
1 &o lear( how to -eas,re how well a stra)ght l)(e +ts a collect)o( of *ata.
2 &o lear( how to co(str,ct the least s<,ares regress)o( l)(e the stra)ght l)(e that
best +ts a collect)o( of *ata.
3 &o lear( the -ea()(g of the slope of the least s<,ares regress)o( l)(e.
&o lear( how to ,se the least s<,ares regress)o( l)(e to est)-ate the respo(se
4ar)able / )( ter-s of the pre*)ctor 4ar)able .
:oodness o Fit o a Strai$ht !ine to (ata
Once the scatter diagram of the data has been drawn and the model assumptions described in the
previous sections at least visually verified *and perhaps the correlation coefficient r computed to
"uantitatively verify the linear trend)$ the next step in the analysis is to find the straight line that best fits
the data. (e will explain how to measure how well a straight line fits a collection of points by examining
how well the liney=12x−1fits the data set
Saylor URL: http://www.saylor.org/books Saylor.org03
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 574/723
To each point in the data set there is associated an Gerror$H the positive or negative vertical distance
from the point to the line: positive if the point is above the line and negative if it is below the line.
The error can be computed as the actual y-value of the point minus the y-value y that is GpredictedH
by inserting the x -value of the data point into the formula for the line:
error at data point (x,y)=(truey)−(predictedy)=y−y
Saylor URL: http://www.saylor.org/books Saylor.org0
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 575/723
The computation of the error for each of the five points in the data set is shown in Table 15.1 0The
rrors in /itting 2ata with a +traight ine0.
Table 15.1 The rrors in /itting 2ata with a +traight ine
x y y=12x−1 y−y (y−y)2
2 ; ; ; ;
2 1 ; 1 1
2 2 ; ;
6 3 3 ; ;
1; 3 X1 1
j = = = ; 2
% first thought for a measure of the goodness of fit of the line to the data would be simply to add the
errors at every point$ but the example shows that this cannot work well in general. The line does not
fit the data perfectly *no line can)$ yet because of cancellation of positive and negative errors the sum
of the errors *the fourth column of numbers) is 'ero. ,nstead goodness of fit is measured by the sum
of the s"uares of the errors. +"uaring eliminates the minus signs$ so no cancellation can occur. /or
the data and line in /igure 15.; 0lot of the /ive-oint 2ata and the ine 0 the sum of the s"uared
errors *the last column of numbers) is !. This number measures the goodness of fit of the line to the
data.
e+()t)o(
The goodness of fit of a line y=mx+bto a set of n pairs (x,y)of numbers in a sample is the sum of the
squared errors
Σ(y−y)2
Dn terms in the sum, one for each data pairE.
Saylor URL: http://www.saylor.org/books Saylor.org00
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 576/723
The !east S"uares /e$ression !ine
iven any collection of pairs of numbers *except when all the x -values are the same) and the
corresponding scatter diagram$ there always exists exactly one straight line that fits the data better
than any other$ in the sense of minimi'ing the sum of the s"uared errors. ,t is called the least squares
regression line. oreover there are formulas for its slope and y-intercept.
Saylor URL: http://www.saylor.org/books Saylor.org0
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 577/723
Saylor URL: http://www.saylor.org/books Saylor.org0
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 578/723
&A#LE 1;.2 &@E ERR"RS N &&N! A&A B&@ &@ E
LEAS& SHUARES RE!RESS"N LNE
x y y=0.34375x−0.125 y−y (y−y)2
2 ; ;.020 X;.020 ;.31;20
2 1 ;.020 ;.30 ;.191;20
2 1.930 ;.;20 ;.;;39;20
6 3 2.20; ;.30; ;.1;20;;
1; 3 3.3120 X;.3120 ;.;9020
EKAPLE 3
Saylor URL: http://www.saylor.org/books Saylor.org06
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 579/723
&able 1;.3 Qata o( Age a(* 'al,e of Use* A,to-ob)les of a Spec)+c ake a(*
o*elQ shows the age )( years a(* the reta)l 4al,e )( tho,sa(*s of *ollars of a
ra(*o- sa-ple of te( a,to-ob)les of the sa-e -ake a(* -o*el.
a. %o(str,ct the scatter *)agra-.
b. %o-p,te the l)(ear correlat)o( coe?c)e(t r . (terpret )ts 4al,e )( the co(te7t of the
proble-.
c. %o-p,te the least s<,ares regress)o( l)(e. Plot )t o( the scatter *)agra-.
*. (terpret the -ea()(g of the slope of the least s<,ares regress)o( l)(e )( the co(te7t
of the proble-.
e. S,ppose a fo,r=year=ol* a,to-ob)le of th)s -ake a(* -o*el )s selecte* at ra(*o-.
Use the regress)o( e<,at)o( to pre*)ct )ts reta)l 4al,e.
f. S,ppose a 2;=year=ol* a,to-ob)le of th)s -ake a(* -o*el )s selecte* at ra(*o-.
Use the regress)o( e<,at)o( to pre*)ct )ts reta)l 4al,e. (terpret the res,lt.g. %o--e(t o( the 4al)*)ty of ,s)(g the regress)o( e<,at)o( to pre*)ct the pr)ce of a
bra(* (ew a,to-ob)le of th)s -ake a(* -o*el.
&A#LE 1;.3 A&A "N A!E AN 'A LU E " USE
AU&""#LES " A SPE%% AIE AN "EL
2 3 3 3 0 0 0
/ 26. 2.6 2.; 3;.0 23.6 2. 23.6 2;. 21. 22.1
Sol,t)o(:
a. &he scatter *)agra- )s show( )( )g,re 1;. QScatter )agra- for Age a(* 'al,e
of Use* A,to-ob)lesQ.
Figure 1A.7%catter Diagram for 0ge and <alue of Used 0utomobiles
Saylor URL: http://www.saylor.org/books Saylor.org09
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 580/723
Saylor URL: http://www.saylor.org/books Saylor.org06;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 581/723
* S)(ce we k(ow (oth)(g abo,t the a,to-ob)le other tha( )ts age we ass,-e
that )t )s of abo,t a4erage 4al,e a(* ,se the a4erage 4al,e of all fo,r=year=ol*
4eh)cles of th)s -ake a(* -o*el as o,r est)-ate. &he a4erage 4al,e )s s)-ply
the 4al,e of y obta)(e* whe( the (,-ber )s )(serte* for )( the least
s<,ares regress)o( e<,at)o(:
e
y=−2.05(4)+32.83=24.63
wh)ch correspo(*s to 23;.
Saylor URL: http://www.saylor.org/books Saylor.org061
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 582/723
* Now we )(sert x=20)(to the least s<,ares regress)o( e<,at)o( to obta)(
y=−2.05(20)+32.83=−8.17
wh)ch correspo(*s to X61;. So-eth)(g )s wro(g here s)(ce a (egat)4e-akes (o se(se. &he error arose fro- apply)(g the regress)o( e<,at)o( to a
4al,e of (ot )( the ra(ge of =4al,es )( the or)g)(al *ata fro- two to s)7
years.
Apply)(g the regress)o( e<,at)o( y=β1x+β0to a 4al,e of o,ts)*e the ra(ge
of =4al,es )( the *ata set )s calle* etrapolation. t )s a( )(4al)* ,se of the
regress)o( e<,at)o( a(* sho,l* be a4o)*e*.
e &he pr)ce of a bra(* (ew 4eh)cle of th)s -ake a(* -o*el )s the 4al,e of the
a,to-ob)le at age ;. f the 4al,e x=0)s )(serte* )(to the regress)o( e<,at)o( the
res,lt )s always β0 the / =)(tercept )( th)s case 32.63 wh)ch correspo(*s to
3263;. #,t th)s )s a case of e7trapolat)o( >,st as part f5 was he(ce th)s res,lt
)s )(4al)* altho,gh (ot ob4)o,sly so. ( the co(te7t of the proble- s)(ce
a,to-ob)les te(* to lose 4al,e -,ch -ore <,)ckly )--e*)ately after they are
p,rchase* tha( they *o after they are se4eral years ol* the (,-ber 3263; )s
probably a( ,(*erest)-ate of the pr)ce of a (ew a,to-ob)le of th)s -ake a(*
-o*el.
/or emphasis we highlight the points raised by parts *f) and *g) of the example.
e+()t)o(
The process of using the least squares regression equation to estimate the value of y at a value of x that
does not lie in the range of the x:values in the data set that was used to form the regression line is
called extrapolation. >t is an invalid use of the regression equation that can lead to errors, hence should
be avoided.
The Sum o the S"uared +rrors SSE
,n general$ in order to measure the goodness of fit of a line to a set of data$ we must compute the
predicted y-value y at every point in the data set$ compute each error$ s"uare it$ and then add up all the
Saylor URL: http://www.saylor.org/books Saylor.org062
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 583/723
s"uares. ,n the case of the least s"uares regression line$ however$ the line that best fits the data$ the sum
of the s"uared errors can be computed directly from the data using the following formula.
The sum of the s"uared errors for the least s"uares regression line is denoted by SSE.,t can be computed
using the formula
SSE=SSyy−β1Ssxy
Saylor URL: http://www.saylor.org/books Saylor.org063
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 584/723
Saylor URL: http://www.saylor.org/books Saylor.org06
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 585/723
SSE=SSyy−β 1SSxy=87.781−(−2.05)(−28.7)=28.946
*+, TA*+AA,S
• @ow well a stra)ght l)(e +ts a *ata set )s -eas,re* by the s,- of the s<,are*
errors.
• &he least s<,ares regress)o( l)(e )s the l)(e that best +ts the *ata. ts slope a(* / =
)(tercept are co-p,te* fro- the *ata ,s)(g for-,las.
Saylor URL: http://www.saylor.org/books Saylor.org060
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 586/723
• &he slope β1of the least s<,ares regress)o( l)(e est)-ates the s)Ve a(* *)rect)o(
of the -ea( cha(ge )( the *epe(*e(t 4ar)able / whe( the )(*epe(*e(t
4ar)able )s )(crease* by o(e ,()t.
• &he s,- of the s<,are* errors SSE of the least s<,ares regress)o( l)(e ca( be
co-p,te* ,s)(g a for-,la w)tho,t ha4)(g to co-p,te all the )(*)4)*,al errors.
EKER%SES
#AS%
or the #as)c a(* Appl)cat)o( e7erc)ses )( th)s sect)o( ,se the co-p,tat)o(s that
were *o(e for the e7erc)ses w)th the sa-e (,-ber )(Sect)o( 1;.2 Q&he L)(ear
%orrelat)o( %oe?c)e(tQ.
1 %o-p,te the least s<,ares regress)o( l)(e for the *ata )( E7erc)se 1 of Sect)o( 1;.2 Q&he
L)(ear %orrelat)o( %oe?c)e(tQ.
2 %o-p,te the least s<,ares regress)o( l)(e for the *ata )( E7erc)se 2 of Sect)o( 1;.2 Q&he
L)(ear %orrelat)o( %oe?c)e(tQ.
3 %o-p,te the least s<,ares regress)o( l)(e for the *ata )( E7erc)se 3 of Sect)o( 1;.2 Q&he
L)(ear %orrelat)o( %oe?c)e(tQ.
%o-p,te the least s<,ares regress)o( l)(e for the *ata )( E7erc)se of Sect)o( 1;.2 Q&heL)(ear %orrelat)o( %oe?c)e(tQ.
0 or the *ata )( E7erc)se 0 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ
a %o-p,te the least s<,ares regress)o( l)(e.
b %o-p,te the s,- of the s<,are* errors SSE ,s)(g the *e+()t)o( Σ(y−y)2.
c %o-p,te the s,- of the s<,are* errors SSE ,s)(g the for-,la SSE=SSyy−β1SSxy.
or the *ata )( E7erc)se of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ
a %o-p,te the least s<,ares regress)o( l)(e.
b %o-p,te the s,- of the s<,are* errors SSE ,s)(g the *e+()t)o( Σ(y−y)2.
c %o-p,te the s,- of the s<,are* errors SSE ,s)(g the for-,la SSE=SSyy−β1SSxy.
Saylor URL: http://www.saylor.org/books Saylor.org06
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 587/723
%o-p,te the least s<,ares regress)o( l)(e for the *ata )( E7erc)se of Sect)o( 1;.2
Q&he L)(ear %orrelat)o( %oe?c)e(tQ.
6 %o-p,te the least s<,ares regress)o( l)(e for the *ata )( E7erc)se 6 of Sect)o( 1;.2
Q&he L)(ear %orrelat)o( %oe?c)e(tQ.
9 or the *ata )( E7erc)se 9 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ
a %o-p,te the least s<,ares regress)o( l)(e.
b %a( yo, co-p,te the s,- of the s<,are* errorsSSE ,s)(g the
*e+()t)o(Σ(y−y)2C E7pla)(.
c %o-p,te the s,- of the s<,are* errors SSE ,s)(g the for-,la SSE=SSyy−β1SSxy.
1; or the *ata )( E7erc)se 1; of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ
a %o-p,te the least s<,ares regress)o( l)(e.
b %a( yo, co-p,te the s,- of the s<,are* errorsSSE ,s)(g the *e+()t)o(
Σ(y−y)2C E7pla)(.
c %o-p,te the s,- of the s<,are* errors SSE ,s)(g the for-,la SSE=SSyy−β1SSxy.
APPL%A&"NS
11 or the *ata )( E7erc)se 11 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ
a %o-p,te the least s<,ares regress)o( l)(e.
b "( a4erage how -a(y (ew wor*s *oes a ch)l* fro- 13 to 16 -o(ths ol* lear(
each -o(thC E7pla)(.
c Est)-ate the a4erage 4ocab,lary of all 1=-o(th=ol* ch)l*re(.
12 or the *ata )( E7erc)se 12 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ
a %o-p,te the least s<,ares regress)o( l)(e.
b "( a4erage how -a(y a**)t)o(al feet are a**e* to the brak)(g *)sta(ce for
each a**)t)o(al 1;; po,(*s of we)ghtC E7pla)(.
c Est)-ate the a4erage brak)(g *)sta(ce of all cars we)gh)(g 3;;; po,(*s.
13 or the *ata )( E7erc)se 13 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ
Saylor URL: http://www.saylor.org/books Saylor.org06
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 588/723
a %o-p,te the least s<,ares regress)o( l)(e.
b Est)-ate the a4erage rest)(g heart rate of all ;=year=ol* -e(.
c Est)-ate the a4erage rest)(g heart rate of all (ewbor( baby boys. %o--e(t
o( the 4al)*)ty of the est)-ate.
1 or the *ata )( E7erc)se 1 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ
a %o-p,te the least s<,ares regress)o( l)(e.
b Est)-ate the a4erage wa4e he)ght whe( the w)(* )s blow)(g at 1; -)les per
ho,r.
c Est)-ate the a4erage wa4e he)ght whe( there )s (o w)(* blow)(g. %o--e(t
o( the 4al)*)ty of the est)-ate.
10 or the *ata )( E7erc)se 10 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ
a %o-p,te the least s<,ares regress)o( l)(e.
b "( a4erage for each a**)t)o(al tho,sa(* *ollars spe(t o( a*4ert)s)(g how
*oes re4e(,e cha(geC E7pla)(.
c Est)-ate the re4e(,e )f 20;; )s spe(t o( a*4ert)s)(g (e7t year.
1 or the *ata )( E7erc)se 1 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ
a %o-p,te the least s<,ares regress)o( l)(e.
b "( a4erage for each a**)t)o(al )(ch of he)ght of two=year=ol* g)rl what )s the
cha(ge )( the a*,lt he)ghtC E7pla)(.
c Pre*)ct the a*,lt he)ght of a two=year=ol* g)rl who )s 33 )(ches tall.
1 or the *ata )( E7erc)se 1 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ
a %o-p,te the least s<,ares regress)o( l)(e.
b %o-p,te SSE ,s)(g the for-,la SSE=SSyy−β1SSxy.
c Est)-ate the a4erage +(al e7a- score of all st,*e(ts whose co,rse a4erage
>,st before the e7a- )s 60.
16 or the *ata )( E7erc)se 16 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ
a %o-p,te the least s<,ares regress)o( l)(e.
b %o-p,te SSE ,s)(g the for-,la SSE=SSyy−β1SSxy.
Saylor URL: http://www.saylor.org/books Saylor.org066
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 589/723
c Est)-ate the (,-ber of acres that wo,l* be har4este* )f 9; -)ll)o( acres of
cor( were pla(te*.
19 or the *ata )( E7erc)se 19 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ
a %o-p,te the least s<,ares regress)o( l)(e.
b (terpret the 4al,e of the slope of the least s<,ares regress)o( l)(e )( the
co(te7t of the proble-.
c Est)-ate the a4erage co(ce(trat)o( of the act)4e )(gre*)e(t )( the bloo* )(
-e( after co(s,-)(g 1 o,(ce of the -e*)cat)o(.
2; or the *ata )( E7erc)se 2; of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ
a %o-p,te the least s<,ares regress)o( l)(e.
b (terpret the 4al,e of the slope of the least s<,ares regress)o( l)(e )( the
co(te7t of the proble-.
c Est)-ate the age of a( oak tree whose g)rth +4e feet o8 the gro,(* )s 92
)(ches.
21 or the *ata )( E7erc)se 21 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ
a %o-p,te the least s<,ares regress)o( l)(e.
b &he 26=*ay stre(gth of co(crete ,se* o( a certa)( >ob -,st be at least 32;;
ps). f the 3=*ay stre(gth )s 13;; ps) wo,l* we a(t)c)pate that the co(crete
w)ll be s,?c)e(tly stro(g o( the 26th *ayC E7pla)( f,lly.
22 or the *ata )( E7erc)se 22 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ
a %o-p,te the least s<,ares regress)o( l)(e.
b f the power fac)l)ty )s calle* ,po( to pro4)*e -ore tha( 90 -)ll)o( watt=ho,rs
to-orrow the( e(ergy w)ll ha4e to be p,rchase* fro- elsewhere at a
pre-),-. &he forecast )s for a( a4erage te-perat,re of 2 *egrees. Sho,l*
the co-pa(y pla( o( p,rchas)(g power at a pre-),-C
Saylor URL: http://www.saylor.org/books Saylor.org069
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 590/723
LAR!E A&A S E& EKE R%SES
20 Large ata Set 1 l)sts the SA& scores a(* !PAs of 1;;; st,*e(ts.
http://www.1.7ls
a %o-p,te the least s<,ares regress)o( l)(e w)th SA& score as the )(*epe(*e(t
4ar)able 5 a(* !PA as the *epe(*e(t 4ar)able / 5.
b (terpret the -ea()(g of the slope β1of regress)o( l)(e )( the co(te7t of
proble-.
c %o-p,te SSE the -eas,re of the goo*(ess of +t of the regress)o( l)(e to the
sa-ple *ata.
Saylor URL: http://www.saylor.org/books Saylor.org09;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 591/723
* Est)-ate the !PA of a st,*e(t whose SA& score )s 130;.
2 Large ata Set 12 l)sts the golf scores o( o(e ro,(* of golf for 0 golfers +rst ,s)(g the)r
ow( or)g)(al cl,bs the( ,s)(g cl,bs of a (ew e7per)-e(tal *es)g( after two -o(ths of
fa-)l)ar)Vat)o( w)th the (ew cl,bs5.
http://www.12.7ls
a %o-p,te the least s<,ares regress)o( l)(e w)th scores ,s)(g the or)g)(al cl,bs
as the )(*epe(*e(t 4ar)able 5 a(* scores ,s)(g the (ew cl,bs as the
*epe(*e(t 4ar)able / 5.
b (terpret the -ea()(g of the slope β1of regress)o( l)(e )( the co(te7t of
proble-.
c %o-p,te SSE the -eas,re of the goo*(ess of +t of the regress)o( l)(e to the
sa-ple *ata.
* Est)-ate the score w)th the (ew cl,bs of a golfer whose score w)th the ol*
cl,bs )s 3.
2 Large ata Set 13 recor*s the (,-ber of b)**ers a(* sales pr)ce of a part)c,lar type of
a(t)<,e gra(*father clock at ; a,ct)o(s.
http://www.13.7ls
a %o-p,te the least s<,ares regress)o( l)(e w)th the (,-ber of b)**ers prese(t
at the a,ct)o( as the )(*epe(*e(t 4ar)able 5 a(* sales pr)ce as the
*epe(*e(t 4ar)able / 5.
b (terpret the -ea()(g of the slope β1of regress)o( l)(e )( the co(te7t of
proble-.
c %o-p,te SSE the -eas,re of the goo*(ess of +t of the regress)o( l)(e to the
sa-ple *ata.
* Est)-ate the sales pr)ce of a clock at a( a,ct)o( at wh)ch the (,-ber of
b)**ers )s se4e(.
Saylor URL: http://www.saylor.org/books Saylor.org091
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 592/723
Saylor URL: http://www.saylor.org/books Saylor.org092
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 593/723
%.8 Statistical &nerences About β%
!+A/N&N: 1';+CT&<+S
1 &o lear( how to co(str,ct a co(+*e(ce )(ter4al forβ1 the slope of the pop,lat)o(
regress)o( l)(e.
2 &o lear( how to test hypotheses regar*)(g β1.
The parameterβ1$ the slope of the population regression line$ is of primary importance in regression
analysis because it gives the true rate of change in the mean E(y)in response to a unit increase in the
predictor variable x . /or every unit increase in x the mean of the response variable y changes
byβ1units$ increasing ifβ1>0and decreasing ifβ1<0. (e wish to construct confidence intervals
forβ1and test hypotheses about it.
Con)dence &ntervals or β%
The slope β1of the least s"uares regression line is a point estimate ofβ1. % confidence interval forβ1is
given by the following formula.
Saylor URL: http://www.saylor.org/books Saylor.org093
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 594/723
Saylor URL: http://www.saylor.org/books Saylor.org09
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 595/723
Saylor URL: http://www.saylor.org/books Saylor.org090
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 596/723
years ol* we are 9;G co(+*e(t that for each a**)t)o(al year of age the a4erage
4al,e of s,ch a 4eh)cle *ecreases by betwee( 11;; a(* 3;;;.
Saylor URL: http://www.saylor.org/books Saylor.org09
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 597/723
Testin$ =ypotheses About β%
Eypotheses regardingβ1can be tested using the same five-step procedures$ either the critical value
approach or the p-value approach$ that were introduced in +ection ?.1 0The lements of Eypothesis
Testing0 and +ection ?.3 0The Observed +ignificance of a Test0 of 7hapter ? 0Testing Eypotheses0. The
null hypothesis always has the formH0:β1=B0 where 95 is a number determined from the statement of the
problem. The three forms of the alternative hypothesis$ with the terminology for each case$ are:
Form o Ha Terminolo$y
Ha:β1<B0 Left=ta)le*
Ha:β1>B0 R)ght=ta)le*
Ha:β1≠B0 &wo=ta)le*
The value 'ero for 95 is of particular importance since in that case the null hypothesis is H0:β1=0$ which
corresponds to the situation in which x is not useful for predicting y. /or ifβ1=0then the population
regression line is hori'ontal$ so the mean E(y)is the same for every value of x and we are #ust as well off inignoring x completely and approximating y by its average value. iven two variables x and y$ the burden
of proof is that x is useful for predicting y$ not that it is not. Thus the phrase Gtest whether x is useful for
prediction of y$H or words to that effect$ means to perform the test
H 0:β1=0 vs.H a:β1≠0
Saylor URL: http://www.saylor.org/books Saylor.org09
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 598/723
Saylor URL: http://www.saylor.org/books Saylor.org096
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 599/723
• Step 0. As show( )( )g,re 1;.9 QRe>ect)o( Reg)o( a(* &est Stat)st)c for Q the
test stat)st)c falls )( the re>ect)o( reg)o(. &he *ec)s)o( )s to re>ect ;. ( the
co(te7t of the proble- o,r co(cl,s)o( )s:
&he *ata pro4)*e s,?c)e(t e4)*e(ce at the 2G le4el of s)g()+ca(ce to
co(cl,*e that the slope of the pop,lat)o( regress)o( l)(e )s (o(Vero so
that )s ,sef,l as a pre*)ctor of / .
Saylor URL: http://www.saylor.org/books Saylor.org099
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 600/723
Figure 1A.6eEection 6egion and )est %tatistic for 'ote 1A.** >(ample ">
+A>2!+ 6
A car sales-a( cla)-s that a,to-ob)les betwee( two a(* s)7 years ol* of the
-ake a(* -o*el *)sc,sse* )( Note 1;.19 QE7a-ple 3Q )( Sect)o( 1;. Q&he Least
S<,ares Regress)o( L)(eQ lose -ore tha( 11;; )( 4al,e each year. &est th)s
cla)- at the 0G le4el of s)g()+ca(ce.
Sol,t)o(:
Be w)ll perfor- the test ,s)(g the cr)t)cal 4al,e approach.
• Step 1. ( ter-s of the 4ar)ables a(* / the sales-a(Ws cla)- )s that )f )s
)(crease* by 1 ,()t o(e a**)t)o(al year )( age5 the( / *ecreases by -ore
tha( 1.1 ,()ts -ore tha( 11;;5. &h,s h)s assert)o( )s that the slope of the
pop,lat)o( regress)o( l)(e )s (egat)4e a(* that )t )s -ore (egat)4e tha( X1.1.
( sy-bols β1<−1.1.S)(ce )t co(ta)(s a( )(e<,al)ty th)s has to be the alter(at)4e
Saylor URL: http://www.saylor.org/books Saylor.org;;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 601/723
hypotheses. &he (,ll hypothes)s has to be a( e<,al)ty a(* ha4e the sa-e
(,-ber o( the r)ght ha(* s)*e so the rele4a(t test )s
&he *ata pro4)*e s,?c)e(t e4)*e(ce at the 0G le4el of s)g()+ca(ce to
co(cl,*e that 4eh)cles of th)s -ake a(* -o*el a(* )( th)s age ra(ge lose -ore
tha( 11;; per year )( 4al,e o( a4erage.
Saylor URL: http://www.saylor.org/books Saylor.org;1
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 602/723
Figure 1A.1A6eEection 6egion and )est %tatistic for 'ote 1A.*8 >(ample >
IEJ &AIE ABAJS
• &he para-eter β1 the slope of the pop,lat)o( regress)o( l)(e )s of pr)-ary
)(terest beca,se )t *escr)bes the a4erage cha(ge )( / w)th respect to ,()t
)(crease )( .
• &he stat)st)c β 1 the slope of the least s<,ares regress)o( l)(e )s a po)(t est)-ate
of β1.%o(+*e(ce )(ter4als for β1ca( be co-p,te* ,s)(g a for-,la.
• @ypotheses regar*)(g β1are teste* ,s)(g the sa-e +4e=step proce*,res
)(tro*,ce* )( %hapter 6 Q&est)(g @ypothesesQ.
EKER%SES
#AS%
or the #as)c a(* Appl)cat)o( e7erc)ses )( th)s sect)o( ,se the co-p,tat)o(s that were
*o(e for the e7erc)ses w)th the sa-e (,-ber )( Sect)o( 1;.2 Q&he L)(ear %orrelat)o(
%oe?c)e(tQ a(* Sect)o( 1;. Q&he Least S<,ares Regress)o( L)(eQ.
Saylor URL: http://www.saylor.org/books Saylor.org;2
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 603/723
1 %o(str,ct the 90G co(+*e(ce )(ter4al for the slope β1of the pop,lat)o( regress)o( l)(e
base* o( the sa-ple *ata set of E7erc)se 1 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o(
%oe?c)e(tQ.
2 %o(str,ct the 9;G co(+*e(ce )(ter4al for the slope β1of the pop,lat)o( regress)o( l)(e
base* o( the sa-ple *ata set of E7erc)se 2 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o(
%oe?c)e(tQ.
3 %o(str,ct the 9;G co(+*e(ce )(ter4al for the slope β1of the pop,lat)o( regress)o( l)(e
base* o( the sa-ple *ata set of E7erc)se 3 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o(
%oe?c)e(tQ.
%o(str,ct the 99G co(+*e(ce )(ter4al for the slope β1of the pop,lat)o( regress)o(
E7erc)se of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ.
0 or the *ata )( E7erc)se 0 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ test at the
1;G le4el of s)g()+ca(ce whether )s ,sef,l for pre*)ct)(g / that )s whether β1≠05.
or the *ata )( E7erc)se of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ test at the
0G le4el of s)g()+ca(ce whether )s ,sef,l for pre*)ct)(g / that )s whether β1≠05.
%o(str,ct the 9;G co(+*e(ce )(ter4al for the slope β1of the pop,lat)o( regress)o( l)(e
base* o( the sa-ple *ata set of E7erc)se of Sect)o( 1;.2 Q&he L)(ear %orrelat)o(
%oe?c)e(tQ.
6 %o(str,ct the 90G co(+*e(ce )(ter4al for the slope β1of the pop,lat)o( regress)o( l)(e
base* o( the sa-ple *ata set of E7erc)se 6 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o(
%oe?c)e(tQ.
9 or the *ata )( E7erc)se 9 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ test at the
1G le4el of s)g()+ca(ce whether )s ,sef,l for pre*)ct)(g / that )s whether β1≠05.
1; or the *ata )( E7erc)se 1; of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ test at
the 1G le4el of s)g()+ca(ce whether )s ,sef,l for pre*)ct)(g / that )s whether β1≠05.
A22!&CAT&1NS
Saylor URL: http://www.saylor.org/books Saylor.org;3
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 604/723
11 or the *ata )( E7erc)se 11 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQco(str,ct
a 9;G co(+*e(ce )(ter4al for the -ea( (,-ber of (ew wor*s ac<,)re* per -o(th by
ch)l*re( betwee( 13 a(* 16 -o(ths of age.
12 or the *ata )( E7erc)se 12 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQco(str,ct
a 9;G co(+*e(ce )(ter4al for the -ea( )(crease* brak)(g *)sta(ce for each a**)t)o(al1;; po,(*s of 4eh)cle we)ght.
13 or the *ata )( E7erc)se 13 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ test at
the 1;G le4el of s)g()+ca(ce whether age )s ,sef,l for pre*)ct)(g rest)(g heart rate.
1 or the *ata )( E7erc)se 1 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ test at
the 1;G le4el of s)g()+ca(ce whether w)(* spee* )s ,sef,l for pre*)ct)(g wa4e he)ght.
10 or the s)t,at)o( *escr)be* )( E7erc)se 10 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o(
%oe?c)e(tQ
a %o(str,ct the 90G co(+*e(ce )(ter4al for the -ea( )(crease )( re4e(,e per
a**)t)o(al tho,sa(* *ollars spe(t o( a*4ert)s)(g.
b A( a*4ert)s)(g age(cy tells the b,s)(ess ow(er that for e4ery a**)t)o(al
tho,sa(* *ollars spe(t o( a*4ert)s)(g re4e(,e w)ll )(crease by o4er 20;;;.
&est th)s cla)- wh)ch )s the alter(at)4e hypothes)s5 at the 0G le4el of
s)g()+ca(ce.
c Perfor- the test of part b5 at the 1;G le4el of s)g()+ca(ce.
* #ase* o( the res,lts )( b5 a(* c5 how bel)e4able )s the a* age(cyWs cla)-C
&h)s )s a s,b>ect)4e >,*ge-e(t.5
1 or the s)t,at)o( *escr)be* )( E7erc)se 1 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o(
%oe?c)e(tQ
a %o(str,ct the 9;G co(+*e(ce )(ter4al for the -ea( )(crease )( he)ght per
a**)t)o(al )(ch of le(gth at age two.
b t )s cla)-e* that for g)rls each a**)t)o(al )(ch of le(gth at age two -ea(s
-ore tha( a( a**)t)o(al )(ch of he)ght at -at,r)ty. &est th)s cla)- wh)ch )s the
alter(at)4e hypothes)s5 at the 1;G le4el of s)g()+ca(ce.
1 or the *ata )( E7erc)se 1 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ test at
the 1;G le4el of s)g()+ca(ce whether co,rse a4erage before the +(al e7a- )s ,sef,l for
pre*)ct)(g the +(al e7a- gra*e.
16 or the s)t,at)o( *escr)be* )( E7erc)se 16 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o(
%oe?c)e(tQ a( agro(o-)st cla)-s that each a**)t)o(al -)ll)o( acres pla(te* res,lts )(
-ore tha( 0;;;; a**)t)o(al acres har4este*. &est th)s cla)- at the 1G le4el of
s)g()+ca(ce.
19 or the *ata )( E7erc)se 19 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ test at
the 1/1;th of 1G le4el of s)g()+ca(ce whether )g(or)(g all other facts s,ch as age a(*
Saylor URL: http://www.saylor.org/books Saylor.org;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 605/723
bo*y -ass the a-o,(t of the -e*)cat)o( co(s,-e* )s a ,sef,l pre*)ctor of bloo*
co(ce(trat)o( of the act)4e )(gre*)e(t.
2; or the *ata )( E7erc)se 2; of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ test at
the 1G le4el of s)g()+ca(ce whether for each a**)t)o(al )(ch of g)rth the age of the tree
)(creases by at least two a(* o(e=half years.21 or the *ata )( E7erc)se 21 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ
a %o(str,ct the 90G co(+*e(ce )(ter4al for the -ea( )(crease )( stre(gth at 26
*ays for each a**)t)o(al h,(*re* ps) )(crease )( stre(gth at 3 *ays.
b &est at the 1/1;th of 1G le4el of s)g()+ca(ce whether the 3=*ay stre(gth )s
,sef,l for pre*)ct)(g 26=*ay stre(gth.
22 or the s)t,at)o( *escr)be* )( E7erc)se 22 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o(
%oe?c)e(tQ
a %o(str,ct the 99G co(+*e(ce )(ter4al for the -ea( *ecrease )( e(ergy
*e-a(* for each o(e=*egree *rop )( te-perat,re.
b A( e(g)(eer w)th the power co-pa(y bel)e4es that for each o(e=*egree
)(crease )( te-perat,re *a)ly e(ergy *e-a(* w)ll *ecrease by -ore tha( 3.
-)ll)o( watt=ho,rs. &est th)s cla)- at the 1G le4el of s)g()+ca(ce.
!A/:+ (ATA S+T ++/C &S+S
23 Large ata Set 1 l)sts the SA& scores a(* !PAs of 1;;; st,*e(ts.
http://www.1.7ls
a %o-p,te the 9;G co(+*e(ce )(ter4al for the slope β1of the pop,lat)o(
regress)o( l)(e w)th SA& score as the )(*epe(*e(t 4ar)able 5 a(* !PA as the*epe(*e(t 4ar)able / 5.
b &est at the 1;G le4el of s)g()+ca(ce the hypothes)s that the slope of the
pop,lat)o( regress)o( l)(e )s greater tha( ;.;;1 aga)(st the (,ll hypothes)s
that )t )s e7actly ;.;;1.
2 Large ata Set 12 l)sts the golf scores o( o(e ro,(* of golf for 0 golfers +rst ,s)(g the)r
ow( or)g)(al cl,bs the( ,s)(g cl,bs of a (ew e7per)-e(tal *es)g( after two -o(ths of
fa-)l)ar)Vat)o( w)th the (ew cl,bs5.
http://www.12.7ls
a %o-p,te the 90G co(+*e(ce )(ter4al for the slope β1of the pop,lat)o(
regress)o( l)(e w)th scores ,s)(g the or)g)(al cl,bs as the )(*epe(*e(t
4ar)able 5 a(* scores ,s)(g the (ew cl,bs as the *epe(*e(t 4ar)able / 5.
b &est at the 1;G le4el of s)g()+ca(ce the hypothes)s that the slope of the
pop,lat)o( regress)o( l)(e )s *)8ere(t fro- 1 aga)(st the (,ll hypothes)s that
)t )s e7actly 1.
Saylor URL: http://www.saylor.org/books Saylor.org;0
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 606/723
20 Large ata Set 13 recor*s the (,-ber of b)**ers a(* sales pr)ce of a part)c,lar type of
a(t)<,e gra(*father clock at ; a,ct)o(s.
http://www.13.7ls
a %o-p,te the 90G co(+*e(ce )(ter4al for the slopeβ1
of the pop,lat)o(regress)o( l)(e w)th the (,-ber of b)**ers prese(t at the a,ct)o( as the
)(*epe(*e(t 4ar)able 5 a(* sales pr)ce as the *epe(*e(t 4ar)able / 5.
b &est at the 1;G le4el of s)g()+ca(ce the hypothes)s that the a4erage sales
pr)ce )(creases by -ore tha( 9; for each a**)t)o(al b)**er at a( a,ct)o(
aga)(st the *efa,lt that )t )(creases by e7actly 9;.
Saylor URL: http://www.saylor.org/books Saylor.org;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 607/723
Saylor URL: http://www.saylor.org/books Saylor.org;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 608/723
.9 The CoeJcient o (etermination
LEARNN! "#$E%&'E
1 &o lear( what the coe?c)e(t of *eter-)(at)o( )s how to co-p,te )t a(* what )t
tells ,s abo,t the relat)o(sh)p betwee( two 4ar)ables a(* / .
,f the scatter diagram of a set of (x,y)pairs shows neither an upward or downward trend$ then the
hori'ontal line y=y−fits it well$ as illustrated in /igure 15.11. The lack of any upward or downward
trend means that when an element of the population is selected at random$ knowing the value of the
measurement x for that element is not helpful in predicting the value of the measurement y.
!igure "5.""
y=y−
,f the scatter diagram shows a linear trend upward or downward then it is useful to compute the least
s"uares regression line y=β1x+β0and use it in predicting y. /igure 15.1! 0+ame +catter 2iagram
Saylor URL: http://www.saylor.org/books Saylor.org;6
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 609/723
with Two %pproximating ines0 illustrates this. ,n each panel we have plotted the height and weight
data of +ection 15.1 0inear Celationships 4etween Dariables0. This is the same scatter plot as /igure
15.! 0lot of Eeight and (eight airs0$ with the average value line y=y−superimposed on it in the
left panel and the least s"uares regression line imposed on it in the right panel. The errors are
indicated graphically by the vertical line segments.
!igure "5."& %ame %catter (iagram with Two Approximating 'ines
Saylor URL: http://www.saylor.org/books Saylor.org;9
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 610/723
EKAPLE 1;
Saylor URL: http://www.saylor.org/books Saylor.org1;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 611/723
&he
4al,e of
,se*
4eh)cles
of the-ake
a(*
-o*el
*)sc,sse* )( Note 1;.19 QE7a-ple 3Q )( Sect)o( 1;. Q&he Least S<,ares Regress)o(
L)(eQ4ar)es w)*ely. &he -ost e7pe(s)4e a,to-ob)le )( the sa-ple )( &able 1;.3 Qata
o( Age a(* 'al,e of Use* A,to-ob)les of a Spec)+c ake a(* o*elQ has 4al,e
3;0;; wh)ch )s (early half aga)( as -,ch as the least e7pe(s)4e o(e wh)ch )s
worth 2;;;. )(* the proport)o( of the 4ar)ab)l)ty )( 4al,e that )s acco,(te* for by
the l)(ear relat)o(sh)p betwee( age a(* 4al,e.
Sol,t)o(:
&he proport)o( of the 4ar)ab)l)ty )( 4al,e / that )s acco,(te* for by the l)(ear
relat)o(sh)p betwee( )t a(* age )s g)4e( by the coe?c)e(t of *eter-)(at)o( r 2. S)(ce
the correlat)o( coe?c)e(t r was alrea*y co-p,te* )( Note 1;.19 QE7a-ple
3Q as r=−0.819 r2=(−0.819)2=0.671.Abo,t G of the 4ar)ab)l)ty )( the 4al,e of th)s 4eh)cle
ca( be e7pla)(e* by )ts age.
EKAPLE 11
Use each of the three for-,las for the coe?c)e(t of *eter-)(at)o( to co-p,te )ts
4al,e for the e7a-ple of ages a(* 4al,es of 4eh)cles.
Sol,t)o(:
Saylor URL: http://www.saylor.org/books Saylor.org11
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 612/723
( Note 1;.19 QE7a-ple 3Q )( Sect)o( 1;. Q&he Least S<,ares Regress)o( L)(eQ we
co-p,te* the e7act 4al,es
The coefficient of determination r! can always be computed by s"uaring the correlation coefficient r if
it is known. %ny one of the defining formulas can also be used. Typically one would make the choice
based on which "uantities have already been computed. (hat should be avoided is trying to
compute r by taking the s"uare root of r!$ if it is already known$ since it is easy to make a sign error
this way. To see what can go wrong$ supposer2=0.64.Taking the s"uare root of a positive number
with any calculating device will always return a positive result. The s"uare root of 5.;6 is 5.?.
Eowever$ the actual value of r might be the negative number Z5.?.
Saylor URL: http://www.saylor.org/books Saylor.org12
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 613/723
*+, TA*+AA,S
• &he coe?c)e(t of *eter-)(at)o( r 2 est)-ates the proport)o( of the 4ar)ab)l)ty )(
the 4ar)able / that )s e7pla)(e* by the l)(ear relat)o(sh)p betwee( / a(* the
4ar)able .
• &here are se4eral for-,las for co-p,t)(g r 2. &he cho)ce of wh)ch o(e to ,se ca(
be base* o( wh)ch <,a(t)t)es ha4e alrea*y bee( co-p,te* so far.
EKER%SES
#AS%
or the #as)c a(* Appl)cat)o( e7erc)ses )( th)s sect)o( ,se the co-p,tat)o(s that
were *o(e for the e7erc)ses w)th the sa-e (,-ber )(Sect)o( 1;.2 Q&he L)(ear
%orrelat)o( %oe?c)e(tQ Sect)o( 1;. Q&he Least S<,ares Regress)o( L)(eQ
a(* Sect)o( 1;.0 QStat)st)cal (fere(ces Abo,t Q.
1 or the sa-ple *ata set of E7erc)se 1 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o(
%oe?c)e(tQ +(* the coe?c)e(t of *eter-)(at)o( ,s)(g the
for-,la r2=β1SSxy/SSyy.%o(+r- yo,r a(swer by s<,ar)(g r as co-p,te* )( that e7erc)se.
2 or the sa-ple *ata set of E7erc)se 2 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o(
%oe?c)e(tQ +(* the coe?c)e(t of *eter-)(at)o( ,s)(g the
for-,la r2=β1SSxy/SSyy.%o(+r- yo,r a(swer by s<,ar)(g r as co-p,te* )( that e7erc)se.
3 or the sa-ple *ata set of E7erc)se 3 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o(
%oe?c)e(tQ +(* the coe?c)e(t of *eter-)(at)o( ,s)(g the
for-,la r2=β1SSxy/SSyy.%o(+r- yo,r a(swer by s<,ar)(g r as co-p,te* )( that e7erc)se.
Saylor URL: http://www.saylor.org/books Saylor.org13
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 614/723
or the sa-ple *ata set of E7erc)se of Sect)o( 1;.2 Q&he L)(ear %orrelat)o(
%oe?c)e(tQ +(* the coe?c)e(t of *eter-)(at)o( ,s)(g the
for-,la r2=β1SSxy/SSyy.%o(+r- yo,r a(swer by s<,ar)(g r as co-p,te* )( that e7erc)se.
0 or the sa-ple *ata set of E7erc)se 0 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o(
%oe?c)e(tQ +(* the coe?c)e(t of *eter-)(at)o( ,s)(g the
for-,la r2=β1SSxy/SSyy.%o(+r- yo,r a(swer by s<,ar)(g r as co-p,te* )( that e7erc)se.
or the sa-ple *ata set of E7erc)se of Sect)o( 1;.2 Q&he L)(ear %orrelat)o(
%oe?c)e(tQ +(* the coe?c)e(t of *eter-)(at)o( ,s)(g the
for-,la r2=β1SSxy/SSyy.%o(+r- yo,r a(swer by s<,ar)(g r as co-p,te* )( that e7erc)se.
or the sa-ple *ata set of E7erc)se of Sect)o( 1;.2 Q&he L)(ear %orrelat)o(
%oe?c)e(tQ +(* the coe?c)e(t of *eter-)(at)o( ,s)(g the
for-,la r2=(SSyy−SSE)/SSyy.%o(+r- yo,r a(swer by s<,ar)(g r as co-p,te* )( that
e7erc)se.
6 or the sa-ple *ata set of E7erc)se 6 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o(
%oe?c)e(tQ +(* the coe?c)e(t of *eter-)(at)o( ,s)(g the
for-,la r2=(SSyy−SSE)/SSyy.%o(+r- yo,r a(swer by s<,ar)(g r as co-p,te* )( that
e7erc)se.
9 or the sa-ple *ata set of E7erc)se 9 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o(
%oe?c)e(tQ +(* the coe?c)e(t of *eter-)(at)o( ,s)(g the
for-,la r2=(SSyy−SSE)/SSyy.%o(+r- yo,r a(swer by s<,ar)(g r as co-p,te* )( that
e7erc)se.
1; or the sa-ple *ata set of E7erc)se 9 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o(
%oe?c)e(tQ +(* the coe?c)e(t of *eter-)(at)o( ,s)(g the
for-,la r2=(SSyy−SSE)/SSyy.%o(+r- yo,r a(swer by s<,ar)(g r as co-p,te* )( that
e7erc)se.
APPL%A&"NS
Saylor URL: http://www.saylor.org/books Saylor.org1
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 615/723
11 or the *ata )( E7erc)se 11 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ co-p,te
the coe?c)e(t of *eter-)(at)o( a(* )(terpret )ts 4al,e )( the co(te7t of age a(*
4ocab,lary.12 or the *ata )( E7erc)se 12 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ co-p,te
the coe?c)e(t of *eter-)(at)o( a(* )(terpret )ts 4al,e )( the co(te7t of 4eh)cle we)ght
a(* brak)(g *)sta(ce.13 or the *ata )( E7erc)se 13 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ co-p,te
the coe?c)e(t of *eter-)(at)o( a(* )(terpret )ts 4al,e )( the co(te7t of age a(* rest)(g
heart rate. ( the age ra(ge of the *ata *oes age see- to be a 4ery )-porta(t factor
w)th regar* to heart rateC1 or the *ata )( E7erc)se 1 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ co-p,te
the coe?c)e(t of *eter-)(at)o( a(* )(terpret )ts 4al,e )( the co(te7t of w)(* spee* a(*
wa4e he)ght. oes w)(* spee* see- to be a 4ery )-porta(t factor w)th regar* to wa4e
he)ghtC10 or the *ata )( E7erc)se 10 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ +(* the
proport)o( of the 4ar)ab)l)ty )( re4e(,e that )s e7pla)(e* by le4el of a*4ert)s)(g.1 or the *ata )( E7erc)se 1 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ +(* the
proport)o( of the 4ar)ab)l)ty )( a*,lt he)ght that )s e7pla)(e* by the 4ar)at)o( )( le(gth at
age two.1 or the *ata )( E7erc)se 1 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ co-p,te
the coe?c)e(t of *eter-)(at)o( a(* )(terpret )ts 4al,e )( the co(te7t of co,rse a4erage
before the +(al e7a- a(* score o( the +(al e7a-.16 or the *ata )( E7erc)se 16 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ co-p,te
the coe?c)e(t of *eter-)(at)o( a(* )(terpret )ts 4al,e )( the co(te7t of acres pla(te*
a(* acres har4este*.19 or the *ata )( E7erc)se 19 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ co-p,te
the coe?c)e(t of *eter-)(at)o( a(* )(terpret )ts 4al,e )( the co(te7t of the a-o,(t of
the -e*)cat)o( co(s,-e* a(* bloo* co(ce(trat)o( of the act)4e )(gre*)e(t.2; or the *ata )( E7erc)se 2; of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ co-p,te
the coe?c)e(t of *eter-)(at)o( a(* )(terpret )ts 4al,e )( the co(te7t of tree s)Ve a(*
age.21 or the *ata )( E7erc)se 21 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ +(* the
proport)o( of the 4ar)ab)l)ty )( 26=*ay stre(gth of co(crete that )s acco,(te* for by
4ar)at)o( )( 3=*ay stre(gth.
22 or the *ata )( E7erc)se 22 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ +(* theproport)o( of the 4ar)ab)l)ty )( e(ergy *e-a(* that )s acco,(te* for by 4ar)at)o( )(
a4erage te-perat,re.
LAR!E A&A SE & EKE R%SES
Saylor URL: http://www.saylor.org/books Saylor.org10
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 616/723
23 Large ata Set 1 l)sts the SA& scores a(* !PAs of 1;;; st,*e(ts. %o-p,te the
coe?c)e(t of *eter-)(at)o( a(* )(terpret )ts 4al,e )( the co(te7t of SA& scores a(*
!PAs.
http://www.1.7ls
2 Large ata Set 12 l)sts the golf scores o( o(e ro,(* of golf for 0 golfers +rst ,s)(g the)r
ow( or)g)(al cl,bs the( ,s)(g cl,bs of a (ew e7per)-e(tal *es)g( after two -o(ths of
fa-)l)ar)Vat)o( w)th the (ew cl,bs5. %o-p,te the coe?c)e(t of *eter-)(at)o( a(*
)(terpret )ts 4al,e )( the co(te7t of golf scores w)th the two k)(*s of golf cl,bs.
http://www.12.7ls
20 Large ata Set 13 recor*s the (,-ber of b)**ers a(* sales pr)ce of a part)c,lar type of
a(t)<,e gra(*father clock at ; a,ct)o(s. %o-p,te the coe?c)e(t of *eter-)(at)o( a(*
)(terpret )ts 4al,e )( the co(te7t of the (,-ber of b)**ers at a( a,ct)o( a(* the pr)ce of
th)s type of a(t)<,e gra(*father clock.
http://www.13.7ls
Saylor URL: http://www.saylor.org/books Saylor.org1
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 617/723
%.4 +stimation and 2rediction
LEARNN! "#$E%&'ES
1 &o lear( the *)st)(ct)o( betwee( est)-at)o( a(* pre*)ct)o(.
2 &o lear( the *)st)(ct)o( betwee( a co(+*e(ce )(ter4al a(* a pre*)ct)o( )(ter4al.
3 &o lear( how to )-ple-e(t for-,las for co-p,t)(g co(+*e(ce )(ter4als a(*
pre*)ct)o( )(ter4als.
Saylor URL: http://www.saylor.org/books Saylor.org1
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 618/723
7onsider the following pairs of problems$ in the context of 9ote 15.1F 0xample 30 in +ection 15.6
0The east +"uares Cegression ine0$ the automobile age and value example.
1
1 stimate the average value of all four-year-old automobiles of this make and
model.
! 7onstruct a F8B confidence interval for the average value of all four-year-old
automobiles of this make and model.
!
1 +hylock intends to buy a four-year-old automobile of this make and model next
week. redict the value of the first such automobile that he encounters.
! 7onstruct a F8B confidence interval for the value of the first such automobile
that he encounters.
The method of solution and answer to the first "uestion in each pair$ *1a) and *!a)$ are the
same. (hen we set x e"ual to 6 in the least s"uares regression
e"uation y=−2.05x+32.83that was computed in part *c) of 9ote 15.1F 0xample
30 in +ection 15.6 0The east +"uares Cegression ine0$ the number returned$
y=−2.05(4)+32.83=24.63
which corresponds to value >!6$;35$ is an estimate of precisely the number sought in
"uestion *1a): the meanE(y)of all y values when x M 6. +ince nothing is known about the firstfour-year-old automobile of this make and model that +hylock will encounter$ our best guess
as to its value is the mean value E(y)of all such automobiles$ the number !6.;3 or >!6$;35$
computed in the same way.
Saylor URL: http://www.saylor.org/books Saylor.org16
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 619/723
The answers to the second part of each "uestion differ. ,n "uestion *1b) we are trying to
estimate a population parameter: the mean of the all the y-values in the sub-population
picked out by the value x M 6$ that is$ the average value of all four-year-old automobiles. ,n
"uestion *!b)$ however$ we are not trying to capture a fixed parameter$ but the value of the
random variable y in one trial of an experiment: examine the first four-year-old car +hylockencounters. ,n the first case we seek to construct a confidence interval in the same sense that
we have done before. ,n the second case the situation is different$ and the interval
constructed has a different name$ prediction interval. ,n the second case we are trying to
GpredictH where a the value of a random variable will take its value.
Saylor URL: http://www.saylor.org/books Saylor.org19
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 620/723
a. x p is a particular value of x that lies in the range of x -values in the data set used to construct the
least s"uares regression line
Saylor URL: http://www.saylor.org/books Saylor.org2;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 621/723
b. y pis the numerical value obtained when the least s"uare regression e"uation is evaluated atx=x p
and
c. the number of degrees of freedom fortα/2isdf=n−2.
The assumptions listed in +ection 15.3 0odelling inear Celationships with Candomness resent0 must
hold.
EKAPLE 12
Us)(g the sa-ple *ata of Note 1;.19 QE7a-ple 3Q )( Sect)o( 1;. Q&he Least S<,ares
Regress)o( L)(eQ recor*e* )( &able 1;.3 Qata o( Age a(* 'al,e of Use* A,to-ob)les
of a Spec)+c ake a(* o*elQ co(str,ct a 90G co(+*e(ce )(ter4al for the a4erage
4al,e of all three=a(*=o(e=half=year=ol* a,to-ob)les of th)s -ake a(* -o*el.
Sol,t)o(:
Sol4)(g th)s proble- )s -erely a -atter of +(*)(g the 4al,es of y p αa(* tα 2@ sε@ x−
a(* SSxx a(* )(sert)(g the- )(to the co(+*e(ce )(ter4al for-,la g)4e( >,st abo4e.
ost of these <,a(t)t)es are alrea*y k(ow(. ro- Note 1;.19 QE7a-ple 3Q )( Sect)o(
1;. Q&he Least S<,ares Regress)o( L)(eQ SSxx=14a(* x−=4.ro- Note 1;.31
QE7a-ple Q )(Sect)o( 1;.0 QStat)st)cal (fere(ces Abo,t Q sε=1.902169814.
Saylor URL: http://www.saylor.org/books Saylor.org21
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 622/723
Saylor URL: http://www.saylor.org/books Saylor.org22
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 623/723
IEJ &AIE ABAJS
• A co(+*e(ce )(ter4al )s ,se* to est)-ate the -ea( 4al,e of / )( the s,b=
pop,lat)o( *eter-)(e* by the co(*)t)o( that ha4e so-e spec)+c 4al,e p.
• &he pre*)ct)o( )(ter4al )s ,se* to pre*)ct the 4al,e that the ra(*o- 4ar)able / w)ll
take whe( has so-e spec)+c 4al,e p.
EKER%SES
#AS%
or the #as)c a(* Appl)cat)o( e7erc)ses )( th)s sect)o( ,se the co-p,tat)o(s that were
*o(e for the e7erc)ses w)th the sa-e (,-ber )( pre4)o,s sect)o(s.
Saylor URL: http://www.saylor.org/books Saylor.org23
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 624/723
1 or the sa-ple *ata set of E7erc)se 1 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ
a !)4e a po)(t est)-ate for the -ea( 4al,e of / )( the s,b=pop,lat)o(
*eter-)(e* by the co(*)t)o( .
b %o(str,ct the 9;G co(+*e(ce )(ter4al for that -ea( 4al,e.
2 or the sa-ple *ata set of E7erc)se 2 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ
a !)4e a po)(t est)-ate for the -ea( 4al,e of / )( the s,b=pop,lat)o(
*eter-)(e* by the co(*)t)o( .
b %o(str,ct the 9;G co(+*e(ce )(ter4al for that -ea( 4al,e.
3 or the sa-ple *ata set of E7erc)se 3 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ
a !)4e a po)(t est)-ate for the -ea( 4al,e of / )( the s,b=pop,lat)o(
*eter-)(e* by the co(*)t)o( .
b %o(str,ct the 90G co(+*e(ce )(ter4al for that -ea( 4al,e.
or the sa-ple *ata set of E7erc)se of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ
a !)4e a po)(t est)-ate for the -ea( 4al,e of / )( the s,b=pop,lat)o(
*eter-)(e* by the co(*)t)o( 2.
b %o(str,ct the 6;G co(+*e(ce )(ter4al for that -ea( 4al,e.
0 or the sa-ple *ata set of E7erc)se 0 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ
a !)4e a po)(t est)-ate for the -ea( 4al,e of / )( the s,b=pop,lat)o(
*eter-)(e* by the co(*)t)o( 1.
b %o(str,ct the 6;G co(+*e(ce )(ter4al for that -ea( 4al,e.
or the sa-ple *ata set of E7erc)se of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ
a !)4e a po)(t est)-ate for the -ea( 4al,e of / )( the s,b=pop,lat)o(
*eter-)(e* by the co(*)t)o( 0.
b %o(str,ct the 90G co(+*e(ce )(ter4al for that -ea( 4al,e.
or the sa-ple *ata set of E7erc)se of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ
a !)4e a po)(t est)-ate for the -ea( 4al,e of / )( the s,b=pop,lat)o(
*eter-)(e* by the co(*)t)o( .
Saylor URL: http://www.saylor.org/books Saylor.org2
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 625/723
b %o(str,ct the 99G co(+*e(ce )(ter4al for that -ea( 4al,e.
c s )t 4al)* to -ake the sa-e est)-ates for 12C E7pla)(.
6 or the sa-ple *ata set of E7erc)se 6 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ
a !)4e a po)(t est)-ate for the -ea( 4al,e of / )( the s,b=pop,lat)o(
*eter-)(e* by the co(*)t)o( 12.
b %o(str,ct the 6;G co(+*e(ce )(ter4al for that -ea( 4al,e.
c s )t 4al)* to -ake the sa-e est)-ates for ;C E7pla)(.
9 or the sa-ple *ata set of E7erc)se 9 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ
a !)4e a po)(t est)-ate for the -ea( 4al,e of / )( the s,b=pop,lat)o(
*eter-)(e* by the co(*)t)o( ;.
b %o(str,ct the 9;G co(+*e(ce )(ter4al for that -ea( 4al,e.
c s )t 4al)* to -ake the sa-e est)-ates for x=−1C E7pla)(.
1; or the sa-ple *ata set of E7erc)se 9 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ
a !)4e a po)(t est)-ate for the -ea( 4al,e of / )( the s,b=pop,lat)o(
*eter-)(e* by the co(*)t)o( 6.
b %o(str,ct the 90G co(+*e(ce )(ter4al for that -ea( 4al,e.
c s )t 4al)* to -ake the sa-e est)-ates for ;C E7pla)(.
APPL%A&"NS
11 or the *ata )( E7erc)se 11 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ
a !)4e a po)(t est)-ate for the a4erage (,-ber of wor*s )( the 4ocab,lary of
16=-o(th=ol* ch)l*re(.
b %o(str,ct the 90G co(+*e(ce )(ter4al for that -ea( 4al,e.
c s )t 4al)* to -ake the sa-e est)-ates for two=year=ol*sC E7pla)(.
12 or the *ata )( E7erc)se 12 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ
a !)4e a po)(t est)-ate for the a4erage brak)(g *)sta(ce of a,to-ob)les that
we)gh 320; po,(*s.
b %o(str,ct the 6;G co(+*e(ce )(ter4al for that -ea( 4al,e.
c s )t 4al)* to -ake the sa-e est)-ates for 0;;;=po,(* a,to-ob)lesC E7pla)(.
Saylor URL: http://www.saylor.org/books Saylor.org20
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 626/723
13 or the *ata )( E7erc)se 13 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ
a !)4e a po)(t est)-ate for the rest)(g heart rate of a -a( who )s 30 years ol*.
b "(e of the -e( )( the sa-ple )s 30 years ol* b,t h)s rest)(g heart rate )s (ot
what yo, co-p,te* )( part a5. E7pla)( why th)s )s (ot a co(tra*)ct)o(.
c %o(str,ct the 9;G co(+*e(ce )(ter4al for the -ea( rest)(g heart rate of all
30=year=ol* -e(.
1 or the *ata )( E7erc)se 1 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ
a !)4e a po)(t est)-ate for the wa4e he)ght whe( the w)(* spee* )s 13 -)les per
ho,r.
b "(e of the w)(* spee*s )( the sa-ple )s 13 -)les per ho,r b,t the he)ght of
wa4es that *ay )s (ot what yo, co-p,te* )( part a5. E7pla)( why th)s )s (ot a
co(tra*)ct)o(.
c %o(str,ct the 9;G co(+*e(ce )(ter4al for the -ea( wa4e he)ght o( *ays
whe( the w)(* spee* )s 13 -)les per ho,r.
10 or the *ata )( E7erc)se 10 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ
a &he b,s)(ess ow(er )(te(*s to spe(* 20;; o( a*4ert)s)(g (e7t year. !)4e a(
est)-ate of (e7t yearWs re4e(,e base* o( th)s fact.
b %o(str,ct the 9;G pre*)ct)o( )(ter4al for (e7t yearWs re4e(,e base* o( the
)(te(t to spe(* 20;; o( a*4ert)s)(g.
1 or the *ata )( E7erc)se 1 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ
a A two=year=ol* g)rl )s 32.3 )(ches lo(g. Pre*)ct her a*,lt he)ght.
b %o(str,ct the 90G pre*)ct)o( )(ter4al for the g)rlWs a*,lt he)ght.
1 or the *ata )( E7erc)se 1 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ
a Lo*o4)co has a 6. a4erage )( h)s phys)cs class >,st before the +(al. !)4e a
po)(t est)-ate of what h)s +(al e7a- gra*e w)ll be.
b E7pla)( whether a( )(ter4al est)-ate for th)s proble- )s a co(+*e(ce )(ter4al
or a pre*)ct)o( )(ter4al.
c #ase* o( yo,r a(swer to b5 co(str,ct a( )(ter4al est)-ate for Lo*o4)coWs
+(al e7a- gra*e at the 9;G le4el of co(+*e(ce.
Saylor URL: http://www.saylor.org/books Saylor.org2
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 627/723
16 or the *ata )( E7erc)se 16 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ
a &h)s year 6.2 -)ll)o( acres of cor( were pla(te*. !)4e a po)(t est)-ate of the
(,-ber of acres that w)ll be har4este* th)s year.
b E7pla)( whether a( )(ter4al est)-ate for th)s proble- )s a co(+*e(ce )(ter4al
or a pre*)ct)o( )(ter4al.
c #ase* o( yo,r a(swer to b5 co(str,ct a( )(ter4al est)-ate for the (,-ber of
acres that w)ll be har4este* th)s year at the 99G le4el of co(+*e(ce.
19 or the *ata )( E7erc)se 19 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ
a !)4e a po)(t est)-ate for the bloo* co(ce(trat)o( of the act)4e )(gre*)e(t of
th)s -e*)cat)o( )( a -a( who has co(s,-e* 1.0 o,(ces of the -e*)cat)o( >,st
rece(tly.
b !rat)a(o >,st co(s,-e* 1.0 o,(ces of th)s -e*)cat)o( 3; -)(,tes ago.
%o(str,ct a 90G pre*)ct)o( )(ter4al for the co(ce(trat)o( of the act)4e
)(gre*)e(t )( h)s bloo* r)ght (ow.
2; or the *ata )( E7erc)se 2; of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ
a Jo, -eas,re the g)rth of a free=sta(*)(g oak tree +4e feet o8 the gro,(* a(*
obta)( the 4al,e 12 )(ches. @ow ol* *o yo, est)-ate the tree to beC
b %o(str,ct a 9;G pre*)ct)o( )(ter4al for the age of th)s tree.
21 or the *ata )( E7erc)se 21 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ
a A test cyl)(*er of co(crete three *ays ol* fa)ls at 10; ps). Pre*)ct what the
26=*ay stre(gth of the co(crete w)ll be.
b %o(str,ct a 99G pre*)ct)o( )(ter4al for the 26=*ay stre(gth of th)s co(crete.
c #ase* o( yo,r a(swer to b5 what wo,l* be the -)()-,- 26=*ay stre(gth
yo, co,l* e7pect th)s co(crete to e7h)b)tC
22 or the *ata )( E7erc)se 22 of Sect)o( 1;.2 Q&he L)(ear %orrelat)o( %oe?c)e(tQ
a &o-orrowWs a4erage te-perat,re )s forecast to be 03 *egrees. Est)-ate the
e(ergy *e-a(* to-orrow.
b %o(str,ct a 99G pre*)ct)o( )(ter4al for the e(ergy *e-a(* to-orrow.
c #ase* o( yo,r a(swer to b5 what wo,l* be the -)()-,- *e-a(* yo, co,l*
e7pectC
Saylor URL: http://www.saylor.org/books Saylor.org2
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 628/723
LAR!E A&A S E& EKE R%SES
23 Large ata Set 1 l)sts the SA& scores a(* !PAs of 1;;; st,*e(ts.
http://www.1.7ls
a !)4e a po)(t est)-ate of the -ea( !PA of all st,*e(ts who score 130; o( the
SA&.
b %o(str,ct a 9;G co(+*e(ce )(ter4al for the -ea( !PA of all st,*e(ts who
score 130; o( the SA&.
2 Large ata Set 12 l)sts the golf scores o( o(e ro,(* of golf for 0 golfers +rst ,s)(g the)r
ow( or)g)(al cl,bs the( ,s)(g cl,bs of a (ew e7per)-e(tal *es)g( after two -o(ths offa-)l)ar)Vat)o( w)th the (ew cl,bs5.
http://www.12.7ls
a &h,r)o a4erages 2 strokes per ro,(* w)th h)s ow( cl,bs. !)4e a po)(t
est)-ate for h)s score o( o(e ro,(* )f he sw)tches to the (ew cl,bs.
b E7pla)( whether a( )(ter4al est)-ate for th)s proble- )s a co(+*e(ce )(ter4al
or a pre*)ct)o( )(ter4al.
c #ase* o( yo,r a(swer to b5 co(str,ct a( )(ter4al est)-ate for &h,r)oWs score
o( o(e ro,(* )f he sw)tches to the (ew cl,bs at 9;G co(+*e(ce.
20 Large ata Set 13 recor*s the (,-ber of b)**ers a(* sales pr)ce of a part)c,lar type of
a(t)<,e gra(*father clock at ; a,ct)o(s.
http://www.13.7ls
a &here are se4e( l)kely b)**ers at the 'ero(a a,ct)o( to*ay. !)4e a po)(t
est)-ate for the pr)ce of s,ch a clock at to*ayWs a,ct)o(.
b E7pla)( whether a( )(ter4al est)-ate for th)s proble- )s a co(+*e(ce )(ter4al
or a pre*)ct)o( )(ter4al.
Saylor URL: http://www.saylor.org/books Saylor.org26
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 629/723
c #ase* o( yo,r a(swer to b5 co(str,ct a( )(ter4al est)-ate for the l)kely sale
pr)ce of s,ch a clock at to*ayWs sale at 90G co(+*e(ce.
Saylor URL: http://www.saylor.org/books Saylor.org29
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 630/723
%.5 A Complete +xample
LEARNN! "#$E%&'E
1 &o see a co-plete l)(ear correlat)o( a(* regress)o( a(alys)s )( a pract)cal sett)(g
as a cohes)4e whole.
,n the preceding sections numerous concepts were introduced and illustrated$ but the analysis was
broken into dis#oint pieces by sections. ,n this section we will go through a complete example of the
use of correlation and regression analysis of data from start to finish$ touching on all the topics ofthis chapter in se"uence.
,n general educators are convinced that$ all other factors being e"ual$ class attendance has a
significant bearing on course performance. To investigate the relationship between attendance and
Saylor URL: http://www.saylor.org/books Saylor.org3;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 631/723
performance$ an education researcher selects for study a multiple section introductory statistics
course at a large university. ,nstructors in the course agree to keep an accurate record of attendance
throughout one semester. %t the end of the semester !; students are selected a random. /or each
student in the sample two measurements are taken: x $ the number of days the student was absent$
andy$ the student&s score on the common final exam in the course. The data are summari'ed in Table15.6 0%bsence and +core 2ata0.
Table 15.6 %bsence and +core 2ata
Absences Score Absences Score
x y x y
2 1
29 0 3
2 9 66
3 ; 96
2 9 1 99
1 ; 69
; 66 1 9
; 92 3 9;
00 1 9;
; 3 6
2 6; 1 6
2 0 3 6;
1 3 1 6
Saylor URL: http://www.saylor.org/books Saylor.org31
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 632/723
% scatter plot of the data is given in /igure 15.13 0lot of the %bsence and xam +core airs0. There
is a downward trend in the plot which indicates that on average students with more absences tend to
do worse on the final examination.
!igure "5."* $lot of the Absence and ;xam %core $airs
The trend observed in /igure 15.13 0lot of the %bsence and xam +core airs0 as well as the fairly
constant width of the apparent band of points in the plot makes it reasonable to assume a
relationship between x and y of the form
y=β1x+β0+ε
whereβ1andβ0are unknown parameters and L is a normal random variable with mean 'ero and
unknown standard deviation 6 . 9ote carefully that this model is being proposed for the population of
all students taking this course$ not #ust those taking it this semester$ and certainly not #ust those in
the sample. The numbersβ1$β0$ and 6 are parameters relating to this large population.
/irst we perform preliminary computations that will be needed later. The data are processed in Table
15.8 0rocessed %bsence and +core 2ata0.
Saylor URL: http://www.saylor.org/books Saylor.org32
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 633/723
Saylor URL: http://www.saylor.org/books Saylor.org33
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 634/723
The statisticsεestimates the standard deviation 6 of the normal random variable L in the model. ,ts
meaning is that among all students with the same number of absences$ the standard deviation of
their scores on the final exam is about 1!.1 points. +uch a large value on a 155-point exam means
that the final exam scores of each sub-population of students$ based on the number of absences$ are
highly variable.
The si'e and sign of the slope β1=−5.23indicate that$ for every class missed$ students tend to score
about 8.!3 fewer points lower on the final exam on average. +imilarly for every two classes missed
Saylor URL: http://www.saylor.org/books Saylor.org3
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 635/723
students tend to score on average2×5.23=10.46fewer points on the final exam$ or about a letter
grade worse on average.
+ince 5 is in the range of x -values in the data set$ the y-intercept also has meaning in this problem. ,t
is an estimate of the average grade on the final exam of all students who have perfect attendance. The
predicted average of such students is β0=91.24.
4efore we use the regression e"uation further$ or perform other analyses$ it would be a good idea to
examine the utility of the linear regression model. (e can do this in two ways: 1) by computing the
correlation coefficient r to see how strongly the number of absences x and the score y on the final
exam are correlated$ and !) by testing the null hypothesisH0:β1=0*the slope of
the population regression line is 'ero$ so x is not a good predictor of y) against the natural
alternativeHa:β1<0*the slope of the population regression line is negative$ so final exam scores y go
down as absences x go up).
Saylor URL: http://www.saylor.org/books Saylor.org30
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 636/723
Saylor URL: http://www.saylor.org/books Saylor.org3
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 637/723
Saylor URL: http://www.saylor.org/books Saylor.org3
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 638/723
or about 6FB. Thus although there is a significant correlation between attendance and performance
on the final exam$ and we can estimate with fair accuracy the average score of students who miss a
certain number of classes$ nevertheless less than half the total variation of the exam scores in the
sample is explained by the number of absences. This should not come as a surprise$ since there are
many factors besides attendance that bear on student performance on exams.
*+, TA*+AA,
• t )s a goo* )*ea to atte(* class.
++/C&S+S
Saylor URL: http://www.saylor.org/books Saylor.org36
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 639/723
&he e7erc)ses )( th)s sect)o( are ,(relate* to those )( pre4)o,s sect)o(s.
1 &he *ata g)4e the a-o,(t of s)l)co[,or)*e )( the water -g/L5 a(* the a-o,(t / of
lea* )( the bloo*strea- μg/*L5 of te( ch)l*re( )( 4ar)o,s co--,()t)es w)th a(* w)tho,t
-,()c)pal water. Perfor- a co-plete a(alys)s of the *ata )( a(alogy w)th the *)sc,ss)o(
)( th)s sect)o( that )s -ake a scatter plot *o prel)-)(ary co-p,tat)o(s +(* the leasts<,ares regress)o( l)(e +(* SSE sε a(* r a(* so o(5. ( the hypothes)s test ,se as the
alter(at)4e hypothes)s β1>0 a(* test at the 0G le4el of s)g()+ca(ce. Use co(+*e(ce le4el
90G for the co(+*e(ce )(ter4al for β1.%o(str,ct 90G co(+*e(ce a(* pre*)ct)o(s
)(ter4als at x p=2at the e(*.
Saylor URL: http://www.saylor.org/books Saylor.org39
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 640/723
http://www.3.7ls
http://www.3A.7ls
Separate o,t fro- Large ata Set 3A >,st the *ata o( -e( a(* *o a co-plete
a(alys)s w)th shoe s)Ve as the )(*epe(*e(t 4ar)able 5 a(* he)ght as the *epe(*e(t
4ar)able / 5. Use α=0.05a(* x p=10whe(e4er appropr)ate.
http://www.3A.7ls
0 Separate o,t fro- Large ata Set 3A >,st the *ata o( wo-e( a(* *o a co-plete
a(alys)s w)th shoe s)Ve as the )(*epe(*e(t 4ar)able 5 a(* he)ght as the *epe(*e(t
4ar)able / 5. Use α=0.05a(* x p=10whe(e4er appropr)ate.
http://www.3A.7ls
Saylor URL: http://www.saylor.org/books Saylor.org;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 641/723
Saylor URL: http://www.saylor.org/books Saylor.org1
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 642/723
Saylor URL: http://www.saylor.org/books Saylor.org2
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 643/723
Saylor URL: http://www.saylor.org/books Saylor.org3
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 644/723
Chapter %%
Chi-S"uare Tests and F-Tests
,n previous chapters you saw how to test hypotheses concerning population means and populationproportions. The idea of testing hypotheses can be extended to many other situations that involve
different parameters and use different test statistics. (hereas the standardi'ed test statistics that
appeared in earlier chapters followed either a normal or +tudent t -distribution$ in this chapter the
tests will involve two other very common and useful distributions$ the chi-s"uare and the ! -
distributions. The chi(square distribution arises in tests of hypotheses concerning the
independence of two random variables and concerning whether a discrete random variable follows a
specified distribution. The 7(distribution arises in tests of hypotheses concerning whether or not
two population variances are e"ual and concerning whether or not three or more population means
are e"ual.
%%.% Chi-S"uare Tests or &ndependence
Saylor URL: http://www.saylor.org/books Saylor.org
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 645/723
!+A/N&N: 1';+CT&<+S
1 &o ,(*ersta(* what ch)=s<,are *)str)b,t)o(s are.
2 &o ,(*ersta(* how to ,se a ch)=s<,are test to >,*ge whether two factors are
)(*epe(*e(t.
Chi-S"uare (istributions
%s you know$ there is a whole family of t -distributions$ each one specified by a parameter called
the degrees of freedom$ denoteddf.+imilarly$ all the chi-s"uare distributions form a family$ and each of
its members is also specified by a parameterdf$ the number of degrees of freedom. 7hi is a reek letter
denoted by the symbolχand chi-s"uare is often denoted byχ2./igure 11.1 0any 0 shows several chi-
s"uare distributions for different degrees of freedom. % chi-s"uare random variable is a random variable
that assumes only positive values and follows a chi-s"uare distribution.
!igure ""." 2anyχ2 (istributions
e+()t)o(
The value of the chi:square random variable χ2with df=kthat cuts off a right tail of area c is
denoted χ2cand is called a critical value. %ee !igure "".&.
Saylor URL: http://www.saylor.org/books Saylor.org0
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 646/723
!igure "".&χ2c >llustrated
/igure 1!.6 07ritical Dalues of 7hi-+"uare 2istributions0 gives values ofχ2cfor various values of c and
under several chi-s"uare distributions with various degrees of freedom.
Tests or &ndependence
Eypotheses tests encountered earlier in the book had to do with how the numerical values of two
population parameters compared. ,n this subsection we will investigate hypotheses that have to do with
whether or not two random variables take their values independently$ or whether the value of one has a
relation to the value of the other. Thus the hypotheses will be expressed in words$ not mathematical
symbols. (e build the discussion around the following example.
Saylor URL: http://www.saylor.org/books Saylor.org
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 647/723
There is a theory that the gender of a baby in the womb is related to the baby&s heart rate: baby girls tend
to have higher heart rates. +uppose we wish to test this theory. (e examine the heart rate records of 65
babies taken during their mothers& last prenatal checkups before delivery$ and to each of these 65
randomly selected records we compute the values of two random measures: 1) gender and !) heart rate. ,n
this context these two random measures are often called factors. +ince the burden of proof is that heart
rate and gender are related$ not that they are unrelated$ the problem of testing the theory on baby gender
and heart rate can be formulated as a test of the following hypotheses:
HO: Baby gender and baby heart rate are independent
vs H. a: Baby gender and baby heart rate arenot independent
The factor gender has two natural categories or levels: boy and girl. (e divide the second factor$
heart rate$ into two levels$ low and high$ by choosing some heart rate$ say 168 beats per minute$ as
the cutoff between them. % heart rate below 168 beats per minute will be considered low and 168 and
above considered high. The 65 records give rise to a ! !contingency table. 4y ad#oining row totals$
column totals$ and a grand total we obtain the table shown as Table 11.1 04aby ender and Eeart
Cate0. The four entries in boldface type are counts of observations from the sample of nM 65. There
were 11 girls with low heart rate$ 1@ boys with low heart rate$ and so on. They form the core of the
expanded table.
Table 11.1 4aby ender and Eeart Cate
4eart 'ate
2o3 4igh 'o3 Total
(ender
(ir" ; 18
:o ; + 22
o"&%n ,o'a" 28 12 ,o'a" F 40
,n analogy with the fact that the probability of independent events is the product of the probabilities
of each event$ if heart rate and gender were independent then we would expect the number in each
core cell to be close to the product of the row total - and column total < of the row and column
containing it$ divided by the sample si'e n. 2enoting such an expected number of observations ; $
these four expected values are:
Saylor URL: http://www.saylor.org/books Saylor.org
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 648/723
• 1st row and 1st column:E=(R×C)/n=18×28/40=12.6
• 1st row and !nd column:E=(R×C)/n=18×12/40=5.4
• !nd row and 1st column:E=(R×C)/n=22×28/40=15.4
• !nd row and !nd column:E=(R×C)/n=22×12/40=6.6
(e update Table 11.1 04aby ender and Eeart Cate0 by placing each expected value in its
corresponding core cell$ right under the observed value in the cell. This gives the updated table Table
11.! 0<pdated 4aby ender and Eeart Cate0.
Table 11.! <pdated 4aby ender and Eeart Cate
=eart /ate
!ow =i$h /ow Total
!e(*er
!)rl O=11E=12.6 O=7E=5.4 6 16
#oy O=17E=15.4 O=5E=6.6 6 22
%ol,-( &otal , 26 , 12 n ;
% measure of how much the data deviate from what we would expect to see if the factors really were
independent is the sum of the s"uares of the difference of the numbers in each core cell$ or$ standardi'ing
by dividing each s"uare by the expected number in the cell$ the sumΣ(O−E)2
E. (e would re#ect the
null hypothesis that the factors are independent only if this number is large$ so the test is right-tailed. ,n
Saylor URL: http://www.saylor.org/books Saylor.org6
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 649/723
this example the random variableΣ(O−E)2 Ehas the chi-s"uare distribution with one degree of
freedom. ,f we had decided at the outset to test at the 15B level of significance$ the critical value defining
the re#ection region would be$ reading from /igure 1!.6 07ritical Dalues of 7hi-+"uare
2istributions0$χ2α=χ20.10=2.706$ so that the re#ection region would be the interval[2.706,∞). (hen
we compute the value of the standardi'ed test statistic we obtain
%s in the example each factor is divided into a number of categories or levels. These could arise
naturally$ as in the boy-girl division of gender$ or somewhat arbitrarily$ as in the high-low division of
heart rate. +uppose /actor 1 has > levels and /actor ! has M levels. Then the information from a
Saylor URL: http://www.saylor.org/books Saylor.org9
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 650/723
random sample gives rise to a general > M contingency table$ which with row totals$ column totals$
and a grand total would appear as shown in Table 11.3 0eneral 7ontingency Table0. ach cell may
be labeled by a pair of indices(i, j).Oijstands for the observed count of observations in the cell in
row i and column j $ -i for theithrow total and < j for the jthcolumn total. To simplify the notation we
will drop the indices so Table 11.3 0eneral 7ontingency Table0 becomes Table 11.6 0+implified
eneral 7ontingency Table0. 9evertheless it is important to keep in mind that the )s$ the -s and
the < s$ though denoted by the same symbols$ are in fact different numbers.
Table 11.3 eneral 7ontingency Table
Factor 0 !evels
% Y Y Y j Y Y Y J /ow Total
actor 1 Le4els
1 11 Y Y Y O1 j Y Y Y O1J 61
i Oi1 Y Y Y Oij Y Y Y OiJ 6i
# OI1 Y Y Y OIj Y Y Y OIJ 6#
%ol,-( &otal ,1 Y Y Y , E
Y Y Y , n
Table 11.6 +implified eneral 7ontingency Table
Factor 0 !evels
% Y Y Y j Y Y Y J /ow Total
actor 1 Le4els 1 Y Y Y Y Y Y 6
i Y Y Y Y Y Y 6
Saylor URL: http://www.saylor.org/books Saylor.org0;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 651/723
Factor 0 !evels
% Y Y Y j Y Y Y J /ow Total
# Y Y Y Y Y Y 6
%ol,-( &otal , Y Y Y , Y Y Y , n
%s in the example$ for each core cell in the table we compute what would be the expected number ; of
observations if the two factors were independent. ; is computed for each core cell *each cell with
an ) in it) of Table 11.6 0+implified eneral 7ontingency Table0 by the rule applied in the example:
Saylor URL: http://www.saylor.org/books Saylor.org01
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 652/723
Saylor URL: http://www.saylor.org/books Saylor.org02
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 653/723
EKAPLE 1
A researcher w)shes to )(4est)gate whether st,*e(tsW scores o( a college
e(tra(ce e7a-)(at)o( %EE5 ha4e a(y )(*)cat)4e power for f,t,re college
perfor-a(ce as -eas,re* by !PA. ( other wor*s he w)shes to )(4est)gate
whether the factors %EE a(* !PA are )(*epe(*e(t or (ot. @e ra(*o-ly
selects n 1;; st,*e(ts )( a college a(* (otes each st,*e(tWs score o( the
e(tra(ce e7a-)(at)o( a(* h)s gra*e po)(t a4erage at the e(* of the sopho-ore
year. @e *)4)*es e(tra(ce e7a- scores )(to two le4els a(* gra*e po)(t a4erages
)(to three le4els. Sort)(g the *ata accor*)(g to these *)4)s)o(s he for-s the
co(t)(ge(cy table show( as &able 11. Q%EE 4ers,s !PA %o(t)(ge(cy &ableQ )(
wh)ch the row a(* col,-( totals ha4e alrea*y bee( co-p,te*.
Saylor URL: http://www.saylor.org/books Saylor.org03
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 654/723
&A#LE 11. %EE 'ERSUS !PA %"N&N!EN%J &A#LE
:2A
K0.4 0.4 to 3.0 L3.0 /ow Total
%EE
<1800 38 %0 8 02
≥1800 9 07 %5 6
%ol,-( &otal 1 3 23 Total=100
&est at the 1G le4el of s)g()+ca(ce whether these *ata pro4)*e s,?c)e(t
e4)*e(ce to co(cl,*e that %EE scores )(*)cate f,t,re perfor-a(ce le4els of
)(co-)(g college fresh-e( as -eas,re* by !PA.
Sol,t)o(:
Be perfor- the test ,s)(g the cr)t)cal 4al,e approach follow)(g the ,s,al +4e=step -etho* o,tl)(e* at the e(* of Sect)o( 6.1 Q&he Ele-e(ts of @ypothes)s
&est)(gQ )( %hapter 6 Q&est)(g @ypothesesQ.
• Step 1. &he hypotheses are
H0: CEE and GPA are independent factors
vs.Ha: CEE and GPA are not independent factors
• Step 2. &he *)str)b,t)o( )s ch)=s<,are.
• Step 3. &o co-p,te the 4al,e of the test stat)st)c we -,st +rst co-p,te* the
e7pecte* (,-ber for each of the s)7 core cells the o(es whose e(tr)es are
bol*face5:
o 1st row a(* 1st col,-(: E=(R×C)/n=41×52/100=21.32
o 1st row a(* 2(* col,-(: E=(R×C)/n=36×52/100=18.72
Saylor URL: http://www.saylor.org/books Saylor.org0
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 655/723
o 1st row a(* 3r* col,-(: E=(R×C)/n=23×52/100=11.96
o 2(* row a(* 1st col,-(: E=(R×C)/n=41×48/100=19.68
o 2(* row a(* 2(* col,-(: E=(R×C)/n=36×48/100=17.28
o
2(* row a(* 3r* col,-(: E=(R×C)/n=23×48/100=11.04 &able 11. Q%EE 4ers,s !PA %o(t)(ge(cy &ableQ )s ,p*ate* to &able 11.
QUp*ate* %EE 4ers,s !PA %o(t)(ge(cy &ableQ.
Saylor URL: http://www.saylor.org/books Saylor.org00
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 656/723
• Step 0. S)(ce 31.0 9.21 the *ec)s)o( )s to re>ect the (,ll hypothes)s.
See )g,re 11.. &he *ata pro4)*e s,?c)e(t e4)*e(ce at the 1G le4el
of s)g()+ca(ce to co(cl,*e that %EE score a(* !PA are (ot
)(*epe(*e(t: the e(tra(ce e7a- score has pre*)ct)4e power.
Figure 11.8'ote 11. >(ample 1>
IEJ &AIEABAJS
• %r)t)cal 4al,es of a ch)=s<,are *)str)b,t)o( w)th *egrees of free*o- dfare fo,(*
)( )g,re 12. Q%r)t)cal 'al,es of %h)=S<,are )str)b,t)o(sQ.
• A ch)=s<,are test ca( be ,se* to e4al,ate the hypothes)s that two ra(*o-
4ar)ables or factors are )(*epe(*e(t.
Saylor URL: http://www.saylor.org/books Saylor.org0
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 657/723
Saylor URL: http://www.saylor.org/books Saylor.org0
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 658/723
Factor
2evel 2evel ! 'o3 Total
a!'or 2
?eve" 1 20 10 R
?eve" 2 15 5 R
?eve" 3 10 20 R
o"&%n ,o'a" C C n
a )(* the col,-( totals the row totals a(* the gra(* total n of the table.
Saylor URL: http://www.saylor.org/books Saylor.org06
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 659/723
b )(* the e7pecte* (,-ber ( of obser4at)o(s for each cell base* o( the ass,-pt)o( that
the two factors are )(*epe(*e(t that )s >,st ,se the for-,la E=(R×C)/n5.
c )(* the 4al,e of the ch)=s<,are test stat)st)c χ2.
* )(* the (,-ber of *egrees of free*o- of the ch)=s<,are test stat)st)c.
A22!&CAT&1NS9 A ch)l* psycholog)st bel)e4es that ch)l*re( perfor- better o( tests whe( they are g)4e(
perce)4e* free*o- of cho)ce. &o test th)s bel)ef the psycholog)st carr)e* o,t a(
e7per)-e(t )( wh)ch 2;; th)r* gra*ers were ra(*o-ly ass)g(e* to two gro,ps 0 a(* -.
Each ch)l* was g)4e( the sa-e s)-ple log)c test. @owe4er )( gro,p - each ch)l* was
g)4e( the free*o- to choose a te7t booklet fro- -a(y w)th 4ar)o,s *raw)(gs o( the
co4ers. &he perfor-a(ce of each ch)l* was rate* as 'ery !oo* !oo* a(* a)r. &he
res,lts are s,--ar)Ve* )( the table pro4)*e*. &est at the 0G le4el of s)g()+ca(ce
whether there )s s,?c)e(t e4)*e(ce )( the *ata to s,pport the psycholog)stWs bel)ef.
)roup
A B
*eror%an!e
=er (ood 32 29
(ood 55 61
air 10 13
1; ( regar* to w)(e tast)(g co-pet)t)o(s -a(y e7perts cla)- that the +rst glass of w)(e
ser4e* sets a refere(ce taste a(* that a *)8ere(t refere(ce w)(e -ay alter the relat)4e
ra(k)(g of the other w)(es )( co-pet)t)o(. &o test th)s cla)- three w)(es 0 - a(* ,
were ser4e* at a w)(e tast)(g e4e(t. Each perso( was ser4e* a s)(gle glass of each
w)(e b,t )( *)8ere(t or*ers for *)8ere(t g,ests. At the close each perso( was aske* to
(a-e the best of the three. "(e h,(*re* se4e(ty=two people were at the e4e(t a(*
the)r top p)cks are g)4e( )( the table pro4)*e*. &est at the 1G le4el of s)g()+ca(ce
whether there )s s,?c)e(t e4)*e(ce )( the *ata to s,pport the cla)- that w)(e e7pertsW
prefere(ce )s *epe(*e(t o( the +rst ser4e* w)(e.
Top ick
A B C
ir+' ("a++
A 12 31 27
B 15 40 21
C 10 9 7
11 s be)(g left=ha(*e* here*)taryC &o a(swer th)s <,est)o( 20; a*,lts are ra(*o-ly
selecte* a(* the)r ha(*e*(ess a(* the)r pare(tsW ha(*e*(ess are (ote*. &he res,lts are
Saylor URL: http://www.saylor.org/books Saylor.org09
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 660/723
s,--ar)Ve* )( the table pro4)*e*. &est at the 1G le4el of s)g()+ca(ce whether there )s
s,?c)e(t e4)*e(ce )( the *ata to co(cl,*e that there )s a here*)tary ele-e(t )(
ha(*e*(ess.
Number of arents 2eft-4anded
0 !
Handedne++
?e' 8 10 12
igh' 178 21 21
12 So-e ge(et)c)sts cla)- that the ge(es that *eter-)(e left=ha(*e*(ess also go4er(
*e4elop-e(t of the la(g,age ce(ters of the bra)(. f th)s cla)- )s tr,e the( )t wo,l* be
reaso(able to e7pect that left=ha(*e* people te(* to ha4e stro(ger la(g,age ab)l)t)es. A
st,*y *es)g(e* to te7t th)s cla)- ra(*o-ly selecte* 6; st,*e(ts who took the
!ra*,ate Recor* E7a-)(at)o( !RE5. &he)r scores o( the la(g,age port)o( of the
e7a-)(at)o( were class)+e* )(to three categor)es: lo& average a(* !ig! a(* the)r
ha(*e*(ess was also (ote*. &he res,lts are g)4e( )( the table pro4)*e*. &est at the 0G
le4el of s)g()+ca(ce whether there )s s,?c)e(t e4)*e(ce )( the *ata to co(cl,*e that
left=ha(*e* people te(* to ha4e stro(ger la(g,age ab)l)t)es.
)'$ $nglish Scores
2o3 Average 4igh
Handedne++
?e' 18 40 22
igh' 201 360 166
13 t )s ge(erally bel)e4e* that ch)l*re( bro,ght ,p )( stable fa-)l)es te(* to *o well )(
school. &o 4er)fy s,ch a bel)ef a soc)al sc)e(t)st e7a-)(e* 29; ra(*o-ly selecte*
st,*e(tsW recor*s )( a p,bl)c h)gh school a(* (ote* each st,*e(tWs fa-)ly str,ct,re a(*
aca*e-)c stat,s fo,r years after e(ter)(g h)gh school. &he *ata were the( sorte* )(to a
2 3 co(t)(ge(cy table w)th two factors. actor 1 has two le4els: graduated a(* did not
graduate. actor 2 has three le4els: no parent one parent a(* t&o parents. &he res,lts
are g)4e( )( the table pro4)*e*. &est at the 1G le4el of s)g()+ca(ce whether there )s
s,?c)e(t e4)*e(ce )( the *ata to co(cl,*e that fa-)ly str,ct,re -atters )( school
perfor-a(ce of the st,*e(ts.
Academic Status
)raduated %id Not )raduate
a%i" No $aren' 18 31
One $aren' 101 44
Saylor URL: http://www.saylor.org/books Saylor.org;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 661/723
Academic Status
)raduated %id Not )raduate
,/o $aren'+ 70 26
1 A large -)**le school a*-)()strator w)shes to ,se celebr)ty )([,e(ce to e(co,rage
st,*e(ts to -ake health)er cho)ces )( the school cafeter)a. &he cafeter)a )s s)t,ate* at
the ce(ter of a( ope( space. E4ery*ay at l,(ch t)-e st,*e(ts get the)r l,(ch a(* a
*r)(k )( three separate l)(es lea*)(g to three separate ser4)(g stat)o(s. As a(
e7per)-e(t the school a*-)()strator *)splaye* a poster of a pop,lar tee( pop star
*r)(k)(g -)lk at each of the three areas where *r)(ks are pro4)*e* e7cept the -)lk )(
the poster )s *)8ere(t at each locat)o(: o(e shows wh)te -)lk o(e shows strawberry=
[a4ore* p)(k -)lk a(* o(e shows chocolate -)lk. After the +rst *ay of the e7per)-e(t
the a*-)()strator (ote* the st,*e(tsW -)lk cho)ces separately for the three l)(es. &he
*ata are g)4e( )( the table pro4)*e*. &est at the 1G le4el of s)g()+ca(ce whether there)s s,?c)e(t e4)*e(ce )( the *ata to co(cl,*e that the posters ha* so-e )-pact o( the
st,*e(tsW *r)(k cho)ces.
Student Choice
'egular Stra3berr1 Chocolate
*o+'er hoi!e
eg&"ar 38 28 40
'ra/berr 18 51 24
ho!o"a'e 32 32 53
LAR!E A&A SE& EKER%SE
10 Large ata Set 6 recor*s the res,lt of a s,r4ey of 3;; ra(*o-ly selecte* a*,lts who go
to -o4)e theaters reg,larly. or each perso( the ge(*er a(* preferre* type of -o4)e
were recor*e*. &est at the 0G le4el of s)g()+ca(ce whether there )s s,?c)e(t e4)*e(ce
)( the *ata to co(cl,*e that the factors ge(*er a(* preferre* type of -o4)e are
*epe(*e(t.
http://www.6.7ls
Saylor URL: http://www.saylor.org/books Saylor.org1
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 662/723
Saylor URL: http://www.saylor.org/books Saylor.org2
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 663/723
%%.0 Chi-S"uare 1ne-Sample :oodness-o-Fit Tests
LEARNN! "#$E%&'E
1 &o ,(*ersta(* how to ,se a ch)=s<,are test to >,*ge whether a sa-ple +ts a
part)c,lar pop,lat)o( well.
+uppose we wish to determine if an ordinary-looking six-sided die is fair$ or balanced$ meaning that
every face has probability 1I; of landing on top when the die is tossed. (e could toss the die do'ens$
maybe hundreds$ of times and compare the actual number of times each face landed on top to the
expected number$ which would be 1I; of the total number of tosses. (e wouldn&t expect each
number to be exactly 1I; of the total$ but it should be close. To be specific$ suppose the die is
tossed n M ;5 times with the results summari'ed in Table 11.? 02ie 7ontingency Table0. /or ease of
reference we add a column of expected fre"uencies$ which in this simple example is simply a column
of 15s. The result is shown as Table 11.F 0<pdated 2ie 7ontingency Table0. ,n analogy with the
previous section we call this an GupdatedH table. % measure of how much the data deviate from what
we would expect to see if the die really were fair is the sum of the s"uares of the differences between
the observed fre"uency ) and the expected fre"uency ; in each row$ or$ standardi'ing by dividing
each s"uare by the expected number$ the sumΣ(O−E)2
E.,f we formulate the investigation as a
test of hypotheses$ the test is
H0: The die is fair
Saylor URL: http://www.saylor.org/books Saylor.org3
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 664/723
vs.Ha: The die isnot fair
Table 11.? 2ie 7ontingency Table
%ie alue Assumed %istribution Observed Fre<uenc1
1 1G6 9
2 1G6 15
3 1G6 9
4 1G6 8
5 1G6 6
6 1G6 13
Table 11.F <pdated 2ie 7ontingency Table
%ie alue Assumed %istribution Observed Fre<8 $:pected Fre<8
1 1G6 9 10
2 1G6 15 10
3 1G6 9 10
4 1G6 8 10
5 1G6 6 10
6 1G6 13 10
(e would re#ect the null hypothesis that the die is fair only if the numberΣ(O−E)2/Eis large$ so the test
is right-tailed. ,n this example the random variable Σ(O−E)2/Ehas the chi-s"uare distribution with five
degrees of freedom. ,f we had decided at the outset to test at the 15B level of significance$ the critical
value defining the re#ection region would be$ reading from /igure 1!.6 07ritical Dalues of 7hi-+"uare
2istributions0$χ2α=χ20.10=9.236$ so that the re#ection region would be the interval[9.236,∞). (hen we
compute the value of the standardi'ed test statistic using the numbers in the last two columns of Table
11.F 0<pdated 2ie 7ontingency Table0$ we obtain
Saylor URL: http://www.saylor.org/books Saylor.org
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 665/723
Saylor URL: http://www.saylor.org/books Saylor.org0
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 666/723
Table 11.15 eneral 7ontingency Table
Factor !evels Assumed (istribution 1bserved Fre"uency
1 p1 1
2 p2 2
# p# #
Table 11.15 0eneral 7ontingency Table0 is updated to Table 11.11 0<pdated eneral 7ontingency
Table0 by adding the expected fre"uency for each value of B . To simplify the notation we drop indices
Saylor URL: http://www.saylor.org/books Saylor.org
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 667/723
for the observed and expected fre"uencies and represent Table 11.11 0<pdated eneral 7ontingency
Table0 by Table 11.1! 0+implified <pdated eneral 7ontingency Table0.
Table 11.11 <pdated eneral 7ontingency Table
Factor !evels Assumed (istribution 1bserved Fre". +xpected Fre".
1 p1 1 (1
2 p2 2 (2
# p# # (#
Table 11.1! +implified <pdated eneral 7ontingency Table
Factor !evels Assumed (istribution 1bserved Fre". +xpected Fre".
1 p1 (
2 p2 (
# p# (
Eere is the test statistic for the general hypothesis based on Table 11.1! 0+implified <pdated eneral
7ontingency Table0$ together with the conditions that it follow a chi-s"uare distribution.
Saylor URL: http://www.saylor.org/books Saylor.org
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 668/723
EKAPLE 2
&able 11.13 QEth()c !ro,ps )( the %e(s,s JearQ shows the *)str)b,t)o( of 4ar)o,s
eth()c gro,ps )( the pop,lat)o( of a part)c,lar state base* o( a *ece(()al U.S.
ce(s,s. )4e years later a ra(*o- sa-ple of 20;; res)*e(ts of the state was
take( w)th the res,lts g)4e( )( &able 11.1 QSa-ple ata )4e Jears After the
%e(s,s JearQ alo(g w)th the probab)l)ty *)str)b,t)o( fro- the ce(s,s year5. &est
at the 1G le4el of s)g()+ca(ce whether there )s s,?c)e(t e4)*e(ce )( the sa-ple
to co(cl,*e that the *)str)b,t)o( of eth()c gro,ps )( th)s state +4e years after the
ce(s,s ha* cha(ge* fro- that )( the ce(s,s year.
&A#LE 11.13 E&@N% !R"UPS N &@E %ENSUS JEAR
+thnicity hite 'lack Amer.-&ndian =ispanic Asian 1thers
Proport)o( ;.3 ;.21 ;.;12 ;.;12 ;.;;6 ;.;;9
Saylor URL: http://www.saylor.org/books Saylor.org6
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 669/723
&A#LE 11.1 SAPLE A&A 'E JEARS A&ER &@E
%ENSUS JEAR
$thnicit1 Assumed %istribution Observed Fre<uenc1
hi'e 0.743 1732
:"a! 0.216 538
%eri!an-ndian 0.012 32
Hi+$ani! 0.012 42
+ian 0.008 133
O'her+ 0.009 23
Sol,t)o(:
Be test ,s)(g the cr)t)cal 4al,e approach.
• Step 1. &he hypotheses of )(terest )( th)s case ca( be e7presse* as
H0:The distribution of ethnic groups has not changed
vs.Ha: The distribution of ethnic groupshas changed
• Step 2. &he *)str)b,t)o( )s ch)=s<,are.
Step 3. &o co-p,te the 4al,e of the test stat)st)c we -,st +rst co-p,te the
e7pecte* (,-ber for each row of &able 11.1 QSa-ple ata )4e Jears After the
%e(s,s JearQ. S)(ce n 20;; ,s)(g the for-,la Ei=n× pia(* the 4al,es
of pi fro- e)ther &able 11.13 QEth()c !ro,ps )( the %e(s,s JearQ or &able 11.1
QSa-ple ata )4e Jears After the %e(s,s JearQ
Saylor URL: http://www.saylor.org/books Saylor.org9
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 670/723
Saylor URL: http://www.saylor.org/books Saylor.org;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 671/723
*+, TA*+AA,
• &he chi-s"uare $oodness-o-)t test ca( be ,se* to e4al,ate the hypothes)s
that a sa-ple )s take( fro- a pop,lat)o( w)th a( ass,-e* spec)+c probab)l)ty
*)str)b,t)o(.
++/C&S+S
'AS&C
1 A *ata sa-ple )s sorte* )(to +4e categor)es w)th a( ass,-e* probab)l)ty *)str)b,t)o(.
Saylor URL: http://www.saylor.org/books Saylor.org1
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 672/723
Factor 2evels Assumed %istribution Observed Fre<uenc1
1 p1=0.1 10
2 p2=0.4 35
3 p3=0.4 45
4 p4=0.1 10
a )(* the s)Ve n of the sa-ple.
b )(* the e7pecte* (,-ber ( of obser4at)o(s for each le4el )f the sa-ple*
pop,lat)o( has a probab)l)ty *)str)b,t)o( as ass,-e* that )s >,st ,se the
for-,la Ei=n× pi5.
c )(* the ch)=s<,are test stat)st)c χ2.
* )(* the (,-ber of *egrees of free*o- of the ch)=s<,are test stat)st)c.
2 A *ata sa-ple )s sorte* )(to +4e categor)es w)th a( ass,-e* probab)l)ty *)str)b,t)o(.
Factor 2evels Assumed %istribution Observed Fre<uenc1
1 p1=0.3 23
2 p2=0.3 30
3 p3=0.2 19
4 p4=0.1 8
5 p5=0.1 10
a )(* the s)Ve n of the sa-ple.
b )(* the e7pecte* (,-ber ( of obser4at)o(s for each le4el )f the sa-ple*
pop,lat)o( has a probab)l)ty *)str)b,t)o( as ass,-e* that )s >,st ,se the
for-,la Ei=n× pi5.
c )(* the ch)=s<,are test stat)st)c χ2.
* )(* the (,-ber of *egrees of free*o- of the ch)=s<,are test stat)st)c.
A22!&CAT&1NS
3 Reta)lers of collect)ble postage sta-ps ofte( b,y the)r sta-ps )( large <,a(t)t)es by
we)ght at a,ct)o(s. &he pr)ces the reta)lers are w)ll)(g to pay *epe(* o( how ol* the
postage sta-ps are. a(y collect)ble postage sta-ps at a,ct)o(s are *escr)be* by the
proport)o(s of sta-ps )ss,e* at 4ar)o,s per)o*s )( the past. !e(erally the ol*er the
sta-ps the h)gher the 4al,e. At o(e part)c,lar a,ct)o( a lot of collect)ble sta-ps )s
a*4ert)se* to ha4e the age *)str)b,t)o( g)4e( )( the table pro4)*e*. A reta)l b,yer took a
sa-ple of 3 sta-ps fro- the lot a(* sorte* the- by age. &he res,lts are g)4e( )( the
Saylor URL: http://www.saylor.org/books Saylor.org2
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 673/723
table pro4)*e*. &est at the 0G le4el of s)g()+ca(ce whether there )s s,?c)e(t e4)*e(ce
)( the *ata to co(cl,*e that the age *)str)b,t)o( of the lot )s *)8ere(t fro- what was
cla)-e* by the seller.
(ear Claimed %istribution Observed Fre<uenc1
:eore 1940 0.10 6
1940 'o 1959 0.25 15
1960 'o 1979 0.45 30
'er 1979 0.20 22
&he l)tter s)Ve of #e(gal t)gers )s typ)cally two or three c,bs b,t )t ca( 4ary betwee( o(e
a(* fo,r. #ase* o( lo(g=ter- obser4at)o(s the l)tter s)Ve of #e(gal t)gers )( the w)l* has
the *)str)b,t)o( g)4e( )( the table pro4)*e*. A Voolog)st bel)e4es that #e(gal t)gers )(
capt)4)ty te(* to ha4e *)8ere(t poss)bly s-aller5 l)tter s)Ves fro- those )( the w)l*. &o
4er)fy th)s bel)ef the Voolog)st searche* all *ata so,rces a(* fo,(* 31 l)tter s)Ve
recor*s of #e(gal t)gers )( capt)4)ty. &he res,lts are g)4e( )( the table pro4)*e*. &est at
the 0G le4el of s)g()+ca(ce whether there )s s,?c)e(t e4)*e(ce )( the *ata to co(cl,*e
that the *)str)b,t)o( of l)tter s)Ves )( capt)4)ty *)8ers fro- that )( the w)l*.
2itter Si=e 9ild 2itter %istribution Observed Fre<uenc1
1 0.11 41
2 0.69 243
3 0.18 274 0.02 5
0 A( o(l)(e shoe reta)ler sells -e(Ws shoes )( s)Ves 6 to 13. ( the past or*ers for the
*)8ere(t shoe s)Ves ha4e followe* the *)str)b,t)o( g)4e( )( the table pro4)*e*. &he
-a(age-e(t bel)e4es that rece(t -arket)(g e8orts -ay ha4e e7pa(*e* the)r
c,sto-er base a(* as a res,lt there -ay be a sh)ft )( the s)Ve *)str)b,t)o( for f,t,re
or*ers. &o ha4e a better ,(*ersta(*)(g of )ts f,t,re sales the shoe seller e7a-)(e*
1;; sales recor*s of rece(t or*ers a(* (ote* the s)Ves of the shoes or*ere*. &he
res,lts are g)4e( )( the table pro4)*e*. &est at the 1G le4el of s)g()+ca(ce whether
there )s s,?c)e(t e4)*e(ce )( the *ata to co(cl,*e that the shoe s)Ve *)str)b,t)o( of
f,t,re sales w)ll *)8er fro- the h)stor)c o(e.
Shoe Si=e ast Si=e %istribution 'ecent Si=e Fre<uenc1
8.0 0.03 25
8.5 0.06 43
Saylor URL: http://www.saylor.org/books Saylor.org3
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 674/723
Shoe Si=e ast Si=e %istribution 'ecent Si=e Fre<uenc1
9.0 0.09 88
9.5 0.19 221
10.0 0.23 27210.5 0.14 150
11.0 0.10 107
11.5 0.06 51
12.0 0.05 37
12.5 0.03 35
13.0 0.02 11
A( o(l)(e shoe reta)ler sells wo-e(Ws shoes )( s)Ves 0 to 1;. ( the past or*ers for the
*)8ere(t shoe s)Ves ha4e followe* the *)str)b,t)o( g)4e( )( the table pro4)*e*. &he
-a(age-e(t bel)e4es that rece(t -arket)(g e8orts -ay ha4e e7pa(*e* the)r c,sto-er
base a(* as a res,lt there -ay be a sh)ft )( the s)Ve *)str)b,t)o( for f,t,re or*ers. &o
ha4e a better ,(*ersta(*)(g of )ts f,t,re sales the shoe seller e7a-)(e* 11 sales
recor*s of rece(t or*ers a(* (ote* the s)Ves of the shoes or*ere*. &he res,lts are g)4e(
)( the table pro4)*e*. &est at the 1G le4el of s)g()+ca(ce whether there )s s,?c)e(t
e4)*e(ce )( the *ata to co(cl,*e that the shoe s)Ve *)str)b,t)o( of f,t,re sales w)ll *)8er
fro- the h)stor)c o(e.
Shoe Si=e ast Si=e %istribution 'ecent Si=e Fre<uenc1
5.0 0.02 20
5.5 0.03 23
6.0 0.07 88
6.5 0.08 90
7.0 0.20 222
7.5 0.20 258
8.0 0.15 177
8.5 0.11 121
9.0 0.08 91
9.5 0.04 53
10.0 0.02 31
Saylor URL: http://www.saylor.org/books Saylor.org
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 675/723
A chess ope()(g )s a se<,e(ce of -o4es at the beg)(()(g of a chess ga-e. &here are
-a(y well=st,*)e* (a-e* ope()(gs )( chess l)terat,re. re(ch efe(se )s o(e of the
-ost pop,lar ope()(gs for black altho,gh )t )s co(s)*ere* a relat)4ely weak ope()(g
s)(ce )t g)4es black probab)l)ty ;.3 of w)(()(g probab)l)ty ;.;0 of los)(g a(*
probab)l)ty ;.201 of *raw)(g. A chess -aster bel)e4es that he has *)sco4ere* a (ew4ar)at)o( of re(ch efe(se that -ay alter the probab)l)ty *)str)b,t)o( of the o,tco-e of
the ga-e. ( h)s -a(y (ter(et chess ga-es )( the last two years he was able to apply
the (ew 4ar)at)o( )( ga-es. &he w)(s losses a(* *raws )( the ga-es are g)4e( )(
the table pro4)*e*. &est at the 0G le4el of s)g()+ca(ce whether there )s s,?c)e(t
e4)*e(ce )( the *ata to co(cl,*e that the (ewly *)sco4ere* 4ar)at)o( of re(ch efe(se
alters the probab)l)ty *)str)b,t)o( of the res,lt of the ga-e.
'esult for 6lack robabilit1 %istribution Ne3 ariation 9ins
in 0.344 31
?o++ 0.405 25
#ra/ 0.251 21
6 &he epart-e(t of Parks a(* B)l*l)fe stocks a large lake w)th +sh e4ery s)7 years. t )s
*eter-)(e* that a healthy *)4ers)ty of +sh )( the lake sho,l* co(s)st of 1;G large-o,th
bass 10G s-all-o,th bass 1;G str)pe* bass 1;G tro,t a(* 2;G cat+sh. &herefore
each t)-e the lake )s stocke* the +sh pop,lat)o( )( the lake )s restore* to -a)(ta)( that
part)c,lar *)str)b,t)o(. E4ery three years the *epart-e(t co(*,cts a st,*y to see
whether the *)str)b,t)o( of the +sh )( the lake has sh)fte* away fro- the target
proport)o(s. ( o(e part)c,lar year a research gro,p fro- the *epart-e(t obser4e* a
sa-ple of 292 +sh fro- the lake w)th the res,lts g)4e( )( the table pro4)*e*. &est at the
0G le4el of s)g()+ca(ce whether there )s s,?c)e(t e4)*e(ce )( the *ata to co(cl,*e
that the +sh pop,lat)o( *)str)b,t)o( has sh)fte* s)(ce the last stock)(g.
Fish Tar$et (istribution Fish in Sample
Large-o,th #ass ;.1; 1
S-all-o,th #ass ;.10 9
Str)pe* #ass ;.1; 21
&ro,t ;.1; 22
%at+sh ;.2; 0
"ther ;.30 111
Saylor URL: http://www.saylor.org/books Saylor.org0
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 676/723
!A/:+ (ATA S+T ++/ C&S+
9 Large ata Set recor*s the res,lt of 0;; tosses of s)7=s)*e* *)e. &est at the 1;G le4el
of s)g()+ca(ce whether there )s s,?c)e(t e4)*e(ce )( the *ata to co(cl,*e that the *)e
)s (ot fa)r or bala(ce*5 that )s that the probab)l)ty *)str)b,t)o( *)8ers fro-
probab)l)ty 1/ for each of the s)7 faces o( the *)e.
http://www..7ls
%%.3 F-tests or +"uality o Two <ariances
LEARNN! "#$E%&'ES
1 &o ,(*ersta(* what F =*)str)b,t)o(s are.
2 &o ,(*ersta(* how to ,se a( F =test to >,*ge whether two pop,lat)o( 4ar)a(ces
are e<,al.
Saylor URL: http://www.saylor.org/books Saylor.org
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 677/723
F-(istributions
%nother important and useful family of distributions in statistics is the family of ! -distributions. ach
member of the ! -distribution family is specified by a pair of parameters called degrees of freedom and
denoteddf1anddf2./igure 11.@ 0any 0 shows several ! -distributions for different pairs of degrees of
freedom. %n 7 random variable is a random variable that assumes only positive values and follows
an ! -distribution.
!igure "".1 2any ! :(istributions
The parameterdf1is often referred to as the numerator degrees of freedom and the parameterdf2as
the denominator degrees of freedom. ,t is important to keep in mind that they are not interchangeable.
/or example$ the ! -distribution with degrees of freedomdf1=3anddf2=8is a different distribution from
the ! -distribution with degrees of freedomdf1=8anddf2=3.
Saylor URL: http://www.saylor.org/books Saylor.org
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 678/723
e+()t)o(
The value of the ! random variable ! with degrees of freedom df1 and df!that cuts off a right tail of
area c is denoted ! c and is called a critical value. %ee !igure "".3.
!igure "".3 ! c >llustrated
Tables containing the values of ! c are given in 7hapter 11 07hi-+"uare Tests and 0. ach of the tables
is for a fixed collection of values of c$ either 5.F55$ 5.F85$ 5.F@8$ 5.FF5$ and 5.FF8 *yielding what are
called GlowerH critical values)$ or 5.558$ 5.515$ 5.5!8$ 5.585$ and 5.155 *yielding what are called
GupperH critical values). ,n each table critical values are given for various pairs *df1$df!). (e illustrate
the use of the tables with several examples.
+A>2!+ 3
S,ppose F )s a( F ra(*o- 4ar)able w)th *egrees of free*o- *f10 a(**f2. Use
the tables to +(*
a. F ;.1;
Saylor URL: http://www.saylor.org/books Saylor.org6
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 679/723
b. F ;.90
Sol,t)o(:
a &he col,-( hea*)(gs of all the tables co(ta)( *f10. Look for the table for
wh)ch ;.1; )s o(e of the e(tr)es o( the e7tre-e left a table of ,pper cr)t)cal 4al,es5
a(* that has a row hea*)(g *f2 )( the left -arg)( of the table. A port)o( of the
rele4a(t table )s pro4)*e*. &he e(try )( the )(tersect)o( of the col,-( w)th
hea*)(g *f10 a(* the row w)th the hea*)(gs ;.1; a(* *f2 wh)ch )s sha*e* )( the
table pro4)*e* )s the a(swer ;.1;.;0.
F Tail Area
df
! > > > + > > > df!
0.005 4 I I I I I I I I I 22.5 I I I
0.01 4 I I I I I I I I I 15.5 I I I
0.025 4 I I I I I I I I I 9.36 I I I
0.05 4 I I I I I I I I I 6.26 I I I
0.10 4 I I I I I I I I I 4.05 I I I
b Look for the table for wh)ch ;.90 )s o(e of the e(tr)es o( the e7tre-e
left a table of lower cr)t)cal 4al,es5 a(* that has a row hea*)(g *f2 )( theleft -arg)( of the table. A port)o( of the rele4a(t table )s pro4)*e*. &he e(try
)( the )(tersect)o( of the col,-( w)th hea*)(g *f10 a(* the row w)th the
hea*)(gs ;.90 a(* *f2 wh)ch )s sha*e* )( the table pro4)*e* )s the
a(swer ;.90;.19.
F Tail Area
d%
% 0 8 d0
;.9; m m m m m m m m m ;.26 m m m
;.90 m m m m m m m m m ;.19 m m m
;.90 m m m m m m m m m ;.1 m m m
Saylor URL: http://www.saylor.org/books Saylor.org9
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 680/723
F Tail Area
d%
% 0 8 d0
;.99 m m m m m m m m m ;.;9 m m m
;.990 m m m m m m m m m ;.; m m m
+A>2!+ 7
S,ppose )s a( F ra(*o- 4ar)able w)th *egrees of
free*o- *f12 a(**f22;. Let n;.;0. Use the tables to +(*
a. n
b. n2
c. 1Xn
*. 1Xn2
Sol,t)o(:
a. &he col,-( hea*)(gs of all the tables co(ta)( *f12. Look for the table for
wh)ch n;.;0 )s o(e of the e(tr)es o( the e7tre-e left a table of ,pper cr)t)cal
4al,es5 a(* that has a row hea*)(g *f22; )( the left -arg)( of the table. A port)o(of the rele4a(t table )s pro4)*e*. &he sha*e* e(try )( the )(tersect)o( of the col,-(
w)th hea*)(g *f12 a(* the row w)th the hea*)(gs ;.;0 a(* *f22; )s the
a(swer ;.;03.9.
F Tail Area
df
! > > >
df!
0.005 20 I I I 6.99 I I I
0.01 20 I I I 5.85 I I I
0.025 20 I I I 4.46 I I I
0.05 20 I I I 3.49 I I I
0.10 20 I I I 2.59 I I I
b. Look for the table for wh)ch n2;.;20 )s o(e of the e(tr)es o( the e7tre-e
left a table of ,pper cr)t)cal 4al,es5 a(* that has a row hea*)(g *f22; )( the left
Saylor URL: http://www.saylor.org/books Saylor.org6;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 681/723
-arg)( of the table. A port)o( of the rele4a(t table )s pro4)*e*. &he sha*e* e(try )(
the )(tersect)o( of the col,-( w)th hea*)(g *f12 a(* the row w)th the hea*)(gs
;.;20 a(* *f22; )s the a(swer ;.;20..
F Tail Area
df
! > > >
df!
0.005 20 I I I 6.99 I I I
0.01 20 I I I 5.85 I I I
0.025 20 I I I 4.46 I I I
0.05 20 I I I 3.49 I I I
0.10 20 I I I 2.59 I I I
3 Look for the table for wh)ch 1Xn;.90 )s o(e of the e(tr)es o( the
e7tre-e left a table of lower cr)t)cal 4al,es5 a(* that has a row
hea*)(g *f22; )( the left -arg)( of the table. A port)o( of the
rele4a(t table )s pro4)*e*. &he sha*e* e(try )( the )(tersect)o( of the
col,-( w)th hea*)(g *f12 a(* the row w)th the hea*)(gs ;.90
a(* *f22; )s the a(swer ;.90;.;0.
F Tail Area
df
! > > > df!
0.90 20 I I I 0.11 I I I
0.95 20 I I I 0.05 I I I
0.975 20 I I I 0.03 I I I
0.99 20 I I I 0.01 I I I
0.995 20 I I I 0.01 I I I
* Look for the table for wh)ch 1Xn2;.90 )s o(e of the e(tr)es o( the
e7tre-e left a table of lower cr)t)cal 4al,es5 a(* that has a row
hea*)(g *f22; )( the left -arg)( of the table. A port)o( of the rele4a(t table
)s pro4)*e*. &he sha*e* e(try )( the )(tersect)o( of the col,-( w)th
Saylor URL: http://www.saylor.org/books Saylor.org61
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 682/723
hea*)(g *f12 a(* the row w)th the hea*)(gs ;.90 a(* *f22; )s the
a(swer;.90;.;3.
F Tail Area
df
! > > >
df!
0.90 20 I I I 0.11 I I I
0.95 20 I I I 0.05 I I I
0.975 20 I I I 0.03 I I I
0.99 20 I I I 0.01 I I I
0.995 20 I I I 0.01 I I I
% fact that sometimes allows us to find a critical value from a table that we could not read otherwise
is:
,f /u*r$s) denotes the value of the ! -distribution with degrees of freedom df1Mr and df!Ms that cuts off a
right tail of area u$ then
ck511Xck5
+A>2!+ 8
Use the tables to +(*
a. F ;.;1 for a( F ra(*o- 4ar)able w)th *f113 a(* *f26
b. F ;.90 for a( F ra(*o- 4ar)able w)th *f1; a(* *f21;
Sol,t)o(:
a. &here )s (o table w)th *f113 b,t there )s o(e w)th *f16.&h,s we ,se the
fact that
;.;113651;.996135
Saylor URL: http://www.saylor.org/books Saylor.org62
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 683/723
Us)(g the rele4a(t table we +(* that ;.996135;.16
he(ce;.;11365;.16X10.00.
b. &here )s (o table w)th *f1; b,t there )s o(e w)th *f11;.&h,s we ,se the
fact that
;.90;1;51;.;201;;5
Us)(g the rele4a(t table we +(* that ;.;201;;53.31
he(ce;.90;1;53.31X1;.3;2.
F-Tests or +"uality o Two <ariances
,n 7hapter F 0Two-+ample roblems0 we saw how to test hypotheses about the difference between
two population means 1 and !. ,n some practical situations the difference between the population
standard deviations 1 and! is also of interest. +tandard deviation measures the variability of a
random variable. /or example$ if the random variable measures the si'e of a machined part in a
manufacturing process$ the si'e of standard deviation is one indicator of product "uality. % smaller
standard deviation among items produced in the manufacturing process is desirable since it
indicates consistency in product "uality.
/or theoretical reasons it is easier to compare the s"uares of the population standard deviations$ the
population variances 1! and !!. This is not a problem$ since 1M! precisely
when 1!M!!$ 1V! precisely when1!V!!$ and 1X! precisely when 1!X!!.
The null hypothesis always has the form E5:1!M!!. The three forms of the alternative hypothesis$
with the terminology for each case$ are:
Form o Ha Terminolo$y
@a:1222 R)ght=ta)le*
@a:12_22 Left=ta)le*
Saylor URL: http://www.saylor.org/books Saylor.org63
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 684/723
Form o Ha Terminolo$y
@a:12q22 &wo=ta)le*
Rust as when we test hypotheses concerning two population means$ we take a random sample from
each population$ of si'es n1 and n!$ and compute the sample standard deviations s1 and s!. ,n this
context the samples are always independent. The populations themselves must be normally
distributed.
&est Stat)st)c for @ypothes)s &ests %o(cer()(g the)8ere(ce#etwee( &wo Pop,lat)o( 'ar)a(ces
s12s22
,f the two populations are normally distributed and if E5:1!M!! is true then under independent
sampling ! approximately follows an ! -distribution with degrees of freedom df1Mn1Z1 and df!Mn!Z1.
% test based on the test statistic / is called an ! -test.
% most important point is that while the re#ection region for a right-tailed test is exactly as in every
other situation that we have encountered$ because of the asymmetry in the ! -distribution the critical
value for a left-tailed test and the lower critical value for a two-tailed test have the special forms
shown in the following table:
Terminolo$y Alternative =ypothesis /eGection /e$ion
R)ght=ta)le* @a:1222 n
Saylor URL: http://www.saylor.org/books Saylor.org6
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 685/723
Terminolo$y Alternative =ypothesis /eGection /e$ion
Left=ta)le* @a:12_22 ^1Xn
&wo=ta)le* @a:12q22 ^1Xn2 or n2
/igure 11.F 0Ce#ection Cegions: *a) Cight-Tailed *b) eft-Tailed *c) Two-Tailed0 illustrates these
re#ection regions.
!igure "".4 -ejection -egions8 DaE -ight:TailedN DbE 'eft:TailedN DcE Two:Tailed
The test is performed using the usual five-step procedure described at the end of +ection ?.1 0The
lements of Eypothesis Testing0 in 7hapter ? 0Testing Eypotheses0.
EKAPLE
Saylor URL: http://www.saylor.org/books Saylor.org60
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 686/723
"(e of the <,al)ty -eas,res of bloo* gl,cose -eter str)ps )s the co(s)ste(cy of
the test res,lts o( the sa-e sa-ple of bloo*. &he co(s)ste(cy )s -eas,re* by the
4ar)a(ce of the rea*)(gs )( repeate* test)(g. S,ppose two types of str)ps 0 a(* -
are co-pare* for the)r respect)4e co(s)ste(c)es. Be arb)trar)ly label the
pop,lat)o( of &ype 0str)ps Pop,lat)o( 1 a(* the pop,lat)o( of &ype - str)psPop,lat)o( 2. S,ppose 10 &ype 0 str)ps were teste* w)th bloo* *rops fro- a well=
shake( 4)al a(* 2; &ype - str)ps were teste* w)th the bloo* fro- the sa-e 4)al.
&he res,lts are s,--ar)Ve* )( &able 11.1 Q&wo &ypes of &est Str)psQ. Ass,-e the
gl,cose rea*)(gs ,s)(g &ype 0 str)ps follow a (or-al *)str)b,t)o( w)th
4ar)a(ce σ21a(* those ,s)(g &ype - str)ps follow a (or-al *)str)b,t)o( w)th
4ar)a(ce w)th σ22. &est at the 1;G le4el of s)g()+ca(ce whether the *ata pro4)*e
s,?c)e(t e4)*e(ce to co(cl,*e that the co(s)ste(c)es of the two types of str)ps
are *)8ere(t.
Saylor URL: http://www.saylor.org/books Saylor.org6
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 687/723
Saylor URL: http://www.saylor.org/books Saylor.org6
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 688/723
Saylor URL: http://www.saylor.org/books Saylor.org66
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 689/723
IEJ &AIEABAJS
• %r)t)cal 4al,es of a( F =*)str)b,t)o( w)th *egrees of free*o- df1a(* df2are fo,(* )(
tables )( %hapter 12 QAppe(*)7Q.
• A( F =test ca( be ,se* to e4al,ate the hypothes)s of two )*e(t)cal (or-al
pop,lat)o( 4ar)a(ces.
Saylor URL: http://www.saylor.org/books Saylor.org69
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 690/723
Saylor URL: http://www.saylor.org/books Saylor.org9;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 691/723
Saylor URL: http://www.saylor.org/books Saylor.org91
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 692/723
Saylor URL: http://www.saylor.org/books Saylor.org92
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 693/723
A22!&CAT&1NS10 $apa(ese st,rgeo( )s a s,bspec)es of the st,rgeo( fa-)ly )(*)ge(o,s to $apa( a(* the
Northwest Pac)+c. ( a part)c,lar +sh hatchery (ewly hatche* baby $apa(ese st,rgeo(
are kept )( ta(ks for se4eral weeks before be)(g tra(sferre* to larger po(*s. )ssol4e*
o7yge( )( ta(k water )s 4ery t)ghtly -o()tore* by a( electro()c syste- a(* r)goro,sly
-a)(ta)(e* at a target le4el of .0 -)ll)gra-s per l)ter -g/l5. &he +sh hatchery looks to
Saylor URL: http://www.saylor.org/books Saylor.org93
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 694/723
,pgra*e the)r water -o()tor)(g syste-s for t)ghter co(trol of *)ssol4e* o7yge(. A (ew
syste- )s e4al,ate* aga)(st the ol* o(e c,rre(tly be)(g ,se* )( ter-s of the 4ar)a(ce )(
-eas,re* *)ssol4e* o7yge(. &h)rty=o(e water sa-ples fro- a ta(k operate* w)th the
(ew syste- were collecte* a(* 1 water sa-ples fro- a ta(k operate* w)th the ol*
syste- were collecte* all *,r)(g the co,rse of a *ay. &he sa-ples y)el* the follow)(g)(for-at)o(:
New Sample 1 :n1=31s21=0.0121
Old Sample 2:n2=16s22=0.0319
&est at the 1;G le4el of s)g()+ca(ce whether the *ata pro4)*e s,?c)e(t e4)*e(ce to
co(cl,*e that the (ew syste- w)ll pro4)*e a t)ghter co(trol of *)ssol4e* o7yge( )( the
ta(ks.
1 &he r)sk of )(4est)(g )( a stock )s -eas,re* by the 4olat)l)ty or the 4ar)a(ce )( cha(ges
)( the pr)ce of that stock. ,t,al f,(*s are baskets of stocks a(* o8er ge(erally lower
r)sk to )(4estors. )8ere(t -,t,al f,(*s ha4e *)8ere(t foc,ses a(* o8er *)8ere(t le4els
of r)sk. @)ppolyta )s *ec)*)(g betwee( two -,t,al f,(*s 0 a(* - w)th s)-)lar e7pecte*
ret,r(s. &o -ake a +(al *ec)s)o( she e7a-)(e* the a((,al ret,r(s of the two f,(*s
*,r)(g the last te( years a(* obta)(e* the follow)(g )(for-at)o(:
Saylor URL: http://www.saylor.org/books Saylor.org9
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 695/723
&est at the 1;G le4el of s)g()+ca(ce whether the *ata pro4)*e s,?c)e(t e4)*e(ce to
co(cl,*e that the (ew playl)st has e7pa(*e* the ra(ge of l)ste(er ages.
19 A laptop co-p,ter -aker ,ses battery packs s,ppl)e* by two co-pa()es 0a(* -.
Bh)le both bra(*s ha4e the sa-e a4erage battery l)fe betwee( charges L#%5 the
co-p,ter -aker see-s to rece)4e -ore co-pla)(ts abo,t shorter L#% tha( e7pecte*
for battery packs s,ppl)e* by co-pa(y -. &he co-p,ter -aker s,spects that th)s
co,l* be ca,se* by h)gher 4ar)a(ce )( L#% for #ra(* -. &o check that te( (ew
battery packs fro- each bra(* are selecte* )(stalle* o( the sa-e -o*els of laptops
a(* the laptops are allowe* to r,( ,(t)l the battery packs are co-pletely *)scharge*.
&he follow)(g are the obser4e* L#%s )( ho,rs.
Saylor URL: http://www.saylor.org/books Saylor.org90
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 696/723
!A/:+ (ATA S+T ++/C &S+S
21 Large ata Sets 1A a(* 1# recor* SA& scores for 19 -ale a(* 061 fe-ale st,*e(ts.
&est at the 1G le4el of s)g()+ca(ce whether the *ata pro4)*e s,?c)e(t e4)*e(ce to
co(cl,*e that the 4ar)a(ces of scores of -ale a(* fe-ale st,*e(ts *)8er.
http://www.1A.7ls
http://www.1#.7ls
22 Large ata Sets A a(* # recor* the s,r4)4al t)-es of 1; laboratory -)ce w)th
thy-)c le,ke-)a. &est at the 1;G le4el of s)g()+ca(ce whether the *ata pro4)*e
Saylor URL: http://www.saylor.org/books Saylor.org9
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 697/723
s,?c)e(t e4)*e(ce to co(cl,*e that the 4ar)a(ces of s,r4)4al t)-es of -ale -)ce a(*
fe-ale -)ce *)8er.
http://www..7ls
http://www.A.7lshttp://www.#.7ls
Saylor URL: http://www.saylor.org/books Saylor.org9
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 698/723
%%.7 F-Tests in 1ne-ay AN1<A
LEARNN! "#$E%&'E
1 &o ,(*ersta(* how to ,se a( F =test to >,*ge whether se4eral pop,lat)o( -ea(s
are all e<,al.
,n 7hapter F 0Two-+ample roblems0 we saw how to compare two population meansµ1andµ2.,n
this section we will learn to compare three or more population means at the same time$ which is
often of interest in practical applications. /or example$ an administrator at a university may be
interested in knowing whether student grade point averages are the same for different ma#ors. ,n
another example$ an oncologist may be interested in knowing whether patients with the same type of
cancer have the same average survival times under several different competing cancer treatments.
,n general$ suppose there are J normal populations with possibly different means$µ1,µ2,…,µK$ but
all with the same varianceσ2.The study "uestion is whether all the J population means are the same. (e formulate this "uestion as the test of hypotheses
H0: µ1=µ2=Y Y Y =µK
vs.Ha: not allK population means are equal
Saylor URL: http://www.saylor.org/books Saylor.org96
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 699/723
To perform the test J independent random samples are taken from the J normal populations.
The J sample means$ the J sample variances$ and the J sample si'es are summari'ed in the table:
opulation Sample Si=e Sample #ean Sample ariance
1 n1 x
−1
s21
2 n2 x−2 s22
K nK x−K s2K
2efine the following "uantities:
Saylor URL: http://www.saylor.org/books Saylor.org99
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 700/723
EKAPLE 6
&he a4erage of gra*e po)(t a4erages !PAs5 of college co,rses )( a spec)+c -a>or
)s a -eas,re of *)?c,lty of the -a>or. A( e*,cator w)shes to co(*,ct a st,*y to
+(* o,t whether the *)?c,lty le4els of *)8ere(t -a>ors are the sa-e. or s,ch a
st,*y a ra(*o- sa-ple of -a>or gra*e po)(t a4erages !PA5 of 11 gra*,at)(g
se()ors at a large ,()4ers)ty )s selecte* for each of the fo,r -a>ors -athe-at)cs
E(gl)sh e*,cat)o( a(* b)ology. &he *ata are g)4e( )( &able 11.1 Q)?c,lty
Le4els of %ollege a>orsQ. &est at the 0G le4el of s)g()+ca(ce whether the *ata
co(ta)( s,?c)e(t e4)*e(ce to co(cl,*e that there are *)8ere(ces a-o(g the
a4erage -a>or !PAs of these fo,r -a>ors.
&A#LE 11.1 %UL&J LE'ELS " %"LLE!E A$"RS
>athematics +n$lish +ducation 'iolo$y
2.09 3. .;; 2.6
3.13 3.19 3.09 3.01
2.9 3.10 2.6; 2.0
2.0; 3.6 2.39 3.1
2.03 3.;3 3. 2.9
Saylor URL: http://www.saylor.org/books Saylor.org;;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 701/723
>athematics +n$lish +ducation 'iolo$y
3.29 2.1 3.09 2.32
2.03 3.2; 3. 2.06
3.1 3.3; 3. 3.21
2.; 3.0 3.13 3.23
3.66 3.20 3.;; 3.0
2. .;; 3. 3.22
Saylor URL: http://www.saylor.org/books Saylor.org;1
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 702/723
Saylor URL: http://www.saylor.org/books Saylor.org;2
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 703/723
EKAPLE 9
A research laboratory *e4elope* two treat-e(ts wh)ch are bel)e4e* to ha4e the
pote(t)al of prolo(g)(g the s,r4)4al t)-es of pat)e(ts w)th a( ac,te for- of thy-)cle,ke-)a. &o e4al,ate the pote(t)al treat-e(t e8ects 33 laboratory -)ce w)th
thy-)c le,ke-)a were ra(*o-ly *)4)*e* )(to three gro,ps. "(e gro,p rece)4e*
&reat-e(t 1 o(e rece)4e* &reat-e(t 2 a(* the th)r* was obser4e* as a co(trol
gro,p. &he s,r4)4al t)-es of these -)ce are g)4e( )( &able 11.16 Q)ce S,r4)4al
&)-es )( aysQ. &est at the 1G le4el of s)g()+ca(ce whether these *ata pro4)*e
Saylor URL: http://www.saylor.org/books Saylor.org;3
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 704/723
s,?c)e(t e4)*e(ce to co(+r- the bel)ef that at least o(e of the two treat-e(ts
a8ects the a4erage s,r4)4al t)-e of -)ce w)th thy-)c le,ke-)a.
&A#LE 11.16 %E SUR' 'AL &ES N AJS
Treatment % Treatment 0 Control
1 0 61
2 3 9
0 2 9 3
6; 0 6 1
; 3 61 0
0 9 2 6
3 1
6 1 6
91
Saylor URL: http://www.saylor.org/books Saylor.org;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 705/723
Saylor URL: http://www.saylor.org/books Saylor.org;0
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 706/723
*+, TA*+AA,
• A( F =test ca( be ,se* to e4al,ate the hypothes)s that the -ea(s of se4eral
(or-al pop,lat)o(s all w)th the sa-e sta(*ar* *e4)at)o( are )*e(t)cal.
++/C&S+S
'AS&C
1 &he follow)(g three ra(*o- sa-ples are take( fro- three (or-al pop,lat)o(s w)th
respect)4e -ea(s µ1 µ2 a(* µ3 a(* the sa-e 4ar)a(ce σ2.
Sample Sample ! Sample "
2 3 0
2 5 1
3 7 2
Saylor URL: http://www.saylor.org/books Saylor.org;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 707/723
Sample Sample ! Sample "
5 1
3
a )(* the co-b)(e* sa-ple s)Ve n.
b )(* the co-b)(e* sa-ple -ea( x−.
c )(* the sa-ple -ea( for each of the three sa-ples.
* )(* the sa-ple 4ar)a(ce for each of the three sa-ples.
e )(* MST.
f )(* MSE.
$ )(* F=MST/MSE.
2 &he follow)(g three ra(*o- sa-ples are take( fro- three (or-al pop,lat)o(s w)th
respect)4e -ea(s µ1 µ2 a(* µ3 a(* a sa-e 4ar)a(ce σ2.
Sample Sample ! Sample "
0.0 1.3 0.2
0.1 1.5 0.2
0.2 1.7 0.3
0.1 0.5
0.0a )(* the co-b)(e* sa-ple s)Ve n.
b )(* the co-b)(e* sa-ple -ea(x−.
c )(* the sa-ple -ea( for each of the three sa-ples.
* )(* the sa-ple 4ar)a(ce for each of the three sa-ples.
e )(* MST.
f )(* MSE.
g )(* F=MST/MSE.
3 Refer to E7erc)se 1.
a )(* the (,-ber of pop,lat)o(s ,(*er co(s)*erat)o( @ .
b )(* the *egrees of free*o- df1=K−1a(* df2=n−K.
c or α=0.05 +(* Fα w)th the *egrees of free*o- co-p,te* abo4e.
* At α=0.05 test hypotheses
Saylor URL: http://www.saylor.org/books Saylor.org;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 708/723
A22!&CAT&1NS
0 &he oVart e8ect refers to a boost of a4erage perfor-a(ce o( tests for ele-e(tary
school st,*e(ts )f the st,*e(ts l)ste( to oVartWs cha-ber -,s)c for a per)o* of t)-e
)--e*)ately before the test. ( or*er to atte-pt to test whether the oVart e8ect
act,ally e7)sts a( ele-e(tary school teacher co(*,cte* a( e7per)-e(t by *)4)*)(g her
th)r*=gra*e class of 10 st,*e(ts )(to three gro,ps of 0. &he +rst gro,p was g)4e( a(
e(*=of=gra*e test w)tho,t -,s)cD the seco(* gro,p l)ste(e* to oVartWs cha-ber -,s)c
for 1; -)(,tesD a(* the th)r* gro,ps l)ste(e* to oVartWs cha-ber -,s)c for 2; -)(,tes
before the test. &he scores of the 10 st,*e(ts are g)4e( below:
)roup )roup ! )roup "
80 79 73
63 73 82
74 74 79
71 77 82
70 81 84Us)(g the AN"'A =test at α=0.10 )s there s,?c)e(t e4)*e(ce )( the *ata to s,ggest that
the oVart e8ect e7)stsC
&he oVart e8ect refers to a boost of a4erage perfor-a(ce o( tests for ele-e(tary
school st,*e(ts )f the st,*e(ts l)ste( to oVartWs cha-ber -,s)c for a per)o* of t)-e
)--e*)ately before the test. a(y e*,cators bel)e4e that s,ch a( e8ect )s (ot
Saylor URL: http://www.saylor.org/books Saylor.org;6
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 709/723
(ecessar)ly *,e to oVartWs -,s)c per se b,t rather a rela7at)o( per)o* before the test.
&o s,pport th)s bel)ef a( ele-e(tary school teacher co(*,cte* a( e7per)-e(t by
*)4)*)(g her th)r*=gra*e class of 10 st,*e(ts )(to three gro,ps of 0. St,*e(ts )( the +rst
gro,p were aske* to g)4e the-sel4es a self=a*-)()stere* fac)al -assageD st,*e(ts )(
the seco(* gro,p l)ste(e* to oVartWs cha-ber -,s)c for 10 -)(,tesD st,*e(ts )( theth)r* gro,p l)ste(e* to Sch,bertWs cha-ber -,s)c for 10 -)(,tes before the test. &he
scores of the 10 st,*e(ts are g)4e( below:
)roup )roup ! )roup "
79 82 80
81 84 81
80 86 71
89 91 90
86 82 86
&est ,s)(g the AN"'A F =test at the 1;G le4el of s)g()+ca(ce whether the *ata pro4)*e
s,?c)e(t e4)*e(ce to co(cl,*e that a(y of the three rela7at)o( -etho* *oes better tha(
the others.
Prec)s)o( we)gh)(g *e4)ces are se(s)t)4e to e(4)ro(-e(tal co(*)t)o(s. &e-perat,re a(*
h,-)*)ty )( a laboratory roo- where s,ch a *e4)ce )s )(stalle* are t)ghtly co(trolle* to
e(s,re h)gh prec)s)o( )( we)gh)(g. A (ewly *es)g(e* we)gh)(g *e4)ce )s cla)-e* to be
-ore rob,st aga)(st s-all 4ar)at)o(s of te-perat,re a(* h,-)*)ty. &o 4er)fy s,ch a
cla)- a laboratory tests the (ew *e4)ce ,(*er fo,r sett)(gs of te-perat,re=h,-)*)tyco(*)t)o(s. )rst two le4els of !ig! a(* lo& te-perat,re a(* two le4els
of !ig! a(* lo& h,-)*)ty are )*e(t)+e*. Let ) sta(* for te-perat,re a(* for h,-)*)ty.
&he fo,r e7per)-e(tal sett)(gs are *e+(e* a(* (ote* as ) 5: h)gh h)gh5 h)gh low5
low h)gh5 a(* low low5. A pre=cal)brate* sta(*ar* we)ght of 1 kg was we)ghe* by the
(ew *e4)ce fo,r t)-es )( each sett)(g. &he res,lts )( ter-s of error )( -)crogra-s -cg5
are g)4e( below:
?high@ high ?high@ lo3 ?lo3@ high ?lo3@ lo3
C1.50 11.47 C14.29 5.54C6.73 9.28 C18.11 10.34
11.69 5.58 C11.16 15.23
C5.72 10.80 C10.41 C5.69
Saylor URL: http://www.saylor.org/books Saylor.org;9
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 710/723
&est ,s)(g the AN"'A F =test at the 1G le4el of s)g()+ca(ce whether the *ata pro4)*e
s,?c)e(t e4)*e(ce to co(cl,*e that the -ea( we)ght rea*)(gs by the (ewly *es)g(e*
*e4)ce 4ary a-o(g the fo,r sett)(gs.
6 &o )(4est)gate the real cost of ow()(g *)8ere(t -akes a(* -o*els of (ew
a,to-ob)les a co(s,-er protect)o( age(cy followe* 1 ow(ers of (ew 4eh)cles offo,r pop,lar -akes a(* -o*els call the- TC HA NA a(* FT a(* kept a recor* of
each of the ow(erWs real cost )( *ollars for the +rst +4e years. &he +4e=year costs of
the 1 car ow(ers are g)4e( below:
TC 4A NA FT
8423 7776 8907 10333
7889 7211 9077 9217
8665 6870 8732 10540
7129 9747
7359 8677
&est ,s)(g the AN"'A F =test at the 0G le4el of s)g()+ca(ce whether the *ata pro4)*e
s,?c)e(t e4)*e(ce to co(cl,*e that there are *)8ere(ces a-o(g the -ea( real costs of
ow(ersh)p for these fo,r -o*els.
9 @elp)(g people to lose we)ght has beco-e a h,ge )(*,stry )( the U()te* States w)th
a((,al re4e(,e )( the h,(*re*s of b)ll)o( *ollars. Rece(tly each of the three -arket=
lea*)(g we)ght re*,c)(g progra-s cla)-e* to be the -ost e8ect)4e. A co(s,-er
research co-pa(y recr,)te* 33 people who w)she* to lose we)ght a(* se(t the- tothe three lea*)(g progra-s. After s)7 -o(ths the)r we)ght losses were recor*e*. &he
res,lts are s,--ar)Ve* below:
Statistic rog8 rog8 ! rog8 "
a%$"e ean x−1=10.65 x−2=8.90 x−3=9.33
a%$"e =arian!e s21=27.20 s22=16.86 s23=32.40
a%$"e iBe n1=11 n2=11 n3=11
&he -ea( we)ght loss of the co-b)(e* sa-ple of all 33 people wasx−=9.63. &est ,s)(g
the AN"'A F =test at the 0G le4el of s)g()+ca(ce whether the *ata pro4)*e s,?c)e(t
e4)*e(ce to co(cl,*e that so-e progra- )s -ore e8ect)4e tha( the others.
1; A lea*)(g phar-ace,t)cal co-pa(y )( the *)sposable co(tact le(ses -arket has always
take( for gra(te* that the sales of certa)( per)pheral pro*,cts s,ch as co(tact le(s
sol,t)o(s wo,l* a,to-at)cally go w)th the establ)she* bra(*s. &he lo(g=sta(*)(g c,lt,re
)( the co-pa(y has bee( that le(s sol,t)o(s wo,l* (ot -ake a s)g()+ca(t *)8ere(ce )(
Saylor URL: http://www.saylor.org/books Saylor.org1;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 711/723
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 712/723
Chapter %0
Appendix !igure "&." <umulative 9inomial $robability
Saylor URL: http://www.saylor.org/books Saylor.org12
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 713/723
Saylor URL: http://www.saylor.org/books Saylor.org13
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 714/723
Saylor URL: http://www.saylor.org/books Saylor.org1
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 715/723
!igure "&.& <umulative Iormal $robability
Saylor URL: http://www.saylor.org/books Saylor.org10
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 716/723
Saylor URL: http://www.saylor.org/books Saylor.org1
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 717/723
!igure "&.* <ritical 0alues of t
Saylor URL: http://www.saylor.org/books Saylor.org1
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 718/723
Saylor URL: http://www.saylor.org/books Saylor.org16
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 719/723
!igure "&. <ritical 0alues of <hi:%quare (istributions
Saylor URL: http://www.saylor.org/books Saylor.org19
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 720/723
!igure "&. pper <ritical 0alues of !:(istributions
Saylor URL: http://www.saylor.org/books Saylor.org2;
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 721/723
Saylor URL: http://www.saylor.org/books Saylor.org21
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 722/723
Saylor URL: http://www.saylor.org/books Saylor.org22
8/17/2019 Introductory Statistics.docx
http://slidepdf.com/reader/full/introductory-statisticsdocx 723/723
!igure "&./ 'ower <ritical 0alues of !:(istributions