337
Author: Brenda Gunderson, Ph.D., 2013 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution!onCommer"ial#hare Ali$e 3.0 Un%orted &i"ense' htt%'(("reative"ommons.or)(li"enses(b*n"sa(3.0( +he Universit* of i"hi)an -%en. i"hi)an initiative has reviewed this material in a""ordan"e with U. Co%*ri)ht &aw and have tried to ma imi/e *our abilit* to use, share, and ada%t it. +he attribution %rovides information about how *ou ma* share and ada%t this material. Co%*ri)ht holders of "ontent in"luded in this material should "onta"t o%en.mi"hi)an umi"h.edu with uestions, "orre"tions, or "larifi"ation re)ardin) the use of "ontent. or more information about how to attribute these materials visit' htt%'((o%en.umi"h.edu(edu"ation(about(termsofuse. #ome materials are used with %ermission from the "o%*ri)ht holders. ou ma* need to obtain new %ermission to use those materials for other uses. +hi in"ludes all "ontent from' ind on #tatisti"s Utts(4e"$ard, 5th 6dition, Cen)a)e &, 2012 +e t -nl*' 7#B! 89:12:;13;8:5 Bundled version' 7#B! 89:0;3:9335:8 #P## and its asso"iated %ro)rams are trademar$s of #P## 7n". for its %ro%rietar* "om%uter software. -ther %rodu"t names mentioned in this resour"e are used for identifi"ation %ur%o onl* and ma* be trademar$s of their res%e"tive "om%anies. Attribution Key or more information see' htt%''((o%en.umi"h.edu(wi$i(AttributionPoli"* Content the copyright holder, author, or law permits you to use, share and adapt: Creative Commons Attribution!onCommer"ial#hare Ali$e &i"ense Publi" Domain < #elf Dedi"ated' =or$s that a "o%*ri)ht holder has dedi"ated to the %ubli" domain. Make Your Own Assessment Content -%en. i"hi)an believes "an be used, shared, and ada%ted be"ause it is ineli)ible for "o%*ri Publi" Domain < 7neli)ible. =or$s that are ineli)ible for "o%*ri)ht %rote"tion in the U.#. >19 U#C ?102>b@@ laws in *our urisdi"tion ma* differ. Content -%en. i"hi)an has used under a air Use determination air Use' Use of wor$s that is determined to be air "onsistent with the U.#. Co%*ri)ht A"t >19 U#C ? 109@ laws in *our urisdi"tion ma* differ. -ur determination D-6# !-+ mean that all uses of this third%art* "ontent are air Uses and we D- !-+ )uarantee that *our use of the "ontent is air. +o use this "ontent *ou should "ondu"t *our own inde%endent anal*sis to determine whether or not *our use will be air. Statistics 250 1

Stats Workbook for College Students

Embed Size (px)

DESCRIPTION

Stats Workbook for College Kids

Citation preview

Author: Brenda Gunderson, Ph.D., 2013
License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution!onCommer"ial#hare Ali$e 3.0 Un%orted &i"ense' htt%'(("reative"ommons.or)(li"enses(b*n"sa(3.0(
+he Universit* of i"hi)an -%en.i"hi)an initiative has reviewed this material in a""ordan"e with U.#. Co%*ri)ht &aw and have tried to maimi/e *our abilit* to use, share, and ada%t it. +he attribution $e* %rovides information about how *ou ma* share and ada%t this material.
Co%*ri)ht holders of "ontent in"luded in this material should "onta"t o%en.mi"hi)anumi"h.edu with an* uestions, "orre"tions, or "larifi"ation re)ardin) the use of "ontent.
 
#ome materials are used with %ermission from the "o%*ri)ht holders. ou ma* need to obtain new %ermission to use those materials for other uses. +his in"ludes all "ontent from'
ind on #tatisti"s Utts(4e"$ard, 5th 6dition, Cen)a)e &, 2012 +et -nl*' 7#B! 89:12:;13;8:5 Bundled version' 7#B! 89:0;3:9335:8
#P## and its asso"iated %ro)rams are trademar$s of #P## 7n". for its %ro%rietar* "om%uter software. -ther %rodu"t names mentioned in this resour"e are used for identifi"ation %ur%oses onl* and ma* be trademar$s of their res%e"tive "om%anies.
Attribution Key
or more information see' htt%''((o%en.umi"h.edu(wi$i(AttributionPoli"*
Content the copyright holder, author, or law permits you to use, share and adapt:  
Creative Commons Attribution!onCommer"ial#hare Ali$e &i"ense  Publi" Domain < #elf Dedi"ated' =or$s that a "o%*ri)ht holder has dedi"ated to the %ubli" domain.
Make Your Own Assessment
Content -%en.i"hi)an believes "an be used, shared, and ada%ted be"ause it is ineli)ible for "o%*ri)ht.
Publi" Domain < 7neli)ible. =or$s that are ineli)ible for "o%*ri)ht %rote"tion in the U.#. >19 U#C ?102>b@@ laws in *our
 urisdi"tion ma* differ.
Content -%en.i"hi)an has used under a air Use determination  air Use' Use of wor$s that is determined to be air "onsistent
with the U.#. Co%*ri)ht A"t >19 U#C ? 109@ laws in *our urisdi"tion ma* differ.
-ur determination D-6# !-+ mean that all uses of this third%art* "ontent are air Uses and we D- !-+ )uarantee that *our use of the "ontent is air. +o use this "ontent *ou should "ondu"t *our own inde%endent anal*sis to determine whether or not *our use will be air.
Statistics 250
Weekly Labs, In-Lab Projects, Suppleents,
an! "l! #$as %or &e'ie( )se! in all lab sections o% Stat
250
*r+ ren!a un!erson *epartent o% Statistics )ni'ersity o% .ic/ian
able o% ontents
.aterial Pae ote to Stu!ents an! Suppleents
Supplement 1: SPSS Commands Summary Supplement 2: Notation Sheet Supplement 3: Name That Scenario Supplement 4: Editing Charts in SPSS Supplement 5: Notes about SPSS t Procedures Supplement : !nterpretation E"amples Supplement #: Summary o$ the %ain t &  Tests Supplement ': (egression )utput in SPSS
1 2 4
22
2:
Lab 37 ie Plots an! ;-; Plots 3 Lab 47 Probability an! &an!o <ariables
4
5
2
60
:
105
Lab 127 Siple Linear &eression 115 Lab 137 /i-SAuare ests 126
3
 
"l! #$as %or &e'ie( E"am 1 *uestions E"am 2 *uestions +inal E"am *uestions
135 1: 203
ote to Stu!ents
,elcome to Statistics 25- at the .ni/ersity o$ %ichigan0 This lab orboo is designed $or you to use in lab and as e"tra preparation $or e"ams !n the orboo you ill nd the $olloing materials:
Suppleental .aterial 6 great summaries $or re$erence throughout the term:
1 SPSS Commands (e$erence 2 Notation Sheet 3 Name That Scenario 4 Editing Charts in SPSS 5 !mportant Notes $or 7ypothesis Testing !nterpretation E"amples # Summary o$ T&tests and Name That Scenario Practice $or %eans ' (egression )utput in SPSS
Weekly Labs ?nubere! 1 to 13@ 6 each lab contains the $ollo parts: o Lab ackroun! 6 ob8ecti/e and brie$ o/er/ie material
hich is good to tae a couple minutes to read before you come to lab each ee
o War-)p >cti'ity 6 9uic 9uestions $or you to do be$ore the !n&ab Pro8ect usually a 9uic re/ie o$ concepts you ha/e seen in lecture
o ILP ?In-Lab Project@ 6 one or more acti/ities that you ill or on in lab in groups ; copy o$ the !P ill be pro/ided to each group $or turning in at the end o$ the lab period
o ool-*o(n >cti'ity 69uestions $or you to do a$ter the !P $or $urther re<ection and application o$ the concepts co/ered in the !P
o #$aple #$a ;uestions 6 old e"am 9uestions on the lab topic=s> $or additional practice
"l! #$as 6 complete sets o$ actual old e"am $or studying ?e sure to re$er to CTools to see i$ any problems on these old e"ams are not rele/ant $or your particular upcoming e"am =due to di@erences in the semester schedule> This in$ormation in addition to solutions ill be posted on CTools in the A(e/ie !n$oB $older under the A(esourcesB tab closer to each e"am date
 The abs are designed to be interacti/e and to pro/ide you ith a complete e"ample $or each concept Completing the
1
 
corresponding Preab assignment =a lin to /ideo instructions $or Preabs ill be on CTools and the Stat 25- ouTube channel> and reading the upcoming lab bacground o/er/ie be$ore lab each ee is a good ay to prepare $or the /arious lab acti/ities
Dood luc in Statistics 25-0 && The Stat 250 Instructors and GSIs
Suppleent 17 SPSS oan!s Suary y Lab B For ;uick &e%erence
Lab 2 – Descriptive Statistics
"pen a !ata =le a$ter ha/ing SPSS already open: +ile )pen Fata
 To produce a 8istora7 Draphs egacy Fialogs 7istogram
 To produce a o$plot  $or a single /ariable ith no groups: Draphs egacy Fialogs ?o"plot Simple Summaries $or separate /ariables
 To use *ata Label .o!e7 +rom inside the Chart Editor Elements Fata abel %ode
 To generate *escripti'e Statistics I7 ;nalyGe Fescripti/e Statistics Fescripti/es
 To generate *escripti'e Statistics II7 ;nalyGe Fescripti/e Statistics +re9uencies
 To produce a ar /art7 Draphs egacy Fialogs ?ar Simple Summaries o$ separate /ariables
 To Split ?or unsplit@ t/e !ata ?et c/arts an! statistics by roup@7 Fata Split +ile
 To produce Si!e-by-Si!e o$plots7  Draphs egacy Fialogs ?o"plot Simple Summaries $or groups o$ cases
Lab 3 – Time Plots and Q-Q Plots
 To produce a SeAuence ?ie@ Plot7 ;nalyGe +orecasting Se9uence Charts  To produce a ;-; Plot7 ;nalyGe Fescripti/e Statistics *&* Plots
2
Lab 8 – One-Sample Confdence Intervals or a Pop!lation "ean
 To produce a on=!ence Inter'al %or a population ean ?et/o! I@7  ;nalyGe Fescripti/e Statistics E"plore Statistics option
 To produce a on=!ence Inter'al %or a population ean ?et/o! II@7  ;nalyGe Compare %eans )ne&Sample T Test
Lab 8 – One-Sample t Proced!res or a Pop!lation "ean
 To per$orm a "ne-Saple est %or a population ean7  ;nalyGe Compare %eans )ne&Sample T Test
3
 To calculate a con=!ence inter'al %or  *7 ;nalyGe Compare
%eans Paired&Samples T Test  To per$orm a Paire! est7 ;nalyGe Compare %eans Paired& Samples T Test  To copute *iCerences7 Trans$orm Compute
Lab $% – Independent Samples t Proced!res
 To construct a con=!ence inter'al %or 1 - 27  ;nalyGe Compare %eans !ndependent&Samples T Test
 To per$orm a (o-Saples est: ;nalyGe Compare %eans !ndependent&Samples T Test
Lab $$ – One-&a' (nal'sis o )ariance *(+O)(,
 To per$orm an >"<>7 ;nalyGe Compare %eans )ne&,ay ;N)H;
Lab $2 – Simple Linear e.ression
 To produce a Scatterplot7 Draphs egacy Fialogs ScatterIFot  To per$orm a Linear &eression7 ;nalyGe (egression inear  To produce a &esi!ual plot7 Draphs egacy Fialogs ScatterIFot
Lab $3 – C/i-S0!are Tests
 To (ei/t cases by ounts7 Fata ,eight Cases  To per$orm a oo!ness o% Fit est7 ;nalyGe Nonparametric  Tests Chi&S9uare  To per$orm a est o% In!epen!ence7 ;nalyGe Fescripti/e Statistics Crosstabs  To per$orm a est o% 8ooeneity7 ;nalyGe Fescripti/e Statistics Crosstabs
4
 
Suppleent 27 otation S/eet  The table belo denes important notations including that used by SPSS hich you ill come across in the course This is not an e"hausti/e list but it is a $airly comprehensi/e o/er/ie o$ the Astrange lettersB used in the course Note: ?lan cells mean there is no corresponding notation
ae Population otation
Proportion  p   p = p&hat> Stan!ar! !e'iation
σ  =sigma> s Std Fe/iation
<ariance σ 2 s2 Hariance Saple siDe n N
on=!ence Inter'als
.arin o% error
8ypot/esis estin est statistics Note: t  F  and
2  χ   statistics ha/e degrees o$ $reedom =abbre/iated d$> associated ith them oo $or these on your +ormula Card
 z  t t  F F 
2 χ   =chi& s9uare>
SSD
Su o% sAuares %or error
SSE
.ean sAuare %or roups
S9uare>
column labeled %ean
S9uare> &eression
&esponse ?!epen!ent@ 'ariable
 y = y &hat>
? =loo in the ro labeled =Constant>>
Slope 1β   =beta&one> b1
? =loo in the ro labeled ith the
name o$ the "&/ariable>
r 2 ( S9uare
ε =error terms>
Suppleent 37 ae /at Scenario
 The rst thing to do in any research in$erence problem is determine hat type o$ in$erence problem it is This ill help in deciding hat procedureI$ormulas are appropriate to use The $olloing 9uestions can help you determine the data scenario you are oring ith
!ease note" #hen ans#er$n%" &'o# any ar$ab!es are there*+ do not count the ar$ab!e #h$ch de,nes the popu!at$ons ($f there $s ore than one popu!at$on)-
7o many populations are thereK
"ne (o .ore t/an t(o
7o many /ariables are thereK
"ne (o
ateorical ;uantitati'e
   Then use the $olloing table to determine hich type o$  in$erence ould be appropriate $or this scenario
Note the corresponding parameter is in parentheses here appropriate
#
 
uber o% Populations uber o% <ariables an! ype
"ne (o .ore /an
1&sample in$erence $or population proportion =p> =abs 5 an >
Chi&s9uare: Doodness o$ +it =ab 13>
2 independe nt samples in$erence $or the di@erence beteen 2 population proportions
=p1 6 p2>
;uantita ti'e
1&sample in$erence $or population mean =µ> =ab '>
Paired samples in$erence $or a population mean di@erence =µF> =ab L>
2 independe nt samples in$erence $or the di@erence beteen 2 population means =µ1 & µ2> =ab 1->
;N)H;
=µi 6 here there is one µi $or each population> =ab 11>
( o
Chi&s9uare: !ndependen ce =ab 13>
;uantita ti'e =relationshi
 
Suppleent 47 #!itin /arts in SPSS )nce e ha/e a histogram =or any chart> made e may ish to edit the chart =perhaps to change the color o$ the bars or change the number o$ class inter/als> To do this double clic on the chart displayed in the output /ieer indo This ill open the chart in the SPSS /art #!itor (in!o(
Suppose e ant to c/ane t/e color o$ the histogram bars $rom tan to a light green To do this double clic on one o$ the tan bars and clic on the Fill or!er tab  in the properties indo Change the =ll color to light green =clic on the light green bo" color> and clic on apply and then close
 To c/ane soe aspect o% t/e  1 -a$is such as the scaling double clic on the x &a"is and its corresponding properties bo" ill appear ith many tab options The scale tab can be used to change the endpoints and ma8or increments o$ the  x &a"is /alue labels ou could ad8ust the minimum to 1---- and the ma"imum to 15---- ea/ing the ;uto bo" checed $or the %a8or !ncrement option ill let SPSS create the increment siGe The 8istora "ptions tab ould allo you to add a normal cur/e to a histogram as ell as change the starting position and siGe o$ the bins =classes or inter/als> in the histogram !n general SPSS uses algorithms to produce a nice display o$ the data These options are help$ul i$ you ha/e multiple plots that you ould lie to display using the same "&a"is /alues so comparisons can be more easily made ou can also change the gray bacground color to hite under the Properties (in!o( ith the bacground highlighted  The /elp button in the loer right corner o$ the Properties bo" can be selected to pro/ide more details about any o$ the /arious options $or that tab )nce you ha/e nished customiGing your chart you can close out the chart editor
 There are alternate ays to get to the Properties bo" in order to customiGe your chart )nce you ha/e double&cliced on your chart to open the Chart Editor clic once on the part o$ the chart that you ish to customiGe =so that is it highlighted> Then clic on the S/o( Properties Win!o( tab =it loos lie a paint palette> in your menu This ill open the Properties bo" ;nother alternati/e is to simply select Properties under the #!it menu in the Chart Editor ;lso note that i$ you do not close the Properties bo" and you continue highlighting di@erent parts o$ your chart the Properties bo" updates so that you can customiGe those parts as you go
1-
 
+or bo$plots i$ there are any points denoted as outliers you can identi$y them by looing at their case label number in the de$ault output The Chart Editor pro/ides a special mode $or identi$ying indi/idual cases hose data labels you ant to display This is the !ata label o!e and hen you are in data label mode you canMt change anything else in the chart +rom the menus choose #leentsG *ata Label .o!e The cursor changes shape to indicate that you are in data label mode Clic the data element $or hich you ant to display the case label !$ there are o/erlapping data elements in the spot that you clic the Chart Editor displays the Select *ata #leent to Label dialog bo"  This dialog allos you to select the specic data element or elements $or hich you ant to display data labels The Chart Editor displays the data label in a de$ault position related to the data element ,hen you are nished choosing data elements $rom the menus choose #leentsG *ata Label .o!e again and the cursor changes bac to the arro to indicate that you are no longer in Fata abel %ode
 The "ptions menu lets you customiGe your chart $urther ou may add a title or te"t bo" $rom this menu Te"t bo"es can appear anyhere in a chart +rom the Chart Editor menus select "ptionsG e$t o$ or "ptionsG itle depending on hich you ant +or titles the Chart Editor creates the title bo" and automatically positions it in the top center o$ the chart Type the te"t and press enter hen you are nished typing To enter line breas press Shi$tEnter !$ necessary use the e$t tab to $ormat the te"t +or te"t bo"es you can drag and drop to reposition them ou may need to resiGe the graph so the te"t bo" ill not co/er up part o$ the graph ou can also copy the plots onto %S,ord or another te"t editor and then type in your name and title ithin the document
Savin. O!tp!t o1es and rap/s !mages and other output $rom SPSS can o$ten be copied and then pasted into a document by selecting the desired output right& clicing and choosing opy 1 To save an o!tp!t bo1  such as a table o$ descripti/e
statistics rst ha/e the location here you ould lie to store the output open Then right&clic on the output table and select opy   The table can then be pasted into a document or te"t&eld =such as those in your Preab assignments or homeor> !$ you are pasting into a ,ord document and i$ the output does not appear to $ormat correctly it may be a good idea to choose Paste Special  and paste as an image our
11
 
stats 25- DS!s pre$er not to recei/e ,ord documents 2 To save a .rap/ such as a bo"plot or histogram you ill
ant to #$port the graph as an image Select the graph you ish to e"port then select #$port $rom the File menu ;t the top clic the Selection button select one ?rap/ics only@ under *ocuent and then choose the le type at the bottom +or uploading to lectureboocom the e"tension A8pgB or ApngB is re9uired ?e$ore completing the e"port command be sure to gi/e your le an in$ormati/e name and note the location here the le ill be sa/ed
3 To save an entire SPSS session you can e"port output =all output in the /ieer charts only or te"t only> in many possible $ormats =html 8peg bitmap etc> ou rst mae the Hieer the acti/e indo and select the #$port command is under the File menu
 ou can also print the contents o$ your output /ieer indo =all output te"t as ell as charts> or any selected portion Clic on File $rom the menu bar and then choose Print ou could also 8ust sa/e your output le ithin SPSS by selecting the Sa'e as option and gi/ing a name $or your le and clic "H 
Suppleent 57 otes about SPSS t   Proce!ures 1 The reported  p&/alue under the column heading o$ Si.4 *2-
tailed,  is $or a 2&sided test +or a one&sided test you rst di/ide the reported 2&tailed p&/alue in hal$ = pI2> !$ the t &statistic is positi/e and the alternati/e hypothesis as upper&tailed => then pI2 is the p& /alue !$ the t &statistic is negati/e and the alternati/e hypothesis as loer&tailed =O> then  pI2 is the  p&/alue 7oe/er i$ the t &statistic is positi/e and the alternati/e hypothesis as loer&tailed =O> or i$ the t &statistic is negati/e and the alternati/e hypothesis as upper&tailed => then the p& /alue is 1 & pI2
 positive
Alternative is >, then p-value is 1 – (sig/2)
Alternative is <, then p-value is 1 – (sig/2)
Alternative is <, then p-value is sig/2
12
 
+or e"ample consider a 2&sided test ith an obser/ed t  statistic /alue o$ -L4' and a p&/alue o$ -34 This -34 is actually the sum o$ to e9ual areas: one being the area to the right o$ -L4' and the other being the area to the le$t o$ &-L4' under the t &distribution ith d$ 11 cur/e =see +igure 1 belo>
!$ the alternati/e hypothesis had been upper&tailed => then the p&/alue ould be only the area to the right =in the direction o$ e"treme> o$ the obser/ed t &statistic o$ -L4' hich is hal$ o$ the to&sided  p&/alue or -1'2 ?ut i$ the alternati/e hypothesis had been loer&tailed =O> the p&/alue ould be the area to the le$t =in the direction o$  e"treme> o$ the obser/ed t &statistic o$ -L4' So the  p&/alue ould be 1 6 =-34I2> -'1'
2 Sometimes SPSS ill display a  p&/alue o$ ---- Clearly the probability is not e"actly Gero (ather it is Gero to 3 signicant digits Thus it is correct to say that the  p&/alue is less than ----5 since anything greater hen rounded ould ha/e resulted in Si+ 2-taile! p-'alue o$ ---1 or more
3 )ne can use a one&sample t  =1 & α>Q condence inter/al to test a to&sided hypothesis at the α Q le/el by checing hether the hypothesiGed or null test /alue is contained in the inter/al !$ it is e cannot re8ect 7- but i$ not e ould re8ect 7- (ecall that i$ you ant to produce a condence inter/al $or the population mean μ you ust speci$y the test /alue to be - in the one&sample t &test dialog bo" +or e"ample a test /alue o$  3- that ould result in a L5Q condence inter/al o$ the di@erence gi/en as =&-25 -5> is actually a condence inter/al $or μ 6 3- See ab 5 $or more details
13
 
4 Supplement pro/ides an o/er/ie o$ the assumptions $or all three t &tests that are presented in abs ' L and 1-
14
Suppleent 7 Interpretation #$aples
!n 1L'- ?ausch and omb Corporation de/eloped a ne type o$  e"tended&li$e contact lens made o$ silicone hich it claimed had a use$ul li$e o$ more than 4 years Furing the research and de/elopment period a random sample o$ contact earers as ased to ear the ne contact lenses and record ho long they lasted The a/erage use$ul li$e o$  the si" pairs o$ lenses as 4 years ith a standard de/iation o$  -4L years
a Interpretation o% t/e Stan!ar! *e'iation7  The a'erae !istance o$ the obser/ed use$ul li/es o$ these lenses %ro t/eir ean use$ul li$e o$ 4 years is abo!t  -4L years
b Calculate the /alue o$ the stan!ar! error o% t/e ean
200.0 6
 s  X SE 
Interpretation7  ,e ould estiate  that the a'erae !istance  o$ the possible sample mean use$ul li$e /alues =obtained $rom repeated random samples o$ siGe n  pairs o$  such lenses> $rom the population ean use%ul li%e μ to be ro!./l'  -2- years
c Construct a :0 con=!ence inter'al $or the population mean li$e o$ all such silicone&based lenses:
)00".#,1!$.()200.0)(01#.2(6.   ⇒±
Interpretation o% t/e Inter'al7 This inter/al pro/ides a range o$ reasonable /alues $or the population mean use$ul li$e μ ,e ould estimate the population ean  use$ul li$e  μ  to be beteen 41L# years and 5--3 years ith L-Q condence
Interpretation o% t/e :0 on=!ence Le'el7 This inter/al as made ith a method hich i$ repeated ould generate many L-Q condence inter/als ,e ould e"pect L-Q o$  these resulting inter/als to contain the population mean li$e μ.
d State the hypotheses to test the claim made by ?ausch and omb about their ne contact lensR that is test i$ the
15
population mean use$ul li$e is more than 4 years
 
  µ 
 The  p-'alue  $or this test is the probability o$ getting a t.test statistic at least as e"treme as the obser/ed test statistic assuming the null hypothesis is true
So e ha/e the  p&/alue is ( )00."≥T  P    $ound under the t =5> distribution This p&/alue turns out to be e9ual to --15
Interpretation o% t/e 'alue o% t/e test statistic t 5 34%% in ters o% a !istance7  The obser/ed sample mean as 3 a/erage distances =ie 3 standard errors> abo/e the hypothesiGed mean o$ 4
Interpretation o% t/e resultin p-val!e o %4%$67 !$ the null hypothesis as true =the population mean use$ul li$e is 8ust 4 years> and this procedure =study> as repeated many times e ould e"pect to see a t &test statistic /alue o$ 3-- or larger in only 15Q o$ the repetitions Thus are data are somehat unusual under the null hypothesis theory pro/iding e/idence $or the alternati/e theory that the population mean use$ul li$e is greater than 4 years
e ;t a 1-Q signicance le/el hat is the !ecisionK e7ect  '0 since the p&/alue is less than -1-
$ ,hat is the conclusionK There is sucient e/idence to conclude that the population ean use$ul li$e o$ the ne lenses is greater than 4 years
+OT9  These interpretations can be e"tended to the any test and condence inter/al ad8usting $or the di@erent parameters di@erent directions o$  e"treme di@erent test statistics etc
1
 
Suppleent 67 Suary o% t/e .ain t -ests  The three in$erence scenarios presented in abs ' L and 1- are: one-saple t   proce!ures, paire! t   proce!ures, an! t(o in!epen!ent saples t  proce!ures  !t is important to loo at the data to determine i$ doing a particular t   procedure is appropriate That is e need to chec the assumptions =(ecall that checing assumptions is the second step in per$orming a hypothesis test> The t   procedures ha/e the $olloing assumptions:
1 Each sample is a ran!o saple 6 =the obser/ations can be /ieed as realiGations o$ independent and identically distributed random /ariables> !n the paired t  procedures the !iCerences  are assumed a random sample
2 Each sample is dran $rom a noral population that is the response /ariable has a normal distribution $or each population !n the paired t   procedures the population o$ !iCerences is assumed to ha/e a normal distribution !n the to&sample case both populations o$  responses are assumed to ha/e normal distributions  ou need normality o$ the underlying population $or the response in order to ha/e normality $or the sample mean !n the case here you do not ha/e a normal population you can still ha/e normality o$ the sample mean i$ you ha/e a large enough sample siGe =most te"ts state at !east /0> !n ab # =Sampling Fistributions and the CT> you ored ith an applet that demonstrated the CT using a !ar%e  sample siGe o$ 25 Thus e ill accept at !east 25 as large enough $or in$erence about means
3 +or the to independent samples t  procedures e also assume that the t(o saples are in!epen!ent ,e also need to assess hether the t(o population 'ariances  can be assumed eAual  in order to decide beteen the pooled and the unpooled t  tests
rap/ical tools can be used to chec these assumptions =see abs 2 and 3 $or more details about these /arious graphs>
ie Plots ?or SeAuence Plots@7 !$ your 9uantitati/e data ha/e been gathered o/er time then a time plot can be used to determine i$ the underlying process that generated that time dependent data appears to be stable So these plots can help us
1#
 
assess i$ the random sample assumption is reasonable +or the paired design problems e assume our set o$ di@erences calculated $rom the paired obser/ations =F1 F2 Fn> are a random sample So i$ these di@erences ere obtained o/er time they should be plotted against their order to see i$ they loo lie they came $rom one population o$ all di@erences =no changing mean or /ariability o/er time> &eeber7 ie or SeAuence plots are use%ul %or c/eckin stability only (/en t/e !ata are or!ere! in soe sense+ I% t/ere is no in/erent or!er to t/e !ata, a seAuence plot s/oul! not be a!e+
8istoras7  7istograms are especially use$ul $or displaying the distribution o$ a 9uantitati/e response /ariable ou could mae a histogram o$ the obser/ations in a one&sample problem o$ the di@erences in a matched pairs design and o$ each o$ the to samples separately in the independent samples design E"amine the histogram $or e/idence o$ strong departures $rom normality such as bimodality or e"treme outliers Since you are 8ust plotting data =8ust a sample and not the entire population o$ responses> your histogram may not loo perfect!y bell&shaped or normal
;-; plots7 *&* plots =or 9uantile plots or normal probability plots> are generally better than histograms $or assessing i$ a normal model is appropriate !$ the points in a *&* plot $all approx$ate!y  in a straight line =ith a positi/e slope> then the normal model assumption is reasonable
o$plots7 ?o"plots are most use$ul $or assessing the /alidity o$  the assumption o$ eAuality o% population 'ariances in t/e t(o in!epen!ent saples !esin ,e ould see i$ the !*(s =shon graphically by the length o$ the bo"es> are comparable and also compare the o/erall ranges !$ they do ha/e comparable lengths or siGes =they do not need to be lined up> then e ha/e support that the e9uality o$ population /ariances assumption is reasonable ,e ould also ant to compare the to sample standard de/iations themsel/es and e/enes test o$ e9uality o$  the to population /ariances may also be a/ailable
1'
 
ae t/at Scenario Practice %or t/e /ree ests7 7a/ing 8ust re/ieed the three main t &test in$erence scenarios you should understand the testing procedures and be able to interpret the results o$ a test 7oe/er it is important to no hen each scenario applies (ead each o$ the $olloing in$erence scenarios and determine hich o$ the three t &test procedures ould be most appropriate: the one& sample t &test the paired t &test or the to&independent samples t &test
1 ; researcher is studying the e@ect o$ a ne teaching techni9ue $or middle school students )ne class o$ 3- students is taught using the ne techni9ue and their mean score on a standardiGed test is compared to the mean score o$ another class o$ 2# students ho ere taught using the old techni9ue
2 ; company claims that the economy siGe /ersion o$ their product contains 32 ounces ; consumer group decides to test the claim by e"amining a random sample o$ 1-- economy siGe bo"es o$ the product since they ha/e recei/ed reports that the bo"es contain less than the 32 ounces claimed
3 ;t some uni/ersities athletic departments ha/e come under re $or lo academic achie/ement among their athletes ;n athletic director decides to test hether or not athletes do in $act ha/e loer DP;s ; random sample o$ 2-- student athletes and a random sample o$ 5-- non&athlete students are taen and their DP;s are recorded
4 ;s part o$ a biology pro8ect some high school students compare heart rates o$ 4- o$ their classmates be$ore and a$ter running a mile They ant to see i$ the heart rate o$ students their age is $aster a$ter running a mile than be$ore on a/erage
5 ; hospital is studying patient costsR they decide to $ollo 5-- surgery patients hospital and medical bills $or a year a$ter surgery and compare them to the estimated costs pro/ided to the patients be$ore surgery They ant to see i$ the estimated and actual costs are comparable on a/erage
; chemical process re9uires that no more than 23 grams o$ an ingredient be added to a batch be$ore the rst hour o$ the process is complete ;n analyst $eels that due to current settings more than 23 grams may actually be added !$ the analyst is correct the settings
1L
 
need to be altered and recent batches recalled ; random sample o$  25 batches is obtained $rom the machine that is supposed to add the ingredient The measurements are used to test the analysts claim
Suppleent 7 &eression "utput in SPSS  There are $our parts to the de$ault regression output .se the scroll bar at the right edge o$ the "utput Win!o( to scroll up to the top o$ the regression output The rst section 8ust reminds you hich /ariable as entered as the e"planatory " /ariableR $or this e"ample the e"planatory /ariable is D+(
 The second section has the heading .o!el Suary The %odel Summary starts ith the correlation beteen the to /ariables & hich is the absol!te val!e  o$ the correlation coeEcient  r   ou need to loo at the sin o% t/e slope o$ the regression line to determine i$ you need to put a minus sign in $ront o$ this /alue to correctly report the correlation coecient =The actual /alue o$  the correlation coecient is also reported in the last section o$  regression output under the column heading eta> The correlation coecient measures the strength o$ the linear association beteen the to /ariables The closer it is to 1 or &1 the stronger the linear association The s9uare o$ the correlation the & SAuare 9uantity has a use$ul interpretation in regression !t is o$ten called the coeEcient o% !eterination and measures the proportion o$ the /ariation in the response that can be e"plained by the linear regression o$ y on x  Thus it is a measure o$ ho ell the linear regression model ts the data The St!+ #rror o% t/e #stiate gi/es the /alue o$ s the estimate o$ the population standard de/iation σ 
Model Summary
Predi"tors' >Constant@, D!Aa.
 The third part o$ the output contains the >"<> table %or reression  used $or assessing i$ the slope is signicantly
2-
 
di@erent $rom - /ia an F   test The corresponding t &test ill be discussed rst and e return to this ;N)H; part later
ANOVAb
1::.22: : 23.;2:
21
 
 The last portion o$ the output $alls under the heading oeEcients  !n this section the least s9uare estimates $or the regression line are gi/en These estimated regression coecients are $ound under the column labeled The estimated slope is ne"t to the independent /ariable name =in this e"ample it is FN;> and the estimated intercept is ne"t to ?onstant@  So b-
is the coecient $or the /ariable =Constant> and b1  is the coecient $or the independent /ariable x   in the model The ne"t column heading is St!+ #rror hich pro/ides the corresponding stan!ar! error o$ each o$ the least s9uares estimates ;lso produced in this table are the t -test statistics in the column labeled t and Si+ hich reports the to&sided  p-'alues $or these t &test statistics
Coefficientsa
.19 .03 .:; 5.98 .002
>Constant@
D!A
odel
1
De%endent Eariable' P&AFU6a.
 The t -statistic $or the slope in the second ro is a test o$ the signicance o$ the model ith 1  /ersus the model ithout 1 " that is  $or testing 7-: β 1  - /ersus
7a: β 1 ≠ - The t -statistic $or the y &intercept in the rst ro is a
test o$ hether the y &intercept =β o> is di@erent $rom Gero This test is not o$ten o$ interest unless a /alue o$ - $or the  y &intercept is meaning$ul and o$ interest +or e"ample i$  x   amount o$ soap used and  y   height o$ the suds then an intercept /alue o$ - is meaning$ul as no soap ould lead to no suds The column labeled Si+ gi/es the t(o-si!e! p-'alue  $or the corresponding hypothesis test
SPSS also pro/ides the in$ormation to calculate condence inter/als $or the parameter estimates The column labeled St!+ #rror pro/ides stan!ar! errors =estimated standard de/iations> o$ the parameter estimates and is the 9uantity that is multiplied by the appropriate t:  'alue in computing the hal$&idth o$ the condence inter/al (ecall that you can re9uest SPSS to produce
22
 
these condence inter/als $or you using the Statistics button in the &eression !ialo bo$
23
 
Interpretation o% estiate! slope b17  ;ccording to our regression model e estimate that increasing FN; by one unit has the e@ect o$ increasing the predicted pla9ue by 1# units
Interpretation o% r27 ;ccording to our model 63 o% 'ariation in pla9ue le/els can be accounted $or by its linear relations/ip ith FN;
*ecision %or test o% a sini=cant linear relations/ip7  Since the p&/alue --2 is less than the signicance le/el U -5 e can re8ect the null hypothesis that the population slope β 1 e9uals -
onclusion7  There is sucient e/idence to conclude that in the linear model $or pla9ue based on FN; the population slope β 1 does not e9ual Gero 7ence it appears that FN; is a signicant linear predictor o$ pla9ue
ets return to the >"<> table in the middle o$ the regression output
ANOVAb
1::.22: : 23.;2:
Predi"tors' >Constant@, D!Aa.
De%endent Eariable' P&AFU6b.
 The &eression Su o% SAuares corresponds to the portion o$  the total /ariation in the data that is accounted $or by the regression line E/erything that is le$t o/er and not accounted $or by the regression line is placed in the &esi!ual Su o% SAuares category Then di/iding the sum o$ s9uares by their respecti/e !%  =degrees o$ $reedom> yields the .ean SAuares 
+inally the ratio o$ the %ean S9uares pro/ides the ; statistic hich tests i$ the slope is signicantly di@erent $rom Gero =ie i$ there is a signicant non&Gero linear relationship beteen the to /ariables 6 7-: β1  - /ersus 7a: β1 ≠ -> The Si+ is the corresponding p&/alue $or the F  test o$ these hypotheses
24
 
!n simple linear regression the t &test in the oeEcients output $or the slope is e9ui/alent to the ;N)H; F &test Notice that the s9uare o$ the t &statistic $or testing about the slope is e9ual to the F &statistic in the ;N)H; table and the corresponding p&/alues are the same
25
 
/eckin t/e Siple Linear &eression >ssuptions 7ere is a summary o$ some graphical procedures that are use$ul in detecting departures $rom the assumptions underlying the simple linear regression model
1+ LI#>&IJ7 Fo a scatter plot o$ y  /ersus x   The plot should appear to be roughly linear
2 S>ILIJ7 Fo a se9uence plot o$ the residuals  The plot should sho no pattern indicating any trend in the mean or in the /ariance o$ the residuals ;n e"ample series plot is shon belo eeber that $t $s on!y appropr$ate to a1e seuence p!ots #hen there $s soe order$n% present $n the data-
2
 
3 "&.>LIJ7 E"amine a *&* plot o$ the residuals to chec on the assumption o$ normality $or the population =true> error terms ;n e"ample *&* plot is shon belo
4 "S> S>*>&* *#<I>I" o$ the population =true> error terms: %ae a plot o$ the residuals /ersus x  This plot is called a resi!ual plot The residuals represent hat is le$t o/er a$ter the linear model has been t The residual plot should be a random scatter o$ points in roughly a horiGontal band ith no apparent pattern ;n e"ample residual plot is shon at the right Sometimes this plot can also re/eal departures $rom linearity =ie that the regression analysis is not appropriate due to lac o$ a linear relationship> 
2#
Lab 17 Sca'ener 8unt9 .ean an! .e!ian
Ob7ective9 !n this lab you ill /isit se/eral o$ the sites that ill be used throughout the course o$ the semester and learn the locations o$ the important resources !n the second part o$ the lab you ill mae use o$ an applet to e"plore ideas related to measures o$ center $or a data set
Overvie&9  !n this course you ill be using /arious $orms o$  technology including applets that ill be use$ul $or e"ploring statistical concepts There ill be a lot o$ places to /isit and the lins $or all these sites are a/ailable on CTools !n the rst portion o$ the !n&ab Pro8ect you ill complete a Sca/enger 7unt that ill allo you to become $amiliar ith the resources a/ailable on CTools !n the second part o$ the !n&ab Pro8ect you ill use the rst applet to enhance your understanding o$ the mean and median
.easures o% enter7 %easures o$ center are numerical /alues that tend to report the $dd!e o$ a set o$ data The to that e ill $ocus on are the mean and the median
1 .ean7 The mean o$ a set o$ n obser/ations is simply the sum o$ the obser/ations di/ided by the number o$  obser/ations n
2 .e!ian7 The median o$ a set o$ obser/ations ordered $rom smallest to largest is the middle /alue !t ill be the /alue such that =at least> hal$ o$ the obser/ations are less than or e9ual to that /alue and =at least> hal$ the obser/ations are
2'
greater than or e9ual to that /alue
.easures o% <ariation or Sprea!7 %easures o$ /ariation include the !nter9uartile (ange =!*(> and standard de/iation These numerical summaries describe the amount o$ spread that is $ound among the data ith larger /alues indicating more /ariability or more spread
1 Stan!ar! *e'iation7 Standard de/iation is a measure o$  the spread o$ the obser/ations $rom the mean !t is actually the s9uare root o$ an a/erage o$ the s9uared de/iations o$  the obser/ations $rom the mean 3e can th$n1 of the standard de$at$on as approx$ate!y an aera%e d$stance of  the obserat$ons fro the ean-
2 I;&7 The !*( measures the spread o$ the middle 5-Q o$ the data !t is dened as the di@erence beteen the 3rd 9uartile =*3> and the 1st  9uartile =*1> These 9uartiles are also called the #5th and 25th percentiles respecti/ely !*( *3 6 *1
2L
 
War-)p7 ateorical an! ;uantitati'e <ariables  Tae a $e minutes to recall the to types o$ /ariables that ha/e been introduced in class: categorical and 9uantitati/e +or each o$  the $olloing /ariables determine hether it is a categorical or 9uantitati/e /ariable (ecall that numerical summaries such as mean and median can only be computed $or 9uantitati/e /ariables
Cell Phone %odel =iPhone ;ndroidV> ateorical
;uantitati'e
;uantitati'e
;uantitati'e 
Points scored in a basetball game ateorical
;uantitati'e 
ILP7 Sca'ener 8unt
 The rst acti/ity in this !n&ab Pro8ect is a Sca/enger 7unt that ill allo you to become $amiliar ith the locations $or all your important resources on CTools
ask >ns(er
 
1+ o to t/e ools Site an! =n! t/e %orula car! pae in &esources7 3hat $s the top$c header at the top of  pa%e / of foru!a card*
2+ o to t/e ools Site an! =n! t/e *ata Sets %ol!er in &esources7
3hat $s the nae of the !ast data set (#hen sorted a!phabet$ca!!y a  z)*
3+ o to t/e ools Site an! =n! t/e Lab In%o %ol!er in &esources7
Do to the $older $or your lab section and nd your lab syllabus
4oo1 at the Grad$n% o!$cy sect$on The In. 4ab ro6ects #$!! count #hat 7 to#ard your ,na! course %rade*
31
 
ask >ns(er 4+ o to t/e ools Site an! =n! t/e
Stats 250 Prelab link7
Clic on the lin that ill tae you to the
Prelab Sitemaer Site
Clic on ESS)N 1 'o# any short $deos do you hae to #atch for your 4esson 8*
5+ o to t/e ools Site an! =n! t/e >ssinents link in t/e le%t enu7
+ind the P(E;? ESS)N 1 assignment
3hen $s your 94:; 49SS<N 8 ass$%nent due*
<4 o to t/e ools Site an! =n! t/e Link to t/e "nline 8W site ?calle! Lecturebook@7
!$ you ha/e your 7, tool subscription log in and select your DS! 3hat $s our !ab sect$on nuber*
Note: !$ you dont ha/e your DS! selected =ith correct section number> your DS! cannot see your homeor and thus you ill recei/e - points0
6+ o to t/e ools Site an! =n! t/e Lab In%o Fol!er in &esources7
Do to the $older $or ins to ab ;pplets
Clic on ;? 1
3hat $s the t$t!e of the :pp!et*
+or the second part o$ the !P you ill or ith this applet
32
ILP7 /e .ean an! t/e .e!ian
 The applet that you opened in the last step o$ the Sca/enger 7unt portion o$ the !P ill no be used to help you disco/er ho the shape o$ the distribution $or a set o$ data can pro/ide important details regarding the relationship beteen the mean and the median $or that data !n this acti/ity you ill obser/e the mean and the median $or a /ariety o$ shapes o$ distributions
)pen the Lab 1 *escripti'es  >pplet  $rom the Ains to ab ;ppletsB $older on the Stat 25- CTools site =in the Aab !n$oB $older under A(esourcesB>
;lternati/ely the original applet can be $ound at: http:IIonlinestatboocomIstatWsimIdescripti/eIinde" html
33
 
 This eb site contains a Xa/a applet that ill help you understand the relationship beteen the mean and the median
1 (ead the applet instructions
2 Clic ein and you ill see a histogram o$ nine numbers: 3 4 4 5 5 5 and #
 This histogram shos a symmetric distribution The summary in the upper le$t corner shos that the mean and the median are both e9ual to 5 the standard de/iation is 115 and there is no seness =note the seness measure is ->
3 Change the distribution so that it no has a positi/e se by ApaintingB the histogram ith the mouse Foes this correspond to a right or le$t seed distributionK ,hich is bigger the mean or the medianK
4 Change the distribution so that it has a negati/e se ,hich direction is this distribution seedK No hich is bigger the mean or the medianK
5 Try a $e other distributions =uni$orm u&shaped etc> and see ho the mean and median compare Comment on your
34
ndings here
SummariGe hat you ha/e learned about the relationship beteen the shape o$ a distribution and the mean and median
ool-*o(n K17 W/ic/ to &eport  ou ha/e seen that the mean is more sensiti/e to outliers than the median +or a data set that contains se/eral outliers hich measure o$ center ould you choose to reportK ,hat measure o$  spreadK E"plain
ool-*o(n K27  &eal Worl! #$aple  ou are the manager o$ a local grocery store ho is put in charge o$ setting the prices $or your stoc ou ill determine the prices $or each product by e"amining the prices o$ your competitors in the neighborhood Suppose your neighborhood consists mainly o$  chain store supermarets along ith 2 high&end grocery stores  ou ant to set your prices lo enough to attract customers but high enough so you ill mae a prot 7o ould you use these measures o$ center to help you determine the pricesK
35
#$aple #$a ;uestion on .ean an! .e!ian
attoos B  The Pe (esearch Center too a sur/ey o$ %illenials that is young adults beteen the ages o$ 1' and 2L The sur/ey looed at characteristics that researchers thought described %illenials 6 social netoring li$e priorities and aspects o$ the respondentsM appearances )ne 9uestion ased ho many tattoos respondents had and there ere 4' respondents ho had at least one tattoo Fata $or these 4' respondents are shon in the histogram belo
a 7o could this histogram be describedK Choose all that apply
I=T S>?D L;T S>?D
S@""TIC ST(L
I"OD(L A+I"OD(L
DC(SI+ T+D
b !$ the mean number o$ tattoos $or these %illenials is 3-1 tattoos hich o$ the $olloing is a reasonable /alue $or the medianK
 2 34%$ B 6
c !s the $olloing statement true or $alseK ;ased on the h$sto%ra" #e can be sure the ran%e of the tattoo data $s exact!y 20-
TA ;(LS
 
d The standard de/iation is computed e"actly to be 2L1 ,rite a sentence to interpret this standard de/iation =in terms o$ an a/erage distance> in the conte"t o$ the problem
3#
Lab 27 *escribin *ata (it/ rap/s an! ubers
Ob7ective9 !n this lab you ill use some graphical and numerical tools to summariGe the distribution $or a 9uantitati/e /ariable or response 6 a histogram a bo"plot mean median standard de/iation and inter9uartile range =!*(> ou ill also be introduced to side&by&side bo"plots $or comparing to or more distributions and bar charts $or summariGing categorical data  These techni9ues can be /ery use$ul at the start o$ data analysis to get a $eel $or the data
Overvie&9   To graphs that can be used to summariGe the distribution $or a single 9uantitati/e /ariable or response are a /istora  and a bo$plot Each graph pro/ides di@erent in$ormation about the distribution ,hen used properly graphs can be a /ery e@ecti/e ay to summariGe data Fata on a single 9uantitati/e /ariable should rst be e"amined graphically The o/erall shape o$ the distribution and e"istence o$ outliers can generally be used to assess i$ the data appear to be coming $rom a relati/ely homogenous population !$ so then /arious numerical summaries may be used to characteriGe the center o$ the distribution =such as mean and median> and the spread o$ the distribution =such as the standard de/iation and the !*(> +or categorical /ariables a bar c/art  can be used to display the number $alling in each category =$re9uency distribution>
8istoras7  ; histogram displays the distribution o$ a 9uantitati/e /ariable by shoing the $re9uency =count> or percent o$ the /alues that are in /arious classes The classes are typically inter/als o$ numbers that co/er the $ull range o$ the /ariable 7istograms can be used to assess the syetry and o!ality o$  a single distribution or $or comparing the relati/e locations and shapes o$ se/eral distributions
o$plots7  )ne plot that can detect e"treme obser/ations or outliers is the bo$plot4  ; bo"plot is a graphical representation o$  the /e&number summary namely the minimum rst 9uartile median third 9uartile and ma"imum o$ the data The centerline o$ the bo" mars the median or the 5-th percentile The sides o$  the bo" sho the rst =loer> 9uartile *1 and the third =upper> 9uartile *3 Thus a bo"plot shos the o/erall range =ma"imum 6 minimum> and the interAuartile rane  =!*( *3 6 *1> ; modied bo"plot uses a rule $or identi$ying /alues that are
3'
 
e"traordinary compared to the others =outliers  or outsi!e 'alues> Circles =o> are used to denote outliers and asteriss =J> to denote e"treme outliers i$ any are present ;ny point belo *1 6 =15 " !*(> or abo/e *3 =15 " !*(> is considered an outlier E"treme outliers are points belo *1 6 =2 " !*(> or abo/e *3 =2 " !*(> ;ox p!ots cannot te!! you the shape of the d$str$but$on-
3L
 
Si!e-by-si!e o$plots7  These plots are help$ul $or comparing to or more distributions ith respect to the /e&number summary +or e"ample suppose you are interested in comparing the distribution o$ a /ariable such as the salary o$ the employees o$ a certain company !$ you ha/e in$ormation on se" $or the group you might be interested in comparing the distribution o$  salary o$ $emales ith respect to males !n this case the side&by& side bo"plot ill be an important part o$ the descripti/e analysis o$  the data set in/ol/ed
ar /arts7  )ne ay to display the number or $re9uency distribution $or a categorical /ariable is ith a bar chart ; bar chart shos the percentage o$ items that $all into each cateory or /alue o$ a cateorical 'ariable  !t displays a bar $or each category ith the height o$ each bar e9ual to the number the proportion or the percentage o$ items in that  category+  !$ the categories ha/e no inherent order e could rearrange the bars in the graph in any ay e lie !n such cases the shape o$ the bar graph ould ha/e no bearing on its interpretation
War-)p7 .atc/in %atch the graph or descripti/e statistic to one o$ its primary uses =some may ha/e more than one and you may use an anser more than once>
 WWWW i 7istogram ; %easure o$ center not sensiti/e to outliers
 WWWW ii?ar Chart ? Compare distributions =but not their shapes>
 WWWW iii %ean C E"amine distribution o$ a categorical /ariable
 WWWW i/ %edian F E"amine distribution o$ a 9uantitati/e /ariable
 WWWW /Side&by&side ?o"plots E %easure o$ spread
 WWWW /i !*( + %easure o$ center sensiti/e to outliers
4-
ILP7 <isualiDin an! #$plorin *ata
!n this !n&ab Pro8ect you ill learn ho to create graphs and obtain descripti/e statistics $or a data set using SPSS
Tas9 The data set eployee !ata+sa' contains in$ormation on employees at a company E"plore possible 9uestions this data could be used to address Create appropriate graphs and obtain descripti/e statistics $or current salary and discuss the results
1 og onto your computer To obtain the data set go to CTools and nd AFatasets $or abs and 7,B in the A(esourcesB $older Select eployee !ata+sa' and sa/e it to a directory o$ your choice =alternati/ely you may open the data set directly in hich case you do not need to open SPSS a$ter> )nce you ha/e sa/ed the data set go to Proras $olloed by Statistics Packaes .at/ Proras  and then select SPSS
2 To open the eployee !ata+sa' data set $rom ithin SPSS select the option "pen an e$istin !ata source  $rom the dialog bo" ith the .ore Files  line highlighted and clic on "H+ Change the directory to here you sa/ed the data set select eployee !ata+sa' and clic on the "pen button The data set ill open and you can /ie it =it ors lie an E"cel spreadsheet>
3 The starting /ie o$ the data is the *ata #!itor indo 7ere you can see the /ariables in the data set and their /alues The rst /ariable you should see is !F
,hat is the second /ariable present in the data setK ,hat type o$ /ariable is itK ,hat is the eighth /ariable present in the data setK ,hat type o$ /ariable is itK
4 ?rainstorm possible 9uestions this data set may ha/e been collected to address
5 +ocus on the /ariable current salary ,hat are some graphs
41
that ould be appropriate to mae $or this /ariableK
Create a histogram $or current salary .se the graphs menu & rap/sG Leacy *ialosG 8istora select =current> salary and mo/e it to the /ariable bo" Editing details can be $ound in the #!itin /arts in SPSS section =Supplement 4> o$ this orboo
Note: ;ll Statistics 25- homeor and labor ill re9uire that students pro/ide an appropriate title and their name on each SPSS chart or output +or histograms clic on the itles button and enter the in$ormation and clic on ontinue
Fescribe hat the histogram shos about the distribution o$  current salary ; good description includes in$ormation regarding the shape modality and range o$ the data along ith possible outliers
# )btain a bo"plot $or current salary .se: rap/sG Leacy *ialosG o$plotG SipleG Suaries o% separate 'ariables This is appropriate $or one /ariable ith no groups Clic on the button *e=ne to open another dialog bo" that denes the /ariables $or our analysis Clic once on salary to highlight it and then on the o$es &epresent arro to select it
ote7 ?o"plots do not ha/e a Titles option 7oe/er you may add a title /ia the Chart Editor Fouble clic the graph and $rom the Chart Editor menus select "ptionsG itle The Chart Editor creates the te"t bo" and automatically positions it in the top center o$ the chart Type the te"t and press enter hen you are nished typing To enter line breas press Shi$tEnter
Fescribe hat the bo"plot shos about the distribution o$  current salary ,hat do the /arious lines on the bo"plot representK
42
 
' Numerical summaries may also be obtained $or any 9uantitati/e /ariable To obtain the /e&number summary do >nalyDeG *escripti'e StatisticsG FreAuencies and then choose the summary measures you ant under the Statistics button +ill in the basic summary measures $or current salary =some re9uire hand calculation>
%ean: Standard Fe/iation:
%in: %a": (ange: %a"&%in
L To obtain numerical summaries or any graph =e"cept bo"plots> $or current salary by minority status e need to split the data le .se *ataG Split File and choose "raniDe output by roups Foing this success$ully does not produce any noticeable changes in the SPSS indosR there ill only be to short lines o$ output in the )utput indo conrming the split  The grouping /ariable is minority classication )btain descripti/e statistics $or current salary by minority status =once the data is split 8ust generate descripti/e statistics> ist some o$ your ndings belo
%inority: Non&%inority:
1-Create histograms $or both minorities and non&minorities =ea/e the data split and create histograms as be$ore> +or each category ould it be appropriate to summariGe the shape o$ the distribution o$ the current salary using descriptors such as seed or symmetricK ,hyK
Iportant ote7 ,hen you are nished conducting analyses by
43
 
group you need to go bac to the Split File dialog bo" and choose >nalyDe all cases, !o not create roups ;gain the only change ill be one line o$ output conrming the data has been Aun&splitB
11Create side&by&side bo"plots $or current salary The data le should +OT  be split to create these .se rap/sG Leacy *ialosG o$plot ith Siple and Suaries %or roups o% cases %inority Status is the /ariable $or the category a"is and current salary is the /ariable
7o does the distribution $or current salary compare $or minorities /ersus non&minorities =based on the side&by&side bo"plots histograms and descripti/es>K
ool-*o(n7 /eck Jour )n!erstan!in about St! *e' (ecall the denition o$ Standard Fe/iation $rom ab 1 6 the standard de/iation is a measure o$ the spread o$ the obser/ations $rom the mean !t is actually the s9uare root o$ an a/erage o$ the s9uared de/iations o$ the obser/ations $rom the mean 3e can th$n1 of the standard de$at$on as approx$ate!y an aera%e d$stance of the obserat$ons fro the ean-
a Suppose e are interested in learning about heights o$  %ichigan students ,e tae a simple random sample o$ 1-- students and nd that the a/erage height $or this sample is inches ith a standard de/iation o$ 2 inches ?elo are some interpretations o$ this standard de/iation +or each one e/aluate i$ it is a correct interpretation or say hy it is incorrect
1 The a/erage distance beteen the height /alues and the mean height is roughly 2 inches
Correct Incorrect   because
 
2 The height /alues di@er $rom the mean height by appro"imately 2 inches on a/erage
Correct Incorrect   because
 WWWWWWWWWWWWWWWWWWWWWWWWWWW
 WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW 
3 The a/erage distance beteen the height /alues is roughly 2 inches
Correct Incorrect   because
 WWWWWWWWWWWWWWWWWWWWWWWWWWW
 WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW 
b ; student pro/ided the $olloing $ncorrect interpretation o$  standard de/iation
A'Q o$ the height /alues are ithin 2 inches o$ the mean heightB
,hy is this interpretation incorrect in generalK ,hat graph o$  the height data ould you mae to chec i$ the statement could be correctK ,hat ould you loo $or in the graphK
45
 
#$aple #$a ;uestion on o$plots +i$ty&/e parents o$ grade&school children ere recently inter/ieed regarding the brea$ast habits in their $amily )ne 9uestion ased as i$ their children tae the time to eat a brea$ast =recorded as brea$ast status 6 es or No> The grades o$  the children in some core classes =eg reading riting math> ere also recorded and a standardiGed grade score =on a 1-&point scale> as computed $or each child ;t the end o$ the study it as disco/ered that the children ho do tae time to eat brea$ast get higher grade scores than those ho dont
a ,hat type o$ study is thisK 9xper$ent <bserat$ona! study 
b ,hat is the response /ariable in this studyK  WWWWWWWWWWWWWWWWWWWWWWWWWWWWW 
c ,hat is the e"planatory /ariable in this studyK  WWWWWWWWWWWWWWWWWWWWWWWWWWW 
d ,hat type o$ /ariable is the e"planatory /ariableK =ate%or$ca! >uant$tat$e
Side&by&side bo"plots o$ the childrens standardiGed grade scores are pro/ided
Do *ou have brea$fast
5
3
e ,hat is =appro"> the loest grade scored by a child ho does ha/e brea$astK WWWWWWWWWWW points
$ ,hat is =appro"> the !*( $or the grade scores o$ children ho do eat brea$astK WWWWWWWWWWW points
g .sing one o$ the measures displayed in the bo"plot complete
4
 
this sentence  The highest grade scored by one o$ the children not  eating brea$ast is (approx) e9ual to the WWWWWWWWWWWWWWWWWWWWWW $or the children ho !o eat brea$ast
h True or $alse: The symmetry in the b $or the children not eating brea$ast $p!$es that the histogram o$ the same data is also symmetric Circle one: Tr!e ;alse E"plain:
Lab 37 ie Plots an! ;-; Plots
Ob7ective9 !n this lab you ill add to your set o$ graphical tools $or e"amining data The graphs you ill e"amine include se9uence =time> plots $or data collected o/er time and *&* plots $or checing hether a normal model is a reasonable distribution $or a 9uantitati/e /ariable
Overvie&9  ab 2 pro/ided a summary o$ some graphical and numerical tools that can be used to summariGe the distribution $or 9uantitati/e and 9ualitati/e /ariables or responses ,e may use those tools $or the acti/ities in this !n&ab Pro8ect but e ill also need to utiliGe the ne tools described belo Note that these %raph$ca! too!s are $ntroduced so!e!y $n !ab" not $n !ecture" so $t #$!! bene,t you to read th$s oer$e# thorou%h!y-
ie ?SeAuence@ Plots7  Fata is o$ten gathered o/er time Employment rate stoc prices and sales gures are 8ust a $e e"amples ,hen data is gathered o/er time it is generally ise to e"amine the data plotted against time Plots against time can re/eal the main $eatures o$ a time series o/erall patterns and striing de/iation $rom those patterns Some o/erall patterns that may arise are:
; persistent long&term rise or $all called a tren! =either increasing or decreasing>
; pattern that repeats itsel$ at regular inter/als o$  time called seasonal 'ariation
; persistent long&term increase or decrease in the 'ariation  o$ the obser/ations called a pattern in 'ariation
!$ data is collected o/er time a time plot can be used to chec the assumption o$ a random sample hich ill be needed $or in$erence procedures ;s you ha/e learned in your lecture notes
4#
 
on sampling a random sample consists o$ $ndependent   and $dent$ca!!y d$str$buted  =iid> obser/ations This means that the obser/ations can be considered as all coming $rom the same parent population =ith the same or $dent$ca! distribution> and are $ndependent  o$ one other ,ith a se9uence plot you can chec the $dent$ca!!y d$str$buted aspect o$ a random sample by looing $or e/idence o$ stability  in the plot Stability is supported hen both the mean o$ the obser/ations and the amount o$ /ariation among obser/ations appear to be constant o/er time and there does not appear to be any pattern in the resulting plot
4'
 
 
;-; Plots7 ater in this class e ill see that the assuption o%  a noral o!el %or a population o% responses (ill be nee!e! in or!er to per%or certain in%erence proce!ures Pre/iously e ha/e seen that a histogram can be used to get an idea o$ the shape o$ a distribution 7oe/er there are more sensiti/e tools $or checing hether the shape is c!ose to a normal =bell&shaped> model
 The best plot that can be used to chec $or normality is called a *& * Plot hich is a plot o$ the percentiles =or 9uantiles> o$ a standard normal distribution against the corresponding percentiles o$ the obser/ed data !$ the obser/ations $ollo an appro"imately normal distribution the resulting plot should be roughly a straight line ith a positi/e slope Fe/iations $rom this indicate possible departures $rom a normal distribution
;t the right is an e"ample o$ a *&* Plot shoing strong support to say the data that does seem to come $rom a population ith an appro"imately normal distribution
4L
 
 
 The *&* plot on the le$t indicates the e"istence o$ to clusters o$  obser/ations The *&* plot in the center shos an e"ample here the shape o$ the distribution appears to be seed right The *&* plot on the right shos e/idence o$ an underlying distribution that has shorter tails compared to those o$ a normal distribution
ote7 !t is only important that you can see the departures in the abo/e graphs and not as important to no i$ the departure implies seed le$t /ersus seed right and so on ; histogram ould allo you to see the shape and type o$ departure $rom normality
+inally e consider an e"ample *&* plot =shon at the right> that appears normal ith the e"ception o$  one data point
5-
 
!n this case e ould say the *&* plot shos e/idence o$ an underlying distribution hich is appro"imately normal e"cept $or one large outlier that should be $urther in/estigated
Note that outliers could appear in either the upper or loer tail
?arm-Ap: /eck Jour )n!erstan!in ?e$ore beginning the !n&ab Pro8ect re/ie your understanding o$  the ey concepts related to time plots and ** plots
1 +or each ord pair select the appropriate ord=s> to complete the sentences
,e use se9uence =or time> plots to chec the 
in!epen!ent i!entically !istribute! 
part o$ the random sample assumption
by looing to see i$ the data appear to be 
stable noral 
that is ha/e a constant mean and constant /ariation o/er
time
!$ there is any pattern in the obser/ations o/er time e
s/oul! s/oul! not 
mae a histogram o$ the obser/ations $or $urther analysis
2+ !$ the time plot supports that e can consider our obser/ations to be a random sample =that is shos our underlying process appears to be stable> e could mae a histogram to help
51
 
assess i$ the model $or our response in the underlying population is normally distributed ,hat other graph could e mae to help assess this normality assumptionK ,hat ould e hope to see in that graph to support normality is reasonableK
ILP7 ie Plot an! ;; Plot #$aples !n this rst part o$ the !n&ab Pro8ect you ill loo at more e"amples o$ time plots to help learn ho to better Yread such graphs $or assessing hether our data appear to be stable and support the random sample condition
ask 17 Do to the Stat 25- Prelab Site and nd the Time Series tab along the top Fonload the timeseriesrdata le script you ill be using in this part o$ the !P ,hen you double clic on this script le it should open up the ( program =on all campus machines and you can donload ( to your computer $ree too> 1 ?egin the program by entering the $olloing command
timeseries=>
2 Select your sample siGe by entering a number beteen 1 and 1----
3 Select i$ you ould lie to see an e"ample o$ a stable or unstable time plots
4 !$ unstable time plots are selected you ill be ased 9uestions to determine the type o$ unstable pattern youd lie to see
5 )nce your time plots ha/e been created you ill be ased i$  you ant to sa/e your plots to the destop as an image ;nser and then you ill again be ased to select a ne"t sample siGe Try out the /arious options and e"plore the /arious patterns o$ time plots that ere mentioned in the introduction
Setch belo a time plot hich indicates both an increasing mean and a decreasing /ariance
52
 
ask 27 Do to the Stat 25- Prelab Site and nd the ** Plot tab along the top Fonload the 99plotrdata le script you ill be using in this part o$ the !P ,hen you double clic on this script le it should open up the ( program =on all campus machines and you can donload ( to your computer $ree too> 1 ?egin the program by entering the $olloing command
99plot=>
2 Select your sample siGe by entering a number beteen 1 and 1----
3 Select the type o$ distribution you ould lie an e"ample a ** plot be generated $rom
4 )nce your ** plots and the corresponding histograms ha/e been created you ill be ased i$ you ant to sa/e your plots to the destop as an image ;nser and then you ill again be ased to select a ne"t sample siGe Try creating ** plots $or many di@erent distributions and sample siGes
5 Setch belo one o$ the resulting ** plots and 7istograms $or a sample o$ 1--- obser/ations $rom a seed right distribution ;; plot7 8istora7
53
ILP7 ie-*epen!ent *ata
ac.ro!nd $9  The data set !eat/rate+sa'  contains the death rate =number o$ deaths per 1-- million miles dri/en> taen at to&year inter/als $rom 1L- to 2--4
Fisplay and summariGe this data in an appropriate and use$ul ay ,hat do you seeK ,ould it mae sense to mae a histogram o$  the death ratesK The $olloing steps ill guide your thining as you complete this tas
 
2 ,hy should a se9uence plot be made to display this dataK
3 %ae a se9uence plot $or the data using >nalyDeG ForecastinG SeAuence /arts ,hat does the graph shoK Comment on i$ you see any trend seasonal /ariation or pattern in /ariation in this graph
4 Foes the plot appear to be stableK ,hat ould you conclude i$  ased i$ the data ere a random sample o$ death ratesK
5 ,ould it mae sense to mae a histogram o$ the death ratesK ,hy or hy notK
54
 
ac.ro!nd 29  The data set ol!%ait/%ul+sa'  contains the date and duration o$ eruptions =in minutes> o$ the )ld +aith$ul geyser The data as collected se/eral times per day o/er 23 consecuti/e days Fisplay and summariGe the data in an appropriate and use$ul ay ,hat do you seeK Foes there appear to be any pattern to this processK The $olloing steps ill guide your thining as you complete this tas
1 %ae a time plot $or the data using >nalyDeG ForecastinG SeAuence /arts  ,hat does the graph shoK ;re there any patterns to the processK
2 Foes the plot appear to be stableK ,hat ould you conclude i$  ased i$ the data ere a random sample o$ eruptionsK
ILP7 >ssessin orality  ou ha/e discussed using a histogram to e"amine the shape o$ the distribution o$ a 9uantitati/e /ariable !$ the histogram shos a $airly homogeneous unimodal set o$ obser/ations e might lie to assess hether a normal distribution is a reasonable model $or the response ; better graph $or assessing normality is a *&* plot !n this problem e ill e"amine a $e distributions and see hat each corresponding *&* plot loos lie
ac.ro!nd $9  Suppose a study e"amined high school students and the relationship beteen !* and DP; .se the i9sa/ dataset and e"amine the distribution o$ !*
1 Create a histogram and a *&* plot $or the !* /alues *&* plots are created /ia ;nalyGe Fescripti/e Statistics *&* plots Pro/ide rough setches 8istora7 ;; Plot7
2 Fescribe the shape o$ the resulting histogram
55
 
3 !s a normal distribution a reasonable model $or !* scores in the population based on this *&* plotK
ac.ro!nd 29 ,e ha/e pre/iously e"plored the eployee !ata+sa' dataset /ariable o$ salary No lets chec to see i$ the o/erall distribution o$ salary can be considered normal and then to see i$ the distribution o$ salary might be normal depending on minority status
1 Create a histogram $or the /ariable salary and describe its shape
2 Create a *&* plot $or salary using >nalyDeG *escripti'e StatisticsG ;-; Plots ?ased on the e/idence o$ these graphs is a normal distribution an appropriate model $or current salaryK ,hy or hy notK
3 Create histograms and *&* plots separately $or minorities and non&minorities =recall the *ataG Split File command> Foes the distribution o$ salary appear to be di@erent $or either groupK Comment on both histograms and *&* plots
4 +or salary is a normal model reasonable $or either minorities or non&minoritiesK
5
 
ool-*o(n7 .atc/in rap/s %atch the corresponding histograms bo"plots and *&* plots
8istora > 8istora 8istora
 
 
,hich type o$ graph best shos the shape o$ the underlying distributionK
,hich type o$ graph best shos i$ the underlying distribution appears to be normal =bell&cur/e>K
#$aple #$a ;uestion on SeAuence Plots an! ;- ; Plots ; ne method o$ measuring phosphorus le/els in soil is under consideration ; sample o$ 11 soil specimens is analyGed using the ne method The time series =se9uence plot> $or the 11 obser/ations is presented belo
a Comment on the o/erall stability o$ these data based on this plot
#euen"e number 
0
50
20
00
;:0
;0
;50
;20
b ;n assumption o$ many statistical in$erence methods is that the data $ollo a normal distribution !n the space pro/ided belo setch ho the *&* plot ould appear i$ a normal distribution as a good model $or phosphorus le/els
5L
Lab 47 Probability an! &an!o <ariables
Ob7ective9 The ob8ecti/e o$ this lab is to become $amiliar ith using the models $or random /ariables and to nd the probabilities associated ith the models you ha/e learned The probabilities e compute $rom these models =$or e"ample pZ/alues in testing theories> ill help us mae reasonable decisions ou ill or ith three random /ariables and the methods used to calculate probability $or each /ariable ou ill also become $amiliar ith se/eral concepts that allo $or easier calculation o$ probabilities
Overvie&9  !n this lab you ill be introduced to se/eral random /ariables and their models These /ariables can be classied as one o$ to types: a discrete random variable hich has a nite number o$ outcomes and a contin!o!s random variable hich has an innite number o$ outcomes
-
 
• Independent vents9 To e/ents ; ? are said to be independent i$ noing that one ill occur =or has occurred> does not change the probability that the other occurs !n probability notation this can be e"pressed as P=;[?> P=;>
• "!t!all' 1cl!sive9  To e/ents ; ? are mutually e"clusi/e =or dis8oint> i$ they do not contain any o$ the same outcomes So their intersection is empty
andom )ariables9  ; random /ariable assigns a number to each outcome o$ a random circumstance or e9ui/alently a random /ariable assigns a number to each unit in a population The distribution o$ a random /ariable is a model that shos us hat /alues are possible $or that particular random /ariable and ho o$ten those /alues are e"pected to occur =ie their probabilities>  The model can be e"pressed as a $unction or table or picture depending on the type o$ /ariable it is
,e ill consider to broad classes o$ random /ariables: discrete random /ariables and continuous random /ariables
1
 
Discrete andom )ariable9 ; discrete random /ariable \ is a random /ariable ith a nite or countable number o$ possible outcomes The probability distribution $unction =pd$> $or a discrete random /ariable \ is a table or rule that assigns probabilities to the possible /alues o$ the \
 To conditions that must alays apply to the probabilities $or a discrete random /ariable are:
Condition 1: The sum o$ all o$ the indi/idual probabilities must e9ual 1
Condition 2: The indi/idual probabilities must be beteen - and 1
inomial andom )ariable9 )ne type o$ a discrete random /ariable is the binomial random /ariable hich counts the number o$ times a certain e/ent occurs out o$ a particular number o$ obser/ations or trials o$ a random e"periment
; binomial e"periment is dened by the $olloing conditions: 1 There are n AtrialsB here n is determined in ad/ance not a
random /alue 2 There are to possible outcomes on each trial called a
AsuccessB =S> and a A$ailureB =+> 3 The outcomes are independent $rom one trial to the ne"t 4 The probability o$ a AsuccessB remains the same $rom one
trial to the ne"t and this probability is denoted by p The probability o$ a A$ailureB is 1 6 p $or e/ery trial
; binomial random /ariable is dened as: \ number o$  successes in the n trials o$ a binomial e"periment
Contin!o!s andom )ariable: ; continuous random /ariable \ taes on all possible /alues in an inter/al =or a collection o$  inter/als> The ay that e determine probabilities $or continuous random /ariables di@ers in one important respect $rom ho e determine probabilities $or discrete random /ariables +or a discrete random /ariable e can nd the probability that the /ariable \ e"actly e9uals a specied /alue ,e cant do this $or a continuous random /ariable !nstead e are only able to nd the probability that \ could tae on /alues in an inter/al ,e do this by determining the corresponding area under a cur/e called the probability density $unction o$ the random /ariable
So the probability distribution o$ a continuous random /ariable is described by a density cur/e The probability o$ an e/ent is the area under the cur/e $or the /alues o$ \ that mae up the e/ent
2
 
 The probability model $or a continuous random /ariable assigns probabilities to inter/als
Fenition: ; cur/e =or $unction> is called a Probability *ensity ur'e i$:
1 !t lies on or abo/e the horiGontal a"is 2 Total area under the cur/e is e9ual to 1
+ormal andom )ariable: The $amily o$ normal distributions is /ery important because many /ariables ha/e this shape and $orm appro"imately and many statistics that e use in our in$erence methods are based on sums or a/erages hich generally ha/e =appro"imately> a normal distribution
; normal cur/e is symmetric bellZshaped centered at the mean and its spread is determined by the standard de/iation !n $act the points o$ in<ection on each side o$ the mean mar the /alues hich are one standard de/iation aay $rom the mean
Standardied Scores9  ; normal distribution is inde"ed by its population mean and its population standard de/iation (ecall that the standard de/iation is a use$ul AyardsticB $or measuring ho $ar an indi/idual /alue $alls $rom the mean The standardiGed score or GZscore is the distance beteen the obser/ed /alue and the mean measured in terms o$ number o$ standard de/iations Halues that are abo/e the mean ha/e positi/e GZscores and /alues that are belo the mean ha/e negati/e GZscores
+ormal (ppro1imation to t/e inomial Distrib!tion9  The easier ay in/ol/es using a normal distribution The normal distribution can be used to appro"imate probabilities $or other types o$ random /ariables one being binomial random /ariables hen the sample siGe n is large
1pected )al!e9 The e"pected /alue o$ a random /ariable is the mean /alue o$ the /ariable \ in the sample space or population o$  possible outcomes E"pected /alue denoted by E=\> can also be interpreted as the mean /alue that ould be obtained $rom an innite number o$ obser/ations on the random /ariable
Standard Deviation9  The standard de/iation can be /ieed as appro"imately the a/erage distance o$ the possible /alues o$ \ $rom its mean
3
 
War-)p7 ypes o% <ariables  Todays typical undergraduate student is o$ten characteriGed as pre$erring teamor e"periential acti/ities and the use o$  technology ;n EC;( =Educause Center $or ;pplied (esearch> study as published on technology use among undergraduate students The study used sur/ey and inter/ieer data to create a portrait o$ todays students e"periences ith and sill using in$ormation technology
1 isted belo are some o$ the response /ariables that ere measured in this study +or each o$ these determine hether it is categorical 9uantitati/e discrete or 9uantitati/e continuous
a Technology onership: Fo you on a computerK cate.orical 0!antitative discrete
0!antitative contin!o!s
b Time =per ee in minutes> spent using a computer $or riting documents =ord processing> cate.orical 0!antitative discrete
0!antitative contin!o!s
c ,hich social netoring site=s> are you a memberK =$aceboo myspace $riendster etc> cate.orical 0!antitative discrete
0!antitative contin!o!s
2 !denti$y appropriate model $or each /ariable =?e complete>

 
\ has a WWWWWWWWWWWWWWWWWWWWWWWWWWW distribution
b Suppose that 45Q o$ %ichigan residents on dogs et ? represent the number o$ %ichigan residents ith a dog in a random sample o$ 1- %ichigan residents
\ has a WWWWWWWWWWWWWWWWWWW
 
 WWWWWWWWWWWWWWWW distribution c et \ represent the score =in points out o$ 1--> on a
standardiGed e"am  The model $or \ is shon belo
\ has a WWWWWWWWWWWWWWWWWWWWWWWWWWWW distribution
#
 
Proble 17 Stu!y on Silin !n a recent study people ere obser/ed $or about 1- seconds in public places =eg malls and restaurants> to determine hether they smiled during the randomly chosen 1-&second inter/al The table shos the results $or comparing males =group 1> and $emales =group 2>
a ,hat is the probability that a randomly selected person smiledK  
b The researcher ould lie to assess i smilin. stat!s is independent o .ender  i To chec $or independence the probability $ound in part =a>
should be compared to hich o$ the $olloing probabilitiesK
P*smiled and male, P*smiled .iven male,
P*male .iven smiled, P*male,
ii +ind the probability selected abo/e and circle the appropriate conclusion
 The probability WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
 Thus it appears that smiling status is is not  independent o$ gender
Proble 27 Suer rip Lent/ Fid high gas prices eep ;mericans $rom hitting the road this past summerK !n a nationide sur/ey o$ adults one /ariable measured as ho many days /acationers spent dri/ing on the road on their longest trip Consider the $olloing =partial> probability distribution $or the random /ariable  ?   the number o$ days $or the longest car trip
 E 4 5 # ' Probab ilit' 
-1- -2- -25
'
 
a Suppose the probability o$ # days is tice as liely as the probability o$ ' days Complete the probability distribution $or  ?  Sho your or
 
Proble 37 Sur'i'in rees B ; landscaping company claims that L-Q o$ the trees they plant sur/i/e =dened as being still ali/e one year $rom planting> !$ a tree does not sur/i/e the company ill replace the tree ith a ne one a ; homeoner ill ha/e 5 trees planted in his yard by this
landscaping company ou can consider these 5 trees to be a random sample o$ all trees planted by this company !$  companys claim is correct hat is the probability at least 4 trees ill sur/i/eK
b ;n )ce Space Fe/eloper ill ha/e 2-- trees planted all around a ne oce space building comple" by this landscaping company ou can consider these 2-- trees to be a random sample o$ all trees planted by this company !$ the companys claim is correct hat is the probability that 3- or more trees ill need to be replaced =ie not sur/i/e>K
Proble 47 8o( .uc/ ie to Jou Spen! Stu!yin ; ,ashington Post article A!s college too easyK ;s study time $alls debate risesB =%ay 21 2-12> stated that the amount o$ time college students actually study has dindled $rom an a/erage o$  24 hours per ee to about 15 hours =based on a sur/ey>
; pro$essor o$ statistics decided to as all o$ his current semester students to report the number o$ hours per ee they spend
L
 
studying his course material =on a regular non&e"am ee> The mean $or the $emale students as 1- hours and the standard de/iation as 35 hours
a Consider the $olloing interpretations o$ the standard de/iation and clearly circle those that are correct
)n a/erage the number o$ hours spent studying statistics /aried $rom the mean by about 35 hours
 The a/erage distance beteen the number o$ hours spent studying statistics is roughly 35 hours
 The a/erage number o$ hours spent studying statistics is about 35 hours aay $rom the mean
 
c The male students had a loer mean and a larger standard de/iation Suppose Xaes response corresponds to a G&score o$  21 Complete the sentence to e"plain hat this tells us about the number o$ hours that Xae studies statistics per ee =?e as specic as you can>
#-
 
d The pro$essor as interested in ho the results o$ his students compare to those taing a Chemistry class The distribution o$  hours spent studying =per ee> $or students in the Chemistry class ere reported as being appro"imately normal ith a mean o$ 12 and a standard de/iation o$ 3
 
ii Xing learns that she is in the top 3-Q o$ this distribution ?ased on the distribution Xing must study at least ho many hours per eeK %ae a hand setch o$ hat you are trying to nd to help sho your or
ool-*o(n7 rue or False Fecide hether the $olloing 9uestions regarding probability and random /ariables are true or $alse 1 !$ the time to ait $or pharmacy help has a uni$orm distribution
$rom - minutes to 3- minutes then 33Q o$ the customers are e"pected to ait more than 2- minutes
rue False
2 !$ \ has a ?inomial =5- -#> distribution then the criteria to use the normal appro"imation are met
rue False
3 'Q o$ test scores are alays ithin one standard de/iation o$  the mean test score
#1
rue False
#$aple #$a ;uestion on Probability an! &an!o <ariables Suppose that the amount o$ time spent aiting $or your bus to our campus each day is a uni$orm random /ariable beteen - to 2- minutes
a Setch a picture o$ the model $or aiting time $or the bus Pro/ide labels $or each a"is and some /alues along each a"is
ets dene the $olloing e/ents: • ; is the e/ent that you ait at least 1- minutes that is your
aiting time is in the inter/al ^1-2-_ • ? is the e/ent that you ait at most 15 minutes that is your
aiting time is in the inter/al ^-15_ • C is the e/ent that you ait at most 1- minutes that is your
aiting time is in the inter/al ^-1-_ ;nser the $olloing 9uestions based on the in$ormation gi/en Sho all or
b ,hat is P=;>K
+inal anser:  WWWWWWWWWWWWWWWW 
+inal anser:  WWWWWWWWWWWWWWWW 
+inal anser:  WWWWWWWWWWWWWWWW 
#2
+inal anser:  WWWWWWWWWWWWWWWW 
$ ;re the e/ents ; and C mutually e"clusi/eK Circle one:    Jes o E"plain brie<y
Lab 57 on=!ence Inter'als ?I@ %or a Population Proportion
Ob7ective9 This module ill help you better understand the ideas in/ol/ed in condence inter/al estimation as ell as ho to interpret both the condence le/el and condence inter/al $or a population proportion ou ill construct one&sample  condence inter/als $or a population proportion and to chec that the conditions necessary $or the inter/al are /alid
Overvie&9  Since generally a population proportion is an unnon number e are interested