10
CHAPTER 1 Data Collection Section 1.1Introduction to the Practice of Statistics Objectives 1.. Define statistics and statistical thinking 2. Explain the process of statistics 3. Distinguish between qualitative and quantitative 4. Distinguish between discrete and continuous vari 5. Determine the level of measurement of a variable Objective 1 Define statistics andstatistical thinking o Ststistics isthe science of collecting, organizing, answer questions. Inaddition, statistics isabout Data The information referred to in the definition is Data are a "fact or proposirtion used to draw a con anindividual. A key aspect of data isthatthey vary. One goal of statistics isto describe and understand a a a a Objective 2 Explain the Process of Statistics ' A&&tze-z-arconsists of the entire group of ind ' A\SA4AE- isa subset of the population that i ' Anaultall4s a person or object that is a DeserelraTlyg . ^ .trZits.r-:-E consist of organizing andsum numerical summaries, tables, and graphs. n-SZZSZCis a numerical sum ma ry based zwF€e€^r./4L ' --SraEsrlai uses methods that take results measures the reliability of the result. . n&pnner,lis a numerical summary of a Pa rameter versus Statistic Example: Suppose the percentage of allstudents on ourca a) What isthe population? Aze S ruEalO a/t/ aq€ b) Does thevalue 84.9% represent a parameter o /4ft,A// fe.< c) Suppose a sample of 250 students isobtained, Does thevalue 86.4% represent a parameter or a 5 7n.r/5 7/A marizing, andanalyzing informatircn to draw conclusions or ding a measure of confidence in any conclusions. n or make a decision." Data dr:scribe characteristics of urces of variability. lsto bestudied. being studied. of the population being situdied. rizing data. Descriptive statistics describq data through a sample. a sample, extends them to ther population, and who havea job is 84.9%. 'P45 a statistic? from this sample we find that 8i6.4% haVe a job. stic?

CHAPTER 1 Data Collection - Los Angeles Mission College 2017... · CHAPTER 1 Data Collection ... 5. Determine the level of measurement of a variable ... 8/23/2017 11:12:57 AM

  • Upload
    ledien

  • View
    224

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CHAPTER 1 Data Collection - Los Angeles Mission College 2017... · CHAPTER 1 Data Collection ... 5. Determine the level of measurement of a variable ... 8/23/2017 11:12:57 AM

CHAPTER 1 Data CollectionSection 1.1 Introduction to the Practice of StatisticsObjectives

1.. Define statist ics and statist ical thinking2. Explain the process of statistics3. Dist inguish between quali tat ive and quantitat ive4. Dist inguish between discrete and continuous vari5. Determine the level of measurement of a variable

Objective 1 Define statistics and statistical thinkingo Ststistics is the science of collecting, organizing,

answer questions. In addit ion, statist ics is aboutData

The information referred to in the definition isData are a "fact or proposirtion used to draw a conan ind iv idual .A key aspect of data is that they vary.One goal of statist ics is to describe and understand

a

a

a

a

Objective 2 Explain the Process of Statistics

' A&&tze-z-arconsists of the entire group of ind

' A\SA4AE- is a subset of the population that i

' Anaultall4s a person or object that is a

DeserelraTlyg. ̂ .trZits.r-:-E consist of organizing and sum

numerical summaries, tables, and graphs.

n-SZZSZCis a numerical sum ma ry based

zwF€e€^r./4L' --SraEsrlai uses methods that take results

measures the reliabil i ty of the result.

. n&pnner,lis a numerical summary of a

Pa rameter versus StatisticExample:Suppose the percentage of al l students on our caa) What is the population?

Aze S ruEalO a/t/ aq€

b) Does the value 84.9% represent a parameter o

/4ft,A// € fe.<

c) Suppose a sample of 250 students is obtained,Does the value 86.4% represent a parameter or a

5 7n.r/5 7/A

marizing, and analyzing informatircn to draw conclusions ording a measure of confidence in any conclusions.

n or make a decision."Data dr:scribe characteristics of

urces of variabi l i ty.

ls to be studied.

being studied.

of the population being situdied.

rizing data. Descript ive statist ics describq data through

a sample.

a sample, extends them to ther population, and

who have a job is 84.9%.

'P45

a statistic?

f rom this sample we f ind that 8i6.4% haVe a job.

stic?

Page 2: CHAPTER 1 Data Collection - Los Angeles Mission College 2017... · CHAPTER 1 Data Collection ... 5. Determine the level of measurement of a variable ... 8/23/2017 11:12:57 AM

lllustrating the process of StatisicsStep 1: ldentify the research objective.Step 2: Collect the information needed to answerStep 3: Describe the data )Organize andStep 4: Draw conclusions from the data.

Objective 3 Distinguish between Qualitative and

o Key Point: Variables vary. Consider the variable hthe height of one individual would be suff icient inthe case. As researchers, we wish to identify the f

Variables and Types of Data

Variables can be classif ied a

. Qufrct 7A 7/vE variables allow for classi

. @ar iab lesprov idenuaddit ion and subtraction can be performed onresults.

Objective 4 Distinguish between Discrete and Contin

Quantitative variables can be further classified

' ADls&sr€ variableis a quantitative variable

a) The number of heads obtained after f l ipping a,^clunur tr+nvE ,D

b) Weights of new born babies in a hospital

c) Eye colors of students in Math 227

countable number of possible values. The term "2, 3, and so on. (e.g. # of books, # of desks)

' AfuzdgrZttva riableis a q ua ntitative va ria bleand can be measured to any desired level of accu

Classification of Va riables

Example:Classify each variable as quali tat ive or quantitat ivecontinuous.

e question.the information,

Variables

re. the characteristics of the individ within the population.

ight. l f a l l individuals had the samc. height, then obtaininging the heights qf al l individuals. Of course, this is not

rs that inf luence variabi l i ty.

or Qruttrlmnvc -.

of individuals based on some attr ibute or characterist ic.

measures of individqals. Anithmetic operations such asof the quantitat ive variable and provide meaningful

r iables

two groups.

t has either a f inite number of possible v4lues or auntable" means the values result from counting such as 0, 1,

t has an inf inite number of possible valuqs it can take on

lf the variable is quantitat ive, further classify i t as discrete or

n f ive t imes.

€7e

QuaPn r/1nv€ t

Qtunzt r4r/vE

Dwuats

Page 3: CHAPTER 1 Data Collection - Los Angeles Mission College 2017... · CHAPTER 1 Data Collection ... 5. Determine the level of measurement of a variable ... 8/23/2017 11:12:57 AM

Objective 5 Determine the Level of Measurement ofo Variables can also be classif ied by how they are cao The level of measurement of the data is useful in

problems.

Four common types of measurement scales are used

rl/ontNftc lalec or. qdytA&EnL-the values of the variable na

not al low for the values of the variable to be arra

OpDak- /eE/ C)F. fu//pe/A4I--it has the properties of the no

for the values of the variable to be arranged in a

-Eatl4uEL /auZar. /HFA,?t4E4L_-it has the properties of the o

of the variable have meaning. A value of zero in thof the quantity. Arithmetic operations such as addvariable.

Pnro /eutz or/a.Z<ezaf-it has the properties of the

variable have meaning. A value of zero in the ratioArithmetic operations such as mult ipl ication and d

Example:Classify each as nominal- level, ordinal- level, in

a) Sizes of carsaPDrrv AL

b) Nationali ty of each studentMotunt

c) lQ of each student-zuEEv4t

d)Weightr@Tzo

Section 1.2 Observational Studies Versus DesignedObjective

t. Dist inguish between an observat ional study and a

Objective lDistinguish between an Observational

' An@rneasures the value ofvalue of ei ther the response or explanatory var iathe behavior of the individuals in the study withou

lf a researcher assigns the individuals in a study toexplanatory variable, and then records the valueconducting a Dat@ten 4zreutat T

Variable

r ized, counted, on measured.ing what procedure to take to arpply statist ics to real

classify variables:

label, or categorize. In addit ion, t lhe naming scheme doesin a ranked, or specif ic, order.

inal level of measunement andthr.. naming scheme allowsor specif ic, order.

inal level of measurement and the differences in the valuesinterval level of measurement does not mean the absence

and subtraction can be perforrmed on values of the

rval level of measurement and the rat ios of the values of thelevel of measurement means the absence of the quant i ty.is ion can be perfornned on the valrues of the var iable.

l - level, or rat io leveldata.

experiment

and an Experiment

response variable without attempting tQ inf luence theThat is, in an observational study, the re5earcher observes

t ry ing to inf luence t lne outcome ofthe study.

certain group, intentionally changes the value of thethe response variable for each group, the researcher is

Page 4: CHAPTER 1 Data Collection - Los Angeles Mission College 2017... · CHAPTER 1 Data Collection ... 5. Determine the level of measurement of a variable ... 8/23/2017 11:12:57 AM

Example: Cellular Phones and Brain Tumors

In both studies, the goal of the research was toof contracting brain tumors. Whether or not brainvariable). The level of cell phone usage is theIn research, we wish to determine how varying theresponse variable.

/rr.fcwNOltte in a study occurs when theTherefore, any relation that may exist between an(dependent) variable may be due to some other va

A/t " f W. ^ is an explanatory variable th

the response variable in the study. In addit ion,variables considered in the studv.

Example:

ldentify the explanatory variable and thea) Rats with cancer are divided into two grou

is used to f ight cancer, and the other receives

is measured. a:y/aAl,lftZE/ y/7f,

t(aftlss lAronBc*A researcher wants to oeterrfr inb wnetner

weight than those who stay single.

fYfzn'va 7a E r/ tt fl€t4ts t a

EEPa^tS€ rfrEda(€ - al

A census is a l ist of al l individuals in a populat ion a

Sect ion 1.3 Simple Random SamplingObjective

L . Obta in a S imple Random Sample

Objective I Random sampling

A sample of s ize n from a populat ion of s ize N issample of s ize n has an equal ly l ikely chance of

Example: l l lustrating Simple Random SamplingSuppose a study group of consists of 5Bob, Patricia, Mike, Jan, and Maria2 of the students must go to the board to

List al l possible samples of size 2 (without rep

b )

fCaa,7azetetz)2 -C-fu,, ///4 6*at, -

(/are414 Jn"I>) C/*reP4 Hac'a

k- page 15)

rmine if radio frequencies from cell phones increase the riskncer was contracted is the response voriable (dependent

variable (i nde pe nde nt variaLtle ).unt of an explanatory variabler affects the value of a

of two or more explanatory variables are not separated.natory ( independent) variable and the response

ble or var iables not accounted for in the study.

was not considered in a study, but that affect the value ofing variables are typical ly related to any explanatory

variable for the fol lowing studies.

One group receives 5 mg of a medication that

0 mg. After 2 years, the spread rof the cancer

-'7rtE,q*au^t f dF,4AEfz6#46€E dFrae slca0

dr /7/€enNe<'h-g cfiptes who marry are more l ikely to gain

/442/7eL 5 r474J

^47 €>?zrF)

ng with certain characterist ics of each indiVidual.

ined through simple random sampling i f eVery possitr le. The sample is then ca l led a s imp le rqndom sample .

onstrate a homework problem.

)2Cha lacO(*q Jnt)t

Page 5: CHAPTER 1 Data Collection - Los Angeles Mission College 2017... · CHAPTER 1 Data Collection ... 5. Determine the level of measurement of a variable ... 8/23/2017 11:12:57 AM

Section 1.4 Other Effective Sampling MethodsObjectives

t. Obtain a Strat i f ied Sample2. Obtain a Systematic Sample3. Obtain a Cluster Sample

nWis one obtained by secal led strofo, and then obtaining a simple randomstratum should be homogeneous (or simi lar) in so

ASys;rarUUo Qad/rris obtained by selecting eveselected is a random number between 1 and k,

n tUlsrep- .Q"qftris obtained by selecting aof individuals.

Sect ion 1.5 Bias in SamplingObjectives

t. Explain the Sources of Bias in Sampling

Sources of Bias in Samplingo l f the results of the sample are not representat ive

Three Sources of Bias1. Sampling Bias-occurs when the technique usedone part of the population over another.2. Nonresponse Bias - exists when individualshave different opinions from those who do.3. Response Bias - exists when the answers on a sTypes of Response Bias: L) Interviewer error; 2)question; 4) Order of the questions or words withi

{vaztaJ' of 5',fr4////V6

(<) €udY /a / cls/azrc- €/1t

7d -r€/ec/ /?€2 oE lzs

+ sys,ry??,re -fq/v/z€

@ fl,Fne,zp< D..vz)o /r, a,fe€eztt r*t t?ap/{ 4zz

lfF.lazzz, r3@ n sZhr ca,€2.,r€zBeJ "Fo".ii

-/e:{nb7iA t

-> SrezZFte)

the population into homogeneoLrs, non-overlapping groupsfrom each stratum. The indi,uiduals within each

way.

k*h individual from the population. The f irst individual

I individuals within a randomly selercted col lect ion or Eiroup

the populat ion, then the sample has bias.

obtain the individuals to be in the sample tends to l lavor

to be in the sample who do not respond to the survey

do not reflect the true feelings of the respondent.isrepresented answers; 3) Words used in surveythe quest ion.

'7tfal-f '

€ 4 @oe/ gze€ ts '?tfe:o

y'aEtr€ (ata4-

,n c4 /.v p 3 A .rq24Ezf7afult Ei^/AAlZy

TlE z6 4/,/71//'{ Tl€ ? *4s€e 7az'<5

o F //t oFel4aEO,

Srubaattr PaluAA za! _Zt ra ltvs 4aa6--EAr/a6, A,uj 6.CA)4*E ztU.LduT 7ry,€TtrFeff* €,4er' cc*+6 a'{oesks r*e:

Page 6: CHAPTER 1 Data Collection - Los Angeles Mission College 2017... · CHAPTER 1 Data Collection ... 5. Determine the level of measurement of a variable ... 8/23/2017 11:12:57 AM

1

Section 1.6 The Design of Experiments

Objectives

1. Describe the Characteristics of an Experiment 2. Explain the Steps in Designing an Experiment 3. Explain the Completely Randomized Design 4. Explain the Matched-Pairs Design 5. Explain the Randomized Block Design

Objective 1. Describe the Characteristics of an Experiment

An experiment is a controlled study conducted to determine the effect of varying one or more explanatory

variables or factors has on a response variable. Any combination of the values of the factors is called a

treatment.

The experimental unit (or subject) is a person, object or some other well-defined item upon which a treatment

is applied.

Page 7: CHAPTER 1 Data Collection - Los Angeles Mission College 2017... · CHAPTER 1 Data Collection ... 5. Determine the level of measurement of a variable ... 8/23/2017 11:12:57 AM

2

A control group serves as a baseline treatment that can be used to compare to other treatments.

A placebo is an innocuous medication, such as a sugar tablet, that looks, tastes, and smells like the experimental

medication.

Blinding refers to nondisclosure of the treatment an experimental unit is receiving.

A single-blind experiment is one in which the experimental unit (or subject) does not know which

treatment he or she is receiving.

A double-blind experiment is one in which neither the experimental unit nor the researcher in contact

with the experimental unit knows which treatment the experimental unit is receiving.

EXAMPLE The Characteristics of an Experiment

The English Department of a community college is considering adopting an online version of the freshman English course. To compare the new online course to the traditional course, an English Department faculty member randomly splits a section of her course. Half of the students receive the traditional course and the other half is given an online version. At the end of the semester, both groups will be given a test to determine which performed better. (a) Who are the experimental units? The students in the class

(b) What is the population for which this study applies? All students who enroll in the class

(c) What are the treatments? Traditional vs. online instruction

(d) What is the response variable? Exam score

(e) Why can’t this experiment be conducted with blinding? Both the students and instructor know which treatment they are receiving

Objective 2 Explain the Steps in Designing an Experiment

To design an experiment means to describe the overall plan in conducting the experiment.

Steps in Conducting an Experiment

Step 1: Identify the problem to be solved. • Should be explicit • Should provide the experimenter direction • Should identify the response variable and the population to be studied. • Often referred to as the claim.

Page 8: CHAPTER 1 Data Collection - Los Angeles Mission College 2017... · CHAPTER 1 Data Collection ... 5. Determine the level of measurement of a variable ... 8/23/2017 11:12:57 AM

3

Step 2: Determine the factors that affect the response variable.

• Once the factors are identified, it must be determined which factors are to be fixed at some

predetermined level (the control), which factors will be manipulated and which factors will be

uncontrolled.

Step 3: Determine the number of experimental units. Step 4: Determine the level of the predictor variables

1. Control: There are two ways to control the factors. (a) Fix their level at one predetermined value throughout the experiment. These are variables whose

effect on the response variable is not of interest. (b) Set them at predetermined levels. These are the factors whose effect on the response variable

interests us. The combinations of the levels of these factors represent the treatments in the experiment.

2. Randomize: Randomize the experimental units to various treatment groups so that the effects of variables whose level cannot be controlled is minimized. The idea is that randomization “averages out” the effect of uncontrolled predictor variables.

Step 5: Conduct the Experiment

a) Replication occurs when each treatment is applied to more than one experimental unit. This helps to assure that the effect of a treatment is not due to some characteristic of a single experimental unit. It is recommended that each treatment group have the same number of experimental units. b) Collect and process the data by measuring the value of the response variable for each replication. Any

difference in the value of the response variable is a result of differences in the level of the treatment.

Step 6: Test the claim. • This is the subject of inferential statistics. • Inferential statistics is a process in which generalizations about a population are made on the basis of results obtained from a sample. Provide a statement regarding the level of confidence in the generalization. Methods of inferential statistics are presented later in the text.

Objective 3 Explain the Completely Randomized Design A completely randomized design is one in which each experimental unit is randomly assigned to a treatment.

EXAMPLE Designing an Experiment The octane of fuel is a measure of its resistance to detonation with a higher number indicating higher resistance. An engineer wants to know whether the level of octane in gasoline affects the gas mileage of an automobile. Assist the engineer in designing an experiment. Step 1: The response variable is miles per gallon. Step 2: Factors that affect miles per gallon: Engine size, outside temperature, driving style, driving conditions, characteristics of car Step 3: We will use 12 cars all of the same model and year. Step 4: We list the variables and their level.

• Octane level - manipulated at 3 levels. Treatment A: 87 octane, Treatment B: 89 octane, Treatment C: 92 octane

Page 9: CHAPTER 1 Data Collection - Los Angeles Mission College 2017... · CHAPTER 1 Data Collection ... 5. Determine the level of measurement of a variable ... 8/23/2017 11:12:57 AM

4

• Engine size - fixed • Temperature - uncontrolled, but will be the same for all 12 cars. • Driving style/conditions - all 12 cars will be driven under the same conditions on a closed track - fixed. • Other characteristics of car - all 12 cars will be the same model year, however, there is probably

variation from car to car. To account for this, we randomly assign the cars to the octane level. Step 5: Randomly assign 4 cars to the 87 octane, 4 cars to the 89 octane, and 4 cars to the 92 octane. Give each car 3 gallons of gasoline. Drive the cars until they run out of gas. Compute the miles per gallon. Step 6: Determine whether any differences exist in miles per gallon. Completely Randomized Design

Objective 4. Explain the Matched-Pairs Design

A matched-pairs design is an experimental design in which the experimental units are paired up. The pairs are

matched up so that they are somehow related (that is, the same person before and after a treatment, twins,

husband and wife, same geographical location, and so on). There are only two levels of treatment in a matched-

pairs design.

EXAMPLE A Matched-Pairs Design Xylitol has proven effective in preventing dental caries (cavities) when included in food or gum. A total of 75 Peruvian children were given milk with and without Xylitol and were asked to evaluate the taste of each. The researchers measured the children’s ratings of the two types of milk. (Source: Castillo JL, et al (2005) Children's acceptance of milk with Xylitol or Sorbitol for dental caries prevention. BMC Oral Health (5)6.) (a) What is the response variable in this experiment? Rating (b) Think of some of the factors in the study. Which are controlled? Which factor is manipulated? Age and

gender of the children; Milk with and without Xylitol is the factor that was manipulated

(c) What are the treatments? How many treatments are there? Milk with Xylitol and milk without xylitol; 2 (d) What type of experimental design is this? Matched-pairs design

(e) Identify the experimental units. 75 Peruvian children

Page 10: CHAPTER 1 Data Collection - Los Angeles Mission College 2017... · CHAPTER 1 Data Collection ... 5. Determine the level of measurement of a variable ... 8/23/2017 11:12:57 AM

5

(f) Why would it be a good idea to randomly assign whether the child drinks the milk with Xylitol first or second? Remove any effect due to order in which milk is drunk. (g) Do you think it would be a good idea to double-blind this experiment? Yes!

Objective 5 Explain the Randomized Block Design

Grouping similar (homogeneous) experimental units together and then randomizing the experimental units

within each group to a treatment is called blocking. Each group of homogeneous individuals is called a block.

Confounding occurs when the effect of two factors (explanatory variables) on the response variable cannot be

distinguished.

A randomized block design is used when the experimental units are divided into homogeneous groups called

blocks. Within each block, the experimental units are randomly assigned to treatments.

EXAMPLE A Randomized Block Design

Recall, the English Department is considering adopting an online version of the freshman English course. After

some deliberation, the English Department thinks that there may be a difference in the performance of the men

and women in the traditional and online courses. To accommodate any potential differences, they randomly

assign half the 60 men to each of the two courses and they do the same for the 70 women.

This is a randomized block design where gender forms the block. This way, gender will not play a role in the

value of the response variable, test score. We do not compare test results across gender.