65
Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Embed Size (px)

Citation preview

Page 1: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Chapter 1

Collecting Data in Reasonable Ways

Created by Kathy Fritz

Page 2: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Data and conclusions based on data are everywhere • Newspapers• Magazines• On-line reports• Professional publications

Should you eat garlic to prevent a cold?

Will eating cheese before going to bed help you sleep

better?

Will doing tai chi one hour per week increase the effectiveness

of your flu shot?

Should you believe what you read?

These type of questions are answered with data gathered from samples or from experiments.

Page 3: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Population – the entire collection of individuals or objects that you want to learn about

Sample – the part of the population that is selected for study

Page 4: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

In the article, “The ‘CSI Effect’ – Does It Really Exist?” (National Institute of Justice [2008]: 1-7), the author speculates that watching crime scene investigation TV shows may be associated with the kind of high-tech evidence that jurors expect to see in criminal trials. Do people who watch such shows on a regular basis have higher expectations than those who do not watch them?

Observational study – a study in which the person conducting the study observes characteristics of a sample selected from one or more existing populations.

How would one go about answering this question?To answer this question, one would need to select a sample of people who watch these shows and a

sample of people who do not. Interview these people to determine their level of expectation of

high-tech evidence in criminal cases.This is called an observational study.The goal of an observational study is to use data

from the sample to learn about the corresponding population.

Page 5: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Suppose a chemistry teacher wants to see the effects on students’ test scores if the lab time were increased from 3 hours to 6 hours.

Experiment - a study in which the person conducting the study looks at how a response variable behaves under different experimental conditions.

Would you use an observational study to answer this question? Why or why not?

Questions in the form “What would happen when . . .?” or “What is the effect of . . .?” CANNOT be answered with data from an

observational study.

An experiment MUST be used.

Page 6: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

A big difference between an experiment and an observational study is . . .

• in an experiment, the person carrying out the study determines who will be in what experimental groups and what the experimental conditions will be

• in an observational study, the person carrying out the study does NOT determine who will be in what groups

In the example about increasing lab time in a chemistry class to see the effect on test scores, the teacher would determine which students are in the 3-hour lab and which are in the 6-hour

lab.

In the example about the “CSI effect”, the researcher did NOT determine whether or not

the people watched investigative TV shows.

Page 7: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Observational Studies• Purpose is to collect data that will

allow you to learn about a single population or about how two or more populations will differ

• Allows you to answer questions like“What is the proportion of . . .?”“What is the average of . . .?”“Is there an association between . . .?”

The “ideal” study would be to carry out a census.

Obtaining information about the entire population is called a census.

Page 8: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Census versus SampleWhy might we prefer to take select a sample rather than perform a census?

1. Measurements that require destroying the item

Measuring how long batteries lastSafety ratings of cars

2.Difficult to find entire populationLength of fish in a lake

3. Limited resourcesTime and money

Most common reason to use a

sample

Page 9: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

When you answer questions like

“What is the proportion of . . .?”“What is the average of . . .?”

You are interested in the population characteristic.

A population characteristic is a number that describes the entire population.

A statistic is a number that describes a sample.

It is important that a sample be representative of the population.

How can we be sure that the sample is representative of the population?

One way is to take a simple random

sample.

Page 10: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Simple Random Sample

A sample of size n is selected from the population in a way that ensures that every different possible sample of the desired size has the same chance of being selected.

Suppose you want to select 10 employees from all employees of a large design firm.

Number each employee with a unique number. Use a random digit table random, a random number generator, or numbers selected from a hat to select the 10 employees for the sample.

What is the value of n ?

The letter n is used to denote sample size; the number of individuals or objects in the

sample.

To select a simple random sample, create a list (called a sampling

frame) of all the employees in the firm.

In order to be a simple random sample – EVERY sample of size 10 MUST have an

equal chance to occur. Thus, it is possible that 10 full-time, 10 part-time, or any combination of full-time and part-time

employees are selected.

Page 11: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

How to use a Random digit table

The following is part of the random digit table found in the back of your textbook:

Row

6 0 9 3 8 7 6 7 9 9 5 6 2 5 6 5 8 4 2 6 4

7 4 1 0 1 0 2 2 0 4 7 5 1 1 9 4 7 9 7 5 1

8 6 4 7 3 6 3 4 5 1 2 3 1 1 8 0 0 4 8 2 0

9 8 0 2 8 7 9 3 8 4 0 4 2 0 8 9 1 2 3 3 2

Number employees from 1 to 250. Select 3-digit numbers from the table. If the number is

not within 1-250, ignore it.

The sample would be the employees that correspond to the selected

numbers.

Suppose our design firm has 250 employees.

Page 12: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Sampling in which an individual or object, once selected, is put back into the population before the next selection.

This allows an object or individual to be selected more than once for a sample.

Sampling with replacement

In practice, sampling with replacement is rarely used.

Page 13: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Sampling in which an individual or object, once selected, is NOT put back into the population before the next selection.

That is once an individual or object is selected, they are not replaced and cannot be selected again.

Sampling without replacement

Although sampling with and without replacement are different, they can be treated as the same when the sample size n is relatively small compared to the population size (no more than 10% of the population).

In practice, sampling without replacement is more

common.

Page 14: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Can a smaller sample size be representative of the population?

population

Samples of varying sizesNotice that the

sample of size 50 is still representative of the population.

Although it is possible to obtain a simple random sample that is not representative of the population, this is likely ONLY when

the sample size is very small.

Page 15: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Stratified Random Sample

The population is first divided into non-overlapping subgroups (called strata).

Then separate simple random samples are selected from each subgroup (stratum).

Suppose we wanted to estimate the average cost of malpractice insurance for doctors in a particular city. We could view the population of all doctors in this city as falling into one of four subgroups: (1) surgeons, (2) internists and family practitioners, (3) obstetricians, and (4) all other doctors.

Strata are subgroups that are similar (homogeneous) based upon some characteristic of

the group members.

Now let's take a look at other sampling methods.

Stratified random sampling is often easier to implement and is more cost

effective than simple random sampling.

The real advantage of stratified sampling is that it often allows you to make more accurate

inferences about a population than does simple random sampling.

In general, it is much easier to produce relatively accurate estimates of characteristics of a

homogeneous group than of a heterogeneous group.

Page 16: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Cluster Sampling

Cluster sampling divides the population of interest into nonoverlapping subgroups, called clusters.

Sometimes it is easier to select groups of individuals from a population than it is

to select individuals themselves.

Clusters are then selected at random, and ALL individuals in the selected clusters are included in the sample.

Suppose that a large urban high school has 600 senior students, all of whom are enrolled in a first period homeroom. There are 24 senior homerooms, each with approximately 25 students. The school administrators want to select a sample of roughly 75 seniors to participate in a survey.

Which would be easier to do?

A) Find 75 randomly selected individuals

B) Find 3 randomly selected homerooms Randomly selecting 3 senior

homerooms and then include all the students in the

selected homerooms in the sample.

The ideal situation for cluster sampling is when each cluster mirrors the characteristics of the population.

Page 17: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Be careful not to confuse clustering and stratification!

Stratification Put into strata (homogenous groups)

Then random select individuals from each group

Sam

ple

Put into clusters

Then random select entire clusters

Clustering

Page 18: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

1 in k Systematic Sampling

Selects an ordered arrangement from a population by•first choosing a starting point at random from the first k individuals •then every k th individual after that

Suppose you wish to select a sample of faculty members from the faculty phone directory. You would first randomly select a faculty from the first 20 (k = 20) faculty listed in the directory. Then select every 20th faculty after that on the list.

This method works reasonably well as long as there are no repeating patterns in

the population list.

k is often selected so that a certain sample size is produced.

Let N = population size and n = sample size.

k = N ÷ n

Page 19: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Convenience Sampling

Selecting individuals or objects that are easy or convenient to sample.

Suppose your statistics professor asked you to gather a sample of 20 students from your college. You survey 20 students in your next class which is music theory.

Will this sample be representative of the population of all students at your college? Why or why not?Convenience sampling is rarely

representative of the population, soDON’T USE IT!

Page 20: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Voluntary response is a type of convenience sampling which relies solely on individuals volunteering to be part of the study.

People who are motivated to volunteer responses often hold strong opinions. It is extremely unlikely that they are representative of the population!

Page 21: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Identify the sampling method

1)The Educational Testing Service (ETS) needed a sample of colleges. ETS first divided all colleges into 6 subgroups of similar types (small public, small private, medium public, medium private, large public, and large private). Then they randomly selected 3 colleges from each group.

Stratified random sample

Page 22: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Identify the sampling design

2) A county commissioner wants to survey people in her district to determine their opinions on a particular law up for adoption. She decides to randomly select blocks in her district and then survey all who live on those blocks.

Cluster sample

Page 23: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Identify the sampling design

3) A local restaurant manager wants to survey customers about the service they receive. Each night the manager randomly chooses a number between 1 & 10. He then gives a survey to that customer, and to every 10th customer after them, to fill it out before they leave.

Systematic sampling

Page 24: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Consider the following example:

In 1936, Franklin Delano Roosevelt had been President for one term.  The magazine, The Literary Digest, predicted that Alf Landon would beat FDR in that year's election by 57 to 43 percent.  The Digest mailed over 10 million questionnaires to names drawn from lists of automobile and telephone owners, and over 2.3 million people responded - a huge sample.

At the same time, a young man named George Gallup sampled only 50,000 people and predicted that Roosevelt would win.  Gallup's prediction was ridiculed as naive. 

After all, the Digest had predicted the winner in every election since 1916, and had based its predictions on the largest response to any poll in history. 

But Roosevelt won with 62% of the vote.  The size of the Digest's error is staggering. 

This is a classic example of how bias affects the results of a sample!

Bias is the tendency for samples to differ from the corresponding

population in some systematic way.

Page 25: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Sources of bias

Selection bias

• Occurs when the way the sample is selected systematically excludes some part of the population of interest

• May also occur if only volunteers or self-selected individuals are used in a study

Suppose you take a sample by randomly

selecting names from the phone book – some groups will

not have the opportunity of being

selected!

People with unlisted phone numbers – usually high-income families

People without phone numbers –usually low-income families

People with ONLY cell phones – usually young adults

Page 26: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Sources of bias

Measurement or Response bias

• Occurs when the method of observation tends to produce values that systematically differ from the true value in some way– Improperly calibrated scale is used to weigh

items– Tendency of people not to be completely honest

when asked about illegal behavior or unpopular beliefs

– Appearance or behavior of the person asking the questions

– Questions on a survey are worded in a way that tends to influence the response

Suppose we wanted to survey high school students on drug abuse and

we used a uniformed police officer to interview each student in our sample

– would we get honest answers?

People are asked if they can trust men in mustaches – the interviewer is a

man with a mustache.

A Gallup survey sponsored by the American Paper Institute (Wall Street Journal, May 17, 1994)

included the following question: “It is estimated that disposable diapers accounts for less than 2%

of the trash in today’s landfills. In contrast, beverage containers, third-class mail and yard

waste are estimated to account for about 21% of trash in landfills. Given this, in your opinion, would

it be fair to tax or ban disposable diapers?”

Page 27: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Sources of bias

Nonresponse

• occurs when responses are not obtained from all individuals selected for inclusion in the sample

• To minimize nonresonse bias, it is critical that a serious effort be made to follow up with individuals who did not respond to the initial request for information

The phone rings – you answer. “Hello,” the person says, “do you have time for

a survey about radio stations?”You hang up!

How might this follow-up be done?

Page 28: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Sources of bias

Will increasing the sample size reduce the effects of bias in the study?

No, it does nothing to reduce bias if• The method of selection is flawed• If non-response is high

Page 29: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Identify one potential source of bias.

Suppose that you want to estimate the total amount of money spent by students on textbooks each semester at a local college. You collect register receipts for students as they leave the bookstore during lunch one day.Convenience sampling – easy way to

collect dataor

Selection bias – students who buy books from on-line bookstores are

excluded.

Page 30: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Selection bias – leaves out homes that are not for sale or homes that are

listed with different realtors.(other answers are possible)

Identify one potential source of bias.

To find the average value of a home in Plano, one averages the price of homes that are listed for sale with a realtor.

Page 31: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

The article “What People Buy from Fast-Food Restaurants: Caloric Content and Menu Item Selection” (Obesity [2009]: 1369-1374) reported that the average number of calories consumed at lunch in New Your City fast-food restaurants was 827. The researchers selected 267 fast-food locations at random. The paper states that at each of these locations “adult” customers were approached as they entered the restaurant and asked to provide their food receipt when exiting and to complete a brief survey.Will this study result in data that is representative of the population?

Page 32: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Question ResponseWhat is the population of interest?

Was the sample selected in a reasonable way?

Is the sample likely to be representative of the population of interest?

Are there any obvious sources of bias?

The people who eat at fast-food restaurants in NYC

Yes – because the researchers randomly selected the fast-food restaurants, this is a reasonable way to select the sample

Yes – because the researchers randomly selected the fast-food restaurants, it is reasonable to regard the people eating at these locations as representative

Two potential sources of bias• Response bias since customers

were approached before they ordered

• Nonresponse – some people refused to participate

Page 33: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Experimental Design

Page 34: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

ExperimentsSuppose we are interested in determining the effect of room temperature on the performance on a first-semester calculus exam. So we decide to perform an experiment.

What variable will we “measure”?the performance on a calculus exam

What variable will “explain” the results on the calculus exam?

the room temperature

This is called the response variable.

Response variable – a variable that is not controlled by the experimenter and that is

measured as part of the experiment

This is called the explanatory variable.

Explanatory variables – those variables that have values that are controlled by the

experimenter (also called factors)

Page 35: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

We decide to use two temperature settings, 65° and 75°.

How many treatments would our experiment have?

the 2 treatments are the 2 temperature settings

Room temperature experiment continued . . .

Experimental condition – any particular combination of the explanatory variables

(also called treatments)

Page 36: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Suppose we have 10 sections of first-semester calculus that have agree to participate in our study.

On who or what will we impose the treatments?the 10 sections of calculus

Should the instructors of these sections be allowed to select to which room temperature that their sections are assigned?

No, since the instructor would probably select the same temperature for all their sections, then it

would be difficult to tell if the scores are due to the temperature or to the instructor’s teaching style

Room temperature experiment continued . . .

These are our subjects or experimental units.

Experimental units – the smallest unit to which a treatment is applied.

This is an example of a confounding variable.

Confounding variable – two variables are confounded when their effects on the response can NOT be distinguished.

Page 37: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Designing Strategies for Single Comparative Experiments

The goal of a single comparative experiment is to determine the effects of the treatment on the response variable.

To do this:You must consider other potential sources of variability in the response• Eliminate themOR• Ensure they produce chance-like

variability

Page 38: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Remember – the explanatory variable is the room temperature setting, 65° and 75°. The response variable is the grade on the calculus exam.

Are there other variables that could affect the response?

Room temperature experiment continued . . .

Time of

day?

Instructor?

Textb

ook?

Ability level of students?

In an experiment, these other variables need to be “controlled”.

Direct control is holding the other variables constant so that their effects are not confounded with those of the experimental conditions (treatments).

Can the experimenter control these other

variables? If so, how?

What about the variables that the experimenter can’t directly

control? What can be done to avoid confounding results?

Page 39: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Remember – the explanatory variable is the room temperature setting, 65° and 75°. The response variable is the grade on the calculus exam.

Room temperature experiment continued . . .

Time of

day?

Instructor?

Textb

ook?

Ability level of students?

The experimenter cannot control who the instructors are. Therefore, the instructors may be potentially confounding.

Another way to control a variable is to block by that variable.

We use each instructor as his/her own block. Then sections of each instructor will

be randomly assigned to the two treatments.

Page 40: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

What about other variables that we cannot control directly or that we don’t even think about?

Random assignment should evenly spread all other variables, that are not controlled directly, into all treatment groups. We expect these variables to affect all the experimental groups in the same way; therefore, their effects are not confounding.

Room temperature experiment continued . . .

Page 41: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Sections assigned

Treatment 1 (65°)

Treatment 2 (75°)

39857

Room temperature experiment continued . . .To randomly assign the 10 sections of first-semester calculus to the 2 treatment groups, we would first number the classes 1-10.

Place the numbers 1-10 on identical slips of paper and put them in a hat. Mix well.

Randomly select 5 numbers from the hat. Those will be the sections that have the

room temperature set at 65°.

The remaining sections will have the room temperature set at 75°.

9 7 5 8 3

1 2 4 6 10

There are 10 sections.

This is called replication. Replication ensures that there is an adequate number

of observations for each experimental condition.

Page 42: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Random assignment removes the potential for confounding variables by creating equivalent experimental groups

Key Concepts of Experimental DesignDirect control holds potential confounding variables constant so their effects are not confounded with the treatments.

Random assignment is a critical component of a good experiment.

Blocking uses potentially confounding variables to create groups (blocks) that are similar. All experimental conditions (treatments) are then tried in each block.

Replication ensures that there is an adequate number of observations for each treatment.

Page 43: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

What to do with Potentially Confounding Variables

Potentially confounding variables that are known and incorporated into the experimental design:

Use:• Direct control – hold potentially confounding

variables fixed• Blocking – allow for valid comparisons because

each treatment is tried in each block

Potentially confounding variables that are NOT known or NOT incorporated into the experimental design:

Use:• Random assignment

Page 44: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Types of Experimental Design

Completely Randomized DesignAn experiment in which experimental units are randomly assigned to treatments is called a completely randomized experiment.

Experimental Units

Measure response for

A

Treatment B

Treatment A

Measure response for

BRandom

Ass

ignm

ent

Compare treatments

Page 45: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Randomized Block DesignAn experiment that incorporates blocking by dividing the experimental units into homogeneous blocks and then randomly assigns the individuals within each block to treatments is called a randomized block experiment.

Block

2

Measure response

for A

Treatment B

Treatment A

Measure response

for BRandom

A

ssig

nm

ent

Compare treatments for block

2

Block

1

Measure response

for A

Treatment B

Treatment A

Measure response

for BRandom

A

ssig

nm

ent

Compare treatments for block

1

Exp

eri

ment

al U

nit

s

Cre

ate

blo

cks

Com

pare

th

e

resu

lts

from

th

e

2 b

lock

s

Page 46: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Can moving their hands help children learn math? An experiment was conducted to compare two different methods for teaching children how to solve math problems of the form 3 + 2 + 8 = ___ + 8. One method involved having students point to the 3 + 2 on the left side of the equal sign with one hand and then point to the blank on the right side of the equal sign before filling in the blank to complete the equation. The other method did not involve using these had gestures.To compare the two methods, 128 children, ages 9 and 10, were randomly assigned to the two experimental conditions. This is an example of what type

of experimental design?Completely Randomized Design

Page 47: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

A Diagram of the math experiment:

This is a completely randomized experiment. The 128 children are randomly assigned into the two treatment groups.

128 children

Measure number correct

on math test

Math without hand gestures

Math with hand

gestures

Measure number correct

on math testRandom

Ass

ignm

ent

Compare number

correct for those who used hand

gestures and those who did

not

Page 48: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Can moving their hands help children learn math? Suppose that you were worried that gender might also be related to performance on the math test. One possibility would be to use direct control of gender – use only boys or only girls. But, any conclusions can ONLY be generalized to the group that was used.

This is an example of a Randomized Block Design.

Another strategy is to incorporate blocking into the design. The researchers could create two blocks, one consisting of girls and one consisting of boys. Then, once the blocks are formed, randomly assign the girls to the two treatments and randomly assign the boys to the two treatments.

Page 49: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

A Diagram of the math experiment:The 128 students are blocked by gender. Then the students are randomly assigned to the two treatments.

47 boys

Measure number correct

Math with no hand gestures

Math with hand

gestrues

Measure number correctR

andom

A

ssig

nm

ent

Compare number correct

with hand gestures and without hand gestures for

boys

81 girls

Measure number correct

Math with no hand gestures

Math with hand

gestires

Measure number correctR

andom

A

ssig

nm

ent Compare

number correct with hand

gestures and without hand gestures for

girls

12

8

Child

ren

Cre

ate

blo

cks

Com

pare

nu

mb

er

corr

ect

wit

h h

an

d

gest

ure

s an

d

wit

hou

t h

an

d

gest

ure

s

Page 50: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

In the room temperature experiment, we have only 2 treatment groups, 65° and 75°. We do NOT have a control group.

A control group•allows the experimenter to assess how the response variable behaves when the treatment is not used. •provides a baseline against which the

treatment groups can be compared to determine whether the treatment had an effect.

Revisit the room temperature experiment . . .

Control group - is an experimental group that does NOT receive any treatment.

In experiments that use human subjects, the use of a placebo may be necessary.

A placebo is something that is IDENTICAL to the treatment received, except it

contains NO active ingredients.

Page 51: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Consider Anna, a waitress. She decides to perform an experiment to determine if writing “Thank you” on the receipt increases her tip percentage.

She plans on having two groups. On one group she will write “Thank you” on the receipt and on the other group she will not write “Thank you” on the receipt.

Which of these is the control group?

Page 52: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Would the students in each section of calculus know to which treatment group, 65° or 75°, they were assigned?

If the students knew about the experiment, they would probably know which treatment group they were in.

So this experiment is probably NOT blinded.

Room temperature experiment continued . . .

An experiment in which the subjects do not know which treatment they were in

is called a single-blind experiment.

A double-blind experiment is one in which neither the subjects nor the

individuals who measure the response knows which treatment is received.

Page 53: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Using Volunteers as Subjects in an Experiment

Remember –Using volunteers in observational studies is never a good idea!

However – It is common practice to use volunteers as subjects in an experiment.• Random assignment of the volunteers to

treatments allows for cause-and-effect conclusions

• But, limits the ability to generalize to population

Page 54: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

The ONLY way to show a cause-effect

relationship is with a well-designed, well-

controlled

experiment!!!

Page 55: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Chilling newborns?Researchers for the National Institute of Child Health and Human Development studied 208 infants whose brains were temporarily deprived of oxygen as a result of complications at birth (The New England Journal of Medicine, October 13, 2005). These babies were subjects in an experiment to determine if reducing body temperature for three days after birth improved their chances of surviving without brain damage. The experiment was summarized in a paper that stated “infants were randomly assigned to usual care (control group) or whole-body cooling.”The researchers recorded whether the infant survived (yes/no) and if the infant had brain damage (yes/no).

Let’s examine this experiment.

Page 56: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Question Response

What question is the experiment trying to answer?

What are the experimental conditions (treatments) for the experiment?

What is the response?

What are the experimental units and how were they selected?

Does chilling newborns whose brains were temporarily deprived of oxygen at birth improve the chance of surviving without brain damage?The experiment compared two experimental conditions: usual care and whole-body cooling

This experiment used two response variables – survival and brain damageThe experimental units were the 208 newborns. The description of the experiment does not say how the newborns were selected, but is unlikely they were a random sample.

Page 57: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Table Continued . . .

Question Response

Does the design incorporate random assignment of experimental units to the different experimental conditions? If not, are there potentially confounding variables that would limit the use of these data?

Does the experiment incorporate a control group and/or a placebo group? If not, would the experiment be improved by including them?

Does the experiment involve blinding? If not, would the experiment be improved by making it single- or double-blind?

Yes, the babies were randomly assigned to one of the two experimental conditions.

In this experiment, the usual care group serves as a control group.

This experiment did not involve blinding. It might have been possible to blind the person who makes the assessment of whether or not there was brain damage, but this probably isn’t necessary in this experiment.

Because this is a well-designed experiment, researchers were able to use the resulting data to conclude that cooling did reduce the risk of

death and disability of infants deprived of oxygen at birth.

Page 58: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

In this study, the investigators followed 806 kids ages 2 to 4 and 704 kids ages 5 to 9 for 4 years. IQ was measured at the beginning of the study and again 4 years later. The researchers found that at the end of the study, the average IQ of the younger kids who were not spanked was 5 points higher than that of kids who were spanked. For the older group, the average IQ of kids who were not spanked was 2.8 points higher.

Spanking Lowers a Child’s IQ (Los Angeles

Times)Spanking can lower IQ (NBC4i, Columbus,

Ohio)

Smacking hits kids’ IQ (newscientist.com)

These headlines all imply that spanking was the cause of the

observed difference in IQ. Is this conclusion reasonable?

Page 59: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Drawing Conclusions from Statistical Studies

Type of Conclusion Reasonable When

Generalize from sample to population

Random selection is used to obtain the sample

There is random assignment of experimental units to experimental conditions

Change in response is caused by experimental conditions (cause-and-effect conclusion)

This table implies the following:• For observational studies, it is not possible to reach cause-and-

effect conclusions, but it is possible to generalize from the sample to the population of interest if the study design incorporated random selection.

• For experiments, it is possible to reach cause-and-effect conclusions if the study design incorporated random assignment to experimental conditions.

• If an experiment incorporates both random assignment to experimental conditions and random selection of experimental units from some population, it is possible to reach cause-and-effect conclusions and to generalize to the population from which the experimental units were selected.

Page 60: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

In this study, the investigators followed 806 kids ages 2 to 4 and 704 kids ages 5 to 9 for 4 years. IQ was measured at the beginning of the study and again 4 years later. The researchers found that at the end of the study, the average IQ of the younger kids who were not spanked was 5 points higher than that of kids who were spanked. For the older group, the average IQ of kids who were not spanked was 2.8 points higher.

Spanking Lowers a Child’s IQ (Los Angeles

Times)Spanking can lower IQ (NBC4i, Columbus,

Ohio)

Smacking hits kids’ IQ (newscientist.com)

These conclusions are not reasonable since this study has no random

assignment of experimental units to experimental conditions.

Revisit . . .

Page 61: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Avoid These Common Mistakes

Page 62: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Avoid These Common Mistakes

1. Drawing a cause-and-effect conclusion from an observational study.

Don’t do

this!

Don’t believe it when others do!

Page 63: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Avoid These Common Mistakes

2. Generalizing results of an experiment that uses volunteers as subjects.

Only do this if it can be convincingly argued that the group of volunteers is representative of the population of interest.

Page 64: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Avoid These Common Mistakes

3. Generalizing conclusions based on data from a poorly designed observational study.

Generalizing from a sample to a population is justified only when the sample is representative of the population. This would be the case if the sample was a random sample from the population, and there were no major potential sources of bias.

Page 65: Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Avoid These Common Mistakes

4. Generalizing conclusions based on an observational study that used voluntary response or convenience sampling to a larger population.

This is almost never reasonable!