25
Simpson’s paradox An association or comparison that holds for all of several groups can reverse direction when the data are combined (aggregated) to form a single group. This reversal is called Simpson’s paradox. H ospitalA H ospitalB D ied 63 16 S urvived 2037 784 Total 2100 800 % surv. 97.0% 98.0% On the surface, Hospital B would seem to have a better record. Patient condition is the lurking variable. Now see Ex. 2.40- 2.41, p.144… P a tie n ts in g o o d co n d itio n P a tie n ts in p o o r co n d itio n H ospitalA H ospitalB H ospitalA H ospitalB Died 6 8 Died 57 8 S urvived 594 592 S urvived 1443 192 Total 600 600 Total 1500 200 % surv. 99.0% 98.7% % surv. 96.2% 96.0% But once patient condition is taken into account, we see that hospital A has in fact a better record for both patient conditions (good and poor). Example: Hospital death rates

Simpson’s paradox An association or comparison that holds for all of several groups can reverse direction when the data are combined (aggregated) to form

Embed Size (px)

Citation preview

Page 1: Simpson’s paradox An association or comparison that holds for all of several groups can reverse direction when the data are combined (aggregated) to form

Simpson’s paradoxAn association or comparison that holds for all of several groups can

reverse direction when the data are combined (aggregated) to form a

single group. This reversal is called Simpson’s paradox.

Hospital A Hospital BDied 63 16Survived 2037 784

Total 2100 800% surv. 97.0% 98.0%

On the surface, Hospital B would seem to have a better record.

Patient condition is the lurking variable. Now see Ex. 2.40-2.41, p.144…

Patients in good condition Patients in poor conditionHospital A Hospital B Hospital A Hospital B

Died 6 8 Died 57 8Survived 594 592 Survived 1443 192

Total 600 600 Total 1500 200% surv. 99.0% 98.7% % surv. 96.2% 96.0%

But once patient condition is taken into account, we see that hospital A has in fact a better record for both patient conditions (good and poor).

Example: Hospital death rates

Page 2: Simpson’s paradox An association or comparison that holds for all of several groups can reverse direction when the data are combined (aggregated) to form

Does Association Imply Causation?

• Sometimes, but not always! Look at example 2.42 on page 149

(section 2.6, Explaining Causation) for several x,y variables where association was found - some are causal, others are not.

• The figure below (Fig. 2.29) gives three possible scenarios explaining a found association between a response variable y and an explanatory variable x:

Page 3: Simpson’s paradox An association or comparison that holds for all of several groups can reverse direction when the data are combined (aggregated) to form

• Association between x and y can certainly be because changes in x cause y to change - but even when causation is present, there are still other variables possibly involved in the relationship. (See #1 in Ex. 2.42)

• Be careful of applying a causal relationship between x and y in one setting to a different setting: (#2 shows a causal relationship in rats - does it extend to humans?)

• Common response is an example of how a "lurking variable" can influence both x and y, creating the association between them (See #3)

• Confounding between two variables arises when their effects on the response cannot be distinguished from each other - the confounding variables can either be explanatory or lurking… (See #5)

Page 4: Simpson’s paradox An association or comparison that holds for all of several groups can reverse direction when the data are combined (aggregated) to form

Establishing causation

It appears that lung cancer is associated with smoking.

How do we know that both of these variables are not being affected by an unobserved third (lurking) variable?

For instance, what if there is a genetic predisposition that causes people to both get lung cancer and become addicted to smoking, but the smoking itself doesn’t CAUSE lung cancer?

1) The association is strong.

2) The association is consistent.

3) Higher doses are associated with stronger responses.

4) Alleged cause precedes the effect.

5) The alleged cause is plausible.

We can evaluate the association using the following criteria:

HW: read 2.6, go over all the examples in the section (esp. 2.43, 2.44) and then look at # 2.133-2.145

Page 5: Simpson’s paradox An association or comparison that holds for all of several groups can reverse direction when the data are combined (aggregated) to form

Obtaining data

•Available data are data that were produced in the past for some other purpose but that may help answer a present question inexpensively. The library and the Internet are sources of available data.

– Government statistical offices are the primary source for demographic, economic, and social data (visit the Fed-Stats site at www.fedstats.gov).

•Beware of drawing conclusions from our own experience or hearsay. Anecdotal evidence is based on haphazardly selected individual cases, which we tend to remember because they are unusual in some way. They also may not be representative of any larger group of cases.

•Some questions require data produced specifically to answer them. This leads to designing observational or experimental studies.

Page 6: Simpson’s paradox An association or comparison that holds for all of several groups can reverse direction when the data are combined (aggregated) to form

Observational study: Record data on individuals without attempting to influence the responses. We typically cannot prove cause & effect this way.

Example: Based on observations you make in nature,

you suspect that female crickets choose their

mates on the basis of their health. Observe

health of male crickets that mated.

Experimental study: Deliberately impose a treatment on individuals and record their responses. Lurking variables can be controlled.

Example: Deliberately infect some males

with intestinal parasites and see whether

females tend to choose healthy rather

than ill males.

Page 7: Simpson’s paradox An association or comparison that holds for all of several groups can reverse direction when the data are combined (aggregated) to form

– a sample is a collection of data drawn from a population, intended to represent the population from which it was drawn – a census is an attempt to sample every individual in the population.

– an experiment imposes a so-called treatment on individuals in order to observe their responses. This is in opposition to an observational study which simply observes individuals and measures variables of interest without intervention

– go over Examples 3.4-3.5 in Ch. 3, Sample Surveys & Experiments

Page 8: Simpson’s paradox An association or comparison that holds for all of several groups can reverse direction when the data are combined (aggregated) to form

Terminology of experiments

• The individuals in an experiment are the experimental units. If they are human, we call them subjects.

• In an experiment, we do something to the subject and measure the response. The “something” we do (explanatory variable) is a called a treatment, or factor. The values of the factor are called its levels. Sometimes a treatment is a combination of levels of more than one factor.

– The factor may be the administration of a drug – the different dosages are its levels.

– One group of people may be placed on a diet/exercise program for six months (treatment), and their blood pressure (response variable) would be compared with that of people who did not diet or exercise. Two levels here: on diet, not on diet

Page 9: Simpson’s paradox An association or comparison that holds for all of several groups can reverse direction when the data are combined (aggregated) to form

• Go over example 3.8 (Section 3.1) and below – an example of a designed experiment with two factors and six treatments. Also see Ex. 3.9 (Section 3.1) for an example of an experiment not designed well... The lack of a control group causes the problem...

Page 10: Simpson’s paradox An association or comparison that holds for all of several groups can reverse direction when the data are combined (aggregated) to form

• If the experiment involves giving two different doses of a drug, we say that we are testing two levels of the factor.

• A response to a treatment is statistically significant if it is larger than you would expect by chance (due to random variation among the subjects). We will learn how to determine this later.

In a study of sickle cell anemia, 150 patients were given the drug

hydroxyurea, and 150 were given a placebo (dummy pill). The researchers

counted the episodes of pain in each subject. Identify:

• The subjects

• The factors / treatments

• And the response variable

• (patients, all 300)

• 1 factor, 2 levels (hydroxyurea and placebo)

• (episodes of pain)

Page 11: Simpson’s paradox An association or comparison that holds for all of several groups can reverse direction when the data are combined (aggregated) to form

• In principle, experiments can give good evidence for causation through what we call randomized controlled comparative experiments.

• The need for comparative experiments is shown in Example 3.9 – a control group is needed so the experimenter can control the effects of outside (lurking) variables

• The use of randomization is illustrated in Example 3.10 – a chance mechanism is used to divide the experimental units into groups to prevent bias.

QuickTime™ and a decompressor

are needed to see this picture.

Page 12: Simpson’s paradox An association or comparison that holds for all of several groups can reverse direction when the data are combined (aggregated) to form

• The logic behind randomized comparative experiments is given on p. 175:– Randomization produces groups of subjects that

should be similar in all respects before the treatments are applied

– Comparative design ensures that influences other than the treatment operate equally on all groups

– Therefore, differences in the response must be due either to the treatment or to chance in the random assignment of subjects to the groups.

• This lead to three basic principles of experimental design in the box on the same page…

Page 13: Simpson’s paradox An association or comparison that holds for all of several groups can reverse direction when the data are combined (aggregated) to form

• Control the effects of lurking variables on the response, usually by comparing two or more treatments

• Randomize – use a chance mechanism to assign experimental units to treatments. See the Table B of random digits discussed on the later slides…

• Repeat each treatment on many units to reduce chance variation in the results

• Then if you see differences in the response they are called statistically significant if they would rarely occur by chance

Page 14: Simpson’s paradox An association or comparison that holds for all of several groups can reverse direction when the data are combined (aggregated) to form

The design of a study is

biased if it systematically

favors certain

outcomes.

Caution about experimentation

The best way to exclude biases in an experiment is to

randomize the design. Both the individuals and

treatments are assigned randomly.

Page 15: Simpson’s paradox An association or comparison that holds for all of several groups can reverse direction when the data are combined (aggregated) to form

Other ways to remove bias:

A double-blind experiment is one in which neither the

subjects nor the experimenter know which individuals got

which treatment until the experiment is completed. The goal

is to avoid forms of placebo effects and biases in

interpretation.

The best way to make sure your conclusions are robust is to

replicate your experiment—do it over. Replication ensures

that particular results are not due to uncontrolled factors or

errors of manipulation.

Page 16: Simpson’s paradox An association or comparison that holds for all of several groups can reverse direction when the data are combined (aggregated) to form

Designing “controlled” experiments

•Fisher found the data from experiments going on for decades to be basically worthless because of poor experimental design.

– Fertilizer had been applied to a field one year and not in another in order to compare the yield of grain produced in the two years. BUT

• It may have rained more, or been sunnier, in different years.• The seeds used may have differed between years as well.

– Or fertilizer was applied to one field and not to a nearby field in the same year. BUT

• The fields might have different soil, water, drainage, and history of previous use.

Too many factors affecting the results were “uncontrolled.”

Sir Ronald Fisher—The “father of statistics”He was sent to Rothamsted Agricultural Station

in the United Kingdom to evaluate the success of various fertilizer treatments.

Page 17: Simpson’s paradox An association or comparison that holds for all of several groups can reverse direction when the data are combined (aggregated) to form

Fisher’s solution:

• In the same field and same year, apply fertilizer to randomly spaced plots within the field. Analyze plants from similarly treated plots together.

• This minimizes the effect of variation within the field in drainage and soil composition on yield, as well as controlling for weather.

F F F FF F

FF FF FF F F

F F F F F

F F F F F F F F

F F F FF

F F F F

“Randomized comparative experiments”

Page 18: Simpson’s paradox An association or comparison that holds for all of several groups can reverse direction when the data are combined (aggregated) to form

A Table of Random Digits can be used to Randomize an Experiment

• any digit in any position in the table is as equally likely to be 0 as 1 as 2 as … as 9

• the digits in different positions are independent in the sense that the value of one has no influence on the value of any other

• any pair of random digits has the same chance of being picked as any other (00, 01, 02, … 99)

• any triple of random digits has the same chance of being picked as any other (000, 001, … 999)

• and so on…

Page 19: Simpson’s paradox An association or comparison that holds for all of several groups can reverse direction when the data are combined (aggregated) to form

• Now use Table B to randomly divide the 40 students in Ex. 3.10 into the two groups (phone 1 and phone 2 groups)– Step 1: Label the experimental units with as few

digits as possible– Step 2: Decide on a protocol for how you will place

the chosen units into the groups– Step 3: Start anywhere in the Table and begin reading

random digits. Matching them with labeled experimental units and following the protocol creates the groups.

Page 20: Simpson’s paradox An association or comparison that holds for all of several groups can reverse direction when the data are combined (aggregated) to form

EX.3.10: We need to randomly divide the 40 students into two groups of 20 -

those using the first cell phone and those using the second cell phone.

1. List and number (label) all available subjects (the group of 40).

2. Decide that the first 20 students chosen go to the phone 1 group; the

remainder to the phone 2 group (this is the protocol)

3. Scan Table B in groups of numbers that are two digits long. Match the

digits with the labels and follow the protocol to form the groups.

45 46 71 17 09 77 55 80 00 95 32 86 32 94 85 82 22 69 00 56

Page 21: Simpson’s paradox An association or comparison that holds for all of several groups can reverse direction when the data are combined (aggregated) to form

• There are many types of experimental designs in use today in the sciences…read about these at the end of section 3.1:– Completely randomized: all experimental units are

allocated at random among all treatments (Ex. 3.10)– Block designs: A block is a group of experimental

units or subjects known in advance to be similar in some way that is expected to affect the response to the treatments. Knowing this, the experimenter can create a block design, in which the random assignment of units is carried out separately within each block. See examples 3.17-3.19 for some examples

– Matched pairs: This is a common design in which a block design is used to compare just two treatments. Sometimes each subject receives both treatments (acts as its own control), or there is a “before-after” design - see Example 3.16

Page 22: Simpson’s paradox An association or comparison that holds for all of several groups can reverse direction when the data are combined (aggregated) to form

Completely randomized experimental designs:

Individuals are randomly assigned to groups, then

the groups are randomly assigned to treatments.

Completely randomized designs

Page 23: Simpson’s paradox An association or comparison that holds for all of several groups can reverse direction when the data are combined (aggregated) to form

In a block, or stratified, design, subjects are divided into groups,

or blocks, prior to the experiment to test hypotheses about

differences between the groups.

The blocking, or stratification, here is by gender.

Block designs

Page 24: Simpson’s paradox An association or comparison that holds for all of several groups can reverse direction when the data are combined (aggregated) to form

Matched pairs: Choose pairs of subjects that are closely matched—

e.g., same sex, height, weight, age, and race. Within each pair,

randomly assign who will receive which treatment.

It is also possible to just use a single person, and give the two

treatments to this person over time in random order (“before”/”after”). In

this case, the “matched pair” is just the same person at different points

in time. Pre/post testing of a new teaching method is another

example...

The most closely matched pair

studies use identical twins.

Matched pairs designs

Page 25: Simpson’s paradox An association or comparison that holds for all of several groups can reverse direction when the data are combined (aggregated) to form

• Read the Introduction & Section 3.1 - pay particular attention to all the Examples. Make sure you understand the terminology and the sketches of the types of designs... Also, make sure you can use Table B to perform a completely randomized design. Also, try to do each of the exercises that occur within the text of that section… then try # 3.17, 3.18, 3.23, 3.27, 3.30, 3.40, 3.44-3.46