Unit 2: Comparing Two Groups In Unit 1, we learned the basic process of statistical inference using tests and confidence intervals. We did all this by

Unit 2: Comparing Two Groups In Unit 1, we learned the basic process of statistical inference using tests and confidence intervals. We did all this by focusing on a single proportion. In Unit 2, we will take these ideas and extend them to comparing two groups. We will compare two proportions, two independent means, and paired data.

Chapter 5: Comparing Two Proportions 5.1: Descriptive (Two-Way Tables) 5.2: Inference with Simulation-Based Methods 5.3: Inference with Theory-Based Methods

Positive and Negative Perceptions Consider these two questions: Are you having a good year? Are you having a bad year? Do people answer each question in such a way that would indicated the same answer? (e.g. Yes for the first one and No for the second.)

Researchers questioned 30 students (randomly giving them one of the two questions). They then recorded if a positive or negative response was given. Is this an observational study or randomized experiment? Positive and Negative Perceptions

Observational units The 30 students Variables Question wording (good year or bad year) Perception of their year (positive or negative) Which is the explanatory and which is the response? Positive and Negative Perceptions

Individual Type of Question ResponseIndividual Type of Question Response 1Good YearPositive16Good YearPositive 2Good YearNegative17Bad YearPositive 3Bad YearPositive18Good YearPositive 4Good YearPositive19Good YearPositive 5Good YearNegative20Good YearPositive 6Bad YearPositive21Bad YearNegative 7Good YearPositive22Good YearPositive 8Good YearPositive23Bad YearNegative 9Good YearPositive24Good YearPositive 10Bad YearNegative25Bad YearNegative 11Good YearNegative26Good YearPositive 12Bad YearNegative27Bad YearNegative 13Good YearPositive28Good YearPositive 14Bad YearNegative29Bad YearPositive 15Good YearPositive30Bad YearNegative Raw Data in a Spreadsheet

A two-way table organizes data Summarizes two categorical variables Also called contingency table Are students more likely to give a positive response if they were given the good year question? Two-Way Tables Good YearBad YearTotal Positive response 15419 Negative response 3811 Total 181230

Conditional proportions will help us better determine if there is an association between the question asked and the type of response. We can see that those given the positive question were more likely to respond positively. Two-Way Tables Good YearBad YearTotal Positive response 15/18 0.834/12 0.3319 Negative response 3811 Total 181230

We can use segmented bar graphs to see this association. Remember that variables are associated if the conditional proportion of the outcomes for one group differ from the conditional proportion of outcomes in other groups. Segmented Bar Graphs

Those responding to the good year question were more likely to answer positively (83% to 33%) than those responding positively to the bad year question. The statistic we will be using to measure this is the difference in proportions. 0.83 - 0.33 = 0.50 higher for the good year question than the bad year question. Perceptions

In the next section we will conduct tests of significance to compare two proportions and I want to give you a preview of that here. We will assume there is no association between the variables (i.e. the two population proportions are the same) and decide if two sample proportions differ enough to conclude this would be very unlikely just by random chance. A Sneak Peak at Comparing Two Proportions: Simulation-Based Approach

Hypotheses Null Hypothesis: There is no association between which question is asked and the type of response. (The proportion of positive responses will be the same in each group. ) Alternative Hypothesis: There is an association between which question is asked and the type of response. (The proportion of positive responses will be different in each group. )

Results Good YearBad YearTotal Positive response 15 (83%)4 (33%)19 Negative response 3811 Total 181230 The difference in proportions of positive responses is 0.83 0.33 = 0.50. How likely is a difference this great or greater if the type of question asked made no difference in how the student would respond?

Random Reassignment Notice that 19 students gave a positive response. If the null hypothesis is true, these 19 would have given a positive response no matter which question was asked. Therefore, under a true null hypothesis, we can randomly place these 19 people into either group and they will still give a positive response. This replicates the random assignment that was done in the experiment. We will also keep constant the 18 that receive the positive question and 12 that receive the negative question. Good YearBad YearTotal Positive response random 19 Negative response random 11 Total181230

You can think about this random reassignment with the raw data as well. It doesnt matter which question was asked, the responses will be the same. Therefore, we can shuffle the type of question and leave the responses fixed. This is equivalent to keeping the same column and row totals and just shuffling the inside of the two-way table as described earlier. IndividualType of QuestionResponseIndividualType of QuestionResponse 1Good YearPositive16Good YearPositive 2Good YearNegative17Bad YearPositive 3Bad YearPositive18Good YearPositive 4Good YearPositive19Good YearPositive 5Good YearNegative20Good YearPositive 6Bad YearPositive21Bad YearNegative 7Good YearPositive22Good YearPositive 8Good YearPositive23Bad YearNegative 9Good YearPositive24Good YearPositive 10Bad YearNegative25Bad YearNegative 11Good YearNegative26Good YearPositive 12Bad YearNegative27Bad YearNegative 13Good YearPositive28Good YearPositive 14Bad YearNegative29Bad YearPositive 15Good YearPositive30Bad YearNegative

Random Reassignment I did this once and found a difference in the proportions of positive responses for the two questions of 0.50 0.83 = 0.33 Good YearBad YearTotal Positive response 9 (50%)10 (83%) 19 Negative response 92 11 Total 181230

Random Reassignment I did this again and found a difference in the proportions of positive responses for the two questions of 0.61 0.67 = 0.06 Good YearBad YearTotal Positive response 11 (61%)8 (67%) 19 Negative response 74 11 Total 181230

Random Reassignment I did this again and found a difference in the proportions of positive responses for the two questions of 0.67 0.58 = 0.09 Good YearBad YearTotal Positive response 12 (67%)7 (58%) 19 Negative response 65 11 Total 181230

Random Reassignment In my three randomizations, I have yet to see a difference in proportions that is as far away from zero as the observed difference of 0.5. Lets do some more randomizations to develop a null distribution.

Random Reassignment After 1000 randomizations, only 7 were as far away from zero as our observed proportion.

Conclusion Since we have a p-value of 7/1000 or 0.007, we can conclude the alternative hypothesis and say we have strong evidence that how the question is phrased affects the response.

Applets Lets look at how this is done in two applets Simulation for Two Proportions Simulation for Multiple Proportions

Exploration 5.1: Murderous Nurse?

Example 5.2: Swimming With Dolphins

Is swimming with dolphins therapeutic for patients suffering from clinical depression? Researchers recruited 30 subjects aged 18- 65 with a clinical diagnosis of mild to moderate depression. Discontinued antidepressants and psychotherapy 4 weeks prior to and throughout the experiment 30 subjects went to an island near Honduras Randomly assigned to two treatment groups Swimming with Dolphins

Both groups engaged in one hour of swimming and snorkeling each day. One group swam in the presence of dolphins and the other group did not. Participants in both groups had identical conditions except for the dolphins After 2 weeks, each subjects level of depression was evaluated, as it had been at the beginning of the study The response variable is if the subject achieved substantial reduction in depression. Swimming with Dolphins

Observational units The 30 subjects with mild to moderate depression. Explanatory variable Swimming with dolphins or not Response variable Reduction in depression or not Are the variables quantitative or categorical? Swimming with Dolphins

Is this study an observational study or an experiment? Are the subjects in this study a random sample from a larger population? Swimming with Dolphins

Results Swimming with Dolphins Dolphin group Control group Total Improved10 (67%)3 (20%)13 Did Not Improve51217 Total15 30

The difference in proportions of improvers is 0.67 0.20 = 0.47. There are two possible explanations for an observed difference of 0.47. A tendency to be more likely to improve with dolphins The 13 subjects were going to show improvement with or without dolphins and random chance assigned more improvers to the dolphins Swimming with Dolphins

Null hypothesis: Dolphins dont help Swimming with dolphins is not associated with substantial improvement in depression Alternative hypothesis: Dolphins help Swimming with dolphins increases the probability of substantial improvement in depression symptoms Swimming with Dolphins

Null Hypothesis: The probability someone exhibits substantial improvement after swimming with dolphins is the same as the probability someone exhibits substantial improvement after swimming without dolphins. Alternative Hypothesis: The probability someone exhibits substantial improvement after swimming with dolphins is higher than the probability someone exhibits substantial improvement after swimming without dolphins. Swimming with Dolphins

If the null hypothesis is true (dolphin therapy is not better) we would have 13 improvers and 17 non-improvers regardless of the group they were in. Any differences we see between groups arise solely from the randomness in the assignment to the groups. Swimming with Dolphins

We can perform this simulation with cards. 13 black cards represent the improvers 17 red cards represent the non-improvers We assume these outcomes would happen no matter which treatment group subjects were in. Shuffle the cards and put 15 in one pile (dolphin therapy) and 15 in another (control group) An improver is equally likely to be assigned to each group Swimming with Dolphins

In the actual study, there were 10 improvers (diff of 0.47) in the dolphin group. We conducted 3 simulations and got 8, 5, and 6 improvers in the dolphin therapy group. (notice the diff in proportions) Swimming with Dolphins

We did 1000 repetitions to develop a null distribution. Why is it centered at about 0? What does each dot represent? Swimming with Dolphins

A 95% confidence interval for the difference in the probability using the standard deviation from the null distribution is 0.467 + 2(0.178) = 0.467 + 0.356 or (0.111to 0.823) We are 95% confident that when allowed to swim with dolphins, the probability of improving is between 0.111 and 0.823 higher than when no dolphins are present. How does this interval back up our conclusion from the test of significance? Swimming with Dolphins

Can we say that the presence of dolphins caused this improvement? Since this was a randomized experiment, and assuming everything was identical between the groups, we have strong evidence that dolphins were the cause Can we generalize to a larger population? Maybe mild to moderately depressed 18-65 year old patients willing to volunteer for this study We have no evidence that random selection was used to find the 30 subjects. Swimming with Dolphins

Exploration 5.2: Contagious Yawns? MythBusters investigated this. 50 subjects were ushered into a small room by co-host Kari. She yawned as she ushered 34 in the room and for 16 she didnt yawn. We will assume she randomly decided who would received the yawns.

Comparing Two Proportions: Theory-Based Approach Section 5.3

Introduction Just as with a single proportion, we can often predict results of a simulation using a theory-based approach. The theory-based approach also gives a simpler way to generate a confidence intervals.

Smoking and Birth Gender

Smoking and Gender How does parents behavior affect the sex of their children? Fukuda et al., 2002 (Japan) found the following: 255 of 565 births (45.1%) where both parents smoked more than a pack a day were boys. 1975 of 3602 births (54.8%) where both parents did not smoke were boys. Other studies have shown a reduced male to female birth ratio where high concentrations of other environmental chemicals are present (e.g. industrial pollution, pesticides)

Smoking and Gender A segmented bar graph and 2-way table Lets compare the proportions to see if the difference is statistically significantly.

Smoking and Gender

What are the observational units in the study? What are the variables in this study? Which variable should be considered the explanatory variable and which the response variable? Can you draw cause-and-effect conclusions for this study?

Smoking and Gender OK to shuffle? In the last section we re-randomized subjects to treatment groups to simulate the null distribution. In this study the parents werent randomized to the treatment, since its observational, but we can still represent the null hypothesis of no association through randomization.

Smoking and Gender Use the 3S Strategy to asses the strength 1. Statistic: The proportion of boys born to nonsmokers minus boys born to smokers is 0.548 0.451 = 0.097.

Smoking and Gender 2. Simulate: Use the Multiple Proportions applet to simulate Many repetitions of shuffling the 2230 boys and 1937 girls to the 565 smoking and 3602 nonsmoking parents Calculate the difference in proportions of boys between the groups for each repetition. Shuffling simulates the null hypothesis of no association

Smoking and Gender 3. Strength of evidence: Nothing as extreme as our observed statistic ( 0.097 or 0.097) occurred in 5000 repetitions, How many SDs is 0.097 above the mean?

Smoking and Gender Notice the null distribution is centered at zero and is bell-shaped. This, along with its standard deviation can be predicted using normal distributions.

Smoking and Gender We can use either the Multiple Proportion applet or the Theory-Based Inference applet to find the p-value

Smoking and Gender

From our test of significance, do we expect 0 to be in the interval of plausible values for the difference in the population proportions?

Smoking and Gender Again, either applet can be used to determine a confidence interval. We are 95% confident that the probability of a boy baby is 0.053 to 0.141 higher for families where neither parent smokes compared to families with two smoking parents

Smoking and Gender We can also write the confidence interval in the form: statistic margin of error. Our statistic is the observed sample difference in proportions, 0.097. We can find the margin of error by subtracting the statistic (center) from the upper endpoint or 0.141 0.097 = 0.044. 0.097 0.044 Is the margin of error about the standard deviation?

Smoking and Gender How would the interval change if the confidence level was 99%?

Smoking and Gender Written as the statistic margin of error 0.097 0.058. Margin of error 0.058 for the 99% confidence interval 0.044 for the 95% confidence interval

Smoking and Gender

( 0.141, 0.053) or 0.097 0.044 instead of (0.053, 0.141) or 0.097 0.044 The negative signs indicate the probability of a boy born to smoking parents is lower than that for nonsmoking parents.

Smoking and Gender Validity Conditions of Theory-Based Same as with a single proportion. Should have at least 10 observations in each of the cells of the 2 x 2 table. Smoking Parents Non- smoking Parents Total Male 25519752230 Female 31016271937 Total 56536024167

Smoking and Gender The strong significant result in this study yielded quite a bit of press when it came out Soon other studies came out which found no relationship between smoking and gender One article argued that confounding variables like social factors, diet, environmental exposure or stress were the reason for different studys results. (These are all possible since it was an observational study.)

Formulas

Strength of Evidence As the proportions move farther away from each other, the strength of evidence increases. As sample size increases, the strength of evidence increases.

Lets run this previous test using both the Simulation-Based and the Theory- Based Applets. Donating Blood Exploration 5.3 Questions 1-14 (skip 2 and 5)

Documents

Unit 2: Comparing Two Groups In Unit 1, we learned the basic process of statistical inference using tests and confidence intervals. We did all this by