24
AP Statistics 5.1 Designing Samples

AP Statistics

Embed Size (px)

DESCRIPTION

AP Statistics. 5.1 Designing Samples. Learning Objective:. Differentiate between an observational study and an experiment Learn different types of sampling techniques Use a random digit table to create an SRS Understand different types of bias. Definitions. - PowerPoint PPT Presentation

Citation preview

Page 1: AP Statistics

AP Statistics5.1 Designing Samples

Page 2: AP Statistics

Differentiate between an observational study and an experiment

Learn different types of sampling techniques

Use a random digit table to create an SRS

Understand different types of bias

Learning Objective:

Page 3: AP Statistics

designs – arrangements or patterns for producing and collecting data

population – entire group of individuals that we want information about

sample – part of a population that we actually examine in order to gather information

Definitions

Page 4: AP Statistics

Answers: a) An individual is a person; the population

is all adult US residents. b) An individual is a household; the

population is all US households. c) An individual is a voltage regulator; the

population is all the regulators in the last shipment

Example: Problem 5.2 p. 248

Page 5: AP Statistics

How many individuals must we collect data from? (sample size?)

How will we select the individuals to be studied?

If (as in many experiments) several groups of individuals are to receive different treatments, how will we form the groups?

Some questions that we need to answer as we design a study or

experiment:

Page 6: AP Statistics

Without a systematic design for producing data, we are subject to being misled by incomplete or haphazard data, or by confounding variables.

Page 7: AP Statistics

Observational Study vs.

Experiment

“passive” “active”

No attempt to influence responses

Deliberately imposes some treatment on individuals in order to observe the effect on their responses

Not good for explaining cause/effect because of confounding variables

Can link cause and effect

Collect a representative sample from the population

Assign (volunteer) subjects randomly to groups

Page 8: AP Statistics

Goal: To use information obtained from a “representative” sample to make inferences about the population from which the sample was taken; the only alternative is taking a census – not very practical!

We will not, as in an experiment, impose a treatment in order to observe the response, but we want to gather information about a large group of individuals. Design of the sample refers to the method used to choose the sample from the population. Poor sample design leads to misleading conclusions.

Sampling Design for Observational Studies

Page 9: AP Statistics

voluntary response sample – consists of people who choose themselves by responding to a general appeal; voluntary response samples tend to be biased (especially towards the negative) because people who have a strong opinion are most likely to respond

Ex: call-in polls, internet quick polls, etc.

“Bad” Sampling Methods

Page 10: AP Statistics

Answer: Only people with a strong opinion on the subject – strong enough that they are willing to spend the time and 50¢ – will respond to this advertisement

Example : Problem5.3 p. 249

Page 11: AP Statistics

convenience sample – “grab” the first n people available – not random!

Ex:

Page 12: AP Statistics

simple random sample (SRS) – each individual in the population has an equal chance of being included in the sample and each subgroup of size n has an equal chance of being in the sample.

You can select an SRS by labeling all the individuals in the population with a number and then randomly selecting a sample (using a calculator or table of random digits)

Probability Samples – “Good” Sampling Methods

Page 13: AP Statistics

Answer: I started with 01 and numbered the managers down the columns. I used a RDT, picking 6 2-digit #’s without repetition (ignoring numbers 00 and greater than 28).

The numbers I selected are in bold:

Line 139: 55 58 89 94 04 70 70 84 10 98 43 56 35 69 34 48 39 45 17 19

Line 140: 12 97 51 32 58 13   Thus, the six managers chosen to be interviewed are: 04-Bonds, 10-Fleming, 17-Liao, 19-Naber, 12-Goel, and

13-Gomez

Example : 5.5 p. 252 (1st ed.)

Page 14: AP Statistics

Answer: I numbered the bottles across the rows from 01 to 25. I used a RDT, picking 3 2-digit #’s without repetition (ignoring numbers 00 and greater than 25).

The chosen numbers are in bold

Line 111: 81 48 66 94 87 60 51 30 92 97 00 41 27 12 38 27 64 93 99 50

Line 112: 59 63 68 88 04 04 63 47 11

Thus, the three bottles chosen to be tested are 12-B0986, 04-A1101, and 11-A2220.

Example : 5.19 p. 262

Page 15: AP Statistics

The rest of these probability samples give each individual, but not each subgroup, and equal chance of being selected.

Systematic random sampling – randomly select a starting place/number and then take every kth value/individual

Page 16: AP Statistics

The following probability samples are used with populations that are very large and/or spread out:

stratified random sampling – break the population into two or more strata (groups – e.g. males and females), then take an SRS from each strata (similar to blocking used in experiments); insures that you include in the sample all types of individuals from the population (more representative sample)

Page 17: AP Statistics

Answer: It is not an SRS, because some samples of size 250 have no chance of being selected. For example, using this method, it would be impossible to select a sample containing all women.

Example: 5.23 p. 264

Page 18: AP Statistics

– uses the idea of a cluster sample – randomly select a location, area, row, etc. and then include everything in that group in your sample; doing this multiple times makes it a multistage sample.

Ex:

multistage sampling

Page 19: AP Statistics

Surveys are a common method of collecting data in an observational study. There are several problems that arise with surveys:

In order to choose a sample to survey, we need a complete an accurate list of the population, but in reality we rarely have one.

Surveys

Page 20: AP Statistics

some groups in the population are left out in the process of choosing the sample; e.g. a survey given to households leaves out homeless people, people in prison, students in dorms; opinion polls over the phone leave out people without phones

undercoverage

Page 21: AP Statistics

the selected individual cannot be contacted/found or refuses to answer the questions. The non response rate for surveys often reaches 30% or more.

nonresponse

Page 22: AP Statistics

bias caused by the behavior of the respondent or of the interviewer; e.g. respondents may lie if the questions deal with illegal or socially unacceptable behavior; the attitude of the interviewer may suggest that one answer is more desirable (therefore interviewers must be trained carefully to remain neutral)

response bias

Page 23: AP Statistics

is the most important influence on the answers given to a survey; watch out for leading questions and difficult-to-understand questions

wording of questions (wording effect)

Page 24: AP Statistics

Answer: A) This question will likely elicit more responses

against gun control (that is, more people will choose 2).

B) The phrasing of this question will tend to make people respond in favor of a nuclear freeze. Only one side of the issue is presented.

C) HUH? The wording of this question is too technical for most people to understand – and for those rare few that do understand, it is slanted towards supporting recycling. It could be rewritten to say something like: “Do you support economic incentives to promote recycling?”

Example: 5.24 p. 264