22
Part III Gathering Data

Part III Gathering Data. Chapter 11 Understanding Randomness Random An event is random if we know what outcomes could happen but not which particular

Embed Size (px)

Citation preview

Part III

Gathering Data

Chapter 11Understanding Randomness Random

An event is random if we know what outcomes could happen but not which particular values did or will happen

Random Numbers “Hard to get”

Pseudorandom Table of random digits

Pick a number from the next slide

1 2 3 4

Simulation A simulation consist of a collection of things that

happened at random. Is used to model real-world relative frequencies using random numbers.

Component Situation that is repeated in the simulation. Each

component has a set of possible outcomes Outcome

An individual result of a simulated component of a simulation

Trial The sequence of events that we are pretending will

take place Step-by-step page 295

Chapter 12Sample Surveys

Idea 1: Examine a part of the whole Carefully select a smaller group from

the population (Sample) A sample that does not represent the

population in some important way is said to be biased

Sample Survey (cont.)

Idea 2: Randomize Randomizing protect us from the

influences of all the features of our population, even the ones that we may not have thought about.

Is the best defense against bias, in which each individual is given a fair random chance of selection

Sample Surveys (cont.) Idea 3: It’s the sample size

The fraction of the population that you have sampled doesn’t matter. It’s the sample size itself that’s important.

Census A Sample that consist of the entire population.

Difficult to complete. Not practical, too expensive Populations are not static Can be more complex

Populations and parameters Population parameter

Parameter (numerical value) that is part of a model for a population. We want to estimate this parameters from sampled data.

Sampling When selecting a sample we want it to be representative, that is that the statistics we compute from the sample reflect the corresponding parameters accurately

Simple Random Sample (SRS) Is a sample in which each combination of

elements has an equal chance of being selected

Sampling Frame A list of individuals from which the sample is

drawn

Other Sampling Designs Stratified random sampling

A sampling design in which the population is divided into homogeneous subsets called strata, and random samples are drawn from each stratum.

Cluster Sampling Random samples are drawn not directly

from the population, but from groups of clusters. (Convenience, practicality, cost)

Other Sampling Designs (cont.)

Systematic Sample Sample drawn by selecting individuals

systematically from a sampling frame. (ex. Every 10 people)

Multistage Sample Combining different sampling methods

How to Sample Badly Sample badly with volunteers

Voluntary response bias invalidates a survey Sample badly because of convenience

Convenience sampling: Simply include the individuals who are at hand

Sample from a bad sampling frame Undercoverage

Some portion of the population is not sampled at all or has a smaller representation in the sample than it has in the population.

How to Sample Badly Non response bias Response Bias

Influence arising from the design of the survey wording.

Look for biases before the survey. There is no way to recover from a biased sample or a survey that asks biased questions

Sampling Variability Difference from sample to sample, given that

the samples are drawn at random

Exercises

Page 325

#8

#14

#15

Chapter 13Experiments Investigative Study

Observational Studies Researchers don’t assign choices No manipulation of the factors

Retrospective study Observational study in which the

researcher identifies the subject and then collect data on their previous condition or behavior

Prospective Study Identifies or selects the subjects and

follows the future outcomes

Experiment Random assignment of subjects to treatments. Explanatory Variable:

Factor (manipulate) Response variable :

Measurement Experimental units

Subjects Participants

Factor A variable whose levels are controlled by the

experimenter Levels of the factor

Treatments All the combinations of the factors with their respective

levels

The Four Principles of Experimental Design 1 - Control

We need to control sources of variation other than the factors being studied. (make the conditions similar for all treatment groups)

2 - Randomize Assign the subjects randomly to the

treatments to equalize the effects of unknown variation

The Four Principles of Experimental Design (cont.)

3 - Replicate Apply the treatments to several

subjects.

4 - Block Separate in blocks of identifiable

attributes that can affect the outcome of the experiment

Designing an Experiment

Step-by-Step Page 335

Experiments Control Treatment

Baseline treatment level to provide basis for comparison.

Blinding There are two main classes of individuals

who can affect the outcome of the experiment

Subjects, treatment administrators Evaluators of the results

Single Blinding (one) Double Blinding (both)

Experiments Placebos

A null treatment to make sure that the effect of the treatment is not due to the placebo effect.

Blocking By blocking we isolate the variability due to the

differences between the blocks so that we can see the differences due to the treatment more clearly

Confounding When the levels of one factor are associated

with the levels of another factor, we say that these two factors are confounded

Exercises

Page 351

#10

#12