Sampling 1231243290208505 1


Citation preview

Revisiting Sampling Concepts


• A population is all the possible members of a category

• Examples: • the heights of every male or every female• the temperature on every day since the beginning of time• Every person who ever has, and ever will, take a particular



• A sample is some subset of a population

– Examples:• The heights of 10 students picked at random• The participants in a drug trial

• Researchers seek to select samples that accurately reflect the broader population from which they are drawn.


Sample Statistics



Samples are drawn to infer something about population

Reasons to Sample

Ideally a decision maker would like to consider every item in the population but;

• To Contact the whole population would be time consuming e.g. Election polls

• The cost of such study might be too high

• In many cases whole population would be consumed if every part of it was considered

• The Sample results are adequate

Probability Vs Non Probability Sampling

Probability Sampling

• Drawing Samples in Random manner

• Using random numbers

• Writing names on identical cards or slips and then drawing randomly

• Choosing every nth item of the population

• First dividing the population into homogeneous groups and then drawing samples randomly

Probability Vs Non Probability Sampling

Non Probability Sampling

• man-on-the-street interviews

• call-in surveys

• readership surveys

• web surveys

Types of Variables

• Qualitative• Quantitative• Discrete• Continuous

• Categorical • Numerical

Sampling Error

• “Sampling error is simply the difference between the estimates obtained from the sample and the true population value.”

Sampling Error = X - µWhere X = Mean of the Sample µ = Mean of the Population

Validity of Sampling Process

Sampling Distributions

• A distribution of all possible statistics calculated from all possible samples of size n drawn from a population is called a Sampling Distribution.

• Three things we want to know about any distribution?

– Central Tendency

– Dispersion

– Shape

Sampling Distribution of Means

• Suppose a population consists of three numbers 1,2 and 3

• All the possible samples of size 2 are drawn from the population

• Mean of the Pop (µ) = (1 + 2 + 3)/3 = 2

• Variance

• Standard Deviation = 0.82

Distribution of the Population

Sampling distribution of means

n = 2

Sample # Sample Sample Mean

1 1,1 1

2 1,2 1.5

3 1,3 2

4 2,1 1.5

5 2,2 2

6 2,3 2.5

7 3,1 2

8 3,2 2.5

9 3,3 3

Mean of SD 2

= µ

= 0.6

= µ

• The population’s distribution has far more variability than that of sample means

• As the sample size increases the dispersion becomes less and in the SD


0.6 < 0.8

• The mean of the sampling distribution of ALL the sample means is equal to the true population mean.

• The standard deviation of a sampling distribution called Standard Error is calculated as

Central Limit Theorem ……

• The variability of a sample mean decreases as the sample size increases

• If the population distribution is normal, so is the sampling distribution

• For ANY population (regardless of its shape) the distribution of sample means will approach a normal distribution as n increases

• It can be demonstrated with the help of simulation.

Central Limit Theorem ……

• How large is a “large sample”?

• It depends upon the form of the distribution from which the samples were taken

• If the population distribution deviates greatly from normality larger samples will be needed to approximate normality.

Implications of CLT

• A light bulb manufacturer claims that the life span of its light bulbs has a mean of 54 months and a standard deviation of 6 months. A consumer advocacy group tests 50 of them. Assuming the manufacturer’s claims are true, what is the probability that it finds a mean lifetime of less than 52 months?

Implications of CLT Cont

• From the data we know that

• µ = 54 Months = 6 Months

• By Central Limit Theorem

= µ = 54






• To find ,we need to convert to z-scores:

• From the Area table = 0.4906

• Hence, the probability of this happening is 0.0094.

• We are 99.06% certain that this will not happen

What can go wrong

• Statistics can be manipulated by taking biased samples intentionally


• Asking leading questions in Interviews and questionnaires

• A survey which showed that 2 out 3 dentists recommend a particular brand of tooth paste

• Some time there is non response from particular portion of population effecting the sampling design

How to do it rightly

• Need to make sure that sample truly represents the population

• Use Random ways where possible

• Avoid personal bias

• Avoid measurement bias

• Do not make any decisions about the population based on the samples until you have applied statistical inferential techniques to the sample.
