Survey Samle

7/27/2019 Survey Samle

http://slidepdf.com/reader/full/survey-samle 1/30

f

Questionnaire Design and

Surveys Sampling

The contents of this site are aimed at students who need to perform basic statistical

analyses on data from sample surveys, especially those in marketing science. Students

are expected to have a basic knowledge of statistics, such as descriptive statistics and

the concept of hypothesis testing.

Professor Hossein Arsham

To search the site, try Edit | Find in page [Ctrl + f]. Enter a word or phrase in the dialogue

box, e.g. "parameter" or "sampling" If the first appearance of the word/phrase is not what

you are looking for, try Find Next.

MENU

1. Introduction

2. Variance and Standard Deviation

3. What Is a Confidence Interval?

4. Questionnaire Design and Surveys Management

5. General Sampling Methods 6. What Is the Margin of Error

7. Sample Size Determination

8. Percentage: Estimation and Testing

9. Multilevel Statistical Models

10. Surveys Sampling Routines

11. Cronbach's Alpha (Coefficient Alpha)

12. The Inter-Rater Reliability

13. Instrumentality Theory

14. Value Measurements Survey Instruments (Rokeach's Value Survey)

15. Danger of Wrong Survey Design and the Interpretation of the Results 16. JavaScript E-labs Learning Objects

Companion Sites:

Business Statistics

Topics in Statistical Data Analysis

http://home.ubalt.edu/ntsbarsh/Business-stat/home.html


http://home.ubalt.edu/ntsbarsh/Business-stat/stat-data/Surveys.htm#rsi


http://home.ubalt.edu/ntsbarsh/Business-stat/stat-data/Surveys.htm#rvsd


http://home.ubalt.edu/ntsbarsh/Business-stat/stat-data/Surveys.htm#rwci


http://home.ubalt.edu/ntsbarsh/Business-stat/stat-data/Surveys.htm#rss


http://home.ubalt.edu/ntsbarsh/Business-stat/stat-data/Surveys.htm#rssm


http://home.ubalt.edu/ntsbarsh/Business-stat/stat-data/Surveys.htm#rmarginerror


http://home.ubalt.edu/ntsbarsh/Business-stat/stat-data/Surveys.htm#rssss


http://home.ubalt.edu/ntsbarsh/Business-stat/stat-data/Surveys.htm#rDetsurvey


http://home.ubalt.edu/ntsbarsh/Business-stat/stat-data/Surveys.htm#rmlsm


http://home.ubalt.edu/ntsbarsh/Business-stat/stat-data/Surveys.htm#rssrp




http://home.ubalt.edu/ntsbarsh/Business-stat/stat-data/Surveys.htm#rCronbachs




http://home.ubalt.edu/ntsbarsh/Business-stat/stat-data/Surveys.htm#rinterraterrel




http://home.ubalt.edu/ntsbarsh/Business-stat/stat-data/Surveys.htm#rinstrutheory




http://home.ubalt.edu/ntsbarsh/Business-stat/stat-data/Surveys.htm#rRokeach




http://home.ubalt.edu/ntsbarsh/Business-stat/stat-data/Surveys.htm#rdangerresult




http://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/SampleSize.htm




http://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm


http://home.ubalt.edu/ntsbarsh/Business-stat/stat-data/Topics.htm























Excel For Statistical Data Analysis

Time Series Analysis and Business Forecasting

Computers and Computational Statistics

Probabilistic Modeling

Systems Simulation

Probability and Statistics Resources Compendium of Web Site Review

The Business Statistics Online Course

Introduction

The main idea of statistical inference is to take a random sample from a population

and then to use the information from the sample to make inferences about particular

population characteristics such as the mean (measure of central tendency), thestandard deviation (measure of spread) or the proportion of units in the population that

have a certain characteristic. Sampling saves money, time, and effort. Additionally, a

sample can, in some cases, provide as much information as a corresponding study that

would attempt to investigate an entire population-careful collection of data from a

sample will often provide better information than a less careful study that tries to look

at everything.

We must study the behavior of the mean of sample values from different specified

populations. Because a sample examines only part of a population, the sample mean

will not exactly equal the corresponding mean of the population. Thus, an importantconsideration for those planning and interpreting sampling results, is the degree to

which sample estimates, such as the sample mean, will agree with the corresponding

population characteristic.

In practice, only one sample is usually taken (in some cases such as "survey data

analysis" a small "pilot sample" is used to test the data-gathering mechanisms and to

get preliminary information for planning the main sampling scheme). However, for

purposes of understanding the degree to which sample means will agree with the

corresponding population mean, it is useful to consider what would happen if 10, or

50, or 100 separate sampling studies, of the same type, were conducted. Howconsistent would the results be across these different studies? If we could see that the

results from each of the samples would be nearly the same (and nearly correct!), then

we would have confidence in the single sample that will actually be used. On the other

hand, seeing that answers from the repeated samples were too variable for the needed

accuracy would suggest that a different sampling plan (perhaps with a larger sample

size) should be used.

http://home.ubalt.edu/ntsbarsh/Business-stat/excel/excel.htm


http://home.ubalt.edu/ntsbarsh/Business-stat/stat-data/Forecast.htm


http://home.ubalt.edu/ntsbarsh/Business-stat/stat-data/SPSSSAS.htm


http://home.ubalt.edu/ntsbarsh/Business-stat/opre/partIX.htm


http://home.ubalt.edu/ntsbarsh/Business-stat/simulation/sim.htm


http://home.ubalt.edu/ntsbarsh/Business-stat/R.htm


http://home.ubalt.edu/ntsbarsh/Business-stat/opre/partI.htm


http://home.ubalt.edu/ntsbarsh/Business-stat/opre504online.htm












A sampling distribution is used to describe the distribution of outcomes that one

would observe from replication of a particular sampling plan.

Know that estimates computed from one sample will be different from estimates that

would be computed from another sample.

Understand that estimates are expected to differ from the population characteristics

(parameters) that we are trying to estimate, but that the properties of sampling

distributions allow us to quantify, probabilistically, how they will differ.

Understand that different statistics have different sampling distributions with

distribution shapes depending on (a) the specific statistic, (b) the sample size, and (c)

the parent distribution.

Understand the relationship between sample size and the distribution of sample

estimates.

Understand that the variability in a sampling distribution can be reduced by increasing

the sample size.

See that in large samples, many sampling distributions can be approximated with a

normal distribution.

Variance and Standard Deviation

Deviations about the mean of a population is the basis for most of the statistical tests

we will learn. Since we are measuring how widely a set of scores is dispersed about

the mean we are measuring variability. We can calculate the deviations about the

mean, and express it as variance or standard deviation. It is very important to have a

firm grasp of this concept because it will be a central concept throughout the course.

Both variance and standard deviation measures variability within a distribution.

Standard deviation is a number that indicates how much, on average, each of the

values in the distribution deviates from the mean (or center) of the distribution. Keepin mind that variance measures the same thing as standard deviation (dispersion of

scores in a distribution). Variance, however, is the average squared deviations about

the mean. Thus, variance is the square of the standard deviation.

In terms of quality of goods/services, It is important to know that higher variation

means lower quality. Measuring the size of variation and its source is the statistician's



job, while fixing it is the job of the engineer or the manager. Quality products and

services have low variation.

What Is a Confidence Interval?

In practice, a confidence interval is used to express the uncertainty in a quantity being

estimated. There is uncertainty because inferences are based on a random sample of

finite size from a population or process of interest. To judge the statistical procedure

we can ask what would happen if we were to repeat the same study, over and over,

getting different data (and thus different confidence intervals) each time.

Know that a confidence interval computed from one sample will be different from a

confidence interval computed from another sample.

Understand the relationship between sample size and width of confidence interval.

Know that sometimes the computed confidence interval does not contain the true

mean value (that is, it is incorrect) and understand how this coverage rate is related to

confidence level.

Questionnaire Design and Surveys Management

This part of the course is aimed at students who need to perform basic

statistical analyses on data from sample surveys, especially those in the

marketing science. Students are expected to have a basic knowledge of

statistics such as descriptive statistics and the concept of hypothesis

testing.

When the sampling units are human beings, the main methods of

collecting information are:

face-to-face interviewing postal surveys

telephone surveys

direct observation.

Internet

The main questions are:



What is the purpose of the survey?

What kinds of questions the survey would be developed to answer?

What sorts of actions is the company considering based on the results of

the survey?

Step 1: Planning Questionnaire Research

Consider the advantages and disadvantages of using questionnaires.

Prepare written objectives for the research.

Have your objectives reviewed by others.

Review the literature related to the objectives.

Determine the feasibility of administering your questionnaire to the

population of interest.

Prepare a time-line.

Step 2. Conducting Item Try-Outs and an Item Analysis

Have your items reviewed by others.

Conduct "think-aloud" with several people.

Carefully select individuals for think-aloud.

Consider asking about 10 individuals to write detailed responses on a

draft of your questionnaire.

Ask some respondents to respond to the questionnaire for an item

analysis. In the first stage of an item analysis, tally the number ofrespondents who selected each choice.

In the second stage of an item analysis, compare the responses of high

and low groups on individual items.

Step 3: Preparing a Questionnaire for Administration

Write a descriptive title for the questionnaire.

Write an introduction to the questionnaire.

Group the items by content, and provide a subtitle for each group.

Within each group of items, place items with the same format together.At the end of the questionnaire, indicate what respondents should do

next.

Prepare an informed consent form, if needed.

If the questionnaire will be mailed to respondents, avoid having your

correspondence look like junk mail.

If the questionnaire will be mailed, consider including a token reward.



If the questionnaire will be mailed, write a follow-up letter.

If the questionnaire will be administered in person, consider preparing

written instructions for the administrator.

Step 4: Selecting a Sample of Respondents

Identify the accessible population.

Avoid using samples of convenience.

Simple random sampling is a desirable method of sampling.

Systematic sampling is an acceptable method of sampling.

Stratification may reduce sampling errors.

Consider using random cluster sampling when every member of a

population belongs to a group.

Consider using multistage sampling to select respondents from large

populations.

Consider the importance of getting precise results when determiningsample size.

Remember that using a large sample does not compensate for a bias in

sampling.

Consider sampling non respondents to get information on the nature of a

bias.

The bias in the mean is the difference of the population means for

respondents and non respondents multiplied by the population

nonresponse rate.

Step 5: Preparing Statistical Tables and Figures

Prepare a table of frequencies.

Consider calculating percentages and arranging them in a table with the

frequencies.

For nominal data, consider constructing a bar graph.

Consider preparing a histogram to display a distribution of scores.

Consider preparing polygons if distributions of scores are to be

compared.

Step 6: Describing Averages and Variability

Use the median as the average for ordinal data.

Consider using the mean as the average for equal interval data.

Use the median as the average for highly skewed, equal interval data.

Use the range very sparingly as the measure of variability.

If the median has been selected as the average, use the interquartile



range as the measure of variability.

If the mean has been selected as the average, use the standard deviation

as the measure of variability.

Keep in mind that the standard deviation has a special relationship to the

normal curve that helps in its interpretation.

For moderately asymmetrical distributions the mode, median and meansatisfy the formula: mode=3*median-2*mean.

Step 7: Describing Relationships

For the relationship between two nominal variables, prepare a

contingency table.

When groups have unequal numbers of respondents, include percentages

in contingency tables.

For the relationship between two equal interval variables, compute a

correlation coefficient.Interpret a Pearson r using the coefficient of determination.

For the relationship between a nominal variable and an equal interval

variable, examine differences among averages.

Step 8: Estimating Margins of Error

It is extremely difficult, and often impossible, to evaluate the effects of a

bias in sampling.

When evaluating a percentage, consider the standard error of a

percentage.When evaluating a mean, consider the standard error of the mean.

When evaluating a median, consider the standard error of the median.

Consider building confidence intervals, especially when comparing two

or more groups

Step 9: Writing Reports of Questionnaire Research

In an informal report, variations in the organization of the report are

permitted.

Academic reports should begin with a formal introduction that citesliterature.

The second section of academic reports should describe the research

methods.

The third section of academic reports should describe the results.

The last section of academic reports should be a discussion.

Acknowledge any weakness in your research methodology.



Missing Values on a Sensitive Topic

A natural way to get answers is to, as much as possible, assure people

that the surveys are anonymous, and to find a way to make the

respondent at least minimally comfortable. So, according to US General

Accounting Office book, "Developing and Using Questionnaires" (Oct1983) chapter 9, you should do the following:

1. explain to respondent the reasons for asking the questions,

2. make response categories as broad as possible.

3. word the question in a nonjudgemental style that avoids the

appearance of censure, or, if possible, make the behavior in

question appear to be socially acceptable.

4. present the request as factual matter as possible.

5. guarantee confidentiality or anonymity

6. make sure the respondent knows the info will not be used in anythreatening way.

7. explain how the info will be handled

8. avoid cross classification that will allow for pinpointing

responses.

Source of Errors

1. The use of an inadequate frame.

2. A poorly designed questionnaire.

3. Recording and measurement errors.4. Non-response problems.

For example consider the following question: "Over the last twelve

months would you say your health has on the whole been : Good? /

Fairly good? / Not good?" . The respondent is required to tick one of 3

thus-labeled boxes.

What is wrong with the following:

It is the ONLY question on the form, which asks about a matter ofopinion rather than fact, but this distinction is not in any way represented

in its layout or wording.

Whereas for a question about opinion there should be a response option

of 'Don't Know' this is not provided. In some cases, such as the Census

Form and the Census advisory staff are adamant that the question must



be answered. Thus a person with no opinion on the matter is in a

quandary and threatened with possible legal action.

This particular question is highly ambiguous as regards the qualitative

nature of what is being asked about (your health). Is one to respond in

terms of how one feels, how one can perform, comparisons with peergroups, comparisons with other periods of one's life, or what?

Relatively recent innovations surrounding the Internet have spawned

new ways for conducting surveys: most notably electronic mail (e-mail)

surveys and WWW surveys. While still in its infancy, it is clear that the

Internet is here to stay and this new medium is going to be used for

survey data collection. The main question is how the Internet can be

used for survey data collection by some effective and efficient design

considerations.

Survey Non-Sampling Errors: The widely used measure of the total

error in a survey estimated is the mean squared error (MSE). The MSE

consists of two components: variance and the square of the bias. Survey

researchers are able to obtain a good quantitative estimated of the

variance component of mean squared error. Unfortunately, the theory

and methods of estimating the bias (non-sampling error) component are

underdeveloped. Because the non-sampling error is usually much greater

than the sampling error in estimates from large sample surveys, it is

imperative that we learn more about it. In recent years U.S. Bureau of

Labor Statistics measures various aspects of non-sampling error by

means of behavioral science among others.

Further Reading: Biemer P., and L.Lyberg, Introduction to Survey Quality , Wiley, 2003.Lehtonen R., and E. Pahkinen, Practical Methods for Design and Analysis of Complex Surveys , Wiley, 2003.

General Sampling Techniques

From the food you eat to the TV you watch, from political elections toschool board actions, much of your life is regulated by the results of

sample surveys. In the information age of today and tomorrow, it is

increasingly important that sample survey design and analysis be

understood by many so as to produce good data for decision making and

to recognize questionable data when it arises. Relevant topics are:

Simple Random Sampling, Stratified Random Sampling, Cluster



Sampling, Systematic Sampling, Ratio and Regression Estimation,

Estimating a Population Size, Sampling a Continuum of Time, Area or

Volume, Questionnaire Design, Errors in Surveys.

A sample is a group of units selected from a larger group (the

population). By studying the sample it is hoped to draw validconclusions about the larger group.

A sample is generally selected for study because the population is too

large to study in its entirety. The sample should be representative of the

general population. This is often best achieved by random sampling.

Also, before collecting the sample, it is important that the researcher

carefully and completely defines the population, including a description

of the members to be included.

Random Sampling: Random sampling of size n from a population size N. Unbiased estimate for variance of is Var( ) = S2(1-n/N)/n, where

n/N is the sampling fraction. For sampling fraction less than 10% the

finite population correction factor (N-n)/(N-1) is almost 1.

The total T is estimated by N. , its variance is N2Var( ).

For 0, 1, (binary) type variables, variation in estimated proportion p is:

S2 = p.(1-p).(1-n/N)/(n-1).

For ratio r = xi/yi= / , the variation for r is

[(N-n)(r 2S2x + S2

y -2 r Cov(x, y)]/[n(N-1). 2].

Stratified Sampling: Stratified sampling can be used whenever the

population can be partitioned into smaller sub-populations, each of,

which is homogeneous according to the particular characteristic of

interest.

s = Wt. Bxar t, over t=1, 2, ..L (strata), and t is Xit/nt.

Its variance is:

W2t /(Nt-nt)S

2t/[nt(Nt-1)]

Population total T is estimated by N. s, its variance is



N2t(Nt-nt)S

2t/[nt(Nt-1)].

Since the survey usually measures several attributes for each population

member, it is impossible to find an allocation that is simultaneously

optimal for each of those variables. Therefore, in such a case we use the

popular method of allocation which use the same sampling fraction ineach stratum. This yield optimal allocation given the variation of the

strata are all the same.

Determination of sample sizes (n) with regard to binary data: Smallest

integer greater than or equal to:

[t2 N p(1-p)] / [t2 p(1-p) + 2 (N-1)]

with N being the size of the total number of cases, n being the sample

size, the expected error, t being the value taken from the t distributioncorresponding to a certain confidence interval, and p being the

probability of an event.

Cross-Sectional Sampling: Cross-Sectional Study the observation of a

defined population at a single point in time or time interval. Exposure

and outcome are determined simultaneously.

Quota Sampling: Quota sampling is availability sampling, but with the

constraint that proportionality by strata be preserved. Thus the

interviewer will be told to interview so many white male smokers, somany black female nonsmokers, and so on, to improve the

representatives of the sample. Maximum variation sampling is a variant

of quota sampling, in which the researcher purposively and non-

randomly tries to select a set of cases, which exhibit maximal differences

on variables of interest. Further variations include extreme or deviant

case sampling or typical case sampling.

What is a statistical instrument? A statistical instrument is any process

that aim at describing a phenomena by using any instrument or device,

however the results may be used as a control tool. Examples of statisticalinstruments are questionnaire and surveys sampling.

What is grab sampling technique? The grab sampling technique is to

take a relatively small sample over a very short period of time, the result

obtained are usually instantaneous. However, thePassive Sampling is a

technique where a sampling device is used for an extended time under



similar conditions. Depending on the desirable statistical investigation,

the Passive Sampling may be a useful alternative or even more

appropriate than grab sampling. However, a passive sampling technique

needs to be developed and tested in the field.

Further Reading: Thompson S., Sampling , Wiley, 2002.

What Is the Margin of Error

Estimation is the process by which sample data are used to indicate the

value of an unknown quantity in a population.

Results of estimation can be expressed as a single value; known as a point estimate, or a range of values, referred to as a confidence interval.

Whenever we use point estimation, we calculate the margin of error

associated with that point estimation. For example; for the estimation of

the population proportion, by the means of sample proportion (P), the

margin of errors calculated often as follows:

±1.96 [P(1-P)/n]1/2

In newspapers and television reports on public opinion pools, the margin

of error is often appears in small font at the bottom of a table or screen,

respectively. However, reporting the amount of error only, is not

informative enough by itself, what is missing is the degree of the

confidence in the findings. The more important missing piece of

information is the sample size n. that is, how many people participated in

the survey, 100 or 100000? By now, you know it well that the larger the

sample size the more accurate is the finding, right?

The reported margin of error is the margin of "sampling error". There are

many nonsampling errors that can and do affect the accuracy of polls.

Here we talk about sampling error. The fact that subgroups have larger

sampling error than one must include the following statement: "Other

sources of error include but are not limited to, individuals refusing to

participate in the interview and inability to connect with the selected

number. Every feasible effort is made to obtain a response and reduce

the error, but the reader (or the viewer) should be aware that some error

is inherent in all research."



If you have a yes/no question in a survey, you probably want to calculate

a proportion P of Yes's (or No's). Under simple random sampling survey,

the variance of P is P(1-P)/n, ignoring the finite population correction,

for large n, say over 30. Now a 95% confidence interval is

P - 1.96 [P(1-P)/n]1/2, P + 1.96 [P(1-P)/n]1/2.A conservative interval can be calculated, since P(1-P) takes its

maximum value when P = 1/2. Replace 1.96 by 2, put P = 1/2 and you

have a 95% consevative confidence interval of 1/n1/2. This

approximation works well as long as P is not too close to 0 or 1. This

useful approximation allows you to calculate approximate 95%

confidence intervals.

References and Further Readings: Casella G., and R. Berger, Statistical Inference , Wadsworth Pub. Co., 2001.Kish L., Survey Sampling , Wiley, 1995.Lehmann E., and G. Casella, Theory of Point Estimation , Springer Verlag, New York, 1998.Levy P., and S. Lemeshow, Sampling of Populations: Methods and Applications , Wiley, 1999.

Sample Size Determination

The question of how large a sample to take arises early in the planning of

any survey. This is an important question that should be treated lightly.

To take a large sample than is needed to achieve the desired results is

wasteful of resources whereas very small samples often lead to that are

no practical use of making good decision. The main objective is to

obtain both a desirable accuracy and a desirable confidence level with

minimum cost.

Pilot Sample: A pilot or preliminary sample must be drawn from the

population and the statistics computed from this sample are used in

determination of the sample size. Observations used in the pilot sample

may be counted as part of the final sample, so that the computed sample

size minus the pilot sample size is the number of observations needed to

satisfy the total sample size requirement.

People sometimes ask me, what fraction of the population do you need?

I answer, "It's irrelevant; accuracy is determined by sample size alone"

This answer has to be modified if the sample is a sizable fraction of the

population.



For an item scored 0/1 for no/yes, the standard deviation of the item

scores is given by SD = [p(1-p)/N] 1/2 where p is the proportion obtaining

a score of 1, and N is the sample size.

The standard error of estimate SE (the standard deviation of the range of

possible p values based on the pilot sample estimate) is given by SE=SD/N½. Thus, SE is at a maximum when p = 0.5. Thus the worst case

scenario occurs when 50% agree, 50% disagree.

The sample size, N, can then be expressed as largest integer less than or

equal to 0.25/SE2

Thus, for SE to be 0.01 (i.e. 1%), a sample size of 2500 would be

needed; 2%, 625; 3%, 278; 4%, 156, 5%, 100.

Note, incidentally, that as long as the sample is a small fraction of thetotal population, the actual size of the population is entirely irrelevant for

the purposes of this calculation.

Sample sizes with regard to binary data:

n = [t2 N p(1-p)] / [t2 p(1-p) + 2 (N-1)]

with N being the size of the total number of cases, n being the sample

size, the expected error, t being the value taken from the t distribution

corresponding to a certain confidence interval, and p being the probability of an event.

For a finite population of size N, the standard error of the sample mean

of size n, is:

[(N -n)/(nN)]½

There are several formulas for the sample size needed for a t-test. The

simplest one is

n = 2(Z+Z)22/D2

which underestimates the sample size, but is reasonable for large sample

sizes. A less inaccurate formula replaces the Z values with t values, and

requires iteration, since the df for the t distribution depends on the

sample size. The accurate formula uses a non-central t distribution and it

also requires iteration.



The simplest approximation is to replace the first Z value in the above

formula with the value from the studentized range statistic that is used to

derive Tukey's follow-up test. If you don't have sufficiently detailed

tables of the studentized range, you can approximate the Tukey follow-

up test using a Bonferroni correction. That is, change the first Z value to

Z where k is the number of comparisons.

Neither of these solutions is exact and the exact solution is a bit messy.

But either of the above approaches is probably close enough, especially

if the resulting sample size is larger than (say) 30.

A better stopping rule for conventional statistical tests is as follows:

Test some minimum (pre-determined) number of subjects.

Stop if p-value is equal to or less than .01, or p-value equal to or greater

than .36; otherwise, run more subjects.

Obviously, another option is to stop if/when the number of subjects

becomes too great for the effect to be of practical interest. This

procedure maintains about 0.05.

We may categorized probability proportion to size (PPS) sampling,

stratification, and ratio estimation (or any other form of model assisted

estimation) as tools that protect one from the results of a very unlucky

sample. The first two (PPS sampling and stratification) do this by

manipulation of the sampling plan (with PPS sampling conceptually a

limiting case of stratification). Model assisted estimation methods such

as ratio estimation serve the same purpose by introduction of ancillary

information into the estimation procedure. Which tools are preferable

depends, as others have said, on costs, availability of information that

allows use of these tools, and the potential payoffs (none of these will

help much if the stratification/PPS/ratio estimation variable is not well

correlated with the response variable of interest).

There are also heuristic methods for determination of sample size. For

example, in healthcare behavior and process measurement sampling

criteria are designed for a 95% CI of 10 percentage points around a

population mean of 0.50; There is a heuristic rule: "If the number of

individuals in the target population is smaller than 50 per month, systems

do not use sampling procedures but, attempt to collect data from all

individuals in the target population."



Further Readings: Goldstein H., Multilevel Statistical Models , Halstead Press, 1995.Kish R., G. Kalton, S. Heeringa, C. O'Muircheartaigh, and J. Lepkowski, Collected Papers of Leslie Kish ,Wiley, 2002.Kish L., Survey Sampling , Wiley, 1995.

Percentage: Estimation and Testing

The following are two JavaScript applets that construct exact confidence

intervals and test of hypothesis with respect to proportion, percentage,

and binomial distribution with or without a finite population,

respectively.

Enter the needed information, and then click the Calculate button.

Application to the test of hypothesis: Notice that, one may utilizeConfidence Interval (CI) for the test of hypothesis purposes. Suppose

you wish to test the following general test of hypothesis:

H0: The population parameter is almost equal to a given claimed value,

against the alternative:

Ha: The population parameter is not even close to the claimed value.

The process of carrying the above test of hypothesis at level ofsignificance using CI is as follow:

1. Ignore the claimed value in the null hypothesis, for time being.

2. Construct a 100(1- )% confidence interval based on the available

data.

3. If the constructed CI does not contain the claimed value, then

there is enough evidence to reject the null hypothesis. Otherwise,

there is no reason to reject the null hypothesis.

Sample Size with Acceptable Absolute Precision: The followings presentthe widely used method for determining the sample size required for

estimating a population mean or proportion.

Let us suppose we want an interval that extends unit on either side of

the estimator. We can write



= Absolute Precision = (reliability coefficient) .(standard error) =

Z /2 . (S/n1/2)

You may like using Sample Size Determination Applet to check your

computations.

Sample Size (n): 200

Number-of- Successes (m): 4

Required Confidence Level (1-

):

The Point Estimate:

The Lower Confidence Limit:

The Upper Confidence Limit:

Confidence Intervals for Finite Population

Population Size (N): 2000

Sample Size (n): 200

Number-of- Successes (m): 4

Required Confidence Level (1-

):0.95

The Point Estimate:

The Lower Confidence Limit:

The Upper Confidence Limit:

Multilevel Statistical Models







Many kinds of data, including observational data collected in the human

and biological sciences, have a hierarchical or clustered structure. For

example, animal and human studies of inheritance deal with a natural

hierarchy where offspring are grouped within families. Offspring from

the same parents tend to be more alike in their physical and mental

characteristics than individuals chosen at random from the population atlarge.

Many designed experiments also create data hierarchies, for example

clinical trials carried out in several randomly chosen centers or groups of

individuals. Multilevel models are concerned only with the fact of such

hierarchies not their provenance. We refer to a hierarchy as consisting of

units grouped at different levels. Thus offspring may be the level 1 units

in a 2-level structure where the level 2 units are the families: students

may be the level 1 units clustered within schools that are the level 2

units.

The existence of such data hierarchies is not accidental and should not be

ignored. Individual people differ as do individual animals and this

necessary differentiation is mirrored in all kinds of social activity where

the latter is often a direct result of the former, for example when students

with similar motivations or aptitudes are grouped in highly selective

schools or colleges. In other cases, the groupings may arise for reasons

less strongly associated with the characteristics of individuals, such as

the allocation of young children to elementary schools, or the allocation

of patients to different clinics. Once groupings are established, even if

their establishment is effectively random, they will tend to become

differentiated, and this differentiation implies that the group' and its

members both influence and are influenced by the group membership.

To ignore this relationship risks overlooking the importance of group

effects, and may also render invalid many of the traditional statistical

analysis techniques used for studying data relationships.

A simple example will show its importance. A well known and

influential study of primary (elementary) school children carried out in

the 1970's claimed that children exposed to so called 'formal' styles of

teaching reading exhibited more progress than those who were not. The

data were analyzed using traditional multiple regression techniques,

which recognized only the individual children as the units of analysis

and ignored their groupings within teachers and into classes. The results

were statistically significant. Subsequently, it has been demonstrated that

when the analysis accounted properly for the grouping of children into



classes, the significant differences disappeared and the 'formally' taught

children could not be shown to differ from the others.

This re-analysis is the first important example of a multilevel analysis of

social science data. In essence what was occurring here was that the

children within any one classroom, because they were taught together,tended to be similar in their performance. As a result they provide rather

less information than would have been the case if the same number of

students had been taught separately by different teachers. In other words,

the basic unit for purposes of comparison should have been the teacher

not the student. The function of the students can be seen as providing,

for each teacher, an estimate of that teacher's effectiveness. Increasing

the number of students per teacher would increase the precision of those

estimates but not change the number of teachers being compared.

Beyond a certain point, simply increasing the numbers of students in this

way hardly improves things at all. On the other hand, increasing the

number of teachers to be compared, with the same or somewhat smaller

number of students per teacher, considerably improves the precision of

the comparisons.

Researchers have long recognized this issue. In education, for example,

there has been much debate about the so called 'unit of analysis' problem,

which is the one just outlined. Before multilevel modelling became well

developed as a research tool, the problems of ignoring hierarchical

structures were reasonably well understood, but they were difficult to

solve because powerful general purpose tools were unavailable. Special

purpose software, for example for the analysis of genetic data, has been

available longer but this was restricted to 'variance components' models

and was not suitable for handling general linear models. Sample survey

workers have recognized this issue in another form. When population

surveys are carried out, the sample design typically mirrors the

hierarchical population structure, in terms of geography and household

membership. Elaborate procedures have been developed to take such

structures into account when carrying out statistical analyses.

Further Readings: Goldstein H., Multilevel Statistical Models , Halstead Press, New York, 1995.Longford N., Random Coefficient Models , Clarendon Press, Oxford, 1993.

These books cover a very wide range of applications and theory.

Surveys Sampling Routines



Note: The following programs are referred to the Practical Methods for

Design and Analysis of Complex Surveys, by R. Lehtonen, and E.

Pahkinen, Wiley, Chichester, 1995. See also, L.Lyberg et al., (Editors),

Survey Measurement and Process Quality, New York, Wiley, 1997.

Other software packages such as Le Sphinx, CENVAR, CLUSTERS,Epi Info, Generalized Estimation System, Super CARP, Stata,

SUDAAN, VPLX, WesVarPC, and ORIRIS IV.

TITLE Bernoulli sampling; PI=0.25, N=32

GET FILE (input dataset)

COMPUTE PI=0.25

COMPUTE EPSN=UNIF(1)

SELECT IF (EPSN LT PI)

WRITE OUTPUT=(output dataset)

TITLE Simple random sampling with replacement; n=8, N=32


COMPUTE L=L+ID

LEAVE L

COMPUTE E=L-ID

NUMERIC W(f2)

COMPUTE W=0

DO REPEAT A=A1-A8

IF (ID=1) A=UNIF(32)

LEAVE A

IF (E LT A AND A LE L) W=W + 1

END REPEAT

SELECT IF (W GT 0)

WRITE OUTFILE = (output dataset)

TITLE Simple random sampling without replacement; n=8, N=32




SAMPLE 8 FROM 32


TITLE Systematic sampling; n=8, sampling interval =4

MATRIX

COMPUTE RAND = TRANC (4*UNIFORM(1,1))

COMPUTE INT=RAND*MAKE(32, 1, 1)

SAVE INT/OUTPUT=*/VAR=INT

END MATRIX

MATCH FILES FILE = (input dataset)/FILE=*

COMPUTE INDEX = MOD ($CASENUM, 4)

SELECT IF (INDEX=INT)

SAVE OUTPUTFILE = (output dataset)/DROP=INDEX INT

The Following routines are for sampling

(selection with probability proportion to size)

TITLE PPS Poisson sampling with expected size of 8

GET FILE )input dataset)

COMPUTE PI=8*HOU*%/91753

COMPUTE EPSN=UNIF(1)

SELECT IF (EPSN LE PI)

WRITE OUTFILE (output dataset)



TITLE PPS Sampling with replacement; n=8


COMPUT L=L+HOU85

LEAVE L

COMPUTE E=L-HOU85

NUMERIC W(F2)

COMPUTE W=0

DO REPEAT A A1 TO A8

IF (ID=1) A=INIF(91753)

LEAVE A

IF (E LT A AND ALE L) W=W+1

END REPEAT

SELECT IF W GT 0


TITLE PPS Systematic sampling n=8


COMPUTE #C=#C + 1

COMPUTE CASE = #C

COMPUTE #SN=8

COMPUTE #PN=01753

COMPUTE #INT=TRUNC (#PN/#SN)

COMPUTE #RAN= TRUNC (UNIFORM (#INT) +1)

DO IF CASE = 1

COMPUTE #COMP=#RAN

COMPUTE RAN=#RAN



END IF

COMPUTE SAMIND=0

LOOP IF +COMP LE CUMHOU*%

+ COMPUTE SAMIND = SAMIND+1

+ COMPUTE #COMP+#COMP+#INT

END LOOP

EXECUTE.

WRITE OUTFILE= (output dataset)

Further Readings: Bethel J., Sample allocation in multivariate surveys, Survey Methodology , 15, 1989, 47-57.

Valliant R., and J. Gentle, An application of mathematical programming to a sample allocationproblem, Computational Statistics and Data Analysis , 25, 1997, 337-360.

Cronbach's Alpha (Coefficient )

Perhaps the best way to conceptualize Cronbach's Alpha is to think of it

as the average of all possible split half reliabilities for a set of items. A

split half reliability is simply the reliability between two parts of a test or

instrument where those two parts are halves of the total instrument. In

general, the reliabilities of these two halves should then be stepped up(Spearman Brown Prophesy Formula) to estimate the reliability for the

full length test rather than the reliability between to half length tests.

Assuming, for ease of interpretation, that a test has an even number of

items (e.g, 10), then items 1-5 versus 6-10 would be one split, evens

versus odds would be another and, in fact, with 10 items chosen 5 at a

time, there are 10 chose 5 or 252 possible split halves for this test. If we

compute each of these stepped up split half reliabilities and averaged

them all, this average would be Cronbach's Alpha. Since some splits will

be better than others in terms of creating two more closely parallel

halves, and the reliability between parallel halves is probably the mostappropriate estimate of an instrument's reliability, Cronbach's alpha is

often considered a relatively conservative estimate of the internal

consistency of a test.

The following is a SAS program for computing coefficient alpha or

Cronbach's Alpha. Note that, it is an option in the PROC CORR



procedure. In SAS, for a WORK data set called ONE, suppose we want

the internal consistency or coefficient alpha or Cronbach's alpha for x1-

x10, the syntax is:

PROC CORR DATA=WORK.ONE ALPHA;

VAR X1-X10;RUN;

There are at least three important caveats to consider when computing

coefficient alpha.

Note 1: How to handle "missing" values. In achievement testing, a

missing value or a not reached value is traditionally coded as 0 or wrong.

The CORR procedure is SAS DOES NOT treating missing as wrong. It

is not difficult to write code to force this to happen, but we must write

the code. In the above example we could do so as follows:

DATA WORK.ONE;SET WORK.ONE;

ARRAY X {10} X1-X10; /* DEFINING AN

ARRAY FOR THE 10 ITEMS */

DO I=1 TO 10;

IF X(I) = . THEN X(I) = 0; /* FOR EACH ITEM

X1-X10 CHANGING MISSING VALUES (.) TO 0 */

END;

RUN;

Note 2: The use of the NOMISS option in the CORR procedure. This is

related to Note 1 above. Another way to handling missing observations

is to use the NOMISS option in the CORR procedure. The syntax is asfollows:

PROC CORR DATA=WORK.ONE ALPHA NOMISS;

VAR X1-X10;

The effect of this is to remove all items X1-X10 from analysis for any

record where a at least one of these items X1-X10 are missing.

Obviously, for achievement testing, especially for speeded tests, where

most examines might not be expected to complete all items, this would

be a problem. The use of the NOMISS option would restrict the analysisto the subset of examines who did complete all items and this quite often

would not be the population of interest when wishing to establish an

internal consistency reliability estimate.

One common approach to resolving this problem might be to define a

number of items that must be attempted for the record to be included.

Some health status measures, for example the SF-36, have scoring rules



that require that at least 50% of the items must be answered for the scale

to be defined. If less than half of the items are attempted, then the scale

is not interpreted. If the scale is considered valid, by THEIR definition,

then all missing values on that scale are replace by the average of the

non-missing items on that scale. The SAS code to implement this scoring

algorithm is summarized below under the assumption that the scale ishas 10 items.


ARRAY X {10} X1-X10;

IF NMISS(OF X1-X10) 5 THEN DO I=1 TO 10;

X(I) = .;

END;

ELSE IF NMISS(OF X1-X10) = 5 THEN DO I=1 TO 10;IF X(I) =. THEN X(I) = MEAN(OF X1-X10);

END;

RUN;

Note that replacing all missing values with the average of the non-

missing values in the cases where then number of missing values is not

greater than half of the total number of items will result in an inflated

Cronbach's alpha. A better approach would be to remove from

consideration records where fewer than 50% of the records are

completed and to leave the remaining records intact, with the missing

values still in. In other words, to implement that first IF statement above,

but to eliminate the ELSE IF clause and then to run the PROC CORR

without the NOMISS option. The bottom line: The NOMISS option in

PROC CORR in general, and with the ALPHA options in particular must be considered carefully.

Note 3: Making sure that all items in the set are coded in the same

direction. Although 0/1 (wrong/right) coding is rarely a problem with

this, for Likert or other scales with more than 2 points on the scale, it is

not uncommon for the scale to remain constant (e.g., Strongly Agree,

Agree, Disagree, Strongly Disagree), but for the wording of the

questions to reverse the appropriate interpretation of the scale. For

example,

Q1. Social Security System Must be reformed SA A D SD

Q2. Social Security System Remain the Same SA A D SD

Clearly, the two questions are on the same scale, but the meanings of the

end points opposite.



In SAS, the way to adjust for this problem is to pick the direction that we

want the scale to be coded, that is, do we want SA to be a positive

statement about the Social Security System or a negative one, and then

reverse scale those items were SA reflects negatively (or positively)

about Social Security System. In the above example, SA for Q1 is a

negative position relative to the Social Security System and, thereforeshould be reverse scaled if the decision is to scale so the SA implies

positive attitudes.

If the coding of the 4-point Likert Scale was SA-0, A-1, D-2, SD-3, then

the item will be reverse scaled as follows:

Q1 = 3-Q1, in this way 0 becomes 3-0 = 1; 1 becomes 3-1 = 2; 2

becomes 3-2 = 1; and 3 becomes 3-3 = 0.

If the coding of the 4-point Likert Scale was SA-1, A-2, D-3, SD-4, then

the item will be reverse scaled as follows:Q1 = 5-Q1, in this way 1 becomes 5-1 = 4; 2 becomes 5-2 = 3; 3

becomes 5-3 = 2; and 4 becomes 5-4 = 1.

From the earlier example, If items X1, X3, X5, X7, and X9 would need

to be reverse scaled for before computing an internal consistency

estimate, then the following SAS code would do the job, Assuming a the

4-point Likert scale illustrated above with 1-4 scoring.


ARRAY X {10} X1-X10;/* DEFINING AN ARRAY FOR THE 10 ITEMS */

DO I=1,3,5,7,9; /* INDICATING WHICH ITEMS

IN THE ARRAY TO BE REVERSE SCALED */

X(I) = 5-X(I); /* REVERSE SCALING

FOR 1-4 CODING OF 4-POINT LIKERT SCALE */

END;

RUN;

It should be noted that some of the output from PROC CORR with the

ALPHA option, such as the correlation of the item with the total and the

internal consistency estimate for the scale with each individual item

NOT part of the scale provides very useful diagnostics that should alertthe researcher about either poorly functioning items or items that were

missed when considering reverse scaling. An item that correlated

negatively with the total usually needs to be reverse scaled or is poorly

formed.



Further Readings: Feldt L., and R. Brennan, Reliability, in Educational Measurement , Linn R. (Ed.), 105-146, 1989, MacmillianPublishing Company.

The Inter-Rater Reliability

The inter-rater reliability between survey interviewers is rarely computed

because different interviewers do not usually go back to ask respondents

the same questions and groups of respondents interviewed by different

interviewers are not always comparable. Especially in personal interview

surveys, interviewers may be assigned to different areas of a city or

region that differ a great deal compositionally. Survey designers should,

however, consider what might give rise to random variation in

Interviewers' performance before starting the study arid standardize the

training and field procedures to reduce these sources of variation asmuch as possible.

References and Further Readings: Aday L., Designing and Conducting Health Surveys: A Comprehensive Guide , Jossey-Bass Publishers, CA,1996.

Instrumentality Theory

Suppose two corresponding items, one from the dimension being ratedand its mate, the relative importance of that topic, called the "valence",

are cross-multiplied, then added up across all such pairs, then divided by

the number of such pairs. This procedure provides a weighted score, the

sum of the items each weighted by its relative importance. The higher

the average weighted score, the greater the overall importance and rating

of the topic. The technique has been well-liked since two issues are

being considered here, how satisfied or prepared or . . . someone is, and

how important that topic is to them. The approach has been applied to

multivariate issues such as factors affecting leaving an organization, job

satisfaction, managerial behavior, etc.

References and Further Readings: Korn, Graubard, Analysis of Health Surveys , Wiley, 1999.



Value Measurements Survey Instruments:

Rokeach's Value Survey

Anthropologists have traditionally observed the behavior of members of

a specific society and inferred from such behavior the dominant or

underlying values of the society. In recent years, however, there has beena gradual shift to measuring values directly by means of survey

questionnaire research. Researchers use data collection instruments

called value instruments to ask people how they feel about such basic

personal and social concepts as freedom, comfort, national security, and

peace.

Research into the relationship between peoples values and their actions

as consumers is still in its infancy. However, it is an area that is destined

to receive increased attention, for it taps a broad dimension of human

behavior that could not be explored effectively before the availability ofstandardized value instruments.

A popular value instrument that has been employed in consumer

behavior studies in the Rokeach Value Survey (RVS). This self-

administered value inventory is divided into two parts, with each part

measuring different but complementary types of personal values. The

first part consists of eighteen terminal value items, which are designed to

measure the relative importance of end- states of existence (i.e. personal

goals). The second part consists of eighteen instrumental value items,

which measure basic approaches and individual might take to reach end-

state values. Thus, the firs half of the measurement instrument deals

with ends, while the second half considers means.

If the items are not reworded to accommodate the Likert format; instead,

respondents are asked to indicate the degree of personal importance each

RVS value holds, from "very unimportant" to "very important," and then

they're given the standard Likert scale next to each RVS value. Some

applications use , for example, a 5-point scale and then features a rank-

ordering of the top three RVS values after each list of has already been

rated, to use in correcting for end-piling. It is show that in many cases,

slightly, but not significantly, lower test-retest reliabilities for the Likert

versus rank-ordered procedure.

Since the common reason for preferring to use the RVS in a Likert

format is to be able to perform normative statistical tests on the data, it is

worthwhile to point out that there are good arguments in favor of using



normative statistical tests on RVS data with the scale in its original,

rank-ordered format, under some conditions.

Further Readings: Arsham H., Questionnaire Design and Surveys Sampling, SySurvey: The Online Survey Tool , 2002.Braithwaite V., Beyond Rokeach's equality-freedom model: Two dimensional values in a one dimensional

world, Journal of Social Issues , 50, 67-94, 1994.Boomsma A., M. Van Duijn, and T. Snijders, (eds.), Essays on Item Response Theory , Springer Verlag,2000.Gibbins K., and I. Walker, Multiple interpretations of the Rokeach value survey, Journal of SocialPsychology , 133, 797-805, 1993.Sijtsma K., and I. W. Molenaar, Introduction to Nonparametric Item Response Theory , Sage 2002. Providesan alternative to parametric Item Response Theory is non-parametric (ordinal) Item Response Theory, suchas the Mokken Scaling method.

Danger of Wrong Survey Design and the Interpretation of the

Results

One of the first things that learners of survey design and sampling must

recognize is that statistical results can very easily be interpreted wrongly.

Saying such as “You can prove anything with figure” have gained

widespread circulation because they embody the bitter experience of

people who have found themselves misled by incorrect deductions

drawn from basically correct data.

Consequently many people tend to distrust statistics, and to regard

statisticians as naïve and incautious. In fact, statisticians are trained:

to be extremely careful in selecting information on which to base

their calculations.

to make only such deductions as are strictly logical.

Danger of Biased Sources: One of the chief dangers facing a statistician

is that the sources of his/her information may be biased. A statistician

must therefore always ask himself such questions as:

Who says this?

Why does he say it? What does he stand to gain from saying it?

How does he know?

Could he be lying?

Danger in Designing a "Bad" Questionnaire: In designing a

questionnaire the following points should be observed in its design:

http://www.sysurvey.com/tips/arsham.htm






questions should be simple

questions should be unambiguous

the best kinds of question are those which allow a pre-printed

answer to be ticked

the questionnaire should be as short as possible

questions should be neither irrelevant nor too personal Leading questions should not be asked. A "leading question" is

one that suggests the answer, e.g. the question “Don’t you agree

that all sensible people use XYZ soap?” suggests the answer "yes"

The questionnaire should be designed so that the questions fall

into a logical sequence. This will enable the respondent to

understand its purpose, and as a result the quality of his answers

may be improved.

The Copyright Statement: The fair use, according to the 1996 Fair Use

Guidelines for Educational Multimedia, of materials presented on this

Web site is permitted for non-commercial and classroom purposes only.

This site may be mirrored intact (including these notices), on any server

with public access. All files are available

at http://home.ubalt.edu/ntsbarsh/Business-stat for mirroring.

Kindly e-mail me your comments, suggestions, and concerns. Thank

you.

Professor Hossein Arsham

This site was launched on 2/18/1994, and its intellectual materials have

been thoroughly revised on a yearly basis. The current version is the

9th Edition. All external links are checked once a month.

http://www.adec.edu/admin/papers/fair10-17.html




http://home.ubalt.edu/ntsbarsh/Business-stat



mailto:[email protected]










Documents

Survey Samle