41
Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of extreme importance for the success of any survey, as any mistake in the sampling process can have far-reaching effects for the integrity of the data and cause the data to be unreliable. The figure below provides a schematic overview of the sampling process. In essence, 8 sequential steps are to be followed: Stage in the research process Steps in the sampling process Description Problem formulation Determination of the research design Determination of data- collection method. 1. Identify the target market Preparation of the sampling procedure. 2. Define the target market 3. Select data collection technique Sample plan design. 4. Determine the sampling frame 5. Select the sample technique Actual sampling procedure. 6. Determine the sample size Implementation of fieldwork. 7. Execution of the sample Specific sample instructions to field agency to ensure the proper procedures have been followed. Data analysis and Research report. 8. Validation of the sample Sample validation once data has been processed. Figure 10.1:- Sampling design process Note that although these eight steps are generally referred to as a sampling procedure, these steps take place at different stages in the research process as indicated. The eight consecutive stages of the sampling process will now be looked at in detail. 10.2 THE STEPS IN THE SAMPLING PROCESS There are eight sequential steps in the sampling process: 10.2.1 STEP 1: IDENTIFY THE TARGET MARKET The first step in sampling methodology is to identify the type of target market. Here one differentiates between demand side research studies and supply side research studies. When it comes to demand side studies one would typically execute a consumer or a household survey. However, a corporate survey is a

CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

M a r k e t i n g R e s e a r c h f o r M a n a g e r s | P a g e 107

CHAPTER 10

SAMPLING METHODOLOGY

10.1 OVERVIEW

The sampling process is of extreme importance for the success of any survey, as any mistake in the

sampling process can have far-reaching effects for the integrity of the data and cause the data to be

unreliable. The figure below provides a schematic overview of the sampling process. In essence, 8

sequential steps are to be followed:

Stage in the research process Steps in the sampling process Description

� Problem formulation

� Determination of the research design

� Determination of data-collection method.

1. Identify the target market

Preparation of the sampling procedure.

2. Define the target market

3. Select data collection technique

� Sample plan design.

4. Determine the sampling frame

5. Select the sample technique Actual sampling procedure.

6. Determine the sample size

� Implementation of fieldwork.

7. Execution of the sample

Specific sample instructions to field agency to ensure the proper procedures have been followed.

� Data analysis and

� Research report.

8. Validation of the sample

Sample validation once data has been processed.

Figure 10.1:- Sampling design process

Note that although these eight steps are generally referred to as a sampling procedure, these steps take

place at different stages in the research process as indicated.

The eight consecutive stages of the sampling process will now be looked at in detail.

10.2 THE STEPS IN THE SAMPLING PROCESS

There are eight sequential steps in the sampling process:

10.2.1 STEP 1: IDENTIFY THE TARGET MARKET

The first step in sampling methodology is to identify the type of target market. Here one differentiates

between demand side research studies and supply side research studies. When it comes to demand side

studies one would typically execute a consumer or a household survey. However, a corporate survey is a

Page 2: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

108 | P a g e H a y d a m a n d M o s t e r t

typical example of a supply side study. This includes surveys with merchants, manufacturers,

companies or distributors. The following table provides an overview of these three survey types.

Table 10.1: Demand side and supply side survey types

Survey type Description Sample unit (*) Examples

Consumer or visitor survey

Demand side survey.

Whenever there is a flow or movement of people or objects to be surveyed.

Defined as a - group or - party.

• A shopper group visiting a store or shopping mall.

• A visitor‘s group to an exhibition.

• A tourist group or travelling party to an event or destination.

Household survey

Demand side survey.

In the case where objects or people under investigation are at formal or informal places of residence (permanent or temporary).

Defined as a dwelling.

• A permanent (formal or informal) residence such as a house, a flat, or a shack.

• A non-permanent residence such as a camping or caravan site.

Corporate survey

Supply side survey.

Used if one wishes to obtain information about companies and products.

Defined as an establishment.

• A formal business entity, such as a dry cleaner or a bed & breakfast.

• An informal business entity such as a street vendor.

(*) See discussion on sample units in step 2.

� Note that these studies do not overlap. In other words, any study conducted cannot be a household and/or a

corporate and/or a consumer study at the same time. If you find yourself in a situation where you know that

you are dealing with a corporate and household survey, then you must consider these two surveys as separate

– each one with its own methodology.

10.2.2 STEP 2: DEFINE THE TARGET POPULATION

The following 4 aspects have to be covered in this stage, namely:

10.2.2.1 Sample unit

A sample unit is the basic unit which contains the element of the population to be sampled (not

surveyed!). The sample unit may be the sample element itself or may be an available entity containing

the sample element (Malhotra, 2002).

How does this definition then apply? From step 1, if one conducts, say, a face-to-face intercept

consumer satisfaction survey with shoppers at Edgars, then the sample unit will be a shopper group.

However, if it is a face-to-face survey or, say, a telephone sample survey which measures the attitude

amongst households with regards to opening a bottle store in their neighbourhood, then the sample unit

will be a house or flat (referred to as a dwelling). Lastly, if one wants to establish the average price that

hair salons in Sandton charge their clients for a cut and blow or for a permanent colour tint, then the

sample unit will be the hair salon.

It is important to note that a sample unit is only used in the sampling process, and is not used in

obtaining the actual information. The reason is simple: have you ever tried talking to a hair salon

(building), or to a physical house in your neighbourhood? It is clear that these are all objects or concepts

that researchers use in the sampling process. Shopper groups in consumer surveys must be seen in the

same light.

In more advanced sample techniques such as the multi-stage or cluster sample, you will need to

differentiate between a primary, secondary, tertiary, or even a final sample unit. Assume that one has to

execute a mystery shopper survey for Foschini’s, then these definitions will apply as follows:

Page 3: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

M a r k e t i n g R e s e a r c h f o r M a n a g e r s | P a g e 109

Firstly, each shop to be surveyed (nationally) will be referred to as a primary sample unit.

Each department, e.g. fine jewellery, within each shop to be surveyed will be labelled as the secondary

sample unit.

The sales counter where the goods are displayed and where the service of staff will be evaluated will then

be referred to as the tertiary sample unit.

Final sampling unit: If any more detailed divisions or classifications are required, that would then be

referred to as a final sample unit.

� A sample unit is always expressed as a singular entity i.e. unit and never plural, i.e. units. The reason for this

is that sample units collectively make up a sample frame (see step 4), which is completely different from a

sample unit.

10.2.2.2 Sample element

A sample element is the smallest single entity, i.e. an object or person who holds the information to be

collected. If the information is held by, say, an object, e.g. observing how long shoppers stand in a queue

at Woolworths before being served, or the number of hits on their webpage, then one will merely observe

the information required (i.e. record the time spent in a queue and establish the number of ‘hits’ on the

webpage). However, if the information is held by a person, then one will obtain the information from the

person concerned, using any one of the interviewing techniques discussed in Chapter 8.

To delineate a sample element, one has to define the exclusions and inclusions which are applicable to

the study. Standard exclusions for household and consumer surveys include:

• Respondents who have been surveyed in the past 6 months. These may include people who, for

some reason, are easy targets for market research. They are approachable, friendly and well

informed about what marketing research is about because they have extensive experience, but

they provide biased information.

• Respondents or immediate family who work for a marketing research agency.

• Respondents or immediate family members who are directly involved in the field of study. For

example, if one does an advertising test, then everybody who works in the advertising or related

industries must be excluded.

Usually some researchers will exclude full-time students (who usually think they know a lot, do little, and

are usually very opinionated, causing results to be meaningless in the end). Other exclusions in household

surveys would be full-time domestics who actually cannot talk on behalf of the owner (if you think about

it).

Examples of standard inclusions are:

• A person must have used or purchased the product under investigation in the past 6 months.

• A respondent must be female in the case of a survey involving make-up and fashion.

• The person must be the head of the household, or be responsible for grocery shopping in the

household.

• A person must be over the age of 18 years (in the case of a grown-up survey).

Page 4: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

110 | P a g e H a y d a m a n d M o s t e r t

For corporate studies one would ask for the person responsible using/purchasing the product or service

and whether he/she has used or purchased the product in the past, say, three months.

The above-mentioned inclusions and exclusions will become the questionnaire screening questions (see

questionnaire design) to be asked at the start of any survey. For example, from the above-mentioned,

screening questions (see Chapter 11) might look as follows:

SQ.1 Do you own dogs?

Yes Proceed to SQ2

No �Terminate and replace interview.

SQ.2 In the past 6 months, have you purchased any Hill’s dog food?

Yes Proceed with survey

No �Terminate and replace interview.

It is therefore not uncommon to have a whole range of so-called screening questions which the respondent

has to “pass” in order to qualify for the interview.

See also step 7: selecting the sample element (section 10.2.7.4).

Note that sometimes a sample element and a sample unit can be one and the same thing. For

instance, if you want to track the profile of visitors to the Design for Living exhibition, then your sample

unit will be a visitor’s group. This definition applies whether the group has 1, 2 or 5 or even 10 visitors in

the group. However, in the case where the visitor’s group is only one person, then one has a situation

where the sample unit is the same as the sample element. Hence the person chosen = person to be

interviewed!

� For compatibility with other studies, it is recommended that one uses the same definition of sample units and

elements as used in the past.

� In the case of a corporate survey, the title (e.g. purchasing or perishable manager) or position (e.g. senior

management) of the person to be interviewed may be listed as inclusions. In this case, personal demographics

may not count in corporate surveys, such as the respondent’s age.

10.2.2.3 Extent

The extent of a research study is defined as the area or coverage of the study field.

For both corporate and household studies, the extent can be defined in terms of an area or

geographical location. Note that the coverage area must be fully defined, for example Bloubergstrand,

Milnerton and Tableview suburban areas. However, in the case of a consumer survey, one refers to the

drawing power or catchment area of the shop, mall or event that is the focus of the study. In the case

of Tyger Valley shopping centre, the extent will be the drawing power of the centre, also called the

centre’s “shopper footprint”.

� In the case of corporate or household surveys, when defining the extent of a study, it is recommended to

make the area compatible with national and municipal boundaries, e.g. so-called magisterial districts.

10.2.2.4 Time frame

The time frame is defined as the time when the fieldwork is expected to run out. This should not be

confused with when the research study starts or finishes, or the duration of research. The reason for this

is simple: the time of fieldwork defines the context in which the results will be viewed. For example,

Page 5: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

M a r k e t i n g R e s e a r c h f o r M a n a g e r s | P a g e 111

assume that you were busy doing research in New York amongst tourists asking them about how safe

they feel in the city. It is obvious that one would have obtained different results if the fieldwork was

conducted the week before or after 9/11 2001! Hence the importance of stating the time the research will

be conducted.

Timing also plays an important role in self-completion and in mail surveys, where the cut-off date, i.e.

when the last questionnaires will be accepted, is determined. In cases where the time frame is not

mentioned, then the time of fieldwork should be estimated.

10.2.3 STEP 3: SELECTION OF A DATA COLLECTION METHOD

This step deals with the chosen data collection method (or given as per client request) to be used by the

researcher. The data collection method needs to be motivated to justify the suitability of the method for

the survey as per Chapter 8. Table 10.2 provides an overview:

Table 10.2: Strengths of descriptive data collection methods

Data collection method Example of motivation of usage.

Telephone interview Fast data collection method. Good control over fieldworkers.

Face-to-face interview Allows the showing of visuals and explaining of difficult concepts.

Mail survey Very cost effective if used with regular mail shots and covers a wide area.

Self-administered survey No interviewer influence in results and very cost effective.

The same will be applicable for observational and qualitative research studies.

The choice of the data collection method influences the sample frame to be used, the sample unit and

sample element (and the selection thereof – see execution of sample). It is also a vital link to

questionnaire design.

10.2.4 STEP 4: CHOOSE THE APPROPRIATE SAMPLE FRAME

Five consecutive steps are followed here, namely (i) defining the sample frame; (ii) making the sample

frame all-inclusive; (iii) updating the sample frame; (iv) preparing the sample frame; and (v) providing a

sample frame summary.

10.2.4.1 Defining the sample frame

Firstly, the sample frame must suit the data collection method as identified in step 3. For instance, if a

face-to-face interviewing technique has been recommended, then the sample frame must have the

appropriate fields, e.g. names and addresses of businesses (as in the case of a corporate survey) or

dwellings (in the case of a household survey). Otherwise, the sample cannot be selected.

� A sample frame is a list of all sample units in the target population. Commonly known by businesses as

databases or clients’ (i.e. businesses or households) contact lists and addresses. In the case of telephone

sample surveys, it would be the contact number and name of respondent to be surveyed.

� Some text use the noun ‘sample’ frame while others prefer the verb ‘sampling’ frame. We accept both

either way.

Page 6: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

112 | P a g e H a y d a m a n d M o s t e r t

Secondly, one has to check whether the noted sample frame matches the defined target population in

question. For example, a poorly defined target population would be “to survey all housewives in

Durbanville (in order to establish their spending patterns)” or “to conduct a survey amongst all medium

and large companies (in the Greater Cape Town area in order to determine their procurement policies)”.

The question here is, what will act as suitable sample frames for these studies?

There are two ways to address poorly defined target populations:

(i) One could change the (probability) sample technique to a non-probability sample technique by

using, for example, a quota or convenience sample, and/or

(ii) One could redefine the target population to match the sample frame. For example, instead of

defining your target population as being medium sized and large businesses in Cape Town, one

might define it as ‘businesses belonging to the Cape Chamber of Commerce and Industry’. It

might not be ideal, but definitely more workable.

10.2.4.2 Making the sample frame all inclusive

This is done by merging various databases or sources to compile one master database (sample frame) to

be used in the sampling process. Take, for instance, a (corporate) study to be executed amongst bed &

breakfast owners in South Africa. In order to make the sample frame all inclusive, one would obtain the

names and addresses of all bed & breakfasts from the B & B Association of SA, the Yellow Pages, various

brochures and pamphlets, or magazines and newspapers to arrive at a master copy to be used.

In order to compile a sample frame from scratch, it is important that you ensure that you have dedicated data

set fields. In other words, make sure that all the company names appear in one dedicated column, telephone

numbers in another, postal codes separate, address fields specifically labelled, e.g. Address1 (street name),

address2 (building name), address3 (suburb), address4 (town/place), etc.

Beware using the internet as a source as it may have an occurrence of many outdated websites of businesses.

10.2.4.3 Updating the sampling frame

Having defined and compiled the sample frame, the next step is to update or “clean” the sample frame.

Three consecutive steps are to be followed in this regard:

(i) Firstly, all duplicate records need to be removed from the sample frame. Two types of

duplications are in question here, namely page and sample unit duplications. The former is quite

simple to solve, as one would look at the sequential numbers on a sample frame of all the sample

units listed, and merely delete the duplications! The latter has to be handled with caution, as you

first need to define exactly what constitutes duplication. A company could have the same name

but different addresses, e.g. Holiday Inn in Cape Town and Holiday Inn in Sea Point. This, of

course, does not qualify as a duplication! Duplications must have the same name and address

and telephone numbers. Therefore, to eliminate duplications, one would begin by sorting the

sample by name - and eliminate the relevant duplications. Another follow-on sort by telephone

number could then be applied and all the relevant duplications would be eliminated, and so forth.

(ii) Eliminate all the foreign sample elements from the sampling frame. At this step we are

talking about the sample frame, so you will be eliminating sampling units that do not belong to

the defined population and not sample elements. There are three survey components that are

affected by foreign elements, namely (i) sample unit, (ii) the extent and (iii) stipulated time

frame. The aim here is to ensure that all the sample units under investigation are treated in the

same way. If one surveys hotels, then a guest house or any other non-hotel establishment for

that matter would be regarded as a foreign element (referred to as sample unit exclusion).

Page 7: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

M a r k e t i n g R e s e a r c h f o r M a n a g e r s | P a g e 113

Likewise, if the extent is defined as Cape Town, then Worcester as a place outside Cape Town

has to be excluded (i.e. coverage exclusion). In the case of self-completion surveys, e.g. mail and

email surveys, one would have to ensure that all the questionnaires were received within the

given stipulated deadline (time exclusion).

(iii) Identify and correct all the missing fields of sample units in the sample frame. This includes

adding missing telephone numbers and address fields, or deleting records with telephone

numbers or addresses which cannot be found, or where a sample unit is found to be dormant i.e.

ex-customer, or a company that has closed down, etc.

� This step (step 10.2.4.3) is not applicable for consumer surveys.

� The difference between a sampling frame and a target population is referred to as a sample frame gap.

10.2.4.4 Preparing the sample frame

In order for a sample frame to be effective, it must be prepared beforehand. This entails two steps,

namely:

(i) All entities on the sampling frame must be exactly specified and sorted in the way in which the

sample would be drawn. For instance, if the sample procedure is a stratified sample by type of

business and area, then the sample frame must be sorted (stratified) accordingly - to the

appropriate strata to be used i.e. by type of business and area.

(ii) Secondly, all sample units must be labelled numerically from 1 to the end of a sample frame,

as one requires these numbers when drawing a sample (using the set of random tables).

� This step (step 10.2.4.4) is not applicable for consumer surveys.

10.2.4.5 Providing a summary of a sampling frame

For both household and corporate surveys, it is important to list the absolute numbers i.e. (N) and

relative number, i.e. percentages (%) of the sample frame in a tabular format. This summary is a very

useful aid in determining the accuracy of the selected sample when comparing it with the results.

A sample frame summary might look something like this:

Area Number of bed &

breakfasts (N)

Percentage

(%)

Cape Town 1,037 50.6%

Bellville 316 15.4%

Boland Wine Route 697 34.0%

Total 2,050 100.0%

As a rule, all percentages

are expressed to a one

decimal point!

Page 8: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

114 | P a g e H a y d a m a n d M o s t e r t

� One must ensure that all the strata or segments (areas in the case of the above-mentioned table) are large

enough to be sampled, as small segments and strata are meaningless and only fuel confusion. A rule of

thumb is that no more than 10 main classifications should be considered for the sample frame summary.

As consumer surveys generally do not have any sample frames, they provide a different challenge

when it comes to sampling. Remember from stage 1 we saw that consumer surveys are executed where

there is a movement or flow of objects or people (as in the case of shopping malls, where thousands of

shoppers visit the mall every single day). The appropriate sampling technique for this type of study is a

time-based sampling technique (see step 5). This entails a sampling frame with so-called ‘flow indicators’.

Table 10.3 provides examples of such flow indicators:

Table 10.3: Flow indicators of consumer surveys

Consumer study Example Flow indicators

Shopper surveys

Malls or shopping centres

• Past total daily numbers of shoppers who

entered the shopping centre/mall*, and/or

• Past total daily till counts of the key stores in

the shopping centre/mall.

Retail outlets such as Edgars • Past total daily numbers of shoppers who

entered the store*, and/or

• Past total daily till counts of the store.

Events &

exhibitions

Design for Living • Past total numbers of attendees per day.

Grahamstown Arts Festival or

Klein Karoo Nasionale Kunstefees

(KKNK)

• Past total numbers of visitors per day or past

total ticket sales for organised events.

* This can be obtained by installing head-counters at the entrances of the shopping centre or shop respectively.

� Although non-probability samples do not require any sample frames, a target population summary is required

in the case of a quota sample.

Steps 5 and 6 both deal with sampling issues. Step 5 deals with the sampling technique and step 6 with

issues surrounding the sample size.

10.2.5 STEP 5: SELECTING THE SAMPLING METHOD

The first question to be answered in this regard is whether a sample or a census will be used for the

survey.

10.2.5.1 Sample or census

If a census is recommended, then the appropriate main reason(s) should be listed. The choice of using a

census or a sample is depicted in Table 10.4.

� If a census is chosen, then you are to proceed to step 7 from here.

Page 9: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

M a r k e t i n g R e s e a r c h f o r M a n a g e r s | P a g e 115

Table 10.4 Choosing between a census or sample

Factor Description Sample Census

Budget How much money does the company have available for research?

Small Large

Time available When is the deadline? Short Long

Population size How many sample units are there in the target population?

Small Large

Variance in the characteristics

If the elements of a population are quite similar, one talks about homogenous populations. Homogenous populations have small population variances.

Small Large

Cost of the sampling error For example, when a very large company e.g. SANLAM is not included (by chance) in the sample

Low High

Cost of the non-sampling errors

Includes the total interviewer, respondent and researcher’s errors.

High Low

Nature of measurement Products that are to be consumed or test marketed e.g. tasting of a chocolate is deemed to be destructive.

Destructive Non-destructive

Source: Malhotra (2002)

10.2.5.2 Sampling technique proposed

In this section one must list the appropriate sampling technique to be used (even if it is given to you by

the client). For example “a systematic stratified sample by type of business and area is to be used for the

study.”

� The execution of the appropriate sample technique is discussed in step 7 and not in this section.

10.2.6 STEP 6: DETERMINING THE SAMPLE SIZE

Sample size calculation has four interrelated issues, namely (i) the initial sample size, (ii) the confidence

interval of the agreed sample size, (iii) the sample size corrector (for large samples only), and (iv) the

sample size summary (which must be aligned to the sample frame summary).

10.2.6.1 The initial agreed sample size

In some instances the appropriate or desired sample size will be given to you. However, as a researcher,

it would be expected of you to determine your own sample size (see the notes on sample size

determination) or at least to advise your client on the appropriate sample size of any given study.

10.2.6.2 Determining the confidence interval

Once the sample size has been given or determined, the typical question of a client in this regard would

be “So what?” or “Why not a larger (or smaller) sample?” Here the confidence interval comes in very

handy. There are two approaches calculating the confidence interval, namely proportions and averages.

Page 10: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

116 | P a g e H a y d a m a n d M o s t e r t

Calculating the confidence interval using proportions

The following formula is used to calculate the confidence interval for estimating population proportions

(i.e. percentages):

�� ���π��1 π��

Where

P = expected sample size proportion

Z = Z value associated with the level of confidence.

n = proposed sample size. � (i.e. the symbol Pi and not the value) = estimated population proportion or “incidence”.

The population proportion may be estimated from secondary data (e.g. past studies), or obtained from a

pilot study or be based on the judgment of the researcher. In the case of total uncertainty, the most

conservative estimate of � should be used, namely 0.50.

For example, given the sample frame as per section 10.2.4.5, assume that the proposed sample size for

our survey is 224 interviews, then the confidence interval for the study is calculated as follows, assuming

the most conservative approach:

P ± 1.96 �0.5�1 0.5��/224 where 1.96 is the Z value associated with a 95% confidence level. A 95% confidence level

should always be assumed (unless specified otherwise).

Secondly, as the population proportion (�) is unknown, we use 0.50 as the most

conservative estimate thereof. Adding it all up, we then have

P ± 1.96√0.25/224 P± 1.96. 0.0334

P± 0.065

= [0.43.5 ≤ p ≤ 0.565]

or

expressed as a percentage: [43.5% ≤ p ≤ 56.5%].

So what does it all mean? Putting it in simple terms, one can say:

If one were to repeat the sample survey of 224 interviews (n) a 100 times, then 95 of the 100 sample

survey results, (i.e. confidence level) will fall within the margin of 43.5% ≥ p ≤56.5% (confidence

interval). Hence, if the percentage obtained from the survey sample falls outside this stated interval, say,

at 64.3% or 42.9%, then the results are deemed to be significant. In other words, the market has shifted

or the sample error of the study is believed to be unacceptably high (see also Chapters 13 and 15 in this

regard).

How does one arrive at a Z value of 1.96? Refer to Table 10.5 below. The total area under the Normal

distribution curve is equated to unity or numerically as 1.0. Since we usually use a two-tailed test (i.e. we

accept outliers at either end of the spectrum), the area under investigation is then 0.5. As we specified a 95%

confidence level, consequently the Z value is calculated as [95% x 0.5] and we arrive at 0.475 (i.e. the area

under investigation). Looking up the appropriate Z value, we then arrive at 1.96! Similarly, if a 90% confidence

level is specified (or 0.045), then the accompanying Z value would be 1.65 (more correctly 1.645).

Page 11: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

M a r k e t i n g R e s e a r c h f o r M a n a g e r s | P a g e 117

Table 10.5: Area under the Normal Distribution Curve N (0,1)

Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09

0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359

0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753 0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141 0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517 0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879 0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224 0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549 0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852 0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133 0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389 1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621

1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830 1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015 1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177 1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319 1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441 1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545 1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633 1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706 1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767 2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817

2.1 0.4821 0.4826 0.4830 0.4834 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857 2.2 0.4861 0.4864 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890 2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916 2.4 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936 2.5 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952 2.6 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.4961 0.4962 0.4963 0.4964 2.7 0.4965 0.4966 0.4967 0.4968 0.4969 0.4970 0.4971 0.4972 0.4973 0.4974 2.8 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4979 0.4980 0.4981 2.9 0.4981 0.4982 0.4982 0.4983 0.4984 0.4984 0.4985 0.4985 0.4986 0.4986 3.0 0.4987 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990

3.1 0.4990 0.4991 0.4991 0.4991 0.4992 0.4992 0.4992 0.4992 0.4993 0.4993 3.2 0.4993 0.4993 0.4994 0.4994 0.4994 0.4994 0.4994 0.4995 0.4995 0.4995 3.3 0.4995 0.4995 0.4995 0.4996 0.4996 0.4996 0.4996 0.4996 0.4996 0.4997 3.4 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4998

� You will notice that there is an inverse relationship between the sample size and the margin of error. The

bigger (smaller) the sample size, the smaller (bigger) the margin of error. For example, at a 95% confidence

level, a sample size of 100 units has a confidence interval of ±9.8%, and in the case of 400 interviews the

confidence interval shrinks (making it more accurate) to ±4.9%.

Page 12: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

118 | P a g e H a y d a m a n d M o s t e r t

Calculating the confidence interval using means

However, marketers are not only interested in proportions or percentages; sometimes they would like to

estimate an average of the population, e.g. average age and size of household, average income and

expenditure, etc. The following formula is to be used then:

� � �. �√� Where

x = expected sample size average

Z = Z value associated with the level of confidence.

n = proposed sample size.

s = sample standard deviation, which estimates the population standard deviation (Ơ).

The sample standard deviation may be estimated from secondary data (e.g. past studies), or obtained

from a pilot study or be based on the judgment of the researcher. A crude, but yet accurate way is to

estimate the data range (i.e. the highest less the lowest estimated value) and divide it by 6 (as there are

six standard deviations (Ơ) in a normal N(0.1) distribution curve). For example, to estimate the sample

standard deviation (s) for a convenience store, one merely establishes the highest amount spent, say,

R326,00 and then subtract it from the lowest spent per customer in the store of, say, R13,00 over the

past month and divide it by 6 (R326 - R13)/6 which equates to an s value of R52.17.

Hence, for 224 interviews one therefore has a confidence interval of:

� � �.R52.17√224

� � 1.96. R52.1714.966

� � 1.96�3.49 � � R6.83 = [R6.83 ≤ � ≤ R6.83]

10.2.6.3 Applying the sample size corrector

If the size of the population is known beforehand, and the sample size is large enough, then the sample

size corrector can be applied. The following formula has to be applied if the sample size is larger or equal

to 10% of the finite population.

Where

N = the population size (as per section 10.2.4.5)

n = proposed sample size (as per section 10.2.6.1), and

n/N ≥ 0.10 (or n/N x 100/1 ≥ 10%).

As the client has 2,050 clients on its database (see section 10.2.4.5) then the sample size corrector

would be:

nc = 224/2,050 > 0.10 (or 10% of the population)

nc = (224.2,050)/(2,050 + 224 – 1)

nc =459,200/2,273

nc =202.02

( )1.

−+=

nN

Nnn

Page 13: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

M a r k e t i n g R e s e a r c h f o r M a n a g e r s | P a g e 119

nc =203 (rounding upwards)

This means that surveying 203 clients will provide the same margin of error as calculated in section

10.2.6.2 of [0.43.5 ≥ p ≤ 0.565].

� Note that one always rounds the corrected sample size number upwards, regardless of the decimal point.

Rounding it downwards implies that the stated confidence interval no longer holds as it is marginally (by a

fraction of a fraction) more.

� In the case of infinite populations or when the population size is unknown (as in the case of consumer

surveys), then this step (step 6.3) must be ignored.

10.2.6.4 Sample size summary

The final step is to provide a sample size summary. This is done by firstly copying the percentages4

obtained from the sampling frame summary as in step 4.5 (section 10.2.4.5). Once the percentages are

known, then the number of interviews to be conducted per category (as per sample frame summary) can

be calculated. The following table illustrates this.

Table 10.6: Example of sample size summary

Area Percentage (%)* Number of bed &

breakfasts (n)

Cape Town 50.6% 103**

Bellville 15.4% 31

Wine route 34.0% 69

Total 100.0% 203

* These percentages are obtained from the sample frame summary.

** Calculated as 50.6% (as per sample frame) x 203 (given the new adjusted sample size).

The above-mentioned information is important as it not only will assist the field agency in terms of the

number of interviews which must be conducted per sub-region, but also acts as a control mechanism by

comparing the initial number of interviews with the actual processed interviews.

� Never depict sample size number by decimal points (e.g. 31.4). You either interview one person or not. A

fraction of a person (0.4) does not exist!

10.2.7 STEP 7: EXECUTING THE SAMPLE

Step 7 has to do with sampling procedures, call-backs and the collection of information from objects and

respondents. Five consecutive steps have to be followed, namely: (i) the minimisation of non-response,

(ii) the handling of non-response, (iii) selection of the sample unit and (iv) sample element and (v)

control.

4 Not the absolute values.

Page 14: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

120 | P a g e H a y d a m a n d M o s t e r t

10.2.7.1 Minimising the non-response

The first step is the minimisation of the non-response of any given survey. With regard to this, the non-

response matrix (see Table 10.7 below) comes in handy to explain what needs to be done.

Table 10.7: Non-response matrix

Willing Unwilling

Available Response.

(see response errors in Chapter 16) X = reducing refusals

Unavailable Y = reducing not-at-homes XY

The table above is divided into two main groupings, namely, by availability and by willingness.

Availability refers to whether the respondent (or object) is present once having been selected to be

interviewed. Willingness, that is, willing/unwilling, refers to the respondent’s attitude - whether he/she

wants to be interviewed at all.

The combination of these two main groupings results in four different scenarios. The first is where the

respondent is available and willing to be interviewed – hence we cannot talk about a non-response. At the

other end (worst case scenario) the respondent is unwilling to participate in the interview and is also

unavailable for the interview (labelled XY). Regarding the remaining two scenarios, two distinct strategies

can be followed to limit this non-response type, namely strategy X (for those who are available but

unwilling to participate) and strategy Y (for those who are unavailable but willing to participate). Once

these two strategies are fully in place, the worst case scenario XY will also be addressed. Both these

strategies depend upon (i) the survey type i.e. household, consumer or corporate, and (ii) the

interviewing technique being used i.e. telephone, face-to-face, mail, self-completion, observation or

internet surveys.

Table 10.8 lists the appropriate strategies to counteract refusals (strategy X) in the case of household

surveys using a face-to-face interviewing technique.

For consumer surveys the same categories apply, but keep in mind the shopper is in a hurry. Think about

having a small table with refreshments.

Note the change in strategy X, if we are to execute a household telephone sample survey:

The following do not matter:

• Presentation of the interviewer. Who cares what he/she looks like?

• Cultural matching. It’s all about the voice – not the physical presence. No funny accents, please.

• Backup documentation. How do you expect to send these?

The following must be changed or adapted (can you determine how):

• Communication. The interviewer’s voice and telephone etiquette is everything.

• Identification can only be done verbally.

• The interview can be conducted much later (up to 21h00?), even on weekends.

• Incentives must be appropriate to the telephone survey. Deliverable, something non-psychical.

• The interview should not be longer than eight minutes. Also, change all rating scales (as

discussed in Chapter 11) to 10 point rating scales. It makes it easier to answer.

For corporate surveys a more formal approach is needed. You will, for instance, make an appointment

with the secretary, dress smart-formal, and give an incentive to the company or department (not person).

Page 15: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

M a r k e t i n g R e s e a r c h f o r M a n a g e r s | P a g e 121

Table 10.8: Strategy X: Reducing the refusal rate (*): Face-to-face household surveys

Strategy X Concept

Execution

Presentation: In a survey scenario, the interviewer can sometimes be over- or under dressed. For example, wearing a 3-piece suit will give the impression that you are selling life insurance, or, on the other hand, arriving in faded and dirty jeans may give the impression that the interviewer might be a suspected tramp. Wearing the appropriate attire may decrease the non-response rate.

Identification: Preferably a durable nametag (in brass with plastic coating) showing the interviewer’s and the research agency’s name. This will engender more confidence on the side of the respondent.

Communication: The interviewer must be a people’s person preferring to mix with people on a face-to-face basis. Therefore, interviewers should ideally not have speech impediments, mumble, or be difficult to understand, etc.

Cultural matching: Although we have had a true democracy since 1994, the fact is that people are still sceptical about one another when it comes to communicating across the colour line. Although this is not always the case, one would generally use White fieldworkers in areas where there is a high incidence of Caucasians staying in an area, and Black/Coloured interviewers in the case of informal settlements. In mixed areas such as Wynberg in Cape Town, this, of course, would not apply. Match the interviewer with the major population group of the particular area.

Backup documentation: It is advisable to have a letter from the research agency, which can be presented on demand. Respondents who are very sceptical may ask for more detail about the research agency that the interviewer represents, upon which the letter will serve its function.

Timing: Fieldworkers have to try to be punctual and arrive at an appropriate time, e.g. between 17h30 and 20h00. Interviewing even on Saturday mornings would be considered to be appropriate.

Prior notification If a telephone number or the physical address is available for the household, then the respondent can be warned about a survey taking place in the near future, either by phone or mail.

Incentives Not all surveys provide incentives (monetary or non-monetary) to respondents. Incentives are usually offered whenever the questionnaire is complex and long. These incentives can be prepaid or promised to the respondent.

Motivating respondents

It is essential that the interviewer provide reasons as to why it would be important for the respondent to complete the survey (although this might not always be possible!).

One could also use certain foot-in-the-door strategies. For instance, the interviewer recruits the respondent and returns at a later time and date to complete the interview. This strategy is followed especially if the questionnaire is particularly long. By then the respondent would be familiar with the interviewer and would be more willing to participate.

Questionnaire design & administration

The length of the questionnaire should be short, to the point and easy to complete. Boring questionnaires can be livened up with good layout and pictures (if applicable). The overall quality of the questionnaire should be well above par.

Convenience Establish the most convenient time and place (on premises or off premises) to survey.

(*) The respondent answers the door, but does not want to be interviewed.

For strategy Y, the likelihood that potential respondents are not at home differs for various reasons. For

example, people with small children are more likely to be at home than bachelors, who are more likely to

be away. Also, people are more likely to be home after hours and over the weekend. Also keep in mind

that people who are actually at home, but refuse to open the door (suspecting that the researcher is a

sales person) also fall within this category. The following strategies can be applied in this instance to

reduce the ‘not at homes’ in a survey:

Page 16: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

122 | P a g e H a y d a m a n d M o s t e r t

Table 10.9: Strategy Y: Reducing the not-at-homes rate(*): face-to-face household surveys

Strategy Y Example

Call backs If a person is not available, then the interviewer can call at a different time and on different days. This should be done at least 3 times before a non-response can be officially recorded.

Identification It is also appropriate to leave a business card or note in the post box at the respondent’s house (sample unit).

Contact In some instances key information about the whereabouts of the respondent (if available), can be obtained from neighbours, or via a domestic worker or a child who answers the door.

Incentive For more difficult cases, an increased incentive could be offered to the respondent to make contact with the interviewer.

(*) The respondent would participate in the study if asked, but is not available at the time of selection (sample unit).

In the case of a face-to-face interview with households or businesses, it is not necessary to conduct the

interview at the premises of the respondent. The key is that the selection of the interviewee should be as per

address (sample unit), but the interview itself (with the sample element) can be anywhere, provided that it

remains a face-to-face interview.

Also keep in mind that response rates themselves do not indicate whether the respondents are representative

of the original sample or not.

Not all survey methods involve personal face-to-face interviews at home (as discussed above). The

following table provides an overview of all the possible surveys which can be executed. Although many

strategies (X and Y) can overlap, it is important to note the differences between all these survey types. In

order to prepare oneself for any eventuality, the following table (which will highlight only the key

differences) should be completed by the student:

Table 10.10: Non-response strategies summary

Survey type: Household Consumer Corporate

Interviewing technique Strategy X Strategy Y Strategy X Strategy X Strategy Y

Face-to-face interviewing:

Telephone interviewing

Mail

Self-completion

Observation

Electronic interviewing: email

/ Internet

See the discussions on interviewing techniques in Chapter 8.

� Note that because of the movement of people, strategy Y cannot be applied to consumer surveys.

10.2.7.2 Handling of non-response

Non-response occurrences can be managed in 3 different ways, namely (i) adjustment of the sample size,

(ii) substitution and (iii) data manipulation.

Page 17: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

M a r k e t i n g R e s e a r c h f o r M a n a g e r s | P a g e 123

Adjusting the sample size

Allowances for an increased sample size have to be accounted for once the ideal sample size is

determined (as in step 6). This is especially important in the case of mail or self-administrated surveys.

The formula to be used here is:

Required sample size = [ n

/ r ]

where r is the incidence rate (%) and is calculated as r = r1% x r2% x r3%, where

r1% = reachable rate (%); r2% = incidence rate (%); r3% = the completion rate (%)

and

n = the required sample size.

The reachable rate (r1%) has to do with the number of people or objects that can be reached at any

point in time. Random-digit number generating to contact people in the case of a telephone survey (in

the absence of a sample frame), is a typical example of a reachable rate5. Other examples include the

percentage of clients (e.g. policy holders with Sanlam) who live in a certain metropolitan area, or the

percentage of clients on a database who can be contacted by phone or whose addresses are still valid.

The incidence rate (r2%) is the percentage of contacts that will qualify for the interview or the

percentage of people that will pass the initial screening questions as listed by the sample element (see

section 10.2.2). For example, it might include the percentage of people who own dogs (in a dog food

survey), or the percentage of readers who read the “Motoring supplement” in the Cape Times

newspaper.

The completion rate (r3%) would be the number of people who, once they quality for the research

study, will agree to cooperate by completing the interview. Estimated completion rates for various

interviewing techniques are as follows:

Table 10.11: Expected completion rates of sample surveys

Interviewing technique Survey type Expected completion rate

Telephone household ~75%

corporate ~70%

Face-to-face household/corporate ~80%

Mail household ~10%

corporate ~5%

Internet corporate ~2%

household ~10%

Mobile phones households/consumer ~60%

Note that the above-mentioned completion rates are very crude estimates. Completion rates depend upon

various factors (Dillon1987: 228), including:

• the length of the interview

• the sensitivity of the topic

• the time of year

• the number of attempts to be made to reach a respondent

5 The random number generation process will yield a very low reachable rate for a household survey in a certain area for the following reasons: in the case where the area has a high presence of businesses, and when a high number of phone numbers have been allocated by Telkom, but not yet been taken up by households (which is typical of many new developments).

Page 18: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

124 | P a g e H a y d a m a n d M o s t e r t

• the length of the time the study will be in the field.

Furthermore, in the case of corporate interviews, the position the target respondent holds within the

establishment, e.g., general manager or sales clerk, and how busy the person is, becomes crucial in

estimating completion rates.

The above-mentioned can be explained using a practical example:

Applying reachable, incidence and completion rates

Assume that FutureTrust, a wholesale unit trust company, has 160,000 clients on their books. Because of increased

competition, the company has decided to conduct marketing research to measure the current customer satisfaction

levels of their clients. The following statistics are available for their client base:

> 123 000 clients are situated in Gauteng. Only 13% of clients live in the Cape Peninsula.

> 62.4% of FutureTrust clients have been with the company for longer than 5 years.

> 800 clients have at least a R250 000 net investment with the company and 92.8% of clients contribute more than

R1,500 per month to a selected unit trust.

It was decided to target only those clients who:

> Live in Gauteng and the Cape Peninsula (reachable rate).

>> Contribute at least R1,500 per month and those customers who have at least R250,000 net investment;

(incidence rate 1)

>> Have been with the company for at least 5 years (incidence rate 2).

>>> A mail survey was decided upon (completion rate).

Note that for identification purposes for each case, the type of incidence has been added (in italics). In real life, you

have to identify them yourself. Note that there are two types of incidence rates which must be adjusted. However, it

is important to keep these incidence categories in the same groups (as the calculations below will show).

Being a trained market researcher, you may have decided to ensure that at least 200 questionnaires are returned to

enable any valid analysis. How many questionnaires should be mailed to customers?

Firstly, one has to convert all incidences to a percentage:

Reachable rate (area): [123,000/160,000 = 76.9%] plus [13% (given)] = 89.9%

Incidence rate 1 (money contributions): [92.8% (given)] plus [(800/160,000) = 0.5%] = 93.3%

Incidence rate 2 (time): 62.4% (given), and

Completion rate: estimated (not given) at 10% (see Table 10.11).

Hence the required number of interviews is:

n/r

= 200/[89.9% x 93.3% x 62.4% x 10.0%] or 200/[0.899 x 0.933 x 0.624 x 0.10]

= 200/0.05234

= 3,822 (rounded upwards) letters have to be sent out.

Substitution

Most commercial surveys use substitution as a way for handling non-responses. Substitution entails

replacing the sampling unit with the same sampling unit type. For each survey type6 a different non-

response method has to be followed.

6 Being corporate, household or consumer surveys.

Page 19: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

M a r k e t i n g R e s e a r c h f o r M a n a g e r s | P a g e 125

i. In the case of a corporate survey, the sample unit will be replaced by the same type of

business entity in the same area or suburb. For example, if a bed and breakfast in Sea Point has

been selected to be surveyed, then this establishment will be replaced by a similar bed and

breakfast in Sea Point if a non-response occurs.

ii. In the case of a face-to-face household survey, then the dwelling, i.e. house, will be replaced

by the same socioeconomic dwelling, i.e. house (size / type), in the same cluster. Hence, if a

townhouse was randomly chosen, then another townhouse next to the selected townhouse

should be its replacement. The following diagram, provides an example of a non-response

procedure for a household survey:

↸ no 7

↸ no 9

↸ no 11

↸ no 13

� King Fischer Avenue �

↸ no 8

� ↸ no 10

� ↸ no 12

↸ no 14

Figure 10.2:- Sample unit substitution of a household survey

The figure above depicts a specific house and street in a suburb, i.e. 9 King Fischer Avenue,

which was randomly chosen for a research study. If a non-response occurs (either because

nobody is at home or a blunt refusal), then the fieldworker will move to the house on the left of

no. 9 King Fischer Avenue. In the case of another non-response, the interviewer will move across

to no. 8, and so on. Important to note here is that if no. 7 Kingfisher Avenue is a larger, double

storey house instead of a single level dwelling on a smallish plot, then no. 7 Kingfisher cannot act

as a replacement and the fieldworker has to move on to, say, no 8. In the above-mentioned

diagram, all six houses are referred to as a selection cluster.

iii. In the case of a consumer mall intercept study, the replacement would need to be the same

gender and race. Hence, if a white male refuses to be surveyed, then another white male should

act as his replacement. One might argue that in order to reduce the non-response effect even

further, one should include the estimated age of the respondent. This is not practical as it does

not only lead to a very complex replacement procedure, but also people (especially women) have

a knack for hiding their actual age. Consequently, estimated age becomes a subjective judgment

by the fieldworker – thus increasing the selection bias.

� It is very important to note that one either uses sample adjustment or substitution but never both

techniques. The reason is obvious: by using both techniques, one firstly increases the sample size and then

using substitution, one replaces non-responses of the larger sample size. The final sample will then be very

large indeed. Combining both techniques renders all sample size calculations null and void!

Other similar ways of substituting non-respondents are the (i) sub-sampling procedures and (ii)

replacement of non-respondents (Malhotra, 1999). We will not go further into these methods as these are

not commonly used.

Page 20: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

126 | P a g e H a y d a m a n d M o s t e r t

Data manipulation

This step does not take place at the sampling stage, nor during fieldwork (or its de-briefing), but it takes

place at the actual data processing stage. Hence the word data manipulation. This process involves

adjusting the sample data to fit the population profile. In total 4 techniques are mainly used in this

regard, namely, subjective estimates, trend analysis, data weighting, and imputation. These techniques

are discussed under reporting and data processing (section 13.3.3).

10.2.7.3 Selecting the sample unit

Now we will discuss the method of choosing the sample unit in more detail for both fixed premises

(household and corporate) and consumer intercept sample surveys.

Fixed premises sample surveys

Fixed premises household and corporate sample surveys can be conducted using area (or municipal)

maps or through a company’s name and address list of clients. The following is an example of an

operational plan for selecting households in a suburb using a suburban map:

< Mussel Cracker Avenue >

1

Start

no 31

No

33 �

No

1

5

etc.

no

1

no

3

no

5

no

7

no

9

No

11

no

13

no

15

no

17

no

19

no

21 �

No

3

4

4 3 2 1 8th 7 6 5 4 3 2 No

5

3

� < Kingfisher avenue >

5 6 7 �8th 1 2 3 4 5 6 7 No

7

No

2

No

4

No

6

No

8

No

10

No

12

No

14

No

16

No

18

No

20

No

22 �

2

No

51

No

49

No

47

No

45

No

43

No

41

No

39

No

37

No

35

No

33

No

31 �

No

9

1

2 1 8th 7 6 5 4 3 2 1 8th No

11

8th

� < Blue Crane Lane >

3 4 5 6 7 �8th 1 2 3 4 5 6 7

No

53

No

55

No

57

No

59

No

61

No

63

No

65

No

67

No

69

No

71

No

73

No

75

No

77

Figure 10.3:- Selection of a sample in a household survey

The instructions for the above sample unit selection plan include (as in figure 10.3):

• Start at the (random) selected household as indicated on the map and go around the block

following the interval of every 8th dwelling (calculated beforehand). See section 10.5.1.3 (i.e.

systematic random sample) for discussion thereof.

• Once at the end of the left (right) hand side of the street, continue by crossing over to the right

(left) hand side of the street.

• You may limit the number of dwellings per block or per street (this is decided beforehand) to be

surveyed: Do not select more than 4 dwellings per street.

• Apply substitution (by race and gender) whenever a non-response occurs (see section 10.2.7.2).

Page 21: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

M a r k e t i n g R e s e a r c h f o r M a n a g e r s | P a g e 127

� See Chapter 10A for discussion of corporate and household sample (and census) surveys using name and

address lists as sample frames.

Consumer intercept sample surveys

As we have already seen, consumer intercept surveys have no sample frames. Hence we need a different

approach here. Therefore, instead of using a sample frame as per chapter 10A, we use a sample frame

summary. The following fictitious example explains.

Assume you have been tasked to survey the visitor’s needs and expectations of WINEX – a yearly wine

experience event at the CTICC (Cape Town International Convention Centre). WINEX has provided you

with the following attendance figures of this four day event over the past four years:

Year: Thursday Friday Saturday Sunday Total visitors

per year

20x1 9,522 22,851 34,587 26,574 93,534

20x2 8,633 23,955 35,601 27,254 95,443

20x3 9,866 24,051 35,992 27,994 97,903

20x4 9,588 24,555 37,501 28,504 100,148

Total visitors per day over four years (Ʃ 20x1 to 20x4)

37,609 95,412 143,681 110,326 387,028

Percentage visitors per day 9.7% (*) 24.7% 37.1% 28.5% 100.0%

(*) (37,609/387,028) x 100/1

Assume you want to draw a sample of 200 respondents in 20x5 - proportionally spread over the four day

event. You also know that the event opens daily from 18h00 to 23h00, except for Saturdays, when it is

open from 16h00 to 23h00. Knowing this, you have decided to intercept respondents from 19h30 to

22h30 daily (i.e. a three hour session) except for Saturday - where you will have a five hour session

(17h30 to 22h30). You also know that a systematic time based sample is the most appropriate sample

technique to apply here (see section 10.5.1.3). Therefore, the proportional sample for the event is

established as follows:

Year: Thursday Friday Saturday Sunday Total

Percentage of visitors per day (as above)

9.7% 24.7% 37.1% 28.5% 100.0%

Number of minutes per interviewing session

180 minutes

(3 hours x 60 minutes)

180 300

(5 hours!) 180 -

Sample allocation per day

20

(9.7% of 200 interviews)

49 74 57 200

Sample interval

(minutes

/daily sample) rounded off to nearest minute

Every 9 minutes

Every 4 minutes

Every 4 minutes

Every 3 minutes

-

Hence, on the opening day, you are required to survey a visitor’s group every 9 minutes, which is

manageable, whereas on a Friday you will survey a visitor’s group every 4 minutes, and so on. Also, if the

interview takes say 5 minutes to complete, you will have to use more than one interviewer for Friday to

Sunday sessions. Your field agency will be in charge of this.

Page 22: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

128 | P a g e H a y d a m a n d M o s t e r t

In order to get a true picture of the shopper profile of any store or shopping mall, it is advised to spread the

number of interviews over one full calendar month. The premise for this is based upon the fact that the

profile of the midweek shopper is different to the weekend shopper who is different to the midmonth shopper

who is different to the end of the month shopper. Also, there is no need to start on the 1st of each month. If

you start say on the 21st of May 20x1, you end the sample shopper intercept survey on the 20th June 20x1.

� On average, interviewer sessions should never be longer than 5 hours. It is humanly impossible to have

longer sessions. In fact, 4 ½ hours sessions is the norm in the industry. Remember, field agencies charge you

per individual session.

� If no visitor stats are available, one could use the unitary interval pattern when selecting a sample unit by

choosing every nth visitor’s group for instance. This sampling technique is referred to as a systematic random

unitary sample.

In selecting the sample unit, you will draw an imaginary line for interviewers at the exit doors of the

event and apply the interval patterns (as well as non-response as per section 10.2.7.2) as above. It is

advisable to use a field supervisor who controls these interval skip patterns for interviewers each day.

Remember, interviewers take short cuts if left unattended.

10.2.7.4 Selecting the sample element

Once the sample unit has been selected, then the sample element i.e. the person or object which

contains the information has to be chosen.

Firstly, the sample element must qualify as per step 2, which defines the sample element (including all its

exclusion and inclusion criteria). The listed exclusions have to be converted into screening questions (SQ).

For example, from step 2 the standard screening questions will be:

SQ.1 Have you been interviewed in the past 6 months? Yes �TERMINATE AND REPLACE

No �GO TO SQ.2

SQ.2 Do you or your immediate family work for a marketing research agency?

Yes �TERMINATE AND REPLACE

No �GO TO SQ.3

SQ.3 Do you or your immediate family work for an advertising agency?

Yes �TERMINATE AND REPLACE

No �GO TO Q.1 OF THE SURVEY

Secondly, in the case of probability sampling, a process of randomisation has to be implemented, if

more than one respondent or object qualifies to be interviewed. This entails giving the interviewers

detailed sampling instructions and not leaving the decision to the interviewers’ own discretion. If this is

not done, then one would encounter sample selection bias (see insert: Focus).

One method of ensuring randomisation in a household or consumer survey, is the so-called application of

the ‘birthday rule’. This entails the interviewer asking all qualifying respondents in the household or

consumer group “Whose birthday comes up next?” to select who qualifies for the interview on an equal

probability basis (even if the next birthday is only next year).

Page 23: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

M a r k e t i n g R e s e a r c h f o r M a n a g e r s | P a g e 129

In the case of a corporate survey, the birthday rule does not apply. One merely selects the first respondent

(identified by title and/or position) who qualifies for the interview. The reason for this is that cultural

differences play a major role with consumers and households, but not in a corporate environment. For

instance, in any household, one would have an array of differences of opinions on various subjects and topics.

Hence the importance of randomisation.

However, corporate surveys are different. Business runs on money, procedures and processes. Consequently

various opinions do not count in a corporate survey. For instance, a question to the purchasing manager as to

which tyres are being used on their trucks would always be the same regardless whether the respondent is

male or female, Black, Asian or White, young or old.

Other randomisation procedures can also be used and applied. One can, for instance, use random

numbers in the case where the interviewer has to select a sample from a numbered database, a bit like a

lottery, where each record is chosen arbitrarily.

� FOCUS: Sample selection bias

Sample selection bias occurs when fieldworkers are left to their own devices when sampling, say, 100 shoppers

in a shopping mall survey. Naturally the interviewers will gravitate towards the people whom they feel the most

comfortable interacting with. If the fieldworker is, say, a 22-year-old white male, then his profile upon

completion of the questionnaires would most likely be young white females under the age of 22 years! Also, the

sampling selection bias increases when the fieldworker on top of his individual preference(s), selects only the

‘really friendly ladies! Then one has a profile of young, really friendly ladies shopping at the mall at the time of

the survey – hardly the correct profile of any shopping centre.

In the case of a household survey, who do you suspect would most likely agree to conduct the interview on

behalf of the household at 20h15? Obvious, it will be the ‘chatter box’ of the household! However, who is most

likely to answer the door at the same time and day in the case of the personal interview? The strongest person,

i.e. the ‘protector’ of the household! What we have is two different people from the same household participating

in the same survey.

To counteract selection bias, a process of randomisation should be utilised.

10.2.7.5 Control

The last step in the execution is to ensure that the interviews did in fact take place and that the data

obtained is indeed authentic and correct. SAMRA prescribes that at least 20% of all conducted interviews

should be back-checked for quality purposes. This 20% could include so-called red herring

questionnaires, i.e. completed questionnaires where the researcher suspects irregularities with the

interview, and/or questionnaires where respondents are phoned to complete missing data or details by a

field supervisor or data controller, and/or where respondents are phoned by the marketing research

agency to compliment (or complain about) the interviewer. Back-checks do not mean that the whole

questionnaire has to be redone, but one would ask between 3 to 5 questions (spread evenly throughout

the questionnaire) to verify the accuracy of the data.

� In the case where an interviewer has been found out cheating on at least one questionnaire (or more), then

the whole batch of that interviewer’s questionnaires should get destroyed. Note that no black-listing of

interviewers who cheat is allowed in South Africa, but having said that, word spreads very fast in the market

with regards to bad performers …

10.2.8 STEP 8: Validating the sample

The final step in the sampling process is to validate the correctness of the sample. This step occurs at the

data processing and report writing stage.

Page 24: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

130 | P a g e H a y d a m a n d M o s t e r t

10.2.8.1 Revising the confidence interval

Firstly, the intended sample could have changed for various reasons. This is especially true in the case of

mail or self-completion surveys. As a result, the confidence interval has to be recalculated for the survey

using the following formulae:

Confidence interval for the mean - large samples (n ≥ 30)

In the case where the final sample size is more than 30 respondents, then the following formula is

applicable:

� � �. �√� For example, if the total number of completed interviews is 321 respondents, and the average

expenditure (x) is calculated as R766,00 per household, with s = R46,00, then the revised confidence

interval would be:

R766 � 1.96 R46√321 = R766 ± R5.03

= [R760.97 ≤ x ≥ R771,03]

Therefore, one would be 95% sure that the true population mean is somewhere between R760,97 and

R771,03; or one can state that one is 95% sure that of being within R5,03 of the true population mean.

Confidence interval for the mean - small samples (n ≤ 30)

The following formula is used if the final sample size is fewer than 30 respondents:

� � �. �√� Hence, instead of using the Z-tables, the appropriate t-Tables are being used.

Revised confidence interval (proportions)

If the key problem researched had to do with proportions (percentages), e.g. estimating market share of

a specific product in the market, then the following formula would be applicable:

P ± Z.√��1 ��/� The value of P = the sample proportion (e.g. 32% market share) obtained.

10.2.8.2 Validating sample proportions

In this step one has to validate that there are no significant differences between the initial population and

sample proportions (as calculated in steps 4 and 6) and the final sample proportion of the completed

results.

In this section, data weighting will take place, if a significant difference between population proportion

and final sample proportion has been found.

Page 25: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

M a r k e t i n g R e s e a r c h f o r M a n a g e r s | P a g e 131

� It is important to note that once data has been weighted, then no statistical tests and/or analysis can be

done on the weighted data. However, statistical tests can still be executed on subsets of original non-

weighted data.

10.2.8.3 Checking how the sample data differs from existing data

The final question of any sample and the final results is whether there was a sampling error or a shift in

the population and/or its composition? It is advisable to compare the results with results from similar

surveys in order to validate the research results and to check whether differences have actually taken

place or not. Here one would execute the famous Ho hypothesis tests using the appropriate statistical

measures to validate the results. This is discussed in Chapter 15.

� A hypothesis test is normally executed only for the primary objectives of any research study. It is seldom

that one would run hypothesis tests with each and every table of the questionnaire.

10.3 CONCLUSION

This chapter discussed the sampling methodology in a lot of detail. The following table (Table 10.12)

provides a summary of what must be completed when discussing any sampling methodology.

Table 10.12: Sampling methodology template

1. TARGET MARKET

Type of survey Specify the target market (corporate, consumer or household).

2. TARGET POPULATION

Sample unit What acts as the sampling unit?

Sample element What is the sample element? Define exclusions and /or inclusions.

Extent Define the target area.

Time frame When is fieldwork to take place?

3. DATA COLLECTION METHOD

Data collection technique Name and motivate the data collection technique.

4. SAMPLING FRAME

Sample frame What acts as the sample frame?

Inclusiveness Name strategies to make the sample frame all inclusive.

Update • Are there any duplications (page and item)?

• Any foreign elements (sample units)?

• Update missing fields (name strategies to do so).

Preparation Stratify (sort/group) the sampling frame. Number all items from one to the end.

Sample frame summary Provide the sample frame summary, for example:

Area (N) %

A 100 33.3%

B 200 66.7%

Total 300 100.0%

5. SAMPLING METHOD

Sample or census Motivate why sample or census has been chosen.

Sample technique State the sample technique fully.

Page 26: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

132 | P a g e H a y d a m a n d M o s t e r t

Table10.12 (cont.)

6. SAMPLE SIZE

Sample size State the sample size. In advanced studies you have to calculate the sample size.

Confidence interval (% or x)

Calculate the confidence interval using the following formula:

P ± Z.√��1 ��/� or � � �. �√�

Sample size corrector Determine the adjusted sample size (if > 10% of the population):

nc = N.n/(N+n-1) where nc = corrected sample size.

Sample size summary

Provide the corrected sample (nc) size summary from the sample frame summary:

Area % (as per sample frame) (nc)

A 33.3% 3

B 66.7% 7

Total 100.0% 10

7. EXECUTION OF THE SAMPLE

Minimising non-response

Discuss the following strategies:

• Unwilling but available (strategy X)

• Willing but unavailable (strategy Y)

Handling of non-response How is non-response being handled?

1. Substitution, sample size adjustment or data manipulation

Selecting the sample unit How is the sample unit chosen? Use random numbers to choose the appropriate sample. See also Annexure in Chapter 10.

Selecting the sample element

How is the sample element chosen (e.g. birthday rule for household surveys).

State the screening questions as per sample element inclusions/exclusions.

Control How many interviews must be checked for correctness (20% of total)?

8. VALIDATION OF THE SAMPLE

Revise the confidence interval

Mention how this is done.

Validate sample proportions Recalculate if the sample size has changed.

Verifying sample data Check data against other available secondary data.

oOo

Page 27: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

M a r k e t i n g R e s e a r c h f o r M a n a g e r s | P a g e 133

CHAPTER 10A (ANNEXURE)

THE APPLICATION OF SAMPLING

This annexure is about the application of sampling. You need a sound knowledge of Chapter 10 before

attempting this section.

Assume Peter has an IT support business called Peter’s Computer Support Services. In total he has 30

clients spread all over South Africa. All his clients are listed in Table 10.13. As you know by now, the

client list (as in Table 10.13) is referred to as the sampling frame. You will note that the sample frame is

organised very carefully. Firstly, it is sorted by region Gauteng (G) > KwaZulu Natal (KZN) > Western

Cape (WCape) and within each region (e.g. Gauteng) by type (Black Steers or KFC) and within each type,

the Rand value these companies spend in total on IT per year (sorted from high to low). Note that the

sample frame could also have been sorted by other criteria, such as type > area and then by Rand spend,

or, say, Rand spend, type and then region. Furthermore, each sample unit (or client) is numbered from 1

to 30.

10.4 UNIVERSE, TARGET POPULATION, SAMPLE FRAME AND CENSUS

Let us assume that Peter does not know the profile of his clients. He has only the clients’ names,

addresses and contact details available to him. In order to establish the profile of his clients (in terms of

region, type of outlet and average spend on IT), Peter has two options: he can either conduct a census

survey or draw a sample.

A target population or universe is any complete group that shares some common sets of characteristics such

as a list of clients, sales territories, stores, people or customers. More specifically, if the group under

investigation is finite (i.e. the number of sample units are known) as in the case of household or corporate

surveys, then one refers to it as the target population. On the other hand, if the population is infinite, as in

the case of a consumer survey, then it is referred to as a universe. A census survey is a situation where

the data is obtained from every member of the target population.

Let us start our discussion with a census survey and end it with a sample survey. Firstly, given that

Peter’s Computer Support Services has only 30 items, one refers to his client list as the target population.

See Table 10.13.

How does one differentiate between a household or corporate (sample /census) survey if one is using a company’s name and address list as a sample frame? The answer lies in the database (i.e. name and address list) itself. For instance, if the client to be surveyed has a house address e.g. Mr Ronny Botha, 25 2nd Street, Harfield Village, then it is a household survey. However if the address is: McDonald’s, 33 Main Rd, Claremont, it would be a corporate survey.

Take note that the same (random) sampling procedure and principles are applied regardless whether it is a household or a corporate survey.

Page 28: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

134 | P a g e H a y d a m a n d M o s t e r t

Table 10.13: Sample frame for Peter’s Computer Support Services

No Address Contact Person Telephone No Region Type

Rands

(‘000)

01 Wilro Krans Cnt, Graphite Rd Jaco Wolmarans (011) 768 4977 G B- Steer 150 02 Main Rd. Alberton Martin, Clinton (011) 869-8020 G B- Steer 135 03 Jeppe Street, Johannesburg Rasheed Ahmed (011) 333-4823 G B- Steer 87 04 Lochner Str, Akasia Lizzie, Johannes (012) 549-3561 G B- Steer 46

05 101 BUREN RD, BEDFORDVIEW Basil Bibis (011) 450 2473 G KFC 66 06 102 MOOI RIVER DR., NORKEM Sarah (011) 972 7111 G KFC 46 07 CORLETT DRIVE, BIRNAM Jean (011) 888 2250 G KFC 42 08 DF MALAN DRIVE, BLACKHEATH Sipho Cele (011) 888 1633 G KFC 41 09 HENDRIK POTGIETER STR, DALE Billy (011) 744 2383 G KFC 29 10 83 Voortrekker Street Douglas (011) 869-7339 G KFC 28 11 Dobsonville 1 Busi Nayaba (011) 988-3982 G KFC 28 12 Soshanguve 1, 276 Bock H Lawrence (012) 797-2571 G KFC 27 13 Twist Street, Johannesburg Elizabeth (011) 725-1638 G KFC 26 14 Wynberg, 2nd Avenue, Main Road Clement Roy (011) 887-3765 G KFC 21 15 Benoni Cbd Themba/Busi, (011) 421-8220 G KFC 20 16 Berea City Philemon, (012) 320-0391 G KFC 18 17 Bertrams (Bezuidenhouts Valley) Sunny, Thoko (011) 614-9332 G KFC 5

18 584 TARA RD, WENTWORTH Bernd (031) 461 2780 KZN B- Steer 98 19 Flanders Drive, Mount Edge Russel (031) 502 3352 KZN B- Steer 77 20 Main Road, Port Shepstone Caren (039) 682 6232 KZN B- Steer 70 21 Marlin Drive, Hibberdene Mahadawu (039) 699 2465 KZN B- Steer 69 22 Mayfair, Marine Pare Margate Anastatia (039) 312 1089 KZN B- Steer 55 23 Mbance Centre, Durban Edmeades, (031) 555 3352 KZN B- Steer 12 24 Kraal Kraft, Rockysdrift, Mount Edge Leon Cremer (031) 758-1103 KZN KFC 9 25 Kranskop Engen 1-Stop Eben Espach (031) 717-3001 KZN KFC 5

26 19A SCHMIDSDRIFT ROAD, Parow Henni (021) 950-2558 WCape KFC 45 27 The Club, Constantia Schalk van Dyk (021) 412 5569 WCape KFC 22 28 N1 Snelweg, Bellville Willie Fryer (021) 655-1003 WCape KFC 13 29 Barclay Road, Beaufort West Stan Koortz (023) 414 3569 WCape KFC 11 30 Sanlam Centre, Parow Hendrik van Zyl (021) 930-5815 WCape KFC 10

G = Gauteng, KZN = KwaZulu Natal, W/Cape = Western Cape.

By surveying everybody on the client list, i.e. sample frame, the profile of Peter’s Computer Support

Services is as follows (you can calculate this parameters yourself):

Table 10.14: Sample frame summary of Peter’s Computer Support Services (N = 30)

Region Outlet Type Average Rand spent on IT (‘000)

G (Gauteng) = 17 (56.7%) Black Steers = 10 (33.3%) R43,7 (1,311/30)

KZN (KwaZulu-Natal) = 8 (26.7%) KFC = 20 (66.7%)

WCape (Western Cape) = 5 (16.6%)

Total: 30 (100%) Total: 30 (100%) Total: 30

� The average spent on IT is calculated by adding up expenditure of all companies on IT (i.e.

150+135+87+46+66+46+42+41+29+28+28+27+26+21+20+18+5+98+77+70+69+55+12+9+5+45+

22+13+11+10 =1,311 ) and dividing it by 30 (N= the number of clients) = R43,7 or R43,700 (note that the

numbers in the table are in thousands (‘000)).

However, it is highly unlikely that everybody will participate in a census study. So, assume that the

following three clients, i.e. 10% non-response (3/30 x 100/1) refused to be interviewed:

Page 29: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

M a r k e t i n g R e s e a r c h f o r M a n a g e r s | P a g e 135

No Address Contact Person Telephone No Region Type Rand

(‘000)

12 Soshanguve 1, 276 Bock H Lawrence (012) 797-2571 G KFC 27

22 MAYFAIR, MARINE PARE MARGATE Anastatia (039) 312 1089 KZN B- Steer 55

29 Barclay Road, Beaufort West Stan Koortz (023) 414 3569 WCape KFC 11

Now, having only interviewed 27 clients instead of 30, the profile of Peter’s Computer Support Services

‘census’ changes to:

Table 10.15: Sample frame summary of Peter’s Computer Support Services (N = 27)

Region Outlet Type Average Rand spent on IT (‘000)

G (Gauteng) = 16 (59.3%) Black Steers = 9 (33.3%) R45,1 (1,218/27)

KZN (KwaZulu-Natal) = 7 (25.9%) KFC = 18 (66.7%)

WCape (Western Cape) = 4 (14.8%)

Total: 27 (100%) Total: 27 (100%) Total: 30

Note the discrepancies in tables 10.14 (census with 30 sample units) and 10.15 (census with 27 sample

units).

Remember that at this point in time, Peter does not know that his profile is supposed to look as in Table

10.14 (if he conducted a census survey with all 30 sample units). For him, the sample frame profile in

Table 10.15 is his reality (27 sample units in total). Hence, any non-response when conducting census

surveys cause the census results to deviate. So beware and note the importance of a low response rate!

10.5 SAMPLING TECHNIQUES

However, if Peter opts for sampling, he has the following sampling techniques at his disposal:

Table 10.16: Overview of sampling methods

Probability sampling methods Non-probability sampling methods

Simple random sampling Convenience sampling

Systematic random sampling Judgmental sampling

Stratified random sampling Quota sampling

Cluster random sampling Snowball sampling

Multistage area sampling

Non-probability sampling relies on the personal judgment of the researcher to select the sample, whereas

probability sampling occurs when the sampling units are all selected by chance (Cant et al, 2003). In the

case of probability sampling techniques, the results can be projected onto the universe or target

population under discussion. For instance, if the (random) sample indicates that 86.5% like the

advertisement, then one can expect similar proportions (%) for the target population (plus /minus the

confidence interval, of course). However, the results of non-probability samples, on the other hand, are

only applicable to the sample and cannot be linked to the population under investigation.

Also, non-probability samples are favoured when informal and exploratory research is to be done.

Typically one would draw small samples (n < 30) and when a sampling frame of the target population is

not available. In this case researchers rely heavily on personal judgment.

Probability samples, on the other hand, are used when the target population is definitive and most of the

sample units are known.

Page 30: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

136 | P a g e H a y d a m a n d M o s t e r t

The above-mentioned sampling methodologies will be discussed using the example of Peter’s Computer

Support Services.

10.5.1 PROBABILITY SAMPLES

Random sampling techniques rely on a pre-determined random selection process. There are various ways

of ensuring randomisation, for example, rolling of a dice, drawing of cards, flipping of coins, etc. The

most common and scientific one is the application of random numbers, which will now be looked at.

10.5.1.1 General principles of drawing random samples using the random numbers table

Firstly, the process of selecting sample items randomly involves using a set of random numbers and

choosing the appropriate sample units from a given sample frame. Given that random numbers are used,

it is important to number all the sample units of the population from one to whatever its maximum size.

In the case of Peter’s Computer Support Services, the sample units (clients) are numbered from 1 to 30

(maximum).

� What is the position of the number ‘0’ i.e. zero (or 000 if three digits are used) when it comes to sample

selection? The number zero is automatically ignored if the database is numbered from 1 to, say, 745. However,

if the database is numbered from 0 to 744, then the number 0 comes into play and is used as a random

number.

Secondly, one has to establish the size (maximum) of the sample frame (database). For instance, if there

are up to 9 items in the sampling frame, then one would use only a single digit number for the sampling

process. If the sample frame has up to 99 items, one would use two digits (tens), up to 999 – three digits

(hundreds), up to 1,000 four digits, and so on. See Table 10.17 in this regard. For Peter’s Computer

Support Services (N = 30) we are looking to use a double digit number. Note that if one uses a single

digit number only instead of a double digit, then all the clients of Peter’s Computer Support Services

numbered from 10 to 30 are automatically excluded from the sample, which leads to sample bias (i.e.

sample units do not have an equal chance of being selected).

Thirdly, focussing now on the random numbers table (Table 10.18), assume that you have decided to

start in column A and row 1 with the number 111648. Given this number, the choice of random numbers

for the database is as follows (as in Table 10.17):

Table 10.17: Determining the number of random digits

Random number 111,648 (Column A row 1)

Size (N) of database / sample

frame (minimum / maximum)

1 to

999,999

1 to

99,999 1 to 9,999 1 to 999 1 to 99 1 to 9

Hundred thousands 0 1 1 6 4 8

Ten thousands 1 1 6 4 8

Thousands 1 6 4 8

Hundreds 6 4 8

Tens 4 8

Singular numbers 8

If there are up to 9 items in the sampling frame, then the first chosen random number would be eight. In

the case where there are up to 99 items in the database, then the initial number is 48, and between 1

and up 999 sample units/items, the random number is 648, etc.

Page 31: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

M a r k e t i n g R e s e a r c h f o r M a n a g e r s | P a g e 137

TABLE 10.18 – Random numbers table # A B C D E F G H I J K 1 011648 363180 754061 376674 236320 751400 104331 202018 193228 917492 221215 2 586284 870549 316587 932055 435685 197352 084683 104538 444682 665558 376449 3 124320 418106 018076 029778 367792 262366 332366 665843 608871 073956 204651 4 505044 739447 047738 120321 514144 823841 383730 002494 807099 726052 674978 5 140434 031048 784839 727172 687142 080482 250053 041113 642081 482373 417018 6 423149 830491 219333 928137 034763 501486 728775 388605 293002 804749 803151 7 791475 088682 097565 263606 644516 170971 484378 096210 046386 171341 092427 8 552178 003353 729970 004118 451017 338027 928373 029453 854745 652885 971958 9 946018 158384 168058 610043 435176 170200 172564 573257 382244 293010 313811 10 656921 985665 295108 956392 997548 311090 925658 683685 049859 510928 377800

11 615556 764044 862130 118088 142841 453747 574368 600822 126450 620101 781373 12 871307 792254 081533 849177 647539 794393 749157 624970 992158 849873 287596 13 245504 280676 688943 384906 242616 634734 212843 070446 927296 372846 132112 14 364273 169751 954283 132266 559053 316053 438137 222505 039005 469095 985019 15 711682 576090 915903 779046 742444 509403 315523 625027 294784 596529 504148

16 871141 129449 498632 965663 488255 196155 950709 274293 742918 085457 978134 17 587345 053260 296212 265832 629664 112468 202405 140152 046014 357613 053980 18 752914 010208 172652 415983 640746 641629 632934 533004 487866 145434 373134 19 132283 268310 193064 154572 179191 183106 834073 888276 098394 113332 684131 20 047119 345932 225615 676422 052942 306517 448086 969897 684034 856211 455506

21 640412 990121 146505 402739 094822 022864 005730 826274 014446 324767 170480 22 599005 169364 193847 575518 006203 632932 030918 930739 894156 527954 106311 23 209638 024775 554041 395630 822444 343292 966073 172230 519844 107530 762724 24 343209 969901 552442 706937 252552 400229 232892 488192 071594 601723 816973 25 973830 887518 213800 031434 982513 786352 275565 207124 576646 412042 475892

26 943331 707138 533128 798035 972069 346692 215947 267293 582272 817543 146484 27 537123 877710 084133 191724 081320 230839 137215 105972 172324 393656 748166 28 750045 860540 411904 100613 196260 003001 684142 578123 579229 924207 654317 29 106837 881829 301765 847502 101155 692320 355426 558656 073924 470001 432338 30 829769 079818 465886 265952 262470 185532 294918 337122 322851 648442 693495

31 721753 349851 580346 091378 475482 062043 241638 245272 162196 045393 074238 32 889362 513432 009585 967688 743517 271136 296070 353879 279222 289506 550623 33 001974 360744 653156 125277 109852 228037 109210 262199 235923 646219 578015 34 153446 901073 333413 778066 124476 154564 492434 472757 113902 158842 281316 35 235107 687742 489732 204817 598905 672498 170736 789103 407791 863123 484546

36 459891 453897 548472 776919 211915 435216 412608 181267 146311 940668 824458 37 860198 479288 961677 143775 741082 936043 509204 988525 590515 564926 119383 38 626937 356846 726078 230286 370044 326989 224843 011282 746586 857812 618753 39 290906 402645 803299 472548 401355 698916 140603 161572 832356 373961 987031 40 809545 768654 327133 409412 535856 690958 607916 710418 907561 845075 539008

41 917918 768311 374852 102415 097238 682102 570722 542161 402761 144479 038236 42 088827 908003 191387 395342 509845 975913 413787 877195 987638 046989 804590 43 367426 028525 319660 879142 091159 748023 588963 960823 191072 147343 763535 44 495634 128727 484078 260217 283684 382662 269237 481714 235770 757541 343865 45 731570 332424 030246 126398 772010 129234 104347 439652 268247 948272 411407

46 338253 526932 537142 026401 033630 100115 630021 124990 676312 440006 353546 47 106064 713254 317063 266852 106530 055472 652693 912739 029642 291192 288537 48 121386 530105 354341 095352 204838 392793 151196 768596 022466 390103 847395 49 381098 349760 945232 974444 640735 852140 649583 709754 877123 268424 266654 50 007649 614279 523291 261503 799091 643298 912732 410290 085314 615787 378551 A B C D E F G H I J K

Page 32: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

138 | P a g e H a y d a m a n d M o s t e r t

Note in addition that one could also have started from the first number on the left, i.e. 1-1-1-6-4-8 and then

moved sideways (instead of downwards) to select the appropriate number on the database. Both approaches

(from the left or from the right x downward or sideways) are correct. However, for consistency we have

chosen to use the right hand digits and, once selected, one moves down the random number table. Also, once

the last number of column A is selected, one would move then to column B (top) and downwards again (using

the right hand digits). And so the process continues.

Finally, as Peter’s Computer Support Services’ sample frame only has two digits (N = 30), then the initial

number is 48. However, the random number 48 is too large (N = 30) and another number has to be

chosen. Moving downward one has the number 84 (586284) which is again too large to be chosen.

However, the next random number 124320 is a hit. On the database it is a Black Steer franchise and you

have to interview Caren from Port Shepstone at (039) 682 6232.

20 MAIN ROAD, PORT SHEPSTONE Caren (039) 682 6232 KZN B- Steer 70

How does one handle duplicate random numbers? Remember that each sample unit must have one equal

chance to be selected. In the case where a random number is chosen more than once, one has to ignore

it. What is the point interviewing the same company twice in a customer tracking survey?

Each sample technique will now briefly be discussed, assuming a sample of six (n = 6) clients or sampling units.

10.5.1.2 Simple random sampling

Applying the above-mentioned principles, the focus now shifts to applying the simple random sample

using the random numbers table in Table 10.18.

� Simple random sampling is a technique in which each sample unit of the population has a known and equal

chance of being selected for the sample (Shao, 2002). Every sample unit is selected independently – one at a

time. This sampling technique is seldom used as a stand-alone.

Knowing that Peter’s Computer Support Services has only 30 clients, one looks at using a double digit

number (on the right of the column) and moving downwards one has:

011648 586284 124320 505044 140434 423149 791475 552178 946018 656921 615556 871307 245504 364273 711682 871141 587345 752914

Page 33: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

M a r k e t i n g R e s e a r c h f o r M a n a g e r s | P a g e 139

Using the corresponding random numbers on your sampling frame, the following clients have been

chosen to be interviewed:

20 Main Road, Port Shepstone Caren (039) 682 6232 KZN B- Steer 70

18 584 Tara Rd, Wentworth Bernd (031) 461 2780 KZN B- Steer 98

21 Marlin Drive, Hibberdene Mahadawu (039) 699 2465 KZN B- Steer 69

7 Corlett Drive, Birnam Jean (011) 888 2250 G KFC 42

4 Lochner Str, Akasia Lizzie, Johannes (012) 549-3561 G B- Steer 46

14 Wynberg, 2nd Avenue, Main Road Clement Roy (011) 887-3765 G KFC 21

The chosen clients provide the following results:

Table 10.19: Random sample results for Peter’s Computer Support Services (n = 6)

Region Outlet Type Average Rand spent on IT

G (Gauteng) = 3 (50.0%) Black Steers = 4 (66.7%) R57,7 (346/6)

KZN (KwaZulu-Natal) = 3 (50.0%) KFC = 2 (33.3%)

Compare these results to the census results in Table 10.14. Note the difference. Why is there such a

significant difference in these two sets of results? How would you as the researcher rectify this?

10.5.1.3 Systematic sampling

A systematic sample is usually used in conjunction with another sample technique such as a stratified or

cluster sample.

� A systematic (random) sampling technique is where the sample is drawn arbitrarily by choosing a

beginning point in a sample frame or list and then sequentially selecting every nth sample unit from the list

(Shao, 2002). This technique can also successfully be applied using a time interval in which case it is referred

to a systematic time-based sample. This sampling technique is seldom used as a stand-alone.

A systematic sample requires a random starting point where the random numbers table is used. Starting

at say column H row 1 (and moving downwards), one finds that the first appropriate number is 18.

Next, one has to establish the interval by selecting every nth number from the sampling frame (not the

random numbers table!). As the population has 30 sample units (N) and the required sample (n) is 6,

then the interval is calculated as:

N/n

= 30/6

= 5th

Hence, starting at 18, every 5th client has to be chosen from the client’s list (NOT random number table):

18 (+5) 23 (+5) 28 (+5) 3 (+5) 8 (+5) 13

Given that there are only 30 clients, when arriving at the end of the sampling frame (as per number 28),

one has to start at the beginning again. In other words, adding 5 to 28 one has: 29, 30 (end of sample

frame) 1, 2 and 3 (being the 5th) will be chosen. Note that the random interval process must end just

before the starting point (number 17 in this case). You are not allowed to move beyond the starting

point, because you will end up in a situation where sampling frame items have a double chance of being

included, resulting in a sampling bias.

Page 34: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

140 | P a g e H a y d a m a n d M o s t e r t

� In the case of fractional interval numbers, e.g. 7.3rd or 7.8th number to be chosen, one must always round the

interval number downwards towards the nearest integer, i.e. 7.3rd and 7.8th both become every 7th sample and

8th sample unit respectively. This might result in drawing more (or less) than the required number of

interviews.

Applying this method, the companies to be surveyed are:

18 584 Tara Rd, Wentworth Bernd (031) 461 2780 KZN B- Steer 98

23 Mbance Centre, Durban Edmeades, (031) 555 3352 KZN B- Steer 12

28 N1 Snelweg, Bellville Willie Fryer (021) 655-1003 WCape KFC 13

3 Jeppe Street, Johannesburg Rasheed Ahmed (011) 333-4823 G B- Steer 87

8 DF Malan Drive, Blackheath Sipho Cele (011) 888 1633 G KFC 41

13 Twist Street, Johannesburg Elizabeth (011) 725-1638 G KFC 26

The results are now as follows:

Table 10.19: Random sample results for Peter’s Computer Support Services (n = 6)

Region Outlet Type Average Rand spent on IT

G (Gauteng) = 3 (50.0%) Black Steers = 3 (50.0%) R46,2 (277/6)

KZN (KwaZulu-Natal) = 2 (33.3%) KFC = 3 (50.0%)

Western Cape = 1 (16.7%)

You will note immediately how much more accurate these results are compared to Table 10.14 (the

census results). This is not because the systematic random sampling technique is more accurate, but has

more to do with the fact that the actual sample technique was a systematic random stratified sample. The

following section explains.

10.5.1.4 Stratified sampling

This is probably the sampling technique used most often by researchers who have two combinations to

decide upon:

• Random stratified or systematic stratified sample, and

• Proportional or disproportional (random or systematic) stratified sample.

� Also known as a stratified random sampling technique, this technique firstly divides the target population into

natural homogenous sub-groups, strata or segments, and then selects the sample units at random (referred to

as a stratified random sample) or by systematic method (referred to as systematic stratified sample) from each

sub-group. Stratified sampling is usually used when a large variation exists within the population (Shao, 2002).

The systematic stratified sample

As the name suggests, one would stratify the sampling frame and apply the systematic sampling

technique as discussed in the previous section (section 10.5.1.3).

Have a look again at Table 10.13 – the sampling frame of Peter’s Computer Support Services. As

mentioned earlier, the sampling frame was carefully organised. The data fields were sorted by region >

type of outlet and > total spending on IT. Without going into the technical correctness of stratified

sampling, simplistically put, instead of talking about sorting, rather replace it with the word ‘stratified’.

Differently expressed: A systematic sample was drawn from a sampling frame which was stratified by

region > type of outlet and Rands spent on IT. This is referred to as a systematic stratified sample.

Page 35: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

M a r k e t i n g R e s e a r c h f o r M a n a g e r s | P a g e 141

Hence, given that only six clients were selected, they provided a reasonable estimate of the target

population. What is important to know, is that in the case of a systematic stratified sample, only one data

file is used.

The random stratified sample

Instead of using one data file, a random stratified sample breaks the sample frame up into smaller data

files, according to their strata. Using the same client list as per Table 10.13, the revised list as in Table

10.20 divides the client list into three separate data files: Gauteng (1 to 17) > KZN (1 to 8) and >

Western Cape (1 to 5). Note also that each data file is numbered from one onwards and not continuous

as in the case of the systematic stratified sample.

Hence, to draw a sample of six, one has to apply the sample frame (target population) proportions to the

sample itself. This will mean that 56.7% of the sample must be from Gauteng (17 of 30), 26.7% from

KZN (8 of 30) and 16.6% from the Western Cape (5 of 30). Rounding off to the nearest integer one

would sample 3 clients from Gauteng (56.7% x 6 = 3.4 ~ 3 clients), 2 clients from KZN (26.7% x 6 = 1.6

~ 2 clients) and 1 client from the Western Cape (16.6% x 6 = 0.996 ~ 1 client) using the simple random

sampling procedure as discussed in section 10.5.1.2.

Table 10.20: Random stratified sample

1 Wilro Krans Cnt, Graphite Rd Jaco Wolmarans (011) 768 4977 G B- Steer 150 2 Main Rd, Alberton Martin, Clinton (011) 869-8020 G B- Steer 135 3 Jeppe Street, Johannesburg Rasheed Ahmed (011) 333-4823 G B- Steer 87 4 Lochner Str, Akasia Lizzie, Johannes (012) 549-3561 G B- Steer 46 5 101 BUREN Rd, BEDFORDVIEW Basil Bibis (011) 450 2473 G KFC 66 6 102 MOOI RIVER Drive., NORKEM Sarah (011) 972 7111 G KFC 46 7 CORLETT DRIVE BIRNAM Jean (011) 888 2250 G KFC 42 8 DF MALAN DRIVE, BLACKHEATH Sipho Cele (011) 888 1633 G KFC 41 9 HENDRIK POTGIETER STR, DALE Billy (011) 744 2383 G KFC 29 10 83 Voortrekker Street Douglas (011) 869-7339 G KFC 28 11 Dobsonville 1, Busi Nayaba (011) 988-3982 G KFC 28 12 Soshanguve 1, 276 Bock H Lawrence (012) 797-2571 G KFC 27 13 Twist Street, Johannesburg Elizabeth (011) 725-1638 G KFC 26 14 Wynberg, 2nd Avenue, Main Road Clement Roy (011) 887-3765 G KFC 21 15 Benoni CBD Themba/Busi, (011) 421-8220 G KFC 20 16 Berea City Philemon, (012) 320-0391 G KFC 18 17 Bertrams (Bezuidenhouts Valley) Sunny, Thoko (011) 614-9332 G KFC 5

1 584 Tara Rd, Wentworth Bernd (031) 461 2780 KZN B- Steer 98 2 Flanders Drive, Mount Edge Russel (031) 502 3352 KZN B- Steer 77 3 Main Road, Port Shepstone Caren (039) 682 6232 KZN B- Steer 70 4 Marlin Drive, Hibberdene Mahadawu (039) 699 2465 KZN B- Steer 69 5 Mayfair, Marine Pare Margate Anastatia (039) 312 1089 KZN B- Steer 55 6 Mbance Centre, Durban Edmeades, (031) 555 3352 KZN B- Steer 12 7 Kraal Kraft, Rockysdrift, Mount Edge Leon Cremer (031) 758-1103 KZN KFC 9 8 Kranskop Engen 1-Stop Eben Espach (031) 717-3001 KZN KFC 5

1 19A Schmidsdrift Road, Parow Henni (021) 950-2558 WCape KFC 45 2 The Club, Constantia Schalk van Dyk (021) 412 5569 WCape KFC 22 3 N1 Snelweg, Bellville Willie Fryer (021) 655-1003 WCape KFC 13 4 Barclay Road, Beaufort West Stan Koortz (023) 414 3569 WCape KFC 11 5 Sanlam Centre, Parow Hendrik van Zyl (021) 930-5815 WCape KFC 10

Proportional and disproportional stratified sample

The above-mentioned stratified sampling techniques (random and systematic) are known to be

proportionally drawn from the target population. However, the researcher may sometimes want to

analyse the strata independently, and then draw a disproportionate stratified sample from the population

Page 36: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

142 | P a g e H a y d a m a n d M o s t e r t

under investigation. Assume that you want to investigate the feasibility of selling speciality bread to bed &

breakfasts and hotels in Cape Town. If there are 100 hotels and 1,900 bed and breakfasts, then even a

sample of 100 would mean that one would only survey 5 hotels (100/2000 x 100) and 95 (1,900/2000 x 100)

bed and breakfasts. One would realise that the hotel representation is by far too small and has to be

increased. In terms of a disproportionate stratified sample then, one could decide to interview 50 hotels

(50% of the sample) and 50 bed and breakfasts (or 50% of the sample) instead of 5 (5%) and 95 (95%)

respectively. The key advantage of a disproportionate sampling technique is that it allows for a detailed

analysis of each strata (if big enough), and at the same time the data can be weighted to the original

population proportions to provide an overview of the actual totals.

In the case of Peter’s Computer Support Services, one can draw a disproportionate random stratified

sample (by region) by selecting, say, 2 each: 2 clients from Gauteng (33.3%), 2 clients from KZN

(33.3%), and 2 clients from the Western Cape (33.3%), instead of 3 clients from Gauteng, 2 clients from

KZN, and 1 client from the Western Cape.

10.5.1.5 Cluster sampling

Different to a stratified sample, where homogeneous groups are put together, (i.e. segments in the

population which have one or more common characteristic), a cluster sample aims to achieve the

opposite. It tries to attain a heterogeneous composition, where groupings have to be different – a mirror

image of the population (Zikmund, 2003).

More accurately known as a random cluster sampling technique, this technique is performed by choosing

a random sample of sub-groups, and all the members of the sub-groups become part of the sample. If the

researcher samples all the members of the selected sub-group, then it is referred to as a one-stage cluster

sample. However, if the members of a sub-group are randomly selected, one refers to it as a two-stage cluster

sample. A very popular two-stage cluster sample method is dividing a geographic area into clusters and

selecting sample units randomly from each area (Shao, 2002).

The main aim of a cluster sample is to reduce the time spent interviewing people in a wide geographical

area, consequently making the survey cheaper. A typical application of a cluster sample is the area cluster

sample (see Figure 10.4).

Figure 10.4: Area cluster sample

A1

A2

C2

C1

B1

B2

Page 37: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

M a r k e t i n g R e s e a r c h f o r M a n a g e r s | P a g e 143

Assume that you want to conduct a face-to-face household survey applying an area cluster sample in the

suburb of Durbanville (Cape Town) depicted in Figure 10.4 above. Firstly, you would divide the area into,

say, three socioeconomic strata, namely: A = a high, B = a medium and C = a lower socioeconomic

grouping. This is usually done by dividing the suburb along the main roads in the suburb (i.e. the typical

above and below the main road or railway line scenario). Wanting to reduce the size of the area for

sampling purposes, you would then further divide each socioeconomic grouping, say, by two clusters

(cluster 1 and 2). Note that you could have divided the suburb into three or four or even ten socio-

economic clusters. It does not matter, as long as you have a feasible number of households you could

survey within each grouping. Also, in dividing the area into clusters, note that each cluster does not need

to be adjacent or even the same size as another similar cluster. For instance, A1 is far from A2 and much

larger as well.

� It is important to realise that an area cluster sample assumes that the people living in socioeconomic

grouping A (A1 and A2), for instance, have the same profile, perceptions and attitudes. They are not

significantly different in any way! The same goes for socioeconomic grouping B (B1 and B2) and C (C1 and

C2).

Having now established two independent heterogeneous clusters, i.e. cluster 1 (A1 – B1 –C1) and cluster

2 (A2 – B2 – C2), you would now choose any one of these two clusters randomly, say, by means of

flipping of a coin. Assume that cluster 1 (i.e. A1 – B1 – C1) has been chosen randomly. Having chosen

cluster 1, you will now draw a random sample, which would be representative of the total population of

Durbanville (not the cluster itself). Note that the socioeconomic areas A1, B1 and C1 are not adjacent to

one another (however they can be, though). As all the households in cluster 2 are ignored, it saves the

researcher a lot of time and money.

Applying the area cluster sample to Peter’s Computer Support Services, you would divide each province

into, say, two clusters (A and B). See Table 10.21. In doing so, sample units in each province per cluster

will be as close to one another as possible. For example, you would have Johannesburg central (cluster A)

vs. Johannesburg outer region (cluster B). Note that the areas within each of the two clusters are not the

same in size.

Page 38: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

144 | P a g e H a y d a m a n d M o s t e r t

Table 10.21: Area cluster sample for Peter’s Computer Support Services

CLUSTER A:

1 Wilro Krans Cnt, Graphite Rd Jaco Wolmarans (011) 768 4977 G B- Steer 150 2 Jeppe Street, Johannesburg Rasheed Ahmed (011) 333-4823 G B- Steer 87 3 Twist Street, Johannesburg Elizabeth (011) 725-1638 G KFC 26 4 Berea City Philemon, (012) 320-0391 G KFC 18 5 83 Voortrekker Street Douglas (011) 869-7339 G KFC 28 6 Wynberg, 2nd Avenue, Main Road Clement Roy (011) 887-3765 G KFC 21 7 Lochner Str, Akasia Lizzie, Johannes (012) 549-3561 G B- Steer 46 8 Dobsonville 1, Busi Nayaba (011) 988-3982 G KFC 28

1 Main Road, Port Shepstone Caren (039) 682 6232 KZN B- Steer 70 2 Marlin Drive, Hibberdene Mahadawu (039) 699 2465 KZN B- Steer 69 3 Mayfair, Marine Pare Margate Anastatia (039) 312 1089 KZN B- Steer 55 4 Kraal Kraft, Rockysdrift, Mount Edge Leon Cremer (031) 758-1103 KZN KFC 9

1 N1 Snelweg, Bellville Willie Fryer (021) 655-1003 WCape KFC 13 2 Barclay Road, Beaufort West Stan Koortz (023) 414 3569 WCape KFC 11 3 Sanlam Centre, Parow Hendrik van Zyl (021) 930-5815 WCape KFC 10

CLUSTER B:

1 101 Buren Rd, Bedfordview Basil Bibis (011) 450 2473 G KFC 66 2 Main Rd. Alberton Martin, Clinton (011) 869-8020 G B- Steer 135 3 102 Mooi River Drive, Norkem Sarah (011) 972 7111 G KFC 46 4 Corlett Drive Birnam Jean (011) 888 2250 G KFC 42 5 DF Malan Drive, Blackheath Sipho Cele (011) 888 1633 G KFC 41 6 Hendrik Potgieter Str, Dale Billy (011) 744 2383 G KFC 29 7 Soshanguve 1, 276 Bock H Lawrence (012) 797-2571 G KFC 27 8 Benoni CBD Themba/Busi, (011) 421-8220 G KFC 20 9 Bertrams (Bezuidenhouts Valley) Sunny, Thoko (011) 614-9332 G KFC 5

1 584 Tara Rd, Wentworth Bernd (031) 461 2780 KZN B- Steer 98 2 Flanders Drive, Mount Edge Russel (031) 502 3352 KZN B- Steer 77 3 Mbance Centre, Durban Edmeades, (031) 555 3352 KZN B- Steer 12 4 Kranskop Engen 1-Stop Eben Espach (031) 717-3001 KZN KFC 5

1 19a Schmidsdrift Road, Parow Henni (021) 950-2558 WCape KFC 45 2 The Club, Constantia Schalk van Dyk (021) 412 5569 WCape KFC 22

The two clusters are summarised as follows:

Table 10.22: Difference between a stratified and a cluster sample

Strata � homogeneous grouping

Clusters heterogeneous grouping �

Gauteng (N = 17) KZN (N = 8) Western Cape (N = 5)

Cluster A Gauteng (N = 8) KZN (N = 4) Western Cape (N = 3)

Cluster B Gauteng (N = 9) KZN (N = 4) Western Cape (N = 2)

Assume that cluster B has been chosen randomly. You will draw a representative sample (n = 6) of the

population, (i.e. 3 from Gauteng, 2 from KZN and 1 from the Western Cape) from cluster B using the

random or systematic procedure approach (discussed earlier). Given all the above considerations, the

cluster sample is not a popular choice among researchers mainly because of a lack of knowledge about

how to apply it correctly. See Table 10.22.

Page 39: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

M a r k e t i n g R e s e a r c h f o r M a n a g e r s | P a g e 145

10.5.1.6 Multistage area sampling

The multistage sampling technique involves using a combination of two or more probability sampling

techniques. Figure 10.5 explains.

Figure 10.5: The multi-stage sampling technique

This sample technique is commonly used when the research covers a wide geographical area, e.g. cross

border or national surveys. In using multiple steps (where each step could imply a different sampling

technique), one would move from the larger target area to the smaller, until the sample element has

been identified. Figure 10.5 provides an example of such an application. For instance, one would select

the number of interviews to be done by province using a quota sample (Western Cape in proportion to

SOUTH AFRICA

WESTERN CAPE

CAPE TOWN

V & A WATERFRONT

APARTMENT BLOCK

APARTMENT

(sample unit)

INTERVIEW

(sample element)

Page 40: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

146 | P a g e H a y d a m a n d M o s t e r t

South Africa), then moving to Cape Town, one could use a stratification process by suburb. In selecting

the V & A Waterfront, one would use a systematic sample (every nth erf or plot is chosen) and given the

apartment block, a simple random number procedure can be followed to identify which apartment is to be

selected. Ultimately the sample element is chosen using the ‘birthday rule’ (Zikmund, 2003).

Also, having done the Western Cape, a similar procedure is followed for the other provinces to arrive at a

national sample.

10.5.2 NON-PROBABILITY SAMPLES

The execution of non-probability samples is much simpler. Starting with a convenience sample:

10.5.2.1 The Convenience Sample

If a researcher is pressed for time, he/she may turn to a convenience sample to obtain key information

quickly. In the case of Peter’s Computer Support Services, the researcher might find him/herself in

Johannesburg and consequently decide to quickly interview two businesses which are in close proximity

to where he works in the Johannesburg CBD.

3 Jeppe Street, Johannesburg Rasheed Ahmed (011) 333-4823 G B- Steer 87

13 Twist Street, Johannesburg Elizabeth (011) 725-1638 G KFC 26

� When time and money are an issue (because one wants to obtain results quickly and cheaply), a

convenience sample is usually drawn. In this case researchers will draw sample units that are either

conveniently located or close at hand or easily accessible (Shao, 2002). This technique is also referred to as

accidental or haphazard sampling (Zikmund, 2003).

10.5.2.2 Judgmental sample

� For a judgmental sample, each sample unit is selected by using the researcher’s personal judgment and

experience. Sample units are therefore selected to satisfy the researcher’s purposes, even though they are not

fully representative (Zikmund, 2003).

Here the researcher of Peter’s Computer Support Services makes a judgment (for wrong or for right) to

interview all KFC clients, as they are perceived (not necessarily true) to be the most profitable and can

bring in the most business.

10.5.2.3 Quota sample

Quota samples try to obtain a representative sample at a very low cost. They are very convenient and

easy to draw. A quota sample involves a two-step process. It starts with the development of categories

of the population elements. Usually these are obtained from the Government population statistics

(censuses) or past company profile studies where categories of the target population are known by

province, race, income, gender, education. This information is usually freely available.

A representative sample is only drawn once the categories are known.

Page 41: CHAPTER 10 SAMPLING METHODOLOGY - Datavision · 2020. 9. 30. · Marketing Research for Managers | Page 107 CHAPTER 10 SAMPLING METHODOLOGY 10.1 OVERVIEW The sampling process is of

M a r k e t i n g R e s e a r c h f o r M a n a g e r s | P a g e 147

If a researcher wants to ensure that various sub-groups of a population will be represented to the exact extent

that the investigation desires, it calls for a quota sample. The researcher will therefore include a sufficient

number of sample units with particular characteristics such as age, race, income, area, etc., to be used for the

quota sample. In this case the researcher will determine the percentage or proportions of the target population

to reflect the proportion in the population under investigation (Shao, 2002 and Zikmund, 2003). This

information can be obtained from Government census statistics, among other sources).

When the target population proportions are known as in the case of Peter’s Computer Support Services,

then a quota sample can be activated as follows:

Province: Gauteng (N = 17)

56.7% KZN (N = 8)

26.7% Western Cape (N = 5)

16.6%

Sample (n = 6) 56.7% x 6 clients

(n = 3) 26.7% x 6 clients

(n = 2) 16.6% x 6 clients

(n = 1)

At first glance one would think that it is exactly the same as the stratified sample (as in section 10.5.1.4)

but there is one key difference:

One is asked to interview any three businesses from Gauteng, ANY two from KZN and ANY one from the

Western Cape. In other words, everything is left to the discretion of the researcher (or fieldworker).

In this regard, the researcher might have drawn the following (correctly defined) quota sample:

Gauteng (n = 3):

1 Wilro Krans Cnt, Graphite Rd Jaco Wolmarans (011) 768 4977 G B- Steer 150

2 Main Rd. Alberton Martin, Clinton (011) 869-8020 G B- Steer 135

3 Jeppe Street, Johannesburg Rasheed Ahmed (011) 333-4823 G B- Steer 87

KZN (n = 2)

18 584 Tara Rd, Wentworth Bernd (031) 461 2780 KZN B- Steer 98

19 Flanders Drive, Mount Edge Russel (031) 502 3352 KZN B- Steer 77

Western Cape (n = 1)

26 19a Schmidsdrift Road, Parow Henni (021) 950-2558 WCape KFC 45

Just calculate the average spent on IT and see the obvious mistake… And therein lies the danger!

10.5.2.4 Snowball sample

� A snowball sample technique involves interviewing initial respondents and asking them to provide names of

similar respondents to be included in the sample. This is a very popular method to use for minority

populations such as millionaires, the disabled, hobbyists, etc. (Shao, 2002)

In this instance, the researcher meets up with a customer who requires C++ support. After the contract

is finished, the researcher now asks the (happy) customer whether he/she knows about other clients

who also run C++ and require programming support.

oOo