03_SamplingDataCollection

1

VariablesA variable is anything that can take on differing or varying values. The values can vary at various times for same object or person or at the same time for different objects or persons. They are used to capture a concept in research.

Examples are production unit, absenteeism, motivation etc.

There are 4 types of variables - dependent, independent, moderating & intervening.

Dependent VariableDependent variable is the variable of primary interest to the researcher. Most of the research is to understand & describe a dependent variable or explain its variability or predict it.

For example, a manager is concerned with the sales of a product after a recent advertising campaign. The dependent variable here is sales.Since the sale of the product can vary - low, medium or high - it is a variable. Since the sale is the main focus of interest to the manager, it is the dependent variable.

It is possible to have more than one dependent variable in a study. For example, volume of output and quality are always comes up together in most studies.

Independent VariableAn independent variable is one that influences the dependent variable in either a positive or negative way. For example, any successful product introduction like iPhone can influence the market price of the share. That is, the more successful the new product turns out to be, the higher will be the stock market price of that firm. Here, the success of new product is an independent variable whereas the stock market price is the dependent variable.

Moderating or Extraneous VariableThis is the variable that has a strong influence on relationship between independent & dependent variable. That means, a moderating variable can modify the underlying relationship between independent & dependent variable.

➔ Workforce Diversity (independent) >> Org Effectiveness (dependent)➔ Managerial effectiveness (moderating)

Intervening VariableAn intervening variable is one that surfaces between the time the independent variables start operating to influence the dependent variable and the time their impact is felt on it.

2

➔ Workforce Diversity (independent) >> Org Effectiveness (dependent)➔ Creative Synergy (intervening)

In a controlled experiments, independent variables may be introduced or manipulated either by the researcher or by someone else who is providing the service. In these situations, there are 2 sets of variables.

➔ Active variables - those variables that can be manipulated, changed or controlled.➔ Attribute variables - those variables that can’t be manipulated, changed or controlled.

For example, in a study of relative effectiveness of 3 teaching models, the researcher does not have any control over characteristics of student population such as their age, gender or motivation to study.

Qualitative or Categorical - Nominal or ordinal scale.Quantitative or Continuous- Interval or ratio scale.

Categorical variables are1. Constant variable - has only one category or value, for example taxi, tree etc.2. Dichotomous variable - has only 2 categories, for example male/female, good/bad etc.3. Polytomous variable - can be divided into more than 2 categories, for example

hindu/muslim/christian or excellent / good / bad etc.

OperationalizationA researcher has to know what to measure before knowing how to actually measure them. The problem definition phase of research process should suggest the concepts that must be measured. A concept can be thought of as a generalized idea that represents something of meaning.

Concepts such as age, sex, education & number of children are relatively concrete properties and easy to define and measure.

Other concepts like loyalty, personality, trust, corporate culture, customer satisfaction, motivation, learning etc are more abstract in nature and quite difficult to define & measure.

Operationalization is the process through which researchers measure concepts especially abstract or subjective concepts like motivation & learning. Through operationalization, we can break an abstract concept into

1. Dimensions, which are behavioral characteristics or facets.2. And each dimensions into elements, which can be quantitatively measure.

3

For example,let's take the abstract concept of 'Thirsty'. A thirsty person exhibit the behavior of drinking lots of fluids. That means, we can measure an abstract concept of thirsty by looking into the number of time he/she drinks fluids.

Sometimes, a single variable cannot capture a concept alone. A construct is a term used for concepts that are measured with multiple variables. For instance, when a business researcher wishes to measure the customer orientation of a salesperson, several variables like these may be used, each captured on a 1–5 scale:

1. I offer the product that is best suited to a customer’s problem.2. A good employee has to have the customer’s best interests in mind.3. I try to find out what kind of products will be most helpful to a customer.

Constructs can be very helpful in operationalizing a concept.

ScalesScale is a device providing range of values that correspond to different values in a concept being measured. Business researchers use many scales or number systems. Not all scales capture the same richness in a measure. Not all concepts require a rich measure. Traditionally, the level of scale measurement is seen as important because it determines the mathematical & statistical operations that are allowable. The four levels or types of scale measurement are nominal, ordinal, interval, and ratio level scales. Each type offers the researcher progressively more power in analyzing and testing the validity of a scale.

Nominal Scale - splits data into groups. Men / Women➔ Most elementary level of measurement.➔ Qualitative in nature.➔ Student id, Emp # are examples.➔ Numerical Operation - count.➔ Statistics - Frequency, Mode

Nominal scale is always used for obtaining personal data such as gender or department or country.Your gender : Male / FemaleYour department : Marketing / Finance / Production / Others

4

Ordinal Scale - rank data in some order.However the ordinal scale do not give any indication of the magnitude of the difference among the ranks.

➔ Ranking scale➔ Ranking based on preference.➔ Numerical Operation - count, order➔ Statistics - Frequency, Mode, Median, Range

For example, income can be measured either quantitatively (in rupees) or qualitatively using categories such as ‘above average’ ‘average’ or ‘below average’. The category ‘above average’ indicates that people so grouped have more income than people in the ‘average’ category.

Interval Scale - set the data on a continuumAn interval scale allows to perform certain arithmetical operations on the data collected from the respondents. In addition to categorizing & ranking, interval scale lets us measure the distance between any two points on the scale. This scale has a starting & terminating point and is divided into equally spaced units/intervals.

➔ Have both nominal & ordinal properties.➔ Supports all common arithmetic operations.➔ Statistics - frequency. mode, median, range, mean, variance, SD.

Interval scale is used when responses to various items that measure a variable can be tapped on a 5 point or 7 point scale, which can thereafter summated across the items (Learning example)

Using the scale below, please indicate your response to each of the items that follow, by circling the numbers that best describe your feeling.>> Strongly disagree - 1>> Disagree - 2>> Neither - 3>> Agree - 4>> Strongly agree - 5

➔ My job offers a chance to advance - 1 / 2 / 3 / 4 / 5➔ This job means a lot to me - 1 / 2 / 3 / 4 / 5

Ratio ScaleRatio scale overcome the disadvantage of the arbitrary origin point of the interval scale. That is, ratio scale not only measures the magnitude of the differences between points on the scale, but also taps the proportions in the difference. It is most powerful of the four scales, because it has a unique zero origin.

Weighing balance is a good example of a ratio scale. It has an absolute zero origin calibrated on it, which allows us to calculate the ratio of weights of two individuals. For instance, a person weighing 100 kg is twice as heavy as one who weighs 50 kg.

Ratio scales are usually used when exact number are called for. The response to these questions can vary from zero to any reasonable figure.

➔ How many retail outlets do you operate?➔ How many years experience you have?

5

Data SourcesPrimary data refers to those data collected first hand by researchers.

Individuals

Focus Group ➔ 8 to 10 members, with a moderator, max 2 hours.➔ Obtain impressions, interpretations & opinions.➔ Relatively inexpensive & dependable data in a short time frame.➔ Can’t be considered truly representative.➔ Video conf, online focus groups etc.

Panel ➔ Meet more than once. Static or Dynamic.➔ Good when several aspects of concept are to be studied from time to

time.

Unobtrusive Methods ➔ Also known as Trace methods.➔ Primary source that doesn’t involve people.➔ # of different brands of soft drink cans found in a trash bag.➔ Wear & tear of journals in a library offers a good indication of their

popularity, frequency of use or both.

Secondary data are those gathered and recorded by someone else prior to the current project, may be for some other purposes.

Main advantages of secondary data are➔ Less expensive➔ Readily available.➔ Faster➔ Provide info not available otherwise. (census, industrial output, economic survey etc)

The disadvantages associated with secondary data are➔ Validity & Reliability➔ Personal bias (Cross checking with other sources)➔ Not intended specifically to meet researcher’s needs.➔ Outdated information.➔ Different units of measurement. (Data conversion required)➔ Format (example date)

Major objectives of secondary data are1. FACT FINDING - identifying consumption patterns or tracking trends.2. MODEL BUILDING - estimating market potential, forecasting sales, selecting trade areas/sites.3. DATABASE MARKETING - Develop prospect list

Applicability to the current project

6

➢ Do the data help to answer questions set out in the problem definition?➢ Do the data apply to the time period of interest?➢ Do the data apply to the population of interest?➢ Do the terms & variable classification presented apply to the current project?➢ Are the units of measurements comparable?

Is it possible to go the original source of data?Is the cost of data acquisition worth it?

Accuracy of data➢ Is there possibility of bias?➢ Can the accuracy of data verified?➢ Is using the data worth the risk?

Data Collection MethodsThere are many methods used to collect or obtain data for research. The choice of the method depends on the purpose of the study, the resource available & the skills of the researcher. Socio Economic & demographic characteristics of the study population can also influence the choice of the data collection method.

Three of the most popular methods are1. Surveys2. Observation3. Experiments

Survey Research Survey research is a method of data collection in which information is obtained directly from individual persons who are selected.

Survey provides 5 types of information about respondents.1. Facts - background characteristics (age, occupation) & personal history (place of birth, political

affiliation etc.)2. Perceptions - statements of what individual know about the world.3. Opinions - statements of preference or judgments.4. Attitudes - relatively stable evaluations of and orientations toward event, object & ideas.5. Behavioral Reports - statement of how people act.

The survey instruments can be➔ Interview schedule - ➔ Questionnaire - to be filled in by respondent.

Interview Any person to person interaction, either face to face or otherwise, between 2 or more individuals with a specific purpose in mind is called interview.

When interviewing a respondent, you as a researcher, have the freedom to decide the format and contents of the questions, select the wording of the questions, decide the way you want to ask them and choose the order in which they are to be asked.

7

Unstructured Interviews➔ Flexible interview structure➔ Flexible interview contents➔ Flexible questions.

Structured Interviews➔ Rigid interview structure➔ Rigid interview contents➔ Rigidity in interview questions & their wording,

An interview schedule is a written list of questions (open ended or closed) prepared for use by an interviewer in a person to person interaction. Note that an interview schedule is a research tool/instrument for collecting data, whereas interviewing is a method of data collection.

One of the main advantages of the structured interview is that it provides uniform information, which assures the comparability of data. Structured interviewing requires fewer interviewing skills than does unstructured interviewing.

Advantages DisadvantagesMore appropriate for complex situations Time consuming & expensive

Useful for collecting in depth information Depends on quality of interaction.

Higher response rate Depends of quality of interviewer

Information can be supplemented through observations.

Interviewer bias

Questions can be explained. Lack of anonymity may result biased response.

Can use any type of population

Various types of interviews are➔ Face to face➔ Telephonic➔ Video assisted➔ Web assisted➔ Door to door➔ Mall intercept etc.

Questionnaire A questionnaire is a written list of questions, the answers to which are recorded by respondents. In a questionnaire, respondents read the questions, interpret what is expected and then write down the answers. The only difference between an interview schedule and a questionnaire is that in the former, it is the interviewer who asks questions and record the respondents replies on an interview schedule.

In the case of questionnaire, as there is no one to explain the meaning of the questions to respondents, it is important that the questions are clear and easy to understand. Also, the layout of a questionnaire should be such that it is easy to read and pleasant to the eye and sequence of questions should be easy to follow. It is better to include some interactive statement in questionnaire to set the context.

A questionnaire can be administered in different ways.

8

Mail - The most common approach to collecting information is to send the questionnaire to prospective respondents by mail. Obviously this approach presupposes that you have access to their addresses. Usually it is a good idea to send a prepaid, self-addressed envelope with the questionnaire as this might increase the response rate. A mailed questionnaire must be accompanied by a covering letter (see below for details). One of the major problems with this method is the low response rate. In the case of an extremely low response rate, the findings have very limited applicability to the population studied.

Collective Administration - One of the best ways of administering a questionnaire is to obtain a captive audience such as students in a classroom, people attending a function, participants in a programme or people assembled in one place. This ensures a very high response rate as you will find few people refuse to participate in your study. Also, as you have personal contact with the study population, you can explain the purpose, relevance and importance of the study and can clarify any questions that respondents may have.

Administration in a public place - Sometimes you can administer a questionnaire in a public place such as a shopping centre, health centre, hospital, school or pub.

Advantages DisadvantagesLess expensive Can’t use with all population - children, illiterate

Offers greater anonymity Lower response rate.

Self selection bias.

Opportunity to clarifying issues is lacking

Response to a question may be influenced by the response to another question.

Possibility of consulting others

Response can’t be supplemented with other information.

Interview Vs Questionnaire➔ Nature of investigation. Questionnaire is best for sensitive questions on crime, sexuality, drug

related etc.➔ Geographical distribution of population.➔ Type of population - Interview is best for children, old people, illiterate etc.

Questionnaire DesignContent ➔

9

Form ➔ Open vs closed ended questions➔ Positively / Negatively worded.

Format ➔Wording ➔ Length, ambiguity, bias, double barrelled questions,

Structure ➔ Leading questions.➔ Explanation >> warm up questions >> substantive questions >>

demographic questions

** Others ➔ Cover letter, Money (pens, lottery), followups, advance notification, survey sponsorships etc.

➔ Always use simple & everyday language - Is anyone in your family a dipsomaniac?➔ Do not use ambiguous question - Are you satisfied with your canteen? << which aspect of the

canteen >>➔ Do not ask doubled barreled questions. - How often and how much time do you spend on each

visit?➔ Do not ask leading questions - “Smoking is bad, isn’t it?”➔ Do not ask questions that are based on presumptions - “How many cigarettes do you smoke in a

day?” - assuming that you are a smoker.➔ Sensitive questions - 2 thoughts -> direct manner or indirect manner.➔ Order of questions - 2 thoughts -> Random vs Logical progress➔ Pre test the research instrument.

Double Barreled Question - It is a question within a question. The main problem with this type of question is that one does not know which particular question is respondent has answered. “Does your department have a special recruitment policy for racial minorities & women?”

Leading Question - One, by its contents or structure or wording, leads a respondent to answer in a certain direction.

Indirect Manner➔ By showing drawings or cartoons.➔ By asking respondents to complete a sentence.➔ By using random devices.

Observation MethodsObservation is the systematic process of recording behavioral patterns of people, objects and occurrences as they happen. No questioning or communication is needed.

Observational studies can gather 7 kinds of observable phenomena.1. Physical action - a worker’s movement during an assembly process.2. Physical objects - % of recycled material compared to trash.3. Verbal behavior - statements made by airline passengers while waiting in line. 4. Expressive behavior - facial expressions, body language, eye contact, personal space, gestures,

manners etc.5. Spatial relations & locations - proximity of middle manager’s office to CEO office.6. Temporal patterns - length of time required to execute an order.7. Verbal & pictorial records - number of illustrations appearing in a training booklet.

10

There are 2 types of observations.

Participant Observation - You as a researcher participate in the activities of the group being observed in the same manner as its members, with or without their knowing that they are being observed. For example, if you want to study the life of prisoners, then pretend to be a prisoner in order to do this.

Non Participant Observation - You as a researcher do not get involved in the activities of the group, but remain a passive observer watching & listening to its activities. Examples are occupational studies like study the functions carried out by nurses in a hospital.

Nature of Observation➔ Human observation - for situations or behavior that is not easily predictable.➔ Mechanical observation - for situations or behavior that are routine or repetitive in nature.

Examples are supermarket scanners & traffic counters.➔ Visible observation - observation in which the observer’s presence is known to the subject.➔ Hidden observation - observation in which the subject is unaware that observation is taking place.➔ Direct observation - a straightforward attempt to observe and record what naturally occurs; the

investigator does not create an artificial situation. Examples are competitive prices, promotion details.

➔ Indirect or Contrived observation - observation in which the investigator creates an artificial env in order to test a hypothesis. (Mystery shoppers)

The major advantage of observational studies compared to surveys, which obtain self-reported data from respondents, is that the data are free from distortions, inaccuracies, or other response biases due to memory error, social desirability bias, and so on. The data are recorded when the actual behavior takes place.

Observation of Human Behavior

Observation of Physical Objects➔ Trace methods➔ Wear & tear of books in a library or replacement rate of tiles in a museum.➔ Count & record physical inventories through retail or wholesale audit.

Content AnalysisObtains data by observing and analyzing the contents or messages of advertisements, newspapers, television programs, letters etc.

For example, content analysis of advertisements might evaluate their use of words, themes, characters, or space and time relationships. Another topic of content analysis is the frequency with which women, African-Americans, or ethnic minorities appear in mass media.

Major challenges are➔ When individuals or groups become aware that they are being observed, they may change their

behavior.➔ There is always a possibility of observer bias.➔ The interpretations drawn from observations may vary from observer to observer.

11

Methods used for recording are1. Narrative recording - qualitative research.2. Using scale3. Categorical recording - always / sometimes / never4. Electronic devices

Experimental Research

Projective MethodsThese are the psychological techniques to get answers without asking a direct question. As a result of this, respondents usually project their unconscious beliefs, attitudes & feelings.

➔ Associations - ➔ Construction - participants must construct a story or picture from a concept.➔ Completion - sentence or story.➔ Expressive - for situations when respondents cannot describe their actions but can demonstrate

them.➔ Choice Ordering - respondents list benefits from most to least important.

Thematic Apperception Test (TAT) - make a story around a picture that is shown.Zaltman Metaphor Elicitation Technique (ZMET) - metaphor elicitation, collage building & brand stories.Video Elicitation - Point & counterpoint

These methods are basically using in market research, when we want to see how individuals associate different products, brands, ads etc. in their mind.,

Sampling Plan

A sample is a subset or some part, of a larger population. The purpose of sampling is to estimate an unknown characteristic of a population.

A population is any complete group - people, sales territories, students - that shares some common set of characteristics. The term population element refers to an individual member of the population.

12

A census is an investigation of all individual elements that make up the population.

Sampling can>> Cut costs>> Reduce labor requirements>> Gather vital information quickly.>> Avoid destruction of test units.>> Most properly selected samples can give results that are reasonably accurate.

"Precision has suffered, but accuracy has not".

A sample may on occasion be more accurate than a census. Interviewer mistakes, tabulation errors, and other nonsampling errors may increase during a census because of the increased volume of work.

Define the target population➔ Not straightforward some times.➔ Purchasing agents vs industrial engineers.➔ By geography, demographics, use, awareness etc.

Select a sampling frame➔ List of elements from which a sample may be drawn. Also called working population, since these

units will eventually provide units involved in analysis.➔ University email directory can considered as a sampling frame for a given university's student

population.➔ In practice, almost every list excludes some members of the population.➔ Sampling frame error occurs when certain sample are excluded or when the entire population is

not accurately represented in the sampling frame.

Determine sampling type➔ Probability - A sampling technique in which every member of the population will have a known,

nonzero probability of being selected.➔ Non probability - units of the sample are chosen on the basis of personal judgment or

convenience. There are NO statistical techniques for measuring random sampling error in a non-probability sample. Therefore, generalizability is never statistically appropriate.

Simple Random ➔ Purest form of probability sampling.➔ Assures each element in the population has an equal chance of being

included in the sample.➔ Advantages - minimal knowledge of population needed, easy to analyze

the data.

13

➔ Disadvantages - high cost, low frequency of use, requires a sampling frame, does not use researcher’s expertise, population should homogeneous.

➔ Methods - Fishbowl draw, Computer program or Random table.

Systematic ➔ An initial starting point is selected by a random process, and then every nth number on the list is selected.

➔ Moderate cost / moderate usage.➔ Simple to draw sample, easy to verify.➔ Periodic ordering, Requires sampling frame.

Stratified ➔ Sub-samples are randomly drawn from samples within different strata that are more or less equal on some characteristic.

➔ Proportional - the number of sampling units drawn from each stratum is in proportion to the relative population size of that stratum.

➔ Disproportional - the number of sampling units drawn from each stratum is allocated according to analytical considerations.

➔ Assures representation of all groups in sample population needed➔ Characteristics of each stratum can be estimated and comparisons

made➔ Reduces variability from systematic➔ Requires accurate information on proportions of each stratum➔ Stratified lists costly to prepare

Cluster ➔ The primary sampling unit is not the individual element, but a large cluster of elements. Either the cluster is randomly selected or the elements within are randomly selected.

➔ Area sample - primary sampling unit is a geographical area.➔ Multi Stage area sample - Involves a combination of two or more types

of probability sampling techniques. Typically, progressively smaller geographical areas are randomly selected in a series of steps.

➔➔

Select Sampling Units➔ A single element or group of elements subject to selection in the sample.➔ Primary Sampling Unit - A term used to designate a unit selected in the first stage of sampling.➔ Secondary Sampling Unit - A term used to designate a unit selected in the second stage of

sampling.

Determine the Sample SizeSample size determination is closely related to statistical estimation. The statistical determination of sample size requires the knowledge of

➔ SD of the population.➔ Magnitude of acceptable error.➔ Confidence level.

If we assume that population SD is known or has been estimated from a pilot study, the formula for sample size is derived from maximum error of estimate formula.

14

SD of population (S)➔ Only a small sample is required if the population is homogeneous.➔ Predicting the average age of college students requires a smaller sample than predicting the

average age of people who visit the zoo on a given day.➔ Pilot study or rule of thumb.

Magnitude of Error or Precision (E)➔ Indicates how precise the estimate must be.➔ Precision depends on managerial judgment or calculations.➔

Confidence Level (Z)➔ Managerial judgment.

Z standardized value that corresponds to confidence level.➜S sample standard deviation or estimate of the population standard deviation.➜E acceptable magnitude of error.➜

For most cases, the size of the population does not have an effect on the sample size. However, a finite correction factor may be needed to adjust a sample size that is more than 5% of a finite population.

15

For proportions

Furthermore, a number of easy-to-use tables have been compiled to help researchers calculate sample size. The main reason a large sample size is desirable is that sample size is related to random sampling error. A smaller sample makes a larger error in estimates more likely. Calculation of sample size for a sample proportion is not difficult. However, most researchers use tables that indicate predetermined sample sizes.

Attitudinal Scale Step by step - 176

Documents

03_SamplingDataCollection