Upload
berenice-richard
View
217
Download
0
Embed Size (px)
Citation preview
Fundamentals of Data Fundamentals of Data AnalysisAnalysis
Preparing the Data for Analysis
• Data editing – the process of identifying omissions, ambiguities and errors in the responses
• Coding – process of assigning numerical values to responses according to a pre-defined system
• Statistically adjusting the data – the process of modifying the data to enhance its quality for analysis
– Weighting, transformations, variable re-specification
Preparing the Data for Analysis
Problems Identified With Data Editing
• Omissions
• Ambiguity
• Inconsistencies
• Lack of Cooperation
• Ineligible Respondent
Preparing the Data for Analysis
• Solutions to such problems
Preparing the Data for Analysis
Coding
• closed-ended questions
– Relatively simple and straightforward
• open-ended questions
– Define all possible responses and categorize each response and then assign a numerical code
– If judgment calls are needed then have several coders do the same task and check inter-coder reliability
– Inter-coder reliability
Statistical adjustment of data
• Weighting – – process of enhancing / reducing the importance
of certain data by assigning a number– Usually done to increase the representativeness
of the sample or achieve study objectives
• Scale transformations– Manipulation of scales to make them
comparable with other scales e.g. converting lbs to kgs. etc.
– Z-scores (standardized scales)
Preparing the Data for Analysis
• Variable Re-specification– Existing data modified to create new
variables
– Large number of variables collapsed into fewer variables
– Creates variables that are consistent with research questions
• Determine if the variable is categorical, rank-order, interval level or ratio level.
Categorical Data Analysis - Objectives
• Describing the sample distribution for the variable (e.g. gender)
• Frequencies, percentages, quartiles, percentiles, graphs (bar, line, histogram, pie)
• What are the typical characteristics of the sample?• Mode
• Does the categorical variable bear any relationship with a distribution of another categorical variable (e.g. gender w.r.t. buy the product or not)
• Cross tabs and chi-square as a measure of association
Rank order data analysis - Objectives
• What are respondent preferences amongst several competing alternatives? (e.g. rank your preferences amongst ten different brands of cars)– Frequencies, Percentages, Graphs
• What is the typical preference pattern in the sample (e.g. which car does the sample prefer the most and which one the least?)– Mode
Rank order data analysis - Objectives
• Are two sets of respondent preferences correlated? (e.g. wrist watches brand preferences with car brand preferences)– Spearman’s rank correlation coefficient
Interval level / Ratio level data analysis - Objectives• What is the average response in the
sample (e.g. what is the mean attitude to the brand?)– Mean / Median
• What is the average variability of the response in the sample (e.g. On an average, how dispersed are the sample’s attitudes to the brand from the mean?)– Standard deviation
Interval level / Ratio level data analysis - Objectives• Do two or more subgroups in the sample
differ from each other on the response / differ from a previously known / hypothesized value
• E.g. do males like the brand significantly more than the females? T tests, z tests
• E.g. Does attitude to WU vary by student status (freshman, sophomore, junior, senior)– ANOVA
Interval level / Ratio level data analysis - Objectives• Are sample responses on two variables
correlated? (e.g. are sales related to the advertising expenditure?)– Pearson correlation
• Can we determine the value of the sample’s response on a variable, if we know the value on another variable? (e.g. If we need to achieve 1 million dollars in sales next year, how much should we spend on advertising?)– Regression analysis