13
Fundamentals of Data Fundamentals of Data Analysis Analysis

Fundamentals of Data Analysis. Preparing the Data for Analysis Data editing – the process of identifying omissions, ambiguities and errors in the responses

Embed Size (px)

Citation preview

Page 1: Fundamentals of Data Analysis. Preparing the Data for Analysis Data editing – the process of identifying omissions, ambiguities and errors in the responses

Fundamentals of Data Fundamentals of Data AnalysisAnalysis

Page 2: Fundamentals of Data Analysis. Preparing the Data for Analysis Data editing – the process of identifying omissions, ambiguities and errors in the responses

Preparing the Data for Analysis

• Data editing – the process of identifying omissions, ambiguities and errors in the responses

• Coding – process of assigning numerical values to responses according to a pre-defined system

• Statistically adjusting the data – the process of modifying the data to enhance its quality for analysis

– Weighting, transformations, variable re-specification

Page 3: Fundamentals of Data Analysis. Preparing the Data for Analysis Data editing – the process of identifying omissions, ambiguities and errors in the responses

Preparing the Data for Analysis

Problems Identified With Data Editing

• Omissions

• Ambiguity

• Inconsistencies

• Lack of Cooperation

• Ineligible Respondent

Page 4: Fundamentals of Data Analysis. Preparing the Data for Analysis Data editing – the process of identifying omissions, ambiguities and errors in the responses

Preparing the Data for Analysis

• Solutions to such problems

Page 5: Fundamentals of Data Analysis. Preparing the Data for Analysis Data editing – the process of identifying omissions, ambiguities and errors in the responses

Preparing the Data for Analysis

Coding

• closed-ended questions

– Relatively simple and straightforward

• open-ended questions

– Define all possible responses and categorize each response and then assign a numerical code

– If judgment calls are needed then have several coders do the same task and check inter-coder reliability

– Inter-coder reliability

Page 6: Fundamentals of Data Analysis. Preparing the Data for Analysis Data editing – the process of identifying omissions, ambiguities and errors in the responses

Statistical adjustment of data

• Weighting – – process of enhancing / reducing the importance

of certain data by assigning a number– Usually done to increase the representativeness

of the sample or achieve study objectives

• Scale transformations– Manipulation of scales to make them

comparable with other scales e.g. converting lbs to kgs. etc.

– Z-scores (standardized scales)

Page 7: Fundamentals of Data Analysis. Preparing the Data for Analysis Data editing – the process of identifying omissions, ambiguities and errors in the responses

Preparing the Data for Analysis

• Variable Re-specification– Existing data modified to create new

variables

– Large number of variables collapsed into fewer variables

– Creates variables that are consistent with research questions

• Determine if the variable is categorical, rank-order, interval level or ratio level.

Page 8: Fundamentals of Data Analysis. Preparing the Data for Analysis Data editing – the process of identifying omissions, ambiguities and errors in the responses

Categorical Data Analysis - Objectives

• Describing the sample distribution for the variable (e.g. gender)

• Frequencies, percentages, quartiles, percentiles, graphs (bar, line, histogram, pie)

• What are the typical characteristics of the sample?• Mode

• Does the categorical variable bear any relationship with a distribution of another categorical variable (e.g. gender w.r.t. buy the product or not)

• Cross tabs and chi-square as a measure of association

Page 9: Fundamentals of Data Analysis. Preparing the Data for Analysis Data editing – the process of identifying omissions, ambiguities and errors in the responses

Rank order data analysis - Objectives

• What are respondent preferences amongst several competing alternatives? (e.g. rank your preferences amongst ten different brands of cars)– Frequencies, Percentages, Graphs

• What is the typical preference pattern in the sample (e.g. which car does the sample prefer the most and which one the least?)– Mode

Page 10: Fundamentals of Data Analysis. Preparing the Data for Analysis Data editing – the process of identifying omissions, ambiguities and errors in the responses

Rank order data analysis - Objectives

• Are two sets of respondent preferences correlated? (e.g. wrist watches brand preferences with car brand preferences)– Spearman’s rank correlation coefficient

Page 11: Fundamentals of Data Analysis. Preparing the Data for Analysis Data editing – the process of identifying omissions, ambiguities and errors in the responses

Interval level / Ratio level data analysis - Objectives• What is the average response in the

sample (e.g. what is the mean attitude to the brand?)– Mean / Median

• What is the average variability of the response in the sample (e.g. On an average, how dispersed are the sample’s attitudes to the brand from the mean?)– Standard deviation

Page 12: Fundamentals of Data Analysis. Preparing the Data for Analysis Data editing – the process of identifying omissions, ambiguities and errors in the responses

Interval level / Ratio level data analysis - Objectives• Do two or more subgroups in the sample

differ from each other on the response / differ from a previously known / hypothesized value

• E.g. do males like the brand significantly more than the females? T tests, z tests

• E.g. Does attitude to WU vary by student status (freshman, sophomore, junior, senior)– ANOVA

Page 13: Fundamentals of Data Analysis. Preparing the Data for Analysis Data editing – the process of identifying omissions, ambiguities and errors in the responses

Interval level / Ratio level data analysis - Objectives• Are sample responses on two variables

correlated? (e.g. are sales related to the advertising expenditure?)– Pearson correlation

• Can we determine the value of the sample’s response on a variable, if we know the value on another variable? (e.g. If we need to achieve 1 million dollars in sales next year, how much should we spend on advertising?)– Regression analysis