Upload
others
View
10
Download
0
Embed Size (px)
Citation preview
INTRODUCTION TO SPSS FOR STATISTICAL
ANALYSIS
ARTHUR MARQUES
MATTY JULLAMON
2
• Workshop materials
https://guides.library.ubc.ca/library_research_commons/rworkshop
• Save the example data in “Desktop” folder (to find the data easily)
Set up
3
• Become Familiar with the SPSS environment
• Learn how to prepare and manage data in SPSS
• Learn how to perform descriptive statistics and inferential statistics
using SPSS
Learning Objectives
4
Overview of quantitative research
Conclusion
Software programs
Research purpose
Research design
Data collection
Statistical analysis- Manipulating data
- Analyzing dataSPSS R STATA
SAS Matlab
Mplus HLM7 …
5
Software for statistical analysis
“User-friendly”
“Common & basic functions”Might not be flexible for complex data & analysis
“Active online community”
6
SPSS environment
7
SPSS environment is composed of 3 main windows:
• Data Editor window
(Data View + Variable View)
• Output window
• Syntax window
SPSS environment
8
• Data Editor - Data View: present whole “data”
SPSS environment
9
• Data Editor - Variable View: present information of all “variables”
SPSS environment
10
• Data Editor - Variable View: “Key” information of variables
SPSS environment
Variable name
Decimal points of data
E.g. 0 decimal point → 1
1 decimal point → 1.0
Description of variable
11
• Data Editor - Variable View: “Key” information of variables
SPSS environment
Meaning of values in variable
E.g. 0 = male, 1 = female
Numbers indicating
missing data
12
• Data Editor - Variable View: “Key” information of variables
SPSS environment
Type of data in variable
- Scale = continuous data
- Nominal = categorical data
- Ordinal = ordinal data
https://stats.idre.ucla.edu/other/mult-pkg/whatstat/what-is-the-difference-between-categorical- ordinal-and-interval-variables/
For more information abut scale, nominal, and ordinal options,
13
• Drop-down menu in Data Editor:
SPSS environment
14
• Output: present “history of your analysis” and all “outputs”
SPSS environment
15
• Syntax: write “syntax”
SPSS environment
16
Data preparation in SPSS
17
• Cross-sectional design:
A set of variables measured from each person in one time point
• A set of variables:
- Gender (Male = 0, Female = 1)
- Age (range 10 – 80)
- Marital status (Married, common law = 1, Widow, divorce, separate = 2,
Single, never married = 3)
- Employment (no job = 1, part time = 2, full time = 3)
- Quality of life_total (range 0 – 20)
- Distress_total (range 0 – 20)
- Self-esteem items (range 0 – 3)
• Missing data are coded as 999
Data preparation with SPSS using example data
18
• Situation 1 - Open data in “SPSS format” (.sav):
• Situation 2 - Import data in “different formats” (excel, text…):
File > Open > Data…
Data import/entry
19
• Enter your data in SPSS Data Editor – Data View:
Data import/entry
20
• A set of variables:
- Gender (Male = 0, Female = 1)
- Age (range 10 - 80)
- Marital status (Married, common law = 1,
Widow, divorce, separate = 2,
Single, never married = 3)
- Employment (No job = 1, Part time = 2,
Full time = 3)
- Quality of life_total (range 0 - 20)
- Distress_total (range 0 – 20)
- Self-esteem items (range 0 – 3)
• Missing data is coded as 999
Checking information of variables in Variable View
E.g., Meaning of valuesClick it
21
• Employment
o No job = 1
o Part time = 2
o Full time = 3
Variable View > Values >Add
Editing value label
1
2
3
22
Data management in SPSS
23
• Make modifications to your raw data
• Common data management tasks:
1. Merging the categories
2. Changing string to numeric data
3. Computing a new summary variable
Data management
…
24
• Recode function in SPSS
• Used for “Merging the categories” & “Changing string to numeric”
• Example
“Employment” with 3 categories → “Employment_new” with 2 categories
recoded into
category “no job” → “no job”
category “part time” → “having job”
category “full time” → “having job”
Data management
25
• Transform > Recode into different variables
Data management
26
• Compute function in SPSS
• Used for “computing a new summary variable”
• Example
“Esteem_Q1” ~ “Esteem_Q10” → “Esteem_total”
Sum up
Esteem_Q1 + … + Esteem_Q10 → Total score of Esteem
Data management
27
• Transform > compute variable
Data management
28
Descriptive statistics in SPSS
29
• Descriptive statistics provide a summary of your data
• Purpose of looking at descriptive statistics:
(1) Check whether valid data are loaded properlyE.g., unexpected values (e.g., 999, -2) in “Age” variable (range 10-80)
(2) Explore data
E.g., potential group differences, associations between variables
(3) Sample description
E.g., % of gender, mean and standard deviation of quality of life score
Descriptive statistics
30
Descriptive statistics in SPSS:
Descriptive statistics
31
• Frequencies for “categorical data”
Descriptive statistics - Frequencies
32
• Frequencies for “categorical data” – Descriptive statistics
Descriptive statistics - Frequencies
33
• Frequencies for “categorical data” - Bar plots
Descriptive statistics - Frequencies
Employment
No job Part-time Full-time
Marital status
Married Widow Single
Gender
Male Female
Freq
ue
ncy
34
• Frequencies for “continuous data”
Descriptive statistics - Frequencies
35
• Frequencies for “continuous data” – Descriptive statistics
Descriptive statistics - Frequencies
36
• Frequencies for “continuous data” – Histograms
Descriptive statistics - Frequencies
Age Quality of life Distress
Values of “Age” Values of “Quality of life” Values of “Distress”
Freq
ue
ncy
37
• Scatter/Dot plots: Graphs > Legacy Dialogs > Scatter/Dot…
→ Useful to explore associations between variables
Descriptive statistics - Graphs
38
• Scatter plots: output
Descriptive statistics - Graphs
39
Inferential statistics in SPSS
40
Inferential statistics in SPSS
• Inferential statistics we are covering today…
For group comparisons:
• Independent sample T test
• One-way ANOVA
For association:
• Pearson correlation
41
Independent sample T-test
• Independent T-test compares means between two groups
• It is often used to see whether there is group difference in
continuous data between two groups (e.g., gender, treatment vs. control)
• Example
• Model assumptions
(1) Independence, (2) Normality, (3) Equal variance
8 7 5 4 11 3 9 8 7 13 11 10 13 11 15 10 17 12
Males Females
42
Analyze > Compare Means > Independent-Sample T Test …
Independent sample T-test
43
• Output: Test for equal variance assumption
• Conclusion:
Variances of male group and female group are not significantly
different
Note. Given alpha level = 0.05
Independent sample T-test
44
• Output: Results of independent T-test
• Conclusion:
There was no statistically significant difference in level of quality of
life between males and females, t(198) = -1.738, p = 0.084.
Note. Given alpha level = 0.05
Independent sample T-test
45
Independent sample one-way ANOVA
• Independent sample one-way ANOVA compares means between
more than two groups
• It is often used to see whether there are group differences in
continuous data between more than two groups
• Example
• Model assumption:
(1) Independence, (2) Normality, (3) Equal variance
8 7 5 4 11 3 9 8 7 13 11 10 13 11 15 10 17 12
Married Widow/Sep Single
46
• Analyze > Compare Means > One-Way ANOVA …
Independent sample one-way ANOVA
47
• Output: Test for equal variance assumption
• Conclusion:
Variances of married, widow, and single groups are not significantly
different
Note. Given alpha level = 0.05
Independent sample one-way ANOVA
48
• Output: Overall group difference (omnibus test results)
• Conclusion:
There was statistically significant group differences in level of quality
of life between martial status groups, F(2, 197) = 19.827, p <0.001.
Note. Given alpha level = 0.05
Independent sample one-way ANOVA
49
• Output: Which groups differ? (post hoc test results)
• Conclusion:
The level of quality of life for married group was significantly higher
than widow group (p < 0.001).
Single group showed significantly higher level of quality of life than
widow group (p < 0.001)
Independent sample one-way ANOVA
50
Pearson’s correlation
• Pearson’s correlation is used to examine associations between
variables (represented by continuous data) by looking at the
direction and strength of the associations
• Example
• Checking outlier
→ “Graphs > Legacy Dialogs > Scatter/Dot…”
association?association?
association?
Distress
Quality of life
Self-esteem
51
• Analyze > Correlate > Bivariate
Pearson’s correlation
52
• Output:
• Conclusion:
There were statistically significant negative correlations between quality
of life and distress (r = - 0.708, p < 0.001) and between self-esteem and
distress (r = -0.685, p < 0.001).
There was statistically significant positive correlation between quality of
life and self-esteem (r = 0.660, p < 0.001).
Pearson’s correlation
53
EXERCISE
Does the level of distress significantly differ by employment group (no
job, part-time, full-time)?
• What statistical analysis should we use?
• What’s the DV?
• What’s the IV?
54
Ordinary least squares linear regression (for your reference)
• Ordinary least squares (OLS) or Linear regression is used to
explain/predict the phenomenon of interest (continuous data)
• Example
• Model assumptions
(1) Independence, (2) Normality, (3) Equal variance, (4) Linearity
IV 1(Distress level)Dependent V
(Quality of life) Explain/
Predict?
IV 1(Distress level)
IV 2(Self-esteem)
Dependent V
(Quality of life)
Explain/
Predict? IV 3(Gender)
Simple OLS regression Multiple OLS regression
55
• Analyze > Regression > Linear …
Ordinary least squares linear regression (for your reference)
DistressEsteem_totalGenderAge
56
• Output
• Conclusion:
Approximately, 55% of the variability in the quality of life was explained
by the variables in the regression model.
The overall regression model significantly explained the quality of life.
Ordinary least squares linear regression (for your reference)
57
• Output:
• Conclusion:
Distress and self-esteem significantly predicted the level of quality of
life.
We would expect -0.416 points decrease in quality of life for every one
point increase in distress, assuming all the other variables are held
constant.
Ordinary least squares linear regression (for your reference)
Y = 12.041 + (-0.416)(Distress) + 0.221(Esteem) + 0.564(Gender)
58
• SPSS environment
• Data preparation in SPSS
• Data management in SPSS
• Descriptive statistics in SPSS
• Inferential statistics in SPSS
→ Try your own quantitative analysis in SPSS!
Summary
59
RESEARCH COMMONS: AN INTERDISCIPLINARY RESEARCH-DRIVEN LEARNING ENVIRONMENT
• Literature review
• Systematic Review Search Methodology
• Citation Management
• Thesis Formatting
• Nvivo Software Support
• SPSS Software Support
• R Group
• Multi-Disciplinary Graduate Student Writing Group
60
SPSS SERVICES BY RESEARCH COMMONS
• Workshops
• One-on-one Consultation
Request form to book SPSS consultation:
http://bit.ly/UBCRCconsult
61
THANK YOU!
QUESTION, COMMENT, IDEAS