INTRODUCTION TO SPSS FOR STATISTICAL ANALYSIS · 3 • Become Familiar with the SPSS environment • Learn how to prepare and manage data in SPSS • Learn how to perform descriptive

INTRODUCTION TO SPSS FOR STATISTICAL

ANALYSIS

ARTHUR MARQUES

MATTY JULLAMON

2

• Workshop materials

https://guides.library.ubc.ca/library_research_commons/rworkshop

• Save the example data in “Desktop” folder (to find the data easily)

Set up

3

• Become Familiar with the SPSS environment

• Learn how to prepare and manage data in SPSS

• Learn how to perform descriptive statistics and inferential statistics

using SPSS

Learning Objectives

4

Overview of quantitative research

Conclusion

Software programs

Research purpose

Research design

Data collection

Statistical analysis- Manipulating data

- Analyzing dataSPSS R STATA

SAS Matlab

Mplus HLM7 …

5

Software for statistical analysis

“User-friendly”

“Common & basic functions”Might not be flexible for complex data & analysis

“Active online community”

6

SPSS environment

7

SPSS environment is composed of 3 main windows:

• Data Editor window

(Data View + Variable View)

• Output window

• Syntax window

SPSS environment

8

• Data Editor - Data View: present whole “data”

SPSS environment

9

• Data Editor - Variable View: present information of all “variables”

SPSS environment

10

• Data Editor - Variable View: “Key” information of variables

SPSS environment

Variable name

Decimal points of data

E.g. 0 decimal point → 1

1 decimal point → 1.0

Description of variable

11


SPSS environment

Meaning of values in variable

E.g. 0 = male, 1 = female

Numbers indicating

missing data

12


SPSS environment

Type of data in variable

- Scale = continuous data

- Nominal = categorical data

- Ordinal = ordinal data

https://stats.idre.ucla.edu/other/mult-pkg/whatstat/what-is-the-difference-between-categorical- ordinal-and-interval-variables/

For more information abut scale, nominal, and ordinal options,

https://stats.idre.ucla.edu/other/mult-pkg/whatstat/what-is-the-difference-between-categorical-ordinal-and-interval-variables/

13

• Drop-down menu in Data Editor:

SPSS environment

14

• Output: present “history of your analysis” and all “outputs”

SPSS environment

15

• Syntax: write “syntax”

SPSS environment

16

Data preparation in SPSS

17

• Cross-sectional design:

A set of variables measured from each person in one time point

• A set of variables:

- Gender (Male = 0, Female = 1)

- Age (range 10 – 80)

- Marital status (Married, common law = 1, Widow, divorce, separate = 2,

Single, never married = 3)

- Employment (no job = 1, part time = 2, full time = 3)

- Quality of life_total (range 0 – 20)

- Distress_total (range 0 – 20)

- Self-esteem items (range 0 – 3)

• Missing data are coded as 999

Data preparation with SPSS using example data

18

• Situation 1 - Open data in “SPSS format” (.sav):

• Situation 2 - Import data in “different formats” (excel, text…):

File > Open > Data…

Data import/entry

19

• Enter your data in SPSS Data Editor – Data View:

Data import/entry

20

• A set of variables:

- Gender (Male = 0, Female = 1)

- Age (range 10 - 80)

- Marital status (Married, common law = 1,

Widow, divorce, separate = 2,

Single, never married = 3)

- Employment (No job = 1, Part time = 2,

Full time = 3)

- Quality of life_total (range 0 - 20)

- Distress_total (range 0 – 20)

- Self-esteem items (range 0 – 3)

• Missing data is coded as 999

Checking information of variables in Variable View

E.g., Meaning of valuesClick it

21

• Employment

o No job = 1

o Part time = 2

o Full time = 3

Variable View > Values >Add

Editing value label

1

2

3

22

Data management in SPSS

23

• Make modifications to your raw data

• Common data management tasks:

1. Merging the categories

2. Changing string to numeric data

3. Computing a new summary variable

Data management

…

24

• Recode function in SPSS

• Used for “Merging the categories” & “Changing string to numeric”

• Example

“Employment” with 3 categories → “Employment_new” with 2 categories

recoded into

category “no job” → “no job”

category “part time” → “having job”

category “full time” → “having job”

Data management

25

• Transform > Recode into different variables

Data management

26

• Compute function in SPSS

• Used for “computing a new summary variable”

• Example

“Esteem_Q1” ~ “Esteem_Q10” → “Esteem_total”

Sum up

Esteem_Q1 + … + Esteem_Q10 → Total score of Esteem

Data management

27

• Transform > compute variable

Data management

28

Descriptive statistics in SPSS

29

• Descriptive statistics provide a summary of your data

• Purpose of looking at descriptive statistics:

(1) Check whether valid data are loaded properlyE.g., unexpected values (e.g., 999, -2) in “Age” variable (range 10-80)

(2) Explore data

E.g., potential group differences, associations between variables

(3) Sample description

E.g., % of gender, mean and standard deviation of quality of life score

Descriptive statistics

30

Descriptive statistics in SPSS:

Descriptive statistics

31

• Frequencies for “categorical data”

Descriptive statistics - Frequencies

32

• Frequencies for “categorical data” – Descriptive statistics


33

• Frequencies for “categorical data” - Bar plots


Employment

No job Part-time Full-time

Marital status

Married Widow Single

Gender

Male Female

Freq

ue

ncy

34

• Frequencies for “continuous data”


35

• Frequencies for “continuous data” – Descriptive statistics


36

• Frequencies for “continuous data” – Histograms


Age Quality of life Distress

Values of “Age” Values of “Quality of life” Values of “Distress”

Freq

ue

ncy

37

• Scatter/Dot plots: Graphs > Legacy Dialogs > Scatter/Dot…

→ Useful to explore associations between variables

Descriptive statistics - Graphs

38

• Scatter plots: output

Descriptive statistics - Graphs

39

Inferential statistics in SPSS

40

Inferential statistics in SPSS

• Inferential statistics we are covering today…

For group comparisons:

• Independent sample T test

• One-way ANOVA

For association:

• Pearson correlation

41

Independent sample T-test

• Independent T-test compares means between two groups

• It is often used to see whether there is group difference in

continuous data between two groups (e.g., gender, treatment vs. control)

• Example

• Model assumptions

(1) Independence, (2) Normality, (3) Equal variance

8 7 5 4 11 3 9 8 7 13 11 10 13 11 15 10 17 12

Males Females

42

Analyze > Compare Means > Independent-Sample T Test …


43

• Output: Test for equal variance assumption

• Conclusion:

Variances of male group and female group are not significantly

different

Note. Given alpha level = 0.05


44

• Output: Results of independent T-test

• Conclusion:

There was no statistically significant difference in level of quality of

life between males and females, t(198) = -1.738, p = 0.084.



45

Independent sample one-way ANOVA

• Independent sample one-way ANOVA compares means between

more than two groups

• It is often used to see whether there are group differences in

continuous data between more than two groups

• Example

• Model assumption:

(1) Independence, (2) Normality, (3) Equal variance

8 7 5 4 11 3 9 8 7 13 11 10 13 11 15 10 17 12

Married Widow/Sep Single

46

• Analyze > Compare Means > One-Way ANOVA …


47

• Output: Test for equal variance assumption

• Conclusion:

Variances of married, widow, and single groups are not significantly

different



48

• Output: Overall group difference (omnibus test results)

• Conclusion:

There was statistically significant group differences in level of quality

of life between martial status groups, F(2, 197) = 19.827, p <0.001.



49

• Output: Which groups differ? (post hoc test results)

• Conclusion:

The level of quality of life for married group was significantly higher

than widow group (p < 0.001).

Single group showed significantly higher level of quality of life than

widow group (p < 0.001)


50

Pearson’s correlation

• Pearson’s correlation is used to examine associations between

variables (represented by continuous data) by looking at the

direction and strength of the associations

• Example

• Checking outlier

→ “Graphs > Legacy Dialogs > Scatter/Dot…”

association?association?

association?

Distress

Quality of life

Self-esteem

51

• Analyze > Correlate > Bivariate


52

• Output:

• Conclusion:

There were statistically significant negative correlations between quality

of life and distress (r = - 0.708, p < 0.001) and between self-esteem and

distress (r = -0.685, p < 0.001).

There was statistically significant positive correlation between quality of

life and self-esteem (r = 0.660, p < 0.001).


53

EXERCISE

Does the level of distress significantly differ by employment group (no

job, part-time, full-time)?

• What statistical analysis should we use?

• What’s the DV?

• What’s the IV?

54

Ordinary least squares linear regression (for your reference)

• Ordinary least squares (OLS) or Linear regression is used to

explain/predict the phenomenon of interest (continuous data)

• Example

• Model assumptions

(1) Independence, (2) Normality, (3) Equal variance, (4) Linearity

IV 1(Distress level)Dependent V

(Quality of life) Explain/

Predict?

IV 1(Distress level)

IV 2(Self-esteem)

Dependent V

(Quality of life)

Explain/

Predict? IV 3(Gender)

Simple OLS regression Multiple OLS regression

55

• Analyze > Regression > Linear …


DistressEsteem_totalGenderAge

56

• Output

• Conclusion:

Approximately, 55% of the variability in the quality of life was explained

by the variables in the regression model.

The overall regression model significantly explained the quality of life.


57

• Output:

• Conclusion:

Distress and self-esteem significantly predicted the level of quality of

life.

We would expect -0.416 points decrease in quality of life for every one

point increase in distress, assuming all the other variables are held

constant.


Y = 12.041 + (-0.416)(Distress) + 0.221(Esteem) + 0.564(Gender)

58

• SPSS environment

• Data preparation in SPSS

• Data management in SPSS

• Descriptive statistics in SPSS

• Inferential statistics in SPSS

→ Try your own quantitative analysis in SPSS!

Summary

59

RESEARCH COMMONS: AN INTERDISCIPLINARY RESEARCH-DRIVEN LEARNING ENVIRONMENT

• Literature review

• Systematic Review Search Methodology

• Citation Management

• Thesis Formatting

• Nvivo Software Support

• SPSS Software Support

• R Group

• Multi-Disciplinary Graduate Student Writing Group

60

SPSS SERVICES BY RESEARCH COMMONS

• Workshops

• One-on-one Consultation

Request form to book SPSS consultation:

http://bit.ly/UBCRCconsult

61

THANK YOU!

QUESTION, COMMENT, IDEAS

62

Feedback Form

FEEDBACK

https://ubc.ca1.qualtrics.com/jfe/form/SV_dj4tgxrxj9JEO9f

Documents

INTRODUCTION TO SPSS FOR STATISTICAL ANALYSIS · 3 • Become Familiar with the SPSS environment • Learn how to prepare and manage data in SPSS • Learn how to perform descriptive