6
Linear Regression in SPSS This example shows you how to create a basic regression model in SPSS, with one predictor variable and one criterion variable. Both variables have to be measured on an interval- or ratio-level scale. In this example, we will use demographic data on various countries to predict infant mortality (“babymort”) from the average number of children born per family (“fertilty”). Here are the data:

How to perform linear regression

Embed Size (px)

Citation preview

Page 1: How to perform linear regression

Linear Regression in SPSS This example shows you how to create a basic regression model in SPSS, with one

predictor variable and one criterion variable. Both variables have to be measured on an

interval- or ratio-level scale. In this example, we will use demographic data on various

countries to predict infant mortality (“babymort”) from the average number of children

born per family (“fertilty”). Here are the data:

Page 2: How to perform linear regression

The regression procedure is found in the “Analyze” menu, under “Regression,” then by

selecting “Linear” in the Regression sub-menu.

The following dialog box will appear:

From the left-hand list, move the variable that you are trying to predict into the top box

(labeled “dependent”). Then move the variable that you are using as a predictor into the

next box (labeled “independent”). Then hit the “OK” button to continue.

Page 3: How to perform linear regression

The output has several sections:

Model Summary

Model R R Square Adjusted R Square

Std. Error of the Estimate

1 .833(a) .694 .691 21.2040

a Predictors: (Constant), Fertility: average number of kids

This section shows you the correlation between the two variables (R).

ANOVAb

107090.4 1 107090.351 238.185 .000a

47209.142 105 449.611

154299.5 106

Regression

Residual

Total

Model

1

Sum of

Squares df Mean Square F Sig.

Predictors: (Constant), Fertility: average number of kidsa.

Dependent Variable: Infant mortality (deaths per 1000 live births)b.

This section shows you the p-value (“sig” for “significance”) of the predictor’s effect on

the criterion variable. P-values less than .05 are generally considered “statistically

significant.”

Coefficientsa

-16.592 4.368 -3.798 .000

16.707 1.083 .833 15.433 .000

(Constant)

Fertility: average

number of kids

Model

1

B Std. Error

Unstandardized

Coefficients

Beta

Standardized

Coefficients

t Sig.

Dependent Variable: Infant mortality (deaths per 1000 live births)a.

This section shows you the beta coefficients for the actual regression equation. Usually,

you want the “unstandardized coefficients,” because this section includes a y-intercept

term (beta zero) as well as a slope term (beta one). The “standardized coefficients” are

based on a re-scaling of the variables so that the y-intercept is equal to zero.

Page 4: How to perform linear regression

One other thing that you may wish to do in a regression problem is to see the graph of the

regression line. The best way to do this is by using the “interactive scatterplot” in the

“Graphs” menu:

The following dialog box will appear. Go through the tabs one by one to define your

graph:

Page 5: How to perform linear regression

On the first tab (“assign variables”), drag and drop your predictor (x-axis) and criterion

(y-axis) variables into the appropriate places on the right-hand side. These boxes aren’t

labeled, but they are on the lines that represent the graph:

On the second tab (“fit”), choose “regression” from the drop-down menu. The default

choices here are fine, or you can also click the check-box for “prediction lines – mean.”

This will give you an error range around the regression line on your graph. If you don’t

want these extra lines, don’t click this checkbox.

The third tab (“spikes”) shouldn’t apply to a smooth regression line. So skip to the fourth

tab (“titles”), where you can enter a title for your graph:

y-axis: the thing you want

to predict goes here

x-axis: the predictor

variable goes here

Page 6: How to perform linear regression

On the final tab (“options”), you can select a visual look for your graph. I chose the

“marina” look – you can play with the other options to see which ones you like. For a

basic, scientific-looking graph, just use the <Default> setting. For presentations or

PowerPoint shows, you can dress up your graphs a little bit like this one.

Here’s the SPSS output:

Linear Regression with95.00% Mean Prediction Interval

ê ê ê êê ê ê ê ê ê ê ê ê ê ê ê ê ê ê

2.0 4.0 6.0 8.0

Fertility: average number of kids

ê

ê

ê

ê

ê

ê

ê

ê

ê

ê

ê

ê

ê

ê

ê

ê

ê

ê

0.0

50.0

100.0

150.0

Infant mortality (deaths per 1000 live births)

W

W W

WW

W

W

W

WW

W

W

W

W

W

W

WW

W

W

W

W

W

W

WWWWW

W

W

W

W

W

W

WW

W

W

W

WW

W

W

W

WW

W

W

WW

W

W WWW

W

W

WW

W

W

W

WW

W

W

W

W W

W

W

W

W

W

WW

WW

WW

WW

W

W

W

W

W

W

W

W WW

W

W

W

W

W

WW

W

WW

W

W

W

W

Infant mortality (deaths per 1000 live births) = -16.59 + 16.71 * fertilty

R-Square = 0.69

Regression Line

Double-click on the graph in the SPSS output to make changes (like moving the equation

for the regression line off to one side, so that it isn’t sitting right on top of the graph).