Spatial Regression Modeling - csiss.org · Spatial Regression Modeling. ... • Moral: When analyzing ... value • Consistency. Estimates converge toward the quantity being estimated

GISPopSci

Day 1

Spatial Regression Modeling

Paul Voss & Katherine Curtis

The Center for Spatially Integrated Social ScienceSanta Barbara, CAJuly 12-17, 2009

GISPopSci

ObjectiveProvide a solid introduction and overview of the concepts and techniques of spatial

regression analysis

with plenty of hands-on experience in the afternoon

lab sessions

GISPopSci

What’s the point?Data that are referenced to location

bring an extremely important additional amount of valuable information to a data analysis

But it also brings some (possibly unfamiliar) pitfalls that require a new awareness as that analysis proceeds

GISPopSci

Plan for the week (1)

• Today: Understanding Spatial Data– Broad overview of spatial data, spatial data analysis and core

spatial concepts. Why “spatial is special”– Classical linear regression model

• assumptions underlying OLS• consequences of violations of assumptions• why spatial processes violate OLS assumptions

– Introduction to EDA & ESDA– Lab: Shapefiles & introduction to GeoDa & EDA using R

• Tuesday: Diagnosing spatial autocorrelation– Understanding & measuring global spatial autocorrelation– Weights matrices– Understanding & measuring local spatial autocorrelation

• Moran scatterplot• LISA statistics

– Lab: Global & local measures of spatial autocorrelation

GISPopSci


• Wednesday: Common modeling strategies for spatial processes– Understanding spatial processes

• spatial heterogeneity• spatial dependence

– Spatial regression models• OLS in GeoDa• understanding GeoDa regression diagnostics• spatial lag model; spatial error model

– No Lab. Free time in Santa Barbara!

• Thursday: Focus on spatial heterogeneity– Spatial heterogeneity in relationships– Introduction to GWR

• GWR theory and approach• GWR software

– Lab: GWR hands-on using R

GISPopSci


• Friday: Alternative spatial processes– Reminder that there are some spatial data analysis

approaches that have not been covered in this class• point pattern analysis• geostatistical methods; kriging

– Exciting things on the horizon– Resources for pursuing these topics intensively– Lab:

• Research presentations• Open forum

GISPopSci

Plan for today• A good motivational example• Why “spatial is special”

– characteristics of spatial data– problems caused by spatial data

• Spatial analysis vs. spatial data analysis• Broad, gentle overview of spatial data and spatial

data analysis• Classes of problems in spatial data analysis• Review OLS assumptions & violations• Exploratory Data Analysis• Introduction to afternoon lab

GISPopSci

Any questions as we get started?

GISPopSci

Some beginning facts• Regression is the workhorse of quantitative social

science• Much social science data is spatially referenced• Spatially referenced data bring special problems to

an analysis– heterogeneity of observational units → heteroskedasticity– spatial autocorrelation → residual dependence

• A consequence of these “special problems” is that the assumption of iid errors in a standard OLS regression specification is violated, and statistical inference from such a model is not valid

GISPopSci

Motivation• Omer R. Galle, Walter R. Gove, & J. Miller McPherson.

1972. “Population Density and Pathology: What Are the Relations for Man” Science 176(4030):23-30– data: 75 community areas in Chicago for 1960– 5 measures of “social pathology” as function of crowding, controlling

for social class & ethnicity– “…the greater the density, the greater the fertility” (p. 176)

• Colin Loftin & Sally K. Ward. 1983. “A Spatial Autocorrelation Model of the Effects of Population Density on Fertility” American Sociological Review 48(1):121-128– “…the GGM findings with regard to fertility are an artifact of the failure

to recognize the presence of disturbance variables which are spatially autocorrelated” (p. 127)

• Moral: When analyzing spatially referenced data, it’s highly useful to know something about the rudiments of spatial data analysis (i.e., some understanding of why “spatial is special”)

GISPopSci

Okay… So why is spatial special?• Scale dependency

– Robinson (ASR, 1950) → “Ecological Fallacy”– “The relationship between ecological and individual

correlations which is discussed in this paper provides a definite answer as to whether ecological correlations can validly be used as substitutes for individual correlations. They cannot.” (p. 357)

– MAUP (Modifiable Areal Unit Problem)– “Habitual users of ecological correlations know that the

size of the coefficient depends to a marked degree upon the number of sub-areas. …[T]he size of the ecological correlation [will increase numerically as consolidation of smaller areas into larger areas takes place].” (Robinson, pp. 357-8)

GISPopSci

Why is spatial special? (2)

• Observational areas are generally of different size– Spatial heterogeneity → error

heteroskedasticity

GISPopSci

Counties in U.S. South: 2000 Census

n = 1,387

Brewster Co. TX:Area: 6,193 mi2

Fairfax City, VA:Area: 6.3 mi2

GISPopSci

Counties in U.S. South: 2000 Census

n = 1,387

Harris Co. TX:Pop: 3,400,578Loving Co. TX:

Pop: 67

GISPopSci


• Observational areas are generally of different size (geographic size; population size)– Heteroskedasticity in regression errors

• Neighboring areas are similar– Tobler’s 1st law of Geography: “Everything is

related to everything else, but near things are more related than distant things.” (1970:236)

– (positive) spatial autocorrelation

GISPopSci

Proportion of Children (under age 18) in Poverty: 2000 Census

Source:SF3 Table P87

Loudoun Co. VA: 0.028

Starr Co. TX: 0.595

GISPopSci


• Observational areas are generally of different size– Heteroskedasticity in regression errors

• Neighboring areas are similar– Tobler’s 1st law of Geography: “Everything is related to

everything else, but near things are more related than distant things.” (1970:236)

– (positive) spatial autocorrelation

• Probable stumbling blocks when modeling the data– Again… the assumption of iid errors in a standard OLS

regression specification is violated and statistical inference is not valid

GISPopSci

Spatial versusNon-Spatial Data Analysis

GISPopSci

Take these two maps, for example

Any traditional data analysis that does not utilize the location & spatial arrangement (topological information) of

the data will lead to identical results for the two maps

GISPopSci

And why is this important?“What makes the methods of modern [spatial data analysis] different from many of their predecessors is that they have been developed with the recognition that spatial data have unique propertiesand that these properties make the use of methods borrowed from aspatial disciplines highly questionable”

Fotheringham, Brunsdon& Charlton, Quantitative

Geography: Perspectives onSpatial Data Analysis

Sage, 2000 p. xii

GISPopSci

Because of these unique properties, if we blithely carry out an OLS regression

using aggregated geographic data…

• Our estimated regression coefficients are biased and inconsistent, or…

• Our estimated regression coefficients are inefficient• Our R2 statistic is exaggerated• We’ve made incorrect inferences• We’ll never get it published

– or shouldn’t!

surely some large subset of the following undesirable horrors almost certainly awaits

us (the curse of Tobler’s 1st Law)

GISPopSci

Given these problems, why would anyone bother to analyze spatial data?

• There’s lots of it!• Occasional need for non-disclosure of

individual-level data• Space is important

– Space as a means of organizing human activities– Location as a means of integrating interesting data

• Some interesting questions can (only) be examined with spatial data

Some interesting questions that might be addressed using modern

spatial analysis:

• Local tax rates(“spillover” in y?)

• Expenditures for police(“spillover” in x?)

• Demographic analysis: Are Quitman & Tallahatchie counties (two contiguous counties in the Mississippi Delta) really two separate observations?

(“spillover” in ε ?)GISPopSci

GISPopSci

Now, Let’s Define Some Terms

GISPopSci

What exactly are “spatial data”?

…data where, in addition to attribute values relating to the primary

phenomena of interest, the relative spatial locations of observations are

also recorded

GISPopSci

And what is spatial regression analysis?

Regression using spatial data where explicit attention is given to location

and arrangement of geographic units

Even if we don’t really care about spatial processes

GISPopSci

Spatial Analysis versusSpatial Data Analysis

GISPopSci

GIS

Spatial Analysis

• P-median problems• Maximal covering problem• Location set covering problem• Traveling salesman problem (TSP)

GISPopSci

GIS

Spatial Data Analysis

Spatial Analysis

“Spatial Statistics”

GISPopSci


GISPopSci

GIS


Spatial Analysis

“Spatial Statistics”

EventData

GeostatisticalData

LatticeData

Spatial Econometrics

Spatial Regression Analysis

|

GISPopSci

Types of Spatial Data

• Event data (point data)• Spatially continuous data (geostatistical

data)• Lattice data (regionalized data)• Spatial interaction data (flow data)

GISPopSci

Our focus of attention in this workshop will primarily be

on Lattice Data

On Friday we’ll briefly touch on the other kinds of spatial data and other

spatial data analytic approaches

GISPopSci

Lattice Data• Have one or more variables whose values are

measured over a set of areas• Interest focuses on the attribute values, not on

the locations which are known and unchanging• Objectives:

– Detecting patterns in the spatial arrangement of attribute values

– examining the relationships among the set of variables taking into account any spatial effects present

• Approach:– Exploratory spatial data analysis– Confirmatory spatial data analysis (spatial regression)

GISPopSci

Some early cautions:• Our goal is to correctly model and draw proper inferences

about an unobserved, random DGP (random field)– spatial process?

• Spatial heterogeneity?• spatial dependence?• time?

– sampling perspective?• Spatial autocorrelation• Scale issues; scale dependency; aggregation bias;

boundary issues• The tools are pretty good, but along the way many

subjective decisions are made– defining “neighborhood” – choosing a weights matrix

GISPopSci

Before taking a closer look at the meanings of Spatial

Autocorrelation, Spatial Heterogeneity and Spatial

Dependence, let’s take a closer look at the traditional OLS

regression model

GISPopSci

Standard OLS Regression

ikikiii xxxy εββββ +++++= ...22110

(n x 1)

(n x k+1) (k+1 x 1)(n x 1)

yXXXβ TT 1)(ˆ −=and where:

eeknT

⎥⎦⎤

⎢⎣⎡

−−= )1(1ˆ 2σ

εX βy +=In matrix notation:

For those of you who glaze over at the sight of that, the

next 3 slides are for you

GISPopSci

GISPopSci

ikikiii xxxy εββββ +++++= ...22110

How do we get from this

εXβy +=

to this

???

GISPopSci

626216106

525215105

424214104

323213103

222212102

121211101

εβββεβββεβββεβββεβββεβββ

+++=+++=+++=+++=+++=+++=

xxyxxyxxyxxyxxyxxy

Assume n = 6 and we have two independent variables x1 and x2

εXβy += is shorthand for the above

GISPopSci

⎥⎥⎥⎥⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢⎢⎢⎢⎢

⎣

⎡

=

6

5

4

3

2

1

yyyyyy

y

⎥⎥⎥⎥⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢⎢⎢⎢⎢

⎣

⎡

=

2616

2515

2414

2313

2212

2111

111111

xxxxxxxxxxxx

X⎥⎥⎥

⎦

⎤

⎢⎢⎢

⎣

⎡=

2

1

0

βββ

β

⎥⎥⎥⎥⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢⎢⎢⎢⎢

⎣

⎡

=

6

5

4

3

2

1

εεεεεε

ε

We get to the shorthand matrix algebra expression by realizing the following:

εXβy +=and

GISPopSci

Okay… but what about the assumptions underlying the

OLS regression model?

We must place some conditions both on the population and on the data to establish unbiasedness, consistency

and efficiency These conditions are embodied in the

Gauss-Markov Theorem

GISPopSci

The Gauss-Markov Theorem asserts that is a “Best Linear Unbiased Estimator” (BLUE) of ,

provided the following assumptions are met:

$β

• Linearity• Mean independence E[εi |xi] = 0 (implies E[ε] = 0)• Homoskedasticity and uncorrelated disturbances

Cov[ε] = E[ε ε’ ] = I• X is of rank k+1 (k = no. of “independent” vars.)• X is non-stochastic (or stochastic with finite second

moments, and E[X’ε] = 0 for unbiasedness)• Normal disturbance

2σ

GISPopSci

To discuss these assumptions we need a few words to describe the

performance of estimates• Bias. An estimation method is unbiased if it

produces estimates that have a statistical expectation equal to the true (population) value

• Consistency. Estimates converge toward the quantity being estimated as the sample size increases

• Efficiency. Efficient estimates are those that have smaller standard errors than estimates produced by some competing estimator

GISPopSci

Regarding the OLS assumptions

• Linearity and mean independence support unbiased estimates

• Homoskedasticity and uncorrelated disturbances support efficiency

• Normality of disturbances means we can do statistical inference using t tables

• Normality also means we can estimate the model by MLE

It is partly with these OLS assumptions in mind that we stress

the need to know our data

EDA / ESDA

GISPopSci

We engage in EDA (ESDA) to determine how our data might lead to violations of one or more of the assumptions underlying our

model regarding the unknown DGP

GISPopSci

For example…• Are we starting out by maximizing the probability of obtaining a

normal error structure?– Why do we care about this?– How can we check for this?– What can we do about it?

• Do we have good linear relationships between our dependent variable and independent variables?– Why do we care about this?– How can we check for this?– What can we do about it?

• Should any of our variables be transformed?– Do you know how to proceed?

• Do we have any outliers?– What kind of outliers?– What options are available to us?

• Fortunately almost everything we’ll want to do can be done within GeoDa TM And what we do in GeoDa, we’ll try to replicate with R

GISPopSci

http://geodacenter.asu.edu/

GeoDaTM is a trademark of Luc Anselin & the University of Illinois

GISPopSci

Exploring Spatial Data• Goal is to seek good understanding and description of the

data, thus suggesting hypotheses to explore

• Not much emphasis here on p-values, which are so ubiquitous in most of our training & our statistical instincts

• Look especially for clues to “spatial heterogeneity” or “spatial dependence”

• Few a priori assumptions about the data

• Robust (“resistant”) methods

• Analysis may sometimes end here

• EDA/ESDA: Despite good tools, at heart EDA is a philosophy; an attitude; “best practice” way of thinking about your data; a way of staying out of trouble

GISPopSci

EDA / SDA Tools• Maps• Descriptive statistics• Plots and graphics• Classification and clustering methods• Software with dynamically linked objects• Global and local spatial autocorrelation• Look for evidence of “Tobler’s First Law”

GISPopSci

Maps? Sure, but what kind?

PPOVQuantile Map (9)

PPOVPercentile Map

PPOVBoxMap (1.5)

PPOVStd. Dev. Map

So… some answers are:What are you trying to discover with the map?What are you wishing to show with the map?

GISPopSci

Descriptive Statistics

GISPopSci

Plots and Graphs

GISPopSci

Checks for normality and

symmetry

Plots and Graphs… (cont.)

GISPopSci

Plots and Graphs… (cont.)

Checks for linearity & outliers

Tomorrow we’ll take a close look at the concept of spatial autocorrelation

But, in the context of assumptions underlying the OLS model, this raises some serious problems

GISPopSci

Correlated Disturbances

GISPopSci

Cov[ε] = E[ε ε’ ] = σ2 Σ ≠ σ2 I• This can arise from a number of

violations of the OLS assumptions• Often occurs when there is an important

independent variable missing from the specification

• It causes OLS parameter estimates or their associated standard errors to be unreliable

• It inflates the value of the R2 statistic

If non-zero covariance between X and ε

GISPopSci

E[X’ε] ≠ 0

OLS parameter estimates are biased

OLS parameter estimates are inconsistent

E[b] = E[(X’X)-1X’y]= E[(X’X)-1X’(Xβ + ε)]= β + (X’X)-1E[X’ε] ≠ β

plim[b] = β + plim[(X’X/n)-1] x plim[X’ε/n]

GISPopSci

Readings for today• Anselin, Luc. 1989. “What is Special About Spatial Data? Alternative

Perspectives on Spatial Data Analysis.” NCGIA Technical Paper 89-4.• Ward, Michael D., and Kristian Skrede Gleditsch. 2008. Spatial

Regression Models. Quantitative Applications in the Social Sciences, No. 155. Thousand Oaks, CA: Sage. Chapter 1. [Book is available on line at: http://faculty.washington.edu/mdw/pdfs/SRMbook.pdf ]

• Goodchild, Michael F., Luc Anselin, Richard P. Applebaum, and Barbara Herr Harthorn. 2000. “Toward Spatially Integrated Social Science.” International Regional Science Review 23:139-159.

• Loftin, Colin and Sally K. Ward. 1983. “A Spatial Autocorrelation Model of the Effects of Population Density on Fertility.” American Sociological Review, 48(1):121-128.

• Galle, Omer R., Walter R. Gove, and J. Miller McPherson. 1972. “Population Density and Pathology: What Are the Relations for Man?” Science (new series) 176:23-30.

http://faculty.washington.edu/mdw/pdfs/SRMbook.pdf

GISPopSci

Afternoon Lab

Shapefiles and introduction to GeoDaTM

and EDA using GeoDa & R

GISPopSci

Questions?

Documents

Spatial Regression Modeling - csiss.org · Spatial Regression Modeling. ... • Moral: When analyzing ... value • Consistency. Estimates converge toward the quantity being estimated