- Home
- Documents
*Interaction Analysis of Spatial P o int P atterns - chris/Medrano_GEO 210C/GEO 210C...آ ...*

prev

next

out of 14

View

0Download

0

Embed Size (px)

Interaction Analysis of Spatial Point Patterns Geog 210C

Introduction to Spatial Data Analysis

Phaedon C. Kyriakidis www.geog.ucsb.edu/∼phaedon

Department of Geography

University of California Santa Barbara

Santa Barbara, CA 93106-4060

phaedon@geog.ucsb.edu

Spring Quarter 2009

Spatial Point Patterns

Definition Set of point locations with recorded “events” within study region, e.g., locations of trees, disease or crime incidents

0 20 40 60 80 100 0

10

20

30

40

50

60

70

80

90

100 N=100 clustered events in a study region

0 20 40 60 80 100 0

10

20

30

40

50

60

70

80

90

100 N=100 random events in a study region

! point locations could correspond to all possible events or to subsets of them (mapped versus sampled point pattern)

! attribute values could have also been measured at event locations, e.g., tree diameter (marked point pattern) – not considered in this handout

Objective of this handout

! Introduce statistical tools for quantifying spatial interaction of events, e.g., clustering versus randomness or regularity

Ph. Kyriakidis (UCSB) Geog 210C Spring 2009 2 / 27

Outline

Concepts & Notation

Distance & Distance Matrices

Distances Involved in Spatial Point Patterns

Quantifying Spatial Interaction: G Function

Quantifying Spatial Interaction: F Function

Quantifying Spatial Interaction: K Function

Points To Remember

Ph. Kyriakidis (UCSB) Geog 210C Spring 2009 3 / 27

Concepts & Notation

Some Notation

Point events Set of N locations of events occurring in a study area:

{ui , i = 1, . . . ,N}, ui ∈ D ⊂ RK

ui = coordinate vector of i-th event location, e.g., in 2D ui = {xi yi}, ∈ = belongs to, D = study domain, a subset ⊂ of a K -dimensional space RK

Variable of interest y(s) = number of events (a count) within arbitrary domain or support s with measure (length, area, volume) |s|; support s is centered at an arbitrary location u and can also be denoted as s(u); in statistics, y(s) is treated as a realization of a random variable (RV) Y (s)

Objective Quantify interaction, e.g., covariation, between outcomes of any two RVs Y (s) and Y (s ′). To do so, all RVs must lie in the same “environment”; in other words, the long-term average (expectation) of RV Y (s) should be similar to that of Y (s ′)

Ph. Kyriakidis (UCSB) Geog 210C Spring 2009 4 / 27

Concepts & Notation

Intensity of Events

Local intensity λ(u) Mean number of events per unit area at an arbitrary location or point u, formally defined as:

λ(u) = lim |s|→0

{ E{Y (s)}

|s|

} , u ∈ D

where E{Y (s)} denotes the expectation (mean) of RV Y (s) within region s(u) centered at u and |s| is the area of that region

Overall intensity λ

Estimated as: λ̂ = n

|D| , where |D| = measure (area) of study region D

First-order stationarity Any RV Y (s) should have the same long-term average, for a fixed areal unit s. This implies a constant intensity: λ(u) = λ, ∀u ∈ D, and the expected number of events with a region s is just a function of |s|: E{Y (s)} = λ|s|, s ∈ D

Ph. Kyriakidis (UCSB) Geog 210C Spring 2009 5 / 27

Concepts & Notation

Interaction Between Count RVs

Second-order intensity Long-term average (expectation) of products of counts per unit areas at any two arbitrary points u and u′, formally defined as:

σ(u,u′) = lim |s|,|s′|→0

{ E{Y (s)Y (s ′)}

|s||s ′|

} , u,u′ ∈ D

Some terminology

! second-order stationarity: expectation of all RVs is constant (first-order stationarity), and second-order intensity is a function of separation vector between any two locations u and u′

! isotropy: only distance (not orientation) of separation vector matters

Outlook Quantifying interaction in spatial point patterns within the above assumptions or working hypotheses amounts to studying distances between events

Ph. Kyriakidis (UCSB) Geog 210C Spring 2009 6 / 27

not the same as E{Y(s)}*E{Y(s')}, unless variables are independent

Distance & Distance Matrices

Distance

A measure of proximity (typically along a crow’s flight path) between any two locations or spatial entities

Euclidean distance Consider two points in a 2D (geographical or other) space with coordinates ui = (xi , yi ) and uj = (xj , yj). The Euclidean distance dij between points ui and uj is computed via Pythagoras’s theorem as:

dij = d(ui ,uj) = ||ui − uj || = √

(xi − xj)2 + (yi − yj)2

||ui − uj || is called the 2-norm of vector hij = ui − uj locations ui and uj are called, respectively, the tail and head of vector hij

x ix

iu

jy

iy

ix jx

iy jy dij

j

j

y

u

x Ph. Kyriakidis (UCSB) Geog 210C Spring 2009 7 / 27

Distance & Distance Matrices

Distance Metric

Formal characteristics of a distance metric A measure dij of proximity between locations ui and uj is a valid distance metric if it satisfies the following requirements:

! distance between a point and itself is always zero: dii = 0 ! distance between a point and another one is always positive: dij > 0 ! distance between two points is the same no matter which point you consider

first: dij = dji ! the triangular inequality holds: sum of length of two sides of a triangle

cannot be smaller than length of third side: dij ≤ dil + dlj

A metric dij need not always be Euclidean, hence should checked to ensure that it is a valid distance metric

Ph. Kyriakidis (UCSB) Geog 210C Spring 2009 8 / 27

Distance & Distance Matrices

Non-Euclidean Distances

Alternative “distance” measures (i) over a road, or railway, (ii) along a river, (ii) over a network

u

5u

4u

1u 2u

3

Euclidean distance between locations network distance between locations

Even more exotic “distance” measures (i) travel time over a network, (ii) perceived travel time between urban landmarks, (iii) volume of exports/imports

Euclidean distances between network nodes #= actual or perceived distances on the network

the latter might not even be formal distance metrics, i.e.: dij #= dji Ph. Kyriakidis (UCSB) Geog 210C Spring 2009 9 / 27

Distance & Distance Matrices

Minkowski’s Generalized Distance

Definition Consider two points in a K -dimensional (geographical or other) space RK with coordinate vectors ui = [ui1, . . . , uik , . . . , uiK ] and uj = [uj1, . . . , ujk , . . . , ujK ]. The

Minkowski distance of order p (with p > 1), denoted as d (p)ij , between points ui and uj is computed as:

d (p)ij =

( K∑

k=1

|uik − ujk |p )1/p

Particular cases ! Manhattan or city-block distance: d (1)ij =

∑K k=1 |uik − ujk |

! Euclidean distance: d (2)ij = √∑K

k=1 |uik − ujk |2 ! infinity norm or Chebyshev distance, as p →∞:

max(|ui1 − uj1|, . . . , |uik − ujk |, . . . , |uiK − ujK |) Distances computed from points in multidimensional spaces

are routinely used in statistical pattern recognition; points represent objects or cases, each described by K attribute values

Ph. Kyriakidis (UCSB) Geog 210C Spring 2009 10 / 27

Distance & Distance Matrices

Euclidean Distance Matrix: Single Set of Points

Definition Consider a set of N points {u1, . . . ,ui , . . . ,uN} in a K -dimensional (geographical or other) space. The distance matrix D is square (N × N) matrix containing the distances {d(ui ,uj), i = 1, . . . ,N, j = 1, . . . ,N} between all N × N possible pairs of points in the set

ui u1 u2 u3 u4 u5 xi x1 x2 x3 x4 x5 yi y1 y2 y3 y4 y5

by convention, u1 is the coordinate vector of the 1st point in the set (1st entry in data file)

D =

d11 d12 d13 d14 d15 d21 d22 d23 d24 d25 d31 d32 d33 d34 d35 d41 d42 d43 d44 d45 d51 d52 d53 d54 d55

=

0 d12 d13 d14 d15 d12 0 d23 d24 d25 d13 d23 0 d34 d35 d14 d24 d34 0 d45 d15 d25 d35 d45 0

= [dij ]

i-th row (or column) contains distances between i-th point ui and all others (including itself) D is symmetric with zeros along its diagonal

Ph. Kyriakidis (UCSB) Geog 210C Spring 2009 11 / 27

Distance & Distance Matrices

Euclidean Distance Matrix: Two Sets of Points

Definition Consider 2 sets of points {u1, . . . ,ui , . . . ,uN} and {t1, . . . , tj , . . . , tM} in a K -dimensional (geographical or other) space. The distance matrix D is a (N ×M) matrix containing the Euclidean distances {d(ui , tj), i = 1, . . . ,N, j = 1, . . . ,M} between all N ×M possible pairs formed by these two sets of points

ui u1 u2 u3 u4 u5 xi x1 x2 x3 x4 x5 yi y1 y2 y3 y4 y5

tj t1 t2 t3 t4 t5 t6 t7 xj x1 x2 x3 x4 x5 x6 x7 yj y1 y2 y3 y4 y5 y6 y7

by convention, u1 is the coordinate vector of the 1st datum in the data set #1, and similarly for t1

D =

d11 d12 d13 d14 d15 d16 d17 d21 d22 d23 d24 d25 d26 d27 d31 d32 d33 d34 d35 d36 d37 d41 d42 d