Interaction Analysis of Spatial P o int P atterns - chris/Medrano_GEO 210C/GEO 210C...آ  2011-03-29آ 

  • View
    0

  • Download
    0

Embed Size (px)

Text of Interaction Analysis of Spatial P o int P atterns - chris/Medrano_GEO 210C/GEO 210C...آ ...

  • Interaction Analysis of Spatial Point Patterns Geog 210C

    Introduction to Spatial Data Analysis

    Phaedon C. Kyriakidis www.geog.ucsb.edu/∼phaedon

    Department of Geography

    University of California Santa Barbara

    Santa Barbara, CA 93106-4060

    phaedon@geog.ucsb.edu

    Spring Quarter 2009

    Spatial Point Patterns

    Definition Set of point locations with recorded “events” within study region, e.g., locations of trees, disease or crime incidents

    0 20 40 60 80 100 0

    10

    20

    30

    40

    50

    60

    70

    80

    90

    100 N=100 clustered events in a study region

    0 20 40 60 80 100 0

    10

    20

    30

    40

    50

    60

    70

    80

    90

    100 N=100 random events in a study region

    ! point locations could correspond to all possible events or to subsets of them (mapped versus sampled point pattern)

    ! attribute values could have also been measured at event locations, e.g., tree diameter (marked point pattern) – not considered in this handout

    Objective of this handout

    ! Introduce statistical tools for quantifying spatial interaction of events, e.g., clustering versus randomness or regularity

    Ph. Kyriakidis (UCSB) Geog 210C Spring 2009 2 / 27

  • Outline

    Concepts & Notation

    Distance & Distance Matrices

    Distances Involved in Spatial Point Patterns

    Quantifying Spatial Interaction: G Function

    Quantifying Spatial Interaction: F Function

    Quantifying Spatial Interaction: K Function

    Points To Remember

    Ph. Kyriakidis (UCSB) Geog 210C Spring 2009 3 / 27

    Concepts & Notation

    Some Notation

    Point events Set of N locations of events occurring in a study area:

    {ui , i = 1, . . . ,N}, ui ∈ D ⊂ RK

    ui = coordinate vector of i-th event location, e.g., in 2D ui = {xi yi}, ∈ = belongs to, D = study domain, a subset ⊂ of a K -dimensional space RK

    Variable of interest y(s) = number of events (a count) within arbitrary domain or support s with measure (length, area, volume) |s|; support s is centered at an arbitrary location u and can also be denoted as s(u); in statistics, y(s) is treated as a realization of a random variable (RV) Y (s)

    Objective Quantify interaction, e.g., covariation, between outcomes of any two RVs Y (s) and Y (s ′). To do so, all RVs must lie in the same “environment”; in other words, the long-term average (expectation) of RV Y (s) should be similar to that of Y (s ′)

    Ph. Kyriakidis (UCSB) Geog 210C Spring 2009 4 / 27

  • Concepts & Notation

    Intensity of Events

    Local intensity λ(u) Mean number of events per unit area at an arbitrary location or point u, formally defined as:

    λ(u) = lim |s|→0

    { E{Y (s)}

    |s|

    } , u ∈ D

    where E{Y (s)} denotes the expectation (mean) of RV Y (s) within region s(u) centered at u and |s| is the area of that region

    Overall intensity λ

    Estimated as: λ̂ = n

    |D| , where |D| = measure (area) of study region D

    First-order stationarity Any RV Y (s) should have the same long-term average, for a fixed areal unit s. This implies a constant intensity: λ(u) = λ, ∀u ∈ D, and the expected number of events with a region s is just a function of |s|: E{Y (s)} = λ|s|, s ∈ D

    Ph. Kyriakidis (UCSB) Geog 210C Spring 2009 5 / 27

    Concepts & Notation

    Interaction Between Count RVs

    Second-order intensity Long-term average (expectation) of products of counts per unit areas at any two arbitrary points u and u′, formally defined as:

    σ(u,u′) = lim |s|,|s′|→0

    { E{Y (s)Y (s ′)}

    |s||s ′|

    } , u,u′ ∈ D

    Some terminology

    ! second-order stationarity: expectation of all RVs is constant (first-order stationarity), and second-order intensity is a function of separation vector between any two locations u and u′

    ! isotropy: only distance (not orientation) of separation vector matters

    Outlook Quantifying interaction in spatial point patterns within the above assumptions or working hypotheses amounts to studying distances between events

    Ph. Kyriakidis (UCSB) Geog 210C Spring 2009 6 / 27

    not the same as E{Y(s)}*E{Y(s')}, unless variables are independent

  • Distance & Distance Matrices

    Distance

    A measure of proximity (typically along a crow’s flight path) between any two locations or spatial entities

    Euclidean distance Consider two points in a 2D (geographical or other) space with coordinates ui = (xi , yi ) and uj = (xj , yj). The Euclidean distance dij between points ui and uj is computed via Pythagoras’s theorem as:

    dij = d(ui ,uj) = ||ui − uj || = √

    (xi − xj)2 + (yi − yj)2

    ||ui − uj || is called the 2-norm of vector hij = ui − uj locations ui and uj are called, respectively, the tail and head of vector hij

    x ix

    iu

    jy

    iy

    ix jx

    iy jy dij

    j

    j

    y

    u

    x Ph. Kyriakidis (UCSB) Geog 210C Spring 2009 7 / 27

    Distance & Distance Matrices

    Distance Metric

    Formal characteristics of a distance metric A measure dij of proximity between locations ui and uj is a valid distance metric if it satisfies the following requirements:

    ! distance between a point and itself is always zero: dii = 0 ! distance between a point and another one is always positive: dij > 0 ! distance between two points is the same no matter which point you consider

    first: dij = dji ! the triangular inequality holds: sum of length of two sides of a triangle

    cannot be smaller than length of third side: dij ≤ dil + dlj

    A metric dij need not always be Euclidean, hence should checked to ensure that it is a valid distance metric

    Ph. Kyriakidis (UCSB) Geog 210C Spring 2009 8 / 27

  • Distance & Distance Matrices

    Non-Euclidean Distances

    Alternative “distance” measures (i) over a road, or railway, (ii) along a river, (ii) over a network

    u

    5u

    4u

    1u 2u

    3

    Euclidean distance between locations network distance between locations

    Even more exotic “distance” measures (i) travel time over a network, (ii) perceived travel time between urban landmarks, (iii) volume of exports/imports

    Euclidean distances between network nodes #= actual or perceived distances on the network

    the latter might not even be formal distance metrics, i.e.: dij #= dji Ph. Kyriakidis (UCSB) Geog 210C Spring 2009 9 / 27

    Distance & Distance Matrices

    Minkowski’s Generalized Distance

    Definition Consider two points in a K -dimensional (geographical or other) space RK with coordinate vectors ui = [ui1, . . . , uik , . . . , uiK ] and uj = [uj1, . . . , ujk , . . . , ujK ]. The

    Minkowski distance of order p (with p > 1), denoted as d (p)ij , between points ui and uj is computed as:

    d (p)ij =

    ( K∑

    k=1

    |uik − ujk |p )1/p

    Particular cases ! Manhattan or city-block distance: d (1)ij =

    ∑K k=1 |uik − ujk |

    ! Euclidean distance: d (2)ij = √∑K

    k=1 |uik − ujk |2 ! infinity norm or Chebyshev distance, as p →∞:

    max(|ui1 − uj1|, . . . , |uik − ujk |, . . . , |uiK − ujK |) Distances computed from points in multidimensional spaces

    are routinely used in statistical pattern recognition; points represent objects or cases, each described by K attribute values

    Ph. Kyriakidis (UCSB) Geog 210C Spring 2009 10 / 27

  • Distance & Distance Matrices

    Euclidean Distance Matrix: Single Set of Points

    Definition Consider a set of N points {u1, . . . ,ui , . . . ,uN} in a K -dimensional (geographical or other) space. The distance matrix D is square (N × N) matrix containing the distances {d(ui ,uj), i = 1, . . . ,N, j = 1, . . . ,N} between all N × N possible pairs of points in the set

    ui u1 u2 u3 u4 u5 xi x1 x2 x3 x4 x5 yi y1 y2 y3 y4 y5

    by convention, u1 is the coordinate vector of the 1st point in the set (1st entry in data file)

    D =

    

    d11 d12 d13 d14 d15 d21 d22 d23 d24 d25 d31 d32 d33 d34 d35 d41 d42 d43 d44 d45 d51 d52 d53 d54 d55

     =

    

    0 d12 d13 d14 d15 d12 0 d23 d24 d25 d13 d23 0 d34 d35 d14 d24 d34 0 d45 d15 d25 d35 d45 0

     = [dij ]

    i-th row (or column) contains distances between i-th point ui and all others (including itself) D is symmetric with zeros along its diagonal

    Ph. Kyriakidis (UCSB) Geog 210C Spring 2009 11 / 27

    Distance & Distance Matrices

    Euclidean Distance Matrix: Two Sets of Points

    Definition Consider 2 sets of points {u1, . . . ,ui , . . . ,uN} and {t1, . . . , tj , . . . , tM} in a K -dimensional (geographical or other) space. The distance matrix D is a (N ×M) matrix containing the Euclidean distances {d(ui , tj), i = 1, . . . ,N, j = 1, . . . ,M} between all N ×M possible pairs formed by these two sets of points

    ui u1 u2 u3 u4 u5 xi x1 x2 x3 x4 x5 yi y1 y2 y3 y4 y5

    tj t1 t2 t3 t4 t5 t6 t7 xj x1 x2 x3 x4 x5 x6 x7 yj y1 y2 y3 y4 y5 y6 y7

    by convention, u1 is the coordinate vector of the 1st datum in the data set #1, and similarly for t1

    D =

    

    d11 d12 d13 d14 d15 d16 d17 d21 d22 d23 d24 d25 d26 d27 d31 d32 d33 d34 d35 d36 d37 d41 d42 d