Download pdf - Spatial and Spatiotemporal Data Miningzhejiang.cs.ua.edu/papers/ComprehensiveGIS.pdf · Spatial and Spatiotemporal Data Mining Abstract The significant growth of spatial and spatiotemporal

Spatial and Spatiotemporal Data Mining Abstract

The significant growth of spatial and spatiotemporal data collection as well as the emergence of

new technologies have heightened the need for automated discovery of spatiotemporal

knowledge. Spatial and spatiotemporal data mining techniques are crucial to organizations which

make decisions based on large spatial and spatiotemporal datasets. The interdisciplinary nature of

spatial and spatiotemporal data mining and the complexity of spatial and spatiotemporal data and

relationships pose statistical and computational challenges. This article provides an overview of

recent advances in spatial and spatiotemporal data mining, and reviews common spatial and

spatiotemporal data mining techniques organized by major pattern families, based on an

introduction to spatial and spatiotemporal data types and relationships as well as the statistical

background. New trends and research needs are also summarized, including spatial and

spatiotemporal data mining in the network space and spatial and spatiotemporal big data platform

development.

Keyword

Spatial, Spatiotemporal, Data Mining, Statistics, Computing, Anomaly, Association, Prediction

Model, Partition, Summarization, Hotspots, Change Footprint.

1 Introduction The significant growth of spatial and spatiotemporal data collection as well as the emergence of

new technologies have heightened the need for automated discovery of spatiotemporal

knowledge. Spatial and spatiotemporal data mining studies the process of discovering interesting

and previously unknown, but potentially useful patterns from large spatial and spatiotemporal

databases. The complexity of spatial and spatiotemporal data and implicit relationships limits the

usefulness of conventional data mining techniques for extracting spatial and spatiotemporal

patterns.

As shown in Figure 1, the process of spatial and spatiotemporal data mining starts at

preprocessing of the input data. Typically, this step is to correct noise, errors, and missing data.

Exploratory space-time analysis is also conducted in this step to understand the underlying

spatiotemporal distribution. Then, an appropriate algorithm is selected to run on the preprocessed

data, and produce output patterns. Common output pattern families include spatial and

spatiotemporal anomalies, associations and tele-couplings, predictive models, partitions and

summarization, hotspots, as well as change patterns. Spatial and spatiotemporal data mining

algorithms often have statistical foundations and integrate scalable computational techniques.

Output patterns are post-processed and then interpreted by domain scientists to find novel insights

and refine data mining algorithms when needed.

Figure 1 The process of spatial & spatiotemporal data mining

1.1 Societal Importance

Spatial and spatiotemporal data mining techniques are crucial to organizations which make

decisions based on large spatial and spatiotemporal datasets. Many government and private

agencies are likely beneficiaries of spatial and spatiotemporal data mining including NASA, the

National Geospatial-Intelligence Agency (Krugman (1997)), the National Cancer Institute (Albert

and McShane (1995)), the US Department of Transportation (Shekhar et al. (1993)), and the

National Institute of Justice (Eck et al. (2005)). These organizations are spread across many

application domains. These application domains include public health, mapping and analysis for

public safety, transportation, environmental science and management, economics, climatology,

public policy, earth science, market research and analytics, public utilities and distribution, etc. In

ecology and environmental management (Haining (1993); Isaaks and others (1989); Roddick and

Spiliopoulou (1999); Scally (2006)), researchers need tools to classify remote sensing images to

map forest coverage. In public safety (Leipnik and Albert (2003)), crime analysts are interested in

discovering hotspot patterns from crime event maps so as to effectively allocate police resources.

In transportation (Lang and (Redlands (1999)), researchers analyze historical taxi GPS

trajectories to recommend fast routes from places to places. Epidemiologists (Elliot et al. (2000))

use spatiotemporal data mining techniques to detect disease outbreak.

The interdisciplinary nature of spatial and spatiotemporal data mining means that its techniques

must developed with awareness of the underlying physics or theories in the application domains.

For example, climate science studies find that observable predictors for climate phenomena

discovered by data science techniques can be misleading if they do not take into account climate

models, locations, and seasons. In this case, statistical significance testing is critically important

in order to further validate or discard relationships mined from data.

1.2 Challenges

preprocessing,

exploratory space-time

analysis

spatial & spatiotemporal

data mining algorithms

input:


data

Output:


patterns

post-processing


statistical foundation

computational

techniques

In addition to interdisciplinary challenges, spatial and spatiotemporal data mining also poses

statistical and computational challenges. Extracting interesting and useful patterns from spatial

and spatiotemporal datasets is more difficult than extracting corresponding patterns from

traditional numeric and categorical data due to the complexity of spatiotemporal data types and

relationships, including: (1) the spatial relationships among the variables, i.e., observations that

are not independent and identically distributed (i.i.d.); (2) the spatial structure of errors; and (3)

nonlinear interactions in feature space. According to Tobler’s first law of geography, “Everything

is related to everything else, but near things are more related than distant things.” For example,

people with similar characteristics, occupation and background tend to cluster together in the

same neighborhoods. In spatial statistics such spatial dependence is called the spatial

autocorrelation effect. Ignoring autocorrelation and assuming an identical and independent

distribution (i.i.d.) when analyzing data with spatial and spatio-temporal characteristics may

produce hypotheses or models that are inaccurate or inconsistent with the data set (Shekhar,

Zhang, et al. (2003)). In addition to spatial dependence at nearby locations, phenomena of

spatiotemporal tele-coupling also indicate long range spatial dependence such as El Niño and La

Niña effects in the climate system. Another challenge comes from the fact that spatiotemporal

datasets are embedded in continuous space and time, and thus many classical data mining

techniques assuming discrete data (e.g., transactions in association rule mining) may not be

effective. A third challenge is the spatial heterogeneity and temporal non-stationarity, i.e.,

spatiotemporal temporal data samples do not follow an identical distribution across the entire

space and over all time. Instead, different geographical regions and temporal period may have

distinct distributions. Modifiable area unit problem (MAUP) or multi-scale effect is another

challenge since results of spatial analysis depends on a choice of appropriate spatial and temporal

scales. Finally, flow and movement and Lagrangian framework of reference in spatiotemporal

networks pose challenges (e.g., directionality, anisotropy, etc.).

One way to deal with implicit spatiotemporal relationships is to materialize the relationships into

traditional data input columns and then apply classical data mining techniques (Agrawal et al.

(1993), (1994); Bamnett and Lewis (1994); Jain and Dubes (1988); Quinlan (1993)). However,

the materialization can result in loss of information (Shekhar, Zhang, et al. (2003)). The spatial

and temporal vagueness which naturally exists in data and relationships usually creates further

modeling and processing difficulty in spatial and spatiotemporal data mining. A more preferable

way to capture implicit spatial and spatiotemporal relationships is to develop statistics and

techniques to incorporate spatial and temporal information into the data mining process.

1.3 Related Work

Surveys in spatial and spatiotemporal data mining can be categorized into two groups: ones

without statistical foundations, and ones with a focus on statistical foundation. Among the

surveys without focuses on statistical foundation, Koperski et al. (1996) and Ester et al. (1997)

reviewed spatial data mining from a spatial database approach; Roddick and Spiliopoulou (1999)

provided a bibliography for spatial, temporal and spatiotemporal data mining; Miller and Han

(2009) cover a list of recent spatial and spatiotemporal data mining topics but without a

systematic view of statistical foundation. Among surveys covering statistical foundations,

(Shekhar, Zhang, et al. (2003)) reviewed several spatial pattern families focusing on spatial data’s

unique characteristics; Kisilevich et al. (2009) reviewed spatiotemporal clustering research;

Aggarwal (2013) has a chapter summarizing spatial and spatiotemporal outlier detection

techniques; Zhou et al. (2014) reviews spatial and spatiotemporal change detection research from

an interdisciplinary view; Cheng et al. (2014) reviews state of the art spatiotemporal data mining

research including spatiotemporal autocorrelation, space-time forecasting and prediction, space-

time clustering, as well as space-time visualization; Shekhar et al. (2011) give a review of spatial

data mining research and Shekhar et al. (2015) reviews spatiotemporal data mining research with

a discussion on statistical foundation.

1.4 Scope

We hope this article contributes to spatial and spatiotemporal data mining research in filling these

two gaps. More specifically: (1) we provide a taxonomy of spatial and spatiotemporal data types;

(2) we provide a taxonomy of spatial and spatiotemporal statistics organized by different data

types; (3) we survey common computational techniques for all major spatial and spatiotemporal

pattern families, including spatial and spatiotemporal anomalies, coupling and tele-coupling,

prediction models, partitioning and summarization, hotspots and change patterns.

1.5 Organization of the Paper

This article starts with the characteristics of the data inputs of spatial and spatiotemporal data

mining (Section 2) and an overview of its statistical foundation (Section 3). It then describes in

detail six main output patterns of spatial and spatiotemporal data mining (Section 4). The paper

concludes with an examination of research needs and future directions in Section 5 and Section 6.

2 Input: Spatial and Spatiotemporal Data The input data is one of the important factors shaping spatial and spatiotemporal data mining. In

this section, a taxonomy of different spatial and spatiotemporal data types is introduced, followed

by a summary of the unique attributes and relationships of different data. The goal is to provide a

systematic overview of data in spatial and spatiotemporal data mining tasks.

2.1 Types of Spatial and Spatiotemporal Data

The complexity of its input data makes spatial and spatiotemporal data mining unique when

compared with classical data mining. The cause of this complexity is the conflict between the

discrete representations and the continuous space and time.

There are three models in spatial data, namely, the object model (Figure 2), the field model

(Figure 3), and the spatial network model (Figure 4) (Shekhar and Chawla (2003)). The object

model uses point, line, polygon, and collections of them to describe the world. Objects in this

model are distinctly identified according to the application context, each of which is associated

with some non-spatial attributes. The field model represents spatial data as a function from a

spatial framework, which is a partitioning of space, to non-spatial attributes. For example, a grid

consisting of areas covered by pixels of a remote sensing image is mapped to pixel values in the

image. Contrary to the object model, the field model is more suitable to depict continuous

phenomena. According to the scope of operations working on field model data, the operations are

categorized into four groups: local, focal, zonal, and global operations. Local operations get an

output at a given location based only on the input at that location. Focal operations’ outputs at a

location depend on the input in a small neighborhood of the location as well. Zonal and global

operations work on inputs from a pre-defined zone or global inputs. The spatial network model

extends graph to represent the relationship between spatial elements using vertices and edges. In

addition to binary relationships between vertices represented by edges in a graph, a spatial

network includes more complex relationships, e.g., different kinds of turns at a road intersection.

Based on how temporal information is additionally modeled in addition to spatial data,

spatiotemporal data can be categorized into three types, i.e., temporal snapshot model, temporal

change model, and event or process model (Li et al. (2008); Yuan (1996), (1999)). In the

temporal snapshot model, spatial layers of the same theme are time-stamped. Based on the data

model in each snapshot layer, snapshots can also represent trajectories of raster data, points, lines,

and polygons, and networks such as time expanded graphs (TEGs) and time aggregated graphs

(TAGs) (George and Shekhar (2008)). Remote sensing imagery time series is a typical example

of the temporal snapshot model of raster data (Figure 5). Unlike the temporal snapshot model, the

temporal change model represents spatiotemporal data with a spatial layer at a given start time

together with incremental changes occurring afterward (Figure 6 represents the motion directions

and speed of two objects). For example, it can represent relative motion attributes such as speed,

acceleration, and direction of spatial points (Gelfand et al. (2010); Laube and Imfeld (2002)), as

well as rotation and deformation on lines and polygons. The event or process model describes

things that happen or go on in time (as shown in Figure 7 Event and Process Model). Processes

represent on-going homogeneous situation involving one or more types of things. The

homogeneity of processes indicates that if a process is going on over an interval of time then it is

also going over all the subintervals of the interval, so the properties of a process is subject to

change over time. Events refer to the culmination of processes associated with precise temporal

boundaries. They are used to represent things that have happened typically. (Allen (1984);

Campelo and Bennett (2013); Shekhar and Xiong (2007); Worboys (2005)).

Figure 2 Object Model

Figure 3 Raster Model

Figure 4 Spatial Network Model

Figure 5 Temporal Snapshot Model

Figure 6 Temporal Change Model

Figure 7 Event and Process Model

2.2 Data Attributes and Relationships

Data attributes for spatial and spatiotemporal data can be categorized into three distinct types, i.e.,

non-spatiotemporal attributes, spatial attributes, and temporal attributes. Non-spatiotemporal

attributes are the same as the attributes of the input data of classical data, which characterize non-

contextual features of objects, such as name, population, and unemployment rate of a city. Spatial

attributes define the spatial location (e.g., longitude, latitude, and elevation) of spatial points,

spatial extent (e.g., area, perimeter) and shape of spatial polylines and polygons in a spatial

reference frame. Temporal attributes include the timestamp of a spatial layer (e.g., a raster layer,

points), as well as the duration of a process.

Relationships on non-spatiotemporal attributes are made explicit through arithmetic, ordering,

and set relations. In contrast, relationships among spatial objects are often implicit, such as

intersection, within, and below. The relationships on non-spatiotemporal and spatial attributes are

summarized in Table 1. Spatial relationships are categorized into five groups, namely, set based,

topological, metric, directional, and other relationships. Treating spatial locations as set elements

set relationships can be extended to be applied on spatial attributes. Topological relationships are

concerned with the relationships between spatial objects that are preserved under continuous

deformations. Metric relationships are defined between spatial elements in a matric space where

the notion of distance is well defined. Directional relationships are further divided into two types,

namely, absolute and object-relative. For example, north, south and so on so forth are absolute

direction relationships which are defined in the context to a global reference system. Object-

relative directions are defined using the orientation of a given object, such as left and right. There

are also other relationships like visibility which identifies whether a spatial element is visible

from an observer point.

Table 1 Non-spatiotemporal and Spatial Relationships

Attributes Relationships

Non-spatiotemporal Arithmetic: sum, subtraction, larger than, etc.

Ordering: before, after, etc.

Set: union, subclass of, instance of, etc.

Spatial Set: union, intersection, membership, etc.

Topological: meet, within, overlap, etc.

Metric: distance, area, perimeter, etc.

Directional: above, northeastern, etc.

Others: visibility, etc.

Different spatiotemporal relationships are put forward target at different spatiotemporal data

types. For temporal snapshots of the object model data, spatiotemporal predicates (Erwig and

Schneider (2002)) are put forward for topological space, while trajectory distance (e.g., Hausdorff

distance (Chen et al. (2011))) is for metric space. Change across snapshots of field model at local,

focal and zonal levels (Zhou et al. (2014)) and cubic map algebra (Mennis et al. (2005)) are

studied on raster snapshots. For temporal snapshots of network models, studies are conducted

such as predecessor(s) and successor(s) on a Lagrangian path (Yang et al. (2014)), temporal

centrality (Habiba et al. (2007)), and network flow. Spatiotemporal coupling (i.e., within

geographic and temporal proximity) (Celik et al. (2006); Huang et al. (2008); Mohan et al.

(2012)) and spatiotemporal covariance structure (Gelfand et al. (2010)) are good examples of

relationships on events and processes model.

3 Statistical Foundations This section provides a taxonomy of common statistical concepts for different spatial and

spatiotemporal data types. Spatial statistics is a branch of statistics specialized to model events

and phenomenon that evolves within a spatial framework. Spatial and spatiotemporal statistics are

distinct from classical statistics due to the unique characteristics of space and time. One important

property of spatial data is spatial dependency, a property so fundamental that geographers have

elevated it to the status of the first law of geography: “Everything is related to everything else, but

nearby things are more related than distant things” (Tobler (1979)). Spatial dependency is also

measured using spatial autocorrelation. Dependency along the temporal dimension is also a topic

studied in spatial and spatiotemporal statistics. Other important properties include spatial

heterogeneity and temporal non-stationarity, as well as the multiple scale effect. With spatial

heterogeneity resides in the data, the distribution of spatial events and interactions between them

may vary drastically by locations. Temporal non-stationarity extends spatial heterogeneity into

the temporal dimension.

3.1 Spatial Statistics for Different Types of Spatial Data

There exists a variety of data models to represent spatial events and processes. Accordingly,

spatial statistics can be categorized according to their underlying spatial data type: Geostatistics

for point referenced data, lattice statistics for areal data, spatial point process for spatial point

patterns, and spatial network statistics.

Geostatistics: Geostatistics is concerned with point referenced data. Point-reference data contains

a set of points with fixed locations and a set of attribute values. The goal of geostatistics is to

model the distribution of the attribute values and make predictions on uncovered locations.

Spatial point referenced data have three characteristics studied by geostatistics (Cressie (2015)),

including spatial continuity (i.e., dependence across locations), weak stationarity (i.e., some

statistical properties not changing with locations) and isotropy (i.e., uniformity in all directions).

Under the assumption of weak stationarity or intrinsic stationarity, spatial dependence at various

distances can be captured by a covariance function or a semivariogram (Banerjee et al. (2014)).

Geostatistics provides a set of statistical tools, such as Kriging (Banerjee et al. (2014)) which can

be used to interpolate attributes at unsampled locations. However, real world spatial data often

shows inherent variation in measurements of a relationship over space, due to influences of

spatial context on the nature of spatial relationships. For example, human behavior can vary

intrinsically over space (e.g., differing cultures). Different jurisdictions tend to produce different

laws (e.g., speed limit differences between Minnesota and Wisconsin). This effect is called spatial

heterogeneity or non-stationarity. Special models (e.g., local space-time Kriging (Gething et al.

(2007))) can be further used to reflect the varying functions at different locations by assigning

higher weights to neighboring points to reduce effect of heterogeneity.

Lattice statistics: Lattice statistics studies statistics for spatial data in the field (or areal) model.

Here a lattice refers to a countable collection of regular or irregular cells in a spatial framework.

The range of spatial dependency among cells is reflected by a neighborhood relationship, which

can be represented by a contiguity matrix called a W-matrix. A spatial neighborhood relationship

can be defined based on spatial adjacency (e.g., rook or queen neighborhoods) or Euclidean

distance, or in more general models, cliques and hypergraphs (Warrender and Augusteijn (1999)).

Based on a W-matrix, spatial autocorrelation statistics can be defined to measure the correlation

of a non-spatial attribute across neighboring locations. Common spatial autocorrelation statistics

include Moran’s I, Getis-Ord Gi*, Geary’s C, Gamma index Γ (Cressie (2015)), etc., as well as

their local versions called local indicators of spatial association (LISA) (Anselin (1995)). Several

spatial statistical models, including the spatial autoregressive model (SAR), conditional

autoregressive model (CAR), Markov random fields (MRF), as well as other Bayesian

hierarchical models (Banerjee et al. (2014)), can be used to model lattice data. Another important

issue is the modifiable areal unit problem (MAUP) (also called the multi-scale effect) (Openshaw

and Openshaw (1984)), an effect in spatial analysis that results for the same analysis method will

change on different aggregation scales. For example, analysis using data aggregated by states will

differ from analysis using data at individual family level.

Spatial point process: Different from geostatistics, spatial point process does not concern attribute

values but the locations of points. The distribution of points is the main focus of spatial point

process. Locations of a set of points can be generated based on different statistical assumptions

(e.g., random, clustered). The most common model assumed for spatial point process is

homogeneous Poisson distribution, which is also known as complete spatial randomness (CSR).

In CSR, the total number of points follows a Poisson distribution and each point is identically and

independently distributed in a pre-defined spatial domain. A variant of CSR is binomial point

process, in which the only difference is a fixed total number of points. In many application

domains, CSR or binomial point process is not an appropriate assumption since points may have

spatial autocorrelation or inhibition characteristics. In such cases, other specialized models should

be applied to better approximate the exact distribution. For spatial inhibition, Poisson hardcore

process is widely used to generate a distribution that enforces mutual repulsion among points. For

spatial autocorrelation, Matern cluster process can be chosen to reflect the clustering

characteristics. Similar cluster processes include Poisson cluster process, Cox cluster process,

Neyman-Scott process, etc. One of the most well-known applications of spatial point process is

spatial scan statistics (Kulldorff (1997)) in hotspot detection. In spatial scan statistics, chancy

hotspots occurred by randomness are removed through statistical significance test under a null

hypothesis based on CSR. Similarly, Ripley’s K function, which estimates the overall clustering

degree of a point distribution, also applies CSR as a base model in its null hypothesis (Ripley

(1976)). Although CSR is still the most popularly used spatial point process for random point

distribution, other models (e.g., Possion hardcore process, Matern cluster process) start being

adapted in data mining problems (e.g., colocation and segregation) for more robust and plausible

assumptions.

Spatial point processes: A spatial point process is a model for the spatial distribution of the points

in a point pattern. It differs from point reference data in that the random variables are locations.

Examples include positions of trees in a forest and locations of bird habitats in a wetland. One

basic type of point process is a homogeneous spatial Poisson point process (also called complete

spatial randomness, or CSR) (Gelfand et al. (2010)), where point locations are mutually

independent with the same intensity over space. However, real world spatial point processes often

show either spatial aggregation (clustering) or spatial inhibition instead of complete spatial

independence as in CSR. Spatial statistics such as Ripley’s K function (Marcon and Puech

(2009); Ripley (1976)), i.e., the average number of points within a certain distance of a given

point over the total average intensity, can be used to test a point pattern against CSR. Moreover,

real world spatial point processes such as crime events often contain hotspot areas instead of

following homogeneous intensity across space. A spatial scan statistic (Kulldorff (1997)) can be

used to detect these hotspot patterns. It tests if the intensity of points inside a scanning window is

significantly higher (or lower) than outside. Though both the K-function and spatial scan statistics

have the same null hypothesis of CSR, their alternative hypotheses are quite different: the K-

function tests if points exhibit spatial aggregation or inhibition instead of independence, while

spatial scan statistics assume that points are independent and test if a hotspot with much higher

intensity exists. Finally, there are other spatial point processes such as the Cox process, in which

the intensity function itself is a random function over space, as well as a cluster process, which

extends a basic point process with a small cluster centered on each original point (Gelfand et al.

(2010)). For extended spatial objects such as lines and polygons, spatial point processes can be

generalized to line processes and flat processes in stochastic geometry (Chiu et al. (2013)).

Spatial network statistics: Most spatial statistics research focuses on the Euclidean space. Spatial

statistics on the network space is much less studied. Spatial network space, e.g., river networks

and street networks, is important in applications of environmental science and public safety

analysis. However, it poses unique challenges including directionality and anisotropy of spatial

dependency, connectivity, as well as high computational cost. Statistical properties of random

fields on a network are summarized in (Guyon (1995)). Recently, several spatial statistics, such as

spatial autocorrelation, K-function, and Kriging, have been generalized to spatial networks

(Okabe et al. (1995), (2006); Okabe and Sugihara (2012)). Little research has been done on

spatiotemporal statistics on the network space.

3.2 Spatiotemporal Statistics

Spatiotemporal statistics is a merge of spatial statistics and temporal statistics (Cressie and Wikle

(2015); Gelfand et al. (2010)). Similar to spatial data, temporal data also preserves intrinsic

properties such as autocorrelation and heterogeneity. By incorporating the additional dimension

of time, majority of the models in spatial statistics can be smoothly transformed into

spatiotemporal statistics. This section summarizes common statistics for different spatiotemporal

data types, including spatial time series, spatiotemporal point process, and time series of lattice

(areal) data.

Spatial time series: Spatial statistics for point reference data have been generalized for

spatiotemporal data (Kyriakidis and Journel (1999)). Examples include spatiotemporal

stationarity, spatiotemporal covariance, spatiotemporal variograms, and spatiotemporal Kriging

(Cressie and Wikle (2015); Gelfand et al. (2010)). There is also temporal autocorrelation and tele-

coupling (high correlation across spatial time series at a long distance). Methods to model

spatiotemporal process include physics inspired models (e.g., stochastically differential

equations) (Cressie and Wikle (2015); Gelfand et al. (2010)) and hierarchical dynamic

spatiotemporal models (e.g., Kalman filtering) for data assimilation (Cressie and Wikle (2015);

Gelfand et al. (2010)).

Spatiotemporal point process: A spatiotemporal point process generalizes the spatial point

process by incorporating the factor of time. As with spatial point processes, there is a

spatiotemporal Poisson process, Cox process, and cluster process. There are also corresponding

statistical tests including a spatiotemporal K function and spatiotemporal scan statistics (Cressie

and Wikle (2015); Gelfand et al. (2010)).

Time series of lattice (areal) data: Similar to lattice statistics, there is spatial and temporal

autocorrelation, a SpatioTemporal Autoregressive Regression (STAR) model (Cressie and Wikle

(2015); Gelfand et al. (2010)), and Bayesian hierarchical models (Banerjee et al. (2014)). Other

spatiotemporal statistics include empirical orthogonal functions (EOF) analysis (principle

component analysis in geophysics), canonical-correlation analysis (CCA), and dynamic

spatiotemporal models (Kalman filter) for data assimilation (Cressie and Wikle (2015); Gelfand

et al. (2010)).

4 Output Pattern Families 4.1 Spatial and Spatiotemporal Anomaly Detection

Detecting spatial and spatiotemporal anomalies is useful in many applications including

transportation, ecology, homeland security, public health, climatology, and location-based

services. For example, spatiotemporal anomaly detection can be used to detect anomalous traffic

patterns from sensor observations on a highway road network. In this section we will review

techniques for spatial and spatiotemporal anomaly detection.

4.1.1 Problem definition

In data mining, anomaly (outlier) have been informally defined as the identification of items,

events or observations which do not conform to an expected pattern or be inconsistent with the

remainder of the dataset (Bamnett and Lewis (1994); Chandola et al. (2009)). Contrarily, a spatial

anomaly (Shekhar and Chawla (2003)) is a spatially referenced object whose non-spatial attribute

values are significantly different from the values of its neighborhood. The detection of anomalies

can lead to the discovery of useful knowledge and has a number of practical applications. For

instance, a road intersection where vehicle speed is much higher than the intersections around it

in rush hour is a spatial anomaly based on the non-spatial attribute vehicle speed. Similarly, a

spatiotemporal anomaly generalizes spatial anomalies by substituting a spatial neighborhood with

a spatiotemporal neighborhood. Figure 4 compares global anomalies versus spatial anomalies.

Point G in Figure 4(a) is considered as a global anomaly, but point S in Figure 4(a) is not,

according to the histogram in Figure 4(b). However, when compared with its spatial neighbors P

and Q, point S is significantly different and thus is a spatial anomaly.

(a) (b)

Figure 8 Global anomalies (e.g., point G) versus spatial anomalies (e.g., point S).

4.1.2 Statistical foundation

The spatial statistics for spatial anomaly detection is of two kinds of bi-partite multidimensional

tests: graphical tests, including variogram clouds (Haslett et al. (1991)) and Moran scatterplots

(Anselin (1995)), and quantitative tests, including scatterplot (Anselin (1994)) and neighborhood

spatial statistics (Shekhar and Chawla (2003)). Variogram is defined as the variance of the

difference between field values at two locations. Figure 9(a) shows a variogram cloud with

neighboring pairs. As can be seen, two pairs (P,S) and (Q,S) on the left hand side lie above the

main group of pairs and are possibly related to spatial anomalies. Moran’s scatterplot is inspired

by Moran’s I index. Its x axis is Z score and y axis is weighted average of neighbors’ Z score, so

the slope of the regression line is Moran’s I index. As shown in Figure 9(b) the upper left and

lower right quadrants indicate spatial neighbors of dissimilar values: low values surrounded by

high value neighbors (e.g., points P and Q), and high values surrounded by low values (e.g,. point

S). Scatterplot (Figure 9(c)) fits a regression line between attribute values and a neighborhood

average of attribute values, and calculates residues of true points from the regression line. Spatial

anomalies, e.g., point S are detected when they have high absolute residual values. Finally, as

shown in Figure 9(d), spatial statistic S(x) measures the extent of discontinuity between the non-

spatial attribute value at location x and the average attribute value of x’s neighbors. A Z-test of

S(x) can be used to detection spatial anomalies (e.g., point S). Median-based Z-test and iterative

Z-test are also studied as alternatives of Z-test. As long as spatiotemporal neighborhoods are

well-defined the spatial statistics for spatial anomaly detection is also applicable to

spatiotemporal anomalies.

0 2 4 6 8 10 12 14 16 18 20 0

1

2

3

4

5

6

7

8

← S

P →

Q →

D ↑

Data point Fitting curve G

L

0 2 4 6 8 10 0

1

2

3

4

5

6

7

8

9

← → + 2

(a) Variogram cloud (b) Moran’s scatterplot

(c)Scatterplot (d)Spatial statistics Z(s) test

Figure 9 Spatial statistics for spatial anomaly detection (input data are the same as in Figure 8(a)).

4.1.3 Spatial anomaly detection approaches

Visualization approaches utilize plots to visualize outlying objects. The common methods include

variogram clouds and Moran scatterplot as introduced earlier.

Quantitative approaches can be classified further into distribution-based and distance-based

methods. In distribution-based methods, data are modelled based on a distribution function (e.g.,

Normal and Poisson distribution), and the final objects are characterized as outliers depending on

the initial statistical hypothesis that controls the model. Distance-based approaches are commonly

used when dealing with points, lines, polygons, locations and the spatial neighborhood. A spatial

statistic is computed as the difference between the non-spatial attribute of the current location and

that of the neighborhood aggregate (Shekhar, Lu, et al. (2003)). Spatial neighborhoods can be

identified by distances on spatial attributes (e.g., K nearest neighbors), or by graph connectivity

2

2.5

3

3.5

4

4.5

5

5.5

6

6.5

7

← S

P →

Q →

0

1

2

3

4

S →

P → ← Q

(e.g., locations on road networks). This research has been extended in a number of ways to allow

for multiple non-spatial attributes (Chen et al. (2008)), average and median attribute value (Lu et

al. (2003a)), weighted spatial outliers (Kou et al. (2006)), categorical spatial outlier (Liu et al.

(2014)), local spatial outliers (Schubert et al. (2014)), and fast detection (Wu et al. (2010)).

4.1.4 Spatiotemporal anomaly detection approaches

The intuition behind spatiotemporal outlier detection is that they reflect “discontinuity” on non-

spatiotemporal attributes within a spatiotemporal neighborhood. Approaches can be summarized

according to the input data types.

Outliers in spatial time series: For spatial time series (on point reference data, raster data, as well

as graph data), basic spatial outlier detection methods, such as visualization based approaches and

neighborhood based approaches, can be generalized with a definition of spatiotemporal

neighborhoods. After a spatiotemporal neighborhood is defined, and a spatiotemporal statistic

measure is computed as the difference between the non-spatial attribute of the current location

and that of the neighborhood aggregate (Chen et al. (2008); Lu et al. (2003b); McGuire et al.

(2014); Shekhar, Lu, et al. (2003)).

Flow Anomalies: Given a set of observations across multiple spatial locations on a spatial

network flow, flow anomaly discovery aims to identify dominant time intervals where the

fraction of time instants of significantly mismatched sensor readings exceeds the given

percentage-threshold. Flow anomaly discovery can be considered as detecting discontinuities or

inconsistencies of a non-spatiotemporal attribute within a neighborhood defined by the flow

between nodes, and such discontinuities are persistent over a period of time. A time-scalable

technique called SWEET (Smart Window Enumeration and Evaluation of persistent-Thresholds)

was proposed (Elfeky et al. (2006); Franke and Gertz (2008); Kang et al. (2008)) that utilizes

several algebraic properties in the flow anomaly problem to discover these patterns efficiently. To

account for flow anomalies across multiple locations, recent work (Kang et al. (2009)) defines a

teleconnected flow anomaly pattern and proposes a RAD (Relationship Analysis of Dynamic-

neighborhoods) technique to efficiently identify this pattern.

Anomalous moving object trajectories: Detecting spatiotemporal outliers from moving object

trajectories is challenging due to the high dimensionality of trajectories and the dynamic nature.

A context-aware stochastic model has been proposed to detect anomalous moving pattern in

indoor device trajectories (Liu et al. (2012)). Another spatial deviations (distance) based method

has been proposed for anomaly monitoring over moving object trajectory stream (Bu et al.

(2009)). In this case, anomalies are defined as rare patterns with big spatial deviations from

normal trajectories in a certain temporal interval. A supervised approach called Motion-Alert has

also been proposed to detect anomaly in massive moving objects (Li et al. (2006)). This approach

first extracts motif features from moving object trajectories, then clusters the features, and learns

a supervised model to classify whether a trajectory is an anomaly. Other techniques have been

proposed to detect anomalous driving patterns from taxi GPS trajectories (Chen et al. (2011); Ge

et al. (2011); Zhang et al. (2011)).

4.2 Spatial and Spatiotemporal Associations, Tele-connections

Spatial and spatiotemporal coupling patterns represent spatiotemporal object types whose

instances often occur in close geographic and temporal proximity. Discovering various patterns of

spatial and spatiotemporal coupling and tele-coupling is important in applications related to

ecology, environmental science, public safety, and climate science. For example, identifying

spatiotemporal cascade patterns from crime event datasets can help police department to

understand crime generators in a city, and thus take effective measures to reduce crime events. In

this section, we will review techniques for identifying spatial and spatiotemporal association as

well as tele-connections. The section starts with the basic spatial association (or co-location)

pattern, and moves on to spatiotemporal association (i.e., spatiotemporal co-occurrence, cascade,

and sequential patterns) as well as spatiotemporal teleconnection.


Spatial association, also known as spatial co-location patterns (Huang et al. (2004)), represent

subsets of spatial event types whose instances are often located in close geographic proximity.

Real-world examples include symbiotic species, e.g., the Nile Crocodile and Egyptian Plover in

ecology. Similarly, spatiotemporal association patterns represent spatiotemporal object types

whose instances often occur in close geographic and temporal proximity. Figure 10 shows an

example of spatial co-location patterns. In this dataset, there exist instances of several Boolean

spatial features, each represented by a distinct shape. Spatial features in sets {′+′,′×′} and

{′𝑜′,′∗ ′} tend to be located together. A careful review reveals two co-location patterns, that is

{′+′,′×′} and {′𝑜′,′∗ ′}.

Figure 10 Illustration of Point Spatial Co-location Patterns. Shapes represent different spatial

feature types. Spatial features in sets {‘+’, ‘×’} and {‘o’, ‘*’} tend to be located together

Mining patterns of spatial and spatiotemporal association is challenging due to the following

reasons: first, there is no explicit transaction in continuous space and time; second, there is

potential for over-counting; third, the number of candidate patterns is exponential, and a trade-off

between statistical rigor of output patterns and computational efficiency has to be made.


The underlying statistic for spatial and spatiotemporal coupling patterns is the cross-K function,

which generalizes the basic Ripley’s K function (introduced in Section 3) for multiple event

types.

4.2.3 Spatial association detection approaches

Mining spatial co-location patterns can be done via two categories of approaches: approaches that

use spatial statistics and algorithms that use association rule mining kind of primitives. Spatial

statistics based approaches utilize statistical measures such as cross-K function with Monte Carlo

simulation (Cressie (2015)), mean nearest neighbor distance, and spatial regression model (Chou

(1997)). However, these approaches are computationally expensive due to the exponential

number of candidate patterns. Contrarily, association rule-ruled approaches focus on the creation

of transactions over space so that a priori like algorithm can be used. Within this category, there

are transaction based approaches and distance based approaches. A transaction based approach

defines transactions over space (e.g., around instances of a reference feature) and then uses an

Apriori-like algorithm (Koperski and Han (1995)). However, in the spatial co-location pattern

mining problem, transactons are often not explicit. Force fitting the notion of transaction in a

continuous spatial framework will lead to loss of implicit spatial relationships across the

boundary of these transactions, as illustrated in Figure 11. As shown in Figure 11(a), there are

three feature types in this dataset: A, B, and C, each of which has two instances. The neighbor

relationships between instances are shown as edges. Co-locations (A, B) and (B, C) may be

considered as frequent in this example. Figure 11(b) shows transactions created by choosing C as

the reference feature. As Co-location (A, B) does not involve the reference feature, it will not be

found. Figure 11(c) shows two possible partitions for the data set of Figure 11(a), along with the

supports for co-location (A, B); in this case, the support measure is order sensitive and may also

miss the Co-location (A, B). A distance based approach defines a distance-based patterns called k-

neighboring class sets (Morimoto (2001)) or using an event centric model (Huang et al. (2004))

based on a definition of participation index, which is an upper bound of cross-K function statistics

and has an anti-monotone property. Recently, approaches have been proposed to identify

colocations for extended spatial objects (Xiong et al. (2004)) or rare events (Huang, Pei, et al.

(2006)), regional colocation patterns (Ding et al. (2011); Wang et al. (2013)) (i.e., pattern is

significant only in a subregion), statistically significant colocation (Barua and Sander (2014)), as

well as fast algorithms (Yoo and Shekhar (2006)).

Figure 11. Example to illustrate different approaches to discovering co-location patterns: (a)

Example data set. (b) Reference feature-centric model. (c) Data partition approach. Support

measure is ill-defined and order sensitive. (d) Event-centric model.

C2 C2 C2 C2

Example dataset, neighboring instances of different features are connected

B2

A1

A2

C1 B1

Transactions{{B1},{B2}} support(A,B) = φ

B2

Reference feature = C

( b )

A1

A2

C1 B1 B1

Support for (a,b) is order sensitive ( c )

2 B 2 B

( a )

B1 A1 C1

A2

A1

A2

C1

Support(A,B) = 1 Support(A,B) = 2

4.2.4 Spatiotemporal association, tele-connection detection

approaches

Spatiotemporal coupling patterns can be categorized according to whether there exists temporal

ordering of object types: spatiotemporal (mixed drove) co-occurrences (Celik et al. (2008)) are

used for unordered patterns, spatiotemporal cascades (Mohan et al. (2012)) for partially ordered

patterns, and spatiotemporal sequential patterns (Huang et al. (2008)) for totally ordered patterns.

Spatiotemporal tele-connection (Zhang et al. (2003a)) is the pattern of significantly positive or

negative temporal correlation between spatial time series data at a great distance. The following

subsections categorizes common computational approaches for discovering spatiotemporal

couplings by different input data types.

Mixed Drove Spatiotemporal Co-Occurrence Patterns represent subsets of two or more different

object-types whose instances are often located in spatial and temporal proximity. Discovering

MDCOPs is potentially useful in identifying tactics in battlefields and games, understanding

predator-prey interactions, and in transportation (road and network) planning (Güting and

Schneider (2005); Koubarakis et al. (2003)). However, mining MDCOPs is computationally very

expensive because the interest measures are computationally complex, datasets are larger due to

the archival history, and the set of candidate patterns is exponential in the number of object-types.

Recent work has produced a monotonic composite interest measure for discovering MDCOPs and

novel MDCOP mining algorithms are presented in (Celik et al. (2006), (2008)). A filter-and-

refine approach has also been proposed to identify spatiotemporal co-occurrence on extended

spatial objects (Pillai et al. (2013)).

A spatiotemporal sequential pattern is a sequence of spatiotemporal event types in the form of f1

→ f2 → ... → fk. It represents a “chain reaction” from event type f1 to event type f2 and then to

event type f3 until it reaches event type fk. A spatiotemporal sequential pattern differs from a

colocation pattern in that it has a total order of event types. Such patterns are important in

applications such as epidemiology where some disease transmission may follow paths between

several species through spatial contacts. Mining spatiotemporal sequential patterns is challenging

due to the lack of statistically meaningful measures as well as high computation cost. A measure

of sequence index, which can be interpreted by K-function statistics, was proposed in (Huang et

al. (2008); Huang, Zhang, et al. (2006)), together with computationally efficient algorithms. Other

works have investigated spatiotemporal sequential patterns from data other than spatiotemporal

events, such as moving object trajectories (Cao et al. (2005); Li et al. (2013); Verhein (2009)).

Cascading spatio-temporal patterns: Partially ordered subsets of event-types whose instances are

located together and occur in stages are called cascading spatio-temporal patterns (CSTP). In the

domain of public safety, events such as bar closings and football games are considered generators

of crime. Preliminary analysis revealed that football games and bar closing events do indeed

generate CSTPs. CSTP discovery can play an important role in disaster planning, climate change

science (Frelich and Reich (2010); Mahoney et al. (2003)) (e.g., understanding the effects of

climate change and global warming) and public health (e.g., tracking the emergence, spread and

re-emergence of multiple infectious diseases (Morens et al. (2004))). A statistically meaningful

metric was proposed to quantify interestingness and computational pruning strategies were

proposed to make the pattern discovery process more computationally efficient (Huang et al.

(2008); Mohan et al. (2012)).

Spatial time series and tele-connection: Given a collection of spatial time series at different

locations, teleconnection discovery aims to identify pairs of spatial time series whose correlation

is above a given threshold. Tele-connection patterns are important in understanding oscillations in

climate science. Computational challenges arise from the length of the time series and the large

number of candidate pairs and the length of time series. An efficient index structure, called a

cone-tree, as well as a filter and refine approach (Zhang et al. (2003a), (2003b)) have been

proposed which utilize spatial autocorrelation of nearby spatial time series to filter out redundant

pair-wise correlation computation. Another challenge is spurious “high correlation” pairs of

locations that happen by chance. Recently, statistical significant tests have been proposed to

identify statistically significant tele-connection patterns called dipoles from climate data (Kawale

et al. (2012)). The approach uses a “wild bootstrap” to capture the spatio-temporal dependencies,

and takes account of the spatial autocorrelation, the seasonality and the trend in the time series

over a period of time.

4.3 Spatial and Spatiotemporal Prediction

Spatial and spatiotemporal prediction is widely used to classify remote sensing images into

dynamic land cover maps and detect changes. Climate scientists use collections of climate

variables to project future trends in global or regional temperature. In this section, we will review

the definition of spatial and spatiotemporal prediction, followed by computational approaches

organized according to their input data types.


Given spatial and spatiotemporal data items, with a set of explanatory variables (also called

explanatory attributes or features) and a dependent variable (also called target variables), the

spatial and spatiotemporal prediction problem aims to learn a model that can predict the

dependent variable from the explanatory variables. When the dependent variable is discrete, the

problem is called spatial and spatiotemporal classification. When the dependent variable is

continuous, the problem is spatial and spatiotemporal regression. One example of spatial and

spatiotemporal classification problem is remote sensing image classification over temporal

snapshots (Almeida et al. (2007)), where the explanatory variables consists of various spectral

bands or channels (e.g., blue, green, red, infra-red, thermal, etc.) and the dependent variable is a

thematic class such as forest, urban, water, and agriculture. Examples of spatial and

spatiotemporal regression include yearly crop yield prediction (Little et al. (2008)), and daily

temperature prediction at different locations. Regression can also be used in inverse estimation,

that is, given that we have an observed value of y, we want to determine the corresponding x

value.

The unique challenges of spatial and spatiotemporal prediction come from the special

characteristics of spatial and spatiotemporal data, which include spatial and temporal

autocorrelation, spatial heterogeneity and temporal non-stationarity, as well as the multi-scale

effect. These unique characteristics violate the common assumption in many traditional prediction

techniques that samples follow an identical and independent distribution (i.i.d.). Simply applying

traditional prediction techniques without incorporating these unique characteristics may produce

hypotheses or models that are inaccurate or inconsistent with the data set. Additionally, a subtle

but equally important reason is related to the choice of the objective function to measure

classification accuracy. For a two-class problem, the standard way to measure classification

accuracy is to calculate the percentage of correctly classified objects. However, this measure may

not be the most suitable in a spatial context. This is because the measure of Spatial accuracy—

how far the predictions are from the actuals—is important in some applications such as ecology

due to the effects of the discretization of a continuous wetland into discrete pixels, as shown in

Figure 12. Figure 12(a) shows the actual locations of nests and (b) shows the pixels with actual

nests. Note the loss of information during the discretization of continuous space into pixels. Many

nest locations barely fall within the pixels labeled ‘A’ and are quite close to other blank pixels,

which represent ‘no-nest’. Now consider two predictions shown in Figure 12(c) and (d). Domain

scientists prefer prediction Figure 12(d) over (c), as the predicted nest locations are closer on

average to some actual nest locations. The classification accuracy measure cannot distinguish

between Figure 12(c) and (d), and a measure of spatial accuracy is needed to capture this

preference.

Figure 12. The choice of objective function and spatial accuracy.

4.3.2 Statistical foundations

Spatial and spatiotemporal prediction techniques are developed based on spatial and

spatiotemporal statistics, including spatial and temporal autocorrelation, spatial heterogeneity,

temporal non-stationarity, and multiple areal unit problems (MAUP) (see Section 3).

4.3.3 Spatial prediction approaches

Several previous studies (Jhung and Swain (1996); Solberg et al. (1996)) have shown that the

modeling of spatial dependency (often called context) during the classification or regression

process improves overall accuracy. Spatial context can be defined by the relationships between

spatially adjacent spatial units in a small neighborhood. Three supervised learning techniques for

classification and regression that model spatial dependency are: (1) Markov random field (MRF)

based classifiers; (2) spatial autoregressive regression (SAR) model; and (3) spatial decision tree.

Markov random field-based Bayesian classifiers: Maximum likelihood classification (MLC) is

one of the most widely used parametric and supervised classification technique in the field of

remote sensing (Hixson et al. (1980); Strahler (1980)). However, MLC is a per-pixel based

classifier and assumes that samples are i.i.d. Ignoring spatial autocorrelation results in salt and

pepper kind of noise in the classified images. One solution is to use MRF-based Bayesian

classifiers (Li (2009)) to model spatial context via the a priori term in Bayes’ rule. This uses a

set of random variables whose interdependency relationship is represented by an undirected

graph (i.e., a symmetric neighborhood matrix).

Spatial autoregressive regression model (SAR): SAR is one of the commonly used

autoregressive models for spatial data regression (Shekhar and Xiong (2007)). In the spatial

autoregressive regression model, the spatial dependencies of the error term, or, the dependent

P

A P P A P A

P P

A A A A A A

Legend

= Nest location A = Actual nest in pixel P = Predicted nest in pixel

(a) (b) (c) (d)

(a) The actual locations of nests. (b) Pixels with actual nests. (c) Location predicted by a model. (d) Location predicted by another

variable, are directly modeled in the regression equation using contiguity matrix (Anselin

(2013)). An example of contiguity matrix is shown in Figure 13. Based on this model, the

spatial autoregressive regression can be written as:

𝑌 = 𝜌𝑊𝑌 + 𝑋𝛽 + 𝜖,

where

𝑌 Observation or dependent variable,

𝑋 Independent variables,

𝜌 Spatial autoregressive parameter that reflects the strength of the spatial dependencies

between the elements of the dependent variable via the logistic function for binary dependent

variables,

𝑊 Contiguity matrix,

𝛽 Regressive coefficient,

𝜖 Unobservable error term.

When 𝜌 = 0, this model is reduced to the ordinary linear regression equation.

(a) Spatial framework

A B C D

A 0 1 1 0

B 1 0 0 1

C 1 0 0 1

D 0 1 1 0

(b) Neighbor relationship

A B C D

A 0 0.5 0.5 0

B 0.5 0 0 0.5

C 0.5 0 0 0.5

D 0 0.5 0.5 0

(c) Contiguity matrix

Figure 13. A spatial framework and its four-neighborhood contiguity matrix.

Spatial decision trees: Decision tree classifiers have been widely used for image classification

(e.g., remote sensing image classification) but they assume an i.i.d. distribution and produce

significant salt-and-pepper noise. Several works propose spatial feature selection heuristics such

as spatial entropy or spatial information gain to select a tree node test that favors spatial

autocorrelation structure while splitting training samples (Li and Claramunt (2006); Stojanova et

al. (2013)). Recently, a focal-test-based spatial decision tree is proposed (Jiang et al. (2013),

(2015)), in which the tree traversal direction of a sample is based on both local and focal

(neighborhood) information. Focal information is measured by local spatial autocorrelation

statistics in a tree node test.

One limitation of the spatial autoregressive model (SAR) is that, it does not account for the

underlying spatial heterogeneity that is natural in geographic spaces. Thus, in a SAR model,

coefficients β of covariates and the model errors are assumed to be uniform throughout the entire

geographic space. One proposed method to account for spatial variation in model parameters and

errors is Geographically Weighted Regression (Fotheringham et al. (2003)).

The regression equation of GWR is 𝑦 = 𝑥𝛽(𝑠) + 𝜖(𝑠), where 𝛽(𝑠) and 𝜖(𝑠) represent the

spatially parameters and the errors, respectively. GWR has the same structure as standard linear

regression, with the exception that the parameters are spatially varying. It also assumes that

A B

C D

samples at nearby locations have higher influence on the parameter estimation of a current

location.

4.3.4 Spatiotemporal prediction approaches

Spatiotemporal Autoregressive Regression (STAR): SpatioTemporal Autoregressive Regression

(STAR) extends SAR by further explicitly modeling the temporal and spatiotemporal dependency

across variables at different locations. More details can be found in (Cressie (2015)).

Spatiotemporal Kriging: Kriging (Cressie (2015)) is a Geostatistic technique to make predictions

at locations where observations are unknown, based on locations where observations are known.

In other words, Kriging is a spatial “interpolation” model. Spatial dependency is captured by the

spatial covariance matrix, which can be estimated through spatial variograms. Spatiotemporal

Kriging (Cressie and Wikle (2015)) generalizes spatial kriging with a spatiotemporal covariance

matrix and variograms. It can be used to make predictions from incomplete and noise

spatiotemporal data.

Hierarchical Dynamic Spatiotemporal Models: Hierarchical dynamic spatiotemporal models

(DSMs) (Cressie and Wikle (2015)), as the name suggests, aim to model spatiotemporal

processes dynamically with a Bayesian hierarchical framework. On the top is a data model, which

represents the conditional dependency of (actual or potential) observations on the underlying

hidden process with latent variables. In the middle is a process model, which captures the

spatiotemporal dependency with the process model. On the bottom is a parameter model, which

captures the prior distributions of model parameters. DSMs have been widely used in climate

science and environment science, e.g., for simulating population growth or atmospheric and

oceanic processes. For model inference, Kalman filter can be used under the assumption of linear

and Gaussian models.

4.4 Spatial and Spatiotemporal Partitioning (Clustering) and

Summarization

Spatial and spatiotemporal partitioning and summarization are important in many societal

applications such as public safety, public health, and environmental science. For example,

partitioning and summarizing crime data, which is spatial and temporal in nature, helps law

enforcement agencies find trends of crimes and effectively deploy their police resources (Levine

(2013)).


Spatial and spatiotemporal partitioning, or Spatial and spatiotemporal clustering is the process of

grouping similar spatial or spatiotemporal data items, and thus partitioning the underlying space

and time (Kisilevich et al. (2009)). It is important to note that spatial and spatiotemporal

partitioning or clustering is closely related to, but not the same as hotspot detection. Hotspots can

be considered as special clusters such that events or activities inside a cluster have much higher

intensity than outside.

Spatial and spatiotemporal summarization aims to provide a compact representation of

spatiotemporal data. For example, traffic accident events on a road network can be summarized

into several main routes that cover most of the accidents. Spatial and spatiotemporal

summarization is often done after or together with spatiotemporal partitioning so that objects in

each partition can be summarized by aggregated statistics or representative objects.

4.4.2 Spatial partitioning and summarization approaches

Data mining and Machine learning literature have explored a large number of partitioning

algorithms which can be classified into three groups.

Hierarchical clustering methods start with all patterns as a single cluster and successively perform

splitting or merging until a stopping criterion is met. This results in a tree of clusters, called

dendograms. The dendogram can be cut at different levels to yield desired clusters. Well-known

hierarchical clustering algorithms include balanced iterative reducing and clustering using

hierarchies (BIRCH), clustering using interconnectivity (Chameleon), clustering using

representatives (CURE), and robust clustering using links (ROCK) (Cervone et al. (2008);

Karypis et al. (1999); Zhang et al. (1996)).

Global partitioning based clustering algorithms start with each pattern as a single cluster and

iteratively reallocate data points to each cluster until a stopping criterion is met. These methods

tend to find clusters of spherical shape. K-Means and K-Medoids are commonly used partitional

algorithms. Squared error is the most frequently used criterion function in partitional clustering.

The recent algorithms in this category include partitioning around medoids (PAM), clustering

large applications (CLARA), clustering large applications based on randomized search

(CLARANS), and expectation-maximization (EM) (Hu et al. (2008); Ng and Han (2002)).

Density-based clustering algorithms try to find clusters based on the density of data points in a

region. These algorithms treat clusters as dense regions of objects in the data space. The density-

based clustering algorithms include density-based spatial clustering of applications with noise

(DBSCAN), ordering points to identify clustering structure (OPTICS), and density-based

clustering (DECODE) (Lai and Nguyen (2004); Ma and Zhang (2004); Neill and Moore (2004);

Pei et al. (2009)).

Spatial summarization often follows spatial partitioning. It is more difficult than non-spatial

because of its non-numeric nature. According to the clustering methods used, the objectives of

spatial summarization vary. For example, centroids or medoids may be summarization results

from K-means or K-Medoids. K main routes are identified as a summary of activities in spatial

network from a K-Main-Routes algorithm proposed by (Oliver, Shekhar, Zhou, et al. (2014)).

4.4.3 Spatiotemporal partitioning and summarization approaches

Based on the input data type, common spatiotemporal partitioning summarization approaches are

as follows.

Spatiotemporal Event Partitioning: Some previously mentioned clustering algorithms for two-

dimensional space can be easily generalized to spatiotemporal scenarios. For example, ST-

DBSCAN (Birant and Kut (2007)) is a spatiotemporal extension of its spatial version called

DBSCAN (Ankerst et al. (1999); Ester et al. (1996)), and ST-GRID (Wang et al. (2006)) splits

the space and time into 3D cells and merges dense cells together into clusters.

Spatial Time-Series Partitioning: Spatial time series partitioning aims to divide the space into

regions such that the correlation or similarity between time series within the same regions is

maximized. Global partitioning based clustering algorithms such as K Means, K Medoids, and

EM can be applied. So can the hierarchical approaches. However, due to the high dimensionality

of spatial time series, density-based approaches are often not effective. When computing

similarity between spatial time series, a filter-and-refine approach (Zhang et al. (2003a)) can be

used to avoid redundant computation.

Trajectory Data Partitioning: Trajectory data partitioning aims to partition trajectories into groups

according to their similarity. Algorithms are of two types, i.e., density-based and frequency-

based. The density-based approaches (Lee et al. (2007)) first break trajectories into small

segments and apply density-based clustering algorithms similar to DBSCAN to connect dense

areas of segments. The frequency based approach (Lee et al. (2009)) uses association rule mining

(Agrawal et al. (1994)) algorithms to identify subsections of trajectories which have high

frequencies (also called high “support”).

Data summarization aims to find compact representation of a data set (Chandola and Kumar

(2007)), which can be conducted on classical data, spatial data, as well as spatiotemporal data. It

is important for data compression as well as for making pattern analysis more convenient. For

spatial time series data, summarization can be done by removing spatial and temporal redundancy

due to the effect of autocorrelation. A family of such algorithms has been used to summarize

traffic data streams (Pan et al. (2010)). Similarly, the centroids from K Means can also be used to

summarize spatial time series. For trajectory data, especially spatial network trajectories,

summarization is more challenging due to the huge cost of similarity computation. A recent

approach summarizes network trajectories into k primary corridors (Evans et al. (2012), (2013)).

The work proposes efficient algorithms to reduce the huge cost for network trajectory distance

computation.

4.5 Spatial and Spatiotemporal Hotspot Detection

The problem of detecting spatial and spatiotemporal hotspot has received a lot of attention and

being widely studied and applied. For example, in epidemiology finding disease hotspots allows

officials to detect an epidemic and allocate resources to limit its spread (Kulldorff (2015)).


Given a set of spatial and spatiotemporal objects (e.g., activity locations and time) in a study area,

spatial and spatiotemporal hotspots are regions together certain time intervals where the number

of objects is anomalously or unexpectedly high within the time intervals. Spatial and

spatiotemporal hotspots are a special kind of clustered pattern whose inside has significantly

higher intensity than outside.

The challenge of spatial and spatiotemporal hotspot detection is brought about by the uncertainty

of the number, location, size, and shape of hotspots in a dataset. Additionally, ‘false’ hotspots that

aggregate events only by chance should often be avoided, since these false hotspots impede

proper response by authorities and waste police resources. Thus, it is often important to test the

statistical significance of candidate spatial or spatiotemporal hotspots.


Spatial scan statistics (Kulldorff (1997)) are used to detect statistically significant hotspots from

spatial datasets. It uses a circle to scan the space-time for candidate hotspots and perform

hypothesis testing. The null hypothesis states that the activity points are distributed randomly

obeying a homogeneous (i.e., same intensity) Poisson process over the geographical space. The

alternative hypothesis states that the inside of the circle has higher intensity of activities than

outside. A test statistic called the log likelihood ratio is computed for each candidate hotspot (or

circle) and the candidate with the highest likelihood ratio can be evaluated using a significance

value (i.e., p-value). Spatiotemporal scan statistics extends spatial scan statistics by considering

an additional temporal dimension, and changes the scanning window to a cylinder.

Besides, local indicators of spatial association, including Moran’s I, Geary’s C or Ord Gi and Gi*

functions (Anselin (1995)), are also used to detect hotspots. Contrary to spatial autocorrelation,

these functions are computed within the neighborhood of a location. For example, the local

Moran’s I statistic is given as:

𝐼𝑖 =𝑥𝑖 − �̅�

𝑆𝑖2 ∑ 𝑤𝑖,𝑗(𝑥𝑗 − �̅�)

𝑛

𝑗=1,𝑗≠𝑖

where 𝑥𝑖 is an attribute for object 𝑖, �̅� is the mean of the corresponding attribute, 𝑤𝑖,𝑗 is the

spatial weight between object 𝑖 and 𝑗, and 𝑆𝑖2 =

∑ (𝑥𝑗−�̅�)2𝑛

𝑗=1,𝑗≠𝑖

𝑛−1 with 𝑛 equating to the total number

of objects. A positive value for local Moran’s I indicates that an object has neighboring objects

with similarly high or low attribute values, so it may be a part of a cluster, while a negative value

indicates that an object has neighboring objects with dissimilar values, so it may be an outlier. In

either instance, the p-value for the object must be small enough for the cluster or outlier to be

considered statistically significant.

4.5.3 Spatial hotspot detection approaches

Generally, there are two categories of algorithms for spatial hotspot detection, namely, clustering

based approaches and spatial scan statistics based approaches.

4.5.3.1 Clustering based approaches

Clustering methods can be used to identify candidate areas for a further evaluation of

spatiotemporal hotspots. These methods include global partitioning based, density-based

clustering and hierarchical clustering (see Section 4.4). These methods can be used as a

preprocessing step to generate candidate hotspot areas, and statistical tools may be used to test

statistical significance. CrimeStat, a software package for spatial analysis of crime locations,

incorporates several clustering methods to determine the crime hotspots in a study area.

CrimeStat package has k-means tool, nearest neighbor hierarchical (NNH) clustering (Jain et al.

(1999)), Risk Adjusted NNH (RANNH) tool, STAC Hot Spot Area tool (Levine (2013)), and a

Local Indicator of Spatial Association (LISA) tool (Anselin (1995)) that are used to evaluate

potential hotspot areas. In addition to point process data, hotspot detection from trajectories is

also studied to detect network paths with high density (Lee et al. (2007)) or frequency of

trajectories (Lee et al. (2009)).

4.5.3.2 Spatial (spatiotemporal) Scan Statistics based approaches

The spatial scan statistic extends the original scan statistic which focuses on one-dimensional

point process to allow for a multi-dimensional point process. The spatial scan statistic also

extends the scan statistic by allowing the scanning window to vary and the baseline process to be

any inhomogeneous Poisson process or Bernoulli process (Kulldorff (1997)). When the spatial

scan statistic was proposed, it focused on finding circular hotspots in two-dimensional Euclidean

space with statistical significance test. It exhaustively enumerates all circles defined by an activity

point as the center and another activity point on the circumference. For each circle, an interest

measure named as log likelihood ratio is computed to indicate how likely the circle is statistically

significant. The final step is to evaluate the p-value of each interest measure using Monte Carlo

simulation because the distribution of log likelihood ratio is unknown beforehand. The p-value

indicate the confidence of the circle being true hotspot area so as to eliminate hotspots generated

only by chance. The null hypothesis of the MC simulation is that the data points following

complete spatial randomness (CSR), in other words, the baseline process is homogeneous Poisson

process.

Figure 14 Ring-shape hotspot detection

Ever since the introduction of the spatial scan statistic, lots of research are conducted targeting at

different shapes of hotspots. Because of the various shapes of the hotspots, each algorithm

focuses on reducing the computational complexity in order to make the algorithms scalable for

large datasets. Neill and Moore (2004) proposes an algorithm for rectangular hotspots detection.

Based on the property of log likelihood ratio, which is the interest measure, and rectangle, it

applies a divide-and-conquer method that speeds up the computation significantly. In addition, if

some errors are tolerated, a heuristic can be used to accelerate the detection even further.

According to the fact that hotspots may not always be with solid centers, Ring-shaped hotspot

detection is put forward to find such hotspots that are formed by two concentric circles

(Eftelioglu et al. (2014)). The hotspots in this shape can be found in environmental criminology.

Typically, criminals do not commit crimes close to their homes but also do not commit crimes too

far because of the transportation cost (e.g., time and money). For example as shown in Figure 14,

the detected ring-shape hotspot’s log likelihood ratio is 284.51, and its p-value is 0.001. However,

the most significant result returned from SatScan is a circle covering the whole ring area with log

likelihood ratio as 211.15, and its p-value is 0.001. Eftelioglu et al. (2014) claims that by

delineating the inner circles of ring-shape hotspots the range to search for the possible source of

events (e.g., home of the suspect of a serial crimes) can be further narrowed down. By

specifically defining the scanning window in a spatial network, some network space hotspot

detection algorithms are put forward. For example, Eftelioglu et al. (2016) introduces a algorithm

for ring-shaped hotspot detection in a spatial network by defining that the center of a ring-shape

hotspot is a shortest path between two points in the network.

4.5.4 spatiotemporal hotspot detection approaches

Spatiotemporal hotspot detection can be seen as a special case of pure spatial hotspot detection by

the addition of time as a third dimension. Two types of spatiotemporal hotspots that are of

particular importance: “persistent” spatiotemporal hotspot and “emerging” spatiotemporal

hotspot. A “persistent” spatiotemporal hotspot is defined as a region where the rate of increase in

observations is constantly high over time. Thus, a persistent hotspot detection assumes that the

risk of a hotspot (i.e., outbreak) is constant over time and it searches over space and time for a

hotspot by simply totaling the number of observations in each time interval. An “emerging”

spatiotemporal hotspot is a region where the rate of observations is monotonically increasing over

time (Chang et al. (2005); Tango et al. (2011)). This kind of spatiotemporal hotspot occurs when

an outbreak emerges causing a sudden increase in the number observations. Such phenomena can

be observed in epidemiology where at the start of an outbreak the number of disease cases

suddenly increases. Tools for the detection of emerging spatiotemporal hotspots use spatial scan

statistics with the change in expectation over time (Neill et al. (2005)). Many of the clustering

based and spatial scan statistics based methods mentioned above, which are originally designed

for two-dimensional Euclidean space and mostly used for pure spatial data, can be used to

identify spatiotemporal candidate hotspots by considering the temporal part of the data as a third

dimension. For example, DBSCAN will cluster both spatial and spatiotemporal data using the

density of the data as its measure, and SatScan, which uses a cylindrical window in three

dimensions (time is the third dimension) instead of the circular window in two dimensions for

spatial hotspots (Kulldorff (2015)), can be used as a tool to detect persistent spatiotemporal

hotspots.

4.6 Spatial and Spatiotemporal Change


Although the single term “change” is used to name the spatial and spatiotemporal change patterns

in different applications, the underlying phenomena may differ significantly. This section briefly

summarizes the main ways a change may be defined in spatiotemporal data (Zhou et al. (2014)):

Change in Statistical Parameter: In this case, the data is assumed to follow a certain distribution

and the change is defined as a shift in this statistical distribution. For example, in statistical

quality control, a change in the mean or variance of the sensor readings is used to detect a fault.

Change in Actual Value: Here, change is modeled as the difference between a data value and its

spatial or temporal neighborhood. For example, in a one-dimensional continuous function, the

magnitude of change can be characterized by the derivative function, while on a two-dimensional

surface, it can be characterized by the gradient magnitude.

Change in Models Fitted to Data: This type of change is identified when a number of function

models are fitted to the data and one or more of the models exhibits a change (e.g., a discontinuity

between consecutive linear functions) (Chandola et al. (2010)).

4.6.2 Spatial and spatiotemporal change detection approaches

Spatial footprints can be classified as raster footprints or vector footprints. Vector footprints are

further classified into four categories: point(s), line(s), polygon(s), and network footprint patterns.

Raster footprints are classified based on the scale of the pattern, namely, local, focal, or zonal

patterns. This classification describes the scale of the change operation of a given phenomenon in

the spatial raster field (Worboys and Duckham (2004)). Local patterns are patterns in which

change at a given location depends only on attributes at this location. Focal patterns are patterns

in which change in a location depends on attributes in that location and its assumed

neighborhood. Zonal patterns define change using an aggregation of location values in a region.

Spatiotemporal Change Patterns with Raster-Based Spatial Footprint: This includes patterns of

spatial changes between snapshots. In remote sensing, detecting changes between satellite images

can help identify land cover change due to human activity, natural disasters, or climate change

(Bujor et al. (2004); Kosugi et al. (2004); Martino et al. (2007)). Given two geographically

aligned raster images, this problem aims to find a collection of pixels that have significant

changes between the two images (Radke et al. (2005)). This pattern is classified as a local change

between snapshots since the change at a given pixel is assumed to be independent of changes at

other pixels. Alternative definitions have assumed that a change at a pixel also depends on its

neighborhoods (Thoma and Bierling (1989)). For example, the pixel values in each block may be

assumed to follow a Gaussian distribution (Aach and Kaup (1995)). We refer to this type of

change footprint pattern as a focal spatial change between snapshots. Researchers in remote

sensing and image processing have also tried to apply image change detection to objects instead

of pixels (Chen et al. (2012); Desclée et al. (2006); Im et al. (2008)), yielding zonal spatial

change patterns between snapshots.

A well-known technique for detecting a local change footprint is simple differencing. The

technique starts by calculating the differences between the corresponding pixels intensities in the

two images. A change at a pixel is flagged if the difference at the pixel exceeds a certain

threshold. Alternative approaches have also been proposed to discover focal change footprints

between images. For example, the block-based density ratio test detects change based on a group

of pixels, known as a block (Aach et al. (1993); Rignot and Zyl (1993)). Object-based approaches

in remote sensing (Im et al. (2008); Im and Jensen (2005)) employ image segmentation

techniques to partition temporal snapshots of images into homogeneous objects (Douglas and

Peucker (1973)) and then classify object pairs in the two temporal snapshots of images into no

change or change classes.

Spatiotemporal Change Patterns with Vector-Based Spatial Footprint: This includes the

Spatiotemporal Volume Change Footprint pattern. This pattern represents a change process

occurring in a spatial region (a polygon) during a time interval. For example, an outbreak event of

a disease can be defined as an increase in disease reports in a certain region during a certain time

window up to the current time. Change patterns known to have an spatiotemporal volume

footprint include the spatiotemporal scan statistics (Kulldorff (2001); Kulldorff et al. (1998)), a

generalization of the spatial scan statistic, and emerging spatiotemporal clusters defined by (Neill

et al. (2005)).

5 Research Trend and Future Research Needs Most current research in spatial and spatiotemporal data mining uses Euclidean space, which

often assumes isotropic property and symmetric neighborhoods. However, in many real-world

applications, the underlying space is network space, such as river networks and road networks

(Isaak et al. (2014); Oliver et al. (2010); Oliver, Shekhar, Kang, et al. (2014)). One of the main

challenges in spatial and spatiotemporal network data mining is to account for the network

structure in the dataset. For example, in anomaly detection, spatial techniques do not consider the

spatial network structure of the dataset, that is, they may not be able to model graph properties

such as one-ways, connectivity, left-turns, etc. The network structure often violates the isotropic

property and symmetry of neighborhoods, and instead, requires asymmetric neighborhood and

directionality of neighborhood relationship (e.g., network flow direction).

Recently, some cutting-edge research has been conducted in the spatial network statistics and data

mining (Okabe and Sugihara (2012)). For example, several spatial network statistical methods

have been developed, e.g., network K function and network spatial autocorrelation. Several

spatial analysis methods have also been generalized to the network space, such as network point

cluster analysis and clumping method, network point density estimation, network spatial

interpolation (Kriging), as well as network Huff model. Due to the nature of spatial network

space as distinct from Euclidean space, these statistics and analysis often rely on advanced spatial

network computational techniques (Okabe and Sugihara (2012)).

We believe more spatiotemporal data mining research is still needed in the network space. First,

though several spatial statistics and data mining techniques have been generalized to the network

space, few spatiotemporal network statistics and data mining have been developed, and the vast

majority of research is still in the Euclidean space. Future research is needed to develop more

spatial network statistics, such as spatial network scan statistics, spatial network random field

model, as well as spatiotemporal autoregressive models for networks. Furthermore, phenomena

observed on spatiotemporal networks need to be interpreted in an appropriate frame of reference

to prevent a mismatch between the nature of the observed phenomena and the mining algorithm.

For instance, moving objects on a spatiotemporal network need to be studied from a traveler’s

perspective, i.e., the Lagrangian frame of reference (Gunturi et al. (2015); Gunturi and Shekhar

(2014)) instead of a snapshot view. This is because a traveler moving along a chosen path in a

spatiotemporal network would experience a road-segment (and its properties such as fuel

efficiency, travel-time etc.) for the time at which he/she arrives at that segment, which may be

distinct from the original departure-time at the start of the journey. These unique requirements

(non-isotropy and Lagrangian reference frame) call for novel spatiotemporal statistical

foundations (Isaak et al. (2014)) as well as new computational approaches for spatiotemporal

network data mining.

Another future research need is to develop spatiotemporal graph big data platforms. Current

relevant big data platforms for spatial and spatiotemporal data mining include ESRI GIS Tools

for Hadoop (ESRI (2013)), Hadoop GIS (Aji et al. (2013)), etc. These provide distributed systems

for geometric-data (e.g., lines, points and polygons) including geometric indexing and

partitioning methods such as R-tree, R+-tree, or Quad tree. Recently, SpatialHadoop has been

developed (Eldawy and Mokbel (2015)). SpatialHadoop embeds geometric notions in language,

visualization, storage, MapReduce, and operations layers. However, spatio-temporal graphs

(STGs) violate the core assumptions of current spatial big data platforms that the geometric

concepts are adequate for conveniently representing STG analytics operations and for partition

data for load-balancing. STGs also violate core assumptions underlying graph analytics software

(e.g., Giraph (Avery (2011)), GraphLab (Low et al. (2014)) and Pregel (Malewicz et al. (2010)))

that traditional location-unaware graphs are adequate for conveniently representing STG analytics

operations and for partition data for load-balancing. Therefore, novel spatiotemporal graph big

data platforms are needed. Several challenges should be addressed, e.g., spatiotemporal graph big

data requires novel distributed file systems (DFS) to partition the graph, and a novel

programming model is still needed to support abstract data types and fundamental STG

operations, etc.

6 Conclusions spatial and spatiotemporal (SST) data mining has broad application domains including ecology

and environmental management, public safety, transportation, earth science, epidemiology, and

climatology. This article provides an overview of recent advances in SST data mining. It reviews

common SST data mining techniques organized by major SST pattern families: SST anomaly,

SST association and tele-coupling, SST prediction, SST partitioning and summarization, SST

hotspots, and SST change detection. New trends and research needs are also summarized,

including SST data minim in the network space and SST big data platform development.

Reference Aach, T. and Kaup, A. (1995), “Bayesian algorithms for adaptive change detection in image

sequences using Markov random fields”, Signal Processing: Image Communication, Vol.

7 No. 2, pp. 147–160.

Aach, T., Kaup, A. and Mester, R. (1993), “Statistical model-based change detection in moving

video”, Signal Processing, Vol. 31 No. 2, pp. 165–180.

Aggarwal, C.C. (2013), Outlier Analysis, Springer, Berlin, Germany.

Agrawal, R., Imieliński, T. and Swami, A. (1993), “Mining Association Rules Between Sets of

Items in Large Databases”, Proceedings of the 1993 ACM SIGMOD International

Conference on Management of Data, ACM, New York, NY, USA, pp. 207–216.

Agrawal, R., Srikant, R. and others. (1994), “Fast algorithms for mining association rules”, Proc.

20th Int. Conf. Very Large Data Bases, VLDB, Vol. 1215, pp. 487–499.

Aji, A., Wang, F., Vo, H., Lee, R., Liu, Q., Zhang, X. and Saltz, J. (2013), “Hadoop GIS: A High

Performance Spatial Data Warehousing System over Mapreduce”, Proc. VLDB Endow.,

Vol. 6 No. 11, pp. 1009–1020.

Albert, P.S. and McShane, L.M. (1995), “A generalized estimating equations approach for

spatially correlated binary data: applications to the analysis of neuroimaging data”,

Biometrics, pp. 627–638.

Allen, J.F. (1984), “Towards a general theory of action and time”, Artificial Intelligence, Vol. 23

No. 2, pp. 123–154.

Almeida, C.M., Souza, I.M., Alves, C.D., Pinho, C.M.D., Pereira, M.N. and Feitosa, R.Q. (2007),

“Multilevel Object-oriented Classification of Quickbird Images for Urban Population

Estimates”, Proceedings of the 15th Annual ACM International Symposium on Advances

in Geographic Information Systems, ACM, New York, NY, USA, p. 12:1–12:8.

Ankerst, M., Breunig, M.M., Kriegel, H.-P. and Sander, J. (1999), “OPTICS: Ordering Points to

Identify the Clustering Structure”, Proceedings of the 1999 ACM SIGMOD International

Conference on Management of Data, ACM, New York, NY, USA, pp. 49–60.

Anselin, L. (1994), “Exploratory spatial data analysis and geographic information systems”, New

Tools for Spatial Analysis, Vol. 17, pp. 45–54.

Anselin, L. (1995), “Local Indicators of Spatial Association—LISA”, Geographical Analysis,

Vol. 27 No. 2, pp. 93–115.

Anselin, L. (2013), Spatial Econometrics: Methods and Models, Springer Science & Business

Media.

Avery, C. (2011), “Giraph: Large-scale graph processing infrastructure on hadoop”, Proceedings

of the Hadoop Summit. Santa Clara, Vol. 11.

Bamnett, V. and Lewis, T. (1994), “Outliers in statistical data”.

Banerjee, S., Carlin, B.P. and Gelfand, A.E. (2014), Hierarchical Modeling and Analysis for

Spatial Data, Crc Press.

Barua, S. and Sander, J. (2014), “Mining Statistically Significant Co-location and Segregation

Patterns”, IEEE Transactions on Knowledge and Data Engineering, Vol. 26 No. 5, pp.

1185–1199.

Birant, D. and Kut, A. (2007), “ST-DBSCAN: An algorithm for clustering spatial–temporal

data”, Data & Knowledge Engineering, Vol. 60 No. 1, pp. 208–221.

Bu, Y., Chen, L., Fu, A.W.-C. and Liu, D. (2009), “Efficient Anomaly Monitoring over Moving

Object Trajectory Streams”, Proceedings of the 15th ACM SIGKDD International

Conference on Knowledge Discovery and Data Mining, ACM, New York, NY, USA, pp.

159–168.

Bujor, F., Trouve, E., Valet, L., Nicolas, J.M. and Rudant, J.P. (2004), “Application of log-

cumulants to the detection of spatiotemporal discontinuities in multitemporal SAR

images”, IEEE Transactions on Geoscience and Remote Sensing, Vol. 42 No. 10, pp.

2073–2084.

Campelo, C.E.C. and Bennett, B. (2013), “Representing and Reasoning about Changing Spatial

Extensions of Geographic Features”, in Tenbrink, T., Stell, J., Galton, A. and Wood, Z.

(Eds.), Spatial Information Theory, Springer International Publishing, pp. 33–52.

Cao, H., Mamoulis, N. and Cheung, D.W. (2005), “Mining frequent spatio-temporal sequential

patterns”, Fifth IEEE International Conference on Data Mining (ICDM’05), presented at

the Fifth IEEE International Conference on Data Mining (ICDM’05), p. 8 pp.-pp.

Celik, M., Shekhar, S., Rogers, J.P. and Shine, J.A. (2008), “Mixed-Drove Spatiotemporal Co-

Occurrence Pattern Mining”, IEEE Transactions on Knowledge and Data Engineering,

Vol. 20 No. 10, pp. 1322–1335.

Celik, M., Shekhar, S., Rogers, J.P., Shine, J.A. and Yoo, J.S. (2006), “Mixed-Drove Spatio-

Temporal Co-occurence Pattern Mining: A Summary of Results”, Sixth International

Conference on Data Mining (ICDM’06), presented at the Sixth International Conference

on Data Mining (ICDM’06), pp. 119–128.

Cervone, G., Franzese, P., Ezber, Y. and Boybeyi, Z. (2008), “Risk Assessment of Atmospheric

Hazard Releases Using K-Means Clustering”, 2008 IEEE International Conference on

Data Mining Workshops, presented at the 2008 IEEE International Conference on Data

Mining Workshops, pp. 342–348.

Chandola, V., Banerjee, A. and Kumar, V. (2009), “Anomaly Detection: A Survey”, ACM

Comput. Surv., Vol. 41 No. 3, p. 15:1–15:58.

Chandola, V., Hui, D., Gu, L., Bhaduri, B. and Vatsavai, R.R. (2010), “Using Time Series

Segmentation for Deriving Vegetation Phenology Indices from MODIS NDVI Data”,

2010 IEEE International Conference on Data Mining Workshops, presented at the 2010

IEEE International Conference on Data Mining Workshops, pp. 202–208.

Chandola, V. and Kumar, V. (2007), “Summarization – compressing data into an informative

representation”, Knowledge and Information Systems, Vol. 12 No. 3, pp. 355–378.

Chang, W., Zeng, D. and Chen, H. (2005), “Prospective spatio-temporal data analysis for security

informatics”, Proceedings. 2005 IEEE Intelligent Transportation Systems, 2005.,

presented at the Proceedings. 2005 IEEE Intelligent Transportation Systems, 2005., pp.

1120–1124.

Chen, C., Zhang, D., Castro, P.S., Li, N., Sun, L. and Li, S. (2011), “Real-Time Detection of

Anomalous Taxi Trajectories from GPS Traces”, in Puiatti, A. and Gu, T. (Eds.), Mobile

and Ubiquitous Systems: Computing, Networking, and Services, Springer Berlin

Heidelberg, pp. 63–74.

Chen, D., Lu, C.-T., Kou, Y. and Chen, F. (2008), “On detecting spatial outliers”,

Geoinformatica, Vol. 12 No. 4, pp. 455–475.

Chen, G., Hay, G.J., Carvalho, L.M.T. and Wulder, M.A. (2012), “Object-based change

detection”, International Journal of Remote Sensing, Vol. 33 No. 14, pp. 4434–4457.

Cheng, D.T., Haworth, J., Anbaroglu, B., Tanaksaranond, G. and Wang, J. (2014),

“Spatiotemporal Data Mining”, in Fischer, M.M. and Nijkamp, P. (Eds.), Handbook of

Regional Science, Springer Berlin Heidelberg, pp. 1173–1193.

Chiu, S.N., Stoyan, D., Kendall, W.S. and Mecke, J. (2013), Stochastic Geometry and Its

Applications, John Wiley & Sons.

Chou, Y.-H. (1997), “Exploring Spatial Analysis in Geographic Information Systems”.

Cressie, N. (2015), Statistics for Spatial Data, John Wiley & Sons.

Cressie, N. and Wikle, C.K. (2015), Statistics for Spatio-Temporal Data, John Wiley & Sons.

Desclée, B., Bogaert, P. and Defourny, P. (2006), “Forest change detection by statistical object-

based method”, Remote Sensing of Environment, Vol. 102 No. 1–2, pp. 1–11.

Ding, W., Eick, C.F., Yuan, X., Wang, J. and Nicot, J.-P. (2011), “A framework for regional

association rule mining and scoping in spatial datasets”, GeoInformatica, Vol. 15 No. 1,

pp. 1–28.

Douglas, D.H. and Peucker, T.K. (1973), “Algorithms for the reduction of the number of points

required to represent a digitized line or its caricature”, Cartographica: The International

Journal for Geographic Information and Geovisualization, Vol. 10 No. 2, pp. 112–122.

Eck, J., Chainey, S., Cameron, J. and Wilson, R. (2005), “Mapping crime: Understanding

hotspots”.

Eftelioglu, E., Li, Y., Tang, X., Shekhar, S., Kang, J.M. and Farah, C. (2016), “Mining Network

Hotspots with Holes: A Summary of Results”, in Miller, J.A., O’Sullivan, D. and

Wiegand, N. (Eds.), Geographic Information Science, Springer International Publishing,

pp. 51–67.

Eftelioglu, E., Shekhar, S., Oliver, D., Zhou, X., Evans, M.R., Xie, Y., Kang, J.M., et al. (2014),

“Ring-Shaped Hotspot Detection: A Summary of Results”, 2014 IEEE International

Conference on Data Mining, presented at the 2014 IEEE International Conference on

Data Mining, pp. 815–820.

Eldawy, A. and Mokbel, M.F. (2015), “SpatialHadoop: A MapReduce framework for spatial

data”, 2015 IEEE 31st International Conference on Data Engineering, presented at the

2015 IEEE 31st International Conference on Data Engineering, pp. 1352–1363.

Elfeky, M.G., Aref, W.G. and Elmagarmid, A.K. (2006), “Stagger: Periodicity mining of data

streams using expanding sliding windows”, Sixth International Conference on Data

Mining (ICDM’06), IEEE, pp. 188–199.

Elliot, P., Wakefield, J.C., Best, N.G., Briggs, D.J. and others. (2000), Spatial Epidemiology:

Methods and Applications., Oxford University Press.

Erwig, M. and Schneider, M. (2002), “Spatio-temporal predicates”, IEEE Transactions on

Knowledge and Data Engineering, Vol. 14 No. 4, pp. 881–901.

ESRI. (2013), “Breathe Life into Big Data”.

Ester, M., Kriegel, H.-P. and Sander, J. (1997), “Spatial data mining: A database approach”,

Advances in Spatial Databases, Springer Berlin Heidelberg, pp. 47–66.

Ester, M., Kriegel, H.-P., Sander, J., Xu, X. and others. (1996), “A density-based algorithm for

discovering clusters in large spatial databases with noise.”, Kdd, Vol. 96, pp. 226–231.

Evans, M.R., Oliver, D., Shekhar, S. and Harvey, F. (2012), “Summarizing Trajectories into K-

primary Corridors: A Summary of Results”, Proceedings of the 20th International

Conference on Advances in Geographic Information Systems, ACM, New York, NY,

USA, pp. 454–457.

Evans, M.R., Oliver, D., Shekhar, S. and Harvey, F. (2013), “Fast and Exact Network Trajectory

Similarity Computation: A Case-study on Bicycle Corridor Planning”, Proceedings of the

2Nd ACM SIGKDD International Workshop on Urban Computing, ACM, New York,

NY, USA, p. 9:1–9:8.

Fotheringham, A.S., Brunsdon, C. and Charlton, M. (2003), Geographically Weighted

Regression: The Analysis of Spatially Varying Relationships, John Wiley & Sons.

Franke, C. and Gertz, M. (2008), “Detection and exploration of outlier regions in sensor data

streams”, 2008 IEEE International Conference on Data Mining Workshops, IEEE, pp.

375–384.

Frelich, L.E. and Reich, P.B. (2010), “Will environmental changes reinforce the impact of global

warming on the prairie–forest border of central North America?”, Frontiers in Ecology

and the Environment, Vol. 8 No. 7, pp. 371–378.

Ge, Y., Xiong, H., Liu, C. and Zhou, Z.H. (2011), “A Taxi Driving Fraud Detection System”,

2011 IEEE 11th International Conference on Data Mining, presented at the 2011 IEEE

11th International Conference on Data Mining, pp. 181–190.

Gelfand, A.E., Diggle, P., Guttorp, P. and Fuentes, M. (2010), Handbook of Spatial Statistics,

CRC Press.

George, B. and Shekhar, S. (2008), “Time-Aggregated Graphs for Modeling Spatio-temporal

Networks”, in Spaccapietra, S., Pan, J.Z., Thiran, P., Halpin, T., Staab, S., Svatek, V.,

Shvaiko, P., et al. (Eds.), Journal on Data Semantics XI, Springer Berlin Heidelberg, pp.

191–212.

Gething, P.W., Atkinson, P.M., Noor, A.M., Gikandi, P.W., Hay, S.I. and Nixon, M.S. (2007), “A

local space–time kriging approach applied to a national outpatient malaria data set”,

Computers & Geosciences, Vol. 33 No. 10, pp. 1337–1350.

Gunturi, V.M.V. and Shekhar, S. (2014), “Lagrangian Xgraphs: A Logical Data-Model for

Spatio-Temporal Network Data: A Summary”, Advances in Conceptual Modeling,

Springer International Publishing, pp. 201–211.

Gunturi, V.M.V., Shekhar, S. and Yang, K. (2015), “A Critical-Time-Point Approach to All-

Departure-Time Lagrangian Shortest Paths”, IEEE Transactions on Knowledge and Data

Engineering, Vol. 27 No. 10, pp. 2591–2603.

Güting, R.H. and Schneider, M. (2005), Moving Objects Databases, Elsevier.

Guyon, X. (1995), Random Fields on a Network: Modeling, Statistics, and Applications, Springer

Science & Business Media.

Habiba, H., Tantipathananandh, C. and Berger-Wolf, T. (2007), “Betweenness centrality measure

in dynamic networks”, Department of Computer Science, University of Illinois at

Chicago, Chicago.

Haining, R. (1993), Spatial Data Analysis in the Social and Environmental Sciences, Cambridge

University Press.

Haslett, J., Bradley, R., Craig, P., Unwin, A. and Wills, G. (1991), “Dynamic graphics for

exploring spatial data with application to locating global and local anomalies”, The

American Statistician, Vol. 45 No. 3, pp. 234–242.

Hixson, M., Scholz, D., Fuhs, N. and Akiyama, T. (1980), “Evaluation of several schemes for

classification of remotely sensed data”, Photogrammetric Engineering and Remote

Sensing.

Hu, T., Xiong, H., Gong, X. and Sung, S.Y. (2008), “ANEMI: An Adaptive Neighborhood

Expectation-Maximization Algorithm with Spatial Augmented Initialization”, Advances

in Knowledge Discovery and Data Mining, Springer Berlin Heidelberg, pp. 160–171.

Huang, Y., Pei, J. and Xiong, H. (2006), “Mining Co-Location Patterns with Rare Events from

Spatial Data Sets”, GeoInformatica, Vol. 10 No. 3, pp. 239–260.

Huang, Y., Shekhar, S. and Xiong, H. (2004), “Discovering colocation patterns from spatial data

sets: a general approach”, IEEE Transactions on Knowledge and Data Engineering, Vol.

16 No. 12, pp. 1472–1485.

Huang, Y., Zhang, L. and Zhang, P. (2006), “Finding Sequential Patterns from Massive Number

of Spatio-Temporal Events”, Proceedings of the 2006 SIAM International Conference on

Data Mining, Vols. 1-0, Society for Industrial and Applied Mathematics, pp. 634–638.

Huang, Y., Zhang, L. and Zhang, P. (2008), “A Framework for Mining Sequential Patterns from

Spatio-Temporal Event Data Sets”, IEEE Transactions on Knowledge and Data

Engineering, Vol. 20 No. 4, pp. 433–448.

Im, J. and Jensen, J.R. (2005), “A change detection model based on neighborhood correlation

image analysis and decision tree classification”, Remote Sensing of Environment, Vol. 99

No. 3, pp. 326–340.

Im, J., Jensen, J.R. and Tullis, J.A. (2008), “Object‐based change detection using correlation

image analysis and image segmentation”, International Journal of Remote Sensing, Vol.

29 No. 2, pp. 399–423.

Isaak, D.J., Peterson, E.E., Ver Hoef, J.M., Wenger, S.J., Falke, J.A., Torgersen, C.E., Sowder,

C., et al. (2014), “Applications of spatial statistical network models to stream data”,

Wiley Interdisciplinary Reviews: Water, Vol. 1 No. 3, pp. 277–294.

Isaaks, E.H. and others. (1989), Applied Geostatistics, Oxford University Press.

Jain, A.K. and Dubes, R.C. (1988), Algorithms for Clustering Data, Prentice-Hall, Inc.

Jain, A.K., Murty, M.N. and Flynn, P.J. (1999), “Data Clustering: A Review”, ACM Comput.

Surv., Vol. 31 No. 3, pp. 264–323.

Jhung, Y. and Swain, P.H. (1996), “Bayesian contextual classification based on modified M-

estimates and Markov random fields”, IEEE Transactions on Geoscience and Remote

Sensing, Vol. 34 No. 1, pp. 67–75.

Jiang, Z., Shekhar, S., Zhou, X., Knight, J. and Corcoran, J. (2013), “Focal-Test-Based Spatial

Decision Tree Learning: A Summary of Results”, 2013 IEEE 13th International

Conference on Data Mining, presented at the 2013 IEEE 13th International Conference

on Data Mining, pp. 320–329.

Jiang, Z., Shekhar, S., Zhou, X., Knight, J. and Corcoran, J. (2015), “Focal-Test-Based Spatial

Decision Tree Learning”, IEEE Transactions on Knowledge and Data Engineering, Vol.

27 No. 6, pp. 1547–1559.

Kang, J.M., Shekhar, S., Henjum, M., Novak, P.J. and Arnold, W.A. (2009), “Discovering

Teleconnected Flow Anomalies: A Relationship Analysis of Dynamic Neighborhoods

(RAD) Approach”, in Mamoulis, N., Seidl, T., Pedersen, T.B., Torp, K. and Assent, I.

(Eds.), Advances in Spatial and Temporal Databases, Springer Berlin Heidelberg, pp.

44–61.

Kang, J.M., Shekhar, S., Wennen, C. and Novak, P. (2008), “Discovering Flow Anomalies: A

SWEET Approach”, 2008 Eighth IEEE International Conference on Data Mining,

presented at the 2008 Eighth IEEE International Conference on Data Mining, pp. 851–

856.

Karypis, G., Han, E.-H. and Kumar, V. (1999), “Chameleon: hierarchical clustering using

dynamic modeling”, Computer, Vol. 32 No. 8, pp. 68–75.

Kawale, J., Chatterjee, S., Ormsby, D., Steinhaeuser, K., Liess, S. and Kumar, V. (2012),

“Testing the Significance of Spatio-temporal Teleconnection Patterns”, Proceedings of

the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data

Mining, ACM, New York, NY, USA, pp. 642–650.

Kisilevich, S., Mansmann, F., Nanni, M. and Rinzivillo, S. (2009), Spatio-Temporal Clustering,

Springer.

Koperski, K., Adhikary, J. and Han, J. (1996), “Spatial data mining: progress and challenges

survey paper”, Proc. ACM SIGMOD Workshop on Research Issues on Data Mining and

Knowledge Discovery, Montreal, Canada, Citeseer, pp. 1–10.

Koperski, K. and Han, J. (1995), “Discovery of spatial association rules in geographic

information databases”, International Symposium on Spatial Databases, Springer, pp.

47–66.

Kosugi, Y., Sakamoto, M., Fukunishi, M., Lu, W., Doihara, T. and Kakumoto, S. (2004), “Urban

change detection related to earthquakes using an adaptive nonlinear mapping of high-

resolution images”, IEEE Geoscience and Remote Sensing Letters, Vol. 1 No. 3, pp. 152–

156.

Kou, Y., Lu, C.-T. and Chen, D. (2006), “Spatial Weighted Outlier Detection.”, SDM, SIAM, pp.

614–618.

Koubarakis, M., Sellis, T., Frank, A.U., Grumbach, S., Güting, R.H., Jensen, C.S., Lorentzos, N.,

et al. (2003), Spatio-Temporal Databases: The CHOROCHRONOS Approach, Springer.

Krugman, P.R. (1997), Development, Geography, and Economic Theory, Vol. 6, MIT press.

Kulldorff, M. (1997), “A spatial scan statistic”, Communications in Statistics - Theory and

Methods, Vol. 26 No. 6, pp. 1481–1496.

Kulldorff, M. (2001), “Prospective time periodic geographical disease surveillance using a scan

statistic”, Journal of the Royal Statistical Society: Series A (Statistics in Society), Vol.

164 No. 1, pp. 61–72.

Kulldorff, M. (2015), SaTScanTM User Guide.

Kulldorff, M., Athas, W.F., Feurer, E.J., Miller, B.A. and Key, C.R. (1998), “Evaluating cluster

alarms: a space-time scan statistic and brain cancer in Los Alamos, New Mexico.”,

American Journal of Public Health, Vol. 88 No. 9, pp. 1377–1380.

Kyriakidis, P.C. and Journel, A.G. (1999), “Geostatistical space–time models: a review”,

Mathematical Geology, Vol. 31 No. 6, pp. 651–684.

Lai, C. and Nguyen, N.T. (2004), “Predicting density-based spatial clusters over time”, Fourth

IEEE International Conference on Data Mining, 2004. ICDM ’04, presented at the

Fourth IEEE International Conference on Data Mining, 2004. ICDM ’04, pp. 443–446.

Lang, L. and (Redlands, C.) E.S.R.I. (1999), Transportation GIS, Vol. 118, ESri Press Redlands.

Laube, P. and Imfeld, S. (2002), “Analyzing Relative Motion within Groups ofTrackable Moving

Point Objects”, in Egenhofer, M.J. and Mark, D.M. (Eds.), Geographic Information

Science, Springer Berlin Heidelberg, pp. 132–144.

Lee, A.J.T., Chen, Y.-A. and Ip, W.-C. (2009), “Mining frequent trajectory patterns in spatial–

temporal databases”, Information Sciences, Vol. 179 No. 13, pp. 2218–2231.

Lee, J.-G., Han, J. and Whang, K.-Y. (2007), “Trajectory Clustering: A Partition-and-group

Framework”, Proceedings of the 2007 ACM SIGMOD International Conference on

Management of Data, ACM, New York, NY, USA, pp. 593–604.

Leipnik, M.R. and Albert, D.P. (2003), GIS in Law Enforcement: Implementation Issues and

Case Studies, CRC Press.

Levine, N. (2013), “CrimeStat IV: A Spatial Statistics Program for the Analysis of Crime

Incident”, Forensic Magazine, 15 August.

Li, S.Z. (2009), Markov Random Field Modeling in Image Analysis, Springer Science & Business

Media.

Li, X. and Claramunt, C. (2006), “A Spatial Entropy-Based Decision Tree for Classification of

Geographical Information”, Transactions in GIS, Vol. 10 No. 3, pp. 451–467.

Li, X., Han, J. and Kim, S. (2006), “Motion-Alert: Automatic Anomaly Detection in Massive

Moving Objects”, in Mehrotra, S., Zeng, D.D., Chen, H., Thuraisingham, B. and Wang,

F.-Y. (Eds.), Intelligence and Security Informatics, Springer Berlin Heidelberg, pp. 166–

177.

Li, Y., Bailey, J., Kulik, L. and Pei, J. (2013), “Mining Probabilistic Frequent Spatio-Temporal

Sequential Patterns with Gap Constraints from Uncertain Databases”, 2013 IEEE 13th

International Conference on Data Mining, presented at the 2013 IEEE 13th International

Conference on Data Mining, pp. 448–457.

Li, Z., Chen, J. and Baltsavias, E. (2008), Advances in Photogrammetry, Remote Sensing and

Spatial Information Sciences: 2008 ISPRS Congress Book, CRC Press.

Little, B., Schucking, M., Gartrell, B., Chen, B., Ross, K. and McKellip, R. (2008), “High

Granularity Remote Sensing and Crop Production over Space and Time: NDVI over the

Growing Season and Prediction of Cotton Yields at the Farm Field Level in Texas”, 2008

IEEE International Conference on Data Mining Workshops, presented at the 2008 IEEE

International Conference on Data Mining Workshops, pp. 426–435.

Liu, C., Xiong, H., Ge, Y., Geng, W. and Perkins, M. (2012), “A Stochastic Model for Context-

Aware Anomaly Detection in Indoor Location Traces”, 2012 IEEE 12th International

Conference on Data Mining, presented at the 2012 IEEE 12th International Conference

on Data Mining, pp. 449–458.

Liu, X., Chen, F. and Lu, C.-T. (2014), “On detecting spatial categorical outliers”,

GeoInformatica, Vol. 18 No. 3, pp. 501–536.

Low, Y., Gonzalez, J.E., Kyrola, A., Bickson, D., Guestrin, C.E. and Hellerstein, J. (2014),

“GraphLab: A New Framework For Parallel Machine Learning”, arXiv:1408.2041 [Cs].

Lu, C.-T., Chen, D. and Kou, Y. (2003a), “Detecting spatial outliers with multiple attributes”,

15th IEEE International Conference on Tools with Artificial Intelligence, 2003.

Proceedings, presented at the 15th IEEE International Conference on Tools with

Artificial Intelligence, 2003. Proceedings, pp. 122–128.

Lu, C.-T., Chen, D. and Kou, Y. (2003b), “Algorithms for spatial outlier detection”, Data Mining,

2003. ICDM 2003. Third IEEE International Conference on, IEEE, pp. 597–600.

Ma, D. and Zhang, A. (2004), “An adaptive density-based clustering algorithm for spatial

database with noise”, Fourth IEEE International Conference on Data Mining, 2004.

ICDM ’04, presented at the Fourth IEEE International Conference on Data Mining, 2004.

ICDM ’04, pp. 467–470.

Mahoney, J.R., Asrar, G., Leinen, M.S., Andrews, J., Glackin, M., Groat, C., Hobenstein, W., et

al. (2003), “Strategic plan for the US climate change science program”, Climate Change

Science Program Office, Washington DC.

Malewicz, G., Austern, M.H., Bik, A.J.., Dehnert, J.C., Horn, I., Leiser, N. and Czajkowski, G.

(2010), “Pregel: A System for Large-scale Graph Processing”, Proceedings of the 2010

ACM SIGMOD International Conference on Management of Data, ACM, New York,

NY, USA, pp. 135–146.

Marcon, E. and Puech, F. (2009), “Generalizing Ripley’s K function to inhomogeneous

populations”, 1 April.

Martino, G.D., Iodice, A., Riccio, D. and Ruello, G. (2007), “A Novel Approach for Disaster

Monitoring: Fractal Models and Tools”, IEEE Transactions on Geoscience and Remote

Sensing, Vol. 45 No. 6, pp. 1559–1570.

McGuire, M.P., Janeja, V.P. and Gangopadhyay, A. (2014), “Mining trajectories of moving

dynamic spatio-temporal regions in sensor datasets”, Data Mining and Knowledge

Discovery, Vol. 28 No. 4, pp. 961–1003.

Mennis, J., Viger, R. and Tomlin, C.D. (2005), “Cubic Map Algebra Functions for Spatio-

Temporal Analysis”, Cartography and Geographic Information Science, Vol. 32 No. 1,

pp. 17–32.

Miller, H.J. and Han, J. (2009), “Geographic Data Mining and Knowledge

Discovery, .${$CRC$}$ Press”, New York.

Mohan, P., Shekhar, S., Shine, J.A. and Rogers, J.P. (2012), “Cascading Spatio-Temporal Pattern

Discovery”, IEEE Transactions on Knowledge and Data Engineering, Vol. 24 No. 11,

pp. 1977–1992.

Morens, D.M., Folkers, G.K. and Fauci, A.S. (2004), “The challenge of emerging and re-

emerging infectious diseases”, Nature, Vol. 430 No. 6996, pp. 242–249.

Morimoto, Y. (2001), “Mining frequent neighboring class sets in spatial databases”, Proceedings

of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and

Data Mining, ACM, pp. 353–358.

Neill, D.B. and Moore, A.W. (2004), “Rapid Detection of Significant Spatial Clusters”,

Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge

Discovery and Data Mining, ACM, New York, NY, USA, pp. 256–265.

Neill, D.B., Moore, A.W., Sabhnani, M. and Daniel, K. (2005), “Detection of Emerging Space-

time Clusters”, Proceedings of the Eleventh ACM SIGKDD International Conference on

Knowledge Discovery in Data Mining, ACM, New York, NY, USA, pp. 218–227.

Ng, R.T. and Han, J. (2002), “CLARANS: a method for clustering objects for spatial data

mining”, IEEE Transactions on Knowledge and Data Engineering, Vol. 14 No. 5, pp.

1003–1016.

Okabe, A., Okunuki, K.-I. and Shiode, S. (2006), “The SANET Toolbox: New Methods for

Network Spatial Analysis”, Transactions in GIS, Vol. 10 No. 4, pp. 535–550.

Okabe, A. and Sugihara, K. (2012), Spatial Analysis along Networks: Statistical and

Computational Methods, John Wiley & Sons.

Okabe, A., Yomono, H. and Kitamura, M. (1995), “Statistical analysis of the distribution of

points on a network”, Geographical Analysis, Vol. 27 No. 2, pp. 152–175.

Oliver, D., Bannur, A., Kang, J.M., Shekhar, S. and Bousselaire, R. (2010), “A K-Main Routes

Approach to Spatial Network Activity Summarization: A Summary of Results”, 2010

IEEE International Conference on Data Mining Workshops, presented at the 2010 IEEE

International Conference on Data Mining Workshops, pp. 265–272.

Oliver, D., Shekhar, S., Kang, J.M., Laubscher, R., Carlan, V. and Bannur, A. (2014), “A K-Main

Routes Approach to Spatial Network Activity Summarization”, IEEE Transactions on


Oliver, D., Shekhar, S., Zhou, X., Eftelioglu, E., Evans, M.R., Zhuang, Q., Kang, J.M., et al.

(2014), “Significant route discovery: A summary of results”, International Conference on

Geographic Information Science, Springer International Publishing, pp. 284–300.

Openshaw, S. and Openshaw, S. (1984), “The modifiable areal unit problem”, Geo Abstracts

University of East Anglia.

Pan, B., Demiryurek, U., Banaei-Kashani, F. and Shahabi, C. (2010), “Spatiotemporal

Summarization of Traffic Data Streams”, Proceedings of the ACM SIGSPATIAL

International Workshop on GeoStreaming, ACM, New York, NY, USA, pp. 4–10.

Pei, T., Jasra, A., Hand, D.J., Zhu, A.-X. and Zhou, C. (2009), “DECODE: a new method for

discovering clusters of different densities in spatial data”, Data Mining and Knowledge

Discovery, Vol. 18 No. 3, p. 337.

Pillai, K.G., Angryk, R.A. and Aydin, B. (2013), “A Filter-and-refine Approach to Mine

Spatiotemporal Co-occurrences”, Proceedings of the 21st ACM SIGSPATIAL

International Conference on Advances in Geographic Information Systems, ACM, New

York, NY, USA, pp. 104–113.

Quinlan, J.R. (1993), “C4. 5: Programming for machine learning”, Morgan Kauffmann, p. 38.

Radke, R.J., Andra, S., Al-Kofahi, O. and Roysam, B. (2005), “Image change detection

algorithms: a systematic survey”, IEEE Transactions on Image Processing, Vol. 14 No.

3, pp. 294–307.

Rignot, E.J.M. and Zyl, J.J. van. (1993), “Change detection techniques for ERS-1 SAR data”,

IEEE Transactions on Geoscience and Remote Sensing, Vol. 31 No. 4, pp. 896–906.

Ripley, B.D. (1976), “The second-order analysis of stationary point processes”, Journal of

Applied Probability, pp. 255–266.

Roddick, J.F. and Spiliopoulou, M. (1999), “A bibliography of temporal, spatial and spatio-

temporal data mining research”, ACM SIGKDD Explorations Newsletter, Vol. 1 No. 1,

pp. 34–38.

Scally, R. (2006), GIS for Environmental Management, ESRI Press.

Schubert, E., Zimek, A. and Kriegel, H.-P. (2014), “Local outlier detection reconsidered: a

generalized view on locality with applications to spatial, video, and network outlier

detection”, Data Mining and Knowledge Discovery, Vol. 28 No. 1, pp. 190–237.

Shekhar, S. and Chawla, S. (2003), Spatial Databases: A Tour, 1 edition., Prentice Hall, Upper

Saddle River, N.J.

Shekhar, S., Evans, M.R., Kang, J.M. and Mohan, P. (2011), “Identifying patterns in spatial

information: A survey of methods”, Wiley Interdisciplinary Reviews: Data Mining and

Knowledge Discovery, Vol. 1 No. 3, pp. 193–214.

Shekhar, S., Jiang, Z., Ali, R.Y., Eftelioglu, E., Tang, X., Gunturi, V.M.V. and Zhou, X. (2015),

“Spatiotemporal Data Mining: A Computational Perspective”, ISPRS International

Journal of Geo-Information, Vol. 4 No. 4, pp. 2306–2338.

Shekhar, S., Lu, C.-T. and Zhang, P. (2003), “A unified approach to detecting spatial outliers”,

GeoInformatica, Vol. 7 No. 2, pp. 139–166.

Shekhar, S. and Xiong, H. (2007), Encyclopedia of GIS, Springer Science & Business Media.

Shekhar, S., Yang, T.A. and Hancock, P.A. (1993), “An intelligent vehicle highway information

management system”, Computer-Aided Civil and Infrastructure Engineering, Vol. 8 No.

3, pp. 175–198.

Shekhar, S., Zhang, P., Huang, Y. and Vatsavai, R.R. (2003), “Trends in spatial data mining”,

Data Mining: Next Generation Challenges and Future Directions, pp. 357–380.

Solberg, A.H.S., Taxt, T. and Jain, A.K. (1996), “A Markov random field model for classification

of multisource satellite imagery”, IEEE Transactions on Geoscience and Remote Sensing,

Vol. 34 No. 1, pp. 100–113.

Stojanova, D., Ceci, M., Appice, A., Malerba, D. and Džeroski, S. (2013), “Dealing with spatial

autocorrelation when learning predictive clustering trees”, Ecological Informatics, Vol.

13, pp. 22–39.

Strahler, A.H. (1980), “The use of prior probabilities in maximum likelihood classification of

remotely sensed data”, Remote Sensing of Environment, Vol. 10 No. 2, pp. 135–163.

Tango, T., Takahashi, K. and Kohriyama, K. (2011), “A Space–Time Scan Statistic for Detecting

Emerging Outbreaks”, Biometrics, Vol. 67 No. 1, pp. 106–115.

Thoma, R. and Bierling, M. (1989), “Motion compensating interpolation considering covered and

uncovered background”, Signal Processing: Image Communication, Vol. 1 No. 2, pp.

191–212.

Tobler, W.R. (1979), “Cellular Geography”, in Gale, S. and Olsson, G. (Eds.), Philosophy in

Geography, Springer Netherlands, pp. 379–386.

Verhein, F. (2009), “Mining Complex Spatio-Temporal Sequence Patterns”, Proceedings of the

2009 SIAM International Conference on Data Mining, Vols. 1-0, Society for Industrial

and Applied Mathematics, pp. 605–616.

Wang, M., Wang, A. and Li, A. (2006), “Mining Spatial-temporal Clusters from Geo-databases”,

Advanced Data Mining and Applications, Springer Berlin Heidelberg, pp. 263–270.

Wang, S., Huang, Y. and Wang, X.S. (2013), “Regional Co-locations of Arbitrary Shapes”,

Advances in Spatial and Temporal Databases, Springer Berlin Heidelberg, pp. 19–37.

Warrender, C.E. and Augusteijn, M.F. (1999), “Fusion of image classifications using Bayesian

techniques with Markov random fields”, International Journal of Remote Sensing, Vol.

20 No. 10, pp. 1987–2002.

Worboys, M. (2005), “Event‐oriented approaches to geographic phenomena”, International

Journal of Geographical Information Science, Vol. 19 No. 1, pp. 1–28.

Worboys, M.F. and Duckham, M. (2004), GIS: A Computing Perspective, Second Edition, CRC

Press.

Wu, M., Jermaine, C., Ranka, S., Song, X. and Gums, J. (2010), “A Model-Agnostic Framework

for Fast Spatial Anomaly Detection”, ACM Trans. Knowl. Discov. Data, Vol. 4 No. 4, p.

20:1–20:30.

Xiong, H., Shekhar, S., Huang, Y., Kumar, V., Ma, X. and Yoc, J. (2004), “A Framework for

Discovering Co-location Patterns in Data Sets with Extended Spatial Objects”,

Proceedings of the 2004 SIAM International Conference on Data Mining, Vols. 1-0,

Society for Industrial and Applied Mathematics, pp. 78–89.

Yang, K., Evans, M.R., Gunturi, V.M.V., Kang, J.M. and Shekhar, S. (2014), “Lagrangian

Approaches to Storage of Spatio-Temporal Network Datasets”, IEEE Transactions on


Yoo, J.S. and Shekhar, S. (2006), “A Joinless Approach for Mining Spatial Colocation Patterns”,

IEEE Transactions on Knowledge and Data Engineering, Vol. 18 No. 10, pp. 1323–

1337.

Yuan, M. (1996), “Temporal GIS and spatio-temporal modeling”, Proceedings of Third

International Conference Workshop on Integrating GIS and Environment Modeling,

Santa Fe, NM.

Yuan, M. (1999), “Use of a Three-Domain Repesentation to Enhance GIS Support for Complex

Spatiotemporal Queries”, Transactions in GIS, Vol. 3 No. 2, pp. 137–159.

Zhang, D., Li, N., Zhou, Z.-H., Chen, C., Sun, L. and Li, S. (2011), “iBAT: Detecting Anomalous

Taxi Trajectories from GPS Traces”, Proceedings of the 13th International Conference

on Ubiquitous Computing, ACM, New York, NY, USA, pp. 99–108.

Zhang, P., Huang, Y., Shekhar, S. and Kumar, V. (2003a), “Correlation Analysis of Spatial Time

Series Datasets: A Filter-and-Refine Approach”, in Whang, K.-Y., Jeon, J., Shim, K. and

Srivastava, J. (Eds.), Advances in Knowledge Discovery and Data Mining, Springer

Berlin Heidelberg, pp. 532–544.

Zhang, P., Huang, Y., Shekhar, S. and Kumar, V. (2003b), “Exploiting Spatial Autocorrelation to

Efficiently Process Correlation-Based Similarity Queries”, Advances in Spatial and

Temporal Databases, Springer Berlin Heidelberg, pp. 449–468.

Zhang, T., Ramakrishnan, R. and Livny, M. (1996), “BIRCH: An Efficient Data Clustering

Method for Very Large Databases”, in Jagadish, H.V. and Mumick, I.S. (Eds.),

Proceedings of the 1996 ACM SIGMOD International Conference on Management of

Data, Montreal, Quebec, Canada, June 4-6, 1996, ACM Press, pp. 103–114.

Zhou, X., Shekhar, S. and Ali, R.Y. (2014), “Spatiotemporal change footprint pattern discovery:

an inter-disciplinary survey”, Wiley Interdisciplinary Reviews: Data Mining and

Knowledge Discovery, Vol. 4 No. 1, pp. 1–23.