Spatial and Spatiotemporal Data Mining Abstract
The significant growth of spatial and spatiotemporal data collection as well as the emergence of
new technologies have heightened the need for automated discovery of spatiotemporal
knowledge. Spatial and spatiotemporal data mining techniques are crucial to organizations which
make decisions based on large spatial and spatiotemporal datasets. The interdisciplinary nature of
spatial and spatiotemporal data mining and the complexity of spatial and spatiotemporal data and
relationships pose statistical and computational challenges. This article provides an overview of
recent advances in spatial and spatiotemporal data mining, and reviews common spatial and
spatiotemporal data mining techniques organized by major pattern families, based on an
introduction to spatial and spatiotemporal data types and relationships as well as the statistical
background. New trends and research needs are also summarized, including spatial and
spatiotemporal data mining in the network space and spatial and spatiotemporal big data platform
development.
Keyword
Spatial, Spatiotemporal, Data Mining, Statistics, Computing, Anomaly, Association, Prediction
Model, Partition, Summarization, Hotspots, Change Footprint.
1 Introduction The significant growth of spatial and spatiotemporal data collection as well as the emergence of
new technologies have heightened the need for automated discovery of spatiotemporal
knowledge. Spatial and spatiotemporal data mining studies the process of discovering interesting
and previously unknown, but potentially useful patterns from large spatial and spatiotemporal
databases. The complexity of spatial and spatiotemporal data and implicit relationships limits the
usefulness of conventional data mining techniques for extracting spatial and spatiotemporal
patterns.
As shown in Figure 1, the process of spatial and spatiotemporal data mining starts at
preprocessing of the input data. Typically, this step is to correct noise, errors, and missing data.
Exploratory space-time analysis is also conducted in this step to understand the underlying
spatiotemporal distribution. Then, an appropriate algorithm is selected to run on the preprocessed
data, and produce output patterns. Common output pattern families include spatial and
spatiotemporal anomalies, associations and tele-couplings, predictive models, partitions and
summarization, hotspots, as well as change patterns. Spatial and spatiotemporal data mining
algorithms often have statistical foundations and integrate scalable computational techniques.
Output patterns are post-processed and then interpreted by domain scientists to find novel insights
and refine data mining algorithms when needed.
Figure 1 The process of spatial & spatiotemporal data mining
1.1 Societal Importance
Spatial and spatiotemporal data mining techniques are crucial to organizations which make
decisions based on large spatial and spatiotemporal datasets. Many government and private
agencies are likely beneficiaries of spatial and spatiotemporal data mining including NASA, the
National Geospatial-Intelligence Agency (Krugman (1997)), the National Cancer Institute (Albert
and McShane (1995)), the US Department of Transportation (Shekhar et al. (1993)), and the
National Institute of Justice (Eck et al. (2005)). These organizations are spread across many
application domains. These application domains include public health, mapping and analysis for
public safety, transportation, environmental science and management, economics, climatology,
public policy, earth science, market research and analytics, public utilities and distribution, etc. In
ecology and environmental management (Haining (1993); Isaaks and others (1989); Roddick and
Spiliopoulou (1999); Scally (2006)), researchers need tools to classify remote sensing images to
map forest coverage. In public safety (Leipnik and Albert (2003)), crime analysts are interested in
discovering hotspot patterns from crime event maps so as to effectively allocate police resources.
In transportation (Lang and (Redlands (1999)), researchers analyze historical taxi GPS
trajectories to recommend fast routes from places to places. Epidemiologists (Elliot et al. (2000))
use spatiotemporal data mining techniques to detect disease outbreak.
The interdisciplinary nature of spatial and spatiotemporal data mining means that its techniques
must developed with awareness of the underlying physics or theories in the application domains.
For example, climate science studies find that observable predictors for climate phenomena
discovered by data science techniques can be misleading if they do not take into account climate
models, locations, and seasons. In this case, statistical significance testing is critically important
in order to further validate or discard relationships mined from data.
1.2 Challenges
preprocessing,
exploratory space-time
analysis
spatial & spatiotemporal
data mining algorithms
input:
spatial & spatiotemporal
data
Output:
spatial & spatiotemporal
patterns
post-processing
spatial & spatiotemporal
statistical foundation
computational
techniques
In addition to interdisciplinary challenges, spatial and spatiotemporal data mining also poses
statistical and computational challenges. Extracting interesting and useful patterns from spatial
and spatiotemporal datasets is more difficult than extracting corresponding patterns from
traditional numeric and categorical data due to the complexity of spatiotemporal data types and
relationships, including: (1) the spatial relationships among the variables, i.e., observations that
are not independent and identically distributed (i.i.d.); (2) the spatial structure of errors; and (3)
nonlinear interactions in feature space. According to Tobler’s first law of geography, “Everything
is related to everything else, but near things are more related than distant things.” For example,
people with similar characteristics, occupation and background tend to cluster together in the
same neighborhoods. In spatial statistics such spatial dependence is called the spatial
autocorrelation effect. Ignoring autocorrelation and assuming an identical and independent
distribution (i.i.d.) when analyzing data with spatial and spatio-temporal characteristics may
produce hypotheses or models that are inaccurate or inconsistent with the data set (Shekhar,
Zhang, et al. (2003)). In addition to spatial dependence at nearby locations, phenomena of
spatiotemporal tele-coupling also indicate long range spatial dependence such as El Niño and La
Niña effects in the climate system. Another challenge comes from the fact that spatiotemporal
datasets are embedded in continuous space and time, and thus many classical data mining
techniques assuming discrete data (e.g., transactions in association rule mining) may not be
effective. A third challenge is the spatial heterogeneity and temporal non-stationarity, i.e.,
spatiotemporal temporal data samples do not follow an identical distribution across the entire
space and over all time. Instead, different geographical regions and temporal period may have
distinct distributions. Modifiable area unit problem (MAUP) or multi-scale effect is another
challenge since results of spatial analysis depends on a choice of appropriate spatial and temporal
scales. Finally, flow and movement and Lagrangian framework of reference in spatiotemporal
networks pose challenges (e.g., directionality, anisotropy, etc.).
One way to deal with implicit spatiotemporal relationships is to materialize the relationships into
traditional data input columns and then apply classical data mining techniques (Agrawal et al.
(1993), (1994); Bamnett and Lewis (1994); Jain and Dubes (1988); Quinlan (1993)). However,
the materialization can result in loss of information (Shekhar, Zhang, et al. (2003)). The spatial
and temporal vagueness which naturally exists in data and relationships usually creates further
modeling and processing difficulty in spatial and spatiotemporal data mining. A more preferable
way to capture implicit spatial and spatiotemporal relationships is to develop statistics and
techniques to incorporate spatial and temporal information into the data mining process.
1.3 Related Work
Surveys in spatial and spatiotemporal data mining can be categorized into two groups: ones
without statistical foundations, and ones with a focus on statistical foundation. Among the
surveys without focuses on statistical foundation, Koperski et al. (1996) and Ester et al. (1997)
reviewed spatial data mining from a spatial database approach; Roddick and Spiliopoulou (1999)
provided a bibliography for spatial, temporal and spatiotemporal data mining; Miller and Han
(2009) cover a list of recent spatial and spatiotemporal data mining topics but without a
systematic view of statistical foundation. Among surveys covering statistical foundations,
(Shekhar, Zhang, et al. (2003)) reviewed several spatial pattern families focusing on spatial data’s
unique characteristics; Kisilevich et al. (2009) reviewed spatiotemporal clustering research;
Aggarwal (2013) has a chapter summarizing spatial and spatiotemporal outlier detection
techniques; Zhou et al. (2014) reviews spatial and spatiotemporal change detection research from
an interdisciplinary view; Cheng et al. (2014) reviews state of the art spatiotemporal data mining
research including spatiotemporal autocorrelation, space-time forecasting and prediction, space-
time clustering, as well as space-time visualization; Shekhar et al. (2011) give a review of spatial
data mining research and Shekhar et al. (2015) reviews spatiotemporal data mining research with
a discussion on statistical foundation.
1.4 Scope
We hope this article contributes to spatial and spatiotemporal data mining research in filling these
two gaps. More specifically: (1) we provide a taxonomy of spatial and spatiotemporal data types;
(2) we provide a taxonomy of spatial and spatiotemporal statistics organized by different data
types; (3) we survey common computational techniques for all major spatial and spatiotemporal
pattern families, including spatial and spatiotemporal anomalies, coupling and tele-coupling,
prediction models, partitioning and summarization, hotspots and change patterns.
1.5 Organization of the Paper
This article starts with the characteristics of the data inputs of spatial and spatiotemporal data
mining (Section 2) and an overview of its statistical foundation (Section 3). It then describes in
detail six main output patterns of spatial and spatiotemporal data mining (Section 4). The paper
concludes with an examination of research needs and future directions in Section 5 and Section 6.
2 Input: Spatial and Spatiotemporal Data The input data is one of the important factors shaping spatial and spatiotemporal data mining. In
this section, a taxonomy of different spatial and spatiotemporal data types is introduced, followed
by a summary of the unique attributes and relationships of different data. The goal is to provide a
systematic overview of data in spatial and spatiotemporal data mining tasks.
2.1 Types of Spatial and Spatiotemporal Data
The complexity of its input data makes spatial and spatiotemporal data mining unique when
compared with classical data mining. The cause of this complexity is the conflict between the
discrete representations and the continuous space and time.
There are three models in spatial data, namely, the object model (Figure 2), the field model
(Figure 3), and the spatial network model (Figure 4) (Shekhar and Chawla (2003)). The object
model uses point, line, polygon, and collections of them to describe the world. Objects in this
model are distinctly identified according to the application context, each of which is associated
with some non-spatial attributes. The field model represents spatial data as a function from a
spatial framework, which is a partitioning of space, to non-spatial attributes. For example, a grid
consisting of areas covered by pixels of a remote sensing image is mapped to pixel values in the
image. Contrary to the object model, the field model is more suitable to depict continuous
phenomena. According to the scope of operations working on field model data, the operations are
categorized into four groups: local, focal, zonal, and global operations. Local operations get an
output at a given location based only on the input at that location. Focal operations’ outputs at a
location depend on the input in a small neighborhood of the location as well. Zonal and global
operations work on inputs from a pre-defined zone or global inputs. The spatial network model
extends graph to represent the relationship between spatial elements using vertices and edges. In
addition to binary relationships between vertices represented by edges in a graph, a spatial
network includes more complex relationships, e.g., different kinds of turns at a road intersection.
Based on how temporal information is additionally modeled in addition to spatial data,
spatiotemporal data can be categorized into three types, i.e., temporal snapshot model, temporal
change model, and event or process model (Li et al. (2008); Yuan (1996), (1999)). In the
temporal snapshot model, spatial layers of the same theme are time-stamped. Based on the data
model in each snapshot layer, snapshots can also represent trajectories of raster data, points, lines,
and polygons, and networks such as time expanded graphs (TEGs) and time aggregated graphs
(TAGs) (George and Shekhar (2008)). Remote sensing imagery time series is a typical example
of the temporal snapshot model of raster data (Figure 5). Unlike the temporal snapshot model, the
temporal change model represents spatiotemporal data with a spatial layer at a given start time
together with incremental changes occurring afterward (Figure 6 represents the motion directions
and speed of two objects). For example, it can represent relative motion attributes such as speed,
acceleration, and direction of spatial points (Gelfand et al. (2010); Laube and Imfeld (2002)), as
well as rotation and deformation on lines and polygons. The event or process model describes
things that happen or go on in time (as shown in Figure 7 Event and Process Model). Processes
represent on-going homogeneous situation involving one or more types of things. The
homogeneity of processes indicates that if a process is going on over an interval of time then it is
also going over all the subintervals of the interval, so the properties of a process is subject to
change over time. Events refer to the culmination of processes associated with precise temporal
boundaries. They are used to represent things that have happened typically. (Allen (1984);
Campelo and Bennett (2013); Shekhar and Xiong (2007); Worboys (2005)).
Figure 2 Object Model
Figure 3 Raster Model
Figure 4 Spatial Network Model
Figure 5 Temporal Snapshot Model
Figure 6 Temporal Change Model
Figure 7 Event and Process Model
2.2 Data Attributes and Relationships
Data attributes for spatial and spatiotemporal data can be categorized into three distinct types, i.e.,
non-spatiotemporal attributes, spatial attributes, and temporal attributes. Non-spatiotemporal
attributes are the same as the attributes of the input data of classical data, which characterize non-
contextual features of objects, such as name, population, and unemployment rate of a city. Spatial
attributes define the spatial location (e.g., longitude, latitude, and elevation) of spatial points,
spatial extent (e.g., area, perimeter) and shape of spatial polylines and polygons in a spatial
reference frame. Temporal attributes include the timestamp of a spatial layer (e.g., a raster layer,
points), as well as the duration of a process.
Relationships on non-spatiotemporal attributes are made explicit through arithmetic, ordering,
and set relations. In contrast, relationships among spatial objects are often implicit, such as
intersection, within, and below. The relationships on non-spatiotemporal and spatial attributes are
summarized in Table 1. Spatial relationships are categorized into five groups, namely, set based,
topological, metric, directional, and other relationships. Treating spatial locations as set elements
set relationships can be extended to be applied on spatial attributes. Topological relationships are
concerned with the relationships between spatial objects that are preserved under continuous
deformations. Metric relationships are defined between spatial elements in a matric space where
the notion of distance is well defined. Directional relationships are further divided into two types,
namely, absolute and object-relative. For example, north, south and so on so forth are absolute
direction relationships which are defined in the context to a global reference system. Object-
relative directions are defined using the orientation of a given object, such as left and right. There
are also other relationships like visibility which identifies whether a spatial element is visible
from an observer point.
Table 1 Non-spatiotemporal and Spatial Relationships
Attributes Relationships
Non-spatiotemporal Arithmetic: sum, subtraction, larger than, etc.
Ordering: before, after, etc.
Set: union, subclass of, instance of, etc.
Spatial Set: union, intersection, membership, etc.
Topological: meet, within, overlap, etc.
Metric: distance, area, perimeter, etc.
Directional: above, northeastern, etc.
Others: visibility, etc.
Different spatiotemporal relationships are put forward target at different spatiotemporal data
types. For temporal snapshots of the object model data, spatiotemporal predicates (Erwig and
Schneider (2002)) are put forward for topological space, while trajectory distance (e.g., Hausdorff
distance (Chen et al. (2011))) is for metric space. Change across snapshots of field model at local,
focal and zonal levels (Zhou et al. (2014)) and cubic map algebra (Mennis et al. (2005)) are
studied on raster snapshots. For temporal snapshots of network models, studies are conducted
such as predecessor(s) and successor(s) on a Lagrangian path (Yang et al. (2014)), temporal
centrality (Habiba et al. (2007)), and network flow. Spatiotemporal coupling (i.e., within
geographic and temporal proximity) (Celik et al. (2006); Huang et al. (2008); Mohan et al.
(2012)) and spatiotemporal covariance structure (Gelfand et al. (2010)) are good examples of
relationships on events and processes model.
3 Statistical Foundations This section provides a taxonomy of common statistical concepts for different spatial and
spatiotemporal data types. Spatial statistics is a branch of statistics specialized to model events
and phenomenon that evolves within a spatial framework. Spatial and spatiotemporal statistics are
distinct from classical statistics due to the unique characteristics of space and time. One important
property of spatial data is spatial dependency, a property so fundamental that geographers have
elevated it to the status of the first law of geography: “Everything is related to everything else, but
nearby things are more related than distant things” (Tobler (1979)). Spatial dependency is also
measured using spatial autocorrelation. Dependency along the temporal dimension is also a topic
studied in spatial and spatiotemporal statistics. Other important properties include spatial
heterogeneity and temporal non-stationarity, as well as the multiple scale effect. With spatial
heterogeneity resides in the data, the distribution of spatial events and interactions between them
may vary drastically by locations. Temporal non-stationarity extends spatial heterogeneity into
the temporal dimension.
3.1 Spatial Statistics for Different Types of Spatial Data
There exists a variety of data models to represent spatial events and processes. Accordingly,
spatial statistics can be categorized according to their underlying spatial data type: Geostatistics
for point referenced data, lattice statistics for areal data, spatial point process for spatial point
patterns, and spatial network statistics.
Geostatistics: Geostatistics is concerned with point referenced data. Point-reference data contains
a set of points with fixed locations and a set of attribute values. The goal of geostatistics is to
model the distribution of the attribute values and make predictions on uncovered locations.
Spatial point referenced data have three characteristics studied by geostatistics (Cressie (2015)),
including spatial continuity (i.e., dependence across locations), weak stationarity (i.e., some
statistical properties not changing with locations) and isotropy (i.e., uniformity in all directions).
Under the assumption of weak stationarity or intrinsic stationarity, spatial dependence at various
distances can be captured by a covariance function or a semivariogram (Banerjee et al. (2014)).
Geostatistics provides a set of statistical tools, such as Kriging (Banerjee et al. (2014)) which can
be used to interpolate attributes at unsampled locations. However, real world spatial data often
shows inherent variation in measurements of a relationship over space, due to influences of
spatial context on the nature of spatial relationships. For example, human behavior can vary
intrinsically over space (e.g., differing cultures). Different jurisdictions tend to produce different
laws (e.g., speed limit differences between Minnesota and Wisconsin). This effect is called spatial
heterogeneity or non-stationarity. Special models (e.g., local space-time Kriging (Gething et al.
(2007))) can be further used to reflect the varying functions at different locations by assigning
higher weights to neighboring points to reduce effect of heterogeneity.
Lattice statistics: Lattice statistics studies statistics for spatial data in the field (or areal) model.
Here a lattice refers to a countable collection of regular or irregular cells in a spatial framework.
The range of spatial dependency among cells is reflected by a neighborhood relationship, which
can be represented by a contiguity matrix called a W-matrix. A spatial neighborhood relationship
can be defined based on spatial adjacency (e.g., rook or queen neighborhoods) or Euclidean
distance, or in more general models, cliques and hypergraphs (Warrender and Augusteijn (1999)).
Based on a W-matrix, spatial autocorrelation statistics can be defined to measure the correlation
of a non-spatial attribute across neighboring locations. Common spatial autocorrelation statistics
include Moran’s I, Getis-Ord Gi*, Geary’s C, Gamma index Γ (Cressie (2015)), etc., as well as
their local versions called local indicators of spatial association (LISA) (Anselin (1995)). Several
spatial statistical models, including the spatial autoregressive model (SAR), conditional
autoregressive model (CAR), Markov random fields (MRF), as well as other Bayesian
hierarchical models (Banerjee et al. (2014)), can be used to model lattice data. Another important
issue is the modifiable areal unit problem (MAUP) (also called the multi-scale effect) (Openshaw
and Openshaw (1984)), an effect in spatial analysis that results for the same analysis method will
change on different aggregation scales. For example, analysis using data aggregated by states will
differ from analysis using data at individual family level.
Spatial point process: Different from geostatistics, spatial point process does not concern attribute
values but the locations of points. The distribution of points is the main focus of spatial point
process. Locations of a set of points can be generated based on different statistical assumptions
(e.g., random, clustered). The most common model assumed for spatial point process is
homogeneous Poisson distribution, which is also known as complete spatial randomness (CSR).
In CSR, the total number of points follows a Poisson distribution and each point is identically and
independently distributed in a pre-defined spatial domain. A variant of CSR is binomial point
process, in which the only difference is a fixed total number of points. In many application
domains, CSR or binomial point process is not an appropriate assumption since points may have
spatial autocorrelation or inhibition characteristics. In such cases, other specialized models should
be applied to better approximate the exact distribution. For spatial inhibition, Poisson hardcore
process is widely used to generate a distribution that enforces mutual repulsion among points. For
spatial autocorrelation, Matern cluster process can be chosen to reflect the clustering
characteristics. Similar cluster processes include Poisson cluster process, Cox cluster process,
Neyman-Scott process, etc. One of the most well-known applications of spatial point process is
spatial scan statistics (Kulldorff (1997)) in hotspot detection. In spatial scan statistics, chancy
hotspots occurred by randomness are removed through statistical significance test under a null
hypothesis based on CSR. Similarly, Ripley’s K function, which estimates the overall clustering
degree of a point distribution, also applies CSR as a base model in its null hypothesis (Ripley
(1976)). Although CSR is still the most popularly used spatial point process for random point
distribution, other models (e.g., Possion hardcore process, Matern cluster process) start being
adapted in data mining problems (e.g., colocation and segregation) for more robust and plausible
assumptions.
Spatial point processes: A spatial point process is a model for the spatial distribution of the points
in a point pattern. It differs from point reference data in that the random variables are locations.
Examples include positions of trees in a forest and locations of bird habitats in a wetland. One
basic type of point process is a homogeneous spatial Poisson point process (also called complete
spatial randomness, or CSR) (Gelfand et al. (2010)), where point locations are mutually
independent with the same intensity over space. However, real world spatial point processes often
show either spatial aggregation (clustering) or spatial inhibition instead of complete spatial
independence as in CSR. Spatial statistics such as Ripley’s K function (Marcon and Puech
(2009); Ripley (1976)), i.e., the average number of points within a certain distance of a given
point over the total average intensity, can be used to test a point pattern against CSR. Moreover,
real world spatial point processes such as crime events often contain hotspot areas instead of
following homogeneous intensity across space. A spatial scan statistic (Kulldorff (1997)) can be
used to detect these hotspot patterns. It tests if the intensity of points inside a scanning window is
significantly higher (or lower) than outside. Though both the K-function and spatial scan statistics
have the same null hypothesis of CSR, their alternative hypotheses are quite different: the K-
function tests if points exhibit spatial aggregation or inhibition instead of independence, while
spatial scan statistics assume that points are independent and test if a hotspot with much higher
intensity exists. Finally, there are other spatial point processes such as the Cox process, in which
the intensity function itself is a random function over space, as well as a cluster process, which
extends a basic point process with a small cluster centered on each original point (Gelfand et al.
(2010)). For extended spatial objects such as lines and polygons, spatial point processes can be
generalized to line processes and flat processes in stochastic geometry (Chiu et al. (2013)).
Spatial network statistics: Most spatial statistics research focuses on the Euclidean space. Spatial
statistics on the network space is much less studied. Spatial network space, e.g., river networks
and street networks, is important in applications of environmental science and public safety
analysis. However, it poses unique challenges including directionality and anisotropy of spatial
dependency, connectivity, as well as high computational cost. Statistical properties of random
fields on a network are summarized in (Guyon (1995)). Recently, several spatial statistics, such as
spatial autocorrelation, K-function, and Kriging, have been generalized to spatial networks
(Okabe et al. (1995), (2006); Okabe and Sugihara (2012)). Little research has been done on
spatiotemporal statistics on the network space.
3.2 Spatiotemporal Statistics
Spatiotemporal statistics is a merge of spatial statistics and temporal statistics (Cressie and Wikle
(2015); Gelfand et al. (2010)). Similar to spatial data, temporal data also preserves intrinsic
properties such as autocorrelation and heterogeneity. By incorporating the additional dimension
of time, majority of the models in spatial statistics can be smoothly transformed into
spatiotemporal statistics. This section summarizes common statistics for different spatiotemporal
data types, including spatial time series, spatiotemporal point process, and time series of lattice
(areal) data.
Spatial time series: Spatial statistics for point reference data have been generalized for
spatiotemporal data (Kyriakidis and Journel (1999)). Examples include spatiotemporal
stationarity, spatiotemporal covariance, spatiotemporal variograms, and spatiotemporal Kriging
(Cressie and Wikle (2015); Gelfand et al. (2010)). There is also temporal autocorrelation and tele-
coupling (high correlation across spatial time series at a long distance). Methods to model
spatiotemporal process include physics inspired models (e.g., stochastically differential
equations) (Cressie and Wikle (2015); Gelfand et al. (2010)) and hierarchical dynamic
spatiotemporal models (e.g., Kalman filtering) for data assimilation (Cressie and Wikle (2015);
Gelfand et al. (2010)).
Spatiotemporal point process: A spatiotemporal point process generalizes the spatial point
process by incorporating the factor of time. As with spatial point processes, there is a
spatiotemporal Poisson process, Cox process, and cluster process. There are also corresponding
statistical tests including a spatiotemporal K function and spatiotemporal scan statistics (Cressie
and Wikle (2015); Gelfand et al. (2010)).
Time series of lattice (areal) data: Similar to lattice statistics, there is spatial and temporal
autocorrelation, a SpatioTemporal Autoregressive Regression (STAR) model (Cressie and Wikle
(2015); Gelfand et al. (2010)), and Bayesian hierarchical models (Banerjee et al. (2014)). Other
spatiotemporal statistics include empirical orthogonal functions (EOF) analysis (principle
component analysis in geophysics), canonical-correlation analysis (CCA), and dynamic
spatiotemporal models (Kalman filter) for data assimilation (Cressie and Wikle (2015); Gelfand
et al. (2010)).
4 Output Pattern Families 4.1 Spatial and Spatiotemporal Anomaly Detection
Detecting spatial and spatiotemporal anomalies is useful in many applications including
transportation, ecology, homeland security, public health, climatology, and location-based
services. For example, spatiotemporal anomaly detection can be used to detect anomalous traffic
patterns from sensor observations on a highway road network. In this section we will review
techniques for spatial and spatiotemporal anomaly detection.
4.1.1 Problem definition
In data mining, anomaly (outlier) have been informally defined as the identification of items,
events or observations which do not conform to an expected pattern or be inconsistent with the
remainder of the dataset (Bamnett and Lewis (1994); Chandola et al. (2009)). Contrarily, a spatial
anomaly (Shekhar and Chawla (2003)) is a spatially referenced object whose non-spatial attribute
values are significantly different from the values of its neighborhood. The detection of anomalies
can lead to the discovery of useful knowledge and has a number of practical applications. For
instance, a road intersection where vehicle speed is much higher than the intersections around it
in rush hour is a spatial anomaly based on the non-spatial attribute vehicle speed. Similarly, a
spatiotemporal anomaly generalizes spatial anomalies by substituting a spatial neighborhood with
a spatiotemporal neighborhood. Figure 4 compares global anomalies versus spatial anomalies.
Point G in Figure 4(a) is considered as a global anomaly, but point S in Figure 4(a) is not,
according to the histogram in Figure 4(b). However, when compared with its spatial neighbors P
and Q, point S is significantly different and thus is a spatial anomaly.
(a) (b)
Figure 8 Global anomalies (e.g., point G) versus spatial anomalies (e.g., point S).
4.1.2 Statistical foundation
The spatial statistics for spatial anomaly detection is of two kinds of bi-partite multidimensional
tests: graphical tests, including variogram clouds (Haslett et al. (1991)) and Moran scatterplots
(Anselin (1995)), and quantitative tests, including scatterplot (Anselin (1994)) and neighborhood
spatial statistics (Shekhar and Chawla (2003)). Variogram is defined as the variance of the
difference between field values at two locations. Figure 9(a) shows a variogram cloud with
neighboring pairs. As can be seen, two pairs (P,S) and (Q,S) on the left hand side lie above the
main group of pairs and are possibly related to spatial anomalies. Moran’s scatterplot is inspired
by Moran’s I index. Its x axis is Z score and y axis is weighted average of neighbors’ Z score, so
the slope of the regression line is Moran’s I index. As shown in Figure 9(b) the upper left and
lower right quadrants indicate spatial neighbors of dissimilar values: low values surrounded by
high value neighbors (e.g., points P and Q), and high values surrounded by low values (e.g,. point
S). Scatterplot (Figure 9(c)) fits a regression line between attribute values and a neighborhood
average of attribute values, and calculates residues of true points from the regression line. Spatial
anomalies, e.g., point S are detected when they have high absolute residual values. Finally, as
shown in Figure 9(d), spatial statistic S(x) measures the extent of discontinuity between the non-
spatial attribute value at location x and the average attribute value of x’s neighbors. A Z-test of
S(x) can be used to detection spatial anomalies (e.g., point S). Median-based Z-test and iterative
Z-test are also studied as alternatives of Z-test. As long as spatiotemporal neighborhoods are
well-defined the spatial statistics for spatial anomaly detection is also applicable to
spatiotemporal anomalies.
0 2 4 6 8 10 12 14 16 18 20 0
1
2
3
4
5
6
7
8
← S
P →
Q →
D ↑
Data point Fitting curve G
L
0 2 4 6 8 10 0
1
2
3
4
5
6
7
8
9
← → + 2
(a) Variogram cloud (b) Moran’s scatterplot
(c)Scatterplot (d)Spatial statistics Z(s) test
Figure 9 Spatial statistics for spatial anomaly detection (input data are the same as in Figure 8(a)).
4.1.3 Spatial anomaly detection approaches
Visualization approaches utilize plots to visualize outlying objects. The common methods include
variogram clouds and Moran scatterplot as introduced earlier.
Quantitative approaches can be classified further into distribution-based and distance-based
methods. In distribution-based methods, data are modelled based on a distribution function (e.g.,
Normal and Poisson distribution), and the final objects are characterized as outliers depending on
the initial statistical hypothesis that controls the model. Distance-based approaches are commonly
used when dealing with points, lines, polygons, locations and the spatial neighborhood. A spatial
statistic is computed as the difference between the non-spatial attribute of the current location and
that of the neighborhood aggregate (Shekhar, Lu, et al. (2003)). Spatial neighborhoods can be
identified by distances on spatial attributes (e.g., K nearest neighbors), or by graph connectivity
2
2.5
3
3.5
4
4.5
5
5.5
6
6.5
7
← S
P →
Q →
0
1
2
3
4
S →
P → ← Q
(e.g., locations on road networks). This research has been extended in a number of ways to allow
for multiple non-spatial attributes (Chen et al. (2008)), average and median attribute value (Lu et
al. (2003a)), weighted spatial outliers (Kou et al. (2006)), categorical spatial outlier (Liu et al.
(2014)), local spatial outliers (Schubert et al. (2014)), and fast detection (Wu et al. (2010)).
4.1.4 Spatiotemporal anomaly detection approaches
The intuition behind spatiotemporal outlier detection is that they reflect “discontinuity” on non-
spatiotemporal attributes within a spatiotemporal neighborhood. Approaches can be summarized
according to the input data types.
Outliers in spatial time series: For spatial time series (on point reference data, raster data, as well
as graph data), basic spatial outlier detection methods, such as visualization based approaches and
neighborhood based approaches, can be generalized with a definition of spatiotemporal
neighborhoods. After a spatiotemporal neighborhood is defined, and a spatiotemporal statistic
measure is computed as the difference between the non-spatial attribute of the current location
and that of the neighborhood aggregate (Chen et al. (2008); Lu et al. (2003b); McGuire et al.
(2014); Shekhar, Lu, et al. (2003)).
Flow Anomalies: Given a set of observations across multiple spatial locations on a spatial
network flow, flow anomaly discovery aims to identify dominant time intervals where the
fraction of time instants of significantly mismatched sensor readings exceeds the given
percentage-threshold. Flow anomaly discovery can be considered as detecting discontinuities or
inconsistencies of a non-spatiotemporal attribute within a neighborhood defined by the flow
between nodes, and such discontinuities are persistent over a period of time. A time-scalable
technique called SWEET (Smart Window Enumeration and Evaluation of persistent-Thresholds)
was proposed (Elfeky et al. (2006); Franke and Gertz (2008); Kang et al. (2008)) that utilizes
several algebraic properties in the flow anomaly problem to discover these patterns efficiently. To
account for flow anomalies across multiple locations, recent work (Kang et al. (2009)) defines a
teleconnected flow anomaly pattern and proposes a RAD (Relationship Analysis of Dynamic-
neighborhoods) technique to efficiently identify this pattern.
Anomalous moving object trajectories: Detecting spatiotemporal outliers from moving object
trajectories is challenging due to the high dimensionality of trajectories and the dynamic nature.
A context-aware stochastic model has been proposed to detect anomalous moving pattern in
indoor device trajectories (Liu et al. (2012)). Another spatial deviations (distance) based method
has been proposed for anomaly monitoring over moving object trajectory stream (Bu et al.
(2009)). In this case, anomalies are defined as rare patterns with big spatial deviations from
normal trajectories in a certain temporal interval. A supervised approach called Motion-Alert has
also been proposed to detect anomaly in massive moving objects (Li et al. (2006)). This approach
first extracts motif features from moving object trajectories, then clusters the features, and learns
a supervised model to classify whether a trajectory is an anomaly. Other techniques have been
proposed to detect anomalous driving patterns from taxi GPS trajectories (Chen et al. (2011); Ge
et al. (2011); Zhang et al. (2011)).
4.2 Spatial and Spatiotemporal Associations, Tele-connections
Spatial and spatiotemporal coupling patterns represent spatiotemporal object types whose
instances often occur in close geographic and temporal proximity. Discovering various patterns of
spatial and spatiotemporal coupling and tele-coupling is important in applications related to
ecology, environmental science, public safety, and climate science. For example, identifying
spatiotemporal cascade patterns from crime event datasets can help police department to
understand crime generators in a city, and thus take effective measures to reduce crime events. In
this section, we will review techniques for identifying spatial and spatiotemporal association as
well as tele-connections. The section starts with the basic spatial association (or co-location)
pattern, and moves on to spatiotemporal association (i.e., spatiotemporal co-occurrence, cascade,
and sequential patterns) as well as spatiotemporal teleconnection.
4.2.1 Problem definition
Spatial association, also known as spatial co-location patterns (Huang et al. (2004)), represent
subsets of spatial event types whose instances are often located in close geographic proximity.
Real-world examples include symbiotic species, e.g., the Nile Crocodile and Egyptian Plover in
ecology. Similarly, spatiotemporal association patterns represent spatiotemporal object types
whose instances often occur in close geographic and temporal proximity. Figure 10 shows an
example of spatial co-location patterns. In this dataset, there exist instances of several Boolean
spatial features, each represented by a distinct shape. Spatial features in sets {′+′,′×′} and
{′𝑜′,′∗ ′} tend to be located together. A careful review reveals two co-location patterns, that is
{′+′,′×′} and {′𝑜′,′∗ ′}.
Figure 10 Illustration of Point Spatial Co-location Patterns. Shapes represent different spatial
feature types. Spatial features in sets {‘+’, ‘×’} and {‘o’, ‘*’} tend to be located together
Mining patterns of spatial and spatiotemporal association is challenging due to the following
reasons: first, there is no explicit transaction in continuous space and time; second, there is
potential for over-counting; third, the number of candidate patterns is exponential, and a trade-off
between statistical rigor of output patterns and computational efficiency has to be made.
4.2.2 Statistical foundation
The underlying statistic for spatial and spatiotemporal coupling patterns is the cross-K function,
which generalizes the basic Ripley’s K function (introduced in Section 3) for multiple event
types.
4.2.3 Spatial association detection approaches
Mining spatial co-location patterns can be done via two categories of approaches: approaches that
use spatial statistics and algorithms that use association rule mining kind of primitives. Spatial
statistics based approaches utilize statistical measures such as cross-K function with Monte Carlo
simulation (Cressie (2015)), mean nearest neighbor distance, and spatial regression model (Chou
(1997)). However, these approaches are computationally expensive due to the exponential
number of candidate patterns. Contrarily, association rule-ruled approaches focus on the creation
of transactions over space so that a priori like algorithm can be used. Within this category, there
are transaction based approaches and distance based approaches. A transaction based approach
defines transactions over space (e.g., around instances of a reference feature) and then uses an
Apriori-like algorithm (Koperski and Han (1995)). However, in the spatial co-location pattern
mining problem, transactons are often not explicit. Force fitting the notion of transaction in a
continuous spatial framework will lead to loss of implicit spatial relationships across the
boundary of these transactions, as illustrated in Figure 11. As shown in Figure 11(a), there are
three feature types in this dataset: A, B, and C, each of which has two instances. The neighbor
relationships between instances are shown as edges. Co-locations (A, B) and (B, C) may be
considered as frequent in this example. Figure 11(b) shows transactions created by choosing C as
the reference feature. As Co-location (A, B) does not involve the reference feature, it will not be
found. Figure 11(c) shows two possible partitions for the data set of Figure 11(a), along with the
supports for co-location (A, B); in this case, the support measure is order sensitive and may also
miss the Co-location (A, B). A distance based approach defines a distance-based patterns called k-
neighboring class sets (Morimoto (2001)) or using an event centric model (Huang et al. (2004))
based on a definition of participation index, which is an upper bound of cross-K function statistics
and has an anti-monotone property. Recently, approaches have been proposed to identify
colocations for extended spatial objects (Xiong et al. (2004)) or rare events (Huang, Pei, et al.
(2006)), regional colocation patterns (Ding et al. (2011); Wang et al. (2013)) (i.e., pattern is
significant only in a subregion), statistically significant colocation (Barua and Sander (2014)), as
well as fast algorithms (Yoo and Shekhar (2006)).
Figure 11. Example to illustrate different approaches to discovering co-location patterns: (a)
Example data set. (b) Reference feature-centric model. (c) Data partition approach. Support
measure is ill-defined and order sensitive. (d) Event-centric model.
C2 C2 C2 C2
Example dataset, neighboring instances of different features are connected
B2
A1
A2
C1 B1
Transactions{{B1},{B2}} support(A,B) = φ
B2
Reference feature = C
( b )
A1
A2
C1 B1 B1
Support for (a,b) is order sensitive ( c )
2 B 2 B
( a )
B1 A1 C1
A2
A1
A2
C1
Support(A,B) = 1 Support(A,B) = 2
4.2.4 Spatiotemporal association, tele-connection detection
approaches
Spatiotemporal coupling patterns can be categorized according to whether there exists temporal
ordering of object types: spatiotemporal (mixed drove) co-occurrences (Celik et al. (2008)) are
used for unordered patterns, spatiotemporal cascades (Mohan et al. (2012)) for partially ordered
patterns, and spatiotemporal sequential patterns (Huang et al. (2008)) for totally ordered patterns.
Spatiotemporal tele-connection (Zhang et al. (2003a)) is the pattern of significantly positive or
negative temporal correlation between spatial time series data at a great distance. The following
subsections categorizes common computational approaches for discovering spatiotemporal
couplings by different input data types.
Mixed Drove Spatiotemporal Co-Occurrence Patterns represent subsets of two or more different
object-types whose instances are often located in spatial and temporal proximity. Discovering
MDCOPs is potentially useful in identifying tactics in battlefields and games, understanding
predator-prey interactions, and in transportation (road and network) planning (Güting and
Schneider (2005); Koubarakis et al. (2003)). However, mining MDCOPs is computationally very
expensive because the interest measures are computationally complex, datasets are larger due to
the archival history, and the set of candidate patterns is exponential in the number of object-types.
Recent work has produced a monotonic composite interest measure for discovering MDCOPs and
novel MDCOP mining algorithms are presented in (Celik et al. (2006), (2008)). A filter-and-
refine approach has also been proposed to identify spatiotemporal co-occurrence on extended
spatial objects (Pillai et al. (2013)).
A spatiotemporal sequential pattern is a sequence of spatiotemporal event types in the form of f1
→ f2 → ... → fk. It represents a “chain reaction” from event type f1 to event type f2 and then to
event type f3 until it reaches event type fk. A spatiotemporal sequential pattern differs from a
colocation pattern in that it has a total order of event types. Such patterns are important in
applications such as epidemiology where some disease transmission may follow paths between
several species through spatial contacts. Mining spatiotemporal sequential patterns is challenging
due to the lack of statistically meaningful measures as well as high computation cost. A measure
of sequence index, which can be interpreted by K-function statistics, was proposed in (Huang et
al. (2008); Huang, Zhang, et al. (2006)), together with computationally efficient algorithms. Other
works have investigated spatiotemporal sequential patterns from data other than spatiotemporal
events, such as moving object trajectories (Cao et al. (2005); Li et al. (2013); Verhein (2009)).
Cascading spatio-temporal patterns: Partially ordered subsets of event-types whose instances are
located together and occur in stages are called cascading spatio-temporal patterns (CSTP). In the
domain of public safety, events such as bar closings and football games are considered generators
of crime. Preliminary analysis revealed that football games and bar closing events do indeed
generate CSTPs. CSTP discovery can play an important role in disaster planning, climate change
science (Frelich and Reich (2010); Mahoney et al. (2003)) (e.g., understanding the effects of
climate change and global warming) and public health (e.g., tracking the emergence, spread and
re-emergence of multiple infectious diseases (Morens et al. (2004))). A statistically meaningful
metric was proposed to quantify interestingness and computational pruning strategies were
proposed to make the pattern discovery process more computationally efficient (Huang et al.
(2008); Mohan et al. (2012)).
Spatial time series and tele-connection: Given a collection of spatial time series at different
locations, teleconnection discovery aims to identify pairs of spatial time series whose correlation
is above a given threshold. Tele-connection patterns are important in understanding oscillations in
climate science. Computational challenges arise from the length of the time series and the large
number of candidate pairs and the length of time series. An efficient index structure, called a
cone-tree, as well as a filter and refine approach (Zhang et al. (2003a), (2003b)) have been
proposed which utilize spatial autocorrelation of nearby spatial time series to filter out redundant
pair-wise correlation computation. Another challenge is spurious “high correlation” pairs of
locations that happen by chance. Recently, statistical significant tests have been proposed to
identify statistically significant tele-connection patterns called dipoles from climate data (Kawale
et al. (2012)). The approach uses a “wild bootstrap” to capture the spatio-temporal dependencies,
and takes account of the spatial autocorrelation, the seasonality and the trend in the time series
over a period of time.
4.3 Spatial and Spatiotemporal Prediction
Spatial and spatiotemporal prediction is widely used to classify remote sensing images into
dynamic land cover maps and detect changes. Climate scientists use collections of climate
variables to project future trends in global or regional temperature. In this section, we will review
the definition of spatial and spatiotemporal prediction, followed by computational approaches
organized according to their input data types.
4.3.1 Problem definition
Given spatial and spatiotemporal data items, with a set of explanatory variables (also called
explanatory attributes or features) and a dependent variable (also called target variables), the
spatial and spatiotemporal prediction problem aims to learn a model that can predict the
dependent variable from the explanatory variables. When the dependent variable is discrete, the
problem is called spatial and spatiotemporal classification. When the dependent variable is
continuous, the problem is spatial and spatiotemporal regression. One example of spatial and
spatiotemporal classification problem is remote sensing image classification over temporal
snapshots (Almeida et al. (2007)), where the explanatory variables consists of various spectral
bands or channels (e.g., blue, green, red, infra-red, thermal, etc.) and the dependent variable is a
thematic class such as forest, urban, water, and agriculture. Examples of spatial and
spatiotemporal regression include yearly crop yield prediction (Little et al. (2008)), and daily
temperature prediction at different locations. Regression can also be used in inverse estimation,
that is, given that we have an observed value of y, we want to determine the corresponding x
value.
The unique challenges of spatial and spatiotemporal prediction come from the special
characteristics of spatial and spatiotemporal data, which include spatial and temporal
autocorrelation, spatial heterogeneity and temporal non-stationarity, as well as the multi-scale
effect. These unique characteristics violate the common assumption in many traditional prediction
techniques that samples follow an identical and independent distribution (i.i.d.). Simply applying
traditional prediction techniques without incorporating these unique characteristics may produce
hypotheses or models that are inaccurate or inconsistent with the data set. Additionally, a subtle
but equally important reason is related to the choice of the objective function to measure
classification accuracy. For a two-class problem, the standard way to measure classification
accuracy is to calculate the percentage of correctly classified objects. However, this measure may
not be the most suitable in a spatial context. This is because the measure of Spatial accuracy—
how far the predictions are from the actuals—is important in some applications such as ecology
due to the effects of the discretization of a continuous wetland into discrete pixels, as shown in
Figure 12. Figure 12(a) shows the actual locations of nests and (b) shows the pixels with actual
nests. Note the loss of information during the discretization of continuous space into pixels. Many
nest locations barely fall within the pixels labeled ‘A’ and are quite close to other blank pixels,
which represent ‘no-nest’. Now consider two predictions shown in Figure 12(c) and (d). Domain
scientists prefer prediction Figure 12(d) over (c), as the predicted nest locations are closer on
average to some actual nest locations. The classification accuracy measure cannot distinguish
between Figure 12(c) and (d), and a measure of spatial accuracy is needed to capture this
preference.
Figure 12. The choice of objective function and spatial accuracy.
4.3.2 Statistical foundations
Spatial and spatiotemporal prediction techniques are developed based on spatial and
spatiotemporal statistics, including spatial and temporal autocorrelation, spatial heterogeneity,
temporal non-stationarity, and multiple areal unit problems (MAUP) (see Section 3).
4.3.3 Spatial prediction approaches
Several previous studies (Jhung and Swain (1996); Solberg et al. (1996)) have shown that the
modeling of spatial dependency (often called context) during the classification or regression
process improves overall accuracy. Spatial context can be defined by the relationships between
spatially adjacent spatial units in a small neighborhood. Three supervised learning techniques for
classification and regression that model spatial dependency are: (1) Markov random field (MRF)
based classifiers; (2) spatial autoregressive regression (SAR) model; and (3) spatial decision tree.
Markov random field-based Bayesian classifiers: Maximum likelihood classification (MLC) is
one of the most widely used parametric and supervised classification technique in the field of
remote sensing (Hixson et al. (1980); Strahler (1980)). However, MLC is a per-pixel based
classifier and assumes that samples are i.i.d. Ignoring spatial autocorrelation results in salt and
pepper kind of noise in the classified images. One solution is to use MRF-based Bayesian
classifiers (Li (2009)) to model spatial context via the a priori term in Bayes’ rule. This uses a
set of random variables whose interdependency relationship is represented by an undirected
graph (i.e., a symmetric neighborhood matrix).
Spatial autoregressive regression model (SAR): SAR is one of the commonly used
autoregressive models for spatial data regression (Shekhar and Xiong (2007)). In the spatial
autoregressive regression model, the spatial dependencies of the error term, or, the dependent
P
A P P A P A
P P
A A A A A A
Legend
= Nest location A = Actual nest in pixel P = Predicted nest in pixel
(a) (b) (c) (d)
(a) The actual locations of nests. (b) Pixels with actual nests. (c) Location predicted by a model. (d) Location predicted by another
variable, are directly modeled in the regression equation using contiguity matrix (Anselin
(2013)). An example of contiguity matrix is shown in Figure 13. Based on this model, the
spatial autoregressive regression can be written as:
𝑌 = 𝜌𝑊𝑌 + 𝑋𝛽 + 𝜖,
where
𝑌 Observation or dependent variable,
𝑋 Independent variables,
𝜌 Spatial autoregressive parameter that reflects the strength of the spatial dependencies
between the elements of the dependent variable via the logistic function for binary dependent
variables,
𝑊 Contiguity matrix,
𝛽 Regressive coefficient,
𝜖 Unobservable error term.
When 𝜌 = 0, this model is reduced to the ordinary linear regression equation.
(a) Spatial framework
A B C D
A 0 1 1 0
B 1 0 0 1
C 1 0 0 1
D 0 1 1 0
(b) Neighbor relationship
A B C D
A 0 0.5 0.5 0
B 0.5 0 0 0.5
C 0.5 0 0 0.5
D 0 0.5 0.5 0
(c) Contiguity matrix
Figure 13. A spatial framework and its four-neighborhood contiguity matrix.
Spatial decision trees: Decision tree classifiers have been widely used for image classification
(e.g., remote sensing image classification) but they assume an i.i.d. distribution and produce
significant salt-and-pepper noise. Several works propose spatial feature selection heuristics such
as spatial entropy or spatial information gain to select a tree node test that favors spatial
autocorrelation structure while splitting training samples (Li and Claramunt (2006); Stojanova et
al. (2013)). Recently, a focal-test-based spatial decision tree is proposed (Jiang et al. (2013),
(2015)), in which the tree traversal direction of a sample is based on both local and focal
(neighborhood) information. Focal information is measured by local spatial autocorrelation
statistics in a tree node test.
One limitation of the spatial autoregressive model (SAR) is that, it does not account for the
underlying spatial heterogeneity that is natural in geographic spaces. Thus, in a SAR model,
coefficients β of covariates and the model errors are assumed to be uniform throughout the entire
geographic space. One proposed method to account for spatial variation in model parameters and
errors is Geographically Weighted Regression (Fotheringham et al. (2003)).
The regression equation of GWR is 𝑦 = 𝑥𝛽(𝑠) + 𝜖(𝑠), where 𝛽(𝑠) and 𝜖(𝑠) represent the
spatially parameters and the errors, respectively. GWR has the same structure as standard linear
regression, with the exception that the parameters are spatially varying. It also assumes that
A B
C D
samples at nearby locations have higher influence on the parameter estimation of a current
location.
4.3.4 Spatiotemporal prediction approaches
Spatiotemporal Autoregressive Regression (STAR): SpatioTemporal Autoregressive Regression
(STAR) extends SAR by further explicitly modeling the temporal and spatiotemporal dependency
across variables at different locations. More details can be found in (Cressie (2015)).
Spatiotemporal Kriging: Kriging (Cressie (2015)) is a Geostatistic technique to make predictions
at locations where observations are unknown, based on locations where observations are known.
In other words, Kriging is a spatial “interpolation” model. Spatial dependency is captured by the
spatial covariance matrix, which can be estimated through spatial variograms. Spatiotemporal
Kriging (Cressie and Wikle (2015)) generalizes spatial kriging with a spatiotemporal covariance
matrix and variograms. It can be used to make predictions from incomplete and noise
spatiotemporal data.
Hierarchical Dynamic Spatiotemporal Models: Hierarchical dynamic spatiotemporal models
(DSMs) (Cressie and Wikle (2015)), as the name suggests, aim to model spatiotemporal
processes dynamically with a Bayesian hierarchical framework. On the top is a data model, which
represents the conditional dependency of (actual or potential) observations on the underlying
hidden process with latent variables. In the middle is a process model, which captures the
spatiotemporal dependency with the process model. On the bottom is a parameter model, which
captures the prior distributions of model parameters. DSMs have been widely used in climate
science and environment science, e.g., for simulating population growth or atmospheric and
oceanic processes. For model inference, Kalman filter can be used under the assumption of linear
and Gaussian models.
4.4 Spatial and Spatiotemporal Partitioning (Clustering) and
Summarization
Spatial and spatiotemporal partitioning and summarization are important in many societal
applications such as public safety, public health, and environmental science. For example,
partitioning and summarizing crime data, which is spatial and temporal in nature, helps law
enforcement agencies find trends of crimes and effectively deploy their police resources (Levine
(2013)).
4.4.1 Problem definition
Spatial and spatiotemporal partitioning, or Spatial and spatiotemporal clustering is the process of
grouping similar spatial or spatiotemporal data items, and thus partitioning the underlying space
and time (Kisilevich et al. (2009)). It is important to note that spatial and spatiotemporal
partitioning or clustering is closely related to, but not the same as hotspot detection. Hotspots can
be considered as special clusters such that events or activities inside a cluster have much higher
intensity than outside.
Spatial and spatiotemporal summarization aims to provide a compact representation of
spatiotemporal data. For example, traffic accident events on a road network can be summarized
into several main routes that cover most of the accidents. Spatial and spatiotemporal
summarization is often done after or together with spatiotemporal partitioning so that objects in
each partition can be summarized by aggregated statistics or representative objects.
4.4.2 Spatial partitioning and summarization approaches
Data mining and Machine learning literature have explored a large number of partitioning
algorithms which can be classified into three groups.
Hierarchical clustering methods start with all patterns as a single cluster and successively perform
splitting or merging until a stopping criterion is met. This results in a tree of clusters, called
dendograms. The dendogram can be cut at different levels to yield desired clusters. Well-known
hierarchical clustering algorithms include balanced iterative reducing and clustering using
hierarchies (BIRCH), clustering using interconnectivity (Chameleon), clustering using
representatives (CURE), and robust clustering using links (ROCK) (Cervone et al. (2008);
Karypis et al. (1999); Zhang et al. (1996)).
Global partitioning based clustering algorithms start with each pattern as a single cluster and
iteratively reallocate data points to each cluster until a stopping criterion is met. These methods
tend to find clusters of spherical shape. K-Means and K-Medoids are commonly used partitional
algorithms. Squared error is the most frequently used criterion function in partitional clustering.
The recent algorithms in this category include partitioning around medoids (PAM), clustering
large applications (CLARA), clustering large applications based on randomized search
(CLARANS), and expectation-maximization (EM) (Hu et al. (2008); Ng and Han (2002)).
Density-based clustering algorithms try to find clusters based on the density of data points in a
region. These algorithms treat clusters as dense regions of objects in the data space. The density-
based clustering algorithms include density-based spatial clustering of applications with noise
(DBSCAN), ordering points to identify clustering structure (OPTICS), and density-based
clustering (DECODE) (Lai and Nguyen (2004); Ma and Zhang (2004); Neill and Moore (2004);
Pei et al. (2009)).
Spatial summarization often follows spatial partitioning. It is more difficult than non-spatial
because of its non-numeric nature. According to the clustering methods used, the objectives of
spatial summarization vary. For example, centroids or medoids may be summarization results
from K-means or K-Medoids. K main routes are identified as a summary of activities in spatial
network from a K-Main-Routes algorithm proposed by (Oliver, Shekhar, Zhou, et al. (2014)).
4.4.3 Spatiotemporal partitioning and summarization approaches
Based on the input data type, common spatiotemporal partitioning summarization approaches are
as follows.
Spatiotemporal Event Partitioning: Some previously mentioned clustering algorithms for two-
dimensional space can be easily generalized to spatiotemporal scenarios. For example, ST-
DBSCAN (Birant and Kut (2007)) is a spatiotemporal extension of its spatial version called
DBSCAN (Ankerst et al. (1999); Ester et al. (1996)), and ST-GRID (Wang et al. (2006)) splits
the space and time into 3D cells and merges dense cells together into clusters.
Spatial Time-Series Partitioning: Spatial time series partitioning aims to divide the space into
regions such that the correlation or similarity between time series within the same regions is
maximized. Global partitioning based clustering algorithms such as K Means, K Medoids, and
EM can be applied. So can the hierarchical approaches. However, due to the high dimensionality
of spatial time series, density-based approaches are often not effective. When computing
similarity between spatial time series, a filter-and-refine approach (Zhang et al. (2003a)) can be
used to avoid redundant computation.
Trajectory Data Partitioning: Trajectory data partitioning aims to partition trajectories into groups
according to their similarity. Algorithms are of two types, i.e., density-based and frequency-
based. The density-based approaches (Lee et al. (2007)) first break trajectories into small
segments and apply density-based clustering algorithms similar to DBSCAN to connect dense
areas of segments. The frequency based approach (Lee et al. (2009)) uses association rule mining
(Agrawal et al. (1994)) algorithms to identify subsections of trajectories which have high
frequencies (also called high “support”).
Data summarization aims to find compact representation of a data set (Chandola and Kumar
(2007)), which can be conducted on classical data, spatial data, as well as spatiotemporal data. It
is important for data compression as well as for making pattern analysis more convenient. For
spatial time series data, summarization can be done by removing spatial and temporal redundancy
due to the effect of autocorrelation. A family of such algorithms has been used to summarize
traffic data streams (Pan et al. (2010)). Similarly, the centroids from K Means can also be used to
summarize spatial time series. For trajectory data, especially spatial network trajectories,
summarization is more challenging due to the huge cost of similarity computation. A recent
approach summarizes network trajectories into k primary corridors (Evans et al. (2012), (2013)).
The work proposes efficient algorithms to reduce the huge cost for network trajectory distance
computation.
4.5 Spatial and Spatiotemporal Hotspot Detection
The problem of detecting spatial and spatiotemporal hotspot has received a lot of attention and
being widely studied and applied. For example, in epidemiology finding disease hotspots allows
officials to detect an epidemic and allocate resources to limit its spread (Kulldorff (2015)).
4.5.1 Problem definition
Given a set of spatial and spatiotemporal objects (e.g., activity locations and time) in a study area,
spatial and spatiotemporal hotspots are regions together certain time intervals where the number
of objects is anomalously or unexpectedly high within the time intervals. Spatial and
spatiotemporal hotspots are a special kind of clustered pattern whose inside has significantly
higher intensity than outside.
The challenge of spatial and spatiotemporal hotspot detection is brought about by the uncertainty
of the number, location, size, and shape of hotspots in a dataset. Additionally, ‘false’ hotspots that
aggregate events only by chance should often be avoided, since these false hotspots impede
proper response by authorities and waste police resources. Thus, it is often important to test the
statistical significance of candidate spatial or spatiotemporal hotspots.
4.5.2 Statistical foundation
Spatial scan statistics (Kulldorff (1997)) are used to detect statistically significant hotspots from
spatial datasets. It uses a circle to scan the space-time for candidate hotspots and perform
hypothesis testing. The null hypothesis states that the activity points are distributed randomly
obeying a homogeneous (i.e., same intensity) Poisson process over the geographical space. The
alternative hypothesis states that the inside of the circle has higher intensity of activities than
outside. A test statistic called the log likelihood ratio is computed for each candidate hotspot (or
circle) and the candidate with the highest likelihood ratio can be evaluated using a significance
value (i.e., p-value). Spatiotemporal scan statistics extends spatial scan statistics by considering
an additional temporal dimension, and changes the scanning window to a cylinder.
Besides, local indicators of spatial association, including Moran’s I, Geary’s C or Ord Gi and Gi*
functions (Anselin (1995)), are also used to detect hotspots. Contrary to spatial autocorrelation,
these functions are computed within the neighborhood of a location. For example, the local
Moran’s I statistic is given as:
𝐼𝑖 =𝑥𝑖 − �̅�
𝑆𝑖2 ∑ 𝑤𝑖,𝑗(𝑥𝑗 − �̅�)
𝑛
𝑗=1,𝑗≠𝑖
where 𝑥𝑖 is an attribute for object 𝑖, �̅� is the mean of the corresponding attribute, 𝑤𝑖,𝑗 is the
spatial weight between object 𝑖 and 𝑗, and 𝑆𝑖2 =
∑ (𝑥𝑗−�̅�)2𝑛
𝑗=1,𝑗≠𝑖
𝑛−1 with 𝑛 equating to the total number
of objects. A positive value for local Moran’s I indicates that an object has neighboring objects
with similarly high or low attribute values, so it may be a part of a cluster, while a negative value
indicates that an object has neighboring objects with dissimilar values, so it may be an outlier. In
either instance, the p-value for the object must be small enough for the cluster or outlier to be
considered statistically significant.
4.5.3 Spatial hotspot detection approaches
Generally, there are two categories of algorithms for spatial hotspot detection, namely, clustering
based approaches and spatial scan statistics based approaches.
4.5.3.1 Clustering based approaches
Clustering methods can be used to identify candidate areas for a further evaluation of
spatiotemporal hotspots. These methods include global partitioning based, density-based
clustering and hierarchical clustering (see Section 4.4). These methods can be used as a
preprocessing step to generate candidate hotspot areas, and statistical tools may be used to test
statistical significance. CrimeStat, a software package for spatial analysis of crime locations,
incorporates several clustering methods to determine the crime hotspots in a study area.
CrimeStat package has k-means tool, nearest neighbor hierarchical (NNH) clustering (Jain et al.
(1999)), Risk Adjusted NNH (RANNH) tool, STAC Hot Spot Area tool (Levine (2013)), and a
Local Indicator of Spatial Association (LISA) tool (Anselin (1995)) that are used to evaluate
potential hotspot areas. In addition to point process data, hotspot detection from trajectories is
also studied to detect network paths with high density (Lee et al. (2007)) or frequency of
trajectories (Lee et al. (2009)).
4.5.3.2 Spatial (spatiotemporal) Scan Statistics based approaches
The spatial scan statistic extends the original scan statistic which focuses on one-dimensional
point process to allow for a multi-dimensional point process. The spatial scan statistic also
extends the scan statistic by allowing the scanning window to vary and the baseline process to be
any inhomogeneous Poisson process or Bernoulli process (Kulldorff (1997)). When the spatial
scan statistic was proposed, it focused on finding circular hotspots in two-dimensional Euclidean
space with statistical significance test. It exhaustively enumerates all circles defined by an activity
point as the center and another activity point on the circumference. For each circle, an interest
measure named as log likelihood ratio is computed to indicate how likely the circle is statistically
significant. The final step is to evaluate the p-value of each interest measure using Monte Carlo
simulation because the distribution of log likelihood ratio is unknown beforehand. The p-value
indicate the confidence of the circle being true hotspot area so as to eliminate hotspots generated
only by chance. The null hypothesis of the MC simulation is that the data points following
complete spatial randomness (CSR), in other words, the baseline process is homogeneous Poisson
process.
Figure 14 Ring-shape hotspot detection
Ever since the introduction of the spatial scan statistic, lots of research are conducted targeting at
different shapes of hotspots. Because of the various shapes of the hotspots, each algorithm
focuses on reducing the computational complexity in order to make the algorithms scalable for
large datasets. Neill and Moore (2004) proposes an algorithm for rectangular hotspots detection.
Based on the property of log likelihood ratio, which is the interest measure, and rectangle, it
applies a divide-and-conquer method that speeds up the computation significantly. In addition, if
some errors are tolerated, a heuristic can be used to accelerate the detection even further.
According to the fact that hotspots may not always be with solid centers, Ring-shaped hotspot
detection is put forward to find such hotspots that are formed by two concentric circles
(Eftelioglu et al. (2014)). The hotspots in this shape can be found in environmental criminology.
Typically, criminals do not commit crimes close to their homes but also do not commit crimes too
far because of the transportation cost (e.g., time and money). For example as shown in Figure 14,
the detected ring-shape hotspot’s log likelihood ratio is 284.51, and its p-value is 0.001. However,
the most significant result returned from SatScan is a circle covering the whole ring area with log
likelihood ratio as 211.15, and its p-value is 0.001. Eftelioglu et al. (2014) claims that by
delineating the inner circles of ring-shape hotspots the range to search for the possible source of
events (e.g., home of the suspect of a serial crimes) can be further narrowed down. By
specifically defining the scanning window in a spatial network, some network space hotspot
detection algorithms are put forward. For example, Eftelioglu et al. (2016) introduces a algorithm
for ring-shaped hotspot detection in a spatial network by defining that the center of a ring-shape
hotspot is a shortest path between two points in the network.
4.5.4 spatiotemporal hotspot detection approaches
Spatiotemporal hotspot detection can be seen as a special case of pure spatial hotspot detection by
the addition of time as a third dimension. Two types of spatiotemporal hotspots that are of
particular importance: “persistent” spatiotemporal hotspot and “emerging” spatiotemporal
hotspot. A “persistent” spatiotemporal hotspot is defined as a region where the rate of increase in
observations is constantly high over time. Thus, a persistent hotspot detection assumes that the
risk of a hotspot (i.e., outbreak) is constant over time and it searches over space and time for a
hotspot by simply totaling the number of observations in each time interval. An “emerging”
spatiotemporal hotspot is a region where the rate of observations is monotonically increasing over
time (Chang et al. (2005); Tango et al. (2011)). This kind of spatiotemporal hotspot occurs when
an outbreak emerges causing a sudden increase in the number observations. Such phenomena can
be observed in epidemiology where at the start of an outbreak the number of disease cases
suddenly increases. Tools for the detection of emerging spatiotemporal hotspots use spatial scan
statistics with the change in expectation over time (Neill et al. (2005)). Many of the clustering
based and spatial scan statistics based methods mentioned above, which are originally designed
for two-dimensional Euclidean space and mostly used for pure spatial data, can be used to
identify spatiotemporal candidate hotspots by considering the temporal part of the data as a third
dimension. For example, DBSCAN will cluster both spatial and spatiotemporal data using the
density of the data as its measure, and SatScan, which uses a cylindrical window in three
dimensions (time is the third dimension) instead of the circular window in two dimensions for
spatial hotspots (Kulldorff (2015)), can be used as a tool to detect persistent spatiotemporal
hotspots.
4.6 Spatial and Spatiotemporal Change
4.6.1 Problem definition
Although the single term “change” is used to name the spatial and spatiotemporal change patterns
in different applications, the underlying phenomena may differ significantly. This section briefly
summarizes the main ways a change may be defined in spatiotemporal data (Zhou et al. (2014)):
Change in Statistical Parameter: In this case, the data is assumed to follow a certain distribution
and the change is defined as a shift in this statistical distribution. For example, in statistical
quality control, a change in the mean or variance of the sensor readings is used to detect a fault.
Change in Actual Value: Here, change is modeled as the difference between a data value and its
spatial or temporal neighborhood. For example, in a one-dimensional continuous function, the
magnitude of change can be characterized by the derivative function, while on a two-dimensional
surface, it can be characterized by the gradient magnitude.
Change in Models Fitted to Data: This type of change is identified when a number of function
models are fitted to the data and one or more of the models exhibits a change (e.g., a discontinuity
between consecutive linear functions) (Chandola et al. (2010)).
4.6.2 Spatial and spatiotemporal change detection approaches
Spatial footprints can be classified as raster footprints or vector footprints. Vector footprints are
further classified into four categories: point(s), line(s), polygon(s), and network footprint patterns.
Raster footprints are classified based on the scale of the pattern, namely, local, focal, or zonal
patterns. This classification describes the scale of the change operation of a given phenomenon in
the spatial raster field (Worboys and Duckham (2004)). Local patterns are patterns in which
change at a given location depends only on attributes at this location. Focal patterns are patterns
in which change in a location depends on attributes in that location and its assumed
neighborhood. Zonal patterns define change using an aggregation of location values in a region.
Spatiotemporal Change Patterns with Raster-Based Spatial Footprint: This includes patterns of
spatial changes between snapshots. In remote sensing, detecting changes between satellite images
can help identify land cover change due to human activity, natural disasters, or climate change
(Bujor et al. (2004); Kosugi et al. (2004); Martino et al. (2007)). Given two geographically
aligned raster images, this problem aims to find a collection of pixels that have significant
changes between the two images (Radke et al. (2005)). This pattern is classified as a local change
between snapshots since the change at a given pixel is assumed to be independent of changes at
other pixels. Alternative definitions have assumed that a change at a pixel also depends on its
neighborhoods (Thoma and Bierling (1989)). For example, the pixel values in each block may be
assumed to follow a Gaussian distribution (Aach and Kaup (1995)). We refer to this type of
change footprint pattern as a focal spatial change between snapshots. Researchers in remote
sensing and image processing have also tried to apply image change detection to objects instead
of pixels (Chen et al. (2012); Desclée et al. (2006); Im et al. (2008)), yielding zonal spatial
change patterns between snapshots.
A well-known technique for detecting a local change footprint is simple differencing. The
technique starts by calculating the differences between the corresponding pixels intensities in the
two images. A change at a pixel is flagged if the difference at the pixel exceeds a certain
threshold. Alternative approaches have also been proposed to discover focal change footprints
between images. For example, the block-based density ratio test detects change based on a group
of pixels, known as a block (Aach et al. (1993); Rignot and Zyl (1993)). Object-based approaches
in remote sensing (Im et al. (2008); Im and Jensen (2005)) employ image segmentation
techniques to partition temporal snapshots of images into homogeneous objects (Douglas and
Peucker (1973)) and then classify object pairs in the two temporal snapshots of images into no
change or change classes.
Spatiotemporal Change Patterns with Vector-Based Spatial Footprint: This includes the
Spatiotemporal Volume Change Footprint pattern. This pattern represents a change process
occurring in a spatial region (a polygon) during a time interval. For example, an outbreak event of
a disease can be defined as an increase in disease reports in a certain region during a certain time
window up to the current time. Change patterns known to have an spatiotemporal volume
footprint include the spatiotemporal scan statistics (Kulldorff (2001); Kulldorff et al. (1998)), a
generalization of the spatial scan statistic, and emerging spatiotemporal clusters defined by (Neill
et al. (2005)).
5 Research Trend and Future Research Needs Most current research in spatial and spatiotemporal data mining uses Euclidean space, which
often assumes isotropic property and symmetric neighborhoods. However, in many real-world
applications, the underlying space is network space, such as river networks and road networks
(Isaak et al. (2014); Oliver et al. (2010); Oliver, Shekhar, Kang, et al. (2014)). One of the main
challenges in spatial and spatiotemporal network data mining is to account for the network
structure in the dataset. For example, in anomaly detection, spatial techniques do not consider the
spatial network structure of the dataset, that is, they may not be able to model graph properties
such as one-ways, connectivity, left-turns, etc. The network structure often violates the isotropic
property and symmetry of neighborhoods, and instead, requires asymmetric neighborhood and
directionality of neighborhood relationship (e.g., network flow direction).
Recently, some cutting-edge research has been conducted in the spatial network statistics and data
mining (Okabe and Sugihara (2012)). For example, several spatial network statistical methods
have been developed, e.g., network K function and network spatial autocorrelation. Several
spatial analysis methods have also been generalized to the network space, such as network point
cluster analysis and clumping method, network point density estimation, network spatial
interpolation (Kriging), as well as network Huff model. Due to the nature of spatial network
space as distinct from Euclidean space, these statistics and analysis often rely on advanced spatial
network computational techniques (Okabe and Sugihara (2012)).
We believe more spatiotemporal data mining research is still needed in the network space. First,
though several spatial statistics and data mining techniques have been generalized to the network
space, few spatiotemporal network statistics and data mining have been developed, and the vast
majority of research is still in the Euclidean space. Future research is needed to develop more
spatial network statistics, such as spatial network scan statistics, spatial network random field
model, as well as spatiotemporal autoregressive models for networks. Furthermore, phenomena
observed on spatiotemporal networks need to be interpreted in an appropriate frame of reference
to prevent a mismatch between the nature of the observed phenomena and the mining algorithm.
For instance, moving objects on a spatiotemporal network need to be studied from a traveler’s
perspective, i.e., the Lagrangian frame of reference (Gunturi et al. (2015); Gunturi and Shekhar
(2014)) instead of a snapshot view. This is because a traveler moving along a chosen path in a
spatiotemporal network would experience a road-segment (and its properties such as fuel
efficiency, travel-time etc.) for the time at which he/she arrives at that segment, which may be
distinct from the original departure-time at the start of the journey. These unique requirements
(non-isotropy and Lagrangian reference frame) call for novel spatiotemporal statistical
foundations (Isaak et al. (2014)) as well as new computational approaches for spatiotemporal
network data mining.
Another future research need is to develop spatiotemporal graph big data platforms. Current
relevant big data platforms for spatial and spatiotemporal data mining include ESRI GIS Tools
for Hadoop (ESRI (2013)), Hadoop GIS (Aji et al. (2013)), etc. These provide distributed systems
for geometric-data (e.g., lines, points and polygons) including geometric indexing and
partitioning methods such as R-tree, R+-tree, or Quad tree. Recently, SpatialHadoop has been
developed (Eldawy and Mokbel (2015)). SpatialHadoop embeds geometric notions in language,
visualization, storage, MapReduce, and operations layers. However, spatio-temporal graphs
(STGs) violate the core assumptions of current spatial big data platforms that the geometric
concepts are adequate for conveniently representing STG analytics operations and for partition
data for load-balancing. STGs also violate core assumptions underlying graph analytics software
(e.g., Giraph (Avery (2011)), GraphLab (Low et al. (2014)) and Pregel (Malewicz et al. (2010)))
that traditional location-unaware graphs are adequate for conveniently representing STG analytics
operations and for partition data for load-balancing. Therefore, novel spatiotemporal graph big
data platforms are needed. Several challenges should be addressed, e.g., spatiotemporal graph big
data requires novel distributed file systems (DFS) to partition the graph, and a novel
programming model is still needed to support abstract data types and fundamental STG
operations, etc.
6 Conclusions spatial and spatiotemporal (SST) data mining has broad application domains including ecology
and environmental management, public safety, transportation, earth science, epidemiology, and
climatology. This article provides an overview of recent advances in SST data mining. It reviews
common SST data mining techniques organized by major SST pattern families: SST anomaly,
SST association and tele-coupling, SST prediction, SST partitioning and summarization, SST
hotspots, and SST change detection. New trends and research needs are also summarized,
including SST data minim in the network space and SST big data platform development.
Reference Aach, T. and Kaup, A. (1995), “Bayesian algorithms for adaptive change detection in image
sequences using Markov random fields”, Signal Processing: Image Communication, Vol.
7 No. 2, pp. 147–160.
Aach, T., Kaup, A. and Mester, R. (1993), “Statistical model-based change detection in moving
video”, Signal Processing, Vol. 31 No. 2, pp. 165–180.
Aggarwal, C.C. (2013), Outlier Analysis, Springer, Berlin, Germany.
Agrawal, R., Imieliński, T. and Swami, A. (1993), “Mining Association Rules Between Sets of
Items in Large Databases”, Proceedings of the 1993 ACM SIGMOD International
Conference on Management of Data, ACM, New York, NY, USA, pp. 207–216.
Agrawal, R., Srikant, R. and others. (1994), “Fast algorithms for mining association rules”, Proc.
20th Int. Conf. Very Large Data Bases, VLDB, Vol. 1215, pp. 487–499.
Aji, A., Wang, F., Vo, H., Lee, R., Liu, Q., Zhang, X. and Saltz, J. (2013), “Hadoop GIS: A High
Performance Spatial Data Warehousing System over Mapreduce”, Proc. VLDB Endow.,
Vol. 6 No. 11, pp. 1009–1020.
Albert, P.S. and McShane, L.M. (1995), “A generalized estimating equations approach for
spatially correlated binary data: applications to the analysis of neuroimaging data”,
Biometrics, pp. 627–638.
Allen, J.F. (1984), “Towards a general theory of action and time”, Artificial Intelligence, Vol. 23
No. 2, pp. 123–154.
Almeida, C.M., Souza, I.M., Alves, C.D., Pinho, C.M.D., Pereira, M.N. and Feitosa, R.Q. (2007),
“Multilevel Object-oriented Classification of Quickbird Images for Urban Population
Estimates”, Proceedings of the 15th Annual ACM International Symposium on Advances
in Geographic Information Systems, ACM, New York, NY, USA, p. 12:1–12:8.
Ankerst, M., Breunig, M.M., Kriegel, H.-P. and Sander, J. (1999), “OPTICS: Ordering Points to
Identify the Clustering Structure”, Proceedings of the 1999 ACM SIGMOD International
Conference on Management of Data, ACM, New York, NY, USA, pp. 49–60.
Anselin, L. (1994), “Exploratory spatial data analysis and geographic information systems”, New
Tools for Spatial Analysis, Vol. 17, pp. 45–54.
Anselin, L. (1995), “Local Indicators of Spatial Association—LISA”, Geographical Analysis,
Vol. 27 No. 2, pp. 93–115.
Anselin, L. (2013), Spatial Econometrics: Methods and Models, Springer Science & Business
Media.
Avery, C. (2011), “Giraph: Large-scale graph processing infrastructure on hadoop”, Proceedings
of the Hadoop Summit. Santa Clara, Vol. 11.
Bamnett, V. and Lewis, T. (1994), “Outliers in statistical data”.
Banerjee, S., Carlin, B.P. and Gelfand, A.E. (2014), Hierarchical Modeling and Analysis for
Spatial Data, Crc Press.
Barua, S. and Sander, J. (2014), “Mining Statistically Significant Co-location and Segregation
Patterns”, IEEE Transactions on Knowledge and Data Engineering, Vol. 26 No. 5, pp.
1185–1199.
Birant, D. and Kut, A. (2007), “ST-DBSCAN: An algorithm for clustering spatial–temporal
data”, Data & Knowledge Engineering, Vol. 60 No. 1, pp. 208–221.
Bu, Y., Chen, L., Fu, A.W.-C. and Liu, D. (2009), “Efficient Anomaly Monitoring over Moving
Object Trajectory Streams”, Proceedings of the 15th ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining, ACM, New York, NY, USA, pp.
159–168.
Bujor, F., Trouve, E., Valet, L., Nicolas, J.M. and Rudant, J.P. (2004), “Application of log-
cumulants to the detection of spatiotemporal discontinuities in multitemporal SAR
images”, IEEE Transactions on Geoscience and Remote Sensing, Vol. 42 No. 10, pp.
2073–2084.
Campelo, C.E.C. and Bennett, B. (2013), “Representing and Reasoning about Changing Spatial
Extensions of Geographic Features”, in Tenbrink, T., Stell, J., Galton, A. and Wood, Z.
(Eds.), Spatial Information Theory, Springer International Publishing, pp. 33–52.
Cao, H., Mamoulis, N. and Cheung, D.W. (2005), “Mining frequent spatio-temporal sequential
patterns”, Fifth IEEE International Conference on Data Mining (ICDM’05), presented at
the Fifth IEEE International Conference on Data Mining (ICDM’05), p. 8 pp.-pp.
Celik, M., Shekhar, S., Rogers, J.P. and Shine, J.A. (2008), “Mixed-Drove Spatiotemporal Co-
Occurrence Pattern Mining”, IEEE Transactions on Knowledge and Data Engineering,
Vol. 20 No. 10, pp. 1322–1335.
Celik, M., Shekhar, S., Rogers, J.P., Shine, J.A. and Yoo, J.S. (2006), “Mixed-Drove Spatio-
Temporal Co-occurence Pattern Mining: A Summary of Results”, Sixth International
Conference on Data Mining (ICDM’06), presented at the Sixth International Conference
on Data Mining (ICDM’06), pp. 119–128.
Cervone, G., Franzese, P., Ezber, Y. and Boybeyi, Z. (2008), “Risk Assessment of Atmospheric
Hazard Releases Using K-Means Clustering”, 2008 IEEE International Conference on
Data Mining Workshops, presented at the 2008 IEEE International Conference on Data
Mining Workshops, pp. 342–348.
Chandola, V., Banerjee, A. and Kumar, V. (2009), “Anomaly Detection: A Survey”, ACM
Comput. Surv., Vol. 41 No. 3, p. 15:1–15:58.
Chandola, V., Hui, D., Gu, L., Bhaduri, B. and Vatsavai, R.R. (2010), “Using Time Series
Segmentation for Deriving Vegetation Phenology Indices from MODIS NDVI Data”,
2010 IEEE International Conference on Data Mining Workshops, presented at the 2010
IEEE International Conference on Data Mining Workshops, pp. 202–208.
Chandola, V. and Kumar, V. (2007), “Summarization – compressing data into an informative
representation”, Knowledge and Information Systems, Vol. 12 No. 3, pp. 355–378.
Chang, W., Zeng, D. and Chen, H. (2005), “Prospective spatio-temporal data analysis for security
informatics”, Proceedings. 2005 IEEE Intelligent Transportation Systems, 2005.,
presented at the Proceedings. 2005 IEEE Intelligent Transportation Systems, 2005., pp.
1120–1124.
Chen, C., Zhang, D., Castro, P.S., Li, N., Sun, L. and Li, S. (2011), “Real-Time Detection of
Anomalous Taxi Trajectories from GPS Traces”, in Puiatti, A. and Gu, T. (Eds.), Mobile
and Ubiquitous Systems: Computing, Networking, and Services, Springer Berlin
Heidelberg, pp. 63–74.
Chen, D., Lu, C.-T., Kou, Y. and Chen, F. (2008), “On detecting spatial outliers”,
Geoinformatica, Vol. 12 No. 4, pp. 455–475.
Chen, G., Hay, G.J., Carvalho, L.M.T. and Wulder, M.A. (2012), “Object-based change
detection”, International Journal of Remote Sensing, Vol. 33 No. 14, pp. 4434–4457.
Cheng, D.T., Haworth, J., Anbaroglu, B., Tanaksaranond, G. and Wang, J. (2014),
“Spatiotemporal Data Mining”, in Fischer, M.M. and Nijkamp, P. (Eds.), Handbook of
Regional Science, Springer Berlin Heidelberg, pp. 1173–1193.
Chiu, S.N., Stoyan, D., Kendall, W.S. and Mecke, J. (2013), Stochastic Geometry and Its
Applications, John Wiley & Sons.
Chou, Y.-H. (1997), “Exploring Spatial Analysis in Geographic Information Systems”.
Cressie, N. (2015), Statistics for Spatial Data, John Wiley & Sons.
Cressie, N. and Wikle, C.K. (2015), Statistics for Spatio-Temporal Data, John Wiley & Sons.
Desclée, B., Bogaert, P. and Defourny, P. (2006), “Forest change detection by statistical object-
based method”, Remote Sensing of Environment, Vol. 102 No. 1–2, pp. 1–11.
Ding, W., Eick, C.F., Yuan, X., Wang, J. and Nicot, J.-P. (2011), “A framework for regional
association rule mining and scoping in spatial datasets”, GeoInformatica, Vol. 15 No. 1,
pp. 1–28.
Douglas, D.H. and Peucker, T.K. (1973), “Algorithms for the reduction of the number of points
required to represent a digitized line or its caricature”, Cartographica: The International
Journal for Geographic Information and Geovisualization, Vol. 10 No. 2, pp. 112–122.
Eck, J., Chainey, S., Cameron, J. and Wilson, R. (2005), “Mapping crime: Understanding
hotspots”.
Eftelioglu, E., Li, Y., Tang, X., Shekhar, S., Kang, J.M. and Farah, C. (2016), “Mining Network
Hotspots with Holes: A Summary of Results”, in Miller, J.A., O’Sullivan, D. and
Wiegand, N. (Eds.), Geographic Information Science, Springer International Publishing,
pp. 51–67.
Eftelioglu, E., Shekhar, S., Oliver, D., Zhou, X., Evans, M.R., Xie, Y., Kang, J.M., et al. (2014),
“Ring-Shaped Hotspot Detection: A Summary of Results”, 2014 IEEE International
Conference on Data Mining, presented at the 2014 IEEE International Conference on
Data Mining, pp. 815–820.
Eldawy, A. and Mokbel, M.F. (2015), “SpatialHadoop: A MapReduce framework for spatial
data”, 2015 IEEE 31st International Conference on Data Engineering, presented at the
2015 IEEE 31st International Conference on Data Engineering, pp. 1352–1363.
Elfeky, M.G., Aref, W.G. and Elmagarmid, A.K. (2006), “Stagger: Periodicity mining of data
streams using expanding sliding windows”, Sixth International Conference on Data
Mining (ICDM’06), IEEE, pp. 188–199.
Elliot, P., Wakefield, J.C., Best, N.G., Briggs, D.J. and others. (2000), Spatial Epidemiology:
Methods and Applications., Oxford University Press.
Erwig, M. and Schneider, M. (2002), “Spatio-temporal predicates”, IEEE Transactions on
Knowledge and Data Engineering, Vol. 14 No. 4, pp. 881–901.
ESRI. (2013), “Breathe Life into Big Data”.
Ester, M., Kriegel, H.-P. and Sander, J. (1997), “Spatial data mining: A database approach”,
Advances in Spatial Databases, Springer Berlin Heidelberg, pp. 47–66.
Ester, M., Kriegel, H.-P., Sander, J., Xu, X. and others. (1996), “A density-based algorithm for
discovering clusters in large spatial databases with noise.”, Kdd, Vol. 96, pp. 226–231.
Evans, M.R., Oliver, D., Shekhar, S. and Harvey, F. (2012), “Summarizing Trajectories into K-
primary Corridors: A Summary of Results”, Proceedings of the 20th International
Conference on Advances in Geographic Information Systems, ACM, New York, NY,
USA, pp. 454–457.
Evans, M.R., Oliver, D., Shekhar, S. and Harvey, F. (2013), “Fast and Exact Network Trajectory
Similarity Computation: A Case-study on Bicycle Corridor Planning”, Proceedings of the
2Nd ACM SIGKDD International Workshop on Urban Computing, ACM, New York,
NY, USA, p. 9:1–9:8.
Fotheringham, A.S., Brunsdon, C. and Charlton, M. (2003), Geographically Weighted
Regression: The Analysis of Spatially Varying Relationships, John Wiley & Sons.
Franke, C. and Gertz, M. (2008), “Detection and exploration of outlier regions in sensor data
streams”, 2008 IEEE International Conference on Data Mining Workshops, IEEE, pp.
375–384.
Frelich, L.E. and Reich, P.B. (2010), “Will environmental changes reinforce the impact of global
warming on the prairie–forest border of central North America?”, Frontiers in Ecology
and the Environment, Vol. 8 No. 7, pp. 371–378.
Ge, Y., Xiong, H., Liu, C. and Zhou, Z.H. (2011), “A Taxi Driving Fraud Detection System”,
2011 IEEE 11th International Conference on Data Mining, presented at the 2011 IEEE
11th International Conference on Data Mining, pp. 181–190.
Gelfand, A.E., Diggle, P., Guttorp, P. and Fuentes, M. (2010), Handbook of Spatial Statistics,
CRC Press.
George, B. and Shekhar, S. (2008), “Time-Aggregated Graphs for Modeling Spatio-temporal
Networks”, in Spaccapietra, S., Pan, J.Z., Thiran, P., Halpin, T., Staab, S., Svatek, V.,
Shvaiko, P., et al. (Eds.), Journal on Data Semantics XI, Springer Berlin Heidelberg, pp.
191–212.
Gething, P.W., Atkinson, P.M., Noor, A.M., Gikandi, P.W., Hay, S.I. and Nixon, M.S. (2007), “A
local space–time kriging approach applied to a national outpatient malaria data set”,
Computers & Geosciences, Vol. 33 No. 10, pp. 1337–1350.
Gunturi, V.M.V. and Shekhar, S. (2014), “Lagrangian Xgraphs: A Logical Data-Model for
Spatio-Temporal Network Data: A Summary”, Advances in Conceptual Modeling,
Springer International Publishing, pp. 201–211.
Gunturi, V.M.V., Shekhar, S. and Yang, K. (2015), “A Critical-Time-Point Approach to All-
Departure-Time Lagrangian Shortest Paths”, IEEE Transactions on Knowledge and Data
Engineering, Vol. 27 No. 10, pp. 2591–2603.
Güting, R.H. and Schneider, M. (2005), Moving Objects Databases, Elsevier.
Guyon, X. (1995), Random Fields on a Network: Modeling, Statistics, and Applications, Springer
Science & Business Media.
Habiba, H., Tantipathananandh, C. and Berger-Wolf, T. (2007), “Betweenness centrality measure
in dynamic networks”, Department of Computer Science, University of Illinois at
Chicago, Chicago.
Haining, R. (1993), Spatial Data Analysis in the Social and Environmental Sciences, Cambridge
University Press.
Haslett, J., Bradley, R., Craig, P., Unwin, A. and Wills, G. (1991), “Dynamic graphics for
exploring spatial data with application to locating global and local anomalies”, The
American Statistician, Vol. 45 No. 3, pp. 234–242.
Hixson, M., Scholz, D., Fuhs, N. and Akiyama, T. (1980), “Evaluation of several schemes for
classification of remotely sensed data”, Photogrammetric Engineering and Remote
Sensing.
Hu, T., Xiong, H., Gong, X. and Sung, S.Y. (2008), “ANEMI: An Adaptive Neighborhood
Expectation-Maximization Algorithm with Spatial Augmented Initialization”, Advances
in Knowledge Discovery and Data Mining, Springer Berlin Heidelberg, pp. 160–171.
Huang, Y., Pei, J. and Xiong, H. (2006), “Mining Co-Location Patterns with Rare Events from
Spatial Data Sets”, GeoInformatica, Vol. 10 No. 3, pp. 239–260.
Huang, Y., Shekhar, S. and Xiong, H. (2004), “Discovering colocation patterns from spatial data
sets: a general approach”, IEEE Transactions on Knowledge and Data Engineering, Vol.
16 No. 12, pp. 1472–1485.
Huang, Y., Zhang, L. and Zhang, P. (2006), “Finding Sequential Patterns from Massive Number
of Spatio-Temporal Events”, Proceedings of the 2006 SIAM International Conference on
Data Mining, Vols. 1-0, Society for Industrial and Applied Mathematics, pp. 634–638.
Huang, Y., Zhang, L. and Zhang, P. (2008), “A Framework for Mining Sequential Patterns from
Spatio-Temporal Event Data Sets”, IEEE Transactions on Knowledge and Data
Engineering, Vol. 20 No. 4, pp. 433–448.
Im, J. and Jensen, J.R. (2005), “A change detection model based on neighborhood correlation
image analysis and decision tree classification”, Remote Sensing of Environment, Vol. 99
No. 3, pp. 326–340.
Im, J., Jensen, J.R. and Tullis, J.A. (2008), “Object‐based change detection using correlation
image analysis and image segmentation”, International Journal of Remote Sensing, Vol.
29 No. 2, pp. 399–423.
Isaak, D.J., Peterson, E.E., Ver Hoef, J.M., Wenger, S.J., Falke, J.A., Torgersen, C.E., Sowder,
C., et al. (2014), “Applications of spatial statistical network models to stream data”,
Wiley Interdisciplinary Reviews: Water, Vol. 1 No. 3, pp. 277–294.
Isaaks, E.H. and others. (1989), Applied Geostatistics, Oxford University Press.
Jain, A.K. and Dubes, R.C. (1988), Algorithms for Clustering Data, Prentice-Hall, Inc.
Jain, A.K., Murty, M.N. and Flynn, P.J. (1999), “Data Clustering: A Review”, ACM Comput.
Surv., Vol. 31 No. 3, pp. 264–323.
Jhung, Y. and Swain, P.H. (1996), “Bayesian contextual classification based on modified M-
estimates and Markov random fields”, IEEE Transactions on Geoscience and Remote
Sensing, Vol. 34 No. 1, pp. 67–75.
Jiang, Z., Shekhar, S., Zhou, X., Knight, J. and Corcoran, J. (2013), “Focal-Test-Based Spatial
Decision Tree Learning: A Summary of Results”, 2013 IEEE 13th International
Conference on Data Mining, presented at the 2013 IEEE 13th International Conference
on Data Mining, pp. 320–329.
Jiang, Z., Shekhar, S., Zhou, X., Knight, J. and Corcoran, J. (2015), “Focal-Test-Based Spatial
Decision Tree Learning”, IEEE Transactions on Knowledge and Data Engineering, Vol.
27 No. 6, pp. 1547–1559.
Kang, J.M., Shekhar, S., Henjum, M., Novak, P.J. and Arnold, W.A. (2009), “Discovering
Teleconnected Flow Anomalies: A Relationship Analysis of Dynamic Neighborhoods
(RAD) Approach”, in Mamoulis, N., Seidl, T., Pedersen, T.B., Torp, K. and Assent, I.
(Eds.), Advances in Spatial and Temporal Databases, Springer Berlin Heidelberg, pp.
44–61.
Kang, J.M., Shekhar, S., Wennen, C. and Novak, P. (2008), “Discovering Flow Anomalies: A
SWEET Approach”, 2008 Eighth IEEE International Conference on Data Mining,
presented at the 2008 Eighth IEEE International Conference on Data Mining, pp. 851–
856.
Karypis, G., Han, E.-H. and Kumar, V. (1999), “Chameleon: hierarchical clustering using
dynamic modeling”, Computer, Vol. 32 No. 8, pp. 68–75.
Kawale, J., Chatterjee, S., Ormsby, D., Steinhaeuser, K., Liess, S. and Kumar, V. (2012),
“Testing the Significance of Spatio-temporal Teleconnection Patterns”, Proceedings of
the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data
Mining, ACM, New York, NY, USA, pp. 642–650.
Kisilevich, S., Mansmann, F., Nanni, M. and Rinzivillo, S. (2009), Spatio-Temporal Clustering,
Springer.
Koperski, K., Adhikary, J. and Han, J. (1996), “Spatial data mining: progress and challenges
survey paper”, Proc. ACM SIGMOD Workshop on Research Issues on Data Mining and
Knowledge Discovery, Montreal, Canada, Citeseer, pp. 1–10.
Koperski, K. and Han, J. (1995), “Discovery of spatial association rules in geographic
information databases”, International Symposium on Spatial Databases, Springer, pp.
47–66.
Kosugi, Y., Sakamoto, M., Fukunishi, M., Lu, W., Doihara, T. and Kakumoto, S. (2004), “Urban
change detection related to earthquakes using an adaptive nonlinear mapping of high-
resolution images”, IEEE Geoscience and Remote Sensing Letters, Vol. 1 No. 3, pp. 152–
156.
Kou, Y., Lu, C.-T. and Chen, D. (2006), “Spatial Weighted Outlier Detection.”, SDM, SIAM, pp.
614–618.
Koubarakis, M., Sellis, T., Frank, A.U., Grumbach, S., Güting, R.H., Jensen, C.S., Lorentzos, N.,
et al. (2003), Spatio-Temporal Databases: The CHOROCHRONOS Approach, Springer.
Krugman, P.R. (1997), Development, Geography, and Economic Theory, Vol. 6, MIT press.
Kulldorff, M. (1997), “A spatial scan statistic”, Communications in Statistics - Theory and
Methods, Vol. 26 No. 6, pp. 1481–1496.
Kulldorff, M. (2001), “Prospective time periodic geographical disease surveillance using a scan
statistic”, Journal of the Royal Statistical Society: Series A (Statistics in Society), Vol.
164 No. 1, pp. 61–72.
Kulldorff, M. (2015), SaTScanTM User Guide.
Kulldorff, M., Athas, W.F., Feurer, E.J., Miller, B.A. and Key, C.R. (1998), “Evaluating cluster
alarms: a space-time scan statistic and brain cancer in Los Alamos, New Mexico.”,
American Journal of Public Health, Vol. 88 No. 9, pp. 1377–1380.
Kyriakidis, P.C. and Journel, A.G. (1999), “Geostatistical space–time models: a review”,
Mathematical Geology, Vol. 31 No. 6, pp. 651–684.
Lai, C. and Nguyen, N.T. (2004), “Predicting density-based spatial clusters over time”, Fourth
IEEE International Conference on Data Mining, 2004. ICDM ’04, presented at the
Fourth IEEE International Conference on Data Mining, 2004. ICDM ’04, pp. 443–446.
Lang, L. and (Redlands, C.) E.S.R.I. (1999), Transportation GIS, Vol. 118, ESri Press Redlands.
Laube, P. and Imfeld, S. (2002), “Analyzing Relative Motion within Groups ofTrackable Moving
Point Objects”, in Egenhofer, M.J. and Mark, D.M. (Eds.), Geographic Information
Science, Springer Berlin Heidelberg, pp. 132–144.
Lee, A.J.T., Chen, Y.-A. and Ip, W.-C. (2009), “Mining frequent trajectory patterns in spatial–
temporal databases”, Information Sciences, Vol. 179 No. 13, pp. 2218–2231.
Lee, J.-G., Han, J. and Whang, K.-Y. (2007), “Trajectory Clustering: A Partition-and-group
Framework”, Proceedings of the 2007 ACM SIGMOD International Conference on
Management of Data, ACM, New York, NY, USA, pp. 593–604.
Leipnik, M.R. and Albert, D.P. (2003), GIS in Law Enforcement: Implementation Issues and
Case Studies, CRC Press.
Levine, N. (2013), “CrimeStat IV: A Spatial Statistics Program for the Analysis of Crime
Incident”, Forensic Magazine, 15 August.
Li, S.Z. (2009), Markov Random Field Modeling in Image Analysis, Springer Science & Business
Media.
Li, X. and Claramunt, C. (2006), “A Spatial Entropy-Based Decision Tree for Classification of
Geographical Information”, Transactions in GIS, Vol. 10 No. 3, pp. 451–467.
Li, X., Han, J. and Kim, S. (2006), “Motion-Alert: Automatic Anomaly Detection in Massive
Moving Objects”, in Mehrotra, S., Zeng, D.D., Chen, H., Thuraisingham, B. and Wang,
F.-Y. (Eds.), Intelligence and Security Informatics, Springer Berlin Heidelberg, pp. 166–
177.
Li, Y., Bailey, J., Kulik, L. and Pei, J. (2013), “Mining Probabilistic Frequent Spatio-Temporal
Sequential Patterns with Gap Constraints from Uncertain Databases”, 2013 IEEE 13th
International Conference on Data Mining, presented at the 2013 IEEE 13th International
Conference on Data Mining, pp. 448–457.
Li, Z., Chen, J. and Baltsavias, E. (2008), Advances in Photogrammetry, Remote Sensing and
Spatial Information Sciences: 2008 ISPRS Congress Book, CRC Press.
Little, B., Schucking, M., Gartrell, B., Chen, B., Ross, K. and McKellip, R. (2008), “High
Granularity Remote Sensing and Crop Production over Space and Time: NDVI over the
Growing Season and Prediction of Cotton Yields at the Farm Field Level in Texas”, 2008
IEEE International Conference on Data Mining Workshops, presented at the 2008 IEEE
International Conference on Data Mining Workshops, pp. 426–435.
Liu, C., Xiong, H., Ge, Y., Geng, W. and Perkins, M. (2012), “A Stochastic Model for Context-
Aware Anomaly Detection in Indoor Location Traces”, 2012 IEEE 12th International
Conference on Data Mining, presented at the 2012 IEEE 12th International Conference
on Data Mining, pp. 449–458.
Liu, X., Chen, F. and Lu, C.-T. (2014), “On detecting spatial categorical outliers”,
GeoInformatica, Vol. 18 No. 3, pp. 501–536.
Low, Y., Gonzalez, J.E., Kyrola, A., Bickson, D., Guestrin, C.E. and Hellerstein, J. (2014),
“GraphLab: A New Framework For Parallel Machine Learning”, arXiv:1408.2041 [Cs].
Lu, C.-T., Chen, D. and Kou, Y. (2003a), “Detecting spatial outliers with multiple attributes”,
15th IEEE International Conference on Tools with Artificial Intelligence, 2003.
Proceedings, presented at the 15th IEEE International Conference on Tools with
Artificial Intelligence, 2003. Proceedings, pp. 122–128.
Lu, C.-T., Chen, D. and Kou, Y. (2003b), “Algorithms for spatial outlier detection”, Data Mining,
2003. ICDM 2003. Third IEEE International Conference on, IEEE, pp. 597–600.
Ma, D. and Zhang, A. (2004), “An adaptive density-based clustering algorithm for spatial
database with noise”, Fourth IEEE International Conference on Data Mining, 2004.
ICDM ’04, presented at the Fourth IEEE International Conference on Data Mining, 2004.
ICDM ’04, pp. 467–470.
Mahoney, J.R., Asrar, G., Leinen, M.S., Andrews, J., Glackin, M., Groat, C., Hobenstein, W., et
al. (2003), “Strategic plan for the US climate change science program”, Climate Change
Science Program Office, Washington DC.
Malewicz, G., Austern, M.H., Bik, A.J.., Dehnert, J.C., Horn, I., Leiser, N. and Czajkowski, G.
(2010), “Pregel: A System for Large-scale Graph Processing”, Proceedings of the 2010
ACM SIGMOD International Conference on Management of Data, ACM, New York,
NY, USA, pp. 135–146.
Marcon, E. and Puech, F. (2009), “Generalizing Ripley’s K function to inhomogeneous
populations”, 1 April.
Martino, G.D., Iodice, A., Riccio, D. and Ruello, G. (2007), “A Novel Approach for Disaster
Monitoring: Fractal Models and Tools”, IEEE Transactions on Geoscience and Remote
Sensing, Vol. 45 No. 6, pp. 1559–1570.
McGuire, M.P., Janeja, V.P. and Gangopadhyay, A. (2014), “Mining trajectories of moving
dynamic spatio-temporal regions in sensor datasets”, Data Mining and Knowledge
Discovery, Vol. 28 No. 4, pp. 961–1003.
Mennis, J., Viger, R. and Tomlin, C.D. (2005), “Cubic Map Algebra Functions for Spatio-
Temporal Analysis”, Cartography and Geographic Information Science, Vol. 32 No. 1,
pp. 17–32.
Miller, H.J. and Han, J. (2009), “Geographic Data Mining and Knowledge
Discovery, .${$CRC$}$ Press”, New York.
Mohan, P., Shekhar, S., Shine, J.A. and Rogers, J.P. (2012), “Cascading Spatio-Temporal Pattern
Discovery”, IEEE Transactions on Knowledge and Data Engineering, Vol. 24 No. 11,
pp. 1977–1992.
Morens, D.M., Folkers, G.K. and Fauci, A.S. (2004), “The challenge of emerging and re-
emerging infectious diseases”, Nature, Vol. 430 No. 6996, pp. 242–249.
Morimoto, Y. (2001), “Mining frequent neighboring class sets in spatial databases”, Proceedings
of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and
Data Mining, ACM, pp. 353–358.
Neill, D.B. and Moore, A.W. (2004), “Rapid Detection of Significant Spatial Clusters”,
Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining, ACM, New York, NY, USA, pp. 256–265.
Neill, D.B., Moore, A.W., Sabhnani, M. and Daniel, K. (2005), “Detection of Emerging Space-
time Clusters”, Proceedings of the Eleventh ACM SIGKDD International Conference on
Knowledge Discovery in Data Mining, ACM, New York, NY, USA, pp. 218–227.
Ng, R.T. and Han, J. (2002), “CLARANS: a method for clustering objects for spatial data
mining”, IEEE Transactions on Knowledge and Data Engineering, Vol. 14 No. 5, pp.
1003–1016.
Okabe, A., Okunuki, K.-I. and Shiode, S. (2006), “The SANET Toolbox: New Methods for
Network Spatial Analysis”, Transactions in GIS, Vol. 10 No. 4, pp. 535–550.
Okabe, A. and Sugihara, K. (2012), Spatial Analysis along Networks: Statistical and
Computational Methods, John Wiley & Sons.
Okabe, A., Yomono, H. and Kitamura, M. (1995), “Statistical analysis of the distribution of
points on a network”, Geographical Analysis, Vol. 27 No. 2, pp. 152–175.
Oliver, D., Bannur, A., Kang, J.M., Shekhar, S. and Bousselaire, R. (2010), “A K-Main Routes
Approach to Spatial Network Activity Summarization: A Summary of Results”, 2010
IEEE International Conference on Data Mining Workshops, presented at the 2010 IEEE
International Conference on Data Mining Workshops, pp. 265–272.
Oliver, D., Shekhar, S., Kang, J.M., Laubscher, R., Carlan, V. and Bannur, A. (2014), “A K-Main
Routes Approach to Spatial Network Activity Summarization”, IEEE Transactions on
Knowledge and Data Engineering, Vol. 26 No. 6, pp. 1464–1478.
Oliver, D., Shekhar, S., Zhou, X., Eftelioglu, E., Evans, M.R., Zhuang, Q., Kang, J.M., et al.
(2014), “Significant route discovery: A summary of results”, International Conference on
Geographic Information Science, Springer International Publishing, pp. 284–300.
Openshaw, S. and Openshaw, S. (1984), “The modifiable areal unit problem”, Geo Abstracts
University of East Anglia.
Pan, B., Demiryurek, U., Banaei-Kashani, F. and Shahabi, C. (2010), “Spatiotemporal
Summarization of Traffic Data Streams”, Proceedings of the ACM SIGSPATIAL
International Workshop on GeoStreaming, ACM, New York, NY, USA, pp. 4–10.
Pei, T., Jasra, A., Hand, D.J., Zhu, A.-X. and Zhou, C. (2009), “DECODE: a new method for
discovering clusters of different densities in spatial data”, Data Mining and Knowledge
Discovery, Vol. 18 No. 3, p. 337.
Pillai, K.G., Angryk, R.A. and Aydin, B. (2013), “A Filter-and-refine Approach to Mine
Spatiotemporal Co-occurrences”, Proceedings of the 21st ACM SIGSPATIAL
International Conference on Advances in Geographic Information Systems, ACM, New
York, NY, USA, pp. 104–113.
Quinlan, J.R. (1993), “C4. 5: Programming for machine learning”, Morgan Kauffmann, p. 38.
Radke, R.J., Andra, S., Al-Kofahi, O. and Roysam, B. (2005), “Image change detection
algorithms: a systematic survey”, IEEE Transactions on Image Processing, Vol. 14 No.
3, pp. 294–307.
Rignot, E.J.M. and Zyl, J.J. van. (1993), “Change detection techniques for ERS-1 SAR data”,
IEEE Transactions on Geoscience and Remote Sensing, Vol. 31 No. 4, pp. 896–906.
Ripley, B.D. (1976), “The second-order analysis of stationary point processes”, Journal of
Applied Probability, pp. 255–266.
Roddick, J.F. and Spiliopoulou, M. (1999), “A bibliography of temporal, spatial and spatio-
temporal data mining research”, ACM SIGKDD Explorations Newsletter, Vol. 1 No. 1,
pp. 34–38.
Scally, R. (2006), GIS for Environmental Management, ESRI Press.
Schubert, E., Zimek, A. and Kriegel, H.-P. (2014), “Local outlier detection reconsidered: a
generalized view on locality with applications to spatial, video, and network outlier
detection”, Data Mining and Knowledge Discovery, Vol. 28 No. 1, pp. 190–237.
Shekhar, S. and Chawla, S. (2003), Spatial Databases: A Tour, 1 edition., Prentice Hall, Upper
Saddle River, N.J.
Shekhar, S., Evans, M.R., Kang, J.M. and Mohan, P. (2011), “Identifying patterns in spatial
information: A survey of methods”, Wiley Interdisciplinary Reviews: Data Mining and
Knowledge Discovery, Vol. 1 No. 3, pp. 193–214.
Shekhar, S., Jiang, Z., Ali, R.Y., Eftelioglu, E., Tang, X., Gunturi, V.M.V. and Zhou, X. (2015),
“Spatiotemporal Data Mining: A Computational Perspective”, ISPRS International
Journal of Geo-Information, Vol. 4 No. 4, pp. 2306–2338.
Shekhar, S., Lu, C.-T. and Zhang, P. (2003), “A unified approach to detecting spatial outliers”,
GeoInformatica, Vol. 7 No. 2, pp. 139–166.
Shekhar, S. and Xiong, H. (2007), Encyclopedia of GIS, Springer Science & Business Media.
Shekhar, S., Yang, T.A. and Hancock, P.A. (1993), “An intelligent vehicle highway information
management system”, Computer-Aided Civil and Infrastructure Engineering, Vol. 8 No.
3, pp. 175–198.
Shekhar, S., Zhang, P., Huang, Y. and Vatsavai, R.R. (2003), “Trends in spatial data mining”,
Data Mining: Next Generation Challenges and Future Directions, pp. 357–380.
Solberg, A.H.S., Taxt, T. and Jain, A.K. (1996), “A Markov random field model for classification
of multisource satellite imagery”, IEEE Transactions on Geoscience and Remote Sensing,
Vol. 34 No. 1, pp. 100–113.
Stojanova, D., Ceci, M., Appice, A., Malerba, D. and Džeroski, S. (2013), “Dealing with spatial
autocorrelation when learning predictive clustering trees”, Ecological Informatics, Vol.
13, pp. 22–39.
Strahler, A.H. (1980), “The use of prior probabilities in maximum likelihood classification of
remotely sensed data”, Remote Sensing of Environment, Vol. 10 No. 2, pp. 135–163.
Tango, T., Takahashi, K. and Kohriyama, K. (2011), “A Space–Time Scan Statistic for Detecting
Emerging Outbreaks”, Biometrics, Vol. 67 No. 1, pp. 106–115.
Thoma, R. and Bierling, M. (1989), “Motion compensating interpolation considering covered and
uncovered background”, Signal Processing: Image Communication, Vol. 1 No. 2, pp.
191–212.
Tobler, W.R. (1979), “Cellular Geography”, in Gale, S. and Olsson, G. (Eds.), Philosophy in
Geography, Springer Netherlands, pp. 379–386.
Verhein, F. (2009), “Mining Complex Spatio-Temporal Sequence Patterns”, Proceedings of the
2009 SIAM International Conference on Data Mining, Vols. 1-0, Society for Industrial
and Applied Mathematics, pp. 605–616.
Wang, M., Wang, A. and Li, A. (2006), “Mining Spatial-temporal Clusters from Geo-databases”,
Advanced Data Mining and Applications, Springer Berlin Heidelberg, pp. 263–270.
Wang, S., Huang, Y. and Wang, X.S. (2013), “Regional Co-locations of Arbitrary Shapes”,
Advances in Spatial and Temporal Databases, Springer Berlin Heidelberg, pp. 19–37.
Warrender, C.E. and Augusteijn, M.F. (1999), “Fusion of image classifications using Bayesian
techniques with Markov random fields”, International Journal of Remote Sensing, Vol.
20 No. 10, pp. 1987–2002.
Worboys, M. (2005), “Event‐oriented approaches to geographic phenomena”, International
Journal of Geographical Information Science, Vol. 19 No. 1, pp. 1–28.
Worboys, M.F. and Duckham, M. (2004), GIS: A Computing Perspective, Second Edition, CRC
Press.
Wu, M., Jermaine, C., Ranka, S., Song, X. and Gums, J. (2010), “A Model-Agnostic Framework
for Fast Spatial Anomaly Detection”, ACM Trans. Knowl. Discov. Data, Vol. 4 No. 4, p.
20:1–20:30.
Xiong, H., Shekhar, S., Huang, Y., Kumar, V., Ma, X. and Yoc, J. (2004), “A Framework for
Discovering Co-location Patterns in Data Sets with Extended Spatial Objects”,
Proceedings of the 2004 SIAM International Conference on Data Mining, Vols. 1-0,
Society for Industrial and Applied Mathematics, pp. 78–89.
Yang, K., Evans, M.R., Gunturi, V.M.V., Kang, J.M. and Shekhar, S. (2014), “Lagrangian
Approaches to Storage of Spatio-Temporal Network Datasets”, IEEE Transactions on
Knowledge and Data Engineering, Vol. 26 No. 9, pp. 2222–2236.
Yoo, J.S. and Shekhar, S. (2006), “A Joinless Approach for Mining Spatial Colocation Patterns”,
IEEE Transactions on Knowledge and Data Engineering, Vol. 18 No. 10, pp. 1323–
1337.
Yuan, M. (1996), “Temporal GIS and spatio-temporal modeling”, Proceedings of Third
International Conference Workshop on Integrating GIS and Environment Modeling,
Santa Fe, NM.
Yuan, M. (1999), “Use of a Three-Domain Repesentation to Enhance GIS Support for Complex
Spatiotemporal Queries”, Transactions in GIS, Vol. 3 No. 2, pp. 137–159.
Zhang, D., Li, N., Zhou, Z.-H., Chen, C., Sun, L. and Li, S. (2011), “iBAT: Detecting Anomalous
Taxi Trajectories from GPS Traces”, Proceedings of the 13th International Conference
on Ubiquitous Computing, ACM, New York, NY, USA, pp. 99–108.
Zhang, P., Huang, Y., Shekhar, S. and Kumar, V. (2003a), “Correlation Analysis of Spatial Time
Series Datasets: A Filter-and-Refine Approach”, in Whang, K.-Y., Jeon, J., Shim, K. and
Srivastava, J. (Eds.), Advances in Knowledge Discovery and Data Mining, Springer
Berlin Heidelberg, pp. 532–544.
Zhang, P., Huang, Y., Shekhar, S. and Kumar, V. (2003b), “Exploiting Spatial Autocorrelation to
Efficiently Process Correlation-Based Similarity Queries”, Advances in Spatial and
Temporal Databases, Springer Berlin Heidelberg, pp. 449–468.
Zhang, T., Ramakrishnan, R. and Livny, M. (1996), “BIRCH: An Efficient Data Clustering
Method for Very Large Databases”, in Jagadish, H.V. and Mumick, I.S. (Eds.),
Proceedings of the 1996 ACM SIGMOD International Conference on Management of
Data, Montreal, Quebec, Canada, June 4-6, 1996, ACM Press, pp. 103–114.
Zhou, X., Shekhar, S. and Ali, R.Y. (2014), “Spatiotemporal change footprint pattern discovery:
an inter-disciplinary survey”, Wiley Interdisciplinary Reviews: Data Mining and
Knowledge Discovery, Vol. 4 No. 1, pp. 1–23.