Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
1
Automatic Extraction of Road Networks from GPS Traces
A road map inference method that generates road network from GPS traces through point
segmentation and grouping.
Jia Qiu, Department of Geomatics Engineering, University of Calgary, 2500 University Drive
NW, Calgary, Alberta, Canada T2N 1N4 [email protected]
Ruisheng Wang, Department of Geomatics Engineering, University of Calgary, 2500 University
Drive NW, Calgary, Alberta, Canada T2N 1N4 [email protected]
2
Abstract: We propose a point segmentation and grouping method to generate road maps
from GPS traces. First, we present a progressive point cloud segmentation algorithm based on
Total Least Squares (TLS) line fitting. Second, we group topologically connected point clusters
by the point’s orientation and cluster’s spatial proximity, where the topological relationship is
generated using Hidden Markov Model (HMM) map matching. Finally, we refine the
intersections of roads so that their geometrical and topological relationships are consistent with
each other. Experimental results show that our algorithm is robust to noises and the generated
road network has a high accuracy in terms of geometry and topology. Compare to the
representative algorithms; the results of our new algorithm have a higher F-measure score for
different matching thresholds.
Keywords: Map inference; point cloud segmentation; point cluster grouping; HMM map
matching.
3
1 Introduction
A road network is one of the most fundamental data of geospatial information. It is,
however, challenging to promptly and consistently update the existing road maps (Wang et al.,
2013). Conventional map updating techniques mainly consist of ground-based and aerial
surveys (Guo et al., 2007). Ground-based surveys are expensive and labor intensive. They are
limited to a long cycle of information acquisition. The aerial surveys have the capability to
obtain large-scale data about the Earth’s surface in a short time. Though several algorithms have
been proposed to extract road network from high-resolution satellite imagery (Doucette et al.,
2004, Youn et al., 2008, Kasemsuppakorn et al., 2013 ), it is still difficult to generate large-scale
road networks with these algorithms, due to the complexity of road networks and the occlusions
in satellite images caused by trees and buildings.
To overcome the shortcomings of traditional road map mapping, nowadays, user generated
content (UGC) (Krumm et al., 2008) is being increasingly used in road map generation. The
UGC is achieved in two ways: (1) collaborative mapping programs, such as OpenStreetMap,
which are highly dependent on manual work; and (2) automatic road generation from Global
Positioning System (GPS) trajectories (Li et al., 2014), which is referred to as map inference.
Being an efficient method for road network updates, map inference focus on the extraction of
road’s geometric positions, topological connections as well as some attribute information such
as lane counts (Schrödl et al., 2004) and road names (Li et al., 2014). The GPS traces are
collected by GPS-equipped vehicles, bicycles or pedestrians and therefore, better captures the
4
geometry and topology of the latest road networks. Most of the existing map inference methods
are designed to deal with low-noise, densely sampled and uniformly distributed GPS traces
(Guo et al., 2007, Wang et al., 2013). In this research, we propose a novel algorithm to infer
high-quality road maps from low-cost and high-noise GPS traces. The primary contributions of
our method are summarized in the following:
A robust progressive 2D point cloud segmentation algorithm using multi-density
thresholds to separate GPS points that visit different roads.
A topological relationship construction algorithm based on Hidden Markov Model
(HMM) based map matching.
An –quantile distance to measure spatial proximity between two point clusters for
grouping.
The remainder of the paper is organized as follows: Section 2 presents a review of the
related work. Section 3 outlines the procedure of our map inference algorithm and Section 4
describes the experiments of two GPS trace datasets. Finally, Section 5 concludes the paper,
outlining possible future work on this topic.
2 Related Work
Several approaches have been proposed to automatically infer road maps from GPS traces.
These approaches mainly include: clustering based methods, trace merging based methods,
Kernel Density Estimation (KDE) based methods, and intersection linking based methods
(Biagioni and Eriksson, 2012a, Ahmed et al. 2014).
5
The clustering based approaches begin with the determination of a series of cluster seeds
by location, bearing and heading of the traces. Seeds are then linked to form the road segments
according to the raw traces. Edelkamp and Schrödl (2003) introduced an approach to infer road
maps with high geometric accuracy, containing detailed lane information from fine-grained GPS
traces. Based on this work, Schrödl et al. (2004) described a process to refine the intersections
according to transition relationships between traces and segments. Agamennoni et al. (2010,
2011) regarded road map inference as finding the principal curve of the road segment by
determining and linking the cluster seeds. Their method, however, is not robust to noises
(Biagioni and Eriksson, 2012b) and requires densely sampled GPS traces.
Trace merging based approaches involve the consolidation of trace segments based on their
geometric relations and shape similarity. The centerlines of roads are then generated by the
merged traces. Cao and Krumm (2009) proposed an aggregation technique to pull together
traces that visit the same road. Even though their method performs well on densely sampled
GPS traces, it is found to be non-robust towards noisy data. Based on the research of Cao and
Krumm (2009), Wang et al. (2014) presented a novel approach for generating routable road
maps from GPS traces. They introduced circular boundaries to isolate points near intersections
and used a k-means clustering method to refine the intersections. This method also aimed at
dealing with densely sampled GPS traces. Niehoefer et al. (2009) described a map generation
method by Intra Journey Merge Routine (IJMR). The core of IJMR method is the merging of
trace segments by the distances between them. Chen et al. (2010) developed an algorithm for
6
road network reconstruction by extracting and linking shared structure in planar paths (GPS
traces). Their method theoretically guarantees geometric accuracy based on the assumptions that
the traces are densely sampled and located on the road surface. However, due to noise, the traces
may not meet the second assumption. Wang et al. (2013) applied a map inference method to
update road maps. They matched the GPS traces with the existing road map and aggregated the
unmatched traces. A polygonal principal curve algorithm (Kégl et al., 2000) is employed
thereafter to infer the road centerlines and subsequently, updating the road map. The
performance of trace merging approach, a type of line-based method (Liu et al., 2012), is highly
dependent on the sampling rates. If two consecutive points of traces are far from each other,
especially when the vehicle turns at an intersection, the segments of GPS traces cannot align
with the roads. In this case, spurious road segments are produced.
The KDE based methods treat the map inference as a digital image processing procedure.
Raw GPS traces are rasterized to a 2D gray image and a global threshold is then used to filter
the image. Finally, skeletonization approaches, such as the Voronoi diagram and morphological
operations, are applied to extract centerlines of the road segments. Examples include the method
proposed by Davies et al. (2006), Chen and Cheng (2008), and Shi et al. (2009). This type of
method is sensitive to the density distribution and it is difficult to select an appropriate global
threshold. To overcome the limitations of the KDE, Biagioni and Eriksson (2012b) proposed a
progressive KDE based method. They developed a hybrid map inference pipeline, consisting of
initial map generation by gray-scale skeletonization, map pruning with a Viterbi map matching
7
algorithm and road map topology and geometry refinement. Their method is robust to noise and
disparity; however, since it is a line-based approach, sparsely sampled traces cannot be handled.
Intersection linking approaches first detect the intersections of the road map and then link
the intersections according to the raw traces. Fathi and Krumm (2010) developed a classifier to
find the intersections from GPS traces. After finding the intersections, they used the raw traces
to prune and connect the intersections. Their method works well on road networks with simple
intersections and densely sampled traces. Karagiorgou and Pfoser (2012) proposed a similar
algorithm consisting of the identification and connection of intersections and the network graph
reduction. They use a speed threshold and changes in direction as indicators to identify the
intersections. The traces around the intersections are then clustered in terms of the distance to
determine the turns of the intersections. Finally, the links (road segments) are generated and
compacted along the paths of the traces. These methods are also sensitive to the sampling rates
of the GPS traces.
Due to the accumulation of noisy GPS points in an area, spurious road segments will be
produced by clustering based methods and KDE based methods. The performance of trace
merging based methods and intersection linking based methods highly depends on the accuracy
and sampling rate of GPS traces. In most of the cases, however, the accuracy of GPS receiver
cannot be guaranteed due to the urban canyon. Recently, Qiu et al. (2014) proposed a method to
segment sparsely sampled GPS traces into point clusters for generation of road centerlines.
Based on this work, we propose a new map inference algorithm for generating road network
8
with high geometric accuracy from noisy GPS traces.
3 Method
The general procedure of our method is shown in Figure 1 . It consists of four main steps:
GPS traces preprocessing, point cloud progressive segmentation, point cluster grouping and
road network geometry refinement.
Figure 1 The general procedure of our map inference method.
3.1 GPS Trace Pre-Processing
The dataset of GPS traces Tr are collected by GPS-equipped vehicles. A GPS trace tr is
collected by a vehicle in a time span. It consists of a sequence of GPS points, denoted by tr={pi
|i=1, 2,…, m} where each point pi=(loni, lati, ti) contains the information of the vehicle’s positon
at a specific time. The coordinates loni[-180, 180] and lati[-90, 90] are the longitude and
9
latitude of the point and tiR+ is the time at which the point is collected. The number of the
points in tr is m. Having a large amount of GPS traces collected by vehicles over space and time,
we can generate road network by the accumulated traces. In the experiment, the longitude and
latitude of all the points are transformed to a projected coordinate system, denoted by (x, y). We
associate an orientation for each point in the raw GPS traces. The orientation is computed by the
vector that is pointed from the point to its next point. If the point is the last point of a trace, set
its orientation as the value of its previous one. The range of the orientation is in [0°, 360°). It
starts from x-axis and advances in the counterclockwise direction.
Before segmenting GPS points, all traces are preprocessed as follows. For a trace tr={pi |
i=1,…, m},
1) if the distance between two consecutive point d(pg, pg+1)<dmin or d(pg, pg+1)>dmax, we
separate the trace tr into two traces as tr={pi |i=1,…, g}, and tr’={pi| i=g+1,…, m}, where
1≤g<m.
2) if the difference of two points’ (pg and pg+1) orientations is larger than a threshold
dDoOmax, we separate the trace tr into two traces as in 1).
The value of dmin is set to 3 meters, which approximates to the accuracy of standard GPS
devices. After computing all the distances between two consecutive points for all GPS traces as
D, we estimate dmax as dmax= d +3×, where d is the mean of the values in D and is estimated
by Median Absolute Deviation (MAD) as Equation (1).
1.4826 median | median ( ) |i i j jX X (1)
10
where X is a set representing D. The reasons for dividing traces by distance are: 1) if the
distance between two consecutive points is smaller than the GPS error, the two points might be
collected in the same position; 2) if two points are far away from each other, the line segment
between them cannot properly align with the real road. We set the value of dDoOmax to 90°. A
sharp change in the orientation of two points indicates a sudden change in the direction of the
road, however, this conclusion violates the smoothness assumption of the roads.
3.2 Progressive Segmentation of Points
Given a set of GPS traces, our algorithm generates a set of clusters C={ci | i=1, 2,…, N},
where each cluster represents a smooth road. A cluster ci={pi,r | r=1, 2,…, ni} consists of a set of
GPS points, where ni is the point number of cluster ci. The distribution of GPS traces and points
varies over space. Some areas have more points than others. Thus, we employ a progressive
strategy to segment the points. The process is controlled by a density parameter , which
represents the number of points in a given area. The general procedure is described in Algorithm
1. The inputs are the set of GPS points P, a radius and the difference of point orientations
for neighbours searching, and a list of densities list sorted in descending order. The output is a
set of point clusters C. First, we segment the points using the first in list. Second, we apply
robust Locally Weighted Scatter plot Smooth (Lowess) (Cleveland, 1979) algorithm to generate
centerlines L for the point clusters C. Third, we match all raw GPS traces with resampled
centerlines L’ to filter GPS points. If an unclassified raw point is matched with a point of a
resampled centerline, we filter out the point as outlier. Finally, we repeat the above three steps
11
by the next until all in list are visited. In this way, the points on the roads frequently visited
by vehicles are extracted first.
Algorithm 1: Progressive Points Segmentation
1
2
3
4
5
6
7
8
9
10
Input: P, , , list
Output: C
initialize nOutlier = 100000000, C = ∅, ∀𝑝 ∈ 𝑃 set p.cId = -1;
for 𝛿 ∈ 𝛿𝑙𝑖𝑠𝑡 do
compute C = C ∪ PointsSegmentation{P, , , list};
generate centerlines L = {li | i = 1, 2, …, N} by C using robust Lowess;
matching raw GPS traces Tr with resampled centerlines L’={ l’i | i = 1, 2, …, N };
for tr ∈ Tr do
obtain tr = {pi | i = 1, 2, …, m};
for pi ∈ tr do
if pi is matched ⋀ pi.cId == -1 then
set pi.cId = nOutlier;
3.2.1 Points Segmentation
We develop an efficient segmentation algorithm to partition GPS points into clusters. First,
we initialize a cluster ck by a point and its neighbours. Then, we fit the cluster by a line lk, which
is computed using Total Least Squares (TLS) fitting. Finally, we grow the cluster ck by adding
the neighbours of points in ck under the line lk’s constraint. Algorithm 2 outlines the general
12
procedure of our GPS points segmentation. We associate an ,-neighborhood (Definition 1) for
each point and an -neighborhood (Definition 2) for each line generated by TLS.
Definition 1: The ,-neighbourhood of a point p, denoted by point set N(p,,), is defined
by N(p,,)={qP | d(p,q)≤ DoO(p,q)≤}, where P is the set of points, d(p,q) is the
Euclidean distance of p and q, and DoO(p,q) is the difference of points’ orientation between p
and q.
Definition 2: The -neighbourhood of a line l, denoted by point set N(l,), is defined by
N(l,)={pP | d(p,l)≤}, where P is the set of points, d(p,l) is the perpendicular distance from
point p to line l.
Algorithm 2: Points Segmentation
1
2
3
4
5
6
7
8
Input: P, , ,
Output: C
initialize cluster id k = 1;
for pi ∈ P do
if pi.cId == -1 ⋀ pi is a segment point then
Initialize ck = pi ∪ N(pi, , ,), nSize = sizeof(ck), 𝛿𝑘=1000;
if nSize >= 2 then
while 𝛿𝑘< do
ck = UpdateCluster {P, ck, , , };
𝛿𝑘 = sizeof(ck) – nSize, nSize = sizeof(ck);
13
9
10
11
12
13
14
15
16
17
if sizeof(ck) < then
empty ck and continue;
construct a new coordinate system CS by ck using PCA;
update 𝛿𝑘 = 1000, nSize = sizeof(ck);
while 𝛿𝑘 > do
ck = ExpandCluster {P, ck, , , , CS};
𝛿𝑘 = sizeof(ck) – nSize, nSize = sizeof(ck);
∀pj ∈ ck, set pj.cId = k, push ck into C.
k = k + 1;
In algorithm 2, after initializing the point cluster ck by a point pi and N(pi,,), we update ck
by Algorithm 3 ‘UpdateCluster.’ The output of algorithm 3 is an updated point cluster c’k. First,
a base orientation dOri and a line lk are computed, where dOri is the median value of
orientations of points in ck and lk is generated by ck using TLS fitting. If a point pt meets three
conditions: 1) pt is in N(lk,), 2) the DoO of pt’s orientation and dOri is less than and 3) the
previous point pt(pre) or next point pt(next) of pt is also in N(lk,), then we add pt to the new
cluster c’k. These conditions aim at finding points to form a line segment. The third condition is
to filter GPS outliers. If a point is located in the buffer region of a centerline, and its previous
and next points do not present in the buffer region, then the point has a higher probability to be
an outlier. Second, we apply these conditions to all points pt in ck to initialize a queue Q. If pt
meets these conditions, then push pt into Q and c’k. Third, we pop point from Q as pt and find
14
points N(pt,,) joined with pt’s previous and next point as Ntotal. If the point in Ntotal meets the
conditions, we push the point into Q and c’k. Finally, the procedure is terminated when Q is
empty. A flag ‘visited’ is used for points in the process to avoid pushing points into Q or c’k
twice. The previous and the next points are obtained by the raw GPS traces. Due to the
disparities in GPS traces distribution, the ,-neighbourhood of some points is empty. Thus, we
consider the previous and next point to expand the point cluster. We repeat the ‘UpdateCluster’
algorithm until the number of cluster points is stable, which means that less than points are
added to cluster ck after the cluster expansion.
Algorithm 3: Update Cluster
1
2
3
4
5
6
7
8
9
Input: P, ck, ,
Output: c’k
initialize dOri as median orientation of points in ck;
fit cluster ck by line lk using Total Least Square (TLS) algorithm;
initialize an empty queue Q;
for pt ∈ ck do
get orientation dOrit of pt;
if pt ∈ N(lk,) ∧ DoO(dOrit, dOri) < then
if pt(pre) ∈ N(lk,) ∨ pt(next) ∈ N(lk,) then
push pt into Q, add pt into c’k and set pt as ‘visited’;
while Q ≠ ∅ do
15
10
11
12
13
14
15
16
pop point from Q as pt;
find previous and next point of pt joined with N(pt, , ) as Ntotal;
for pt ∈ Ntotal do
get orientation dOrit of pt;
if pt is not ‘visited’ ∧ pt ∈ N(lk,) ∧ DoO(dOrit, dOri) < then
if pt(pre) ∈ N(lk,) ∨ pt(next) ∈ N(lk,) then
push pt into Q, add pt into c’k and set pt as ‘visited’;
In Algorithm 2, after generating the cluster ck (Algorithm 3), we expand it (Algorithm 4)
using the points present in the head and tail regions of ck. The ‘ExpandCluster’ algorithm is
designed to generate curves or winding roads rather than straight roads. The output is an
expanded cluster c’k. Consider an illustration in Figure 2, (a) shows the transformed coordinate
system CS and the point cluster ci generated by ‘UpdateCluster.’ The CS is computed by
Principle Component Analysis (PCA), where the abscissa axis of CS is the eigenvector of the
first component and the vertical axis is the eigenvector of the second component. After sorting
the points in CS, we compute x0=min{pi,r.x | r=1, 2,…,ni}, x3=max{pi,r.x | r=1, 2,…,ni}, and
x1=x0+6×, x2=x3-6×. We then set the head region and the tail region as R(H)=[x0, x1], and
R(T)=[x2, x3], respectively. The points in head region P(H) and points in tail region P(T) are
obtained by R(H) and R(T). This setting guarantees that P(H) and P(T) will form road segments
that their orientations are consistent with the abscissa of CS. Subsequently, we apply
‘UpdateCluster’ algorithm to points P(H) and P(T) separately to generate expanded clusters
16
P’(H) and P’(T), as shown in Figure 2 (c). The head expands to the region with x value less than
x1 and tail will expand to the region with x value greater than x2. Points of P’(H) and P’(T) with
x values in R(H) and R(T) also require the distances to the nearest point of P(H) or P(T) less
than . The ‘ExpandCluster’ algorithm will terminate when less than points are added after
expanding the cluster.
Figure 2 An illustration of expanding cluster by the head and tail regions, (a) the transformed
coordinate system CS; (b) the head and tail region of the point cluster; and (c) the new cluster
generated by expanding the tail region.
Algorithm 4: Expand Cluster
1
2
3
4
Input: P, ck, , , , CS
Output: c’k
sort points of ck in ascending order by their x values in CS;
define head and tail ranges of ck as R(H) = [x0, x1] and R(T) = [x2, x3];
find points P(H) and P(T) in the head and tail regions of ck by their x values in CS;
initialize 𝛿𝑘 = 1000, P(H)temp=P(H), P(T)temp=P(T), nSize = sizeof(P(H)) + sizeof(P(T));
17
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
while 𝛿𝑘 > 𝛿 do
𝑃′(𝐻) = UpdateCluster {P, P(H)temp, , , };
𝑃′(𝑇) = UpdateCluster {P, P(T)temp, , , };
generate 𝑃′(𝑈) = 𝑃′(𝐻) ∪ 𝑃′(𝑇);
transform all points in 𝑃′(𝑈) to the coordinate system CS;
empty P(H)temp, P(T)temp;
for 𝑝𝑢′ ∈ 𝑃′(𝑈) do
if 𝑝𝑢′ . 𝑥 < 𝑥0 then
add 𝑝𝑢′ to P(H)temp;
if 𝑥0 ≤ 𝑝𝑢′ . 𝑥 ∧ 𝑝𝑢
′ . 𝑥 < 𝑥1 then
dmin = min{𝑑(𝑝𝑢′ , 𝑝ℎ)}, ∀𝑝ℎ ∈ 𝑃(𝐻);
if dmin < then
add 𝑝𝑢′ to P(H)temp;
if 𝑥2 ≤ 𝑝𝑢′ . 𝑥 ∧ 𝑝𝑢
′ . 𝑥 < 𝑥3 then
dmin = min{𝑑(𝑝𝑢′ , 𝑝𝑡)}, ∀𝑝𝑡 ∈ 𝑃(𝑇);
if dmin < then
add 𝑝𝑢′ to P(T)temp;
if 𝑝𝑢′ . 𝑥 > 𝑥3 then
add 𝑝𝑢′ to P(T)temp;
𝛿𝑘 = sizeof(P(H)temp) + sizeof(P(T)temp) – nSize,
18
25
26
nSize = sizeof(P(H)temp) + sizeof(P(T)temp);
transform all points in 𝑃′(𝑈) to the original coordinate system;
update 𝑐′𝑘 = P(H)temp ∪ P(T)temp
All the clusters are initialized to a point and points in its ,-neighbourhood. Different
initial points generate different clusters. We classify all GPS points into two categories, the
‘intersection points’ and ‘segment points.’ The intersection points are points located around two
ends of the road segments. The segment points are points located in the middle area of the road
segments. In our experiment, we notice that using segment point to initialize the point clusters
will generate better results in terms of geometric accuracy of the road map. The reason is that
the orientation of a segment point is more accurate than that of an intersection point. Based on
the Difference of Normals (DoN), which is proposed by Ioannou et al. (2012), we use a
multi-scale operator called Difference of Orientations (DoO) to classify GPS points. We
estimate the orientation dOri(pi,) of a point pi by applying PCA to the union of pi and N(pi,,).
The orientation is the direction of the first component’s eigenvector. We compute two
orientations dOri(pi,r1), dOri(pi,r2) using the support radii, r1 and r2. Then the difference of
orientation dDoOi= |dOri(pi,r1) - dOri(pi,r2)|, module=180°, is computed, where r1=, r2=1.6×r1.
Plate 1 shows the distribution of the DoO of our GPS data. DoO ranges from 0° to 90°. After
obtaining difference of orientations, DoO = {dDoOi | i=1, 2, …, n} for all GPS points, we use
MAD formula given in Equation (1) to estimate a DoO, which in turn describes the distribution
of DoO. Then, we classify point pi as segment point if dDoOi≤3×DoO and the others as
19
intersection points.
Plate 1 The DoO distribution of road network
Our GPS point segmentation method needs three parameters: , for neighbourhood
generation and for initializing clusters and terminating clusters growth. We fix the search
radius to 15 meters. This value is estimated based on the lane width and the lane number of the
real urban roads. The lane width in cities is about 3.75~4.0 meters. We assume that the roads
visited by vehicles contain at least 4 lanes. We set the difference of orientations to to 30°. The
value has a high tolerance to the noise of GPS points and to the inaccurate orientations of points.
The parameter represents the density of points and we will use this parameter to implement
the progressive segmentation algorithm.
3.2.2 Centerline Generation
After obtaining the segmentation result C, we generate centerlines to represent the real
roads. First, we construct a new coordinate system CS by all the points in ci using PCA
transformation. Second, we connect all the points in ci by their x values in CS in ascending order
to form the centerline li. Finally, we apply the robust Lowess method to generate centerlines
from the point clusters. The method is a curve smoothing algorithm and removes noises and
20
outliers by residual analysis. It includes five steps. First, a locally weighted linear least-squares
regression is applied for each point to smooth the curve. Second, the residual for each point is
calculated according to the smoothed curve. Third, the value computed by MAD is applied to
remove outliers and update local weights. Fourth, the curve is smoothed using step one again by
the updated weights. Finally, the previous three steps are repeated for a total of five iterations to
obtain a smooth curve.
In the algorithm, the local linear regression for each point is performed on its neighbours,
where the neighbours are computed by their x values in CS. It is important to determine how
many neighbours are used for the local linear regression. We use the average number of the
points contained in a specified span, dSpan to estimate the number of neighbours k for each
cluster. Considering Figure 3 as an illustration, after transforming all the points into the new
coordinate system CS, we use the formula k=dSpan/l×n to estimate the number of neighbours,
where l=xmax-xmin, n is the number of points in the cluster. We fix the dSpan to 10 meters
according to the experimental tests to calculate k for each cluster.
Figure 3 Illustration of robust Lowess algorithm
For each point cluster ci and the generated centerline li, we assume that the distances from
the points of the cluster to li complies to Gaussian distribution G~(, ), where =0 represents
21
the centerline and is estimated by MAD (Equation (1)). In the next section, we use max ii
as max to search the candidate list for initializing emission probabilities.
3.2.3 HMM based Map Matching
According to Newson and Krumm (2009), matching GPS traces with the actual road map
(i.e. map matching) is to find the most probable route that the vehicle visited based on the
location of vehicle measured by GPS and the connectivity of the road map. They modeled the
map matching problem as a HMM. In our method, by matching GPS points with sample points
of road centerlines, we construct the topological relationship between road centerlines. We also
use the matching result to filter out points generated by GPS noise. For a point cluster ci, we use
the robust Lowess to generate the centerline li. Then we sample the road centerline as follows. If
the road length of three consecutive points is smaller than a threshold (i.e. 3 meters), remove the
middle point. If the road length between two points is larger than the same threshold, add points
every 3 meters between them. A new road centerline will be obtained as li = {vi,r | r=1, 2,…, n’i},
where vi,r is the sample point of li, and n’i is the new number of points.
The transition probabilities of HMM give the probability of a vehicle moving from one
sample point to another. The emission probability of a particular point pt at time t associated
with a nearby sample point vf, denoted by Pr(vf |pt), gives the likelihood that pt was observed
from nearby sample point vf. For a point pt of a GPS trace, we search up to 10 nearest sample
points in N(pt,,) to generate the candidate list for matching with pt. Figure 4 presents the
procedure of matching GPS points with the sample points. We can calculate probabilities of all
22
the possible routes that allow a vehicle travel from point pt-1 to point pt according to the
transition probabilities and emission probabilities. We then choose the route that maximizes the
probability as the matching result.
Figure 4 All possible routes for transferring from point pt-1 to point pt
We compute a weighted connection matrix W = {wi,j | i, j=1, 2,…,N} between point clusters
in C according to the raw GPS traces to initialize the transition matrix. If point pi,r in cluster ci
and its next point pj,s in cluster cj belong to a same GPS trace, we conclude that vehicles have
probability to travel from sampled road centerline li to lj. Thus, we can infer the weighted
connection wi,j from cluster ci to cj. Consider Figure 5 as an illustration, 9 points are contained in
cluster ci, where 4 points transit from ci to cj, 3 points transit from ci to ck and 2 points transit
from ci to itself. We then compute wi,j=4/9, wi,k=3/9 and wi,i= 2/9. The sum of the connection
weight originated from ci is equal to 1. The number of non-zero weight originated from ci is
defined as the outdegree, deg(ci) of the cluster ci. In practice, for most of the cluster ci, the
number of points that transit from ci to itself is larger than the number of points that transit from
23
ci to others. This fact results in that wi,i is much greater than wi,j, where i j. The other fact is
that wi,j is much greater than wr,s, if deg(ci) is much greater than deg(cr). It indicates that
connection weights originated from lower outdegree clusters are larger than the connection
weights originated from higher outdegree clusters. In order to make the connection weight
independent of the outdegree, we compute the weighted connection matrix W as follows:
1) ciC, calculate the number m of the points in ci such that their next points are not in
ci.
2) wi,jW, (i j), calculate the number k of the points in ci such that their next points in
cj, then, set wi,j = k/m.
3) wi,iW, set wi,i =1 if m ni.
4) ciC, compute the outdegree, degi = ||wi,j 0||, (j=1, 2,…, N).
5) compute degmin = min{degi | i=1, 2,…, N}.
6) wi,jW,(i j), update wi,j = wi,j×(degi/degmin).
Figure 5 Points transition between clusters
Given two sampled centerlines li = {vi,r | r=1, 2,…, n’i} and lj = {vj,s | s=1, 2,…, n’j}, which
are derived from the point cluster ci = {pi,r | r=1, 2,…, ni}and cj = {pj,s | s=1, 2,…, nj}, the
transition probabilities are computed based on W as follows:
24
1) For any two sample points vi,r and vi,t that belong to the same centerline li, if the length
of the route {vi,r, vi,r+1,…, vi,t}, where r<t, is less than 6×max, we set the transition probability as
Pr(vi,r, vi,t) = 1.
2) For any two sample points vi,r and vj,s that belong to two different sampled centerline li
and lj, the transition probability is calculated by Equation (2)
2
, , , , ,Pr( , ) / ( , )i r j s i j i r j sv v w d v v (2)
where wi,j is the connection weight from cluster ci to cj and d(vi,r, vj,s) is the Euclidean distance
between the sample points. The intuitions of setting the transition probabilities are: 1) vehicles
prefer to transit from a point to the other on the same centerline; and 2) if staying on the same
centerline is impossible, vehicles are most likely to transit between the two points with smaller
distance and larger connection weight.
For a point pt, its candidate list for matching is computed by N(pt,,). The search radius
is set to 6×max. We then initialize the emission probability Pr(vf |pt), (f=1, 2,…,10) for pt by the
size of points in clusters as follows.
1) For each vf, get the point cluster ci that the derived sampled centerline li contains vf.
2) For each Pr(vf |pt), set its value as the size of points in cluster ci.
3) Normalize Pr(vf |pt) , (f =1, 2,…,10) for pt by scaling the sum of them to 1.
After initializing the transition probabilities and emission probabilities, we use Viterbi map
matching algorithm (Newson and Krumm, 2009, Thiagarajan et al. 2009) to match raw GPS
traces with sampled centerlines to find the most probabilistic route that vehicles visited. The
25
route with highest probability indicates the connection relationship between centerlines.
Centerlines that do not match with any raw trace are treated as noisy points. If two consecutive
sample points belong to two centerlines, we infer that the two centerlines are topologically
connected. We then use all the points that are not segmented and matched as new input for the
next loop of our point segmentation algorithm (Algorithm 1).
3.3 Point Cluster Grouping
Due to the accumulation of GPS noises, points outside the road can be segmented to form
another road centerline. The result also contains some short centerlines with similar orientations.
Thus, we use point clusters grouping to form long curves and join together point clusters near
each other to reduce redundancy. A curve drawn in one smooth movement is known as ‘Stroke’
(Thomson and Brooks, 2002). It is derived from the principle of “Good Continuation,” which
indicates that elements following similar directions tend to be grouped together. We use this idea
to guide our point cluster grouping algorithm.
Considering Figure 6 as an illustration, given two topologically connected road centerline li
and lj, we decide whether to group the corresponding point cluster ci and cj according to the
spatial proximity of li and lj.
First, we temporally group ci and cj to obtain a new point cluster cg=ci cj and generate a
new centerline lg from cg, where lg={vg,t | t=1, 2,…,n’g}, n’g≤n’i+n’j. Vertices of lg and points in
cg are in a one-to-one correspondence with each other except outliers. Vertices have associated
cluster IDs are transformed into a coordinate system CS generated by the PCA transformation.
26
Second, we find the common range R(C) of the two clusters using centerline lg and
generate two centerlines ,{ | 1,2,..., }c c
i r i
c
il v r n and ,{ | 1,2,..., }cc c
j j s jl v s n by points c
ic
and c
jc . Here, c
ic and c
jc , constitute the points of R(C) originally from ci and cj. The common
range R(C) is defined as a range of CS where x values lie in [xmin, xmax]. We find the minimum
and maximum x values among all vertex pairs {vg,t, vg,t+1} of lg as xmin and xmax, where vg,t and
vg,t+1 corresponding to two points in clusters ci and cj, separately.
Third, we propose an -quantile distance between c
il and c
jl to measure the spatial
proximity of the two centerlines li and lj.
1) ,
c c
i r iv l , find , , , ,( , ) min{ ( , ) | }c c c c c c
i r j i r j s j s jd v l d v v v l as distance set ( , )c c
i jD l l ;
2),
c c
j s jv l , find , , , ,( , ) min{ ( , ) | }c c c c c c
j s i j s i r i r id v l d v v v l as distance set ( , )c c
j iD l l ;
3) sort distances in ( , )c c
i jD l l and ( , )c c
j iD l l as ( , )c c
s i jD l l and ( , )c c
s j iD l l in descending
order;
4) obtain the -quantile distance according to ( , )c c
s i jD l l and ( , )c c
s j iD l l as ( , )c c
i jd l l
and ( , )c c
j id l l .
Finally, if the distance ( ( , ) ( , )) / 2c c c c
i j j id l l d l l is less than a pre-defined threshold, we
group ci and cj permanently, otherwise, retain ci and cj. Based on this idea, we propose a point
cluster grouping algorithm described in Algorithm 5.
27
Figure 6 Illustration of generating the common region of two point clusters
Algorithm 5: Point Cluster Grouping
1
2
3
4
5
6
7
8
9
10
11
Input: ci and cj, , , vdSigma
Output: CN
calculate the difference of mean orientations dDoOmean by points in ci and cj;
calculate the distance d = min{min{d(pi,r, pj,s) | pj,s ∈ cj} | pi,r ∈ ci};
if 𝑑𝐷𝑜𝑂𝑚𝑒𝑎𝑛 > 𝛽 ∧ 𝑑 > 6 × max{𝑣𝑑𝑆𝑖𝑔𝑚𝑎} then
return CN = {ci, cj};
generate centerlines li and lj for point clusters ci and cj;
generate centerline lg by 𝑐𝑔 = 𝑐𝑖 ∪ 𝑐𝑗, and construct a coordinate system CS;
∀{𝑣𝑔,𝑡 , 𝑣𝑔,𝑡+1} that belong to two clusters, find R(C) by x values in CS as [xmin, xmax];
∀𝑝𝑖,𝑟 ∈ 𝑐𝑖, find points in R(C) as 𝑐𝑖𝑐 = {𝑝𝑖,𝑟𝑐
𝑐 |𝑝𝑖,𝑟𝑐𝑐 . 𝑥 ∈ [𝑥𝑚𝑖𝑛, 𝑥𝑚𝑎𝑥]};
∀𝑝𝑗,𝑠 ∈ 𝑐𝑗, find points in R(C) as 𝑐𝑗𝑐 = {𝑝𝑗,𝑠𝑐
𝑐 |𝑝𝑗,𝑠𝑐𝑐 . 𝑥 ∈ [𝑥𝑚𝑖𝑛, 𝑥𝑚𝑎𝑥]};
compute d according to –quantile distance as 𝑑 = (𝑑𝛼(𝑙𝑖𝑐 , 𝑙𝑗
𝑐) + 𝑑𝛼(𝑙𝑗𝑐 , 𝑙𝑖
𝑐))/2.0;
if 𝑑 > 6 × (𝑣𝑑𝑆𝑖𝑔𝑚𝑎[𝑖] + 𝑣𝑑𝑆𝑖𝑔𝑚𝑎[𝑗]) ∧ 𝑑 > 2 × 휀 then
28
12
13
14
return CN = { ci, cj};
else
return CN={cg};
The input parameter ‘vdSigma’ is a vector of i for each cluster ci. The output is {ci, cj} or
the grouped cluster cg. At the beginning, we use the difference of mean orientations of the two
point clusters ci and cj to determine whether the two centerlines li and lj smoothly connect to
each other. According to Jiang et al. (2008), the deflection angle ranging in [30°, 75°] produces
stable ‘Stroke’ from road segments. The deflection angle is the angle between two connected
road segments. In our method, we intend to generate a longer smooth curve lg from point cluster
cg. Similar to the idea of ‘Stroke’ generation, if the difference of mean orientations of ci and cj is
larger than a pre-defined threshold , we assume that ci and cj cannot be grouped to generate a
longer smooth curve. We set to 60°, which is in the range [30°, 75°]. However, in order to
group point clusters with smaller difference of orientations earlier than point clusters with larger
difference of orientations, we apply the grouping algorithm in a progressive manner. More
specifically, we begin Algorithm 5 on a set of point cluster C with a small (i.e. 10°) and
iteratively run Algorithm 5 by increasing value of with an interval size (i.e. 25°) until
reaches 60°.
Due to the disparities in the sampling rate of the raw GPS traces, the two clusters along the
most probabilistic route that vehicles visited may be far from each other. Thus, we use the
minimum function described in Equation (3) to measure the distance between two point clusters.
29
If the distance between two clusters d (ci, cj) is greater than 6×max, they are not groupable.
, ,
, ,, min mi( n ( , )) ( )i r i j s j
i j i r j sp c p c
d c c d p p
(3)
If the difference of mean orientations of ci and cj is smaller than , we use the geometric
position of the vertices in li and lj to determine if the two point clusters can be grouped or not.
We compute two -quantile distances to measure the spatial proximity between two point
clusters. One is computed from li to lj and the other is computed from lj to li. If the mean of two
distances is smaller than 6×(i+j) or 2×, we group the two clusters.
3.4 Geometry Refinement
After grouping the point clusters, we generate centerlines using the robust Lowess
algorithm. However, intersections of centerlines are not represented explicitly. Figure 7 (a)
shows the typical centerlines of an inferred road map. We observe that centerlines are not
intersected with each other, but they are topologically connected. Therefore, we use the map
matching result as a constraint to refine the geometry of the centerlines so that they are
consistent with their topological relationship.
Figure 7 Intersections refinement of inferred road maps
First, we simplify all centerlines by Douglas-Peuker (DP) algorithm (1973). The threshold
is set to 3 meters, which is much smaller than the assumed road width. It ensures that the
simplified road centerline remains on the road. Figure 7 (b) shows the simplified centerlines of
30
Figure 7 (a), most points of the centerlines are removed by DP algorithm.
Then, for two topologically connected centerlines, we compute the intersection as follows.
1) If the two centerlines are intersected with each other, illustrated by intersection 1 of Figure 7
(b), we compute the intersection. 2) If one of the centerline can be extended to intersect with the
other, e.g., intersections 2 and 4 of Figure 7 (b), we extend the centerline and compute the
intersection. 3) If the two centerlines can be extended to intersect with each other, such as the
intersection 3 of Figure 7 (b), we compute the intersection by extending both of the two
centerlines.
Figure 7 (c) shows the intersections computed by the described procedure. Finally, all the
dead end line segments (either of its two end points are intersections) having a length less than
2× are removed.Figure 7 (d) shows the refined road centerlines.
4 Experiments
We use two datasets to test our algorithm. The first one is the GPS traces collected in
Chicago and the second is the GPS traces covering parts of San Francisco city. The Chicago
dataset is GPS traces from buses serving the University of Illinois at Chicago campus (Biagioni
and Eriksson, 2012b). The dataset covers an area of 3.8 km 2.4 km. There are 889 traces and
about 118,000 points. The total length of the traces is 2,671 km. We find that the distances from
some of the GPS points to the road centerline are more than 100 meters. The San Francisco
dataset is GPS traces downloaded from OpenStreetMap (2013). The traces are uploaded by
people around the world. The dataset covers an area of 3.3 km 3.5 km. There are 161 traces
31
and about 261,000 points. The total length of the traces is 16,507 km. The accuracy of the traces
obtained from OpenStreetMap cannot be guaranteed and there is no meta data about the
accuracy. We then preprocess the GPS traces by distances and difference of orientations
between points before our experiments.
Plate 2 (a) shows the raw GPS traces of the Chicago dataset displayed as points. We apply
our progressive segmentation algorithm to the dataset. The list is set to {20, 10, 5} by trial and
error method. Plate 2 (b) shows the clusters generated with different . Black points are the
union of clusters generated with =20, blue points are clusters generated with =10 and red
points are clusters generated with =5. We can observe that most of the points are segmented
with =20 due to the dense distribution of the GPS points. Plate 2 (c) shows the obtained 94
clusters. Different colors represent different clusters. We then group clusters that are
topologically connected according to their geometric position. Plate 2 (d) shows the point
cluster grouping results, and forty-three clusters are obtained.
32
Plate 2 (a) Raw GPS traces of the University of Illinois displayed as points; (b) point clusters
generated by the progressive segmentation algorithm, different colors represent clusters
generated with different ; (c) point clusters generated by the progressive segmentation
algorithm, different colors represent different clusters; and (d) point cluster after grouping.
As a final step, we use the robust Lowess algorithm to generate centerlines from the
grouped point clusters. The inferred road map (black lines) is overlapped with the reference map
(gray lines) as shown in Figure 8. We then apply our geometry refinement method to process the
intersection of the interred road map and obtain the final result shown in Figure 9. The close-up
shows that the intersections of topologically connected centerlines are computed properly.
Figure 8 The generated road map (black) overlapped with the reference map (gray) of Chicago
33
Figure 9 The refined road map (black) overlapped with the reference map (gray) of Chicago
We then repeat the experiment with the same parameters on dataset 2. Figure 10 shows the
raw GPS traces of San Francisco displayed as points. Figure 11 shows the final result of the
inferred road map. According to Figure 11, most of the straight line segments and the centerlines
in high density area (this indicates that the roads are frequently visited by vehicles) are
generated by our algorithm.
34
Figure 10 Raw GPS traces of the San Francisco displayed as points.
Figure 11 The inferred road map of San Francisco.
We evaluate the representative algorithms such as Edelkamp and Schrödl (2003), Davies et
al. (2006), Cao and Krumm (2009) using the San Francisco dataset. The results are shown in
35
Figure 12. Figure 12 (a) shows the roads generated by Edelkamp and Schrödl’s method. Due to
the noises of GPS traces, the intersections of the road map cannot be generated properly. Figure
12 (b) shows the road map generated by Davies’s method. Some roads are missing in the
inferred road map due to the uneven distribution of GPS traces. Their method also cannot deal
with the complex intersections. Figure 12 (c) shows the road map inferred by Cao and Krumm’s
method. The result shows that their method cannot handle the disparity problem of GPS traces.
Comparing with the result shown in Figure 11, the proposed method achieves the best
performance among these four methods.
Figure 12 Road networks generated by: (a) Edelkamp and Schrödl’s method; (b) Davies’
method and (c) Cao and Krumm’s method from San Francisco dataset.
36
Thus far, we qualitatively evaluated the proposed map inference algorithm using the
reference map and other related algorithms. Next, we present a quantitative evaluation for the
geometric accuracy and the topological accuracy of our map inference algorithm. We use the
method proposed by Liu et al. (2012) to evaluate the geometric accuracy and the method
proposed by Biagioni and Eriksson (2012a) to measure the topological accuracy. In general,
they match the inferred road map with the reference map by different matching distance
thresholds, and use F-measure to quantitatively evaluate the inferred road map. Assume that tp
is the matched proportion of inferred map and reference map, ||Truth|| is the road length on
reference map, ||M|| is the road length on inferred road map. F-measure = (2 precision
recall)/(precision+recall), where recall=tp/||Truth||, precision=tp/||M||.
The details of the geometric evaluation method are described as follows:
1. Sample points on the roads of the reference map and inferred maps using equal
intervals (i.e., 1 m).
2. Compute the orientations of all the points by their connections (modulo 180°).
3. Match the sample points on the inferred map with the reference map by nearest
distance matching under the constraint of the orientation. The difference of orientations
of two points matched with each other is no more than 60°.
4. Use different distance thresholds to determine the best match proportion.
The details of the topological evaluation method are described as follows:
1. Sample points on the roads of the reference and inferred maps using equal intervals
37
(i.e., 1 m).
2. For each intersection in the reference map, find its topologically connected points
within a maximum radius (i.e., 50 m).
3. Find the nearest point of the intersection in the inferred road map as a starting point and
repeat step 2 to generate a topologically connected point set.
4. Match the two sets of points generated in step 2 and step 3 to count the matched and
unmatched holes and marbles.
5. Repeat from step 2 to step 4 until all intersections in reference map are visited.
The reference map (gray lines) and inferred map (black lines) of the Chicago dataset are
overlapped, as shown in Figure 13. Figure 13 (a) presents the road map generated by Biagioni
and Eriksson’s algorithm (2012b). We manually select the roads of the reference map that are
actually visited by vehicles according to the raw GPS traces. Each road is represented by one
centerline, since their algorithm does not separate the centerlines with opposite directions for a
bidirectional road. Figure 13 (b) is the road map generated by our method. We generate two
centerlines for the opposite directions of a bidirectional road. The corresponding roads of the
reference map also have two centerlines.
38
Figure 13 (a) Road map (black lines) constructed using Biagioni and Eriksson’s method and the
roads that actually visited by vehicles (gray lines); (b) road map (black lines) generated by our
proposed method and the roads that actually visited by vehicles (gray lines).
First, we evaluate the geometric accuracy. The total length of the roads in the reference
map in Figure 13 (a) is 32.89 km, and the total length of the roads inferred by Biagioni and
Eriksson’s method is 24.50 km. The matching results with different matching thresholds of
distance are shown in Table 1. We apply the same evaluation procedure to our inferred road map.
The results are shown in Table 2. The total length of the roads in the reference map in Figure 13
(b) is 42.15 km, and the total length of the inferred roads of our method is 35.53 km.
Bidirectional roads in Figure 13 (b) are represented by two centerlines. In Table 1 and Table 2,
the ‘length of well-matched roads’ is the total length of roads in the reference map (or the
inferred map) that are matched with roads in the inferred map (or the reference map), given the
matching threshold of distance.
39
Table 1.Matching results of Biagioni and Eriksson’s method on the Chicago dataset
Matching Threshold
of Distance (m)
Length of
Well Matched Roads (km)
Recall Precision F-measure
5 15.79 0.48 0.645 0.55
10 20.22 0.615 0.826 0.705
15 22.71 0.690 0.927 0.791
20 23.57 0.717 0.962 0.821
50 24.01 0.730 0.980 0.837
Table 2.Matching results of our method on the Chicago dataset
Matching Threshold
of Distance (m)
Length of
Well Matched Roads (km)
Recall Precision F-measure
5 30.93 0.734 0.870 0.796
10 33.89 0.804 0.954 0.873
15 34.21 0.812 0.963 0.881
20 34.28 0.813 0.965 0.883
50 34.3 0.814 0.965 0.883
Figure 14 presents the plots of the recall, precision and F-measure of the two methods.
Figure 14 (a) shows Biagioni and Eriksson’s method (2012b). Their values increase as the
matching threshold increases from 5 m to 20 m and become constant after the threshold reaches
20 m. Figure 14 (b) demonstrates that the values of these three criteria do not change much with
40
our method after reaching the matching threshold of 10 m. The values of the three criteria of our
method are larger than the values of Biagioni and Eriksson’s method when the matching
threshold is smaller than 20 m. These results indicate that the road map inferred by our method
has better accuracy in terms of the geometry. In addition, using Chicago dataset, Biagioni and
Eriksson (2012b) demonstrated that, in terms of geometry, their method outperformed the
typical existing map inference methods, i.e., clustering based methods (Edelkamp and Schrödl,
2003), trace merging based methods (Cao and Krumm, 2009) and KDE based methods (Davies
et al., 2006).
Figure 14 Recall, precision and F-measure for geometric evaluation of (a) Biagioni and
Eriksson’s method, and (b) our method for Chicago dataset.
Then, we evaluate the topological accuracy. Figure 15 presents the plots of the recall,
precision and F-measure of Biagioni and Eriksson’s method (Figure 15 (a)) and our method
(Figure 15 (b)). Compared to Figure 15 (b), the values of the three criteria in Figure 15 (a)
increase faster as the matching threshold increase. The fact indicates that the road map inferred
by our method has a better topological accuracy under a small matching distance threshold. The
precision of the inferred road map generated by our method is better than the map generated by
0.40.50.60.70.80.91
0 5 10 15 20 25 30 35 40 45 50 55
Matching distance threshold (m)
Recall Precision F-measure(a)
0.40.50.60.70.80.91
0 5 10 15 20 25 30 35 40 45 50 55
Matching distance threshold (m)
Recall Precision F-measure(b)
41
Biagioni and Eriksson’s method; however, the recall is worse after matching distance threshold
is greater than 10 meters. The reason is that we generated the intersections according to the
HMM based map matching result. The intersection of two geometrically intersected roads
cannot be inferred if no vehicles transit from one road to the other via the intersection. The
number of intersections in the reference map of Figure 13 (b) is 117. We only generate 76 of
them according to the HMM based map matching result.
Figure 15 Recall, precision and F-measure for topological evaluation of (a) Biagioni and
Eriksson’s method, and (b) our method for Chicago dataset.
5 Conclusion
In this research, we propose a point segmentation and grouping algorithm for road map
inference from GPS traces. We assume that all road segments can be fitted by a set of line
segments. We then partition all points into clusters to represent the centerlines of roads in a
progressive manner. We use the robust Lowess algorithm to generate centerlines from
unorganized point clouds. We also employ HMM based map matching algorithm to construct
the topological relationship of road centerlines and filter road segments that generated by GPS
noises. Then we group road centerlines according to the topological relationship and spatial
0.5
0.60.7
0.8
0.9
1
0 5 10 15 20 25 30
Matching distance threshold (m)
Recall Precesion F-measure(a)
0.5
0.60.7
0.8
0.9
1
0 5 10 15 20 25 30
Matching distance threshold (m)
Recall Precesion F-measure(b)
42
proximity to remove the redundancy. Finally, we perform a geometry refinement on the road
network to ensure that the geometric relationship of road centerlines is consistent with its
topological relationship. Our method performs better for the road network that consists of
straight line segments than the road network that includes winding roads. According to the
experimental and evaluation results, we conclude that our method has high performance in
terms of the geometric and topological accuracy.
References
Agamennoni, G., J. I. Nieto and E. M. Nebot. 2010. "Technical Report: Inference of Principal
Road Paths Using GPS Data". The University of Sydney, Australian Centre for Field
Robotics.
Agamennoni, G., J. I. Nieto and E. M. Nebot. 2011. "Robust Inference of Principal Road Paths
for Intelligent Transportation Systems". IEEE Transactions onIntelligent Transportation
Systems, 12 (1): 298-308.
Ahmed, M., B. T. Fasy, K. S. Hickmann, et al. 2013. "Path-Based Distance for Street Map
Comparison".arXiv preprintarXiv:1309.6131.
Ahmed, M., S. Karagiorgou, D. Pfose, et al. 2014. "A Comparison and Evaluation of Map
Construction Algorithms".arXiv preprint arXiv:1402.5138.
Biagioni, J. and J. Eriksson. 2012a. "Inferring Road Maps from Global Positioning System
Traces: Survey and Comparative Evaluation". Transportation Research Record: Journal of
the Transportation Research Board, 2291 (1): 61-71.
43
Biagioni, J. and J. Eriksson. 2012b. "Map Inference in the Face of Noise and Disparity". In
Proceedings of the20th International Conference on Advances in Geographic Information
Systems, (New York, USA: ACM), pp. 79-88.
Cao, L. and J. Krumm. 2009. "From GPS Traces to a Routable Road Map". In Proceedings of
the17th International Conference on Advances in Geographic Information Systems, (New
York, USA: ACM), pp. 3-12.
Chen, C. and Y. Cheng. 2008. "Roads Digital Map Generation with Multi-track GPS Data". In
2008 International Workshop on Education Technology and Training & 2008 International
Workshop on Geoscience and Remote Sensing (IEEE Computer Society), pp. 508-511.
Chen, D., L. J. Guibas, J. Hershberger, et al. 2010. "Road Network Reconstruction for
Organizing Paths". In Proceedings of the twenty-first annual ACM-SIAM symposium on
Discrete Algorithms. Societyfor Industrial and Applied Mathematics, (Philadelphia, USA),
pp. 1309-1320.
Cleveland, W. S. 1979. "Robust Locally Weighted Regression and Smoothing Scatterplots".
Journal of the American Statistical Association, 74 (368): 829-836.
Davies, J. J., A. R. Beresford and A. Hopper. 2006. "Scalable, Distributed, Real-Time Map
Generation". IEEE Pervasive Computing, 5 (4): 47-54.
Doucette, P., P. Agouris and A. Stefanidis. 2004. "Automated Road Extraction from High
Resolution Multispectral Imagery". Photogrammetric Engineering & Remote Sensing, 70
(12): 1405-1416.
44
Edelkamp, S. and S. Schrödl, 2003. "Route Planning and Map Inference with Global
Positioning Traces". In Computer Science in Perspective, (Springer, Berlin Heidelberg), pp.
128-151.
Ester, M., H. Kriegel, J. Sande, et al. 1996. "A Density-Based Algorithm for Discovering
Clusters in Large Spatial Databases with Noise". In KDD, pp. 226-231.
Fathi, A. and J. Krumm. 2010. "Detecting Road Intersections from GPS Traces". In Geographic
Information Science, (Springer, Berlin Heidelberg), pp. 56-69.
Guo, T., K. Iwamura and M. Koga. 2007. "Towards High Accuracy Road Maps Generation
from Massive GPS Traces Data".2007 IEEE Geoscience and Remote Sensing Symposium,
pp. 667-670.
Ioannou, Y., B. Taati, R. Harrap and M. Greenspan. 2012. "Difference of Normals as a
Multi-Scale Operator in Unorganized Point Clouds". In 2012 Second International
Conference on 3D Imaging, Modeling, Processing, Visuallization and Transmission
(3DIMPVT) (IEEE).
Jiang, B., S. Zhao and J. Yin. 2008. "Self-organized Natural Roads for Predicting Traffic Flow:
ASensitivity Study". Journal of Statistical Mechanics: Theory and Experiment, 2008 (07):
1-27.
Karagiorgou, S. and D. Pfoser, 2012, "On Vehicle Tracking Data-Based Road Network
Generation". InProceedings of the20th International Conference on Advances in
Geographic Information Systems, (New York, USA: ACM), pp. 89-98.
45
Kégl, B., A. Krzyzak, T. Linder, et al. 2000. "Learning and Design of Principal Curves". IEEE
Transactions on Pattern Analysis and Machine Intelligence, 22 (3): 281-297.
Krumm, J., J. Letchner and E. Horvitz. 2007. "Map Matching with Travel Time Constraints". In
Society of Automotive Engineers (SAE) World Congress.
Krumm, J., N. Davies and C. Narayanaswami. 2008. "User-Generated Content". IEEE
Pervasive Computing, 7 (4): 10-11.
Kasemsuppakorn P. and H. A. Karimi 2013. "Pedestrian Network Extraction from Fused Aerial
Imagery (Orthoimages) and Laser Imagery (Lidar)". Photogrammetric Engineering &
Remote Sensing, 79 (4): 369-379.
Li, J., Q. Qin, J. Han, et al. 2015. "Mining Trajectory Data and Geotagged Data in Social Media
for Road Map Inference". Transactions in GIS, 19 (1): 1-18.
Liu, X., J. Biagioni, J. Eriksson, et al. 2012. "Mining Large-Scale, Sparse GPS Traces for Map
Inference: Comparision of Approaches". InProceedings of the18th ACM SIGKDD
international conference on Knowledge discovery and data mining, (New York, USA:
ACM), pp. 669-677.
Lou, Y., C. Zhang, Y. Zheng, et al. 2009. "Map-matching for Low-sampling-rate GPS
Trajectories". InProceedings of the17th ACMSIGSPATIAL International Conference on
Advancesin Geographic Information Systems, (New York, USA: ACM), pp. 352-361.
46
Newson, P. and J. Krumm. 2009. "Hidden Markov Map Matching Through Noise and
Saprseness". In Proceedings of the 17th ACM SIGSPATIAL International Conference on
Advances in Geographic Information Systems, pp. 336-343.
Niehoefer, B., R. Burda, C. Wietfeld, et al. 2009. "GPS Community Map Generation for
Enhanced Routing Methods Based on Trace-Collection by Mobile Phones".The First
International Conference on Advances in Satellite and Space Communications2009.
SPACOMM 2009, pp. 156-161.
OpenStreetMap Foundation: Bulk GPS track data, 2013.
https://blog.openstreetmap.org/2013/04/12/bulk-gpx-track-data/
Qiu, J., R. Wang and X. Wang. 2014. "Inferring Road Maps from Sparsely-Sampled GPS
Traces".InAdvances in Artificial Intelligence, Springer, pp. 339-344.
Schrödl, S., K. Wagstaff, S. Rogers, et al. 2004. "Mining GPS Traces for Map Refinement".
Data Mining And Knowledge Discovery, 9: 59-87.
Shi, W., S. Shen and Y. Liu. 2009. Automatic Generation of Road Network Map from Massive
GPS Vehicle Trajectories. In Proceedings of the 12th International IEEE Conference on
Intelligent Transportation Systems (St. Louis, MO, USA), pp. 28-53.
Thomson, R. C. and R. Brooks. 2002. Exploiting Perceptual Grouping for Map Analysis,
Understanding and Generalization: The Case of Road and River Networks. In Graphics
Recognition Algorithms and Applications, (Springer, Berlin Heidelberg), pp. 148-157.
47
Thiagarajan, A., L. Ravindranath, K. LaCurts, et al. 2009. "VTrack: Accurate, Energy-aware
Road Traffic Delay Estimation Using Mobile Phones". In Proceedings of the 7th ACM
Conference onEmbedded Networked Sensor Systems, (New York, USA: ACM), pp. 85-98.
Wang, J., X. Rui, X. Song, et al. 2015. "A Novel Approach for Generating Routable Road Maps
from Vehicle GPS Traces". International Journal of Geographical Information Science, 29:
69-91.
Wang, J., Z. Yu, W. Zhang, et al. 2014. "Robust Reconstruction of 2D Curves from Scattered
Noisy Point Data". Computer-Aided Design, 50: 27-40.
Wang, Y., X. Liu, H. Wei, et al. 2013. "CrowdAtlas: Self-Updating Maps for Cloud and
Personal Use". In Proceeding of the 11th annual international conference on Mobile
systems, applications, and services, ACM, pp. 27-40.
Youn J., J. S. Bethel, et al. 2008, "Extracting Urban Road Networks from High-resolution True
Orthoimage and Lidar". Photogrammetric Engineering & Remote Sensing, 74 (2): 227-237.