Scalable Transfer Patterns · fer Patterns [1, 3], which is excellent for all of the criteria above except two: the space consumption of the precomputed auxiliary data is large and

Scalable Transfer Patterns

Hannah Bast∗ Matthias Hertel∗ Sabine Storandt∗

Abstract

We consider the problem of Pareto-optimal routeplanning in public-transit networks of a wholecountry, a whole continent, or even the wholeworld. On such large networks, existing approachessuffer from either a very large space consumption, avery long preprocessing time or slow query process-ing. Transfer Patterns, a state-of-the-art techniquefor route planning in transit networks, achieves ex-cellent query times, but the space consumption islarge and the preprocessing time is huge. In thispaper, we introduce a new scheme for the TransferPattern precomputation and query graph construc-tion that reduces both the necessary preprocessingtime and space consumption by an order of mag-nitude and more. Average query times are below1 ms for local queries, independent of the size ofthe network, around 30 ms for non-local queries onthe complete transit network of Germany, and anestimated 200 ms for a fictitious transit networkcovering the currently available data of the wholeworld.

1 Introduction

Route planning on real-world public-transit net-works is a complex problem. When evaluating al-gorithms for this problem, a variety of criteria areimportant in practice: preprocessing time, spaceconsumption of the precomputed auxiliary data,query processing time and space, the ability formulti-criteria optimization (travel times, number oftransfers, prices etc.), the ability for realistic mod-eling (transfer buffers, foot paths), and the abilityto deal with delays and other real-time informa-tion. There exist a variety of approaches, whichhit different Pareto points among these criteria; see

∗Department of Computer Science, University of Freiburg,

79110 Freiburg, Germany, [email protected], [email protected], [email protected] .

[2] for an overview. One such approach is Trans-fer Patterns [1, 3], which is excellent for all of thecriteria above except two: the space consumptionof the precomputed auxiliary data is large and thepreprocessing time is huge.

Another important aspect mostly neglected sofar is scalability. Most experimental evaluations ofexisting approaches are done on metropolitan ar-eas (like London [9, 18] or Madrid [4]), and someon whole countries (like Germany [16, 18] or Swe-den [6, 5]). The largest of these networks is Ger-many, with an average of 15 million connectionsper day. In comparison, the number of connec-tions on an average day for the (currently avail-able) public-transit data of the whole world is esti-mated to be 320 million [15]. A few papers also con-sider networks of whole continents. For example,[12] consider Europe, however only long-distancetransport, which amounts for only a tiny fractionof all transport (1.7 million connections per day).In the original Transfer Patterns publication [1], aNorth America network was investigated (57 mil-lion connections). However, the reported prepro-cessing time is around 3,000 core hours (4 monthson a single core) with suboptimal results for an(albeit very small) fraction of the queries. For theother route planning approaches, experimental re-sults for networks of this size are not available.

We first provide an overview of existing routeplanning approaches and discuss their potentialapplicability to continental-size networks. We willsee that for most approaches, the extrapolatedquery times or space consumption are too high tobe of practical use. For Transfer Patterns, querytimes were shown to scale well (about 6 ms on NewYork, about 10 ms for North America [1]), so thereis great potential for interactive query times evenwhen considering the network of the whole world.

In this paper, we introduce a new scheme

for Transfer Patterns precomputation and querygraph construction that allows interactive querytimes for huge transit networks with manageablepreprocessing times and space consumption.

1.1 Scalability of other routing approachesWe are interested in the setting where all Pareto-optimal routes regarding travel time and numberof transfers are computed in a query – which is themost common objective function when it comes toroute planning in public-transit networks. A routeis Pareto-optimal if there exists no other routefor the same departure time with smaller traveltime and less number of transfers. We discuss thefollowing approaches with regard to the ability ofPareto-optimal route computation and scalability:Dijkstra’s algorithm on a time-expanded graph(TED), Dijkstra’ algorithm on a time-dependentgraph (TDD), round-based public-transit routing(RAPTOR) [9], the Connection Scan algorithm(CSA) [10] and its accelerated version (ACSA) [16],trip-based routing (TB) [18], and approaches basedon labeling, namely public-transit labeling (PTL)[6] and time-table labeling (TTL) [17].

Time-expanded and time-dependent graphs arebasic models for converting the timetable infor-mation of a public-transit network into a graphthat enables routing via Dijkstra’s algorithm. TheTED approach was used in the original TransferPattern paper for preprocessing. The disadvan-tage of the time-expanded model is the relativelylarge graph size. For example, for the North Amer-ica instance with about 338,000 stations the time-expanded graph contains over 100 million nodesand about 450 million edges. The query timesfor both TED and TDD exceed 100 millisecondsfor Pareto-optimal routes (regarding travel timeand number of transfers), already on metropolitan-sized networks. For whole countries or even conti-nents, query times are on the order of several sec-onds and more.

RAPTOR takes advantage of the fact that op-timal connections typically require only few trans-fers. RAPTOR computes Pareto-optimal connec-tions in order of increasing number of transfersby operating in rounds. The number of rounds isbounded by the maximum number of transfers oc-

curring in an optimal connection from start to des-tination. For cross-country queries this number caneasily be five or more. Since RAPTOR considersall stations reachable within this maximum num-ber of transfers, it is likely to scan large portionsof the complete network for such queries. RAP-TOR achieves good query times for metropolitanareas, but query times are over one second alreadyon Germany.

CSA stores all elementary connections in a sin-gle array, sorted by departure time. For a givenquery, the connections are scanned starting fromthe given source station and departure time untilthe algorithm can be sure that all optimal connec-tions to the given target station are found. Thenumber of scanned connections is usually large al-ready for small networks, yet this algorithm is fastdue to its ideal data locality. On London, CSA out-performs RAPTOR. Nevertheless, CSA is expectedto scale even worse than RAPTOR. When consid-ering, e.g., the whole of Europe, CSA would scanconnections in Italy, Greece and Norway even whenthe query is between stations in Lisbon. ACSAwas developed to reduce the number of such un-necessary scans by partitioning the transit network(using METIS [13]), and restricting scan opera-tions to connections in/between relevant partitionsin a query. Unfortunately, ACSA was only eval-uated for earliest arrival time queries (using thenumber of transfers only as tie breaker, not as ad-ditional optimization criterion). Minimizing onlytravel time is a significantly easier problem, andonly one solution per departure time is produced(while for Pareto-optimal queries typically severalroutes are returned).

TB is based on a similar idea as RAPTORand also enumerates optimal routes in increasingorder of transfers. The main difference is thatpossible transfers between trips are pruned in apreprocessing phase in an optimality-preservingway. This allows to scan less trips in a query,which results in query times of about 40ms onGermany. Answering queries on Germany withTransfer Patterns takes 0.3ms (without hubs) onaverage which is two orders of magnitude faster.The gap between the query times is expected tobecome larger with growing network sizes, because

even with transfer pruning, TB has to consider asignificant portion of the network in a cross-countryor even cross-continental query.

PTL uses hub labeling on a time-expandedgraph. This leads to query times on the order of afew microseconds on London and also whole coun-tries such as Sweden. The price to pay is a hugespace consumption of the auxiliary data, which isabout 26 GB for London alone. Even when neglect-ing amplification effects, the space consumption forconsidering, e.g., the whole of Europe would be gi-gantic. TTL is another indexing technique for tran-sit networks which assigns labels to stations, lead-ing again to query times in the microsecond range.Due to a custom-tailored compression technique,the space consumption of the auxiliary data is onlyabout 1 GB for metropolitan areas. But TTL onlycomputes optimal routes regarding travel time /earliest arrival time. Pareto-optimal solutions can-not be produced. Furthermore, all labeling tech-niques suffer from not being easily adaptable incase delays have to be considered.

1.2 Other related work While none of theexisting approaches for public-transit routing havebeen evaluated on a whole-world network, webservices like Rome2rio1 offer the computation ofroutes between arbitrary locations on the globe.However, the goal there is not to come up with thefull set of Pareto-optimal solutions, but rather aconcise set of possible route options.

Improving the preprocessing time for TransferPatterns was considered before in the frequency-labeling approach [5]. Frequency-labeling is basedon the idea of handling vehicles that depart peri-odically not individually for every departure timebut in a single operation whenever possible. Thisaccelerates the Transfer Pattern precomputationby a factor of 60 on Germany (compared to theoriginal TP paper [1], taking the difference of theused hardware into account) while also requiringless space during preprocessing. However, the largespace consumption for the auxiliary data remainsunchanged with this approach. And even with theimproved preprocessing time, handling networks of

1http://www.rome2rio.com

continental size seems out of reach without massiveparallelization.

Our new Transfer Pattern preprocessingscheme is inspired by the Customizable RoutePlanning framework (CRP) [7] which allows fastshortest-path computation and real-time updateson street networks. CRP starts by partitioning thestreet network into cells. Then, between all pairsof border nodes of a single cell (with border nodesbeing adjacent to an edge connecting to anothercell), overlay edges are inserted with costs equalto the shortest driving time between the nodes.Queries are answered by a bi-directional Dijkstracomputation in this overlay graph. As the overlayedges allow to skip over whole cells, query timesare significantly better for CRP than for plain Di-jkstra (especially if the partitioning is repeated re-cursively, resulting in a multi-layer graph). ButCRP relies on several characteristics of street net-works that public-transit networks are not compli-ant with. For example, every subpath of a short-est/quickest path is also an optimal path in thestreet network. This is not true for routes in public-transit networks when considering Pareto-optimalsolutions, which increases the problem complexitysignificantly.

1.3 Contribution We present a new prepro-cessing scheme for Transfer Patterns that improvespreprocessing time and space consumption for theprecomputed auxiliary data structures by an orderof magnitude and more. Local queries take below1 ms, and non-local queries can be answered in 30ms on Germany and an estimated 200 ms on thenetwork of the whole world.

We first cluster the network and then show howthese clusters and the natural hierarchy of localand long-distance transport can be exploited tolimit the necessary effort for constructing transferpatterns in an optimality-preserving way. Wedescribe several clustering methods and discusstheir impact on the complete preprocessing scheme.We evaluate our new scheme on the large (but nothuge) transit network of Germany and comparethe outcome to the conventional Transfer Patternscheme. We also extrapolate our results to a (sofar fictitious) public-transit network of the whole

http://www.rome2rio.com

world.The deeper reason why our approach works

is that very large (country-scale, continental-scale,or even world-scale) transit networks can be de-composed into fairly well-separated sub-networksof bounded size. Each such sub-network can be aslarge as a whole metropolitan area (e.g., New Yorkand surrounding cities). This works because thestate of the art has advanced to such a point thatpublic-transit routing on whole metropolitan areascan be done with very fast query processing timeand reasonable preprocessing effort. We can there-fore consider whole metropolitan areas as “local”in our approach.

2 Preliminaries

We first introduce some basic notation and describein detail the conventional Transfer Pattern con-struction and query processing scheme.

2.1 Transfer Patterns Transfer Patterns (TP)[1] is the approach behind public transportationroute planning on Google Maps. The idea of TPis to precompute and compactly store all optimalroutes in a time-independent abstraction. Thisabstraction of an optimal route, the so calledtransfer pattern, is the sequence of stations on theroute at which a change of vehicle or transportationmode occurs (including the source and the targetstation).

To construct TP, a profile search is started foreach station in the network, computing all Pareto-optimal paths to all other stations in a given timeinterval. Then all Pareto-optimal paths are back-tracked and the transfer stations are identified. Foreach station, all these patterns are stored in reverseorder in a directed acyclic graph (DAG). This re-duces the space consumption, as common prefixesof patterns only have to be stored once. Still, ifevery station can be reached from every other sta-tion, the space consumption for all DAGs is (atleast) quadratic in the number of stations in thenetwork. For small networks, this is negligible com-pared to the space consumption for the timetables.But for the whole world, with millions of stations,this would be infeasible. Moreover, profile searchesare expensive for large networks, even with algo-

rithms designed for this purpose, like rRaptor [9],frequency-labeling [5] or profile TB [18]. With thelatter, profile searches for Germany for a 24 hourrange on a single core require about a day. Forlarger networks, we need more profile searches andeach individual profile search is more expensive.

Query processing works as follows. For givenstart and target stations s and t, a query graphis constructed by extracting all transfer patternsfrom the DAG for s which lead to t (which isdone in reverse starting from nodes in the DAGreferring to t). The query graph is typically verysmall (on average less than a hundred nodes forqueries on Germany). For a given departure time,Pareto-optimal routes can be found with the helpof the query graph by running a time-dependentDijkstra on it. Evaluating edge costs in the querygraph amounts to a look-up in a direct connectiondata structure, which stores all routes that donot require transfers between each pair of stations,sorted by departure time. This enables query timesof less than a millisecond on Germany.

2.2 Hubs Hubs were introduced to reduce thenecessary preprocessing effort and especially thespace consumption for TP. Hubs are a subset of“important” stations in the sense that they appearin many Pareto-optimal routes. Profile searchesfrom hubs are done as described above. Profilesearches from non-hubs can be stopped when allactive labels encode a route over a hub. Every firsthub on a Pareto-optimal path belongs to the accessstation set of the source station. At query time, theDAG for s is used to identify all optimal patternsto t and to the access stations of s. The DAGs ofthe access stations are then used to get the optimalpatterns to the target t.

For example, for Switzerland, the space con-sumption for the DAGs was reduced from over18 GB to about 830 MB when using hubs [1].But preprocessing times were only reduced from635 h to 586 h. To handle even larger networks,some heuristics were introduced in [1], as e.g. onlyconsidering routes with at most two transfers inthe searches from non-hubs. It was shown thatthe number of suboptimal queries when using theheuristics is very small. Heuristic Transfer Pat-

tern construction for Switzerland only required 61h. But for North America, even heuristic prepro-cessing requires over 3000 h.

3 Making Transfer Patterns Scalable

In the following, we present a multi-phase TP pre-processing scheme which takes several characteris-tics of transit networks into account to reduce thetime and space necessary to compute and store alloptimal Transfer Patterns.

3.1 Clustering transit networks The firstand basic step in our preprocessing pipeline con-sists of dividing the stations of the transit networkinto suitable partitions/clusters. Ideally, we wouldlike every cluster to be convex.

Definition 1. (Convex Transit Cluster)A subset C ⊆ S of stations in a transit networkis called a convex transit cluster if for all pairss, t ∈ C all optimal routes from s to t only traversestations in C.

The complete network is naturally convex. Butour goal is to break the network down into smallerclusters. In general, we expect from a goodclustering that (1) clusters have moderate size, (2)there are only few connections between differentclusters (3) each cluster contains at least onestation where long-distance vehicles depart.

Many large transit networks are implicitly clus-tered because of different transport agencies run-ning the vehicles in certain areas. But this datais not always available and the induced cluster-ing might not meet our requirements (in terms ofsize, number of connections between clusters andconvexity). Therefore, we will propose methodsthat work independently of this kind of informa-tion. Our clustering is computed based on only thelocal-transport connections and foot paths and noton the long-distance connections (like ICE, long-distance buses, etc.). If the given transit data doesnot provide a distinction between local and long-distance transport, a simple classifier based on thenumber of stations on the trip, distance betweenconsecutive stations on the trip and speed of thevehicle can be used to come up with a heuristicdistinction.

Based on all trips specified in the transit data,we construct a graph by creating a vertex for everystation and connecting consecutive stations in atrip with edges. So for a trip that traverses kstations, we insert k−1 edges. Edge weights reflectthe frequency of the connection over the period ofa year. If several trips would lead to the insertionof the same edge, edge weights are accumulated.Moreover we add edges between stations when theyare connected via a foot path (≤ 400 m). Theyreceive an artificially high edge weight because footpaths can be used at any time and stations in closeproximity of each other should preferably end upin the same cluster. In our experiments, we use theweight 200,000 which corresponds to ≈ 365 ·24 ·20,i.e. the foot path is used every three minutes. Thisleads to foot paths exhibiting a higher weight than99.89% of the transit edges.

In the following, we outline four methodsfor computing clusters in public-transit networks,namely k-Means, merging, METIS and PUNCH.

3.1.1 k-Means clustering [14] This approachis oblivious of the graph structure but only consid-ers the geo-coordinates of the stations. The k is aparameter to be set by the user. For a given k, aninitial set of k seed stations is selected (e.g., ran-domly) and all other stations are assigned to thecluster of their closest seed. k-Means then mini-mizes the overall distance from the data points tothe midpoints of the clusters to which they are as-signed. It operates in rounds, where each roundstarts by selecting the means of the clusters as newseeds, followed by a reassignment of the stations tothese seeds. The algorithms stops when the clus-tering does not change anymore from one round tothe next. Note that k-Means does not necessarilycompute connected partitions, since it ignores thestructure of the network and uses only spatial datato form the clusters.

3.1.2 Merge-based clustering [11] This hier-archical approach was originally designed to find apartitioning of a road network. In the beginning,every node of a graph forms its own partition of size1. Neighboring partitions are merged until theirsize reaches a given upper bound U . The order

of the pairs of partitions that are merged is deter-mined by a utility function: in each step, the algo-rithm selects the pair of neighboring partitions ofcombined size ≤ U with the maximum value of theutility function. Since the aim of the algorithm isto find a partitioning with moderate partition sizesand small edge weights between the partitions, theutility function punishes big partitions and rewardspairs of partitions with high edge weights betweenthem. We use the following utility function in ourexperiments:

f(u, v) = 1/s(u) · 1/s(v) · (w(u,v)/√

s(u) + w(v,u)/√

s(v))

with s(u), s(v) denoting the sizes of the clustersu, v and w(u, v) is the sum of the weights of alledges with one endpoint in u and one endpoint inv. The algorithm stops when no pair of neighboringpartitions with combined size ≤ U is left. Insteadof an upper bound size U , one can also specify thenumber k of partitions. Then the algorithm mergespartitions until only k partitions are left, withoutconsidering the size of the partitions. Just like fork-Means, the right choice of k is crucial for theresult.

3.1.3 METIS [13] and PUNCH [8] METIS isa general-purpose graph-clustering algorithm thatruns in three phases. In the first phase (calledcoarsening) the size of the graph is reduced. In thesecond phase, a partitioning of the coarse graph iscomputed. In the third phase, the partitioning isprojected back to the original graph and refined.We use METIS with the -contig parameter, whichenforces contiguous partitions. METIS is used inthe preprocessing of ACSA [16].

PUNCH (partitioning using natural cut heuris-tics) was originally developed for road networksand runs in two phases. The first phase (calledfiltering) shrinks the input graph without chang-ing its natural properties (as exhibiting small cutsalong rivers and mountains). The second phase(called assembly) computes an initial partitioningof the shrunk graph and then revisits parts of theshrunk graph in order to improve the partitioning.

3.1.4 Types of stations in a cluster Let Cbe the set of clusters from a clustering. For each

Figure 1: Illustration of a cluster (gray area). Dotsindicate stations, trips are represented by straightlines. The large red dots indicate the borderstations of the cluster. These stations are servedby trips that enter/leave the cluster.

cluster C ∈ C we define the set of border stationsb(C) as those stations from C that are served by atrip to or from another cluster. That is, s ∈ b(C) iffs ∈ C and there is a trip containing s and s′ ∈ C ′

with C ′ 6= C (see Figure 1 for an example). Wealso define long(C) to be the set of those stationsfrom C where long-distance transport departs orarrives.

In the following, we will use Cx to refer to thecluster of a station x.

3.1.5 Border station reduction We formu-lated as goals for our clustering that clusters shouldhave moderate size and there should only be fewconnections between different clusters. These tworequirements are important as they will determinethe preprocessing time and the query time later on.To be more specific, our preprocessing will requireto compute all Pareto-optimal routes within eachcluster. Obviously, the smaller the clusters the lesstime is needed per cluster. But we will also need tocompute Pareto-optimal routes between clusters.It will turn out that this requires an expensive pro-file search for each border station of each cluster.Also, for an s-t-query the number of border sta-tions of Cs and Ct will determine the query graphsize. Hence small border station sets are desirable.But this leads to a conflict with our goal of havingsmall clusters. For example, if we aim for clusterswith around 1,000 stations each, large metropolitan

Figure 2: Border station reduction by merging twoclusters – only 5 out of 27 remain.

areas, as Berlin or London, with more than 10,000stations are split in several parts. As the public-transit in such areas is densely connected, splittingthem leads to a huge number of border stations.But there are typically only a few of such denselyconnected areas per country. Hence allowing largerclusters only for such areas will not increase thetotal time for precomputing intra-cluster solutionstoo severely, but will decrease the maximum num-ber of border stations per cluster.

We realize this by a post-processing phase inwhich we merge clusters if the number of borderstations reduces dramatically, see Figure 2 for anexample.

3.2 Local transfer patterns The next step af-ter the clustering is to compute local transfer pat-terns, for which we need the optimal connectionsbetween all stations inside a cluster. Since we as-sume that each cluster is of moderate size (at mostof the size of one metropolitan area), we can usethe basic Transfer Patterns approach without hubshere (using, e.g., rRaptor for the profile searches).So for each cluster C, we start a profile query foreach station s ∈ C until all Pareto-optimal routesto all other stations in C are known. We do notconsider connections with transfer stations that arenot in C at this stage.

If the cluster is convex, this already guaranteesoptimal local queries, that is, between two stations

from the same cluster. Determining which clustersare convex will be a by-product of the last step ofour framework.

3.3 Long-distance transfer patterns be-tween clusters In the next step, we concentrateon long-distance transport. Typically, the numberof stations serving long-distance transport (calledlong-distance stations in the following) is very smallcompared to the total number of stations. For ex-ample, for the whole network of Germany less than3% of all stations are long-distance in this sense.We now want to compute the transfer patterns be-tween all pairs of long-distance stations in differentclusters. To this end, we run a profile search fromeach s ∈ long(C) until all optimal connections tostations in

⋃C′∈C\C long(C ′) are known.

During the search, whenever a new cluster C ′ isentered, we use the local transfer pattern from theprevious step to compute labels for all stations inb(C ′) and long(C ′). This saves us the considerationof the complete local transport in every cluster,thus speeding up the profile searches significantly.

If considering not only countries but whole con-tinents or the public transport world-wide (includ-ing flights), the computation of these long-distancepatterns might still be too expensive. Therefore,we use hubs in this phase to limit the search ra-dius, for example large airports and train stations(e.g., Frankfurt Airport).

Having completed this step, we can computeapproximately optimal routes between two stationss and t (in different clusters) efficiently as follows.We first look up the local transfer patterns from sto all stations in long(Cs) and from all stations inlong(Ct) to t. Then we look up the long-distancetransfer patterns from the long-distance stations inlong(Cs) to the long-distance stations is long(Ct).Note that in a suitable clustering neither long(Cs)nor long(Ct) should be empty, i.e., every clustercontains at least one long-distance station. Theresulting query graph can be evaluated as usual.

3.4 Local connections between clusters Itremains to take care of optimal connections thatdemand local transport between different clusters.A naive approach would be to run a complete

profile search from every station of each clusterconsidering local and long-distance transport. Butthis would take effort comparable to that of theconventional Transfer Patterns approach. We usetwo modifications in order to reduce the necessaryeffort, based on the output of the previous stepsabove:

1. We only consider border stations of clustersinstead of all stations as starting points forprofile searches.

2. We accelerate each search by considering the(approximate) solutions that can be com-puted via the the precomputed local and long-distance Transfer Patterns.

We now describe in detail how these two modifica-tions help to decrease the preprocessing time.

In the conventional Transfer Pattern computa-tion, the profile searches from each station do a lotof redundant work. For example, consider a stations which only has elementary connections to anotherstation s′. Then every optimal connection from scould be deduced by using the query graph for s′

and augmenting it with the edge (s, s′). A separateprofile search starting at s would be unnecessary.Hubs were introduced in the original approach asone method to reduce redundancy: the parts of theoptimal routes beyond hub stations are only com-puted once. We will use the border stations of eachcluster like hubs. Obviously, every connection be-tween different clusters Cs and Ct involves visitingat least a border station in Cs and a border sta-tion in Ct. Since we already computed the optimalconnections between all pairs of stations inside acluster in Step 2, it remains to compute the opti-mal connections between the border stations.

A profile search from a border station b ∈ Cb

works as follows. We use a time-dependent Dijk-stra computation and consider only local transportconnections to other border stations. Every timewe improve a label at a border station b′ of a clusterCb′ , we try to improve the labels at all other bor-der stations of C ′b by using the local transfer pat-terns from Step 2. Each improved label is addedto the priority queue (PQ) of our Dijkstra com-putation. Moreover, we consider the long-distance

connections leaving the cluster as follows. We firstevaluate the local transfer patterns from b′ to allstations in long(Cb′). Then we use the transferpatterns for each station from long(C ′b) (precom-puted in Step 3) to improve the labels of all other(far away) stations q ∈ long(Cq). Since we are onlyinterested in border-station labels, we use the localtransfer pattern in Cq to check the optimal routesfrom q to all border nodes of Cq. Only if a border-station label is improved, we add it to the PQ.

After the profile run is completed, all optimalconnections are backtracked in order to constructthe DAG. This includes the integration of local andlong-distance transfer patterns (from Steps 2 and3) which were used in the search. In order to laterconstruct query graphs between border stationsmore efficiently, we do not construct a DAG foreach individual border station of C but ratherone DAG per cluster, containing all patterns to aborder station of the cluster. Note that a simplemerging of the DAGs does not necessarily lead toa DAG again. We take care of potential cycles bykeeping multiple DAGs per cluster if necessary.

The described approach limits the search effortthat is necessary per border station. Intuitively,a route to a far-away destination almost alwaysinvolves long-distance transport. This is becauseusing only local transport the target is eitherunreachable (think about going from the U.S. EastCoast to the West Coast) or the travel time and/orthe number of transfers are unbearably high. Usingour (approximate) solutions via local and long-distance transfer patterns, we are able to prunenon-optimal local connections early. Moreover, weonly consider non-border stations in a cluster ifthey are part of a transfer pattern between twoborder or long-distance stations. This saves manyedge relaxation operations and – since only borderstations are pushed into the PQ – also reduces thespace consumption of the profile search.

Over the course of this last preprocessing step,we can easily decide whether the clusters producedin the first step are convex. If no new optimalpatterns are identified between border stations ofa cluster, the local transfer Patterns already cap-tured all optimal connections between all stationsinside the cluster. In that case, the cluster is con-

vex. Otherwise it is not.

3.5 Query processing For a given query, weproceed as follows. We first check whether thesource station s and the target station t are in thesame cluster. If this is the case and the clusteris marked convex, we only need to evaluate thequery graph built from the local transfer patternsbetween s and t, which can easily be extracted.

If s and t are in different clusters or in thesame non-convex cluster, we additionally extractthe local query graph from s to all border stationsof Cs. We do the same for the border stations ofCt and t. Then we extract the transfer patternsbetween the border stations of the two clusterswhich provides us with the final query graph. Ifhubs are used, they are incorporated as describedin Section 2.2.

Running a time-dependent Dijkstra on thequery graph (using the direct connection datastructure to evaluate edge costs) completes thequery answering.

3.6 Correctness proof We prove that our ap-proach returns the full set of Pareto-optimal so-lutions. For that purpose, we first show an infix-optimality property of Pareto-optimal transfer pat-terns.

Lemma 3.1. If a Pareto-optimal route r from s tot is encoded with the transfer pattern s, q1, · · · , qr, t,then either every subpattern qi, · · · , qj also encodesa Pareto-optimal route or it can be replaced by apattern that encodes a Pareto-optimal solution fromqi to qj without changing the costs of the route froms to t.

Proof. Let qi, · · · , qj be an arbitrary subpattern.Let dep be the departure time at qi in r andarr the arrival time at qj . If qi, · · · , qj is notPareto-optimal, there has to exist another routefrom qi to qj which (1) departs no earlier thandep, (2) has no more than j − i − 1 transfers and(3) arrives no later at qj than arr. Because of (1)and (3), we can create a new valid route r′ from sto t that uses the alternative route from qi to qj .Because of (1)-(3), r′ can only lead to the same ora better arrival time at t and to less or the same

number of transfers as r. Therefore, if r′ wouldbe better than r in one aspect, r would not havebeen Pareto-optimal in the first place. Hence weconclude that r and r′ lead to the same costs.

Theorem 3.1. The query answering procedure re-turns all Pareto-optimal solutions from s to t.

Proof. If s and t are in the same convex cluster(Cs = Ct), the correctness argument from theconventional Transfer Patterns algorithm applies.

Otherwise let s, q1, q2, · · · , qr, t be a pattern of aPareto-optimal solution. Let qi be the first stationin the pattern that is not contained in Cs. Thenthe pattern from s to qi−1 has to be containedin the DAG for s because it is an optimal localtransfer pattern (possibly qi−1 = s). Analogously,let qj be the last station in the pattern that is notcontained in Ct. Then the pattern qj+1 to t has tobe contained in the DAG for qj+1 from the localtransfer patterns computation (possibly qj+1 = t).

Now qi−1 and qi are in different clusters,and so are qj and qj+1. Hence qi−1 is a borderstation of Cs and qj+1 is a border station ofCt. Therefore all optimal patterns between qi−1and qj+1 are encoded in the DAG for the borderstations of Cs. According to Lemma 3.1, ifs, . . . , qi−1, . . . , qj+1, . . . , t is a pattern encodinga Pareto-optimal solution, either the subpatternfrom qi−1 to qj+1 has to be optimal as well forsome departure time or there exists an alternativepattern that leads to the same Pareto-optimalsolution from s to t. Therefore, the extractedquery graph (encoding the precomputed Pareto-optimal solutions from s to the border nodes ofCs, from all border nodes of Cs to all bordernodes of Ct and from the border nodes of Ct to t)allows to recreate the Pareto-optimal solution inquestion. As this holds for every pattern leadingto a Pareto-optimal solution, the time-dependentDijkstra run on the query graph will return thefull set of Pareto-optimal solutions.

We want to emphasize that our definition of bor-der stations (all stations on inter-cluster trips) is

55min

40min

5min

AB

Figure 3: Illustrating the necessity of all stationson inter-cluster trips being border stations: If onlythe red marked nodes would be border stations,the only Pareto-optimal route from the border ofA to the border of B would be via the green line,taking 40 minutes and no transfers. Combininglocal patterns with border patterns, the route fromthe circled node to B would cost 45 minutes andone transfer. But the Pareto-optimal solution ofusing the blue line directly, taking 1 hour and zerotransfers, would be missed. If the circled node is aborder station as well this problem can not occur.

crucial for correctness. Every station with a di-rect connection to another cluster is a border sta-tion according to our definition. Figure 3 showsthat defining the border stations as those with anelementary connection to another cluster wouldnot suffice, as Pareto-optimal solutions would bemissed.

4 Experimental Results

We implemented the new Transfer Pattern con-struction and query processing scheme in C++,along with the conventional Transfer Pattern ap-proach with and without hubs. Run times are mea-sured on a single core of an Intel i5-3360M CPUwith 2.80GHz and 16GB RAM.

4.1 Data sets The main reason for missing ex-perimental results on public-transit network of con-tinental size is the unavailability of data. Whilemore and more GTFS feeds become openly avail-able, most of them contain only the data for onemetropolitan area. Even when aggregating allavailable data for a country, coverage is often poor.In particular, data in rural areas and long-distancetransport connecting the metropolitan areas is of-ten missing. Exceptions are, for example, the UK,Sweden and the Netherlands, for which feeds forthe whole country were published. Moreover, the

Germany World

stations 0.25 million 5 million ∗

connections/day 15 million 320 million ∗

Table 1: Characteristics of the public-transit net-

work of Germany (HAFAS data 2015/16) and esti-

mated values for the whole world for a single day.

complete public-transit of Germany is described inthe HAFAS data provided by Deutsche Bahn. Butcreating a meaningful feed for the whole of Europeor other continents is still out of reach. Therefore,we will first focus on Germany to show the abilityof our new scheme to improve over the conventionalTransfer Pattern approach. Then, we extrapolateour results to a (fictitious) transit network of thewhole world. Table 1 shows the characteristics ofour data sets; the numbers for the whole world arebased on [15]. In Table 1 and all the result ta-bles that follow, we mark estimated / extrapolatedfigures with a ∗.

4.2 Transfer Pattern baseline We firstpresent results for the conventional TransferPattern construction. For Germany, withouthubs and using rRaptor for the profile searches,preprocessing takes about 372 hours [5] andproduces auxiliary data of a total size of about140 GB. This corresponds to profile search timesof about 5 seconds per station. Using hubs (andaccepting a small fraction of suboptimal results),preprocessing times can be reduced to 350 hoursand auxiliary data size to 10 GB. In view ofthe profile search times reported for Germanywhen using TB [18], preprocessing times could bereduced to 24 hours.

Considering the whole world, with an estimatednumber of stations on the order of 5 million, weexpect a single run of rRaptor to take at least100 seconds per station on average. Without hubs,this would result in a preprocessing time of about140,000 core hours (15 years on a single core) and aspace consumption of around 50 TB (if all stationsare reachable pair-wisely). With hubs, assumingthat searches from non-hubs visit about 5% of allstations, the space consumption would decrease toabout 3 TB, but the preprocessing time would still

be around 20,000 core hours (over 2 years on asingle core).

4.3 Clustering We first evaluate how our clus-tering approaches behave on the public transit net-work of Germany.

4.3.1 Baseline The HAFAS data groups thestations of Germany into clusters of stations from187 distinct sub-agencies. Cluster sizes accordingto this grouping range from 1 to 33,385 with anaverage of 1,337 stations and a standard deviationof 3,456. Most clusters exhibit slightly less than athousand stations. The total cut size (i.e., the sumof all weights of edges with stations in differentclusters) is around 96 million.

4.3.2 Comparison of clustering approachesWe computed our own clusterings for the transitnetwork of Germany, using k-Means, merge-basedclustering, METIS and PUNCH as described inSection 3.1.

Germany contains about 250,000 stations.Thus, matching the average cluster size of around850 in the HAFAS data leads to about 300 clus-ters. Table 2 shows a comparison of all approacheswhen fixing the number of clusters to 300. k-Meansproduces good results with respect to the total cutsize and the number of border nodes. This is sur-prising, given the limited amount of informationthis approach takes as input. Merge-based cluster-ing performs best in terms of cut size and numberof border nodes. METIS produces even slightlyless cut edges than merging, but the cut size ismuch worse. Using PUNCH also did not resultin satisfactory cut sizes. PUNCH was developedto work on unweighted graphs. We tried to in-corporate edge weights and used a utility functionsimilar to merging to overcome this. But PUNCHwas still outperformed by the other approaches. AsPUNCH consists of many sub-algorithms which allcould be further tuned, there might be a way toproduce better results with PUNCH, though.

We conclude that in our implementation,merge-based clustering leads to the clustering mostsuitable for our application, since the number ofborder nodes and the number of inter-cluster trips

0

100

200

300

400

500

600

700

800

900

1000

1100

1200

0 1000 2000 3000 4000 5000 6000 7000 8000

max. cluster size (U)

number of clusterscut size/100,000

avg. #border stations

Figure 4: Experimental results for computing clus-ters on Germany with merging.

indicated by the cut size determine the runningtime of the profile searches in the border transferpattern computation phase.

4.3.3 Varying the cluster size We also testeddifferent cluster size bounds in the merging ap-proach to evaluate the impact on the cut size andthe number of border nodes.

In general, very small clusters/many clusterslead to efficient computability of local transfer pat-terns. But then the cut size and the total numberof border stations is huge, hence the border transferpattern computation gets expensive. On the otherhand, very large clusters/few clusters lead to ex-pensive profile searches in the local transfer patternconstruction process. Moreover, large clusters leadto many border stations per cluster which rendersnon-local queries (with source and target in dif-ferent clusters) inefficient. Both extremes, namelyevery station forming its own cluster and all sta-tions being in the same cluster, yield a scheme thatis equivalent to the conventional Transfer Patternscheme with excellent query times but huge pre-processing effort in terms of space and time. Weaim for a clustering which provides us with a goodtrade-off between preprocessing time, space con-sumption and query processing.

In Figure 4, merge-based clusterings are an-alyzed for different values of U (determining themaximal allowed number of stations in a cluster).

k-Means Merging METIS PUNCH

avg. cluster size 833 833 833 886

std. dev. cluster size 336.4 632.8 360.5 562.8

cut size 20.1 · 107 3.1 · 107 6.5 · 107 71.0 · 107

cut edges 16,124 (2.9 %) 10,887 (2.0 %) 10,828 (2.0 %) 15,162 (6.1 %)

border nodes 20,212 (8.1 %) 14,669 (5.9 %) 21,089 (8.4 %) 21,234 (8.5 %)

Table 2: Comparison of different clustering approaches on the Germany data set. The number of clusters

was set to 300, therefore the average cluster contains 833 stations. For PUNCH, the number of clusters

is not an input parameter. Reported numbers refer to a decomposition with 282 clusters.

We observe that doubling U halves the number ofclusters and the cut size but also increases the av-erage number of border nodes per cluster. A valueof U between 1, 000 and 4, 000 leads to the besttrade-off. We will use the merge-based clusteringwith U = 1, 500 as basis for the following exper-iments. There, most clusters contain about 1,100stations and have about 70-110 border nodes. Wewill use those numbers as basis for our later extrap-olations.

The maximum number of border stations percluster is 1, 242, though. Hence there are clusterswith nearly all stations being border stations. Us-ing our post-processing scheme, we could reducethis number to 986 while creating a few larger clus-ters with up to 6, 109 stations. Clustering methodsthat take metropolitan areas more explicitly intoconsideration might help to reduce the maximumoccurring number of border stations further. Fig-ure 5 depicts our final clustering.

4.4 Local transfer pattern computationThe next step to evaluate is the local transfer pat-tern (LTP) computation in each cluster. UsingrRaptor, we observed an average profile search timeper station in the cluster of about 150 ms (includingbacktracking of optimal routes and DAG construc-tion). Processing a single cluster then takes about2.5 minutes on average. For Germany, this resultsin a total LTP computation time of 12.5 hours. ForTB, profile times on London (with 20,800 stationsand about 5 million connections) were reported tobe 70 ms on average [18]. The average size of ourclusters is about 15 times smaller than London.But it is unclear if this translates to the same fac-tor of time reduction. Assuming 10 ms per profile

Figure 5: Outcome of our transit network cluster-ing routine.

query in a cluster, LTP computation would take 10seconds per cluster and less than an hour in total.Experiments on the Netherlands with an averagecluster size of 1,000 stations led to similar results.For Sweden and the UK, the timings were even bet-ter, since clusters contained fewer interior trips onaverage. Considering the whole world, we expectto get about 5,000 clusters with the same char-acteristics as for the clusters in Germany. Thenthe LTP computation would take about 200 hours

avg. per avg. per total

station cluster Germany World

rRaptor 150 ms 2.5 min 12.5 h 200 h

TB 10 ms ∗ 10 secs ∗ 1 h ∗ 15 h ∗

Table 3: Timings for local transfer pattern compu-

tation with different profile search algorithms.

with rRaptor, and about 15 hours with TB. Table3 summarizes these results.

Regarding space consumption, the DAGs fora single cluster require 4 MB on average. ForGermany, the total space consumption is 760 MB.For the whole world, this extrapolates to around20 GB.

4.5 Long-distance transfer pattern compu-tation In Germany, 7,530 stations feature long-distant transport (served e.g. by ICE/IC or Eu-rostar trains). Almost all of these stations are alsoborder stations of their respective clusters. Sincewe only push border stations and long-distance sta-tions in the PQ of the Dijkstra during long-distancepattern construction, the PQ contains 16,000 ele-ments at most.

For Germany, a profile search using TDD(time-dependent Dijkstra) requires 15.2 seconds onaverage. Our accelerated search – using precom-puted LTPs from the previous step, and only push-ing long-distance and border stations in the PQ –takes only 1.3 seconds on average. Note that wekeep the query graphs between border nodes explic-itly in memory and do not recompute them fromscratch every time a cluster is visited. The totaltime for the precomputation of the long-distancepatterns on Germany is 2.7 hours.

Estimating the number of long-distance sta-tions for the whole world is difficult. Assumingthat, like in Germany, about 3% of the stationsworld-wide feature long-distance transport (pre-sumably a huge overestimation), we get 150,000such stations. On Germany, evaluating a singlecluster takes 3-6 ms per profile search on aver-age. Experiments on the Netherlands give simi-lar results. For the whole world, we expect a sin-gle profile search using precomputed LTPs to takearound 30 seconds. This results in a total time

# long-distance time

stations per station total

Germany 7,530 1.3 secs 2.7 h

World 150,000 ∗ 30 secs ∗ 1,200 h ∗

Table 4: Number of source stations and timings for

long-distance transfer pattern computation.

consumption of about 1250 hours for this phase.Table 4 provides an overview of the timings forlong-distance pattern computation.

For Germany, the space consumption to storelong-distance patterns is 180 MB. For the wholeworld, storing all such patterns in the naive wayrequires around 60 GB. Using hubs, the spaceconsumption is reduced to around 4 GB.

4.6 Border transfer pattern computationWith the merge-based clustering on Germany,about 16,000 stations are border stations. Fromeach of these, we ran profile searches using bothlocal and long-distance patterns. The average pro-file search time was about 0.6 seconds. For long-distance stations, no new searches are necessary,because we do full searches for them in the previousphase. The total time consumption for this phaseis therefore 1.25 hours. For the whole world, as-suming the same percentage of border nodes, about300,000 profile searches are necessary. With an ex-trapolated running time of 15 seconds per search,and assuming that half of the border stations arelong-distance stations, the precomputation time forthis phase would be about 600 hours.

Long-distance patterns can be deleted afterthis phase. The space consumption for borderpatterns is about 400 MB for Germany, with somecompression coming from the combined DAGs forthe border stations of a cluster. For the wholeworld, border DAGs require about 180 GB withouthubs, and about 10 GB with hubs. Table 5summarizes the results for the complete TransferPattern construction pipeline.

4.7 Query processing We differentiate be-tween queries with the source and the target sta-tion in the same cluster (local queries) and withthe source and the target station in different clus-

Germany World

time space time space

scalable TP, local patterns 12.5 h 760 MB 200 h ∗ 20 GB ∗

scalable TP, long-distance patterns 2.7 h (180 MB) 1,250 h ∗ ( 4 GB) ∗

scalable TP, border patterns 1.3 h 400 MB 600 h ∗ 10 GB ∗

scalable TP, total 16.5 h 1160 MB 2,050 h ∗ 30 GB ∗

conventional TP, without hubs 372 h 140 GB 140,000 h ∗ 50 TB ∗

conventional TP, with hubs 350 h 2 GB 20,000 h ∗ 3 TB ∗

Table 5: Overview of time and space consumption of the different steps of our new scalable Transfer

Patterns precomputation. The space consumption for the long-distance patterns is enclosed in

parantheses, since they are only required during preprocessing but not for query processing. The last

two lines show numbers for the conventional Transfer Patterns approach; note that with hubs, a small

fraction of the results may be sub-optimal.

local non-local

convex non-convex Germany World

edges 30 380 4,500 15,000 ∗

time 0.1 ms 5 ms 32 ms 200 ms ∗

Table 6: Number of edges in the query graph and

runtime for different kind of queries. Values are

averaged over 1,000 runs with randomly selected

source and target station.

ters (non-local queries). For a local query, it is alsorelevant whether the cluster is convex or not. Table6 shows all results.

For Germany, local queries in convex clusterstake 0.1 ms on average, with a query graph contain-ing less than 30 edges on average. For non-convexclusters (92 out of 300), query times are 5.0 mson average, and query graphs contain around 380edges on average. Actually, in a non-convex cluster,on average over 90% of all pairs of source/targetstations do not demand the addition of connectionsbetween border nodes. One could check this prop-erty for all station pairs in the preprocessing andthen flag non-convex pairs. The query times forsuch “convex station pairs” would then be as fastas for station pairs in convex clusters.

For non-local queries, the combined querygraphs contains about 3,500 edges on average.Constructing those graphs from scratch for everyquery takes about 20 ms on average. If the querygraphs between clusters are cached, they can be

merged with the local pattern from the sourceto the border stations and from the border sta-tions to the target in 2 ms on average. The time-dependent Dijkstra computation on the final querygraph takes 30 ms on average. This results in atotal average query time of 50 ms or 32 ms, respec-tively.

For the whole world, query times are expectedto be similar to the reported results for Germany.In particular, note that the time to answer localqueries does not depend on how large the networkis in total. For non-local queries, the patterns be-tween border stations might contain more transfernodes. Also, using hubs might introduce an addi-tional chunk of nodes and edges. But even when wepessimistically expect that for each pair of bordernodes 3 additional nodes are created in the querygraph (on Germany, it is 0.3 nodes per border sta-tion pair), the query graph would contain around15,000 edges on average. The average total querytime is then around 200 milliseconds.

5 Conclusions and Future Work

We introduced a public-transit route planningscheme that is based on clustering the network butstill produces the full set of Pareto-optimal solu-tions regarding earliest arrival time and number oftransfers. On the transit network of Germany, ournew approach reduces the preprocessing time by afactor of more than 20 and the space consumptionby factor of more than 100, with exact query re-sults in all cases. Using other profile search meth-

ods than rRaptor, the preprocessing times couldbe further improved. Our extrapolated values forthe whole world predict even more dramatic reduc-tions: down to a very reasonable 2,000 hour pre-computation and 30 GB space consumption. Weare aware that the structure of transit networks inother parts of the world differs significantly fromthe structure of the German network. However, weconsider Germany a worst-case basis for extrapo-lation due to the generally high network density,which makes the partitioning into well-separatedsub-networks harder. Of course, as soon as larger-scale public-transit data is openly available, exper-iments should be conducted to check the validityof our extrapolations.

In future work, a clustering approach whichis designed to produce (possibly overlapping) con-vex clusters could accelerate the preprocessing andlead to even better query times for local (intra-cluster) queries. An important open problem isto deal with those (few) clusters with a relativelylarge number of border nodes. These do not sig-nificantly affect average query times, but are crit-ical for the maximum query time. One promis-ing approach could be a more metropolitan-awareclustering that allows some clusters to be signifi-cantly larger than others if this helps to reduce themaximal number of border nodes. We also con-sider it worth investigating which search techniquesbesides time-dependent Dijkstra searches allow forthe incorporation of precomputed transfer patternsinto the profile search. This could further reducethe profile-search times in the precomputation ofthe long-distance and border patterns.

References

[1] H. Bast, E. Carlsson, A. Eigenwillig, R. Geis-berger, C. Harrelson, V. Raychev, and F. Viger.Fast routing in very large public transportationnetworks using transfer patterns. In ESA 2010,pages 290–301. Springer, 2010.

[2] H. Bast, D. Delling, A. Goldberg, M. Muller-Hannemann, T. Pajor, P. Sanders, D. Wagner, andR. F. Werneck. Route planning in transportationnetworks. arXiv:1504.05140, 2015.

[3] H. Bast, J. Sternisko, and S. Storandt. Delay-robustness of transfer patterns in public trans-

portation route planning. In ATMOS, pages 42–54, 2013.

[4] H. Bast and S. Storandt. Flow-based guidebookrouting. In ALENEX, pages 155–165, 2014.

[5] H. Bast and S. Storandt. Frequency-based searchfor public transit. In SIGSPATIAL, pages 13–22,2014.

[6] D. Delling, J. Dibbelt, T. Pajor, and R. Werneck.Public transit labeling. In SEA, pages 273–285,2015.

[7] D. Delling, A. Goldberg, T. Pajor, and R. Wer-neck. Customizable route planning. In SEA, pages376–387. Springer, 2011.

[8] D. Delling, A. Goldberg, I. Razenshteyn, andR. Werneck. Graph partitioning with natural cuts.In IPDPS, pages 1135–1146, 2011.

[9] D. Delling, T. Pajor, and R. Werneck. Round-based public transit routing. Transportation Sci-ence, 2014.

[10] J. Dibbelt, T. Pajor, B. Strasser, and D. Wagner.Intriguingly simple and fast transit routing. InSEA, pages 43–54. 2013.

[11] I. Flinsenberg, M. Van Der Horst, J. Lukkien,and J. Verriet. Creating graph partitions for fastoptimum route planning. WSEAS, 3(3):569–574,2004.

[12] R. Geisberger. Contraction of timetable networkswith realistic transfers. In Experimental Algo-rithms, pages 71–82. Springer, 2010.

[13] G. Karypis and V. Kumar. A fast and high qualitymultilevel scheme for partitioning irregular graphs.SIAM J. on Scientific Computing, 20(1):359–392,1998.

[14] J. MacQueen. Some methods for classificationand analysis of multivariate observations. InBerkeley symposium on mathematical statisticsand probability, volume 1, pages 281–297, 1967.

[15] Private Communication, 2015. With a large com-mercial transit-routing provider.

[16] B. Strasser and D. Wagner. Connection scanaccelerated. In ALENEX, pages 125–137, 2014.

[17] S. Wang, W. Lin, Y. Yang, X. Xiao, and S. Zhou.Efficient route planning on public transportationnetworks: A labelling approach. In SIGMOD,pages 967–982, 2015.

[18] S. Witt. Trip-based public transit routing. InESA, pages 1025–1036, 2015.

Documents

Scalable Transfer Patterns · fer Patterns [1, 3], which is excellent for all of the criteria above except two: the space consumption of the precomputed auxiliary data is large and