Estimating destination flows

Embed Size (px)

Citation preview

  • 8/3/2019 Estimating destination flows

    1/9

    36 PERVASIVEcomputing Published by the IEEE CS n 1536-1268/11/$26.00 2011 IEEE

    L a r g e - S c a L e O p p O r t u n i S t i c S e n S i n g

    esimig Oigi-

    Dsiio Flowsusig Mobil phoLoio D

    Origin-destination (OD) matri-

    ces represent one o the most

    important sources o inor-

    mation used in the strategic

    planning and management o

    transportation networks. A precise calculationo OD matrices is an essential component in

    enabling administrative authorities to optimize

    the use o their transportation networks, not

    only to benet users on their

    daily journeys, but also to

    plan investments required to

    adapt these inrastructures

    to envisaged uture needs. Tra-

    ditionally, urban planning and

    transportation engineering

    use household questionnaires

    or census and road surveys todevelop methodologies or OD matrix estima-

    tion. This approach has two main drawbacks:

    calculating an OD matrix, rom the initial

    data gathering to the exploitation o the rst

    results, can take years and produce only a

    snapshot o the travel demand; and

    the collected data has shortcomings in terms

    o both spatial and temporal scale.

    Sensor-based OD estimation methods use

    street sensors such as loop detectors and video

    cameras together with trac-assignment mod-

    els. Analogous methods have been developed

    using probe vehicles, in which vehicle traces

    serve as data sources.1,2 Those methods are,

    however, limited by the act that models are

    oten underdetermined because the number oparameters to be estimated is typically larger

    than the number o monitored network links.3

    On the other hand, the wide deployment o

    pervasive computing devices (mobile phone,

    smart cards, GPS devices, digital cameras, and

    so on) provides unprecedented digital ootprints,

    telling where people are and when theyre there.

    Earlier projects have used dierent methodolo-

    gies or detecting the presence and movement o

    crowds through their digital ootprints (Flickr

    photo, mobile phone logs, smart card records,

    and taxi/bus GPS traces).4 6 This ne-grainedanalysis could dramatically increase our un-

    derstanding o the use o space and daily com-

    muting fows or urban mobility planning and

    management. Thus, its no surprise that the idea

    o using mobile phones to monitor trac con-

    ditions isnt new, as we discuss in the Related

    Work in Analyzing Trac Flow sidebar.

    Although the results rom these other studies

    show great potential or using cellular probe tra-

    jectory inormation to estimate travel demand,

    all methods must overcome several shortcomings

    beore they can be put into practice. Indeed, as

    Using an algorithm to analyze opportunistically collected mobile

    phone location data, the authors estimate weekday and weekend travel

    patterns of a large metropolitan area with high accuracy.

    Francesco Calabrese

    and Giusy Di Lorenzo

    IBM Dublin Research Laboratory

    Liang Liu and Carlo Ratti

    Massachusetts Instituteof Technology

  • 8/3/2019 Estimating destination flows

    2/9

    OCTOBERDECEMBER 2011 PERVASIVEcomputing 37

    Yi Zhang and his colleagues note,7 eld

    tests are needed or the ollowing reasons:

    real coverage areas o cell phone tow-

    ers are quite dierent rom simulated

    ones, and vary rom urban to rural

    areas;

    validation o methods to determine

    a trips origin and destination shouldbe perormed using real individual

    mobility data;

    real mobility and calling patterns

    should be included in the analysis

    because they crucially infuence the

    methods perormance;

    existing OD matrices should be used

    as ground truth to veriy the correct-

    ness o the estimated results.

    Our methodology uses opportunis-

    tically collected mobile phone location

    data to estimate dynamic OD matrices.

    We address concerns using a real mo-

    bility and calling dataset rom 1 million

    mobile phone users. We use the Boston

    metropolitan area as a case study and

    validate our methodology using census

    survey data or both county and census-

    tract levels.8 To our knowledge, both

    the methodology developed and thedata precision and amount are unique.

    Mobil pho DsThe considered dataset consists o

    anonymous location measurements

    generated each time a device connects

    to the cellular network, including:

    when a call is placed or received (both

    at the beginning and end o a call);

    when a short message is sent or

    received;

    when the user connects to the Inter-

    net (or example, to browse the Web,

    or through email programs that peri-

    odically check the mail server).

    In this article, we reer to these events

    as network connections. The events

    represent a superset o those in the call

    detail records used elsewhere.9,10

    We analyzed 829 million mobile

    location data or 1 million devices

    collected by AirSage (www.airsage.

    com). This data included not only the

    ID o the cell tower the mobile phone

    was connected to, but also an estima-

    tion o its position within the cell,

    which is generated through triangu-

    lation by AirSages Wireless Signal

    Extraction technology. Each loca-

    tion measurement miM is charac-

    terized by a position pmi expressed

    S

    everal related studies on analyzing trac fow have been

    published in recent years.Raaele Bolla and Franco Davoli presented a model or

    estimating trac using an algorithm that calculates trac para-

    meters on the basis o mobile phone location data.1 Researchers

    in Rome developed a case study or real-time urban monitor-

    ing using aggregated mobile phone data to monitor trac and

    movement o vehicles and pedestrians.2 Randal Cayord and

    Tigran Johnson analyzed the main parameters to be consid-

    ered, namely precision, metering requency, and the number

    o localizations necessary to achieve accurate trac descrip-

    tions.3 Several companies worldwide, including ITIS Holdings

    (Britain), Delcan (Canada), CellInt (Israel), and AirSage and

    IntelliOne (USA), have begun developing commercial applica-tions o mobile phone-based trac monitoring.

    With the speciic goal o measuring OD lows, dierent

    mobile phone signaling datasets have been considered

    and simulated to evaluate the easibility o estimating trips.

    Initial work used billing data, consisting o cell phone

    tower inormation every time a phone received or made a

    call.4 Other research has used mobile phone positions every

    two hours to iner trips,5 location updates to iner mobile

    phone movement,6 and cell phone tower handover inorma-

    tion.7 A recent eort estimated the daily OD demand using

    simulated cellular probe trajectory inormation (extracted

    rom location updates, handover, and transition o timing

    advance values) and tested the methodology via the VisSim

    simulation.8

    REfEREnCES

    1. R. Bolla and F. Davoli, Road Trac Estimation rom Location Track-

    ing Data in the Mobile Cellular Network, Proc. IEEE Wireless Comm.

    and Networking Conf., vol. 3, IEEE Press, 2000, pp. 11071112.

    2. F. Calabrese et al., Real-Time Urban Monitoring Using Cell Phones:

    A Case Study in Rome, IEEE Trans. Intelligent Transportation Systems,

    vol. 12, no. 1, 2011, pp. 141151.

    3. R. Cayord and T. Johnson, Operational Parameters Aecting Use o

    Anonymous Cellphone Tracking or Generating Trac Inormation,

    Proc. Transportation Research Board Ann. Meeting, 2003.

    4. J. White and I. Wells, Extracting Origin Destination Inormationrom Mobile Phone Data, International Conerence on Road Trans-

    portation and Control, IEE, 2002, pp. 3034.

    5. C. Pan et al., Cellular-Based Data-Extracting Method or Trip Distri-

    bution,J. Transportation Research Board, vol. 1945, 2006, pp. 3339.

    6. N. Caceres, J. Wideberg, and F. Benitez, Deriving Origin Destination

    Data rom a Mobile Phone Network, Intelligent Transport Systems,

    vol. 1, no. 1, 2007, pp. 15 26.

    7. K. Sohn and D. Kim, Dynamic Origin-Dest ination Flow Estimation

    Using Cellular Communication System, IEEE Trans. Vehicular Technol-

    ogy, vol. 57, no. 5, 2008, pp. 2703 2713.

    8. Y. Zhang et al., Daily O-D Matrix Estimation Using Cellular Probe

    Data, Proc. Transportation Research Board Ann. Meeting, 2010.

    rld Wok i alyzig tffi Flow

  • 8/3/2019 Estimating destination flows

    3/9

    38 PERVASIVEcomputing www.computer.org/pervasive

    Large-ScaLe OppOrtuniStic SenSing

    in latitude and longitude and atimestamp t

    mi.

    To iner trips rom these measure-

    ments, we rst characterized the indi-

    vidual calling activity and veried that

    its requent enough to allow monitor-

    ing the users movement over time with

    a ne enough resolution. For each user,

    we measured the interevent timethat

    is, the time interval between two con-

    secutive network connections (similar

    to what Marta Gonzlez and her col-

    leagues measured10). The average inter-event time measured or the entire pop-

    ulation was 260 minutes, much lower

    than Gonzlez and her colleagues mea-

    surement (500 minutes) because were

    also considering mobile Internet con-

    nections. Because the distribution o

    interevent times or a user spans several

    temporal scales, we urther character-

    ized each calling activity distribution

    by its rst and third quantile and the

    median. Figure 1 shows the distribu-

    tion o the rst and third quantile and

    the median or all available users intothe dataset. The arithmetic average

    o the medians is 84 minutes (the geo-

    metric average o the medians is 10.3 min-

    utes) with results small enough to de-

    tect changes o location where the user

    stops or as little as 1.5 hours.

    Mobile-phone-derived location data

    has lower resolution than GPS data.

    Internal and independent testing sug-

    gest an average uncertainty radius o

    320 meters, and a median o 220 me-

    ters. Moreover, at some peak usage pe-riods, additional location errors can be

    introduced when users are automati-

    cally transerred by the network rom

    the closest cellular tower to one thats

    urther away but less heavily loaded.

    Oigi-Dsiioesimio MhodThe procedure or estimating dynamic

    OD matrices consists o two steps: trip

    determination and origin-destination

    estimation.

    To alleviate the eects o localization

    errors and event-driven location measure-

    ments on individual trip determination,

    we apply a low-pass lter with a 10-minute

    resampling rate to the raw data. This ol-lows an approach tested with data rom

    Rome, Italy.11 In addition, because ewer

    localization errors might still generate

    ctitious trips, we adapt a preprocess-

    ing step used to analyze GPS traces. This

    step uses clustering to identiy minor

    oscillations around a common location.

    The approach used to handle loca-

    tion errors and identiy meaningul

    locations in a users travel history can

    be understood as ollows:

    We begin with a measurement series

    Ms={mq,mq+1,,mz}Mz-q-1,q>z,

    derived rom a series o network con-

    nections over a certain time interval

    T t tm mz q= > 0.

    We dene an area with radius S

    (in this case, 1 km to account or the

    localization errors estimated by Air-

    Sage), such that

    max ,

    , .

    distance ( ) < Sp p

    i j z

    m mi j

    qAll consecutive points pjMs or

    which this condition holds can be

    used together such that the centroid

    becomes a virtual location

    p z q ps mi q

    i z

    i=

    =

    =

    ( ) 1 (the centroido the points),

    that is a trips origin or destination.

    Once the virtual locations are

    detected, we can evaluate the stops

    (virtual locations) and trips as pathsbetween users positions at conse-

    cutive virtual locations. Each trip

    trip(u, o, d, t) is characterized by

    user ID u, origin location o, desti-

    nation location d, and starting time t.

    Once trips are extracted, we use the

    ollowing procedure to derive OD fows:

    The geographical area under analy-

    sis is divided into regions: regioni,

    i= 1,, n.

    Figure 1. Caracterizatio o idividua caig activity or te etire popuatio,

    i terms o time betee to etor coectio evets. Graps so te

    distributios o te media (soid ie), frst quatie (das-dotted ie), ad tird

    quatie (dased ie) o idividua iterevet time.

    102 101 100 101 102 103 104 1050

    0.01

    0.02

    0.03

    0.04

    0.05

    0.06

    0.07

    0.08

    0.09

    0.10

    Interevent time (minutes)

    Distribution

    First quantile Median Third quantile

  • 8/3/2019 Estimating destination flows

    4/9

    OCTOBERDECEMBER 2011 PERVASIVEcomputing 39

    Origin and destination regions, to-

    gether with starting time, are ex-

    tracted or each trip o each user

    trip(u, o, d, t).

    Trips with the same origin and desti-

    nation regions are grouped together

    at dierent temporal windows tw,or example, weekly, daily, and

    hourly:

    m i j tw

    trip u o d to region d regioni j

    ( ), ,

    ( , , , )

    , ,

    =

    t tw

    .

    The result is a 3D matrix M3 whose

    element m(i,j, tw) represents the num-

    ber o trips rom origin region i to desti-

    nation regionj starting within the time

    window tw.

    cs Sdy d comisoBased on the area covered by the mo-

    bile phone locations dataset, we ana-

    lyzed movement among areas in eight

    counties in eastern Massachusetts

    (Middlesex, Suolk, Essex, Worcester,

    Norolk, Bristol, Plymouth, and Barns-

    table) with an approximate population

    o 5.5 million people. To simpliy the

    analysis, we randomly selected 25 per-

    cent o the available users and extracted

    traces or them.

    ti chizio

    As a rst analysis, we studied the trip-

    length distribution (see Figure 2a), show-

    ing that trips range rom 1 to 300 km.

    We determined the trip length x by cal-

    culating the Euclidean distances be-

    tween the trips origins and destinations.The distribution is well approximated

    by P(x) = (x + 14.6) -0.78exp(-x/60)

    with R2= 0.98, which conrms previ-

    ous ndings.10 The slightly dierent

    coecients ound in this case could be

    because o the dierent build environ-

    ments in Europe and the US.12 To check

    the plausibility o our segmentation o

    the trajectory in trips, we compute the

    same statistics computed on the number

    o individual trips per day. Figure 2b

    shows the distribution over the entirepopulation, separating weekday and

    weekend trips. We obtain an average

    o 5 trips per day during the week,

    and 4.5 on weekends. This number is

    reasonable when compared to the US

    National Household Travel Survey (see

    http://nhts.ornl.gov), which evaluated

    this number to be between 4.18 during

    the week and 3.86 on weekends. (Rea-

    sons or these dierences include seve-

    ral years o dierence between when

    the two datasets were collected, and the

    act that NHTS is based on a sample

    over the entire US population, so not

    ocused on the behavior o people in the

    Boston metropolitan area.)

    To evaluate whether we have sam-

    pling biases in our data, we computed

    the home location distribution esti-mated rom the mobile phone data, and

    compared it with data rom the US 2000

    Census. To detect a home location, we

    rst group together geographic regions

    that are close in space, creating a grid in

    space where the side o every cell is 500

    meters long. For each cell, we evaluate

    the number o nights the user connects

    to the network in the nighttime interval

    while in that cell, and select as a home

    location the cell with the greatest value.

    (We used 6 p.m. to 8 a.m. as the night-time interval based on statistics rom

    the American Time Use Survey; www.

    bls.gov/tus/charts/work.htm.)

    To validate the home location distri-

    bution, we compared it with population

    data rom the US 2000 Census, at the

    census-tract level.13 Within the eight

    selected counties are 1,171 distinct

    census tracts, with populations rang-

    ing rom 70 to 12,000 people (on aver-

    age 4,705), and an area ranging rom

    0.08 to 203 km2 (on average 10.8 km2).

    0 0.5 1.0 1.5 2.0 2.5

    104

    103

    102

    101

    100

    Trip length (log10 km)

    Distribution

    1 0.5 0 0.5 1.0 1.5 2.0

    106

    105

    104

    103

    102

    101

    100

    Number of trips (log10)

    Distribution

    (a) (b)

    Weekday

    Weekend

    Figure 2. Statistics o te trips detected rom te mobie poe ocatio data: (a) trip-egt distributio (curve iterpoated

    it P(x) = (x+ 14.6)-0.78exp(-x/60) it R2 = 0.98), ad (b) umber o idividua trips per day distributio.

  • 8/3/2019 Estimating destination flows

    5/9

    40 PERVASIVEcomputing www.computer.org/pervasive

    Large-ScaLe OppOrtuniStic SenSing

    The census tract population estimated

    using cell phone users home locations

    scales linearly with the census popula-

    tion, as Figure 3 shows, corresponding

    to an average 4.3 percent o the popula-

    tion being monitored.

    OD Flows chizio

    To validate the accuracy o the OD

    matrices produced using the mobile

    phone traces, we used the most re-

    cent tracttract worker fows dataset

    rom the Census Transportation Plan-

    ning Package.8 CTPP is a tabulation

    o responses rom households com-

    pleting the census long orm. It is the

    only census product that summarizes

    data by place o work and tabulates

    the fow o workers between home and

    work.

    The tracttract worker fows data

    shows the number o workers in each

    tract o work by tract o residence.

    Workers are dened as people 16 years

    old and older who were employed andat work, ull or part time, during the

    census reerence week (generally the

    last week o March). The data contains

    the number o workers in the fow who

    were allocated to tract, place, and

    county o work.

    Given the two levels o granular-

    ity (tract and county) available in the

    CTPP dataset, we computed our OD

    estimates at two levels o aggregation.

    Because commuting fow generally ac-

    counts or two trips (home to workand work to home), we considered un-

    directed fows between two locations to

    compare our OD estimations. For each

    granularity, we computed the average

    daily number o trips:

    m i j

    K

    daysm i j tw m j i tw

    All

    All

    tw day

    ( , )

    #( ( , , ) ( , , ))= +

    =

    ,,

    ,..., ,...,i n j i= = 1 1 1

    where KAll is a scaling actor we use tocompare our estimation with the census

    estimations.

    Moreover, because according to the

    denition, the census dataset includes

    only commuting trips, we evaluated the

    average daily number o trips made only

    on weekday mornings (6 to 10 a.m.)

    rom the estimated home to estimated

    work location (estimated as the most

    requent stop area on weekday morn-

    ings between 8 and 10 a.m.):

    m i j

    K

    weekdaysm i j tw m j i tw

    WM

    WM

    tw wm

    ( , )

    #( ( , , ) ( , , ))= +

    =

    = =

    ,

    ,..., ,...,i n j i1 1 1

    where KWM is a scaling actor.

    Finally, we also compared our pre-

    dictions with the well-known and

    widely used gravity model:14

    m i j

    K

    P P

    d i n j i

    Gravity

    G

    i j

    i j

    ( , )

    , ,..., ,...,,

    =

    = = 2 1 1 1,,

    Figure 3. US 2000 cesus tract popuatio desity ad ce poe users estimated

    ome ocatios desity. (a) Cesus tract popuatio desity derived rom ce poe

    users estimated ome ocatios. (b) To compare tese to desities, e mutipied

    ce poe-based popuatio desity by 100/4.3 to accout or te percetage o te

    popuatio beig moitored. Error bars represet te stadard error.

    Cell phone user density (person/sq mi)

    0.000000000122.6

    122.7287.3

    287.4521.8

    521.9847.9

    848.01222

    12231653

    16542158

    21592888

    28894353

    43547267

    (a)

    5 4.5 4.0 3.5 3.0 2.5 2.0 1.5 1.07

    6

    5

    4

    3

    2

    1

    0

    Estimatedtractpopulationdensity(log10)

    Census tract population density (log10)(b)

  • 8/3/2019 Estimating destination flows

    6/9

    OCTOBERDECEMBER 2011 PERVASIVEcomputing 41

    where KG is a scaling actor, and di,j is

    the Euclidean distance (in kilometers)

    between the centroids o the regions.

    Figure 4a shows the county-level re-

    sults. The plots correspond to models

    that minimize the least square errors,

    using KAll= 16.9 or the predictionmade with the average number o trips

    in a day mAll, KWM= 71.4 or the pre-

    diction made with the average number

    o trips on weekday mornings mWM ,

    and KG= 58.4 or the gravity model

    mGravity.

    Correlations show encouraging re-

    sults, with R2= 0.59 or the gravity

    model, R2= 0.73 or the prediction

    made with all trips, and the best result

    R2= 0.76 or predictions made with

    weekday morning trips only. The re-sulting high correlation shows that the

    estimated OD matrices resemble OD

    matrices generated using completely

    dierent inormation.

    Using the best model mWM, we com-

    pared our results with the tract-level

    census data. At this level, noise is more

    evident (see Figure 4b), but we can still

    see on average a good linear relation-

    ship between census estimation and

    our estimationR2= 0.36 in this case,

    which is high compared to the R2= 0.10

    o the gravity model. We also evaluated

    more sophisticated gravity-like models

    by optimizing the dexponent and sub-

    stituting the populations with the total

    estimated number o trips outgoing or

    incoming or an area, but still obtained

    R2< 0.3.The relatively low value oR2 com-

    pared to the county-level analysis is

    partially because the relationship seems

    less linear when the census estimates

    ewer than 10 trips rom tract to tract.

    This might be because census fows are

    estimated rom a subsample, which

    might result in small numbers or par-

    ticular pairs o census tracts. Moreover,

    census estimates werent available or

    the same year as the mobile phone data,

    and trip origins and destinations mighthave slightly changed (at this high level

    o spatial detail) between the two moni-

    tored periods.

    We note that the scaling actor KWM

    used or the last model mWM corre-

    sponds to a share o monitored trips

    thats about 1.4 percent compared

    to the census estimations. This ac-

    tor can be explained by the percent-

    age o mobile phones selected (about

    4.3 percent) and by the calling activity,

    which isnt very high in the morning.

    Other elements, such as the act that were

    monitoring not only commuting fows,

    might explain the remaining dierence.

    Estimating KWM lets us extrapolate the

    ODs computed using the mobile phone

    data to the whole population.

    nw poilWe see a great deal o potential or this

    approach. We should note, though,

    that OD fows data estimated through

    census surveys have the ollowing

    limitations:8

    The decennial census monitors

    usual days to avoid local or re-

    gional anomalies, such as transit

    strikes or severe weather, on a single

    sampling day. This tends to hide theless common uses, such as individu-

    als telecommuting once every two

    weeks or carpooling once a week be-

    cause o ever-changing lie and work

    patterns.

    According to the denition, the cen-

    sus dataset doesnt include nonwork

    trips, and modelers must develop

    relationships between work and non-

    work trips.

    The census data is based on a

    xed-point snapshot approach, so

    Figure 4. Compariso betee mobie poe ad cesus origi-destiatio (OD) estimates. Error bars so oe stadard

    deviatio rom te average: (a) couty eve soig a trips, eeday morig trips, ad te gravity mode; ad (b) tract eve

    soig eeday morig trips.

    0 1 2 3 4 5 60

    1

    2

    3

    4

    5

    6

    Census number of trips (log10)

    Estimatednumberoftrips(lo

    g10)

    All trips

    Weekday mornings trips

    Gravity model

    (a) (b)

    0.5 0 0.5 1.0 1.5 2.0 2.5 3.0 3.50

    0.5

    1.0

    1.5

    2.0

    2.5

    3.0

    3.5

    4.0

    Census number of trips (log10)

    Estimatednumberoftrips(lo

    g10)

  • 8/3/2019 Estimating destination flows

    7/9

    42 PERVASIVEcomputing www.computer.org/pervasive

    Large-ScaLe OppOrtuniStic SenSing

    transportation planners can only

    interpret data over geographic space,

    rather than over time.

    Compared with traditional census

    data, however, our methodology to de-

    tect OD matrices rom mobile phonetraces has several advantages:

    It can capture the weekday and

    weekend patterns as well as seasonal

    variations.

    It can capture work and nonwork

    trips, which is essential or trip chain-

    ing and activity-based modeling.

    Thus, these matrices could then com-

    plement traditionally generated OD

    matrices, providing a ine-grainedspatiotemporal pattern o mobility.

    tmol alysis

    Whereas the census gives only static

    inormation about OD fows, the OD

    matrices derived rom mobile phone

    data let us appreciate the dierences

    in travel demand over time. Figure 5a

    shows the total daily travel demand or

    three weeks in October 2009. A weekly

    pattern clearly appears in the travel de-

    mand, with the minimum on weekends

    (especially Sundays) and the maximum

    on Fridays. Moreover, Figure 5a shows

    a change in travel demand on the sec-

    ond Monday (day 9 in the gure), corre-

    sponding to Columbus Day. For a better

    look at this pattern, we plot the hourly

    travel demand or Columbus Day com-pared to other Mondays (see Figure 5b).

    We clearly see a higher travel demand

    in the rst two hours o the day, ol-

    lowed by lower demand rom hour 4 to

    hour 9, and rom hour 12 to hour 20,

    because o the holiday.

    Siomol alysis

    Our methodology can capture ine-

    grained OD matrices in both spatial

    and temporal scale, essential data or

    understanding transport demand andtransport modeling, especially during

    special events. For example, Figure 6

    compares the incoming fows toward

    Fenway Park, Bostons baseball sta-

    dium. We compare two days: Sunday,

    11 October, when the Boston Red Sox

    played the Los Angeles Angels in a

    postseason game, and a Sunday with

    no special events. As the gure shows,

    we can capture the increasing incom-

    ing low caused by the event, both

    in terms o new trip origins, and in

    fow volume. Further studies with the

    same dataset have also shown regular

    spatial patterns o attendee origins based

    on event type, inormation that would

    be valuable or event management.5

    As this study shows, perva-sive datasets such as mo-

    bile phone traces provide

    rich inormation to support

    transportation planning and operation.

    Meanwhile, some related limitations

    should also be addressed when apply-

    ing these datasets in mobility analysis.

    The localization error, or example,

    limits the minimum size o the regions

    that can be considered.

    Other elements that can aect the

    statistical results include:

    the market share o the mobile phone

    operator rom which the dataset is

    obtained,

    the potential nonrandomness o the

    mobile phone users (or example,

    teenagers),

    calling plans, which can limit the

    number o samples acquired at each

    hour or day, and

    the number o devices that each per-

    son carries.

    Figure 5. Tempora variatio i te umber o trips detected rom te mobie poe ocatio data: (a) umber o trips per day,

    over a tree-ee iterva; ad (b) umber o trips per our, over tree dieret Modays.

    1.5

    1.6

    1.7

    1.8

    1.9

    2.0

    2.110

    6

    Numberoftrips

    S

    un

    M

    on

    T

    ue

    W

    ed

    T

    hu

    Fri

    Sat

    S

    un

    M

    on

    T

    ue

    W

    ed

    T

    hu

    Fri

    Sat

    S

    un

    M

    on

    T

    ue

    W

    ed

    T

    hu

    Fri

    Sat

    ColumbusDay

    (a) (b)

    0 5 10 15 20

    0

    5

    10

    15104

    Hour

    Numberoftrips

    Columbus Day

    Other Mondays

  • 8/3/2019 Estimating destination flows

    8/9

    OCTOBERDECEMBER 2011 PERVASIVEcomputing 43

    Moreover, because the considered data-

    set is event driven (location measure-

    ments available only when the device

    makes network connections), users

    connection patterns can aect the

    possibility o capturing more or ewer

    trips. This last limitation could be

    solved by continuous location readings

    rom GPS devices, which would require

    the users consent. A hybrid approach

    could integrate both event-driven and

    continuous location measurements,

    as the current method can be easily

    generalized to dierent datasets with

    dierent spatiotemporal resolutions.

    Nonetheless, the analysis perormed

    on the interevent time, the spatial dis-

    tribution o mobile phone users, and

    comparisons with census estimations

    conrm that the mobile phone data

    represent a reasonable proxy or human

    mobility.

    Future work will involve reproducing

    the analysis or other cities to understand

    which parameters infuence the scaling

    actors to be used to extrapolate the

    ODs computed using the mobile phone

    data to the whole population. The re-

    search output will give transportation

    planners an automatic and systematic

    way to understand the dynamics o

    daily mobility in a real complex metro-

    politan area.

    francesco Calabrese is an advisory research

    sta member at the IBM Dublin Research Labo-

    ratory and a research aliate at the SENSEable

    City Laboratory o the Massachusetts Institute

    o Technology. His research interests include

    ubiquitous computing, intelligent transportation

    systems, urban network analysis, and distributed

    control systems design. Calabrese has a PhD in

    computer and systems engineering rom the Uni-

    versity o Naples Federico II, Italy. Hes a member

    o IEEE and the IEEE Control Systems Society.Contact him at [email protected].

    Giusy Di Lorenzo is a research scientist at the

    IBM Dublin Research Laboratory and a research

    aliate at the SENSEable City Laboratory o the

    Massachusetts Institute o Technology. Her re-

    search explores the analysis o urban dynamics

    and human mobility using data gathered rom

    sensor networks. In particular, shes interested

    in developing context-aware applications to

    improve urban living experiences o citizens.

    Di Lorenzo has a PhD in computer and systems

    engineering rom the University o Naples

    Federico II. Contact her at [email protected].

    Liang Liu is the deputy director o the Creative In-

    dustries Park in Shanghai and Changzhou. He was a

    postdoctoral associate at the SENSEable City Labo-

    ratory o the Massachusetts Institute o Technol-

    ogy. Liu has a PhD in urban planning and manage-

    ment rom Tongji University, China. Contact him at

    [email protected].

    Carlo Ratti is an architect, engineer, and associate

    proessor o practice at the Massachusetts Institute

    o Technology, and director o MITs SENSEable City

    Laboratory. Ratti has an MSc in civil structural engi-

    neering rom both the Polytechnic University o Turin

    and the School o International Management (ENPC)

    in Paris, and a PhD in architecture rom the Univer-

    sity o Cambridge. He is a member o the Order o

    Engineers in Turin and the Architects Registration

    Board (UK). Contact him at [email protected].

    the AUThORS

    (b)(a)

    Figure 6. Icomig trips i te Feay Par area. Eac ie coects a detected trip origi area to te Feay Par: (a) orma

    Suday, ad (b) day o a Red Sox game. Fo voume is represeted by te ies ticess.

  • 8/3/2019 Estimating destination flows

    9/9

    44 PERVASIVEcomputing www.computer.org/pervasive

    Large-ScaLe OppOrtuniStic SenSing

    ACknOwlEDGMEnTS

    We thank Moshe E. Ben-Akiva or his eedback

    and AirSage or providing the mobile phone loca-

    tion dataset.

    REFEREnCES

    1. X. Zhou and H.S . Ma hmassan i,Dynamic Origin-Destination DemandEstimation Using Automatic Vehicle Iden-tication Data, IEEE Trans. IntelligentTransportation Systems, vol. 7, no. 1,2006, pp. 105114.

    2. S. Baek et al., Method or EstimatingPopulation OD Matrix Based on ProbeVehicles, Korean Soc. Civil Engineers(KSCE) J. Civil Eng., vol. 14, no. 2, 2010,pp. 231235.

    3. M.L. Hazelton, Some Comments onOrigin-Destination Matrix Estimation,Transportation Research Part A: Policy andPractice, vol. 37, no. 10, 2003, pp. 811822.

    4. F. Girardin et al., Digital Footprinting:Uncovering Tourists with User-Generated

    Content, IEEE Pervasive Computing,vol. 7, no. 4, 2008, pp. 36 43.

    5. F. Calabrese et al., The Geography oTaste: Analyzing Cell-Phone Mobil-ity and Social Events, Proc. Intl Conf.Pervasive Computing, Springer, 2010,pp. 2237.

    6. D. Quercia et al., Mobile Phones and Out-door Advertising: Measurable Advertis-ing, IEEE Pervasive Computing, vol. 10,no. 2, 2011, pp. 2836.

    7. Y. Zhang et al., Daily O-D MatrixEstimation Using Cellular Probe Data,Transportation Research Board Ann.Meeting, 2010.

    8. CTPP Census Transportation Plan-ning Package, US Department o

    Transportation, Census Transporta-tion Planning Products, 2010; www.hwa.dot.gov/planning/census_issues/ctpp/.

    9. J. White and I. Wells, Extracting OriginDestination Inormation rom MobilePhone Data, International Conerence

    on Road Transportation and Control,IEE, 2002, pp. 3034.

    10. M. Gonzalez, C. Hidalgo, and A.-L. Bara-basi, Understanding Individual HumanMobility Patterns, Nature, vol. 453,no. 7196, 2008, pp. 779782.

    11. F. Calabrese et al., Real-Time UrbanMonitoring Using Cell Phones: A CaseStudy in Rome, IEEE Trans. IntelligentTransportation Systems, vol. 12, no. 1,2011, pp. 141151.

    12. L. Liu et al., The Law o InhabitantTravel Distance Distribution, Proc.European Conf. Complex Systems, 2009.

    13. MassGIS, Census 2000 Tracts Data-layer Description, 2010; www.mass.gov/mgis/cen2000_tracts.htm.

    14. J. Anderson, A Theoretical Foundationor the Gravity Equation, Am. EconomicRev., vol. 69, no. 1, 1979, pp. 106116.

    Selected CS articles and columns

    are also available or ree at

    http://ComputingNow.computer.org.

    Advertising Personnel

    Marian Anderson: Sr. Advertising CoordinatorEmail: [email protected]: +1 714 816 2139 | Fax: +1 714 821 4010

    Sandy Brown: Sr. Business Development Mgr.Email [email protected]: +1 714 816 2144 | Fax: +1 714 821 4010

    Advertising Sales Representatives (display)

    Western US/Pacifc/Far East:

    Eric KincaidEmail: [email protected]: +1 214 673 3742Fax: +1 888 886 8599

    Eastern US/Europe/Middle East:Ann & David SchisslerEmail: [email protected],[email protected]: +1 508 394 4026Fax: +1 508 394 4926

    Advertising Sales Representatives (Classifed Line)

    Greg Barbash

    Email: [email protected]: +1 914 944 0940Fax: +1 508 394 4926

    Advertising Sales Representatives (Jobs Board)

    Greg BarbashEmail: [email protected]: +1 914 944 0940Fax: +1 508 394 4926

    AdvertiSer informAtion october/december 2011