Upload
jonas-hawkins
View
216
Download
0
Embed Size (px)
Citation preview
Chris AndrewsGeorgia Institute of TechnologyB.S. Computer Science5th Year Undergraduate
Trajectory Pattern MiningFosca Giannotti
Mirco Nanni
Dino Pedreschi
Fabio Pinelli
ConceptsAnalyze trajectory of moving objects
A 3mins B 5mins C 10mins D
Trajectory Patterns – description of frequent behavior relating to space and time
Frequent Sequence Pattern (FSP)Determine if trajectory sequence matches any trajectory patterns
in a given set
Study different methods of preparing a Temporally Annotated Sequence (TAS) for data mining
Trajectory Patterns (T-Patterns)Trajectory Pattern
sequence of time-stamped locationsS = { ( x0, y0, t0 ) , … , ( xn, yn, tn ) }
Temporal Annotation set of times relating to trajectoriesA = { a1 , a2, … an }
Temporally Annotated Sequence(S,A) = (x0,y0) a1 (x1,y1) a2 … an (xn,yn)
Neighborhood FunctionNeighborhood Function N : R2 -> P (R2)
Calculates spatial containment of regionsInput point to find enclosing Region of InterestDefines the necessary proximity to fall into a regionParameters:
e – radius or necessary proximity of points
Regions of Interest (RoI)Performing these comparisons on points is costlyA simple preprocessing step can alleviate this
Utilize the Neighborhood Function NR()Translate each set of points into regionsTimestamp is selected from when the trajectory first entered
the regionNow compare sequence of regions and timestamps using the
TAS mining algorithm presented in [2].
Static RoINeighborhood Function NR()
Initially receives set of R disjoint spatial regions R regions are predefined based on prior knowledgeEach represents relevant place for processing
Static NR() simplifies problem of mining patternsSequence of points become groupedResult: sequence of regions(x,y) a1 (x’,y’) becomes X a1 Y
Dynamic RoIData sets often do not possess predetermined
regions
Instead need to formulate regions based on criteria of density of the trajectories
Preprocessing now must determine set R of popular regions from the data set
R is now the set of Region of Interests from used by the Neighborhood Function NR() to translate points into Regions of Interest
Popular RegionsGrid G of n x m cells Density Threshold dEach cell with density G(i,j) Set R of popular regions
Each region in R forms rectangular regionSets in R are pair wise distinctDense cells always contained in some region in RAll regions in R have average density above dAll regions in R cannot expand without their
average density decreasing below d
Grid Density PreparationSplit space into n x m grid with small cells
Increment cells where trajectory passes
Neighborhood Function NR() determines which surrounding cells
Regression - increment continuously along trajectory
Popular Regions AlgorithmAlgorithm: PopularRegions( G, d )Complexity: O ( |G| log |G| )
Iteratively consider each dense cellFor each:
Expands in all four directionsSelect expansion that maximizes densityRepeat until expansion would decrease below
density threshold
Results
Evaluating the T-PatternsCompute density of each cell of grid
Compute set of RoI’s by determining Popular Regions
Translate the input trajectories into sequence of RoI’s and timestamps for the transitions
Input the trajectories and times into TAS mining algorithm[2]
ExperimentsGPS Data
Fleet of 273 trucks in Athens, Greece112,203 total points recordedRunning both static & dynamic pattern algorithmsVarious parameter settings
Performance AnalysisSynthetic Data by CENTRE synthesizer50% random & 50% predetermined
Pattern Mining ResultsStatic found: A t1 B t2 BDynamic found: A t1 B’ t2 B’’
Execution Time Results• Increase linearly with increasing
number of input trajectories (both algorithms)
• Grow when density threshold decreases
• Static performs better with extreme threshold
• Static does not perform with middle threshold
Additional ResultsIncreasing radius of spatial neighborhood obtains
irregular performance and large values lead to poor execution times
Changing time tolerance (t) obtains results similar to TAS’s
Increasing the number of points in each trajectory causes linear growth of execution times
Works Cited[1] Trajectory pattern mining, Fosca Giannotti, Mirco
Nanni, Fabio Pinelli, Dino Pedreschi, Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining KDD. ACM, 2007.
[2] Efficient Mining of Sequences with Temporal Annotations. F. Giannotti, M. Nanni, and D. Pedreschi. In Proc. SIAM Conference on Data Mining, pages 346–357. SIAM, 2006.