1. Travel route recommendation using geotagged photos Takeshi Kurashima, Tomoharu Iwata, Go Irie, Ko Fujimura 2. The system's user interface 3. Block diagram of the system User location Recommended landmark sequence of given travel timeUser mode of transport User's free time and allowed marginRecommendation systemRecommended landmark sequence of given travel time...User requested number of sequencesRecommended landmark sequence of given travel timeFlickr photos User's photos 4. Necessary components Identifying landmarks in the area (no given list) and naming them Estimation of travel time between landmarks using given transportation methodEstimation of time spent visiting each landmarkRecommending landmark sequences for the user 5. Previous Work and Innovation Previous Work: Crandall et al - extract landmarks at various granularity levels from Flickr photos using mean-shift, and name them Popescu et al - popular trips within a city from photo data-sets Choudhury et al - constructing representative travel routes linking popular landmarks within a city using popularity of landmarks, stay times and transit times Popescu et al - deducing the typical visit duration of a landmarkMain innovation here: Personalized recommendations based on user's location history and implicit interests Estimation of traveling times between landmarks using different modes of transport Building a complete recommender system implementing the ideas above 6. Flickr Many digital cameras and phones add a geo-location tag to images automatically. Flickr houses at least 221,883,830 Geo-tagged time-stamped photos from over 51 million users. http://www.flickr.com/map Time-stamps will be used for travel time estimationTextual tags are used to name the extracted landmarkGeo-tags will be used for landmark extraction and recommendationNOT using the actual photos at all, just the meta-data (fast)The Flickr API allows searching for public images taken in a given geobox or geo-circle for non-commercial use. http://www.flickr.com/services/api/flickr.photos.search.html 7. Assumptions Taking a picture of a place and uploading it to Flickr constitutes a recommendation. (Not many This museum was boring photos)Geo-locations of camera and of photographed object are equivalent (The lookout point is recommended, not the view)NOT assuming absolute time stamps of photos are correct, since many camera clocks aren't set. Image time-deltas are used. 8. From photos to landmarks Clustering points in a two-dimensional space 9. Landmark Extraction Assumptions The probability of taking a photograph of a landmark is distributed normally ( ) as a function of the distance (>0) from the landmark Each photo is of one landmark. (A photo of a close object against the background of a distant one is a photo of the closer object) 10. The Mean-Shift procedure: Estimates the local maximum of the probability distribution of each cluster of photos the location of a landmarkIts only parameter is the bandwidth Iteratively compute for each photo, until it converges: 11. From photos to landmarks Substitute the geo-location of each photo with the landmark it captures. Group successive user photos of the same landmark as one photo. (Taking many pictures of a place isn't considered a stronger recommendation)The time-stamp of grouped photos is the average between the time-stamps of the first and last successive photos of the landmark. The textual representation of each discovered landmark is the most common tag of all the photos of the landmark 12. Photographer behavior model We want to estimate P( lt | , hu ), the probability that: user uwith location history huat landmark lt1 at time t1, lt2 at time t2, etc. visits lt at time tWe assume the photographer's decision on the next landmark to visit is a function of: The photographer's current location (sequence)The photographer's topics of interest 13. Location-based Model Using a Markov model: For simplicity, a first-order Markov model is used:Maximum likelihood estimation: 14. Topic Model (PLSA) Each user is a distribution over topics ZEach topic is a distribution over the landmarks UserLandmark distributionsUsing the law of total probability: P(lt) =Topic distributionP(z)Assuming P(hu) and P(lt) are independently conditioned on p(z) we get P(lt|z,hu)P(z|hu) = 15. Expectation Maximization (EM) Computes P(lt|z), P(z|hu) for the topic formula iteratively until convergence using :E step:M step: visible 16. Markov-Topic Model Assuming P(hu) and P(lt-1) are independently conditioned on P(lt) we get, after derivation:Topic MarkovNormalizing Factor P(lt-1|hu) 17. Generating travel routes Naive method: Compute the probability of all possible routes of given time based on user's location and history Choosing the most probable onesA best-first-search is used on the probability tree: P(l1|l0,hu) l1P(l2|,hu)l0P(l2|l0,hu) l2P(l3|l0,hu) l3 18. Travel time estimation The time-delta between consecutive landmarks in a sequence represents travel time between them, using a specific mode of transport (and sometimes includes some of the visit times of both locations) 19. K-means K-means is used on each two landmarks. Identifies K typical travel times between them using different transportation methodsK=3 was chosen Three peaks visible here: Google Maps gives estimates for walking, using public transportation and using a carWalking is assumed to be slowest, followed by public transport, then private car 20. Experiments 696,394 photographs71,718 usersPhotos taken within 20 km from the center of: Washington D.C., New York City, Philadelphia and Boston on the East CoastLos Angeles, San Francisco and Las Vegas on the West Coast 21. Choosing the number of topics Rating by precision of prediction of last landmark of each sequence, over 5-fold cross-validation 22. Results Last-step prediction accuracy 23. Sequence under time-constraint prediction accuracy 24. Comparison of estimation of travel time against Google MapsDoes this reflect on the system or on Google Maps? 25. Routes per time period 26. Routes per transportation mode 27. Routes chosen by topic Routes suggested by the Markov model alone:Routes suggested by the Markov-Topic model: 28. Future Work Using photographer's social network profile and friends list Consideration of opening hours, congestion and fee Evaluation in the field 29. Take-away points Creatively looking for data Building a complete system is a teaching experience. To build a system, it's frequently necessary to use a variety of (AI) methods it's good to have a diverse mental toolbox, or a diverse team. Testing is important - quantitative experiments on a large-scale dataset. Statistically significantly better than the competitors. 30. Thanks! Questions?