Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
1
TwitterSense: GeoVisual Analytics for
Gaining Insights from Spatially-
referenced Social Media
Dr. KAM Tin Seong, PhD.
Associate Professor of Information Systems (Practice)
School of Information Systems, SMU
Tel: 6828-0269
Welcome to TwitterSense
• http://3.ideatransitivity.appspot.com/viz
2
2
Welcome to TwitterSense
• http://3.ideatransitivity.appspot.com/playback
3
Content
• Research motivation
• Approaches
• Analysis & Results
• Future Work
3
Research Motivation
• Rapid growth of social media
– Facebook: 500 million users1
– Twitter: 200 million users2
• More updates posted than ever before
– Facebook: 7.5 billion items shared per week1
– Twitter: 1 billion tweets per week2
[1] Facebook Press Room. Accessed 13 Apr 2011 (http://www.facebook.com/press/info.php?statistics)
[2] Twitter Blog: #numbers, 14 March 2011 (http://blog.twitter.com/2011/03/numbers.html)
Why Twitter?
• Increase in number of mobile users
– Twitter: 182% increase from 2010 to 20112
• Apps available for all mobile platforms
[2] Twitter Blog: #numbers, 14 March 2011 (http://blog.twitter.com/2011/03/numbers.html)
4
Why Twitter?
East-West line MRT breakdown
7th April 2011
Political Forum
2nd April 2011
Location-based social networking
Foursquare
(March 2009)
Google Latitude
(February 2009)
Facebook Places
(August 2010)
5
Trendsmap
URL: http://trendsmap.com/, Stateless Systems
HyperCities: Egypt
T. Presner, Y. Kawano, D. Shepard, UCLA Digital Humanities Collaborative (http://egypt.hypercities.com/)
6
Our Research Focus
1. Discover methods to collect tweet data based on
topic & location
2. Perform analysis on collected tweets
– Contents
– Social connections (replies, retweets)
– Geocoding
3. Develop a geovisual analytics tool for detecting
and analysing geospatial patterns from social
media data
Architecture
Crawler Tweet map
Data handler Analysis Web
Services
Google App Engine
Browser
Data
Store
7
Data Collection
• 3 APIs available
API Live data? Historical
data?
User
profile
data?
Formats Limits
REST API Delayed
XML, JSON 1 instance
Search API Delayed
XML, JSON Rate limited
Streaming
API
JSON 1 instance
Data Collection
• Parameters: Search location
Radial bounds
(Search API)
Bounding box
(Streaming API)
8
Data Collection – Search API
Search API Data
handler Analysis
Web
Services
Data
Store
Data Collection – Streaming API
Streaming
API Data
handler Analysis
Web
Services
Data
Store
9
Back-end modules: Web Services
• Expose data for external applications
• Raw tweet data & analysis results
• Formats: XML, JSON
Back-end modules: Web Services
• Invoked using URL & query string
10
Data Collected
• Live View
– Snapshot of tweets
– Limited historical data
– Average size of dataset: 100-200 tweets
• Streaming API crawler
– Live tweet data, no historical data
– Location defined by bounding boxes
– Average size of dataset: 11,000 tweets
Datasets from Live View
• Crawled from various locations in Singapore
• 15 datasets containing 100-200 tweets each
• Tweets spanning 2-4 hours
11
Datasets from Streaming API crawler
• Crawled from the whole of Singapore
• Spans a 24-hour period starting at 7:00 am
• 2 datasets, each with > 11,000 tweets
– Sunday, 2nd April 2011
– Tuesday, 12th April 2011
Cleaning & preparation of data
• Location values
– Remove device/app prefixes
– Parse latitude-longitude values
• Geocode location text
– Google Geocoder API
12
Data format
Tweet
Tweet ID
Text
Author
Time of tweet
Location (Latitude, Longitude)
Entities
Dataset name
Analysis Results
• Identified by dataset name
• Information
– Topics
– Hashtags
– User References
– Links
13
Analysis Results
• Obtained through web services
• Further processing using client side scripts
– Manage map markers
– Render visualizations
– Control animation of visualization
Live View – Tweet Map
14
Live View – Analysis Results
Playback View
15
Playback View – Filtering
Findings
• Patterns of movement (data file: SG_3April2011)
16
Findings
• Patterns of movement
Future Work
• Social network analysis and visualisation
33
17
Q & A
34