Upload
katrin-weller
View
1.043
Download
2
Embed Size (px)
DESCRIPTION
Overview / introduction to current Twitter research. This presentation was initially prepared for my colleagues at GESIS.
Citation preview
TWITTER RESEARCH: STATE OF THE ART AND OPEN CHALLENGES
Katrin Weller
@kwelle
1
Präsentation im DAS Kolloquium Köln,
21. März 2013
Introduction to Twitter Part 1
Twitter: A Short Introduction 3
Jack Dorsey (2000): “twttr sketch”. http://www.flickr.com/photos/jackdorsey/182613360/
Twitter: A Short Introduction 4
Twitter: A Short Introduction 5
Tweet = 140 characters
Twitter: A Short Introduction 6
Followers and Followees
Twitter: A Short Introduction 7
Twitter: A Short Introduction 8
Twitter Terminology 9
Retweet
(RT, via)
Kooti, F., Yang, H., Cha, M., Gummadi, K.P. & Mason, W.A. (2012). The Emergence of Conventions in Online Social Networks.
Proceedings of the International Conference on Weblogs and Social Media (ICWSM 2012), Dublin.
Twitter Terminology 10
@message
(≠ direct message)
Twitter Terminology 11
#
Hashtags
Hashtags 12
Memes
Conventions
Trends
Influence
#becareful!!! March 19, 2012
Twitter: Numbers 13
Founded in 2006 (independent Plattform
since 2007)
March 2012: 140 million active users and
340 million Tweets a day
December 2012: more than 200 million
users
USA, 2012: 15% of online adults use Twitter
Germany, 2012: 4% of population
PEW Internet:
http://www.pewinter
net.org/Reports/201
2/Twitter-Use-
2012.aspx
Twitter Blog:
http://blog.twitt
er.com/2012/0
3/twitter-turns-
six.html
@twitter:
https://twitter.co
m/twitter/status
/28105165223
5087872
ARD/ZDF online Studie
http://www.ard-zdf-
onlinestudie.de/fileadmin/
Online12/0708-
2012_Busemann_Gscheidl
e.pdf
Twitter vs. Facebook 14
In Germany
Facebook: 72.1% (of Internet users)
Twitter: 10.5% (of Internet users)
German Social Media
Consumer Report:
http://www.socialmediathinklab
.com/wp-
content/uploads/2013/02/W
WU_Social-Media-Consumer-
Report_0213_Ansicht.pdf
Twitter: Trivia 15
Twitter users with most followers?
Trends 2012: olympics, US election
Most retweeted:
https://2012.twitter.com/de/golden-tweets.html
Twitter: Popular Users 16
Ma
rch,
20
13
, ht
tp:/
/tw
itte
rcoun
ter.
com
/p
ag
es/
10
0
Twitter Tools 17
Twitter Tools 18
19
Twitter Research Part 2
Development of Twitter Research 21
0
200
400
600
800
1000
1200
2007 2008 2009 2010 2011 2012
Scopus WoS SSCI Scopus: Social Science only
Twitter vs. Facebook 22
0
200
400
600
800
1000
1200
1400
2007 2008 2009 2010 2011 2012
Scopus: Twitter Scopus: Facebook
Scopus: Publications from 70 countries 23
United States; 958
United Kingdom; 174
Japan; 166 China; 140 Germany; 115
Australia; 95
Canada; 77
Spain; 77
South Korea; 73
India; 58
Scopus: Publications by Discipline 24
1621
519
375
291
284
178
115
98
48
41
0 200 400 600 800 1000 1200 1400 1600 1800
Computer Science
Social Sciences
Engineering
Mathematics
Business, Management and Accounting
Medicine
Decision Sciences
Arts and Humanities
Materials Science
Psychology
Top ten subject areas for Twitter research based on Scopus (TITLE-ABS-KEY(Twitter) AND PUBYEAR > 2006)
Why Study Twitter? 25
pointless babble?
https://twitter.com/lalasmommy2012/status/308574570720415745
Early Twitter Research 26
Java and colleagues (2007) characterised most
tweets as “daily chatter”.
Pear Analytics study: 40% of tweets are pointless
babble (Kelly, 2009).
Java, A., Song, X., Finin, T., & Tseng, B. (2007). Why we twitter: understanding microblogging usage and communities. In Proceedings of the 9th
WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis (WebKDD/SNA-KDD ’07). ACM, New York, NY, USA,
56-65. DOI=10.1145/1348549.1348556 http://doi.acm.org/10.1145/1348549.1348556
Kelly, R. (2009). Twitter Study. Pear Analytics, retrieved from http://www.pearanalytics.com/wp-content/uploads/2012/12/Twitter-Study-
August-2009.pdf
Twitter Evolution 27
Conversations evolved on Twitter
@ symbol for replies (Honeycutt & Herring, 2009)
RTs
New studies focusing on communication structures
and networks
Honeycutt, C., and Herring, Susan C. (2009). Beyond microblogging: Conversation and collaboration via Twitter. Proceedings of the
Forty-Second Hawaii International Conference on System Sciences.Los Alamitos, CA IEEE Press.
Twitter Evolution 28
Kwak, H., Lee, C., Park, H., & Moon, S. (2010). What is Twitter, a Social Network or a News Media? In Proceedings of the 19th
International World Wide Web (WWW) Conference, April 26-30, 2010, Raleigh NC, USA.
Follow me?
Selected Research Areas 29
de-banalising Twitter!
Selected Research Areas 30
Brand Communication & Marketing
Crises & Natural Disasters
Elections & Politics
Popular Culture Education & Scholarly
Communication Health Care &
Diseases
Technical Research 31
Named Entity Extraction Information Retrieval &
Ranking Sentiment Analysis
Network Analyses Trend Detection
How to Study Twitter? 32
Twitter Data? 33
How to Collect Twitter Data? 34
Challenges:
Real-time archiving obligatory
Limited to portions of traffic
How to Collect Twitter Data? 35
Twitter APIs
API = Application Programming Interface
Interface to collect data from web applications
Streaming API, Rest APIs, Search API
How to Collect Twitter Data? 36
STREAMING API
- push-based, live stream of data
- Researchers need tools that maintain a connection to
this stream.
How to Collect Twitter Data? 37
Sample (1% or 10% of all tweets, probably
random)
Track (tweets including specific keywords)
Follow (tweets from selected users)
Locations (for geotagged tweets)
How to Collect Twitter Data? 38
REST API
Limited requests per hour
Social graph data (who is following who)
Access trending topics
And many more
How to Collect Twitter Data? 39
Your Twapperkeeper
Via RSS feed (e.g. google reader, thunderbird)
Third parties ($)
Your own programmes
Gnip / DataSift
Your Twapperkeeper 40
Your Twapperkeeper 41
Tweet Archivist 42
http
://w
ww
.tw
eeta
rchi
vist
.com
/
GNIP 43
Examples 44
Twitter Visualisations 45
http://www.cci.edu.au/node/1362
The Australian Twitter-Sphere
Rhythm of a City 46
http://engineering.twitter.com/2012/06/studying-rapidly-evolving-user.html
Sentiments 47
http
://w
ww
.ccs
.neu.
ed
u/ho
me/a
mis
love
/tw
itte
rmood
/
Twitter in African Capitals 48
http
://w
ww
.jeun
ea
friq
ue.c
om
/A
rtic
le/A
RTJ
AW
EB2
01
30
21
51
65
82
6/in
tern
et-
libre
ville
-acc
ra-a
dd
is-a
beb
are
sea
ux-s
oci
aux
-les-
cap
ita
les-
afr
ica
ines-
de-
twitte
r-q
uart
ier-
pa
r-q
uart
ier.
htm
l#Tu
nis
49
1. FC Köln (@fckoeln)
Borussia Mönchengladbach (@VfLBorussia)
BVB Dortmund 09 II (@BVB)
FC Bayern München (@BayMuenchen)
FC Schalke 04 II (@s04, official)
FC Schalke 04 I (@FCSchalke04, inofficial)
Hamburger SV (@HSV)
SV Werder Bremen I (@Werder_Bremen)
SV Werder Bremen II (@werderbremen)
0
10000
20000
30000
40000
50000
60000
70000
80000
Jun 11 Jul 11 Aug 11 Sep 11 Oct 11 Nov 11 Dec 11 Jan 12 Feb 12 Mar 12 Apr 12 May 12 Jun 12
num
ber
of
follow
ers
month
1. FC Augsburg (@FCAugsburg) 1. FC Kaiserslautern (@Rote_Teufel)* 1. FC Köln (@fckoeln)1. FC Nürnberg (@1_fc_nuernberg) 1. FSV Mainz 05 (1FSVMainz05) 1899 Hoffenheim (achtzehn99)Bayer 04 Leverkusen (@bayer04fussball) Borussia Mönchengladbach (@VfLBorussia) BVB Dortmund 09 I (@BVBDortmund09)BVB Dortmund 09 II (@BVB) FC Bayern München (@BayMuenchen) FC Schalke 04 II (@s04, official)FC Schalke 04 I (@FCSchalke04, inofficial) Hamburger SV (@HSV) Hannover 96 I (@ichbin96)Hannover 96 II (@hannover96) Hertha BSC Berlin (@HerthaBSC)* SC Freiburg (@sc_freiburg)SV Werder Bremen I (@Werder_Bremen) SV Werder Bremen II (@werderbremen) VfB Stuttgart (@VfB)VfL Wolfsburg (@VfL_Wolfsburg)
Twitterleague: German Bundesliga Clubs during Season 2011/12
See also: Weller, K., & Bruns, A. (2013). Das Spiel dauert 140 Zeichen. Wie deutsche Fußballvereine Twitter für Marketing und
Fan-Kommunikation entdecken. In: HiER 2013. Proceedings des 8. Hildesheimer Evaluierungs- und Retrievalworkshop, April 2013: https://www.uni-
hildesheim.de/media/fb3/informationswissenschaft/HIER/hier2013_proceedings_vorab.pdf
London Riots 50
http://www.guardian.co.uk/uk/2011/dec/07/twitter-riots-how-news-spread
51
Bruns, A., & Burgess, J. (2012). Notes towards the scientific study of Twitter. In Tokar, A., Beurskens, M., Keuneke, S.,
Mahrt, M., Peters, I., Puschmann, C., van Treeck, T., & Weller, K. (Eds.). (2012). Science and the Internet (pp. 159-169).
Düsseldorf: Düsseldorf University Press http://nfgwin.uni-duesseldorf.de/sites/default/files/Bruns.pdf
Challenges Part 3
It‘s not all about the numbers… 53
Big Data vs.
meaningful research
questions
Representativeness 54
Influenced by, e.g.:
User statistics
Time of data collection
Reliability 55
Verified accounts
Verified storys?
What should we measure? 56
What is a link, a follower, a
friend, a tweet worth?
How to interpret users‘
activities?
What should we measure? 57
Typically:
- Tweets from a certain user / a group of users
- Tweets mentioning a certain users
- Tweets that contain a specific word or hashtag
- Random tweets
Also:
- Follower numbers
- Tweets containg (specific) URLs
Standard Metrics? 58
Number of tweets (per period, per user)
Number of users with at least one tweet
Structural Analysis of Tweets:
Original tweets, RTs, (modified RTs), @message
Tweets containing URLs
Standards? Time Series
hours
Num
ber
of
tweets
per
hour
By Cornelius Puschmann (@coffee001)
Standards? Networks (@, RT, follower)
By Cornelius Puschmann (@coffee001)
Standards? Active Users
Users and their tweets during #www2010
0
20
40
60
80
100
120
140
160
180
200
Num
ber
of
tweets
Standards? Active Users
0
5
10
15
20
25
30
35
40
Num
ber
of
@-m
ess
aes
@-Nachrichten gesendet @-Nachrichten empfangen@messages sent @messages received
Legal vs. Ethical? 63
Legal Questions 64
Grey area in law
Basics:
Twitter Terms of Services
Twitter Rules of the Road
Twitter Privacy Policy
Ethical Questions 65
Not all that‘s legal is ethical
How to deal with user data?
How to anonomyse the data?
Legal and Ethical Questions 66
Currently: common sense
Share as little individual-related data as possible
Don‘t make tweet collections publicly available
Data Access & Long Term Accessibility 67
today vs. tomorrow
Reproducibility 68
Collecting the same data twice?
Working with existing datasets?
Cleaning the data?
Preservation 69
Library of Congress
ARCOMEM project
Thank you for your attention! 70
Upcoming 71
Edited Collection: Twitter and Society
Editors: Katrin Weller, Axel Bruns, Jean Burgess, Merja Mahrt
&Cornelius Puschmann. To be published 2013, with Peter Lang.