Upload
christoph-trattner
View
4.281
Download
2
Embed Size (px)
DESCRIPTION
Nowadays, social networks and media, such as Facebook, Twitter & Co, affect our communication and our exchange of knowledge more than ever. But which additional benefits can offer social media apart from easy interaction with friends and how can they be used to create additional value for companies and institutions? These are the questions that the area Social Computing at Know-Center addresses in detail. In this talk we will give a brief overview of industry and non-industry related research projects which we have been involved in recently with my group, Social Computing at the Know-Center, in the context of Big Data and social media. In particular, the talk will highlight specific research project outcomes and work-in-progress that make use of social media data to help people to explore the vastly growing overloaded information space more efficiently.
Citation preview
Social Computing @ Know-Center
1
. Christoph Trattner 29.8.2014 – PUC, Chile
Social Computing in the area of Big Data at the Know-Center
Christoph Trattner
Know-Center [email protected]
@Graz University of Technology, Austria
Social Computing @ Know-Center
2
. Christoph Trattner 29.8.2014 – PUC, Chile
Before I will start in this talk I will talk a bit about myself and how it happened that I became
Head of the Social Computing Research Area at the Know-Center, Austria’s leading competence
center for data driven business and Big Data analytics
Social Computing @ Know-Center
3
. Christoph Trattner 29.8.2014 – PUC, Chile
Where do I come from (Austria)?
Social Computing @ Know-Center
4
. Christoph Trattner 29.8.2014 – PUC, Chile
Graz
Social Computing @ Know-Center
5
. Christoph Trattner 29.8.2014 – PUC, Chile
Academic Back-Ground?
§ Studies Computer Science at Graz University of Technology & University of Pittsburgh
§ Worked since 2009 as scientific researcher at the KMI & IICM (BSc 2008, MSc 2009)
§ My PhD thesis was on the Search & Navigation in Social Tagging Systems (defended 2012)
§ Since Feb. 2013 @ Know-Center § Leading the SC Area § At TUG:
§ WebScience § Semantic Technologies
Social Computing @ Know-Center
6
. Christoph Trattner 29.8.2014 – PUC, Chile
My team
2 Post-‐Docs, 5 Pre-‐Docs (4 more to join soon J)
2 MSc student 2 BSc student
DI. Dieter Theiler
DI. Dominik Kowald
Dr. Peter Kraker
Mag. Sebastian Dennerlein
Dr. Elisabeth Lex
Mag. Matthias Rella
DI. Emanuel Lacic
Social Computing @ Know-Center
7
. Christoph Trattner 29.8.2014 – PUC, Chile
Thanks to my Collaborators
Social Computing @ Know-Center
8
. Christoph Trattner 29.8.2014 – PUC, Chile
What is my group doing?
… we research on novel methods and tools that exploit social data to generate a greater value for the individual, communities, companies and the society as whole. Our competences: • Network & Web Science • Science 2.0 • Predictive Modeling • Social Network Analysis • Information Quality Assessment • User Modeling • Machine Learning and Data Mining • Collaborative Systems
Our Services: • Social Analytics: Hub-, Expert -, Community
-, Influencer -, Information Flow-, Trend (Event) Detection, etc.
• Information Quality Assessment • Social & Location-based Recommander
Systems • Customer Segmentation • Social Systems Design
Social Computing @ Know-Center
9
. Christoph Trattner 29.8.2014 – PUC, Chile
What type of projects are we running?
COMET NON-K
EU FWF
Industry Projects
FFG ...
Non-Industrial Projects
Social Computing @ Know-Center
10
. Christoph Trattner 29.8.2014 – PUC, Chile
Some industry partners...
Social Computing @ Know-Center
11
. Christoph Trattner 29.8.2014 – PUC, Chile
The Projects Project 1: Mendeley – UK Startup (recently acquired by Elsevier):
Interested in the problem of hirachical concept-based navigation.
Project 2: Blanc Noir – Austrian Startup: Interested in the problem
of recommending items to users through social data. Project 3: University of Pittsburgh & Several Austrian
companies: Interested on the usefulness of Twitter in academic conferences.
Social Computing @ Know-Center
12
. Christoph Trattner 29.8.2014 – PUC, Chile
Ok, lets start….
Social Computing @ Know-Center
13
. Christoph Trattner 29.8.2014 – PUC, Chile
Project 1
Mendeley – UK Startup (recently acquired by Elsevier):
Interested in the problem of hierarchical concept-based navigation.
Social Computing @ Know-Center
14
. Christoph Trattner 29.8.2014 – PUC, Chile
Research Question 1: What kind of meta-data is more useful for the task of navigation in information systems - tags or keywords? Externals involved: • Mendeley, London, UK Helic, D., Körner, C., Granitzer, M., Strohmaier, M. and Trattner, C. 2012. Navigational Efficiency of Broad vs.
Narrow Folksonomies. In Proceedings of the 23rd ACM Conference on Hypertext and Social Media (HT 2012), ACM, New York, NY, USA, pp. 63-72.
Social Computing @ Know-Center
15
. Christoph Trattner 29.8.2014 – PUC, Chile
Mendeley
Social Computing @ Know-Center
16
. Christoph Trattner 29.8.2014 – PUC, Chile
§ We
Keywords
Tags
Mendeley Desktop
Social Computing @ Know-Center
17
. Christoph Trattner 29.8.2014 – PUC, Chile
Task
What is the best way to extract hirachies from enties such as social tags or keywords? What is more useful for navigation – keyword or tag hierarchies?
Social Computing @ Know-Center
18
. Christoph Trattner 29.8.2014 – PUC, Chile
Different types of hierarchy induction algorithms
Helic, D., Strohmaier, M., Trattner, C., Muhr M. and Lermann, K.: Pragmatic Evaluation of Folksonomies, In Proceedings of the 20th international conference on World Wide Web (WWW 2011), ACM, New York, NY, USA, 417-426, 2011.
Social Computing @ Know-Center
19
. Christoph Trattner 29.8.2014 – PUC, Chile
Issue (!!!)
...no literature on what type of hierarchy is best suited for the task of navigation...
D. J. Watts, P. S. Dodds, and M. E. J. Newman. Identity and search in social networks. Science, 296:1302–1305, 2002.
J. M. Kleinberg. Navigation in a small world. Nature, 406(6798):845, August 2000.
Social Computing @ Know-Center
20
. Christoph Trattner 29.8.2014 – PUC, Chile
Stanley Milgram
§ A social psychologist § Yale and Harvard University
§ Study on the Small World Problem, beyond well defined communities and relations (such as actors, scientists, …)
§ „An Experimental Study of the Small World Problem”
1933-1984
Social Computing @ Know-Center
21
. Christoph Trattner 29.8.2014 – PUC, Chile
Set Up
§ Target person: § A Boston stockbroker
§ Three starting populations § 100 “Nebraska stockholders” § 96 “Nebraska random” § 100 “Boston random”
Nebraska random
Nebraska stockholders
Boston stockbroker
Boston random
Target
Social Computing @ Know-Center
22
. Christoph Trattner 29.8.2014 – PUC, Chile
Results
§ How many of the starters would be able to establish contact with the target? § 64 out of 296 reached the target
§ How many intermediaries would be required to link starters with the target? § Well, that depends: the overall mean 5.2 links § Through hometown: 6.1 links § Through business: 4.6 links § Boston group faster than Nebraska groups § Nebraska stockholders not faster than Nebraska random
§ What form would the distribution of chain lengths take?
Social Computing @ Know-Center
23
. Christoph Trattner 29.8.2014 – PUC, Chile
Hierarchical decentralized searcher
Information Network
Hierarchy
Social Computing @ Know-Center
24
. Christoph Trattner 29.8.2014 – PUC, Chile
Validation
§ We compared simulations with human click trails of the online Game – The Wiki Game (http://thewikigame.com/) § Contains 1,500,000 click trails of more than 500,000 users with (start; target) information.
Social Computing @ Know-Center
25
. Christoph Trattner 29.8.2014 – PUC, Chile
Hierachy Creation (1) Two types of hierarchies were evaluated
1.) First type is based on our previous work § Categorial Concepts:
§ Tags from Delicious § Category labels from Wikipedia
Similarity Graph Latent Hierarchical Taxonomy
Wikipedia Category Label Dataset: 2,300,000 category labels, 4,500,000 articles, 30,000,000 category label assignments Delicious Tag Dataset: 440,000 tags, 580,000 articles and 3,400,000 tag assignments
Social Computing @ Know-Center
26
. Christoph Trattner 29.8.2014 – PUC, Chile
Hierarchy Creation (2)
2.) Second type is based on the work of [Muchnik et al. 2007]
Muchnik, L., Itzhack, R., Solomon S. and Louzoun Y.: Self-emergence of knowledge trees: Extraction of the Wikipedia hierarchies, PHYSICAL REVIEW E 76, 016106 (2007)
Simple idea: Algorithm iterates through all links in the network and decides if that link is of a hierarchical type, in which case it remains in the network otherwise it is removed.
Directed link-network dataset of the English-Wikipedia from February 2012. All in all, the dataset includes around 10,000,000 articles and around 250,000,000 links
Social Computing @ Know-Center
27
. Christoph Trattner 29.8.2014 – PUC, Chile
Validation Human Navigators
Social Computing @ Know-Center
28
. Christoph Trattner 29.8.2014 – PUC, Chile
...ok let‘s come back to the Mendeley „problem“...
Social Computing @ Know-Center
29
. Christoph Trattner 29.8.2014 – PUC, Chile
Tags
Are keyword hierarchies more navigable than social tag hierarchies?
Keywords
Results: Our Greedy Navigator (= Simulator) needs on average 1-click more with keywords to reach the target node than with tags
Results: With simulations we find that tag-based hierarchies are more efficient for navigation than keywords
Social Computing @ Know-Center
30
. Christoph Trattner 29.8.2014 – PUC, Chile
...ok let‘s move on to some (Social) networking stuff J
Social Computing @ Know-Center
31
. Christoph Trattner 29.8.2014 – PUC, Chile
Project 2
Blanc Noir – Austrian Startup: Interested in the problem
of recommending items to users through social & location-based (social) data.
Social Computing @ Know-Center
32
. Christoph Trattner 29.8.2014 – PUC, Chile
Research Question 2: To what extent is social network location-based data useful to predict trades or products in online and offline marketplaces? Externals involved: • Blanc Noir • PUC, Chile Trattner, C., Parra, D., Eberhard, L. and Wen, X.: Who will Trade with Whom? Predicting Buyer-Seller
Interactions in Online Trading Platforms through Social Networks, In Proceedings of the ACM World Wide Web Conference (WWW 2014), ACM, New York, NY, 2014.
Social Computing @ Know-Center
33
. Christoph Trattner 29.8.2014 – PUC, Chile
How did we answer that question?
• Major issue: There are no freely available data sets available
• Idea: Crawl data from virtual world of Second Life • Comprises both:
• Online Social Network & Location-Based (Social) data • Amazon/eBay alike Marketplace
• https://my.secondlife.com/ • https://marketplace.secondlife.com/
Social Computing @ Know-Center
34
. Christoph Trattner 29.8.2014 – PUC, Chile
Features
• In our analysis we focused on content (e.g., common interests) and network features (e.g., common interaction partners)
Example of network features we used in our analysis
Social Computing @ Know-Center
35
. Christoph Trattner 29.8.2014 – PUC, Chile
Evaluation
• We split the dataset in two different kinds of sets (one for training and one for testing)
• Trained a binary classifier • Eval metric (Area Under the Curve – AUC)
Social Computing @ Know-Center
36
. Christoph Trattner 29.8.2014 – PUC, Chile
Results: seller/buyer prediction
Baseline: 0.5 (random guessing)
Dataset: • 131,087 seller profiles with 268,852
trading interactions. • 169,035 social profiles with overall
3,175,304 social interactions.
Results: Although the combination of features from both social and trading networks did not show a significant improvement over trading network data alone, our experiments indicate that the online social network data improve the predictive accuracy of trading interactions over random guessing by 28% in a cold-start setting.
Social Computing @ Know-Center
37
. Christoph Trattner 29.8.2014 – PUC, Chile
Follow-up (1) Experiment with location-based social network data
Task: Predict items to users
User-based collaborative filtering
Social Computing @ Know-Center
38
. Christoph Trattner 29.8.2014 – PUC, Chile
Follow-up (2)
Social Computing @ Know-Center
39
. Christoph Trattner 29.8.2014 – PUC, Chile
Recsium Framework • Near Real-Time Updates • Real Time Recommendations • Deals with various sources of data • RESTful API
Social Computing @ Know-Center
40
. Christoph Trattner 29.8.2014 – PUC, Chile
Demo - Recsium
http://recsium.know-center.tugraz.at/recsium/
Social Computing @ Know-Center
41
. Christoph Trattner 29.8.2014 – PUC, Chile
...currently working on
Location-based services shopping malls, train-stations Technology: iBeacons Task: indoor navigation, indoor marketing, etc...
Social Computing @ Know-Center
42
. Christoph Trattner 29.8.2014 – PUC, Chile
Project 3
University of Pittsburgh: Interested on the usefulness
of Twitter in academic conferences.
Social Computing @ Know-Center
43
. Christoph Trattner 29.8.2014 – PUC, Chile
Research Question 3: To what extent is Twitter useful to engage new comers (junior researchers) in academic conferences? Externals involved: • University of Pittsburgh, Pittsburgh, USA • PUC, Chile Wen,X., Parra, D. and Trattner, C.: How groups of people interact with each other on Twitter during academic
conferences, In Proceedings of the 2014 ACM Conference on Computer Supported Cooperative Work (CSCW 2014), ACM, Baltimore, Maryland, USA.
Social Computing @ Know-Center
44
. Christoph Trattner 29.8.2014 – PUC, Chile
Dataset § Data: We collected tweets data by searching for the hashtag of four
conferences: Hypertext 2012 (#ht2012), UMAP 2012 (#umap2012), RecSys 2012 (#recsys2012), and ECTEL 2012 (#ectel2012).
§ Tweets Type: a) mentions, b) replies to, c) re-tweets, and d) isolated tweets (not a), b), c))
§ Twitters Group: a) Junior researcher (JR), b) Senior researcher (SR), c) Faculty (F), d) Industry (I), and e) Organizations (OR).
Dates
captured #
Users # Total tweets
a) Mentions
b) Replies
c) RT
not a), b), c)
% Users re-
tweeted, mentioned, replied-
to
# F # I # JR # O # SR
HT 12 June 24-28 61 254 24 19 105 106 34.40% 19 16 6 4 15
UMAP 12 July 16-20 51 234 32 16 104 82 37.30% 23 7 3 8 18
RECSYS 12 Sept. 10-13 266 2022 265 60 1087 610 34.60% 61 120 6 19 53
ECTEL 12 Sept. 18-21 91 434 17 138 38 241 46.20% 51 17 3 11 15
Social Computing @ Know-Center
45
. Christoph Trattner 29.8.2014 – PUC, Chile
Who is receiving the attention?
0.00
1.00
2.00
3.00
4.00
5.00
6.00
7.00
8.00
9.00
Faculty Senior Researcher Junior Researcher Organization Industry
Average Group Attention Per User
HT 12
UMAP 12
RECSYS 12
ECTEL 12
0.00
2.00
4.00
6.00
8.00
10.00
12.00
14.00
16.00
Faculty Senior Researcher Junior Researcher Organization Industry
Average Group Contribution Per User
HT 12
UMAP 12
RECSYS 12
ECTEL 12
Conversion Ratio (CR) = Attention / Contribution = (|mentioned| + |replied| + |RT|) /|tweets|
0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90
Faculty Senior Researcher Junior Researcher Organization Industry
Conversion Ratio
HT 12
UMAP 12
RECSYS 12
ECTEL 12
Results: Junior researchers show the lowest group attention, and conversation ration among all groups.
Social Computing @ Know-Center
46
. Christoph Trattner 29.8.2014 – PUC, Chile
Who interacts with whom?
HT12 UMAP12 RECSYS12 ECTEL12
From\To F SR JR O I F SR JR O I F SR JR O I F SR JR O I
Faculty (F) 0.43 0.16 0.20 0.16 0.05 0.53 0.42 0.00 0.02 0.04 0.36 0.30 0.01 0.00 0.34 0.73 0.14 0.00 0.02 0.11
Senior Researcher (SR) 0.46 0.19 0.15 0.12 0.08 0.32 0.60 0.00 0.01 0.06 0.22 0.33 0.01 0.02 0.42 0.42 0.13 0.00 0.16 0.29
Junior Researcher (JR) 0.52 0.00 0.12 0.20 0.16 0.40 0.60 0.00 0.00 0.00 0.21 0.38 0.08 0.00 0.33 1.00 0.00 0.00 0.00 0.00
OrganizaTon (O) 0.26 0.30 0.15 0.26 0.04 0.50 0.40 0.00 0.10 0.00 0.15 0.26 0.02 0.08 0.49 0.20 0.20 0.00 0.27 0.33
Industry (I) 0.27 0.31 0.19 0.19 0.04 0.42 0.50 0.00 0.08 0.00 0.26 0.25 0.00 0.02 0.47 0.58 0.20 0.00 0.13 0.10
Results: Juniors researchers are less involved in the conversation on Twitter than any other group of users.
Social Computing @ Know-Center
47
. Christoph Trattner 29.8.2014 – PUC, Chile
Has usage changed over time?
Results: Retweets and Mentions increase over time. Replies and Mentions stay steady over time.
Social Computing @ Know-Center
48
. Christoph Trattner 29.8.2014 – PUC, Chile
Has interaction changed over time?
Results: Our analysis reveals a steady growth in the communication over twitter over time. Interestingly these conversations get less connected over time.
Social Computing @ Know-Center
49
. Christoph Trattner 29.8.2014 – PUC, Chile
What keeps users returning over time?
Results: Eigenvector centrality is the most important feature to predict future conference participation followed by degree centrality.
Social Computing @ Know-Center
50
. Christoph Trattner 29.8.2014 – PUC, Chile
...ok that‘s basically it J
Social Computing @ Know-Center
51
. Christoph Trattner 29.8.2014 – PUC, Chile
...of course there are other projects
Social Computing @ Know-Center
52
. Christoph Trattner 29.8.2014 – PUC, Chile
Thank you!
Christoph Trattner Email: [email protected] Web: christophtrattner.info Twitter: @ctrattner
Sponsors:
Social Computing @ Know-Center
53
. Christoph Trattner 29.8.2014 – PUC, Chile
Any questions?