Upload
juan-d-borrero
View
57
Download
3
Tags:
Embed Size (px)
DESCRIPTION
Social bookmarking systems attract researchers in information systems and social sciences because they offer an enormous quantity of user-generated annotations that reveal the interests of millions of people. In this paper, we explore a different viewpoint to gain an understanding of the social bookmarking systems. Using data crawled from a large social tagging system we argue that the prominence of a website, as measured by its status or public recognition, also determines its centrality. To test this hypothesis we predict the indexes of authority and other measures of centrality via Social Network Analysis. We also use Gephi to visualize the networks, and analyze the structure. The results discussed in the paper come from a sample of 61,043 taggings that involved 3,668 users and 4,913 bookmarked websites from a specific Social Network Sites, Delicious, on the subject of globalization of agriculture. We find that mass media companies have a competitive advantage in attracting links and user attention.
Citation preview
11
Hyperlink Formation in Social Bookmarking Systems: Who is Who Online?
1st European Conferenceon Social Networks (EUSN)
1-4 July 2014, Barcelona, Spainhttp://www.eusn.org
University of Huelva, Spain
Juan D. Borrero, [email protected]
Estrella Gualda, [email protected]
José Carpio, [email protected]
22
Why this paper?Network centrality
High centrality is a property of large networks like on the Internet (Barabási and Réka, 1999; Barabási et al., 2000).
The Zipf/power law is a defining characteristic of large-scale networks such as the Web (e.g. Barabási and Réka, 1999), which implies a high degree of network centralization.
It is unclear whether this is also a feature of the properties of sub-networks.
33
Why this paper?Node centrality and preferences The Web links are of
conferrers of “authority”(Kleinberg, 1999) or “endorsement” (Davenport and Cronin, 2000) and can indicate preferences (see Baldassarri and Diani, 2007).
We argue that te prominence of a website, as measured by its status or public recognition, also determines its centrality.
Little literature has in account implicit relations between nodes within online networks (e.g., Ackland and O’Neil, 2011).
The ties among websites bookmarked can indicate preferences.
44
Why this paper?Social Bookmarking Provides a huge amount of
user-generated annotations for web content.
Reflects the interests of millions of users.
Collective knowledge (Wisdom-of-crowds).
Research areas: (Web-) Search & Content classification, Ontology building, Trend detection, Recommendation, Sociology …
Source: http://blog.hubspot.com/blog/tabid/6307/bid/7372/9-Reasons-Why-Your-Social-Media-Strategy-Isn-t-Working.aspx/
55
Context and Topic of StudyDeliciousDelicious as a hyperlink networkGlobalization of Agriculture
Objectives and Hypotheses
MethodologyData collectionAnalysis
ResultsNetwork centralizationTop authoritative nodesVisualization network
Discussion and Conclusions
Further Research
Possible Applications
Hyperlink Formation in Social Bookmarking Systems: Who is Who Online?
Outline
6
Context and Topic of StudyDeliciousA free social bookmarking web service for storing, sharing and discovering web bookmarks
7
Context and Topic of StudyDelicious• Content is created, annotated and viewed by its users. • Users can tag each of their bookmarks on the Delicious website, and provides
knowledge about the URL marked.• View bookmarks added or annotated by other users
Collective nature
88
Context and Topic of StudyDelicious as a hyperlink network
we can also see indirected links (e.g. between urls - straight lines), that represent a unipartite network
The structure of Social tagging –Delicious- website can be viewed as a network of three different and interconnected node types (tripartite network): the user who make the annotation, the link to the resource (urls) and one or more tags.
we see the hyperlink network (uurl)
u
u’
u’’
url
url’
url’’
T’’’
T’’
T’T
99
GlobalizationImplies large market as result of the reduction transaction costs of international
trade
Globalization of agriculture- trade (foods, goods)- prices (food, goods)
- food consumption (bulk products versus processed products)- R&D
- rules and laws (subsidies, WTO related to poverty)
implications
Asymmetries
effects
Discussion/diffusion Web 2.0
Context and Topic of StudyTopic
more easily
10
Objectives and Hypotheses
1. To analize the delicious sub-network centrality
2. To link node centrality with public recognition
3. To discover implicit relations between urls
11
Objectives and Hypotheses
1. To analize the delicious sub-network centralityH1: In the globalization of agriculture network on delicious (sub-network) Zipf law is fulfilled.
2. To link node centrality with public recognitionH2: Websites with a higher public recognition will be more likely to receive a large number of links
3. To discover implicit relations between urlsH3: The ties among urls can indicate preferences
1212
(A) Start point. Identify the search attributes. Authoritative source as baseline to find keywords connected to the idea of ‘globalization of agriculture’
(B) Perl program web-crawling was made to gather the sample of users, URLs and tags from 22 April 2011 to 21 May 2011.
(C) Results 3,668 users on 4,913 URLs.
(D) Program in Haskell to reduce the amount of data by cutting the URLs.
MethodologyData collection
13
MethodologyAnalysis
Social Network Analysis (SNA) with Pajek and Gephi software,
1.studying the properties of centrality
2.visualizing through graphs.
1414
Zipf lawThe network is highly centralized within a few nodes.
How come that a few websites receive more links?
ResultsH1: Network centralizationHyperlink Network (userURL). The degree of variability in URL centrality scores according to indegree.
2,148 URLs arranged in rank order by number of inbound links (URL’s Indegree: Sum of total inbound links)
Only 10 URLs from 2,148 (0.47%) account for 17.97% of links.1% URLs (22 URLs from 2,148) account for 26.50% of links.
15
ResultsH2: Top authoritative nodes in the Delicious “Globalization of agriculture” hyperlink network (userURL)
Ten most centralized websites
Six of them were well-know media-based (online newspapers) and activists
Indegree
Value URL Description
1 259 www.nytimes.com Online newspaper
2 170 www.independent.co.uk Online newspaper
3 155 www.naomiklein.org Activist media site
4 144 www.news.bbc.co.uk/ Online newspaper
5 124 www.globalresearch.ca Activist media site
6 95 www.spiegel.de/ Online newspaper
7 94 www.guardian.co.uk/ Online newspaper
8 94 www.economist.com/ Online newspaper
9 87 www.corpwatch.org Activist media site
10 172 www.theatlantic.com Online magazine
Alexa.com
130
568
1,010,476
63
10,795
170
14,493
1,747
335,338
1,063
www.uab.cat 29,555The popularity of the website
bookmarked determines its centrality
16
ResultsH2: Top authoritative nodes in unipartite network (URL-URL)
Again the majority were well-know media-based (online newspapers) and Web 2.0 (YouTube and Wikipedia).
Degree Closeness Betweenness
1 537 www.nytimes.com 0.4421 www.nytimes.com 0.0930 www.nytimes.com
2 386 www.news.bbc.co.uk 0.4180 www.news.bbc.co.uk 0.0593 www.news.bbc.co.uk
3 337 www.economist.com 0.4068 www.guardian.co.uk 0.0366 www.globalresearch.ca
4 324 www.guardian.co.uk 0.3992 www.economist.com 0.0341 www.guardian.co.uk
5 286 www.ft.com 0.3886 www.rodrik.typepad.com 0.0293 www.naomiklein.org
6 257 www.rodrik.typepad.com 0.3868 www.ft.com 0.0290 www.economist.com
7 243 www.en.wikipedia.org 0.3854 www.en.wikipedia.org 0.0262 www.wikipedia.org
8 222 www.youtube.com 0.3820 www.spiegel.de 0.0207 www.youtube.com
9 218 www.spiegel.de 0.3814 www.washingtonpost.com 0.0191 www.spiegel.de
10 217 www.globalresearch.ca 0.3800 www.globalresearch.ca 0.0184 www.ft.com
Ten most centralized websites
Ties based on preferences at bookmarking URLs
The popularity of the website bookmarked determines its centrality
3 6Alexa
17
ResultsH3: Visualization URL-URL unipartite network.Colour: Communities. Layout: ForceAtlas2 from Gephi.
We found twodifferent
communities
1.the mass media websites
belong to the blue one, and
2.the main activists
websites are included in the green cluster
Ties based on user
preferences
Complete network
Main communities
Size: Degree
Size: Betweenness
• Very unequal distribution of power of the URLs bookmarked in the topic globalization of agriculture (O1)
– Most bookmarked URLs seem to reflect preferences of USERs at bookmarking websites.
– userURL network reflects a big centralization of preferences, and the existence of long tail.
• Maybe not by chance, main nodes represent authoritative websites in the world (O2)
– Well know mass media and popular activists surpassed by far other resources bookmarked.
• The collaborative practice of bookmarking websites in Delicious can allow us to discover virtual communities (O3)
– URL-URL unipartite network produces clusters of URLs.
– Identification of key collective actors (represented here through URLs) allow a better comprehension of leadership, influence processes, and power-related structures.
18
Discussion and conclusions
19
• Why is ‘that’ so important URL in the network of globalization of agriculture?– Key URLs in this type of network could configure and reconfigure the
evolution of the network (TIME), and structure and even manipulate the type of interchange of resources in Delicious or in similarbookmarking sites.
• Go in-depth about users.– To identify of key actors that share URLs.
• Distinction between scientifics and other professionals. – To distinct users in the way how they use some tags
• Distinction between scientifics / other professionals or users? • Identify users with the same patterns at tagging, or URLs that were
similarly labelled: study structural equivalences• Is it by chance? Do the most prominent actors correspond to a profile of very active
and participative people? Do they usually work (or have as hobby) in this area and this is why tag so many URLs in Delicious?
Further research
20
Possible Applications
• Producing and “manipulating” public opinion (at recommending and describing websites) and markets– If we know the interests of users belonging to a network,
we could also be able to make recommendations• For social practitioners, is a good way to identify key
informants in a community through which to disseminate useful and important information.
• Important for researchers interested in formulating strategies for intervention and mobilization.
• Applications in advertising, e-commerce, social movements, security…
• …