20
1 Hyperlink Formation in Social Bookmarking Systems: Who is Who Online? 1st European Conference on Social Networks (EUSN) 1-4 July 2014, Barcelona, Spain http://www.eusn.org University of Huelva, Spain Juan D. Borrero, [email protected] Estrella Gualda, [email protected] José Carpio, [email protected]

Hyperlink Formation in Social Bookmarking Systems: Who is Who Online?

Embed Size (px)

DESCRIPTION

Social bookmarking systems attract researchers in information systems and social sciences because they offer an enormous quantity of user-generated annotations that reveal the interests of millions of people. In this paper, we explore a different viewpoint to gain an understanding of the social bookmarking systems. Using data crawled from a large social tagging system we argue that the prominence of a website, as measured by its status or public recognition, also determines its centrality. To test this hypothesis we predict the indexes of authority and other measures of centrality via Social Network Analysis. We also use Gephi to visualize the networks, and analyze the structure. The results discussed in the paper come from a sample of 61,043 taggings that involved 3,668 users and 4,913 bookmarked websites from a specific Social Network Sites, Delicious, on the subject of globalization of agriculture. We find that mass media companies have a competitive advantage in attracting links and user attention.

Citation preview

Page 1: Hyperlink Formation in Social Bookmarking Systems: Who is Who Online?

11

Hyperlink Formation in Social Bookmarking Systems: Who is Who Online?

1st European Conferenceon Social Networks (EUSN)

1-4 July 2014, Barcelona, Spainhttp://www.eusn.org

University of Huelva, Spain

Juan D. Borrero, [email protected]

Estrella Gualda, [email protected]

José Carpio, [email protected]

Page 2: Hyperlink Formation in Social Bookmarking Systems: Who is Who Online?

22

Why this paper?Network centrality

High centrality is a property of large networks like on the Internet (Barabási and Réka, 1999; Barabási et al., 2000).

The Zipf/power law is a defining characteristic of large-scale networks such as the Web (e.g. Barabási and Réka, 1999), which implies a high degree of network centralization.

It is unclear whether this is also a feature of the properties of sub-networks.

Page 3: Hyperlink Formation in Social Bookmarking Systems: Who is Who Online?

33

Why this paper?Node centrality and preferences The Web links are of

conferrers of “authority”(Kleinberg, 1999) or “endorsement” (Davenport and Cronin, 2000) and can indicate preferences (see Baldassarri and Diani, 2007).

We argue that te prominence of a website, as measured by its status or public recognition, also determines its centrality.

Little literature has in account implicit relations between nodes within online networks (e.g., Ackland and O’Neil, 2011).

The ties among websites bookmarked can indicate preferences.

Page 4: Hyperlink Formation in Social Bookmarking Systems: Who is Who Online?

44

Why this paper?Social Bookmarking Provides a huge amount of

user-generated annotations for web content.

Reflects the interests of millions of users.

Collective knowledge (Wisdom-of-crowds).

Research areas: (Web-) Search & Content classification, Ontology building, Trend detection, Recommendation, Sociology …

Source: http://blog.hubspot.com/blog/tabid/6307/bid/7372/9-Reasons-Why-Your-Social-Media-Strategy-Isn-t-Working.aspx/

Page 5: Hyperlink Formation in Social Bookmarking Systems: Who is Who Online?

55

Context and Topic of StudyDeliciousDelicious as a hyperlink networkGlobalization of Agriculture

Objectives and Hypotheses

MethodologyData collectionAnalysis

ResultsNetwork centralizationTop authoritative nodesVisualization network

Discussion and Conclusions

Further Research

Possible Applications

Hyperlink Formation in Social Bookmarking Systems: Who is Who Online?

Outline

Page 6: Hyperlink Formation in Social Bookmarking Systems: Who is Who Online?

6

Context and Topic of StudyDeliciousA free social bookmarking web service for storing, sharing and discovering web bookmarks

Page 7: Hyperlink Formation in Social Bookmarking Systems: Who is Who Online?

7

Context and Topic of StudyDelicious• Content is created, annotated and viewed by its users. • Users can tag each of their bookmarks on the Delicious website, and provides

knowledge about the URL marked.• View bookmarks added or annotated by other users

Collective nature

Page 8: Hyperlink Formation in Social Bookmarking Systems: Who is Who Online?

88

Context and Topic of StudyDelicious as a hyperlink network

we can also see indirected links (e.g. between urls - straight lines), that represent a unipartite network

The structure of Social tagging –Delicious- website can be viewed as a network of three different and interconnected node types (tripartite network): the user who make the annotation, the link to the resource (urls) and one or more tags.

we see the hyperlink network (uurl)

u

u’

u’’

url

url’

url’’

T’’’

T’’

T’T

Page 9: Hyperlink Formation in Social Bookmarking Systems: Who is Who Online?

99

GlobalizationImplies large market as result of the reduction transaction costs of international

trade

Globalization of agriculture- trade (foods, goods)- prices (food, goods)

- food consumption (bulk products versus processed products)- R&D

- rules and laws (subsidies, WTO related to poverty)

implications

Asymmetries

effects

Discussion/diffusion Web 2.0

Context and Topic of StudyTopic

more easily

Page 10: Hyperlink Formation in Social Bookmarking Systems: Who is Who Online?

10

Objectives and Hypotheses

1. To analize the delicious sub-network centrality

2. To link node centrality with public recognition

3. To discover implicit relations between urls

Page 11: Hyperlink Formation in Social Bookmarking Systems: Who is Who Online?

11

Objectives and Hypotheses

1. To analize the delicious sub-network centralityH1: In the globalization of agriculture network on delicious (sub-network) Zipf law is fulfilled.

2. To link node centrality with public recognitionH2: Websites with a higher public recognition will be more likely to receive a large number of links

3. To discover implicit relations between urlsH3: The ties among urls can indicate preferences

Page 12: Hyperlink Formation in Social Bookmarking Systems: Who is Who Online?

1212

(A) Start point. Identify the search attributes. Authoritative source as baseline to find keywords connected to the idea of ‘globalization of agriculture’

(B) Perl program web-crawling was made to gather the sample of users, URLs and tags from 22 April 2011 to 21 May 2011.

(C) Results 3,668 users on 4,913 URLs.

(D) Program in Haskell to reduce the amount of data by cutting the URLs.

MethodologyData collection

Page 13: Hyperlink Formation in Social Bookmarking Systems: Who is Who Online?

13

MethodologyAnalysis

Social Network Analysis (SNA) with Pajek and Gephi software,

1.studying the properties of centrality

2.visualizing through graphs.

Page 14: Hyperlink Formation in Social Bookmarking Systems: Who is Who Online?

1414

Zipf lawThe network is highly centralized within a few nodes.

How come that a few websites receive more links?

ResultsH1: Network centralizationHyperlink Network (userURL). The degree of variability in URL centrality scores according to indegree.

2,148 URLs arranged in rank order by number of inbound links (URL’s Indegree: Sum of total inbound links)

Only 10 URLs from 2,148 (0.47%) account for 17.97% of links.1% URLs (22 URLs from 2,148) account for 26.50% of links.

Page 15: Hyperlink Formation in Social Bookmarking Systems: Who is Who Online?

15

ResultsH2: Top authoritative nodes in the Delicious “Globalization of agriculture” hyperlink network (userURL)

Ten most centralized websites

Six of them were well-know media-based (online newspapers) and activists

Indegree

Value URL Description

1 259 www.nytimes.com Online newspaper

2 170 www.independent.co.uk Online newspaper

3 155 www.naomiklein.org Activist media site

4 144 www.news.bbc.co.uk/ Online newspaper

5 124 www.globalresearch.ca Activist media site

6 95 www.spiegel.de/ Online newspaper

7 94 www.guardian.co.uk/ Online newspaper

8 94 www.economist.com/ Online newspaper

9 87 www.corpwatch.org Activist media site

10 172 www.theatlantic.com Online magazine

Alexa.com

130

568

1,010,476

63

10,795

170

14,493

1,747

335,338

1,063

www.uab.cat 29,555The popularity of the website

bookmarked determines its centrality

Page 16: Hyperlink Formation in Social Bookmarking Systems: Who is Who Online?

16

ResultsH2: Top authoritative nodes in unipartite network (URL-URL)

Again the majority were well-know media-based (online newspapers) and Web 2.0 (YouTube and Wikipedia).

Degree Closeness Betweenness

1 537 www.nytimes.com 0.4421 www.nytimes.com 0.0930 www.nytimes.com

2 386 www.news.bbc.co.uk 0.4180 www.news.bbc.co.uk 0.0593 www.news.bbc.co.uk

3 337 www.economist.com 0.4068 www.guardian.co.uk 0.0366 www.globalresearch.ca

4 324 www.guardian.co.uk 0.3992 www.economist.com 0.0341 www.guardian.co.uk

5 286 www.ft.com 0.3886 www.rodrik.typepad.com 0.0293 www.naomiklein.org

6 257 www.rodrik.typepad.com 0.3868 www.ft.com 0.0290 www.economist.com

7 243 www.en.wikipedia.org 0.3854 www.en.wikipedia.org 0.0262 www.wikipedia.org

8 222 www.youtube.com 0.3820 www.spiegel.de 0.0207 www.youtube.com

9 218 www.spiegel.de 0.3814 www.washingtonpost.com 0.0191 www.spiegel.de

10 217 www.globalresearch.ca 0.3800 www.globalresearch.ca 0.0184 www.ft.com

Ten most centralized websites

Ties based on preferences at bookmarking URLs

The popularity of the website bookmarked determines its centrality

3 6Alexa

Page 17: Hyperlink Formation in Social Bookmarking Systems: Who is Who Online?

17

ResultsH3: Visualization URL-URL unipartite network.Colour: Communities. Layout: ForceAtlas2 from Gephi.

We found twodifferent

communities

1.the mass media websites

belong to the blue one, and

2.the main activists

websites are included in the green cluster

Ties based on user

preferences

Complete network

Main communities

Size: Degree

Size: Betweenness

Page 18: Hyperlink Formation in Social Bookmarking Systems: Who is Who Online?

• Very unequal distribution of power of the URLs bookmarked in the topic globalization of agriculture (O1)

– Most bookmarked URLs seem to reflect preferences of USERs at bookmarking websites.

– userURL network reflects a big centralization of preferences, and the existence of long tail.

• Maybe not by chance, main nodes represent authoritative websites in the world (O2)

– Well know mass media and popular activists surpassed by far other resources bookmarked.

• The collaborative practice of bookmarking websites in Delicious can allow us to discover virtual communities (O3)

– URL-URL unipartite network produces clusters of URLs.

– Identification of key collective actors (represented here through URLs) allow a better comprehension of leadership, influence processes, and power-related structures.

18

Discussion and conclusions

Page 19: Hyperlink Formation in Social Bookmarking Systems: Who is Who Online?

19

• Why is ‘that’ so important URL in the network of globalization of agriculture?– Key URLs in this type of network could configure and reconfigure the

evolution of the network (TIME), and structure and even manipulate the type of interchange of resources in Delicious or in similarbookmarking sites.

• Go in-depth about users.– To identify of key actors that share URLs.

• Distinction between scientifics and other professionals. – To distinct users in the way how they use some tags

• Distinction between scientifics / other professionals or users? • Identify users with the same patterns at tagging, or URLs that were

similarly labelled: study structural equivalences• Is it by chance? Do the most prominent actors correspond to a profile of very active

and participative people? Do they usually work (or have as hobby) in this area and this is why tag so many URLs in Delicious?

Further research

Page 20: Hyperlink Formation in Social Bookmarking Systems: Who is Who Online?

20

Possible Applications

• Producing and “manipulating” public opinion (at recommending and describing websites) and markets– If we know the interests of users belonging to a network,

we could also be able to make recommendations• For social practitioners, is a good way to identify key

informants in a community through which to disseminate useful and important information.

• Important for researchers interested in formulating strategies for intervention and mobilization.

• Applications in advertising, e-commerce, social movements, security…

• …