24
Reading Preference and Behavior on Wikipedia Janette Lehmann, Claudia Müller-Birn, David Laniado, Mounia Lalmas, Andreas Kaltenbrunner photo credit: marissa, CC BY 2.0

Reading Preference and Behavior on Wikipedia

Embed Size (px)

Citation preview

Page 1: Reading Preference and Behavior on Wikipedia

Reading Preference and Behavior on Wikipedia

Janette Lehmann, Claudia Müller-Birn, David Laniado,

Mounia Lalmas, Andreas Kaltenbrunner

photo credit: marissa, CC BY 2.0

Page 2: Reading Preference and Behavior on Wikipedia
Page 3: Reading Preference and Behavior on Wikipedia
Page 4: Reading Preference and Behavior on Wikipedia

• Second-class members of an

online community (Preece et al. 2004)

• “Lurkers” or “free-riders” (e.g., Nonnecke, 2000, Nonnecke, 2004)

• More resource-taking than

value-adding (Kollock, 1990)

• Only valuable when they

become active contributors (Preece et al. 2004)

Page 5: Reading Preference and Behavior on Wikipedia

Why is it useful to study readers?

• Improving the article quality evaluation – Defining new metrics to measure article quality (e.g., reading time)

– Interweaving explicit (AFT) and implicit feedback

• Improving the interface design

• Giving authors positive feedback – Authors feel that their work is more valuable when many users read the article

• Improving the reading experience – Users … having a good reading experience

… returning more often … becoming contributors

Page 6: Reading Preference and Behavior on Wikipedia

(1) We studied users’ reading preferences

- what they read -

(2) We analyzed users’ reading behaviors

- how they read -

Page 7: Reading Preference and Behavior on Wikipedia
Page 8: Reading Preference and Behavior on Wikipedia

Preference matrix of biography articles

Editing preference of an article

Article length at the end of our data period

Reading preference of an article

Median monthly article popularity

measured by the number of page views

• 74.1% of the articles have an average

article length or popularity.

• We focus on the remaining 25.9% - the

extreme cases.

Data set

Page view data from Wikipedia

1M biography articles

460M page views

Sep 2011 – Sep 2012

Page 9: Reading Preference and Behavior on Wikipedia

Preference matrix of biography articles

For 9.8% (group I) and 7.9% (group III) of the articles

editing and reading activity is high.

Page 10: Reading Preference and Behavior on Wikipedia

Preference matrix of biography articles

For 4.0% (group II) of the articles

editing activity is high, but reading activity is low.

Page 11: Reading Preference and Behavior on Wikipedia

Preference matrix of biography articles

For 4.2% (group IV) of the articles

editing activity is low, but reading activity is high.

Page 12: Reading Preference and Behavior on Wikipedia

Reading preferences

• Dominance of entertainment-related topics on Wikipedia

• There are articles where editing and reading preferences do not align – Being aware of these divergences can help editors

making informed decisions about which articles to focus next.

– Thereby also temporal changes of popularity should be taken into account.

Page 13: Reading Preference and Behavior on Wikipedia

(1) We studied users’ reading preferences

- what they read -

(2) We analyzed users’ reading behaviors

- how they read -

Page 14: Reading Preference and Behavior on Wikipedia

Reading session

Session metrics

article views: 3

reading time: 4.3min

session articles: 5

0.5min 1.8min 2min

session

starts

session

ends

time

Data set

Browsing data from the Yahoo toolbar

288K biography articles

387K users

4.5M page views

Sep 2011 – Sep 2012

Page 15: Reading Preference and Behavior on Wikipedia

Behavior vectors of an article

Behavior vector 2

Behavior vector 3 Behavior vector 1

Behavior vector

• Average reading behavior on an article described by the three session metrics

and the popularity metric

• 9.7K articles; 50K behavior vectors

Reading pattern

• Clustering of the behavior vectors using k-means

• 4 main reading pattern (clusters) were identified

Page 16: Reading Preference and Behavior on Wikipedia

Reading pattern

Focus

• Expected encyclopedic reading behavior

• Users spend a lot of time reading the article (high ReadingTime), but access very few other articles (low

value of SessionArticles) within the session

- / + little below/above average

-- / ++ far below/above average

Page 17: Reading Preference and Behavior on Wikipedia

Reading pattern

Trending

• Articles related to trending topics (high Popularity)

• Users “quickly look up” for information about something that is currently trending or has recently

happened (average ReadingTime)

• Highest editing activity: Articles are long (38K), and edited frequently (20 edits)

- / + little below/above average

-- / ++ far below/above average

Page 18: Reading Preference and Behavior on Wikipedia

Reading pattern

Exploration

• Users explore many articles around a topic (high value of SessionArticles)

• Thereby they return regularly to the focal article, using it as a kind of ‘navigation page’ (high value of

ArticleViews)

- / + little below/above average

-- / ++ far below/above average

Page 19: Reading Preference and Behavior on Wikipedia

Reading pattern

- / + little below/above average

-- / ++ far below/above average

Passing

• Users read many articles related to a topic (high value of SessionArticles)

• Thereby users only pass through the focal article (low ReadingTime), and do not return to it (low

ArticleViews)

• Lowest editing activity: Articles are short (16K), and not edited frequently (8 edits)

Page 20: Reading Preference and Behavior on Wikipedia

Reading pattern over time

Stability

• 30% of the articles are popular in a single-month

• 10% are popular over the whole 13-month period

• Almost all articles have one reading pattern half

of their life time

Transitions

• Transitions are temporary – articles belong to

one cluster, and move temporarily to another

cluster

• High reciprocity – similar number of transitions

in both directions

• “Focus” cluster is isolated - Articles in that

cluster are the most stable ones

• Strong connection between the “Passing”,

“Exploration”, and “Trending” clusters – many

articles adopt all three reading patterns

Page 21: Reading Preference and Behavior on Wikipedia

Conclusions

Data on readers are available, but their potential has not being fully exploited.

They can support editors to make long-lasting decisions for their editorial work, and

might engage readers more to the Wikipedia.

The temporal nature of reading behavior should be taken into account.

photo credit: marissa, CC BY 2.0

Page 22: Reading Preference and Behavior on Wikipedia

Future work

Extension of the study about reading behavior

Development/Extension of tools that support editors (e.g., SuggestBot)

photo credit: marissa, CC BY 2.0

Page 23: Reading Preference and Behavior on Wikipedia

Thank you.

For more information:

http://janette-lehmann.de/docs/pub2014_ht.pdf

Check out the review by Piotr on Wikimedia Research Newsletter (vol 4, issue 7, July 2014)

Page 24: Reading Preference and Behavior on Wikipedia

References

• C. Okoli, M. Mehdi, M. Mesgari, F. A. Nielsen, and A. Lanamäki. The People’s Encyclopedia Under

the Gaze of the Sages: A Systematic Review of Scholarly Research on Wikipedia. http://ssrn.com/

abstract=2021326, 2012.

• J. Preece, B. Nonnecke, and D. Andrews. The top five reasons for lurking: improving community

experiences for everyone. Comp. in Human Behavior, 20(2), 2004.

• B. Nonnecke and J. Preece. Lurker demographics: counting the silent. In Proc. CHI (2000).

• B. Nonnecke, J. Preece and D. Andrews. What lurkers and posters think of each other. In Proc.

HICSS (2004).

• P. Kollock. The economies of online cooperation: Gifts and public goods in cyberspace. In

Communities in Cyberspace, pages 220–239. Routledge, 1990.