Talk before you type: coordination in Wikipedia

Embed Size (px)

DESCRIPTION

Brief internal presentation of the paper by Viegas et al. (2005) titled "Talk before you type: coordination in Wikipedia"

Citation preview

  • 1. Talk Before You Type: Coordination in Wikipedia Authors : Fernanda B. Vigas, Martin Wattenberg, Jesse Kriss, Frank van Ham Visual Communication Lab, IBM Research HICSS-40, 2007 # citations at April, 19 2010 from Google Scholar:116 Presenter:Marco Frassoni SoNet group, http://sonet.fbk.eu

2. SoNet Research Meetings These slides were used for an internal presentation of the SoNet group. Every week, one member of the SoNet group presents a research papers to the other members. The mentioned paper(s) are hence written by other researchers. Being internal presentations, these slides might be a bit rough and unpolished. You can find more information (including this presentation) about the SoNet group at http://sonet.fbk.eu 3. Research topic & questions This paper describes the results of an empirical analysis of Wikipedia and discusses ways in which the Wikipedia community has evolved as it has grown. It contrasts findings with an earlier study [1] and present three main results.

  • the community maintains a strong resilience to malicious editing, despite tremendous growth and high traffic.

4. the fastest growing areas of Wikipedia are devoted to coordination and organization. 5. it focuses on a particular set of pages used to coordinate work, the Talk pages finding that these pages serve many purposes, notably supporting strategic planning of editing. 6. Compare & contrast

  • Since the first study [1] they made, Wikipedia has grown enormously. This dramatic change in scale is clearly visible through the application of history flow diagrams (see Figure 1), valuable tool for providing a clear overview of the editing activity in Wikipedia (as for edit wars, see Figure 3).These diagrams have been used to analyze a random sample (~ 5%) of article pages taken from an entire dump (i.e. a file from that included all pages - except for deleted pages - along with full revision histories) of the English Wikipedia.

7. Compare & contrast One of the main results of [1] was that Wikipedia showed remarkable resilience in the face of malicious edits. A statistical can of the sample collected for this research (eg SAMPLE05) shows that the basic fast-repair characteristics of Wikipedia remain strong. Results are presented in the following tables: These findings suggest that the community maintains a strong resilience to vandalism and malicious editing, despite its tremendous growth, high traffic and having became a high-value target. 8. How has Wikipedia grown?

  • Investigating growing patterns can be worthy to understand the way Wikipedia community is evolving. Wikipedia is divided into several sections (namespaces) each serving a special purpose. To investigate growing patterns, researchers focused on eight namespaces and their evolution over time (see Figure 5)

9. How has Wikipedia grown? User Talk pages registered the fastest growing in the consideredtime-span(see Table 4). Moreover, even focusing only to the two namespaces directly related to the encyclopedic content of Wikipedia (eg Main and Talk namespaces), the number of articles' discussion pages has grown faster than the number of articles' pages.

  • This pattern echoes the tendency of active Wikipedians to move from having a local focusediting individual articlesto a more high-level concern for the quality of content and the health of the community, as described by Bryant et al. [3].

10. Talk pages To better understand the increase in coordination-related pages, researchers decided to examine one of these categories, the articles' talk pages. In Vigas et al, Talk pages were characterized as places where conflict was resolved [1]. While it is true that they serve this function, a closer reading of the Talk pages indicates they play an important role in planning and other types of coordination as well. Some basic statistics behind Talk pages: Non-empty Talk pages exist for 14.5% of the article pages in the dump. Heavily edited articles and Talk pages go hand in hand. While the average edits per page in Wikipedia is roughly 15 (median = 2), around 94% of the pages with more than 100 edits have related Talk pages. Conversely, articles with associated Talk pages have, on average, 5.8 times more edits and 4.8 times more users than articles without. 11. Talk pages analysis Methodology: manual coding of 25 Talk pages, belonging to Wikipedia articles, chosen by the research among all the others. Each Talk page has been coded by two researchers, and their results compared to ensure reliability. A Talk page is divided into posts, styled with signatures, indentation and titled by discussion topics. An algorithm was developed to detect automatically posts in a Talk page and determine the number of single post: for the algorithm a post is everything on the Talk page which is followed by a signed user name, an horizontal rule or a new indentation level. Wikipedia admits to archive old posts on the Talk page in talk archives. For the purpose of this research, only not archived Talk pages have been analyzed. 12. Posting dimensions

  • Requests/suggestions for editing coordination

13. Requests for information 14. References to vandalism 15. References to Wikipedia guidelines and policies 16. References to internal Wikipedia resources 17. Off-topic remarks 18. Polls 19. Requests for peer review 20. Information boxes 21. Images 22. Other Researchers identified 11 coding dimensions which are not mutually exclusive, hence a single post could be coded in more than one dimension. Posting dimensions are: 23. Coding findings Requests for coordination were, by far, the most common kind of posting, accounting for over half of the contributions on Talk pages (see table 7). This establishes the crucial strategic role that Talk pages serve in Wikipedia. Contributors use Talk pages to discuss their editing activities in advance, to ask for help, and to explain the reasons why they think specific changes should be made. Next in frequency were requests for information, which occurred once in every ten posts. Such requests are usually made by visitors who have no intention of editing the associated article and they suggest the use of Talk pages as a place to tap into expert knowledge of specific topics. 24. Bibliography

  • Vigas, F., Wattenberg, M., & Dave, K. StudyingCooperation and Conflict between Authors with history

25. Benkler, Y. Coase's Penguin, or, Linux and The Nature of the Firm. The Yale Law Journal, Vol 12, N 3. December 2002. 26. Bryant, S., Forte, A., Bruckman, A. Becoming Wikipedian: Transformation of Participation in a Collaborative Online Enciclopedia. In Proceedings of GROUP 2005. 27. Emigh, W., and Herring, S. Collaborative authoring on the Web: A genre analysis of online encyclopedias. In Proceedings of HICSS-38, 2005. 28. Forte, A., Bruckman, A. Why do People Write for Wikipedia? Incentives to Contribute to Open-Content. Publishing. GROUP 05 workshop position paper. 29. Lih, A. Wikipedia as Participatory Journalism: Reliable Sources? Metrics for Evaluating Collaborative Media as a News Source. Proceedings of the Fifth International Symposium on Online Journalism, 2004. 30. These slides are released under Creative Commons Attribution-ShareAlike 2.5 You are free: * to copy, distribute, display, and perform the work * to make derivative works * to make commercial use of the work Under the following conditions: Attribution. You must attribute the work in the manner specified by the author or licensor. Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under a license identical to this one. * For any reuse or distribution, you must make clear to others the license terms of this work. * Any of these conditions can be waived if you get permission from the copyright holder. Your fair use and other rights are in no way affected by the above. More info at http://creativecommons.org/licenses/by-sa/2.5/