18
Wikipedia for Researchers Andrew Gray – Wikipedian in Residence [email protected] / @generalising

Wikipedia for Researchers

Embed Size (px)

DESCRIPTION

Wikipedia for Researchers talk, as given at the British Library. The first part covers Wikipedia as a resource for researchers, looking at how it works, how to judge the reliability of content, and how to use Wikipedia as a starting point to access other resources. The second part looks at how Wikipedia is used by researchers as a subject or a corpus, and gives an overview of the kinds of research being done on Wikipedia.

Citation preview

Page 1: Wikipedia for Researchers

Wikipedia for Researchers

Andrew Gray – Wikipedian in Residence

[email protected] / @generalising

Page 2: Wikipedia for Researchers

2

About Wikipedia & Wikimedia

Wikimedia Movement and charitable body 80,000 contributors in 280 languages and

eleven core projects Image repository, dictionary, news site… …read by 7% of the world!

Wikipedia 19,000,000 articles, 4,000,000 in English 6,500 articles and 235,000 edits per day

(…and ten years ago, this was all fields…)

Page 3: Wikipedia for Researchers

3

…so what is Wikipedia?

…an encyclopedia

…written neutrally and verifiably

…using previously published information

…free to use, distribute, or reuse

…a collaborative community

…with no firm rules

Page 4: Wikipedia for Researchers

4

Internal processes

All edits are visible through watchlists and page histories About 7% are vandalism or malicious; processes to detect

these Median time to correction < 2 minutes… but some stay much

longer

Individual discussion pages for all articles – “talk”

Quality review and assessment process

Specialised “wikiproject” working groups and central noticeboards eg/ content topics; style; dispute resolution; copyright; etc.

Page 5: Wikipedia for Researchers

5

Quality of Wikipedia

On average… it’s not bad In 2005 four errors per article, versus three in Britannica In 2011, in English, Spanish & Arabic:

“…the Wikipedia articles in this sample scored higher overall than the comparison articles with respect to accuracy, references, style/ readability and overall judgment…”

Millions of articles – so many are, individually, problematic Various ways of identifying “signs” of quality Markers for quality are both obvious and subtle

Very effective “springboard” tool

Page 6: Wikipedia for Researchers

6

Looking for quality

Corner icons - article locked down in some way - featured or “good” quality

Problem tags

Article talk pages and histories

Style Badly written or formatted articles = often neglected

Page 7: Wikipedia for Researchers

7

Accessing other content

Structured categories and navigational templates

“What links here”

Page 8: Wikipedia for Researchers

8

Moving on to other content

Other languages – not translations, and may have more content

Mousing over footnote markers

Within the references: Links through DOIs and other identifiers ISBNs go to a special landing page

…and then out to libraries, booksellers, etc ISSNs go to WorldCat If an author, look for authority control links:

Page 9: Wikipedia for Researchers

9

Preferences

Available to logged in users

Two particularly useful options: New window for external links (Gadgets > Browsing)

Quality assessment in headers (Gadgets > Appearance)

Many others - mostly editor-oriented tools

Page 10: Wikipedia for Researchers

10

Looking for sets of material

Some tools available – http://www.toolserver.org Complex to use, but rewarding

CatScan: look for intersection of categories “all physicists born in 1912” – 51 in English, 34 in German

Full dumps of all data available – http://dumps.wikipedia.org

Page 11: Wikipedia for Researchers

11

Research about Wikipedia

Thriving research around Wikipedia community & content by mid-2011, 2100 peer-reviewed articles and 38 PhD theses Active research committee and WMF support

Regular report - http://meta.wikimedia.org/wiki/Research:Newsletter also @wikiresearch

Major themes include: Community and content creation Reading and researching by users Quality of content Technical research

Page 12: Wikipedia for Researchers

12

Research on communities

Research on the Wikipedia communities:

Dynamics of community conflict, discussions, collaboration, voting, contribution, mentoring…

Demographics, motivation and specialisms of contributors Patterns of growth and content creation/deletion Effect of central programs on volunteer activity Cross-cultural interaction

Page 13: Wikipedia for Researchers

13

Research on users

Research on usage of Wikipedia:

Specific searching behaviour Patterns of usage (yearly, daily) Tracking external events (eg swine flu) through Wikipedia Search engine rankings Change in usage by students Effect of Wikipedia publication on wider literature

Page 14: Wikipedia for Researchers

14

Research on content

Research on the content of Wikipedia:

Evolution of content Accuracy, coverage and quality Biases – geographic, cultural, gender Linguistic analysis Visualisations of content Effect of external publications on Wikipedia

Page 15: Wikipedia for Researchers

15

Research on technical aspects

Research on the technical side of Wikipedia:

Extensive work on scaling open-content services Tools for detecting and handling vandalism Algorithmic detection and identification of bias, spam Practical research on uses of wikis

Page 16: Wikipedia for Researchers

16

Research example – visualising art history

http://commons.wikimedia.org/wiki/File:Wikiarthistory.png

Page 17: Wikipedia for Researchers

17

Research example – visualising editing patterns

http://commons.wikimedia.org/wiki/File:WikiTrip_egyptian_revolution_screenshot.png

Page 18: Wikipedia for Researchers

18

Research example – editor activity

http://commons.wikimedia.org/wiki/File:Effect_of_barnstars_on_productivity.png