Upload
jaqueline-newlove
View
219
Download
3
Tags:
Embed Size (px)
Citation preview
The Gene Wiki: Community Intelligence Applied to Gene
Annotation
FaceBase Kick-off Meeting
November 16, 2009
Andrew Su, Ph.D.
2
The biomedical literature is massive
Centralized curation efforts do not scale
with the rapid growth of the biomedical
literature
811,214 articles in PubMed in 2008
Sooner or later, the research community will need to be involved in the annotation effort to scale up to the rate of
data generation.
3
Wikipedia as a model
• Wikipedia: “the free encyclopedia that anyone can edit.”• Contains a huge breadth of topics and volume of
information• > 2 million articles, > 1 billion words
• More accurate than one might think• comparable to Britannica Online
• Epitomizes collaborative editing• 300K+ active editors
• Displays both structured and unstructured data• figures, images, photos
http://en.wikipedia.org/wiki/Wikipedia:Size_comparisons, July 2008
Articles Words (millions) Average words / article
Wikipedia >2,000,000 >1,000 435
Britannica Online 120,000 55 370
4
Gene “stubs”
• Active MCB community at WP had already developed ~650 gene articles
• Can we accelerate this process through stub creation?
• In total, created 8000 new articles and edited 650 previously existing articles.
5
Positive feedback loopsGene wiki page utility
Number ofusers
Number ofcontributors
1001
2002
6
25k gene-specific review articles?Goal: Create a continually-updated, collaboratively-written, and community-reviewed review article for
every gene in the human genome.
Figures and diagrams
Inline PubMed citations
Hyperlinks to related concepts
Table of contents
Gene Wiki usage7
Utility
Users
Contributors
85% of Gene Wiki pages are found on the first page of Google search
results…
Median: 303 views / page / monthTotal: 2.9 million views / month
8
Gene Wiki editing activity
During Jan – Jun 2009…
… 6848 edits were made by 1923 unique users of IP addresses
… average of 1100 edits per month (SD=171)
… additional 11,912 edits made by automated “bots”
… total increase in text content by 2.28 megabytes, approximately equal to 19 research articles in PLoS Biology
Utility
Users
Contributors
Positive feedback loop initiated?9
Utility
Users
Contributors
There is substantial evidence that Wikipedia and the Gene Wiki are used by both scientists and the general
public, so we as a community better make it good.
Monthly statistics
Dual FaceBase Wiki efforts• Direct participation in the Gene Wiki
– Goal: Scientific outreach and education– Pro:
• Existing critical mass of editors and content– Con:
• Bureaucracy• Limited scope as a general encyclopedia
• Standalone FaceBase Wiki– Goal: Creation of a useful research tool– Pro:
• Greater editorial control• Inclusion of unpublished data and findings• Emphasis on content for craniofacial community
– Con:• Difficult to create and maintain critical mass
– Q: How should we seed content?
10
11
Acknowledgements
Funding and Support
NIGMS, NIHNovartis Research Foundation
John Hogenesch, UPennJohn Hogenesch, UPennAngel Pizzaro, UPennAngel Pizzaro, UPennFaramarz Valafar, SDSUFaramarz Valafar, SDSUDonabel Roberts, SDSUDonabel Roberts, SDSUPierre Lindenbaum, Pierre Lindenbaum,
Fondation Jean Fondation Jean DaussetDaussetMichael Martone, RushMichael Martone, RushKonrad Koehler, Karo BioKonrad Koehler, Karo Bio
David DelanoDavid DelanoJennifer FloydJennifer FloydJames GoodaleJames GoodalePhil McClurgPhil McClurgSteve SuSteve SuRichard TragerRichard TragerJulia TurnerJulia Turner
Serge Batalov Serge Batalov Ghislain BonamyGhislain BonamyJason BoyerJason BoyerJon HussJon HussYue HuYue Hu
Jeff JanesJeff JanesMarc LegliseMarc LegliseCamilo OrozcoCamilo OrozcoChunlei WuChunlei Wu
To access the Gene Wiki, Google “gene wiki portal” (or your favorite
gene’s symbol) for more info…
Collaborators Current group members Past members