11
The Gene Wiki: Community Intelligence Applied to Gene Annotation FaceBase Kick-off Meeting November 16, 2009 Andrew Su, Ph.D.

The Gene Wiki: Community Intelligence Applied to Gene Annotation FaceBase Kick-off Meeting November 16, 2009 Andrew Su, Ph.D

Embed Size (px)

Citation preview

Page 1: The Gene Wiki: Community Intelligence Applied to Gene Annotation FaceBase Kick-off Meeting November 16, 2009 Andrew Su, Ph.D

The Gene Wiki: Community Intelligence Applied to Gene

Annotation

FaceBase Kick-off Meeting

November 16, 2009

Andrew Su, Ph.D.

Page 2: The Gene Wiki: Community Intelligence Applied to Gene Annotation FaceBase Kick-off Meeting November 16, 2009 Andrew Su, Ph.D

2

The biomedical literature is massive

Centralized curation efforts do not scale

with the rapid growth of the biomedical

literature

811,214 articles in PubMed in 2008

Sooner or later, the research community will need to be involved in the annotation effort to scale up to the rate of

data generation.

Page 3: The Gene Wiki: Community Intelligence Applied to Gene Annotation FaceBase Kick-off Meeting November 16, 2009 Andrew Su, Ph.D

3

Wikipedia as a model

• Wikipedia: “the free encyclopedia that anyone can edit.”• Contains a huge breadth of topics and volume of

information• > 2 million articles, > 1 billion words

• More accurate than one might think• comparable to Britannica Online

• Epitomizes collaborative editing• 300K+ active editors

• Displays both structured and unstructured data• figures, images, photos

http://en.wikipedia.org/wiki/Wikipedia:Size_comparisons, July 2008

Articles Words (millions) Average words / article

Wikipedia >2,000,000 >1,000 435

Britannica Online 120,000 55 370

Page 4: The Gene Wiki: Community Intelligence Applied to Gene Annotation FaceBase Kick-off Meeting November 16, 2009 Andrew Su, Ph.D

4

Gene “stubs”

• Active MCB community at WP had already developed ~650 gene articles

• Can we accelerate this process through stub creation?

• In total, created 8000 new articles and edited 650 previously existing articles.

Page 5: The Gene Wiki: Community Intelligence Applied to Gene Annotation FaceBase Kick-off Meeting November 16, 2009 Andrew Su, Ph.D

5

Positive feedback loopsGene wiki page utility

Number ofusers

Number ofcontributors

1001

2002

Page 6: The Gene Wiki: Community Intelligence Applied to Gene Annotation FaceBase Kick-off Meeting November 16, 2009 Andrew Su, Ph.D

6

25k gene-specific review articles?Goal: Create a continually-updated, collaboratively-written, and community-reviewed review article for

every gene in the human genome.

Figures and diagrams

Inline PubMed citations

Hyperlinks to related concepts

Table of contents

Page 7: The Gene Wiki: Community Intelligence Applied to Gene Annotation FaceBase Kick-off Meeting November 16, 2009 Andrew Su, Ph.D

Gene Wiki usage7

Utility

Users

Contributors

85% of Gene Wiki pages are found on the first page of Google search

results…

Median: 303 views / page / monthTotal: 2.9 million views / month

Page 8: The Gene Wiki: Community Intelligence Applied to Gene Annotation FaceBase Kick-off Meeting November 16, 2009 Andrew Su, Ph.D

8

Gene Wiki editing activity

During Jan – Jun 2009…

… 6848 edits were made by 1923 unique users of IP addresses

… average of 1100 edits per month (SD=171)

… additional 11,912 edits made by automated “bots”

… total increase in text content by 2.28 megabytes, approximately equal to 19 research articles in PLoS Biology

Utility

Users

Contributors

Page 9: The Gene Wiki: Community Intelligence Applied to Gene Annotation FaceBase Kick-off Meeting November 16, 2009 Andrew Su, Ph.D

Positive feedback loop initiated?9

Utility

Users

Contributors

There is substantial evidence that Wikipedia and the Gene Wiki are used by both scientists and the general

public, so we as a community better make it good.

Monthly statistics

Page 10: The Gene Wiki: Community Intelligence Applied to Gene Annotation FaceBase Kick-off Meeting November 16, 2009 Andrew Su, Ph.D

Dual FaceBase Wiki efforts• Direct participation in the Gene Wiki

– Goal: Scientific outreach and education– Pro:

• Existing critical mass of editors and content– Con:

• Bureaucracy• Limited scope as a general encyclopedia

• Standalone FaceBase Wiki– Goal: Creation of a useful research tool– Pro:

• Greater editorial control• Inclusion of unpublished data and findings• Emphasis on content for craniofacial community

– Con:• Difficult to create and maintain critical mass

– Q: How should we seed content?

10

Page 11: The Gene Wiki: Community Intelligence Applied to Gene Annotation FaceBase Kick-off Meeting November 16, 2009 Andrew Su, Ph.D

11

Acknowledgements

Funding and Support

NIGMS, NIHNovartis Research Foundation

John Hogenesch, UPennJohn Hogenesch, UPennAngel Pizzaro, UPennAngel Pizzaro, UPennFaramarz Valafar, SDSUFaramarz Valafar, SDSUDonabel Roberts, SDSUDonabel Roberts, SDSUPierre Lindenbaum, Pierre Lindenbaum,

Fondation Jean Fondation Jean DaussetDaussetMichael Martone, RushMichael Martone, RushKonrad Koehler, Karo BioKonrad Koehler, Karo Bio

David DelanoDavid DelanoJennifer FloydJennifer FloydJames GoodaleJames GoodalePhil McClurgPhil McClurgSteve SuSteve SuRichard TragerRichard TragerJulia TurnerJulia Turner

Serge Batalov Serge Batalov Ghislain BonamyGhislain BonamyJason BoyerJason BoyerJon HussJon HussYue HuYue Hu

Jeff JanesJeff JanesMarc LegliseMarc LegliseCamilo OrozcoCamilo OrozcoChunlei WuChunlei Wu

To access the Gene Wiki, Google “gene wiki portal” (or your favorite

gene’s symbol) for more info…

Collaborators Current group members Past members