Upload
sal
View
2.135
Download
7
Tags:
Embed Size (px)
DESCRIPTION
Structured gene annotations are a foundation on which many bioinformatics and statistical analyses are built, however their representation is quite sparse – in comparison to the total knowledge that could be captured. As centralized biocuration efforts struggle to keep up with the rate of biomedical data generation, new models for gene annotation need to be explored. Recently, online games have emerged as an effective way to recruit, engage and organize contributors to help address difficult challenges like online image tagging (ESP Game), protein folding (Foldit), or multiple sequence alignment (Phylo). We present here two online games - Dizeez and GenESP - aimed at identifying novel gene-disease annotations, i.e. gene-disease links well established in the literature, but not yet reflected as structured annotations. Preliminary results are provided from game play online and at scientific confer-ences. These data suggest that even after limited game play, novel gene-disease annotations can be mined from game playing logs. Both games are available at http://genegames.org.
Citation preview
Salvatore)Loguercio*,)Benjamin)Good,)Andrew)Su)
Department)of)Molecular)and)Experimental)Medicine)The)Scripps)Research)InsDtute)
ISMB)BioEOntologies)SIG)
July)13,)2012)
Games)for)Human)Gene)AnnotaDon)
Growth of potential annotations 2
500000 550000 600000 650000 700000 750000 800000 850000 900000 950000
1000000
Number of articles
added to PubMed
PubMed in 2012: > 21 million articles.
Approaching 1 million new articles per year (>1/minute)
3
0
10
20
1979 1984 1989 1994 1999 2004 2009
Average capacity of human scientist Number of articles read by typical scientist
1.5%)of)PubMed*)cited)by)GO)annotaDons)
*311,696)arDcles)(2011)))
PubMed)
GO)
5
0
Sooner or later, the research community will
need to be involved in the annotation effort to scale
up to the rate of data generation.
How)to)involve)the)community)in)gene)annotaDon?)
Crowdsourcing)Biology)
Gene)Wiki:)Comprehensively)organize)knowledge)of)all)human)genes)
Gene)annotaDon)portal)for)aggregaDng)geneEcentric)online)content)
h]p://biogps.org))
Biological)games)Build)scienDfic)knowledge)through)game)play)
Why)games?)
It)is)esDmated)that)9)billion)hours)are)spent)playing)Solitaire)every)year)
13)
h]p://www.flickr.com/photos/archana3k1/4124330493/)
Seven million human hours
14)
Twenty million human hours
h]p://www.flickr.com/photos/ableman/2171326385/)
E)15)
150 billion human hours
h]p://www.flickr.com/photos/rvpEcw/6243289302/)
per year
Can)we)harness)some)of)this)Dme)and)energy?))
Games)with)a)purpose)
Devise)protein)folding)algorithms)
Fix)mulDple)sequence)alignments)Design)RNA)molecules)
Label)all)images)on)the)Web)
Annotate)all)human)genes)
Record)the)relevant)properDes)of)each)gene)in)a)manner)that)facilitates)computaDon)
• biological)process)• molecular)funcDon)• cellular)localizaDon)• interacDon)partners)• disease)relevance)• genomic)locaDon)• geneDc)variaDons)• post)translaDonal)
modificaDons)• related)drugs)• related)publicaDons)• ...)
Gene)
Dizeez:)geneEdisease)associaDon)quiz)
DIZEEZ:)geneEdisease)associaDon)quiz)
If its ‘right’, you get points
then on to the next question
Click the related disease
hurry!
h]p://genegames.org)
• AdverDsed)with)a)blog)post,)a)few)tweets)and)conference)poster)
• Results)since)Dec.)2011:)
– 180)people)have)played)it)– 713)one)minute)game)rounds)have)been)completed)– 5,282)disDnct)geneEdisease)associaDons)collected)
Gameplay)
Quality)through)replicaDon)
5,282)DisDnct)geneEdisease)pairs)collected)
example:)ABCB5))Acute)myeloid)leukemia))
collected)more)than)once)482))PotenDal)new)annotaDons)
(do)not)appear)in)OMIM,)PharmGKB))223))
Novel)annotaDons)E)I)
#&Occurrences& Gene& Disease&
7) GAST% gastrinoma)
7) RBP3% reDnoblastoma)
7) SSX1% synovial)sarcoma)
6) TG% Graves')disease)
6) CRYGC%% Cataract)
6) SOX8% mental)retardaDon)
6) WRN%% Werner)syndrome)
6) ABL1%% leukemia)
6) MLL3%% leukemia)
6) SNAI2%% breast)carcinoma)
Pubmed) OMIM) PharmGKB) Gene&Wiki)
2010)or)later)
Novel)annotaDons)E)II)
#&Occurrences& Gene& Disease&
2) ABCB5) acute)myeloid)leukemia)
2) HOXB7) leukemia)
2) SULF1) carcinoma)
2) ALPP) reDnoblastoma)
2) FOXM1) Melanoma)
Pubmed) OMIM) PharmGKB) Gene&Wiki)
2009)or)later)
Current)limitaDons))
• Dizeez)actually)punishes)desired)behavior)(adding)new,)unknown)associaDons))by)not)awarding)points)
• Does)not)allow)player)to)enter)associaDons)other)than)those)in)the)provided)list)
• GenESP)fixes)both)problems)
GenESP:)gene)E)concept)associaDon)with)a)partner)
(modeled)amer)the)ESP)Game).)See:)Ahn)and)Dabbish)(2004))Labeling)images)with)a)computer)game,)SIGCHI)
h]p://genegames.org)
Gene)–)concept)associaDon)with)a)partner)
A)reEusable)pa]ern)
Gene) Disease)
Gene) FuncDon)
Gene) Gene)
Gene) Gene)relaDonship)
The Gene Wiki Hairball!
)Geong)players))
Social)gaming)
EducaDng)players)Building)a)community))
Arena)mode)labs)vs.)labs)
MulDplayer)Online)E)“Farmville)for)gene)annotaDon”)
SOX2!)TP53!)
Dizeez&
Epilogue)Crowdsourcing)for)knowledge)acquisiDon)
Data)and)contributors)
Crowdsourced)model)TradiDonal)model)
Knowledge)
Small)expert)group)
Knowledge)
Data)
Crowdsourced)model)TradiDonal)model)
ComputaDon)
Annotate)all)human)genes)
Erik)Clarke)Max)Nanis)
Ian)Macleod)Chunlei)Wu)
Su)Lab)@)TSRI)
Funding&and&Support&
(BioGPS:)GM83924,)Gene)Wiki:)GM089820))
Interwebs&h]p://sulab.org)
[email protected])@sal999)
+Salvatore)Loguercio)
Crowdsourcing)Biology)@)GSoC)2012!)
Special)thanks)to:)
Ben)Good)Andrew)Su)
Students:)Clarence)Leung)Carolina)Lidstrom)