Upload
amia
View
55
Download
2
Embed Size (px)
DESCRIPTION
2013 Summit on Translational Bioinformatics
Citation preview
NATIONAL HUMAN GENOME RESEARCH INSTITUTE
Division of Intramural Research
National Human Genome Research Institute
Phenotype-Genotype Integrator (PheGenI) updates: synthesizing genome-wide association study (GWAS) data with existing genomic resources
Lucia A. Hindorff, PhD, MPH1, Douglas J. Hoffman, MS2, Heather A. Junkins, MS1, Masato Kimura, PhD2, Donna Maglott, PhD2, Lon Phan, PhD2, Stephen Sherry, PhD2, Michael Feolo, MS2, Erin M. Ramos, PhD, MPH1
1 NHGRI, NIH, Bethesda, MD; 2 NCBI, NIH, Bethesda, MD
Abstract Rapidly accumulating data from genome-wide association studies (GWAS) are most useful when synthesized with existing databases. We developed and updated the Phenotype-Genotype Integrator (PheGenI), integrating NCBI genomic databases with association data from the NHGRI GWAS Catalog. Integrating over 66,000* association records with extensive SNP, gene, and eQTL data, PheGenI enables deeper investigation of SNPs associated with a wide range of traits, facilitating the examination of the relationships between SNPs and human disease. * Updated as of 3/7/2013 URL: http://www.ncbi.nlm.nih.gov/ gap/PheGenI
• GWAS have identified >8,700 genetic variants associated with a range of human traits and diseases at p<10-5 (3,716 at p < 5 x 10-8)
• Replication, fine mapping and follow up studies are crucial next steps to understanding functional consequences of these variants
• Integration of GWAS data with existing complementary databases can inform prioritization of variants to follow up, study design considerations, and generation of biological hypotheses
Background / Rationale
Data Sources
• PheGenI merges data from the NHGRI GWAS catalog with several resources housed at NCBI (Figure 1).
http://www.genome.gov/gwastudies/
http://www.ncbi.nlm.nih.gov/projects/gapplusprev/sgap_plus.htm
http://www.ncbi.nlm.nih.gov/gene/
http://www.ncbi.nlm.nih.gov/projects/SNP/
http://www.ncbi.nlm.nih.gov/gap
http://www.ncbi.nlm.nih.gov/gtex/GTEX2/gtex.cgi
• The association records span a broad range of human phenotypes and diseases (Figure 2).
• Counts are current as of 3/7/2013.
Conclusions and Future Improvements
• By providing a user-friendly web interface that integrates various NCBI genomic databases with association data from the NHGRI GWAS Catalog, PheGenI enables deeper investigation and interrogation of SNPs associated with a wide range of traits, facilitating the examination of the relationships between SNPs and human disease.
• Future improvements to PheGenI include
• Annotation of supporting results
• Enhanced mapping of phenotype terms
• Incorporation of additional data, such as functional and regulatory elements.
• Ability to programmatically retrieve data using the NCBI Entrez Programming Utilities
Figure 6. Gene Table
Figure 7. SNP Table
User customizability and support
• Downloadable results tables and high-resolution ideogram • Ability to dynamically sort tables • Most data points hyperlinked to underlying databases • Interactive genome browser with customizable tracks • Links to related GWAS datasets available for request • YouTube tutorial (http://bit.ly/WeqNKd) • Information buttons for documentation
Figure 3. PheGenI Search Interface & Results Summary
Users may search on phenotype (broad categories or one or more traits), gene, SNP, or chromosomal range. Additional filters on for p-value or SNP functional class are available (Fig. 3).
A PheGenI search of the example trait ‘celiac disease’ returned several records (summarized in Fig. 3 and shown in detail in Fig. 4-8).
Figure 4. Association Results Table
Results are shown in the context of the GWAS association (Fig. 4) and linked to additional information based on gene (Figure 6) and SNP (Figure 7).
Figure 8. eQTL Data and GWAS datasets
Where available, links are shown to eQTL data & dbGaP datasets for which GWAS data are available for request (Fig. 8).
Genetic Association Results
NHGRI GWAS Catalog dbGaP
N = 11,781 N = 54,282
Genomic Variation
dbSNP N = 63,222,716
Genes
Gene N = 23,550
mRNA Expression
eQTL N = 60,657
Figure 1. Databases and Record Counts
Figure 2. Phenotype Distribution of PheGenI SNPs
The Genome View maps SNPs and genes to a genomic region and allows for interactive browsing and pinning of individual results (Fig. 5).
Figure 5. Genome View and Browser