Upload
gabe-rudy
View
420
Download
0
Embed Size (px)
Citation preview
Visualizing Genomic Variants and Annotations
is Vital for Accurate Interpretation
April 23, 2015
Gabe Rudy
@gabeinformatics
VP Product Management and Engineering
Golden Helix
My Background
Golden Helix- Founded in 1998- Genetic association software- Analytic services- Thousands of users worldwide- Over 800 customer citations in journals
Products I Build with My Team- SNP & Variation Suite (SVS)
- SNP, CNV, NGS tertiary analysis- Import and deal with all flavors of upstream data
- VarSeq- Annotate and filter variants in gene panels, exomes and
genomes for clinical labs and researchers.
- GenomeBrowse (Free!)- Visualization of everything with genomic coordinates. All
standardized file formats.
Visualization of Variants to Aid Interpretation
Variants + Genomic Context- Where it is in gene- Annotations that match, don’t match- Other variants in cohort- Nearby variants in cohort/population
Alignment Evidence- BAM files provide more than is in VCF
Variant Representation- Multi-Allelic Sites- Allelic Primitives- Left-Alignment- Combination!
X:38226614 - G/A
• Recent Addition to ClinVar:• 2013-05-09 G/A - Untested with Disease Unspecified• 2014-03-03 G/A - Pathogenic with not_provided
citing:
X:38226614 - G/A
• Cited PubMed article was on ResearchGate, Hiroki Morizono contacted
• Provided full text and lots of interesting backstory on OTC
• “If you are able to eat all the steak you want, you may have the mutation; it would appear to be a hypomorphic allele (and a very mild one at that)”
• “Is possible that the late onset case that [was] identified may have been someone who was having a very bad day, and several things went poorly for them.”
• “The R40H mutation, there was a grandfather or granduncle who was affected who ate whatever he wanted, and seemed unaffected while the proband had several episodes.”
X:38226614 - G/A
• Most likely partial penetrance, with potential risk of triggering with shock event
• The Glycine is conserved down to Opossum (Platypus, Zebafish has a Alanine)
Reference Sequence Versus Gene Sequence
EMG1 on GRCh37
“Gap” of the mRNA coding sequence versus reference seq:
Handled differently by 3 different “gene alignments”
Reference Sequence Versus Gene Sequence
EMG1 on GRCh38
Reference sequence patched, no gap
Alignments agree
Left-Align Annotations
Using a Smith-Waterman algorithm to left-align variants from public databases show non-obvious differences
NGS alignment and variant calling always left-aligned
Left-align your database so they can be annotated
Exome Sequencing in Consumer Genomics
Exomes done as part of Pilot Program
80x coverage
Raw data with no interpretation
ErinJIA
Gabe(me)
Ethan
NM_002626.4:c.1877G>C in PFKL
NP_002617.3:p.Arg626Pro missense mutation
Predicted damaging by 4/5 functional predictions
VEST3: 0.948, GERP++: 4.59
ExAC and 1kG have a G>A, but G>C is novel
Variants in region are extremely rare (G>C ExAC 4 of 122,364 alleles) – 0.003%
No ClinVar variants for gene
OMIM entry has no known disease association
PubMed search shows few recent articles: Most recent 1998 paper showed- phosphofructokinase (PFKL) overexpressed in Down syndrome (DS) - Transgenic PFKL mice had an abnormal glucose metabolism with reduced clearance
rate from blood and enhanced metabolic rate in brain.
Thank you
Heidi Rehm – Chief Laboratory Director at Laboratory for Molecular Medicine, PCPGM
Hiroki Morizono – Children’s National
Reece Hart – Computational Biologist, Invitae (now 23andMe)
Greta Linse Peterson – Director of Product Management and Quality, Golden Helix