140128 use cases of giab RMs

Use cases of GIAB Reference Materials

NIST, HSPH, Claritas, NCI, MSSM, Personalis, Qiagen

Genome in a Bo>le Consor?um

Preliminary uses of high-‐confidence NIST-‐GIAB genotypes for NA12878

•  NIST has released several versions of high-‐confidence genotypes for its pilot RM

•  We’ve collected some examples of how people are using these genotypes

NIST work with FDA helps answer the ques?on…

•  So you’ve sequenced my genome. How well did you do? –  FDA approval…

•  NIST work on developing the most accurate interpreta?on of a human genome, coupled with a NIST Reference Material currently being developed, enabled FDA to assess the performance of the sequencer submi>ed for marke?ng approval.

PERSPECTIVE

n engl j med nejm.org2

at other cancers that share com-mon driver mutations. The new technology allows us go from our current approach of targeted searches for specific mutations in individual cancers to wide-spread use of approaches that survey the entire genome.

A major area of opportunity that has yet to be fully exploited is pharmacogenomics — the use of genomic information to iden-tify the right drug at the right dose for each patient. More than 120 FDA-approved drugs have pharmacogenomics information in their labeling, providing im-portant details about differences in response to the drug and, in some cases, recommending ge-netic testing before prescribing.2

But the full potential of phar-macogenomics is largely unreal-ized, because of the logistic chal-lenges in obtaining suitable genomic information in a timely enough fashion to guide prescrib-ing. Placing genomic information in the electronic medical record would facilitate this kind of per-sonalized medicine. If the pa-tient’s entire genome were part of his or her medical record,

then the complexities of acquir-ing a DNA sample, shipping it, and performing laboratory work would be replaced by a quick electronic query.

Although this scenario holds great promise, the utility of ge-nomic information for drug pre-scribing must be documented with rigorous evidence. For ex-ample, three recently published clinical trials raise questions about the clinical utility of using pharmacogenetic information in the initial dosing of vitamin K anatagonists.3

The FDA based its decision to grant marketing authorization for the Illumina instrument platform and reagents on their demon-strated accuracy across numer-ous genomic segments, spanning 19 human chromosomes. Preci-sion and reproducibility across instruments, users, days, and re-agent lots were also demonstrated.

The marketing authorization of a sequencing platform for clini-cal use will probably expand the incorporation of genetic informa-tion into health care. But even the most promising technologies cannot fully realize their poten-

tial if the relevant policy, legal, and regulatory issues are not ad-equately addressed. Already, key policy advances have helped smooth the way and address many of the public’s concerns about the potential misuse of ge-netic information.4 For example, the Health Insurance Portability and Accountability Act of 1996 (HIPAA) and the Genetic Informa-tion Nondiscrimination Act (GINA) prohibit health insurers from con-sidering genetic information as a preexisting condition, as material to underwriting, or as the basis for denying coverage. GINA also protects against use of genetic information by employers. These protections do not, however, ex-tend to the disease manifesta-tions of genetic risks. Although genomic information showing a predisposition to cancer would be protected under GINA, other clinical signs or symptoms indic-ative of cancer are not protected. Provisions of the Affordable Care Act set to go into effect in 2014 go a step further and will preclude consideration of all preexisting conditions, whether genomic or not, in establishing insurance premiums. Current federal laws, however, do not restrict the use of genomic information in life insurance, long-term care insur-ance, or disability insurance.

The legal landscape for the use of genomics in personalized medicine grew brighter in June of this year when the Supreme Court ruled (in Association for Mo-lecular Pathology v. Myriad Genetics) that isolated naturally occurring DNA cannot be patented. This decision was a breakthrough for access to individual genetic tests but also, even more important, for the integration of genome se-quencing into clinical care. Be-fore the Myriad decision, there

FDA Authorization for Next-Generation Sequencer

Cos

t per

Gen

ome

(U.S

. $)

100,000,000

1,000

10,000

100,000

1,000,000

10,000,000

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

Cost per Genome.

Adapted from the National Human Genome Research Institute.

The New England Journal of Medicine Downloaded from nejm.org at FDA Biosciences Library on November 20, 2013. For personal use only. No other uses without permission.

Copyright © 2013 Massachusetts Medical Society. All rights reserved.

Perspective

The NEW ENGLAND JOURNAL of MEDICINE

n engl j med nejm.org 1

This year marks 60 years since James Watson and Francis Crick described the structure of DNA

and 10 years since the complete sequencing of the human genome. Fittingly, today the Food and Drug

Administration (FDA) has granted marketing authorization for the first high-throughput (next-gener-ation) genomic sequencer, Illumi-na’s MiSeqDx, which will allow the development and use of innu-merable new genome-based tests. When a global team of research-ers sequenced that first human genome, it took more than a de-cade and cost hundreds of mil-lions of dollars. Today, because of federal and private investment, sequencing technologies have ad-vanced dramatically, and a human genome can be sequenced in about 24 hours for what is now less than $5,000 (see graph). This is a rare example of technology development in which faster,

cheaper, and better have coincid-ed: as costs have plummeted and capacity has increased, the accu-racy of sequencing has substan-tially improved. With the FDA’s announcement, a platform that took nearly a decade to develop from an initial research project funded by the National Institutes of Health will be brought into use for clinical care. Clinicians can selectively look for an almost unlimited number of genetic changes that may be of medical significance. Access to these data opens the door for the trans-formation of research, clinical care, and patient engagement.

To see how this technology could be used, consider cancer.

Comprehensive analysis of the genome sequence of individual cancers has helped uncover the specific mutations that contribute to the malignant phenotype, iden-tify new targets for therapy, and increase the opportunities for choosing the optimal treatment for each patient. For instance, lung adenocarcinoma can now be divided into subtypes with unique genomic fingerprints associated with different outcomes and dif-ferent responses to particular therapies. More broadly, recent work from the Cancer Genome Atlas demonstrates that the tis-sue of origin of a particular can-cer may be much less relevant to prognosis and response to ther-apy than the array of causative mutations.1 As a result, patients diagnosed with a cancer for which there are few therapeutic options may increasingly benefit from drug therapies originally aimed

First FDA Authorization for Next-Generation SequencerFrancis S. Collins, M.D., Ph.D., and Margaret A. Hamburg, M.D.



Perspective

The NEW ENGLAND JOURNAL of MEDICINE

n engl j med nejm.org 1

This year marks 60 years since James Watson and Francis Crick described the structure of DNA

and 10 years since the complete sequencing of the human genome. Fittingly, today the Food and Drug

Administration (FDA) has granted marketing authorization for the first high-throughput (next-gener-ation) genomic sequencer, Illumi-na’s MiSeqDx, which will allow the development and use of innu-merable new genome-based tests. When a global team of research-ers sequenced that first human genome, it took more than a de-cade and cost hundreds of mil-lions of dollars. Today, because of federal and private investment, sequencing technologies have ad-vanced dramatically, and a human genome can be sequenced in about 24 hours for what is now less than $5,000 (see graph). This is a rare example of technology development in which faster,

cheaper, and better have coincid-ed: as costs have plummeted and capacity has increased, the accu-racy of sequencing has substan-tially improved. With the FDA’s announcement, a platform that took nearly a decade to develop from an initial research project funded by the National Institutes of Health will be brought into use for clinical care. Clinicians can selectively look for an almost unlimited number of genetic changes that may be of medical significance. Access to these data opens the door for the trans-formation of research, clinical care, and patient engagement.

To see how this technology could be used, consider cancer.

Comprehensive analysis of the genome sequence of individual cancers has helped uncover the specific mutations that contribute to the malignant phenotype, iden-tify new targets for therapy, and increase the opportunities for choosing the optimal treatment for each patient. For instance, lung adenocarcinoma can now be divided into subtypes with unique genomic fingerprints associated with different outcomes and dif-ferent responses to particular therapies. More broadly, recent work from the Cancer Genome Atlas demonstrates that the tis-sue of origin of a particular can-cer may be much less relevant to prognosis and response to ther-apy than the array of causative mutations.1 As a result, patients diagnosed with a cancer for which there are few therapeutic options may increasingly benefit from drug therapies originally aimed

First FDA Authorization for Next-Generation SequencerFrancis S. Collins, M.D., Ph.D., and Margaret A. Hamburg, M.D.



Paper Describing NIST-‐GIAB Characteriza?on of NA12878: Nature Biotechnology, accepted DOI: arXiv:1307.4661 [q-‐bio.GN]

HSPH – Brad Chapman Comparing variant callers

h>p://bcbio.wordpress.com/2013/10/21/updated-‐comparison-‐of-‐variant-‐detec?on-‐methods-‐ensemble-‐freebayes-‐and-‐minimal-‐bam-‐prepara?on-‐pipelines/

Freebayes SNP calls changed very li>le in 2013

h>p://www.bioplanet.com/gcat/reports/1933-‐westleouzm/variant-‐calls/illumina-‐100bp-‐pe-‐exome-‐150x/bwamem-‐freebayes-‐0-‐9-‐10-‐131226/compare-‐1934-‐akckizzzfr-‐1931-‐laqgzjytqw-‐1935-‐xwckffckoa/snp/group-‐quality

Freebayes indel calls improved in 2013

h>p://www.bioplanet.com/gcat/reports/1933-‐westleouzm/variant-‐calls/illumina-‐100bp-‐pe-‐exome-‐150x/bwamem-‐freebayes-‐0-‐9-‐10-‐131226/compare-‐1934-‐akckizzzfr-‐1931-‐laqgzjytqw-‐1935-‐xwckffckoa/indel/group-‐quality

Feedback from MoCha lab in NCI •  We built a targeted amplicons NGS assay for detec?ng muta?ons in clinical tumor specimens

•  To assess the assay’s specificity, we compared 84 runs of CEPH NA12878 data from our assay with NIST’s consensus variant list (VCF v2.15)

•  We observed a high overall concordance with a few FP variants in homopolymeric regions unique in our plahorm

•  We concluded that NIST GIAB is a useful reference standard to evaluate assay specificity

Personalis – Categorizing exome regions

h>p://www.personalis.com/assets/files/posters/ashg2013/Towards_a_medical-‐grade_exome.pdf

Genome in a Bo>le @ Mount Sinai Michael Linderman

Icahn Ins?tute for Genomics and Mul?scale Biology

Dept. of Gene?cs and Genomic Sciences

Ongoing clinical pipeline valida?on

Technical replicates: NA12878 NA12891 NA12892 NA18507 NA10080

SNP Array

Sanger

Genome in a Bo>le

Pla?num Genomes

Concordance Analysis

Reference Materials

Evalua?ng and tuning variant calling & filtering

We evaluate a set of NA12878 technical replicates against GIAB for each new pipeline version

Measure overall analy?cal performance

Tune VQSR threshold senng to inflec?on point

GIAB Use at Qiagen (Frederick, MD)

•  Use GIAB false posi?ve sites to quickly iden?fy PCR ar?facts in reads from PCR-‐enriched samples.

•  Compare accuracy of PCR-‐enrichment amplicon sequencing to accuracy of hybridiza?on-‐capture whole-‐exome sequencing.

•  Tune variant calling pipeline for good balance between sensi?vity and specificity.

•  Compare variant calling methods, of course!

iden?fy PCR ar?facts quickly •  example: sequenced fragment was primed by a PCR primer-‐

dimer strand formed in earlier PCR cycles

5-GGACCTGTGGGTGGGTAAC-3 oligo intended for chr1 locus |||||||||||xxxx 3-GACACCCACCCGAGGTT-5 oligo intended for chr5 locus 5-GGGTCTGTGGGTGGGCTCCAA-3 DNA sample (chr5 locus) || ||||||||||||||||| 3-CCTGGACACCCACCCGAGGTT-5 dimer product primes 4 bp upstream AC false positive variant called

compare amplicon enrichment to hybridiza?on capture enrichment

read set FPR x 1E6 TPR PPV

snp GeneRead 662 Kb 46.9 0.94 0.92

Mount Sinai 37 Mb 7.6 0.89 0.99

indel GeneRead 662 Kb 9.1 0.60 0.79

Mount Sinai 37Mb 0.7 0.75 0.98

•  characterize panels for mul?plex-‐PCR enrichment •  compare to exome capture read set from Mount Sinai

Medical School (105x coverage, 36.6Mb)

tune variant calling pipelines

•  generate ROC curves to compare Strelka and MuTect matched tumor/normal

•  “tumor” here is 8, 16, 36, and 100% spike-‐in of NA12878 DNA

CLARITAS GENOMICS

CONFIDENTIAL

NIST GIAB Confident Calls as the Gold Standard

●  Community resource to which all can contribute and access ○  Group-specific or internally-generated gold standards have unknown

methods, origin, and curation. ○  NIST GIAB confident calls allows public access and contributions.

●  Claritas Genomics uses this callset for: ●  Technology feasibility, research and development

○  test new technologies, protocols, reagents, methods, etc. ●  Validation and verification

○  assay validation and verification for clinical use ●  Critical attributes:

○  a large number of confident true negative positions! ○  confident region allows for scale or genome-scale sensitivity and

specificity analysis. ○  part of a large well-characterized pedigree ○  a trusted source of truth

Other use cases?

•  What other types of uses might people develop?