17
Use cases of GIAB Reference Materials NIST, HSPH, Claritas, NCI, MSSM, Personalis, Qiagen Genome in a Bo>le Consor?um

140128 use cases of giab RMs

Embed Size (px)

Citation preview

Page 1: 140128 use cases of giab RMs

Use  cases  of  GIAB  Reference  Materials  

NIST,  HSPH,  Claritas,  NCI,  MSSM,  Personalis,  Qiagen  

Genome  in  a  Bo>le  Consor?um  

Page 2: 140128 use cases of giab RMs

Preliminary  uses  of  high-­‐confidence  NIST-­‐GIAB  genotypes  for  NA12878  

•  NIST  has  released  several  versions  of  high-­‐confidence  genotypes  for  its  pilot  RM  

•  We’ve  collected  some  examples  of  how  people  are  using  these  genotypes  

Page 3: 140128 use cases of giab RMs

NIST  work  with  FDA  helps  answer  the  ques?on…  

•  So  you’ve  sequenced  my  genome.  How  well  did  you  do?  –  FDA  approval…  

•  NIST  work  on  developing  the  most  accurate  interpreta?on  of  a  human  genome,  coupled  with  a  NIST  Reference  Material  currently  being  developed,  enabled  FDA  to  assess  the  performance  of  the  sequencer  submi>ed  for  marke?ng  approval.  

PERSPECTIVE

n engl j med nejm.org2

at other cancers that share com-mon driver mutations. The new technology allows us go from our current approach of targeted searches for specific mutations in individual cancers to wide-spread use of approaches that survey the entire genome.

A major area of opportunity that has yet to be fully exploited is pharmacogenomics — the use of genomic information to iden-tify the right drug at the right dose for each patient. More than 120 FDA-approved drugs have pharmacogenomics information in their labeling, providing im-portant details about differences in response to the drug and, in some cases, recommending ge-netic testing before prescribing.2

But the full potential of phar-macogenomics is largely unreal-ized, because of the logistic chal-lenges in obtaining suitable genomic information in a timely enough fashion to guide prescrib-ing. Placing genomic information in the electronic medical record would facilitate this kind of per-sonalized medicine. If the pa-tient’s entire genome were part of his or her medical record,

then the complexities of acquir-ing a DNA sample, shipping it, and performing laboratory work would be replaced by a quick electronic query.

Although this scenario holds great promise, the utility of ge-nomic information for drug pre-scribing must be documented with rigorous evidence. For ex-ample, three recently published clinical trials raise questions about the clinical utility of using pharmacogenetic information in the initial dosing of vitamin K anatagonists.3

The FDA based its decision to grant marketing authorization for the Illumina instrument platform and reagents on their demon-strated accuracy across numer-ous genomic segments, spanning 19 human chromosomes. Preci-sion and reproducibility across instruments, users, days, and re-agent lots were also demonstrated.

The marketing authorization of a sequencing platform for clini-cal use will probably expand the incorporation of genetic informa-tion into health care. But even the most promising technologies cannot fully realize their poten-

tial if the relevant policy, legal, and regulatory issues are not ad-equately addressed. Already, key policy advances have helped smooth the way and address many of the public’s concerns about the potential misuse of ge-netic information.4 For example, the Health Insurance Portability and Accountability Act of 1996 (HIPAA) and the Genetic Informa-tion Nondiscrimination Act (GINA) prohibit health insurers from con-sidering genetic information as a preexisting condition, as material to underwriting, or as the basis for denying coverage. GINA also protects against use of genetic information by employers. These protections do not, however, ex-tend to the disease manifesta-tions of genetic risks. Although genomic information showing a predisposition to cancer would be protected under GINA, other clinical signs or symptoms indic-ative of cancer are not protected. Provisions of the Affordable Care Act set to go into effect in 2014 go a step further and will preclude consideration of all preexisting conditions, whether genomic or not, in establishing insurance premiums. Current federal laws, however, do not restrict the use of genomic information in life insurance, long-term care insur-ance, or disability insurance.

The legal landscape for the use of genomics in personalized medicine grew brighter in June of this year when the Supreme Court ruled (in Association for Mo-lecular Pathology v. Myriad Genetics) that isolated naturally occurring DNA cannot be patented. This decision was a breakthrough for access to individual genetic tests but also, even more important, for the integration of genome se-quencing into clinical care. Be-fore the Myriad decision, there

FDA Authorization for Next-Generation Sequencer

Cos

t per

Gen

ome

(U.S

. $)

100,000,000

1,000

10,000

100,000

1,000,000

10,000,000

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

Cost per Genome.

Adapted from the National Human Genome Research Institute.

The New England Journal of Medicine Downloaded from nejm.org at FDA Biosciences Library on November 20, 2013. For personal use only. No other uses without permission.

Copyright © 2013 Massachusetts Medical Society. All rights reserved.

Perspective

The NEW ENGLAND JOURNAL of MEDICINE

n engl j med nejm.org 1

This year marks 60 years since James Watson and Francis Crick described the structure of DNA

and 10 years since the complete sequencing of the human genome. Fittingly, today the Food and Drug

Administration (FDA) has granted marketing authorization for the first high-throughput (next-gener-ation) genomic sequencer, Illumi-na’s MiSeqDx, which will allow the development and use of innu-merable new genome-based tests. When a global team of research-ers sequenced that first human genome, it took more than a de-cade and cost hundreds of mil-lions of dollars. Today, because of federal and private investment, sequencing technologies have ad-vanced dramatically, and a human genome can be sequenced in about 24 hours for what is now less than $5,000 (see graph). This is a rare example of technology development in which faster,

cheaper, and better have coincid-ed: as costs have plummeted and capacity has increased, the accu-racy of sequencing has substan-tially improved. With the FDA’s announcement, a platform that took nearly a decade to develop from an initial research project funded by the National Institutes of Health will be brought into use for clinical care. Clinicians can selectively look for an almost unlimited number of genetic changes that may be of medical significance. Access to these data opens the door for the trans-formation of research, clinical care, and patient engagement.

To see how this technology could be used, consider cancer.

Comprehensive analysis of the genome sequence of individual cancers has helped uncover the specific mutations that contribute to the malignant phenotype, iden-tify new targets for therapy, and increase the opportunities for choosing the optimal treatment for each patient. For instance, lung adenocarcinoma can now be divided into subtypes with unique genomic fingerprints associated with different outcomes and dif-ferent responses to particular therapies. More broadly, recent work from the Cancer Genome Atlas demonstrates that the tis-sue of origin of a particular can-cer may be much less relevant to prognosis and response to ther-apy than the array of causative mutations.1 As a result, patients diagnosed with a cancer for which there are few therapeutic options may increasingly benefit from drug therapies originally aimed

First FDA Authorization for Next-Generation SequencerFrancis S. Collins, M.D., Ph.D., and Margaret A. Hamburg, M.D.

The New England Journal of Medicine Downloaded from nejm.org at FDA Biosciences Library on November 20, 2013. For personal use only. No other uses without permission.

Copyright © 2013 Massachusetts Medical Society. All rights reserved.

Perspective

The NEW ENGLAND JOURNAL of MEDICINE

n engl j med nejm.org 1

This year marks 60 years since James Watson and Francis Crick described the structure of DNA

and 10 years since the complete sequencing of the human genome. Fittingly, today the Food and Drug

Administration (FDA) has granted marketing authorization for the first high-throughput (next-gener-ation) genomic sequencer, Illumi-na’s MiSeqDx, which will allow the development and use of innu-merable new genome-based tests. When a global team of research-ers sequenced that first human genome, it took more than a de-cade and cost hundreds of mil-lions of dollars. Today, because of federal and private investment, sequencing technologies have ad-vanced dramatically, and a human genome can be sequenced in about 24 hours for what is now less than $5,000 (see graph). This is a rare example of technology development in which faster,

cheaper, and better have coincid-ed: as costs have plummeted and capacity has increased, the accu-racy of sequencing has substan-tially improved. With the FDA’s announcement, a platform that took nearly a decade to develop from an initial research project funded by the National Institutes of Health will be brought into use for clinical care. Clinicians can selectively look for an almost unlimited number of genetic changes that may be of medical significance. Access to these data opens the door for the trans-formation of research, clinical care, and patient engagement.

To see how this technology could be used, consider cancer.

Comprehensive analysis of the genome sequence of individual cancers has helped uncover the specific mutations that contribute to the malignant phenotype, iden-tify new targets for therapy, and increase the opportunities for choosing the optimal treatment for each patient. For instance, lung adenocarcinoma can now be divided into subtypes with unique genomic fingerprints associated with different outcomes and dif-ferent responses to particular therapies. More broadly, recent work from the Cancer Genome Atlas demonstrates that the tis-sue of origin of a particular can-cer may be much less relevant to prognosis and response to ther-apy than the array of causative mutations.1 As a result, patients diagnosed with a cancer for which there are few therapeutic options may increasingly benefit from drug therapies originally aimed

First FDA Authorization for Next-Generation SequencerFrancis S. Collins, M.D., Ph.D., and Margaret A. Hamburg, M.D.

The New England Journal of Medicine Downloaded from nejm.org at FDA Biosciences Library on November 20, 2013. For personal use only. No other uses without permission.

Copyright © 2013 Massachusetts Medical Society. All rights reserved.

Paper  Describing  NIST-­‐GIAB  Characteriza?on  of  NA12878:  Nature  Biotechnology,  accepted  DOI:  arXiv:1307.4661  [q-­‐bio.GN]  

Page 4: 140128 use cases of giab RMs

HSPH  –  Brad  Chapman    Comparing  variant  callers  

h>p://bcbio.wordpress.com/2013/10/21/updated-­‐comparison-­‐of-­‐variant-­‐detec?on-­‐methods-­‐ensemble-­‐freebayes-­‐and-­‐minimal-­‐bam-­‐prepara?on-­‐pipelines/  

Page 5: 140128 use cases of giab RMs

Freebayes  SNP  calls  changed  very  li>le  in  2013  

h>p://www.bioplanet.com/gcat/reports/1933-­‐westleouzm/variant-­‐calls/illumina-­‐100bp-­‐pe-­‐exome-­‐150x/bwamem-­‐freebayes-­‐0-­‐9-­‐10-­‐131226/compare-­‐1934-­‐akckizzzfr-­‐1931-­‐laqgzjytqw-­‐1935-­‐xwckffckoa/snp/group-­‐quality  

Page 6: 140128 use cases of giab RMs

Freebayes  indel  calls  improved  in  2013  

h>p://www.bioplanet.com/gcat/reports/1933-­‐westleouzm/variant-­‐calls/illumina-­‐100bp-­‐pe-­‐exome-­‐150x/bwamem-­‐freebayes-­‐0-­‐9-­‐10-­‐131226/compare-­‐1934-­‐akckizzzfr-­‐1931-­‐laqgzjytqw-­‐1935-­‐xwckffckoa/indel/group-­‐quality  

Page 7: 140128 use cases of giab RMs

Feedback  from  MoCha  lab  in  NCI    •  We  built  a  targeted  amplicons  NGS  assay  for  detec?ng  muta?ons  in  clinical  tumor  specimens  

•  To  assess  the  assay’s  specificity,  we  compared  84  runs  of  CEPH  NA12878  data  from  our  assay  with  NIST’s  consensus  variant  list  (VCF  v2.15)    

•  We  observed  a  high  overall  concordance  with  a  few  FP  variants  in  homopolymeric  regions  unique  in  our  plahorm  

•  We  concluded  that  NIST  GIAB  is  a  useful  reference  standard  to  evaluate  assay  specificity  

Page 8: 140128 use cases of giab RMs

Personalis  –  Categorizing  exome  regions  

h>p://www.personalis.com/assets/files/posters/ashg2013/Towards_a_medical-­‐grade_exome.pdf  

Page 9: 140128 use cases of giab RMs

Genome  in  a  Bo>le  @  Mount  Sinai  Michael  Linderman  

 Icahn  Ins?tute  for  Genomics  and  Mul?scale  Biology  

Dept.  of  Gene?cs  and  Genomic  Sciences  

Page 10: 140128 use cases of giab RMs

Ongoing  clinical  pipeline  valida?on    

Technical  replicates:  NA12878  NA12891  NA12892  NA18507  NA10080  

SNP  Array  

Sanger  

Genome  in  a  Bo>le  

Pla?num  Genomes  

Concordance  Analysis  

Reference  Materials  

Page 11: 140128 use cases of giab RMs

Evalua?ng  and  tuning  variant  calling  &  filtering  

We  evaluate  a  set  of  NA12878  technical  replicates  against  GIAB  for  each  new  pipeline  version  

Measure  overall  analy?cal  performance  

Tune  VQSR  threshold  senng  to  inflec?on  point  

Page 12: 140128 use cases of giab RMs

GIAB  Use  at  Qiagen  (Frederick,  MD)  

•  Use  GIAB  false  posi?ve  sites  to  quickly  iden?fy  PCR  ar?facts  in  reads  from  PCR-­‐enriched  samples.  

•  Compare  accuracy  of  PCR-­‐enrichment  amplicon  sequencing  to  accuracy  of  hybridiza?on-­‐capture  whole-­‐exome  sequencing.  

•  Tune  variant  calling  pipeline  for  good  balance  between  sensi?vity  and  specificity.  

•  Compare  variant  calling  methods,  of  course!  

 

Page 13: 140128 use cases of giab RMs

iden?fy  PCR  ar?facts  quickly  •  example:  sequenced  fragment  was  primed  by  a  PCR  primer-­‐

dimer  strand  formed  in  earlier  PCR  cycles  

 

5-GGACCTGTGGGTGGGTAAC-3 oligo intended for chr1 locus   |||||||||||xxxx   3-GACACCCACCCGAGGTT-5 oligo intended for chr5 locus 5-GGGTCTGTGGGTGGGCTCCAA-3 DNA sample (chr5 locus) || |||||||||||||||||  3-CCTGGACACCCACCCGAGGTT-5 dimer product primes 4 bp upstream AC false positive variant called  

Page 14: 140128 use cases of giab RMs

compare  amplicon  enrichment  to  hybridiza?on  capture  enrichment  

read  set   FPR  x  1E6   TPR   PPV  

snp  GeneRead  662  Kb   46.9   0.94   0.92  

Mount  Sinai  37  Mb   7.6   0.89   0.99  

indel  GeneRead  662  Kb   9.1   0.60   0.79  

Mount  Sinai  37Mb   0.7   0.75   0.98  

•  characterize  panels  for  mul?plex-­‐PCR  enrichment  •  compare  to  exome  capture  read  set  from  Mount  Sinai  

Medical  School  (105x  coverage,  36.6Mb)  

Page 15: 140128 use cases of giab RMs

tune  variant  calling  pipelines  

•  generate  ROC  curves  to  compare  Strelka  and  MuTect  matched  tumor/normal  

•  “tumor”  here  is  8,  16,  36,  and  100%  spike-­‐in  of  NA12878  DNA  

Page 16: 140128 use cases of giab RMs

CLARITAS GENOMICS

CONFIDENTIAL

NIST GIAB Confident Calls as the Gold Standard

●  Community resource to which all can contribute and access ○  Group-specific or internally-generated gold standards have unknown

methods, origin, and curation. ○  NIST GIAB confident calls allows public access and contributions.

●  Claritas Genomics uses this callset for: ●  Technology feasibility, research and development

○  test new technologies, protocols, reagents, methods, etc. ●  Validation and verification

○  assay validation and verification for clinical use ●  Critical attributes:

○  a large number of confident true negative positions! ○  confident region allows for scale or genome-scale sensitivity and

specificity analysis. ○  part of a large well-characterized pedigree ○  a trusted source of truth

Page 17: 140128 use cases of giab RMs

Other  use  cases?  

•  What  other  types  of  uses  might  people  develop?