62
Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D. ã Stanford University

Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D. Stanford University

Embed Size (px)

Citation preview

Page 1: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

An Introduction toNext Generation Sequencing

Hanlee Ji, M.D. ã Stanford University

Page 2: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Overview

• Principles of next generation DNA sequencing

• Analysis of genetic variation and research

applications

Page 3: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Advances in DNA sequencing technology

M. Stratton et al. Nature 458 (2009)

Page 4: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Applications

• Identifying genetic variants

– Whole genome

– Exome

– Subsets

• Transcriptomes (e.g. RNASeq)

• Chip-seq

• Epigenomes (methylation)

• Many others!

Page 5: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Sequencing-by-synthesis

• Individual DNA molecules from a “sequencing library”.

• Sequencing via multiple cycles of nucleotide incorporation.

• Solid phase support

• High density reads using a photodetector (i.e. CCDS) or solid state system

• Images per cycle provides sequence data.

J. Shendure and Ji. Nat Biotech (2008)

Page 6: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Sequencing-by-ligation

Complete Genomics

• DNA nanoballs from circles

• Combinatorial probe anchor ligation

• 10 base reads adjacent to 8 anchor sites

• 31- to 35-base mate-paired reads

Dramanac et al. Science (2010)

Page 7: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Solid state detection of DNA synthesis

Rothberg et al. Nature (2011)

“Nanowell” solid-state detection

Page 8: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Single molecule sequencing

New technologies

• Single molecule

detection

• Pacific Biosciences

– Sequencing by

synthesis

– Single base

incorporation

“nanowell” sequencing-by-synthesis

Page 9: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Nanopore sequencing

• DNA inserted in a nanopore in lipid membrane

• speed control provided by a phi29 DNA polymerase

• Translocation via an electrical field and polymerase DNA sequence via changes in the ionic current

Page 10: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Issues with next generation DNA sequencing

• Higher sequencing error rates

– <0.1 to 10% or greater depending on

sequencing chemistry and configuration

• Systematic bias based on approach

• Short sequence reads (<250 bases)

• Massive data output

– Data storage anagement

– Variant calling analysis

Page 11: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Aspects of sequencing next generation sequencing

• DNA sequencing library preparation.

• Processing of sequence reads

• Types of reads (e.g. “mate pairs”)

• Alignment

– Fold coverage

• Assembly

• Variant calling

Page 12: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Overview of the process in whole genome sequencing

D Koboldt et al. Briefings in Bioinformatics (2010)

Page 13: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Sequencing library preparation – 454 system

Page 14: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Sequencing process

Page 15: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Sequencing data generation and analysis

D Koboldt et al. Briefings in Bioinformatics (2010)

Page 16: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Quality metrics to improve variant calls

• Sequencing fold coverage based on alignment.

– Higher fold coverage required in cancer genomes

• Elimination of duplicate reads.

– Bottlenecks which propagate errors from DNA

amplification.

• Using high quality base calls

– Quality scores 30 or higher

• Repeat sequences in genomes.

• Significance or confidence values for variants

Page 17: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

DNA sequence data format and visualization

• Sequence alignment map (SAM)• Viewing “pileups”

Page 18: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Genetic variation

• Point mutations

– Nonsynonymous versus synonymous

• Insertion / deletions (indels)

• Copy number variations (CNVs)

• Structural variants (SV)

– Intrachromosomal

• Large indels

• Duplications

• Inversions

– Interchromosomal

• Balanced translocations

• Imbalanced translocations

Page 19: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Single nucleotide variants from cancer genomes

P Sohrab et al. Nature, 461 (2010)

Page 20: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Variant callers

• Genome Analysis Toolkit

• Varscan• SAMTools• SNVmix• Others…

Page 21: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Single nucleotide mutations

• Silent = synonymous

• Substitution = nonsynonymous

• Nonsense = premature stop

http://commons.wikimedia.org

Page 22: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Transitions versus transversion mutations

Ding et al. Nature (2010)

Transition

Transition

Transversion

Transversion

Transversion

Transversion

• Transition– A <-> G– C <-> T

• Transversions– A <-> T– A <-> C– G <-> T– G <-> C

Page 23: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Small insertion and deletions

Page 24: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Targeting strategies for resequencing genomic subsets

Array-based

hybridization

capture

In-solution capture

(e.g. molecular

inversion probes)

In-solution

Hybridization

capture

Page 25: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Rapid targeted mutation analysis from cancer genomes

aTarget-specific

oligonucleotide

Preparation Processing

Single-adaptor

library

Flow cell

Hybridization, extension

and denaturation

Immobilized DNA

b STEP 1

Primer-probe

preparation

STEP 2

Target capture

STEP 3

Cluster preparation

Sequence

Data

Primer-probe

Immobilized Primer ‘C’

Immobilized Primer ‘D’

Sequencing Primer 2

Sequencing Primer 1

Page 26: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

“Onconomic” diagnostic mutations analysis

• Rapid mutation for point-of-care analysis

• Analysis of identified cancer drivers

• Determination of pathogenic mutations

• Example, nonsense mutation in SMAD4

Normal

Tumor

Page 27: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Visualizing sequence

SNP genotyping

1.5 Mb region on Chromosome 18

Page 28: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Whole genome sequencing

M. Stratton et al. Nature, 458 (2009)

Page 29: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

A needle in a human genome haystack?

• A human genome

has 23

chromosomes.

• 6 billion individual

DNA basepairs per

genome.

• A single basepair

error can be a

disease mutation.

..GATC..ERROR..TTCCAA..

X

Page 30: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Exome sequencing

M. Clark et al. Nature Biotechnology (2012)

Page 31: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

A cancer family pedigree

Colorectal Cancer

No Cancer

Male Female

Colon Polyps

42 y/o43 y/o

AP

Page 32: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

AP

MotherFather

Assessment of a cancer family – unaffected versus affected

Page 33: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

AP

Exome sequencing analysis for identifying inherited disease

91 2 3 4 5 6 7 8 10 11 etc.

Mother

1 2 3 4 5 6 7 8 9 10 11 etc.

AP’s unique family variants

Father

1 2 3 4 5 6 7 8 9 10 11 etc.

• Identify the variants unique to the affected members.

Page 34: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Interpretation of genetic variants

• Substitutions translation bioinformatically

• SIFT - probability that a substitution is tolerated– < 0.05 is deleterious.

• PolyPhen – categorical definitions– "benign", "possibly

damaging" and "probably damaging”

• Protein structural mapping

IDH1 mapping of Arg132 cancer mutation

Parson et al., Science, (2008)

Page 35: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Sequence assembly

• Assembling fragments of

random sequence to form a

set of larger contiguous

sequences (contigs).

• Used to assemble de novo

genomes of new organisms.

• Useful for reconstruction

regions of high complexity

such as SVs.

Zerbino DR, Birney E, Genome Research, 18 (2010)

Page 36: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Metagenomic characterization of bacterial flora

Page 37: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Copy number from genome sequencing

• Genome shotgun

sequencing

comparison.

• Copy number

variation derived

directly from

sequence reads.

• 15 Kb windows with

sequence tag

counting

Campbell et al., Nature Genetics, (2008)

Page 38: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Copy number variations (CNVs) from genomic sequencing

Genomic sequence analysis

Array CGH CNV analysis

Breast cancer – Chromosome 1

Page 39: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Structural variations in human genomes

Deletion Duplication Inversion

Translocation

http://commons.wikimedia.org

Intrachromosomal

Interchromosomal

Insertion

Page 40: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Structural variation

• Mate pair sequences dependent on the genomic DNA insertion size (population).

Exon i n

Exon i n+1

Normal

Tumor

300 nts

300 nts

Exon i n

Deleted region

Intact region

Page 41: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Genomic deletion analysis

• Breast cancer genome

sequencing.

• Mate pair sequences

used in indel analysis.

• Changes in the

location of mapped

reads that are not

concordant with the

sequencing library

insert size.

Normal

Primary

Metastasis

Xenograft

Ding et al, Nature, (2010)

Page 42: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Structural variants from small cell lung cancer genome

Campbell et al., Nature Genetics, (2008)

Duplication Inversion

Page 43: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Translocations in colorectal cancer genomes

• Balanced tranlsocations between chr 8 and 20 p arms

• Structural changes can only be delineated based on

sequencingBass et al., Nature Genetics, (2011)

Page 44: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Cancer transcriptome sequencing (RNASeq)

• Mate pair analysis from prostrate cancer mRNA• Identification of reads indicating gene fusions.

N Palanisamy et al, Nat Med 16 (2007)

Page 45: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Sequenced cancer genomes – nonsmall cell lung

Lee et al. Nature 465, (2010)

Tumor coverage 60 X

Normal coverage 46 X

Mutation rate per Mb 17.7

Total identified tumor mutations 83,000

Coding mutations 540

Validated mutations 302

Total identified indels 54,921

Coding indels 253

Total identified structural variants 79

Validated structural variants 43

Page 46: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Whole genome analysis of colorectal cancer

• Cancer Genome

Atlas analysis of

colon

adenocarcinoma

• “Circos” plots of

whole genome data

Page 47: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Gene expression and RNASeq

Page 48: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

CHIP-Seq

Page 49: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Ultrasensitive mutation detection

• Robust detection of 1 mutant allele from 1,000 wildtype

alleles in heterogeneous mixtures

• Application to viral infections

• Analysis of cancer point mutations

Flaherty et al., Nucleic Acids Research, 2012

Page 50: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Deep resequencing for rare variants

Smith et al., Nature 2009

• Derived from

reassortment of swine

and human flu in swine

• More than 214

countries in 2009

• More than 622,482

infections confirmed

• 18,449 deaths

confirmed by WHO

Page 51: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Oseltamivir resistance mutation in influenza

Page 52: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

• Neuramindase bound

to oseltamivir

(Tamiflu)

• Mutations cluster

around sialic

acid/oseltamivir

binding pocket

Collins et al. Nature 2008.

Detection of the oseltamivir resistance mutationc

Page 53: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Phylogenetic tree of H1 influenza genomes

Page 54: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Conclusion

• Multiple approaches available for analysis of genomes

• Scale of sequence data requires extensive computational,

bioinformatic and statistical data analysis

• Methods, technologies and analysis continue to improve

and become simpler.

Page 55: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Page 56: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Genetics via large scale sequencing

• mutations and other

genomic DNA aberrations

contribute to neoplastic

development

• Specific genetic variants

and other indicate clinical

phenotype

• Utility as diagnostics

Page 57: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Cancer exome survey (Sanger sequencing)

• Each row represents a chromosome.• Peaks represent driver mutations

TP53

Wood et al, Science, 318 (2007)

Page 58: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Tiers of cancer genome sequencing

Complete Human Genomes

Exomes &

Transcriptomes

Genomic

Subsets

Cancer diagnostic

Translational

studies

Discovery

?

Page 59: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Whole cancer genome sequencing

• Pros

– Most comprehensive coverage of the genome.

– Least bias – most objective analysis

– Highest resolution at base pair level

– Identification of complex structural variants

– Experimentally straightforward…

• Cons

– Cost (rapidly dropping!)

– Rapidly evolution of technologies

– Challenging data management and analysis

Page 60: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Sample and genetic complexity of cancer

• Sample variability

– Normal stroma contamination

– Mixtures of variable lineages

– Degradation of DNA

• Intratumoral genetic heterogeneity

– Clonal subpopulations carrying different

mutations

• Background random mutations (e.g.

passengers)

• Complex genomic structure

Page 61: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

Sequencing cancer genomes – clinical samples

• Type of samples

– Cancer cell lines

– Xenografts

– Primary tumors

– Purified cancer cells

• Requirement for matched samples

– Normal diploid genome

Page 62: Stanford Comprehensive Cancer Center Stanford Genome Technology Center An Introduction to Next Generation Sequencing Hanlee Ji, M.D.  Stanford University

Stanford Comprehensive Cancer CenterStanford Genome Technology Center

False positive mutation rates in genome sequencing

• Mutation false positive rate requires high accuracy.

– 1 base / 10,000 error = 300,000 false mutations

– 1 base / 100,000 error = 30,000 false mutations

• Ideal false positive rate

– 1 base / 1,000,000 error

– ~50% of candidate mutations are correct!