53
Next Next-Generation Sequencing: an overview of Generation Sequencing: an overview of technologies and applications technologies and applications Matthew Tinning Australian Genome Research Facility July 2012

NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

  • Upload
    others

  • View
    25

  • Download
    0

Embed Size (px)

Citation preview

Page 1: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

NextNext--Generation Sequencing: an overview of Generation Sequencing: an overview of technologies and applicationstechnologies and applications

Matthew Tinning

Australian Genome Research Facility

July 2012

Page 2: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

History of SequencingHistory of SequencingWhere have we been?Where have we been?

1869 – Discovery of DNA

1909 – Chemical characterisation

1953 – Structure of DNA solved

1977 – Sanger sequencing invented

– First genome sequenced – ФX174 (5 kb)– First genome sequenced – ФX174 (5 kb)

1986 – First automated sequencing machine

1990 – Human Genome Project started

1992 – First “sequencing factory” at TIGR

Page 3: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

History of SequencingHistory of SequencingWhere have we been?Where have we been?

1995 – First bacterial genome – H. influenzae (1.8 Mb)

1998 – First animal genome – C. elegans (97 Mb)

2003 – Completion of Human Genome Project (3 Gb)

– 13 years, $2.7 bn

2005 – First “next-generation” sequencing instrument2005 – First “next-generation” sequencing instrument

2008 – 2356 genome sequences in NCBI database

Page 4: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

Origin of DNA SequencingOrigin of DNA Sequencing

• 1977

–First genome (ФX174)

–Sequencing by synthesis (Sanger)

–Sequencing by degradation (Maxam-–Sequencing by degradation (Maxam-Gilbert)

Page 5: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

Sanger sequencingSanger sequencing

• Uses DNA polymerase

• All four nucleotides, plus one dideoxynucleotide

• Random termination at specific bases• Random termination at specific bases

• Separate by gel electrophoresis

Page 6: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

Chain extension and Chain extension and terminationtermination

TCTGAT

G

TC

AG

AT*

TCTGAT

AGACTACGTACTTGACGAGTAC...

Page 7: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

Chain extension and Chain extension and terminationtermination

TCTGATGCAT*

AGACTACGTACTTGACGAGTAC

TG

A

AGACTACGTACTTGACGAGTAC

deoxynucleotide dideoxynucleotide

Page 8: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

Fragment identificationFragment identification

Page 9: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

Gel electrophoresisGel electrophoresis

Page 10: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

1986: 4 Reactions to 1 Lane1986: 4 Reactions to 1 Lane

Sequencing Reaction Products Progression of Sequencing Reaction

Page 11: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

ABI377ABI377

Page 12: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

ABI3730xlABI3730xl

Page 13: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

ElectropherogramsElectropherograms

Page 14: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

Sanger SequencingSanger Sequencing

•Maximum read length ~900 base•Maximum yield/day < 2.1 million bases (rapid mode, 500 bp reads)

< 0.1% of the human genome > 1000 days of sequencing for a 1 fold coverage ...

Page 15: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

Human Genome ProjectHuman Genome Project

Page 16: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

Human Genome ProjectHuman Genome Project

• Launched in 1989 –expected to take 15 years

– Competing Celera project launched in 1998

• Genome estimated to be 92% complete

– 1st Draft released in 2000

– “Complete” genome released in 2003– “Complete” genome released in 2003

– Sequence of last chromosome published in 2006

• Cost: ~$3 billion

– Celera ~$300 million

Page 17: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

Shotgun Sequencing ApproachShotgun Sequencing Approach

Page 18: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

NextNext--Gen Sequencing Gen Sequencing TechnologiesTechnologies

Page 19: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

NextNext--Gen Sequencing Gen Sequencing TechnologiesTechnologies

• Four platforms, four technologies

• All massively parallel sequencing

–Sequencing by synthesis

–Sequencing by Ligation–Sequencing by Ligation

• Read lengths vary from ~36bp to >400bp

• Read numbers vary from ~ 1 million to ~ 1 billion per run

Page 20: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

NextNext--Gen Sequencing Gen Sequencing TechnologiesTechnologies

Life Technologies SOLiDRoche GS-FLX

Illumina HiSeq Life Technologies Ion Torrent

Page 21: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

Next Gen Sequencing Library Next Gen Sequencing Library PreparationPreparation

Page 22: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

Roche GSRoche GS--FLXFLX

Page 23: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

WorkflowWorkflow

Library Preparation

emPCR Setup

Sample Fragmentation

emPCR Amplification

Pyrosequencing

Data Analysis

Page 24: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

PyrosequencingPyrosequencing

Page 25: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

emPCRemPCREmulsion PCR is a method of clonal amplification which allows

for millions of unique PCRs to be performed at once through the generation of micro-reactors.

Page 26: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

emPCR

The Water-in-Oil-Emulsion

Page 27: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

Massively Parallel SequencingMassively Parallel Sequencing

Page 28: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

Data AnalysisData Analysis

T Base Flow

A Base Flow

C Base Flow

G Base Flow

Raw Image Files

Image Processing

Base-calling

Quality Filtering

SFF File

Page 29: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

454 Platform Updates454 Platform Updates

•100bp reads, ~20Mbp / runGS20

•250bp reads ~100 Mbp / run (7.5 hrs)GS-FLX

•400bp reads ~400 Mbp / run (10 hrs) GS-FLX Titanium •400bp reads ~400 Mbp / run (10 hrs) GS-FLX Titanium

•700 bp reads ~700 Mbp/run (18 hrs)GS-FLX Titanium Plus

•400 bp reads ~ 35Mbp/run (10 hrs)GS Junior

Page 30: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

Illumina Illumina HiSeqHiSeq

Page 31: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

DNA(0.1-1.0 ug)

5’3’

G

T

C

A

TA

G

C

A

C

A

G

TC

A

T

C

A

C

GT

IlluminaIllumina Sequencing TechnologySequencing TechnologyRobust Reversible Terminator Chemistry Foundation

Sample preparation Cluster growth

5’

G

T

C

T

CC

TAG

CG

TA

1 2 3 7 8 94 5 6

Image acquisition Base calling

T G C T A C G A T …

Sequencing

Page 32: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

Platform UpdatesPlatform Updates

• 18bp reads, ~1Gbp / runSolexa 1G

• 36bp reads ~3Gbp / runIllumina GA

• 75bp paired reads ~10Gbp / run (8 days) Illumina GAII

• 75bp paired reads ~40Gbp / run (8 days)Illumina GAIIx

Illumina HiSeq 2000 • 100 bp paired reads ~200 Gbp/ run (10 days)Illumina HiSeq 2000

• 100bp paired reads ~600Gbp / run (12 days)Illumina HiSeq, v3 SBS

• 150 paired reads ~1.5 Gb/run (27 hrs)MiSeq

Maximum yield / day 50,Gbp ~16x the human genome

Page 33: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

Applied Applied BiosystemsBiosystems SOLiDSOLiD

Page 34: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

Sequencing by LigationSequencing by Ligation

Page 35: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

Base InterrogationsBase Interrogations

Page 36: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

2 Base encoding2 Base encoding

AT

Page 37: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

emPCR and EnrichmentemPCR and Enrichment

3’ Modification allows covalent bonding to the slide surface

Page 38: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

Platform UpdatesPlatform Updates

• 50bp Paired reads ~50Gbp / run (12 days)SOLiD 3

• 50bp Paired reads ~100Gbp / run (12 days)SOLiD 4

• 75bp Paired reads ~300Gbp / run (14 days)5500xl

Maximum yield / day 21,000,000,000bp 7x the human genome 3.5 hours of sequencing for a 1 fold coverage.....

Page 39: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

Applied Applied BiosystemsBiosystems::Ion Torrent PGMIon Torrent PGM

Page 40: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

Ion TorrentIon Torrent

• Ion Semiconductor Sequencing• Detection of hydrogen ions during the polymerization DNA• Sequencing occurs in microwells • Sequencing occurs in microwells with ion sensors• No modified nucleotides• No optics

Page 41: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

Ion TorrentIon Torrent

• DNA � Ions � Sequence

– Nucleotides flow sequentially over Ion semiconductor chip

– One sensor per well per sequencing reaction

– Direct detection of natural DNA extension

– Millions of sequencing reactions per chip

– Fast cycle time, real time detection

dNTP

∆ pH

∆ Q

H+

– Fast cycle time, real time detection

Sensor Plate

Silicon SubstrateDrain SourceBulk To column

receiver

∆ Q

∆ V

Sensing Layer

Page 42: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

Ion Torrent: System UpdatesIon Torrent: System Updates

• 100bp reads ~10 Mb/run (1.5 hrs)314 Chip

• 100 bp reads ~100 Mbp / run (2 hrs)

• 200 bp reads ~200 Mbp/run (3 hrs)316 Chip

• 200 bp reads ~1 Gbp / run (4.5 hrs)318 Chip

Page 43: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

Summary of NGS PlatformsSummary of NGS Platforms

• Clonal amplification of sequencing template– emPCR (454, SOLiD and Ion Torrent)

– Bridge amplification (Illumina)

• Sequencing by Synthesis– 454 Pyrosequencing

– Illumina Reversible Terminator Chemistry– Illumina Reversible Terminator Chemistry

– Ion Torrent Ion Semiconductor Sequencing

• Sequencing by ligation– SOLiD – 2 base encoding

• Dramatic reduction in cost of sequencing– GS-FLX provides > 100x decrease in costs compared to

Sanger Sequencing

– HiSeq and SOLiD > 100x decrease in costs over GS-FLX

Page 44: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

Rapid Innovation Driving Cost Rapid Innovation Driving Cost DownDown

Evolution of NGS system output Cost per Human Genome

Throughput(GB)

120

300300GB

3GB 6GB

20GB

0

20

40

60

80

100

2007 2008 2009 2010

Page 45: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

NGS Library PreparationNGS Library Preparation

• Library preparation

• Targeted enrichment

Page 46: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

Library PreparationLibrary Preparation

Page 47: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

Sample preparationSample preparation

DNA

Fragmentation

Fragmentation

mRNA

cDNA Synthesis

mechanical

chemical

Fragmentation

Ligation of Amplification/Sequencing Adaptors

Library Fragment Size Selection

cDNA Synthesis

Page 48: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

ApplicationsApplications

• DNA• Whole Genome

• hybridization Capture

• amplicon• amplicon

• ChIP-seq

• RNA• mRNA

• small RNA

Page 49: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

Whole Genome SequencingWhole Genome Sequencing

• de novo assembly

• Reference Mapping – SNVs, rearrangements

• Comparative genomics

E. coli assembly from MiSeq DataIllumina application notes

Page 50: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

Targeted ReTargeted Re--sequencing: sequencing: sequence capturesequence capture

• Enrichment for specific targets via capture with oligonculeotide baits

– Exome Capture• Capure 40-70 Mb of annotated exonsannotated exons

– Custom Capture• up to 50 Mb of target

sequences

Page 51: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

Targeted Targeted ResequencingResequencing: : AmpliconsAmplicons

• Preparation of amplicons tagged with sequencing adapters

– Well suited for 454 and bench top sequencers

– Deep sequencing for detection of somatic mutationssomatic mutations

– 16S Sequencing for microbial diversity

Page 52: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion

RNA applicationsRNA applications

• mRNA– Gene expression analysis

– Transcriptome analysis

• Small RNA– Small RNA discovery– Small RNA discovery

– Gene regulation

Page 53: NextNext--Generation Sequencing: an overview of Generation …bioinformatics.org.au/resources/ws12/presentations/Presentation... · the generation of micro-reactors. emPCR The Water-in-Oil-Emulsion