29
The world leader in serving science *Mohit Gupta Democratizing Sequencing with Ion S5 TM Sequencers Powered by GPUs *GPU work is combined effort with Jakob Siegel For Research Use Only. Not for use in diagnostic procedures.

Democratizing Sequencing with Ion S5TM Sequencers Powered ... · The world leader in serving science *Mohit Gupta Democratizing Sequencing with Ion S5TM Sequencers Powered by GPUs

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Democratizing Sequencing with Ion S5TM Sequencers Powered ... · The world leader in serving science *Mohit Gupta Democratizing Sequencing with Ion S5TM Sequencers Powered by GPUs

The world leader in serving science

*Mohit Gupta

Democratizing Sequencing with Ion S5TM Sequencers Powered by GPUs

*GPU work is combined effort with Jakob Siegel

For Research Use Only. Not for use in diagnostic procedures.

Page 2: Democratizing Sequencing with Ion S5TM Sequencers Powered ... · The world leader in serving science *Mohit Gupta Democratizing Sequencing with Ion S5TM Sequencers Powered by GPUs

Why sequence DNA ?

Page 3: Democratizing Sequencing with Ion S5TM Sequencers Powered ... · The world leader in serving science *Mohit Gupta Democratizing Sequencing with Ion S5TM Sequencers Powered by GPUs

Why Targeted Sequencing More cost effective, more time efficient and simpler to analyze

Targeted Sequencing Whole Exome Whole Genome

Variants generated per run 10s to 100s ~50,000 ~3,000,000

Likely number of variants for follow-up 1-10s 1-10s 1-10s

Time to analyze Hours to Days Days to Weeks Weeks to Months

Total cost including analysis $ $$ $$$

Page 4: Democratizing Sequencing with Ion S5TM Sequencers Powered ... · The world leader in serving science *Mohit Gupta Democratizing Sequencing with Ion S5TM Sequencers Powered by GPUs

Low Cost, Simple, Scalable, Real Time Sequencing

Wafer Semiconductor Manufacturing

Chip Semiconductor Packaging

Millions of Sensors

Semiconductor Design

Sensor Plate

Silicon Substrate Drain Source Bulk

∆ V

Sensing Layer

H+

Single Sensor Chemical to Digital Sequence

TCGTACC…

Page 5: Democratizing Sequencing with Ion S5TM Sequencers Powered ... · The world leader in serving science *Mohit Gupta Democratizing Sequencing with Ion S5TM Sequencers Powered by GPUs

Sensor Plate

Silicon Substrate Drain Source Bulk

dNTP

To column receiver

∆ pH

∆ Q

∆ V

Sensing Layer

H+

RothbergJ.M.etalNaturedoi:10.1038/nature10242

Transistor as a pH meter

Compute Intensive signal processing

Page 6: Democratizing Sequencing with Ion S5TM Sequencers Powered ... · The world leader in serving science *Mohit Gupta Democratizing Sequencing with Ion S5TM Sequencers Powered by GPUs

Ion NGS Instruments Evolution

Hard to Use

Easy to Use

2010 2012 2014 2016

Ion Chef

Ion Proton Ion PGM

Ion S5TM

Ion S5TM XL System

Page 7: Democratizing Sequencing with Ion S5TM Sequencers Powered ... · The world leader in serving science *Mohit Gupta Democratizing Sequencing with Ion S5TM Sequencers Powered by GPUs

Applications Ion S5TM and Ion S5TM XL Systems

•  Simplest NGS workflow for targeted sequencing - <15mins to set up a sequencing run and <45mins total hands-on time from DNA to data with Ion Chef™ System

•  Fastest run time – as little as 3.5 hours from sequence to BAM files.

•  Lowest capital investment - Single platform for all targeted applications with flexibility to scale from 5M - 80M reads

•  Lowest DNA/RNA input requirements – as little as 1ng using Ion AmpliSeq™ technology

•  Easy setup and training – single day installation and plug and play cartridge-based reagents

Simplest and fastest targeted sequencing system with the lowest capital investment

Page 8: Democratizing Sequencing with Ion S5TM Sequencers Powered ... · The world leader in serving science *Mohit Gupta Democratizing Sequencing with Ion S5TM Sequencers Powered by GPUs

Ion 520™ Chip 5 M Reads

Ion 530™ Chip 15-20 M Reads

Ion 540™ Chip 60-80 M Reads

•  Gene panels to exomes and transcriptomes on a single low cost platform

•  Built-in Informatics - no external server

required •  Upgradable from S5TM to S5TM XL

configuration •  NVIDIA GTX 970

Ion S5™ System – Low Cost

Page 9: Democratizing Sequencing with Ion S5TM Sequencers Powered ... · The world leader in serving science *Mohit Gupta Democratizing Sequencing with Ion S5TM Sequencers Powered by GPUs

•  Rapid sequencing and analysis for labs requiring more output or multiple runs per day

•  The computing power to enable rapid

turnaround times from any size of experiment

•  1 hr analysis time for gene panels, 5 hr analysis for transcriptomes

•  NVIDIA Tesla K40

Ion 520™ Chip 5 M Reads

Ion 530™ Chip 15-20 M Reads

Ion 540™ Chip 60-80 M Reads

Ion S5™ XL System – Speed

Page 10: Democratizing Sequencing with Ion S5TM Sequencers Powered ... · The world leader in serving science *Mohit Gupta Democratizing Sequencing with Ion S5TM Sequencers Powered by GPUs

200bp Sequencing

Time S5/S5 XL

200bp Analysis

Time Ion S5TM

XL System

200bp Analysis

Time Ion S5TM System

Output No. of Reads

Maximum Read Length

Ion 530™ Chip 2.5 hr ~2.5 hr ~8 hr 3-5 Gb 15-20 M 400bp

Ion 540™ Chip 2.5 hr ~5 hr ~16.5 hr 10-15 Gb 60-80 M 200bp

Ion 520™ Chip 2.5 hr ~1 hr ~5 hr 1-2 Gb 3-5 M 400bp

5M to 80M reads

Ion S5TM XL System Ion S5TM System

Ion S5TM and Ion S5TM XL Systems Scalable performance on a single platform

Page 11: Democratizing Sequencing with Ion S5TM Sequencers Powered ... · The world leader in serving science *Mohit Gupta Democratizing Sequencing with Ion S5TM Sequencers Powered by GPUs

Data Processing Pipeline

Data acquisition

and compression

in FPGA

Signal Processing

BaseCalling and

Alignment to reference genome

2 TB 180 GB

20 TB (540TM chip)

Page 12: Democratizing Sequencing with Ion S5TM Sequencers Powered ... · The world leader in serving science *Mohit Gupta Democratizing Sequencing with Ion S5TM Sequencers Powered by GPUs

GPU to the rescue

• Removed main hotspot in signal processing pipeline • Speedups of more than 250x over a CPU core!

0 20 40 60 80 100 120 140 160

CPU

GPU

time in s

bead find

CPU processing first 20 flows

per block CPU processing after flow 20 time spent in fitting

Page 13: Democratizing Sequencing with Ion S5TM Sequencers Powered ... · The world leader in serving science *Mohit Gupta Democratizing Sequencing with Ion S5TM Sequencers Powered by GPUs

GPU’s Impact

•  Multiple sequencing runs a day possible •  Swift pace of Research and Development •  Accelerated product innovation

with GPU

CPU only

On Instrument Analysis Time with and without GPU

Page 14: Democratizing Sequencing with Ion S5TM Sequencers Powered ... · The world leader in serving science *Mohit Gupta Democratizing Sequencing with Ion S5TM Sequencers Powered by GPUs

Signal Processing

Page 15: Democratizing Sequencing with Ion S5TM Sequencers Powered ... · The world leader in serving science *Mohit Gupta Democratizing Sequencing with Ion S5TM Sequencers Powered by GPUs

Signal Processing Flow

Reading flow data

Writing signal values

Raw Data Processing

Post Fit Processing

Parameter Estimation

unique to each well (LM fitting)

Regional Parameter Estimation

(Common to all wells)

Page 16: Democratizing Sequencing with Ion S5TM Sequencers Powered ... · The world leader in serving science *Mohit Gupta Democratizing Sequencing with Ion S5TM Sequencers Powered by GPUs

Mathematical model

•  Sophisticated model •  Background correction •  Incorporation •  Buffering

•  Regional Parameters •  Enzyme kinetics, nucleotide rise,

diffusion etc.

•  Well Parameters •  Hydrogen ions generated,

buffering, DNA copies etc.

Decay in H+ Incorporation

Page 17: Democratizing Sequencing with Ion S5TM Sequencers Powered ... · The world leader in serving science *Mohit Gupta Democratizing Sequencing with Ion S5TM Sequencers Powered by GPUs

GPU Acceleration

Page 18: Democratizing Sequencing with Ion S5TM Sequencers Powered ... · The world leader in serving science *Mohit Gupta Democratizing Sequencing with Ion S5TM Sequencers Powered by GPUs

Current Execution Model

•  Based on Original CPU implementation: Process Level

•  96 blocks •  depending on hardware 4 to 6

processes in parallel •  work on available data during

experiment

*Heat-map and timing from a S5TM XL 540TM with Nvidia Tesla K40 GPU

Page 19: Democratizing Sequencing with Ion S5TM Sequencers Powered ... · The world leader in serving science *Mohit Gupta Democratizing Sequencing with Ion S5TM Sequencers Powered by GPUs

BkgModel Worker BkgModel Worker BkgModel Worker BkgModel Worker BkgModel Worker BkgModel Worker Thread

ImgLoader ImgLoader ImgLoader Raw Data Loader Thread

CPU Queue

Gen Traces

1 36

RegionFit PostFit Xtalk/Clonal

6

ImgLoader ImgLoader ImgLoader 1.well writer

accumulate traces for 20 flows

1

sync

Single Flow Fit

bead

s

frames flows (20)

copy

Current Execution Model

•  Based on Original CPU implementation: Thread Level

Page 20: Democratizing Sequencing with Ion S5TM Sequencers Powered ... · The world leader in serving science *Mohit Gupta Democratizing Sequencing with Ion S5TM Sequencers Powered by GPUs

Current Implementation

•  Stream based to hide PCIe transfer

•  Resources needed for stream execution are pre-allocated and obtained from a resource pool.

•  If resources to create a Stream Execution Unit (SEU) are available the Stream Manager will try to poll a new job from a job queue.

•  The Stream Manager can drive multiple SEUs which can be of different types.

•  Theoretically up to 16 SEUs can be spawned in one Stream Manager if enough resources are available

Page 21: Democratizing Sequencing with Ion S5TM Sequencers Powered ... · The world leader in serving science *Mohit Gupta Democratizing Sequencing with Ion S5TM Sequencers Powered by GPUs

•  ~400 MB GPU / 150MB Host memory per stream •  no room for persistent data (36 regions) •  huge allocation and copy overhead •  data transpose overhead •  varying bead count and frames per region,

reallocation and slowdown in absolute worst case. •  synchronization steps

GPU memory

frames

flows (20)

beads

host memory

bead

s

frames flows (20)

copy

BkgModel Worker BkgModel Worker BkgModel Worker BkgModel Worker BkgModel Worker BkgModel Worker Thread

ImgLoader ImgLoader ImgLoader Raw Data Loader Thread

CPU Queue

Gen Traces

1 36

RegionFit PostFit Xtalk/Clonal

GPU Queue

6

ImgLoader ImgLoader ImgLoader 1.well writer

GPU Worker

StreamManager SEU

StreamEU

GPU transp. Input

SingleFlowFit

transp. Output

accumulate traces for 20 flows

36

1

sync

page locked memory

… frames

flows (20)

bead

s

Current Pipeline after 20 flows

Page 22: Democratizing Sequencing with Ion S5TM Sequencers Powered ... · The world leader in serving science *Mohit Gupta Democratizing Sequencing with Ion S5TM Sequencers Powered by GPUs

Why are further optimizations needed?

•  Current pipeline utilizes GPU (more or less) efficiently during bkgmodel fitting • Generating empty and bead traces a bottleneck

• Big chunk of CPU time spent in these computations • Mostly memory bound and a natural step to be performed on GPU as

a precursor to fitting • Raw data processing is another big compute hog

•  This pipeline will enable it to be easily streamlined in the new flow • Many unnecessary data transformations and memcopies • Complex execution model

Page 23: Democratizing Sequencing with Ion S5TM Sequencers Powered ... · The world leader in serving science *Mohit Gupta Democratizing Sequencing with Ion S5TM Sequencers Powered by GPUs

Pipeline performance

212

192

207

161

0

50

100

150

200

250

current optimized with MPS

wal

l clo

ck ti

me

in m

inut

es

Timing S5TM XL 540 (500 flows) K20 k40 with boost

99% 99%

76%

98%

0%

20%

40%

60%

80%

100%

120%

current optimized with MPS

GPU Utilization

K20 K40 with boost

Page 24: Democratizing Sequencing with Ion S5TM Sequencers Powered ... · The world leader in serving science *Mohit Gupta Democratizing Sequencing with Ion S5TM Sequencers Powered by GPUs

Signal Processing Flow

Reading flow data

Writing signal values

Raw Data Processing

Post Fit Processing

Parameter Estimation

unique to each well (LM fitting)

Regional Parameter Estimation

(Common to all wells)

Page 25: Democratizing Sequencing with Ion S5TM Sequencers Powered ... · The world leader in serving science *Mohit Gupta Democratizing Sequencing with Ion S5TM Sequencers Powered by GPUs

Current Optimization Work

• Expand scope of GPU implementation • Modifications in intermediate data layout • Removed need for addition copies and transposes • Changes in spatial and temporal data subdivision • Use of Nvidia MPS to hide PCIe transfers

• Algorithm Changes • This is a hugh one. • Daunting gold standard of current pipeline accuracy to

overcome • Main focus now

Page 26: Democratizing Sequencing with Ion S5TM Sequencers Powered ... · The world leader in serving science *Mohit Gupta Democratizing Sequencing with Ion S5TM Sequencers Powered by GPUs

GPU Worker

ImgLoader ImgLoader ImgLoader Raw Data Loader Thread

ImgLoader ImgLoader ImgLoader 1.well writer

GPU

Gen Traces

RegionFit

SingleFlowFit

Init Persistent Data

first GPU flow?

Xtalk

ClonalFilter

ClonalFilter Flow

GPU memory

host memory

copy

•  Per block fixed amount ~270 MB GPU memory •  Almost no additional host memory •  Persistent data only copied/generated once on

device, no additional transposes. •  No host side copy overhead (use of MPS to hide

PCIe) •  Fixed max block size, no need for re-allocation •  No synchronization steps

Optimized after 20 flows

Page 27: Democratizing Sequencing with Ion S5TM Sequencers Powered ... · The world leader in serving science *Mohit Gupta Democratizing Sequencing with Ion S5TM Sequencers Powered by GPUs

Optimization Summary

• 520TM/530TM/540TM block level signal processing •  Concept of bkgmodel regions internal to the GPU •  Easy to experiment with different region sizes •  Regions can talk to each other.

•  Streamlined flow from raw data processing to signal processing •  Sequential execution of the pipeline steps •  Final output to be written to 1.wells •  Fewer data copies and reduced memory footprint •  Freed up CPU resources

•  Reduced context switches on the GPU •  Better utilization of PCIe bandwidth

Page 28: Democratizing Sequencing with Ion S5TM Sequencers Powered ... · The world leader in serving science *Mohit Gupta Democratizing Sequencing with Ion S5TM Sequencers Powered by GPUs

Thank You

NVIDIA specifically the DevTech Team

Nikolai Sakhkarnykh Jonathan Bentz

Andrew Vandergrift Bob Keating Mark Berger

Kimberley Powell

Our supervisor Eugene Ingerman and

The entire Ion Torrent R&D team

Page 29: Democratizing Sequencing with Ion S5TM Sequencers Powered ... · The world leader in serving science *Mohit Gupta Democratizing Sequencing with Ion S5TM Sequencers Powered by GPUs

For Research Use Only. Not for use in diagnostic procedures. © 2016 Thermo Fisher Scientific Inc. All rights reserved. All trademarks are the property of Thermo Fisher

Scientific and its subsidiaries unless otherwise specified.

NVIDIA, GTX and Tesla are trademarks of Nvidia Corporation. Intel and Xeon are trademarks of Intel Corporation. Altera and Stratix are trademarks of Altera Corporation