Upload
others
View
22
Download
0
Embed Size (px)
Citation preview
Next Generation Sequencing Workflow and Applications for Mycobacterium
tuberculosis
Jamie Posey PhD Applied Research Team Lead
June 8 2015
National Center for HIVAIDS Viral Hepatitis STD and TB Prevention Division of Tuberculosis Elimination
Next Generation Sequencing (NGS) Workflow
Sample Preparation
Library Preparation
Sequence Analyze
Sample Preparation for Whole Genome Sequencing (WGS)
Isolation of DNA Chemical lysis (CTAB) Mechanical lysis (FastPrep-24) Purify DNA
Shear genomic DNA
Physical Enzymatic
Library Preparation for Illumina Platforms
Sequence Libraries
httpswwwyoutubecomembedHMyCqWhwB8Eiframeamprel=0ampautoplay=1
httpswwwyoutubecomwatchv=v8p4ph2MAvI
httpswwwyoutubecomwatchv=WYBzbxIfuKs
httpswwwyoutubecomwatcht=10ampv=L_jAtDSB8kA
Ion Torrent PGM by Life Technologies
PacBio SMRT
Illumina
Genome Assembly
httpwwwjigsawplanetcomrc=createpuzzle
Bioinformatic Tools
gatcbiotechcom
Simple Variant Call Pipeline
httpswwwebiacuktrainingsitesebiacuktrainingfilesmaterials2014140217_AgriOmicsdan_bolser_snp_callingpdf
Examples of Commercial Products
The companies and products depicted here are not endorsed by CDC
Reference-Guided (Mapped) Assembly
Reference SequenceGenome
Low sequence coverage
UNMAPPED READS 1 Sequences not present in the reference 2 Plasmids or other extrachromosomal 3 DNA Structural VariationRearrangement
ADVANTAGES Relatively fast well-suited to highly-conserved genomes DISADVANTAGES Issues with high diversity mobile elements
Cov
erag
e
18X
1X
Contig 1 Contig 2
De-Novo Assembly
NOT TO SCALE
Contig 1 Contig 2 Contig 3
Contig 4 Contig 5 Contig 6 Contig 7
ADVANTAGES Reference agnostic assembles all the reads it can into contigs DISADVANTAGES Doesnrsquot always get things right Repeat sequences etc
Whole Genome SNP Typing
A
A
T
C
C
C
T
T
T
A
A
A
G
G
C
A
T
T
Reference SequenceCore Genome
1
2
3
ACTAGA
ACTAGT TCTACT
NJ tree visualization
Turnaround Time
DNA Prep Sequencing Analysis
0 ndash 21 days 2 ndash 3 days Few hours to days
MOLECULAR EPIDEMIOLOGY
Cluster 1
~ 100 Patients
Cluster 2
Cluster 3
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
DRUG RESISTANCE
Applications of WGS for Drug Resistance
Surveillance
Clinical management
Identify new mechanisms
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
C G C31998G58 bp upstream Rv0029
C TC1663856T
acn
C T C51403T Rv0047c G AG1674048A
fabG1
C T C118832T Rv0102 T CT1877958C
pks7
C T C247984T Rv0207c C GC1888075G
pks9
T C T362962CPE_PGRS5
T CT2087076C
171 bp upstream of Rv1838c
C G C477188G Rv0398c C TC2372126T
Rv2112c
C A C480678A mmpL1 G CG2402463C
Rv2142c
A GA649974G
ubiE T CT2614547C
46 bp upstream Rv2339
C T C761147T rpoB G AG2751471A
Rv2449c
G C G765719C rpoC G AG2958534A
Rv2631
C G C799139G Rv0698 G AG3126489A
Rv2819c
G AG905686A
Rv0811c G AG3137406A
echA16
C A C926861APE_PGRS13
C TC3213150T
lepB
C GC1023436G
betP A CA3377940C
PPE46
T GT1093459G
PE_PGRS17
A CA3380380C
PPE47
G AG1114491A
Rv0997 A GA3416480G
Rv3055
C TC1208858T
Rv1084 C TC3455434T
Rv3088
C TC1231660T
Rv1104 G CG3608047C
Rv3230c
A CA1246730C
bpoB C TC3764285T
PPE56
C TC1266797T
Rv1139c G AG3765280A
PPE56
C TC1309314T
fdxC C AC3777772A
spoU
A CA1320356C
papA3 A GA4026439G
5 bp upstream Rv3585
G AG1353888A
tagA C AC4037284A
PE_PGRS59
G AG1421085A
Rv1272c T AT4072484A
Rv3633
G CG4084482C
topA
A GA4314271G
bfrB
Identify New Mechanisms of Resistance
fabG1 mabA inhA
L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T
METAGENOMICS
Starting Material
Sputum Subculture
Dx Culture
AMD Metagenomics Project Overview
Clinical specimen
Comparative pathogen sequence database RNADNA content sequence
High throughput sequencing Sample
extraction processing
Subtractive host
sequence database
Identification amp characterization
Clutter Mitigation
Clinical Specimen Sets
Human whole blood (EDTA) 2 liters bull Acquired through normal channels
Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate
Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo
Test Sputum for Mtb
Metagenomics
35819020 10
336491752 89
35886 0
4844864 1
Sputum Background Sample Raw Read processing
Quality filter failed (Q30 trim-filter)
Human mapped (BWA -defaults)
TB Mapped (BWA -defaults)
Other
MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios
MTbhuDNA Ratio () MTbSputum Ratio ()
20 1 01 001 0001 00001 20 10 1 01 001 0001
Transposon-Based Library Prep
Sequence Capture
MiSeq Sequencing and Analysis
All mixtures 25 ng uL-1
Project Workflow
Agilent SureSelect
- (
Mtb SureSelect Enrichment
01 1 20
20 1 10
2654689 117325
10133910 11929612 67210
001
11052
(825 45x) (994 77x)
(989 251x) (989 270x) (72 3x)
Recent Publication
Conclusions
Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management
Developed SOPs and optimized analysis
Surveillance Drug resistance
Work in progress
Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS
Acknowledgements
Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson
Core Facility
Mike Frace Mili Sheth Jamie Davis
AMD Metagenomics Team
Chris Hopkins Eishita Tyagi Scott Burns
Next Generation Sequencing (NGS) Workflow
Sample Preparation
Library Preparation
Sequence Analyze
Sample Preparation for Whole Genome Sequencing (WGS)
Isolation of DNA Chemical lysis (CTAB) Mechanical lysis (FastPrep-24) Purify DNA
Shear genomic DNA
Physical Enzymatic
Library Preparation for Illumina Platforms
Sequence Libraries
httpswwwyoutubecomembedHMyCqWhwB8Eiframeamprel=0ampautoplay=1
httpswwwyoutubecomwatchv=v8p4ph2MAvI
httpswwwyoutubecomwatchv=WYBzbxIfuKs
httpswwwyoutubecomwatcht=10ampv=L_jAtDSB8kA
Ion Torrent PGM by Life Technologies
PacBio SMRT
Illumina
Genome Assembly
httpwwwjigsawplanetcomrc=createpuzzle
Bioinformatic Tools
gatcbiotechcom
Simple Variant Call Pipeline
httpswwwebiacuktrainingsitesebiacuktrainingfilesmaterials2014140217_AgriOmicsdan_bolser_snp_callingpdf
Examples of Commercial Products
The companies and products depicted here are not endorsed by CDC
Reference-Guided (Mapped) Assembly
Reference SequenceGenome
Low sequence coverage
UNMAPPED READS 1 Sequences not present in the reference 2 Plasmids or other extrachromosomal 3 DNA Structural VariationRearrangement
ADVANTAGES Relatively fast well-suited to highly-conserved genomes DISADVANTAGES Issues with high diversity mobile elements
Cov
erag
e
18X
1X
Contig 1 Contig 2
De-Novo Assembly
NOT TO SCALE
Contig 1 Contig 2 Contig 3
Contig 4 Contig 5 Contig 6 Contig 7
ADVANTAGES Reference agnostic assembles all the reads it can into contigs DISADVANTAGES Doesnrsquot always get things right Repeat sequences etc
Whole Genome SNP Typing
A
A
T
C
C
C
T
T
T
A
A
A
G
G
C
A
T
T
Reference SequenceCore Genome
1
2
3
ACTAGA
ACTAGT TCTACT
NJ tree visualization
Turnaround Time
DNA Prep Sequencing Analysis
0 ndash 21 days 2 ndash 3 days Few hours to days
MOLECULAR EPIDEMIOLOGY
Cluster 1
~ 100 Patients
Cluster 2
Cluster 3
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
DRUG RESISTANCE
Applications of WGS for Drug Resistance
Surveillance
Clinical management
Identify new mechanisms
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
C G C31998G58 bp upstream Rv0029
C TC1663856T
acn
C T C51403T Rv0047c G AG1674048A
fabG1
C T C118832T Rv0102 T CT1877958C
pks7
C T C247984T Rv0207c C GC1888075G
pks9
T C T362962CPE_PGRS5
T CT2087076C
171 bp upstream of Rv1838c
C G C477188G Rv0398c C TC2372126T
Rv2112c
C A C480678A mmpL1 G CG2402463C
Rv2142c
A GA649974G
ubiE T CT2614547C
46 bp upstream Rv2339
C T C761147T rpoB G AG2751471A
Rv2449c
G C G765719C rpoC G AG2958534A
Rv2631
C G C799139G Rv0698 G AG3126489A
Rv2819c
G AG905686A
Rv0811c G AG3137406A
echA16
C A C926861APE_PGRS13
C TC3213150T
lepB
C GC1023436G
betP A CA3377940C
PPE46
T GT1093459G
PE_PGRS17
A CA3380380C
PPE47
G AG1114491A
Rv0997 A GA3416480G
Rv3055
C TC1208858T
Rv1084 C TC3455434T
Rv3088
C TC1231660T
Rv1104 G CG3608047C
Rv3230c
A CA1246730C
bpoB C TC3764285T
PPE56
C TC1266797T
Rv1139c G AG3765280A
PPE56
C TC1309314T
fdxC C AC3777772A
spoU
A CA1320356C
papA3 A GA4026439G
5 bp upstream Rv3585
G AG1353888A
tagA C AC4037284A
PE_PGRS59
G AG1421085A
Rv1272c T AT4072484A
Rv3633
G CG4084482C
topA
A GA4314271G
bfrB
Identify New Mechanisms of Resistance
fabG1 mabA inhA
L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T
METAGENOMICS
Starting Material
Sputum Subculture
Dx Culture
AMD Metagenomics Project Overview
Clinical specimen
Comparative pathogen sequence database RNADNA content sequence
High throughput sequencing Sample
extraction processing
Subtractive host
sequence database
Identification amp characterization
Clutter Mitigation
Clinical Specimen Sets
Human whole blood (EDTA) 2 liters bull Acquired through normal channels
Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate
Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo
Test Sputum for Mtb
Metagenomics
35819020 10
336491752 89
35886 0
4844864 1
Sputum Background Sample Raw Read processing
Quality filter failed (Q30 trim-filter)
Human mapped (BWA -defaults)
TB Mapped (BWA -defaults)
Other
MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios
MTbhuDNA Ratio () MTbSputum Ratio ()
20 1 01 001 0001 00001 20 10 1 01 001 0001
Transposon-Based Library Prep
Sequence Capture
MiSeq Sequencing and Analysis
All mixtures 25 ng uL-1
Project Workflow
Agilent SureSelect
- (
Mtb SureSelect Enrichment
01 1 20
20 1 10
2654689 117325
10133910 11929612 67210
001
11052
(825 45x) (994 77x)
(989 251x) (989 270x) (72 3x)
Recent Publication
Conclusions
Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management
Developed SOPs and optimized analysis
Surveillance Drug resistance
Work in progress
Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS
Acknowledgements
Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson
Core Facility
Mike Frace Mili Sheth Jamie Davis
AMD Metagenomics Team
Chris Hopkins Eishita Tyagi Scott Burns
Sample Preparation for Whole Genome Sequencing (WGS)
Isolation of DNA Chemical lysis (CTAB) Mechanical lysis (FastPrep-24) Purify DNA
Shear genomic DNA
Physical Enzymatic
Library Preparation for Illumina Platforms
Sequence Libraries
httpswwwyoutubecomembedHMyCqWhwB8Eiframeamprel=0ampautoplay=1
httpswwwyoutubecomwatchv=v8p4ph2MAvI
httpswwwyoutubecomwatchv=WYBzbxIfuKs
httpswwwyoutubecomwatcht=10ampv=L_jAtDSB8kA
Ion Torrent PGM by Life Technologies
PacBio SMRT
Illumina
Genome Assembly
httpwwwjigsawplanetcomrc=createpuzzle
Bioinformatic Tools
gatcbiotechcom
Simple Variant Call Pipeline
httpswwwebiacuktrainingsitesebiacuktrainingfilesmaterials2014140217_AgriOmicsdan_bolser_snp_callingpdf
Examples of Commercial Products
The companies and products depicted here are not endorsed by CDC
Reference-Guided (Mapped) Assembly
Reference SequenceGenome
Low sequence coverage
UNMAPPED READS 1 Sequences not present in the reference 2 Plasmids or other extrachromosomal 3 DNA Structural VariationRearrangement
ADVANTAGES Relatively fast well-suited to highly-conserved genomes DISADVANTAGES Issues with high diversity mobile elements
Cov
erag
e
18X
1X
Contig 1 Contig 2
De-Novo Assembly
NOT TO SCALE
Contig 1 Contig 2 Contig 3
Contig 4 Contig 5 Contig 6 Contig 7
ADVANTAGES Reference agnostic assembles all the reads it can into contigs DISADVANTAGES Doesnrsquot always get things right Repeat sequences etc
Whole Genome SNP Typing
A
A
T
C
C
C
T
T
T
A
A
A
G
G
C
A
T
T
Reference SequenceCore Genome
1
2
3
ACTAGA
ACTAGT TCTACT
NJ tree visualization
Turnaround Time
DNA Prep Sequencing Analysis
0 ndash 21 days 2 ndash 3 days Few hours to days
MOLECULAR EPIDEMIOLOGY
Cluster 1
~ 100 Patients
Cluster 2
Cluster 3
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
DRUG RESISTANCE
Applications of WGS for Drug Resistance
Surveillance
Clinical management
Identify new mechanisms
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
C G C31998G58 bp upstream Rv0029
C TC1663856T
acn
C T C51403T Rv0047c G AG1674048A
fabG1
C T C118832T Rv0102 T CT1877958C
pks7
C T C247984T Rv0207c C GC1888075G
pks9
T C T362962CPE_PGRS5
T CT2087076C
171 bp upstream of Rv1838c
C G C477188G Rv0398c C TC2372126T
Rv2112c
C A C480678A mmpL1 G CG2402463C
Rv2142c
A GA649974G
ubiE T CT2614547C
46 bp upstream Rv2339
C T C761147T rpoB G AG2751471A
Rv2449c
G C G765719C rpoC G AG2958534A
Rv2631
C G C799139G Rv0698 G AG3126489A
Rv2819c
G AG905686A
Rv0811c G AG3137406A
echA16
C A C926861APE_PGRS13
C TC3213150T
lepB
C GC1023436G
betP A CA3377940C
PPE46
T GT1093459G
PE_PGRS17
A CA3380380C
PPE47
G AG1114491A
Rv0997 A GA3416480G
Rv3055
C TC1208858T
Rv1084 C TC3455434T
Rv3088
C TC1231660T
Rv1104 G CG3608047C
Rv3230c
A CA1246730C
bpoB C TC3764285T
PPE56
C TC1266797T
Rv1139c G AG3765280A
PPE56
C TC1309314T
fdxC C AC3777772A
spoU
A CA1320356C
papA3 A GA4026439G
5 bp upstream Rv3585
G AG1353888A
tagA C AC4037284A
PE_PGRS59
G AG1421085A
Rv1272c T AT4072484A
Rv3633
G CG4084482C
topA
A GA4314271G
bfrB
Identify New Mechanisms of Resistance
fabG1 mabA inhA
L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T
METAGENOMICS
Starting Material
Sputum Subculture
Dx Culture
AMD Metagenomics Project Overview
Clinical specimen
Comparative pathogen sequence database RNADNA content sequence
High throughput sequencing Sample
extraction processing
Subtractive host
sequence database
Identification amp characterization
Clutter Mitigation
Clinical Specimen Sets
Human whole blood (EDTA) 2 liters bull Acquired through normal channels
Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate
Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo
Test Sputum for Mtb
Metagenomics
35819020 10
336491752 89
35886 0
4844864 1
Sputum Background Sample Raw Read processing
Quality filter failed (Q30 trim-filter)
Human mapped (BWA -defaults)
TB Mapped (BWA -defaults)
Other
MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios
MTbhuDNA Ratio () MTbSputum Ratio ()
20 1 01 001 0001 00001 20 10 1 01 001 0001
Transposon-Based Library Prep
Sequence Capture
MiSeq Sequencing and Analysis
All mixtures 25 ng uL-1
Project Workflow
Agilent SureSelect
- (
Mtb SureSelect Enrichment
01 1 20
20 1 10
2654689 117325
10133910 11929612 67210
001
11052
(825 45x) (994 77x)
(989 251x) (989 270x) (72 3x)
Recent Publication
Conclusions
Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management
Developed SOPs and optimized analysis
Surveillance Drug resistance
Work in progress
Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS
Acknowledgements
Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson
Core Facility
Mike Frace Mili Sheth Jamie Davis
AMD Metagenomics Team
Chris Hopkins Eishita Tyagi Scott Burns
Library Preparation for Illumina Platforms
Sequence Libraries
httpswwwyoutubecomembedHMyCqWhwB8Eiframeamprel=0ampautoplay=1
httpswwwyoutubecomwatchv=v8p4ph2MAvI
httpswwwyoutubecomwatchv=WYBzbxIfuKs
httpswwwyoutubecomwatcht=10ampv=L_jAtDSB8kA
Ion Torrent PGM by Life Technologies
PacBio SMRT
Illumina
Genome Assembly
httpwwwjigsawplanetcomrc=createpuzzle
Bioinformatic Tools
gatcbiotechcom
Simple Variant Call Pipeline
httpswwwebiacuktrainingsitesebiacuktrainingfilesmaterials2014140217_AgriOmicsdan_bolser_snp_callingpdf
Examples of Commercial Products
The companies and products depicted here are not endorsed by CDC
Reference-Guided (Mapped) Assembly
Reference SequenceGenome
Low sequence coverage
UNMAPPED READS 1 Sequences not present in the reference 2 Plasmids or other extrachromosomal 3 DNA Structural VariationRearrangement
ADVANTAGES Relatively fast well-suited to highly-conserved genomes DISADVANTAGES Issues with high diversity mobile elements
Cov
erag
e
18X
1X
Contig 1 Contig 2
De-Novo Assembly
NOT TO SCALE
Contig 1 Contig 2 Contig 3
Contig 4 Contig 5 Contig 6 Contig 7
ADVANTAGES Reference agnostic assembles all the reads it can into contigs DISADVANTAGES Doesnrsquot always get things right Repeat sequences etc
Whole Genome SNP Typing
A
A
T
C
C
C
T
T
T
A
A
A
G
G
C
A
T
T
Reference SequenceCore Genome
1
2
3
ACTAGA
ACTAGT TCTACT
NJ tree visualization
Turnaround Time
DNA Prep Sequencing Analysis
0 ndash 21 days 2 ndash 3 days Few hours to days
MOLECULAR EPIDEMIOLOGY
Cluster 1
~ 100 Patients
Cluster 2
Cluster 3
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
DRUG RESISTANCE
Applications of WGS for Drug Resistance
Surveillance
Clinical management
Identify new mechanisms
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
C G C31998G58 bp upstream Rv0029
C TC1663856T
acn
C T C51403T Rv0047c G AG1674048A
fabG1
C T C118832T Rv0102 T CT1877958C
pks7
C T C247984T Rv0207c C GC1888075G
pks9
T C T362962CPE_PGRS5
T CT2087076C
171 bp upstream of Rv1838c
C G C477188G Rv0398c C TC2372126T
Rv2112c
C A C480678A mmpL1 G CG2402463C
Rv2142c
A GA649974G
ubiE T CT2614547C
46 bp upstream Rv2339
C T C761147T rpoB G AG2751471A
Rv2449c
G C G765719C rpoC G AG2958534A
Rv2631
C G C799139G Rv0698 G AG3126489A
Rv2819c
G AG905686A
Rv0811c G AG3137406A
echA16
C A C926861APE_PGRS13
C TC3213150T
lepB
C GC1023436G
betP A CA3377940C
PPE46
T GT1093459G
PE_PGRS17
A CA3380380C
PPE47
G AG1114491A
Rv0997 A GA3416480G
Rv3055
C TC1208858T
Rv1084 C TC3455434T
Rv3088
C TC1231660T
Rv1104 G CG3608047C
Rv3230c
A CA1246730C
bpoB C TC3764285T
PPE56
C TC1266797T
Rv1139c G AG3765280A
PPE56
C TC1309314T
fdxC C AC3777772A
spoU
A CA1320356C
papA3 A GA4026439G
5 bp upstream Rv3585
G AG1353888A
tagA C AC4037284A
PE_PGRS59
G AG1421085A
Rv1272c T AT4072484A
Rv3633
G CG4084482C
topA
A GA4314271G
bfrB
Identify New Mechanisms of Resistance
fabG1 mabA inhA
L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T
METAGENOMICS
Starting Material
Sputum Subculture
Dx Culture
AMD Metagenomics Project Overview
Clinical specimen
Comparative pathogen sequence database RNADNA content sequence
High throughput sequencing Sample
extraction processing
Subtractive host
sequence database
Identification amp characterization
Clutter Mitigation
Clinical Specimen Sets
Human whole blood (EDTA) 2 liters bull Acquired through normal channels
Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate
Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo
Test Sputum for Mtb
Metagenomics
35819020 10
336491752 89
35886 0
4844864 1
Sputum Background Sample Raw Read processing
Quality filter failed (Q30 trim-filter)
Human mapped (BWA -defaults)
TB Mapped (BWA -defaults)
Other
MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios
MTbhuDNA Ratio () MTbSputum Ratio ()
20 1 01 001 0001 00001 20 10 1 01 001 0001
Transposon-Based Library Prep
Sequence Capture
MiSeq Sequencing and Analysis
All mixtures 25 ng uL-1
Project Workflow
Agilent SureSelect
- (
Mtb SureSelect Enrichment
01 1 20
20 1 10
2654689 117325
10133910 11929612 67210
001
11052
(825 45x) (994 77x)
(989 251x) (989 270x) (72 3x)
Recent Publication
Conclusions
Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management
Developed SOPs and optimized analysis
Surveillance Drug resistance
Work in progress
Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS
Acknowledgements
Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson
Core Facility
Mike Frace Mili Sheth Jamie Davis
AMD Metagenomics Team
Chris Hopkins Eishita Tyagi Scott Burns
Sequence Libraries
httpswwwyoutubecomembedHMyCqWhwB8Eiframeamprel=0ampautoplay=1
httpswwwyoutubecomwatchv=v8p4ph2MAvI
httpswwwyoutubecomwatchv=WYBzbxIfuKs
httpswwwyoutubecomwatcht=10ampv=L_jAtDSB8kA
Ion Torrent PGM by Life Technologies
PacBio SMRT
Illumina
Genome Assembly
httpwwwjigsawplanetcomrc=createpuzzle
Bioinformatic Tools
gatcbiotechcom
Simple Variant Call Pipeline
httpswwwebiacuktrainingsitesebiacuktrainingfilesmaterials2014140217_AgriOmicsdan_bolser_snp_callingpdf
Examples of Commercial Products
The companies and products depicted here are not endorsed by CDC
Reference-Guided (Mapped) Assembly
Reference SequenceGenome
Low sequence coverage
UNMAPPED READS 1 Sequences not present in the reference 2 Plasmids or other extrachromosomal 3 DNA Structural VariationRearrangement
ADVANTAGES Relatively fast well-suited to highly-conserved genomes DISADVANTAGES Issues with high diversity mobile elements
Cov
erag
e
18X
1X
Contig 1 Contig 2
De-Novo Assembly
NOT TO SCALE
Contig 1 Contig 2 Contig 3
Contig 4 Contig 5 Contig 6 Contig 7
ADVANTAGES Reference agnostic assembles all the reads it can into contigs DISADVANTAGES Doesnrsquot always get things right Repeat sequences etc
Whole Genome SNP Typing
A
A
T
C
C
C
T
T
T
A
A
A
G
G
C
A
T
T
Reference SequenceCore Genome
1
2
3
ACTAGA
ACTAGT TCTACT
NJ tree visualization
Turnaround Time
DNA Prep Sequencing Analysis
0 ndash 21 days 2 ndash 3 days Few hours to days
MOLECULAR EPIDEMIOLOGY
Cluster 1
~ 100 Patients
Cluster 2
Cluster 3
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
DRUG RESISTANCE
Applications of WGS for Drug Resistance
Surveillance
Clinical management
Identify new mechanisms
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
C G C31998G58 bp upstream Rv0029
C TC1663856T
acn
C T C51403T Rv0047c G AG1674048A
fabG1
C T C118832T Rv0102 T CT1877958C
pks7
C T C247984T Rv0207c C GC1888075G
pks9
T C T362962CPE_PGRS5
T CT2087076C
171 bp upstream of Rv1838c
C G C477188G Rv0398c C TC2372126T
Rv2112c
C A C480678A mmpL1 G CG2402463C
Rv2142c
A GA649974G
ubiE T CT2614547C
46 bp upstream Rv2339
C T C761147T rpoB G AG2751471A
Rv2449c
G C G765719C rpoC G AG2958534A
Rv2631
C G C799139G Rv0698 G AG3126489A
Rv2819c
G AG905686A
Rv0811c G AG3137406A
echA16
C A C926861APE_PGRS13
C TC3213150T
lepB
C GC1023436G
betP A CA3377940C
PPE46
T GT1093459G
PE_PGRS17
A CA3380380C
PPE47
G AG1114491A
Rv0997 A GA3416480G
Rv3055
C TC1208858T
Rv1084 C TC3455434T
Rv3088
C TC1231660T
Rv1104 G CG3608047C
Rv3230c
A CA1246730C
bpoB C TC3764285T
PPE56
C TC1266797T
Rv1139c G AG3765280A
PPE56
C TC1309314T
fdxC C AC3777772A
spoU
A CA1320356C
papA3 A GA4026439G
5 bp upstream Rv3585
G AG1353888A
tagA C AC4037284A
PE_PGRS59
G AG1421085A
Rv1272c T AT4072484A
Rv3633
G CG4084482C
topA
A GA4314271G
bfrB
Identify New Mechanisms of Resistance
fabG1 mabA inhA
L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T
METAGENOMICS
Starting Material
Sputum Subculture
Dx Culture
AMD Metagenomics Project Overview
Clinical specimen
Comparative pathogen sequence database RNADNA content sequence
High throughput sequencing Sample
extraction processing
Subtractive host
sequence database
Identification amp characterization
Clutter Mitigation
Clinical Specimen Sets
Human whole blood (EDTA) 2 liters bull Acquired through normal channels
Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate
Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo
Test Sputum for Mtb
Metagenomics
35819020 10
336491752 89
35886 0
4844864 1
Sputum Background Sample Raw Read processing
Quality filter failed (Q30 trim-filter)
Human mapped (BWA -defaults)
TB Mapped (BWA -defaults)
Other
MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios
MTbhuDNA Ratio () MTbSputum Ratio ()
20 1 01 001 0001 00001 20 10 1 01 001 0001
Transposon-Based Library Prep
Sequence Capture
MiSeq Sequencing and Analysis
All mixtures 25 ng uL-1
Project Workflow
Agilent SureSelect
- (
Mtb SureSelect Enrichment
01 1 20
20 1 10
2654689 117325
10133910 11929612 67210
001
11052
(825 45x) (994 77x)
(989 251x) (989 270x) (72 3x)
Recent Publication
Conclusions
Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management
Developed SOPs and optimized analysis
Surveillance Drug resistance
Work in progress
Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS
Acknowledgements
Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson
Core Facility
Mike Frace Mili Sheth Jamie Davis
AMD Metagenomics Team
Chris Hopkins Eishita Tyagi Scott Burns
httpswwwyoutubecomembedHMyCqWhwB8Eiframeamprel=0ampautoplay=1
httpswwwyoutubecomwatchv=v8p4ph2MAvI
httpswwwyoutubecomwatchv=WYBzbxIfuKs
httpswwwyoutubecomwatcht=10ampv=L_jAtDSB8kA
Ion Torrent PGM by Life Technologies
PacBio SMRT
Illumina
Genome Assembly
httpwwwjigsawplanetcomrc=createpuzzle
Bioinformatic Tools
gatcbiotechcom
Simple Variant Call Pipeline
httpswwwebiacuktrainingsitesebiacuktrainingfilesmaterials2014140217_AgriOmicsdan_bolser_snp_callingpdf
Examples of Commercial Products
The companies and products depicted here are not endorsed by CDC
Reference-Guided (Mapped) Assembly
Reference SequenceGenome
Low sequence coverage
UNMAPPED READS 1 Sequences not present in the reference 2 Plasmids or other extrachromosomal 3 DNA Structural VariationRearrangement
ADVANTAGES Relatively fast well-suited to highly-conserved genomes DISADVANTAGES Issues with high diversity mobile elements
Cov
erag
e
18X
1X
Contig 1 Contig 2
De-Novo Assembly
NOT TO SCALE
Contig 1 Contig 2 Contig 3
Contig 4 Contig 5 Contig 6 Contig 7
ADVANTAGES Reference agnostic assembles all the reads it can into contigs DISADVANTAGES Doesnrsquot always get things right Repeat sequences etc
Whole Genome SNP Typing
A
A
T
C
C
C
T
T
T
A
A
A
G
G
C
A
T
T
Reference SequenceCore Genome
1
2
3
ACTAGA
ACTAGT TCTACT
NJ tree visualization
Turnaround Time
DNA Prep Sequencing Analysis
0 ndash 21 days 2 ndash 3 days Few hours to days
MOLECULAR EPIDEMIOLOGY
Cluster 1
~ 100 Patients
Cluster 2
Cluster 3
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
DRUG RESISTANCE
Applications of WGS for Drug Resistance
Surveillance
Clinical management
Identify new mechanisms
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
C G C31998G58 bp upstream Rv0029
C TC1663856T
acn
C T C51403T Rv0047c G AG1674048A
fabG1
C T C118832T Rv0102 T CT1877958C
pks7
C T C247984T Rv0207c C GC1888075G
pks9
T C T362962CPE_PGRS5
T CT2087076C
171 bp upstream of Rv1838c
C G C477188G Rv0398c C TC2372126T
Rv2112c
C A C480678A mmpL1 G CG2402463C
Rv2142c
A GA649974G
ubiE T CT2614547C
46 bp upstream Rv2339
C T C761147T rpoB G AG2751471A
Rv2449c
G C G765719C rpoC G AG2958534A
Rv2631
C G C799139G Rv0698 G AG3126489A
Rv2819c
G AG905686A
Rv0811c G AG3137406A
echA16
C A C926861APE_PGRS13
C TC3213150T
lepB
C GC1023436G
betP A CA3377940C
PPE46
T GT1093459G
PE_PGRS17
A CA3380380C
PPE47
G AG1114491A
Rv0997 A GA3416480G
Rv3055
C TC1208858T
Rv1084 C TC3455434T
Rv3088
C TC1231660T
Rv1104 G CG3608047C
Rv3230c
A CA1246730C
bpoB C TC3764285T
PPE56
C TC1266797T
Rv1139c G AG3765280A
PPE56
C TC1309314T
fdxC C AC3777772A
spoU
A CA1320356C
papA3 A GA4026439G
5 bp upstream Rv3585
G AG1353888A
tagA C AC4037284A
PE_PGRS59
G AG1421085A
Rv1272c T AT4072484A
Rv3633
G CG4084482C
topA
A GA4314271G
bfrB
Identify New Mechanisms of Resistance
fabG1 mabA inhA
L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T
METAGENOMICS
Starting Material
Sputum Subculture
Dx Culture
AMD Metagenomics Project Overview
Clinical specimen
Comparative pathogen sequence database RNADNA content sequence
High throughput sequencing Sample
extraction processing
Subtractive host
sequence database
Identification amp characterization
Clutter Mitigation
Clinical Specimen Sets
Human whole blood (EDTA) 2 liters bull Acquired through normal channels
Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate
Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo
Test Sputum for Mtb
Metagenomics
35819020 10
336491752 89
35886 0
4844864 1
Sputum Background Sample Raw Read processing
Quality filter failed (Q30 trim-filter)
Human mapped (BWA -defaults)
TB Mapped (BWA -defaults)
Other
MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios
MTbhuDNA Ratio () MTbSputum Ratio ()
20 1 01 001 0001 00001 20 10 1 01 001 0001
Transposon-Based Library Prep
Sequence Capture
MiSeq Sequencing and Analysis
All mixtures 25 ng uL-1
Project Workflow
Agilent SureSelect
- (
Mtb SureSelect Enrichment
01 1 20
20 1 10
2654689 117325
10133910 11929612 67210
001
11052
(825 45x) (994 77x)
(989 251x) (989 270x) (72 3x)
Recent Publication
Conclusions
Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management
Developed SOPs and optimized analysis
Surveillance Drug resistance
Work in progress
Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS
Acknowledgements
Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson
Core Facility
Mike Frace Mili Sheth Jamie Davis
AMD Metagenomics Team
Chris Hopkins Eishita Tyagi Scott Burns
Genome Assembly
httpwwwjigsawplanetcomrc=createpuzzle
Bioinformatic Tools
gatcbiotechcom
Simple Variant Call Pipeline
httpswwwebiacuktrainingsitesebiacuktrainingfilesmaterials2014140217_AgriOmicsdan_bolser_snp_callingpdf
Examples of Commercial Products
The companies and products depicted here are not endorsed by CDC
Reference-Guided (Mapped) Assembly
Reference SequenceGenome
Low sequence coverage
UNMAPPED READS 1 Sequences not present in the reference 2 Plasmids or other extrachromosomal 3 DNA Structural VariationRearrangement
ADVANTAGES Relatively fast well-suited to highly-conserved genomes DISADVANTAGES Issues with high diversity mobile elements
Cov
erag
e
18X
1X
Contig 1 Contig 2
De-Novo Assembly
NOT TO SCALE
Contig 1 Contig 2 Contig 3
Contig 4 Contig 5 Contig 6 Contig 7
ADVANTAGES Reference agnostic assembles all the reads it can into contigs DISADVANTAGES Doesnrsquot always get things right Repeat sequences etc
Whole Genome SNP Typing
A
A
T
C
C
C
T
T
T
A
A
A
G
G
C
A
T
T
Reference SequenceCore Genome
1
2
3
ACTAGA
ACTAGT TCTACT
NJ tree visualization
Turnaround Time
DNA Prep Sequencing Analysis
0 ndash 21 days 2 ndash 3 days Few hours to days
MOLECULAR EPIDEMIOLOGY
Cluster 1
~ 100 Patients
Cluster 2
Cluster 3
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
DRUG RESISTANCE
Applications of WGS for Drug Resistance
Surveillance
Clinical management
Identify new mechanisms
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
C G C31998G58 bp upstream Rv0029
C TC1663856T
acn
C T C51403T Rv0047c G AG1674048A
fabG1
C T C118832T Rv0102 T CT1877958C
pks7
C T C247984T Rv0207c C GC1888075G
pks9
T C T362962CPE_PGRS5
T CT2087076C
171 bp upstream of Rv1838c
C G C477188G Rv0398c C TC2372126T
Rv2112c
C A C480678A mmpL1 G CG2402463C
Rv2142c
A GA649974G
ubiE T CT2614547C
46 bp upstream Rv2339
C T C761147T rpoB G AG2751471A
Rv2449c
G C G765719C rpoC G AG2958534A
Rv2631
C G C799139G Rv0698 G AG3126489A
Rv2819c
G AG905686A
Rv0811c G AG3137406A
echA16
C A C926861APE_PGRS13
C TC3213150T
lepB
C GC1023436G
betP A CA3377940C
PPE46
T GT1093459G
PE_PGRS17
A CA3380380C
PPE47
G AG1114491A
Rv0997 A GA3416480G
Rv3055
C TC1208858T
Rv1084 C TC3455434T
Rv3088
C TC1231660T
Rv1104 G CG3608047C
Rv3230c
A CA1246730C
bpoB C TC3764285T
PPE56
C TC1266797T
Rv1139c G AG3765280A
PPE56
C TC1309314T
fdxC C AC3777772A
spoU
A CA1320356C
papA3 A GA4026439G
5 bp upstream Rv3585
G AG1353888A
tagA C AC4037284A
PE_PGRS59
G AG1421085A
Rv1272c T AT4072484A
Rv3633
G CG4084482C
topA
A GA4314271G
bfrB
Identify New Mechanisms of Resistance
fabG1 mabA inhA
L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T
METAGENOMICS
Starting Material
Sputum Subculture
Dx Culture
AMD Metagenomics Project Overview
Clinical specimen
Comparative pathogen sequence database RNADNA content sequence
High throughput sequencing Sample
extraction processing
Subtractive host
sequence database
Identification amp characterization
Clutter Mitigation
Clinical Specimen Sets
Human whole blood (EDTA) 2 liters bull Acquired through normal channels
Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate
Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo
Test Sputum for Mtb
Metagenomics
35819020 10
336491752 89
35886 0
4844864 1
Sputum Background Sample Raw Read processing
Quality filter failed (Q30 trim-filter)
Human mapped (BWA -defaults)
TB Mapped (BWA -defaults)
Other
MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios
MTbhuDNA Ratio () MTbSputum Ratio ()
20 1 01 001 0001 00001 20 10 1 01 001 0001
Transposon-Based Library Prep
Sequence Capture
MiSeq Sequencing and Analysis
All mixtures 25 ng uL-1
Project Workflow
Agilent SureSelect
- (
Mtb SureSelect Enrichment
01 1 20
20 1 10
2654689 117325
10133910 11929612 67210
001
11052
(825 45x) (994 77x)
(989 251x) (989 270x) (72 3x)
Recent Publication
Conclusions
Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management
Developed SOPs and optimized analysis
Surveillance Drug resistance
Work in progress
Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS
Acknowledgements
Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson
Core Facility
Mike Frace Mili Sheth Jamie Davis
AMD Metagenomics Team
Chris Hopkins Eishita Tyagi Scott Burns
Bioinformatic Tools
gatcbiotechcom
Simple Variant Call Pipeline
httpswwwebiacuktrainingsitesebiacuktrainingfilesmaterials2014140217_AgriOmicsdan_bolser_snp_callingpdf
Examples of Commercial Products
The companies and products depicted here are not endorsed by CDC
Reference-Guided (Mapped) Assembly
Reference SequenceGenome
Low sequence coverage
UNMAPPED READS 1 Sequences not present in the reference 2 Plasmids or other extrachromosomal 3 DNA Structural VariationRearrangement
ADVANTAGES Relatively fast well-suited to highly-conserved genomes DISADVANTAGES Issues with high diversity mobile elements
Cov
erag
e
18X
1X
Contig 1 Contig 2
De-Novo Assembly
NOT TO SCALE
Contig 1 Contig 2 Contig 3
Contig 4 Contig 5 Contig 6 Contig 7
ADVANTAGES Reference agnostic assembles all the reads it can into contigs DISADVANTAGES Doesnrsquot always get things right Repeat sequences etc
Whole Genome SNP Typing
A
A
T
C
C
C
T
T
T
A
A
A
G
G
C
A
T
T
Reference SequenceCore Genome
1
2
3
ACTAGA
ACTAGT TCTACT
NJ tree visualization
Turnaround Time
DNA Prep Sequencing Analysis
0 ndash 21 days 2 ndash 3 days Few hours to days
MOLECULAR EPIDEMIOLOGY
Cluster 1
~ 100 Patients
Cluster 2
Cluster 3
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
DRUG RESISTANCE
Applications of WGS for Drug Resistance
Surveillance
Clinical management
Identify new mechanisms
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
C G C31998G58 bp upstream Rv0029
C TC1663856T
acn
C T C51403T Rv0047c G AG1674048A
fabG1
C T C118832T Rv0102 T CT1877958C
pks7
C T C247984T Rv0207c C GC1888075G
pks9
T C T362962CPE_PGRS5
T CT2087076C
171 bp upstream of Rv1838c
C G C477188G Rv0398c C TC2372126T
Rv2112c
C A C480678A mmpL1 G CG2402463C
Rv2142c
A GA649974G
ubiE T CT2614547C
46 bp upstream Rv2339
C T C761147T rpoB G AG2751471A
Rv2449c
G C G765719C rpoC G AG2958534A
Rv2631
C G C799139G Rv0698 G AG3126489A
Rv2819c
G AG905686A
Rv0811c G AG3137406A
echA16
C A C926861APE_PGRS13
C TC3213150T
lepB
C GC1023436G
betP A CA3377940C
PPE46
T GT1093459G
PE_PGRS17
A CA3380380C
PPE47
G AG1114491A
Rv0997 A GA3416480G
Rv3055
C TC1208858T
Rv1084 C TC3455434T
Rv3088
C TC1231660T
Rv1104 G CG3608047C
Rv3230c
A CA1246730C
bpoB C TC3764285T
PPE56
C TC1266797T
Rv1139c G AG3765280A
PPE56
C TC1309314T
fdxC C AC3777772A
spoU
A CA1320356C
papA3 A GA4026439G
5 bp upstream Rv3585
G AG1353888A
tagA C AC4037284A
PE_PGRS59
G AG1421085A
Rv1272c T AT4072484A
Rv3633
G CG4084482C
topA
A GA4314271G
bfrB
Identify New Mechanisms of Resistance
fabG1 mabA inhA
L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T
METAGENOMICS
Starting Material
Sputum Subculture
Dx Culture
AMD Metagenomics Project Overview
Clinical specimen
Comparative pathogen sequence database RNADNA content sequence
High throughput sequencing Sample
extraction processing
Subtractive host
sequence database
Identification amp characterization
Clutter Mitigation
Clinical Specimen Sets
Human whole blood (EDTA) 2 liters bull Acquired through normal channels
Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate
Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo
Test Sputum for Mtb
Metagenomics
35819020 10
336491752 89
35886 0
4844864 1
Sputum Background Sample Raw Read processing
Quality filter failed (Q30 trim-filter)
Human mapped (BWA -defaults)
TB Mapped (BWA -defaults)
Other
MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios
MTbhuDNA Ratio () MTbSputum Ratio ()
20 1 01 001 0001 00001 20 10 1 01 001 0001
Transposon-Based Library Prep
Sequence Capture
MiSeq Sequencing and Analysis
All mixtures 25 ng uL-1
Project Workflow
Agilent SureSelect
- (
Mtb SureSelect Enrichment
01 1 20
20 1 10
2654689 117325
10133910 11929612 67210
001
11052
(825 45x) (994 77x)
(989 251x) (989 270x) (72 3x)
Recent Publication
Conclusions
Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management
Developed SOPs and optimized analysis
Surveillance Drug resistance
Work in progress
Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS
Acknowledgements
Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson
Core Facility
Mike Frace Mili Sheth Jamie Davis
AMD Metagenomics Team
Chris Hopkins Eishita Tyagi Scott Burns
Simple Variant Call Pipeline
httpswwwebiacuktrainingsitesebiacuktrainingfilesmaterials2014140217_AgriOmicsdan_bolser_snp_callingpdf
Examples of Commercial Products
The companies and products depicted here are not endorsed by CDC
Reference-Guided (Mapped) Assembly
Reference SequenceGenome
Low sequence coverage
UNMAPPED READS 1 Sequences not present in the reference 2 Plasmids or other extrachromosomal 3 DNA Structural VariationRearrangement
ADVANTAGES Relatively fast well-suited to highly-conserved genomes DISADVANTAGES Issues with high diversity mobile elements
Cov
erag
e
18X
1X
Contig 1 Contig 2
De-Novo Assembly
NOT TO SCALE
Contig 1 Contig 2 Contig 3
Contig 4 Contig 5 Contig 6 Contig 7
ADVANTAGES Reference agnostic assembles all the reads it can into contigs DISADVANTAGES Doesnrsquot always get things right Repeat sequences etc
Whole Genome SNP Typing
A
A
T
C
C
C
T
T
T
A
A
A
G
G
C
A
T
T
Reference SequenceCore Genome
1
2
3
ACTAGA
ACTAGT TCTACT
NJ tree visualization
Turnaround Time
DNA Prep Sequencing Analysis
0 ndash 21 days 2 ndash 3 days Few hours to days
MOLECULAR EPIDEMIOLOGY
Cluster 1
~ 100 Patients
Cluster 2
Cluster 3
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
DRUG RESISTANCE
Applications of WGS for Drug Resistance
Surveillance
Clinical management
Identify new mechanisms
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
C G C31998G58 bp upstream Rv0029
C TC1663856T
acn
C T C51403T Rv0047c G AG1674048A
fabG1
C T C118832T Rv0102 T CT1877958C
pks7
C T C247984T Rv0207c C GC1888075G
pks9
T C T362962CPE_PGRS5
T CT2087076C
171 bp upstream of Rv1838c
C G C477188G Rv0398c C TC2372126T
Rv2112c
C A C480678A mmpL1 G CG2402463C
Rv2142c
A GA649974G
ubiE T CT2614547C
46 bp upstream Rv2339
C T C761147T rpoB G AG2751471A
Rv2449c
G C G765719C rpoC G AG2958534A
Rv2631
C G C799139G Rv0698 G AG3126489A
Rv2819c
G AG905686A
Rv0811c G AG3137406A
echA16
C A C926861APE_PGRS13
C TC3213150T
lepB
C GC1023436G
betP A CA3377940C
PPE46
T GT1093459G
PE_PGRS17
A CA3380380C
PPE47
G AG1114491A
Rv0997 A GA3416480G
Rv3055
C TC1208858T
Rv1084 C TC3455434T
Rv3088
C TC1231660T
Rv1104 G CG3608047C
Rv3230c
A CA1246730C
bpoB C TC3764285T
PPE56
C TC1266797T
Rv1139c G AG3765280A
PPE56
C TC1309314T
fdxC C AC3777772A
spoU
A CA1320356C
papA3 A GA4026439G
5 bp upstream Rv3585
G AG1353888A
tagA C AC4037284A
PE_PGRS59
G AG1421085A
Rv1272c T AT4072484A
Rv3633
G CG4084482C
topA
A GA4314271G
bfrB
Identify New Mechanisms of Resistance
fabG1 mabA inhA
L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T
METAGENOMICS
Starting Material
Sputum Subculture
Dx Culture
AMD Metagenomics Project Overview
Clinical specimen
Comparative pathogen sequence database RNADNA content sequence
High throughput sequencing Sample
extraction processing
Subtractive host
sequence database
Identification amp characterization
Clutter Mitigation
Clinical Specimen Sets
Human whole blood (EDTA) 2 liters bull Acquired through normal channels
Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate
Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo
Test Sputum for Mtb
Metagenomics
35819020 10
336491752 89
35886 0
4844864 1
Sputum Background Sample Raw Read processing
Quality filter failed (Q30 trim-filter)
Human mapped (BWA -defaults)
TB Mapped (BWA -defaults)
Other
MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios
MTbhuDNA Ratio () MTbSputum Ratio ()
20 1 01 001 0001 00001 20 10 1 01 001 0001
Transposon-Based Library Prep
Sequence Capture
MiSeq Sequencing and Analysis
All mixtures 25 ng uL-1
Project Workflow
Agilent SureSelect
- (
Mtb SureSelect Enrichment
01 1 20
20 1 10
2654689 117325
10133910 11929612 67210
001
11052
(825 45x) (994 77x)
(989 251x) (989 270x) (72 3x)
Recent Publication
Conclusions
Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management
Developed SOPs and optimized analysis
Surveillance Drug resistance
Work in progress
Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS
Acknowledgements
Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson
Core Facility
Mike Frace Mili Sheth Jamie Davis
AMD Metagenomics Team
Chris Hopkins Eishita Tyagi Scott Burns
Examples of Commercial Products
The companies and products depicted here are not endorsed by CDC
Reference-Guided (Mapped) Assembly
Reference SequenceGenome
Low sequence coverage
UNMAPPED READS 1 Sequences not present in the reference 2 Plasmids or other extrachromosomal 3 DNA Structural VariationRearrangement
ADVANTAGES Relatively fast well-suited to highly-conserved genomes DISADVANTAGES Issues with high diversity mobile elements
Cov
erag
e
18X
1X
Contig 1 Contig 2
De-Novo Assembly
NOT TO SCALE
Contig 1 Contig 2 Contig 3
Contig 4 Contig 5 Contig 6 Contig 7
ADVANTAGES Reference agnostic assembles all the reads it can into contigs DISADVANTAGES Doesnrsquot always get things right Repeat sequences etc
Whole Genome SNP Typing
A
A
T
C
C
C
T
T
T
A
A
A
G
G
C
A
T
T
Reference SequenceCore Genome
1
2
3
ACTAGA
ACTAGT TCTACT
NJ tree visualization
Turnaround Time
DNA Prep Sequencing Analysis
0 ndash 21 days 2 ndash 3 days Few hours to days
MOLECULAR EPIDEMIOLOGY
Cluster 1
~ 100 Patients
Cluster 2
Cluster 3
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
DRUG RESISTANCE
Applications of WGS for Drug Resistance
Surveillance
Clinical management
Identify new mechanisms
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
C G C31998G58 bp upstream Rv0029
C TC1663856T
acn
C T C51403T Rv0047c G AG1674048A
fabG1
C T C118832T Rv0102 T CT1877958C
pks7
C T C247984T Rv0207c C GC1888075G
pks9
T C T362962CPE_PGRS5
T CT2087076C
171 bp upstream of Rv1838c
C G C477188G Rv0398c C TC2372126T
Rv2112c
C A C480678A mmpL1 G CG2402463C
Rv2142c
A GA649974G
ubiE T CT2614547C
46 bp upstream Rv2339
C T C761147T rpoB G AG2751471A
Rv2449c
G C G765719C rpoC G AG2958534A
Rv2631
C G C799139G Rv0698 G AG3126489A
Rv2819c
G AG905686A
Rv0811c G AG3137406A
echA16
C A C926861APE_PGRS13
C TC3213150T
lepB
C GC1023436G
betP A CA3377940C
PPE46
T GT1093459G
PE_PGRS17
A CA3380380C
PPE47
G AG1114491A
Rv0997 A GA3416480G
Rv3055
C TC1208858T
Rv1084 C TC3455434T
Rv3088
C TC1231660T
Rv1104 G CG3608047C
Rv3230c
A CA1246730C
bpoB C TC3764285T
PPE56
C TC1266797T
Rv1139c G AG3765280A
PPE56
C TC1309314T
fdxC C AC3777772A
spoU
A CA1320356C
papA3 A GA4026439G
5 bp upstream Rv3585
G AG1353888A
tagA C AC4037284A
PE_PGRS59
G AG1421085A
Rv1272c T AT4072484A
Rv3633
G CG4084482C
topA
A GA4314271G
bfrB
Identify New Mechanisms of Resistance
fabG1 mabA inhA
L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T
METAGENOMICS
Starting Material
Sputum Subculture
Dx Culture
AMD Metagenomics Project Overview
Clinical specimen
Comparative pathogen sequence database RNADNA content sequence
High throughput sequencing Sample
extraction processing
Subtractive host
sequence database
Identification amp characterization
Clutter Mitigation
Clinical Specimen Sets
Human whole blood (EDTA) 2 liters bull Acquired through normal channels
Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate
Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo
Test Sputum for Mtb
Metagenomics
35819020 10
336491752 89
35886 0
4844864 1
Sputum Background Sample Raw Read processing
Quality filter failed (Q30 trim-filter)
Human mapped (BWA -defaults)
TB Mapped (BWA -defaults)
Other
MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios
MTbhuDNA Ratio () MTbSputum Ratio ()
20 1 01 001 0001 00001 20 10 1 01 001 0001
Transposon-Based Library Prep
Sequence Capture
MiSeq Sequencing and Analysis
All mixtures 25 ng uL-1
Project Workflow
Agilent SureSelect
- (
Mtb SureSelect Enrichment
01 1 20
20 1 10
2654689 117325
10133910 11929612 67210
001
11052
(825 45x) (994 77x)
(989 251x) (989 270x) (72 3x)
Recent Publication
Conclusions
Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management
Developed SOPs and optimized analysis
Surveillance Drug resistance
Work in progress
Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS
Acknowledgements
Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson
Core Facility
Mike Frace Mili Sheth Jamie Davis
AMD Metagenomics Team
Chris Hopkins Eishita Tyagi Scott Burns
Reference-Guided (Mapped) Assembly
Reference SequenceGenome
Low sequence coverage
UNMAPPED READS 1 Sequences not present in the reference 2 Plasmids or other extrachromosomal 3 DNA Structural VariationRearrangement
ADVANTAGES Relatively fast well-suited to highly-conserved genomes DISADVANTAGES Issues with high diversity mobile elements
Cov
erag
e
18X
1X
Contig 1 Contig 2
De-Novo Assembly
NOT TO SCALE
Contig 1 Contig 2 Contig 3
Contig 4 Contig 5 Contig 6 Contig 7
ADVANTAGES Reference agnostic assembles all the reads it can into contigs DISADVANTAGES Doesnrsquot always get things right Repeat sequences etc
Whole Genome SNP Typing
A
A
T
C
C
C
T
T
T
A
A
A
G
G
C
A
T
T
Reference SequenceCore Genome
1
2
3
ACTAGA
ACTAGT TCTACT
NJ tree visualization
Turnaround Time
DNA Prep Sequencing Analysis
0 ndash 21 days 2 ndash 3 days Few hours to days
MOLECULAR EPIDEMIOLOGY
Cluster 1
~ 100 Patients
Cluster 2
Cluster 3
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
DRUG RESISTANCE
Applications of WGS for Drug Resistance
Surveillance
Clinical management
Identify new mechanisms
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
C G C31998G58 bp upstream Rv0029
C TC1663856T
acn
C T C51403T Rv0047c G AG1674048A
fabG1
C T C118832T Rv0102 T CT1877958C
pks7
C T C247984T Rv0207c C GC1888075G
pks9
T C T362962CPE_PGRS5
T CT2087076C
171 bp upstream of Rv1838c
C G C477188G Rv0398c C TC2372126T
Rv2112c
C A C480678A mmpL1 G CG2402463C
Rv2142c
A GA649974G
ubiE T CT2614547C
46 bp upstream Rv2339
C T C761147T rpoB G AG2751471A
Rv2449c
G C G765719C rpoC G AG2958534A
Rv2631
C G C799139G Rv0698 G AG3126489A
Rv2819c
G AG905686A
Rv0811c G AG3137406A
echA16
C A C926861APE_PGRS13
C TC3213150T
lepB
C GC1023436G
betP A CA3377940C
PPE46
T GT1093459G
PE_PGRS17
A CA3380380C
PPE47
G AG1114491A
Rv0997 A GA3416480G
Rv3055
C TC1208858T
Rv1084 C TC3455434T
Rv3088
C TC1231660T
Rv1104 G CG3608047C
Rv3230c
A CA1246730C
bpoB C TC3764285T
PPE56
C TC1266797T
Rv1139c G AG3765280A
PPE56
C TC1309314T
fdxC C AC3777772A
spoU
A CA1320356C
papA3 A GA4026439G
5 bp upstream Rv3585
G AG1353888A
tagA C AC4037284A
PE_PGRS59
G AG1421085A
Rv1272c T AT4072484A
Rv3633
G CG4084482C
topA
A GA4314271G
bfrB
Identify New Mechanisms of Resistance
fabG1 mabA inhA
L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T
METAGENOMICS
Starting Material
Sputum Subculture
Dx Culture
AMD Metagenomics Project Overview
Clinical specimen
Comparative pathogen sequence database RNADNA content sequence
High throughput sequencing Sample
extraction processing
Subtractive host
sequence database
Identification amp characterization
Clutter Mitigation
Clinical Specimen Sets
Human whole blood (EDTA) 2 liters bull Acquired through normal channels
Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate
Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo
Test Sputum for Mtb
Metagenomics
35819020 10
336491752 89
35886 0
4844864 1
Sputum Background Sample Raw Read processing
Quality filter failed (Q30 trim-filter)
Human mapped (BWA -defaults)
TB Mapped (BWA -defaults)
Other
MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios
MTbhuDNA Ratio () MTbSputum Ratio ()
20 1 01 001 0001 00001 20 10 1 01 001 0001
Transposon-Based Library Prep
Sequence Capture
MiSeq Sequencing and Analysis
All mixtures 25 ng uL-1
Project Workflow
Agilent SureSelect
- (
Mtb SureSelect Enrichment
01 1 20
20 1 10
2654689 117325
10133910 11929612 67210
001
11052
(825 45x) (994 77x)
(989 251x) (989 270x) (72 3x)
Recent Publication
Conclusions
Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management
Developed SOPs and optimized analysis
Surveillance Drug resistance
Work in progress
Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS
Acknowledgements
Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson
Core Facility
Mike Frace Mili Sheth Jamie Davis
AMD Metagenomics Team
Chris Hopkins Eishita Tyagi Scott Burns
De-Novo Assembly
NOT TO SCALE
Contig 1 Contig 2 Contig 3
Contig 4 Contig 5 Contig 6 Contig 7
ADVANTAGES Reference agnostic assembles all the reads it can into contigs DISADVANTAGES Doesnrsquot always get things right Repeat sequences etc
Whole Genome SNP Typing
A
A
T
C
C
C
T
T
T
A
A
A
G
G
C
A
T
T
Reference SequenceCore Genome
1
2
3
ACTAGA
ACTAGT TCTACT
NJ tree visualization
Turnaround Time
DNA Prep Sequencing Analysis
0 ndash 21 days 2 ndash 3 days Few hours to days
MOLECULAR EPIDEMIOLOGY
Cluster 1
~ 100 Patients
Cluster 2
Cluster 3
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
DRUG RESISTANCE
Applications of WGS for Drug Resistance
Surveillance
Clinical management
Identify new mechanisms
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
C G C31998G58 bp upstream Rv0029
C TC1663856T
acn
C T C51403T Rv0047c G AG1674048A
fabG1
C T C118832T Rv0102 T CT1877958C
pks7
C T C247984T Rv0207c C GC1888075G
pks9
T C T362962CPE_PGRS5
T CT2087076C
171 bp upstream of Rv1838c
C G C477188G Rv0398c C TC2372126T
Rv2112c
C A C480678A mmpL1 G CG2402463C
Rv2142c
A GA649974G
ubiE T CT2614547C
46 bp upstream Rv2339
C T C761147T rpoB G AG2751471A
Rv2449c
G C G765719C rpoC G AG2958534A
Rv2631
C G C799139G Rv0698 G AG3126489A
Rv2819c
G AG905686A
Rv0811c G AG3137406A
echA16
C A C926861APE_PGRS13
C TC3213150T
lepB
C GC1023436G
betP A CA3377940C
PPE46
T GT1093459G
PE_PGRS17
A CA3380380C
PPE47
G AG1114491A
Rv0997 A GA3416480G
Rv3055
C TC1208858T
Rv1084 C TC3455434T
Rv3088
C TC1231660T
Rv1104 G CG3608047C
Rv3230c
A CA1246730C
bpoB C TC3764285T
PPE56
C TC1266797T
Rv1139c G AG3765280A
PPE56
C TC1309314T
fdxC C AC3777772A
spoU
A CA1320356C
papA3 A GA4026439G
5 bp upstream Rv3585
G AG1353888A
tagA C AC4037284A
PE_PGRS59
G AG1421085A
Rv1272c T AT4072484A
Rv3633
G CG4084482C
topA
A GA4314271G
bfrB
Identify New Mechanisms of Resistance
fabG1 mabA inhA
L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T
METAGENOMICS
Starting Material
Sputum Subculture
Dx Culture
AMD Metagenomics Project Overview
Clinical specimen
Comparative pathogen sequence database RNADNA content sequence
High throughput sequencing Sample
extraction processing
Subtractive host
sequence database
Identification amp characterization
Clutter Mitigation
Clinical Specimen Sets
Human whole blood (EDTA) 2 liters bull Acquired through normal channels
Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate
Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo
Test Sputum for Mtb
Metagenomics
35819020 10
336491752 89
35886 0
4844864 1
Sputum Background Sample Raw Read processing
Quality filter failed (Q30 trim-filter)
Human mapped (BWA -defaults)
TB Mapped (BWA -defaults)
Other
MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios
MTbhuDNA Ratio () MTbSputum Ratio ()
20 1 01 001 0001 00001 20 10 1 01 001 0001
Transposon-Based Library Prep
Sequence Capture
MiSeq Sequencing and Analysis
All mixtures 25 ng uL-1
Project Workflow
Agilent SureSelect
- (
Mtb SureSelect Enrichment
01 1 20
20 1 10
2654689 117325
10133910 11929612 67210
001
11052
(825 45x) (994 77x)
(989 251x) (989 270x) (72 3x)
Recent Publication
Conclusions
Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management
Developed SOPs and optimized analysis
Surveillance Drug resistance
Work in progress
Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS
Acknowledgements
Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson
Core Facility
Mike Frace Mili Sheth Jamie Davis
AMD Metagenomics Team
Chris Hopkins Eishita Tyagi Scott Burns
Whole Genome SNP Typing
A
A
T
C
C
C
T
T
T
A
A
A
G
G
C
A
T
T
Reference SequenceCore Genome
1
2
3
ACTAGA
ACTAGT TCTACT
NJ tree visualization
Turnaround Time
DNA Prep Sequencing Analysis
0 ndash 21 days 2 ndash 3 days Few hours to days
MOLECULAR EPIDEMIOLOGY
Cluster 1
~ 100 Patients
Cluster 2
Cluster 3
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
DRUG RESISTANCE
Applications of WGS for Drug Resistance
Surveillance
Clinical management
Identify new mechanisms
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
C G C31998G58 bp upstream Rv0029
C TC1663856T
acn
C T C51403T Rv0047c G AG1674048A
fabG1
C T C118832T Rv0102 T CT1877958C
pks7
C T C247984T Rv0207c C GC1888075G
pks9
T C T362962CPE_PGRS5
T CT2087076C
171 bp upstream of Rv1838c
C G C477188G Rv0398c C TC2372126T
Rv2112c
C A C480678A mmpL1 G CG2402463C
Rv2142c
A GA649974G
ubiE T CT2614547C
46 bp upstream Rv2339
C T C761147T rpoB G AG2751471A
Rv2449c
G C G765719C rpoC G AG2958534A
Rv2631
C G C799139G Rv0698 G AG3126489A
Rv2819c
G AG905686A
Rv0811c G AG3137406A
echA16
C A C926861APE_PGRS13
C TC3213150T
lepB
C GC1023436G
betP A CA3377940C
PPE46
T GT1093459G
PE_PGRS17
A CA3380380C
PPE47
G AG1114491A
Rv0997 A GA3416480G
Rv3055
C TC1208858T
Rv1084 C TC3455434T
Rv3088
C TC1231660T
Rv1104 G CG3608047C
Rv3230c
A CA1246730C
bpoB C TC3764285T
PPE56
C TC1266797T
Rv1139c G AG3765280A
PPE56
C TC1309314T
fdxC C AC3777772A
spoU
A CA1320356C
papA3 A GA4026439G
5 bp upstream Rv3585
G AG1353888A
tagA C AC4037284A
PE_PGRS59
G AG1421085A
Rv1272c T AT4072484A
Rv3633
G CG4084482C
topA
A GA4314271G
bfrB
Identify New Mechanisms of Resistance
fabG1 mabA inhA
L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T
METAGENOMICS
Starting Material
Sputum Subculture
Dx Culture
AMD Metagenomics Project Overview
Clinical specimen
Comparative pathogen sequence database RNADNA content sequence
High throughput sequencing Sample
extraction processing
Subtractive host
sequence database
Identification amp characterization
Clutter Mitigation
Clinical Specimen Sets
Human whole blood (EDTA) 2 liters bull Acquired through normal channels
Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate
Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo
Test Sputum for Mtb
Metagenomics
35819020 10
336491752 89
35886 0
4844864 1
Sputum Background Sample Raw Read processing
Quality filter failed (Q30 trim-filter)
Human mapped (BWA -defaults)
TB Mapped (BWA -defaults)
Other
MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios
MTbhuDNA Ratio () MTbSputum Ratio ()
20 1 01 001 0001 00001 20 10 1 01 001 0001
Transposon-Based Library Prep
Sequence Capture
MiSeq Sequencing and Analysis
All mixtures 25 ng uL-1
Project Workflow
Agilent SureSelect
- (
Mtb SureSelect Enrichment
01 1 20
20 1 10
2654689 117325
10133910 11929612 67210
001
11052
(825 45x) (994 77x)
(989 251x) (989 270x) (72 3x)
Recent Publication
Conclusions
Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management
Developed SOPs and optimized analysis
Surveillance Drug resistance
Work in progress
Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS
Acknowledgements
Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson
Core Facility
Mike Frace Mili Sheth Jamie Davis
AMD Metagenomics Team
Chris Hopkins Eishita Tyagi Scott Burns
NJ tree visualization
Turnaround Time
DNA Prep Sequencing Analysis
0 ndash 21 days 2 ndash 3 days Few hours to days
MOLECULAR EPIDEMIOLOGY
Cluster 1
~ 100 Patients
Cluster 2
Cluster 3
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
DRUG RESISTANCE
Applications of WGS for Drug Resistance
Surveillance
Clinical management
Identify new mechanisms
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
C G C31998G58 bp upstream Rv0029
C TC1663856T
acn
C T C51403T Rv0047c G AG1674048A
fabG1
C T C118832T Rv0102 T CT1877958C
pks7
C T C247984T Rv0207c C GC1888075G
pks9
T C T362962CPE_PGRS5
T CT2087076C
171 bp upstream of Rv1838c
C G C477188G Rv0398c C TC2372126T
Rv2112c
C A C480678A mmpL1 G CG2402463C
Rv2142c
A GA649974G
ubiE T CT2614547C
46 bp upstream Rv2339
C T C761147T rpoB G AG2751471A
Rv2449c
G C G765719C rpoC G AG2958534A
Rv2631
C G C799139G Rv0698 G AG3126489A
Rv2819c
G AG905686A
Rv0811c G AG3137406A
echA16
C A C926861APE_PGRS13
C TC3213150T
lepB
C GC1023436G
betP A CA3377940C
PPE46
T GT1093459G
PE_PGRS17
A CA3380380C
PPE47
G AG1114491A
Rv0997 A GA3416480G
Rv3055
C TC1208858T
Rv1084 C TC3455434T
Rv3088
C TC1231660T
Rv1104 G CG3608047C
Rv3230c
A CA1246730C
bpoB C TC3764285T
PPE56
C TC1266797T
Rv1139c G AG3765280A
PPE56
C TC1309314T
fdxC C AC3777772A
spoU
A CA1320356C
papA3 A GA4026439G
5 bp upstream Rv3585
G AG1353888A
tagA C AC4037284A
PE_PGRS59
G AG1421085A
Rv1272c T AT4072484A
Rv3633
G CG4084482C
topA
A GA4314271G
bfrB
Identify New Mechanisms of Resistance
fabG1 mabA inhA
L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T
METAGENOMICS
Starting Material
Sputum Subculture
Dx Culture
AMD Metagenomics Project Overview
Clinical specimen
Comparative pathogen sequence database RNADNA content sequence
High throughput sequencing Sample
extraction processing
Subtractive host
sequence database
Identification amp characterization
Clutter Mitigation
Clinical Specimen Sets
Human whole blood (EDTA) 2 liters bull Acquired through normal channels
Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate
Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo
Test Sputum for Mtb
Metagenomics
35819020 10
336491752 89
35886 0
4844864 1
Sputum Background Sample Raw Read processing
Quality filter failed (Q30 trim-filter)
Human mapped (BWA -defaults)
TB Mapped (BWA -defaults)
Other
MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios
MTbhuDNA Ratio () MTbSputum Ratio ()
20 1 01 001 0001 00001 20 10 1 01 001 0001
Transposon-Based Library Prep
Sequence Capture
MiSeq Sequencing and Analysis
All mixtures 25 ng uL-1
Project Workflow
Agilent SureSelect
- (
Mtb SureSelect Enrichment
01 1 20
20 1 10
2654689 117325
10133910 11929612 67210
001
11052
(825 45x) (994 77x)
(989 251x) (989 270x) (72 3x)
Recent Publication
Conclusions
Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management
Developed SOPs and optimized analysis
Surveillance Drug resistance
Work in progress
Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS
Acknowledgements
Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson
Core Facility
Mike Frace Mili Sheth Jamie Davis
AMD Metagenomics Team
Chris Hopkins Eishita Tyagi Scott Burns
Turnaround Time
DNA Prep Sequencing Analysis
0 ndash 21 days 2 ndash 3 days Few hours to days
MOLECULAR EPIDEMIOLOGY
Cluster 1
~ 100 Patients
Cluster 2
Cluster 3
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
DRUG RESISTANCE
Applications of WGS for Drug Resistance
Surveillance
Clinical management
Identify new mechanisms
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
C G C31998G58 bp upstream Rv0029
C TC1663856T
acn
C T C51403T Rv0047c G AG1674048A
fabG1
C T C118832T Rv0102 T CT1877958C
pks7
C T C247984T Rv0207c C GC1888075G
pks9
T C T362962CPE_PGRS5
T CT2087076C
171 bp upstream of Rv1838c
C G C477188G Rv0398c C TC2372126T
Rv2112c
C A C480678A mmpL1 G CG2402463C
Rv2142c
A GA649974G
ubiE T CT2614547C
46 bp upstream Rv2339
C T C761147T rpoB G AG2751471A
Rv2449c
G C G765719C rpoC G AG2958534A
Rv2631
C G C799139G Rv0698 G AG3126489A
Rv2819c
G AG905686A
Rv0811c G AG3137406A
echA16
C A C926861APE_PGRS13
C TC3213150T
lepB
C GC1023436G
betP A CA3377940C
PPE46
T GT1093459G
PE_PGRS17
A CA3380380C
PPE47
G AG1114491A
Rv0997 A GA3416480G
Rv3055
C TC1208858T
Rv1084 C TC3455434T
Rv3088
C TC1231660T
Rv1104 G CG3608047C
Rv3230c
A CA1246730C
bpoB C TC3764285T
PPE56
C TC1266797T
Rv1139c G AG3765280A
PPE56
C TC1309314T
fdxC C AC3777772A
spoU
A CA1320356C
papA3 A GA4026439G
5 bp upstream Rv3585
G AG1353888A
tagA C AC4037284A
PE_PGRS59
G AG1421085A
Rv1272c T AT4072484A
Rv3633
G CG4084482C
topA
A GA4314271G
bfrB
Identify New Mechanisms of Resistance
fabG1 mabA inhA
L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T
METAGENOMICS
Starting Material
Sputum Subculture
Dx Culture
AMD Metagenomics Project Overview
Clinical specimen
Comparative pathogen sequence database RNADNA content sequence
High throughput sequencing Sample
extraction processing
Subtractive host
sequence database
Identification amp characterization
Clutter Mitigation
Clinical Specimen Sets
Human whole blood (EDTA) 2 liters bull Acquired through normal channels
Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate
Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo
Test Sputum for Mtb
Metagenomics
35819020 10
336491752 89
35886 0
4844864 1
Sputum Background Sample Raw Read processing
Quality filter failed (Q30 trim-filter)
Human mapped (BWA -defaults)
TB Mapped (BWA -defaults)
Other
MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios
MTbhuDNA Ratio () MTbSputum Ratio ()
20 1 01 001 0001 00001 20 10 1 01 001 0001
Transposon-Based Library Prep
Sequence Capture
MiSeq Sequencing and Analysis
All mixtures 25 ng uL-1
Project Workflow
Agilent SureSelect
- (
Mtb SureSelect Enrichment
01 1 20
20 1 10
2654689 117325
10133910 11929612 67210
001
11052
(825 45x) (994 77x)
(989 251x) (989 270x) (72 3x)
Recent Publication
Conclusions
Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management
Developed SOPs and optimized analysis
Surveillance Drug resistance
Work in progress
Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS
Acknowledgements
Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson
Core Facility
Mike Frace Mili Sheth Jamie Davis
AMD Metagenomics Team
Chris Hopkins Eishita Tyagi Scott Burns
MOLECULAR EPIDEMIOLOGY
Cluster 1
~ 100 Patients
Cluster 2
Cluster 3
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
DRUG RESISTANCE
Applications of WGS for Drug Resistance
Surveillance
Clinical management
Identify new mechanisms
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
C G C31998G58 bp upstream Rv0029
C TC1663856T
acn
C T C51403T Rv0047c G AG1674048A
fabG1
C T C118832T Rv0102 T CT1877958C
pks7
C T C247984T Rv0207c C GC1888075G
pks9
T C T362962CPE_PGRS5
T CT2087076C
171 bp upstream of Rv1838c
C G C477188G Rv0398c C TC2372126T
Rv2112c
C A C480678A mmpL1 G CG2402463C
Rv2142c
A GA649974G
ubiE T CT2614547C
46 bp upstream Rv2339
C T C761147T rpoB G AG2751471A
Rv2449c
G C G765719C rpoC G AG2958534A
Rv2631
C G C799139G Rv0698 G AG3126489A
Rv2819c
G AG905686A
Rv0811c G AG3137406A
echA16
C A C926861APE_PGRS13
C TC3213150T
lepB
C GC1023436G
betP A CA3377940C
PPE46
T GT1093459G
PE_PGRS17
A CA3380380C
PPE47
G AG1114491A
Rv0997 A GA3416480G
Rv3055
C TC1208858T
Rv1084 C TC3455434T
Rv3088
C TC1231660T
Rv1104 G CG3608047C
Rv3230c
A CA1246730C
bpoB C TC3764285T
PPE56
C TC1266797T
Rv1139c G AG3765280A
PPE56
C TC1309314T
fdxC C AC3777772A
spoU
A CA1320356C
papA3 A GA4026439G
5 bp upstream Rv3585
G AG1353888A
tagA C AC4037284A
PE_PGRS59
G AG1421085A
Rv1272c T AT4072484A
Rv3633
G CG4084482C
topA
A GA4314271G
bfrB
Identify New Mechanisms of Resistance
fabG1 mabA inhA
L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T
METAGENOMICS
Starting Material
Sputum Subculture
Dx Culture
AMD Metagenomics Project Overview
Clinical specimen
Comparative pathogen sequence database RNADNA content sequence
High throughput sequencing Sample
extraction processing
Subtractive host
sequence database
Identification amp characterization
Clutter Mitigation
Clinical Specimen Sets
Human whole blood (EDTA) 2 liters bull Acquired through normal channels
Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate
Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo
Test Sputum for Mtb
Metagenomics
35819020 10
336491752 89
35886 0
4844864 1
Sputum Background Sample Raw Read processing
Quality filter failed (Q30 trim-filter)
Human mapped (BWA -defaults)
TB Mapped (BWA -defaults)
Other
MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios
MTbhuDNA Ratio () MTbSputum Ratio ()
20 1 01 001 0001 00001 20 10 1 01 001 0001
Transposon-Based Library Prep
Sequence Capture
MiSeq Sequencing and Analysis
All mixtures 25 ng uL-1
Project Workflow
Agilent SureSelect
- (
Mtb SureSelect Enrichment
01 1 20
20 1 10
2654689 117325
10133910 11929612 67210
001
11052
(825 45x) (994 77x)
(989 251x) (989 270x) (72 3x)
Recent Publication
Conclusions
Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management
Developed SOPs and optimized analysis
Surveillance Drug resistance
Work in progress
Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS
Acknowledgements
Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson
Core Facility
Mike Frace Mili Sheth Jamie Davis
AMD Metagenomics Team
Chris Hopkins Eishita Tyagi Scott Burns
Cluster 1
~ 100 Patients
Cluster 2
Cluster 3
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
DRUG RESISTANCE
Applications of WGS for Drug Resistance
Surveillance
Clinical management
Identify new mechanisms
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
C G C31998G58 bp upstream Rv0029
C TC1663856T
acn
C T C51403T Rv0047c G AG1674048A
fabG1
C T C118832T Rv0102 T CT1877958C
pks7
C T C247984T Rv0207c C GC1888075G
pks9
T C T362962CPE_PGRS5
T CT2087076C
171 bp upstream of Rv1838c
C G C477188G Rv0398c C TC2372126T
Rv2112c
C A C480678A mmpL1 G CG2402463C
Rv2142c
A GA649974G
ubiE T CT2614547C
46 bp upstream Rv2339
C T C761147T rpoB G AG2751471A
Rv2449c
G C G765719C rpoC G AG2958534A
Rv2631
C G C799139G Rv0698 G AG3126489A
Rv2819c
G AG905686A
Rv0811c G AG3137406A
echA16
C A C926861APE_PGRS13
C TC3213150T
lepB
C GC1023436G
betP A CA3377940C
PPE46
T GT1093459G
PE_PGRS17
A CA3380380C
PPE47
G AG1114491A
Rv0997 A GA3416480G
Rv3055
C TC1208858T
Rv1084 C TC3455434T
Rv3088
C TC1231660T
Rv1104 G CG3608047C
Rv3230c
A CA1246730C
bpoB C TC3764285T
PPE56
C TC1266797T
Rv1139c G AG3765280A
PPE56
C TC1309314T
fdxC C AC3777772A
spoU
A CA1320356C
papA3 A GA4026439G
5 bp upstream Rv3585
G AG1353888A
tagA C AC4037284A
PE_PGRS59
G AG1421085A
Rv1272c T AT4072484A
Rv3633
G CG4084482C
topA
A GA4314271G
bfrB
Identify New Mechanisms of Resistance
fabG1 mabA inhA
L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T
METAGENOMICS
Starting Material
Sputum Subculture
Dx Culture
AMD Metagenomics Project Overview
Clinical specimen
Comparative pathogen sequence database RNADNA content sequence
High throughput sequencing Sample
extraction processing
Subtractive host
sequence database
Identification amp characterization
Clutter Mitigation
Clinical Specimen Sets
Human whole blood (EDTA) 2 liters bull Acquired through normal channels
Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate
Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo
Test Sputum for Mtb
Metagenomics
35819020 10
336491752 89
35886 0
4844864 1
Sputum Background Sample Raw Read processing
Quality filter failed (Q30 trim-filter)
Human mapped (BWA -defaults)
TB Mapped (BWA -defaults)
Other
MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios
MTbhuDNA Ratio () MTbSputum Ratio ()
20 1 01 001 0001 00001 20 10 1 01 001 0001
Transposon-Based Library Prep
Sequence Capture
MiSeq Sequencing and Analysis
All mixtures 25 ng uL-1
Project Workflow
Agilent SureSelect
- (
Mtb SureSelect Enrichment
01 1 20
20 1 10
2654689 117325
10133910 11929612 67210
001
11052
(825 45x) (994 77x)
(989 251x) (989 270x) (72 3x)
Recent Publication
Conclusions
Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management
Developed SOPs and optimized analysis
Surveillance Drug resistance
Work in progress
Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS
Acknowledgements
Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson
Core Facility
Mike Frace Mili Sheth Jamie Davis
AMD Metagenomics Team
Chris Hopkins Eishita Tyagi Scott Burns
Cluster 2
Cluster 3
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
DRUG RESISTANCE
Applications of WGS for Drug Resistance
Surveillance
Clinical management
Identify new mechanisms
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
C G C31998G58 bp upstream Rv0029
C TC1663856T
acn
C T C51403T Rv0047c G AG1674048A
fabG1
C T C118832T Rv0102 T CT1877958C
pks7
C T C247984T Rv0207c C GC1888075G
pks9
T C T362962CPE_PGRS5
T CT2087076C
171 bp upstream of Rv1838c
C G C477188G Rv0398c C TC2372126T
Rv2112c
C A C480678A mmpL1 G CG2402463C
Rv2142c
A GA649974G
ubiE T CT2614547C
46 bp upstream Rv2339
C T C761147T rpoB G AG2751471A
Rv2449c
G C G765719C rpoC G AG2958534A
Rv2631
C G C799139G Rv0698 G AG3126489A
Rv2819c
G AG905686A
Rv0811c G AG3137406A
echA16
C A C926861APE_PGRS13
C TC3213150T
lepB
C GC1023436G
betP A CA3377940C
PPE46
T GT1093459G
PE_PGRS17
A CA3380380C
PPE47
G AG1114491A
Rv0997 A GA3416480G
Rv3055
C TC1208858T
Rv1084 C TC3455434T
Rv3088
C TC1231660T
Rv1104 G CG3608047C
Rv3230c
A CA1246730C
bpoB C TC3764285T
PPE56
C TC1266797T
Rv1139c G AG3765280A
PPE56
C TC1309314T
fdxC C AC3777772A
spoU
A CA1320356C
papA3 A GA4026439G
5 bp upstream Rv3585
G AG1353888A
tagA C AC4037284A
PE_PGRS59
G AG1421085A
Rv1272c T AT4072484A
Rv3633
G CG4084482C
topA
A GA4314271G
bfrB
Identify New Mechanisms of Resistance
fabG1 mabA inhA
L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T
METAGENOMICS
Starting Material
Sputum Subculture
Dx Culture
AMD Metagenomics Project Overview
Clinical specimen
Comparative pathogen sequence database RNADNA content sequence
High throughput sequencing Sample
extraction processing
Subtractive host
sequence database
Identification amp characterization
Clutter Mitigation
Clinical Specimen Sets
Human whole blood (EDTA) 2 liters bull Acquired through normal channels
Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate
Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo
Test Sputum for Mtb
Metagenomics
35819020 10
336491752 89
35886 0
4844864 1
Sputum Background Sample Raw Read processing
Quality filter failed (Q30 trim-filter)
Human mapped (BWA -defaults)
TB Mapped (BWA -defaults)
Other
MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios
MTbhuDNA Ratio () MTbSputum Ratio ()
20 1 01 001 0001 00001 20 10 1 01 001 0001
Transposon-Based Library Prep
Sequence Capture
MiSeq Sequencing and Analysis
All mixtures 25 ng uL-1
Project Workflow
Agilent SureSelect
- (
Mtb SureSelect Enrichment
01 1 20
20 1 10
2654689 117325
10133910 11929612 67210
001
11052
(825 45x) (994 77x)
(989 251x) (989 270x) (72 3x)
Recent Publication
Conclusions
Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management
Developed SOPs and optimized analysis
Surveillance Drug resistance
Work in progress
Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS
Acknowledgements
Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson
Core Facility
Mike Frace Mili Sheth Jamie Davis
AMD Metagenomics Team
Chris Hopkins Eishita Tyagi Scott Burns
Cluster 3
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
DRUG RESISTANCE
Applications of WGS for Drug Resistance
Surveillance
Clinical management
Identify new mechanisms
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
C G C31998G58 bp upstream Rv0029
C TC1663856T
acn
C T C51403T Rv0047c G AG1674048A
fabG1
C T C118832T Rv0102 T CT1877958C
pks7
C T C247984T Rv0207c C GC1888075G
pks9
T C T362962CPE_PGRS5
T CT2087076C
171 bp upstream of Rv1838c
C G C477188G Rv0398c C TC2372126T
Rv2112c
C A C480678A mmpL1 G CG2402463C
Rv2142c
A GA649974G
ubiE T CT2614547C
46 bp upstream Rv2339
C T C761147T rpoB G AG2751471A
Rv2449c
G C G765719C rpoC G AG2958534A
Rv2631
C G C799139G Rv0698 G AG3126489A
Rv2819c
G AG905686A
Rv0811c G AG3137406A
echA16
C A C926861APE_PGRS13
C TC3213150T
lepB
C GC1023436G
betP A CA3377940C
PPE46
T GT1093459G
PE_PGRS17
A CA3380380C
PPE47
G AG1114491A
Rv0997 A GA3416480G
Rv3055
C TC1208858T
Rv1084 C TC3455434T
Rv3088
C TC1231660T
Rv1104 G CG3608047C
Rv3230c
A CA1246730C
bpoB C TC3764285T
PPE56
C TC1266797T
Rv1139c G AG3765280A
PPE56
C TC1309314T
fdxC C AC3777772A
spoU
A CA1320356C
papA3 A GA4026439G
5 bp upstream Rv3585
G AG1353888A
tagA C AC4037284A
PE_PGRS59
G AG1421085A
Rv1272c T AT4072484A
Rv3633
G CG4084482C
topA
A GA4314271G
bfrB
Identify New Mechanisms of Resistance
fabG1 mabA inhA
L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T
METAGENOMICS
Starting Material
Sputum Subculture
Dx Culture
AMD Metagenomics Project Overview
Clinical specimen
Comparative pathogen sequence database RNADNA content sequence
High throughput sequencing Sample
extraction processing
Subtractive host
sequence database
Identification amp characterization
Clutter Mitigation
Clinical Specimen Sets
Human whole blood (EDTA) 2 liters bull Acquired through normal channels
Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate
Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo
Test Sputum for Mtb
Metagenomics
35819020 10
336491752 89
35886 0
4844864 1
Sputum Background Sample Raw Read processing
Quality filter failed (Q30 trim-filter)
Human mapped (BWA -defaults)
TB Mapped (BWA -defaults)
Other
MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios
MTbhuDNA Ratio () MTbSputum Ratio ()
20 1 01 001 0001 00001 20 10 1 01 001 0001
Transposon-Based Library Prep
Sequence Capture
MiSeq Sequencing and Analysis
All mixtures 25 ng uL-1
Project Workflow
Agilent SureSelect
- (
Mtb SureSelect Enrichment
01 1 20
20 1 10
2654689 117325
10133910 11929612 67210
001
11052
(825 45x) (994 77x)
(989 251x) (989 270x) (72 3x)
Recent Publication
Conclusions
Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management
Developed SOPs and optimized analysis
Surveillance Drug resistance
Work in progress
Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS
Acknowledgements
Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson
Core Facility
Mike Frace Mili Sheth Jamie Davis
AMD Metagenomics Team
Chris Hopkins Eishita Tyagi Scott Burns
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
DRUG RESISTANCE
Applications of WGS for Drug Resistance
Surveillance
Clinical management
Identify new mechanisms
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
C G C31998G58 bp upstream Rv0029
C TC1663856T
acn
C T C51403T Rv0047c G AG1674048A
fabG1
C T C118832T Rv0102 T CT1877958C
pks7
C T C247984T Rv0207c C GC1888075G
pks9
T C T362962CPE_PGRS5
T CT2087076C
171 bp upstream of Rv1838c
C G C477188G Rv0398c C TC2372126T
Rv2112c
C A C480678A mmpL1 G CG2402463C
Rv2142c
A GA649974G
ubiE T CT2614547C
46 bp upstream Rv2339
C T C761147T rpoB G AG2751471A
Rv2449c
G C G765719C rpoC G AG2958534A
Rv2631
C G C799139G Rv0698 G AG3126489A
Rv2819c
G AG905686A
Rv0811c G AG3137406A
echA16
C A C926861APE_PGRS13
C TC3213150T
lepB
C GC1023436G
betP A CA3377940C
PPE46
T GT1093459G
PE_PGRS17
A CA3380380C
PPE47
G AG1114491A
Rv0997 A GA3416480G
Rv3055
C TC1208858T
Rv1084 C TC3455434T
Rv3088
C TC1231660T
Rv1104 G CG3608047C
Rv3230c
A CA1246730C
bpoB C TC3764285T
PPE56
C TC1266797T
Rv1139c G AG3765280A
PPE56
C TC1309314T
fdxC C AC3777772A
spoU
A CA1320356C
papA3 A GA4026439G
5 bp upstream Rv3585
G AG1353888A
tagA C AC4037284A
PE_PGRS59
G AG1421085A
Rv1272c T AT4072484A
Rv3633
G CG4084482C
topA
A GA4314271G
bfrB
Identify New Mechanisms of Resistance
fabG1 mabA inhA
L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T
METAGENOMICS
Starting Material
Sputum Subculture
Dx Culture
AMD Metagenomics Project Overview
Clinical specimen
Comparative pathogen sequence database RNADNA content sequence
High throughput sequencing Sample
extraction processing
Subtractive host
sequence database
Identification amp characterization
Clutter Mitigation
Clinical Specimen Sets
Human whole blood (EDTA) 2 liters bull Acquired through normal channels
Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate
Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo
Test Sputum for Mtb
Metagenomics
35819020 10
336491752 89
35886 0
4844864 1
Sputum Background Sample Raw Read processing
Quality filter failed (Q30 trim-filter)
Human mapped (BWA -defaults)
TB Mapped (BWA -defaults)
Other
MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios
MTbhuDNA Ratio () MTbSputum Ratio ()
20 1 01 001 0001 00001 20 10 1 01 001 0001
Transposon-Based Library Prep
Sequence Capture
MiSeq Sequencing and Analysis
All mixtures 25 ng uL-1
Project Workflow
Agilent SureSelect
- (
Mtb SureSelect Enrichment
01 1 20
20 1 10
2654689 117325
10133910 11929612 67210
001
11052
(825 45x) (994 77x)
(989 251x) (989 270x) (72 3x)
Recent Publication
Conclusions
Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management
Developed SOPs and optimized analysis
Surveillance Drug resistance
Work in progress
Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS
Acknowledgements
Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson
Core Facility
Mike Frace Mili Sheth Jamie Davis
AMD Metagenomics Team
Chris Hopkins Eishita Tyagi Scott Burns
DRUG RESISTANCE
Applications of WGS for Drug Resistance
Surveillance
Clinical management
Identify new mechanisms
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
C G C31998G58 bp upstream Rv0029
C TC1663856T
acn
C T C51403T Rv0047c G AG1674048A
fabG1
C T C118832T Rv0102 T CT1877958C
pks7
C T C247984T Rv0207c C GC1888075G
pks9
T C T362962CPE_PGRS5
T CT2087076C
171 bp upstream of Rv1838c
C G C477188G Rv0398c C TC2372126T
Rv2112c
C A C480678A mmpL1 G CG2402463C
Rv2142c
A GA649974G
ubiE T CT2614547C
46 bp upstream Rv2339
C T C761147T rpoB G AG2751471A
Rv2449c
G C G765719C rpoC G AG2958534A
Rv2631
C G C799139G Rv0698 G AG3126489A
Rv2819c
G AG905686A
Rv0811c G AG3137406A
echA16
C A C926861APE_PGRS13
C TC3213150T
lepB
C GC1023436G
betP A CA3377940C
PPE46
T GT1093459G
PE_PGRS17
A CA3380380C
PPE47
G AG1114491A
Rv0997 A GA3416480G
Rv3055
C TC1208858T
Rv1084 C TC3455434T
Rv3088
C TC1231660T
Rv1104 G CG3608047C
Rv3230c
A CA1246730C
bpoB C TC3764285T
PPE56
C TC1266797T
Rv1139c G AG3765280A
PPE56
C TC1309314T
fdxC C AC3777772A
spoU
A CA1320356C
papA3 A GA4026439G
5 bp upstream Rv3585
G AG1353888A
tagA C AC4037284A
PE_PGRS59
G AG1421085A
Rv1272c T AT4072484A
Rv3633
G CG4084482C
topA
A GA4314271G
bfrB
Identify New Mechanisms of Resistance
fabG1 mabA inhA
L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T
METAGENOMICS
Starting Material
Sputum Subculture
Dx Culture
AMD Metagenomics Project Overview
Clinical specimen
Comparative pathogen sequence database RNADNA content sequence
High throughput sequencing Sample
extraction processing
Subtractive host
sequence database
Identification amp characterization
Clutter Mitigation
Clinical Specimen Sets
Human whole blood (EDTA) 2 liters bull Acquired through normal channels
Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate
Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo
Test Sputum for Mtb
Metagenomics
35819020 10
336491752 89
35886 0
4844864 1
Sputum Background Sample Raw Read processing
Quality filter failed (Q30 trim-filter)
Human mapped (BWA -defaults)
TB Mapped (BWA -defaults)
Other
MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios
MTbhuDNA Ratio () MTbSputum Ratio ()
20 1 01 001 0001 00001 20 10 1 01 001 0001
Transposon-Based Library Prep
Sequence Capture
MiSeq Sequencing and Analysis
All mixtures 25 ng uL-1
Project Workflow
Agilent SureSelect
- (
Mtb SureSelect Enrichment
01 1 20
20 1 10
2654689 117325
10133910 11929612 67210
001
11052
(825 45x) (994 77x)
(989 251x) (989 270x) (72 3x)
Recent Publication
Conclusions
Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management
Developed SOPs and optimized analysis
Surveillance Drug resistance
Work in progress
Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS
Acknowledgements
Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson
Core Facility
Mike Frace Mili Sheth Jamie Davis
AMD Metagenomics Team
Chris Hopkins Eishita Tyagi Scott Burns
Applications of WGS for Drug Resistance
Surveillance
Clinical management
Identify new mechanisms
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
C G C31998G58 bp upstream Rv0029
C TC1663856T
acn
C T C51403T Rv0047c G AG1674048A
fabG1
C T C118832T Rv0102 T CT1877958C
pks7
C T C247984T Rv0207c C GC1888075G
pks9
T C T362962CPE_PGRS5
T CT2087076C
171 bp upstream of Rv1838c
C G C477188G Rv0398c C TC2372126T
Rv2112c
C A C480678A mmpL1 G CG2402463C
Rv2142c
A GA649974G
ubiE T CT2614547C
46 bp upstream Rv2339
C T C761147T rpoB G AG2751471A
Rv2449c
G C G765719C rpoC G AG2958534A
Rv2631
C G C799139G Rv0698 G AG3126489A
Rv2819c
G AG905686A
Rv0811c G AG3137406A
echA16
C A C926861APE_PGRS13
C TC3213150T
lepB
C GC1023436G
betP A CA3377940C
PPE46
T GT1093459G
PE_PGRS17
A CA3380380C
PPE47
G AG1114491A
Rv0997 A GA3416480G
Rv3055
C TC1208858T
Rv1084 C TC3455434T
Rv3088
C TC1231660T
Rv1104 G CG3608047C
Rv3230c
A CA1246730C
bpoB C TC3764285T
PPE56
C TC1266797T
Rv1139c G AG3765280A
PPE56
C TC1309314T
fdxC C AC3777772A
spoU
A CA1320356C
papA3 A GA4026439G
5 bp upstream Rv3585
G AG1353888A
tagA C AC4037284A
PE_PGRS59
G AG1421085A
Rv1272c T AT4072484A
Rv3633
G CG4084482C
topA
A GA4314271G
bfrB
Identify New Mechanisms of Resistance
fabG1 mabA inhA
L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T
METAGENOMICS
Starting Material
Sputum Subculture
Dx Culture
AMD Metagenomics Project Overview
Clinical specimen
Comparative pathogen sequence database RNADNA content sequence
High throughput sequencing Sample
extraction processing
Subtractive host
sequence database
Identification amp characterization
Clutter Mitigation
Clinical Specimen Sets
Human whole blood (EDTA) 2 liters bull Acquired through normal channels
Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate
Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo
Test Sputum for Mtb
Metagenomics
35819020 10
336491752 89
35886 0
4844864 1
Sputum Background Sample Raw Read processing
Quality filter failed (Q30 trim-filter)
Human mapped (BWA -defaults)
TB Mapped (BWA -defaults)
Other
MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios
MTbhuDNA Ratio () MTbSputum Ratio ()
20 1 01 001 0001 00001 20 10 1 01 001 0001
Transposon-Based Library Prep
Sequence Capture
MiSeq Sequencing and Analysis
All mixtures 25 ng uL-1
Project Workflow
Agilent SureSelect
- (
Mtb SureSelect Enrichment
01 1 20
20 1 10
2654689 117325
10133910 11929612 67210
001
11052
(825 45x) (994 77x)
(989 251x) (989 270x) (72 3x)
Recent Publication
Conclusions
Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management
Developed SOPs and optimized analysis
Surveillance Drug resistance
Work in progress
Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS
Acknowledgements
Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson
Core Facility
Mike Frace Mili Sheth Jamie Davis
AMD Metagenomics Team
Chris Hopkins Eishita Tyagi Scott Burns
29 44
26
19
36 63
46 44
Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333
B B
B B
B
B
A
C
C G C31998G58 bp upstream Rv0029
C TC1663856T
acn
C T C51403T Rv0047c G AG1674048A
fabG1
C T C118832T Rv0102 T CT1877958C
pks7
C T C247984T Rv0207c C GC1888075G
pks9
T C T362962CPE_PGRS5
T CT2087076C
171 bp upstream of Rv1838c
C G C477188G Rv0398c C TC2372126T
Rv2112c
C A C480678A mmpL1 G CG2402463C
Rv2142c
A GA649974G
ubiE T CT2614547C
46 bp upstream Rv2339
C T C761147T rpoB G AG2751471A
Rv2449c
G C G765719C rpoC G AG2958534A
Rv2631
C G C799139G Rv0698 G AG3126489A
Rv2819c
G AG905686A
Rv0811c G AG3137406A
echA16
C A C926861APE_PGRS13
C TC3213150T
lepB
C GC1023436G
betP A CA3377940C
PPE46
T GT1093459G
PE_PGRS17
A CA3380380C
PPE47
G AG1114491A
Rv0997 A GA3416480G
Rv3055
C TC1208858T
Rv1084 C TC3455434T
Rv3088
C TC1231660T
Rv1104 G CG3608047C
Rv3230c
A CA1246730C
bpoB C TC3764285T
PPE56
C TC1266797T
Rv1139c G AG3765280A
PPE56
C TC1309314T
fdxC C AC3777772A
spoU
A CA1320356C
papA3 A GA4026439G
5 bp upstream Rv3585
G AG1353888A
tagA C AC4037284A
PE_PGRS59
G AG1421085A
Rv1272c T AT4072484A
Rv3633
G CG4084482C
topA
A GA4314271G
bfrB
Identify New Mechanisms of Resistance
fabG1 mabA inhA
L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T
METAGENOMICS
Starting Material
Sputum Subculture
Dx Culture
AMD Metagenomics Project Overview
Clinical specimen
Comparative pathogen sequence database RNADNA content sequence
High throughput sequencing Sample
extraction processing
Subtractive host
sequence database
Identification amp characterization
Clutter Mitigation
Clinical Specimen Sets
Human whole blood (EDTA) 2 liters bull Acquired through normal channels
Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate
Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo
Test Sputum for Mtb
Metagenomics
35819020 10
336491752 89
35886 0
4844864 1
Sputum Background Sample Raw Read processing
Quality filter failed (Q30 trim-filter)
Human mapped (BWA -defaults)
TB Mapped (BWA -defaults)
Other
MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios
MTbhuDNA Ratio () MTbSputum Ratio ()
20 1 01 001 0001 00001 20 10 1 01 001 0001
Transposon-Based Library Prep
Sequence Capture
MiSeq Sequencing and Analysis
All mixtures 25 ng uL-1
Project Workflow
Agilent SureSelect
- (
Mtb SureSelect Enrichment
01 1 20
20 1 10
2654689 117325
10133910 11929612 67210
001
11052
(825 45x) (994 77x)
(989 251x) (989 270x) (72 3x)
Recent Publication
Conclusions
Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management
Developed SOPs and optimized analysis
Surveillance Drug resistance
Work in progress
Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS
Acknowledgements
Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson
Core Facility
Mike Frace Mili Sheth Jamie Davis
AMD Metagenomics Team
Chris Hopkins Eishita Tyagi Scott Burns
C G C31998G58 bp upstream Rv0029
C TC1663856T
acn
C T C51403T Rv0047c G AG1674048A
fabG1
C T C118832T Rv0102 T CT1877958C
pks7
C T C247984T Rv0207c C GC1888075G
pks9
T C T362962CPE_PGRS5
T CT2087076C
171 bp upstream of Rv1838c
C G C477188G Rv0398c C TC2372126T
Rv2112c
C A C480678A mmpL1 G CG2402463C
Rv2142c
A GA649974G
ubiE T CT2614547C
46 bp upstream Rv2339
C T C761147T rpoB G AG2751471A
Rv2449c
G C G765719C rpoC G AG2958534A
Rv2631
C G C799139G Rv0698 G AG3126489A
Rv2819c
G AG905686A
Rv0811c G AG3137406A
echA16
C A C926861APE_PGRS13
C TC3213150T
lepB
C GC1023436G
betP A CA3377940C
PPE46
T GT1093459G
PE_PGRS17
A CA3380380C
PPE47
G AG1114491A
Rv0997 A GA3416480G
Rv3055
C TC1208858T
Rv1084 C TC3455434T
Rv3088
C TC1231660T
Rv1104 G CG3608047C
Rv3230c
A CA1246730C
bpoB C TC3764285T
PPE56
C TC1266797T
Rv1139c G AG3765280A
PPE56
C TC1309314T
fdxC C AC3777772A
spoU
A CA1320356C
papA3 A GA4026439G
5 bp upstream Rv3585
G AG1353888A
tagA C AC4037284A
PE_PGRS59
G AG1421085A
Rv1272c T AT4072484A
Rv3633
G CG4084482C
topA
A GA4314271G
bfrB
Identify New Mechanisms of Resistance
fabG1 mabA inhA
L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T
METAGENOMICS
Starting Material
Sputum Subculture
Dx Culture
AMD Metagenomics Project Overview
Clinical specimen
Comparative pathogen sequence database RNADNA content sequence
High throughput sequencing Sample
extraction processing
Subtractive host
sequence database
Identification amp characterization
Clutter Mitigation
Clinical Specimen Sets
Human whole blood (EDTA) 2 liters bull Acquired through normal channels
Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate
Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo
Test Sputum for Mtb
Metagenomics
35819020 10
336491752 89
35886 0
4844864 1
Sputum Background Sample Raw Read processing
Quality filter failed (Q30 trim-filter)
Human mapped (BWA -defaults)
TB Mapped (BWA -defaults)
Other
MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios
MTbhuDNA Ratio () MTbSputum Ratio ()
20 1 01 001 0001 00001 20 10 1 01 001 0001
Transposon-Based Library Prep
Sequence Capture
MiSeq Sequencing and Analysis
All mixtures 25 ng uL-1
Project Workflow
Agilent SureSelect
- (
Mtb SureSelect Enrichment
01 1 20
20 1 10
2654689 117325
10133910 11929612 67210
001
11052
(825 45x) (994 77x)
(989 251x) (989 270x) (72 3x)
Recent Publication
Conclusions
Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management
Developed SOPs and optimized analysis
Surveillance Drug resistance
Work in progress
Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS
Acknowledgements
Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson
Core Facility
Mike Frace Mili Sheth Jamie Davis
AMD Metagenomics Team
Chris Hopkins Eishita Tyagi Scott Burns
Identify New Mechanisms of Resistance
fabG1 mabA inhA
L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T
METAGENOMICS
Starting Material
Sputum Subculture
Dx Culture
AMD Metagenomics Project Overview
Clinical specimen
Comparative pathogen sequence database RNADNA content sequence
High throughput sequencing Sample
extraction processing
Subtractive host
sequence database
Identification amp characterization
Clutter Mitigation
Clinical Specimen Sets
Human whole blood (EDTA) 2 liters bull Acquired through normal channels
Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate
Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo
Test Sputum for Mtb
Metagenomics
35819020 10
336491752 89
35886 0
4844864 1
Sputum Background Sample Raw Read processing
Quality filter failed (Q30 trim-filter)
Human mapped (BWA -defaults)
TB Mapped (BWA -defaults)
Other
MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios
MTbhuDNA Ratio () MTbSputum Ratio ()
20 1 01 001 0001 00001 20 10 1 01 001 0001
Transposon-Based Library Prep
Sequence Capture
MiSeq Sequencing and Analysis
All mixtures 25 ng uL-1
Project Workflow
Agilent SureSelect
- (
Mtb SureSelect Enrichment
01 1 20
20 1 10
2654689 117325
10133910 11929612 67210
001
11052
(825 45x) (994 77x)
(989 251x) (989 270x) (72 3x)
Recent Publication
Conclusions
Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management
Developed SOPs and optimized analysis
Surveillance Drug resistance
Work in progress
Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS
Acknowledgements
Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson
Core Facility
Mike Frace Mili Sheth Jamie Davis
AMD Metagenomics Team
Chris Hopkins Eishita Tyagi Scott Burns
METAGENOMICS
Starting Material
Sputum Subculture
Dx Culture
AMD Metagenomics Project Overview
Clinical specimen
Comparative pathogen sequence database RNADNA content sequence
High throughput sequencing Sample
extraction processing
Subtractive host
sequence database
Identification amp characterization
Clutter Mitigation
Clinical Specimen Sets
Human whole blood (EDTA) 2 liters bull Acquired through normal channels
Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate
Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo
Test Sputum for Mtb
Metagenomics
35819020 10
336491752 89
35886 0
4844864 1
Sputum Background Sample Raw Read processing
Quality filter failed (Q30 trim-filter)
Human mapped (BWA -defaults)
TB Mapped (BWA -defaults)
Other
MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios
MTbhuDNA Ratio () MTbSputum Ratio ()
20 1 01 001 0001 00001 20 10 1 01 001 0001
Transposon-Based Library Prep
Sequence Capture
MiSeq Sequencing and Analysis
All mixtures 25 ng uL-1
Project Workflow
Agilent SureSelect
- (
Mtb SureSelect Enrichment
01 1 20
20 1 10
2654689 117325
10133910 11929612 67210
001
11052
(825 45x) (994 77x)
(989 251x) (989 270x) (72 3x)
Recent Publication
Conclusions
Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management
Developed SOPs and optimized analysis
Surveillance Drug resistance
Work in progress
Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS
Acknowledgements
Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson
Core Facility
Mike Frace Mili Sheth Jamie Davis
AMD Metagenomics Team
Chris Hopkins Eishita Tyagi Scott Burns
Starting Material
Sputum Subculture
Dx Culture
AMD Metagenomics Project Overview
Clinical specimen
Comparative pathogen sequence database RNADNA content sequence
High throughput sequencing Sample
extraction processing
Subtractive host
sequence database
Identification amp characterization
Clutter Mitigation
Clinical Specimen Sets
Human whole blood (EDTA) 2 liters bull Acquired through normal channels
Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate
Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo
Test Sputum for Mtb
Metagenomics
35819020 10
336491752 89
35886 0
4844864 1
Sputum Background Sample Raw Read processing
Quality filter failed (Q30 trim-filter)
Human mapped (BWA -defaults)
TB Mapped (BWA -defaults)
Other
MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios
MTbhuDNA Ratio () MTbSputum Ratio ()
20 1 01 001 0001 00001 20 10 1 01 001 0001
Transposon-Based Library Prep
Sequence Capture
MiSeq Sequencing and Analysis
All mixtures 25 ng uL-1
Project Workflow
Agilent SureSelect
- (
Mtb SureSelect Enrichment
01 1 20
20 1 10
2654689 117325
10133910 11929612 67210
001
11052
(825 45x) (994 77x)
(989 251x) (989 270x) (72 3x)
Recent Publication
Conclusions
Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management
Developed SOPs and optimized analysis
Surveillance Drug resistance
Work in progress
Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS
Acknowledgements
Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson
Core Facility
Mike Frace Mili Sheth Jamie Davis
AMD Metagenomics Team
Chris Hopkins Eishita Tyagi Scott Burns
AMD Metagenomics Project Overview
Clinical specimen
Comparative pathogen sequence database RNADNA content sequence
High throughput sequencing Sample
extraction processing
Subtractive host
sequence database
Identification amp characterization
Clutter Mitigation
Clinical Specimen Sets
Human whole blood (EDTA) 2 liters bull Acquired through normal channels
Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate
Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo
Test Sputum for Mtb
Metagenomics
35819020 10
336491752 89
35886 0
4844864 1
Sputum Background Sample Raw Read processing
Quality filter failed (Q30 trim-filter)
Human mapped (BWA -defaults)
TB Mapped (BWA -defaults)
Other
MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios
MTbhuDNA Ratio () MTbSputum Ratio ()
20 1 01 001 0001 00001 20 10 1 01 001 0001
Transposon-Based Library Prep
Sequence Capture
MiSeq Sequencing and Analysis
All mixtures 25 ng uL-1
Project Workflow
Agilent SureSelect
- (
Mtb SureSelect Enrichment
01 1 20
20 1 10
2654689 117325
10133910 11929612 67210
001
11052
(825 45x) (994 77x)
(989 251x) (989 270x) (72 3x)
Recent Publication
Conclusions
Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management
Developed SOPs and optimized analysis
Surveillance Drug resistance
Work in progress
Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS
Acknowledgements
Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson
Core Facility
Mike Frace Mili Sheth Jamie Davis
AMD Metagenomics Team
Chris Hopkins Eishita Tyagi Scott Burns
Clinical Specimen Sets
Human whole blood (EDTA) 2 liters bull Acquired through normal channels
Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate
Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo
Test Sputum for Mtb
Metagenomics
35819020 10
336491752 89
35886 0
4844864 1
Sputum Background Sample Raw Read processing
Quality filter failed (Q30 trim-filter)
Human mapped (BWA -defaults)
TB Mapped (BWA -defaults)
Other
MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios
MTbhuDNA Ratio () MTbSputum Ratio ()
20 1 01 001 0001 00001 20 10 1 01 001 0001
Transposon-Based Library Prep
Sequence Capture
MiSeq Sequencing and Analysis
All mixtures 25 ng uL-1
Project Workflow
Agilent SureSelect
- (
Mtb SureSelect Enrichment
01 1 20
20 1 10
2654689 117325
10133910 11929612 67210
001
11052
(825 45x) (994 77x)
(989 251x) (989 270x) (72 3x)
Recent Publication
Conclusions
Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management
Developed SOPs and optimized analysis
Surveillance Drug resistance
Work in progress
Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS
Acknowledgements
Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson
Core Facility
Mike Frace Mili Sheth Jamie Davis
AMD Metagenomics Team
Chris Hopkins Eishita Tyagi Scott Burns
Test Sputum for Mtb
Metagenomics
35819020 10
336491752 89
35886 0
4844864 1
Sputum Background Sample Raw Read processing
Quality filter failed (Q30 trim-filter)
Human mapped (BWA -defaults)
TB Mapped (BWA -defaults)
Other
MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios
MTbhuDNA Ratio () MTbSputum Ratio ()
20 1 01 001 0001 00001 20 10 1 01 001 0001
Transposon-Based Library Prep
Sequence Capture
MiSeq Sequencing and Analysis
All mixtures 25 ng uL-1
Project Workflow
Agilent SureSelect
- (
Mtb SureSelect Enrichment
01 1 20
20 1 10
2654689 117325
10133910 11929612 67210
001
11052
(825 45x) (994 77x)
(989 251x) (989 270x) (72 3x)
Recent Publication
Conclusions
Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management
Developed SOPs and optimized analysis
Surveillance Drug resistance
Work in progress
Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS
Acknowledgements
Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson
Core Facility
Mike Frace Mili Sheth Jamie Davis
AMD Metagenomics Team
Chris Hopkins Eishita Tyagi Scott Burns
Metagenomics
35819020 10
336491752 89
35886 0
4844864 1
Sputum Background Sample Raw Read processing
Quality filter failed (Q30 trim-filter)
Human mapped (BWA -defaults)
TB Mapped (BWA -defaults)
Other
MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios
MTbhuDNA Ratio () MTbSputum Ratio ()
20 1 01 001 0001 00001 20 10 1 01 001 0001
Transposon-Based Library Prep
Sequence Capture
MiSeq Sequencing and Analysis
All mixtures 25 ng uL-1
Project Workflow
Agilent SureSelect
- (
Mtb SureSelect Enrichment
01 1 20
20 1 10
2654689 117325
10133910 11929612 67210
001
11052
(825 45x) (994 77x)
(989 251x) (989 270x) (72 3x)
Recent Publication
Conclusions
Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management
Developed SOPs and optimized analysis
Surveillance Drug resistance
Work in progress
Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS
Acknowledgements
Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson
Core Facility
Mike Frace Mili Sheth Jamie Davis
AMD Metagenomics Team
Chris Hopkins Eishita Tyagi Scott Burns
MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios
MTbhuDNA Ratio () MTbSputum Ratio ()
20 1 01 001 0001 00001 20 10 1 01 001 0001
Transposon-Based Library Prep
Sequence Capture
MiSeq Sequencing and Analysis
All mixtures 25 ng uL-1
Project Workflow
Agilent SureSelect
- (
Mtb SureSelect Enrichment
01 1 20
20 1 10
2654689 117325
10133910 11929612 67210
001
11052
(825 45x) (994 77x)
(989 251x) (989 270x) (72 3x)
Recent Publication
Conclusions
Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management
Developed SOPs and optimized analysis
Surveillance Drug resistance
Work in progress
Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS
Acknowledgements
Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson
Core Facility
Mike Frace Mili Sheth Jamie Davis
AMD Metagenomics Team
Chris Hopkins Eishita Tyagi Scott Burns
Agilent SureSelect
- (
Mtb SureSelect Enrichment
01 1 20
20 1 10
2654689 117325
10133910 11929612 67210
001
11052
(825 45x) (994 77x)
(989 251x) (989 270x) (72 3x)
Recent Publication
Conclusions
Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management
Developed SOPs and optimized analysis
Surveillance Drug resistance
Work in progress
Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS
Acknowledgements
Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson
Core Facility
Mike Frace Mili Sheth Jamie Davis
AMD Metagenomics Team
Chris Hopkins Eishita Tyagi Scott Burns
- (
Mtb SureSelect Enrichment
01 1 20
20 1 10
2654689 117325
10133910 11929612 67210
001
11052
(825 45x) (994 77x)
(989 251x) (989 270x) (72 3x)
Recent Publication
Conclusions
Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management
Developed SOPs and optimized analysis
Surveillance Drug resistance
Work in progress
Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS
Acknowledgements
Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson
Core Facility
Mike Frace Mili Sheth Jamie Davis
AMD Metagenomics Team
Chris Hopkins Eishita Tyagi Scott Burns
Recent Publication
Conclusions
Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management
Developed SOPs and optimized analysis
Surveillance Drug resistance
Work in progress
Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS
Acknowledgements
Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson
Core Facility
Mike Frace Mili Sheth Jamie Davis
AMD Metagenomics Team
Chris Hopkins Eishita Tyagi Scott Burns
Conclusions
Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management
Developed SOPs and optimized analysis
Surveillance Drug resistance
Work in progress
Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS
Acknowledgements
Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson
Core Facility
Mike Frace Mili Sheth Jamie Davis
AMD Metagenomics Team
Chris Hopkins Eishita Tyagi Scott Burns
Acknowledgements
Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson
Core Facility
Mike Frace Mili Sheth Jamie Davis
AMD Metagenomics Team
Chris Hopkins Eishita Tyagi Scott Burns