1
An Automated Workflow that Enables Cell-Free DNA Extraction & Whole-Genome NGS Library Preparation from 24 Plasma Samples in a Single Workday M. Carter*, M. Goldrick, Davina Kruczek, Mareike Honickel & Arvind Kothandaraman *[email protected] Bioo Scientific Corporation 7050 Burleson Rd. Austin, TX 78744 OVERVIEW Studies reported here demonstrate an integrated approach for automated cf DNA extraction from 24 plasma samples in less than one day or 96 samples in approximately two days. We also highlight the results of our automation workflow when analyzing the size distribution of fetal-derived cf DNA in maternal circulation. e research tools and approaches we have developed for high-throughput cfDNA extraction and analysis will facilitate the use of plasma and other biofluids for an even wider range of non-invasive and cost-effective clinical research applications. METHODS Plasma Samples. Blood was collected by a commercial vendor from healthy donors into bags containing K2-EDTA anti-coagulant. Plasma was separated from RBCs and WBCs by centrifugation at 1,600xg for 10 minutes, carefully removed, and shipped to Bioo Scientific® on wet ice. Plasma was further spun at 4,000xg for 20 minutes. e double-spun plasma was removed and pooled where necessary. Donors 1 and 4 are pools of various donors while Donors 2 and 3 are individual donors. Plasma from K2-EDTA whole blood from the second trimester pregnant donor was obtained from a commercial source (Zen Bio®), and plasma from the first-trimester pregnant donor was obtained from blood collected in Streck BCT tubes and fractionated, using a double-spin protocol. Both pregnant donors carried male fetuses. Plasma DNA extraction and quantitation. cfDNA was extracted from 24 X 5mL plasma replicates for each donor (or donor pool) using the NextPrep-Mag™ cfDNA Automated Isolation Kit and script (Bioo Scientific) on the chemagic™ 360 automation platform (chemagen™). e cf DNA was eluted in 80 µL. Concentration of eluted cf DNA was determined by Qubit® dsDNA HS assay (ermo Fisher Scientific®) using a 1:20 dilution. Concentration and size distribution of the cfDNA samples were characterized using the High Sensitivity DNA kit and 2100 Bioanalyzer® platform (Agilent®). Library preparation for whole genome sequencing. An equal volume (32 µl) of each extracted cf DNA was transferred to a 96 well hard-shelled PCR plate and used as input for whole genome library preparation on the Sciclone® G3 NGS workstation. Libraries were made using the NEXTflex® Cell-Free DNA-seq kit for Illumina® platforms, with barcoded adapters (Bioo Scientific®) diluted 1:8. Libraries were analyzed for yield and size distribution on the LabChip® GX Touch™ HT platform (PerkinElmer®). Libraries from the pregnant donors were sequenced on the Illumina® MiSeq® platform as a 2x150 sequencing run and analyzed. Note that cf DNA library preparation does not include a DNA fragmentation step since the DNA is naturally fragmented to a size of ~170 bp. Bioinformatics. Paired sequencing reads were trimmed off 3’ adapter sequences using Cutadapt version 1.9.1 and aligned to human genome GRCh38 using Bowtie2 version 2.2.6. Y chromosome mapped reads were copied to additional alignment files by using Samtools version 1.3.1 view command. Genome and Y chromosome alignment files were converted back to sequencing files using Picard-Tools 1.95 in preparation for insert size binning. Binning of insert size on aligned sequencing reads were done using custom Python2 scripting, utilizing Biopython 1.66 (Python™). Binned insert sizes were stored in comma-separated values (or .csv) files and were graphically analyzed using Excel 2016. e % of reads mapping to the human Y chromosome were used to determine the fetal fraction of cf DNA. RESULTS Figure 1. Automation workflows for 24 and 96 samples om 5mL plasma to sequencing-ready libraries. (A) Material s used om leſt to right: (1) chemagic™ 360 instrument for automated cfDNA extraction om 5mL plasma; (2) NextPrep-Mag™ cfDNA extraction reagents; (3) Sciclone G3 NGS workstation for automated library prep; (4) NEXTFLEX® Cell-Free DNA-Seq lbrary prep reagents. (B) Single day workflow for 24 samples. (C) Two day automated workflow for 96 samples INTRODUCTION cf DNA is a fluid sample type with vast applications in the analysis of cancer-associated sequence variants for the detection of actionable mutations in patients being treated for malignant disease, which can enable earlier treatment at a lower cost, monitoring cancer-linked cfDNA variants in healthy individuals for early detection of malignant disease, non-invasive prenatal diagnostics for early detection of genetic abnormalities without the risk of fetal injury, monitoring of organ transplant recipients for early signs of organ rejection, and tracking pathologies that do not involve differences in primary genetic sequence. To realize the full potential of cfDNA- based diagnostics, clinical research is needed to improve workflows that efficiently extract circulating cfDNA at high yield and purity, convert the cfDNA into libraries to allow massively parallel sequencing of regions of interest, and carry out the bioinformatic analysis steps. is poster describes methodology developed at Bioo Scientific, a PerkinElmer® Company, for both rapid high-throughput automated extraction of cfDNA from high volumes of plasma (5 mL) and for preparation of whole-genome cf DNA libraries from the extracted cf DNA, in a fully automated fashion. Figure 3. Automated Library prep of extracted cfDNA– 32µl of extracted cfDNA shown in Figure 2 was used as a template for library preparation using the NEXTflex® Cell Free DNA-Seq kit on the PerkinElmer® Sciclone® G3 NGS Workstation. 96 samples were processed in a single run. (A) LabChip® GXII Touch™ gel image of 96 libraries. DNA libraries were diluted 1:8 and run on a HSDNA chip. (B) Nanomolar concentrations of the ~300bp region of the library (corresponding to mononucleosomal cfDNA) was quantified using the LabChip® GX Reviewer soſtware. Box and whisker plots were generated for each donor. %CVs were 11, 9, 28, and 14% for donors 1-4, respectively. Figure 4. (A) Blood was collected om 2 healthy pregnant donors with known male fetus (20 week - EDTA; 12 week - Streck BCT®) and doubly spun to remove contaminating cells. Automated cfDNA extraction of 5mL of the resultant cell ee plasmas was carried out using Bioo Scientific’s NextPrep-Mag™ cfDNA Automated Isolation reagents on the chemagic ™ 360 instrument (red traces) . Library Prep was automated on the PerkinElmer® Sciclone® G3 NGS workstation using Bioo Scientific’s NEXTflex® Cell Free DNA-Seq reagents, 10 cycles PCR, with and without using an upont SPRI cleanup to remove non-mononucleosomal DNA (>200bp) (blue, green traces). Yields of library size ranges were quantified using the Bioanalyzer® soſtware and showed that ~95% of non-mononucleosomal peaks were excluded om libraries when size selection was used. (B) Library sequencing reads were aligned to human genome GRCh38 and binned to insert size to examine the effect of size selection on mononucleosome and non- mononucleosome quantity. Y chromosome inserts represent reads derived om fetal DNA. Binned inserts om total genome mapped reads and Y chromosome mapped reads show size selection excludes non-mononucleosomal reads om sequencing data. CONCLUSIONS High concordance was seen between cf DNA yield and size distribution for replicate 5 mL plasma samples processed for automated cf DNA extraction using the NextPrep-Mag™ cf DNA automated isolation kit on the chemagic™ 360 platform. e extent of donor to donor variation was much higher, as expected, than the extent of variation between replicate samples from the same donor or donor pool (Fig 2). Excellent reproducibility was also observed for yield and size distribution of the 96 whole-genome libraries produced on the Sciclone® G3 NGS workstation (Fig 3). Regarding the analysis of cfDNA obtained from the two pregnant donors, our results clearly show that fetal reads (reads mapping to the Y-chromosome) are present across the whole size distribution of cfDNA (Fig 4b). is result is surprising, in light of reports that fetal cf DNA fragments are typically shorter than non-fetal fragments (Chan et al, PNAS, doi 1615800113.). Our studies also demonstrate feasibility of automating the entire workflow for NIPT research, and the ability to detect informative levels of fetal cf DNA in first-trimester plasma samples, even without size-selection of the cf DNA or the resulting whole-genome libraries. e rapid automated extraction of cf DNA from high-volume 5 mL plasma samples, coupled with the capability of converting the extracted cf DNA into whole-genome libraries on a second automation platform, provides an integrated workflow that will enable efficient liquid biopsy-based investigations relevant for cancer and non-invasive prenatal assessment. Future studies will focus on methodological advances to maximize detection of rare variants in plasma and other biofluids, using automated workflows. For research use only. Not for use in diagnostic purposes. 0 1 2 3 4 5 6 7 8 9 10 Qubit BioA ng cfDNA recovered/mL plasma Donor 1 (n=24) 0 1 2 3 4 5 6 7 8 9 10 Qubit BioA ng cfDNA recovered/mL plasma Donor 2 (n=24) 0 1 2 3 4 5 6 7 8 9 10 Qubit BioA ng cfDNA recovered/mL plasma Donor 3 (n=24) 0 5 10 15 20 25 30 35 40 Qubit BioA ng cfDNA recovered/mL plasma Donor 4 (n=24) 0 10 20 30 40 50 60 70 80 90 100 D1 D2 D3 D4 Library concentration (nM) Donor ID Library Yields (n=24) (A) D1 D2 D3 D4 (B) cfDNA Library + Size Selection Library - Size Selection 0 2000 4000 6000 8000 10000 12000 1 12 23 34 45 56 67 78 89 100 111 122 133 144 155 166 177 188 199 210 221 232 243 254 265 276 287 298 309 320 331 342 353 364 375 386 397 408 419 430 441 452 463 474 485 496 507 518 529 540 551 562 573 584 595 606 Total Insert Sizes 12 week - No Size Selection 12 week + Size Selection 20 week - No Size Selection 20 week + Size Selection (B) 0 50 100 150 1 12 23 34 45 56 67 78 89 100 111 122 133 144 155 166 177 188 199 210 221 232 243 254 265 276 287 298 309 320 331 342 353 364 375 386 397 408 419 430 441 452 463 474 485 496 507 518 529 540 551 562 573 584 595 Y Chromosome Insert Sizes (A) 20 week EDTA 12 week Streck BCT cfDNA Library + Size Selection Library - Size Selection Figure 2. Highly reproducible cfDNA extractions on chemagic™ 360 instrument. cfDNA was extracted om 5mL of doubly-spun EDTA plasma om four healthy donors in four automated runs (24 replicates of a single donor/run). Reproducible yields are shown (1) qualitatively by gel and electropherogram and (2) quantitatively for both total DNA (Qubit® HSDNA assay) and for mononucleosome peak (Bioanalyzer® 100-275 bp region). For Donors 1-4, %CVs were 10, 10, 12, and 14% on the Bioanalyzer® and 12, 7, 9, and 18% on the Qubit®, respectively. Donor 4 showed significant contamination with gDNA, resulting in much higher Qubit® yields for that donor. Each automation run takes 75 minutes, with 15 minutes of hands on time. On graph, Qubit = Qubit® HSDNA assay and BioA = High Sensitivity DNA kit on the 2100 Bioanalyzer® platform a PerkinElmer company

An Automated Workflow that Enables Cell-Free DNA ... · minutes, carefully removed, and shipped to Bioo Scientific® on wet ice. Plasma was further spun at 4,000xg for 20 minutes

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

  • An Automated Workflow that Enables Cell-Free DNA Extraction & Whole-Genome NGS Library Preparation from 24 Plasma Samples in a Single Workday

    M. Carter*, M. Goldrick, Davina Kruczek, Mareike Honickel & Arvind Kothandaraman*[email protected]

    Bioo Scientific Corporation7050 Burleson Rd.Austin, TX 78744

    OVERVIEWStudies reported here demonstrate an integrated approach for automated cf DNA extraction from 24 plasma samples in less than one day or 96 samples in approximately two days. We also highlight the results of our automation workflow when analyzing the size distribution of fetal-derived cf DNA in maternal circulation. The research tools and approaches we have developed for high-throughput cf DNA extraction and analysis will facilitate the use of plasma and other biofluids for an even wider range of non-invasive and cost-effective clinical research applications.

    METHODSPlasma Samples. Blood was collected by a commercial vendor from healthy donors into bags containing K2-EDTA anti-coagulant. Plasma was separated from RBCs and WBCs by centrifugation at 1,600xg for 10 minutes, carefully removed, and shipped to Bioo Scientific® on wet ice. Plasma was further spun at 4,000xg for 20 minutes. The double-spun plasma was removed and pooled where necessary. Donors 1 and 4 are pools of various donors while Donors 2 and 3 are individual donors. Plasma from K2-EDTA whole blood from the second trimester pregnant donor was obtained from a commercial source (Zen Bio®), and plasma from the first-trimester pregnant donor was obtained from blood collected in Streck BCT tubes and fractionated, using a double-spin protocol. Both pregnant donors carried male fetuses.

    Plasma DNA extraction and quantitation. cf DNA was extracted from 24 X 5mL plasma replicates for each donor (or donor pool) using the NextPrep-Mag™ cf DNA Automated Isolation Kit and script (Bioo Scientific) on the chemagic™ 360 automation platform (chemagen™). The cf DNA was eluted in 80 µL. Concentration of eluted cf DNA was determined by Qubit® dsDNA HS assay (Thermo Fisher Scientific®) using a 1:20 dilution. Concentration and size distribution of the cf DNA samples were characterized using the High Sensitivity DNA kit and 2100 Bioanalyzer® platform (Agilent®).

    Library preparation for whole genome sequencing. An equal volume (32 µl) of each extracted cf DNA was transferred to a 96 well hard-shelled PCR plate and used as input for whole genome library preparation on the Sciclone® G3 NGS workstation. Libraries were made using the NEXTflex® Cell-Free DNA-seq kit for Illumina® platforms, with barcoded adapters (Bioo Scientific®) diluted 1:8. Libraries were analyzed for yield and size distribution on the LabChip® GX Touch™ HT platform (PerkinElmer®). Libraries from the pregnant donors were sequenced on the Illumina®

    MiSeq® platform as a 2x150 sequencing run and analyzed. Note that cf DNA library preparation does not include a DNA fragmentation step since the DNA is naturally fragmented to a size of ~170 bp.Bioinformatics. Paired sequencing reads were trimmed off 3’ adapter sequences using Cutadapt version 1.9.1 and aligned to human genome GRCh38 using Bowtie2 version 2.2.6. Y chromosome mapped reads were copied to additional alignment files by using Samtools version 1.3.1 view command. Genome and Y chromosome alignment files were converted back to sequencing files using Picard-Tools 1.95 in preparation for insert size binning. Binning of insert size on aligned sequencing reads were done using custom Python2 scripting, utilizing Biopython 1.66 (Python™). Binned insert sizes were stored in comma-separated values (or .csv) files and were graphically analyzed using Excel 2016. The % of reads mapping to the human Y chromosome were used to determine the fetal fraction of cf DNA.

    RESULTS

    Figure 1. Automation workflows for 24 and 96 samples from 5mL plasma to sequencing-ready libraries. (A) Material s used from left to right: (1) chemagic™ 360 instrument for automated cfDNA extraction from 5mL plasma; (2) NextPrep-Mag™ cfDNA extraction reagents; (3) Sciclone G3 NGS workstation for automated library prep; (4) NEXTFLEX® Cell-Free DNA-Seq lbrary prep reagents. (B) Single day workflow for 24 samples. (C) Two day automated workflow for 96 samples

    INTRODUCTIONcf DNA is a fluid sample type with vast applications in the analysis of cancer-associated sequence variants for the detection of actionable mutations in patients being treated for malignant disease, which can enable earlier treatment at a lower cost, monitoring cancer-linked cf DNA variants in healthy individuals for early detection of malignant disease, non-invasive prenatal diagnostics for early detection of genetic abnormalities without the risk of fetal injury, monitoring of organ transplant recipients for early signs of organ rejection, and tracking pathologies that do not involve differences in primary genetic sequence. To realize the full potential of cf DNA-based diagnostics, clinical research is needed to improve workflows that efficiently extract circulating cf DNA at high yield and purity, convert the cf DNA into libraries to allow massively parallel sequencing of regions of interest, and carry out the bioinformatic analysis steps. This poster describes methodology developed at Bioo Scientific, a PerkinElmer® Company, for both rapid high-throughput automated extraction of cf DNA from high volumes of plasma (5 mL) and for preparation of whole-genome cf DNA libraries from the extracted cf DNA, in a fully automated fashion.

    Figure 3. Automated Library prep of extracted cfDNA– 32µl of extracted cfDNA shown in Figure 2 was used as a template for library preparation using the NEXTflex® Cell Free DNA-Seq kit on the PerkinElmer® Sciclone® G3 NGS Workstation. 96 samples were processed in a single run. (A) LabChip® GXII Touch™ gel image of 96 libraries. DNA libraries were diluted 1:8 and run on a HSDNA chip. (B) Nanomolar concentrations of the ~300bp region of the library (corresponding to mononucleosomal cfDNA) was quantified using the LabChip® GX Reviewer software. Box and whisker plots were generated for each donor. %CVs were 11, 9, 28, and 14% for donors 1-4, respectively.

    Figure 4. (A) Blood was collected from 2 healthy pregnant donors with known male fetus (20 week - EDTA; 12 week - Streck BCT®) and doubly spun to remove contaminating cells. Automated cfDNA extraction of 5mL of the resultant cell free plasmas was carried out using Bioo Scientific’s NextPrep-Mag™ cfDNA Automated Isolation reagents on the chemagic ™ 360 instrument (red traces) . Library Prep was automated on the PerkinElmer® Sciclone® G3 NGS workstation using Bioo Scientific’s NEXTflex® Cell Free DNA-Seq reagents, 10 cycles PCR, with and without using an upfront SPRI cleanup to remove non-mononucleosomal DNA (>200bp) (blue, green traces). Yields of library size ranges were quantified using the Bioanalyzer® software and showed that ~95% of non-mononucleosomal peaks were excluded from libraries when size selection was used. (B) Library sequencing reads were aligned to human genome GRCh38 and binned to insert size to examine the effect of size selection on mononucleosome and non-mononucleosome quantity. Y chromosome inserts represent reads derived from fetal DNA. Binned inserts from total genome mapped reads and Y chromosome mapped reads show size selection excludes non-mononucleosomal reads from sequencing data.

    CONCLUSIONSHigh concordance was seen between cf DNA yield and size distribution for replicate 5 mL plasma samples processed for automated cf DNA extraction using the NextPrep-Mag™ cf DNA automated isolation kit on the chemagic™ 360 platform. The extent of donor to donor variation was much higher, as expected, than the extent of variation between replicate samples from the same donor or donor pool (Fig 2). Excellent reproducibility was also observed for yield and size distribution of the 96 whole-genome libraries produced on the Sciclone® G3 NGS workstation (Fig 3). Regarding the analysis of cf DNA obtained from the two pregnant donors, our results clearly show that fetal reads (reads mapping to the Y-chromosome) are present across the whole size distribution of cf DNA (Fig 4b). This result is surprising, in light of reports that fetal cf DNA fragments are typically shorter than non-fetal fragments (Chan et al, PNAS, doi 1615800113.). Our studies also demonstrate feasibility of automating the entire workflow for NIPT research, and the ability to detect informative levels of fetal cf DNA in first-trimester plasma samples, even without size-selection of the cf DNA or the resulting whole-genome libraries. The rapid automated extraction of cf DNA from high-volume 5 mL plasma samples, coupled with the capability of converting the extracted cf DNA into whole-genome libraries on a second automation platform, provides an integrated workflow that will enable efficient liquid biopsy-based investigations relevant for cancer and non-invasive prenatal assessment. Future studies will focus on methodological advances to maximize detection of rare variants in plasma and other biofluids, using automated workflows.

    For research use only. Not for use in diagnostic purposes.

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    Qubit BioA

    ng cf

    DNA

    reco

    vere

    d/mL

    plas

    ma

    Donor 1 (n=24)

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    Qubit BioA

    ng cf

    DNA

    reco

    vere

    d/mL

    plas

    ma

    Donor 2 (n=24)

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    Qubit BioA

    ng cf

    DNA

    reco

    vere

    d/mL

    plas

    ma

    Donor 3 (n=24)

    0

    5

    10

    15

    20

    25

    30

    35

    40

    Qubit BioA

    ng cf

    DNA

    reco

    vere

    d/mL

    plas

    ma

    Donor 4 (n=24)

    0

    10

    20

    30

    40

    50

    60

    70

    80

    90

    100

    D1 D2 D3 D4Lib

    rary c

    once

    ntrat

    ion (n

    M)

    Donor ID

    Library Yields (n=24)

    (A)

    D1 D2 D3 D4

    (B)

    cfDNA Library + Size Selection Library - Size Selection

    0

    2000

    4000

    6000

    8000

    10000

    12000

    1 12 23 34 45 56 67 78 89 100

    111

    122

    133

    144

    155

    166

    177

    188

    199

    210

    221

    232

    243

    254

    265

    276

    287

    298

    309

    320

    331

    342

    353

    364

    375

    386

    397

    408

    419

    430

    441

    452

    463

    474

    485

    496

    507

    518

    529

    540

    551

    562

    573

    584

    595

    606

    Total Insert Sizes

    12 week - No Size Selection12 week + Size Selection20 week - No Size Selection20 week + Size Selection

    (B)

    0

    50

    100

    150

    1 12 23 34 45 56 67 78 89 100

    111

    122

    133

    144

    155

    166

    177

    188

    199

    210

    221

    232

    243

    254

    265

    276

    287

    298

    309

    320

    331

    342

    353

    364

    375

    386

    397

    408

    419

    430

    441

    452

    463

    474

    485

    496

    507

    518

    529

    540

    551

    562

    573

    584

    595

    Y Chromosome Insert Sizes

    (A) 20 week EDTA 12 week Streck BCT cfDNA Library + Size Selection Library - Size Selection

    Figure 2. Highly reproducible cfDNA extractions on chemagic™ 360 instrument. cfDNA was extracted from 5mL of doubly-spun EDTA plasma from four healthy donors in four automated runs (24 replicates of a single donor/run). Reproducible yields are shown (1) qualitatively by gel and electropherogram and (2) quantitatively for both total DNA (Qubit® HSDNA assay) and for mononucleosome peak (Bioanalyzer® 100-275 bp region). For Donors 1-4, %CVs were 10, 10, 12, and 14% on the Bioanalyzer® and 12, 7, 9, and 18% on the Qubit®, respectively. Donor 4 showed significant contamination with gDNA, resulting in much higher Qubit® yields for that donor. Each automation run takes 75 minutes, with 15 minutes of hands on time. On graph, Qubit = Qubit® HSDNA assay and BioA = High Sensitivity DNA kit on the 2100 Bioanalyzer® platform

    a PerkinElmer company