1
Introduction UK Biobank is a health research charity which has collected and/or generated a wide range of biological samples and data on 500,000 participants aged 40-69 years old (at baseline). Data and samples are available for use by bona-fide researchers to investigate aetiology of disease. A project was initiated in 2013 to perform SNP genotyping on all UK Biobank participants. The aim of the project (which was to be completed within an 18-24 month timeframe) was to generate high quality genotype data and to make these data available for researchers, via the UK Biobank Showcase system. DNA extraction and quantification were performed at UK Biobank; genotyping (on the custom UK Biobank Axiom® Array) was undertaken by Affymetrix; and the Wellcome Trust Centre for Human Genetics (WTCHG) performed data duality control, phasing and imputation on the derived data. This poster describes the sample processing workflow, presents a comparison between the two quantification methods used in the project (Trinean DropSense® 96 and PicoGreen) and provides a summary of the genotyping metrics for the (approx.) first 180,000 samples analysed. 1. Sample Processing Workflow Sample retrieval, DNA extraction and quantification was performed at the UK Biobank co-ordinating centre in Stockport, UK. DNA was shipped to the Affymetrix Research Services Laboratory in Santa Clara for genotyping on the custom Axiom® Array. Data review, quality control was performed by WTCHG, Oxford. Imputation is to be performed on the dataset by WTCHG. All genotype data will be available in the UK Biobank Showcase (http://biobank.ctsu.ox.ac.uk/crystal/). 2. DNA Extraction and Quantification Buffy coat was picked from an automated -80°C sample repository for DNA extraction. DNA extraction was performed on a custom-built TECAN platform (Figure 2). The extraction chemistry and protocol used (Promega Maxwell® 16 Blood DNA Purification Kit (AS1010)) was modified to ensure a purified sample from the large volume of source material. During the DNA extraction process, DNA was quantified on the Trinean DropSense®96 and the volume was assessed on the Brooks Tube Auditor™. Plates were authorised for shipment if minimum criteria were met for the DNA concentration and 260/280 (>80% samples had concentration >20ng/μl). DNA was stored at -80°C prior to shipment on dry ice to Affymetrix for genotyping. A second quantification was performed at Affymetrix using PicoGreen, which was used to normalise DNA for input into the genotyping process. UK Biobank intends to make both quantification values available to approved researchers. 5. Conclusion The UK Biobank DNA sample processing workflow has yielded high quality DNA suitable for genotyping on the Affymetrix Axiom® assay. The quantification methods employed (Trinean DropSense ® and PicoGreen) provide similar values (r=0.86). Whilst PicoGreen was ultimately used in the normalisation of DNA, the initial quantification at UK Biobank was a good indicator of DNA concentration. Failed samples could not be attributed to low DNA yield or quality. Data derived from the UK Biobank Genotyping project will be available for researchers to use. In addition, the DNA derived from this project is available for researchers to perform other analyses, subject to the UK Biobank Access procedures. Figure 1: Overview of sample processing workflow from picking samples for DNA extraction to genotype data available to UK Biobank approved researchers Figure 2:DNA extraction system for DNA extraction, quantification and volume measurement 3. DNA Quantification Results Data is presented for 182,705 samples (i) . The average DNA concentration measured using the Trinean DropSense® 96 was 40.35ng/μl and from PicoGreen was 39.71ng/μl (Figure 3). The average 260/280 was 1.89. The DropSense® 96 was selected because it was readily automated and required minimal consumables. A correlation of r=0.86 for the DNA concentration measured on the Trinean DropSense® 96 and by the PicoGreen quantification method was obtained for 182,705 DNA samples (Figure 4). The quantification results from both methods employed indicates the DNA extraction process described provides good quality DNA. (i) 93 samples with a large difference between PicoGreen and Trinean quantification (>150 ng/μl) were excluded as it was expected these samples were not fully homogenous when quantified at Affymetrix after a freeze/thaw cycle. 4. Genotyping Results Initial results from genotyping 182,705 DNA samples are displayed in Table 1. The values presented can be viewed as quality metrics; the values for the UK Biobank samples are shown with a set of “expected threshold” values - minimum / expected values as published by Affymetrix. The UK Biobank samples comfortably exceed these ‘expected thresholds’ indicating that the DNA is of good quality. These results are after the data review at Affymetrix and before the Quality Control at WTCHG. A small number of samples failed genotyping (0.43%); the avg concentration of these samples was 34.95ng/μl (PicoGreen) / 38.04ng/μl (Trinean), and the 260/280 ratio was 1.91. DNA concentration/quality was not the reason for the majority of failures. Metric Result Expected threshold Pass Rate 99.57% 95% Average Call Rate 99.89% >95% dQC 0.98 >0.82 Table 1: Summary metrics from genotyping of 182,705 samples Sample Processing for the UK Biobank Genotyping Project Samantha Murphy 1 and Tony Montoye 2 1 UK Biobank (1-4 Spectrum Way, Adswood, Stockport, SK3 0SA, United Kingdom); 2 Trinean (Dulle Grietlaan 17/3, B-9050 Gentbrugge, Belgium) Figure 4: DNA concentration measured by Trinean and PicoGreen correlation Figure 3: Histogram of DNA concentration measured by Trinean and PicoGreen r=0.86

Sample Processing for the UK Biobank Genotyping …static.progressivemediagroup.com/Uploads/CaseStudy/1389/...A project was initiated in 2013 to perform SNP genotyping on all UK Biobank

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Sample Processing for the UK Biobank Genotyping …static.progressivemediagroup.com/Uploads/CaseStudy/1389/...A project was initiated in 2013 to perform SNP genotyping on all UK Biobank

Introduction

UK Biobank is a health research charity which has collected and/or generated a wide range of biological samples and data on 500,000 participants aged 40-69 years old (at baseline). Data and samples are available for use by bona-fide researchers to investigate aetiology of disease. A project was initiated in 2013 to perform SNP genotyping on all UK Biobank participants. The aim of the project (which was to be completed within an 18-24 month timeframe) was to generate high quality genotype data and to make these data available for researchers, via the UK Biobank Showcase system. DNA extraction and quantification were performed at UK Biobank; genotyping (on the custom UK Biobank Axiom® Array) was undertaken by Affymetrix; and the Wellcome Trust Centre for Human Genetics (WTCHG) performed data duality control, phasing and imputation on the derived data. This poster describes the sample processing workflow, presents a comparison between the two quantification methods used in the project (Trinean DropSense® 96 and PicoGreen) and provides a summary of the genotyping metrics for the (approx.) first 180,000 samples analysed.

1. Sample Processing Workflow

Sample retrieval, DNA extraction and quantification was performed at the UK Biobank co-ordinating centre in Stockport, UK. DNA was shipped to the Affymetrix Research Services Laboratory in Santa Clara for genotyping on the custom Axiom® Array. Data review, quality control was performed by WTCHG, Oxford. Imputation is to be performed on the dataset by WTCHG. All genotype data will be available in the UK Biobank Showcase (http://biobank.ctsu.ox.ac.uk/crystal/).

2. DNA Extraction and Quantification

Buffy coat was picked from an automated -80°C sample repository for DNA extraction. DNA extraction was performed on a custom-built TECAN platform (Figure 2). The extraction chemistry and protocol used (Promega Maxwell® 16 Blood DNA Purification Kit (AS1010)) was modified to ensure a purified sample from the large volume of source material. During the DNA extraction process, DNA was quantified on the Trinean DropSense®96 and the volume was assessed on the Brooks Tube Auditor™. Plates were authorised for shipment if minimum criteria were met for the DNA concentration and 260/280 (>80% samples had concentration >20ng/µl).

DNA was stored at -80°C prior to shipment on dry ice to Affymetrix for genotyping. A second quantification was performed at Affymetrix using PicoGreen, which was used to normalise DNA for input into the genotyping process. UK Biobank intends to make both quantification values available to approved researchers.

5. Conclusion

The UK Biobank DNA sample processing workflow has yielded high quality DNA suitable for genotyping on the Affymetrix Axiom® assay. The quantification methods employed (Trinean DropSense ® and PicoGreen) provide similar values (r=0.86). Whilst PicoGreen was ultimately used in the normalisation of DNA, the initial quantification at UK Biobank was a good indicator of DNA concentration. Failed samples could not be attributed to low DNA yield or quality. Data derived from the UK Biobank Genotyping project will be available for researchers to use. In addition, the DNA derived from this project is available for researchers to perform other analyses, subject to the UK Biobank Access procedures.

Figure 1: Overview of sample processing workflow from picking samples for DNA extraction to genotype data available to UK Biobank approved researchers

Figure 2:DNA extraction system for DNA extraction, quantification and volume measurement

3. DNA Quantification Results

Data is presented for 182,705 samples(i). The average DNA concentration measured using the Trinean DropSense® 96 was 40.35ng/µl and from PicoGreen was 39.71ng/µl (Figure 3). The average 260/280 was 1.89. The DropSense® 96 was selected because it was readily automated and required minimal consumables. A correlation of r=0.86 for the DNA concentration measured on the Trinean DropSense® 96 and by the PicoGreen quantification method was obtained for 182,705 DNA samples (Figure 4). The quantification results from both methods employed indicates the DNA extraction process described provides good quality DNA. (i) 93 samples with a large difference between PicoGreen and Trinean quantification (>150 ng/µl) were excluded as it was expected these samples were not fully homogenous when quantified at Affymetrix after a freeze/thaw cycle.

4. Genotyping Results

Initial results from genotyping 182,705 DNA samples are displayed in Table 1. The values presented can be viewed as quality metrics; the values for the UK Biobank samples are shown with a set of “expected threshold” values - minimum / expected values as published by Affymetrix. The UK Biobank samples comfortably exceed these ‘expected thresholds’ indicating that the DNA is of good quality. These results are after the data review at Affymetrix and before the Quality Control at WTCHG. A small number of samples failed genotyping (0.43%); the avg concentration of these samples was 34.95ng/µl (PicoGreen) / 38.04ng/µl (Trinean), and the 260/280 ratio was 1.91. DNA concentration/quality was not the reason for the majority of failures.

Metric Result Expected threshold

Pass Rate 99.57% 95%

Average Call Rate 99.89% >95%

dQC 0.98 >0.82

Table 1: Summary metrics from genotyping of 182,705 samples

Sample Processing for the UK Biobank Genotyping Project Samantha Murphy1 and Tony Montoye2

1UK Biobank (1-4 Spectrum Way, Adswood, Stockport, SK3 0SA, United Kingdom); 2Trinean (Dulle Grietlaan 17/3, B-9050 Gentbrugge, Belgium)

Figure 4: DNA concentration measured by Trinean and PicoGreen correlation

Figure 3: Histogram of DNA concentration measured by Trinean and PicoGreen

r=0.86