16
MammaPrint and BluePrint Molecular Diagnostics Using Targeted RNA Next-Generation Sequencing Technology Q27 Lorenza Mittempergher,* Leonie J.M.J. Delahaye,* Anke Witteveen,* Jacob B. Spangler, y Fariet Hassenmahomed,* Sammy Mee, y Souane Mahmoudi,* Jiang Chen, y Simon Bao, y Mireille H.J. Snel,* Sandra Leidelmeijer,* Naomi Besseling,* Anne B. Lucas, z Carlos Pabón-Peña, z Sabine C. Linn, x Christa Dreezen,* Diederik Wehkamp,* Bob Y. Chan, y René Bernards,* { Laura J. van t Veer,* k and Annuska M. Glas* Q1 Q2 Q3 Q4 From the Agendia NV,* Amsterdam, the Netherlands; the Agendia Inc., y Irvine, California; the Diagnostics and Genomics Group, z Agilent Technologies, Santa Clara, California; the Divisions of Molecular Pathology and Medical Oncology x and Molecular Carcinogenesis, { Netherlands Cancer Institute, Amsterdam, the Netherlands; and the Laboratory Medicine, k University of California, San Francisco, San Francisco California Accepted for publication April 16, 2019. Address correspondence to Annuska M. Glas, Ph.D., Research and Development and Product Support, Agendia NV, Science Park 406, 1098 XH Amsterdam, the Nether- lands Q6 . E-mail: annuska.glas@ agendia.com. Next-generation DNA sequencing is rapidly becoming an indispensable tool for genome-directed cancer diagnostics, but next-generation RNA sequencing (RNA-seq) is currently not standardly used in clinical diagnostics for expression assessment. However, multigene RNA diagnostic assays are used increasingly in the routine diagnosis of early-stage breast cancer. Two of the most widely used tests are currently available only as a central laboratory service, which limits their clinical use. We evaluated the use of RNA-seq as a decentralized method to perform such tests. The MammaPrint and BluePrint RNA-seq tests were found to be equivalent to the clinically validated microarray tests. The RNA-seq tests were highly reproducible when performed in different locations and were stable over time. The MammaPrint RNA-seq test was clinically validated. Our data demonstrate that RNA-seq can be used as a decentralized plat- form, yielding results substantially equivalent to results derived from the predicate diagnostic device. (J Mol Diagn 2019, -:1e16; https://doi.org/10.1016/j.jmoldx.2019.04.007) In Q7 recent years, RNA-sequencing (RNA-seq) technology has become the standard for transcriptome analysis because of its exibility, sensitivity, and accuracy in measuring gene expression. RNA-seq is a technology in which cDNA is derived from RNA, followed by library construction and massively parallel deep sequencing. 1 RNA-seq is the next- generation sequencing (NGS) counterpart of microarrays. However, unlike microarray, RNA-seq provides not only an in-depth view of transcript abundance, but also confers the ability to detect novel RNA transcript variants. 2 From a technical perspective, RNA-seq has proved to be highly reproducible and has low background. Considering these characteristics, RNA-seq has the potential to improve clin- ical diagnostics for a wide range of diseases, including cancer. To introduce the use of this technology in clinical practice, efforts are ongoing to establish reference standards and best practices for RNA-seq as a clinical test. 2 MammaPrint was one of the rst gene-expressionebased diagnostic tests introduced into the clinical management of breast cancer 3e7 and the rst in vitro diagnostic multivariate index assay using gene expression that obtained US Food and Supported by the European Union Seventh Framework Programme FP7/ 2007e2013, under the Rational Therapy for Breast Cancer project grant 258967, work package 8 (A.M.G., principal investigator Q5 ) Q26 . Disclosures: Authors (except A.B.L., C.P.P., and S.C.L.) are employed by Agendia (R.B. and L.V.V. part-time). C.D. is a consultant for Agendia, the commercial entity that markets the 70-gene signature as MammaPrint and the 80-gene signature as BluePrint. L.V.V. and R.B. are shareholders of Agendia and are named inventors on the patent for the 70-gene signature used in this study. A.M.G. is named inventor on the patent for the 80-gene signature used in this study. A.B.L. and C.P.P. are employed by Agilent Technologies, the commercial entity that manufactures the microarrays and SureSelect next-generation sequencing probes and reagents for the Mam- maPrint and BluePrint assays. No writing assistance was used in the pro- duction of this article. Copyright ª 2019 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0). https://doi.org/10.1016/j.jmoldx.2019.04.007 jmd.amjpathol.org The Journal of Molecular Diagnostics, Vol. -, No. -, - 2019 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 FLA 5.6.0 DTD ĸ JMDI802_proof ĸ 22 July 2019 ĸ 8:07 pm ĸ EO: 2018_374

MammaPrint and BluePrint Molecular Diagnostics Using ......ion semiconductor sequencing,14 single-molecule real-time sequencing,15 nanopore sequencing,16 and sequencing by synthesis.17

  • Upload
    others

  • View
    12

  • Download
    0

Embed Size (px)

Citation preview

Page 1: MammaPrint and BluePrint Molecular Diagnostics Using ......ion semiconductor sequencing,14 single-molecule real-time sequencing,15 nanopore sequencing,16 and sequencing by synthesis.17

Q27

Q1Q2

Q3Q4

Q6

Q7

The Journal of Molecular Diagnostics, Vol. -, No. -, - 2019

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162

jmd.amjpathol.org

636465666768697071727374

MammaPrint and BluePrint Molecular 7576 Diagnostics Using Targeted RNA Next-Generation 7778798081828384

Sequencing TechnologyLorenza Mittempergher,* Leonie J.M.J. Delahaye,* Anke Witteveen,* Jacob B. Spangler,y Fariet Hassenmahomed,*Sammy Mee,y Soufiane Mahmoudi,* Jiang Chen,y Simon Bao,y Mireille H.J. Snel,* Sandra Leidelmeijer,* Naomi Besseling,*Anne B. Lucas,z Carlos Pabón-Peña,z Sabine C. Linn,x Christa Dreezen,* Diederik Wehkamp,* Bob Y. Chan,y René Bernards,*{

Laura J. van ’t Veer,*k and Annuska M. Glas*

85868788

From the Agendia NV,* Amsterdam, the Netherlands; the Agendia Inc.,y Irvine, California; the Diagnostics and Genomics Group,z Agilent Technologies,Santa Clara, California; the Divisions of Molecular Pathology and Medical Oncologyx and Molecular Carcinogenesis,{ Netherlands Cancer Institute,Amsterdam, the Netherlands; and the Laboratory Medicine,k University of California, San Francisco, San Francisco California

899091

Accepted for publication

C

T

h

9293949596979899

April 16, 2019.

Address correspondence toAnnuska M. Glas, Ph.D.,Research and Development andProduct Support, Agendia NV,Science Park 406, 1098 XHAmsterdam, the Nether-lands. E-mail: [email protected].

opyright ª 2019 American Society for Inve

his is an open access article under the CC B

ttps://doi.org/10.1016/j.jmoldx.2019.04.007

100101102

Next-generation DNA sequencing is rapidly becoming an indispensable tool for genome-directed cancerdiagnostics, but next-generation RNA sequencing (RNA-seq) is currently not standardly used in clinicaldiagnostics for expression assessment. However, multigene RNA diagnostic assays are used increasinglyin the routine diagnosis of early-stage breast cancer. Two of the most widely used tests are currentlyavailable only as a central laboratory service, which limits their clinical use. We evaluated the use ofRNA-seq as a decentralized method to perform such tests. The MammaPrint and BluePrint RNA-seq testswere found to be equivalent to the clinically validated microarray tests. The RNA-seq tests were highlyreproducible when performed in different locations and were stable over time. The MammaPrint RNA-seqtest was clinically validated. Our data demonstrate that RNA-seq can be used as a decentralized plat-form, yielding results substantially equivalent to results derived from the predicate diagnostic device.(J Mol Diagn 2019, -: 1e16; https://doi.org/10.1016/j.jmoldx.2019.04.007)

Supported by the European Union Seventh Framework Programme FP7/2007e2013, under the Rational Therapy for Breast Cancer project grant258967, work package 8 (A.M.G., principal investigator Q5) Q26.

Disclosures: Authors (except A.B.L., C.P.P., and S.C.L.) are employedby Agendia (R.B. and L.V.V. part-time). C.D. is a consultant for Agendia,the commercial entity that markets the 70-gene signature as MammaPrintand the 80-gene signature as BluePrint. L.V.V. and R.B. are shareholders ofAgendia and are named inventors on the patent for the 70-gene signatureused in this study. A.M.G. is named inventor on the patent for the 80-genesignature used in this study. A.B.L. and C.P.P. are employed by AgilentTechnologies, the commercial entity that manufactures the microarrays andSureSelect next-generation sequencing probes and reagents for the Mam-maPrint and BluePrint assays. No writing assistance was used in the pro-duction of this article.

103104105106107108109110111112113114115116117118119120121122123

In recent years, RNA-sequencing (RNA-seq) technologyhas become the standard for transcriptome analysis becauseof its flexibility, sensitivity, and accuracy in measuring geneexpression. RNA-seq is a technology in which cDNA isderived from RNA, followed by library construction andmassively parallel deep sequencing.1 RNA-seq is the next-generation sequencing (NGS) counterpart of microarrays.However, unlike microarray, RNA-seq provides not only anin-depth view of transcript abundance, but also confers theability to detect novel RNA transcript variants.2 From atechnical perspective, RNA-seq has proved to be highlyreproducible and has low background. Considering thesecharacteristics, RNA-seq has the potential to improve clin-ical diagnostics for a wide range of diseases, includingcancer. To introduce the use of this technology in clinicalpractice, efforts are ongoing to establish reference standardsand best practices for RNA-seq as a clinical test.2

stigative Pathology and the Association for M

Y-NC-ND license (http://creativecommons.org

FLA 5.6.0 DTD � JMDI802_proo

MammaPrint was one of the first gene-expressionebaseddiagnostic tests introduced into the clinical management ofbreast cancer3e7 and the first in vitro diagnostic multivariateindex assay using gene expression that obtained US Food and

olecular Pathology. Published by Elsevier Inc.

/licenses/by-nc-nd/4.0).

124

f � 22 July 2019 � 8:07 pm � EO: 2018_374

Page 2: MammaPrint and BluePrint Molecular Diagnostics Using ......ion semiconductor sequencing,14 single-molecule real-time sequencing,15 nanopore sequencing,16 and sequencing by synthesis.17

Q8

Q9

10

Mittempergher et al

125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186

187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247

Drug Administration (FDA) 510(k) clearances for both fresh-frozen (FF) and formalin-fixed, paraffin-embedded (FFPE)tissue.8 Extensive validation studies and, above all, the recentMINDACT clinical trial9 have demonstrated the clinicalutility ofMammaPrint at the highest level of clinical evidence(level 1A),making it a unique example of amicroarray-based,clinical diagnostic test that helps guide physicians in adjuvanttreatment decisions for early-stage breast cancer patients.Specifically, MammaPrint determines the expression of 70signature genes and stratifies early-stage breast cancer pa-tients at low and high risk for developing distant metastaseswithin 5 years after diagnosis. A second microarray test forbreast cancer is BluePrint, which measures the expression of80 signature genes to reveal breast cancer subtypes (ie,luminal, HER2, and basal types).10,11

Currently, MammaPrint and BluePrint tests are performedat central laboratories in the Netherlands and United Statesusing microarray technology. With a centralized setting, it isdifficult to serve countries with legal restrictions on sendingpatient material outside the country; furthermore, reim-bursement often requires local processing and forms apractical obstacle to clinical implementation of these tests.Therefore, decentralized diagnostic tests would offer tophysicians an in-house solution, ideally without compro-mising the level of clinical validity. RNA-seq NGS tech-nology is highly reproducible and currently easier toimplement than microarray in preexisting infrastructure andexperimental workflows because of the increased use ofRNA-seq technology in many laboratories worldwide. Inaddition, RNA NGS shows good performance with RNAderived from FFPE tissues, even from small amounts ofstarting material.12,13 Herein, we evaluate the use of RNANGSebased technology to perform clinical diagnostictesting, using the MammaPrint/BluePrint FFPE microarray-based assay as an example.

Several NGS technologies are available to date, includingion semiconductor sequencing,14 single-molecule real-timesequencing,15 nanopore sequencing,16 and sequencing bysynthesis.17 The sequencing-by-synthesis technologyaccounts for the largest share of the market, mostly becauseof the development of new and advanced NGS platformsand the increasing demand for the systems (including theHiSeq series and MiSeq) of Illumina (Hayward, CA). Atargeted RNA-seq approach on the Illumina NGS technol-ogy that combines the RNA (cDNA) library preparationwith the enrichment of our MammaPrint/BluePrint tran-scripts of interest using complementary capture probes waschosen. More important, such a targeted RNA NGSapproach does not rely on poly-A tail enrichment or useribosomal RNA depletion (eg, Ribo-Zero), and it can bereliably applied on degraded and cross-linked RNAs, suchas those from FFPE tissues.13

The translation of a diagnostic test from one technologyplatform to another one requires a series of experimentsthat assess the analytical and clinical performance of thenewly developed test with respect to the standard

2FLA 5.6.0 DTD � JMDI802_pro

technology. Therefore, different validation experimentswere performed to assess the equivalence of the Mam-maPrint and BluePrint NGS test to the diagnostic micro-array test and its reproducibility between differentsamplings of the same tumor block (surrogate for intra-tumor heterogeneity), over time (precision) and betweendifferent laboratories (interlaboratory agreement). More-over, the clinical performance of the MammaPrint andBluePrint NGS test was assessed in a set of samples withknown clinical outcome, which were previously used tovalidate the clinical utility of the MammaPrint FDA-cleared (Dx) diagnostic test Q.We describe herein the development of targeted RNANGS

MammaPrint/BluePrint tests and their technical and clinicalperformance compared with the standard microarray-basedMammaPrint and BluePrint diagnostic tests.

Materials and Methods

Sample Selection

FFPE samples selected for this study fulfilled the followingcriteria: i) They complied with the intended use of Mam-maPrint and BluePrint in vitro diagnostic tests: stage I, stageII, and operable stage III invasive breast cancer, tumor size�5.0 cm, and lymph nodeenegative and one to three lymphnodeepositive patients. ii) They had microarray Mam-maPrint results covering the entire MammaPrint indexrange. iii) They had at least five samples for each of thethree BluePrint subtypes based on microarray analysis. Inaddition to these including criteria, FFPE samples selectedfor the clinical comparison had a minimum follow-up of 5years from the date of diagnosis.

Targeted RNA-Seq Workflow

OverviewTo develop the targeted RNA NGS MammaPrint/BluePrinttest, a modified version of the Agilent SureSelectXT pro-cedure, which combines a strand-specific RNA library withan RNA Enrichment System starting from FFPE tissue, wasused. For the general laboratory procedure, please refer tothe SureSelectXT RNA Direct Protocol (as per manufac-turer’s instructions; Agilent Technologies, Santa Clara,CA); and for the Agendia MammaPrint/BluePrint NGS testprocedure, please visit the Agendia website (Agendia NV,Amsterdam, the Netherlands). The SureSelect TargetEnrichment workflow is a solution-based system usingultralong 120-mer biotinylated cRNA baits (so-called cap-ture probes) to capture regions of interest, enriching themout of a NGS genomic fragment library. The AgilentSureSelectXT technology performs a post-capture indexingwhere samples are indexed (through a PCR enrichment step)after the capture is performed on each sample individually.After the target enrichment preparation, the cDNA librarieswere sequenced using the MiSeq [research use only (RUO)

jmd.amjpathol.org - The Journal of Molecular Diagnostics

248

of � 22 July 2019 � 8:07 pm � EO: 2018_374

Page 3: MammaPrint and BluePrint Molecular Diagnostics Using ......ion semiconductor sequencing,14 single-molecule real-time sequencing,15 nanopore sequencing,16 and sequencing by synthesis.17

Q11

Q12

MammaPrint and BluePrint Using RNA-Seq

249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310

311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371

and Dx] Illumina sequencers. Subsequently, the expressionof the MammaPrint and BluePrint signature genes wasassessed using the sequence read count data and translatedinto the outcome test prediction. Supplemental Figure S1represents the overall laboratory workflow performed inthis study.

NGS Probe DesignFirst, the Agendia MammaPrint and BluePrint signaturegenes, previously reported,3,10,18 were translated into cap-ture probes to be used in the Agilent SureSelectXT work-flow. The target enrichment panel (ie, capture probes) wasdesigned by first mapping the microarray oligonucleotidesequences to hg19 (ie, human assembly GRCh37), followedby target gene annotation. Genomic locations of probesmapping uniquely to intergenic or noncoding regions wereconfirmed against hg38 (ie, human assembly GRCh38)using BLAT19; 120-bp capture probes were then tiled acrossfull transcripts based on RefSeq,20 CCDS,21 Ensembl,22

GENCODE,23 and VEGA24 reference databases. Thetarget regions used to design the SureSelect probes are re-ported in Supplemental Table S1. Each signature gene wastargeted by multiple probes, from a minimum of 2 up to amaximum of 184 target probes.

The probe selection process resulted in an Agendia baitcapture library covering 0.92 megabases of genomic re-gions. The Agendia bait capture library (alias MammaPrintBluePrint NGS Panel) targets a total of 178 unique genes.

FFPE RNA Quality Assessment and PreparationRNA isolation was performed on four FFPE tissue sections(5 mm thick) with either the RNeasy FFPE kit or AllPrep kit(Qiagen, Valencia, CA) using xylene for deparaffinization,in accordance with the manufacturer’s instructions. IsolatedFFPE Total RNA had to have 260/280 and 260/230absorbance ratio values close to 2.0 for both ratios. Toassess the quality of the total RNA, the Agilent 2100 Bio-analyzer instrument was used to calculate the DV200 metric,which represents the percentage of RNA fragments >200nucleotides. Samples with DV200 <20% were excludedfrom the analysis.

cDNA Library PreparationFFPE Total RNA was chemically fragmented to an appro-priate size (approximately 200 nucleotides), and the RNAfragments were bound to random primers. The recom-mended fragmentation conditions vary, depending on theinitial quality of the RNA sample, with lower-qualitysamples subjected to fragmentation at a lower temperatureand for a shorter duration. cDNA was synthesized, and endswere repaired. cDNA was adenylated at the 30 end, andIllumina P5/P7 (flow cell binding sites) adapters wereligated. The cDNA adaptereligated library was enriched byPCR amplification. Quality and quantity of the cDNAadaptereligated library were assessed using the Agilent2100 Bioanalyzer instrument.

The Journal of Molecular Diagnostics - jmd.amjpathol.orgFLA 5.6.0 DTD � JMDI802_proo

RNA Target EnrichmentPrepared cDNA adaptereligated librarywas hybridized to theAgendia bait capture library overnight. The hybridizationreaction combined the prepared cDNA library with theSureSelect hybridization reagents and specific blockingagents. After hybridization was completed (17 to 24 hours),cDNA library/capture library hybrid was captured usingstreptavidin-coated beads that bind to the biotinylated cRNAbaits. Next, the cDNA adaptereligated and captured librarywas enriched by PCR amplification. The PCR amplificationstep adds index tags while it amplifies the captured library.Samples (cDNA adaptereligated, captured, and indexedlibraries) can then be pooled for multiplexed sequencing.Quality and quantity of the cDNA adaptereligated andcaptured library were assessed using the Agilent 2100 Bio-analyzer instrument. To proceed with sequencing, at least 1nmol/L of cDNA adaptereligated and capture library (be-tween 150 and 700 bp) was targeted.

SequencingcDNA adaptereligated and captured libraries were pooledat equimolar concentration before sequencing. Sample poolsof 1, 2, or 4 nmol/L starting concentration were denaturedusing sodium hydroxide and diluted with hybridization HT1buffer, as per manufacturer’s instructions (Illumina). Sam-ples were single-end sequenced on the Illumina MiSeq orMiSeq DX (RUO mode) instrument at the length of 150 bpusing the MiSeq Reagent Kit V3 (150 cycles). A single-endsequencing protocol was chosen because it provided suffi-cient read count and resolution to assess the expression ofthe genes in our target panel.

Sequencing Data AnalysisSequence reads in the FASTQ format were generated andanalyzed following a multistep procedure (SupplementalFigure S2). First, the quality of each FASTQ sequence wasassessed using the FASTQC version 0.11.5 software (https://www.bioinformatics.babraham.ac.uk/projects/fastqc; lastaccessed November 13, 2018). Next, the reads weretrimmed with the Trimmomatic version 0.36 software,25

and trimmed sequence reads were mapped to the EnsemblGRCh37 human genome using the HISAT2 version 2.0.4software.26 HISAT2 output files in a SAM (sequence align-ment/map format) were compressed to binary sequencealignment/map files using SAMtools version 1.3.127 priorsequence read quantification. Afterward, mapped reads werecounted using the feature Counts version 1.5.1 software.28

FASTQ files of the samples analyzed in this study havebeen submitted to the European Nucleotide Archive database(https://www.ebi.ac.uk/ena) under the study accessionnumber PRJEB31802.

MammaPrint and BluePrint Outcome CalculationFor each gene in every sample, raw sequence read countwas normalized using a count per million normalizationmethod. By correcting the total counts in each sample by the

3

372

f � 22 July 2019 � 8:07 pm � EO: 2018_374

Page 4: MammaPrint and BluePrint Molecular Diagnostics Using ......ion semiconductor sequencing,14 single-molecule real-time sequencing,15 nanopore sequencing,16 and sequencing by synthesis.17

Q13

Q14

Q15

Mittempergher et al

373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434

435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495

total number of reads, samples sequenced in different poolscan be compared with each other. Normalization was per-formed in R version 3.3.2.29 After the data were normalized,MammaPrint and BluePrint indexes were calculated usingthe same algorithm for current MammaPrint and BluePrinttests as previously described.3,6,8,10,30 In brief, the Mam-maPrint indexes are calculated by taking the expression ofthe 70 MammaPrint genes for each sample and correlatingthem to the MammaPrint Risk templates. If the MammaP-rint index is >0, the sample is classified as low risk; whenthe index is �0, then the sample is classified as high risk.Similarly, the BluePrint indexes are calculated by taking theexpression of the 80 BluePrint genes and comparing themwith the three different templates, one for each of the sub-types (luminal, HER2, and basal types). The subtype withthe highest (scaled) correlation is the reported subtype.

Data Analysis and Statistical Analysis

Data were analyzed using the MATLAB software versionR2012a (MathWorks, Natick, MA) and RStudio softwareversion 1.1.442, based on R software version 3.4.2. Datawere visualized using MATLAB software, MultipleExperiment Viewer version 4.8.1, Microsoft PowerPoint(Microsoft Corp., Redmond, WA), and Excel 2016 MSO16.0.9330.2124 (Microsoft Corp.). Unsupervised clusteringanalyses were performed with Multiple Experiment Viewer.

The following was done to compare gene expressionlevels obtained with NGS and microarray technologies. Foreach of the 70 MammaPrint and 80 BluePrint genes,expression levels were compared between 204 NGS sam-ples and 204 matched microarray samples using Pearsoncorrelation (NGS versus microarray paired). For eachsignature gene, a not paired comparison was also performed,where its NGS expression was correlated to the microarrayor the NGS expression of any of the other signature genesand vice versa (NGS versus not paired microarray, NGSversus not paired NGS, microarray versus not paired NGS,and microarray versus not paired microarray). Pearson co-efficients generated per gene from the not paired compari-sons were averaged to obtain one not paired Pearsoncoefficient value per gene.

To compare the 70 MammaPrint gene expression levels indifferent MammaPrint risk groups, for each of the 70 Mam-maPrint genes, the average NGS and microarray mediancentered expression of the high- and the low-risk patientsampleswas calculated separately. NGS andmicroarray high-and low-risk aggregate expressions were compared usingPearson correlation. Similarly, for each of the 80 BluePrintgenes, the average NGS and microarray median centeredexpression of the luminal-, HER2-, and basal-type patientsamples was calculated separately. NGS and microarrayluminal-, HER2-, and basal-type aggregate expression levelswere compared using Pearson correlation.

The equivalence of MammaPrint and BluePrint indexeswas determined by a Pearson correlation for assessment of

4FLA 5.6.0 DTD � JMDI802_pro

the degree of linear correlation and a Passing-Bablokregression analysis to obtain the regression equation. Scat-terplots were used to visually examine the existence of anyconstant bias in the difference of measurements betweensamples analyzed with targeted RNA-seq and microarraytechnologies. Acceptable limits were defined a priori, on thebasis of microarray data generated at Agendia between 2012and 2016 using the Agendia’s Test Suite. MammaPrint andBluePrint test outcomes (MammaPrint: high/low risk;BluePrint: luminal, basal, and HER2 types), assessed usingtargeted RNA-sequencing technology, were compared withthe standard microarray diagnostic outcomes using a con-tingency table. On the basis of the contingency table,concordance, negative percentage agreement (NPA), andpositive percentage agreement (PPA) were determined. TheNPA assesses whether the percentage of MammaPrint low-risk results obtained with microarray technology isconcordant with the results obtained with the targeted RNA-seq technology. Similarly, PPA is the percentage of Mam-maPrint high-risk results obtained by one techniquecompared with that obtained by the other technique.Concordance, NPA, and PPA should be �90%. The relativestability (reproducibility) is calculated by 100 minus therelative SD, which is the SD measured as percentile of thetotal MammaPrint or BluePrint NGS dynamic rangeassessed on a set of 326 samples sequenced in the Agendialaboratories. Clinicopathologic data and clinical outcomedata were analyzed using the statistical package SPSS 22.0for Windows (SPSS Inc., Chicago, IL). All validation ex-periments to assess technical equivalence, reproducibility,tumor heterogeneity, and clinical performance of the tar-geted RNA NGS MammaPrint/BluePrint test were designedaccording to guidelines of the FDA.31 Interlaboratoryreproducibility was tested at the Agendia central laboratory(Amsterdam, the Netherlands; site 1), at University HospitalLeuven (Leuven, Belgium; site 2), and at Curie Institute(Paris, France; site 3).

Study Approval

Only data and not samples were collected for this study. Alldata and analyses used or performed in this study complywith the current ethical laws of the Netherlands. All patientsample data were anonymized in accordance with nationalethical guidelines (Code for Proper Secondary Use ofHuman Tissues, Dutch Federation of Medical ScientificSocieties), and study samples had institutional review boardapprovals.

Results

Targeted RNA Next-Generation SequencingPerformance

To represent the MammaPrint and BluePrint diagnosticsignatures, a target gene panel consisting of a total of 178

jmd.amjpathol.org - The Journal of Molecular Diagnostics

496

of � 22 July 2019 � 8:07 pm � EO: 2018_374

Page 5: MammaPrint and BluePrint Molecular Diagnostics Using ......ion semiconductor sequencing,14 single-molecule real-time sequencing,15 nanopore sequencing,16 and sequencing by synthesis.17

½F1�½F1�

½F2�½F2�

print&

web4C=FPO

Q23

MammaPrint and BluePrint Using RNA-Seq

497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558

559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592

genes (MammaPrint, BluePrint, and control genes), eachrepresented by different 120-mer capture probes targetingtheir exome sequence (see Materials and Methods fordetails), was selected. To assess the specificity of thesequence capture strategy, a relevant parameter is thenumber of reads that map to the targeted sequence that, inthis study, corresponds to 0.9 megabases in size.

To assess the analytical performance of the MammaPrint/BluePrint targeted RNA NGS test (referred to as NGS forbrevity), a total of 327 FFPE breast cancer samples(Supplemental Table S2) were successfully sequenced in thisstudy. On average, 21 samples were pooled per MiSeq run(minimum number of samples per pool Z 9, maximumnumber of samples per pool Z 32). Overall, an average of9.45 � 105 reads per sample (minimum Z 1.95 � 105,maximum Z 4.03 � 106), of which 96.3% mapped to thehuman genome (hg19), were obtained. Of all reads, onaverage, 79.8% were on target (eg, uniquely assigned to theMammaPrint, BluePrint, and control genes). This indicatesthat the capture strategy can efficiently enrich theMammaPrintand BluePrint signature genes. Figure 1 depicts the relativemapping efficiency for the 327 samples by showing the relativepercentage of reads mapped to the human genome (mappedread count) and the relative percentage of reads on target (on-target read count) versus the total read count set to 100%.

When looking at the normalized read count of MammaP-rint and BluePrint genes in the 327 samples, an average of2.45� 103 reads perMammaPrint gene (maximumZ 3.23�104, minimumZ 11) and an average of 4.93� 103 reads perBluePrint gene (maximum Z 2.87 � 104, minimum Z 35)were observed (Supplemental Figure S3). These read count

½F3�½F3�

Figure 1 Targeted RNA-sequencing performance of MammaPrint andBluePrint next-generation sequencing test. Relative mapping efficiency forthe 327 samples is shown as the relative percentage of reads mapped to thehuman genome (mapped read count; orange) and the relative percentage ofreads on target (on-target read count; green) versus the total read countset to 100% (total read count; purple).

The Journal of Molecular Diagnostics - jmd.amjpathol.org

593594595596597598599600601602603604605606607608609610611612613614615616617618619

FLA 5.6.0 DTD � JMDI802_proo

estimates indicated that all target genes were expressed in thesample set, and their expression was captured during thetarget enrichment step. High variation in the gene read countsis expected because of the biological variation in expressionof the signature genes, depending on the MammaPrint riskgroup and BluePrint subtype of the sample.

Gene Expression Comparative Analysis between NGSand Microarray

To assess the correlation between gene expression measuredwith NGS and microarray, the MammaPrint 70-gene andBluePrint 80-gene expression levels were analyzed in asubset of 204 unique FFPE patient samples for whichmatched NGS and microarray data were available from thesame RNA sample. NGS and microarray normalizedexpression was median centered per gene separately foreach technique (ie, NGS and microarray). A similar strategyhas been used previously in a different context, to adjust forplatform bias.32e34

First, the Pearson correlation was calculated for eachMammaPrint and BluePrint gene between its paired and notpaired median centered expression across the 204 matchedNGS and microarray samples. The distribution of Pearsoncorrelation coefficients generated from the comparison be-tween NGS and microarray paired gene expression profilesdiffered significantly from the distribution of the not pairedprofiles for both MammaPrint and BluePrint genes, indi-cating a good concordance between gene expressionmeasured with the two techniques (Figure 2). Specifically,for NGS and microarray paired profiles, the average Pearsoncorrelation was 0.582 � 0.223 for the MammaPrint genesand 0.748 � 0.188 for the BluePrint genes. For not pairedprofiles, the Pearson correlation coefficient significantlydecreased (Wilcoxon signed-rank test P < 0.0001) for bothMammaPrint and BluePrint genes (MammaPrint genes:0.064 � 0.089 for NGS versus not paired microarray,0.064 � 0.171 for microarray versus not paired NGS,0.213 � 0.194 for NGS versus not paired NGS, and 0.195 �0.120 for microarray versus not paired microarray; Blue-Print genes: 0.104 � 0.198 for NGS versus not pairedmicroarray, 0.104 � 0.125 for microarray versus not pairedNGS, 0.102 � 0.144 for NGS versus not paired NGS, and0.246 � 0.180 for microarray versus not paired microarray).As expected, the average correlation between not pairedsamples of the same technique type is higher than the onemeasured between not paired samples of a different tech-nique type (MammaPrint genes: 0.204 versus 0.064; Blue-Print genes: 0.174 versus 0.104).

Next, the expression of the 70 MammaPrint genes and the80 BluePrint genes measured with NGS, with the expressionmeasured with microarray using heat maps, was visuallycompared. NGS expression was highly comparable with themicroarray expression for both MammaPrint and BluePrintgenes (Figure 3). For both MammaPrint and BluePrintgenes, NGS expression intensity was greater than the

5

620

f � 22 July 2019 � 8:07 pm � EO: 2018_374

Page 6: MammaPrint and BluePrint Molecular Diagnostics Using ......ion semiconductor sequencing,14 single-molecule real-time sequencing,15 nanopore sequencing,16 and sequencing by synthesis.17

½F4�½F4�

½F5�½F5�

Figure 2 Distribution of Pearson correlation co-efficients generated from the comparison betweennext-generation sequencing (NGS) and microarray(MA) expression of the MammaPrint (A) and theBluePrint (B) genes in paired (NGS versus MA) and notpaired (NGS versus not paired MA, NGS versus notpaired NGS, MA versus not paired NGS, and MA versusnot paired MA) gene expression profiles across 204NGS and MA matched patient samples. The x axisreports the comparison type; the y axis reports theaverage Pearson correlation of each comparison typefor each of the MammaPrint or BluePrint genes. Eachdot represents a gene for which the average of thePearson correlations for each comparison type iscalculated. Normalized NGS and microarray expressiondata are median centered before correlation calcula-tion. Color code: NGS versus MA paired, black; NGSversus not paired MA, white; NGS versus not pairedNGS, gray; MA versus not paired NGS, light gray; MAversus not paired MA, light yellow. N Z 70 (A);N Z 80 (B).

print&web4C=FPO

Mittempergher et al

621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682

683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743

microarray expression intensity, indicating a larger geneexpression dynamic range for NGS than for microarray forMammaPrint and BluePrint genes. Of importance, the 70-gene heat map for both NGS and microarray resemblesthe heat map originally published using fresh-frozen RNA.3

Pearson correlation between the average NGS and micro-array expression of the 70 genes in MammaPrint high-risk(N Z 122) and low-risk (N Z 82) patients was high(high-risk Pearson r Z 0.870, low-risk Pearson r Z 0.876).Similar to the 70 genes, strong correlations were observedbetween the average NGS and microarray expression of the80 genes in BluePrint luminal-type (N Z 133), HER2-type(N Z 42), and basal-type (N Z 29) patient samples(luminal-type Pearson r Z 0.902, HER2-type Pearsonr Z 0.887, basal-type Pearson r Z 0.869).

Technical Equivalence of MammaPrint and BluePrintResults for NGS and Microarray

Equivalence of NGS and microarray technologies wasstudied by performing both diagnostic tests on a set of 85FFPE breast cancer samples for MammaPrint and on a set of98 breast cancer samples for BluePrint (Supplemental TableS2). All samples fulfilled the selection criteria of the study(see Materials and Methods).

6FLA 5.6.0 DTD � JMDI802_pro

MammaPrint NGS indexes showed high correlation withthe matching microarray indexes (Pearson r Z 0.978), asconfirmed by the Passing-Bablok regression analysis35

(Figure 4A and Supplemental Table S3). The Bland-Altmananalysis showed no bias toward one platform, and the meandifference between the NGS and the microarray indexes was0.1% of the observed dynamic range for the NGS test; thisdifference is similar to the technical variance observed usingFF tissue on the microarray-to-microarray comparison.30

Concordance of categorical MammaPrint low- and high-risk classification was 97.6% with an NPA of 94.0% (95%CI, 81.9%e98.5%) and a PPA of 100%. A k score of 0.82(95% CI, 0.74e0.89) indicated almost perfect agreement.31

The samples that were discordant in MammaPrint classifi-cation results (N Z 2) had indexes close to the classificationthreshold, with a difference between the NGS and themicroarray indexes of 0.018 and to 0.137, respectively. Thesedifferences fall within 1 SD from the average index difference(0.083 � 0.077) of the full sample set (N Z 85).Similar to MammaPrint, BluePrint NGS indexes for each

subtype showed high correlation with thematchingmicroarrayindexes (luminal-type Pearson rZ 0.979, HER2-type PearsonrZ 0.788, basal-type Pearson rZ 0.987), as confirmed by thePassing-Bablok regression analysis (Figure 5A andSupplemental Table S3). The correlation between NGS and

jmd.amjpathol.org - The Journal of Molecular Diagnostics

744

of � 22 July 2019 � 8:07 pm � EO: 2018_374

Page 7: MammaPrint and BluePrint Molecular Diagnostics Using ......ion semiconductor sequencing,14 single-molecule real-time sequencing,15 nanopore sequencing,16 and sequencing by synthesis.17

print&web4C=FPO

Figure 3 A: Expression heat maps of the MammaPrint 70 genes measured with next-generation sequencing (NGS; left panel) and microarray (right panel)in 204 matched NGS and microarray patient samples. Samples are ordered on the basis of their microarray MammaPrint index, indicated as scatterplot graphnext to the heat map. Genes are clustered on the basis of their NGS expression. B: Expression of the BluePrint 80 genes measured with NGS (left panel) andmicroarray (right panel) in 204 matched NGS and microarray patient samples. Samples are ordered on the basis of their microarray BluePrint subtype,indicated as color bar next to the heat map (luminal type, blue; HER2 type, yellow; basal type, red). Genes are clustered on the basis of their NGS expression.Samples are shown in rows, and genes are shown in columns. Genes are clustered using hierarchical clustering, Pearson correlation distance, and averagelinkage. Data are normalized, Log2 Q24transformed, and median centered before visualization. The color bar underneath the heat map reports the expression range(from �3 to 3: 70 genes; from �5 to 5: 80 genes).

MammaPrint and BluePrint Using RNA-Seq

745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806

807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844845846847848849850851852853854855856857858859860861862863864865866867

microarray HER2-type indexes was lower than for the luminaland basal-type indexes. This is because of the fact that mostsamples were not HER2 type and, therefore, the HER2 NGSand microarray indexes of noneHER2-type samples maydiffer more from each other. In addition, higher variability forthe HER2 index is expected because the HER2 subtype isdefined by a signature of four genes and expression changesoccurring in each of the four HER2 signature genes can have ahigher impact on the HER2 signature readout than expressionchanges occurring in each of the luminal and basal signaturegenes that are part of a more numerous set (58 luminal signa-ture, 28 basal signature).10 Of importance, most genes thatdefine the HER2 signature are underexpressed in noneHER2-type samples and, therefore, the measurement precision basedon low expressed genes could be poorer. For all three subtypes,the Bland-Altman analysis showed no bias toward one plat-form, with a mean difference between the NGS and themicroarray indexes smaller than the control technical variance

The Journal of Molecular Diagnostics - jmd.amjpathol.orgFLA 5.6.0 DTD � JMDI802_proo

observed for FFPE microarray. The mean difference betweenthe NGS and the microarray indexes was 0.8%, 2.8%, and2.5% of the observed NGS dynamic range for the luminal,HER2, and basal types, respectively (Supplemental Table S4).Overall concordance of categorical luminal-, HER2-, andbasal-type classification was 100%.

Reproducibility of MammaPrint and BluePrint NGSResults Using Two Independent RNA Isolations fromthe Same Tumor Tissue Block

Cellular heterogeneity can occur within the same primarytumor (alias intratumor heterogeneity).36e38 Gene expres-sion analysis results from different parts of the same tumorcan, therefore, be variable. To evaluate the possible effect ofintratumor heterogeneity, the reproducibility of the Mam-maPrint/BluePrint NGS test was also assessed between twodifferent RNA isolations from the same FFPE tumor block.

7

868

f � 22 July 2019 � 8:07 pm � EO: 2018_374

Page 8: MammaPrint and BluePrint Molecular Diagnostics Using ......ion semiconductor sequencing,14 single-molecule real-time sequencing,15 nanopore sequencing,16 and sequencing by synthesis.17

Figure 4 Scatterplot comparison analysis of MammaPrint indexes in different validation studies. Each dot represents a single breast cancer sample forwhich total RNA underwent next-generation sequencing (NGS) or microarray laboratory processing and analysis. A: Scatterplot showing equivalence ofMammaPrint microarray [predicate device, US Food and Drug Administration (FDA) 510(K141142)] indexes (x axis) versus MammaPrint NGS (targeted RNA NGStest) indexes (y axis). B: Scatterplot showing equivalence of MammaPrint NGS indexes assessed from RNAs isolated from two independent isolations, isolation1 (x axis) versus isolation 2 (y axis). C: Scatterplot showing equivalence of MammaPrint NGS indexes assessed using MiSeq instrument (x axis) versusMammaPrint NGS indexes assessed using MiSeq FDA-cleared (Dx) instrument (y axis). D: Scatterplot showing equivalence of MammaPrint NGS indexes derivedfrom Qiagen RNeasy isolation method (x axis) versus MammaPrint NGS indexes derived from Qiagen AllPrep isolation method (y axis).

Mittempergher et al

869870871872873874875876877878879880881882883884885886887888889890891892893894895896897898899900901902903904905906907908909910911912913914915916917918919920921922923924925926927928929930

931932933934935936937938939940941942943944945946947948949950951952953954955956957958959960961962963964965966967968969970971972973974975976977978979980981982983984985986987988989990991

Each sample block was sectioned twice: the first set of fourslides for the first isolation and the second set of four slidesfor the second isolation. This analysis is an approximationof tumor heterogeneity; a true tumor heterogeneity studywould analyze two different parts of the tumor. Moreover,this experiment aims to assess a more so-called biologicalvariable (ie, cellular heterogeneity of the same primarytumor) than an analytical variable. Forty-three matchedsamples were collected for this analysis. A high correlationwas observed in MammaPrint NGS indexes, as well as forBluePrint NGS indexes per subtype derived from these twoindependent isolations (MammaPrint Pearson r Z 0.993,luminal-type Pearson r Z 0.995, basal-type PearsonrZ 0.993, HER2-type Pearson rZ 0.990), as confirmed bythe Passing-Bablok regression analysis. In this analysis, theintercepts for MammaPrint and BluePrint are close to zero

8FLA 5.6.0 DTD � JMDI802_pro

and the slopes are close to one, indicating that there is nobias in results between the two isolations (Figure 4B,Figure 5B, and Supplemental Table S3). For both Mam-maPrint and BluePrint, the Bland-Altman analysis showedno bias toward one of the two isolations, with a mean dif-ference between the matched NGS indexes smaller than themicroarray control technical variance. The mean differencebetween the matched NGS indexes was 0.2%, 0.1%, 2.1%,and 0.4% of the observed NGS dynamic range for Mam-maPrint, BluePrint luminal type, BluePrint HER2 type, andBluePrint basal type, respectively (Supplemental Table S4).Concordance of categorical MammaPrint low- and high-riskclassification between the two isolations was 97.7%, with anNPA of 100% and a PPA of 96.0% (95% CI, 80.5%e99.3%). Overall concordance of categorical BluePrintluminal-, HER2-, and basal-type classification was 100%.

jmd.amjpathol.org - The Journal of Molecular Diagnostics

992

of � 22 July 2019 � 8:07 pm � EO: 2018_374

Page 9: MammaPrint and BluePrint Molecular Diagnostics Using ......ion semiconductor sequencing,14 single-molecule real-time sequencing,15 nanopore sequencing,16 and sequencing by synthesis.17

Figure 5 Scatterplot comparison analysis of BluePrint (BP) luminal, HER2, and basal indexes in different validation studies. Each dot represents a singlebreast cancer sample for which total RNA underwent next-generation sequencing (NGS) or microarray laboratory processing and analysis. A: Scatterplotsshowing equivalence of BluePrint microarray indexes (x axis) versus BluePrint NGS (targeted RNA NGS test) indexes (y axis) in the luminal, HER2, and basalsubtypes. B: Scatterplots showing equivalence of BluePrint NGS indexes assessed from RNAs isolated from two independent isolations, isolation 1 (x axis)versus isolation 2 (y axis), in the luminal, HER2, and basal subtypes. C: Scatterplots showing equivalence of BluePrint NGS indexes assessed using MiSeqinstrument (x axis) versus BluePrint NGS indexes assessed using MiSeq US Food and Drug Administrationecleared (Dx) instrument (y axis) in the luminal,HER2, and basal subtypes. D: Scatterplots showing equivalence of BluePrint NGS indexes derived from Qiagen RNeasy isolation method (x axis) versus BluePrintNGS indexes derived from Qiagen AllPrep isolation method (y axis) in the luminal, HER2, and basal subtypes.

MammaPrint and BluePrint Using RNA-Seq

The Journal of Molecular Diagnostics - jmd.amjpathol.org 9

9939949959969979989991000100110021003100410051006100710081009101010111012101310141015101610171018101910201021102210231024102510261027102810291030103110321033103410351036103710381039104010411042104310441045104610471048104910501051105210531054

10551056105710581059106010611062106310641065106610671068106910701071107210731074107510761077107810791080108110821083108410851086108710881089109010911092109310941095109610971098109911001101110211031104110511061107110811091110111111121113111411151116

FLA 5.6.0 DTD � JMDI802_proof � 22 July 2019 � 8:07 pm � EO: 2018_374

Page 10: MammaPrint and BluePrint Molecular Diagnostics Using ......ion semiconductor sequencing,14 single-molecule real-time sequencing,15 nanopore sequencing,16 and sequencing by synthesis.17

½F6�½F6�

Mittempergher et al

11171118111911201121112211231124112511261127112811291130113111321133113411351136113711381139114011411142114311441145114611471148114911501151115211531154115511561157115811591160116111621163116411651166116711681169117011711172117311741175117611771178

1179118011811182118311841185118611871188118911901191119211931194119511961197119811991200120112021203120412051206120712081209121012111212121312141215121612171218121912201221122212231224122512261227122812291230123112321233123412351236123712381239

Reproducibility of MammaPrint and BluePrint NGSResults Using MiSeq and MiSeq Dx Instruments

Currently, Illumina offers an RUO and an Dx version of theMiSeq instrument. The MiSeq Dx can be also used in aRUO mode to allow users to run customized sequencingprotocols. Many diagnostic laboratories around the worldare starting to use a MiSeq Dx instrument; therefore, resultsgenerated from both instruments (MiSeq and MiSeq Dx inRUO mode) were compared using the same setting, toassess their degree of reproducibility. The experiments wereperformed in two centralized Agendia laboratories, one inIrvine, CA, using the MiSeq Dx instrument and one inAmsterdam, the Netherlands, using the MiSeq instrument.This analysis also included possible effects of site-to-sitevariation in the performance of the NGS test. Fifty FFPEsamples were selected and processed in both locations,starting from previously isolated RNA. Two samples failedthe internal quality control (QC) assessment and wereexcluded from the analysis; therefore, 48 matched sampleswere left for further analysis. Figures 4C and 5C showscatterplots of MammaPrint and BluePrint indexes gener-ated at the two sites with MiSeq and MiSeq Dx. A highcorrelation was observed in MammaPrint NGS indexes, aswell as for the BluePrint NGS indexes for each subtypederived from the two-site NGS processing and sequencing(MammaPrint Pearson r Z 0.991, luminal-type Pearsonr Z 0.998, HER2-type Pearson r Z 0.992, basal-typePearson r Z 0.995), as confirmed by the Passing-Bablokregression analysis (Supplemental Table S3). For bothMammaPrint and BluePrint, the Bland-Altman analysisshowed no bias toward one of the two isolations, with amean difference between the matched NGS indexes smallerthan the control technical variance. The mean differencebetween the matched NGS indexes was 1.0%, 1.1%, 1.6%,and 1.7% of the observed NGS dynamic range for Mam-maPrint, BluePrint luminal type, BluePrint HER2 type, andBluePrint basal type, respectively (Supplemental Table S4).

Concordance of categorical MammaPrint low- and high-risk classification between the two sites was 95.8% with anNPA of 89.5% (95%CI, 68.6%e97.1%) and a PPA of 100%.Two discordant cases were observed; however, the absolutedifferences between the NGS indexes obtained with the twoinstruments were smaller than the control technical variance.Overall concordance of categorical BluePrint luminal-,HER2-, and basal-type classification was 97.9%. The use oftwo different sequencing instruments in two different loca-tions had no significant influence on the outcome of the test.

Reproducibility of MammaPrint and BluePrint NGSResults Over Time

Reproducibility over time is a key indicator of precision fora diagnostic test. Therefore, three control patient samples(control 1, control 2, control 3) were repeatedly processedand sequenced over a 6-month period, starting from total

10FLA 5.6.0 DTD � JMDI802_pro

RNA onwards. Controls 1 and 2 were MammaPrint lowrisk, and control 3 was MammaPrint high risk. For control1, there were 26 NGS measurements collected, of which 25passed the QC assessment; for control 2, there were 23 NGSmeasurements collected, of which 17 passed the QCassessment; for control 3, there were 16 NGS measurementscollected, of which 14 passed the QC assessment.In this long-term validation experiment, nearly all po-

tential sources of variation are included by default (ie,different operators, reagent batches, and environmentalfactors).For both MammaPrint and BluePrint categorical classi-

fications, the concordance was 100% between all the mea-surements of each control sample. The reproducibility wasmeasured by the relative SD of the samples. MammaPrintshowed stable results over time (Figure 6) for all threecontrol patient samples, with relative SDs of 1.5% (control1, 25 measurements), 2.0% (control 2, 17 measurements),and 1.8% (control 3, 14 measurements). The MammaPrintNGS test showed a reproducibility (ie, relative stability) of98.5%, 98.0%, and 98.2% in the three control samples,respectively (Supplemental Table S4). This reproducibilityis slightly higher than using the microarray-based test.39

Similar to MammaPrint, BluePrint showed stable resultsover time, with an average relative SD over the three controlsamples of 1.5% (luminal type), 2.6% (HER2 type), and1.5% (basal type) (Supplemental Table S4). On the basis ofthese three control samples, the reproducibility of theBluePrint NGS test was 98.5% for the luminal type, 98.4%for the basal type, and 97.4% for the HER2 type(Supplemental Table S4). None of the three samplesanalyzed over time was classified as HER2 subtype and,therefore, the HER2 subtype index reproducibility was notmeasured. This had no effect on the BluePrint outcomes.Because therewas a higher failure rate for control sample 2,

which was isolated with a different RNA isolation methodthan the other two control samples (ie, Qiagen All Prep kitversus Qiagen RNeasy kit), it was further studied. Twenty-eight matched FFPE samples were selected, and each blockwas sectioned twice; and the first set of four slides was iso-lated with the RNeasy method, whereas the second set of fourslides was isolated with the All Prep isolationmethods. Of the56 samples analyzed (28 pairs), four samples failed for bothisolation methods, two for the RNeasymethod, and six for theAll Prep method. This result would support the idea that AllPrep gives a higher failure rate than the RNeasy method.However, larger sample numbers are needed to confirm andelucidate this finding. Additional validation of this hypothesisis still ongoing. Of importance, when looking at the resultsobtained for 16 RNeasy-AllPrep pairs successfullysequenced, high correlation was observed between the twoisolation methods for both MammaPrint NGS indexes andBluePrint NGS indexes for each subtype (MammaPrintPearson rZ 0.992, luminal-type Pearson rZ 0.997, HER2-type Pearson r Z 0.947, basal-type Pearson r Z 996)(Figure 4D, Figure 5D, and Supplemental Table S3). The

jmd.amjpathol.org - The Journal of Molecular Diagnostics

1240

of � 22 July 2019 � 8:07 pm � EO: 2018_374

Page 11: MammaPrint and BluePrint Molecular Diagnostics Using ......ion semiconductor sequencing,14 single-molecule real-time sequencing,15 nanopore sequencing,16 and sequencing by synthesis.17

½F7�½F7�

print&web4C=FPO

Figure 6 Shewhart chart showing MammaPrint and BluePrint next-generation sequencing (NGS) indexes (y axis) of three control samples over time in 25measurements (control sample 1; pink), 17 measurements (control sample 2; black), and 14 measurements (control sample 3; blue) (x axis). MammaPrint (A),luminal BluePrint (B), HER2 BluePrint (C), and basal BluePrint (D) NGS indexes. Each dot represents a single breast cancer sample for which total RNAunderwent NGS laboratory processing and analysis.

MammaPrint and BluePrint Using RNA-Seq

12411242124312441245124612471248124912501251125212531254125512561257125812591260126112621263126412651266126712681269127012711272127312741275127612771278127912801281128212831284128512861287128812891290129112921293129412951296129712981299130013011302

1303130413051306130713081309131013111312131313141315131613171318131913201321132213231324132513261327132813291330133113321333133413351336133713381339134013411342134313441345134613471348134913501351135213531354135513561357135813591360136113621363

mean difference between the matched NGS indexes was1.5%, 0.7%, 1.8%, and 0.04% of the observed NGS dynamicrange for MammaPrint, BluePrint luminal type, BluePrintHER2 type, and BluePrint basal type, respectively(Supplemental Table S4). Concordance of categoricalMammaPrint low- and high-risk classification and BluePrintsubtype classification between the two isolation methods was100%.

Reproducibility of MammaPrint and BluePrint NGSResults between Different Laboratories

The main objective of developing a MammaPrint/BluePrintNGS test is the implementation of a decentralized diagnosticsetting. Therefore, reproducibility was assessed betweendifferent laboratories by processing the same set of 15 FFPERNA samples in three independent laboratories: Agendiacentral laboratory (site 1) and two decentralized academiclaboratories (sites 2 and 3) (seeMaterials and Methods). The

The Journal of Molecular Diagnostics - jmd.amjpathol.orgFLA 5.6.0 DTD � JMDI802_proo

three sites used aMiSeq instrumentwith the same setting.Onesample analyzed at site 2 did not pass the QC assessment and,therefore, 14 samples were used for the analysis.

For both decentralized sites, high correlation was observedbetween their MammaPrint and BluePrint NGS indexes andthe Agendia MammaPrint and BluePrint NGS indexes (site 1versus site 2: MammaPrint Pearson r Z 0.998, luminal-typePearson r Z 0.995, HER2-type Pearson r Z 0.990, basal-type Pearson r Z 0.994; Agendia site 1 versus site 3: Mam-maPrint Pearson rZ 0.997, luminal-type Pearson rZ 0.996,HER2-type Pearson rZ 0.986, basal-typePearson rZ 0.991),as confirmed by the Passing-Bablok regression analysis(Figure 7 and Supplemental Table S3). For both MammaPrint(Figure 7A) and BluePrint (Figure 7, B and C), the scatterplotand the Bland-Altman analysis showed no bias toward one ofthe sites (site 1 versus site 2, site 1 versus site 3), with a meandifference between the matched NGS indexes smaller than thecontrol technical variance (site 1 versus site 2: mean differencein NGS-microarray indexes for MammaPrint was 0.6%, for

11

1364

f � 22 July 2019 � 8:07 pm � EO: 2018_374

Page 12: MammaPrint and BluePrint Molecular Diagnostics Using ......ion semiconductor sequencing,14 single-molecule real-time sequencing,15 nanopore sequencing,16 and sequencing by synthesis.17

Mittempergher et al

13651366136713681369137013711372137313741375137613771378137913801381138213831384138513861387138813891390139113921393139413951396139713981399140014011402140314041405140614071408140914101411141214131414141514161417141814191420142114221423142414251426

1427142814291430143114321433143414351436

BluePrint luminal type was 0.8%, for HER2 type was 4.8%,and for basal type was 0.5% of the observed NGS dynamicrange; site 1 versus site 3: mean difference in NGS-microarrayindexes forMammaPrint was 2.3%, for BluePrint luminal typewas 0.7%, for HER2 type was 1.0%, and for basal type was1.1% of the observed NGS dynamic range) (SupplementalTable S4). Concordance of 100% for categorical MammaP-rint low- and high-risk classification and BluePrint subtypeclassification was observed between the three laboratories.

Figure 7 Scatterplot comparison analysis of MammaPrint and BluePrint (BP) nassessed at the Agendia (Amsterdam, the Netherlands) central laboratory and at decancer sample for which total RNA underwent NGS or microarray laboratory proceindexes assessed at Agendia (x axis) versus MammaPrint NGS indexes assessed atequivalence of BluePrint NGS indexes assessed at Agendia (x axis) versus Mamequivalence of BluePrint NGS indexes assessed at Agendia (x axis) versus Mamma

12FLA 5.6.0 DTD � JMDI802_pro

Clinical Performance Comparison

As described above, the NGS-based test proved to be tech-nically equivalent to the microarray diagnostic test, showinghigh correlation between the NGS- and the microarray-basedindexes. Nevertheless, highly correlated continuous valuesmay result in a lower concordance when these continuousvariables are presented as binary categories, as is the case forMammaPrint low- and high-risk classification groups. In

ext-generation sequencing (NGS) indexes (luminal, HER2, and basal types)centralized laboratories at sites 2 and 3. Each dot represents a single breastssing and analysis. A: Scatterplot showing equivalence of MammaPrint NGSsite 2 (y axis; left panel) and site 3 (right panel). B: Scatterplot showingmaPrint NGS indexes assessed at site 2 (y axis). C: Scatterplot showingPrint NGS indexes assessed at site 3 (y axis).

jmd.amjpathol.org - The Journal of Molecular Diagnostics

1437143814391440144114421443144414451446144714481449145014511452145314541455145614571458145914601461146214631464146514661467146814691470147114721473147414751476147714781479148014811482148314841485148614871488

of � 22 July 2019 � 8:07 pm � EO: 2018_374

Page 13: MammaPrint and BluePrint Molecular Diagnostics Using ......ion semiconductor sequencing,14 single-molecule real-time sequencing,15 nanopore sequencing,16 and sequencing by synthesis.17

Q16

½F8�½F8�

T1�T1�

print&

web4C=FPO

Figure 8 Kaplan-Meier curve of 316 microarRAy-prognoSTics-in-breast-cancER (RASTER) study patients for 5-year distant recurrence-free interval (DRFI)assessed using MammaPrint FF predicate microarray [US Food and Drug Administration 510 (K062694/K070675); A] and MammaPrint next-generationsequencing (NGS; B). Kaplan-Meier curves plotted for DRFI show comparable clinical performance of MammaPrint low-risk (gray lines) and high-risk(red lines) patients in matched formalin-fixed, paraffin-embedded microarray and NGS in a series of 316 early-stage breast cancer patients. P values ofsignificance are assessed using log-rank test and showed on the graphs.

MammaPrint and BluePrint Using RNA-Seq

14891490149114921493149414951496149714981499150015011502150315041505150615071508150915101511151215131514151515161517151815191520152115221523152415251526152715281529153015311532153315341535153615371538153915401541154215431544154515461547154815491550

15511552155315541555155615571558155915601561156215631564156515661567156815691570157115721573157415751576157715781579158015811582158315841585158615871588158915901591159215931594159515961597159815991600

particular, samples with a result close to a classificationthreshold with a small difference between two measurementsmay still result in a switch in binary outcome.39 Therefore, theclinical performance of the NGS-based test was assessed in adata set with known clinical outcome. A large series of breasttumors were used from the microarRAy-prognoSTics-in-breast-cancER (RASTER) study, for which 5-year outcomedata are available and were used previously to comparedifferent versions of the MammaPrint microarray diagnostictest (ie, MammaPrint FF versus FFPE).5,39 All patients wereaged 18 to 61 years and had a histologically confirmedinvasive adenocarcinoma of the breast (cT1-3N0M0). For341 samples of the original series of 427, total RNA wasavailable. Because of the age of the samples (on average, 12years after surgery), 316were successfully processed with theNGS test (Supplemental Table S2). Clinicopathologic char-acteristics of the 316 patients analyzedwithNGS and groupedby MammaPrint risk outcome are reported in SupplementalTable S5. The 316 patient subset and the original 427 pa-tient set show similar clinicopathologic characteristics, asconfirmed by the P value after testing for difference (Pearsonc2 statistic for categorical variables andU-test for continuousvariables, P > 0.05) (Supplemental Table S6). Survivalanalysis of the 316 patient samples for 5-year distantrecurrence-free interval was performed for MammaPrint FFmicroarray [predicate device, FDA 510(K062694/

Table 1 Survival Percentages for 5-Year DRFI in 316 Patients with Ma

Risk level

MammaPrint microarray

5-year DRFI survival 95% CI

Low 96.5 93.4e99High 90.6 85.9e95

Survival percentages, at 5 years after surgery, for DRFI for MammaPrint fresh-frSurvival percentages and their 95% CI are given for the low- and high-risk groupDRFI, distant recurrence-free interval; NGS, next-generation sequencing.

The Journal of Molecular Diagnostics - jmd.amjpathol.orgFLA 5.6.0 DTD � JMDI802_proo

K070675)]5 as well as for MammaPrint NGS. Figure 8 showsthat the Kaplan-Meier curves are similar between the micro-array and NGS results and that MammaPrint NGS high- andlow-risk groups significantly differ in 5-year survival, with alog-rank P value smaller than the one observed for Mam-maPrint microarray (NGS log-rank P Z 0.002 versusmicroarray log-rank P Z 0.038). Moreover, 5-year survivalpercentages of distant recurrence-free interval for both thelow- and high-risk patient groups were comparable betweenmicroarray and NGS, as reported in Table 1 ½½.

Concordance of categorical MammaPrint low- andhigh-risk classification was 94.0% with an NPA of96.3% (95% CI, 92.2%e98.3%) and a PPA of 91.4%(95% CI, 85.9%e94.9%). In summary, we show hereinthat diagnostic testing results using targeted NGS andmicroarray are comparable and lead to similar clinicalperformance of both tests in a cohort of patients withknown outcome.

Discussion

Gene expressionebased diagnostic tools are increasinglyused to assess risk of recurrence in breast cancer.6,9,40,41 Thetwo most commonly used tests, MammaPrint6,8,9 andOncotype DX,42,43 are currently only available to breast

tched Microarray and NGS Results

MammaPrint NGS

5-year DRFI survival 95% CI

.6 97.0 94.1e99.9

.3 89.3 84.2e94.4

ozen predicate microarray (US Food and Drug Administration 510k) results.Q25s, separately.

13

160116021603160416051606160716081609161016111612

f � 22 July 2019 � 8:07 pm � EO: 2018_374

Page 14: MammaPrint and BluePrint Molecular Diagnostics Using ......ion semiconductor sequencing,14 single-molecule real-time sequencing,15 nanopore sequencing,16 and sequencing by synthesis.17

17

Mittempergher et al

16131614161516161617161816191620162116221623162416251626162716281629163016311632163316341635163616371638163916401641164216431644164516461647164816491650165116521653165416551656165716581659166016611662166316641665166616671668166916701671167216731674

1675167616771678167916801681168216831684168516861687168816891690169116921693169416951696169716981699170017011702170317041705170617071708170917101711171217131714171517161717171817191720172117221723172417251726172717281729173017311732173317341735

cancer patients through central laboratory services. Thislimits the availability of these tests, as some countries havelegal restrictions on the export of human tissue. To over-come this barrier to clinical use, an RNA-seq kit version ofMammaPrint was developed and its reproducibility, stabil-ity, and clinical validity were tested.

In recent years, RNA NGS technology evolved rapidly,and it is increasingly chosen by genomics laboratoriesbecause of its low background, high sensitivity, and repro-ducibility.44 Efforts toward the establishment of the bestpractices for RNA NGS as a clinical test (reproducibility,accuracy, and precision) are ongoing.2,45,46

In light of the potential of the RNA NGS technology forclinical applications, this study was initiated with theobjective of translating the current MammaPrint and Blue-print FFPE microarray test to an RNA NGSebased test. Atargeted RNA-seq approach was chosen because of thelimited size of the gene target panel (approximately 200genes, which cover 0.9 megabases) and because a targetsequencing approach delivers high sequencing depth andhigh coverage on target. On average, a percentage of readson target >79% was observed, which indicates a high-capture performance, especially for targeted sequencing ofFFPE-derived RNA.13,47,48

Overall, a high concordance was observed betweenresults obtained using NGS and microarray technologies(all >97%). When NGS and microarray results generatedfrom the same FFPE RNA were compared, a discordancerate of 2.4% was observed, which can only be ascribed tothe assay-related variation. After combining assay-relatedvariation and intratumor heterogeneity from our study(2.3%), a discordance rate of approximately 5% wasobtained, which matches the discordance rate observedfor the MammaPrint-fresh microarray test between twobiopsy isolations of the same tumor.30 Interestingly, if itis considered that the concordance between the Mam-maPrint FFPE and FF test is 91.5% (95% CI, 87%e95%),8 it can be concluded indirectly that assay-relatedvariation (NGS versus microarray techniques) is lowerthan the variation observed between the MammaPrint FFand MammaPrint FFPE microarray tests, when excludingthe contribution of intratumor heterogeneity (2.4% FFPENGS versus FFPE microarray; 3.5% FFPE microarrayversus FF predicate microarray). This study suggests thatsample preservation method (FF or FFPE) is a biggercontributor to variation than the technological platform(microarray or NGS). This result is in line with the studyof Zhang et al,49 in which they showed that RNA-seq andmicroarray-based models perform similarly in clinical endpoint prediction for neuroblastoma patients. In theirstudy, prediction accuracies were most strongly influ-enced by the nature of the clinical end point, whereas thetechnological platforms (RNA-seq versus microarray) didnot significantly affect performances of the model.49

Next, the reproducibility of the NGS test was assessedbetween different central laboratories using different MiSeq

14FLA 5.6.0 DTD � JMDI802_pro

instruments. The objective was to measure the reproduc-ibility of the NGS test, excluding the RNA isolation pro-cedure; and therefore, for this experiment, the same totalRNA was processed at the different Agendia sites. Again,high correlation was observed in a set of 48 matched RNAsamples, between the NGS and microarray MammaPrintand BluePrint indexes (Pearson r coefficients all >0.99)generated at the two central laboratories, as well as highconcordance between the categorical MammaPrint andBluePrint classifications (>95%).In the technical validation studies described so far, a few

samples that had a discordant MammaPrint and/or BluePrintNGS classification with respect to the microarray classifi-cation were observed. However, all these discordant casesshowed an absolute index difference smaller than themicroarray technical variance and, therefore, these discor-dant events could be accepted as part of the expectedtechnical variation of the NGS test.To assess more extensively the precision of the NGS test,

three patient control samples (two low-risk and one high-riskMammaPrint) were repeatedly measured over a 6-monthperiod. Reproducibility of the MammaPrint and BluePrintNGS indexes was >98% (minimum Z 97.0%,maximumZ 99.1%). The data showed that the MammaPrintand BluePrint NGS test is stable over time and that the sameresult is generated regardless of the operator, reagent lot, day,and other factors switching in the laboratory process.To assess the reproducibility of the NGS test between

decentralized laboratories (ie, not central laboratories), twoindependent academic European cancer centers (CurieInstitute and University Hospital Leuven Q) processed thesame set of RNA samples (n Z 15) using the MammaPrintand BluePrint NGS test. Considering that the main reason ofdeveloping the MammaPrint/BluePrint test is to enable adecentralized use of diagnostic testing, assessing thereproducibility of the NGS test between independent labo-ratories is important to assess the feasibility of such adecentralized setting in an academic setting. A 100%concordance was observed between the MammaPrint andBluePrint outcome results obtained at the three sites(Agendia, University Hospital Leuven, Curie Institute), withhigh correlation (Pearson r > 0.98) between the Mam-maPrint and BluePrint NGS indexes. This result highlightsonce again the robustness of the NGS test, and it paves theway for a larger validation study in which samples areprocessed locally with the NGS kit.In this study, its clinical performance was also assessed

using a set of 316 breast cancer samples with knownclinical outcome. Samples that switched between thedifferent platforms may affect the clinical performance ofthe test. Therefore, 316 samples with known clinicaloutcome were analyzed and compared with MammaPrintresults generated using the microarray platform [FF pred-icate device, FDA 510(K062694/K070675), same as usedfor the MINDACT series].9 The survival curves of Mam-maPrint high- and low-risk patients, based on the NGS

jmd.amjpathol.org - The Journal of Molecular Diagnostics

1736

of � 22 July 2019 � 8:07 pm � EO: 2018_374

Page 15: MammaPrint and BluePrint Molecular Diagnostics Using ......ion semiconductor sequencing,14 single-molecule real-time sequencing,15 nanopore sequencing,16 and sequencing by synthesis.17

MammaPrint and BluePrint Using RNA-Seq

17371738173917401741174217431744174517461747174817491750175117521753175417551756175717581759176017611762176317641765176617671768176917701771177217731774177517761777177817791780178117821783178417851786178717881789179017911792179317941795179617971798

1799180018011802180318041805180618071808180918101811181218131814181518161817181818191820182118221823182418251826182718281829183018311832183318341835183618371838183918401841184218431844184518461847184818491850185118521853185418551856185718581859

result, were equivalent to those based on the microarrayresult. Both NGS and microarray have a significant dif-ference in 5-year distant recurrence-free interval (log-rankP Z 0.002). The NGS kit showed a survival of 97% in theMammaPrint low-risk group in mostly untreated patients.When the MammaPrint risk prediction results obtainedwith the NGS test were compared with the one based onthe reference microarray test, overall high concordancewas obtained, with PPA and NPA values both >90%. Thisresult indicates that in nearly all of the cases, patienttreatment recommendations based on the NGS result arethe same as those based on the microarray result; and thesame patient group will be identified for safely forgoingchemotherapy (low-risk patients) or for undergoingnecessary chemotherapy treatment (high-risk patients).Discordant results were also observed, as could be ex-pected from the technical equivalence study; however, thisis part of the expected technical variation of the NGS testcompared with the microarray test.

In summary, results presented herein show that Mam-maPrint and BluePrint indexes generated with the targetedRNA NGS test are not only technically equivalent to thosegeneratedwith themicroarray FFPEpredicate device, but alsohighly reproducible between centralized laboratories, stableover time, and clinically meaningful. A larger study isrequired to confirm reproducibility between decentralizedlaboratories, but preliminary data generated in collaborationwith two European academic cancer centers confirm highreproducibility between results from Agendia and externallaboratories. The successful translation of the MammaPrintmicroarray test to anNGS-based test highlights the robustnessof the biology behind theMammaPrint test, which underwentseveral translations over time6,8,30,39 and still proves to be areliable tool for personalized medical care.

Acknowledgments

We thank Petra Sintubin, Liesbet Vliegen, and LaurenceSlembrouck (University Hospital Leuven) and VirginieRaynal, Bohec Mylene, Cécile Reyes, and Audrey Rapinat(Curie Institute) for performing the next-generationsequencing (NGS) laboratory experiments for the “Repro-ducibility of MammaPrint and BluePrint NGS Results be-tween Different Laboratories” study.

Supplemental Data

Supplemental material for this article can be found athttp://doi.org/10.1016/j.jmoldx.2019.04.007.

References

1. Han Y, Gao S, Muegge K, Zhang W, Zhou B: Advanced applicationsof RNA sequencing and challenges. Bioinform Biol Insights 2015, 9:29e46

The Journal of Molecular Diagnostics - jmd.amjpathol.orgFLA 5.6.0 DTD � JMDI802_proo

2. Byron SA, Van Keuren-Jensen KR, Engelthaler DM, Carpten JD,Craig DW: Translating RNA sequencing into clinical diagnostics:opportunities and challenges. Nat Rev Genet 2016, 17:257e271

3. van’t Veer LJ, Dai H, van de Vijver MJ, He YD, Hart AAM, Mao M,Peterse HL, van der Kooy K, Marton MJ, Witteveen AT,Schreiber GJ, Kerkhoven RM, Roberts C, Linsley PS, Bernards R,Friend SH: Gene expression profiling predicts clinical outcome ofbreast cancer. Nature 2002, 415:530e536

4. van de Vijver MJ, He YD, van’t Veer LJ, Dai H, Hart AA,Voskuil DW, Schreiber GJ, Peterse JL, Roberts C, Marton MJ,Parrish M, Atsma D, Witteveen A, Glas A, Delahaye L, van derVelde T, Bartelink H, Rodenhuis S, Rutgers ET, Friend SH,Bernards R: A gene-expression signature as a predictor of survival inbreast cancer. N Engl J Med 2002, 347:1999e2009

5. Drukker CA, Bueno-De-Mesquita JM, Retèl VP, Van Harten WH,Van Tinteren H, Wesseling J, Roumen RMH, Knauer M, Van ’TVeer LJ, Sonke GS, Rutgers EJT, Van De Vijver MJ, Linn SC: Aprospective evaluation of a breast cancer prognosis signature in theobservational RASTER study. Int J Cancer 2013, 133:929e936

6. Glas AM, Floore A, Delahaye LJMJ, Witteveen AT, Pover RCF,Bakx N, Lahti-Domenici JST, Bruinsma TJ, Warmoes MO,Bernards R, Wessels LFA, Van’t Veer LJ: Converting a breast cancermicroarray signature into a high-throughput diagnostic test. BMCGenomics 2006, 7:278

7. Buyse M, Loi S, van’t Veer L, Viale G, Delorenzi M, Glas AM,d’Assignies MS, Bergh J, Lidereau R, Ellis P, Harris A, Bogaerts J,Therasse P, Floore A, Amakrane M, Piette F, Rutgers E, Sotiriou C,Cardoso F, Piccart MJ, Decker N, Straehle C: Validation and clinicalutility of a 70-gene prognostic signature for women with node-negative breast cancer. J Natl Cancer Inst 2006, 98:1183e1192

8. Sapino A, Roepman P, Linn SC, Snel MHJ, Delahaye LJMJ, van denAkker J, Glas AM, Simon IM, Barth N, de Snoo FA, van ’t Veer LJ,Molinaro L, Berns EMJJ, Wesseling J, Riley LB, Anderson D,Nguyen B, Cox CE: MammaPrint molecular diagnostics on formalin-fixed, paraffin-embedded tissue. J Mol Diagn 2014, 16:190e197

9. Cardoso F, van’t Veer LJ, Bogaerts J, Slaets L, Viale G, Delaloge S,Pierga J-Y, Brain E, Causeret S, DeLorenzi M, Glas AM,Golfinopoulos V, Goulioti T, Knox S, Matos E, Meulemans B,Neijenhuis PA, Nitz U, Passalacqua R, Ravdin P, Rubio IT,Saghatchian M, Smilde TJ, Sotiriou C, Stork L, Straehle C,Thomas G, Thompson AM, van der Hoeven JM, Vuylsteke P,Bernards R, Tryfonidis K, Rutgers E, Piccart M: 70-Gene signature asan aid to treatment decisions in early-stage breast cancer. N Engl JMed 2016, 375:717e729

10. Krijgsman O, Roepman P, Zwart W, Carroll JS, Tian S, De Snoo FA,Bender RA, Bernards R, Glas AM: A diagnostic gene profile formolecular subtyping of breast cancer associated with treatmentresponse. Breast Cancer Res Treat 2012, 133:37e47

11. Beitsch P, Whitworth P, Baron P, Pellicane J, Treece T, Yoder E,Gittleman M; NBRST Investigators Group: Genomic impact ofneoadjuvant therapy on breast cancer: incomplete response is asso-ciated with altered diagnostic gene signatures. Ann Surg Oncol 2016,23:3317e3323

12. Ozsolak F, Milos PM: RNA sequencing: advances, challenges andopportunities. Nat Rev Genet 2011, 12:87e98

13. Cieslik M, Chugh R, Wu YM, Wu M, Brennan C, Lonigro R, Su F,Wang R, Siddiqui J, Mehra R, Cao X, Lucas D, Chinnaiyan AM,Robinson D: The use of exome capture RNA-seq for highly degradedRNA with application to clinical cancer sequencing. Genome Res2015, 25:1372e1381

14. Rothberg JM, Hinz W, Rearick TM, Schultz J, Mileski W, Davey M,et al: An integrated semiconductor device enabling non-opticalgenome sequencing. Nature 2011, 475:348e352

15. Ardui S, Ameur A, Vermeesch JR, Hestand MS: Single moleculereal-time (SMRT) sequencing comes of age: applications andutilities for medical diagnostics. Nucleic Acids Res 2018, 46:2159e2168

15

1860

f � 22 July 2019 � 8:07 pm � EO: 2018_374

Page 16: MammaPrint and BluePrint Molecular Diagnostics Using ......ion semiconductor sequencing,14 single-molecule real-time sequencing,15 nanopore sequencing,16 and sequencing by synthesis.17

Q18

Q19

Q20

21

22

Mittempergher et al

18611862186318641865186618671868186918701871187218731874187518761877187818791880188118821883188418851886188718881889189018911892189318941895189618971898189919001901190219031904190519061907190819091910191119121913191419151916191719181919192019211922

1923192419251926192719281929193019311932193319341935193619371938193919401941194219431944194519461947194819491950195119521953195419551956195719581959196019611962196319641965196619671968196919701971197219731974197519761977197819791980198119821983

16. Branton D, Deamer DW, Marziali A, Bayley H, Benner SA, Butler T,Di Ventra M, Garaj S, Hibbs A, Huang X, Jovanovich SB, Krstic PS,Lindsay S, Ling XS, Mastrangelo CH, Meller A, Oliver JS,Pershin YV, Ramsey JM, Riehn R, Soni GV, Tabard-Cossa V,Wanunu M, Wiggin M, Schloss JA: The potential and challenges ofnanopore sequencing. Nat Biotechnol 2008, 26:1146e1153

17. Bentley DR, Balasubramanian S, Swerdlow HP, Smith GP, Milton J,Brown CG, et al: Accurate whole human genome sequencing usingreversible terminator chemistry. Nature 2008, 456:53e59

18. Tian S, Roepman P, van’t Veer LJ, Bernards R, de Snoo F, Glas AM:Biological functions of the genes in the Mammaprint breast cancerprofile reflect the hallmarks of cancer. Biomark Insights 2010, 5:129e138

19. Kent WJ: BLAT: the BLAST-like alignment tool. Genome Res 2002,12:656e664

20. Pruitt K, Brown G, Tatusova T, Maglott D: Chapter 18: the ReferenceSequence (RefSeq) Database Database. Edited by The NCBI Hand-book; 2002

21. Pruitt KD, Harrow J, Harte RA, Wallin C, Diekhans M, Maglott DR,et al: The consensus coding sequence (CCDS) project: identifying acommon protein-coding gene set for the human and mouse genomes.Genome Res 2009, 19:1316e1323

22. Yates A, Akanni W, Amode MR, Barrell D, Billis K, Carvalho-Silva D, et al: Ensembl 2016. Nucleic Acids Res 2016, 44:D710eD716

23. Harrow J, Frankish A, Gonzalez JM, Tapanari E, Diekhans M,Kokocinski F, et al: GENCODE: the reference human genomeannotation for the ENCODE project. Genome Res 2012, 22:1760e1774

24. Harrow JL, Steward CA, Frankish A, Gilbert JG, Gonzalez JM,Loveland JE, Mudge J, Sheppard D, Thomas M, Trevanion S,Wilming LG: The Vertebrate Genome Annotation browser 10 yearson. Nucleic Acids Res 2014, 42:D771eD779

25. Bolger AM, Lohse M, Usadel B: Trimmomatic: a flexible trimmer forIllumina sequence data. Bioinformatics 2014, 30:2114e2120

26. Sirén J, Välimäki N, Mäkinen V: HISAT2: fast and sensitive align-ment against general human population. IEEE/ACM Trans ComputBiol Bioinform 2014, 11:375e388

27. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N,Marth G, Abecasis G, Durbin R: The Sequence Alignment/Mapformat and SAMtools. Bioinformatics 2009, 25:2078e2079

28. Liao Y, Smyth GK, Shi W: FeatureCounts: an efficient general pur-pose program for assigning sequence reads to genomic features.Bioinformatics 2014, 30:923e930

29. R Core Team: RA Language and Environment for StatisticalComputing, 1. R Foundation for Statistical Computing, 2016. pp. 409

30. Delahaye LJ, Wehkamp D, Floore AN, Bernards R, van ’t Veer LJ,Glas AM: Performance characteristics of the MammaPrint breastcancer diagnostic gene signature. Per Med 2013, 10:801e811

31. Food andDrugAdministration: Statistical guidance on reporting resultsfrom studies evaluating diagnostic tests Available at: https://www.fda.gov/RegulatoryInformation/Guidances/ucm071148.htm; 2007

32. Mittempergher L, de Ronde JJ, Nieuwland M, Kerkhoven RM,Simon I, Rutgers EJT, Wessels LFA, Van’t Veer LJ: Gene expressionprofiles from formalin fixed paraffin embedded breast cancer tissueare largely comparable to fresh frozen matched tissue. PLoS One2011, 6:e17163

33. Kok M, Linn SC, Van Laar RK, Jansen MPHM, Van Den Berg TM,Delahaye LJMJ, Glas AM, Peterse JL, Hauptmann M, Foekens JA,Klijn JGM, Wessels LFA, Van’t Veer LJ, Berns EMJJ: Comparisonof gene expression profiles predicting progression in breast cancerpatients treated with tamoxifen. Breast Cancer Res Treat 2009, 113:275e283

34. Reyal F, van Vliet MH, Armstrong NJ, Horlings HM, de Visser KE,Kok M, Teschendorff AE, Mook S, van ’t Veer L, Caldas C,Salmon RJ, Vijver MJVD, Wessels LFA: A comprehensive analysisof prognostic signatures reveals the high predictive capacity of the

16FLA 5.6.0 DTD � JMDI802_pro

proliferation, immune response and RNA splicing modules in breastcancer. Breast Cancer Res 2008, 10:R93

35. Atiqi R, van Iersel C, Cleophas TJ: Accuracy assessments of quan-titative diagnostic tests for clinical research. Int J Clin PharmacolTher 2009, 47:153e158

36. Martelotto LG, Ng CK, Piscuoglio S, Weigelt B, Reis-Filho JS: Breastcancer intra-tumor heterogeneity. Breast Cancer Res 2014, 16:210

37. Tabchy A, Ma CX, Bose R, Ellis MJ: Incorporating genomics intobreast cancer clinical trials and care. Clin Cancer Res 2013, 19:6371e6379

38. Asioli S, Maletta F, Verdun Di Cantogno L, Satolli MA,Schena M, Pecchioni C, Botta C, Chiusa L, Molinaro L, Conti L,Viale G, Ingravallo G, Maiorano E, Sapino A: Approachingheterogeneity of human epidermal growth factor receptor 2 insurgical specimens of gastric cancer. Hum Pathol 2012, 43:2070e2079

39. Beumer I, Witteveen A, Delahaye L,Wehkamp D, Snel M, Dreezen C,Zheng J, FlooreA,BrinkG, ChanB, Linn S, BernardsR, van ’TVeer L,Glas A: Equivalence of MammaPrint array types in clinical trials anddiagnostics. Breast Cancer Res Treat 2016, 156:279e287

40. Sparano JA, Gray RJ, Makower DF, Pritchard KI, Albain KS,Hayes DF, Geyer CE Jr, Dees EC, Goetz MP, Olson JA Jr, Lively T,Badve SS, Saphner TJ, Wagner LI, Whelan TJ, Ellis MJ, Paik S,Wood WC, Ravdin PM, Keane MM, Gomez Moreno HL, Reddy PS,Goggins TF, Mayer IA, Brufsky AM, Toppmeyer DL,Kaklamani VG, Berenberg JL, Abrams J, Sledge GW Jr: Adjuvantchemotherapy guided by a 21-gene expression assay in breast cancer.N Engl J Med 2018, 379:111e121 Q

41. Filipits M, Rudas M, Jakesz R, Dubsky P, Fitzal F, Singer CF,Dietze O, Greil R, Jelen A, Sevelda P, Freibauer C, Müller V,Jänicke F, Schmidt M, Kölbl H, Rody A, Kaufmann M, Schroth W,Brauch H, Schwab M, Fritz P, Weber KE, Feder IS, Hennig G,Kronenwett R, Gehrmann M, Gnant M: A new molecular predictor ofdistant recurrence in ER-positive, HER2-negative breast cancer addsindependent information to conventional clinical risk factors. ClinCancer Res 2011, 17:6012e6020

42. Sparano JA, Paik S: Development of the 21-gene assay and itsapplication in clinical practice and clinical trials. J Clin Oncol 2008,26:721e728

43. Paik S, Shak S, Tang G, Kim C, Baker J, Cronin M, Baehner FL,Walker MG, Watson D, Park T, Hiller W, Fisher ER, Wickerham DL,Bryant J, Wolmark N: A multigene assay to predict recurrence oftamoxifen-treated, node-negative breast cancer. N Engl J Med 2004,351:2817e2826 Q

44. Marioni JC, Mason CE, Mane SM, Stephens M, Gilad Y: RNA-seq:an assessment of technical reproducibility and comparison with geneexpression arrays. Genome Res 2008, 18:1509e1517

45. Matthijs G, Souche E, Alders M, Corveleyn A, Eck S, Feenstra I,Race V, Sistermans E, Sturm M, Weiss M, Yntema H, Bakker E,Scheffer H, Bauer P: Guidelines for diagnostic next-generationsequencing. Eur J Hum Genet 2016, 24:2e5

46. Kamps R, Brandão R, Bosch B, Paulussen A, Xanthoulea S, Blok M,Romano A: Next-generation sequencing in oncology: genetic diag-nosis, risk prediction and cancer classification. Int J Mol Sci 2017, 18:308

47. Jovanovi�c B, Sheng Q, Seitz RS, Lawrence KD, Morris SW,Thomas LR, Hout DR, Schweitzer BL, Guo Y, Pietenpol JA,Lehmann BD: Comparison of triple-negative breast cancer molecularsubtyping using RNA from matched fresh-frozen versus formalin-fixed paraffin-embedded tissue. BMC Cancer 2017, 17:241

48. García-García G, Baux D, Faugère V, Moclyn M, Koenig M,Claustres M, Roux AF: Assessment of the latest NGS enrich-ment capture methods in clinical context. Sci Rep 2016, 6:20948

49. Zhang W, Yu Y, Hertwig F, Thierry-Mieg J, Zhang W, Thierry-Mieg D, et al: Comparison of RNA-seq and microarray-based modelsfor clinical endpoint prediction. Genome Biol 2015, 16:133

jmd.amjpathol.org - The Journal of Molecular Diagnostics

1984

of � 22 July 2019 � 8:08 pm � EO: 2018_374