Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
microRNA Profiling: Platform Comparison
ABRF Microarray Research Group
Don Baldwin – Penn Microarray Facility, University of Pennsylvania
Project Goals
• Examine multiple microarray and next-gen sequencer platforms for performance in miRNA profiling
• Provide information on sensitivity, reproducibility, and concordance among platforms
• Make data available for reference in selecting & running miRNA profiling assays
Project History• Study begun as 2009 MARG project in collaboration with
DSRG
• Preliminary report on microarray component of study and single miRNA seq result presented at ABRF 2009
• Over last year, additional sequencing data has been generated on Illumina and ABI SOliD platforms in MARG member labs
ABRF MicroArray Research Group
microRNA Profiling: Platform Comparison
Study Design & Implementation
Design• Use commercial total RNA from 2 different human tissues
• Perform miRNA profiling on 4 microarray platforms (Agilent, Affymetrix, Exiqon, and Illumina) and Taqman (ABI) low density arrays in triplicate
• Perform sequencing on 2 deep-sequencing platforms (ABI SOLiD and Illumina GA)
• Analyze data from each platform for detection sensitivity, data reproducibility and profile concordance across platforms
Disclaimer: mention of products or trade names does not constitute and endorsement
Implementation• Commercial RNA purchased from Ambion: First Choice
human total RNA – liver and brain.• Each platform run in laboratory of different MARG or
DSRG member.• Assays were run according to manufacturer
recommendations as implemented in member lab.
Platform ArrayABI TLDA Taqman human miRNA A&B v2.0Affymetrix GeneChip miRNA Agilent Human miRNA Exiqon miRCURY LNA miRNA Illumina Human miRNA v2 Panel
HT Seq PlatformIllumina GA IIABI SOLiD
General methods• RNA aliquots distributed from central site to all member
labs performing microarray or sequencing assay.
• No LMW RNA enrichment for microarray platforms; small RNA enrichment for sequencing platforms.
• Triplicate assays were performed simultaneously for each microarray assay method.
• No replicates for sequencing (because of cost), except that Illumina sequencing was performed twice in separate labs.
Data analysis• Primary data collected and analyzed centrally
• Microarray and Taqman data analyzed with Partek Genomics Suite software at University of Pennsylania
• Sequencing data aligned and analyzed with GeneSifter (GeoSpiza) software at Oregon Health & Science University
ABRF MicroArray Research Group
microRNA Profiling: Platform Comparison
Microarray Results
Partek Genomic Suite data analyses
Platform Raw data Reference Normalization Filter
TaqMan avg delta Ct4-6 endogenous targets
median shift
GeneChip .CEL all, one channel RMA "hsa-"DASL AVG Signal all, one channel log quantile "hsa-"
Agilent TotalGeneSignal all, one channelfloor, log quantile
miRCURY.GPR bkgd corr median
all, dye swap pairsfloor, loess log ratios
"hsa-", flagged
rank order: top 400 expression levelscommon targets: miRBase accession numberSAM: Significance Analysis for Microarrays, FDR 5%
Mean and Standard Deviation of three technical replicates
Brain Liver
Applied Biosystems TaqMan Low Density Arrays
SD
Expression level (log2)
Brain Liver
SD
Expression level (log2)
Affymetrix microRNA GeneChip
Brain Liver
SD
Expression level (log2)
Illumina microRNA DASL assay
Brain Liver
SD
Expression level (log2)
Agilent SurePrint glass slides
Brain Liver
SD
Expression level (log2)
Exiqon miRCURY LNA glass slides
ABI 20
Affymetrix 15
Illumina 10
Agilent 15
Exiqon 13
Log2 dynamic range
Table 1. Brain technical replicate correlation (Average Pearson
Correlation Coefficients, R)
PlatformTotal number of
human probes Normalization
methodAll probes Top 400 probes
(Brain)Common human
probes (639)
ABI 664 Median shift 0.9449 0.9250 0.9449
Affymetrix 847 RMA 0.9883 0.9877 0.9899
Illumina 858 Quantile 0.9855 0.9864 0.9870
Agilent 723 Quantile 0.9968 0.9990 0.9967
Exiqon (BG subtracted) 739 Loess 0.9924 0.9922 0.9926
Table 2. Liver technical replicate correlation (Average Pearson Correlation Coefficients, R)
Platform Total number of
human probes Normalization
methodAll probes Top 400 probes
(Liver)Common human
probes (639)
ABI 664 Median shift 0.9384 0.9277 0.9376
Affymetrix 847 RMA 0.9771 0.9781 0.9785
Illumina 858 Quantile 0.9527 0.9373 0.9562
Agilent 723 Quantile 0.9952 0.9955 0.9952
Exiqon (BG subtracted) 739 Loess 0.9713 0.9697 0.9701
Correlation coefficients: technical replicates
Table 3. Differentially expressed miRNAs detected from common set of 639 human probes
Platform Significantly
different miRNAs
(FDR <=5)
miRNAs with two‐
fold or greater
difference
miRNAs with
significant
difference of >2x
Maximum
negative fold‐
change detected
Maximum positive
fold‐change
detected
ABI 206 338 206 ‐57586 2607
Affymetrix 305 243 238 ‐620 5153
Illumina 347 248 247 ‐109 339
Agilent 347 357 340 ‐11345 3955
Exiqon (BG subtracted) 362 184 184 ‐2929 734
Detection of differential expression: Brain vs. Liver
Table 4. Pearson correlation coefficients for differentially expressed miRNAs
PlatformABI Affymetrix Illumina Agilent Exiqon Mean
concordanceMedianconcordance
ABI 1 0.7570 0.7426 0.6999 0.6699 0.7739 0.7426
Affymetrix 0.7570 1 0.7411 0.7580 0.7163 0.7945 0.7570
Illumina 0.7426 0.7411 1 0.7381 0.6701 0.7784 0.7411
Agilent 0.6999 0.7580 0.7381 1 0.8304 0.8053 0.7580
Exiqon 0.6699 0.7163 0.6701 0.8304 1 0.7773 0.7163
Concordance for detection of differential expression: common target set
Illumina
Exiqon
Affymetrix
Agilent
ABI RT-PCR
liver > = < brain
Hierarchical clustering: all human miRNAs log2 ratio
ABRF MicroArray Research Group
microRNA Profiling: Platform Comparison
Sequencing Results
llumina SOLiDInput File: FASTQ csfastaAlignment algorithm: bowtie mapreads
Alignment Flow:1. Input file (raw reads) aligned to genome of reference species.2. Reads mapping to rRNA, tRNA, mtRNA, etc. define “filtered” reads.3. Reads mapping to miRBase genome coordinates define known miRNAs.4. Reads mapping outside any of the defined genome references are intergenic.5. Reads mapping to multiple genomic coordinates set aside as “non-uniquely
mapped”.6. Reads not mapping at all set aside as “Not mapped”.
Normalized counts: reads per million mapped miRNAs
GeneSifter (GeoSpiza) analysis pipeline
Distribution of Reads (Brain)% of reads
* Total # of reads
** Adapter or not mapped
Distribution of Reads (Liver)% of reads
* Total # of reads
** Adapter or not mapped
Expression range: miRNA mapped reads
microRNAs Detected*% of all miRNAs**
* >5 reads after normalization
** All = 705 different miRNAs
Brain Liver
Dynamic Range of miRNA Reads*Orders of m
agnitude
* Magnitude of difference between highest and lowest expressed miRNA
** miRNA measured as highest expressed
Brain Liver
**let‐7a‐1
**let‐7a‐2 **
mir‐29c**
mir‐21**
mir‐122 **mir‐192
Microarray & Next Gen Seq miRNA profiling summary
• All platforms tested are effective in detecting miRNA transcripts
• Intraplatform reproducibility generally high• Differential expression detection among MA
platforms was similar (r=0.65 to 0.84) • Concordance between MA and next-gen seq
(analysis in progress)• Dynamic detection range much greater for next
gen sequencing (~2-3 log10 more than MA)
Summary: Assay requirements as implemented in MARG study
*from total RNA to primary data
**reagents & supplies (including array); labor not includeda current recommended minimum inputs ~ 100 ng
ABI Taqman RT PCR
Affymetrix arrays
Agilent arrays
Exiqon LNA arrays
Illumina arrays
ABI SOLid seq
Illumina GA seq
major equipment:7900HT PCR system
GeneChip scanner & fluidic stations
glass slide scanner
glass slide scanner
BeadArray station
Sequencer & sample prep instruments
Sequencer & cluster station
total RNA 500 ng 1 uga 200nga 200a 200 ng 500 ng 1 ug/5 ug
time* 6 hrs 1.5 days 2 days 2 days 2 days 2 weeks 1 week
ease-of-use xxx xx(x) xx xx xxx x x
cost per sample** $400 $250 $250 $340/ 2
channel slide $200 ~$1300 ~$1000
MARG project labs 2009 & 2010• Don Baldwin – University of Pennsylvania• Chris Harrington – Oregon Health & Science University• Susan Hester – Environmental Protection Agency, NC• Herbert Auer – Institute for Research in Biomedicine, Spain• Wei Wang – Cornell University• Nadereh Jafari – Northwestern University• Steve Potter – Cinncinnati Children’s Hospital
• Nalini Raghavachari – NHLBI Genomics Core Facility, NIH• Natalia Reyero – Jackson State University
Other MARG members 2010• Peter Schweitzer – Cornell University
DSRG project labs 2009
Acknowledgements: We thank GeoSpiza and Partek for their assistance
ABRF MicroArray Research Group
microRNA Synthetic Reference Project
Outline
Latin Square design
Standard Set 1 Set 2 Set 3 Set 4Set 5
(majority)
A 0.5x 0.1x 0.01x 0.001x 1x
B 0.001x 0.5x 0.1x 0.01x 1x
C 0.01x 0.001x 0.5x 0.1x 1x
D 0.1x 0.01x 0.001x 0.5x 1x
E 1x 1x 1x 1x 1x
Randomize Key Name CLUSTALW
1 hsa-miR-664* .........ACUGGCUAGGGAAAAUGAUUGGAU.........
2 hsa-miR-577 ...............UAGAUAAAAUAUUGGUACCUG......
3 hsa-miR-576-3p ...........AAGAUGUGGAAAA.AUUGGAAUC........
4 hsa-miR-208a .........AUAAGACGAGCAAAAAGCUUGU...........
5 hsa-miR-208b .........AUAAGACGAACAAAAGGUUUGU...........
6 hsa-miR-559 ..........UAAAG.UAAAUAUGCACCAAAA..........
7 hsa-miR-9* .........AUAAAGCUAGAUA...ACCGAAAGU........
8 hsa-miR-142-5p ........CAUAAAG.UAGAAA.GCACUACU...........
9 hsa-miR-335 ..........UCAAGAGCAAUAACGAAAAAUGU.........
10 hsa-miR-620 .........AUGGAGAUAGAUAUAGAAAU.............
11 hsa-miR-516a-5p ........UUCUCGAGGAAAGAAGCACUUUC...........
12 hsa-miR-516b ........AUCUGGAGGUAAGAAGCACUUU............
13 hsa-miR-1255a ........AGGAUGAGCAAAGAAAGUAGAUU...........
14 hsa-miR-1255b ........CGGAUGAGCAAAGAAAGUGGUU............
15 hsa-miR-765 ........UGGAGGAG.AAGGAAGGUGAUG............
16 hsa-miR-483-5p ......AAGACGGGAGGAAAGAAGGGAG..............
17 hsa-miR-1 .........UGGAAUGUAAAGAAGUAUGUAU...........
.
.
.
.
.
.
.
.
.
Index by sequence composition
Set 1 Set 2 Set 3 Set 4
3 37 5 26
175 132 46 104
193 135 214 246
200 240 237 296
259 245 310 473
531 292 409 485
652 311 414 790
809 594 496 797
830 680 629 846
877 843 665 859
Decimal key Name Sequence Set
121.1 hsa-miR-130b .......CAGUGCAAUGAUGAAAGGGCAU............. 1
121.2 hsa-miR-130a .......CAGUGCAAUGUUAAAAGGGCAU............. 2
121.3 hsa-miR-301b .......CAGUGCAAUGAUAUUGUCAAAGC............ 3
121.4 hsa-miR-301a .......CAGUGCAAUAGUAUUGUCAAAGC............ 4
133.1 hsa-miR-376a ..........AUCAUAGAGGAAAAUCCACGU........... 1
133.2 hsa-miR-376c ..........AACAUAGAGGAAAUUCCACGU........... 2
304.1 hsa-miR-181d .......AACAUUCAUUGUUGUCGGUGGGU............ 3
304.2 hsa-miR-181b .......AACAUUCAUUGCUGUCGGUGGGU............ 4
309.1 hsa-let-7f ..........UGAGGUAGUAGAUUGUAUAGUU.......... 1
309.2 hsa-let-7a ..........UGAGGUAGUAGGUUGUAUAGUU.......... 2
309.3 hsa-let-7e ..........UGAGGUAGGAGGUUGUAUAGUU.......... 3
309.4 hsa-let-7b ..........UGAGGUAGUAGGUUGUGUGGUU.......... 4
309.5 hsa-let-7c ..........UGAGGUAGUAGGUUGUAUGGUU.......... 1
309.6 hsa-miR-98 ..........UGAGGUAGUAAGUUGUAUUGUU.......... 2
309.7 hsa-let-7d ..........AGAGGUAGUAGGUUGCAUAGUU.......... 3
309.8 hsa-let-7g ..........UGAGGUAGUAGUUUGUACAGUU.......... 4
309.9 hsa-let-7i ..........UGAGGUAGUAGUUUGUGCUGUU.......... 1
348.1 hsa-miR-520b ........AAAGUGCUUCC..UUUUAGAGGG........... 2
348.2 hsa-miR-520c-3p ........AAAGUGCUUCC..UUUUAGAGGGU.......... 3
348.3 hsa-miR-520f .........AAGUGCUUCC..UUUUAGAGGGUU......... 4
860.1 hsa-miR-15b ..........UAGCAGCACAUCAUGGUUUACA.......... 1
860.2 hsa-miR-15a ..........UAGCAGCACAUAAUGGUUUGUG.......... 2
871.1 hsa-miR-23a .........AUCACAUUGCCAGGGAUUUCC............ 3
871.2 hsa-miR-23b .........AUCACAUUGCCAGGGAUUACC............ 4
Research Randomizer: 4 sets of 10 unique numbers, range 1-878
Supplemented with related miRNAsChoose set members
Synthetic reference applications
Standard A,B,C,D,E
MARG members profile by microarrays and sequencing
Deposit data in GEO
Distribute to ABRF members
Public or commercial distribution
Mix with yeast
FFPE
MARG members extract RNA and profile