24
Microarray analysis: The CCBR’s perspective Manjula Kasoji CCBR 09/29/2014 1

Microarray analysis: The CCBR’s perspective

Embed Size (px)

DESCRIPTION

Microarray analysis: The CCBR’s perspective. Manjula Kasoji CCBR 09/29/2014. Common pitfalls. Number of replicates Source, quantity and quality of RNA Batch effects A dequate expression signal Time series experiments Non-target tissue contamination. No Replicates, No Statistics. - PowerPoint PPT Presentation

Citation preview

Page 1: Microarray analysis: The CCBR’s perspective

1

Microarray analysis: The CCBR’s perspective

Manjula KasojiCCBR

09/29/2014

Page 2: Microarray analysis: The CCBR’s perspective

2

Common pitfalls

• Number of replicates

• Source, quantity and quality of RNA

• Batch effects

• Adequate expression signal

• Time series experiments

• Non-target tissue contamination

Page 3: Microarray analysis: The CCBR’s perspective

3

No Replicates, No Statistics

A project with no replicates may give you some information, but not possible to do statistics

Page 4: Microarray analysis: The CCBR’s perspective

4

How many biological replicates are needed in a quantitative study?

>= 7replicates

>= 3 replicates

More replicates if• High biological variability• Contamination by non-target tissues• Subtle treatment effect• Multiple treatments• Mechanism of action• Network analysis• And many more…

>= 4-5replicates

Page 5: Microarray analysis: The CCBR’s perspective

5

Depending on the source of RNA, sometimes even with the required number of replicates, samples do not cluster well

• Embryonic tissue

• Knock-out efficiency may also play a role in good sample clustering.

Quality, quantity and Source of RNA influence sample clustering

Restoration of Knockout 1

Knockout 1

Knockout 2

Control

Page 6: Microarray analysis: The CCBR’s perspective

6

Randomization and consistent processing will help avoid batch effects

Treatment

BiologicalReplicate

TechnicalReplicate

Array Batch

A A B B

Page 7: Microarray analysis: The CCBR’s perspective

7

Example of batch effect

Cell line A

Cell line B

Cell line C

Cell line DCell line E

Treated 1

Control 1

Treated 2

Control 2

Batch 1 – Scan Date 02/22/2011

Batch 2 – Scan Date 08/12/2011

Page 8: Microarray analysis: The CCBR’s perspective

8

Batch effects can be visualized via clustering as well

Summary:• Batch effects can be avoided by good experimental design and

randomization.• Batch effects can be visualized on a PCA plot and by clustering.

Batch A Batch B

Page 9: Microarray analysis: The CCBR’s perspective

9

Weak signal expression across samples confounds analysis results

Group1

Group2

Group3

Group4

• Poor clustering of samples

• Genes regulated by gene A induced upon DNA damage

• 4 different conditions

Page 10: Microarray analysis: The CCBR’s perspective

10

Weak signal leads toVery little or no significant, differentially expressed genes

Contrasts DEGs (FDR 0.05, FC 2)

Group 1 vs. Group 2 0

Group 3 vs. Group 4 0

Group 1 vs. Group 3 0

Group 2 vs. Group 4 0

What can we do in this situation?• Relax the statistical parameters (lower p-value and FC thresholds)• Caveat is that this will increase the number of false positives and will negatively

influence downstream analysis.

Summary:Sufficient number of replicatesRandomizationValidation

Page 11: Microarray analysis: The CCBR’s perspective

11

Adding time points to an experiment can be useful for finding biological relevance

• Comparing immune system response in knockout mouse model to human model after treatment with endotoxin.

• Only one time point in mouse: 24hrs 6 time points in human data (0,2,4,6,9,24 hours)

WT-Mouse

KO-Mouse

Human 0 hr

Human 2 hr

Human 4 hr

Human 6 hr

Human 9 hr

Human 24 hr

Page 12: Microarray analysis: The CCBR’s perspective

12

A successful project: Sufficient number of replicates and samples of a group

cluster well

Treated, Time 1

Treated, Time 2

Not treated, Time 1

Not treated, Time 2

Principal Component Analysis

•Effect of cell density and drug treatment on cell survival and growth.•Two conditions and 4 samples per group.

Page 13: Microarray analysis: The CCBR’s perspective

13

Diagnosing outliers

PCA plots are a good way to flag outliers

Page 14: Microarray analysis: The CCBR’s perspective

14

Diagnosing outliers: Quality control

• arrayQualityMetrics() from R/Bioconductor• Metrics measured:

1 Between array comparison(Distance between arrays, PCA)2 Array intensity distribution(Box plots, density plots)3 Affymetrix specific plots on raw data (RLE – Relative Log Expression)4 Affymetrix specific plots on raw data (NUSE – Normalize Unscaled Standard

Error)5 Individual array quality (MA plots)6 Spatial distribution of intensities

• If a sample outlier fails more than one QC metric:1. that sample should be re-run if possible.2. be removed from the analysis.

Page 15: Microarray analysis: The CCBR’s perspective

15

Diagnosing outliers: Quality control

Density Plot Box Plot Heat map

Page 16: Microarray analysis: The CCBR’s perspective

16

Sufficient number of replicates and good quality lead to sufficient number of DEGs

Contrast Groups DEGs

Treated-Day 1 vs. Not Treated Day 2 288

Treated-Day 2 vs. Not Treated Day 2 434

Treated-Day 2 vs. Treated-Day 1 105

Not Treated Day 2 vs. Not Treated Day 1 264

Summary: • Sufficient replicates and good quality samples yield a successful

project.• Outliers can be diagnosed by visualization on a PCA plot and

checking technical QC metrics to ensure that the outlier is not due to biological variability.

Significant, differentially expressed genes (DEGs), p-value 0.05, FC 2

Page 17: Microarray analysis: The CCBR’s perspective

17

Downstream analysis: Functional enrichment using IPA

• Question: Which genes are associated with the growth-suppressive effect of low cell density on cell proliferation and survival?

• Time 1 = low cell density, Time 2 = high density

Time 2, Treated vs. Non-treated Time 1, Treated vs. Non-treated

224 210 78

 

Subset of the 10 genes specifically involved in the Cellular Growth and Proliferation function that are also predicted to be growth suppressive.

Top 5 Bio-functions

Page 18: Microarray analysis: The CCBR’s perspective

19

Visualization of networks in IPA

Interaction network Interaction network expanded to include connections to upstream molecules

Page 19: Microarray analysis: The CCBR’s perspective

20

After the analysis

• Submit data to public repository and provide required metadata

Page 20: Microarray analysis: The CCBR’s perspective

21

What you need to provide to CCBR

Microarray Facility

Investigator

23

1CCBR

• Give us a visit before you begin your experiment• Raw data (e.g. .CEL files)• Metadata (type of array, platform, species, experimental design

information, processing dates)– http://ccrifx.cancer.gov/apps/site/example_microarray

• Your goals and participation• Submit your project request

– https://ccrifx.cancer.gov/apps/project_request/request_project

4

Page 21: Microarray analysis: The CCBR’s perspective

22

If you want to perform the analysis on your own, you need to…

• Learn appropriate qc methods, different statistical tests, and experimental designs

• Know what is in your tool box– Command line

• Affymetrix Power Tools (APT)—for Macs, command line only; free• R/Bioconductor packages

– GUI tools• Affymetrix Expression Console (PC only)--free• Partek• Gene Set Enrichment Analysis (GSEA)• Ingenuity Pathway Analysis (IPA)

• To take this further – Know how to run command line programs – Learn how to script (R/Bioconductor)– Learn different R packages

Page 22: Microarray analysis: The CCBR’s perspective

23

Recap

• Appropriate experimental design

• Sufficient replicates to have statistical power

• Consistent processing to avoid batch effects

• Raw data and meta data

• Visualization

• Validation

* Continuous interaction with

CCBR

Page 23: Microarray analysis: The CCBR’s perspective

Acknowledgements

CCRIFX Fathi Elloumi, PhDParthav Jailwala, MS Li Jia, MS Manjula Kasoji, MS Anjan Purkayastha, PhDAnand S Merchant, MD, PhD

Eric Stahlberg, PhD

24

CCR expertsMaggie Cam, PhDSean Davis, MD, PhDMax Lee, PhDPeter FitzGerald, PhDDavid Goldstein, PhD Sequencing FacilityYongmei Zhao, MSBao Tran, MS

ABCCBrian Luke, PhD Uma Mudunuri, MSBob Stephens, PhDMing Yi, PhD

Jack Collins, PhD

Page 24: Microarray analysis: The CCBR’s perspective

25

Questions??

ContactCCBR home page:

http://ccrifx.cancer.gov/apps/site/defaultCCBR email:

[email protected] 37, room 1123Building 41, room B620

Office hours: Fridays 9:30am -11:30am