27
Probe Selection for Microarrays Considerations and Pitfalls Kay Hofmann MEMOREC Stoffel GmbH Cologne/Germany

Probe Selection for Microarrays

  • Upload
    river

  • View
    40

  • Download
    1

Embed Size (px)

DESCRIPTION

Probe Selection for Microarrays. Considerations and Pitfalls. Kay Hofmann MEMOREC Stoffel GmbH Cologne/Germany. Probe selection wish list. Probe selection strategy should ensure Biologically meaningful results (The truth...) Coverage, Sensitivity (... The whole truth...) - PowerPoint PPT Presentation

Citation preview

Page 1: Probe Selection for Microarrays

Probe Selection for Microarrays

Considerations and Pitfalls

Kay HofmannMEMOREC Stoffel GmbHCologne/Germany

Page 2: Probe Selection for Microarrays

Probe selection wish list

Probe selection strategy should ensure

Biologically meaningful results (The truth...)

Coverage, Sensitivity (... The whole truth...)

Specificity (... And nothing but the truth)

Annotation

Reproducability

Page 3: Probe Selection for Microarrays

Technology

Probe immobilization

Oligonucleotide coupling Synthesis with linker, covalent coupling to surface

Oligonucleotide photolithography

ds-cDNA coupling cDNA generated by PCR, nonspecific binding to surface

ss-cDNA coupling PCR with one modified primer, covalent coupling, 2nd strand removal

Spotting

With contact (pin-based systems)

Withoug contact (ink jet technology)

Page 4: Probe Selection for Microarrays

Technology-specific requirements

General

Not too short (sensitivity, selectivity)

Not too long (viscosity, surface properties)

Not too heterogeneous (robustness)

Degree of importance depends on method

Single strand methods (Oligos, ss-cDNA)

Orientation must be known

ss-cDNA methods are not perfect

ds-cDNA methods don’t care

Page 5: Probe Selection for Microarrays

Probe selection approaches

Accuracy Throughput

Selected GeneRegions

SelectedGenes

Anonymous

ESTs

ClusterRepresentatives

Page 6: Probe Selection for Microarrays

Non-Selective Approaches

EST spotting

Using clones from a library after sequencing

Little justification since sequence availability allow selection

Anonmymous (blind) spotting

Using clones from a library without prior sequencing

Only clones with interesting expression pattern are sequenced

Normalization of library highly recommended

Typical uses:

HT-arrays of ‘exotic’ organisms or tissues

Large-scale verification of DD clones

Page 7: Probe Selection for Microarrays

Spotting of cluster representatives

Sequence Clustering

For human / mouse / rat EST clones: public cluster libraries

Unigene (NCBI)

THC (TIGR)

For custom sequence: clustering tools

STACK_PACK (SANBI)

JESAM (HGMP)

PCP (Paracel, commercial)

Page 8: Probe Selection for Microarrays

A benign clustering situation

Page 9: Probe Selection for Microarrays

In the absence of 5‘-3‘ links

Two clusters corresponding to one gene

!

Page 10: Probe Selection for Microarrays

Overlap too short

Three clusters corresponding to one gene

!

Page 11: Probe Selection for Microarrays

Chimeric ESTs! !

One cluster corresponding to two genes

Page 12: Probe Selection for Microarrays

Chimeric ESTs .. continued

Chimeric ESTs are quite common

Chimeric ESTs are a major nuisance for array probe selection

One of the fusion partners is usually a highly expressed mRNA

Double-picking of chimeric ESTs can fool even cautious clustering programs.

Unigene contains several chimeric clusters

The annotation of chimeric clusters is erratic

Chimeric ESTs can be detected by genome comparison

There is one particularly bad class of chimeric sequences that will be subject of the exercises.

Page 13: Probe Selection for Microarrays

How to select a cluster representative

If possible, pick a clone with completely known sequence

Avoid problematic regions Alu-repeats, B1, B2 and other SINEs LINEs Endogenous retroviruses Microsatellite repeats

Avoid regions with high similarity to non-identical sequences

In many clusters, orientation and position relative to ORF are unknown and cannot be selected for.

Test selected clone for sequence correctness

Test selected clone for chimerism

Some commercial providers offer sequence verified UNIGENE cluster representatives

Page 14: Probe Selection for Microarrays

Selection of genes

If possible, use all of them

Biased selection Selection by tissue Selection by topic Selection by visibility Selection by known expression properties Selection from unbiased pre-screen

Use sources of expression information EST frequency Published array studies SAGE data

Page 15: Probe Selection for Microarrays

Selection of gene regions

3‘ UTR

ORF

5‘ UTR

Page 16: Probe Selection for Microarrays

Alternative polyadenylation

Page 17: Probe Selection for Microarrays

Alternative polyadenylation

Constitutive polyA heterogeneity

3’-Fragments: reduced sensitivity no impact on expression ratio

Regulated polyA heterogeneity Fragment choice influences expression ratio Multiple fragments necessary

Detection of cryptic polyA signals Prediction (AATAAA) Polyadenylated ESTs SAGE tags

Page 18: Probe Selection for Microarrays

Alternative splicing

Page 19: Probe Selection for Microarrays

Alternative splicing

Constitutive splice form heterogeneity

Fragment in alternative exon: reduced sensitivity No impact on expression ratio

Regulated splice form heterogeneity Fragment choice influences expression ratio Multiple fragments necessary

Detection of alternative splicing events Hard/Impossible to predict EST analysis (beware of pre-mRNA) Literature

Page 20: Probe Selection for Microarrays

Alternative promoter usage

Page 21: Probe Selection for Microarrays

Alternative promotor usage

What is the desired readout?

If promoter activity matters most: multiple fragments If overall mRNA level matters most: downstream fragment

Detection of alternative promoter usage Prediction difficult (possible?) EST analysis Literature

Page 22: Probe Selection for Microarrays

UDP-Glucuronosyltransferases

UGT1A8

UGT1A7

Page 23: Probe Selection for Microarrays

Selection of gene regions

Coding region (ORF)

Annotation relatively safe No problems with alternative polyA sites No repetitive elements or other funny sequences danger of close isoforms danger of alternative splicing might be missing in short RT products

3’ untranslated region Annotation less safe danger of alternative polyA sites danger of repetitive elements less likely to cross-hybridize with isoforms little danger of alternative splicing

5’ untranslated region close linkage to promoter frequently not available

Page 24: Probe Selection for Microarrays

A checklist

Pick a gene

Try get a complete cDNA sequence

Verify sequence architecture (e.g. cross-species comparison)

Mask repetitive elements (and vector!)

If possible, discard 3’-UTR beyond first polyA signal

Look for alternative splice events

Use remaining region of interest for similarity searches

Mask regions that could cross-hybridize

Use the remaining region for probe amplification or EST selection

When working with ESTs, use sequence-verified clones

Page 25: Probe Selection for Microarrays
Page 26: Probe Selection for Microarrays

1) Assume that you are interested in the p53-homolog p63, also known as Ket (TrEMBL: Q9UE10) What kind of fragment(s) would you use for expression analysis? Why?

2) The cytochrome P450 family is very important for toxicological microarray analysis since most isoforms repond to different toxic compounds. Is it possible to design a cDNA fragment (minimal size 200 bp) that would be able to separate CYP2A6 and CYP2A7? What is the situation with CYP1A1 and CYP1A2? What region should be used?

3) Check whether probes for p53 (Swissprot: P53_HUMAN), p63 and p73 (P73_HUMAN) are available on the Affymetrix human 35K chip or the mouse 12K chip. Check whether there are sequence verified clones available from Research Genetics.

4) Two (hypothetical) papers using different types of microarrays report very different results for the regulation of the thyroid receptor alpha-2 (Swissprot: THA2_HUMAN). Can you think of a possible explanation? What could you do to resolve this issue?

Exercises

Page 27: Probe Selection for Microarrays

1) Literature search with Pubmed:http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed

2) Sequence search & retrieval (SwissProt, Entrez)http://www.expasy.ch/sprot/http://www.ncbi.nlm.nih.gov:80/entrez/query.fcgi?db=Nucleotide

3) BLAST searches at SIBhttp://www.ch.embnet.org/software/aBLAST.html Use specific subdatabase! Mind the ‘repsim‘ filter

4) Two-way sequence alignmenthttp://www.ch.embnet.org/software/LALIGN_form.html

Tools for Exercises