61
RNA-RNA interaction A biological crash course and introduction to prediction methods

RNA-RNA interaction A biological crash course and introduction to prediction methods

  • View
    222

  • Download
    0

Embed Size (px)

Citation preview

Page 1: RNA-RNA interaction A biological crash course and introduction to prediction methods

RNA-RNA interaction

A biological crash course and introduction to prediction methods

Page 2: RNA-RNA interaction A biological crash course and introduction to prediction methods

Part I – Biological crash course Bacteria

Plasmid copy controlPost-segregational killing systems trans-encoded chromosomal RNAs

RNA interference (gene silencing) Translation regulation

C. elegans developmental regulationmiRNA-miRNA interactions

Human telomerase

Page 3: RNA-RNA interaction A biological crash course and introduction to prediction methods

DNA vs. RNA

Bases #Strands Structure

DNA A,C,G,T 2 Double helix

RNA A,C,G,U 1 or 2 Stem-loop, pseudoknots, etc.

Page 4: RNA-RNA interaction A biological crash course and introduction to prediction methods

Gene expression

Central dogma of molecular biology

Page 5: RNA-RNA interaction A biological crash course and introduction to prediction methods

Translation

mRNA -> protein via triplet code What happens if mRNA is destroyed or

otherwise can’t be translated?

Page 6: RNA-RNA interaction A biological crash course and introduction to prediction methods

Bacteria backgrounder

Single-celled organisms Prokaryotes = no nucleus Multi-cistronic transcripts -> multiple

genes transcribed at one time, often with overlapping reading frames

Page 7: RNA-RNA interaction A biological crash course and introduction to prediction methods

Bacterial genetic information Bacterial chromosome (1)

Genome of organismRequired for life

Plasmids (2)Circular DNA moleculesDouble-stranded Independently self-replicatingNot required for life, often confer selective

advantage such as antibiotic resistance

Page 8: RNA-RNA interaction A biological crash course and introduction to prediction methods

Plasmid replication

(1),(2) – Genes encoded on plasmid (3) – Origin of Replication (ORI)

Page 9: RNA-RNA interaction A biological crash course and introduction to prediction methods

Plasmid copy control

Recall independent self-replication Copy number fluctuations are unavoidable Too many -> “runaway”, host dies Too few -> increased risk of plasmid loss

Problem: How to control copy count?

Solution: negative feedback loop mediated by RNA-RNA interaction

Page 10: RNA-RNA interaction A biological crash course and introduction to prediction methods

R1 copy control

Genes:oriR1 – origin of replicationrepA – lots of this protein product is required

for replication initiation tap – translation of protein product is required

for translation of repA proteincopA – product is antisense RNAcopB – product is a repressor protein (not

covered here)

Page 11: RNA-RNA interaction A biological crash course and introduction to prediction methods

R1 copy control (2)

copA – RNA with stem-loop structure copT – target segment of repA/tap mRNA,

also forms a stem-loop structure Single loop-loop interaction

Page 12: RNA-RNA interaction A biological crash course and introduction to prediction methods

R1 copy control (3)

Page 13: RNA-RNA interaction A biological crash course and introduction to prediction methods

R1 copy control (4)

copA RNA is unstable; it degrades If not enough plasmids are producing

copA antisense RNA (copy number is too low), more repA protein can be produced

Therefore the plasmid can replicate

Page 14: RNA-RNA interaction A biological crash course and introduction to prediction methods

Post-segregational killing systems

Plasmid self-preservation mechanism Bacterial host losing plasmid results in

host death R1 plasmid hok/sok system is the

prototype All such systems work similarly

Page 15: RNA-RNA interaction A biological crash course and introduction to prediction methods

R1 hok/sok system

hok/sok locus encodes:hok protein – “host killing”Overlapping reading frame – mok –

“modulator of killing”sok RNA – “suppressor of killer”

mok must be translated for hok to be expressed

mok cannot be translated if sok is present

Page 16: RNA-RNA interaction A biological crash course and introduction to prediction methods

R1 hok/sok system (2)

hok mRNA is extremely compactMany stem-loop structuresFlush 5’ – 3’ pairingHighly stable -> long half-lifeTranslationally inert

mok segment is both:Translationally activeAble to bind sok inhibitor RNA

Page 17: RNA-RNA interaction A biological crash course and introduction to prediction methods

R1 hok/sok system (3)

sok RNA is highly unstable Bacteria with R1 have lots of sok produced

sok binds mok, hok is not translated Bacteria which lose R1 have:

Lots of stable hok mRNAQuickly degrading sok RNA (low stability)No new sok RNA being producedhok is translated -> bacteria dies

Page 18: RNA-RNA interaction A biological crash course and introduction to prediction methods

Bacterial chromosomes

Plasmid antisense RNAs are generally cis-encoded Implies complete Watson-Crick

complementarity Bacterial chromosomes contain trans-

encoded antisense RNAsNot necessarily complete complementarity

Often stress-related control systems

Page 19: RNA-RNA interaction A biological crash course and introduction to prediction methods

oxyS/fhlA in E. coli

oxyS – RNA transcript induced by stress

fhlA – transcriptional activator site

oxyS/fhlA complex binds via two loop-loop interactions

Page 20: RNA-RNA interaction A biological crash course and introduction to prediction methods

RNA interference (RNAi)

a.k.a. post-transcriptional gene silencing Double-stranded RNAs are introduced into

the cellComplementary to mRNA for a geneDirectly introduced in a wet lab, orProduced by the cell itself

Page 21: RNA-RNA interaction A biological crash course and introduction to prediction methods

RNA interference (2)

dsRNAs are cleaved into 21-23 nt segments (“small interfering RNAs”, or siRNAs) by an enzyme called Dicer

Page 22: RNA-RNA interaction A biological crash course and introduction to prediction methods

RNA interference (3)

siRNAs are incorporated into RNA-induced silencing complex (RISC)

Page 23: RNA-RNA interaction A biological crash course and introduction to prediction methods

RNA interference (4)

Guided by base complementarity of the siRNA, the RISC targets mRNA for degradation

Page 24: RNA-RNA interaction A biological crash course and introduction to prediction methods

RNA interference – why?

Studying gene functionKnock out or inhibit a gene’s normal functionCan the organism survive?What phenotypic changes are observed?

Therapeutic suppressionE.g. cancer treatment

Page 25: RNA-RNA interaction A biological crash course and introduction to prediction methods

micro RNA (miRNA)

Gene expression regulation Created by similar process to siRNA Generally prevents binding of ribosome

Page 26: RNA-RNA interaction A biological crash course and introduction to prediction methods

Ex: C. elegans development

lin-4 and let-7 antisense RNAs Regulate larval development in C. elegans One of the two binding sites for lin-41 and

let-7 interaction:

Page 27: RNA-RNA interaction A biological crash course and introduction to prediction methods

Human telomerase Telomerase = ribonucleoprotein complex

Ribo = ribosomal/RNA associationNucleo = nuclear localizationProtein = contains a protein

Responsible for maintaining telomere length in eukaryotic chromosomes

Main components:Telomerase reverse transcriptaseHuman telomerase RNA (hTR)

Page 28: RNA-RNA interaction A biological crash course and introduction to prediction methods

Human telomerase (2)

Reverse transcriptaseTranscribes RNA to DNA (rather than the

usual DNA to RNA) Telomeres – repeated regions at the end

of eukaryotic chromosomes hTR is the template for the repeated

region

Page 29: RNA-RNA interaction A biological crash course and introduction to prediction methods

Human telomerase (3)

hTR 11-nt templating region consists of:Repeat template: CUAACCCAlignment domain: UAAC

Positions telomerase on the DNA strand Provides template for repeat region

Page 30: RNA-RNA interaction A biological crash course and introduction to prediction methods

Human telomerase (4)

Page 31: RNA-RNA interaction A biological crash course and introduction to prediction methods

Loop-loop interaction

Sometimes referred to as “kissing loops” Recall that all of the RNA-RNA interaction

discussed so far (excepting RNAi), involve loop-loop interaction

Predicting miRNA transcripts and targets involves loop structure prediction

Page 32: RNA-RNA interaction A biological crash course and introduction to prediction methods

ReferencesCouzin, J. (2002) “Breakthrough of the year – Small RNAs

make big splash.” Science 298(5602):2296-2297.

Lai, E.C., Wiel, C., and Rubin, G.M. (2004) “Complementary miRNA pairs suggest a regulatory role for miRNA:miRNA duplexes.” RNA 10(2):171-175.

Moss, E.G. (2001) “RNA interference – It’s a small RNA world.” Current Biology 11(19):R722-775.

Sharp, P.A. (2001) “RNA interference – 2001.” Genes and Development 15(5):485-90.

Shi, Y. (2003) “Mammalian RNAi for the masses.” TRENDS in Genetics 19(1):9-12.

Page 33: RNA-RNA interaction A biological crash course and introduction to prediction methods

References (2)Ueda, C.T., and Roberts, R.W. (2004) “Analysis of a long-

range interaction between conserved domains of human telomerase RNA.” RNA 10(1):139-147.

Wagner, E.G.H. and Flärdh, K. (2002) “Antisense RNAs everywhere?” TRENDS in Genetics 18(5):223-226.

Wagner, E.G.H., Altuvia, S., and Romby, P. (2002) “Antisense RNAs in bacteria and their genetic elements.” Advances in Genetics 45:361-398.

Page 34: RNA-RNA interaction A biological crash course and introduction to prediction methods

Part II – Prediction

Identifying effective siRNAsNeural network approach

Identifying targetsMammalian miRNA target prediction

Page 35: RNA-RNA interaction A biological crash course and introduction to prediction methods

Prediction of siRNAs

Sequence properties that make a good antisense RNA an effective gene inhibitor are not well understood

Most computational models consider only:RNA structure predictionMotif searches

Page 36: RNA-RNA interaction A biological crash course and introduction to prediction methods

Neural net approach

Training set: 490 known siRNA molecules Input parameters:

Base compositionmRNA:siRNA binding energy properties3’ and 5’ binding energyStructure of siRNA (hairpin energy and

quality) Target function: efficacy

Page 37: RNA-RNA interaction A biological crash course and introduction to prediction methods

Neural net approach (2)

Page 38: RNA-RNA interaction A biological crash course and introduction to prediction methods

Neural net results

14 inputs, 11 hidden units, 1 output Success rate of 92% Average prediction of 12 effective siRNAs

per 1000 base pairs Stringent (high specificity) Good for designing siRNAs for RNAi

Page 39: RNA-RNA interaction A biological crash course and introduction to prediction methods

Prediction of miRNA targets

Mammals/vertebratesLots of known miRNAsMostly unknown target genes

Initial method outlineLook at conserved miRNAsLook for conserved target sites

Page 40: RNA-RNA interaction A biological crash course and introduction to prediction methods

micro RNAs in animals

0.5-1.0% of predicted genes encode miRNAOne of the more abundant regulatory classes

Tissue-specific or developmental stage-specific expression

High evolutionary conservation

Page 41: RNA-RNA interaction A biological crash course and introduction to prediction methods

micro RNAs in plants

Finding targets in plants is relatively easy Look for mRNA transcripts with near-

perfect complementarity to known miRNAs Signal-to-noise ratio exceeds 10:1 for

Arabidopsis (model plant organism) Naïve approach in C. elegans and D.

melanogaster? No more hits than expected by random chance!

Page 42: RNA-RNA interaction A biological crash course and introduction to prediction methods

So what can we use?

Pairing to nucleotides 2-8 at the 5’ end of the miRNATarget recognition

Target regions enriched for genes involved in transcriptional regulation

Page 43: RNA-RNA interaction A biological crash course and introduction to prediction methods

Goals for algorithm

Predict 100s of miRNA targets Estimate false-positive rates Provide computational and experimental

evidence of authenticity Identify common functionality classes

other than transcriptional regulator genes

Page 44: RNA-RNA interaction A biological crash course and introduction to prediction methods

TargetScan Algorithm developed by Lewis et al 2003 Input:

miRNA that is known to be conserved across multiple organisms

Orthologous 3’ UTR sequencesCut-off values for two parametersValue for one free parameter

Output: Ranked list of candidate target genes

Page 45: RNA-RNA interaction A biological crash course and introduction to prediction methods

TargetScan (1)

Search UTRs in one organismBases 2-8 from miRNA = “miRNA seed”Perfect Watson-Crick complementarityNo wobble pairs (G-U)7nt matches = “seed matches”

Page 46: RNA-RNA interaction A biological crash course and introduction to prediction methods

TargetScan (2)

Extend seed matchesAllow G-U (wobble) pairsBoth directionsStop at mismatches

Page 47: RNA-RNA interaction A biological crash course and introduction to prediction methods

TargetScan (3)

Optimize basepairingRemaining 3’ region of miRNA35 bases of UTR 5’ to each seed matchRNAfold program (Hofacker et al 1994)

Page 48: RNA-RNA interaction A biological crash course and introduction to prediction methods

TargetScan (4)

Folding free energy (G) assigned to each putative miRNA:target interaction

Ignores initiation free energy RNAeval (Hofacker et al 1994)

Page 49: RNA-RNA interaction A biological crash course and introduction to prediction methods

Z score for each UTR (no match -> Z=1.0)

TargetScan (5)

n

k

TGkeZ1

/

n = number of seed matches in UTR (may be more than one)

Gk = free energy of miRNA:target site interaction of kth seed match

T = parameter influencing relative weighting of UTRs with few high affinity target sites against UTRs with lots of low affinity target sites (experimentally determined)

Page 50: RNA-RNA interaction A biological crash course and introduction to prediction methods

TargetScan (6)

Order UTRs by Z score Assign rank to each UTR Repeat this process for each of the other

organisms with UTR datasets

Page 51: RNA-RNA interaction A biological crash course and introduction to prediction methods

TargetScan (7)

UTR i is a predicted target if for all organisms:

Ci ZZ Ci RR

Page 52: RNA-RNA interaction A biological crash course and introduction to prediction methods

Datasets

nrMamm (mammalian – 79 sequences)Homologs in human, mouse, and pufferfish Identical between human and mouse, not

necessarily pufferfish (fugu) nrVert (vertebrate – 55 sequences)

Identical between human, mouse, and fugu Non-redundant: if multiple miRNAs had

the same seed, one representative chosen

Page 53: RNA-RNA interaction A biological crash course and introduction to prediction methods

Sample program flow

Page 54: RNA-RNA interaction A biological crash course and introduction to prediction methods

Results for nrMamm

nrMamm searched against human, mouse, and rat orthologous 3’ UTRs

451 miRNA:target interactions predicted for 400 unique genes

Average 5.7 targets per miRNA Signal:noise ratio of 3.2:1

Page 55: RNA-RNA interaction A biological crash course and introduction to prediction methods

Results for nrVert

Additional search against fugu UTRs Signal:noise ratio improves to 4.6:1 Relaxed cut-off values 115 predicted miRNA:target interactions

for 107 unique genes 2.1 putative targets per miRNA

Page 56: RNA-RNA interaction A biological crash course and introduction to prediction methods

Signal:noise ratio calculation

Signal = number of predicted targets from nrMamm dataset

Noise = number of predicted targets from randomly shuffled miRNAs

Shuffled control sequences screened to ensure preservation of relevant features – don’t underestimate the noise!

Page 57: RNA-RNA interaction A biological crash course and introduction to prediction methods

Screening control sequences

Features to consider:Expected frequency of seed matchesExpected frequency of matching to 3’ end of

miRNA (after seed extension)Observed count of seed matches in UTR

datasetsPredicted free energies for seed:match

interactions

Page 58: RNA-RNA interaction A biological crash course and introduction to prediction methods

Signal:noise results Filled bars are for

authentic miRNAs Open bars show the

mean and standard deviation for shuffled sequences

nrMamm set used for first two, nrVert used for set including fugu

Page 59: RNA-RNA interaction A biological crash course and introduction to prediction methods

Biological relevance Hypothesis: 5’ conservation of miRNAs is

important for mRNA target recognitionHighest signal:noise ratio observed when

seed positioned close to 5’ end Hypothesis: highly conserved miRNAs are

more involved in regulationHigh degree of conservation -> more

predicted targetsMembership in large miRNA family -> more

predicted targets

Page 60: RNA-RNA interaction A biological crash course and introduction to prediction methods

Experimental verification

15 predicted target sites chosenAll with known biological functionRepresentative of the entire list of candidates

11 target sites confirmedExpression of upstream ORF influenced27% false positives – close correspondance

to predicted 30% false positives

Page 61: RNA-RNA interaction A biological crash course and introduction to prediction methods

ReferencesChalk, A.M. and Sonnhammer, E.L.L. (2002)

“Computational antisense oligo prediction with a neural network model.” Bioinformatics 18(12):1567-1575.

Hofacker, I.L., Fontanta, W., Stadler, P.F., Bonhoeffer, S., Tacker, M., and Schuster, P. (1994) “Fast folding and comparison of RNA secondary structures.” Monatshefte fur Chemie 125:167-168.

Lewis, B.P., Shih, I., Jones-Rhoades, M.W., and Bartel, D.P. (2003) “Prediction of mammalian microRNA targets.” Cell 115(7):787-798.