18
MicroRNA Detection and Target Prediction: Integration of Computational and Experimental Approaches KEYA CHAUDHURI and RAGHUNATH CHATTERJEE ABSTRACT In recent years, microRNAs (miRNAs), a class of 19–25 nucleotides noncoding RNAs, have been shown to play a major role in gene regulation across a broad range of metazoans and are important for a diverse biological functions. These miRNAs are involved in the regulation of protein expression primarily by binding to one or more target sites on an mRNA transcript and causing cleavage or repression of translation. Computer-based approaches for miRNA gene identification and miRNA target prediction are being considered as indispensable in miRNA research. Similarly, effective experimental techniques validating in silico predictions are crucial to the testing and finetuning of computational algorithms. Iterative interactions between in silico and experi- mental methods are now playing a central role in the biology of miRNAs. In this review, we summarize the various computational methods for identification of miRNAs and their targets as well as the technologies that have been developed to validate the predictions. INTRODUCTION M ICRORNAS (MIRNAS) are an abundant class of endoge- nous, small, noncoding RNAs typically 19 to 25 nucleo- tide (nt) long expressed in a wide variety of organisms from plants, viruses, and animals (Bartel and Bartel, 2003; Lim et al., 2003a; Pfeffer et al., 2004, 2005). Many miRNAs are highly conserved across species (Bartel, 2004). The numbers of miRNA entries are increasing almost exponentially over the past 5 years. Depending on the degree of complementarity between miRNA and its target transcript, miRNAs are known to exercise post- transcriptional control over most eukaryotic genomes, causing degradation of target transcript or translational repression (Lagos- Quintana et al., 2001; Lau et al., 2001; Lee and Ambros, 2001; Moss and Poethig, 2002; Bartel, 2004). The founding members of this class of noncoding RNAs are the lin-4 and let-7 gene products of Caenorhabditis elegans (Lee et al., 1993; Reinhart et al., 2000). Both lin-4 and let-7 RNAs act as repressor of their respective target genes lin-14, lin-28, and lin-41 (Lee et al., 1993; Moss et al., 1997; Slack et al., 2000). In all these cases repression was mediated by the presence of complementary miRNA sequences in the 3 0 untranslated regions (UTRs) of the target mRNAs (Slack et al., 2000; Lewis et al., 2003). The participation of several miRNAs in essential biological processes have been identified; for example, the roles of animal miRNAs have been identified in developmental timing, cell death, cell proliferation control, hematopoiesis and patterning of nervous system, pancreatic cell insulin secretion, adipocyte development (Ambros, 2004; Harfe, 2005). Recent findings in- dicate that mRNA profiles are changed in a large number of hu- man cancers (Xie et al., 2005; Calin and Croce, 2006) and that the forced over expression of miRNAs can lead to the devel- opment of tumors (He et al., 2005). The biogenesis of miRNAs has been studied by several lab- oratories, and we are beginning to understand the process of transcription, cytoplasmic transport, and maturation of miRNAs (Bartel, 2004; Kim, 2005). The miRNA genes encoded in the genomes of most eukaryotic organisms are transcribed by RNA polymerase II into primary miRNAs. These structured RNAs are then processed by the RNase III endonuclease Drosha to form one or more precursor miRNA (~60–100 nt) having capa- bility to form stem–loop-type secondary structure. Specific RNA cleavage by Drosha predetermines the mature miRNA sequence and provides the substrate for subsequent processing events. The pre-miRNAs are transported to the cytoplasm by Exportin-5, in a Ran GTP-dependent manner. Once in the cytoplasm, a second RNase III endonuclease, Dicer, acts on Molecular & Human Genetics Division, Indian Institute of Chemical Biology, Kolkata, India. DNA AND CELL BIOLOGY Volume 26, Number 5, 2007 # Mary Ann Liebert, Inc. Pp. 321–337 DOI: 10.1089/dna.2006.0549 321

MicroRNA Detection and Target Prediction: Integration of Computational and Experimental Approaches

Embed Size (px)

Citation preview

MicroRNA Detection and Target Prediction: Integrationof Computational and Experimental Approaches

KEYA CHAUDHURI and RAGHUNATH CHATTERJEE

ABSTRACT

In recent years, microRNAs (miRNAs), a class of 19–25 nucleotides noncoding RNAs, have been shown to play

a major role in gene regulation across a broad range of metazoans and are important for a diverse biological

functions. These miRNAs are involved in the regulation of protein expression primarily by binding to one ormore target sites on an mRNA transcript and causing cleavage or repression of translation. Computer-based

approaches for miRNA gene identification and miRNA target prediction are being considered as indispensable

in miRNA research. Similarly, effective experimental techniques validating in silico predictions are crucial to

the testing and finetuning of computational algorithms. Iterative interactions between in silico and experi-

mental methods are now playing a central role in the biology of miRNAs. In this review, we summarize the

various computational methods for identification of miRNAs and their targets as well as the technologies that

have been developed to validate the predictions.

INTRODUCTION

M ICRORNAS (MIRNAS) are an abundant class of endoge-

nous, small, noncoding RNAs typically 19 to 25 nucleo-

tide (nt) long expressed in a wide variety of organisms from

plants, viruses, and animals (Bartel and Bartel, 2003; Lim et al.,

2003a; Pfeffer et al., 2004, 2005). Many miRNAs are highly

conserved across species (Bartel, 2004). The numbers of miRNA

entries are increasing almost exponentially over the past 5 years.

Depending on the degree of complementarity between miRNA

and its target transcript, miRNAs are known to exercise post-

transcriptional control over most eukaryotic genomes, causing

degradation of target transcript or translational repression (Lagos-

Quintana et al., 2001; Lau et al., 2001; Lee and Ambros, 2001;

Moss and Poethig, 2002; Bartel, 2004).

The founding members of this class of noncoding RNAs are

the lin-4 and let-7 gene products ofCaenorhabditis elegans (Lee

et al., 1993; Reinhart et al., 2000). Both lin-4 and let-7RNAs act

as repressor of their respective target genes lin-14, lin-28, and

lin-41 (Lee et al., 1993; Moss et al., 1997; Slack et al., 2000).

In all these cases repression was mediated by the presence of

complementary miRNA sequences in the 30 untranslated regions(UTRs) of the target mRNAs (Slack et al., 2000; Lewis et al.,

2003).

The participation of several miRNAs in essential biological

processes have been identified; for example, the roles of animal

miRNAs have been identified in developmental timing, cell

death, cell proliferation control, hematopoiesis and patterning

of nervous system, pancreatic cell insulin secretion, adipocyte

development (Ambros, 2004; Harfe, 2005). Recent findings in-

dicate that mRNA profiles are changed in a large number of hu-

man cancers (Xie et al., 2005; Calin and Croce, 2006) and that

the forced over expression of miRNAs can lead to the devel-

opment of tumors (He et al., 2005).

The biogenesis of miRNAs has been studied by several lab-

oratories, and we are beginning to understand the process of

transcription, cytoplasmic transport, and maturation of miRNAs

(Bartel, 2004; Kim, 2005). The miRNA genes encoded in the

genomes of most eukaryotic organisms are transcribed by RNA

polymerase II into primary miRNAs. These structured RNAs

are then processed by the RNase III endonuclease Drosha to

form one or more precursor miRNA (~60–100 nt) having capa-

bility to form stem–loop-type secondary structure. Specific

RNA cleavage by Drosha predetermines the mature miRNA

sequence and provides the substrate for subsequent processing

events. The pre-miRNAs are transported to the cytoplasm

by Exportin-5, in a Ran GTP-dependent manner. Once in the

cytoplasm, a second RNase III endonuclease, Dicer, acts on

Molecular & Human Genetics Division, Indian Institute of Chemical Biology, Kolkata, India.

DNA AND CELL BIOLOGYVolume 26, Number 5, 2007# Mary Ann Liebert, Inc.Pp. 321–337DOI: 10.1089/dna.2006.0549

321

the pre-miRNA and subsequently forms mature double stranded

miRNA (~19–25 nt). One strand of the miRNA duplex is sub-

sequently incorporated into an effector complex termed RNA-

induced silencing complex that mediates target gene expression.

Direct cloning and sequencing of short RNA molecules has

enabled the identification of many miRNAs; however, highly

constrained tissue- and time-specific expression patterns, pres-

ence of degradation products from mRNAs, and other noncoding

RNAs, has made it difficult and incomplete to clone miRNAs

(Lai et al., 2003; Lim et al., 2003b). This led to the development

of increasingly sophisticated computational approaches to predict

miRNAs and their target mRNAs complemented by the biolog-

ical validation techniques. An overview of the current advances

in this area is presented here with a view to stimulate the reader to

explore the diverse and exciting field of miRNA research.

COMPUTATIONAL IDENTIFICATIONOF miRNAs

The basic principle of the computational approaches is

simple—they rely on the known characteristics of miRNAs and

search those in other organisms. miRNA detection in animals

relies on (1) conservation of miRNAs in the genomes of related

species, (2) formation of stable stem–loop structure by pre-

miRNAs, and (3) the presence of mature miRNAs in the stem

and not in the loop of pre-miRNAs. However, the prediction

approaches are more challenging in the case of viruses and

plants, as viral miRNAs are rarely evolutionarily conserved

and the level of sequence conservation of miRNA precursor is

lower in plants. The length of hairpin structures is also more

variable in plants compared to that of animals. The secondary

structures, generated using RNA Fold (Hofacker, 2003), of

representative pre-miRNAs from Arabidopsis thaliana, Homo

sapiens, and Epstein Barr virus is shown in Figure 1.

miRNA PREDICTION ALGORITHMS

Computational procedure MiRscan (Lim et al., 2003a, 2003b)

was developed to identify miRNA genes conserved in more than

one genome. The program uses an RNA folding algorithm

RNAFold (Schuster et al., 1994) to locate potential hairpin

structures in sequences that are evolutionarily conserved among

C. elegans and C. briggsae. Briefly, it slides a 110-nt window

along both strands of the C. elegans genome and folding the

window with RNAfold to identify predicted stem–loop struc-

tures (>25 bp) and a folding free energy of at least�25 kcal/mol.

Each conserved hairpin, considered as a potential pre-miRNA,

was further evaluated for the location of miRNA within it by

passing a 21-nt window along each stem–loop, assigning a log-

likelihood score to each position for its similarity to known

miRNAs, the training set for the algorithm being 50 published

miRNAs from C. elegans and C. briggsae.

This approach successfully identified 35 novel miRNAs in

C. elegans (from ~35,000 hairpins conserved between C. ele-

gans and C. briggsae), of which 16 were experimentally veri-

fied, the accuracy level was calculated to be � 0.7. From

~15,000 hairpins conserved among humans, mice, and puffer

fish, the algorithm identified a high-scoring set of 188 human

loci. This set included 81 of the 109 members of a reference set

of known human miRNA loci, for a sensitivity of 0.74 (Lim et

al., 2003a).

Later, MiRscan algorithm has been used along with stem–

loop finder to identify miRNAs in human cytomegalovirus ge-

nome (Grey et al., 2005). Out of the potential 406 stem–loops

identified, 110 were conserved between chimpanzee and hu-

man cytomegaloviruses, and of these, 13 exhibited a significant

score from MiRscan. Five of these miRNAs were found to be

expressed during infection.

The accuracy of MiRscan was further improved by the same

group (Ohler et al., 2004). A highly significant sequence motif,

with consensus CTCCGCCC, was found to be present in about

200 bp upstream of almost all independently transcribed nem-

atode miRNA genes which were incorporated in the algorithm.

With this improvement, the total number of confidently iden-

tified nematode miRNAs now approaches 100. Using a poly-

merase chain reaction (PCR)-sequencing protocol, 9 new C.

elegans miRNA gene candidates were validated in their study.

The computationalmiRNAdetection programmiRseeker (Lai

et al., 2003) analyzed the completed euchromatic sequences of

twoDrosophila species,D. melanogaster andD. pseudoobscura

for conserved sequences that adopt an extended stem–loop

structure and display a pattern of nucleotide divergence char-

acteristic of known miRNAs using Mfold (Zuker, 2003). The

sensitivity of this computational procedure was demonstrated

by the presence of 75% (18 of 24) of previously identified

Drosophila miRNAs within the top 124 candidates. In total, 48

novel miRNA candidates identified by miRseeker were strongly

conserved in more distant insect, nematode, or vertebrate ge-

nomes. Of these, the expression of a total of 24 novel miRNA

genes was experimentally verified. MiRseeker estimated around

110 miRNA genes in Drosopila genomes.

Additional C. elegans small RNAs with properties similar

to miRNAs and siRNAs were identified by cDNA sequencing

and comparative genomics (Ambros et al., 2003). The novel

miRNAs were then identified by Mfold (Zuker, 2003). Their

method identified 21 new C. elegans miRNAs and estimated

that C. elegansmight contain 100 miRNAs, 30% of which were

conserved in vertebrates. Another informatic approach, based

on sequence conservation and structural similarity to known

miRNAs, predicted 214 candidate miRNAs in C. elegans ge-

nome (Grad et al., 2003). Northern blotting and sensitive PCR-

based experimental approaches were used to confirm the expres-

sion of some new miRNAs. They estimated that C. elegansmay

encode140–300miRNAsandpotentiallymanymore.Their strat-

egy was similar to MiRscan (Lim et al., 2003a, 2003b) but used

different selection criteria.

Rodriguez et al. (2004) annotated the positions of mammalian

miRNAs, obtained from microRNA registry (Griffiths-Jones,

2004), in the human and mouse genomes to derive a global

perspective on the transcription of miRNAs in mammals. Their

results showed that more than half of all known mammalian

miRNAs were within introns of either protein-coding or non-

coding transcription units, whereas ~10% were encoded by

exons of long nonprotein-coding transcripts (mRNA-like non-

coding RNAs).

Berezikov et al. (2005) used phylogenetic shadowing ap-

proach (Boffelli et al., 2003) to predict novel candidate miRNAs

in humans. Phylogenetic shadowing overcomes the limitation

322 CHAUDHURI AND CHATTERJEE

of classical phylogenetic footprinting and allows unambiguous

sequence alignments and accurate conservation determination

at single nucleotide resolution level. The authors sequenced

122 miRNAs in 10 primate species and found that nucleotides

in the stem of miRNA hairpin precursors are significantly more

conserved compared to loop sequences and sequences flanking

the hairpins. Using this distinctive property in conjunction with

other known properties of miRNAs, they predicted 976 can-

didate miRNAs by scanning whole-genome human/mouse and

human/rat alignments, most of the novel miRNA candidates

being conserved in other vertebrates (dog, cow, chicken, opos-

sum, and zebrafish). Northern blot analysis confirmed the ex-

pression of mature miRNAs for 16 out of 69 representative

candidates. Their results suggested that the numbers of miRNAs

in the human genome would be significantly higher than pre-

viously estimated, although the risk of false positives might be

higher.

For computational prediction of miRNAs, Sewer et al. (2005)

exploited the property that miRNAs are occasionally clustered.

The fraction of clustered miRNA genes in D. melanogaster has

been estimated to be ~50% (Bartel, 2004), while a total of 37%

of the known humanmiRNA genes analyzed in a study appeared

in clusters of two or more, with pairwise chromosomal distances

of at most 3000 nucleotides (Altuvia et al., 2005). Starting with

the known human, mouse, and rat miRNAs, the authors ana-

lyzed 20 kb of flanking genomic regions for the presence of

putative pre-miRNAs. Cross-species comparisons were then

used to make conservative estimates of the number of novel

miRNAs. In this way they predicted between 50 and 100 novel

pre-miRNAs for each of the conserved species. Around 30% of

FIG. 1. Hairpin secondary structures of miRNAs of Arabidopsis thaliana (ath-mir-156c), Homo sapiens (has-let-7d) and EpsteinBarr virus (EBV-mir-BHRF1-1). PremiRNA sequences were obtained from miRbase (Griffiths-Jones, 2004; Griffiths-Jones et al.,2006) and secondary structures were generated from Vienna RNAfold web server (Hofacker, 2003).

MICRORNA DETECTION AND TARGET PREDICTION 323

their predicted miRNA had experimental support in a large set

of cloned mammalian small RNAs.

The algorithm ProMiR (Nam et al., 2005) introduced a

probabilistic colearning model based on the paired hidden

Markov model (HMM) for miRNA gene finding. The authors

combine both sequence and structure of pre-miRNA in a prob-

abilistic framework and simultaneously decide the presence of

pre-miRNA and mature miRNA by detecting the signals for the

site cleaved by Drosha. This approach is expected to identify

novel miRNAs in addition to those which are abundantly ex-

pressed or close homologs of previously identified miRNAs.

On screening human chromosomes 16, 17, 18, and 19, ProMiR

detected at least 23 novel miRNA gene candidates, which do

not bear sequence similarity to the known miRNA genes.

Of these, nine candidate genes were experimentally validated

by determining/examining the accumulation of pre-miRNAs by

real-time quantitative RT-PCR in the cells depleted of Drosha,

indicating that ProMiR may successfully predict at least with

40% accuracy.

ProMiR was further improved by the same group (Nam et al.,

2006) to ProMiRII integrating additional filtering criteria like

G/C ratio, conservation score, entropy, and free energy of can-

didate sequences. Low and high stringency prediction of con-

served and nonconserved miRNA genes are possible by

adjusting the filtering criteria. With appropriate training data

set, this method can be applied to all species.

An integrative approach called PalGrade (Bentwich et al.,

2005) has been developed combining computational predic-

tions with microarray analysis and sequence-directed cloning

for miRNA detection which does not rely on sequence con-

servation. In this approach, the folding of noncoding region of

the entire human genome was carried out with RNAFold (Hof-

acker, 2003) yielding 11 million hairpins, from which PalGrade

selected a set of 5300 high scoring candidates, which were then

passed through microarray experiments yielding 359 expressed

candidates and were sequenced, finally yielding 89 novel val-

idated human miRNAs. Thus, PalGrade could identify a large

number of miRNAs that are unique to primates and are unde-

tected by other prediction algorithms and proposed that the

total number of miRNAs in humans could be at least 800.

Legendre et al. (2005) used a profile-based strategy im-

plemented in an ERPIN program with a view to estimate how

many miRNAs could be recovered. ERPIN represents RNA

alignments as weight matrices or profiles (Gautheret and Lam-

bert, 2001), and identifies matching sequences using a combined

dynamic programming/profile scan algorithm, thus capturing

both primary and secondary structure information, which is

particularly well adapted to pre-miRNA identification. Their ap-

proach produced 265 new miRNA candidates that were not

previously found in miRNA databases. The authors suggested

that the Profile-based RNA detection will be an important com-

plement of similarity search programs in the completion of

miRNA collections.

Comparative methods, based on the idea that miRNAs are

conserved across species, have been used by some groups (We-

ber, 2005; Xie et al., 2005; Pedersen et al., 2006) for identifi-

cation of miRNAs.

Xie et al. (2005) performed a comparative analysis of the

human, mouse, rat, and dog genomes to create a systematic

catalog of common regulatory motifs in promoters and 30-UTRs

which identified 106 new motifs (8-mers) in 30-UTR. The

neighborhood of each motif was then evaluated by RNAfold

(Hofacker, 2003) for secondary structure and the selected stable

stem–loops were further evaluated based on several observed

features of known miRNAs: higher conservation in the 22 bp

stem, lower conservation in the loop and surrounding regions,

and appropriate base-pairing and bulges in the stem region. No-

tably, roughly one-half of the discovered motifs in the 30 UTRswere related to miRNAs, leading to the identification of several

new miRNA genes. Their results suggested that previous esti-

mates of the number of human miRNA genes were low, and that

miRNAs regulate at least 20% of human genes.

Weber (2005) performed a systematic search for poten-

tial human orthologues of known mouse miRNAs and mouse

orthologues of known human miRNAs deposited in the miRNA

Registry (Griffiths-Jones, 2004). His algorithm consisted of

miRNA tracks written to visualize miRNAs in human andmouse

genomes on the UCSC Genome Browser. With this tool, the

author systematically determined the position and orientation of

miRNA genes relative to known transcriptional units, examined

the conservation of miRNA gene localization between the hu-

man and mouse genomes, and made a comprehensive list of

miRNA clusters. The hairpin structures of the sequences cor-

responding to potential new miRNA precursors were assessed

with the MFOLD (Zuker, 2003) program, finally leading to the

identification of 35 human and 45mouse putativemiRNA genes.

The comparative genomic method of Pedersen et al. (2006) is

based on phylogenetic analysis of multiple alignments. Their al-

gorithm EvoFold makes use of phylogenetic stochastic context-

free grammar and is a combined probabilistic model of RNA

secondary structure and sequence evolution. Screening the re-

gions of the human genome that are under strong selective

constraints, the algorithm yielded a set of 48,479 candidate RNA

structures containing various types of genetic regulatory ele-

ments including 195 miRNAs. The false positives are high,

estimated to be around 62%. Among the highest-scoring can-

didates, the screen predicted 169 new miRNAs.

A basic learning approach based on the Naı̈ve Bayes clas-

sifier has recently been proposed for the prediction of miRNA

genes (Yousef et al., 2006). This method automatically gener-

ates a model from the training data, which consists of sequence

and structure information of known miRNAs from a variety of

species, allowing prediction of nonconserved miRNAs. This

was followed by comparative analysis over multiple species to

reduce the number of false positives. The resulting algorithm

exhibits higher specificity and similar sensitivity compared to

currently used algorithms using conserved gnomic regions (Grad

et al., 2003; Lai et al., 2003; Lim et al., 2003a, 2003b). The

major novelty of the approach lies in the integration of data from

multiple species which stabilize the learning process, and more

importantly, construct a model that is more likely to be appli-

cable to a variety of genomes.

A support vector machine (SVM) based approaches has re-

cently been proposed by two groups (Helvik et al., 2006; Hertel

and Stadler, 2006). The program RNAmicro (Hertel and Stadler,

2006) used an SVM-based approach in conjunction with a

nonstringent filter for consensus secondary structures and could

detect miRNA precursors in multiple sequence alignments.

Helvik et al. (2006) presented a SVM-classifier called Micro-

processor SVM, which predicts 50 Drosha processing sites in

324 CHAUDHURI AND CHATTERJEE

hairpins of candidate miRNAs. The prediction was correct for

50% of known human 50 miRNAs. Using another classifier

trained on the output from theMicroprocessor SVM, the authors

performed an analysis on 130 reported miRNAs and showed that

some miRNAs may have been mis-annotated. The authors sug-

gest that expressed hairpins should not be annotated as miRNAs

until they are verified to be Drosha and Dicer substrates.

Li et al. (2006) has recently proposed a scanning method

which examined ESTs and intronic sequences to identify novel

miRNAs using the srnaloop program (Grad et al., 2003). The

output was passed through sequence and structure filters like

GC content, core, and hairpin minimum-free energies and their

ratio. Using 130 newly updated premiRNA and randomly se-

lected sequences, the sensitivity and specificity of the method

was 85 and 49%, respectively.

Chatterjee and Chaudhuri (2006) developed a computational

approach miRsearch based on the criteria that miRNAs are usu-

ally highly conserved in the genomes of related organisms, their

pre-miRNA transcript forms extended stem–loop structure, and

the mature miRNAs are present in the long helical arm of the

stem–loop structure. miRsearch relies on searching the homologs

of all knownmiRNAs of one organism in the genome of a related

organism allowing few mismatches depending on the evolution-

ary distance between them, followed by assessing for the capa-

bility of formation of stem–loop structure using MFOLD (Zuker,

2003). The approach identified 91 probable candidate miRNAs

along with pre-miRNAs in Anopheles gambiae using known D.

melanogaster miRNAs and selecting the cutoff free energy of

MFOLD based on known D. melanogaster pre-miRNAs.

COMPUTATIONAL IDENTIFICATIONOF PLANT miRNAs

For plant miRNA detection various computational methods

have been developed, which are mostly focused on Arabidopsis

thaliana and Oryza sativa genomes. The algorithm MiRFinder

(Bonnet et al., 2004) was based on the conservation of short

sequences between the genomes of A. thaliana andO. sativa and

on properties of the secondary structure of the miRNA pre-

cursor. The method was fine tuned to take into account the

variable length of the miRNA precursor sequences. Out of the

identified 91 potential miRNA genes, 58 had at least one nearly

perfect match with an Arabidopsis mRNA, constituting the

potential targets of those miRNAs.

Another comparative genomic approach (Jones-Rhoades and

Bartel, 2004) identified novel miRNAs in Arabidopsis of which

23 miRNA candidates, representing seven newly identified gene

families, were experimentally validated.

A single genome computational approach, findMiRNA (Adai

et al., 2005) was proposed, which relies on the rigid comple-

mentarity between plant miRNAs and their targets, and stem–

loop formation. From this data set, they selected 13 potential

new miRNAs for experimental verification and detected the

expression of 8 of the candidate miRNAs. This method thus

provides an important alternative as it uses a single genome, and

looks for conserved miRNA target sites in transcripts in addi-

tion to looking for conserved miRNA sequences.

Wang et al. (2004) presented a computational method for

genome-wide prediction of A. thaliana miRNAs using charac-

teristic features of known plant miRNAs as criteria to search for

miRNAs conserved between Arabidopsis and O. sativa. They

predicted 95 ArabidopsismiRNAs including 83 new sequences.

The expression of 19 new miRNAs was confirmed by Northern

blot hybridization. Their results suggested that at least some

miRNA precursors are polyadenylated at certain stages.

Strategies for miRNA identification based on Expressed

Sequence Tags (EST) analysis was taken up by some groups

(Williams et al., 2005; Zhang et al., 2005). ESTs represent true

gene expression; the analysis based on ESTs could thus provide

more evidence and confidence in the discovery of new potential

miRNAs. Williams et al. (2005) used the nonannotated EST

database (Yamada et al., 2003) and developed a computational

screen based on the properties of miRNAs to exclusively iden-

tify the candidate miRNAs. Zhang et al. (2005) also used EST

analysis and DNA database analysis in detail and reported 338

new miRNAs in 60 plant species.

COMPUTATIONAL IDENTIFICATIONOF VIRAL miRNAs

Pfeffer et al. (2004) first identified miRNAs in Epstein-Barr

virus (EBV), a large DNA virus of the Herpes family that pref-

erentially infects human B cells. The miRNAs were found to

be clustered in two different regions of the genome and were

detectable by Northern blotting. They concluded that EBV,

through miRNA, might exploit RNA silencing as a convenient

method for gene regulation of host and viral genes in a non-

immunogenic manner. In contrast to most eukaryotic miRNAs,

these viral miRNAs do not have close homologs in other viral

genomes or in the genome of the human host. So, in a later study

by the same authors miRNA genes in the herpesvirus family

were identified by combining a new miRNA gene prediction

method with small-RNA cloning from several virus-infected

cell types (Pfeffer et al., 2005).

An algorithm VirMir was developed for the detection of

likely pre-miRNAs in the small genome (<300 kb) (Sullivan

et al., 2005). VirMir could identify miRNAs encoded by SV40,

and this study also defined their functional significance for viral

infection. A refined version of VirMir named as VMir was

presented by the same group (Grundhoff et al., 2006). VMir

features an updated scoring algorithm and the incorporation of

several quality filters designed to reduce the complexity of the

prediction. Candidate hairpins were then synthesized as oligo-

nucleotides on microarrays, hybridized with small RNAs from

infected cells and miRNAs scoring positive on the arrays were

then subjected to Northern blot analysis. Using this approach,

10 of the known and 1 novel Kapos sarcoma-associated her-

pesvirus (KSHV) pre-miRNAs were identified.

Recently, a computational method (Cui et al., 2006) has been

proposed to screen the complete genome ofHSV-1 for sequences

that adopt an extended stem–loop structure and display a pattern

of nucleotide divergence characteristic of known miRNAs.

Using this method, 11 HSV-1 genomic loci were predicted to

encode 13 miRNA precursors and 24 miRNA candidates. Eight

of the HSV-1 miRNA candidates were predicted to be con-

served in HSV-2. The precursor and the mature form of one

HSV-1 miRNA candidate was detected during infection of

Vero cells by Northern blot hybridization suggesting a possible

MICRORNA DETECTION AND TARGET PREDICTION 325

role for this miRNA in regulation of viral and host gene ex-

pression.

EXPERIMENTAL VALIDATIONOF CANDIDATE miRNAs

Computational predictions contribute a lot to the miRNA gene

discovery, but the existence of a candidate miRNA needs ex-

perimental validation. Experimental detection of miRNAs is

technically challenging because of their small size, sequence

similarity among various members, low level, and tissue-specific

or developmental stage-specific expression. Fortunately in the

past 2 years there has been a significant progress in performance,

execution, and fine tuning of several validation approaches re-

sulting in high sensitivity, high throughput and high comparative

capabilities.

Northern blot

Northern blot analysis has been widely used to determine the

expression of both the mature and precursor miRNAs cloned

from size-fractionated cDNA libraries (Lagos-Quintana et al.,

2001; Lau et al., 2001; Lee andAmbros, 2001;Calin et al., 2002).

The disadvantages are its low throughput and limited sensitivity

for detecting rare miRNAs, and consequently, a large amount of

total RNA per sample is required which may not be feasible

especially for diseased tissues. However, Northern blots are sill

regarded as golden standard for miRNA validation, and quan-

tification (Ambros et al., 2003) and is used for confirmation of

high-throughput data (Sempere et al., 2004).

The sensitivity of detection of miRNAs by Northern blot

has been increased by about 10-fold using locked nucleic acid

(LNA)-modified oligonucleotide probes (Valoczi et al., 2004).

LNA probes exhibit unprecedented thermal stability and show

improved hybridization properties against complementary RNA

targets. This strategy has been used in the detection of miRNAs

in the mouse, A. thaliana, and Nicotiana benthamiana (Valoczi

et al., 2004).

Other hybridization techniques include RNase protection as-

say (RPA) (Lee et al., 2002), primer extension (Seitz et al.

2004), and a signal-amplifying ribozyme method (Hartig et al.,

2004). In RPA, a labeled antisense RNA probe complementary

to the sequence of interest is synthesized through an in vitro

transcription and hybridized to total RNA followed by digestion

with a single-strand-specific RNase to degrade unhybridized

probe and target. The remaining protected probe:target hybrid

is separated on a denaturing polyacrylamide gel and detected by

methods specific to the label on the probe. The primer exten-

sion approach detects the miRNA by hybridizing a labeled DNA

primer to the 30-portion of the RNA, followed by template-

directed incorporation of nucleotides by reverse transcrip-

tase. The primer is a few nucleotides shorter than the predicted

miRNA target. Polyacrylamide gel electrophoresis is then used

to detect the extended products. The miRNA detection by ribo-

zyme method is based upon hairpin ribozymes that cleave a

short RNA substrate labeled with a fluorophor at the 30-endand a quencher at the 50-end, as a function of the presence or

absence of a miRNA effector followed by real-time monitoring.

Cloning and sequencing approaches

Initially random cloning and sequencing of size-fractionated

RNA was the main approach for identification of miRNAs

(Lagos-Quintanaetal.,2002).Later, informaticspredictionswere

carried out in parallel.

Bentwich et al. (2005) used a sequence-directed cloning

and sequencing approach. A biotin-labeled oligonucleotide

(~22–30 nt long, biotin at the 50-end) was designed and used to

capture the homologous miRNA from a cDNA library enriched

for small RNAs. The captured cDNA library molecules were

then PCR amplified, cloned, and sequenced.

In the PCR-based cloning approach used earlier (Lim et al.

2003a, 2003b), a primer specific to the predicted 30-terminus of

the candidate miRNA and a universal primer corresponding to

the 50-adapter was used to amplify the specific cDNA clone from

a cDNA library constructed from 18–26 nt RNAs. PCR products

were then cloned and sequenced.

A more sensitive PCR-based cloning approach called mRAP

procedure (Takada et al., 2006) has been described recently.

Isolated small-RNA molecules are first ligated at their 30-end toa 30 adaptor and reverse-transcribed with a primer complemen-

tary to the 30 adaptor. After the annealing of a 50 adaptor to the

poly(C) overhang (added earlier) of the cDNAs, PCR is per-

formed to amplify the cDNAs. The isolated cDNAs are then

cloned and sequenced.

RT-PCR and real-time RT-PCR

The reverse-transcriptase polymerase chain reaction (RT-

PCR) is a widely used method to detect the expression of mRNA

and other RNA molecules. Real-time quantitative RT-PCR has

been successfully used to detect the expression of miRNA pre-

cursors (Schmittgen et al., 2004). Fu et al. (2006) have proposed

a poly(A)-tailed RT-PCR, to detect the expression of mature

miRNAs. Total RNA was polyadenylated by poly(A) polymer-

ase, cDNA was synthesized by an RT primer and reverse tran-

scriptase using the poly(A)-tailed total RNA as templates. The

expression of mature miRNAs was comparable to that deter-

mined by Northern blotting. Recently, Tang et al. (2006) ana-

lyzed miRNAs in single cell by using a real-time PCR-based

220-plex miRNA expression profiling method. This method re-

quires about 0.015 ng of starting total RNA.

Macroarray

Krichevsky et al. (2003) determined the expression profile of

44 miRNAs in mice brain by an oligonucleotide array (~52–74

nucleotides antisense) on nylon membranes. The array was

probed with a radioisotope-labeled low molecular weight frac-

tion (<60 nt) of total RNA and analyzed by phosphor imaging.

This approach is less suited for high throughput applications

and suffers from the drawbacks like unequal hybridization ef-

ficiency of individual probes and targets and use of single data

points for normalization impairing accurate quantification.

In situ hybridization

In situ hybridization methods for detection of miRNAs have

been developed recently (Wienholds et al., 2005; Kloosterman

et al., 2006a, 2006b; Nelson et al., 2006). Technical challenges

include fixing of small RNAs which diffuses fast and could be

326 CHAUDHURI AND CHATTERJEE

lost during hybridization or washing. Use of LNA-modified

oligonucleotide probes successfully detected conserved verte-

brate miRNAs in zebrafish, mice, and frog embryos (Wien-

holds et al., 2005; Kloosterman et al., 2006a, 2006b). Nelson

et al. (2006) demonstrated coordinated miRNA expression on

archival formalin-fixed paraffin-embedded (FFPE) humanbrains

and oligodendroglial tumors by using RAKE (RNA-primed,

array-based, Klenow Enzyme) miRNA microarray platform in

conjunction with LNA-based in situ hybridization. Deo et al.

(2006) have recently improved the specificity by using RNA

oligonucleotide probes linked to a fluorescein hapten and highly

specific washing conditions with tetra-methyl ammonium chlo-

ride (TMAC). The method could directly detect mature miR-

NAs in tissue sections from developing mouse embryos, adult

brain, and the eye. In situ hybridization can be successfully used

to determine the spatio-temporal expression patterns of candi-

date miRNAs, thus having immense application to functional

studies.

Microarray technology

To validate hundreds of computationally predicted novel

miRNAs and also to determine their comparative expression

profile, miRNA profiling by microarray is emerging as a pow-

erful and popular tool because of its automation and high

throughput nature.

Liu et al. (2004) published the first report of genome wide

miRNAexpression profiling by amicroarray in human andmouse

tissues. The miRNAs were reverse transcribed with biotinylated

random primers and hybridized to oligonucleotide spotted

arrays. miRNA levels were then detected using streptavidin-

bound fluorophores. Their results were confirmed with North-

ern blot, real-time RT-PCR, and literature search. The study

measured the expression of pre-miRNA (~70 nt) rather than

~22-nt mature miRNA. Moreover, labeling of highly structured

pre-miRNA with random primers may be susceptible to strong

biases in efficiency.

The miRNA microarray techniques was improved by specifi-

cally targeting the mature 22-nt miRNA sequence and ligase-

based labeling (Thomson et al., 2004). They adapted a labeling

method using T4 RNA ligase to couple the 30-end of RNAs to a

fluorescent modified dinucleotide. Their microarray data strongly

correlated with Northern blot analysis. However, the possibility

of crosshybridization between closely related miRNAs and po-

tential biases in the expression data from different ligation effi-

ciencymay not be eliminated. The authors reported the expression

profiles of 124 miRNAs in adult mouse tissues and embryonic

stages.

Miska et al. (2004) developed a microarray for mature

miRNA expression analysis, in which miRNAs were ligated to

30 and 50 adaptor oligonucleotides followed by reverse tran-

scription and amplified by PCRwith Cy3-labeled primer to label

the sense strand of PCR products. Their data correlated well

with Northern blot analysis. Since RNA is amplified before

hybridization, a relatively low amount of starting material is

needed. A similar PCR amplification and adapter ligation strat-

egy has also been presented (Barad et al., 2004), where labeling

of cDNA was done after PCR amplification. This methodology

was utilized to profile the expression of 150 known human

miRNAs in HeLa cells and five human tissues.

In another approach Tm-normalized DNA oligonucleotides,

antisense to the given small RNA sequence, were used as

probes in the microarray experiment, target miRNAs were PCR

amplified with a fluorescently labeled primer (Axtell and Bartel,

2005; Baskerville and Bartel, 2005). A synthetic reference li-

brary was used for internal normalization.

Sun et al. (2004), however, generated unlabeled cDNA first

with random hexamer. After alkaline hydrolysis of the template

RNA, the single-stranded cDNA was 30 labeled with biotin-

ddUTP using terminal deoxynucleotidyl transferase.

In contrast to the above methods labeling small RNAs prior to

hybridization, a new procedure called the RNA primed array-

based Klenow enzyme (RAKE) assay has been described (Nel-

son et al., 2004). DNA oligonucleotide probes having 30-endcomplementary to specific miRNAs and a universal spacer se-

quence at the 50-end are synthesized and covalently crosslinked

at the 50 termini on to glass microarray slides. The RNA sam-

ples containing miRNAs are then hybridized and treated with

exonuclease I to degrade single-stranded, unhybridized probes.

The Klenow fragment of DNA polymerase I and biotinylated

dATP were then added. The hybridized miRNAs act as primers

and immobilized DNA probes as templates generating biotin

labeled double-stranded fragments, which were visualized by

streptavidin-conjugated fluorophore. Another innovative ad-

vancement made by this group was to isolate small RNAs from

FFPE sections of tissue samples to use on their microarrays.

RAKE does not involve the generation of the cDNA library

or amplification of the RNA sample, and avoids sample RNA

manipulation altogether. This method can distinguish nucleotide

mismatches at the 30-end where miRNA homologs commonly

share the greatest sequence disparity.

Direct chemical methods of labeling have been tried by dif-

ferent groups (Babak et al., 2004; Liang et al., 2005). Babak

et al. (2004) usedUlysisAlexaFluor (Molecular Probes, Eugene,

OR), which reacts with the N7 of guanine to form a stable coor-

dination complex. Total RNA was fluor-labeled and hybridized

directly to an array antisense to 154 mouse miRNAs and 206

other noncoding RNAs included as controls. miRNA expression

was analyzed across 17 mouse organs and tissues and detected

78 miRNAs. Since miRNAs may contain varying G-residues,

the labeling efficiencies are expected to vary and miRNAs

lacking G residuals cannot be labeled.

In the method by Liang et al. (2005) miRNAs were oxidized

with sodium periodate to convert 30 terminal adjacent hydroxyl

groups (20 and 30 position of ribose) into dialdehyde, which was

then reacted with biotin-X-hydrazide through a condensation re-

action resulting in biotinylated miRNA. The biotinylated miR-

NAs were captured on the microarray by oligonucleotide probes

in hybridization. Quantum dots (QD) were labeled on the cap-

tured miRNAs through the strong specific interaction of strepta-

vidin and biotin. QDs have a high extinction coefficient and a high

quantum yield, so trace amounts of miRNAs are easily detected

with a laser confocal scanner. The detection limit of this micro-

array for miRNA was higher than the previous methods using

a model microarray, the authors reported the profiling of 11

miRNAs from leaf and root of rice (Oryza sativa L.ssp.indica)

seedlings. miRNAs resulted from the analysis had a good repro-

ducibility and were consistent with the Northern blot result.

To avoid using high-cost detection equipment, the authors

have used a colorimetric gold–silver detection method, in which

MICRORNA DETECTION AND TARGET PREDICTION 327

the captured miRNAs were labeled with streptavidin-conjugated

gold followed by silver enhancement. During silver enhance-

ment, the gold nanoparticles bound to miRNAs catalyzed the

reduction of silver ions to metallic silver, which further auto-

catalyzed the reduction of silver ions to form metallic silver

precipitation on gold, resulting in a signal enhancement. This

process allowed straightforward detection of the miRNAs with

an ordinary charge coupled device (CCD) camera mounted on a

microscope.

The recent developments in microarray analysis involves

more sensitive and specific methods (Castoldi et al., 2006; Fang

et al., 2006; Wang et al., 2006). The microarray platform mi-

Chip (Castoldi et al., 2006) is based on LNA-modified, Tm-

normalized capture probes spotted onto NHS-coated glass slides

and can discriminate between miRNAs with single nucleotide

differences. Fang et al. (2006) detected miRNAs on LNA

microarrays down to a concentration of 10 fM using a combi-

nation of surface polyadenylation chemistry and nanoparticle-

amplified surface plasmon resonance imaging (SPRI) detection.

This approach was used to determine miRNA concentrations in

a total RNA sample from mouse liver tissue. Wang et al. (2006),

on the other hand, used multiplexed miRNA profiling assay

employing simple high-efficiency direct labeling of submicro-

gram quantities of total RNA, without amplification or size

fractionation. The assay had a low detection limit (<0.05 amol)

and also suitable for use with formalin-fixed paraffin-embedded

clinical samples.

Other approaches

Jonstrup et al. (2006) presented a simple miRNA detection

protocol based on padlock probes and rolling circle amplifica-

tion. Padlock probes are linear DNA probes where the terminal

sequences are designed to hybridize to two adjacent target se-

quences. Under right conditions, DNA ligase will ligate the

termini of the padlock probe on a perfectly matching RNA

template, accurately distinguishing matched and mismatched

substrates. The miRNA, used as a template, can subsequently be

used as primer for rolling circle amplification, thereby linearly

amplifying the target sequence in a quantitative manner. It can

be performed without specialized equipment, and has been

shown to measure specific miRNAs in a few nanograms of total

RNA.

Lu et al. (2005) presented a new, bead-based flow cytometric

miRNA expression profiling method and determined expression

profile of 217 mammalian miRNAs including multiple human

cancers. In this method, oligonucleotide-capture probes com-

plementary to miRNAs were coupled to carboxylated 5-micron

polystyrene beads impregnated with variable mixtures of two

fluorescent dyes (that can yield up to 100 colors), each repre-

senting a single miRNA. Following adaptor ligations of both

the 50-phosphate and the 30-hydroxyl groups of miRNAs, reverse

transcribed miRNAs were PCR amplified using a common bio-

tinylated primer, hybridized to the capture beads, and stained

with streptavidin-phycoerythrin. The beads were then analyzed

using a flow-cytometer. The results were comparable to North-

ern blot analysis. In contrast to traditional microarrays, bead-

based hybridization more closely approximates hybridization in

solution, raising the specificity, thus, it offers a less expensive

high-speed platform for miRNA validation.

A list summarizing putative and experimentally verified

miRNAs in various animals, plants, and viruses are presented in

Table 1.

COMPUTATIONAL PREDICTIONOF miRNA TARGETS

Computational prediction of miRNA targets are far more

challenging compared to miRNA prediction due to the lack of

strict base pairing between miRNA and its target mRNA se-

quences. The basic principles of these predictions, largely de-

rived from experimental studies, rely on: (1) complementarity of

miRNA sequence to the 30 UTR of target mRNA, complemen-

tarity being imperfect in animals with some exceptions but exact

in plants (illustrated in Fig. 2), (2) strong binding of the 50 end ofmiRNA to target compared to its 30-end, (3) thermodynamic

stability of miRNA-mRNA duplex, (4) conservation of target 30

UTR sites related genomes, (5) multiplicity and cooperativity of

miRNA-target interaction, and (6) lack of strong secondary

structure of target mRNAs at miRNA binding site.

MicroRNA TARGET PREDICTIONALGORITHMS

Target prediction in animals

Stark et al. (2003) screened conserved 30 UTR sequences

from the D. melanogaster genome for potential miRNA targets.

The procedure involves (a) search for sequences complemen-

tary to the first eight residues (allowing for G:U mismatches) of

themiRNAwithHMMer (Eddy, 1996), (b) evaluation of thermo-

dynamic stability and structure of the predicted miRNA–target

mRNA heteroduplex with MFOLD (Zuker, 2003). This ap-

proach identified previously validated targets and some new

targets. Three predicted targets each for miR-7 and the miR-2

family were experimentally verified.

Later, the same group systematically evaluated the minimal

requirements for a functional miRNA–target duplex in vivo

(Brennecke et al., 2005). Their study revealed two categories

of target sites in Drosophila: (1) 50 dominant sites base pairing

well with the 50 end of miRNA and may be canonical (pairing

at both 50 and 30 ends) or seed (pairing with the 50 end only); (2)30 compensatory sites (weak 50 base pairing but strong com-

pensatory pairing to 30 end). Their study showed that both

classes of sites are used in biologically relevant genes.

Further studies by the same group (Stark et al., 2005) com-

bined improved miRNA target prediction with information on

gene function and expression in Drosophila. They reported that

a large set of genes involved in basic cellular processes avoid

miRNA regulation due to short 30 UTRs lacking miRNA bind-

ing sites. For individual miRNAs, coexpressed genes avoided

miRNA sites, whereas target genes and miRNAs were prefer-

entially expressed in neighboring tissues. This mutually ex-

clusive expression argues that miRNAs confer accuracy to

developmental gene expression programs, thus ensuring tissue

identity and supporting cell-lineage decisions.

TargetScan (Lewis et al., 2003), a computational method for

the identification of targets of vertebrate miRNA, combines

328 CHAUDHURI AND CHATTERJEE

thermodynamics-based modeling of RNA:RNA duplex interac-

tions with comparative sequence analysis. Given an miRNA

conserved in multiple organisms and a set of orthologous 30

UTR sequences from these organisms, TargetScan (1) searches

the UTRs for segments of complementarity of a 7-nt seed (2–8

from the 50 end of miRNA), (2) extends each seed match al-

lowing G:U pairs, (3) optimizes base pairing using the RNA-

fold program (Hofacker et al., 1994), (4) assigns folding free

energy to each miRNA: target site interaction with RNAeval

(Hofacker et al., 1994), (5) score each UTR and assigns rank,

and (6) predicts as miRNA targets those genes with score and

rank above prechosen cutoff. TargetScan was applied to a non-

redundant pan-mammalian set of 79 miRNAs and a nonre-

dundant pan-vertebrate set of 55 miRNAs and predicted more

than 400 regulatory target genes, the estimated false positives

being ~22–31%. Experimental support for 11 out of 15 genes

was obtained using a HeLa cell reporter system.

TargetScan was further improved to TargetScanS by the same

group (Lewis et al., 2005). TargetScanS relaxed the seed nu-

cleotide match to 6 nt (2–7 from the 50 end), and it did not

consider thermodynamic stability of pairing, pairing outside

immediate vicinity of the seed, or presence of multiple com-

plementary sites per UTR. Chicken and dog genomes were in-

cluded in addition to the mouse, rat, and human genomes, thus

reducing the noise. Moreover, conserved positions immediately

upstream/downstream of the seed match were considered in-

creasing the specificity of prediction. The false positive rate was

estimated ~22% for targets conserved in mammals. In a four-

genome analysis of 30 UTRs, over one-third of human genes

were estimated to be conserved miRNA targets.

Enright et al. (2003) presented a computational method

miRanda for whole genome prediction of miRNA targets in

animals. Target identification inmiRanda involves: (1) sequence

matching, using a position-weighted local alignment algorithm,

TABLE 1. PUTATIVE AND EXPERIMENTALLY VERIFIED MIRNASa

Organism

Total

predicted

miRNA

miRNAs

experimentally

verified

miRNAs yet to

be verified

experimentally

Animals

Anopheles gambiae 84 0 84

Apis mellifera 25 0 25

Bombyx mori 21 0 21

Bos taurus 109 103 6

Caenorhabditis briggsae 79 0 79

Caenorhabditis elegans 115 114 1

Canis familiaris 6 6 0

Drosophila melanogaster 85 82 3

Drosophila pseudoobscura 74 0 74

Danio rerio 371 355 16

Fugu rubripes 133 0 133

Gallus gallus 165 95 71

Homo sapiens 541 444 97

Monodelphis domestica 111 0 111

Mus musculus 419 351 68

Pan troglodytes 39 35 4

Rattus norvegicus 261 174 87

Schmidtea mediterranea 85 85 0

Tetraodon nigroviridis 133 0 133

Xenopus tropicalis 196 0 196

Plants

Arabidopsis thaliana 133 127 6

Oryza sativa 242 98 144

Zea mays 96 0 96

Viruses

Epstein Barr virus 32 32 0

Human cytomegalovirus 14 14 0

Kaposi sarcoma-associated herpesvirus 17 17 0

Mareks disease virus 10 10 0

Mouse gammaherpesvirus 68 9 9 0

Rhesus lymphocryptovirus 22 22 0

Simian Virus 40 2 2 0

aNumbers obtained from miRbase release 9.0 at ftp://ftp.sanger.ac.uk/pub/mirbase/sequences/9.0/database_files/mirna_mature.txt.gz

MICRORNA DETECTION AND TARGET PREDICTION 329

(2) free-energy calculation estimating the energetics of this

physical interaction; and (3) filtering through evolutionary con-

servation. For validation, experimentally verified target genes

were used in a randomized background model. Application to

D. melanogaster, D. pseudoobscura, and A. gambiae genomes

identified target genes enriched in transcription factors and

distinct networks like control of cell fate, morphogenesis, and

nervous system function.

John et al. (2004) further applied miRanda for the human

miRNA target prediction. They reported about 2000 human

genes with miRNA target sites conserved in mammals and about

250 human genes conserved as targets between mammals and

fish. Overrepresented targets included transcription factors, pro-

teins involved in translational regulation, and components of

the miRNA/ubiquitin machinery, representing novel feedback

loops in gene regulation.

Rehmsmeier et al. (2004) presented an improved RNA

folding algorithm RNAhybrid that predicts multiple potential

miRNAs binding sites in target RNAs. This program finds the

energetically most favorable miRNA–mRNA hybridization

sites and does not allow miRNA–miRNA/mRNA–mRNA hy-

bridizations. Statistical significance was assessed with an ex-

treme value statistics of length normalized minimum free

energies, a Poisson approximation of multiple binding sites,

and the calculation of effective numbers of orthologous targets

in comparative genomics. In Drosophila, the algorithm recov-

ered some of the known targets, and suggested additional pos-

tulated targets.

Recently, some useful features like disallowance of G:U base

pairs in the seed region and seed-match speedup accelerates the

program (Kruger and Rehmsmeier 2006). This advanced RNA

hybrid predicted lin-39 as a strong target candidate for miR-241

which is a bona fide target site but has been missed by the

standard approaches (Enright et al., 2003; Krek et al., 2005;

Lewis et al., 2005).

Rajewsky and Socci (2004) presented a simple model for the

mechanism of miRNA target site recognition based on kinetic

and thermodynamic consideration. Application to a set of 74

D. melanogastermiRNAs and 30 UTR sequences of a predefined

set of fly mRNAs revealed highly scoring, conserved putative

target sites in several key developmental body-patterning tran-

scription factors such as fushi-tarazu and hairy.

The DIANA-microT algorithm (Kiriakidou et al. 2004) used

a combined bioinformatics and experimental approach for the

prediction of human miRNA targets. This program scores high

affinity interaction betweenmiRNA:mRNA recognition element

(MRE) with a dynamical programming algorithm and then uses

a MRE filter based on experimentally derived rules: (a) the

proximal (50-end) region of miRNA forms seven to nine base

pairs with MRE; the nucleotide may or may not participate; (b)

a central loop or bulge must exist; and (c) the distal (30-end)region of the miRNA must form at least five canonical/wobble

base pairs with MRE. DIANA-microT successfully recovered

all previously known prototypical C. elegans miRNA targets.

Smalheiser and Torvik (2004) carried out a population-wide

unbiased statistical analysis of how human miRNAs interact

complementarily with human mRNAs compared to scrambled

control sequences. The results demonstrate several novel fea-

tures of human miRNA–mRNA interactions that differ from

C. elegans and Drosophila, and identified a set of 71 mRNAs.

Unlike the case in C. elegans and Drosophila, many human

miRNAs exhibited long exact matches (�10 bases), and in

several cases, a miRNA hit multiple mRNAs belonging to same

functional class.

Krek et al. (2005) presented a probabilistic identification of

combinations of target sites (PicTar) for identification of targets

for both single and combinations of miRNAs. In PicTar prob-

abilistic model, miRNAs compete with each other and back-

ground for binding. The model accounts for synergistic effects

of multiple binding sites of one miRNA or several miRNAs

acting together, along with appropriate scoring of overlapping

sites. The probabilities assigned according to experimental and

computational results. The authors reported that each vertebrate

miRNA could target roughly 200 transcripts on average.

Later, Grun et al. (2005) exploited cross species comparison

and PicTar. They predicted that each D. melanogaster miRNA,

on average, could target 54 transcripts. They also predicted co-

ordinate regulation of target genes by clustered miRNAs. The

authors suggested that gene regulation by miRNAs was com-

parable between flies and mammals, but certain miRNAs might

function in clade-specific modes of gene regulation. The algo-

rithm could recover published miRNA targets and is estimated

to have a ~30% false-positive rate.

By using a new version of PicTar and sequence alignments

of three nematodes, Lall et al. (2006) predicted that miRNAs

regulate at least 10% of C. elegans genes through conserved

interactions. They devised a more elaborate way of assigning

emission probabilities based on the likely increased evolution-

ary conservation of functional sites.

FIG. 2. Complementarity of plant or animal miRNA to 30-UTR of the respective target mRNA. The mature miRNA hy-bridizes to its complementary site in the 30 UTR of targetmRNA. Plant miRNAs commonly pair with perfect comple-mentarity to the target, but animal miRNAs show imperfectcomplementarity although at least one animal miRNA showperfect complementarity (Yekta et al., 2004). When miRNAand mRNA target sites are perfectly complementary, mRNA isdegraded, and for imperfect complementarity leads to repres-sion of protein translation.

330 CHAUDHURI AND CHATTERJEE

Software program MovingTargets (Burgler and Macdonald,

2005) predicts miRNA targets inDrosophila. The approach cre-

ates a database of potential targets, and screens those for ad-

herence to constraints suggested by analysis of known miRNA/

target interactions. The program uses a set of biological con-

straints: number of target sites (usually multiple), strength of

miRNA–mRNA hybridization, number of miRNA 50 nucleo-tides involved in base pairing to the target and involved in G:U

base pairs. The authors identified 83 high likelihood miRNA

targets and verified 3 of these in cultured cells, including a target

for the Drosophila let-7 homolog. MovingTargets provides

flexibility in adjusting the values for the constraints leading to the

prediction of more refined sets of miRNA targets in future.

Saetrom et al. (2005) presented a miRNA target prediction

algorithm TargetBoost, an adaptation of the boosted genetic

programming algorithm (Saetrom 2004), which uses machine

learning to capture the binding characteristics between miRNAs

and their targets on a set of validated miRNA targets in lower

organisms. Given a miRNA and a potential target site, this

classifier returns a score that represents the likelihood of the site

being targeted by the miRNA. The program was used to predict

target sites in a set of genes important for fly body patterning in

D. melanogaster.

Yoon and De Micheli (2005) proposed a computational

method to predict miRNA regulatory modules. Their method

was based upon the observation that more than one miRNA

typically regulates one message, and that one miRNA may have

several target genes. The method was tested with human ge-

nome and a total of 431 regulatory modules were predicted.

Robins et al. (2005) developed an algorithm for predicting

targets that does not rely on evolutionary conservation. It con-

sists of: (a) matching 50 seven nucleotides, (b) scoring the match

of the entire miRNA, (c) incorporating 30 UTR structure of the

target, and (d) combining scores for multiple sites in the targets.

By usingDrosophilamiRNAs as a test case, they validated their

predictions in 10 of 15 genes tested. However, in contrast to

other studies, their computational and experimental data suggest

that miRNAs have fewer targets than previously reported.

Xie et al. (2005) performed comparative analysis of the human,

mouse, rat, and dog genomes and created a systematic catalog of

common regulatorymotifs in promoters and 30 UTRs. Although itis not a miRNA target prediction algorithm, the authors report a

large number of UTR motifs, many of which are likely to be

miRNA targets. Their conclusion was based on the following ob-

servations: (a) the motifs had strong directional bias with respect

to the DNA strand, (b) length distribution showed a strong peak at

8-base length, and (c) they ended with an adenosine comple-

mentary to the 50-end of a bindingmiRNA. The authors suggested

that in humans, miRNAs regulate at least 20% of genes.

Chan and colleagues (2005) have applied a computational

comparative genomic approach for identifying the targets of

miRNAs in closely related flies and worms. The target set of a

given miRNA was defined as the union of all conserved sets

corresponding to the highly conserved k-mers complementary to

its 50 extremity and randomly generated sequences were used to

create pseudotargets and used as control. This approach did not

require orthologous sequences to be aligned, and required only

two genomes. Using this strategy, a large number of target genes

for most of the knownmiRNAswere determined in these species.

Kim et al. (2006) presented miTarget, a SVM classifier to

predictmiRNA target genes. The algorithmmiTarget uses a radial

basis function kernel as a similarity measure for SVM features,

categorized by structural, thermodynamic, and position-based

features. The training data set was collected from the literature to

make a biologically relevant simulation. With this approach,

authors predicted significant functions for human miR-1, miR-

124a, and miR-373 using Gene Ontology (GO) analysis and re-

vealed the importance of pairing at positions 4, 5, and 6 in the 50

region of a miRNA from a feature selection experiment.

Wang (2006) implemented an algorithmMirTarget for animal

miRNA target prediction. The algorithm combines relevant

parameters for miRNA target recognition (seed sequence scan-

ning, cross-species conservation, miRNA-target site duplex sta-

bility, and limited seed extension) and heuristically assigns

weights according to their relative importance. A score calcu-

lation scheme is introduced to reflect the strength of each pa-

rameter. The authors also performed microarray time course

experiments to identify downregulated genes due to miRNA

overexpression.A significant downregulationofmanycell cycle-

related genes was observed following miR-124 over expression.

Watanabe et al. (2006) used an algorithm which combines

hybridization tendency of an miRNA–mRNA target duplex with

the conventionally used prediction criteria. The numbers of

perfectly complementary di-nucleotide sequences were counted

between known pairs of miRNA–mRNA in C. elegans and the

free energy within complementary base pairs of each dinucleo-

tide was calculated by sliding a 2-nt window along all nucleo-

tides of the miRNA–mRNA duplex. The analysis confirmed

strong base pairing at the 50-end of miRNAs (nts 1–8) in C.

elegans, the required central region mismatch (nt 9 or nt 10), and

found weak binding at the 30 region (nts 13–14) in addition.With

this approach, the group predicted 687 possible miRNA target

transcripts, many of which are thought to be involved in C.

elegans development.

Very recently, a pattern-based method rna22 (Miranda et al.,

2006) has been presented for identification of miRNA binding

sites and their corresponding miRNA–mRNA heteroduplexes.

Unlike the previous methods, rna22 does not use a cross-species

sequence conservation filter, allowing the discovery of miRNA

binding sites, that may not be present in closely related species.

Rna22 first finds putative miRNA binding sites in the sequence

of interest, then identifies the targeting miRNA. Computation-

ally, rna22 could identify most of the currently known hetero-

duplexes. Experimentally, with luciferase assays, the authors

demonstrated average repressions of 30% ormore for 168 of 226

tested targets. The results suggest that in a given genome the

true numbers of miRNA precursors, miRNA binding sites, and

affected gene transcripts may be substantially higher than cur-

rently hypothesized and that, in addition to 30 UTRs, numerous

binding sites likely exist in 50 UTRs and coding sequences.

TARGET PREDICTION INPLANTS AND VIRUSES

Rhoades et al. (2002) extracted Arabidopsis mRNA se-

quences from GenBank, and searched for complementary sites

MICRORNA DETECTION AND TARGET PREDICTION 331

for the 16 miRNAs using PatScan (Dsouza et al. 1997). They

predicted 49 regulatory targets of 14 miRNAs of 34 belonging

to transcription factor gene families involved in developmental

patterning or cell differentiation, and hence, the authors sug-

gested that many plant miRNAs function during cell differ-

entiation to clear key regulatory transcripts from daughter cell

lineage. A similar approach was used by Bonnet et al. (2004),

but they allowed mismatches in accordance with the length of

the potential miRNA. Their data was validated with the test

data set of Rhoades et al. (2002).

Jones-Rhoades and Bartel (2004) developed a comparative

genomic approach to systematically identify both miRNAs and

their targets conserved in A. thaliana and O. sativa. The algo-

rithm allowed for gaps and more mismatches in the mRNA:

miRNA duplex compared to earlier method (Rhoades et al.

2002). The authors confirmed 19 newly identified target candi-

dates and suggested that plant miRNAs have a strong propen-

sity to target genes controlling development, transcription

factors, and F-box proteins in particular, in addition to those of

superoxide dismutases, laccases, and ATP sulfurylases.

Target prediction by Wang et al. (2004) used extensive se-

quence complementarity between miRNAs and their target

mRNAs. Putative targets functionally conserved between A.

thaliana and O. sativa were identified for most newly identified

miRNAs. Independent microarray data showed that the expres-

sion levels of some mRNA targets anticorrelated with the ac-

cumulation pattern of their corresponding regulatory miRNAs.

The cleavage of three target mRNAs by miRNA binding was

validated in 50 RACE experiments.

A Web-based integrated computing system, miRU, has been

developed by Zhang (2005) for plant miRNA target gene pre-

diction in any plant whose genome sequence or a large number

of expressed sequence tags (ESTs) are available.

Li and Zhang (2005) proposed a computational pipeline and

detected 96 candidate Arabidopsis miRNAs which were pre-

dicted to target 102 transcription factor genes classified as 28

transcription factor gene families. The method searched for

short, perfectly complementary sequences, and considered RNA

secondary structures and sequence conversation between Ara-

bidopsis and O. sativa.

Zilberstein et al. (2006) presented a program miRNAXpress

which associates between miRNAs and conditions in which they

act. The program consists of a target prediction module (out-

put: Targets Matrix) and associating OperationF, operating on

predefined Expression Matrix, working in tandem. The program

was applied on A. thaliana as model containing 98 miRNAs

and 380 conditions. One interesting result stated that mir159C

activity could be a factor in the misresponse of nph4 mutants to

phototropic stimulations.

To identify targets of Epstein-Barr virus (EBV) miRNAs,

Pfeffer et al. (2004) used a similar computational method used

earlier for animals (Enright et al., 2003). Themajority of predicted

host cell targets had more than one binding site for viral miRNAs,

and ~50% of these had additional targets from host miRNAs. The

predicted viral miRNA targets included regulators of cell prolif-

eration and apoptosis, B cell-specific chemokines and cytokines,

transcriptional regulators, and components of signal transduction

pathways. The authors suggested that EBVmight exploit miRNA

silencing as a convenient method for gene regulation of host and

viral genes in a nonimmunogenic manner.

EXPERIMENTAL VALIDATIONOF miRNA TARGETS

In sharp contrast to the availability of the number of experi-

mentally validated miRNAs, there is a dearth of experimental

evidences identifying their corresponding targets. This is be-

cause validating predictions of miRNA targets is much more

challenging, and so far there is no simple, high-throughput

method for biologically validating miRNA targets. The most

commonly used method implements tissue culture assays using

luciferase reporter gene constructs fused to target sequences

(Lewis et al., 2003; Chang et al., 2004; Esau et al., 2004;

Mansfield et al., 2004; Burgler and Macdonald, 2005; Kir-

iakidou et al., 2005; Krek et al., 2005). These constructs are used

to transfect cells expressing the relevant miRNA, or sometimes

miRNA is experimentally overexpressed, along with vectors

carrying mutant versions of binding sites. If such a construct is

actively regulated by miRNAs already present in the transfected

cells, one might expect it to produce lower levels of the reporter

than the mutant construct.

Another method is to examine cells in which an miRNA has

been over expressed by transfection ofhomologous synthetic

short interfering RNAs or recombinant adenoviral infection for

stable target mRNA expression by microarray (Krutzfeldt et al.,

2005; Lim et al., 2005). This approach could be effective, as

many of the predicted genes can be tested in one experiment.

Loss-of-function studies have also been used in which an

miRNA is inhibited by 20-O-methyl- modified oligonucleotides,

and the inhibition of activity is assayed either by luciferase

activity in reporter assays or by gene expression analysis (Poy

et al., 2004; Chen et al., 2006; Schratt et al., 2006). Lall et al.

(2006) have developed an in vivo validation system that has the

key feature of using upstream sequence from each specific tar-

get, allowing us to drive reporter expression in a manner that

approximates expression of the endogenous transcript. With the

above techniques a small of targets have so far been validated.

Complications are due largely to multiplicity of miRNA targets

and to cooperative interactions of miRNAs with a target. Nev-

ertheless, the experimental results are encouraging, and have

confirmed that various target prediction engines are indeed cap-

able of identifying miRNA targets. Future development of

high-throughput target validation techniques will be necessary

to raise the specificity and sensitivity of microRNA target

prediction algorithms.

miRNA DATA RESOURCES

The increasing number of predicted miRNAs and their re-

spective targets has led to the development of several database

resources. miRBase is one such database focused on microRNA

data. It incorporates all published miRNA sequences with ge-

nomic location and annotation, predicted miRNA target genes,

and also it has a registry where data submissions can be done

prior to publication. Besides miRBase, there are other database

resources likemiRNAMAP, Tarbase, andArgonaut. A summary

of the available online resources like database resources, Web

sites for miRNA and target prediction algorithms, and Web

sites with precomputed predictions is presented in Table 2.

332 CHAUDHURI AND CHATTERJEE

TABLE 2. AVAILABLE ONLINE RESOURCES FOR MIRNA INFORMATION

Name URL Remarks References

Database resources

miRNA registry/

miRBase

http://microrna.sanger.ac.uk miRNA sequences,

annotations, and

predicted targets

(Griffiths-Jones, 2004,

Griffiths-Jones

et al., 2006)

miRNAMap http://mirnamap.mbc.nctu

.edu.tw

Genomic maps for

miRNA genes and targets

(Hsu et al., 2006)

Tarbase http://www.diana.pcbi

.upenn.edu

List of experimentally

supported miRNA targets

(Sethupathy et al., 2006)

Argonaute http://www.ma

.uni-heidelberg.de/apps/zmf/

argonaute/interface

Database for gene

regulation by

mammalian miRNAs

(Shahi et al., 2006)

Identification of miRNAs

MiRscan http://genes.mit.edu/mirscan miRNA gene scan webserver (Lim et al., 2003a, 2003b;

Ohler et al., 2004)

MiRseeker Dr. Pavel Tomancak;

Email: tomancak

@mpi-cbg.de

Program available

upon request

(Lai et al., 2003)

srnaloop http://arep.med.harvard.edu/

miRNA/pgmlicense.html

Source code available (Grad et al., 2003)

findMiRNA http://sundarlab.ucdavis.edu/

mirna/

Downloadable program (Adai et al., 2005)

ProMiR II http://cbit.snu.ac.kr/*ProMiR2/ Web server (Nam et al., 2006)

Bayes-MiRNAfind https://bioinfo.wistar.upenn.edu/

miRNA/miRNA/login.php

Webserver (Yousef et al., 2006)

Microprocessor SVM &

miRNA SVM

https://demo1.interagon.com/

miRNA/

Webserver (Helvik et al., 2006)

miR-abela http://www.mirz.unibas.ch RNA regulatory networks —

Identification of miRNA

targets

TargetScan/

TargetScanS

http://genes.mit.edu/

targetscan

Precomputed searchable

targets for human, mouse,

rat, dog

(Lewis et al., 2003, 2005)

PicTar http://pictar.bio.nyu.edu Precomputed searchable

targets for vertebrates

and flies

(Grun et al., 2005;

Krek et al., 2005)

miRanda http://www.microma.org Precomputed searchable

targets for human,

flies, and zebrafish

(Enright et al., 2003;

John et al., 2004)

DIANA-microT http://www.diana.pcbi.upenn

.edu/cgi-bin/micro_t.cgi

Webserver for target

prediction in human,

mouse, rat, flies, worm,

and Arabidopsis

(Kiriakidou et al., 2004)

RNAhybrid http://bibiserv.techfak

.uni-bielefeld.de/rnahybrid

Prediction of miRNA

binding sites

(Rehmsmeier et al., 2004)

miRU http://bioinfo3.noble.org/

miRNA/miRU.htm

Webserver for plant

miRNA target finder

(Zhang, 2005)

TargetBoost https://demo1.interagon.com/

demo

Webserver for target

prediction

(Saetrom et al., 2005)

RNA22 http://cbcsrv.watson

.ibm.com/rna22.html

Webserver for target

prediction

(Miranda et al., 2006)

miRNA target

prediction at EMBL

http://www.russell

.embl-heidelberg.de/

miRNAs/

Precomputed searchable

targets for Drosophila

(Stark et al., 2003, 2005;

Brennecke et al., 2005)

RNA folding programs

Mfold package http://www.bioinfo.rpi.edu/

applications/mfold/

RNA folding and

hybridization prediction

(Zuker, 2003)

Vienna package http://www.tbi.univie.ac.at/

*ivo/RNA/

RNA secondary structure

prediction and comparison

(Hofacker, 2003)

333

CONCLUDING REMARKS

Over the past few years, the complex and subtle roles of

miRNAs in gene regulation have been increasingly appreciated.

This review summarizes the recent research efforts in compu-

tational prediction and experimental validation techniques re-

lated to both miRNAs and their targets. Most miRNA prediction

algorithms combine information on sequence, structure, and

conservation and predict different numbers of candidate miRNA

genes, few of which have been experimentally validated. Pos-

sible explanations could be that these represent false-positives

or the gene is not simply expressed in the RNA sample exam-

ined. These algorithms so far been not been equipped with the

predictions on the orientation of the transcript (plus or minus

strand) with respect to genomic location, the position of the

processing sites within the hairpin structure, and the determi-

nation of which of the paired segments of the hairpin will

constitute the mature miRNA. Target prediction, again, has been

more complicated by multiplicity of interaction and by the fact

that hundreds of RNAs and thousands of the targets appear to

compose remarkably complex regulatory networks, mediating

many facets of eukaryotic cell function. Despite such uncer-

tainties, in silico prediction methods for miRNAs and their tar-

gets have already become a valuable tool. Sensitive biological

validation techniques are key factors in fine tuning informat-

ics prediction algorithms. And yet, developing such biological

techniques often depends on effective prediction algorithms.

An integrated detection approach, which combines computa-

tional prediction together with high-throughput biological val-

idation, has been most effective in discovery of miRNAs. Now

we know that the regulation of gene expressions by miRNA is

a widespread natural phenomenon regulating complex genetic

pathways, and these miRNAs are modulated in many human

diseases. Understanding the miRNA-guided network has enor-

mous possibility of providing a new window for diagnostics

and therapy of many human diseases. Many challenges remain

in understanding miRNAs and dissecting the affected pathways.

Integrative approaches with crosstalk between in silico and ex-

perimental methods will continue to push forward future de-

velopments in this exciting field.

REFERENCES

ADAI, A., JOHNSON, C., MLOTSHWA, S., ARCHER-EVANS, S.,MANOCHA, V., VANCE, V., and SUNDARESAN, V. (2005).Computational prediction of miRNAs in Arabidopsis thaliana. Ge-nome Res. 15, 78–91.

ALTUVIA, Y., LANDGRAF, P., LITHWICK, G., ELEFANT, N.,PFEFFER, S., ARAVIN, A., BROWNSTEIN, M.J., TUSCHL, T.,and MARGALIT, H. (2005). Clustering and conservation patterns ofhuman microRNAs. Nucleic Acids Res. 33, 2697–2706.

AMBROS, V. (2004). The functions of animal microRNAs. Nature431, 350–355.

AMBROS, V., LEE, R.C., LAVANWAY, A., WILLIAMS, P.T., andJEWELL, D. (2003). MicroRNAs and other tiny endogenous RNAsin C. elegans. Curr Biol. 13, 807–818.

AXTELL, M.J., and BARTEL, D.P. (2005). Antiquity of microRNAsand their targets in land plants. Plant Cell 17, 1658–1673.

BABAK, T., ZHANG, W., MORRIS, Q., BLENCOWE, B.J., andHUGHES, T.R. (2004). Probing microRNAs with microarrays: tissuespecificity and functional inference. RNA 10, 1813–1819.

BARAD, O., MEIRI, E., AVNIEL, A., AHARONOV, R., BARZILAI,A., BENTWICH, I., EINAV, U., GILAD, S., HURBAN, P., KAROV,

Y., et al. (2004). MicroRNA expression detected by oligonucleotidemicroarrays: System establishment and expression profiling in humantissues. Genome Res. 14, 2486–2494.

BARTEL, B., and BARTEL, D.P. (2003). MicroRNAs: At the root ofplant development? Plant Physiol. 132, 709–717.

BARTEL,D.P. (2004).MicroRNAs: Genomics, biogenesis, mechanism,and function. Cell 116, 281–297.

BASKERVILLE, S., and BARTEL, D.P. (2005). Microarray profilingof miRNAs reveals frequent coexpression with neighboring miRNAsand host genes. RNA 11, 241–247.

BENTWICH, I., AVNIEL, A., KAROV, Y., AHARONOV, R., GILAD,S., BARAD, O., BARZILAI, A., EINAT, P., EINAV, U., MEIRI, E.,et al. (2005). Identificationof hundreds of conservedandnonconservedhuman microRNAs. Nat. Genet. 37, 766–770.

BEREZIKOV, E., GURYEV, V., VAN DE BELT, J., WIENHOLDS,E., PLASTERK, R.H., and CUPPEN, E. (2005). Phylogenetic shad-owing and computational identification of human microRNA genes.Cell 120, 21–24.

BOFFELLI, D., MCAULIFFE, J., OVCHARENKO, D., LEWIS, K.D.,OVCHARENKO, I., PACHTER, L., and RUBIN, E.M. (2003). Phy-logenetic shadowing of primate sequences to find functional regionsof the human genome. Science 299, 1391–1394.

BONNET, E., WUYTS, J., ROUZE, P., and VAN DE PEER, Y. (2004).Detection of 91 potential conserved plant microRNAs in Arabidopsisthaliana and Oryza sativa identifies important target genes. Proc. Natl.Acad. Sci. USA 101, 11511–11516.

BRENNECKE, J., STARK, A., RUSSELL, R.B., and COHEN, S.M.(2005). Principles of microRNA-target recognition. PLoS Biol.3, e85.

BURGLER, C., and MACDONALD, P.M. (2005). Prediction and ver-ification of microRNA targets by MovingTargets, a highly adaptableprediction method. BMC Genomics 6, 88.

CALIN, G.A., and CROCE, C.M. (2006). MicroRNA-cancer connec-tion: the beginning of a new tale. Cancer Res. 66, 7390–7394.

CALIN, G.A., DUMITRU, C.D., SHIMIZU, M., BICHI, R., ZUPO, S.,NOCH,E.,ALDLER,H., RATTAN,S., KEATING,M.,RAI,K., et al.(2002). Frequent deletions and down-regulation of micro-RNA genesmiR15 and miR16 at 13q14 in chronic lymphocytic leukemia. Proc.Natl. Acad. Sci. USA 99, 15524–15529.

CASTOLDI, M., SCHMIDT, S., BENES, V., NOERHOLM, M.,KULOZIK, A.E., HENTZE, M.W., and MUCKENTHALER, M.U.(2006). A sensitive array for microRNA expression profiling (miChip)based on locked nucleic acids (LNA). RNA 12, 913–920.

CHAN, C.S., ELEMENTO, O., and TAVAZOIE, S. (2005). Revealingposttranscriptional regulatory elements through network-level con-servation. PLoS Comput. Biol. 1, e69.

CHANG, S., JOHNSTON, R.J., JR., FROKJAER-JENSEN, C.,LOCKERY, S., and HOBERT, O. (2004). MicroRNAs act sequen-tially and asymmetrically to control chemosensory laterality in thenematode. Nature 430, 785–789.

CHATTERJEE, R., and CHAUDHURI, K. (2006). An approach for theidentification of microRNA with an application to Anopheles gam-biae. Acta Biochim. Pol. 53, 303–309.

CHEN, J.F., MANDEL, E.M., THOMSON, J.M., WU, Q., CALLIS,T.E., HAMMOND, S.M., CONLON, F.L., and WANG, D.Z. (2006).The role of microRNA-1 and microRNA-133 in skeletal muscleproliferation and differentiation. Nat. Genet. 38, 228–233.

CUI, C., GRIFFITHS, A., LI, G., SILVA, L.M., KRAMER, M.F.,GAASTERLAND, T., WANG, X.J., and COEN, D.M. (2006). Pre-diction and identification of herpes simplex virus 1-encoded micro-RNAs. J. Virol. 80, 5499–5508.

DEO, M., YU, J.Y., CHUNG, K.H., TIPPENS, M., and TURNER, D.L.(2006). Detection of mammalian microRNA expression by in situ hy-bridization with RNA oligonucleotides. Dev. Dyn. 235, 2538–2548.

DSOUZA, M., LARSEN, N., and OVERBEEK, R. (1997). Searchingfor patterns in genomic data. Trends Genet. 13, 497–498.

EDDY, S.R. (1996). Hidden Markov models. Curr. Opin. Struct. Biol.6, 361–365.

ENRIGHT, A.J., JOHN, B., GAUL, U., TUSCHL, T., SANDER, C., andMARKS, D.S. (2003). MicroRNA targets in Drosophila. GenomeBiol. 5, R1.

ESAU, C., KANG, X., PERALTA, E., HANSON, E., MARCUSSON,E.G., RAVICHANDRAN, L.V., SUN, Y., KOO, S., PERERA, R.J.,

334 CHAUDHURI AND CHATTERJEE

JAIN, R., et al. (2004). MicroRNA-143 regulates adipocyte differen-tiation. J. Biol. Chem. 279, 52361–52365.

FANG, S., LEE, H.J., WARK, A.W., and CORN, R.M. (2006). Atto-mole microarray detection of microRNAs by nanoparticle-amplifiedSPR imaging measurements of surface polyadenylation reactions.J. Am. Chem. Soc. 128, 14044–14046.

FU, H.J., ZHU, J., YANG, M., ZHANG, Z.Y., TIE, Y., JIANG, H.,SUN, Z.X., and ZHENG, X.F. (2006). A novel method to monitorthe expression of microRNAs. Mol. Biotechnol. 32, 197–204.

GAUTHERET, D., and LAMBERT, A. (2001). Direct RNA motifdefinition and identification from multiple sequence alignmentsusing secondary structure profiles. J. Mol. Biol. 313, 1003–1011.

GRAD, Y., AACH, J., HAYES, G.D., REINHART, B.J., CHURCH,G.M., RUVKUN, G., and KIM, J. (2003). Computational and ex-perimental identification of C. elegans microRNAs. Mol. Cell 11,1253–1263.

GREY, F., ANTONIEWICZ, A., ALLEN, E., SAUGSTAD, J.,MCSHEA, A., CARRINGTON, J.C., and NELSON, J. (2005). Iden-tification and characterization of human cytomegalovirus-encodedmicroRNAs. J. Virol. 79, 12095–12099.

GRIFFITHS-JONES, S. (2004). ThemicroRNARegistry. Nucleic AcidsRes. 32, D109–D111.

GRIFFITHS-JONES, S., GROCOCK, R.J., VAN DONGEN, S., BA-TEMAN, A., and ENRIGHT, A.J. (2006). miRBase: microRNA se-quences, targets and gene nomenclature. Nucleic Acids Res. 34,D140–D144.

GRUN, D., WANG, Y.L., LANGENBERGER, D., GUNSALUS, K.C.,and RAJEWSKY, N. (2005). microRNA target predictions acrossseven Drosophila species and comparison to mammalian targets.PLoS Comput. Biol. 1, e13.

GRUNDHOFF, A., SULLIVAN, C.S., and GANEM, D. (2006). Acombined computational and microarray-based approach identifiesnovel microRNAs encoded by human gamma-herpesviruses. RNA12, 733–750.

HARFE, B.D. (2005). MicroRNAs in vertebrate development. Curr.Opin. Genet. Dev. 15, 410–415.

HARTIG, J.S., GRUNE, I., NAJAFI-SHOUSHTARI, S.H., and FA-MULOK, M. (2004). Sequence-specific detection of MicroRNAsby signal-amplifying ribozymes. J. Am. Chem. Soc. 126, 722–723.

HE, L., THOMSON, J.M., HEMANN, M.T., HERNANDO-MONGE,E., MU, D., GOODSON, S., POWERS, S., CORDON-CARDO, C.,LOWE, S.W., HANNON,G.J., et al. (2005). AmicroRNApolycistronas a potential human oncogene. Nature 435, 828–833.

HELVIK, S.A., SNOVE, O., JR., and SAETROM, P. (2006). Reli-able prediction of Drosha processing sites improves microRNA geneprediction. Bioinformatics.

HERTEL, J., and STADLER, P.F. (2006). Hairpins in a Haystack: Rec-ognizing microRNA precursors in comparative genomics data. Bio-informatics 22, e197–e202.

HOFACKER, I.L. (2003). Vienna RNA secondary structure server.Nucleic Acids Res. 31, 3429–3431.

HOFACKER, I.L., FONTANA, W., STADLER, P.F., BONHOEFFER,S., TACKER, M., and SCHUSTER, P. (1994). Fast folding and com-parison of RNA secondary structures. Monatsh. Chem. 125, 167–188.

HSU, P.W., HUANG, H.D., HSU, S.D., LIN, L.Z., TSOU, A.P.,TSENG, C.P., STADLER, P.F., WASHIETL, S., and HOFACKER,I.L. (2006). miRNAMap: Genomic maps of microRNA genes andtheir target genes in mammalian genomes. Nucleic Acids Res. 34,D135–D139.

JOHN, B., ENRIGHT, A.J., ARAVIN, A., TUSCHL, T., SANDER, C.,and MARKS, D.S. (2004). Human MicroRNA targets. PLoS Biol. 2,e363.

JONES-RHOADES, M.W., and BARTEL, D.P. (2004). Computationalidentification of plant microRNAs and their targets, including astress-induced miRNA. Mol. Cell 14, 787–799.

JONSTRUP, S.P., KOCH, J., and KJEMS, J. (2006). A microRNAdetection system based on padlock probes and rolling circle ampli-fication. RNA 12, 1747–1752.

KIM, S.K., NAM, J.W., RHEE, J.K., LEE, W.J., and ZHANG B.T.(2006). miTarget: microRNA target gene prediction using a supportvector machine. BMC Bioinformatics 7, 411.

KIM, V.N. (2005). MicroRNA biogenesis: Coordinated cropping anddicing. Nat. Rev. Mol. Cell Biol. 6, 376–385.

KIRIAKIDOU, M., NELSON, P.T., KOURANOV, A., FITZIEV, P.,BOUYIOUKOS, C., MOURELATOS, Z., and HATZIGEORGIOU,A. (2004).A combined computational-experimental approachpredictshuman microRNA targets. Genes Dev. 18, 1165–1178.

KIRIAKIDOU, M., NELSON, P., LAMPRINAKI, S., SHARMA, A.,and MOURELATOS, Z. (2005). Detection of microRNAs and assaysto monitor microRNA activities in vivo and in vitro. Methods Mol.Biol. 309, 295–310.

KLOOSTERMAN, W.P., STEINER, F.A., BEREZIKOV, E., DEBRUIJN, E., VAN DE BELT, J., VERHEUL, M., CUPPEN, E., andPLASTERK, R.H. (2006a). Cloning and expression of new micro-RNAs from zebrafish. Nucleic Acids Res. 34, 2558–2569.

KLOOSTERMAN, W.P., WIENHOLDS, E., DE BRUIJN, E., KAUP-PINEN,S., andPLASTERK,R.H. (2006b). In situdetection ofmiRNAsin animal embryos using LNA-modified oligonucleotide probes. Nat.Methods 3, 27–29.

KREK, A., GRUN, D., POY, M.N., WOLF, R., ROSENBERG, L.,EPSTEIN, E.J., MACMENAMIN, P., DA PIEDADE, I., GUNSA-LUS, K.C., STOFFEL, M., et al. (2005). Combinatorial microRNAtarget predictions. Nat. Genet. 37, 495–500.

KRICHEVSKY, A.M., KING, K.S., DONAHUE, C.P., KHRAPKO, K.,and KOSIK, K.S. (2003). A microRNA array reveals extensive reg-ulation of microRNAs during brain development. RNA 9, 1274–1281.

KRUGER, J., and REHMSMEIER, M. (2006). RNAhybrid: microRNAtarget prediction easy, fast and flexible. Nucleic Acids Res. 34,W451–W454.

KRUTZFELDT, J., RAJEWSKY, N., BRAICH, R., RAJEEV, K.G.,TUSCHL, T., MANOHARAN, M., and STOFFEL, M. (2005). Silenc-ing of microRNAs in vivo with ‘‘antagomirs.’’ Nature 438, 685–689.

LAGOS-QUINTANA, M., RAUHUT, R., LENDECKEL, W., andTUSCHL, T. (2001). Identification of novel genes coding for smallexpressed RNAs. Science 294, 853–858.

LAGOS-QUINTANA, M., RAUHUT, R., YALCIN, A., MEYER, J.,LENDECKEL, W., and TUSCHL, T. (2002). Identification of tissue-specific microRNAs from mouse. Curr. Biol. 12, 735–739.

LAI, E.C., TOMANCAK, P., WILLIAMS, R.W., and RUBIN, G.M.(2003). Computational identification ofDrosophilamicroRNA genes.Genome Biol. 4, R42.

LALL, S., GRUN, D., KREK, A., CHEN, K., WANG, Y.L., DEWEY,C.N., SOOD, P., COLOMBO, T., BRAY, N., MACMENAMIN, P.,et al. (2006). A genome-wide map of conserved microRNA targets inC. elegans. Curr. Biol. 16, 460–471.

LAU, N.C., LIM, L.P., WEINSTEIN, E.G., and BARTEL, D.P. (2001).An abundant class of tiny RNAs with probable regulatory roles inCaenorhabditis elegans. Science 294, 858–862.

LEE, R.C., and AMBROS, V. (2001). An extensive class of smallRNAs in Caenorhabditis elegans. Science 294, 862–864.

LEE, R.C., FEINBAUM, R.L., and AMBROS, V. (1993). The C. elegansheterochronic gene lin-4 encodes small RNAs with antisense comple-mentarity to lin-14. Cell 75, 843–854.

LEE, Y., JEON, K., LEE, J.T., KIM, S., and KIM, V.N. (2002). Micro-RNA maturation: Stepwise processing and subcellular localization.EMBO J. 21, 4663–4670.

LEGENDRE,M., LAMBERT,A., andGAUTHERET,D. (2005). Profile-based detection of microRNA precursors in animal genomes. Bioinfor-matics 21, 841–845.

LEWIS, B.P., SHIH, I.H., JONES-RHOADES, M.W., BARTEL, D.P.,and BURGE, C.B. (2003). Prediction of mammalian microRNAtargets. Cell 115, 787–798.

LEWIS, B.P., BURGE, C.B., and BARTEL, D.P. (2005). Conservedseed pairing, often flanked by adenosines, indicates that thousands ofhuman genes are microRNA targets. Cell 120, 15–20.

LI, S.C., PAN, C.Y., and LIN, W.C. (2006). Bioinformatic discovery ofmicroRNAprecursors from human ESTs and introns. BMCGenomics7, 164.

LI, X., and ZHANG, Y.Z. (2005). Computational detection of micro-RNAs targeting transcription factor genes in Arabidopsis thaliana.Comput. Biol. Chem. 29, 360–367.

LIANG, R.Q., LI, W., LI, Y., TAN, C.Y., LI, J.X., JIN, Y.X., andRUAN, K.C. (2005). An oligonucleotide microarray for microRNAexpression analysis based on labeling RNA with quantum dot andnanogold probe. Nucleic Acids Res. 33, e17.

MICRORNA DETECTION AND TARGET PREDICTION 335

LIM, L.P., GLASNER,M.E., YEKTA, S., BURGE,C.B., and BARTEL,D.P. (2003a). Vertebrate microRNA genes. Science 299, 1540.

LIM, L.P., LAU, N.C., WEINSTEIN, E.G., ABDELHAKIM, A.,YEKTA, S., RHOADES, M.W., BURGE, C.B., and BARTEL, D.P.(2003b). The microRNAs of Caenorhabditis elegans. Genes Dev.17, 991–1008.

LIM, L.P., LAU, N.C., GARRETT-ENGELE, P., GRIMSON, A.,SCHELTER, J.M., CASTLE, J., BARTEL, D.P., LINSLEY, P.S.,and JOHNSON, J.M. (2005). Microarray analysis shows that somemicroRNAs downregulate large numbers of target mRNAs. Nature433, 769–773.

LIU, C.G., CALIN, G.A., MELOON, B., GAMLIEL, N., SEVIGNANI,C., FERRACIN, M., DUMITRU, C.D., SHIMIZU, M., ZUPO, S.,DONO, M., et al. (2004). An oligonucleotide microchip for genome-wide microRNA profiling in human and mouse tissues. Proc. Natl.Acad. Sci. USA 101, 9740–9744.

LU, J., GETZ, G., MISKA, E.A., ALVAREZ-SAAVEDRA, E., LAMB,J., PECK, D., SWEET-CORDERO, A., EBERT, B.L., MAK, R.H.,FERRANDO, A.A., et al. (2005). MicroRNA expression profilesclassify human cancers. Nature 435, 834–838.

MANSFIELD, J.H., HARFE, B.D., NISSEN, R., OBENAUER, J.,SRINEEL, J., CHAUDHURI, A., FARZAN-KASHANI, R., ZUKER,M., PASQUINELLI, A.E., RUVKUN, G., et al. (2004). MicroRNA-responsive ‘‘sensor’’ transgenes uncover Hox-like and other devel-opmentally regulated patterns of vertebrate microRNA expression.Nat. Genet. 36, 1079–1083.

MIRANDA, K.C., HUYNH, T., TAY, Y., ANG, Y.S., TAM, W.L.,THOMSON, A.M., LIM, B., and RIGOUTSOS, I. (2006). A pattern-based method for the identification of MicroRNA binding sites andtheir corresponding heteroduplexes. Cell 126, 1203–1217.

MISKA,E.A.,ALVAREZ-SAAVEDRA,E., TOWNSEND,M.,YOSHII,A., SESTAN, N., RAKIC, P., CONSTANTINE-PATON, M., andHORVITZ, H.R. (2004). Microarray analysis of microRNA expressionin the developing mammalian brain. Genome Biol. 5, R68.

MOSS, E.G., and POETHIG, R.S. (2002). MicroRNAs: something newunder the sun. Curr. Biol. 12, R688–R690.

MOSS, E.G., LEE, R.C., and AMBROS, V. (1997). The cold shockdomain protein LIN-28 controls developmental timing in C. elegansand is regulated by the lin-4 RNA. Cell 88, 637–646.

NAM, J.W., SHIN, K.R., HAN, J., LEE, Y., KIM, V.N., and ZHANG,B.T. (2005). Human microRNA prediction through a probabilisticco-learning model of sequence and structure. Nucleic Acids Res. 33,3570–3581.

NAM, J.W., KIM, J., KIM, S.K., and ZHANG, B.T. (2006). ProMiR II:A web server for the probabilistic prediction of clustered, nonclus-tered, conserved and nonconserved microRNAs. Nucleic Acids Res.34, W455–W458.

NELSON, P.T., BALDWIN, D.A., SCEARCE, L.M., OBERHOLT-ZER, J.C., TOBIAS, J.W., and MOURELATOS, Z. (2004).Microarray-based, high-throughput gene expression profiling of mi-croRNAs. Nat. Methods 1, 155–161.

NELSON, P.T., BALDWIN, D.A., KLOOSTERMAN, W.P., KAUP-PINEN, S., PLASTERK, R.H., and MOURELATOS, Z. (2006).RAKE and LNA-ISH reveal microRNA expression and localizationin archival human brain. RNA 12, 187–191.

OHLER, U., YEKTA, S., LIM, L.P., BARTEL, D.P., and BURGE, C.B.(2004). Patterns of flanking sequence conservation and a character-istic upstream motif for microRNA gene identification. RNA 10,1309–1322.

PEDERSEN, J.S., BEJERANO, G., SIEPEL, A., ROSENBLOOM, K.,LINDBLAD-TOH, K., LANDER, E.S., KENT, J., MILLER, W., andHAUSSLER, D. (2006). Identification and classification of conservedRNA secondary structures in the human genome. PLoS Comput. Biol.2, e33.

PFEFFER, S., ZAVOLAN, M., GRASSER, F.A., CHIEN, M., RUSSO,J.J., JU, J., JOHN, B., ENRIGHT, A.J., MARKS, D., SANDER, C.,et al. (2004). Identification of virus-encoded microRNAs. Science304, 734–736.

PFEFFER, S., SEWER, A., LAGOS-QUINTANA,M., SHERIDAN, R.,SANDER, C., GRASSER, F.A., VAN DYK, L.F., HO, C.K., SHU-MAN, S., CHIEN, M., et al. (2005). Identification of microRNAs ofthe herpesvirus family. Nat. Methods 2, 269–276.

POY, M.N., ELIASSON, L., KRUTZFELDT, J., KUWAJIMA, S., MA,X., MACDONALD, P.E., PFEFFER, S., TUSCHL, T., RAJEWSKY,N., RORSMAN, P., et al. (2004). A pancreatic islet-specific micro-RNA regulates insulin secretion. Nature 432, 226–230.

RAJEWSKY, N., and SOCCI, N.D. (2004). Computational identifica-tion of microRNA targets. Dev. Biol. 267, 529–535.

REHMSMEIER, M., STEFFEN, P., HOCHSMANN, M., and GIE-GERICH, R. (2004). Fast and effective prediction of microRNA/target duplexes. RNA 10, 1507–1517.

REINHART, B.J., SLACK, F.J., BASSON, M., PASQUINELLI, A.E.,BETTINGER, J.C., ROUGVIE, A.E., HORVITZ, H.R., and RUV-KUN, G. (2000). The 21-nucleotide let-7 RNA regulates develop-mental timing in Caenorhabditis elegans. Nature 403, 901–906.

RHOADES, M.W., REINHART, B.J., LIM, L.P., BURGE, C.B.,BARTEL, B., and BARTEL, D.P. (2002). Prediction of plant mi-croRNA targets. Cell 110, 513–520.

ROBINS, H., LI, Y., and PADGETT, R.W. (2005). Incorporatingstructure to predict microRNA targets. Proc. Natl. Acad. Sci. USA102, 4006–4009.

RODRIGUEZ, A., GRIFFITHS-JONES, S., ASHURST, J.L., andBRADLEY, A. (2004). Identification of mammalian microRNA hostgenes and transcription units. Genome Res. 14, 1902–1910.

SAETROM, O., SNOVE, O., JR., and SAETROM, P. (2005). Weightedsequence motifs as an improved seeding step in microRNA targetprediction algorithms. RNA 11, 995–1003.

SAETROM, P. (2004). Predicting the efficacy of short oligonucleotidesin antisense and RNAi experiments with boosted genetic program-ming. Bioinformatics 20, 3055–3063.

SCHMITTGEN, T.D., JIANG, J., LIU, Q., and YANG, L. (2004).A high-throughput method to monitor the expression of microRNAprecursors. Nucleic Acids Res. 32, e43.

SCHRATT, G.M., TUEBING, F., NIGH, E.A., KANE, C.G., SABA-TINI, M.E., KIEBLER, M., and GREENBERG, M.E. (2006). Abrain-specific microRNA regulates dendritic spine development.Nature 439, 283–289.

SCHUSTER, P., FONTANA, W., STADLER, P.F., and HOFACKER,I.L. (1994). From sequences to shapes and back: A case study inRNA secondary structures. Proc. Biol. Sci. 255, 279–284.

SEITZ, H., ROYO, H., BORTOLIN, M.L., LIN, S.P., FERGUSON-SMITH, A.C., and CAVAILLE J. (2004). A large imprinted micro-RNA gene cluster at the mouse Dlk1-Gtl2 domain. Genome Res. 14,1741–1748.

SEMPERE, L.F., FREEMANTLE, S., PITHA-ROWE, I., MOSS, E.,DMITROVSKY, E., and AMBROS, V. (2004). Expression profilingof mammalian microRNAs uncovers a subset of brain-expressedmicroRNAs with possible roles in murine and human neuronal dif-ferentiation. Genome Biol. 5, R13.

SETHUPATHY, P., CORDA, B., and HATZIGEORGIOU, A.G.(2006). TarBase: A comprehensive database of experimentally sup-ported animal microRNA targets. RNA 12, 192–197.

SEWER, A., PAUL, N., LANDGRAF, P., ARAVIN, A., PFEFFER, S.,BROWNSTEIN, M.J., TUSCHL, T., VAN NIMWEGEN, E., andZAVOLAN, M. (2005). Identification of clustered microRNAs usingan ab initio prediction method. BMC Bioinformatics 6, 267.

SHAHI, P., LOUKIANIOUK, S., BOHNE-LANG, A., KENZELMANN,M., KUFFER, S., MAERTENS, S., EILS, R., GRONE, H.J., GRETZ,N., and BRORS, B. (2006). Argonaute—Adatabase for gene regulationby mammalian microRNAs. Nucleic Acids Res. 34, D115–D118.

SLACK, F.J., BASSON, M., LIU, Z., AMBROS, V., HORVITZ, H.R.,and RUVKUN, G. (2000). The lin-41 RBCC gene acts in the C.elegans heterochronic pathway between the let-7 regulatory RNAand the LIN-29 transcription factor. Mol. Cell 5, 659–669.

SMALHEISER, N.R., and TORVIK, V.I. (2004). A population-basedstatistical approach identifies parameters characteristic of humanmicroRNA–mRNA interactions. BMC Bioinformatics 5, 139.

STARK, A., BRENNECKE, J., RUSSELL, R.B., and COHEN, S.M.(2003). Identification of Drosophila MicroRNA targets. PLoS Biol.1, E60.

STARK, A., BRENNECKE, J., BUSHATI, N., RUSSELL, R.B., andCOHEN, S.M. (2005). Animal MicroRNAs confer robustness togene expression and have a significant impact on 30UTR evolution.Cell 123, 1133–1146.

336 CHAUDHURI AND CHATTERJEE

SULLIVAN, C.S., GRUNDHOFF, A.T., TEVETHIA, S., PIPAS, J.M.,and GANEM, D. (2005). SV40-encoded microRNAs regulate viralgene expression and reduce susceptibility to cytotoxic T cells. Nature435, 682–686.

SUN, Y., KOO, S., WHITE, N., PERALTA, E., ESAU, C., DEAN,N.M., and PERERA, R.J. (2004). Development of a micro-array todetect human and mouse microRNAs and characterization of ex-pression in human organs. Nucleic Acids Res. 32, e188.

TAKADA, S., BEREZIKOV, E., YAMASHITA, Y., LAGOS-QUINTANA, M., KLOOSTERMAN, W.P., ENOMOTO, M., HA-TANAKA, H., FUJIWARA, S., WATANABE, H., SODA, M., et al.(2006). Mouse microRNA profiles determined with a new and sen-sitive cloning method. Nucleic Acids Res. 34, e115.

TANG, F., HAJKOVA, P., BARTON, S.C., LAO, K., and SURANI,M.A. (2006). MicroRNA expression profiling of single whole em-bryonic stem cells. Nucleic Acids Res. 34, e9.

THOMSON, J.M., PARKER, J., PEROU, C.M., and HAMMOND,S.M. (2004). A custom microarray platform for analysis of micro-RNA gene expression. Nat. Methods 1, 47–53.

VALOCZI, A., HORNYIK, C., VARGA, N., BURGYAN, J., KAUP-PINEN, S., and HAVELDA, Z. (2004). Sensitive and specific de-tection of microRNAs by Northern blot analysis using LNA-modifiedoligonucleotide probes. Nucleic Acids Res. 32, e175.

WANG, H., ACH, R.A., and CURRY, B. (2006). Direct and sensitivemiRNA profiling from low-input total RNA. RNA.

WANG, X. (2006). Systematic identification of microRNA functions bycombining target prediction and expression profiling. Nucleic AcidsRes. 34, 1646–1652.

WANG, X.J., REYES, J.L., CHUA, N.H., and GAASTERLAND, T.(2004). Prediction and identification of Arabidopsis thaliana micro-RNAs and their mRNA targets. Genome Biol. 5, R65.

WATANABE, Y., YACHIE, N., NUMATA, K., SAITO, R., KANAI,A., and TOMITA, M. (2006). Computational analysis of microRNAtargets in Caenorhabditis elegans. Gene 365, 2–10.

WEBER, M.J. (2005). New human and mouse microRNA genes foundby homology search. FEBS J. 272, 59–73.

WIENHOLDS, E., KLOOSTERMAN, W.P., MISKA, E., ALVAREZ-SAAVEDRA, E., BEREZIKOV, E., DE BRUIJN, E., HORVITZ,H.R., KAUPPINEN, S., and PLASTERK, R.H. (2005). MicroRNAexpression in zebrafish embryonic development. Science 309, 310–311.

WILLIAMS, L., CARLES, C.C., OSMONT, K.S., and FLETCHER,J.C. (2005). A database analysis method identifies an endogenoustrans-acting short-interfering RNA that targets the ArabidopsisARF2, ARF3, and ARF4 genes. Proc. Natl. Acad. Sci. USA 102,9703–9708.

XIE, X., LU, J., KULBOKAS, E.J., GOLUB, T.R., MOOTHA, V.,LINDBLAD-TOH, K., LANDER, E.S., and KELLIS, M. (2005).Systematic discovery of regulatory motifs in human promoters and30 UTRs by comparison of several mammals. Nature 434, 338–345.

YAMADA, K., LIM, J., DALE, J.M., CHEN, H., SHINN, P., PALM,C.J., SOUTHWICK, A.M., WU, H.C., KIM, C., NGUYEN, M., etal. (2003). Empirical analysis of transcriptional activity in the Ara-bidopsis genome. Science 302, 842–846.

YEKTA, S., SHIH, I.H., and BARTEL, D.P. (2004). Micro-RNA di-rected cleavage of HOXB8 mRNA. Science 304, 594–596.

YOON, S., and DE MICHELI, G. (2005). Prediction of regulatorymodules comprising microRNAs and target genes. Bioinformatics 21(Suppl 2), ii93–ii100.

YOUSEF, M., NEBOZHYN, M., SHATKAY, H., KANTERAKIS, S.,SHOWE, L.C., and SHOWE, M.K. (2006). Combining multi-speciesgenomic data for microRNA identification using a Naive Bayesclassifier. Bioinformatics 22, 1325–1334.

ZHANG, B.H., PAN, X.P., WANG, Q.L., COBB, G.P., and ANDER-SON, T.A. (2005). Identification and characterization of new plantmicroRNAs using EST analysis. Cell Res. 15, 336–360.

ZHANG, Y. (2005). miRU: An automated plant miRNA target predic-tion server. Nucleic Acids Res. 33, W701–W704.

ZILBERSTEIN, C.B., ZIV-UKELSON, M., PINTER, R.Y., and YA-KHINI, Z. (2006). A high-throughput approach for associating Mi-croRNAs with their activity conditions. J. Comput. Biol. 13, 245–266.

ZUKER, M. (2003). Mfold web server for nucleic acid folding andhybridization prediction. Nucleic Acids Res. 31, 3406–3415.

Address reprint requests to:

Keya Chaudhuri, Ph.D.

Molecular & Human Genetics Division

Indian Institute of Chemical Biology

4, Raja S. C. Mullick Road

Kolkata, 700 032

India

E-mail:[email protected]

[email protected]

Received for publication November 29, 2006; received in re-

vised form December 28, 2006; accepted January 12, 2007.

MICRORNA DETECTION AND TARGET PREDICTION 337