24
Bioinformatics Topics Not Covered in this Course BMI 730 Kun Huang Department of Biomedical Informatics Ohio State University

Bioinformatics Topics Not Covered in this Course BMI 730

  • Upload
    zeno

  • View
    42

  • Download
    1

Embed Size (px)

DESCRIPTION

Bioinformatics Topics Not Covered in this Course BMI 730. Kun Huang Department of Biomedical Informatics Ohio State University. Non-coding RNA MicroRNA Related Bioinformatics Issues MicroRNA prediction and recognition Second order structure prediction Target prediction - PowerPoint PPT Presentation

Citation preview

Page 1: Bioinformatics Topics Not Covered in this Course  BMI 730

Bioinformatics Topics Not Covered in this Course

BMI 730 Kun Huang

Department of Biomedical InformaticsOhio State University

Page 2: Bioinformatics Topics Not Covered in this Course  BMI 730

Non-coding RNA MicroRNA Related Bioinformatics Issues

MicroRNA prediction and recognition Second order structure prediction Target prediction

Microbial Related Bioinformatics Metagenomics

Other Omics

Other Informatics

Page 3: Bioinformatics Topics Not Covered in this Course  BMI 730

Non-coding RNA• Non-coding DNA

• Junk DNA• Pseudogenes• Retrotransposons - Human Endogenous

Retroviruses (HERVs)• C-value enigma (e.g., Amoeba dubia genome

has more than 670 billion bases; pufferfish genome is 1/10 of human genome)

• Findings from ENCODE – nearly the entire genome is transcribed

Page 4: Bioinformatics Topics Not Covered in this Course  BMI 730

Non-coding RNA (ncRNA)• Any RNA molecule that is not translated into

a protein. • sRNA, npcRNA, nmRNA, snmRNA, fRNA• Also including tRNA, rRNA, snoRNA,

microRNA (miRNA), siRNA, piRNA, long ncRNA (e.g., Xist), shRNA

• Note the difference between siRNA and miRNA

Page 5: Bioinformatics Topics Not Covered in this Course  BMI 730

Non-coding RNA (ncRNA)• RNA-induced silencing complex (RISC)• RNA-induced transcriptional silencing (RITS)

Page 6: Bioinformatics Topics Not Covered in this Course  BMI 730

MicroRNA (miRNA)• Another level of regulation

Page 7: Bioinformatics Topics Not Covered in this Course  BMI 730

a

p

m

1

2

b

E2F1

E2F2

E2F3 Myc

17-5p 17-3p 18a 19a 20a 19b 92-1

c

Myc E2F

mir-17-92

Reviewed by: Coller et al. (2008), PLoS Genet 3(8): e146Figures from Dr. Baltz Agula

MicroRNA (miRNA)

Page 8: Bioinformatics Topics Not Covered in this Course  BMI 730

Non-coding RNA

MicroRNA Related Bioinformatics Issues• Secondary structure prediction• MicroRNA prediction and recognition• Target prediction

Databases

Page 9: Bioinformatics Topics Not Covered in this Course  BMI 730

Secondary structure prediction• Applications

• RNA folding dynamics• ncRNA discovery• Microarray probe validation/comparison

Wang et al. Genome Biology 2004 5:R65  

Page 10: Bioinformatics Topics Not Covered in this Course  BMI 730

Secondary structure prediction - Physics-based models

- Minimizing free energy / Dynamical programming / other optimization schemes

- Parameters come from empirical studies of RNA structural energetics (e.g., nearest neighbor interactions in stacking base pairs using synthesized oligonucleotides)

- Restricted from experimental procedure- Scoring models are used- Most ignore sequence dependence of hairpin, bulge,

internal, and multi-branch loop energies- Multi-branch loop energies rely on ad hoc scores- Still top performance- Mfold, ViennaRNA, PKnots, RDfold, etc

Page 11: Bioinformatics Topics Not Covered in this Course  BMI 730

Secondary structure prediction - Probabilistic approach- Stochastic context-free grammars (SCFG) – e.g., QRNA

- Specify grammar rules that induce a joint probability distribution over possible RNA structures and sequences

- Parameter easily learnt without experiments- Parameters may not have physical meanings- Performance inferior to physics-model methods

- Extensions: Conditional log-linear model (CLLM) – e.g., CONTRAfold

- Integrate the learning procedure with energy-based scoring systems

Page 12: Bioinformatics Topics Not Covered in this Course  BMI 730

Secondary structure prediction

CONTRAfold PKnotRG

Page 13: Bioinformatics Topics Not Covered in this Course  BMI 730

Secondary structure prediction - Comparative approach- Single sequence prediction (physics-based, SCFG) have

difficulty in searching all configurations- Structures that have been conserved by evolution are far

more likely to be the functional form

Page 14: Bioinformatics Topics Not Covered in this Course  BMI 730

MicroRNA prediction and discovery- Experimental approach - cloning- MicroRNA array (OSU microarray facility)- Massive sequencing

- Select segments in the range of 20-25nt- Using Solexa/SOLiD sequencer- Map to genome- Enrichment analysis / peak calling- Experimental validation

Page 15: Bioinformatics Topics Not Covered in this Course  BMI 730

MicroRNA prediction and discovery- Bioinformatics / machine learning approach

Wang et al. Genome Biology 2004 5:R65  

Page 16: Bioinformatics Topics Not Covered in this Course  BMI 730

MicroRNA prediction and discovery- Bioinformatics / machine learning approach

- Using evolutionary information

Nam, J.-W. et al. Nucl. Acids Res. 2005 33:3570-3581; doi:10.1093/nar/gki668

Page 17: Bioinformatics Topics Not Covered in this Course  BMI 730

MicroRNA prediction and discovery- Bioinformatics / machine learning approach

- Support vector machine / need features

• Features: • Sequence features

• Nucleotide frequency counts• Total G/C content

• Folding features• Pairing propensity• Minimum free energy (MFE)

• Topological features• Packing ratio

Page 18: Bioinformatics Topics Not Covered in this Course  BMI 730

MicroRNA target Prediction- Experimental / bioinformatics approach

- Blast can identify thousands potential targets – how to pin down the real ones?

Page 19: Bioinformatics Topics Not Covered in this Course  BMI 730

MicroRNA target Prediction- Computational / bioinformatics approach

- Mutually exclusive transcription pattern between miRNA and its targets

- Microarray screening- Existing of complementary sequence- Context score – features - Machine learning approaches (e.g., SVM,

regression, etc)

Cell, Volume 136, Issue 2, 215-233, 23 January 2009MicroRNAs: Target Recognition and Regulatory Functions

David P. Bartel

Page 20: Bioinformatics Topics Not Covered in this Course  BMI 730

Non-coding RNA

MicroRNA Related Bioinformatics Issues• Secondary structure prediction• MicroRNA prediction and recognition• Target prediction

Databases

Page 21: Bioinformatics Topics Not Covered in this Course  BMI 730

Databases• MicroRNA.org:

http://www.microrna.org/microrna/getMirnaForm.do• MirBase: http://microrna.sanger.ac.uk• …

Target prediction• MIRDB• TargetScan (http://targetscan.org)• PicTar (http://pictar.bio.nyu.edu)• miRanda (part of Sanger database)• MirTarget • …

Softwares• List at

http://en.wikipedia.org/wiki/List_of_RNA_structure_prediction_software

Page 22: Bioinformatics Topics Not Covered in this Course  BMI 730

Non-coding RNA MicroRNA Related Bioinformatics Issues

MicroRNA prediction and recognition Second order structure prediction Target prediction

Microbial Related Bioinformatics Metagenomics

Other Omics

Other Informatics

Page 23: Bioinformatics Topics Not Covered in this Course  BMI 730

Metagenomics study of genetic material recovered directly from

environmental samples a community of spieces – e.g., microbial from the

stomach of cow Challenges:

Who are there? How many?

16S riRNA – universal primer, highly conserved, used for profiling

forward: AGA GTT TGA TCC TGG CTC AG reverse: ACG GCT ACC TTG TTA CGA CTT

Next generation sequencing – more genes (chicken-and-egg)

Community metabolism – identify metabolic pathways within the community

New challenges: comparative study

Page 24: Bioinformatics Topics Not Covered in this Course  BMI 730

Non-coding RNA MicroRNA Related Bioinformatics Issues

MicroRNA prediction and recognition Second order structure prediction Target prediction

Microbial Related Bioinformatics Metagenomics

Other Omics

Other Informatics