66
CS 6463: An overview of Molecular Biology 1 21 st Century = Biotech Century • Completion of human genome • High-throughput microarray and similar device • Cloning • Genetic engineering • Computational power Everyone is moving towards Biotech

BiologyOverview.ppt

  • Upload
    pammy98

  • View
    549

  • Download
    1

Embed Size (px)

Citation preview

Page 1: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 1

21st Century = Biotech Century

• Completion of human genome• High-throughput microarray and similar devices• Cloning• Genetic engineering• Computational power

Everyone is moving towards Biotech

Page 2: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 2

Explosive growth of biological data

Biology is becoming more computational intensive. High throughput bioinformatics, Lots of data The Molecular Biology Database Collection: 2005 update

Small excerpt from the A's: AARSDB: Aminoacyl-tRNA synthetase sequences ABCdb: ABC transporters AceDB: C. elegans, S. pombe, and human sequences

and genomic information ACTIVITY: Functional DNA/RNA site activity ALFRED: Allele frequencies and DNA polymorphisms

Page 3: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 3

Opportunities for CS

Possibilities for CS contributions Data integration problem Data extraction from literature (natural language

processing) Database issues (including automation) Visualization Mining large complex data sets

Page 4: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 4

Objective Introduction to basic molecular biology to computer science

students by a computer scientist.

A survey of databases: NCBI, SwissProt, PDB, Transfac, … Introduction to computational techniques in analyzing

genomics (and proteomics) data

Basic

Page 5: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 5

Communication is important

Page 6: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 6

Textbooks and course website

Required textbooks: Molecular Biology of the Cell (Main text) Bioinformatics, Genomics and Proteomics (Lab) Other material

References: Human Molecular Genetics (2nd Edition available for free) Data Mining : Practical Machine Learning Tools and

Techniques with Java Implementations (The Morgan Kaufmann Series in Data Management Systems) by Ian H. Witten, Eibe Frank (Paperback)

Microarrays for an Integrative Genomics (Computational Molecular Biology) [Paperback] By: Isaac S. Kohane, et al

Molecular Biology Web Book Course website:

http://www.cs.utsa.edu/~kwek/cs6463f05.html

Page 7: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 7

Intended Audience CS graduate students with an interest in

bioinformatics or want to explore bioinformatics. High School Biology.

Not for students who want to find a filler class in between classes.

Every Tuesday noon to 1pm, Human Genome (HuGe) lab meets to discuss current bioinformatics issues. All are welcome even if you are new to bioinformatics (but are taking this course).

Page 8: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 8

Database Search

Page 9: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 9

Course Organization Overview of Molecular Biology (and project discussion)

Databases Introduction to Cell:1. Cells and Genomes2. Cell Chemistry and Biosynthesis3. Proteins

Data preprocessingClassification problemClustering problemMicroarray analysis

Sequence alignmentHidden Markov Model

Basic Genetic Mechanisms4. DNA and Chromosomes6. From DNA to Protein7. Control of gene expression

Diseases:23. Cancer25. PathogensOthers: SNP, NRAi

Gene findingMotif finding

Bioinformatics/Computational Biology Molecular Biology

Page 10: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 10

Project Grade distributions

1 Quiz – 10% 2 tests – 30% Homework and Lab – 10% Project – 50% (+ 10% bonus)

Project Serious in bioinformatics (all HuGe Lab members): Mini (NIH-)

proposal project. Besides preliminary results, a proposal for future work (i.e. independent studies, theses). Possible collaborations with UTHSCSA and others.

Specific Aim(s): What do you want to do? Why is it important? Background: What have been done previously? (What make you

approach interesting?) Where do you get your data? (Preliminary) Result: To elaborate later. Future Work: To elaborate later.

A project: Same as above except do not need to have future work. Office hours (for projects): By appointment (send me an email

24 hours before) Tu, Th 10-3, 5-7, 8:30-10. W 10:30-noon.

Page 11: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 11

Some Important Dates September 13: Quiz 1 (there will be a second chance quiz) September 20: Specific aim of project due. [1 meeting to

discuss with me] October 27: Test 1 October 18: Background of project due. (you must already

started doing experiments) [2 meetings to discuss with me] November 24: Test 2 December 10: Final report of project. [2 meetings to discuss

with me]

IMPORTANT: if you do not meet me the require number of times, I am not accepting your report. Also, each meeting should be at least one week a part.

Page 12: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 12

Your Responsibility

• Read the assigned reading once the material is covered in lecture. Lecture is to make your reading easier.

• Try printing out the slides to take notes.

• Project: Observe the deadline!!!! Come and talk to me.

Page 13: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 13

A. An overview of molecular biology

Read Human Molecular Genetic Ch. 1A.1. BackgroundA.2. MacromoleculesA.3. DNA structureA.4. RNA transcription and Gene ExpressionA.5. RNA processingA.6. Translation, post-translation processing

and protein structureA.7. Project ideas

Page 14: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 14

Two types of cells:1. Prokaryotic (bacteria)2. Eukaryotic (multicellular organisms,Ameba, E. Coli)

A.1 Background: Procaryotic and Eukaryotic Cells

Page 15: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 15

A.1 Background: Procaryotic and Eukaryotic Cells

http://www-class.unl.edu/bios201a/spring97/group6/

Page 16: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 16

A.2. Building Blocks: Chemical Composition of Eukaryotic Cell

Water [E. Coli: 70%, Mammalian Cell: 70%] Macro-molecules:

DNA: Deoxyribonucleic Acid [E. Coli: 1%, Mammal: 0.25%] RNA: Ribonucleic Acid [E. Coli: 6%, Mammal: 1.1%] Proteins [E. Coli: 15%, Mammal: 18%]

Inorganic ions: Na+, K+, Mg+, Ca2+, Cl- [E. Coli: 1%, Mammal: 1%] Lipids:

Phospholipids [E. Coli: 2%, Mammal: 3%] Other lipids [E. Coli: -, Mammal: 0.2%]

Polysaccahrides [E. Coli: 1%, Mammal: 0.25%]

Volume: [E. Coli: 2 x 10-12cm, Mammal: 4 x 10-9cm] Relative Volume: [E. Coli: Mammal = 1: 2000]

Page 17: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 17

A.2 Building Blocks: Structure of bases, nucleosides and nucleotides

DNA: ‘polymer of A, G, T, C’RNA: ‘polymer of A, G, U (replace T), C’

sugar

base

Purines:

Pyrimidines:

Page 18: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 18

A.2. Building Blocks: Common bases found in nucleic acids

Page 19: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 19

A.2 Building Blocks: 20 amino acids

Polypeptides: chains of amino acids

Amino groupCarboxyl group

Page 20: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 20

A.2. Building Blocks: Abbreviation of Amino Acids

NameAbbreviation

Linear Structure

Alanine ala A CH3-CH(NH2)-COOH

Arginine arg R HN=C(NH2)-NH-(CH2)3-CH(NH2)-COOH

Asparagine asn N H2N-CO-CH2-CH(NH2)-COOH

Aspartic Acid asp D HOOC-CH2-CH(NH2)-COOH

Cysteine cys C HS-CH2-CH(NH2)-COOH

Glutamic Acid glu E HOOC-(CH2)2-CH(NH2)-COOH

Glutamine gln Q H2N-CO-(CH2)2-CH(NH2)-COOH

Glycine gly G NH2-CH2-COOH

Histidine his H NH-CH=N-CH=C-CH2-CH(NH2)-COOH

Isoleucine ile I CH3-CH2-CH(CH3)-CH(NH2)-COOH

Leucine leu L (CH3)2-CH-CH2-CH(NH2)-COOH

Lysine lys K H2N-(CH2)4-CH(NH2)-COOH

Methionine met M CH3-S-(CH2)2-CH(NH2)-COOH

Phenylalanine

phe F Ph-CH2-CH(NH2)-COOH

Proline pro P NH-(CH2)3-CH-COOH

Serine ser S HO-CH2-CH(NH2)-COOH

Threonine thr T CH3-CH(OH)-CH(NH2)-COOH

Tryptophan trp W Ph-NH-CH=C-CH2-CH(NH2)-COOH

Tyrosine tyr Y HO-Ph-CH2-CH(NH2)-COOH

Valine val V (CH3)2-CH-CH(NH2)-COOH

Page 21: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 21

A.2. Building blocks: Properties of Amino Acids I

http://www.russell.embl-heidelberg.de/aas/aas.html

Page 22: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 22

A.2. Building blocks: Some Terms for describing Properties of Amino Acids

Hydrophobic amino acids are those with side-chains that do not like to reside in an aqueous (i.e. water) environment.

Polar amino acids are those with side-chains that prefer to reside in an aqueous (i.e. water) environment.

Strictly speaking, aliphatic implies that the protein side chain contains only carbon or hydrogen atoms.

A side chain is aromatic when it contains an aromatic ring system.

Page 23: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 23

A.2 Building Blocks: Covalent and Non-covalent Bonds

Covalent bonds: stronger. Nucleic acid and protein polymers are from by covalent binds connecting nucleotides and amino acids (respectively) to form a linear backbone

Non-covalent bonds: weaker and revisible. 4 types:

1. Hydrogen bonds: N – H –O [double-stranded DNA, protein folding, …etc

2. Ionic bonds: Ionic interaction between charged group, sat Na+ and Cl-

3. Van der Waals: Optimum attraction between two atoms.

4. Hydrophobic forces: Water is polar molecules,

Page 24: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 24

A. An overview of molecular biology

A.1. Background A.2. Building Blocks of MacromoleculesA.3. DNA structureA.4. RNA transcription and Gene ExpressionA.5. RNA processingA.6. Translation, post-translation processing

and protein structureA.7. Project ideas

Page 25: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 25

A.3 DNA Structure: The Phosphodiester Bond

Page 26: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 26

A.3 DNA Structure: base pairing (Watson-Crick Rule).

Page 27: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 27

A.3 DNA Structure: DNA is a double-stranded anti-parallel helix

http://www.sumanasinc.com/webcontent/anisamples/molecularbiology/DNA_structure.html

upst

ream

dow

nstr

eam

ComplementaryDNA(cDNA)

%GC = 40%? How many % is G? C? A? T?

Page 28: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 28

A.3 DNA Structure: DNA is a double-stranded anti-parallel helix

Page 29: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 29

A.3 DNA Structure: RNA structure

palindrome

Page 30: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 30

A.3 DNA Structure: Viral Genomes

Highly Variable: DNA or RNA Single stranded or double stranded Linear or Circular Segmented and Multipartite

Virus normally replicate in the cytosol. Unusal Retrovirus duplicate itself in the nucleus (using reverse transcriptase)

Page 31: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 31

A.4 DNA Structure: The Central Dogma

Old 1-directional model

Page 32: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 32

A. An overview of molecular biology

A.1. Background A.2. Building Blocks of MacromoleculesA.3. DNA structureA.4. RNA transcription and Gene ExpressionA.5. RNA processingA.6. Translation, post-translation processing

and protein structureA.7. Project ideas

Page 33: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 33

A.4 Transcription and Gene Expression:Transcription

exon exon exonintronintronstart stop5’ UTR 3’ UTRpromoterTFBS

5’ 3’

(1st key)

Nuclear membrane

(2nd key, May not be there)

exon exon exonintronintronstart stop5’ UTR 3’ UTR

(complementary nucleotides)

Pre-mRNA poly A

cap

pore

TFBS(almost always there)

(mostly for non-housing gene)

TFBS – Transcription factor binding site

Page 34: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 34

A.4 Transcription and Gene Expression:Gene Regulation

A G T C

U C A G

http://henge.bio.miami.edu/mallery/movies/transcription.mov

G

C

G

http://www-class.unl.edu/biochem/gp2/m_biology/animation/gene/gene_a2.html

Page 35: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 35

A.4 Transcription and Gene Expression:RNA Polymerase

There are three classes of RNA Polymerases: Polymerase I: Localized in the nucleolus. Transcribe

rRNA (ribosome RNA) 28S, 18S 5.8S rRNA. Polymerase II: All protein-coding genes most

smRNAs. Unique in capping and polyadenylation. Polymerase III: tRNA, other rRNAs, snRNAs. [The

promoter can be downstream]

Pusedo-genes (gene fragments): Previously were genes

Only 2% of the human genome encode proteins.

Page 36: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 36

A.4 Transcription and Gene Expression: Trans- and cis-elements

Cis- element DNA sequence Trans-acting Factor

GC Box GGGCGG Spl

TATA Box TATAA TFIID (TFIIA – stabilize it)

CAAT Box CCAAT Many

TRE GTGAGT(A/C)A AP-1 family (many)

CRE (cAMP response element)

GTGACGT(A/C)A(A/G)

CREB/ATF family

Important: If pattern is there, does not necessary mean it is a cis-element.

Page 37: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 37

A.4 Transcription and Gene Expression: Promoters

Start from 1 not 0

Page 38: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 38

A.4 Transcription and Gene Expression: Enhancers and Silencers (Transcription Factors)

Many basepairsaway

Page 39: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 39

A.4 Transcription and Gene Expression: Tissue Specific Genes

House keeping genes: Genes encoding histone protein, ribosome protein. Always on.

Tissue or development-specific (non-housekeeping) genes: Transcriptional inactive chromatin Methylation of Cytosine, replacing a hydrogen (H) with

methyl (CH3) Transcription factors’ expression levels are low.

Microarrays measure the expression levels of genes

Page 40: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 40

A. An overview of molecular biology

A.1. Background A.2. Building Blocks of MacromoleculesA.3. DNA structureA.4. RNA transcription and Gene ExpressionA.5. RNA processingA.6. Translation, post-translation processing

and protein structureA.7. Project ideas

Page 41: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 41

A.4 Transcription and Gene Expression:Transcription

exon exon exonintronintronstart stop5’ UTR 3’ UTRpromoterTFBS

5’ 3’

(1st key)

Nuclear membrane

(2nd key, May not be there)

Splicing the introns: http://www.sumanasinc.com/webcontent/anisamples/molecularbiology/mRNAsplicing.html

exon exon exonintronintronstart stop5’ UTR 3’ UTR

(complementary nucleotides)

Pre-mRNA poly A

exon exon exonstart stop5’ UTR 3’ UTRMassager RNA (mRNA) poly A

cap

pore

TFBS(almost always there)

(mostly for non-housing gene)

Page 42: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 42

A.5 RNA Processing: RNA Splicing

donoracceptor

GT-AG spliceosomeAT-AC spliceosome (rare)

Page 43: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 43

A.5 RNA Processing: Consensus Sequences at splice donor, acceptor and branch sites

Page 44: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 44

A.5 RNA Processing: Mechanism of RNA Splicing (GU-AG introns)

Splicesome(5 snRNA)

http://www.nature.com/nrn/journal/v2/n1/animation/nrn0101_043a_swf_MEDIA1.html

Page 45: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 45

A.5 RNA Processing: 5’ End Capping

Page 46: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 46

A.5 RNA Processing: 3’ end polyadenylated.

Page 47: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 47

A.5 RNA Processing: Functions of 5’ End Cap and Poly A tail

Functions of 5’ end cap

1. Prevent mRNA molecules degradation.

2. Facilitate transport to cytoplasm

3. RNA splicing

4. Facilitate translation

Function of 3’ end poly(A) tail

1. Facilitate transport to cytoplasm

2. Stabilize the mRNA in the cytoplasm

3. Facilitate translation

Page 48: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 48

A.5 RNA Processing: Example of the human -globin gene

Page 49: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 49

A.4 RNA Processing: Export out of the nuclear

Page 50: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 50

A. An overview of molecular biology

A.1. Background A.2. Building Blocks of MacromoleculesA.3. DNA structureA.4. RNA transcription and Gene ExpressionA.5. RNA processingA.6. Translation, post-translation processing

and protein structureA.7. Project ideas

Page 51: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 51

A.5 RNA Processing: The Codon-anticodon Recognition

http://henge.bio.miami.edu/mallery/movies/translation.mov

(almost always) tRNA

Page 52: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 52

A.6 Translation and Post-Translational Processing : Peptide Bond Formation

Page 53: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 53

A.6 Translation and Post-Translational Processing: The Genetic Codes

N-terminalC-terminal

Page 54: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 54

A.6 Translation and Post-Translational Processing: The Genetic Codes

wobble- mitochondrial

64 possible codons: 1 Start codon AUG. 3 stop codons, 20 amino acids

Signal in mRNAs can lead to alternative interpretation of stop codons:UGA 21st AA selencocysteine, UAG 22nd AA pyrrolysine.

Page 55: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 55

A.6 Translation and Post-Translational Processing: Multiple Post-Translational Cleavages of Polypeptide Precursors

Page 56: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 56

A.6 Translation and Post-Translational Processing: Protein Secondary Structure

Page 57: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 57

A.6 Translation and Post-Translational Processing: Quaternary

Amino acid sequence secondary structure tertiary structure

Amino acid sequence

Page 58: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 58

A.6 Translation and Post-Translational Processing: Quaternary Structure

Page 59: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 59

A.6 Translation and Post-Translational Processing: Disulfide Bridges

Page 60: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 60

A.6 Translation and Post-Translational Processing: Post-translational Modification

http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=hmg.table.103

Page 61: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 61

A.6 Translation and Post-Translational Processing: Protein Sorting (Localization)

Protein Destination (Typical) Location and form of signal

Endoplasmic reticulum and secretion from cell

N-terminal peptide of 20 or so very hydrophobic AAs.

Mitochondria N-terminal peptide, a-helix. One side hydrophilic and one side hydrophobic

Nucleus Internal sequence of amino acids. Often a string of basic amino acids plus prolines; maybe bipartite.

Lysosome Addition of mannose 6-phosphate residues

1. Signal Peptide

2. Post-translational modification

Page 62: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 62

A.6 Translation and Post-Translational Processing: Cellular Function of Proteins

Diverse cellular functions: Enzymes – ‘cut things into pieces’ Receptors Transport Transcription factor Signaling Hormones Strutural .. etc

Page 63: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 63

A. An overview of molecular biology

A.1. Background A.2. Building Blocks of MacromoleculesA.3. DNA structureA.4. RNA transcription and Gene ExpressionA.5. RNA processingA.6. Translation, post-translation processing

and protein structureA.7. Project ideas

Page 64: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 64

A.7 Summary: Central Dogma Simplify

Enzymes, Receptors,... etc

Page 65: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 65

A.7 Summary: Don’t forget about mitochondria!

Page 66: BiologyOverview.ppt

CS 6463: An overview of Molecular Biology 66

A.7 Summary: Life is more complex