22
Michael Schroeder BioTechnological Center TU Dresden Biotec Genome Lesk, Introduction to Bioinformatics, Chapter

Genome

  • Upload
    thea

  • View
    25

  • Download
    0

Embed Size (px)

DESCRIPTION

Genome. Lesk, Introduction to Bioinformatics, Chapter 2. Organisms and cells. All organisms consist of small cells Human body has approx 6x10 13 cells of about 320 different types Cell size can vary greatly Human red blood cell  5 microns (0.005 mm) Neuron from spinal cord  1m long - PowerPoint PPT Presentation

Citation preview

Page 1: Genome

Michael Schroeder BioTechnological CenterTU Dresden Biotec

Genome

Lesk, Introduction to Bioinformatics,

Chapter 2

Page 2: Genome

By Michael Schroeder, Biotec, 2004 2

Organisms and cells All organisms consist of small cells

Human body has approx 6x1013 cells of about 320 different types

Cell size can vary greatly Human red blood cell 5 microns (0.005 mm) Neuron from spinal cord 1m long

Two types of organisms Prokaryotes - Bacteria for example Eukaryotes - most other organisms Archaea – few organisms living in hostile

environments

Page 3: Genome

By Michael Schroeder, Biotec, 2004 3

Genomes and Genes: Not all DNA codes for genes

Organism Number of bp Genes

ФX-174 5386 10 Virus infecting E.coli

Human mitochondrion 16,569 37 Subcellular organelle

Mycoplasma pneumoniae 816,394 680 Pneumonia

Mycoplasma laboratorium 382 Minimal genome project

Hemophilus influenzae 1,830,138 1,738 Middle ear infection

E. coli 4,639,221 4,406

Saccharomyces cerevisiae 12.1 x 106 5,885 Yeast

C. elegans 95.5 x 106 19,099 Worm

Drosophila melanogaster 1.8 x 108 13,601 Fruit fly

H. sapiens 3.2 x 109 22,333 Human

Page 4: Genome

By Michael Schroeder, Biotec, 2004 4

Genetic information

Genes as discovered by Mendel entirely abstract entities

Chromosomes are physical entities and their banding patterns their landmarks Chromosomes are numbered in size (1=largest) Human chromome: p (petite=short), q (queue) arm,

e.g. 15q11.1

DNA sequences = hereditary information in physical form

Page 5: Genome

By Michael Schroeder, Biotec, 2004 5

Locating genes

The disease cystic fibrosis is known since middle ages, the relevant protein was not

Folklore: „Children with excessive salt in sweat - noticable when kissing them on forehead - were short lived“

Implication: Chloride channel in epithelial tissues Search in family pedrigrees identified various genetic

markers (Variable Number Tandem Repeat), which limited the genomic region first from 1-2 Mio bp to 300kb

Finally the deletion 508Phe in the CFTR gene was identified as cause

Page 6: Genome

By Michael Schroeder, Biotec, 2004 6

Chromosome

Page 7: Genome

By Michael Schroeder, Biotec, 2004 7

Chromosome banding pattern map

Page 8: Genome

By Michael Schroeder, Biotec, 2004 8

Chromosome banding pattern map

Page 9: Genome

By Michael Schroeder, Biotec, 2004 9

2 Types of Maps: Physical Map

Genome sequencing projects supply the DNA sequence of each chromosome

The physical distance is the number of base pairs that separate two genes

…ACTGTATGACTGGCATGGCACTGGGGCAAATGTGCACTC…

110

180 Mbp

100

50

Gene A Gene B

C. Voigt, S. Ibrahim, S. Möller, P. Serrano Fernández. Non-linear map conversions. German Conference on Bioinformatics, 2003

Page 10: Genome

By Michael Schroeder, Biotec, 2004 10

• Chromosomes are carriers of genetic information

• Genetic information is linked and linearly arranged inside the chromosome

• This linkage is sometimes broken: recombination (crossing-over)

Genetic Maps

C. Voigt, S. Ibrahim, S. Möller, P. Serrano Fernández. Non-linear map conversions. German Conference on Bioinformatics, 2003

2 Types of Maps: Genetic Map

Page 11: Genome

By Michael Schroeder, Biotec, 2004 11

Genes located far from each other are more likely to be uncoupled during a crossing-over

A Morgan is the genetic distance in which 1 crossing-over is expected to occur

2

7078

110

0

cM

C. Voigt, S. Ibrahim, S. Möller, P. Serrano Fernández. Non-linear map conversions. German Conference on Bioinformatics, 2003

2 Types of Maps: Genetic Maps

Page 12: Genome

By Michael Schroeder, Biotec, 2004 12

Historical background

Different systems provide us with complementary information (not completely redundant)

Genetic markers may be mapped in only one system (conversions needed)

Genetic markers may be ambiguous

C. Voigt, S. Ibrahim, S. Möller, P. Serrano Fernández. Non-linear map conversions. German Conference on Bioinformatics, 2003

Why 2 Types of Maps?

Page 13: Genome

By Michael Schroeder, Biotec, 2004 13

bps / cM

bps / cM

Linear relationship

C. Voigt, S. Ibrahim, S. Möller, P. Serrano Fernández. Non-linear map conversions. German Conference on Bioinformatics, 2003

Expected Map Conversion

Page 14: Genome

By Michael Schroeder, Biotec, 2004 14

bps / cM / cR

bps / cM / cR

Linear relationship

bps

cM

Human chromosome 12

C. Voigt, S. Ibrahim, S. Möller, P. Serrano Fernández. Non-linear map conversions. German Conference on Bioinformatics, 2003

Observed Map Conversion Non linear relationship (Yu A, et al. 2001.

Nature, 409:951-3 Outliers Marker abiguity Local marker density Inversions

Page 15: Genome

By Michael Schroeder, Biotec, 2004 15

Gene density and recombination

Recombination is mostly higher in areas with a high gene density.

high recombination

high gene density

bps

cMHuman chromosome 12

Yao, et al. (2002) Proc Natl Acad Sci

99(9):6157

C. Voigt, S. Ibrahim, S. Möller, P. Serrano Fernández. Non-linear map conversions. German Conference on Bioinformatics, 2003

General Properties

Page 16: Genome

By Michael Schroeder, Biotec, 2004 17

How to Detect Genes? Detecting of regions similar to known coding regions from

other organisms Gene expressed (in another organism) mRNA cDNA = EST

(Expressed Sequence Tags) search for start of EST

Ab initio: derive gene from sequence itself Bacteria easy as genes are contiguous Eucaryotes problem: alternative splicing

Initial exon: Search for TATA box ~30bp upstream, no in-frame stop codon, ends before GT splice signal

Internal exon: AG splice signal, no in-frame stop codons, ends before GT splice signal

Final exon followed by polyadenylation

Page 17: Genome

By Michael Schroeder, Biotec, 2004 18By Michael Schroeder, Biotec, 2004 18

Brent, Nat Biotech, 2007

Page 18: Genome

By Michael Schroeder, Biotec, 2004 19

How to detect genes: De novo prediction

GenScan (late 90s) predicts 10% of ORFs in human genome Overprediction of 45,000 genes (~22,000 current

estimate) TwinScan (ealry 2000s):

Use alignment between target and a related genome: detect one third of ORFs in human genome

N-Scan Includes pseudo gene detection Predicts 20,138 genes

By Michael Schroeder, Biotec, 2004 19

Page 19: Genome

By Michael Schroeder, Biotec, 2004 20

Applications

Genetic diversity and anthropology Cheetahs very closely related to each other pointing to

a population bottleneck 10,000 years ago Humans: mitochondrial DNA passed on through

maternal line, Y chromosome from father to son Variation in mitochondrial DNA in humans suggests

single maternal ancestor 140,000-200,000 years ago Population of Iceland (first inhabited 1100 years ago)

descended from Scandinavian males and femals from Scandinavia and the British Isles

Basques linguistically and genetically isolated

Page 20: Genome

By Michael Schroeder, Biotec, 2004 21

Evolution of Genomes

Phylogenetic profiles What genes do different phyla share? What homologous proteins do different phyla share What functions to different phyla share?

Page 21: Genome

By Michael Schroeder, Biotec, 2004 22

Shared functions of bacteria, archaea, and eucarya

Functions shared by Haemophilus influenza (bacteria), Methanococus jannaschii (archaea), Saccharomyces cerevisiae (eucarya) Energy:

Biosyntehsis of cofactors, amino acids Central and intermediary metabolism Energy metabolism Fatty acids and phospholipids Nucleotide biosynthesis Transport

Information: Replication Transcription Translation

Communication and regulation Regulatory functions Cell envelope/cell wall Cellular processes

Can we construct a minimal organism?

Page 22: Genome

By Michael Schroeder, Biotec, 2004 23

Summary

Relation of DNA, genes and chromosomes Relationship of distance in Morgan and basepairs How to find genes in DNA

By similarity Ab initiov with Introns, exons, alternative splicing

Read Lesk, chapter 2