Upload
thea
View
25
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Genome. Lesk, Introduction to Bioinformatics, Chapter 2. Organisms and cells. All organisms consist of small cells Human body has approx 6x10 13 cells of about 320 different types Cell size can vary greatly Human red blood cell 5 microns (0.005 mm) Neuron from spinal cord 1m long - PowerPoint PPT Presentation
Citation preview
Michael Schroeder BioTechnological CenterTU Dresden Biotec
Genome
Lesk, Introduction to Bioinformatics,
Chapter 2
By Michael Schroeder, Biotec, 2004 2
Organisms and cells All organisms consist of small cells
Human body has approx 6x1013 cells of about 320 different types
Cell size can vary greatly Human red blood cell 5 microns (0.005 mm) Neuron from spinal cord 1m long
Two types of organisms Prokaryotes - Bacteria for example Eukaryotes - most other organisms Archaea – few organisms living in hostile
environments
By Michael Schroeder, Biotec, 2004 3
Genomes and Genes: Not all DNA codes for genes
Organism Number of bp Genes
ФX-174 5386 10 Virus infecting E.coli
Human mitochondrion 16,569 37 Subcellular organelle
Mycoplasma pneumoniae 816,394 680 Pneumonia
Mycoplasma laboratorium 382 Minimal genome project
Hemophilus influenzae 1,830,138 1,738 Middle ear infection
E. coli 4,639,221 4,406
Saccharomyces cerevisiae 12.1 x 106 5,885 Yeast
C. elegans 95.5 x 106 19,099 Worm
Drosophila melanogaster 1.8 x 108 13,601 Fruit fly
H. sapiens 3.2 x 109 22,333 Human
By Michael Schroeder, Biotec, 2004 4
Genetic information
Genes as discovered by Mendel entirely abstract entities
Chromosomes are physical entities and their banding patterns their landmarks Chromosomes are numbered in size (1=largest) Human chromome: p (petite=short), q (queue) arm,
e.g. 15q11.1
DNA sequences = hereditary information in physical form
By Michael Schroeder, Biotec, 2004 5
Locating genes
The disease cystic fibrosis is known since middle ages, the relevant protein was not
Folklore: „Children with excessive salt in sweat - noticable when kissing them on forehead - were short lived“
Implication: Chloride channel in epithelial tissues Search in family pedrigrees identified various genetic
markers (Variable Number Tandem Repeat), which limited the genomic region first from 1-2 Mio bp to 300kb
Finally the deletion 508Phe in the CFTR gene was identified as cause
By Michael Schroeder, Biotec, 2004 6
Chromosome
By Michael Schroeder, Biotec, 2004 7
Chromosome banding pattern map
By Michael Schroeder, Biotec, 2004 8
Chromosome banding pattern map
By Michael Schroeder, Biotec, 2004 9
2 Types of Maps: Physical Map
Genome sequencing projects supply the DNA sequence of each chromosome
The physical distance is the number of base pairs that separate two genes
…ACTGTATGACTGGCATGGCACTGGGGCAAATGTGCACTC…
110
180 Mbp
100
50
Gene A Gene B
C. Voigt, S. Ibrahim, S. Möller, P. Serrano Fernández. Non-linear map conversions. German Conference on Bioinformatics, 2003
By Michael Schroeder, Biotec, 2004 10
• Chromosomes are carriers of genetic information
• Genetic information is linked and linearly arranged inside the chromosome
• This linkage is sometimes broken: recombination (crossing-over)
Genetic Maps
C. Voigt, S. Ibrahim, S. Möller, P. Serrano Fernández. Non-linear map conversions. German Conference on Bioinformatics, 2003
2 Types of Maps: Genetic Map
By Michael Schroeder, Biotec, 2004 11
Genes located far from each other are more likely to be uncoupled during a crossing-over
A Morgan is the genetic distance in which 1 crossing-over is expected to occur
2
7078
110
0
cM
C. Voigt, S. Ibrahim, S. Möller, P. Serrano Fernández. Non-linear map conversions. German Conference on Bioinformatics, 2003
2 Types of Maps: Genetic Maps
By Michael Schroeder, Biotec, 2004 12
Historical background
Different systems provide us with complementary information (not completely redundant)
Genetic markers may be mapped in only one system (conversions needed)
Genetic markers may be ambiguous
C. Voigt, S. Ibrahim, S. Möller, P. Serrano Fernández. Non-linear map conversions. German Conference on Bioinformatics, 2003
Why 2 Types of Maps?
By Michael Schroeder, Biotec, 2004 13
bps / cM
bps / cM
Linear relationship
C. Voigt, S. Ibrahim, S. Möller, P. Serrano Fernández. Non-linear map conversions. German Conference on Bioinformatics, 2003
Expected Map Conversion
By Michael Schroeder, Biotec, 2004 14
bps / cM / cR
bps / cM / cR
Linear relationship
bps
cM
Human chromosome 12
C. Voigt, S. Ibrahim, S. Möller, P. Serrano Fernández. Non-linear map conversions. German Conference on Bioinformatics, 2003
Observed Map Conversion Non linear relationship (Yu A, et al. 2001.
Nature, 409:951-3 Outliers Marker abiguity Local marker density Inversions
By Michael Schroeder, Biotec, 2004 15
Gene density and recombination
Recombination is mostly higher in areas with a high gene density.
high recombination
high gene density
bps
cMHuman chromosome 12
Yao, et al. (2002) Proc Natl Acad Sci
99(9):6157
C. Voigt, S. Ibrahim, S. Möller, P. Serrano Fernández. Non-linear map conversions. German Conference on Bioinformatics, 2003
General Properties
By Michael Schroeder, Biotec, 2004 17
How to Detect Genes? Detecting of regions similar to known coding regions from
other organisms Gene expressed (in another organism) mRNA cDNA = EST
(Expressed Sequence Tags) search for start of EST
Ab initio: derive gene from sequence itself Bacteria easy as genes are contiguous Eucaryotes problem: alternative splicing
Initial exon: Search for TATA box ~30bp upstream, no in-frame stop codon, ends before GT splice signal
Internal exon: AG splice signal, no in-frame stop codons, ends before GT splice signal
Final exon followed by polyadenylation
By Michael Schroeder, Biotec, 2004 18By Michael Schroeder, Biotec, 2004 18
Brent, Nat Biotech, 2007
By Michael Schroeder, Biotec, 2004 19
How to detect genes: De novo prediction
GenScan (late 90s) predicts 10% of ORFs in human genome Overprediction of 45,000 genes (~22,000 current
estimate) TwinScan (ealry 2000s):
Use alignment between target and a related genome: detect one third of ORFs in human genome
N-Scan Includes pseudo gene detection Predicts 20,138 genes
By Michael Schroeder, Biotec, 2004 19
By Michael Schroeder, Biotec, 2004 20
Applications
Genetic diversity and anthropology Cheetahs very closely related to each other pointing to
a population bottleneck 10,000 years ago Humans: mitochondrial DNA passed on through
maternal line, Y chromosome from father to son Variation in mitochondrial DNA in humans suggests
single maternal ancestor 140,000-200,000 years ago Population of Iceland (first inhabited 1100 years ago)
descended from Scandinavian males and femals from Scandinavia and the British Isles
Basques linguistically and genetically isolated
By Michael Schroeder, Biotec, 2004 21
Evolution of Genomes
Phylogenetic profiles What genes do different phyla share? What homologous proteins do different phyla share What functions to different phyla share?
By Michael Schroeder, Biotec, 2004 22
Shared functions of bacteria, archaea, and eucarya
Functions shared by Haemophilus influenza (bacteria), Methanococus jannaschii (archaea), Saccharomyces cerevisiae (eucarya) Energy:
Biosyntehsis of cofactors, amino acids Central and intermediary metabolism Energy metabolism Fatty acids and phospholipids Nucleotide biosynthesis Transport
Information: Replication Transcription Translation
Communication and regulation Regulatory functions Cell envelope/cell wall Cellular processes
Can we construct a minimal organism?
By Michael Schroeder, Biotec, 2004 23
Summary
Relation of DNA, genes and chromosomes Relationship of distance in Morgan and basepairs How to find genes in DNA
By similarity Ab initiov with Introns, exons, alternative splicing
Read Lesk, chapter 2