15
Genetica per Scienze Natura a.a. 03-04 prof S. Presciut 1. What is a gene? Definition: A gene is a discrete unit of DNA (or RNA in Definition: A gene is a discrete unit of DNA (or RNA in some viruses) that encodes a nucleic acid or protein some viruses) that encodes a nucleic acid or protein product that contributes to or influences the phenotype of product that contributes to or influences the phenotype of the cell the cell or the organism. or the organism. Genes are Genes are the functional units of chromosomal DNA the functional units of chromosomal DNA . Each . Each gene not only encodes the structure of some cellular gene not only encodes the structure of some cellular product, but also bears control elements (short sequences) product, but also bears control elements (short sequences) that determine when, where, and how much of that product that determine when, where, and how much of that product is synthesized. Most genes encode protein products; is synthesized. Most genes encode protein products; special classes of genes encode for RNA molecules. special classes of genes encode for RNA molecules. The way genes encode proteins is indirect and involves several The way genes encode proteins is indirect and involves several steps. The first step is to copy ( steps. The first step is to copy (transcribe transcribe ) the information ) the information encoded in the DNA of the gene as a related but single-stranded encoded in the DNA of the gene as a related but single-stranded molecule called molecule called messenger RNA messenger RNA . Subsequently the information in the . Subsequently the information in the messenger RNA is messenger RNA is translated translated (decoded) into a string of amino acids (decoded) into a string of amino acids called a polypeptide. The polypeptides, on their own or by called a polypeptide. The polypeptides, on their own or by aggregating with other polypeptides and cell constituents, form the aggregating with other polypeptides and cell constituents, form the functional proteins of the cell. functional proteins of the cell.

Genetica per Scienze Naturali a.a. 03-04 prof S. Presciuttini 1. What is a gene? Definition: A gene is a discrete unit of DNA (or RNA in some viruses)

Embed Size (px)

Citation preview

Page 1: Genetica per Scienze Naturali a.a. 03-04 prof S. Presciuttini 1. What is a gene? Definition: A gene is a discrete unit of DNA (or RNA in some viruses)

Genetica per Scienze Naturalia.a. 03-04 prof S. Presciuttini

1. What is a gene? Definition: A gene is a discrete unit of DNA (or RNA in some Definition: A gene is a discrete unit of DNA (or RNA in some

viruses) that encodes a nucleic acid or protein product that contributes viruses) that encodes a nucleic acid or protein product that contributes to or influences the phenotype of the cellto or influences the phenotype of the cell or the organism. or the organism.

Genes are Genes are the functional units of chromosomal DNAthe functional units of chromosomal DNA. Each gene not . Each gene not only encodes the structure of some cellular product, but also bears only encodes the structure of some cellular product, but also bears control elements (short sequences) that determine when, where, and control elements (short sequences) that determine when, where, and how much of that product is synthesized. Most genes encode protein how much of that product is synthesized. Most genes encode protein products; special classes of genes encode for RNA molecules.products; special classes of genes encode for RNA molecules. The way genes encode proteins is indirect and involves several steps. The first The way genes encode proteins is indirect and involves several steps. The first

step is to copy (step is to copy (transcribetranscribe) the information encoded in the DNA of the gene as ) the information encoded in the DNA of the gene as a related but single-stranded molecule called a related but single-stranded molecule called messenger RNAmessenger RNA. Subsequently . Subsequently the information in the messenger RNA is the information in the messenger RNA is translatedtranslated (decoded) into a string of (decoded) into a string of amino acids called a polypeptide. The polypeptides, on their own or by amino acids called a polypeptide. The polypeptides, on their own or by aggregating with other polypeptides and cell constituents, form the functional aggregating with other polypeptides and cell constituents, form the functional proteins of the cell.proteins of the cell.

Page 2: Genetica per Scienze Naturali a.a. 03-04 prof S. Presciuttini 1. What is a gene? Definition: A gene is a discrete unit of DNA (or RNA in some viruses)

Genetica per Scienze Naturalia.a. 03-04 prof S. Presciuttini

2. Introns and exons Trying to pinpoint precisely what genes are is complicated by the fact Trying to pinpoint precisely what genes are is complicated by the fact

that many eukaryotic genes contain mysterious segments of DNA, that many eukaryotic genes contain mysterious segments of DNA, called called intronsintrons, interspersed in the transcribed region of the gene. , interspersed in the transcribed region of the gene. Introns do not contain information for functional gene product such as Introns do not contain information for functional gene product such as protein. protein. They are transcribedThey are transcribed together with the coding regions together with the coding regions (called (called exonsexons) but are then ) but are then excisedexcised from the initial transcript. from the initial transcript.

Since correct sequence in the introns (as well as in the regulatory Since correct sequence in the introns (as well as in the regulatory region) is necessary in order to generate a properly sized transcript at region) is necessary in order to generate a properly sized transcript at the right time and place, introns (along with coding and regulatory the right time and place, introns (along with coding and regulatory regions) regions) shouldshould be considered part of the overall functional unit be considered part of the overall functional unit, , in in other words, part of the gene other words, part of the gene

Page 3: Genetica per Scienze Naturali a.a. 03-04 prof S. Presciuttini 1. What is a gene? Definition: A gene is a discrete unit of DNA (or RNA in some viruses)

Genetica per Scienze Naturalia.a. 03-04 prof S. Presciuttini

4. Schematic gene structure

Generalized gene structure Generalized gene structure in prokaryotes and in prokaryotes and eukaryotes. The coding eukaryotes. The coding region (dark green) is the region (dark green) is the region that contains the region that contains the information for the information for the structure of the gene structure of the gene product (usually a protein). product (usually a protein). The adjacent regulatory The adjacent regulatory regions (lime green) regions (lime green) contain sequences that are contain sequences that are recognized and bound by recognized and bound by proteins that make the proteins that make the gene's RNA and by gene's RNA and by proteins that influence the proteins that influence the amount of RNA made. amount of RNA made.

Page 4: Genetica per Scienze Naturali a.a. 03-04 prof S. Presciuttini 1. What is a gene? Definition: A gene is a discrete unit of DNA (or RNA in some viruses)

Genetica per Scienze Naturalia.a. 03-04 prof S. Presciuttini

3. The average lenght of coding regions

Estimates of the average length of polypeptide chains Estimates of the average length of polypeptide chains coded by genes of various organisms; these value have to coded by genes of various organisms; these value have to be multiplied by 3 in order to obtaing the lenght of the be multiplied by 3 in order to obtaing the lenght of the corresponding coding DNA. Tipical values are 1,000 to corresponding coding DNA. Tipical values are 1,000 to 1,500 bp.1,500 bp.

Organism Average length of gene product (aa)

Vibrio cholerae (bacterium) 304 Saccharomyces cerevisiae (yeast) 477 Drosophila melanogaster (fruit fly) 492 Cenorhabditis elegans (nematode) 436 Arabidopsis thaliana (weed) 435 Homo sapiens 497

Page 5: Genetica per Scienze Naturali a.a. 03-04 prof S. Presciuttini 1. What is a gene? Definition: A gene is a discrete unit of DNA (or RNA in some viruses)

Genetica per Scienze Naturalia.a. 03-04 prof S. Presciuttini

5. Number of introns-exons per gene

MMany eukaryotic any eukaryotic genes contain genes contain mysterious segments mysterious segments of DNA, called of DNA, called introns, interspersed introns, interspersed in the region of the in the region of the gene. gene. IntronsIntrons do not do not contain information contain information for functional gene for functional gene product such as product such as protein.protein.

Distribution of the number of exons among genes of three organisms

Page 6: Genetica per Scienze Naturali a.a. 03-04 prof S. Presciuttini 1. What is a gene? Definition: A gene is a discrete unit of DNA (or RNA in some viruses)

Genetica per Scienze Naturalia.a. 03-04 prof S. Presciuttini

6. Genomes and genes

Genome Group Size (kb) Number of genes

Eukaryotic nucleus

Saccharomyces cerevisiae Yeast 13,500 6,000

Caenorhabditis elegans Nematode 100,000 13,500

Arabidopsis thaliana Plant 120,000 25,000

Homo sapiens Human 3,000,000 100,000

Prokaryote

Escherichia coli Bacterium 4,700 4,000

Hemophilus influenzae Bacterium 1,830 1,703

Methanococcus jannaschii Bacterium 1,660 1,738

Viruses

T4 Bacterial virus 172 300

HCMV (herpes group) Human virus 229 200

Eukaryotic organelles

S. cerevisiae mitochondria Yeast 78 34

H. sapiens mitochondria Human 17 37

Marchantia polymorpha

chloroplast Liverwort 121 136

The number of genes The number of genes iincreases with ncreases with ggenome size, but enome size, but the the trend is complicated trend is complicated due to due to repetitive repetitive DNA and introns.DNA and introns.

Counting genes is Counting genes is difficult, even in difficult, even in completely sequenced completely sequenced genomesgenomes

Page 7: Genetica per Scienze Naturali a.a. 03-04 prof S. Presciuttini 1. What is a gene? Definition: A gene is a discrete unit of DNA (or RNA in some viruses)

Genetica per Scienze Naturalia.a. 03-04 prof S. Presciuttini

7. Average gene length

Intron/exon statistics for various organisms

Page 8: Genetica per Scienze Naturali a.a. 03-04 prof S. Presciuttini 1. What is a gene? Definition: A gene is a discrete unit of DNA (or RNA in some viruses)

Genetica per Scienze Naturalia.a. 03-04 prof S. Presciuttini

8. Plasmid genomes

Bacterial cells isolated from nature often contain small DNA elements that are not Bacterial cells isolated from nature often contain small DNA elements that are not essential for the basic operation of the bacterial cell. These elements are called essential for the basic operation of the bacterial cell. These elements are called plasmids. Plasmids are symbiotic molecules that cannot survive at all outside of cells. plasmids. Plasmids are symbiotic molecules that cannot survive at all outside of cells. Even though plasmids are not part of the basic operational system of their host cells, Even though plasmids are not part of the basic operational system of their host cells, some are quite complex, carrying many genes, so it is quite appropriate to refer to their some are quite complex, carrying many genes, so it is quite appropriate to refer to their distinctive DNA as a "plasmid genome." Bacterial plasmids often contain genes that are distinctive DNA as a "plasmid genome." Bacterial plasmids often contain genes that are extremely useful to the bacterial host, for example, by promoting bacterial cell fusion, extremely useful to the bacterial host, for example, by promoting bacterial cell fusion, conferring antibiotic resistance, or producing toxins.conferring antibiotic resistance, or producing toxins.

Plasmids also are occasionally found in fungal and plant cells. Most are found inside Plasmids also are occasionally found in fungal and plant cells. Most are found inside mitochondria and chloroplasts, but some are found in nuclei or in the cytosol. Unlike mitochondria and chloroplasts, but some are found in nuclei or in the cytosol. Unlike the bacterial plasmids mentioned above, these eukaryotic plasmids seem to provide no the bacterial plasmids mentioned above, these eukaryotic plasmids seem to provide no benefits for their hoststhey seem to exist selfishly, only for the purpose of their own benefits for their hoststhey seem to exist selfishly, only for the purpose of their own propagation.propagation.

For their replication and maintenance, plasmids depend on the general cellular For their replication and maintenance, plasmids depend on the general cellular machinery encoded by the host genome. Bacterial plasmids are most often circular, but machinery encoded by the host genome. Bacterial plasmids are most often circular, but there are linear types too. In fungi and plants, linear plasmids are most common, but there are linear types too. In fungi and plants, linear plasmids are most common, but circular types are known in fungi.circular types are known in fungi.

Page 9: Genetica per Scienze Naturali a.a. 03-04 prof S. Presciuttini 1. What is a gene? Definition: A gene is a discrete unit of DNA (or RNA in some viruses)

Genetica per Scienze Naturalia.a. 03-04 prof S. Presciuttini

9. Organellar genomes Mitochondrial and chloroplast chromosomes consist of double-stranded DNA Mitochondrial and chloroplast chromosomes consist of double-stranded DNA

molecules. Individual mitochondria and chloroplasts contain identical multiple molecules. Individual mitochondria and chloroplasts contain identical multiple copies of their chromosomes, and each eukaryotic cell contains several to many of copies of their chromosomes, and each eukaryotic cell contains several to many of these organelles.these organelles.

The organelle chromosomes contain genes specific to the functions of the organelle The organelle chromosomes contain genes specific to the functions of the organelle concerned. Nevertheless, most of the biological functions that occur inside these concerned. Nevertheless, most of the biological functions that occur inside these organelles are specified by genes in the nuclear genome. There is no overlap with organelles are specified by genes in the nuclear genome. There is no overlap with the nuclear genome in gene content.the nuclear genome in gene content.

Mitochondria and chloroplasts probably were originally prokaryotic cells that Mitochondria and chloroplasts probably were originally prokaryotic cells that entered and took up a symbiotic relationship inside another cell. Throughout entered and took up a symbiotic relationship inside another cell. Throughout evolution most of the original prokaryotic genes were transferred to the nuclear evolution most of the original prokaryotic genes were transferred to the nuclear genome or lost.genome or lost.

Mitochondrial genomes can be eliminated in some organisms such as yeasts, but Mitochondrial genomes can be eliminated in some organisms such as yeasts, but most organisms cannot survive without them, so there is still mutual most organisms cannot survive without them, so there is still mutual interdependence between nuclear and organelle subdivisions of the genome. interdependence between nuclear and organelle subdivisions of the genome. Chloroplasts can be eliminated only in photosynthetic organisms that can survive by Chloroplasts can be eliminated only in photosynthetic organisms that can survive by taking in preformed nutrients from the environment (that is, that can act as taking in preformed nutrients from the environment (that is, that can act as heterotrophs).heterotrophs).

Page 10: Genetica per Scienze Naturali a.a. 03-04 prof S. Presciuttini 1. What is a gene? Definition: A gene is a discrete unit of DNA (or RNA in some viruses)

Genetica per Scienze Naturalia.a. 03-04 prof S. Presciuttini

10. Most eukaryotic DNA does not include genes BBetween genes there is DNA, mostly of unknown function. The size etween genes there is DNA, mostly of unknown function. The size

and nature of this DNA vary with the genome.and nature of this DNA vary with the genome. IIn n bacteria and bacteria and fungi there is little, but in mammals the fungi there is little, but in mammals the intergenic intergenic

regionsregions can be huge. can be huge. SSequences of DNA that exist quite distant from a given gene can equences of DNA that exist quite distant from a given gene can

affect theaffect the regulation regulation of that gene. They could thus be considered of that gene. They could thus be considered part of the part of the functional gene unitfunctional gene unit, even though separated by long , even though separated by long segments of DNA having nothing to do with the gene in question.segments of DNA having nothing to do with the gene in question.

In many eukaryotes some of the DNA between genes is In many eukaryotes some of the DNA between genes is repetitiverepetitive, , consisting of several different types of units repeated throughout the consisting of several different types of units repeated throughout the genome. Some of the repetitive DNA is dispersed; some is found in genome. Some of the repetitive DNA is dispersed; some is found in contiguous "tandem" arrays. Repetitive DNA is also found in some contiguous "tandem" arrays. Repetitive DNA is also found in some introns. The extent of this DNA is different in different species, and introns. The extent of this DNA is different in different species, and indeed there is variation of repeat number within species.indeed there is variation of repeat number within species.

Page 11: Genetica per Scienze Naturali a.a. 03-04 prof S. Presciuttini 1. What is a gene? Definition: A gene is a discrete unit of DNA (or RNA in some viruses)

Genetica per Scienze Naturalia.a. 03-04 prof S. Presciuttini

11. Comparing gene densities

Schematic diagram of gene topography in four organisms.

Light green = introns; dark green = exons; white = intergenic regions

Page 12: Genetica per Scienze Naturali a.a. 03-04 prof S. Presciuttini 1. What is a gene? Definition: A gene is a discrete unit of DNA (or RNA in some viruses)

Genetica per Scienze Naturalia.a. 03-04 prof S. Presciuttini

12. A small fraction of total eukaryotic DNA is coding

In mammals, only a few percent of the DNA is actualy coding:

Page 13: Genetica per Scienze Naturali a.a. 03-04 prof S. Presciuttini 1. What is a gene? Definition: A gene is a discrete unit of DNA (or RNA in some viruses)

Genetica per Scienze Naturalia.a. 03-04 prof S. Presciuttini

13. Coding sequences are needles in the haystack It It is apparentis apparent that the coding sequences are only a small part of the that the coding sequences are only a small part of the

genomegenome in most eukaryotes, particularly in human in most eukaryotes, particularly in human. Finding these . Finding these regions is like finding aregions is like finding a needle in the haystack needle in the haystack..

In addition,In addition, the genes are not uniformly distributed. There are regions the genes are not uniformly distributed. There are regions in the genome where the genes are packed together, and regions in the genome where the genes are packed together, and regions where they are sparsewhere they are sparse,, where finding genes is like finding water in where finding genes is like finding water in aa desert.desert.

Page 14: Genetica per Scienze Naturali a.a. 03-04 prof S. Presciuttini 1. What is a gene? Definition: A gene is a discrete unit of DNA (or RNA in some viruses)

Genetica per Scienze Naturalia.a. 03-04 prof S. Presciuttini

14. Categorizing the genes in eukaryotic genomes Classification schemes based on gene function suggest that all eukaryotes possess Classification schemes based on gene function suggest that all eukaryotes possess

the same basic set of genes, but that more complex species have a greater number of the same basic set of genes, but that more complex species have a greater number of genes in each category. For example, humans have the greatest number of genes in genes in each category. For example, humans have the greatest number of genes in all but one of the categories used in the figure, the exception being ‘metabolism' all but one of the categories used in the figure, the exception being ‘metabolism' where where ArabidopsisArabidopsis comes out on top as a result of its photosynthetic capability, comes out on top as a result of its photosynthetic capability, which requires a large set of genes not present in the other four genomes included in which requires a large set of genes not present in the other four genomes included in this comparison.this comparison.

This functional classification This functional classification reveals other interesting reveals other interesting features, notably that features, notably that C. C. eleganselegans has a relatively high has a relatively high number of genes whose number of genes whose functions are involved in cell-functions are involved in cell-cell signaling, which is cell signaling, which is surprising given that this surprising given that this organism has just 959 cells. organism has just 959 cells. Humans, who have 1013 cells, Humans, who have 1013 cells, have only 250 more genes for have only 250 more genes for cell-cell signaling. cell-cell signaling.

Page 15: Genetica per Scienze Naturali a.a. 03-04 prof S. Presciuttini 1. What is a gene? Definition: A gene is a discrete unit of DNA (or RNA in some viruses)

Genetica per Scienze Naturalia.a. 03-04 prof S. Presciuttini

15. Overview of the human genome Genome size is approximately 3,200 MbGenome size is approximately 3,200 Mb Gene number is approximately 30,000Gene number is approximately 30,000 Average gene density is 1 per 100 kb (5% of DNA encodes proteins); some areas Average gene density is 1 per 100 kb (5% of DNA encodes proteins); some areas

are gene rich, others are gene deserts (0 to 64 genes per 100 kb)are gene rich, others are gene deserts (0 to 64 genes per 100 kb) Average gene size (including introns) is 27 kb; gene regions account for about 25% Average gene size (including introns) is 27 kb; gene regions account for about 25%

of genomeof genome Average polypeptide size is 1.3 kbAverage polypeptide size is 1.3 kb Fraction of genome with coding functions is about 1.5%Fraction of genome with coding functions is about 1.5% At least 50% of genome made of transposable elements (e.g. LINES and Alus)At least 50% of genome made of transposable elements (e.g. LINES and Alus) Intron number ranges from 0 (in histones) to 234 (titin , a muscle protein).Intron number ranges from 0 (in histones) to 234 (titin , a muscle protein). Hundreds of genes appear to have been transferred directly from bacteria to Hundreds of genes appear to have been transferred directly from bacteria to

vertebrate genomes. Mechanism unknown.vertebrate genomes. Mechanism unknown. Functions have been assigned to 60% of genes.Functions have been assigned to 60% of genes. Largest human gene is dystrophin (mutated in muscular dystrophy): 2.5 Mb (larger Largest human gene is dystrophin (mutated in muscular dystrophy): 2.5 Mb (larger

than some bacterial genomes)than some bacterial genomes) 1077 blocks of duplicated regions in human genome (contain 10,000 genes): 1077 blocks of duplicated regions in human genome (contain 10,000 genes):

suggests genome rearrangements common in evolutionsuggests genome rearrangements common in evolution