51
Learning Objectives : Understand the basic differences between genomic and cDNA libraries Understand how genomic libraries are constructed Understand the purpose for having overlapping DNA fragments in genomic libraries and how they are generated Understand how cDNA libraries are constructed and the use of reverse transcriptase for their construction Understand the rationale for library screening Understand the method of plaque hybridization Understand the four methods for library screening and when they are put into use

Learning Objectives : Understand the basic differences between …€¦ ·  · 2010-08-25Learning Objectives : • Understand the basic differences between genomic and cDNA ... •

Embed Size (px)

Citation preview

Learning Objectives :

• Understand the basic differences between genomic and cDNAlibraries• Understand how genomic libraries are constructed• Understand the purpose for having overlapping DNA fragments in genomic libraries and how they are generated• Understand how cDNA libraries are constructed and the use of reverse transcriptase for their construction• Understand the rationale for library screening• Understand the method of plaque hybridization• Understand the four methods for library screening and when theyare put into use

Molecular cloning in bacterial cells….

This strategy can be applied to genomic DNA as well as cDNA

Library construction• two types of libraries

• a genomic library contains fragments of genomic DNA (genes)• a cDNA library contains DNA copies of cellular mRNAs

• both types are usually cloned in bacteriophage vectors

Construction of a genomic library

vector DNA (bacteriophage lambda)• lambda has a linear double-

stranded DNA genome• the left and right arms are essential

for the phage replication cycle• the internal fragment is dispensable

“left arm” “right arm”

Bam HI sites

internal fragment (dispensable for phage growth)

NNG GATCCNN

NNCCTAG GNN

internal fragment

cut with Bam HI(6-base cutter)

remove internalfragment

“left arm” “right arm”

cut with Sau 3A (4-base cutter)which has ends compatible

with Bam HI:

NNN GATCNNN

NNNCTAG NNN

isolate ~20 kbfragments

human genomic DNA (isolated frommany cells)Bam HI sites:

combine and treatwith DNA ligase

“left arm” “right arm”

“left arm” “right arm”

package into bacteriophageand infect E. coli

1

2 3

4

5 6

• genomic library of human DNA fragmentsin which each phage contains a differenthuman DNA sequence

7

• isolation of ~20 kb fragments provides optimallysized DNAs for cloning in bacteriophage

• partial digestion with a frequent-cutter (4-base cutter) allows productionof overlapping fragments, since not every site is cut

• overlapping fragments insures that all sequences in the genome are cloned• overlapping fragments allows larger physical maps to be constructed as

contiguous chromosomal regions (contigs) are put together from the sequence data

• number of clones needed to fully represent the human genome (3 X 109 bp)assuming ~20 kb fragments

• theoretical minimum = ~150,000• 99% probability that every sequence is represented = ~800,000

Partial restriction enzyme digestionallows cloning of overlapping fragments

a “contig”

All possible sites:

Results of a partial digestion: = uncut = cut

Genomic Library making… The partial digest is one of the most important steps. Why???

Due to the production of overlapping DNA fragments

The production of a cDNA library

Construction of a cDNA library

• reverse transcriptase makes a DNA copy of an RNA

The life cycle of a retrovirus depends on reverse transcriptase

retrovirus

1. virus enters celland looses envelope

2. the capsid is uncoated, releasing genomicRNA and reverse transcriptase

3. reverse transcriptasemakes a DNA copy

4. then copies the DNA strand tomake it double-stranded DNA,

removing the RNA with RNase H

5. the DNA is then integratedinto the host cell genomewhere it is transcribed byhost RNA polymerase II

6. it is translated into viral proteins,and assembled into new

virus particles

new viruses

• cDNA library construction

AAAAA5’ 3’ mRNA(all mRNAs in cell)

anneal oligo(dT) primers of 12-18 bases in length

AAAAATTTTT

5’ 3’5’

add reverse transcriptase and dNTPs

AAAAATTTTT

5’3’

3’5’ cDNA

add RNaseH (specific for the RNA strand of an RNA-DNAhybrid) and carry out a partial digestion

AATTTTT

5’

short RNA fragments serve as primers forsecond strand synthesis using DNA polymerase I

3’

3’

AAAAATTTTT

5’3’

short RNA fragments serve as primers forsecond strand synthesis using DNA polymerase I

AAATTTTT

5’3’

DNA polymerase I removes the remaining RNA withits 5’ to 3’ exonuclease activity and continues synthesis

DNA ligase seals the gaps

AAATTTTT

5’3’

AAAAATTTTT

5’3’

double-strandedcDNA

AAAAATTTTT

5’3’

NNNNNNNNGNNNNNNNNCTTAAEcoRI linkers are ligated to both ends

using DNA ligase

AAAAANNNNNNNNGTTTTTNNNNNNNNCTTAA

• double-stranded cDNA copies of mRNA with EcoRI cohesive ends arenow ready to ligate into a bacteriophage lambda vector cut with EcoRI

5’3’

AATTCNNNNNNNNGNNNNNNNN

EcoRI sites

combine cDNAs withlambda arms and treat

with DNA ligase

“left arm” “right arm”

“left arm” “right arm”

package into bacteriophageand infect E. coli

1

2 3

4

5 6

• cDNA library in which each phage containsa different human cDNA

7

cDNAs

DNA libraries• Genomic DNA libraries contains both introns and

exons and promoters etc…– Usually made with 4 base cutters that cut frequently (

every 275 bases or so).– The production of overlapping sequences is due to

partial digestion.– Libra sqry complexity is important to make sure that

the sequence you are looking for is found in the DNA that has been sampled.

– N = ln (1-P) / ln (1-f) where N = number of clones, P = probability that the DNA fragment is found in your library and f = the frequency of the DNA in your library.

Genomic DNA complexity• To screen for a clone in a library usually want a

99% probability that your clone is found there.• Frequency is the size of the DNA fragment in the

library/the size of the haploid genome. For a lambda library 17 kb (1.7 x 104) is the average size of library. The size of the genome is

3 x 109 bpF = 1.7 x 104 / 3 x 109 bp N = ln (1-.99) / ln (1- [1.7 x 104 / 3 x 109])N = ln .01 / ln (1 - 0.56 x 10-5)N = -4.6061702 / -0.0000056N = 822,351 clones

Genome equivalents

• How many genome equivalents are there in this library?

How do you calculate this?822,351 x 1.7 x 104 bps = 1.40 x 1010 bpsDivide by the genome size 3.0 x 109 bps= 4.67 times the genome equivalentHow many positives will you get if you

screen for a single copy gene?

Insertional mutagensis

• In all of the vectors that are currently used to date there is a system that can either identify or select for vectors containing clones. This is the backbone of recombinant DNA technology.

• Initial vectors involved the cloning into a antibiotic resistance gene making a bacteria containing a vector with a DNA fragment sensitive to the antibiotic. This is not the best situation, Why?

Insertional mutagenesis II

• The use of the beta-galactosidase gene for an insertional mutagenesis target allowed the screening of all clones for those that contained inserts by a simple blue white color assay. This gene cleaves X gal (chromagen) to give rise to a blue dye that colors the bacteria or phage plaque. This allows the screening those plasmids or phage particles that contain DNA disrupting the target gene.

Insertional mutagenesis III

• In addition suppressor tRNA genes can be used to identify YAC that contain an insert. The suppressor tRNA can suppress the effects of a Ade2 ochre mutation. This gives a white yeast colony. When the tRNA gene is disrupted the colonies are pink due to the accumulation of a precursor of Adenine. Pink colonies are what is desired. See Figure 4.16

Clones are usually characterized first by restriction digestion.

This DNA fragment was digest with various enzymes giving rise to specific sizes. These can be used to generate a restriction map

Vectors for library construction

• Plasmid vectors – Small circles of DNA that contain a selection

marker like antibiotic resistance. – Insertional mutagenesis target with a multi

cloning site.– A variety DNA replicons. Bacterial, Yeast.

• Maximum size of insert is about 10 kb.

Lambda and Cosmid vectors

• Bacteriophage lambda can be used as a cloning vector. It has a genome of about 50 kb of linear DNA. Its life cycle is condusive to the use as a cloning vector…The lytic cycle can be supported by only a portion of the genes found in the lambda genome.

Lambda life cycle.

The lytic life cycle produces phage particles immediately

The lysogeniclife cycle requires genes in the middle of the genome, which can be replaced

Lambda insertion and replacement vectors

• Only 37 to 52 kb DNA fragments can be packaged into the lambda head. This can be done in vitro .Because the middle portion of the lambda genome can be replace if the lytic life cycle is used up to 23 kb DNA can be inserted in lambda genome. These are used for genomic DNA libraries.

• Insertion vectors can hold up to 7 kb of cDNA.

Lambda genome

In vitro Packaging of ligated lambda DNA.

Cosmid vectors

• A cosmid is a hybrid between a lambda vector and a plasmid. The COS sites are the only thing that is necessary for lambda DNA packaging. Therefore if one can ligate COS sites about 50 kb apart then the ligation products can be in vitropackaged. Therefore cosmid vectors can contain 33 to 45 kb.

Cosmid vector ligation

Making a Genomic Library with Cosmids

• Partial digest

• Ligation into site EcoRI

cos

TetR

EcoRI

21.5 kb

Final Steps of a Genomic Library

• Package into heads and plug with tails• Transduce E. coli receptor cell• Select white colonies with tetR

• Check for plasmid• Screen in your mutant for phenotype

restoration

Things You Should Remember

• Some plasmids are used as vectors or cloning vehicles but they are limited to the amount of DNA that can be cloned.

• A cosmid is a plasmid that has at least one COS (cohesive end site).

• COS comes from a bacteriophage.• A genomic library contains at least one copy of

every gene in an organism.

Cloning large DNA fragments• Due to the large size of the human genome and

the fact that many genes are very large and some DNA fragments cannot be replicated in lambda other vector systems needed to be developed.

• Bacterial Artificial chromosomes (BAC) vectors– These vectors are based on the E. coli F factor.

These vectors are maintained at 1-2 copies per cell and can hold > 300 kb of insert DNA.

– Problems are low DNA yield from host cells. (due to low copy number when compared to 300 copies per cell with a plasmid vector like pUC19.

Cloning large DNA fragments II

• Bacteriophage P1 – These vectors are like lambda and can hold

up to 110 to 115 kb of DNA . This DNA can then be packaged by the P1 phage protein coat.

– The use of T4 in vitro packaging systems can enable the recovery of 122 kb inserts.

– See Figure 4.15

Bacteriophage P1 vector system.

Cloning large DNA fragments III• Yeast Artificial Chromosomes

Many DNA fragments cannot be propagated in bacterial cells. Therefore yeast artificial chromosomes can be built with a few specific components.

1. Centromere2. Telomere3. Autonomously replicating sequence

(ARS)• Genomic DNA is ligated between two

telomeres and the ligation products are transformed into yeast cells using the spheroplast method.

YAC cloning system

Cloning systems

Vector systems that can be used to clone DNA

Plaque hybridization

• This is a general technique required for a number of specific approaches for isolating cDNA or genomic clones

• Generally, one starts by1). Isolating a cDNA sequence from a cDNA library, then2). The gene from a genomic library using the cDNA as a probe

• Information gained from cDNA and genomic clones1). cDNA clones provide the amino acid sequence of the

full-length protein, unencumbered by intron sequences2). Genomic clones provide the control regions and are

required for searching for mutations

Library screening: four experimental approaches

• Starting with a protein1). Synthetic oligonucleotide - plaque hybridization2). Antibody - variation of plaque hybridization

• Starting with mRNA1). Differential cDNA library screening - yet another variation2). Expression screening - does not utilize plaque hybridization

spot on film indicates a plaque containing DNA of interest

Library screening• plaque hybridization

• plate phage library on lawn of E. coli (bacteria >>> phage)• plaques form as a consequence of a spreading lytic infection

starting with a single phage-infected bacterial cell• each phage plaque is a clone of identical recombinant phage• prepare replica of phage plaques and hybridize DNA with probe

E. coli lawn is grown on agar plate and thenoverlayered with the recombinant phage library.

Wherever a single bacteriophage particleinfects a bacteria cell, a plaque will form.This is a clear area caused by the lysisof bacteria on the lawn of E. coli.

A replica of the agar plate is made on anitrocellulose sheet - the DNA is denaturedand adheres to the nitrocellulose.

The nitrocellulose is hybridized with a labeledDNA probe (such as an oligonucleotide) andthe nitrocellulose is exposed to X-ray film.

X-ray film

Labeled probe in solutione.g. an ‘oligo’ probe32P label

Hybridization of probe toimmobilized DNA

Probe hybridized to immobilized DNAforming double-stranded region

How does one isolate a gene for an inherited disorder?• Start with a candidate protein

DNA protein• If a protein candidate has been identified for a geneticdisease it can be used to make a probe to screen for the gene

1. oligonucleotide probe• purify the protein of interest• partially sequence the protein• find a region having amino acids with the fewest possible codons• predict a DNA sequence that could represent a gene region

encoding a portion of the protein• synthesize a set of degenerate oligonucleotides for that region• hybridize the labeled oligonucleotide to the phage library

MET.GLU.PHE.TYR.ILE.CYS.GLN.LYS amino acidsAUG.GAA.UUU.UAU.AUU.UGU.CAA.AAA

G C C C C G G all possibleA oligonucleotides

1 X 2 X 2 X 2 X 3 X 2 X 2 X 2 = 192-fold degenerate

}

2. antibody probe• purify the protein of interest• make an antibody to that protein• construct cDNA library to express recombinant proteins in E. coli• use the antibody to detect the protein being made from the cloned

cDNA encoded by the recombinant phage in E. coli using plaquehybridization method modified for antibody probes

left arm right arm

bacterial promoter and Shine-Dalgarno sequence

human cDNA insert

How does one isolate a gene for an inherited disorder?• Start with a candidate mRNA

DNA mRNA• mRNA candidates can be identified by comparing mRNApopulations between normal and abnormal tissues, or bylooking for a specific function encoded by the mRNA

1. differential cDNA library screening• prepare duplicate plaque replica plates• hybridize one with a labeled cDNA probe made to all the mRNAs

in the normal cell and hybridize the other (duplicate) with thecorresponding probe to the abnormal cell

• differences in cell function should be reflected by differencesin the mRNA populations

• any plaques showing differential hybridization are candidates

hybridization with cDNA probe

f l ll

hybridization with cDNA probefrom abnormal cells

differentiallyexpressedclone

nohybrid-izationto thisplaque

2. expression screening• develop a cell-based functional assay for the abnormality

(e.g., a transport assay)• construct cDNA library in a way that will allow expression of

protein in mammalian cells• inject groups of cDNA clones into cells and assay function• narrow down cDNA clones using smaller groups of clones until

the function is observed with a single cDNA species

• inject groups of cDNA clones• if the function being assayed is

observed, divide the group of clonesinto smaller groups and retest

• continue process of testing smallergroups until the function beingassayed is obtained with one clone

• for example, inject clonesand test cells for transport activity

left arm right arm

human cDNA insertmammalian promoter

--

+

--

-

-

+

- - - -

-+

--

102104

Expression Cloning• Certain vector systems can be used to produce

specific products. • The type of expression product

• RNA… Riboprobes• Protein product…

• The type of environment• In vitro … cell free• In vivo … mammalian or prokaryotic cells

• Purpose of the expression system• To produce large quantities of proteins for protein studies or

antibody production.

cDNA expression libraries

• The gene for a specific protein can be cloned from an expression cDNA library if an antibody to the protein is available. A variety vectors can be used to produce fusion proteins which can be detected with Ab in question. See Figure 4.18

Expression for Ab detection

Expression in Eukaryotic cells

• Many proteins need specific modifications to work properly… expression in bacterial cells is not sufficient

• Plasmid based Eukaryotic expression systems which work after transient transfection into mammalian cell lines have been produced.

• Viral based system are also popular.