23
Reading the blueprint of life DNA sequencing

online dna

Embed Size (px)

Citation preview

Page 1: online dna

8/7/2019 online dna

http://slidepdf.com/reader/full/online-dna 1/23

Reading the blueprint of life

DNA sequencing

Page 2: online dna

8/7/2019 online dna

http://slidepdf.com/reader/full/online-dna 2/23

Page 3: online dna

8/7/2019 online dna

http://slidepdf.com/reader/full/online-dna 3/23

Sequencing

The DNA from the genome is chopped intobits- whole chromosomes are too large to

deal with, so the DNA is broken intomanageably-sized overlapping segments.

The DNA is amplified by cloning into bacteria(PCR, see later, doesn¶t produce enough and

requires sequence information for theprimers).

It is then denatur ed (ie. melted), so that thetwo strands split apart.

Page 4: online dna

8/7/2019 online dna

http://slidepdf.com/reader/full/online-dna 4/23

Sequencing- continued

Denatured DNA is added to reaction mix with: ± a primer (to start complementary pairing),

 ± DNA polymerase

 ± nucleotides including special ones calleddideoxynucleotides. These special nucleotidesdo not allow further nucleotides to be added to thechain. So in a mix with dideoxy-A, every time adideoxy-A is added (small proportion of As), the

reaction ends. This results in different lengthfragments. The dideoxynucleotides arefluorescently tagged.

Fragments can be separated out on a gel by

electrophor esis and their length calculated.Working out DNA sequence ~ jigsaw puzzle.

Page 5: online dna

8/7/2019 online dna

http://slidepdf.com/reader/full/online-dna 5/23

DNA sequencing ±

preparation In order to sequence a piece of DNA, first

need to amplify it. This is sometimes done by

a process called polymerase chain r eaction (PCR).

PCR: The necessary ingredients for DNA

replication are 1) the DNA itself, 2) DNA

polymerase, 3) free nucleotides and 4)primers - Place all these in a test tube.

Page 6: online dna

8/7/2019 online dna

http://slidepdf.com/reader/full/online-dna 6/23

DNA amplification

- PCRStep 1 ± heat to c. 95°C for 30s ± this

denatures the DNA and unzips the twostrands

Step 2 ± cool to c. 55°C for 20s, this causes theprimer to bind to the DNA

Step 3 ± heat to c. 72°C for a minute per kb(kilobase)± this allows the polymerase tocatalyse the addition of free nucleotides tothe primer, replicating the DNA.

So in two minutes a c. 1kb piece of DNA isreplicated. Repeat for a few hoursp a million

copies.

Page 7: online dna

8/7/2019 online dna

http://slidepdf.com/reader/full/online-dna 7/23

DNA Amplification

- cloning An alternative to PCR is to insert the

piece of DNA into the DNA of a

bacterium. Replicating the bacterium

thus replicates the DNA.

Cf. recombinant DNA technology

Page 8: online dna

8/7/2019 online dna

http://slidepdf.com/reader/full/online-dna 8/23

Sequencing using gel

electrophoresis Here is a gel with 28 DNA samples: green

bands represent A, blue C, yellow G and red

T.

Small molecules move faster.

Page 9: online dna

8/7/2019 online dna

http://slidepdf.com/reader/full/online-dna 9/23

Sequence assembly using

mapping Originally sequencing was performed by cutting the

chromosomes into large pieces which were clonedinto bacteria, creating a whole library of DNA

segments. The segments were cut open to look for common sequence landmarks in overlappingfragments. These were used to fingerprint thefragments, so that it was known where in thechromosome the fragment was- this is called

mapping. The fragments were cut into smaller pieces and the process repeated and the smallfragments were sequenced. Finally the wholesequence is known (in terms of short fragments andtheir locations on the chromosome).

Page 10: online dna

8/7/2019 online dna

http://slidepdf.com/reader/full/online-dna 10/23

Shotgun sequencing

Shotgun sequencing dispenses with the need for 

mapping and so is much faster. It involves chopping

the DNA into fragments of size c. 2000base pairs(bps) and 10000 bps, sequencing the first and last

500 bps of each fragment. It then uses computer 

algorithms to assemble the entire sequence from the

sequenced fragments.

Page 11: online dna

8/7/2019 online dna

http://slidepdf.com/reader/full/online-dna 11/23

Page 12: online dna

8/7/2019 online dna

http://slidepdf.com/reader/full/online-dna 12/23

 Acquisition of sequence data Genomes must be sequenced several times over on

average, both to ensure complete coverage of the

genome is achieved, and because sequencing datais somewhat error-prone.

Increases in the efficiency of sequencing have led to

a year on year increase in the rate of new sequence

data acquisition:

Page 13: online dna

8/7/2019 online dna

http://slidepdf.com/reader/full/online-dna 13/23

3200

0

http://www2.ebi.ac.uk/genomes/mot/index.html

Page 14: online dna

8/7/2019 online dna

http://slidepdf.com/reader/full/online-dna 14/23

Statistics of genome

sequencesStatistics can be global or  local:

Base composition of  genomes:

Bacteria (E. coli): 25% A, 25% C, 25% G,25% T

Mosquito (P. falciparum): 82%A+T

Human: 59%A+T

Translation initiation:

ATG is the near universal motif (codon)indicating the start of translation in DNA

coding sequence.

Page 15: online dna

8/7/2019 online dna

http://slidepdf.com/reader/full/online-dna 15/23

Databases of sequence

information Internet has become a vital resource in

making sequence data generally available to

the biological community at large. Examples:

GenBank (www.ncbi.nlm.nih.gov/Genbank),EMBL (www.ebi.ac.uk/embl),

DDBJ (www.ddbj.nig.ac.jp). Used for: gene prediction, protein structure/

function prediction, homology searching

Page 16: online dna

8/7/2019 online dna

http://slidepdf.com/reader/full/online-dna 16/23

Extracting important

information The most important parts of the genome are the

genes.

Efforts have been made to identify genes out of 

sequence data. Expr essed sequence tags (ESTs) are short pieces

of sequence data that correspond to mRNAs found incells of the organism.

ESTs are produced by purifying mRNA from cells andthen using an enzyme called r everse transcriptaseto convert these to copy DNA (cDNA). The DNA isthen cloned in bacteria and sequenced.

The sequence obtained is usually only short (c. 700

base pairs) and may not be very accurate, but ES

Tsstill provide very useful information.

Page 17: online dna

8/7/2019 online dna

http://slidepdf.com/reader/full/online-dna 17/23

Page 18: online dna

8/7/2019 online dna

http://slidepdf.com/reader/full/online-dna 18/23

Open reading frames There are 6 reading frames, 3 forwards:

And 3 backwards (on the other strand). A frame issaid to be open if it contains long stretches without astop codon.

[Lower lines are single-letter amino acid codes,*=stop.]

5' 3'

atgcccaagctgaatagcgtagaggggttttcatcatttgaggacgatgta taa

1 atg ccc aag ctg aat agc gta gag ggg ttt tca tca ttt gag gac gat gta taa

M P K L N S V E G F S S F E D D V *

2 tgc cca agc tga ata gcg tag agg ggt ttt cat cat ttg agg acg atg tat

C P S * I A * R G F H H L R T M Y

3 gcc caa gct gaa tag cgt aga ggg gtt ttc atc att tga gga cga tgt ata

A Q A E * R R G V F I I * G R C I

Page 19: online dna

8/7/2019 online dna

http://slidepdf.com/reader/full/online-dna 19/23

Gene prediction in eukaryotes In bacteria, open reading frames (ORFs)

are pretty much enough to indicate genes,

but in eukaryotes finding genes is morecomplicated, because

1. Eukaryotic DNA is roughly 97-98%noncoding- in such a large amount, ORFsmay exist by chance.

2. Eukaryotic DNA contains introns, so findingthe start and the end of a gene is notenough- also have to find which bits(introns) to edit out of sequence. Alsointrons break up open reading frames.

Page 20: online dna

8/7/2019 online dna

http://slidepdf.com/reader/full/online-dna 20/23

Introns - reminder 

Mentioned in ³Introduction to Molecular 

Biology´

These are pieces of DNA within genes, whichare transcribed but then spliced out of the

RNA before it is translated.

They make it much harder to find genes,

since finding open reading frames is not

enough, you also need to find where introns

and exons start and end.

Page 21: online dna

8/7/2019 online dna

http://slidepdf.com/reader/full/online-dna 21/23

Conclusions Sequencing DNA involves:

 ± Amplifying it by PCR or cloning

 ± Chopping it up into manageable bits ± Replicating it with fluorescently-tagged

dideoxynucleotides

 ± Running the different length fragments on a geland reading this

 ± Assembling the pieces (sequences of manageablebits).

Shotgun sequencing is faster than mapping-based assembly methods, but can have

accuracy problems.

Page 22: online dna

8/7/2019 online dna

http://slidepdf.com/reader/full/online-dna 22/23

Conclusions Sequence data is stored in online databases

Extracting useful information and patterns

from such data is part of bioinformatics andoften employs intelligent systems techniques.

Page 23: online dna

8/7/2019 online dna

http://slidepdf.com/reader/full/online-dna 23/23

Next block of lectures History of genomics

Introduction to bioinformatics

More on gene prediction