Upload
dangerbala3
View
218
Download
0
Embed Size (px)
Citation preview
8/7/2019 online dna
http://slidepdf.com/reader/full/online-dna 1/23
Reading the blueprint of life
DNA sequencing
8/7/2019 online dna
http://slidepdf.com/reader/full/online-dna 2/23
8/7/2019 online dna
http://slidepdf.com/reader/full/online-dna 3/23
Sequencing
The DNA from the genome is chopped intobits- whole chromosomes are too large to
deal with, so the DNA is broken intomanageably-sized overlapping segments.
The DNA is amplified by cloning into bacteria(PCR, see later, doesn¶t produce enough and
requires sequence information for theprimers).
It is then denatur ed (ie. melted), so that thetwo strands split apart.
8/7/2019 online dna
http://slidepdf.com/reader/full/online-dna 4/23
Sequencing- continued
Denatured DNA is added to reaction mix with: ± a primer (to start complementary pairing),
± DNA polymerase
± nucleotides including special ones calleddideoxynucleotides. These special nucleotidesdo not allow further nucleotides to be added to thechain. So in a mix with dideoxy-A, every time adideoxy-A is added (small proportion of As), the
reaction ends. This results in different lengthfragments. The dideoxynucleotides arefluorescently tagged.
Fragments can be separated out on a gel by
electrophor esis and their length calculated.Working out DNA sequence ~ jigsaw puzzle.
8/7/2019 online dna
http://slidepdf.com/reader/full/online-dna 5/23
DNA sequencing ±
preparation In order to sequence a piece of DNA, first
need to amplify it. This is sometimes done by
a process called polymerase chain r eaction (PCR).
PCR: The necessary ingredients for DNA
replication are 1) the DNA itself, 2) DNA
polymerase, 3) free nucleotides and 4)primers - Place all these in a test tube.
8/7/2019 online dna
http://slidepdf.com/reader/full/online-dna 6/23
DNA amplification
- PCRStep 1 ± heat to c. 95°C for 30s ± this
denatures the DNA and unzips the twostrands
Step 2 ± cool to c. 55°C for 20s, this causes theprimer to bind to the DNA
Step 3 ± heat to c. 72°C for a minute per kb(kilobase)± this allows the polymerase tocatalyse the addition of free nucleotides tothe primer, replicating the DNA.
So in two minutes a c. 1kb piece of DNA isreplicated. Repeat for a few hoursp a million
copies.
8/7/2019 online dna
http://slidepdf.com/reader/full/online-dna 7/23
DNA Amplification
- cloning An alternative to PCR is to insert the
piece of DNA into the DNA of a
bacterium. Replicating the bacterium
thus replicates the DNA.
Cf. recombinant DNA technology
8/7/2019 online dna
http://slidepdf.com/reader/full/online-dna 8/23
Sequencing using gel
electrophoresis Here is a gel with 28 DNA samples: green
bands represent A, blue C, yellow G and red
T.
Small molecules move faster.
8/7/2019 online dna
http://slidepdf.com/reader/full/online-dna 9/23
Sequence assembly using
mapping Originally sequencing was performed by cutting the
chromosomes into large pieces which were clonedinto bacteria, creating a whole library of DNA
segments. The segments were cut open to look for common sequence landmarks in overlappingfragments. These were used to fingerprint thefragments, so that it was known where in thechromosome the fragment was- this is called
mapping. The fragments were cut into smaller pieces and the process repeated and the smallfragments were sequenced. Finally the wholesequence is known (in terms of short fragments andtheir locations on the chromosome).
8/7/2019 online dna
http://slidepdf.com/reader/full/online-dna 10/23
Shotgun sequencing
Shotgun sequencing dispenses with the need for
mapping and so is much faster. It involves chopping
the DNA into fragments of size c. 2000base pairs(bps) and 10000 bps, sequencing the first and last
500 bps of each fragment. It then uses computer
algorithms to assemble the entire sequence from the
sequenced fragments.
8/7/2019 online dna
http://slidepdf.com/reader/full/online-dna 11/23
8/7/2019 online dna
http://slidepdf.com/reader/full/online-dna 12/23
Acquisition of sequence data Genomes must be sequenced several times over on
average, both to ensure complete coverage of the
genome is achieved, and because sequencing datais somewhat error-prone.
Increases in the efficiency of sequencing have led to
a year on year increase in the rate of new sequence
data acquisition:
8/7/2019 online dna
http://slidepdf.com/reader/full/online-dna 13/23
3200
0
http://www2.ebi.ac.uk/genomes/mot/index.html
8/7/2019 online dna
http://slidepdf.com/reader/full/online-dna 14/23
Statistics of genome
sequencesStatistics can be global or local:
Base composition of genomes:
Bacteria (E. coli): 25% A, 25% C, 25% G,25% T
Mosquito (P. falciparum): 82%A+T
Human: 59%A+T
Translation initiation:
ATG is the near universal motif (codon)indicating the start of translation in DNA
coding sequence.
8/7/2019 online dna
http://slidepdf.com/reader/full/online-dna 15/23
Databases of sequence
information Internet has become a vital resource in
making sequence data generally available to
the biological community at large. Examples:
GenBank (www.ncbi.nlm.nih.gov/Genbank),EMBL (www.ebi.ac.uk/embl),
DDBJ (www.ddbj.nig.ac.jp). Used for: gene prediction, protein structure/
function prediction, homology searching
8/7/2019 online dna
http://slidepdf.com/reader/full/online-dna 16/23
Extracting important
information The most important parts of the genome are the
genes.
Efforts have been made to identify genes out of
sequence data. Expr essed sequence tags (ESTs) are short pieces
of sequence data that correspond to mRNAs found incells of the organism.
ESTs are produced by purifying mRNA from cells andthen using an enzyme called r everse transcriptaseto convert these to copy DNA (cDNA). The DNA isthen cloned in bacteria and sequenced.
The sequence obtained is usually only short (c. 700
base pairs) and may not be very accurate, but ES
Tsstill provide very useful information.
8/7/2019 online dna
http://slidepdf.com/reader/full/online-dna 17/23
8/7/2019 online dna
http://slidepdf.com/reader/full/online-dna 18/23
Open reading frames There are 6 reading frames, 3 forwards:
And 3 backwards (on the other strand). A frame issaid to be open if it contains long stretches without astop codon.
[Lower lines are single-letter amino acid codes,*=stop.]
5' 3'
atgcccaagctgaatagcgtagaggggttttcatcatttgaggacgatgta taa
1 atg ccc aag ctg aat agc gta gag ggg ttt tca tca ttt gag gac gat gta taa
M P K L N S V E G F S S F E D D V *
2 tgc cca agc tga ata gcg tag agg ggt ttt cat cat ttg agg acg atg tat
C P S * I A * R G F H H L R T M Y
3 gcc caa gct gaa tag cgt aga ggg gtt ttc atc att tga gga cga tgt ata
A Q A E * R R G V F I I * G R C I
8/7/2019 online dna
http://slidepdf.com/reader/full/online-dna 19/23
Gene prediction in eukaryotes In bacteria, open reading frames (ORFs)
are pretty much enough to indicate genes,
but in eukaryotes finding genes is morecomplicated, because
1. Eukaryotic DNA is roughly 97-98%noncoding- in such a large amount, ORFsmay exist by chance.
2. Eukaryotic DNA contains introns, so findingthe start and the end of a gene is notenough- also have to find which bits(introns) to edit out of sequence. Alsointrons break up open reading frames.
8/7/2019 online dna
http://slidepdf.com/reader/full/online-dna 20/23
Introns - reminder
Mentioned in ³Introduction to Molecular
Biology´
These are pieces of DNA within genes, whichare transcribed but then spliced out of the
RNA before it is translated.
They make it much harder to find genes,
since finding open reading frames is not
enough, you also need to find where introns
and exons start and end.
8/7/2019 online dna
http://slidepdf.com/reader/full/online-dna 21/23
Conclusions Sequencing DNA involves:
± Amplifying it by PCR or cloning
± Chopping it up into manageable bits ± Replicating it with fluorescently-tagged
dideoxynucleotides
± Running the different length fragments on a geland reading this
± Assembling the pieces (sequences of manageablebits).
Shotgun sequencing is faster than mapping-based assembly methods, but can have
accuracy problems.
8/7/2019 online dna
http://slidepdf.com/reader/full/online-dna 22/23
Conclusions Sequence data is stored in online databases
Extracting useful information and patterns
from such data is part of bioinformatics andoften employs intelligent systems techniques.
8/7/2019 online dna
http://slidepdf.com/reader/full/online-dna 23/23
Next block of lectures History of genomics
Introduction to bioinformatics
More on gene prediction