23
In Introduction to DNA Forensics: The Basic In Introduction to DNA Forensics: The Basics Michael D. Kane, Ph.D. Professor of Biomedical Informatics Department of Computer & Information Technology Lead Genomic Scientist, Bindley Bioscience Center Purdue University Contact: (office) 765-494-2564, [email protected]

In Introduction to DNA Forensics: The Basics

Embed Size (px)

DESCRIPTION

In Introduction to DNA Forensics: The Basics. Michael D. Kane, Ph.D. Professor of Biomedical Informatics Department of Computer & Information Technology Lead Genomic Scientist, Bindley Bioscience Center Purdue University Contact: (office) 765-494-2564, [email protected]. - PowerPoint PPT Presentation

Citation preview

In Introduction to DNA Forensics: The BasicsIn Introduction to DNA Forensics: The Basics

Michael D. Kane, Ph.D.Professor of Biomedical Informatics

Department of Computer & Information TechnologyLead Genomic Scientist, Bindley Bioscience Center

Purdue University

Contact: (office) 765-494-2564, [email protected]

DNA is Information Storage

“Zipped Files”

Decompression

“Executable Files”

“Computer-Lingo”

DNA is Double Stranded – One strand is the “coding strand” and the other strand is there to stabilize the DNA sequence when not in use. Double-stranded DNA is very durable in our environment.

CAGGACCATGGAACTCAGCGTCCTCCTCTTCCTTGCACTCCTCACAGGACTCTTGCTACTCCTGGTTCAGCGCCACCCTAACACCCATGACCGCCTCCCACCAGGGCCCCGCCCTCTGCCCCTTTTGGGAAACCTTCTGCAGATGGATAGAAGAGGCCTACTCAAATCCTTTCTGAGGTTCCGAGAGAAATATGGGGACGTCTTCACGGTACACCTGGGACCGAGGCCCGTGGTCATGCTGTGTGGAGTAGAGGCCATACGGGAGGCCCTTGTGGACAAGGCTGAGGCCTTCTCTGGCCGGGGAAAAATCGCCATGGTCGACCCATTCTTCCGGGGATATGGTGTGATCTTTGCCAATGGAAACCGCTGGAAGGTGCTTCGGCGATTCTCTGTGACCACTATGAGGGACTTCGGGATGGGAAAGCGGAGTGTGGAGGAGCGGATTCAGGAGGAGGCTCAGTGTCTGATAGAGGAGCTTCGGAAATCCAAGGGGGCCCTCATGGACCCCACCTTCCTCTTCCAGTCCATTACCGCCAACATCATCTGCTCCATCGTCTTTGGAAAACGATTCCACTACCAAGATCAAGAGTTCCTGAAGATGCTGAACTTGTTCTACCAGACTTTTTCACTCATCAGCTCTGTATTCGGCCAGCTGTTTGAGCTCTTCTCTGGCTTCTTGAAATACTTTCCTGGGGCACACAGGCAAGTTTACAAAAACCTGCAGGAAATCAATGCTTACATTGGCCACAGTGTGGAGAAGCACCGTGAAACCCTGGACCCCAGCGCCCCCAAGGACCTCATCGACACCTACCTGCTCCACATGGAAAAAGAGAAATCCAACGCACACAGTGAATTCAGCCACCAGAACCTCAACCTCAACACGCTCTCGCTCTTCTTTGCTGGCACTGAGACCACCAGCACCACTCTCCGCTACGGCTTCCTGCTCATGCTCAAATACCCTCATGTTGCAGAGAGAGTCTACAGGGAGATTGAACAGGTGATTGGCCCACATCGCCCTCCAGAGCTTCATGACCGAGCCAAAATGCCATACACAGAGGCAGTCATCTATGAGATTCAGAGATTTTCCGACCTTCTCCCCATGGGTGTGCCCCACATTGTCACCCAACACACCAGCTTCCGAGGGTACATCATCCCCAAGGACACAGAAGTATTTCTCATCCTGAGCACTGCTCTCCATGACCCACACTA…

Yeast Human Lily

Yeast genome is 240-times smaller than

human

Lily genome is 40-times bigger

than human

Simple Summary of Human Genomics1) There are 3 billion base-pairs (or “bytes”) of information in the human genome.

2) Only 2% of the human genome is made up of “genes”, the remaining 98% is somewhat unique to each individual, and important in deriving DNA-based evidence.

3) A “gene” encodes a protein. Proteins are the functional units of living systems (hair, cotton, skin, venoms, pollens, foods, etc, etc, etc…)

4) Only about 0.1% of our genome is “unique” to us individually (as opposed to race, gender or familial inheritance), or about 3 million base pairs of DNA.

Genes

Simple Summary of Molecular Biology

1) DNA can be isolated from different sample types.

2) Sections of DNA can be “amplified” 1-billion fold in a few hours, which means to enrich for certain sections for subsequent analysis (PCR amplification).

3) DNA has many areas of repeated sequence (e.g. …catg-catg-catg-catg-catg…)

4) DNA can be “cut” at specific sequence points (e.g. ACTG).

This is the basis for DNA-based forensics evidence

STR (new method)RFLP (old method)

mtDNA

STR “Fingerprinting” MethodSTR “Fingerprinting” Method

STR, or “short, tandem repeat” (sequences) exist in the non-coding regions of our DNA (i.e. not in “genes”), and vary between individuals.

These regions can be “amplified”, and the length of each of the amplified STR sequences can be determined.

In criminal investigations, there are 13 regions (aka “loci”) in the human genome that are amplified and analyzed.

Suspect 1.1)jump-jump-jump-jump-jump (20)2)run-run-run-run-run (15)3)skip-skip-skip-skip-skip-skip (24)4)hop-hop-hop-hop (12)…13) Total in STR

Suspect 2.1)jump-jump-jump-jump (16)2)run-run-run-run-run-run (18)3)skip-skip-skip-skip-skip-skip-skip (28)4)hop-hop-hop-hop (12)…13) Total in STR

SIMPLE EXAMPLE (using only 4 STR amplified regions)

Determining the “size” of amplified DNADetermining the “size” of amplified DNA

…and

Determining the “size” of amplified DNADetermining the “size” of amplified DNA

Power source

+-

DNA in a salt solution

DNA has a “-” charge, and is attracted to the “+” electrode.

Determining the “size” of amplified DNADetermining the “size” of amplified DNA

+-

(Continued)

gelgelAmplified DNA sampleis placed in gel

+-shorter DNAlonger DNA

The gel limits the diffusion rate of the DNA, and therefore the shorter pieces of amplified DNA move quicker through the gel.

Determining the “size” of amplified DNADetermining the “size” of amplified DNA(Continued)

EXAMPLES of some DNA gels

STR “Fingerprinting” MethodSTR “Fingerprinting” Method

Suspect 11)jump-jump-jump-jump-jump (20)2)run-run-run-run-run (15)3)skip-skip-skip-skip-skip-skip (24)4)hop-hop-hop-hop (12)

Suspect 21)jump-jump-jump-jump (16)2)run-run-run-run-run-run (18)3)skip-skip-skip-skip-skip-skip-skip (28)4)hop-hop-hop-hop (12)

…back to our SIMPLE EXAMPLE (using only 4 STR amplified regions)

28

24

20

16

12

Sus

pect

1

Sus

pect

2

Unk

now

nS

ampl

e

Note: The word strings above (e.g. jump-jump) are intended to present the example clearly. In reality, these would all be DNA (e.g. CATG-CATG-CATG)

STR “Fingerprinting” MethodSTR “Fingerprinting” MethodStatistical Basis and Information Management

By testing nine of these STR sites on different chromosomes in humans you get a one in a billion unique signature. 

Nine sites as standards are used by the military for paternity matters. 

Thirteen sites are commonly used for forensic tests and for the CODIS database, although this method is not sufficient for identifying differences in identical twins.

The Combined DNA Index System (CODIS) is a DNA database funded by the United States Federal Bureau of Investigation (FBI). It is a computer system that stores DNA profiles created by federal, state, and local crime laboratories in the United States, with the ability to search the database to assist in the identification of suspects in crimes.

Although the DNA Identification Act was passed in 1994, CODIS did not become fully operational until 1998.

DNA Amplification and “Specificity”DNA Amplification and “Specificity”

PCR amplification involves enriching a specific section of DNA for analysis.

It is considered “specific” since two different pieces of synthetic DNA (i.e. primers) are used to facilitate the synthesis of DNA in a test tube. These “primers” are specific to a known section of DNA, and if the reaction is done correctly the only DNA amplified is the intended section of DNA (i.e. amplifying an STR section from the rest of the human DNA, as well as any bacterial, viral, or plant DNA).

In human samples, the presence of more than one contributing source of DNA (i.e. a sample that has been contaminated by other people at a crime scene or working in a molecular forensics lab) will be detected through the presence of 4 (or more) results, rather than 2 (remember… we each have 2 copies of our genome, one from mom and one from dad).

So in STR analysis, each “loci” or “allele” actually has two results, or one if mom and dad each provided your genome with the same size of a given STR.

DNA Amplification and “Specificity”DNA Amplification and “Specificity”

The 13 STR loci used by the FBI (and other law enforcement) are:

CSF1POFGATH01TPOXvWAD3S1358D5S818D7S820D8S1179D13S317D16S539D18S51D21S11

TAKEN FROM: Bruce Budowle, Genotype Profiles for Five Population Groups at the Short Tandem Repeat Loci D2S1338 and D19S433, Forensic Science Communications July 2001.

Example Results

STR “Fingerprinting” MethodSTR “Fingerprinting” MethodConfounding Issues

1) DNA from crime scene evidence can be very small quantity, poorly preserved, or highly degraded, so only a partial DNA profile can be obtained. When fewer than 13 STR loci are examined, the overall genotype frequency is higher, therefore making the probability of a random match higher as well.

As an example, if a suspect-sample match was made using only 4 of the STR loci (as was the example earlier), the probability for a match (true or false positive) is about 1 in 330.

2) If an individual happens to have STR alleles that are very common in his or her ethnic group, the genotype frequency can also be quite high, even when all of the core 13 STR loci are examined.

3) Crime scene samples sometimes contain DNA from several different sources, which can make identifying the source(s) of the DNA extremely difficult.

PCR Concept: Amplification of a “piece” of DNA for analysis.

Driving phenomena of PCR: Heating and Cooling

Heating: Double-stranded DNA “comes apart” when heated to near boiling. This is also called “denaturing” or “melting”.

Cooling: Complementary DNA “comes together” when cooled. This is also called “renaturing”, “annealing” or “hybridizing”.

HEATING

COOLING

Double-Stranded DNA

Single-Stranded DNA

HEATING

Double-Stranded DNA

Single-Stranded DNA

5’

3’

3’

5’PCR Primers

This section of the DNA templatewill be amplified.

Most PCR applications use 30 cycles (230 = 1.07 billion), representing an amplification of about 1 billion fold.

OTHER DNA Methods…OTHER DNA Methods…In Restriction Fragment Length Polymorphism (RFLP), a large area of highly variable DNA is amplified (PCR), then “cut” with a specific restriction enzyme. A restriction enzyme cuts DNA at a specific site (e.g. Nla3 cuts at CATG). Once the amplified DNA is cut (or digested), the resulting DNA fragments are separated on a gel (similar to what we discussed earlier). Since each person would have a unique subset of DNA fragments in this method, their gel pattern would be unique.

STR has largely replaced RFLP since the results of STR can be much more easily described categorically therefore stored/searched in a database.

mtDNA is mitochondrial DNA, which is maternally inherited (i.e. you only get this from your biological mother), and has two highly variable sections of DNA. DNA amplification and sequencing of these regions can be used to gain a positive match, but will not exclude people of a similar familial line (i.e. you, your siblings, your mother and your grandmother all have the same sequence).

The advantage of mtDNA use is when STR (or other methods based on nuclear DNA) are limited, such as samples of hair, bone, teeth. Similarly, if the sample is highly degraded, mtDNA may be preferred since there hundreds of mitochondria per cell, yet only on nucleus).

Bauer, Timothy M.

Herrington, Patrick C.

Jackson, Devin C.

Larew, Connor T.

Lozevski, Michael A.

McMillian, Ronald M.

Miller, Grant K.

Stampfli, Noah L.