13
7.03 Fall 2006 1 Lecture 29 - Polymorphisms in Human DNA Sequences •SNPs •SSRs Eukaryotic Genes and Genomes = DNA content of a gamete (sperm or egg) genome = DNA content of a complete haploid set of chromosomes H. sapiens M. musculus D. melanogaster C. elegans S. cerevisiae E. coli genes/ haploid year sequence completed DNA content/ haploid(Mb) cM Chromosomes Species 1 16 6 4 20 23 N/A 4000 300 280 1700 3300 5 12 100 180 3000 3000 1997 1997 1998 2000 4,200 5,800 19,000 14,000 30,000? 30,000? Mb = megabase = 1 million base-pairs of DNA Kb = kilobase = 1 thousand base-pairs of DNA Note: cM = centi Morgan = 1% recombination 2002 draft 2001 draft 2005 finished? 2003 finished

Lecture 29 -Polymorphisms in Human DNA Sequences •SNPs •SSRsweb.mit.edu/7.03/documents/Lecture_29.pdf · Lecture 29 -Polymorphisms in Human DNA Sequences •SNPs ... SSRs are

  • Upload
    dangdat

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Lecture 29 -Polymorphisms in Human DNA Sequences •SNPs •SSRsweb.mit.edu/7.03/documents/Lecture_29.pdf · Lecture 29 -Polymorphisms in Human DNA Sequences •SNPs ... SSRs are

7.03 Fall 2006

1

Lecture 29 - Polymorphisms in Human DNA Sequences

•SNPs•SSRs

Eukaryotic Genes and Genomes

= DNA content of a gamete (sperm or egg)genome = DNA content of a complete haploid set of chromosomes

H. sapiens

M. musculus

D. melanogaster

C. elegans

S. cerevisiae

E. coli

genes/haploid

yearsequencecompleted

DNAcontent/

haploid(Mb)cMChromosomesSpecies

1

16

6

4

20

23

N/A

4000

300

280

1700

3300

5

12

100

180

3000

3000

1997

1997

1998

2000

4,200

5,800

19,000

14,000

30,000?

30,000?

Mb = megabase = 1 million base-pairs of DNA Kb = kilobase = 1 thousand base-pairs of DNA

Note: cM = centi Morgan = 1% recombination

2002 draft

2001 draft2005 finished?

2003 finished

Page 2: Lecture 29 -Polymorphisms in Human DNA Sequences •SNPs •SSRsweb.mit.edu/7.03/documents/Lecture_29.pdf · Lecture 29 -Polymorphisms in Human DNA Sequences •SNPs ... SSRs are

7.03 Fall 2006

2

30003300H. sapiens

30001700M. musculus

180280D. melanogaster

100300C. elegans

124000S. cerevisiae

5N/AE. coli

true breedingstrains?

designcrosses?

generationtime

DNA content/haploid (Mb)cMSpecies

30 min

90 min

4 d

2 wk

3 mo

20 yr

yes yes

yes yes

yes yes

yes yes

yes yes

no no

• Human genetics is retrospective (vs prospective). Human geneticistscannot test hypotheses prospectively. Themouse provides a prospective surrogate.

• Can’t do selections

• Meager amounts of data Human geneticists typically rely upon statisticalarguments as opposed to overwhelmingamounts of data in drawing connections betweengenotype and phenotype.

• Highly dependent on DNA-based maps and DNA-based analysis

The unique advantages of human genetics:

• A large population which is self-screening to a considerable degree• Phenotypic subtlety is not lost on the observer• The self interest of our species

Page 3: Lecture 29 -Polymorphisms in Human DNA Sequences •SNPs •SSRsweb.mit.edu/7.03/documents/Lecture_29.pdf · Lecture 29 -Polymorphisms in Human DNA Sequences •SNPs ... SSRs are

7.03 Fall 2006

3

1) SNPs = single nucleotide polymorphisms = single nucleotide substitutions

Hnuc =

A locus is said to be polymorphic if two or more alleles are each present ata frequency of at least 1% in a populationof animals.

In human populations:

average heterozygosity per nucleotide site = 0.001

Page 4: Lecture 29 -Polymorphisms in Human DNA Sequences •SNPs •SSRsweb.mit.edu/7.03/documents/Lecture_29.pdf · Lecture 29 -Polymorphisms in Human DNA Sequences •SNPs ... SSRs are

7.03 Fall 2006

4

TTT GCT GGC CAC TTT GCT GGA CAC

Phe Ala Gly His

SYNONOMOUS CHANGES

Phe Ala Gly His

TTT GCT GGC CAC TTT GCT TGC CAC

Phe Ala Cys HisPhe Ala Gly His

NON-SYNONOMOUS CHANGES

The great majority (probably 99%) of SNPs are selectively “neutral” changesof little or no functional consequence:

• outside coding or gene regulatory regions (>97% of humangenome)

• silent substitutions in coding sequences

• some amino acid substitutions do not affect protein stability or function

A small minority of SNPs are of functional consequence and areselectively advantageous or disadvantageous.

• disadvantageous SNPs selected against --> further underrepresentation

Page 5: Lecture 29 -Polymorphisms in Human DNA Sequences •SNPs •SSRsweb.mit.edu/7.03/documents/Lecture_29.pdf · Lecture 29 -Polymorphisms in Human DNA Sequences •SNPs ... SSRs are

7.03 Fall 2006

5

Affymetrix chip

Page 6: Lecture 29 -Polymorphisms in Human DNA Sequences •SNPs •SSRsweb.mit.edu/7.03/documents/Lecture_29.pdf · Lecture 29 -Polymorphisms in Human DNA Sequences •SNPs ... SSRs are

7.03 Fall 2006

6

All Tumorous

C57black X

3 Tumors :: 1 non-tumor

TUMORS NON-TUMORS

C57blackAA aa

Aa

Page 7: Lecture 29 -Polymorphisms in Human DNA Sequences •SNPs •SSRsweb.mit.edu/7.03/documents/Lecture_29.pdf · Lecture 29 -Polymorphisms in Human DNA Sequences •SNPs ... SSRs are

7.03 Fall 2006

7

All NON-TUMORS (normal)

C57black AKRX

13/16 NON-TUMORS:: 3/16 tumors

TUMORS NON-TUMORS

All Non-Tumors (normal)

.

C57black AKRX

13/16 non-tumors :: 3/16 tumors

TUMORS NON-TUMORS

AAbb aaBB

AaBb

A-B-aaB-aabb

A-bb

AKR HAS A GENE (B) THAT SUPPRESSES TUMORS

Page 8: Lecture 29 -Polymorphisms in Human DNA Sequences •SNPs •SSRsweb.mit.edu/7.03/documents/Lecture_29.pdf · Lecture 29 -Polymorphisms in Human DNA Sequences •SNPs ... SSRs are

7.03 Fall 2006

8

Page 9: Lecture 29 -Polymorphisms in Human DNA Sequences •SNPs •SSRsweb.mit.edu/7.03/documents/Lecture_29.pdf · Lecture 29 -Polymorphisms in Human DNA Sequences •SNPs ... SSRs are

7.03 Fall 2006

9

Page 10: Lecture 29 -Polymorphisms in Human DNA Sequences •SNPs •SSRsweb.mit.edu/7.03/documents/Lecture_29.pdf · Lecture 29 -Polymorphisms in Human DNA Sequences •SNPs ... SSRs are

7.03 Fall 2006

10

O

OHHO

HO

HOO

LactoseH

!(1,4)-Glycoside Linkage

1

O

OH

HOHO

OH4

H

galactoseresidue

O

OH

HO

HO

HOO

CellobioseH

!(1,4)-Glycoside Linkage

1

O

OH

HOHO

OH4

H

glucoseresidue

glucose residue

CANDIDATE GENE

LACTOSE

The enzyme lactase that is located in the villus enterocytes of the small intestine is responsible for digestion of lactose in milk.

Lactase activity is high and vital during infancy, but in most mammals, including most humans, lactase activity

declines after the weaning phase. In other healthy humans, lactase activity persists at a high level throughout adult life, enabling them to digest lactose as adults. This dominantly

inherited genetic trait is known as lactase persistence. The distribution of these different lactase phenotypes in

human populations is highly variable and is controlled by a polymorphic element cis-acting to the lactase gene. A

putative causal nucleotide change has been identified and occurs on the background of a very extended haplotype

that is frequent in Northern Europeans, where lactase persistence is frequent. This single nucleotide polymorphism is located 14 kb upstream from the start

of transcription of lactase in an intron of the adjacent gene MCM6. This change does not, however, explain all the variation in lactase expression.

Page 11: Lecture 29 -Polymorphisms in Human DNA Sequences •SNPs •SSRsweb.mit.edu/7.03/documents/Lecture_29.pdf · Lecture 29 -Polymorphisms in Human DNA Sequences •SNPs ... SSRs are

7.03 Fall 2006

11

LACTOSE TOLERANCE

LACTASE GENE

SNP

Genotype

2) SSRs = simple sequence repeat polymorphisms = "microsatellites"

Most common type in mammalian genomes is

16F15E14D13C12B11Anallelesprimer #1

primer #2PCRgel electrophoresis

n

CA repeat:

(CA)n

(GT)n

AB CD EF AD CF

FEDCBA

161514131211

Page 12: Lecture 29 -Polymorphisms in Human DNA Sequences •SNPs •SSRsweb.mit.edu/7.03/documents/Lecture_29.pdf · Lecture 29 -Polymorphisms in Human DNA Sequences •SNPs ... SSRs are

7.03 Fall 2006

12

SSRs are extremely useful as genetic markers in human studies because:

• they are easily scored (by PCR)

• they are codominant

• many SSRs exhibit very high average heterozygosities: HSSR = 0.7 to 0.9

• SSRs are abundant

A randomly selected person is likely to be heterozygous.

SSRs occur, on average, about once every in the human(or mouse) genomes. have been identified andmapped within the human genome.

30 kb> 20,000 SSRs

Huntington's disease (HD)

HD:

Phenotype: Loss of neurons personality change, memory loss, motor problem

autosomal dominant affecting 1/20,000 individuals

Page 13: Lecture 29 -Polymorphisms in Human DNA Sequences •SNPs •SSRsweb.mit.edu/7.03/documents/Lecture_29.pdf · Lecture 29 -Polymorphisms in Human DNA Sequences •SNPs ... SSRs are

7.03 Fall 2006

13

20 cM

SSR1 SSR2 SSR3 SSR4 SSR5

genetic linkage mapping

We genotype the six members of the family for SSRs scattered throughoutthe genome (which spans 3300 cM)—perhaps 165 different SSRs distributedat intervals so that20 cM one SSR must be within 10 cM of theHuntington's gene:

SSR37HDPaternal

alleles:

SSR37HD

Genotypes:

We obtain potentially exciting results with SSR37, on chromosome 4:

DCB

A

SSR37

HD/+ HD/+ HD/++/+ +/+ +/+AB AC ADBD BC CD

HD HD++A ABB