Upload
others
View
4
Download
1
Embed Size (px)
Citation preview
PUBH 8445: Lecture 1
Saonli Basu, Ph.D.
Division of BiostatisticsSchool of Public HealthUniversity of Minnesota
Statistical Genetics
It can broadly be classified into three sub categories:
Mendelian Genetics: studies the transmission of alleles inpedigrees.Population Genetics: the rules of how genes behave inpopulation.Quantitative Genetics: the rules of transmission of complexquantitative traits, those with both a genetic andenvironmental basis.
Saonli Basu PUBH 8445: Lecture 1
Genetic Terminologies
DNA, or deoxyribonucleic acid, is the hereditary material in humansand almost all other organisms. Nearly every cell in a person’s bodyhas the same DNA. Most DNA is located in the cell nucleus (whereit is called nuclear DNA), but a small amount of DNA can also befound in the mitochondria (where it is called mitochondrial DNA ormtDNA).
The information in DNA is stored as a code made up of fourchemical bases: adenine (A), guanine (G), cytosine (C), andthymine (T). Human DNA consists of about 3 billion bases, andmore than 99 percent of those bases are the same in all people. Theorder, or sequence, of these bases determines the informationavailable for building and maintaining an organism.
DNA bases pair up with each other, A with T and C with G, toform units called base pairs. Each base is also attached to a sugarmolecule and a phosphate molecule. Together, a base, sugar, andphosphate are called a nucleotide. Nucleotides are arranged in twolong strands that form a spiral called a double helix.
Saonli Basu PUBH 8445: Lecture 1
Genetic Terminologies
Cell nucleus
Adenine
Base pairs [
Thymine •,Guanine
Base pairs [
Cytosine • DNA's Double Helix. DNA molecules are found inside the cell's nucleus, tightly packed into chromosomes. Scientists use the term "double helix" to describe DNA's winding, two-stranded chemical structure. Alternating sugar and phosphate groups form the helix's two parallel strands, which run in opposite directions. Nitrogen bases on the two strands chemically pair together to form the interior, or the backbone of the helix. The base adenine (A) always pairs with thymine (T), while guanine (G) always pairs with cytosine (C).
Saonli Basu PUBH 8445: Lecture 1
Terminologies(contd)
Chromosome: The entire genome (complete set of nuclearDNA) is arranged in pairs of chromosomes.
There are 22 autosomes and 2 sex chromosomes.
For every pair of chromosomes, one is inherited from themother of an individual and one is inherited from the father ofan individual.
Chromosomes that are of the same pair and carry the sameset of genes and are called homologous.
Saonli Basu PUBH 8445: Lecture 1
Terminologies(contd)
Saonli Basu PUBH 8445: Lecture 1
terminologies(contd)
Locus: Each position of the genome is called a “locus”(“loci” for multiple locations). A locus could represent asingle base position or a collection of bases.
Allele: The variations observed in the human population at alocus are called the “alleles” for that locus. If the locusrepresents a single base, there could be at most twovariations. This type of locus is called “Single NucleotideVariation” (SNP). There are markers called microsattelites(Short Tandem Repeats: GTAGTAGTAGTAGTA...)
Gene: A gene is the basic physical and functional unit ofheredity. Genes, which are made up of DNA, act asinstructions to make molecules called proteins. In humans,genes vary in size from a few hundred DNA bases to morethan 2 million bases. The Human Genome Project hasestimated that humans have about 20,000 genes.
Saonli Basu PUBH 8445: Lecture 1
Mendelian Inheritance
In cross-pollinating plants that either produce yellow or greenpea seeds exclusively, Mendel found that the first offspringgeneration (f1) always has yellow seeds. However, thefollowing generation (f2) consistently has a 3:1 ratio of yellowto green.
This 3:1 ratio occurs in later generations as well.
Saonli Basu PUBH 8445: Lecture 1
Mendelian Inheritance
Law of Segregation: for any particular trait, the pair ofalleles of each parent separate and only one allele passes fromeach parent on to an offspring. Which allele in a parent’s pairof alleles is inherited is a matter of chance. We now knowthat this segregation of alleles occurs during the process ofmeiosis.
Saonli Basu PUBH 8445: Lecture 1
Terminologies(contd)
Genotype: The specific combination of alleles for a givenlocus/gene. Going back to Mendel’s plants, we can now say that allof his true-breeding plants contained two of the same alleles at thegene location for seed color. Yellow plants in this P generation hadtwo alleles for yellow color (YY), and green P generation plants hadtwo alleles for green color (GG). When two alleles at a locus areidentical, the individual is said to be homozygous at the location.
On the other hand, crossing the two color plants to produce F1hybrids created a generation of plants with one Y allele and one Gallele (YG). An organism with two opposing alleles at a location issaid to be heterozygous.
Saonli Basu PUBH 8445: Lecture 1
terminologies(contd)
Phenotype: The genetic makeup of a certain trait (e.g., YY, YGand GG) is called its genotype, while the physical expression ofthese traits (e.g., yellow or green) is called a phenotype.
Dominant/Recessive: For the pea plants, if the Y allele isdominant and the G allele is recessive, only two phenotypesare possible. Both the plants with YG and YY genotypes willhave the yellow color phenotype, while the plants with the GGgenotype will have the green color phenotype.A trait is the general aspect of physiology being shown in thephenotype. So, for example, the trait here is the pea seed-colorof the pea plant. The phenotype can be either yellow or greencolor, depending on the genotype.
Saonli Basu PUBH 8445: Lecture 1
Example: Genotype vs Phenotype
The ABO locus is on chromosome 9
The (main) alleles at the locus are A, B, and O.
The 6 genotypes are AA, AO, BB, BO, AB and OO
Homozygotes are AA,BB,OO; Heterozygotes are AO,BO andAB.
The 4 phenotypes are blood types A, B, AB and O
O allele is recessive to A and to B; A and B are eachdominant to O
AO and AA are blood type A; BB and BO are blood type B.
A and B are codominant: AA, AB and BB are distinguishable.
Saonli Basu PUBH 8445: Lecture 1
Mendel’s Principles of Genetic Inheritance
Law of Independent Assortment: In the gametes, alleles ofone gene separate independently of those of another gene, andthus all possible combinations of alleles are equally probable.
Law of Dominance: Each trait is determined by two factors(alleles), inherited one from each parent. These factors eachexhibit a characteristic dominant, co-dominant, or recessiveexpression, and those that are dominant will mask theexpression of those that are recessive.
Saonli Basu PUBH 8445: Lecture 1
Transmission of Alleles
Basic Concepts
\A B
a b
High LD -> No Recombination
(r2 = 1) SNP1 “tags” SNP2
A B
A B
A B
a b
a b
a b
Low LD -> Recombination
Many possibilities
A b
A ba Ba b
A BA B
a B
A b
etc…
A B
A B
X
OR
Parent 1 Parent 2
A B
a b
Saonli Basu PUBH 8445: Lecture 1
Quantitative Genetics
Quantitative genetics is the study of these polygenictraitsQuantitative genetic variation can be described in threeways:
Traits are influenced by multiple genes, i.e. theyre polygenic.They are usually influence more easily by environmental factorsthan simple Mendelian traits.Both of the factors above usually lead to a continuousdistribution of the particular trait. For example, you can seethe near normal distribution when comparing a samplepopulation by their height.
Saonli Basu PUBH 8445: Lecture 1
Allelic architecture and mapping strategy
Steps in Positional Cloning
Schuler (1996) Science
Saonli Basu PUBH 8445: Lecture 1
Broad Genetic Epidemiology Study Design Categories:
Linkage AnalysisFollows meiotic events through families for co-segregation ofdisease and particular genetic variants
Large FamiliesSibling Pairs (or other family pairs)Works VERY well for “Mendelian” diseases
Association Studies Detect association between geneticvariants and disease across families: exploits linkagedisequilibrium.
Case-Control designsCohort designsParents with affected child trios (TDT)May be more appropriate for complex diseases
Saonli Basu PUBH 8445: Lecture 1
Allelic architecture and mapping strategy
Mag
nitu
de o
f effe
ct
Frequency in population
Family-based linkage studies
Association studies in populations
Unlikely to exist
Fn. Studies
Slide thanks to D. Altshuler
Saonli Basu PUBH 8445: Lecture 1
Gene Discovery
5
800
Human Complex TraitsHuman Mendelian Trait
20001995199019851980
1800
1600
1400
1200
1000
600
400
200
45
40
35
30
25
20
15
10
Source: ‘Finding genes that underlie complex traits’Glazier AM, Nadeau JH, Aitman TJ Science,2002
Saonli Basu PUBH 8445: Lecture 1
World-wide distribution of the IB (ABO) allele
Saonli Basu PUBH 8445: Lecture 1
Hapmap Project
Scientists thought the mutations that caused commondiseases would themselves be common.
They first identified the common mutations in the humanpopulation in a $100 million project called the HapMap. Thenthey compared patient’s genomes with those of healthygenomes. The comparisons relied on ingenious devices calledSNP chips, which scan just a tiny portion of the genome.
These projects, called genome-wide association studies(GWAS), each cost several millions.
The results of this costly international exercise have beendisappointing. About 2,000 sites on the human genome havebeen statistically linked with various diseases, but in manycases the sites are not inside working genes, suggesting theremay be some conceptual flaw in the statistics.
Saonli Basu PUBH 8445: Lecture 1
Hapmap project
View variation patterns
Triangle plot shows LD values using r2 or
D’/LOD scores in one or more HapMappopulations
Phased haplotype track shows all 120 chromosomes with
alleles colored yellow and blue
Saonli Basu PUBH 8445: Lecture 1
1000 genome project
Searc
Search Health 3,000+ Topics
Enlarge This Image
Michael Stravato for The New York Times
Dr. James R. Lupski, a medical geneticist with a nerve disease, had his whole genome decoded.
Multimedia
Disease Cause Is Pinpointed With GenomeBy NICHOLAS WADEPublished: March 10, 2010
Two research teams have independently decoded the entire genome of patients to find the exact genetic cause of their diseases. The approach may offer a new start in the so far disappointing effort to identify the genetic roots of major killers like heart disease, diabetes and Alzheimer’s.
In the decade since the first full genetic code of a human was sequenced for some $500 million, less than a dozen genomes had been decoded, all of healthy people.
Geneticists said the new research showed it was now possible to sequence the entire genome of a patient at reasonable cost and with sufficient accuracy to be of practical use to medical researchers. One subject’s genome cost just $50,000 to decode.
“We are finally about to turn the corner, and I suspect that in the next few years human genetics will finally begin to systematically deliver clinically meaningful findings,” said
WisconPlay
Log in toare shaPrivacy
What’
Self-InjuFebruary 2
VegetabFebruary 1
RemediFebruary 1
A DoctoFebruary 1
In SurpFebruary 1
HOME PAGE TODAY'S PAPER VIDEO MOST POPULAR TIMES TOPICS
ResearchWORLD U.S. N.Y. / REGION BUSINESS TECHNOLOGY SCIENCE HEALTH SPORTS OPINION ARTS S
RESEARCH FITNESS & NUTRITION MONEY & POLICY VIEWS HEALTH G
COMMENTS (70)
SIGN IN TO E-MAIL
REPRINTS
SHARE
RECOMMEND
TimesPeople recommended: Wisconsin Power Play 1:50 PMWelcome to TimesPeopleGet Started Recommend
Page 1 of 5Disease Cause Is Pinpointed With Genome - NYTimes.com
2/21/2011http://www.nytimes.com/2010/03/11/health/research/11gene.html
Saonli Basu PUBH 8445: Lecture 1
Why Statistical Genetics?
Extremely interesting and fun projects.
Rewarding and gratifying to see the importance of learningstatistical techniques.
Huge demand for statisticians and lots of money currentlyinvested in developing statistical techniques to facilitate thediscovery of personalised medicines.
Saonli Basu PUBH 8445: Lecture 1