Upload
anna-glenn
View
221
Download
0
Tags:
Embed Size (px)
Citation preview
Applied research in human Applied research in human geneticsgenetics
Weibin ShiMichele Sale
Identification of genes that cause disease
The central focus of human The central focus of human genetics research:genetics research:
Which polymorphisms inWhich polymorphisms in
Which genes inWhich genes in
Which individualsWhich individuals
Exposed to which environmental factorsExposed to which environmental factors
Increase risk of developing disease?Increase risk of developing disease?
Defining what to studyDefining what to study As in any biomedical study, need to precisely As in any biomedical study, need to precisely
define the disease under study define the disease under study
Define primary phenotype and secondary Define primary phenotype and secondary phenotypesphenotypes
Understanding risk factorsUnderstanding risk factors Genetic or Environmental?Genetic or Environmental?
• Ethnic differencesEthnic differences• Age/gender influenceAge/gender influence
Refining whether the disease Refining whether the disease under study is geneticunder study is genetic
Family studies: Familial aggregation Twin studies: Concordance rate of
disorder for monozygotic twins (MZ) vs. the rate for dyzogotic (DZ) twins
Adoption studies: disease frequency of adoptees’ biological vs. their adopted parents or siblings
Ethnic differences
Best Proof of All?Best Proof of All?
Connect genetic variation to the disease!Connect genetic variation to the disease!
But, how do we But, how do we find the gene?find the gene?
Linkage analysis Linkage analysis andand Association Association analysisanalysis are effective in identifying are effective in identifying Mendelian disorder genesMendelian disorder genes but are but are
less effective in identifying less effective in identifying complex disease genescomplex disease genes
Complex diseases are often caused by multiple genes and environmental factors
Difficulties of genetic studies of complex disease in humans
Heterogeneity of human populations
Several to many genes involved
modest effects for any single gene
Environmental influences
Advantages over other mammals:-Small size (<40g), short generation time (8-9 wks), large litter size (5~10 puppies)-Numerous inbred strains and gene-targeted-Easy control of environmental factors
Mouse model of Mouse model of humanhuman genetic genetic diseasedisease
Mouse genome shares great similarity Mouse genome shares great similarity with the human genomewith the human genome
Mouse-Human Comparison2.5 vs. 3.2 billion bp long> 99% of genes have homologs
> 95% of genome “syntenic” (relative gene-order conservation)
Variation among mouse strains in susceptibility to diet-induced atherosclerosis
Atherosclerotic Vascular Disease
Terminology
Discrete/qualitative trait - traits that are present or absent.
Continuous/quantitative trait - traits that have measurable characteristics across a range of values. This class includes the vast majority of diseases afflicting humans.
Gene 1
Gene 3Gene 4
Gene 5
Gene 2
Gene 6
Quantitative trait locus (QTL) Quantitative trait locus (QTL) analysisanalysis
B6
F1
x
x
F2…
C3H
QTL analysis starts with selection of two QTL analysis starts with selection of two phenotypically different strainsphenotypically different strains
All F2s are analyzed for trait values
All FAll F22s are typed for genetic markers s are typed for genetic markers
spanning the who genomespanning the who genome
Statistical analysis
Map Manager QTXb20 (http://mapmgr.roswellpark.org/) and R/qtl (http://www.biostat. jhsph.edu/~kbroman/software) are available for testing the association of a phenotype with each marker.
Log of the-odds-ratio (LOD) score is used to define the significance of the association of a genetic marker with a trait.
Genome-wide scan for Genome-wide scan for atherosclerotic lesions atherosclerotic lesions
Interval mapping provides best estimation on the location of genes affecting atherosclerotic lesions
Dissect major QTL byconstruction and analysis of congenic strains
Congenic strain: identical to an inbred strain except for a differential chromosomal segment
Sequence ComparisonSequence Comparison
If crosses include those of sequenced If crosses include those of sequenced strains, search database for polymorphisms strains, search database for polymorphisms of positional candidate genes in the QTL of positional candidate genes in the QTL regions.regions.15 common inbred strains (B6, AJ, 129, DBA, C3H …)15 common inbred strains (B6, AJ, 129, DBA, C3H …) now available at now available at MGI, NCBI, and MGI, NCBI, and EnsemblEnsembl
Re-sequence coding and promoter regions of Re-sequence coding and promoter regions of strong candidate genes.strong candidate genes.
Gene expression databaseGene expression database
Where is your gene expressed?Where is your gene expressed?http://www.informatics.jax.org/javawi2/servlet/WIFetch?http://www.informatics.jax.org/javawi2/servlet/WIFetch?page=expressionQFpage=expressionQF
http://www.ncbi.nlm.nih.gov/geo/
Is there microarry data for your gene?Is there microarry data for your gene?
Conduct functional studies to prove the identity of promising candidate genes
Test the significance of QTL genes found in mouse Test the significance of QTL genes found in mouse by association analysis using human populations by association analysis using human populations
Table 2 Genotyping results for genes in the human Chr 1 region homologous to the mouse Table 2 Genotyping results for genes in the human Chr 1 region homologous to the mouse Ath1Ath1 locus locus
Rare alleles, % (total alleles)Rare alleles, % (total alleles)
GeneGeneaa RefSNP IDRefSNP IDbb SNPSNPcc Position in gene (bp in Ensembl)Position in gene (bp in Ensembl) AffectedAffected ControlControl PP
PIGCPIGC rs1063412rs1063412 C/TC/T Exon 2 coding (169650343)Exon 2 coding (169650343) 40.7 (684)40.7 (684) 41.7 (734)41.7 (734) 0.690.69
C1orf9C1orf9dd rs1053381rs1053381 A/GA/G 3' UTR (169819913)3' UTR (169819913) 6.3 (694)6.3 (694) 8.0 (672)8.0 (672) 0.220.22
TNFSF6 (FASL)TNFSF6 (FASL) rs763110rs763110 C/TC/T 687 bp upstream (169866874)687 bp upstream (169866874) 31.6 (728)31.6 (728) 32.0 (744)32.0 (744) 0.870.87
IntergenicIntergenic rs983514rs983514 A/TA/T (170111828)(170111828) 3.0 (708)3.0 (708) 3.5 (714)3.5 (714) 0.570.57
TNFSF18TNFSF18 rs1883477rs1883477 A/GA/G Intron 1 (170258429)Intron 1 (170258429) 19.0 (694)19.0 (694) 18.3 (706)18.3 (706) 0.720.72
TNFSF4 (OX40L)TNFSF4 (OX40L) rs1234315rs1234315 C/TC/T 1,992 bp upstream (170417839)1,992 bp upstream (170417839)ff 45.9 (754)45.9 (754) 43.3 (778)43.3 (778) 0.310.31
rs3850641rs3850641 A/GA/G Intron 1 (170415208)Intron 1 (170415208)ee 15.5 (766)15.5 (766) 12.1 (784)12.1 (784) 0.050.05
rs1234313rs1234313 A/GA/G Intron 1 (170405623)Intron 1 (170405623)ee 29.6 (766)29.6 (766) 33.4 (784)33.4 (784) 0.110.11
rs3861950rs3861950 C/TC/T Intron 2 (170395668)Intron 2 (170395668)ee 33.4 (710)33.4 (710) 30.4 (746)30.4 (746) 0.230.23
rs1234312rs1234312 C/TC/T 1,809 bp downstream (170390438)1,809 bp downstream (170390438)ff 3.0 (766)3.0 (766) 2.6 (772)2.6 (772) 0.620.62
aa
Applied research in Applied research in human geneticshuman geneticsMichèle Sale, Ph.D.Michèle Sale, Ph.D.
Center for Public Health GenomicsCenter for Public Health Genomics
[email protected]@virginia.edu
Tel: 982-0368Tel: 982-0368
National DNA Day!National DNA Day!
April 25April 25 Commemorates the discovery of the Commemorates the discovery of the
structure of DNA in 1953 and the structure of DNA in 1953 and the sequencing of the human genome 50 sequencing of the human genome 50 years lateryears later
Genetic Information Non-Genetic Information Non-Discrimination Act of 2007 Discrimination Act of 2007
(GINA)(GINA) A A version first introduced in 1995version first introduced in 1995 GINA would:GINA would:
Prohibit access to individuals' personal genetic information by insurance Prohibit access to individuals' personal genetic information by insurance companies making health coverage plan enrollment decisions, and by companies making health coverage plan enrollment decisions, and by employers making hiring decisions;employers making hiring decisions;
Prohibit insurance companies from requesting that applicants for group Prohibit insurance companies from requesting that applicants for group or individual health coverage plans be subjected to genetic testing or or individual health coverage plans be subjected to genetic testing or screening, and prohibit them from discriminating against health plan screening, and prohibit them from discriminating against health plan applicants based on individual genetic information; andapplicants based on individual genetic information; and
Prohibit employers from using genetic information to refuse employment, Prohibit employers from using genetic information to refuse employment, and prohibit them from collecting employees' personal genetic and prohibit them from collecting employees' personal genetic information without their explicit consent. information without their explicit consent.
Nearly 40 states have had individual forms of the legislation in placeNearly 40 states have had individual forms of the legislation in place
Passed by House:Passed by House: April 25, 2007 (420-3), and againApril 25, 2007 (420-3), and again March 7, 2008 (as part of the Paul Wellstone Mental Health and March 7, 2008 (as part of the Paul Wellstone Mental Health and
Addiction Equity Act, 268-148)Addiction Equity Act, 268-148) Senator Tom Coburn (R, Oklahoma) had placed hold on bill in the senateSenator Tom Coburn (R, Oklahoma) had placed hold on bill in the senate April 24, 2008: GINA passes in Senate (95-0)April 24, 2008: GINA passes in Senate (95-0)
Some examples from Some examples from GWAS for type 2 GWAS for type 2
diabetesdiabetes
The first type 2 diabetes GWAS papers…The first type 2 diabetes GWAS papers…
Sladek et al. A genome-wide association study identifies novel risk loci for Sladek et al. A genome-wide association study identifies novel risk loci for type 2 diabetes. Nature. 2007 Feb 22; 445:881-5.type 2 diabetes. Nature. 2007 Feb 22; 445:881-5.
Frayling et al. A common variant in the FTO gene is associated with body Frayling et al. A common variant in the FTO gene is associated with body mass index and predisposes to childhood and adult obesity. Science. 2007 mass index and predisposes to childhood and adult obesity. Science. 2007 May 11; 316:889-94.May 11; 316:889-94.
Steinthorsdottir et al. A variant in CDKAL1 influences insulin response and Steinthorsdottir et al. A variant in CDKAL1 influences insulin response and risk of type 2 diabetes. Nat Genet. 2007 Jun; 39:770-5.risk of type 2 diabetes. Nat Genet. 2007 Jun; 39:770-5.
Wellcome Trust Case Control Consortium. Wellcome Trust Case Control Consortium. Genome-wide association study Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. of 14,000 cases of seven common diseases and 3,000 shared controls. Nature. 2007 Jun 7; 447:661-78.Nature. 2007 Jun 7; 447:661-78.
Saxena et al. Genome-wide association analysis identifies loci for type 2 Saxena et al. Genome-wide association analysis identifies loci for type 2 diabetes and triglyceride levels. Science. 2007 Jun 1;316(5829):1331-6 diabetes and triglyceride levels. Science. 2007 Jun 1;316(5829):1331-6
Zeggini et al. Replication of genome-wide association signals in UK Zeggini et al. Replication of genome-wide association signals in UK samples reveals risk loci for type 2 diabetes. Science. 2007 Jun 1; samples reveals risk loci for type 2 diabetes. Science. 2007 Jun 1; 316:1336-41.316:1336-41.
Scott et al. A genome-wide association study of type 2 diabetes in Finns Scott et al. A genome-wide association study of type 2 diabetes in Finns detects multiple susceptibility variants. Science. 2007 Jun 1; 316:1341-5.detects multiple susceptibility variants. Science. 2007 Jun 1; 316:1341-5.
Diabetes Genetics Initiative of Broad Institute of Harvard Diabetes Genetics Initiative of Broad Institute of Harvard and MIT, Lund University, and Novartis Institutes of and MIT, Lund University, and Novartis Institutes of
BioMedical Research, Science 2007 Jun BioMedical Research, Science 2007 Jun 1;316(5829):1331-61;316(5829):1331-6
Association results from Association results from WTCC replication studyWTCC replication study
Zeggini, E. et al. Replication of genome-wide association signals in UK samples reveals risk loci for type 2 diabetes. Science 316, 1336–1341 (2007). Frayling TM. Nat Rev Genet 2007 Sep; 8:657-62
Transcription-factor 7-like 2 Transcription-factor 7-like 2 (TCF7L2) (TCF7L2)
Major new diabetes geneMajor new diabetes gene Identified as a diabetes gene byIdentified as a diabetes gene by
Grant Grant et al.et al. Nat Genet 2006 March; 38: 320-323Nat Genet 2006 March; 38: 320-323
Not previously suspected to be involved in Not previously suspected to be involved in diabetesdiabetes
Known to influence levels of at least 60 other Known to influence levels of at least 60 other genes!genes!
Shown to have a role in insulin secretion Shown to have a role in insulin secretion (Lyssenko et al. J Clin Invest. 2007 Aug; 117:2155-63)(Lyssenko et al. J Clin Invest. 2007 Aug; 117:2155-63)
Replicated GWAS diabetes genesReplicated GWAS diabetes genes
GeneGene ChrChr ReferenceReference
Previously known diabetes genesPreviously known diabetes genes
TCF7L2TCF7L2 1010 Sladek, Steinthorsdottir, ScottSladek, Steinthorsdottir, Scott
PPARGPPARG 33 WTCCC, ScottWTCCC, Scott
KCNJ11KCNJ11 1111 WTCCC, Scott, SaxenaWTCCC, Scott, Saxena
Novel diabetes genesNovel diabetes genes
SLC30A8SLC30A8 11 Sladek, Scott, Zeggini, SaxenaSladek, Scott, Zeggini, Saxena
IGF2BP2IGF2BP2 33 Saxena, WTCCC, Scott, ZegginiSaxena, WTCCC, Scott, Zeggini
CDKAL1CDKAL1 66 Steinthorsdottir, Scott, Zeggini, SaxenaSteinthorsdottir, Scott, Zeggini, Saxena
HHEX/IDEHHEX/IDE 1010 Sladek, Scott, Zeggini, SaxenaSladek, Scott, Zeggini, Saxena
CDKN2A/CDKN2B regionCDKN2A/CDKN2B region 99 Saxena, WTCCC, ScottSaxena, WTCCC, Scott
FTOFTO 1616 WTCCC, Scott, ZegginiWTCCC, Scott, Zeggini
Frayling TM. Nat Rev Genet 2007 Sep; 8:657-62
Effect sizes of 11 confirmed Effect sizes of 11 confirmed diabetes variantsdiabetes variants
Frayling TM. Nat Rev Genet 2007 Sep; 8:657-62
TCF7L2 resultsTCF7L2 results
SNP Population Case frequency
Control frequency
P-value Odds ratio
rs7903146 Iceland 39% 30% 1.6 x 10-9 1.50
Denmark 36% 27% 0.0018 1.46
U.S. (Caucasians) 40% 28% 1.6 x 10-7 1.71
U.K. (Caucasians) 38% 31% 1.3 x 10-11 1.35
Finland 22% 18% 0.00042 1.33
France 43% 31% 6.0 x 10-35 1.69
Netherlands 37% 29% 4.4 x 10-5 1.41
Europe (Caucasians) 36% 28% <0.0001 1.54
U.K. (Indian) 34% 27% 0.002 1.53
U.S. (African American) 37% 28% 4.1 x 10-6 1.51
West Africa 41% 21% 0.0021 1.45
ButBut this variant is rarer in this variant is rarer inEast Asian and Native East Asian and Native American populationsAmerican populations
SNP Population Case frequency
Control frequency
P-value Odds ratio
rs7903146 Iceland 39% 30% 1.6 x 10-9 1.50
Denmark 36% 27% 0.0018 1.46
U.S. (Caucasians) 40% 28% 1.6 x 10-7 1.71
U.K. (Caucasians) 38% 31% 1.3 x 10-11 1.35
Finland 22% 18% 0.00042 1.33
France 43% 31% 6.0 x 10-35 1.69
Netherlands 37% 29% 4.4 x 10-5 1.41
Europe (Caucasians) 36% 28% <0.0001 1.54
U.K. (Indian) 34% 27% 0.002 1.53
U.S. (African American) 37% 28% 4.1 x 10-6 1.51
West Africa 41% 21% 0.0021 1.45
Mexico 19% 16% 0.16 1.25
Hong Kong (Chinese) 3% 2% 0.42 1.27
• However, other variants in the same gene are associated with diabetes
Investigation of “European” diabetes Investigation of “European” diabetes alleles in African Americansalleles in African Americans
*Dominant model (<10 counts for minor alllele homozygote)
Lewis et al. Diabetes 2008 (in press)
Gene SNP European Reported
Risk Allele
Admixture-Adjusted AdditiveP-value
Admixture-Adjusted OR
(95% CI)
PKN2 rs6698181 T 0.388 1.08 (0.91-1.29)
IGF2BP2 rs4402960 T 0.803 0.98 (0.87-1.11)
FLJ39370 rs17044137 A 0.747 0.98 (0.86-1.12)
CDKAL1 rs10946398 C 0.110 1.11 (0.98-1.26)
CDKAL1 rs7754840 C 0.136 1.10 (0.97-1.25)
SLC30A8 rs13266634 C 0.543* 1.46 (0.43-4.89)
CDKN2B/CDKN2A rs564398 T 0.320* 2.99 (0.34-25.98)
CDKN2B/CDKN2A rs10811661 T 0.128* 0.18 (0.02-1.64)
IDE/KIF11/HHEX rs1111875 C 0.767 1.02 (0.88-1.19)
IDE/KIF11/HHEX rs5015480 C 0.400 0.95 (0.83-1.08)
IDE/KIF11/HHEX rs7923837 G 0.303* 1.87 (0.57-6.12)
Intragenic rs9300039 C 0.029* 0.42 (0.19-0.91)
LOC387761 rs7480010 G 0.084 1.18 (0.98-1.44)
EXT2/ALX4 rs1113132 C 0.221* 0.47 (0.14-1.57)
EXT2/ALX4 rs11037909 T 0.511 0.94 (0.79-1.13)
EXT2/ALX4 rs3740878 A 0.129* 0.46 (0.17-1.26)
FTO rs8050136 A 0.783 1.02 (0.90-1.15)
TCF7L2* rs7903146 T 1.59x10-61.39 (1.21-1.60)
Allele frequencies differAllele frequencies differ
Lewis et al. Diabetes 2008 (in press)
Gene SNP European Reported
Risk Allele
Risk Allele Frequency Controls
Risk Allele Frequency
Cases
Reported Risk Allele Frequency Controls
Reported Risk Allele Frequency
Cases
α=0.05 α=0.10
PKN2 rs6698181 T 0.153 0.156 0.290 0.320 0.237 0.345
IGF2BP2 rs4402960 T 0.525 0.528 0.304 0.341 0.555 0.675
FLJ39370 rs17044137 A 0.329 0.326 0.230 0.270 0.060 0.115
CDKAL1 rs10946398 C 0.582 0.615 0.319 0.361 0.427 0.522
CDKAL1 rs7754840 C 0.585 0.616 0.360 0.387 0.427 0.552
SLC30A8 rs13266634 C 0.914 0.916 0.609 0.649 0.169 0.263
CDKN2B/CDKN2A rs564398 T 0.934 0.943 0.558 0.595 0.140 0.225
CDKN2B/CDKN2A rs10811661 T 0.933 0.927 0.850 0.872 0.304 0.422
IDE/KIF11/HHEX rs1111875 C 0.766 0.774 0.522 0.546 0.371 0.495
IDE/KIF11/HHEX rs5015480 C 0.633 0.621 0.425 0.379 0.470 0.595
IDE/KIF11/HHEX rs7923837 G 0.917 0.929 0.597 0.622 0.143 0.229
Intragenic rs9300039 C 0.889 0.884 0.892 0.924 0.584 0.701
LOC387761 rs7480010 G 0.858 0.890 0.301 0.336 0.062 0.117
EXT2/ALX4 rs1113132 C 0.915 0.920 0.733 0.763 0.475 0.600
EXT2/ALX4 rs11037909 T 0.862 0.859 0.729 0.760 0.913 0.953
EXT2/ALX4 rs3740878 A 0.907 0.914 0.728 0.760 0.760 0.846
FTO rs8050136 A 0.446 0.452 0.398 0.455 0.711 0.808
TCF7L2* rs7903146 T 0.284 0.354 0.181 0.227 0.997 0.999
Reported European Data
Power to Detect Association in African
Americans
African American Data
Can genetic Can genetic information change information change
practice in the clinic?practice in the clinic?
Neonatal diabetesNeonatal diabetes
Mutations of the ATP-sensitive inwardly-Mutations of the ATP-sensitive inwardly-rectifying potassium channel subunit Kir6.2 rectifying potassium channel subunit Kir6.2 ((KCNJ11KCNJ11) gene cause 30-58% of cases of ) gene cause 30-58% of cases of diabetes diagnosed in patients under six diabetes diagnosed in patients under six months of agemonths of age
The majority of cases (80-90%) are The majority of cases (80-90%) are de novode novo mutations, so won’t be identified on the mutations, so won’t be identified on the basis of family historybasis of family history
Neonatal diabetes –Neonatal diabetes –KCNJ11 mutationsKCNJ11 mutations
Pearson ER et al. N Engl J Med 2006, 355 (5), 467-477
In the beta-cell, glucose metabolism increases intracellular ATP In the beta-cell, glucose metabolism increases intracellular ATP production from ADPproduction from ADP
This leads to the closure of ATP-sensitive potassium channels and This leads to the closure of ATP-sensitive potassium channels and membrane depolarizationmembrane depolarization
Subsequent activation of voltage-dependent calcium channels and influx Subsequent activation of voltage-dependent calcium channels and influx of calcium results in insulin granule exocytosisof calcium results in insulin granule exocytosis
Patients with Patients with KCNJ11KCNJ11 mutations have K mutations have KATPATP channels with decreased channels with decreased sensitivity to ATPsensitivity to ATP
Channels remain open in the presence of glucoseChannels remain open in the presence of glucose Reducing insulin secretionReducing insulin secretion
Neonatal diabetesNeonatal diabetes
Since patients present with hyperglycemia, Since patients present with hyperglycemia, undetectable C-peptide, and frequently have undetectable C-peptide, and frequently have ketoacidosis (30%), they are often initially ketoacidosis (30%), they are often initially treated with insulintreated with insulin
A study of 49 patients showed that 90% could A study of 49 patients showed that 90% could successfully be treated with sulfonylureassuccessfully be treated with sulfonylureas
Pearson ER et al. N Engl J Med 2006, 355 (5), 467-477
PharmacogeneticsPharmacogenetics
Cytochrome P450 tableCytochrome P450 table
Stamer and Stuber. Genetic factors in pain and its treatment. Curr Opin Anaesthesiol. 2007 Oct;20(5):478-84.
http://http://medicine.iupui.edu/flockhart/table.htmmedicine.iupui.edu/flockhart/table.htm
Lanfear and McLeod. Pharmacogenetics: using DNA to optimize drug therapy. Am Fam Physician. 2007 Oct 15;76(8):1179-82.
Clinical trialsClinical trials
Genetic testing may allow selective Genetic testing may allow selective recruitment of participants in whom drug is recruitment of participants in whom drug is expected to be most efficaciousexpected to be most efficacious
Lower costs to bring drug to marketLower costs to bring drug to market Will it be approved for a select genetic Will it be approved for a select genetic
group?group?
Ethical issuesEthical issues
PrivacyPrivacy InsuranceInsurance
HealthHealth LifeLife DisabilityDisability
EmploymentEmployment
You can’t change your genes – You can’t change your genes – Why does genetics matter?Why does genetics matter?
Identify new pathways involved in disease predispositionIdentify new pathways involved in disease predisposition New “druggable” targetsNew “druggable” targets
More specific diagnosisMore specific diagnosis
PharmocogeneticsPharmocogenetics Identify genetic factors that influence an individual’s response Identify genetic factors that influence an individual’s response
to a particular therapyto a particular therapy Selection of therapiesSelection of therapies Clinical trial designClinical trial design
OutcomesOutcomes Recovery ratesRecovery rates Long-term sequelaeLong-term sequelae
Era of “personalized medicine”Era of “personalized medicine”
You can’t change your genes – You can’t change your genes – Why does genetics matter?Why does genetics matter?
Better prediction of who is at greatest risk Better prediction of who is at greatest risk and targeted early interventionand targeted early intervention
PREVENTION
J. Craig Venter
Results from Venter’s Results from Venter’s GenomeGenome
After QC filtering, 4.1 Million variants, 1.288M After QC filtering, 4.1 Million variants, 1.288M are novel to dbSNP (30%)are novel to dbSNP (30%) SNPs, indels, inversions, segmental duplication, and SNPs, indels, inversions, segmental duplication, and
more complex variationmore complex variation
78% of 4.1M are SNPs; the other 22% cover 78% of 4.1M are SNPs; the other 22% cover 9Mb of variant bases9Mb of variant bases
62 Copy Number Variants = 10Mb 62 Copy Number Variants = 10Mb Total of variation = 0.5% of genomeTotal of variation = 0.5% of genome Heterozygous Indels range from 1 - 321 bpHeterozygous Indels range from 1 - 321 bp
Levy et al, PLoS Biology, 2007
J. Craig VenterJ. Craig Venter
Carries:Carries: A gene variant linked to moist ear wax production A gene variant linked to moist ear wax production Genes linked to both heart disease (Genes linked to both heart disease (SORL1) SORL1) and longevity and longevity
Genes linked toGenes linked to Alzheimer’s (APOE) Alzheimer’s (APOE) Macular degeneration Macular degeneration High cholesterol High cholesterol Carries up to seven gene types linked to tobacco addiction Carries up to seven gene types linked to tobacco addiction
‘‘Project Jim’Project Jim’
Bio-IT World June 2007
1.3 percent of Watson’s genome did not match the existing reference genome. > 600,000 novel SNPs< 68,000 insertions and deletions compared to the reference sequence, 3bp - 7kbases
http://http://www.personalgenomes.orgwww.personalgenomes.org//
23andMe - Genetics Just Got Perso23andMe - Genetics Just Got Personal.nal.
NavigenicsNavigenics Home Home