41
J. B. Cole Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350, USA [email protected] Genomic selec+on and systems biology – lessons from dairy ca5le breeding

Genomic selection and systems biology – lessons from dairy cattle breeding

Embed Size (px)

DESCRIPTION

Presentation made to the staff of Keygene, NV, in Wageningen, The Netherlands. (I don't know what the problem is with the template here. It looks fine if you use a dark background.)

Citation preview

Page 1: Genomic selection and systems biology – lessons from dairy cattle breeding

J. B. Cole Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350, USA [email protected]

Genomic  selec+on  and  systems  biology  –  lessons  from  dairy  ca5le  breeding  

Page 2: Genomic selection and systems biology – lessons from dairy cattle breeding

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (2)   Cole  

Dairy Cattle

  9 million cows in US

  Attempt to have a calf born every year

  Replaced after 2 or 3 years of milking

  Bred using artificial insemination

  Popular bulls have 10,000+ progeny

  Cows can have many progeny though superovulation and embryo transfer

Page 3: Genomic selection and systems biology – lessons from dairy cattle breeding

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (3)   Cole  

Embryo transferred to recipient"

Parents selected"

Dam inseminated"

Bull born"

Semen collected (1 y)"Daughters born (9 m later) "

Daughters have calves (2 y later)" Bull receives progeny test "

(5 y)"

Genomic Test"

Lifecycle of bull

Page 4: Genomic selection and systems biology – lessons from dairy cattle breeding

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (4)   Cole  

Phenotypes recorded

  Monthly recording

  Milk, fat, and protein yields

  Somatic cell count (udder health)

  Visual appraisal for type traits

  Breed associations record pedigree

  Calving difficulty and stillbirth

Page 5: Genomic selection and systems biology – lessons from dairy cattle breeding

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (5)   Cole  

Available data

Type of Data Number of Records

Cows with lactation data 28,394,976

Lactations 68,373,863

Individual test days 508,574,532

Dystocia records 20,770,758

Animals in pedigree file 58,893,009

Genotyped bulls 105,654

Genotyped cows 276,173

Page 6: Genomic selection and systems biology – lessons from dairy cattle breeding

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (6)   Cole  

0

50000

100000

150000

200000

250000

300000

1004 1008 1012 1104 1108 1112 1204 1208 1212 1304

Bulls Cows

Cole"

Many animals have been genotyped

Evaluation Date (YYMM)"

Gen

otyp

es"

381,827 genotyped animals"

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (6)  

Page 7: Genomic selection and systems biology – lessons from dairy cattle breeding

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (7)   Cole  

How does genetic selection work?

  ΔG = genetic gain each year

  reliability = how certain we are about our estimate of an animal’s genetic merit (genomics can é)

  selection intensity = how “picky” we are when making mating decisions (management can é)

  genetic variance = variation in the population due to genetics (we can’t really change this)

  generation interval = time between generations (genomics can ê)

Page 8: Genomic selection and systems biology – lessons from dairy cattle breeding

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (8)   Cole  8"

Calculation of genomic evaluations

  Deregressed PTA derived from traditional evaluations of predictor animals

  Allele substitution effects estimated for 45,188 SNP

  Polygenic effect estimated for genetic variation not captured by SNP

  Selection index combination of genomic and traditional not included in genomic

Page 9: Genomic selection and systems biology – lessons from dairy cattle breeding

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (9)   Cole  

Many chips are available

HD"

50KV2 "

LD "

GGP HD!

  BovineSNP50

  Version 1 54,001 SNP

  Version 2 54,609 SNP

  45,188 used in evaluations

  High-density (HD)

  777,962 SNP

  Only 50K SNP used,

  Low-density (LD)

  6,909 SNP

  Geneseek Genomic Profiler & GGP-HD

Page 10: Genomic selection and systems biology – lessons from dairy cattle breeding

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (10)   Cole  

What is a SNP genotype worth?

For the protein yield (h2=0.30), the SNP genotype provides information equivalent to an additional 34 daughters"

Pedigree is equivalent to information on about 7 daughters "

Page 11: Genomic selection and systems biology – lessons from dairy cattle breeding

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (11)   Cole  

And for daughter pregnancy rate (h2=0.04), SNP = 131 daughters"

What is a SNP genotype worth?"

Page 12: Genomic selection and systems biology – lessons from dairy cattle breeding

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (12)   Cole  

High density SNP chip

  Currently only 50K subset of SNP used

  Some increase in accuracy from better tracking of QTL possible

  Realized gains have been small

  Potential for across-breed evaluations

  Requires few new HD genotypes once adequate base for imputation developed

Page 13: Genomic selection and systems biology – lessons from dairy cattle breeding

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (13)   Cole  

Low density SNP chip

  6909 SNP mostly from SNP50 chip

  Evenly spaced across 30 chromosomes

  Addresses performance issues with 3K while providing low-cost genotyping

  Provides over 98% accuracy imputing 50K genotypes

Page 14: Genomic selection and systems biology – lessons from dairy cattle breeding

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (14)   Cole  

Parentage validation and discovery

  Parent-progeny conflicts detected

  Animal checked against all other genotypes

  Reported to breeds and requesters

  Correct sire usually detected

  Maternal grandsire checking

  SNP at a time checking

  Haplotype checking more accurate

  Breeds moving to accept SNP in place of microsatellites

Page 15: Genomic selection and systems biology – lessons from dairy cattle breeding

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (15)   Cole  

Imputation

  Based on splitting the genotype into individual chromosomes

  Missing SNP assigned by tracking inheritance from ancestors and descendants

  Imputed dams increase predictor population

  3K, LD, & 50K genotypes merged by imputing SNP not on LD or 3K

Page 16: Genomic selection and systems biology – lessons from dairy cattle breeding

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (16)   Cole  

Genotypes and haplotypes

  Genotypes indicate how many copies of each allele were inherited

  Haplotypes indicate which alleles are on which chromosome

  Observed genotypes partitioned into the two unknown haplotypes

  Pedigree haplotyping uses relatives

  Population haplotyping finds matching allele patterns

Page 17: Genomic selection and systems biology – lessons from dairy cattle breeding

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (17)   Cole  

Haplotyping program – findhap.f90

  Begin with population haplotyping

  Divide chromosomes into segments, ~250 to 75 SNP / segment

  List haplotypes by genotype match

  Similar to fastPhase, IMPUTE

  End with pedigree haplotyping

  Detect crossover, fix noninheritance

  Impute nongenotyped ancestors

Page 18: Genomic selection and systems biology – lessons from dairy cattle breeding

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (18)   Cole  

O-Style Haplotypes Chromosome 15

Page 19: Genomic selection and systems biology – lessons from dairy cattle breeding

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (19)   Cole  

We’re working on new tools

Page 20: Genomic selection and systems biology – lessons from dairy cattle breeding

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (20)   Cole  

Recessive defect discovery

  Check for homozygous haplotypes

  7 to 90 expected but none observed

  5 of top 11 are potentially lethal

  936 to 52,449 carrier sire-by-carrier MGS fertility records

  3.1% to 3.7% lower conception rates

  Some slightly higher stillbirth rates

  Confirmed Brachyspina same way

Page 21: Genomic selection and systems biology – lessons from dairy cattle breeding

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (21)   Cole  

Impact on producers

  Young-bull evaluations with accuracy of early 1st­crop evaluations

  AI organizations marketing genomically evaluated 2-year-olds

  Rate of genetic improvement may increase by up to 50%

  Studs reducing progeny-test programs

Page 22: Genomic selection and systems biology – lessons from dairy cattle breeding

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (22)   Cole  

Why genomics works in dairy

  Extensive historical data available

  Well-developed genetic evaluation program

  Widespread use of AI sires

  Progeny test programs

  High-valued animals, worth the cost of genotyping

  Long generation interval which can be reduced substantially by genomics

Page 23: Genomic selection and systems biology – lessons from dairy cattle breeding

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (23)   Cole  

Where do we go from here?

  We found a few QTL

  Most traits show infinitessimal inheritance

  Dominance effects also are small

  What about epistasis?

  Systems biology – gene/protein/transcription factor networks

Page 24: Genomic selection and systems biology – lessons from dairy cattle breeding

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (24)   Cole  24"

We confirmed known QTL

Cole, J.B. et al. 2009. Distribution and location of genetic effects for dairy traits. ICAR Tech Ser. 13:355–360."

Page 25: Genomic selection and systems biology – lessons from dairy cattle breeding

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (25)   Cole  

Gene set enrichment analysis-SNP

Gene pathways (G)"GWAS results"

Score increase is proportional to SNP test statistic"

Nominal p-value corrected for multiple testing"

Pathways with moderate

effects"Holden et al., 2008 (Bioinformatics 89:1669-1683. doi:10.2527/jas.2010-3681)"

SNP ranked by significance

(L)"

SNP in pathway genes

(S)"Score increases for each Li in S"

Permutation test and FDR"

Includes all SNP, S, that are included in L"

The more SNP in S that appear near the

top of L, the higher the Enrichment Score"

Page 26: Genomic selection and systems biology – lessons from dairy cattle breeding

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (26)   Cole  

Association weight matrix

  Find gene coexpression networks (Fortes et al., 2010)

  Select SNP by significance, correlation, dist’n, etc.

−  Favor intragenic SNP significant across traits

  Construct weight matrix

−  Rows are SNP, columns are traits cols

−  Cells are normalized z-score of the additive effect of ith SNP on jth trait

  Significant correlations are identified using PCIT (Reverter and Chan, 2008) and visualized

−  Cells randomly permuted as control

Page 27: Genomic selection and systems biology – lessons from dairy cattle breeding

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (27)   Cole  

Can we identify regulatory networks?

Fortes et al., 2011 (J. Animal Sci. 89:1669-1683. doi:10.2527/jas.2010-3681)"

Candidate genes and pathways that affect age at puberty common to both breeds"

Page 28: Genomic selection and systems biology – lessons from dairy cattle breeding

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (28)   Cole  

Network analysis

Fortes et al., 2011 (J. Animal Sci. 89:1669-1683. doi:10.2527/jas.2010-3681)"

Gene network – the red center identifies highly connected nodes."

Subnetwork of interacting transcription factors from the puberty network."

Subnetwork of interacting transcription factors from a collection of mouse and human data. (Validation step.)"

Page 29: Genomic selection and systems biology – lessons from dairy cattle breeding

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (29)   Cole  

Enriched pathways

Fortes et al., 2011 (J. Animal Sci. 89:1669-1683. doi:10.2527/jas.2010-3681)"

Page 30: Genomic selection and systems biology – lessons from dairy cattle breeding

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (30)   Cole  

Transcription factor network

Fortes et al., 2011 (J. Animal Sci. 89:1669-1683. doi:10.2527/jas.2010-3681)"

Yellow genes were submitted to database. Other nodes were mined from FunCoup. Red: protein-protein interaction Blue: mRNA coexpression"

Page 31: Genomic selection and systems biology – lessons from dairy cattle breeding

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (31)   Cole  

How do we rank allele effects?

  GSEA and AWM require that we order SNP on some criterion

  p-values (actual or nominal)

  q-values (false discovery rate)

  Not all models provide p-values

  Allele substitution effects (not so good)

  Scaled substitution effects (better)

  It’s not clear (to me) which is best

Page 32: Genomic selection and systems biology – lessons from dairy cattle breeding

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (32)   Cole  

Aren’t P-values easy?

  Single SNP, fixed-effects model

  Inflation of error variances

  Spurious associations

  e.g., Plink

  Multiple SNP, mixed-effects model

  Accounts for population structure

  e.g., TASSEL, GoldenHelix SVS

Page 33: Genomic selection and systems biology – lessons from dairy cattle breeding

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (33)   Cole  

A recent example from dairy

  Extreme birth weights are associated with increased risk of stillbirth and calving difficulty

  Birth weights are not measured on most dairy farms in the US

  With German colleagues, we developed a predictor based on traits we do measure

Page 34: Genomic selection and systems biology – lessons from dairy cattle breeding

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (34)   Cole  

GWAS for birth weight PTA

h"

Cole et al.(2013), unpublished data"

Page 35: Genomic selection and systems biology – lessons from dairy cattle breeding

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (35)   Cole  

KEGG pathways for birth weight What does regulation of the actin cytoskeleton have to do with birth weight in cattle? That is, do these results make sense?"

Maybe…these pathways may be involved in establishment & maintenance of pregnancy, as well as coordination of growth and development. "

Cole et al.(2013), unpublished data"

Page 36: Genomic selection and systems biology – lessons from dairy cattle breeding

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (36)   Cole  

A new project

  The Brown Swiss, Holstein, and Jersey breeds experience dystocia at different rates

  We are applying the AWM method of Fortes et al. to these data

  The goal is to identify gene networks…

  Common to all breeds

  Different by breed

Page 37: Genomic selection and systems biology – lessons from dairy cattle breeding

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (37)   Cole  

We have divergent populations

Cole et al., 2005 (J. Dairy Sci. 88(4):1529–1539)"

Page 38: Genomic selection and systems biology – lessons from dairy cattle breeding

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (38)   Cole  

Challenges

  Annotation

  This is a mess in the cow

  The reference assembly may not be representative of all taurine cows

  Validation

  Doing functional genomics with large mammals is expensive – who pays?

  When have we proven something?

Page 39: Genomic selection and systems biology – lessons from dairy cattle breeding

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (39)   Cole  

Conclusions

  We’re not going to find big QTL for most traits

  We may identify gene networks affecting complex phenotypes

  We’re learning how much we don’t know about functional genomics in the cow

  Validation remains a problem

Page 40: Genomic selection and systems biology – lessons from dairy cattle breeding

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (40)   Cole  

Partners

  Illumina

  Marylinn Munson

  Cindy Lawley

  Christian Haudenschild

  BARC

  Curt Van Tassell

  Lakshmi Matukumalli

  Tad Sonstegard

  Missouri

  Jerry Taylor

  Bob Schnabel

  Stephanie McKay

  Alberta

  Steve Moore

  USMARC – Clay Center

  Tim Smith

  Mark Allan

iBMAC Consortium" Funding Agencies"  USDA/NRI/CSREES

  2006-35616-16697

  2006-35205-16888

  2006-35205-16701

  USDA/ARS

  1265-31000-081D

  1265-31000-090D

  5438-31000-073D

  Merial

  Stewart Bauck

  NAAB

  Godon Doak

  ABS Global

  Accelerated Genetics

  Alta Genetics

  CRI/Genex

  Select Sires

  Semex Alliance

  Taurus Service

Page 41: Genomic selection and systems biology – lessons from dairy cattle breeding

Keygene  N.V.,  Wageningen,  The  Netherlands,  28  May  2013  (41)   Cole  

Questions?

http://gigaom.com/2012/05/31/t-mobile-pits-its-math-against-verizons-the-loser-common-sense/shutterstock_76826245/"