12
1223 Pharmacogenomics (2014) 15(9), 1223–1234 ISSN 1462-2416 part of Pharmacogenomics Research Article 10.2217/PGS.14.102 © 2014 Future Medicine Ltd Aim: Pharmacogenomics holds promise to rationalize drug use by minimizing drug toxicity and at the same time increase drug efficacy. There are currently several assays to screen for known pharmacogenomic biomarkers for the most commonly prescribed drugs. However, these genetic screening assays cannot account for other known or novel pharmacogenomic markers. Materials & methods: We analyzed whole-genome sequences of 482 unrelated individuals of various ethnic backgrounds to obtain their personalized pharmacogenomics profiles. Results: Bioinformatics analysis revealed 408,964 variants in 231 pharmacogenes, from which 26,807 were residing on exons and proximal regulatory sequences, whereas 16,487 were novel. In silico analyses indicated that 1012 novel pharmacogene-related variants possibly abolish protein function. We have also performed whole-genome sequencing analysis in a seven- member family of Greek origin in an effort to explain the variable response rate to acenocoumarol treatment in two family members. Conclusion: Overall, our data demonstrate that whole-genome sequencing, unlike conventional genetic screening methods, is necessary to determine an individual’s pharmacogenomics profile in a more comprehensive manner, which, combined with the gradually decreasing whole- genome sequencing costs, would expedite bringing personalized medicine closer to reality. Original submitted 25 November 2013; Revision submitted 30 June 2014 Keywords: drug metabolism • gene variants • personalized pharmacogenomics profile  • pharmacogenomics • whole-genome sequencing Background Pharmacogenomics aims to correlate genomic variation and gene expression with drug effi- cacy and/or toxicity [1] . To date, there are sev- eral genes, also referred to as pharmacogenes, which are associated with absorption, distribu- tion, metabolism, excretion and toxicity of sev- eral drugs (ADMET) [1–3] . Currently, there are a number of genotyping methods, namely PCR or even microarray-based assays, to perform genetic screening of known pharmacogenomic markers in well-documented pharmacogenes. The most commonly used microarray-based platforms are the AmpliChip CYP450 Test™ (Roche Diagnostics, Basel, Switzerland), a CE in vitro diagnostic certified assay that analyses 29 known CYP2D6 gene variants, including gene duplication and deletion, as well as two frequent CYP2C19 gene variants [4] . Also, the DMET+ assay (Affymetrix, CA, USA), allows simultaneous analysis of 1936 pharmaco- genomic biomarkers in 231 pharmacogenes [5,6] . In addition, a high-throughput genotyp- ing assay is also available [7] . However, these platforms, like every other microarray-based genetic screening approach, have the inherent danger of missing novel unique or rare variants in ADMET-related genes, which may either affect ADMET gene expression or enzyme structure and may hence lead to variable response or increased toxicity in commonly prescribed drugs Personalized pharmacogenomics profiling using whole-genome sequencing Clint Mizzi 1 , Brock Peters 2 , Christina Mitropoulou 3 , Konstantinos Mitropoulos 4 , Theodora Katsila 5 , Misha R Agarwal 2 , Ron HN van Schaik 3 , Radoje Drmanac 2 , Joseph Borg ‡,6,7 & George P Patrinos* ,‡,5 1 Laboratory of Molecular Genetics,  Department of Physiology &  Biochemistry, University of Malta, Msida,  Malta 2 Complete Genomics, Inc., Mountain  View, CA, USA 3 Erasmus MC, Faculty of Medicine &  Health Sciences, Department of Clinical  Chemistry, Rotterdam, The Netherlands 4 The Golden Helix Foundation, London,  UK 5 University of Patras, School of Health  Sciences, Department of Pharmacy,  University Campus, Rion, GR-26504,  Patras, Greece 6 Department of Applied Biomedical  Science, Faculty of Health Sciences,  University of Malta, Msida, Malta 7 Erasmus MC, Faculty of Medicine &  Health Sciences, Department of Cell  Biology, Rotterdam, The Netherlands *Author for correspondence: Tel.: +30 2610 9692339 [email protected] Authors contributed equally For reprint orders, please contact: [email protected]

Research Article Pharmacogenomics Research Article club/Mizzi et al.pdf · Research Article Mizzi, Peters, Mitropoulou et al. The advent of next-generation sequencing has cre-ated

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Research Article Pharmacogenomics Research Article club/Mizzi et al.pdf · Research Article Mizzi, Peters, Mitropoulou et al. The advent of next-generation sequencing has cre-ated

1223Pharmacogenomics (2014) 15(9), 1223–1234 ISSN 1462-2416

part of

PharmacogenomicsResearch Article

10.2217/PGS.14.102 © 2014 Future Medicine Ltd

Pharmacogenomics

Research Article15

9

2014

Aim: Pharmacogenomics holds promise to rationalize drug use by minimizing drug toxicity and at the same time increase drug efficacy. There are currently several assays to screen for known pharmacogenomic biomarkers for the most commonly prescribed drugs. However, these genetic screening assays cannot account for other known or novel pharmacogenomic markers. Materials & methods: We analyzed whole-genome sequences of 482 unrelated individuals of various ethnic backgrounds to obtain their personalized pharmacogenomics profiles. Results: Bioinformatics analysis revealed 408,964 variants in 231 pharmacogenes, from which 26,807 were residing on exons and proximal regulatory sequences, whereas 16,487 were novel. In silico analyses indicated that 1012 novel pharmacogene-related variants possibly abolish protein function. We have also performed whole-genome sequencing analysis in a seven-member family of Greek origin in an effort to explain the variable response rate to acenocoumarol treatment in two family members. Conclusion: Overall, our data demonstrate that whole-genome sequencing, unlike conventional genetic screening methods, is necessary to determine an individual’s pharmacogenomics profile in a more comprehensive manner, which, combined with the gradually decreasing whole-genome sequencing costs, would expedite bringing personalized medicine closer to reality.

Original submitted 25 November 2013; Revision submitted 30 June 2014

Keywords:  drug metabolism • gene variants • personalized pharmacogenomics profile  • pharmacogenomics • whole-genome sequencing

BackgroundPharmacogenomics aims to correlate genomic variation and gene expression with drug effi-cacy and/or toxicity [1]. To date, there are sev-eral genes, also referred to as pharmacogenes, which are associated with absorption, distribu-tion, metabolism, excretion and toxicity of sev-eral drugs (ADMET) [1–3]. Currently, there are a number of genotyping methods, namely PCR or even microarray-based assays, to perform genetic screening of known pharmaco genomic markers in well-documented pharmacogenes. The most commonly used microarray-based platforms are the AmpliChip CYP450 Test™ (Roche Diagnostics, Basel, Switzerland), a CE in vitro diagnostic certified assay that analyses

29 known CYP2D6 gene variants, including gene duplication and deletion, as well as two frequent CYP2C19 gene variants [4]. Also, the DMET+ assay (Affymetrix, CA, USA), allows simultaneous analysis of 1936 pharmaco-genomic biomarkers in 231 pharmacogenes [5,6]. In addition, a high-throughput genotyp-ing assay is also available [7]. However, these platforms, like every other microarray-based genetic screening approach, have the inherent danger of missing novel unique or rare variants in ADMET-related genes, which may either affect ADMET gene expression or enzyme structure and may hence lead to variable response or increased toxicity in commonly prescribed drugs

Personalized pharmacogenomics profiling using whole-genome sequencing

Clint Mizzi1, Brock Peters2, Christina Mitropoulou3, Konstantinos Mitropoulos4, Theodora Katsila5, Misha R Agarwal2, Ron HN van Schaik3, Radoje Drmanac2, Joseph Borg‡,6,7 & George P Patrinos*,‡,5

1Laboratory of Molecular Genetics, 

Department of Physiology & 

Biochemistry, University of Malta, Msida, 

Malta 2Complete Genomics, Inc., Mountain 

View, CA, USA 3Erasmus MC, Faculty of Medicine & 

Health Sciences, Department of Clinical 

Chemistry, Rotterdam, The Netherlands 4The Golden Helix Foundation, London, 

UK 5University of Patras, School of Health 

Sciences, Department of Pharmacy, 

University Campus, Rion, GR-26504, 

Patras, Greece 6Department of Applied Biomedical 

Science, Faculty of Health Sciences, 

University of Malta, Msida, Malta 7Erasmus MC, Faculty of Medicine & 

Health Sciences, Department of Cell 

Biology, Rotterdam, The Netherlands

*Author for correspondence:

Tel.: +30 2610 9692339

[email protected] ‡Authors contributed equally

For reprint orders, please contact: [email protected]

Page 2: Research Article Pharmacogenomics Research Article club/Mizzi et al.pdf · Research Article Mizzi, Peters, Mitropoulou et al. The advent of next-generation sequencing has cre-ated

1224 Pharmacogenomics (2014) 15(9) future science group

Research Article Mizzi, Peters, Mitropoulou et al.

The advent of next-generation sequencing has cre-ated unprecedented opportunities to analyze whole genomes, which, unlike conventional medium- or even high-throughput genetic screening approaches, allows us to obtain a full picture with respect to people’s vari-omes. To date, whole-exome and/or whole-genome sequencing can be easily performed using several com-mercially available or proprietary platforms, to com-prehensively analyze genome variation with a high degree of accuracy and with reasonable costs, com-pared with the not so distant past. Despite the fact that whole-exome sequencing is currently more cost effec-tive than whole-genome sequencing, it has a number of limitations [8], particularly in pharmacogenomics [9,10]: our knowledge of all truly protein-coding exons in the genome is still incomplete, so current capture probes can only target known exons; there is a degree of variability between the various commercial target enrichment kits; regulatory and untranslated regions are not sequenced; there is a significant bias in the target enrichment step, since the efficiency of capture probes varies considerably and some sequences fail to be targeted by capture probe design altogether. As such, not all templates are sequenced with equal effi-ciency and not all sequences can be aligned to the ref-erence genome so as to allow base calling, therefore, a significant proportion of variants may go undetected. In the latter case, this may be particularly problem-atic for paralog genes, such as in the case of CYP2D6 gene resequencing [9,10], using certain next-generation sequencing platforms.

Here, we exploited whole-genome sequencing to identify novel and putatively causative genomic vari-ants, affecting the structure and function of 231 pharmaco genes in a large number of human genomes from various ethnic backgrounds, aiming to dem-onstrate the advantages of this method contrary to the commonly used conventional genetic screening approaches. Also, we have used whole-genome sequenc-ing to determine personalized pharmacogenomics pro-files of a seven-member family of Greek origin and to delineate these profiles with anticoagulation treatment response in two of these family members.

Materials & methodsCase selectionWe have analyzed whole genome sequences from 482 non-diseased individuals of various ethnicities, as previously described [11,12]. Analysis was performed in two stages: a pilot study of 69 publicly available genomes [13] from various ethnic backgrounds (Supplementary Table 1; see online at: www.futuremedicine.com/doi/suppl/10.2217/pgs.14.102); and a follow-up study including 413 Cau-casian genomes from a collection of healthy octogenar-

ians analyzed on the same sequencing platform (Well-derly Study). In addition, a seven-member family of Greek origin was independently analyzed in an effort to delineate acenocoumarol treatment efficacy in two fam-ily members with their under lying pharmacogenomic profile (Supplementary Figure 1). Informed consent was obtained from all individuals that took part in this study and the study was approved by the ethics committees of partnering Institutions

Prothrombin time & international normalized ratio measurementA total of 5 ml of peripheral blood obtained by venous puncture from two family members (family members 4 and 7) receiving oral anticoagulant treatment was col-lected into Vacutainer tubes containing 0.105 M tri-sodium citrate. Prothrombin times were subsequently determined as previously described [14].

DNA isolation & whole-genome sequencingGenomic DNA isolation was performed from saliva using the Oragene collection kit (DNA Genotek, Ontario, Canada). Whole-genome sequencing was performed using Complete Genomics’ (CA, USA) DNA nanoarray platform [11]. DNA sequencing coverage was 110×.

Bioinformatics & in silico analysesGenomes were aligned with hg19 reference genomes. A list of nonredundant variants in 231 ADMET-related genes [5] was generated from all 482 human genomes. The list was annotated with Annovar comparing the novelty of the variants with dbSNP 137. These selected variants were compared against the 1936 known pharmaco genomics variants documented in these 231 genes [5]. Combined initial analysis resulted in a list of ADMET gene-related variants found in exons, introns, gene promoters and proximal regulatory regions, tak-ing the gene definition for hg19 found on the Complete Genomics public server [15] and increased by 200 bases upstream and 200 bases downstream from whole-genome sequenced DNA. Functional variants (exomes and regulatory) are defined as variants found in coding sequences, splicing, upstream, downstream and 5́ - and 3 -́UTRs. The variants were filtered according to the analysis required using custom scripts and Complete Genomics Analysis Tools (CGA™ Tools, available from [16]).

To obtain predictions for a list of possible functional variants, we have employed the SIFT (available from [17]) and PROVEAN (available from [18]) algorithms that calculate the probability of a certain sequence variant to affect protein function. PROVEAN is able to make predictions for any type of protein sequence

Page 3: Research Article Pharmacogenomics Research Article club/Mizzi et al.pdf · Research Article Mizzi, Peters, Mitropoulou et al. The advent of next-generation sequencing has cre-ated

www.futuremedicine.com 1225

Figure 1. Depiction of the known and novel genomic variants in absorption, distribution, metabolism, excretion and toxicity related genes. Pharmacogenes are found throughout the whole human genome and using next-generation sequencing technology at a whole-genome level, we identified common and rare variants from the whole genome. To depict these findings, a karyotype-like graphical representation of variants in coding and regulatory (e.g., 200 bp up- and down-stream of transcription start and polyadenylation sites, respectively) regions found in the absorption, distribution, metabolism, excretion and toxicity related genes of all 482 individuals whose genomes have been analyzed was prepared. Variants are displayed at their exact physical location in the genome where a dark blue line indicates an area containing known variants and sites of novel variants are marked in red. The locations of the most important pharmacogenes are depicted with arrows. Chr: Chromosome.

chr1 chr2 chr3 chr4 chr5 chr6 chr7 chr8 chr9 chr10 chr11 chr12 chr13 chr14 chr15 chr16 chr17 chr18 chr19 chr20 chr21 chr22 chrx

UGT1A1CYP2C19, CYP2C9VKORC1CYP2D6TPMT

Known variantNovel variant

chr1

future science group

Personalized pharmacogenomics profiling using whole-genome sequencing Research Article

alterations, including single or multiple amino acid substitutions, deletions and insertions [19].

ResultsWhole-genome sequencing reveals a large number of genomic variants in the 231 pharmacogenesWe first performed a first round of analysis by analyz-ing 69 publicly available genomes to get an estimate of the extent of genome variation in the 231 ADMET-related genes. In the second round of analysis, we sub-sequently included 413 additional human genomes from the Wellderly Study to verify our findings from the first study. From the total 482 genomes that were

analyzed, we identified 408,964 variants in ADMET-related genes, from which 239,194 variants (58.5%) were found only once (allele frequency = 0.2%) and 38,636 reached frequencies of over 20%. Also, 26,807 variants were discovered in exons and proximal regula-tory regions. Quality of the variants called was consid-ered. From the available public genomes we chose the variants considered of high quality found in the data file, deducted from the ti/tv, homo/het, dN/dS ratios and missingness analysis (Supplementary Table 2).

A total number of 17,733 variants were found on average for each individual in these 231 ADMET-related genes. Confirming with the latest version of dbSNP (Release 137), we identified a total of 16,487

Page 4: Research Article Pharmacogenomics Research Article club/Mizzi et al.pdf · Research Article Mizzi, Peters, Mitropoulou et al. The advent of next-generation sequencing has cre-ated

1226 Pharmacogenomics (2014) 15(9) future science group

Research Article Mizzi, Peters, Mitropoulou et al.

Variants Pilot (CG69) Follow-up (413 Wellderly)

n % n %

Total 87,027 100 321,924 100

Functional variants 4958 5.6 21,849 6.78

Unique variants 31,637 36.35 207,557 50.75

Total, average in each individual 15,791 n/a 18,058 n/a

Novel, potentially functional 682 0.78 15,803 4.9

Novel variants in exome 4480

Novel, potentially functional found only on one allele

562 0.65 11,472 3.56

Novel, potentially functional with frequency >1%

123 0.14 738 0.23

Tandem substitutions 2631 3.02 39,806 12.36

Insertions 5096 5.85 39,497 12.27

Deletions 6426 7.38 38,482 11.95

Number of variants found in DMET+ microarray (range)

285–366 307–397

DMET+ coverage, %† (range) 33.05–41.00 26.39–33.64 †Please see text for details.

n/a: Not applicable.

variants (4% of all the variants) in exons and regula-tory regions that were not annotated in dbSNP that are likely to have functional significance (Figure 1), from which the majority (72.99%), namely 12,034, are found only once on one allele. These data are summarized in Table 1. It must be noted that analy-sis with the DMET+ chip for each individual, would reveal, on average a mere 249.52 variants in their 231 pharmacogenes.

To further demonstrate the applicability of this approach, we have focused on the most important pharmacogenes, for which genetic tests are inte-grated in several drug labels by different regulatory bodies. We have decided to focus on the CYP2D6, CYPC9, VKORC1, UGT1A1 and TPMT genes, since these genes are not only the most well-documented pharmaco genes but have also been previously shown to be involved in the metabolism of a variety (over 250) of the most-commonly prescribed drugs, such as antiepi-leptic drugs, antidepressants and proton-pump inhibi-tors, while variants in these genes are shown to be well-established pharmacogenomic markers, also used in clinical applications and incorporated in drug labels. Our analysis revealed 2521 novel variants in these five pharmacogenes, from which 202 reside in exons and proximal regulatory regions. Also, 11 of these variants reach frequencies of over 1% and as such they might

constitute important pharmacogenomic markers for the most commonly prescribed drugs, since they might affect the overall enzyme activity. An overview of these findings is displayed in Table 2 & Figure 2.

Subsequently, we sought to determine the rate of the novel versus known genomic variants present in these 231 ADMET-related genes, and in particular the location (e.g., exons, introns and regulatory regions) and nature of these variants (e.g., substitutions and indels). Our data showed that the ratio of ADMET gene-related variants found on the DMET+ micro array (which includes variants in intron regions; Affymetrix, vs functional variants, namely variants found in exons, 5́ -UTR, 3 -́UTR, and variants that overlap 1-kb region up- and down-stream of transcription start sites) was between 26.39 and 40% in our study sample (Figure 3A). Out of all genomes, 1709 variants, which were either novel or known and annotated in dbSNP databases but not present on the DMET-array, were found with frequencies >20%. Taken together with the previous data, these data underline the potential of whole-genome sequencing to capture several novel and potentially important ADMET-related variants in individual patients.

Furthermore, we have used the data available from the pilot study sample to analyze the occurrence of novel ADMET gene-related variants and calculated

Table 1. Overview of the number of genomic variants that have been identified in the CG69 and Wellderly genome collections, by whole-genome sequencing analysis.

Page 5: Research Article Pharmacogenomics Research Article club/Mizzi et al.pdf · Research Article Mizzi, Peters, Mitropoulou et al. The advent of next-generation sequencing has cre-ated

www.futuremedicine.com 1227future science group

Personalized pharmacogenomics profiling using whole-genome sequencing Research Article

the novel versus known variant ratio among different ethnicities. Our analysis indicated that novel ADMET gene-related variations occur significantly more fre-quently in certain ethnic groups compared with others (Figure 4A & B). In particular, in individuals of Guja-rati Indians in Houston, Texas (GIH), Japanese in Tokyo, Japan (JPT) and Maasai in Kinyawa, Kenya (MKK) origin, the novel versus known ADMET gene-related variants is almost doubled compared with other ethnicities. This latter finding further highlights the importance of determining the inci-dence of pharmacogenomic markers in distinct ethnic groups and populations, hence moving personalized to ‘populationalized’ medicine [20].

In silico analysisIn order to demonstrate whether a fraction of all novel exome variants have a functional significance, for example, by diminishing gene expression or by exert-ing a damaging effect in the resulting protein prod-uct, we have performed in silico analysis using the PROVEAN and SIFT algorithms [19,21]. In this analy-sis, we paid particular attention to the five very well-known pharmacogenes, namely CYP2D6, CYP2C9, UGT1A1, TPMT and VKORC1. Our in silico analysis using the SIFT and PROVEAN algorithms showed that a number of missense mutations in these genes are likely to have a damaging effect on to their respective proteins products. This in turn may directly impact on drug metabolism, which may lead to reduced or diminished drug efficacy and/or increased drug toxic-ity, Analysis of novel variants found in the CYP2D6 revealed three missense mutations, one of which affects two amino acids (p.Leu91Met and p.His94Arg) in the protein since it is a substitution DNA mutation; and six frameshift mutations. In TPMT, analysis revealed three missense, one nonsense and one frameshift mutation. The frameshift and nonsense mutations, as evidenced by SIFT and Polyphen analysis, putatively have a deleterious and damaging effect to the enzymes, while missense mutations can impact the activity of the enzyme. A portrayal of a number of these missense mutations in a 3D protein structure was constructed to highlight the location occupied by the mutated amino acid (Supplementary Figure 1). Data files from the Protein Data Bank (PDB) and coordinates were obtained from the Research Collaboratory for Struc-tural Bio informatics (RCSB) for the TPMT (pdb 2h11), CYP2C9 (pdb 1OG2) and CYP2D6 (2F9Q) proteins and analyzed by a desktop version of PyMOL v1.6 [22,23]. PyMOL was used to highlight the signifi-cant amino acid changes between the wild-type (nor-mal) protein sequences against the mutant (abnormal) protein sequence. The 3D structures depict a single

CY

P2D

6 C

YP2

C9

VK

OR

C1

UG

T1A

1 TP

MT

W

elld

erly

C

G69

W

elld

erly

CG

69W

elld

erly

CG

69W

elld

erly

CG

69W

elld

erly

CG

69

N

T

NT

N

T

NT

N

T

NT

N

T

NT

NT

N

T

Tota

l11

324

1

66

8

1266

1675

89

489

19

122

6

423

25

433

2

5515

2

570

724

29

209

Fun

ctio

nal

var

ian

ts23

70

132

31

50

219

18

32

05

31

39

613

91

106

5

27

Tota

l, av

erag

e p

er

ind

ivid

ual

n/a

24.7

n

/a11

.2

n/a

67.8

9

n/a

54

.95

n

/a9.

40

n

/a3.

39

n/a

13.2

9

n/a

24.3

0

n/a

47.2

2

n/a

42.0

8

No

vel v

aria

nts

in e

xom

e10

50

129

12

26

211

6

11

03

24

20

613

7

14

07

No

vel,

po

ten

tial

ly

fun

ctio

nal

fo

un

d o

nly

o

n o

ne

alle

le

15n

/a

1n

/a

28n

/a

2n

/a

13n

/a

0n

/a

19n

/a

4n

/a

70n

/a

3n

/a

No

vel,

po

ten

tial

ly

fun

ctio

nal

wit

h

freq

uen

cy >

1%

1n

/a

0n

/a

1n

/a

0n

/a

1n

/a

0n

/a

5n

/a

2n

/a

2n

/a

2n

/a

Tan

dem

su

bst

itu

tio

ns

n/a

28

n/a

3

n/a

235

n

/a8

n

/a27

n

/a0

n

/a32

n

/a4

4

n/a

82

n/a

5

Inse

rtio

ns

n/a

11

n/a

0

n/a

190

n

/a25

n

/a3

4

n/a

3

n/a

40

n

/a0

n

/a10

5

n/a

14

Del

etio

ns

n/a

19

n/a

5

n/a

140

n

/a16

n

/a2

n

/a0

n

/a37

n

/a0

n

/a91

n

/a15

DM

ET+

co

vera

ge

n/a

16

n/a

11

n/a

9

n/a

5

n/a

11

n/a

10

n/a

9

n/a

10

n/a

7

n/a

5

DM

ET/

tota

l rat

io (

%)

n/a

6.6

4

n/a

16.1

8

n/a

0.5

4

n/a

1.02

n

/a4

.87

n

/a43

.48

n

/a2.

71

n/a

6.58

n/a

0.97

n

/a2.

39See also F

igu

re 2. 

N: Novel; n/a: Not applicable; T: Total.

Tab

le 2

. Ove

rvie

w o

f th

e n

um

ber

of

the

no

vel a

nd

to

tal g

eno

mic

var

ian

ts id

enti

fied

in t

he

CY

P2D

6, C

YP2

C9,

VK

OR

C1,

UG

T1A

1 an

d T

PMT

gen

es.

Page 6: Research Article Pharmacogenomics Research Article club/Mizzi et al.pdf · Research Article Mizzi, Peters, Mitropoulou et al. The advent of next-generation sequencing has cre-ated

1228 Pharmacogenomics (2014) 15(9)

Figure 2. Novel variants identified in the most important pharmacogenes. The total number of variants (depicted in blue above each gene) are compared with the number of novel variants (not annotated in dbSNP 137; depicted in red above each gene) identified in the most important pharmacogenes, namely CYP2D6, CYP2C9, VKORC1, UGT1A1 and TPMT. Green dots depict variants with a mutant allele frequency of >20 while black dots depict variants found in DMET+ Array and in the genomes. See also Table 2.

CYP2D6

CYP2C9

VKORC1

TPMT

UGT1A1

future science group

Research Article Mizzi, Peters, Mitropoulou et al.

polypeptide chain, out of a possible tetramer, such as the case in CYP2D6, or duplex, such as the case in CYP2C9 and TPMT. An interesting nonsense muta-tion in TPMT ablates the last nine amino acids affect-ing the C-terminal end of the protein and was also highlighted in the protein (Supplementary Figure 1).

Delineating personalized pharmacogenomics profiles with anticoagulation treatment efficacyFinally, in order to demonstrate the applicability of this approach in a real-life clinical scenario, we have per-formed whole-genome sequencing in seven members (three generations) of a family of Greek origin, to iden-tify novel and putatively functional pharmacogenomic markers in the ADMET-related genes and attempted to correlate these with response to acenocoumarol treat-ment. In particular, family members 4 and 7, who are unrelated (Supplementary Figure 2), are both diag-nosed with atrial fibrillation and undergo acenocou-

marol treatment. However, although family member 7 responds well to acenocoumarol treatment, assessed by stable drug dosing (6 mg/week for the last 36 months) and the stable international normalized ratio (INR) measurement (within 2–3; Supplementary Figure 3), family member 4 does not respond well to acenocou-marol treatment, deducted by the fluctuating INR values and underlying drug dosing scheme and the hospitalization for severe bleeding. Our data showed that around one-third of the variants where different between the two patients, where 33.44% of functional variants found in family member 4 are not found in family member 7, and 34.29% of functional variants found in family member 7 are not found in family mem-ber 4. Similarly, our analysis revealed 38 novel puta-tively functional variants in family member 4 that are not found in family member 7, and 40 novel putatively functional variants found in family member 7 that are not found in family member 4 (Supplementary Table 3). We also studied novel variants found in the trio (father,

Page 7: Research Article Pharmacogenomics Research Article club/Mizzi et al.pdf · Research Article Mizzi, Peters, Mitropoulou et al. The advent of next-generation sequencing has cre-ated

www.futuremedicine.com 1229

Figure 3. Distribution of absorption, distribution, metabolism, excretion and toxicity gene-related variants in the 482 human genomes. Percentage of absorption, distribution, metabolism, excretion and toxicity gene-related variants in each genome, classified into five categories: DMET coverage [5], novel functional variants, tandem substitutions, insertions and deletions (shown on the x-axis).

future science group

Personalized pharmacogenomics profiling using whole-genome sequencing Research Article

mother and child) with data found in the other data-sets. Around 5000 high-quality novel variants were identified from the trio. We filtered variants found in CG69 and Wellderly datasets resulting in 1700 variants, Polyphen annotations were added to this list and only those that were annotated were kept resulting in seven variants (Supplementary Table 4). Looking closer to the genes that are responsible for acenocoumarol treatment, we found that family member 4 is hetero zygous for the CYP2C9 rs1799853 variation (CYP2C9*2) and for the rs9332242 variant in the 3 -́UTR of the CYP2C9 genes. On the contrary, family member 7 had no variations in the CYP2C9 gene (Table 3).

Our approach identified CYP2C9 and CYP2C19 gene variants not found on the DMET+ chip and the DMET coverage was an average of 36.18%, similar to those found in the 482 genomes. However, although we failed to identify novel functional variants in the CYP2C9 and VKORC1 genes, related to acenocoumarol treatment, we identified several novel variants found in other pharmacogenes, some of them in genes involved in the clopidogrel pathway (Supplementary Figure 4) [24]. The lack of any sequence variants in genes involved in alternative anticoagulation therapies, such as clopido-grel, may suggest that family member 4 could consider altering her anti coagulation treatment modality from acenocoumarol to clopidogrel. Similarly, the plethora of known and novel genomic variants not only in the

CYP2C19 gene (namely rs17885098 and rs3758581, both belonging to haplotypes linked to increased tox-icity risk), but also in other genes, such as ABCB1, CYP2B6 and CYP3A5, all of which are involved in the clopidogrel pathway (Supplementary Figure 3), strongly suggest that family member 7 should, under no circumstances, modify the anticoagulation treat-ment modality. At this point, it should be stressed that although acenocoumarol and clopidogrel act via a dif-ferent mechanism (acenocoumarol is an anticoagulant, while clopidogrel is an antiplatelet agent), they are both frequently used to treat patients with atrial fibrillation, but should not generally be considered as therapeutic substitutes.

DiscussionThe postgenomic revolution, characterized by the advent of massively parallel whole-genome and -exome sequencing, has led to the correlation of spe-cific genomic variants with disease predisposition and other clinical features, including response to some of the most commonly prescribed drugs. There are cur-rently very few studies to demonstrate the usefulness of whole-genome sequencing in pharmacogenomics. Initially, Ashley and coworkers showed that variants in a person’s genome could suggest probable clopidogrel resistance, positive response to lipid-lowering therapy and indicated the need for a low initial dose for warfa-

DMET coverage

45

40

35

30

25

20

15

10

5

0

Per

cen

tag

e (%

)

Novelfunctional variants

Substitutions Insertions Deletions

Variant category

Page 8: Research Article Pharmacogenomics Research Article club/Mizzi et al.pdf · Research Article Mizzi, Peters, Mitropoulou et al. The advent of next-generation sequencing has cre-ated

1230 Pharmacogenomics (2014) 15(9)

Fig

ure

4. D

iffe

ren

ces

ob

serv

ed in

th

e ra

tes

of

no

vel a

nd

kn

ow

n v

aria

nts

am

on

g d

iffe

ren

t et

hn

icit

ies.

(A

) Pe

rcen

tag

es

of

kno

wn

ab

sorp

tio

n, d

istr

ibu

tio

n, m

etab

olis

m,

excr

etio

n a

nd

to

xici

ty (

AD

ME

T)

gen

e-r

elat

ed v

aria

nts

, pre

sen

t in

th

e D

ME

T+ p

latf

orm

, dis

trib

ute

d o

ver

the

12 d

iffe

ren

t et

hn

ic g

rou

ps

of

the

pilo

t C

G69

hu

man

gen

om

es

anal

ysis

. Ple

ase

also

see

th

e Su

pp

lem

enta

ry M

ater

ial)

. (B

) Pe

rcen

tag

es

of

no

vel A

DM

ET

gen

e va

rian

ts in

th

e 69

hu

man

gen

om

es

anal

yzed

, dis

trib

ute

d o

ver

the

dif

fere

nt

eth

nic

gro

up

s (p

leas

e al

so s

ee t

he

Sup

ple

men

tary

Mat

eria

l). (

C)

Kn

ow

n/n

ove

l AD

ME

T g

ene

-rel

ated

var

ian

ts in

th

e va

rio

us

eth

nic

gro

up

s an

alyz

ed. T

he

12 e

thn

ic g

rou

ps

com

pri

sin

g t

he

CG

69 c

olle

ctio

n is

sh

ow

n a

t th

e x-

axis

. A

SW: A

fric

an a

nce

stry

in S

ou

thw

est

USA

; CEP

H/U

TAH

: Uta

h r

esi

den

ts w

ith

an

cest

ry f

rom

no

rth

ern

an

d w

est

ern

Eu

rop

e; C

EU: U

tah

re

sid

ents

wit

h n

ort

her

n a

nd

we

ster

n

Euro

pea

n a

nce

stry

fro

m t

he

CEP

H c

olle

ctio

n; C

HB

: Han

Ch

ine

se in

Bei

jing

, Ch

ina

; GIH

: Gu

jara

ti In

dia

ns

in H

ou

sto

n, T

exas

; JP

T: J

apan

ese

in T

ok

yo, J

apan

; LW

K: L

uh

ya in

W

ebu

ye, K

enya

; MK

K: M

aasa

i in

Kin

yaw

a, K

enya

; MX

L: M

exic

an a

nce

stry

in L

os

An

gel

es,

Cal

ifo

rnia

; PU

R: P

uer

to R

ican

s in

Pu

erto

Ric

o; T

SI: T

osc

ani i

n It

alia

; YR

I: Y

oru

ba

in Ib

adan

, Nig

eria

.

future science group

Research Article Mizzi, Peters, Mitropoulou et al.D

ME

T c

over

age

42 40 38 36 34 32 30

Percentage (%)

PUR

CEU

YRI

CHB

JPT

LWK

MXL

ASW

TSI

GIH

MKK

PUR

CEU

YRI

CHB

JPT

LWK

MXL

ASW

TSI

GIH

MKK

6 5 4 3 2 1 0

Percentage (%)

Kn

ow

n/n

ovel

Nov

el v

aria

nts

PUR

CEU

CEPH/UTAH

CEPH/UTAH

CEPH/UTAH

YRI

CHB

JPT

LWK

MXL

ASW

TSI

GIH

MKK

024681012141618

Percentage (%)C

G69

co

llect

ion

eth

nic

gro

up

sC

G69

co

llect

ion

eth

nic

gro

up

s

CG

69 c

olle

ctio

n e

thn

ic g

rou

ps

Page 9: Research Article Pharmacogenomics Research Article club/Mizzi et al.pdf · Research Article Mizzi, Peters, Mitropoulou et al. The advent of next-generation sequencing has cre-ated

www.futuremedicine.com 1231future science group

Personalized pharmacogenomics profiling using whole-genome sequencing Research Article

rin [25]. In another study, Drögemöller and coworkers attempted to address for the first time the application of whole-genome resequencing in pharmaco genomics applications [26]. Indeed a year later, Nelson and coworkers identified a large number of rare functional variants in 202 drug target genes carried out in just over 14,000 individuals [27]. This study took a very large population genetics approach, and focused on a different set of genes (drug targets) than the ones described in this paper. Nelson and co workers suc-cessfully showed that mutations in such genes can very well reduce or completely abolish efficacy of cor-responding drugs, highlighting the importance and further support the use of whole-genome sequencing for personalized healthcare and clinical diagnostics [27]. In our study, we sought to identify the number of variants (both common and rare), not only in many more pharmacogenes (n = 231), but most importantly those that are directly involved in drug metabolism, while we have also taken a more personalized medicine approach by linking novel variants with drug response. Genetic disorders have already begun to greatly ben-efit from whole-genome sequencing applications and genomic data from selected patients with well-defined phenotypes are accummulating at a very rapid pace [28]. Studies making use of whole-genome sequencing technology in recent years, have led to an exponential increase in the generation of human genome variation data from research centers and diagnostic laboratories, thereby enriching our knowledge of the genetic hetero-geneity underlying both monogenic disorders and complex clinical traits. As previously discussed, it is obvious that analyzing only the exome fraction of our genome, though inexpensive, has several disadvantages over whole-genome sequencing (e.g., poor coverage of some coding regions as a result of the target enrich-ment step and zero coverage of potentially impor-tant and clinically relevant genomic variants lying in genomic regions other than exons).

In this study, we attempted to exploit the ultimate genetic analysis, namely the whole-genome nucleotide sequence analysis of individual genomes to determine personalized pharmacogenomics profiles. We have opted to use the proprietary DNA nanoball resequenc-ing technology of Complete Genomics, instead of the other available next-generation sequencing technolo-gies since the mate pair and unique mapping-based sequence assembly of Complete Genomics does not call any variant in a sequence if there is a conflicting repeated region, something that is particularly impor-tant for the CYP2D6 gene [28]. In other words, due to mate pairs, most of the genes (or most of their exons) can be properly sequenced in spite of having closely related pseudogenes [11]. C

hro

mo

som

eSt

art

nu

cleo

tid

eEn

d

nu

cleo

tid

e R

efer

ence

se

qu

ence

Var

ian

t se

qu

ence

db

SNP

137

ann

ota

tio

nLo

cati

on

Gen

eEx

on

ic e

ffec

tFo

un

d

in t

he

DM

ET+

A

rray

Freq

uen

cy

in G

reec

e

Fou

nd

in f

amily

mem

ber

4 b

ut

no

t in

fam

ily m

emb

er 7

chr1

09

6702

04

69

6702

047

CT

rs17

9985

3E

xon

icC

YP2

C9

No

nsy

no

nym

ou

s SN

VY

es16

%

chr1

09

674

889

29

674

889

3C

Grs

9332

242

3’-U

TRC

YP2

C9

NA

No

NA

chr1

09

6749

326

967

4932

7T

Grs

1862

1765

9D

ow

nst

ream

CY

P2C

9N

AN

oN

A

Fou

nd

in f

amily

mem

ber

7 b

ut

no

t in

fam

ily m

emb

er 4

chr1

09

6522

560

965

2256

1T

Crs

178

8509

8E

xon

icC

YP2

C19

Syn

on

ymo

us

SNV

No

NA

chr1

09

6602

622

966

0262

3G

Ars

3758

581

Exo

nic

CY

P2C

19N

on

syn

on

ymo

us

SNV

Yes

7.4%

The family tree and corresponding international normalized ratio values are provided in S

up

ple

men

tary

Fig

ure

s 1

& 2, respectively.

Chr: Chromosome; NA: Not available; SNV: Single nucleotide variation.

Tab

le 3

. Gen

om

ic v

aria

nts

fo

un

d in

th

e C

YP2

C9

and

CY

P2C

19 g

enes

, rel

ated

to

an

tico

agu

lati

on

tre

atm

ent

(ace

no

cou

mar

ol a

nd

clo

pid

og

rel,

resp

ecti

vely

) id

enti

fied

as

bei

ng

het

ero

zyg

ou

s in

fam

ily m

emb

ers

4 an

d 7

.

Page 10: Research Article Pharmacogenomics Research Article club/Mizzi et al.pdf · Research Article Mizzi, Peters, Mitropoulou et al. The advent of next-generation sequencing has cre-ated

1232 Pharmacogenomics (2014) 15(9) future science group

Research Article Mizzi, Peters, Mitropoulou et al.

Although the exact nature of intronic variants and of variants residing in intergenic regions has not yet been completely elucidated, failing to determine a comprehensive personalized pharmacogenomic profile and to diagnose rare, if not unique, but clinically sig-nificant pharmacogenomic markers may have serious implications in individual patients. This may, in turn, increase the chances of poor drug response, increase the chances of adverse reactions, which reciprocally will result in increased treatment costs.

Our data showed that the number of ADMET-related variants identified by whole-genome sequenc-ing is significantly higher compared with the variants that would have been identified if DMET+, the most comprehensive genotyping platform for pharmaco-genomic biomarkers available to date, or any other cur-rently available genetic screening platform were used. Although this is to be expected, it clearly shows that, as with every other genetic screening method, a seemingly negative result from a pharmacogenomic screening platform does not necessarily mean that a person does not bear other novel and putatively deleterious variants to ADMET enzymes that would lead to reduced drug efficacy or increase drug toxicity.

Interestingly, our independent whole-genome seq-uencing analysis in the Greek seven-member family led to two very interesting findings. First, the analy-sis managed to explain, at least in part, the variation in acenocoumarol treatment response between the two family members, as a result of the presence of the CYP2C9*2 allele in heterozygosity in family member 4 (Table 3). Of course, the involvement of other novel genomic variants in genes yet to be shown to relate with acenocoumarol metabolism cannot be excluded. Second, the analysis also highlighted that: family member 4 may consider altering her anticoagulation treatment modality from acenocoumarol to clopido-grel, since in this case, this alternative treatment may yield better clinical results; family member 7 should, under no circumstances, modify the anticoagulation treatment modality, since in this case, she may face adverse reactions.

Our study has a number of limitations. First of all, due to the large number of novel variants that were identified, we were unable to perform functional stud-ies to confirm whether certain genomic variants in those 231 pharmacogenes are indeed deleterious to the protein function and as such can be exploited as pharmaco genomic biomarkers. Although the SIFT-based in silico analysis provides an attractive alternative to predict pathogenicity of the novel variants, it can-not fully replace functional assays. Using these datasets, we could not create a set of cases and controls to study drug treatment of acenocoumarol treatment. Further-

more, due to the fact that (rare) variants of interest have not been validated using an independent genotyping technology, such as Sanger sequencing, these findings should be treated with caution. Also, the analysis of just two members from a single family does not allow us to draw solid conclusions but only to highlight the importance and potential usefulness of whole-genome sequencing in pharmacogenomics. Finally, our study takes into consideration only those 231 pharmacogenes that have been previously shown to be implicated in predicting drug efficacy and/or toxicity and does not take into account genomic variation in other genes whose role as pharmacogenomic biomarkers is yet to be determined. Obviously the fact that a large number of human genome data is freely available allows replication of this exercise once evidence for novel pharmacogenes is provided.

Moreover, the benefit of such an analysis conducted on the Wellderly and public genomes available from Complete Genomics would be amplified much more if critical and well-defined clinical phenotypes were avail-able. This would mean that novel genes that are pre-dicted to act as pharmacogenes can be identified in this way, and could lead to the discovery of new functions and possibly new treatment options.

Conclusion & future perspectiveAs personalized drug treatment and genomic medicine moves closer to becoming a reality, the use of medium- or even high-throughput microarray-based genetic screening tests that fit all ethnicities warrants reconsid-eration and tweaking to accommodate today’s patient needs. Also, as next-generation sequencing technology improves, based on both read lengths and accuracy, this approach will facilitate the more accurate geno typing of pharmacogenes in future studies and aid in the implementation of next-generation sequencing-based pharmacogenomic assays in the clinic.

This study clearly demonstrates that not only can whole-genome sequencing reveal a relatively large number of unique (or rare) pharmacogenomic markers that would otherwise go undetected by conventional genetic screening methods, such as PCR or microarray-based methods, but also that significant variation exists among different ethnic groups as far as novel ADMET gene-related variants are concerned. At this point, one should bear in mind that before such an approach and resulting data are used in a clinical setting, veri-fication needs to be performed in a quality accredited (International Organization for Standardization [ISO] or Clinical Laboratory Improvement Amendments [CLIA] certified) laboratory.

An important aspect of next-generation sequencing technology that would be critical for its early adop-

Page 11: Research Article Pharmacogenomics Research Article club/Mizzi et al.pdf · Research Article Mizzi, Peters, Mitropoulou et al. The advent of next-generation sequencing has cre-ated

www.futuremedicine.com 1233future science group

Personalized pharmacogenomics profiling using whole-genome sequencing Research Article

tion in the clinic is cost–effectiveness. In other words, it becomes obvious that performing a comprehensive personalized pharmacogenomic profile using whole-genome sequencing (which currently costs US$3000 and is dropping) that will include almost all of the germline and de novo genomic variants needed to man-age all current and future treatment modalities is cost effective when compared against the costs of testing for a single marker or several markers in few pharmaco-genes (from US$300 up to US$1500, respectively). At present, setting up a (centralized) whole-genome sequencing facility and training clinicians to be able to translate pharmacogenomic data are two of the most important hurdles that need to be overcome [29], but sample outsourcing for data analysis and interpretation might be the answer to this obstacle using an econ-omy-of-scale model. Ultimately, as pharmacogenomic testing costs using whole-genome sequencing and its cost–effectiveness are well-documented, it will only be a matter of time until testing cost reimbursement is adopted by national insurance bodies.

Overall, our findings undoubtedly highlight the value of whole-genome sequencing, to determine patients’ unique personalized pharmacogenomics pro-files. Considering the fact that whole-genome sequenc-ing costs continue to rapidly decline and at the same time genome-sequencing services become more accu-

rate in delivering clinical-grade genome sequences, it is expected that whole-genome sequencing will gradually assume an integral part in genomic medicine in the not so distant future.

Financial & competing interests disclosureThis  study was  supported  in  part  by ALTF 71-2011 grant  to 

J Borg and the Golden Helix Foundation (P3; RD-201201) and 

European  Commission  grants  (RD-CONNECT;  FP7–305444) 

to GP Patrinos. This project was encouraged by the Genomic 

Medicine Alliance [30]. The authors declare the following con-

flict  of  interests:  B  Peters, MR Agarwal  and R Drmanac  are 

employees of Complete Genomics, Inc. The authors have no 

other  relevant  affiliations  or  financial  involvement  with  any 

organization or entity with a financial  interest  in or financial 

conflict with the subject matter or materials discussed in the 

manuscript apart from those disclosed.

No writing assistance was utilized in the production of this 

manuscript.

Ethical conduct of researchThe authors state that they have obtained appropriate institu-

tional review board approval or have followed the principles 

outlined in the Declaration of Helsinki for all human or animal 

experimental investigations. In addition, for investigations in-

volving human subjects, informed consent has been obtained 

from the participants involved.

Executive summary

• Pharmacogenomics holds promise to rationalize drug use by minimizing drug toxicity and at the same time increase drug efficacy.

• There are currently several genetic screening assays to screen for known pharmacogenomic biomarkers for the most commonly prescribed drugs; however, these cannot account for other known or novel pharmacogenomic markers.

• We analyzed whole-genome sequences of 482 unrelated individuals of various ethnic backgrounds to obtain their personalized pharmacogenomics profiles.

• Bioinformatics analysis revealed 408,964 variants in 231 pharmacogenes, from which 26,807 were residing on exons and proximal regulatory sequences, whereas 16,487 were novel, out of which 1012 variants possibly abolish protein function, as indicated by in silico analysis.

• We have also performed whole-genome sequencing analysis in a seven member family of Greek origin in an effort to explain the variable response rate to acenocoumarol treatment in two family members.

• Overall, our data demonstrate that whole-genome sequencing, unlike conventional genetic screening methods, is necessary to determine an individual’s pharmacogenomic profile in a more comprehensive manner, which, combined with the gradually decreasing whole-genome sequencing costs, would expedite bringing personalized medicine closer to reality.

References1 Evans WE, Relling MV. Pharmacogenomics: translating

functional genomics into rational therapeutics. Science 286(487), 487–491 (1999).

2 de Denus S, Letarte N, Hurlimann T et al. An evaluation of pharmacists’ expectations towards pharmacogenomics. Pharmacogenomics 14(2), 165–175 (2013).

3 PharmGKB. www.pharmgkb.org

4 Roche AmpliChip 450 test. http://molecular.roche.com/assays/Pages/AmpliChipCYP450Test.aspx

5 Burmester JK, Sedova M, Shapero MH, Mansfield E. DMET microarray technology for pharmacogenomics-based personalized medicine. Methods Mol. Biol. 632, 99–124 (2010)

6 Affymetrix DMET+ microarray. www.affymetrix.com/estore/browse/products.jsp?productId=131412#1_1

Page 12: Research Article Pharmacogenomics Research Article club/Mizzi et al.pdf · Research Article Mizzi, Peters, Mitropoulou et al. The advent of next-generation sequencing has cre-ated

1234 Pharmacogenomics (2014) 15(9) future science group

Research Article Mizzi, Peters, Mitropoulou et al.

7 Johnson JA, Burkley BM, Langaee TY et al. Implementing personalized medicine: development of a cost-effective customized pharmacogenetic genotyping array. Clin. Pharmacol. Ther. 92(4), 437–439 (2012).

8 Bamshad MJ, Ng SB, Bigham A W et al. Exome sequencing as a tool for Mendelian disease gene discovery. Nat. Rev. Genet. 12(11), 745–755 (2011).

9 Gamazon ER, Skol AD, Perera MA. The limits of genome-wide methods for pharmacogenomics testing. Pharmacogenet. Genomics 22(4), 261–272 (2012).

10 Altman RB, Whirl-Carrillo M, Klein TE. Challenges in the pharmacogenomic annotation of whole genomes. Clin. Pharmacol. Ther. 94(2), 211–213 (2013).

11 Drmanac R, Sparks AB, Callow MJ et al. Human genome sequencing using unchained base reads on self-assembling DNA nanoarrays. Science 327(5961), 78–81 (2010).

12 Roach JC. Analysis of genetic inheritance in a family quartet by whole-genome sequencing. Science 328, 636–639 (2010).

13 69 Genomes Data. www.completegenomics.com/public-data/69-Genomes

14 Fritsma GA. Evaluation of Hemostasis. In: Hematology: Clinical Principles and Applications. Rodak B (Ed). WB Saunders Company, PA, USA, 719–753 (2002).

15 Complete Genomics public server. ftp://ftp2.completegenomics.com

16 CGA™ Tools. www.completegenomics.com/analysis-tools/cgatools

17 SIFT tool. http://sift.jcvi.org

18 PROVEAN tool webpage. http://provean.jcvi.org/about.php#about_3

19 Choi Y, Sims GE, Murphy S, Miller JR, Chan AP. Predicting the functional effect of amino acid substitutions and indels. PLoS ONE 7(10), e46688 (2012).

20 Mette L, Mitropoulos K, Vozikis A, Patrinos GP. Pharmacogenomics and public health: implementing ‘populationalized’ medicine. Pharmacogenomics 13(7), 803–813 (2012).

21 Kumar P, Henikoff S, Ng PC. Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nature Protocols 4(7), 1073–1081 (2009).

22 RCSB Protein Data Bank. www.rcsb.org/pdb/home/home.do

23 PyMOL. www.pymol.org

24 Sangkuhl K, Klein TE, Altman RB. Clopidogrel pathway. Pharmacogenet. Genomics 20(7), 463–465 (2010).

25 Ashley EA, Butte AJ, Wheeler MT. Clinical assessment incorporating a personal genome. Lancet 375(9725), 1525–1535 (2010).

26 Drögemöller BI, Wright GEB, Niehaus DJH, Emsley RA, Warnich L. Whole-genome resequencing in pharmacogenomics: moving away from past disparities to globally representative applications. Pharmacogenomics 12(12), 1717–1728 (2011).

27 Nelson MR, Wegmann D, Ehm MG et al. An abundance of rare functional variants in 202 drug target genes sequenced in 14,002 people. Science 337(6090), 100–104 (2012).

28 Drögemöller BI, Wright GE, Niehaus DJ et al. Next-generation sequencing of pharmacogenes: a critical analysis focusing on schizophrenia treatment. Pharmacogenet. Genomics 23(12), 666–674 (2013).

29 Kampourakis K, Vayena E, Mitropoulou C et al. Key challenges for next-generation pharmacogenomics. EMBO Rep. 15(5), 472–476 2014).

30 Genomic Medicine Alliance. www.genomicmedicinealliance.org