TOXICOLOGICAL SCIENCES 115(1), 41–50 (2010)
doi:10.1093/toxsci/kfq027
Advance Access publication January 27, 2010
A Comprehensive Haplotype Analysis of the XPC Genomic SequenceReveals a Cluster of Genetic Variants Associated with Sensitivity to
Tobacco-Smoke Mutagens
Catherine M. Rondelli,* Randa A. El-Zein,† Jeffrey K. Wickliffe,‡ Carol J. Etzel,† and Sherif Z. Abdel-Rahman*,1
*Department of Obstetrics and Gynecology, The University of Texas Medical Branch, Galveston, Texas 77555; †Department of Epidemiology, MD Anderson
Cancer Center, Houston, Texas 77030; and ‡Department of Environmental Health Sciences, Tulane University Health Sciences Center, School of Public Health
and Tropical Medicine, New Orleans, Louisiana 70112
1To whom correspondence should be addressed at Department of Obstetrics and Gynecology, The University of Texas Medical Branch, 11.104 A, Medical
Research Building, Galveston, TX 77555-1062. Fax: (409) 772-2261. E-mail: [email protected].
Received December 16, 2009; accepted January 22, 2010
The impact of single-nucleotide polymorphisms (SNPs) of the
DNA repair gene XPC on DNA repair capacity (DRC) and
genotoxicity has not been comprehensively determined. We
constructed a comprehensive haplotype map encompassing all
common XPC SNPs and evaluated the effect of Bayesian-inferred
haplotypes on DNA damage associated with tobacco smoking,
using chromosome aberrations (CA) as a biomarker. We also used
the mutagen-sensitivity assay, in which mutagen-induced CA in
cultured lymphocytes are determined, to evaluate the haplotype
effects on DRC. We hypothesized that if certain XPC haplotypes
have functional effects, a correlation between these haplotypes
and baseline and/or mutagen-induced CA would exist. Using
HapMap and single nucleotide polymorphism (dbSNP) databases,
we identified 92 SNPs, of which 35 had minor allele frequencies
‡ 0.05. Bayesian inference and subsequent phylogenetic analysis
identified 21 unique haplotypes, which segregated into six distinct
phylogenetically grouped haplotypes (PGHs A–F). A SNP tagging
approach used identified 11 tagSNPs representing these 35 SNPs
(r2 5 0.80). We utilized these tagSNPs to genotype a population of
smokers matched to nonsmokers (n 5 123). Haplotypes for each
individual were reconstituted and PGH designations were
assigned. Relationships between XPC haplotypes and baseline
and/or mutagen-induced CA were then evaluated. We observed
significant interaction among smoking and PGH-C (p5 0.046) for
baseline CA where baseline CA was 3.5 times higher in smokers
compared to nonsmokers. Significant interactions among smoking
and PGH-D (p 5 0.023) and PGH-F (p 5 0.007) for mutagen-
induced CA frequencies were also observed. These data indicate
that certain XPC haplotypes significantly alter CA and DRC in
smokers and, thus, can contribute to cancer risk.
Key Words: DNA nucleotide excision repair; XPC gene;
polymorphism; haplotypes; biomarkers; chromosome; smoking;
cancer.
Smoking is associated with a high risk of cancer at many
organs (IARC, 1986). Not all smokers, however, develop cancer,
which clearly indicates a significant interindividual variation in
metabolism of tobacco carcinogens and in repair of the resulting
genetic damage (Liu et al., 2005). In fact, studies have
consistently shown a significant association between reduced
DNA repair capacity (DRC) and increased risk of tobacco-related
cancers (Shen et al., 2003; Zhu et al., 2007). The nucleotide
excision repair (NER) is the major DNA repair pathway that
removes genetic damage resulting from exposure to many
tobacco carcinogens (Friedberg, 2001). An important protein in
this pathway is the xeroderma pigmentosum complementation
group C (XPC) protein, which plays a key role as a part of the
DNA damage–recognition complex (Araki et al., 2001). XPC is
the only protein in this complex that directly binds to the damaged
DNA (Park and Choi, 2006) to initiate the NER process through
the recruitment of other proteins, including xeroderma pigmento-
sum complementation group A (XPA), transcription factor II H
(TFIIH), xeroderma pigmentosum complementation group G
(XPG), and replication protein A (RPA) (Bunick et al., 2006).
The XPC gene spans 33 kb and encodes a 940 amino acid
protein (Genbank accession No. AC090645). XPC is highly
polymorphic, with many single-nucleotide polymorphisms
(SNPs) in the exonic region and the intronic, 3# and 5#untranslated regions (UTRs), including the promoter region.
Only a few of these SNPs, namely the exon 16 variant K939Q
(rs2228001), exon 8 variant A499V (rs2228000), intron 11–5
splice site C/A (rs3729587), and intron 9 PolyAT insertion,
have been studied as potential modifiers of cancer risk in
humans. Many epidemiological studies have shown associa-
tions between these SNPs and risk for human cancer for many
organs (e.g., An et al., 2007; Guo et al., 2008; Hansen et al.,2007; Zhu et al., 2007).
Over 90 SNPs in the XPC gene have been reported in the
International HapMap Project (www.hapmap.org) and
The authors certify that all research involving human subjects was done
under full compliance with all government policies and the Helsinki
Declaration.
� The Author 2010. Published by Oxford University Press on behalf of the Society of Toxicology. All rights reserved.For permissions, please email: [email protected]
the National Center for Biotechnology Information (NCBI)
single nucleotide polymorphism (dbSNP) (www.ncbi.nlm.nih.
gov/projects/SNP) databases. The phenotypic and/or functional
effects of these SNPs have not yet been characterized,
including their impact on DNA damage response and DRC.
Analysis of the potential effect of each of these SNPs on
disease risk, or evaluation of their individual phenotypic
effects, is certainly impractical. However, it is well known that
genetic variation in human populations is not arrayed simply as
independent SNPs but, rather, as various combinations of SNPs
or ‘‘haplotypes’’. This is because some of the individual SNPs,
often those located in close proximity to one another, are
correlated and exist in degrees of linkage disequilibrium (LD).
This creates identifiable haplotypes, comprising several SNPs
(Gabriel et al., 2002). Therefore, the phenotypic effects of
haplotypes, rather than that of individual SNPs, should be
examined in studies designed to determine the role of genetic
variability in relation to disease outcome. A practical approach to
achieve this goal would be to identify subsets of SNPs that
accurately identify haplotypes. Such SNPs could be identified
using a ‘‘tagging SNPs (tagSNPs)’’ strategy (Johnson et al.,2001). Thus, a subset of all SNPs (i.e., tagSNPs) in a given gene
region, highly correlated with other SNPs, could then be selected
for analysis, significantly reducing the volume of genotyping
needed. This approach is biologically more plausible, and more
comprehensive, since it involves the evaluation of effects of
multiple SNPs that could jointly influence disease outcome.
To our knowledge, a comprehensive haplotype analysis of
the entire XPC genomic sequence has not been conducted.
Furthermore, an evaluation of the functional effects of the XPChaplotypes, with regard to their effect on DRC, has not yet
been pursued. In the current study, we constructed a comprehen-
sive haplotype map encompassing SNPs of theXPC gene that are
reported to exist with a minor allele frequency (MAF) � 0.05 in
the general population. We hypothesized that if certain XPChaplotypes have phenotypic or functional effects, there would be
a correlation between these haplotypes and genetic damage in
individuals exposed to environmental carcinogens, such as those
found in tobacco smoke. Genetic damage was evaluated in our
study population using chromosome aberrations (CA) as a bio-
marker since increased frequency of CA in circulating peripheral
blood lymphocytes (PBLs) is considered an indication of
increased cancer risk (Bonassi et al., 2000; Hagmar et al.,1998). In addition, we used the mutagen-sensitivity assay, in
which CA frequency is determined following exposure of
cultured PBLs to a known mutagen. This is a biomarker that
serves as an indirect measure for DRC and as an intermediate
phenotype for cancer risk (Hsu et al., 1991; Spitz et al., 1995).
MATERIALS AND METHODS
Study subjects and blood collection. The study protocol was approved by
the University of Texas Medical Branch (UTMB) Institutional Review Board.
All study subjects signed a written consent form that described the purpose of the
study. A total of 123 White non-Hispanic subjects participated in this study.
They were subjects who were a subset of a larger cohort recruited without regard
to age, sex, or ethnicity from the smoking and nonsmoking staff and student
population of UTMB in Galveston, TX. This cohort is composed of individuals
who had responded to posted notices and advertisements requesting volunteers
for studies aimed at understanding the functional and biological significance of
sequence variability in DNA repair genes. Participation in this study was open to
White non-Hispanics only to avoid potential problems with admixtures when
developing the tagSNPs analysis. TagSNPs are not applicable to all races/
ethnicities, and separate sets of tagSNPs would need to be developed for each
ethnic/racial group. White non-Hispanics are accurately represented in HapMap
by the CEPH population (Utah residents with ancestry from northern and western
Europe; abbreviated and thereafter referred to as CEU).
Individuals were defined as nonsmokers if they had smoked less than 100
cigarettes during their lifetime. Individuals were defined as current smokers if
they had smoked at least five cigarettes per day for at least 1 year prior to
enrollment in the study. Smokers (n ¼ 62) were matched to nonsmokers
(n ¼ 61) based on age (± 5 years) and sex. Participants were asked to fill out
a questionnaire that provided demographic, occupational, and medical
information. Also collected was information regarding smoking habits,
including number of cigarettes per day, preferred brand, duration of smoking,
former tobacco use, and use of other tobacco products. Exclusion criteria for all
volunteers included a recent acute viral or bacterial infection; a major chronic
illness, such as cancer or an autoimmune disorder; a recent blood transfusion;
treatment with mutagenic agents, such as chemotherapeutic drugs or radiation;
excessive alcohol consumption, defined as more than a 10 g serving per day (as
determined by nationwide standard practices); and employment involving
exposure to potentially mutagenic agents. Because of these criteria, only
apparently healthy volunteers were included in the study to control for potential
confounders. A blood sample (10 ml) was obtained from each volunteer for
genotype analysis and cytogenetic cultures.
Identification of tagSNPs. The HapMap Data Release 22 phase II
assembly (HapMap online database at www.hapmap.org data release 22 phase
II NCBI assembly B36 dbSNP b126) was used as the source of genotypes for
this study. Genotypes for all SNPs reported in the genomic region
encompassing XPC were obtained from the International HapMap Project
database representing the CEU population. The CEU population sample is the
one that is the most ethnically similar to our sample of self-reported White non-
Hispanic subjects from UTMB who were evaluated in this study. We examined
2 kb of the 5# UTR of XPC since this region contains elements controlling XPC
gene expression, and we also examined the entire gene region and 2 kb of the
3# UTR. Genotypes for the CEU population were screened using Haploview
ver. 4.1 to ensure that only SNPs with a MAF of 0.05 or greater were used
in the subsequent haplotype inference. Next, we used Tagger software
(www.broad.mit.edu/mpg/tagger) to identify tagSNPs for assay design and
subsequent haplotype determination. Specifically, we used an aggressive
multimarker approach (up to six markers) restricted to SNPs with a MAF �0.05. We conservatively set the r2 threshold to � 0.8 (mean value 0.971) and
used a logarithm of odds score for estimating a recombination frequency
heterogeneity threshold of 2.
Genotyping of tagSNPs. Custom-designed real-time PCR-based assay kits
using the TaqMan chemistry from Applied Biosystems (Foster City, CA) were
used for genotyping tagSNPs. Each kit was developed to our specifications using
fluorescent probes that were designed to anneal to the designated SNP,
dependent on its sequence as determined from the reference SNP (rs) number
designated for that SNP in the NCBI dbSNP database (http://www.ncbi.nlm
.nih.gov/SNP/). Allele-specific probes were labeled with either the FAM or the
VIC fluorophore and an appropriate quencher. The PCR consisted of TaqMan
universal master mix, template DNA, and target-assay mix in a total reaction
volume of 12 ll at concentrations recommended by Applied Biosystems.
Thermal cycling was carried out in our laboratory on an MJ Research DNA
Engine thermocycler (from a subsidiary of BioRad Labs) equipped with
a computerized BioRad Chromo4 real-time PCR detection system (Hercules,
CA), under recommended conditions (50�C, 2 min; 95�C, 10 min; and 40 cycles
42 RONDELLI ET AL.
at 95�C for 15 s and 58–61�C for 1 min). Designation of referent and
polymorphic forms was determined by the FAM to VIC ratio. For quality
control, all PCRs were run in duplicate, and, along with no-template negative
controls, positive controls for each possible genotypic combination were
included when possible. Samples were coded for case-control status so that the
operator interpreting the results was blinded to the smoking status of the subject.
Samples from smokers and nonsmokers were run together in mixed batches, and
10% of the samples were randomly selected and subjected to repeat analysis, as
another quality-control measure for verification of genotyping results.
Additionally, genotypes for all tagSNPs were analyzed for deviations from
Hardy-Weinberg equilibrium (HWE) on a locus-by-locus basis using two
methods implemented in LD Analyzer ver. 1.0. The first method is a standard
two-sided Pearson chi-squared test and is rapid and computationally simple. The
second method relies on a Monte Carlo permutation–based exact test to
estimate deviations from HWE. Any SNP failing the test was excluded from
the study as an added quality-control measure.
Construction of XPC haplotypes and phylogenetic analysis. Our strategy
consisted of first using the HapMap data on the CEU population as a resource
to infer possible XPC haplotypes. These inferred haplotypes were then used to
develop a tagSNPs panel. These tagSNPs were subsequently used to genotype
our study participants. Based on the genotyping results, individuals were then
assigned to haplotypes corresponding to those inferred from the CEU
population. Haplotypes were inferred from the CEU population, using Bayesian
statistics implemented in PHASE ver. 2.1 software (www.stat.washington.edu
/stephens/phase.html), formatting the input file to account for the family trios
comprising the CEU sample. The number of iterations was increased to 10,000,
the thinning interval was increased to 10, and the burn-in was increased to 200
to improve the accuracy of the inferred haplotypes. The default setting was
selected with an output posterior probability threshold of 0.9. To ensure
accuracy of reported results, individuals lacking defined genotype data for more
than one SNP were excluded from the analysis. In addition, individuals lacking
identification of a single SNP that prevented the accurate assignment of full
haplotypes were also excluded from further analysis.
Because a substantial number of inferred haplotypes were expected, making
the number of potential statistical comparisons problematic, a phylogenetic
grouping approach was used to group or cluster evolutionarily related
haplotypes from the CEU population. Genetic distances were computed among
haplotypes using the maximum likelihood composite model implemented in
MEGA 4 (http://www.megasoftware.net/). Distances among haplotypes were
then phylogenetically clustered using the neighbor-joining method in MEGA 4.
Phylogenetically related haplotypes were given group designations for further
statistical comparisons and analysis. The use of the tagSNPs panel derived from
the CEU population allowed us to assign haplotypes to our population
corresponding to CEU-inferred haplotypes and subsequently to groups based
on the phylogenetic analysis of the complete SNP panel from the CEU
population. Grouping of haplotypes, based on genealogical or phenotypic
relationships, previously has been used successfully by many other
investigators (Bardel et al., 2009; Rieder et al., 2005; Veenstra et al., 2005).
Phylogenetically grouped haplotypes (PGHs), which share strong genealogical
similarities, serve to substantially increase the statistical power of analyses by
reducing the number of groups to be evaluated.
Cytogenetic cultures and the mutagen-sensitivity assay. Cultures for
cytogenetic assays were established according to standard procedures (Evans
and O’Riordan, 1975), as routinely done in our laboratory (Abdel-Rahman and
El-Zein, 2000; Affatato et al., 2004). Briefly, aliquots of 1 ml of PBLs were
cultured with 9 ml of RPMI 1640 medium supplemented with 100 U/ml
penicillin, 100 lg/ml streptomycin, 10% fetal bovine serum, and 2mM
L-glutamine (Invitrogen, Carlsbad, CA). Stimulation of PBLs was accom-
plished by the addition of 0.18 mg/ml phytohemagglutinin (reagent grade;
Remel, Lenexa, KS). Two cultures were set up for each subject: one culture was
not treated to give a baseline in vivo CA frequency and the second culture was
used for the mutagen-sensitivity assay. After 46 h, the suspended cells in the
second culture were centrifuged and the growth medium reserved. The PBLs
were then resuspended in 5 ml serum-free RPMI 1640 supplemented with
0.24mM of the mutagen and 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanone
(NNK) (CAS#64091-91-4, National Cancer Institute, Midwest Carcinogen
Repository, Kansas City, MO) and incubated at 37�C in the presence of 5%
CO2 for 1 h. Following NNK treatment, the PBLs were washed twice with
serum-free RPMI 1640, transferred to clean tubes and resuspended in the
original growth medium until harvested. Harvesting was performed 72 h after
NNK treatment. The mutagen concentration and harvest times had been
established from our previous studies and have been shown to produce
measurable levels of genetic damage and low levels of toxicity over a period of
time that allows the effects of DNA repair to be manifest (Abdel-Rahman and
El-Zein, 2000; Affatato et al., 2004).
Cell culture harvest and cytogenetic analysis. Prior to harvest, cells from
all cultures were treated with 0.1 lg/ml colcemid (Gibco-Invitrogen) for 1 h to
arrest the cells in metaphase. The cultures of PBLs were centrifuged and the
cells resuspended in hypotonic solution (0.075M potassium chloride), fixed
with Carnoy’s fixative (three parts methanol/one part acetic acid, vol/vol), and
stored at 4�C. Slides for cytogenetic analysis were then prepared in duplicate by
spreading the fixed cells on the slides and staining them with Giemsa. One
hundred metaphase cells on each slide were scored for CAs using a Nikon 400
light microscope, according to standard procedures (ISCN, 1985). Aberrations
were recorded as chromosome breaks or frank chromatid breaks. Chromatid
breaks were counted as one break and chromosome breaks as two breaks. Total
aberrant cells were recorded as a percentage of aberrant cells (breaks per 100
cells). For quality control, slides were coded before scoring to protect against
scorer bias. Cells from slides prepared from both smokers and nonsmokers
were scored blindly in mixed batches. To ensure quality control, 20% of the
slides were randomly selected for blind rescoring. Agreement between the
original data and rescored data was measured using the Cohen’s kappa
statistical test. A statistically significant value of p < 0.001 was obtained for
both baseline and mutagen-induced CA, indicating that the agreement between
the original and rescored data was not attributable to random chance.
Statistical analysis. Each individual was coded for the presence (þ) or
absence (�) of each PGH. We used descriptive statistical analyses [mean (±
standard errors of the mean; SEM)] for continuous variables and frequencies for
categorical variables to characterize the study population. We compared mean
CA frequencies for each PGH (present vs. absent) using preliminary Student’s
two-sample t-tests. In order to account for the fact that we were performing
multiple tests in our comparison of baseline and mutagen-induced CA
frequencies within each PGH group separately, we completed a permutation
test with 1000 replicates (PGH present/absent status was randomly permuted
within each replicate) to calculate empirical p values, respectively, for each
PGH comparison. Permutations test corrections are known to be robust and
have the benefit that the empirical p value is constructed directly from the
experimental data at hand (Cheverud, 2001). We completed the same procedure
upon stratification by smoking status (nonsmokers and smokers). Guided by
these preliminary results, a general linear statistical model that included the
final parameters estimated from the exploratory analysis was then fit to evaluate
differences in CA frequency involving interactions between each PGH and
smoking, separately for each PGH, adjusted for age and gender. We
constructed error-bar plots (depicting mean and 95% confidence interval
limits) to graphically visualize statistically significant interactions.
RESULTS
Characteristics of the Study Population
The study population consisted of 123 White non-Hispanic
subjects. We were able to obtain full haplotype data on 99 of
the 123 individuals, and therefore, only those 99 subjects were
included in all subsequent analyses. Of these individuals, 78
were females (78.8%) and 21 were males (21.2%). There were
50 smokers and 49 nonsmokers, who were matched with
XPC HAPLOTYPES AND SENSITIVITY TO TOBACCO SMOKE 43
respect to age (± 5 years) and sex. The smokers had smoked
between 5 and 50 cigarettes per day (mean ± SD: 17.4 ± 1.26)
for a minimum of 1 year (mean ± SD: 19.8 ± 1.62 years) before
participating in the study. The age of the participants ranged
from 20 to 72 years, with a median of 37 years and a mean
(± SD) of 39.0 (± 1.30) years. There was no significant
difference in the smoking habits (total number of smoking
years, number of cigarettes smoked per day, and pack years,
defined as packs smoked per day 3 the number of smoking
years) between males and females. The mean ± SEM frequency
of baseline CA frequency for the study population was
0.79 ± 0.10. After mutagen challenge, the mean ± SEM
frequency of mutagen-induced CA was 5.24 ± 0.29.
Identification of tagSNPs, Structures of XPC Haplotypes, andPhylogenetic Analysis
At the time of the analysis, using the information available
on the CEU population from HapMap, we identified 92 SNPs
encompassing the entire coding region, the introns, and 2 kb
upstream and 2 kb downstream of the coding region of the XPCgene. Of these, 35 SNPs were predicted to occur with a MAF �0.05. We identified 11 tagSNPs (the bolded and underlined rs
numbers in Table 1), which tagged the 35 SNPs with
a correlation coefficient of r2 ¼ 0.8. These 11 tagSNPs were
used for subsequent genotyping of the study population. This
linkage-based genotyping significantly reduced the volume of
unique genotyping assays and concurrently reduced the effort
and time required to evaluate the effect of all 35 SNPs on
genetic damage. The 35 SNPs tagged by these 11 tagSNPs and
their position on the XPC gene are presented in Table 1.
Using the information available on the CEU population,
we utilized the Bayesian-inference analysis implemented
in PHASE ver. 2.1 software (www.stat.washington.edu/
stephens/software/html), which revealed 21 unique haplotypes.
Table 2 shows the full 21 haplotypes, as generated by the
PHASE analysis used in this study. Phylogenetic analysis of
these haplotypes was used to assess genealogical relationships
among these 21 haplotypes, utilizing the maximum likelihood
model implemented in MEGA 4 software (http://www
.megasoftware.net/). Haplotypes were grouped based on clade
formation and percent sequence divergence. Six clades were
apparent in the midpoint-rooted cluster analysis corresponding
to the PGHs A–F. Percent sequence divergence within groups
ranged from 4.8% in PGH-F to 8.6% in PGH-D. Percent
sequence divergence between groups ranged from 18.6%
(PGH-A and PGH-B) to 57.3% (PGH-A and PGH-E). Since
there is no firm objective metric that exists for deciding what
should constitute acceptable levels of within- and between-
group or clade percent sequence divergence for this type of
analysis, the apparently ‘‘natural’’ groups and divisions in this
case were used, based on the phylogenetic structuring in the
tree. A bootstrap analysis (data not shown), using 10,000
replicates, strongly supported such groupings. There was �90% clade support for the selected PGHs. As shown in
Figure 1, PGH-A consisted of six haplotypes, PGH-B of one
haplotype, PGH-C of five haplotypes, PGH-D of two
haplotypes, PGH-E of three haplotypes, and PGH-F consisted
of four haplotypes.
Genotype Analysis of the Study Population
After all individuals were genotyped for the 11 tagSNPs, the
genotype data were analyzed for HWE. In this analysis, only
10 of the 11 SNPs passed. As a result of this analysis, we
subsequently excluded rs2470352, which was determined not
TABLE 1
SNPs Existing with a MAF ‡ 0.05 in the XPC Gene
rsa Alleles Ancestral allele Haplotype position Variation site
8516 C/T T 9 3# UTR
10468 C/T T 9 3# UTR
1126547 C/G G 1 3# UTR
2470352 A/T A 2 3# UTR
2229090 C/G C 9 3# UTR
2228001 A/C C 3 Exon 16b
2733532 C/T T 3 Intron 15
2733533 A/C C 11 Intron 15
2733534 C/G G 11 Intron 15
2279017 G/T T 3 Intron 12
2470353 C/G G 11 Intron 12
2607734 A/G A 3 Intron 11
2607736 A/G A 3 Intron 11
2607737 C/T C 11 Intron 11
3731149 A/C A 8 Intron 10
3731146 G/T T 8 Intron 10
9653966 G/T T 4 Intron 10
1124303 G/T T 5 Intron 10
3731143 C/T T 6 Intron 10
2228000 C/T C 9 Exon 9c
2227999 A/G G 6 Exon 9d
3731127 C/T C 7 Intron 8
3731125 A/G A 4 Intron 7
3731124 A/C A 8 Intron 7
13099160 A/G A 7 Intron 7
1106087 G/T G 9 Intron 5
3731108 C/T C 8 Intron 5
3731106 A/G A 8 Intron 5
3729587 C/G C 8 Intron 5
3731093 C/T T 4 Intron 3
2733537 A/G A 10 Intron 3
3731081 G/T G 8 Intron 3
3731068 A/C C 8 Intron 2
1350344 A/G G 11 Intron 1
2607775 C/G C 11 5# UTR
aReference SNP (rs) numbers are those designated by the dbSNP database of
the NCBI (http://www.ncbi.nlm.nih.gov/SNP/). Bold and underlined rs
numbers correspond to the 11 tagSNPs used in genotyping analysis of the 35
SNPs identified with MAF > 0.05 in the XPC gene.bThe rs2228001 (A/C) SNP in exon 16 results in a lysine to glutamine amino
acid change in codon 939 (K939Q).cThe rs2228000 (C/T) SNP in exon 9 results in a valine to arginine amino
acid change in codon 499 (V499R).dThe rs 2227999 (A/G) SNP in exon 9 results in a histidine to arginine amino
acid change at codon 492 (R492H).
44 RONDELLI ET AL.
to be in LD with any of the other SNPs under study. We then
reconstituted haplotypes for each individual in our study
population using genotype data generated with the remaining
10 SNPs, which were compared to the CEU haplotypes. All
SNP genotyping reactions were performed with more than 95%
success rate. We excluded 24 subjects from the study who
lacked genotype data for one (n ¼ 18) or more (n ¼ 6) SNPs
due to repeated PCRs failure since this prevented accurate
haplotype assignment for these individuals. Subsequently, we
reconstituted haplotypes for the remaining 99 individuals,
using the genotype data we generated and the CEU haplotypes
as a reference. For accuracy purposes, these 24 individuals
were also excluded from further analysis. All the subjects
excluded were not different in any other aspect from the rest of
the study population.
A PGH designation was assigned to each individual
evaluated. The descriptive statistical results indicated that the
most common PGH in the study population was PGH-F
(40.4%), while the least common PGH was PGH-B (3.0%).
The frequencies of each of the PGHs are presented in Table 3.
Relationship between XPC Haplotypes and the Background(Baseline) CA Frequency
The background (baseline) and mutagen-induced CA
frequencies observed in the presence (þ) and absence (�) of
haplotypes from the different PGHs identified in this study are
presented in Table 4. The PGH groups were first coded (PGH-
A to PGH-F) and then analyzed based on ‘‘haplotype group
copy’’ (HGC) using a dominant genetic model (0 HGCs ¼ 0, 1
or 2 HGCs ¼ 1). When the general linear model, adjusted for
age and sex, was fit to investigate interactions between each
TABLE 2
Individual Haplotypes Determined by PHASE Analysis for the
CEU Population (Utah Residents with Ancestry from Northern
and Western Europe) of HapMapa
1 TTCACACCGGGGGCATGTTCGCGAAGCACCGGCGC
2 TTCACACCGGGGGCATGTTCGTGAGGCACCGGCGC
3 TTCACACACGCGGTCGTTTCGCACAGTGGTATCAG
4 TTCACACACGCGGTCGTTTCGCACAGTGGTATAAG
5 TTCACACACGCGGTCGTTTTGCACAGTGGTATAAG
6 TTCACACACGCGGTCGTGTCGCACAGTGGTATAAG
7 TTCACCCACGCGGTCGTTTCGCACAGTGGTATCAG
8 TTCACCTCGTGAGCATTTTCGCAAAGCACTAGCGC
9 TTCACCTCGTGAACATTTTCGCAAAGCACTAGCGC
10 TTCAGACCGGGGGCATTTTTGCAAATCACTGGCGC
11 TTCTCACCGGGGGCATGTTCGTGAGGCACCGGCGC
12 TTCTCACACGCGGTCGTGTCGCACAGTGGTATAAG
13 TTGACCCCGTGAACATTTTCGCAAAGCACTAGCGC
14 TTGACCTCGTGAACATTTTCGCAAAGCACTAGCGC
15 TCCACACACGCGGTAGTTTCGCAAAGCGGTAGCAG
16 CCCACACACGCGGTATTTCTACAAATCACTGGCAG
17 CCCAGACACGCGGTATTTTTGCAAATCACTGGCAG
18 CCCAGACACGCGGTATTTCTACAAATCACTGGCAG
19 CCCTGACCGGGGGCATTTTTGCAAATCACTGGCGC
20 CCCTGACACGCGGTATTTTTGCAAATCACTGGCAG
21 CCCTGACACGCGGTATTTCTACAAATCACTGGCAG
aA total of 21 unique haplotypes were identified using Bayesian inference
implemented in PHASE v2.1.1. The 21 haplotypes presented in the table
represent the specific combinations of the 35 SNPs evaluated in the study.
FIG. 1. Haplotype structure of theXPC gene. A total of 21 unique haplotypes
were identified using Bayesian inference implemented in PHASE v2.1.1. A
maximum likelihood composite model of phylogenetic analysis in MEGA 4 was
conducted on these 21 haplotypes resulting in six PGH (PGH-A, PGH-B, PGH-C,
PGH-D, PGH-E, and PGH-F) based on genetic distances, as indicated by the
brackets. These six PGHs were used as individual units in further analyses.
TABLE 3
Frequencies of the PGH of the XPC Gene in the Study
Population
PGH status n (%)
A
þ 26 (26.3)
� 73 (73.7)
Ba
þ 3 (3.0)
� 96 (97.0)
C
þ 20 (20.2)
� 79 (79.8)
D
þ 4 (4.0)
� 95 (96.0)
E
þ 7 (7.0)
� 92 (92.9)
Fb
þ 40 (40.4)
� 59 (59.6)
þ, presence; �, absence.aThe least common PGH.bThe most common PGH.
XPC HAPLOTYPES AND SENSITIVITY TO TOBACCO SMOKE 45
PGH and smoking on baseline CA frequencies, we observed a
significant interaction between smoking and PGH-C (p ¼ 0.046)
(Fig. 2). Nonsmokers who were negative for PGH-C had the
lowest level of baseline CA (mean ± SEM ¼ 0.53 ± 0.192), while
smokers who were positive for PGH-C had significantly higher
baseline CA frequencies (mean ± SEM ¼ 1.21 ± 0.29). Among
those positive for PGH-C, the baseline CA frequency was
3.5 times higher in smokers compared to nonsmokers. In
contrast, we observed no significant interactions between
smoking and PGH-A and PGH-F on baseline CA (data not
shown). Because of the small sample sizes of PGHs B, D, and E,
their interaction effect with smoking on baseline CA could not be
evaluated in the current study.
Relationship between XPC Haplotypes and MutagenSensitivity
Using the general linear statistical model, adjusted for age
and sex, to investigate interactions between each PGH and
smoking on mutagen-induced CA frequency, we observed no
significant interactions between smoking and PGHs A, B, C,
and E (data not shown). However, we observed significant
interactions between smoking and PGH-D (p ¼ 0.023) and
PGH-F (p ¼ 0.031) (Fig. 3). Nonsmokers who were positive
for PGH-D had a significantly lower level of mutagen-induced
CA frequencies (mean ± SEM ¼ 3.75 ± 0.85) than smokers
who were positive for PGH-D (8.75 ± 2.43). Among those
positive for PGH-D, the mutagen-induced CA frequency was
2.3 times higher in smokers compared to nonsmokers, whereas
among those who were negative for PGH-D, this difference in
response in smokers compared to nonsmokers was not observed.
Likewise, nonsmokers who were positive for PGH-F had a
lower frequency of mutagen-induced CA (4.63 ± 0.47) compared
to smokers who were positive for PGH-F (6.03 ± 0.51).
Among those positive for PGH-F, the mutagen-induced CA
frequency was 1.3 times (24%) higher in smokers compared to
nonsmokers.
DISCUSSION
To our knowledge, this is the first study to provide
a comprehensive evaluation of the relationship between XPChaplotypes and genetic damage associated with tobacco
smoking. Rather than addressing the effect of a few individual
SNPs, we determined the relationship between genetic damage
and haplotypes that comprise the common SNPs in the entire
genomic region of the XPC gene. This approach is compre-
hensive and biologically more plausible since it allows for the
evaluation of the effect of multiple SNPs that could jointly
influence outcome. The relationship between XPC haplotypes
and genetic damage was evaluated using CA as a biomarker
because of the well-established strong association between
increased CA frequency and cancer risk. Of all biomarkers
available for human studies, CA is the only biomarker that has
been adequately validated in many independent prospective
TABLE 4
Effect of XPC Haplotype Groups on Background (Baseline) and
Mutagen-Induced CA Frequencies
aPGH bStatus Nonsmokers Smokers
Baseline CA frequency
A þ 0.69 (0.17c) 0.78 (0.21)
� 0.76 (0.21) 0.91 (0.21)
B þ 0.25 (0.25) 0.00 (0.00)
� 0.77 (0.14) 0.9 (0.16)
C þ 0.53 (0.19) 1.21 (0.29)
� 0.81 (0.17) 0.65 (0.16)
D þ 0.75 (0.48) 1.75 (0.63)
� 0.72 (0.14) 0.78 (0.15)
E þ 0.67 (0.33) 0.43 (0.20)
� 0.73 (0.14) 0.93 (0.17)
F þ 0.92 (0.21) 0.82 (0.18)
� 0.48 (0.12) 1.00 (0.25)
Mutagen-induced CA frequency
A þ 5.11 (0.51) 4.67 (0.72)
� 4.73 (0.51) 6.03 (0.57)
B þ 5.25 (1.65) 2.50 (1.50)
� 4.91 (0.37) 5.67 (0.46)
C þ 5.81 (0.48) 5.68 (0.69)
� 4.52 (0.47) 5.45 (0.60)
D þ 3.75 (0.85) 8.75 (2.43)
� 5.04 (0.38) 5.26 (0.43)
E þ 4.57 (1.23) 4.29 (1.25)
� 5.00 (0.37) 5.74 (0.48)
F þ 4.63 (0.47) 6.03 (0.51)
� 5.32 (0.56) 4.00 (0.86)
þ, presence and �, absence.aPGH: phylogenetically-grouped haplotype.bStatus: presence (þ) or absence (�) of the haplotype group.cSEM = standard error of the mean.
FIG. 2. Interaction between PGH-C and smoking, as related to CA frequency.
A general linear model adjusted for age and gender was fit to investigate
interactions between PGH-C and smoking on CA. A permutation test with 1000
replicates was used to calculate empirical p values to account for multiple testing.
Error-bar plots depict mean and 95% confidence interval limits. The round
symbols indicate nonsmokers and the triangular symbols indicate smokers. The
interaction between smoking and PGH-C was significant (p ¼ 0.046).
46 RONDELLI ET AL.
studies as a risk factor for cancer (Bonassi et al., 1995, 2000;
Hagmar et al., 1994, 1998). Our data indicate a significant XPChaplotype–smoking interaction, which was observed between
smoking and PGH-C on frequencies of CA. Our data provide
support for results from previous association studies linking
certain XPC polymorphisms to smoking-associated cancer risk
(An et al., 2007; Guo et al., 2008; Hansen et al., 2007). Our
results suggest that certain XPC haplotypes could affect the
repair of genetic damage caused by tobacco-smoke carcinogens.
Our findings also suggest that certain smokers may be at greater
risk than others for the development of genomic instability,
a critical step in the carcinogenic process, as evidenced by the
increase in CA in PBLs from individuals with PGH-C.
Previous studies addressed associations between only four
XPC polymorphisms and cancer risk, and these studies
produced inconsistent results. For example, positive associa-
tions between the rs2228000 SNP (A499V) and cancer risk
were reported in some studies (An et al., 2007; Sak et al., 2006;
Shen et al., 2005) but not in others (Guo et al., 2008; Weiss
et al., 2006). Similarly, an association between the rs2279017
in intron 12 of XPC and bladder cancer risk was reported (Sak
et al., 2006); however, this association remains to be
confirmed. The rs2228001 SNP in exon 16 (K939Q) was
associated with esophageal, colorectal, and lung cancers in
some studies (Guo et al., 2008; Hansen et al., 2007) but not in
others (An et al., 2007; Weiss et al., 2006; Zhu et al., 2008).
Inconsistencies between studies are not surprising and have
been reported before with polymorphisms of other genes.
Possible explanations for such inconsistencies were often
discussed and included differences in study design and
ethnicities of the studied populations (Au et al., 2004;
Manuguerra et al., 2006). Another possible explanation we
propose for such discrepancies is that the XPC polymorphisms
evaluated exist in variable degrees of LD with others that were
not evaluated in these investigations. Differences in sampling
procedures, coupled with incomplete LD in some cases, may
capture SNPs with functional effects that are not being directly
investigated, but in other cases, such SNPs may not be
captured. Such sampling inconsistencies, possibly influenced
by an inadequate number of studied subjects, may explain these
disparate results. It is also conceivable that the polymorphisms
previously studied have little or no biological effect in-
dependently, but when present as part of a specific haplotype,
they exert a phenotypic effect. This hypothesis is supported by
recent findings from our laboratory indicating that the
ss74800505 SNP that we discovered in the NEIL2 gene had
no effect on expression levels when evaluated independently,
yet when evaluated as part of a haplotype, a significant
reduction (69%) in NEIL2 expression was observed (Kinslow
et al., 2008). Another possible reason for inconsistencies could
be that the phenotypic effect observed with a certain SNP was,
in fact, due to the effects evoked by other SNPs that exist in LD
with the studied SNP. Because of the variability in the degree
of LD existing in different populations, the effects observed for
a certain SNP in one study may not be the same in other studies
because of the population effect. Future research based on our
current study, addressing the effect of haplotypes rather than
the effects of individual SNPs, may clarify these issues and
may significantly reduce inconsistencies in the results currently
observed between different investigations.
Our findings with PGH-C are consistent with reports
indicating that certain XPC SNPs belonging to this phyloge-
netic group of haplotypes are associated with increased cancer
risk. For example, the rs2228000 (V499R) SNP, uniformly
present in PGH-C, was associated with increased risk of head
and neck, bladder, and lung cancers (An et al., 2007; Sak et al.,2006; Shen et al., 2005). Our data are also consistent with
a recent report indicating that the rs2228000 (V499R) SNP is
associated with decreased DRC (Zhu et al., 2008). Whether the
FIG. 3. Interaction between PGH-D and PGH-F and smoking as related to
mutagen-induced CA frequency. A general linear model adjusted for age and
gender was fit to investigate interactions between PGH-D (A) and PGH-F (B)
and smoking as related to mutagen-induced CA. A permutation test with 1000
replicates was used to calculate empirical p values, respectively, for each
outcome, to account for multiple testing. Error-bar plots depict mean and 95%
confidence interval limits. The round symbols indicate nonsmokers and the
triangular symbols indicate smokers. The interaction between smoking and
PGH-D and PGH-F was significant (p values ¼ 0.023 and 0.031 for PGH-D
and PGH-F, respectively).
XPC HAPLOTYPES AND SENSITIVITY TO TOBACCO SMOKE 47
effect observed is related to the particular SNP evaluated in
these earlier investigations or to other SNPs in PGH-C remains
to be determined.
We found a significant difference in mutagen sensitivity
between smokers who were positive compared to those who
were negative for PGH-D and PGH-F. Smokers with these
PGHs exhibited significantly higher mutagen sensitivity than
smokers who did not have one of these PGHs. This suggests
that smokers with PGH-D or PGH-F could be predisposed to
a greater risk for developing cancer, given the well-
established association between reduced DRC, as determined
by mutagen sensitivity, and cancer risk (An et al., 2007;
Cheng et al., 1998; Spitz et al., 1995; Wang et al., 2007). The
haplotype-smoking interaction is not surprising since reduced
repair would only be important in presence of genotoxic
exposure. This gene-smoking interaction is consistent with
previous reports with other polymorphisms in other DNA
repair genes (e.g., Abdel-Rahman et al., 2000; Affatato et al.,2004). A plausible biological explanation for such interaction
is that, in smokers, continuous exposure to tobacco smoke
mutagens could overwhelm the DNA repair machinery,
making the effect of the polymorphisms that reduce repair
capacity more pronounced. Thus, the inheritance of poly-
morphisms that result in even a slight decrease in DNA repair
could lead to more noticeable genetic damage in such
individuals compared to nonsmokers. It is noteworthy that
while PGH-C was associated with differences in baseline CA,
it was not associated with mutagen-induced genetic damage.
This could likely be due to differences in the mechanism(s) by
which certain PGHs exert their effects with respect to chronic
and acute exposures. For example, in response to chronic
tobacco carcinogens exposure, haplotypes belonging to PGH-
C could possibly affect XPC binding to the DNA lesion, thus
reducing overall DNA repair over time, which would manifest
as an increase in CA in smokers. Conversely, PGH-D and
PGH-F may exert their effect primarily in the presence of an
acute exposure to a mutagen, suggesting that XPC haplotypes
belonging to these PGHs could affect protein stability and/or
turnover at the translational and/or transcriptional levels.
Additional studies are warranted to support or refute these
potential mechanisms. It should be noted, however, that while
the exact mechanisms by which SNPs belonging to PGH-C,
-D, and -F influence genetic damage are not fully understood,
some of the previously studied SNPs belonging to these PGHs
(e.g., rs2228000, rs2279017) might have potentially signifi-
cant effects on protein structure and/or function. For example,
the rs2228000 (A499V) SNP of PGH-C is located at the 5# end
of the hHR23B-binding region of the gene and may, thus, alter
the function of XPC by altering its binding with the hHR23B
protein that is necessary for XPC function. However, other
SNPs in other regions of the gene, which exist in LD with
rs2228000, may also contribute to the observed phenotypic
effect. For example, an SNP in the 3# UTR can affect
posttranscriptional activity, such as messenger RNA (mRNA)
folding–directed rates of translation or mRNA half-life
stability (George Priya Doss et al., 2008). Similarly, intronic
SNPs that are at, or near, exonic boundaries can affect mRNA
translation through exon skipping and/or aberrant mRNA
folding (Cheng et al., 2006; Duan et al., 2007; Kinslow et al.,2008; Law et al., 2007), and SNPs in the 5# UTR can affect
XPC gene expression via promoter modulation (Cheng et al.,2006). Taken together, our findings suggest that SNPs, in
coding as well as noncoding regions of the XPC gene, that are
in LD with each other as part of a given haplotype may act in
a collective manner to influence the phenotype. Mechanistic
studies examining the effects of haplotypes, rather than the
effects of individual SNPs, on XPC function are warranted to
clarify the role of XPC polymorphisms.
In summary, despite the small sample size of the current
study, a limitation that we acknowledge and which limited our
ability to conclusively evaluate the effect of some PGHs, our
data indicate that haplotypes belonging to PGH-C, -D, and -F
appear to confer sensitivity to the mutagenic effects of tobacco
carcinogens. Larger studies are needed to confirm our initial
findings, and mechanistic research investigating the effect of
XPC haplotypes on NER capacity and on the risk of developing
diseases is clearly warranted. These studies are currently in
progress in our laboratory.
FUNDING
National Institute of Environmental Health Science (NIEHS)
Center award (ES06676), by a John Sealy Memorial
Endowment Foundation grant to S.A.-R.; a predoctoral fellow-
ship to C.M.R. from the NIEHS (T32-07454), a cancer
prevention fellowship funded by the National Cancer Institute
(K07CA093592) to C.J.E.; National Cancer Institute
(CA123208) to C.J.E.; CA129050 and CA098549 to R.E.-Z.
and by the National Institute of Neurological Disorders and
Stroke NS065392-01 to S.A.-R.; studies were conducted with
the assistance of the Institute for Translational Sciences—
Clinical Research Center at UTMB funded by a
1UL1RR029876-01 grant from the National Center for Research
Resources, National Institutes of Health.
ACKNOWLEDGMENTS
We thank Dr Marinel M. Ammenheuser for her critical
review of the manuscript.
REFERENCES
Abdel-Rahman, S. Z., and El-Zein, R. A. (2000). The 399Gln polymorphism in
the DNA repair gene XRCC1 modulates the genotoxic response induced in
human lymphocytes by the tobacco-specific nitrosamine NNK. Cancer Lett.
159, 63–71.
48 RONDELLI ET AL.
Abdel-Rahman, S. Z., Salama, S. A., Au, W. W., and Hamada, F. A. (2000).
Role of polymorphic CYP2E1 and CYP2D6 genes in NNK-induced
chromosome aberrations in cultured human lymphocytes. Pharmacogenetics
10, 239–249.
Affatato, A. A., Wolfe, K. J., Lopez, M. S., Hallberg, C.,
Ammenheuser, M. M., and Abdel-Rahman, S. Z. (2004). Effect of XPD/
ERCC2 polymorphisms on chromosome aberration frequencies in smokers
and on sensitivity to the mutagenic tobacco-specific nitrosamine NNK.
Environ. Mol. Mutagen. 44, 65–73.
An, J., Liu, Z., Hu, Z., Li, G., Wang, L. E., Sturgis, E. M., El-Naggar, A. K.,
Spitz, M. R., and Wei, Q. (2007). Potentially functional single nucleotide
polymorphisms in the core nucleotide excision repair genes and risk of
squamous cell carcinoma of the head and neck. Cancer Epidemiol.
Biomarkers Prev. 16, 1633–1638.
Araki, M., Masutani, C., Takemura, M., Uchida, A., Sugasawa, K., Kondoh, J.,
Ohkuma, Y., and Hanaoka, F. (2001). Centrosome protein centrin2/caltracin1
is part of the xeroderma pigmentosum group c complex that initiates global
genome nucleotide excision repair. J. Bio. Chem. 276, 18665–18672.
Au, W. W., Navasumrit, P., and Ruchirawat, M. (2004). Use of biomarkers to
characterize functions of polymorphic DNA repair genotypes. Int. J. Hyg.
Environ. Health 207, 301–313.
Bardel, C., Danjean, V., Morange, P., Genin, E., and Darlu, P. (2009). On the
use of phylogeny-based tests to detect association between quantitative traits
and haplotypes. Genet Epidemiol. 33, 729–739.
Bonassi, S., Abbondandolo, A., Camurri, L., Dal Pra, L., De Ferrari, M.,
Degrassi, F., Forni, A., Lamberti, L., Lando, C., and Padovani, P. (1995).
Are chromosome aberrations in circulating lymphocytes predictive of future
cancer onset in humans? Preliminary results of an Italian cohort study.
Cancer Genet. Cytogenet. 79, 133–135.
Bonassi, S., Hagmar, L., Stromberg, U., Montagud, A. H., Tinnerberg, H.,
Forni, A., Heikkila, P., Wanders, S., Wilhardt, P., Hansteen, I. L., et al.
(2000). Chromosomal aberrations in lymphocytes predict human cancer
independently of exposure to carcinogens. European Study Group on
Cytogenetic Biomarkers and Health. Cancer Res. 60, 1619–1625.
Bunick, C. G., Miller, M. R., Fuller, B. E., Fanning, E., and Chazin, W. J.
(2006). Biochemical and structural domain analysis of xeroderma pigmen-
tosum complementation group C protein. Biochemistry 45, 14965–14979.
Cheng, A. J., Mao, Y. M., and Cui, R. Z. (2006). The effect of gene
polymorphism in promoter and intron 1 on human Apo A I expression.
Zhonghua Yi Xue Yi Chuan Xue Za Zhi 23, 610–613.
Cheng, L., Eicher, S. A., Guo, Z., Hong, W. K., Spitz, M. R., and Wei, Q.
(1998). Reduced DNA repair capacity in head and neck cancer patients.
Cancer Epidemiol. Biomarkers Prev. 7, 465–468.
Cheverud, J. M. (2001). A simple correction for multiple comparisons in
interval mapping genome scans. Heredity 87, 52–58.
Duan, Z. X., Zhu, P. F., Dong, H., Gu, W., Yang, C., Liu, Q., Wang, Z. G., and
Jiang, J. X. (2007). Functional significance of the TLR4/11367 poly-
morphism identified in Chinese Han population. Shock 160, 160–164.
Evans, H. J., and O’Riordan, M. L. (1975). Human peripheral blood
lymphocytes for the analysis of chromosome aberrations in mutagen tests.
Mutat. Res. 31, 135–148.
Friedberg, E. C. (2001). How nucleotide excision repair protects against cancer.
Nat. Rev. Cancer 1, 22–33.
Gabriel, S. B., Schaffner, S. F., Nguyen, H., Moore, J. M., Roy, J.,
Blumenstiel, B., Higgins, J., DeFelice, M., Lochner, A., Faggart, M., et al.
(2002). The structure of haplotype blocks in the human genome. Science
296, 2225–2229.
George Priya Doss, C., Sundandiradoss, R., Rajasekaran, R., Choudhury, P.,
Sinha, P., Hota, P., Batra, U. P., and Rao, S. (2008). Applications of
computational algorithm tools to identify functional SNPs. Funct. Integr.
Genomics 9, 309–316.
Guo, W., Zhou, R. M., Wan, L. L., Wang, N., Li, Y., Zhang, X. J., and
Dong, X. J. (2008). Polymorphisms of the DNA repair gene xeroderma
pigmentosum groups A and C and risk of esophageal cell carcinoma in
a population of high incidence region of North China. J. Cancer Res. Clin.
Oncol. 134, 267–270.
Hagmar, L., Bonassi, S., Stromberg, U., Brogger, A., Knudsen, L. E.,
Norppa, H., and Reuterwall, C. (1998). Chromosomal aberrations in
lymphocytes predict human cancer: a report from the European Study
Group on Cytogenetic Biomarkers and Health (ESCH). Cancer Res. 58,
4117–4121.
Hagmar, L., Brogger, A., Hansteen, I. L., Heim, S., Hogstedt, B., Knudsen, L.,
Lambert, B., Linnainmaa, K., Mitelman, F., and Nordenson, I. (1994).
Cancer risk in humans predicted by increased levels of chromosomal
aberrations in lymphocytes: Nordic study group on the health risk of
chromosome damage. Cancer Res. 54, 2919–2922.
Hansen, R. D., Sorensen, M., Tjonneland, A., Overvad, K., Walling, H.,
Raaschou-Nielsen, O., and Vogel, U. (2007). XPA A23G, XPC Lys939Gln,
XPD Lys751Gln and XPD Asp312Asn polymorphisms, interactions with
smoking, alcohol and dietary factors, and risk of colorectal cancer. Mutat.
Res. 619, 68–80.
Hsu, T. C., Spitz, M. R., and Schantz, S. P. (1991). Mutagen sensitivity:
a biologic marker of cancer susceptibility. Cancer Epidemiol. Biomarkers
Prev. 1, 83–89.
IARC. (1986). Tobacco smoking. IARC Monographs for the Evaluation of the
Carcinogenic Risk of Chemicals to Humans (IARC), (World Health
Organization, Ed.), pp. 312–314. IARC, Lyon, France.
ISCN. (1985). An International System for Human Cytogenetic Nomenclature.
Report of the Standing Committee on Human Cytogenetic Nomenclature.
Birth Defects Orig. Artic. Ser. 21, 1–117.
Johnson, G. C., Esposito, L., Barratt, B. J., Smith, A. N., Heward, J., Di
Genova, G., Ueda, H., Cordell, H. J., Eaves, I. A., Dudbridge, F., et al.
(2001). Haplotype tagging for the identification of common disease genes.
Nat. Genet. 29, 233–237.
Kinslow, C. J., El-Zein, R. A., Hill, C. E., Wickliffe, J. K., and Abdel-
Rahman, S. Z. (2008). Single nucleotide polymorphisms 5’ upstream
the coding region of the NEIL2 gene influence gene transcription levels
and alter levels of genetic damage. Genes Chromosomes Cancer 47,
923–932.
Law, A. J., Kleinman, J. E., Weinberger, D. R., and Weickert, C. S. (2007).
Disease-associated intronic variants in the ErbB4 gene are related to altered
ErbB4 splice-variant expression in the brain in schizophrenia. Hum. Mol.
Genet. 16, 129–141.
Liu, G., Zhou, W., and Christiani, D. C. (2005). Molecular epidemiology of
non-small cell lung cancer. Semin. Respir. Crit. Care Med. 26, 265–272.
Manuguerra, M., Saletta, F., Karagas, M. R., Berwick, M., Veglia, F.,
Vineis, P., and Matullo, G. (2006). XRCC3 and XPD/ERCC2 single
nucleotide polymorphisms and the risk of cancer: a HuGE review. Am.
J. Epidemiol. 164, 297–302.
Park, C. J., and Choi, B. S. (2006). The protein shuffle: sequential interactions
among components of the human nucleotide excision repair pathway.
FEBS J. 273, 1600–1608.
Rieder, M. J., Reiner, A. P., Gage, B. F., Nickerson, D. A., Eby, C. S.,
McLeod, H. L., Blough, D. K., Thummel, K. E., Veenstra, D. L., and
Rettie, A. E. (2005). Effect of VKORC1 haplotypes on transcriptional
regulation and warfarin dose. N. Engl. J. Med. 352, 2285–2293.
Sak, S. C., Barrett, J. H., Paul, A. B., Bishop, T. D., and Kiltie, A. E. (2006).
Comprehensive analysis of 22 XPC polymorphisms and bladder cancer risk.
Cancer Epidemiol. Biomarkers Prev. 15, 2537–2541.
Shen, H., Spitz, M. R., Qiao, Y., Guo, Z., Wang, L. E., Bosken, C. H.,
Amos, C. I., and Wei, Q. (2003). Smoking, DNA repair capacity and risk of
nonsmall cell lung cancer. Int. J. Cancer 107, 84–88.
XPC HAPLOTYPES AND SENSITIVITY TO TOBACCO SMOKE 49
Shen, M., Berndt, S. I., Rothman, N., DeMarini, D. M., Mumford, J. L., He, X.,
Bonner, M. R., Tian, L., Yeager, M., Welch, R., et al. (2005). Poly-
morphisms in the DNA nucleotide excision repair gene and lung cancer risk
in Xuan Wei, China. Int. J. Cancer 116, 768–773.
Spitz, M. R., Hsu, T. C., Wu, X. F., Fueger, J. J., Amos, C. I., and Roth, J. A.
(1995). Mutagen sensitivity as a biologic marker of lung cancer risk in
African Americans. Cancer Epidemiol. Biomarkers Prev. 4, 99–103.
Veenstra, D. L., You, J. H., Rieder, M. J., Farin, F. M., Wilkerson, H. W.,
Blough, D. K., Cheng, G., and Rettie, A. E. (2005). Association of Vitamin
K epoxide reductase complex 1 (VKORC1) variants with warfarin dose in
a Hong Kong Chinese patient population. Pharmacogenet. Genomics 15,
687–691.
Wang, Y., Spitz, M. R., Lee, J. J., Huang, M., Lippman, S. M., and Wu, X.
(2007). Nucleotide excision repair pathway genes and oral premalignant
lesions. Clin. Cancer Res. 13, 3753–3758.
Weiss, J. M., Weiss, N. S., Ulrich, C. M., Doherty, J. A., and Chen, C. (2006).
Nucleotide excision repair genotype and the incidence of endometrial cancer:
effect of other risk factors on the association. Gynecol. Oncol. 103, 891–896.
Zhu, Y., Lai, M., Yang, H., Lin, J., Huang, M., Grossman, H. B., Dinney, C. P.,
and Wu, X. (2007). Genotypes, haplotypes, and diplotypes of XPC and risk
of bladder cancer. Carcinogenesis 28, 698–703.
Zhu, Y., Yang, H., Chen, Q., Lin, J., Grossman, H. B., Dinney, C. P., Wu, X.,
and Gu, J. (2008). Modulation of DNA damage/DNA repair capacity by
XPC polymorphisms. DNA Repair 7, 141–148.
50 RONDELLI ET AL.