35
- 1 - Structure and function of the nucleosome-binding PWWP domain 1 2 Su Qin 1 and Jinrong Min 1, 2 3 1 Structural Genomics Consortium, University of Toronto, 101 College Street, Toronto, 4 Ontario M5G 1L7, Canada. 5 2 Department of Physiology, University of Toronto, Toronto, Ontario M5S 1A8, 6 Canada. 7 Corresponding author: Min, J. ([email protected]). 8 9 Keywords: PWWP domain; nucleosome binding; histone binding; DNA binding; 10 crosstalk; epigenetic code 11

Structure and function of the nuc leosome-binding PWWP domain · - 1 - 1 Structure and function of the nuc leosome-binding PWWP domain 2 3 Su Qin1 and Jinrong Min1, 2 4 1 Structural

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Structure and function of the nuc leosome-binding PWWP domain · - 1 - 1 Structure and function of the nuc leosome-binding PWWP domain 2 3 Su Qin1 and Jinrong Min1, 2 4 1 Structural

- 1 -

Structure and function of the nucleosome-binding PWWP domain 1

2

Su Qin1 and Jinrong Min1, 2 3

1 Structural Genomics Consortium, University of Toronto, 101 College Street, Toronto, 4

Ontario M5G 1L7, Canada. 5

2 Department of Physiology, University of Toronto, Toronto, Ontario M5S 1A8, 6

Canada. 7

Corresponding author: Min, J. ([email protected]). 8

9

Keywords: PWWP domain; nucleosome binding; histone binding; DNA binding; 10

crosstalk; epigenetic code 11

Page 2: Structure and function of the nuc leosome-binding PWWP domain · - 1 - 1 Structure and function of the nuc leosome-binding PWWP domain 2 3 Su Qin1 and Jinrong Min1, 2 4 1 Structural

- 2 -

Abstract 1

2

PWWP domain-containing proteins are often involved in chromatin-associated 3

biological processes, such as transcriptional regulation and DNA repair, and recent 4

studies have shown that the PWWP domain specifies chromatin localization. 5

Mutations in the PWWP domain have been linked to various human diseases, 6

emphasizing its importance. Structural studies reveal that PWWP domains possess a 7

conserved aromatic cage for histone methyl-lysine recognition, and synergistically 8

bind both histone and DNA, which contributes to their nucleosome binding ability and 9

chromatin localization. Furthermore, the PWWP domain often cooperates with other 10

histone and DNA “reader” or “modifier” domains to evoke crosstalk between various 11

epigenetic marks. Here, we discuss these recent advances in understanding the 12

structure and function of the PWWP domain. 13

14

15

Page 3: Structure and function of the nuc leosome-binding PWWP domain · - 1 - 1 Structure and function of the nuc leosome-binding PWWP domain 2 3 Su Qin1 and Jinrong Min1, 2 4 1 Structural

- 3 -

Structural characteristics of the PWWP domain 1

The PWWP domain is named after a conserved Pro-Trp-Trp-Pro motif [1, 2]. 2

However, the name can be misleading as only the fourth residue Pro is absolutely 3

conserved. The PWWP domain was also named the HATH domain (homologous to 4

the amino terminus of HDGF (Hepatoma-derived growth factor)) [3] and the 5

RBB1NT domain (RBBP1 N-terminal domain) (PDB entry: 2YRV and Pfam entry: 6

PF08169). It is found ubiquitously in eukaryotes, ranging from unicellular organisms 7

to humans, and there are more than 20 PWWP domain-containing proteins in the 8

human genome, most of which are chromatin-associated (Table 1). 9

The PWWP domain belongs to the Royal superfamily, which also includes the 10

chromodomain, Tudor domain, and the Malignant Brain Tumor (MBT) domain [4]. 11

The Royal superfamily shares a common structural feature, an antiparallel 12

β-barrel-like fold formed by 4-5 β-strands, except the canonical chromodomain, 13

which harbors only three β-strands and requires the binding ligand to complete the 14

β-barrel fold by forming an extra β-strand [5]. The PWWP domain contains a 15

complete β-barrel of 5 antiparallel β-strands (β1-β5), in which a short 310 helix is 16

often inserted between β4 and β5, and a highly variable linker that may form 17

additional secondary structure elements is inserted between β2 and β3 (Fig. 1 and Fig. 18

2) [6]. A unique structural feature of the PWWP domain is the presence of a helix 19

bundle of 1-6 α-helixes following the β-barrel (Fig. 1 and Fig. 3) [6]. This helix 20

bundle region is very variable and diverse at the sequence level; therefore, only the 21

β-barrel subdomain of the PWWP domain could be reliably predicted in the protein 22

Page 4: Structure and function of the nuc leosome-binding PWWP domain · - 1 - 1 Structure and function of the nuc leosome-binding PWWP domain 2 3 Su Qin1 and Jinrong Min1, 2 4 1 Structural

- 4 -

domain databases, such as SMART (Simple Modular Architecture Research Tool) [7] 1

and the Human Protein Reference Database [8]. Nevertheless, a “V” shaped motif 2

consisting of two helixes is relatively conserved in the helix bundle subdomain (Fig. 1 3

and Fig. 3). The Pro-Trp-Trp-Pro motif is located in the beginning of the β2 strand 4

and it is packed against the helix bundle (Fig. 1 and Fig. 2), underscoring its critical 5

roles in protein folding and stability. In most cases, the PWWP domain can fold as an 6

independent functional unit; however, recent studies reveal that the PWWP domain of 7

ZMYND11 (also known as BS69) folds together with the preceding bromodomain 8

and zinc finger to form an integral functional module (Fig. 1H) [9, 10]. Interestingly, 9

the PWWP domain of HDGF can form a homodimer through a domain exchange such 10

that β1-β2 of one molecule is swapped with that of the other molecule (Fig.1J) [11]. 11

12

DNA binding ability of the PWWP domain 13

The first three-dimensional structure of a PWWP domain was determined for the 14

murine DNA methyltransferase Dnmt3b [12]. Structural analysis of this PWWP 15

domain revealed a prominent positively charged surface, suggesting a potential role in 16

DNA binding, which was confirmed in vitro [12]. Sequence analyses revealed that a 17

common feature for the PWWP domain is that it is rich in lysine and arginine residues 18

and has a theoretical isoelectric point of more than 9, suggesting a general role of the 19

PWWP domain in DNA binding, which was later confirmed for PWWP domains in 20

other proteins, such as HDGF[13, 14], MSH6[15], PSIP1 (also known as LEDGF and 21

p75)[16, 17], and ZMYND11[9, 10]. A DNA binding assay also revealed that the 22

Page 5: Structure and function of the nuc leosome-binding PWWP domain · - 1 - 1 Structure and function of the nuc leosome-binding PWWP domain 2 3 Su Qin1 and Jinrong Min1, 2 4 1 Structural

- 5 -

murine Dnmt3b-PWWP was unselective for different kinds of DNA, and did not show 1

a preference for non-CpG DNA, unmodified CpG, hemimethylated CpG, or fully 2

methylated CpG DNA [12]. Based on a selected and amplified binding assay, the 3

PWWP domain of HDGF also did not discriminate between AT and GC base pairs 4

[13]. Electro mobility shift assays also revealed that the MSH6-PWWP has similar 5

affinity toward double-stranded, double-stranded G/T mismatch, or double-stranded 6

nicked DNA, but weaker affinity toward single-stranded DNA [15]. Taken together, 7

the PWWP domain is able to bind DNA in a nonspecific manner. 8

To date, no structure of a PWWP-DNA complex is available in the protein 9

structure database (Protein Data Bank). However, using NMR chemical shift 10

perturbation experiments, several groups tried to map the DNA binding site on 11

different PWWP domains (HDGF[13], MSH6[15], and PSIP1[16, 17]). Notably, the 12

residues potentially involved in DNA binding are consistently localized on one side of 13

the protein, centering on the β1-β2 arch region and the Pro-Trp-Trp-Pro motif, which 14

also overlaps with the patch of highly positively charged surface (Fig. 4). The 15

HDGF-PWWP is also able to bind heparin, a linear polymer consisting of repeating 16

units of 1→4-linked uronic acid and glucosamine residues. Similar to DNA, heparin 17

is highly negatively charged, and it binds to a similar positively charged surface on 18

the PWWP domain of HDGF [18]. These structural clues suggest that PWWP 19

domains interact with DNA’s phosphate backbone through electrostatic interactions, 20

thus lacking sequence specifity. To specify chromatin localization, additional 21

mechanism may be required. 22

Page 6: Structure and function of the nuc leosome-binding PWWP domain · - 1 - 1 Structure and function of the nuc leosome-binding PWWP domain 2 3 Su Qin1 and Jinrong Min1, 2 4 1 Structural

- 6 -

1

Histone binding ability of the PWWP domain 2

The structural similarity of the PWWP domain to other Royal superfamily 3

members, which can recognize methylated lysine and arginine [19], prompted 4

scientists to propose the PWWP domain as a potential histone “reader” in 2005 [20]. 5

Later on, it was demonstrated that the zebrafish Brpf1-PWWP could bind histones 6

directly [21] and the fission yeast Pdp1-PWWP could recognize H4K20me 7

specifically [22, 23]. The crystal structures of a BRPF1-H3K36me3 complex fully 8

established the notion that the PWWP domain can recognize methylated histones [6, 9

24]. Since then, many other PWWP domains were reported to possess methyl-lysine 10

recognition activity; for example, DNMT3A-PWWP binds H3K36me3 [25], 11

PSIP1-PWWP binds H3K36me3 [16, 17, 26], MSH6-PWWP binds H3K36me3 [27], 12

HDGF2-PWWP binds H3K79me3 and H4K20me3 [6], and ZMYND11-PWWP binds 13

H3.3K36me3 [9]. 14

Structural analysis of the PWWP-histone complex structures identified a 15

conserved cage for methyl-lysine binding, formed by three aromatic residues (Fig. 5). 16

The third residue (W/Y) of the P-W-W-P motif and the residue (F/Y/W) immediately 17

preceding this motif are involved in forming this cage. The third aromatic cage 18

residue (F/Y/W) comes from the end of the β3 strand [6, 9, 24]. Sequence alignment 19

reveals that most PWWP domains have this conserved cage for potential 20

methylated-histone binding (Fig. 1 and Fig. 2) [6]. Nevertheless, several exceptions 21

exist. The PWWP domains of RBBP1, RBBP1L1, MBD5, and NSD1 (N-terminal) 22

Page 7: Structure and function of the nuc leosome-binding PWWP domain · - 1 - 1 Structure and function of the nuc leosome-binding PWWP domain 2 3 Su Qin1 and Jinrong Min1, 2 4 1 Structural

- 7 -

have an incomplete aromatic cage. Consistently, the RBBP1-PWWP does not show 1

any binding to methylated histone peptides [28]. In the case of BRPF1, the peptide 2

residues (G33GV35) that precede the trimethylated K36 occupy a shallow groove on 3

the protein surface which involves the long β-β-α insertion between β2 and β3. For 4

the HDGF2, this insertion is a short loop and disordered, which may be responsible 5

for its unspecific binding to both H3K79me3 and H4K20me3 (Fig. 5). Following we 6

will discuss the histone binding ability of PWWP domains in a nucleosomal context. 7

8

Nucleosome binding ability of the PWWP domain 9

The aforementioned PWWP-binding histone lysine sites H3K36, H3K79, and 10

H4K20 are all in close proximity to DNA in the nucleosomal context. That the PWWP 11

domain can bind both DNA and methylated histone suggests a synergistic binding 12

mechanism among these interactions. Similar phenomena were observed for other 13

Royal superfamily modules; for example, compared to the chromodomain alone, the 14

pre-formed complex of the MSL3[29] or RBBP1[28] chromodomain with DNA 15

displayed enhanced binding ability to H4K20me1 or H4K20me3 peptides, 16

respectively. In addition, the Tudor domain of PHF1 concomitantly interacts with both 17

the H3K36me3 and DNA of the H3K36me3-nucleosome core particle with increased 18

binding ability [30]. In the case of the PWWP domain, binding affinity towards either 19

histone peptide or DNA oligonucleotide is very weak, but it exhibits significantly 20

enhanced binding affinity to methylated nucleosomes. For example, the affinity of the 21

PSIP1-PWWP for H3K36me3 methylated nucleosomes (Kd ~1.5 μM) is four orders of 22

Page 8: Structure and function of the nuc leosome-binding PWWP domain · - 1 - 1 Structure and function of the nuc leosome-binding PWWP domain 2 3 Su Qin1 and Jinrong Min1, 2 4 1 Structural

- 8 -

magnitude higher than for the H3K36me3 peptide (Kd ~17 mM), and two orders 1

higher than for DNA only (Kd ~150μM) [16]. Another group also confirmed this 2

finding independently, although the measured Kd values differ (H3K36me3 3

nucleosome Kd ~48 nM, H3K36me3 peptide Kd >6.5 mM, and DNA Kd ~1.5 μM) [17], 4

possibly due to different methods used in these two studies. However, this synergy 5

was not detected in ZMYND11 [9] or fission yeast Pdp1 [23] when just using a 6

mixture of histone peptide and DNA oligonucleotide for binding studies, so it is likely 7

that the binding synergy between histones and DNA in PWWP interactions may only 8

occur in the nucleosomal context. 9

In all the available histone-PWWP complex structures, the histone peptides 10

reside in a structurally conserved binding groove perpendicular to the β4 strand of the 11

PWWP domains (Fig. 5). NMR mapping studies suggest that DNA binds to the 12

PWWP domain on the other side via a conserved patch of positively charged surface, 13

therefore, the PWWP domains adopt distinct and conserved interfaces to engage the 14

histone and DNA, respectively (Fig. 4). 15

16

Cooperation with other histone and DNA readers 17

Emerging evidence reveals that epigenetic codes consisting of multiple 18

epigenetic modifications could be recognized by multiple reader domains 19

cooperatively [31]. Of note, many epigenetic reader domains often coexist in a single 20

polypeptide or complex. This also holds true for the PWWP domain (Table 1), which 21

often coexists with PHD-like zinc finger domains (PHD, ADD and CW). For example, 22

Page 9: Structure and function of the nuc leosome-binding PWWP domain · - 1 - 1 Structure and function of the nuc leosome-binding PWWP domain 2 3 Su Qin1 and Jinrong Min1, 2 4 1 Structural

- 9 -

the ADD domain in DNMT3A [32], the first PHD domain in BRPF2 [33], and the 1

fifth PHD domain in NSD3 [34] are able to bind unmethylated H3K4, and the CW 2

domain in the ZCWPW1/2 subfamily is able to recognize H3K4me3 [35]. Another 3

histone reader that frequently coexists with the PWWP domain is a bromodomain, 4

which is known to recognize acetylated lysine on histones [36]. There are also other 5

DNA binding domains in some PWWP domain containing proteins, such as MBD 6

domain, HMG box and AT hook (Table 1). An atypical PHD domain (PHD2) from the 7

BRPF2 also shows DNA binding ability [37]. Nevertheless, how these histone- and 8

DNA-binding domains cooperate with the PWWP domain is largely unknown. 9

Recently, a study on ZMYND11, which harbors a PHD-Bromo-PWWP cassette, 10

revealed an uncommon combination of histone reader modules to bind exclusively to 11

the methylated K36 of histone variant H3.3 [9]. The histone variant H3.3 possesses a 12

sequence motif ‘S31…A87AIG90’ that is distinct from the ‘A31…S87AVM90’ 13

sequence motif of the canonical histone H3.1/H3.2. In addition to the K36me3 14

binding by the conserved aromatic cage of the PWWP domain, the unique residue S31 15

of H3.3 is specifically recognized by a second composite pocket at the junction of the 16

bromodomain, PWWP, and an embedded zinc finger motif (Fig. 5D). This study is the 17

first to define a critical role of H3.3 S31 in substrate recognition. However, several 18

structural features imply that the ZMYND11 bromodomain is unlikely to be a histone 19

acetyl-lysine-binding module [9, 10]. The crystal structure of the PHD-Bromo-PWWP 20

cassette of ZMYND8, a potential tumor suppressor closely related to ZMYND11, has 21

also been solved recently (PDB entry: 4COS). It would be interesting to determine the 22

Page 10: Structure and function of the nuc leosome-binding PWWP domain · - 1 - 1 Structure and function of the nuc leosome-binding PWWP domain 2 3 Su Qin1 and Jinrong Min1, 2 4 1 Structural

- 10 -

histone binding ability of ZMYND8, because it not only harbors an aromatic cage for 1

methyl-lysine binding and a similar pocket for H3.3S31 recognition (Fig. 5D), but 2

also contains a canonical bromodomain that has all the residues required for 3

acetyl-lysine binding. The identification of ZYMD11 as a variant specific reader 4

opens up the possibility that other variants and modification-specific readers exist to 5

fine-tune their functions. 6

7

Functions of the PWWP domain-containing proteins 8

The PWWP domain mainly functions as a methylated-nucleosome binder and it 9

can specifically recognize H3K36me3 or H4K20me3. In general, H3K36me3 is 10

deposited on the coding region of active genes and H4K20me3 is a hallmark of 11

silenced heterochromatic regions [38]. These histone marks are also under dynamic 12

regulation during cell cycle and in response to external stimuli. PWWP 13

domain-containing proteins often harbor other functional domains and/or reside in a 14

complex containing other subunits (Table 1). By targeting specifical chromatin region, 15

and dependent on the protein/complex it resides, the PWWP domain is involved in 16

various biological processes/functions, including DNA methylation, histone 17

modification, DNA repair, and transcription regulation. 18

DNMT3A/B: Crosstalk of DNA methylation and histone H3K36me3 methylation 19

DNMT3A and DNMT3B are the de novo DNA methyltransferases responsible 20

for the establishment of DNA methylation patterns during development [39]. They 21

share three highly conserved domains: the PWWP domain at the N-terminal region, 22

Page 11: Structure and function of the nuc leosome-binding PWWP domain · - 1 - 1 Structure and function of the nuc leosome-binding PWWP domain 2 3 Su Qin1 and Jinrong Min1, 2 4 1 Structural

- 11 -

the ADD domain in the middle, and the catalytic domain at the C-terminus. Both 1

DNMT3A and DNMT3B are associated with chromatin in vivo throughout the entire 2

cell cycle, and they are highly concentrated in heterochromatic regions during 3

interphase and at specific loci of chromosome arms in metaphase [40, 41]. Markedly, 4

the PWWP domains in DNMT3A/B are essential for their chromatin association [40, 5

41], and the PWWP domain of the DNMT3A specifically recognizes H3K36me3, 6

which targets it to chromatin and guides DNA methylation [25]. Disruption of the 7

PWWP domain abolishes the ability of DNMT3A/B to methylate the major satellite 8

repeats in pericentric heterochromatin[40]. In addition to the PWWP domain, the 9

ADD domain of DNMT3A, which recognizes the N-terminus of the histone H3 tail, is 10

also required for de novo DNA methylation [42, 43]. 11

A missense mutation (S282P, homozygous) in the PWWP domain of the human 12

DNMT3B gene causes ICF (immunodeficiency, centromeric heterochromatin 13

instability, and facial anomalies) syndrome [44]. Ser282 is located in the β4 strand 14

and is in close proximity to the H3K36me3-binding cage. Chromatin association of 15

DNMT3B is disrupted by this ICF mutation [41]; thus, deficiency in recognition of 16

H3K36me3 by DNMT3B may be linked to the ICF syndrome directly, which further 17

underlines the importance of the PWWP domain in DNMT3A/B. 18

19

Crosstalk of histone acetylation and histone H3K36me3 methylation 20

BRPF1/2/3 are the scaffold proteins of the MOZ/MORF histone acetyltransferase 21

(HAT) complexes [45]. MOZ is specifically required for the H3K9 acetylation of 22

Page 12: Structure and function of the nuc leosome-binding PWWP domain · - 1 - 1 Structure and function of the nuc leosome-binding PWWP domain 2 3 Su Qin1 and Jinrong Min1, 2 4 1 Structural

- 12 -

active Hox gene loci, which is necessary for correct body segment identity[46]. 1

During development of zebrafish Brpf1 is also required for histone acetylation, the 2

maintenance of cranial Hox gene expression, and the proper determination of 3

pharyngeal segmental identities [21]. Loss of the Brpf1 PWWP domain alone was as 4

deleterious as a severe truncation and of Brpf1 that resulted in a putative brpf1-null 5

allele, underscoring the importance of the PWWP domain in proper Brpf1 function in 6

vivo [21]. The PWWP domains of BRPF1/2/3 have been shown to recognize 7

H3K36me3 [6, 24], suggesting a crosstalk mechanism between histone acetylation 8

and H3K36me3. 9

The Saccharomyces cerevisiae NuA3 HAT complexe shares similar components 10

with the human MOZ/MORF complexes, but the PWWP domain is absent in the Nto1 11

protein, which is the homolog of human BRPF proteins [47]. However, the PWWP 12

domain containing protein Pdp3 can interact with members of the NuA3 HAT 13

complex and form a distinct form NuA3b (the isoform that contains Yng1 but not 14

Pdp3 is referred as NuA3a). Deletion of the PDP3 gene decreases NuA3-directed 15

transcription and results in growth defects when combined with transcription 16

elongation mutants, suggesting Pdp3-associated NuA3 functions in the transcription 17

elongation process [48]. It is proposed that NuA3a uses the PHD finger of Yng1 to 18

interact with H3K4me3 at the promoter regions, and NuA3b may be located at the 19

coding regions of genes through the interaction between the PWWP domain of Pdp3 20

and H3K36me3 [48]. Overall, the PWWP domain may evoke crosstalk between 21

histone acetylation and H3K36me3 by targeting HAT complexes to the gene body 22

Page 13: Structure and function of the nuc leosome-binding PWWP domain · - 1 - 1 Structure and function of the nuc leosome-binding PWWP domain 2 3 Su Qin1 and Jinrong Min1, 2 4 1 Structural

- 13 -

regions. 1

2

The PWWP domain and regulation of histone methylation 3

Chromatin-modifying enzymes or complexes often contain histone “reader” 4

domains to bind modified histones, which allow them to spread the resultant 5

modification along the chromatin. NSD1/2/3 are histone mono- and di-methylases for 6

H3K36 that contain two PWWP domains [49]. Sequence alignment reveals that all of 7

the PWWP domains of NSD1/2/3, except the N-terminal PWWP domain of NSD1, 8

have conserved aromatic residues to form a potential cage for methyl-lysine binding 9

(Fig. 2). Indeed, the N-terminal PWWP domains of NSD2/3 are able to recognize 10

H3K36me2/3 in vitro [6]. But whether and/or how these PWWP domains contribute 11

to their enzymatic activity and histone methylation propagation is not established yet. 12

The fission yeast PWWP domain-containing protein Pdp1 is a binding partner of 13

the histone methyltransferase Set9, which catalyzes mono-, di-, and tri-methylation of 14

H4K20 [50]. The PWWP domain of Pdp1 binds to H4K20me3 [23], and mutations 15

within the PWWP domain that abrogated this interaction in vitro reduced both the 16

association of Set9 with chromatin and the H4K20 methylation level in vivo, 17

establishing that the H4K20me binding ability of Pdp1 is essential for Set9’s H4K20 18

methylation activity[22]. 19

Another PWWP domain containing protein, NPAC/GLYR1, was recently found 20

to be a binding partner of the H3K4 demethylase LSD2, and NPAC positively 21

regulates the H3K4 demethylase activity of LSD2[51]. Of note, NPAC was also 22

Page 14: Structure and function of the nuc leosome-binding PWWP domain · - 1 - 1 Structure and function of the nuc leosome-binding PWWP domain 2 3 Su Qin1 and Jinrong Min1, 2 4 1 Structural

- 14 -

identified as an H3K36me3 reader [52], probably through its PWWP domain. Overall, 1

these data across different PWWP-containing proteins suggest potential links between 2

the PWWP domain and regulation of histone methylation. 3

4

The PWWP domain and DNA repair 5

Environmental and endogenous DNA damaging agents can impair DNA integrity 6

and threaten genomic stability. Unrepaired lesions in critical genes (such as tumor 7

suppressor genes) can impede a cell's ability to carry out its function and appreciably 8

increase the likelihood of tumor formation. PWWP domain-containing proteins are 9

involved in DNA repair through regulating histone methylation and chromatin 10

architecture and recruiting key proteins in response to double-strand breaks (DSBs). 11

Methylation of histone H4K20 (H4K20me) is essential for recruiting the DNA 12

damage mediator 53BP1 to DNA lesions and subsequent activation of a DNA-damage 13

checkpoint [53]. In fission yeast, this methylation mark is established by Set9, and the 14

PWWP protein Pdp1 is required for Set9 chromatin localization (see above). Yeast 15

cells without Pdp1 were deficient in all three methylation states of H4K20, sensitive 16

to genotoxic treatments, and impaired in Crb2 (a 53BP1 homolog) recruitment [22]. 17

In mammals, methylation of H4K20 and recruitment of 53BP1 at DSBs is instead 18

mediated by a single PWWP-containing histone methyltransferase, NSD2 [54, 55]. 19

Downregulation of NSD2 significantly decreases H4K20 methylation at DSBs and the 20

subsequent accumulation of 53BP1 [54]. However, the contribution of its PWWP 21

domains remains undetermined. 22

Page 15: Structure and function of the nuc leosome-binding PWWP domain · - 1 - 1 Structure and function of the nuc leosome-binding PWWP domain 2 3 Su Qin1 and Jinrong Min1, 2 4 1 Structural

- 15 -

Mammalian interphase chromatin also responds to DNA damage by altering the 1

compactness of its architecture, thereby permitting local access of DNA repair 2

machineries [56]. The PWWP domain-containing protein MUM1 (also known as 3

EXPAND1) was recently reported to be an architectural component of chromatin, and 4

by its direct interaction with 53BP1, MUM1 plays an accessory role to facilitate DNA 5

damage-induced chromatin changes and is important for efficient DNA repair and cell 6

survival following DNA damage [57]. In vivo chromatin association of MUM1 relies 7

on its PWWP domain-mediated binding to nucleosomes, and in vitro assays revealed 8

its binding to H3K36me3[6]. Ablation of this interaction impairs damage-induced 9

chromatin decondensation, which is accompanied by sustained DNA damage and 10

hypersensitivity to genotoxic stress [57]. 11

An important mechanism to repair DSBs is the homologous recombination 12

pathway and a key step in this pathway is the DNA-end resection and generation of 13

single-strand DNA (ssDNA). Two factors, the Mre11–Rad50–Nbs1 (MRN) complex 14

and Retinoblastoma binding protein 8 (RBBP8), are required in this process [58]. 15

Whereas MRN functions in DSB sensing and associates with the chromatin 16

compartment after DNA damage, RBBP8 is required for the DNA-end processing in 17

an MRN- and ATM-dependent manner [59]. The PWWP protein PSIP1 promotes the 18

repair of DSBs through its interaction with RBBP8 [60]. PSIP1 is also constitutively 19

associated with chromatin through its PWWP domain that binds preferentially to 20

H3K36me3. Depletion of PSIP1 impairs the recruitment of RBBP8 to DSBs and the 21

subsequent RBBP8-dependent DNA-end resection. PSIP1 binds to RBBP8 in a DNA 22

Page 16: Structure and function of the nuc leosome-binding PWWP domain · - 1 - 1 Structure and function of the nuc leosome-binding PWWP domain 2 3 Su Qin1 and Jinrong Min1, 2 4 1 Structural

- 16 -

damage–dependent manner, thereby enhancing its tethering to the active chromatin 1

and facilitating its access to DSBs [60]. 2

A recent study also revealed that PWWP domain is involved in DNA mismatch 3

repair (MMR). MMR maintains genome stability primarily by correcting base-base 4

and small insertion-deletion (ID) mispairs generated during DNA replication. In 5

human cells, these mispairs are recognized by two protein complexes: MSH2-MSH6 6

(MutSα) and MSH2-MSH3 (MutSβ) [61]. MSH6 binds to the H3K36me3 mark in a 7

PWWP-dependent manner and this interaction mediates MutSα association with 8

chromatin in cells [27]. The histone methyltransferase SETD2, which is responsible 9

for trimethylation of H3K36, is also required for human MMR in vivo [27]. 10

H3K36me3 is cell-cycle regulated, with its methylation level peaking in late G1/early 11

S and being largely depleted in late S/G2 [27]. DNA mismatches usually arise through 12

occasional proofreading errors by DNA polymerases during DNA replication, and the 13

enrichment of H3K36me3 during S phase may facilitate recruitment of the MMR 14

machinery to where (chromatin) and when (during DNA replication) it is most needed 15

[61]. The substitution S144I, which is located in the PWWP domain of MSH6 and has 16

an impact on protein stability [15], has been linked to a cancer predisposition 17

syndrome called HNPCC (hereditary non-polyposis colorectal cancer) [62]. Overall, 18

PWWP domain-containing proteins are involved in distinct DNA repair mechanisms 19

with the PWWP domain specifying the chromatin localization. 20

21

The PWWP domain and transcription elongation 22

Page 17: Structure and function of the nuc leosome-binding PWWP domain · - 1 - 1 Structure and function of the nuc leosome-binding PWWP domain 2 3 Su Qin1 and Jinrong Min1, 2 4 1 Structural

- 17 -

During transcription elongation, nucleosomes can be evicted or repositioned by 1

ATP-dependent chromatin remodelers to allow passage of RNA polymerase II 2

(RNAPII). Chromatin must be ‘reset’ in the wake of the polymerase passage in order 3

to prevent production of cryptic transcripts from within the gene body. The histone 4

methyltransferase Set2 adds the resetting mark H3K36me3 on histone H3 in coding 5

regions in Saccharomyces cerevisiae, which is recognized by the PWWP domain of 6

the Ioc4 subunit of the Isw1b chromatin-remodeling complex [63, 64]. Therefore, the 7

chromatin remodelers Isw1 and Chd1 act synergistically in the Set2 pathway by 8

antagonizing histone exchange and decreasing incorporation of acetylated histones 9

within coding regions in the wake of transcription elongation by RNAPII, hence 10

maintaining chromatin integrity during transcription elongation [63, 65]. 11

However, the picture gets more complicated in higher eukaryotes where the 12

histone variant H3.3 is incorporated into the gene bodies in a transcription-coupled 13

manner [66]. A recent study revealed that ZMYND11 is an H3.3-specific reader of 14

H3K36me3 [9]. Chromatin immunoprecipitation followed by sequencing (ChIP-seq) 15

shows a genome-wide co-localization of ZMYND11 with H3K36me3 and H3.3 in 16

gene bodies. The ZMYND11 occupancy also correlates with RNAPII density in the 17

gene body, and ZMYND11 represses gene expression by preventing the transition of 18

paused RNAPII to elongation. Although ZMYND11 is associated with highly 19

expressed genes, it functions as an unconventional transcription co-repressor by 20

modulating RNA polymerase II at the elongation stage [9]. 21

22

Page 18: Structure and function of the nuc leosome-binding PWWP domain · - 1 - 1 Structure and function of the nuc leosome-binding PWWP domain 2 3 Su Qin1 and Jinrong Min1, 2 4 1 Structural

- 18 -

Multifaceted PSIP1 1

PSIP1 is a 75-kDa protein consisting of an N-terminal PWWP domain, a 2

functional nuclear localization signal (NLS), a tandem copy of the AT-hook 3

DNA-binding motif, and a C-terminal domain that interacts with the HIV integrase 4

[67, 68]. Although there are multiple chromatin-binding domains in PSIP1, the 5

PWWP domain is essential for its in vivo functions. Based on the high sequence 6

homology between the PWWP domain of PSIP1 and those of HDGF and 7

HDGF-related proteins, PSIP1 has been categorized into the HDGF family [69]. In 8

addition to its involvement in DNA repair (discussed earlier), PSIP1 was first isolated 9

as an transcriptional co-activator [70], associating with transcriptional activators and 10

components of the basal transcriptional machinery including RNAPII subunits[71]. It 11

is also an essential subunit of the MLL complex in MLL oncogenic transformations 12

via HOX gene regulation [72]. The short isoform p52, which shares the N-terminal 13

part including the PWWP domain with other isoforms, but not p75, co-localizes and 14

interacts with splicing factor Srsf1 and other proteins involved in mRNA processing, 15

thereby contributing to the regulation of alternative splicing [26]. 16

Significant research effort has been focused on understanding the involvement of 17

the p75 variant of PSIP1 in HIV infection [73]. Upon HIV infection, PSIP1 binds to 18

the HIV integrase with its C-terminal integrase-binding domain and facilitates viral 19

cDNA integration into transcribed (and thus accessible) genes. In vitro, PSIP1 20

stimulates HIV-1 integrase activity toward both naked target DNAs and reconstituted 21

polynucleosomes. Surprisingly, a different requirement for the chromatin-binding 22

Page 19: Structure and function of the nuc leosome-binding PWWP domain · - 1 - 1 Structure and function of the nuc leosome-binding PWWP domain 2 3 Su Qin1 and Jinrong Min1, 2 4 1 Structural

- 19 -

domains of PSIP1 was observed when using naked DNA versus polynucleosomes. 1

With naked DNA, deletion of both PWWP domain and AT hooks was required to 2

ablate PSIP1 cofactor function. But with polynucleosomes, the activity of PSIP1 3

mainly depended on the PWWP domain, and to a lesser extent on the AT-hook 4

DNA-binding motifs [74]. This highlights the importance of the PWWP domain in the 5

nucleosomal context. 6

Finally, it comes to our mind that, although the PWWP domain mainly functions 7

to specifying chromatin localization, they are involved in a numerous of 8

chromatin-related processes, depending on their protein/complex context. It will not 9

be surprising that new cellular functions are connected to the PWWP 10

domain-containing proteins. 11

12

Concluding remarks 13

The PWWP domain plays a critical role in recruiting or tethering their associated 14

chromatin modifying activities to the target locations on chromatin. Despite the 15

significant advance in understanding the structure and function of the PWWP domain, 16

many questions have yet to be answered. It is now well-established that the PWWP 17

domain is a nucleosome binding domain, but the detailed structural basis of this 18

interaction remains to be resolved. PWWP domains have been shown to read the 19

H3K36me and H4K20me marks, yet it is unclear whether other histone marks can be 20

recognized by the PWWP domain. Furthermore, it would also be interesting to 21

identify their non-histone binding ligands, as such ligands have been identified for 22

Page 20: Structure and function of the nuc leosome-binding PWWP domain · - 1 - 1 Structure and function of the nuc leosome-binding PWWP domain 2 3 Su Qin1 and Jinrong Min1, 2 4 1 Structural

- 20 -

other Royal family members [75-77]. The PWWP domain often coexists with other 1

chromatin readers and/or modifier domains, but the mechanisms of cooperation of the 2

PWWP domain with these chromatin-associated domains are largely unexplored. 3

The PWWP domain-containing proteins are involved in various biological 4

processes, and malfunctions of these proteins have been implicated in different human 5

diseases. Due to their critical roles in gene regulation, there is currently great effort in 6

developing chemical probes for epigenetic proteins. Much progress has been made in 7

the development of chemical probes and inhibitors for histone readers, such as the 8

bromodomain [78] and MBT domains [79], but the PWWP domain is still an 9

untouched target at this time. It is to be expected that more structural and functional 10

analyses of the PWWP domains are forthcoming, which should further enlighten the 11

functions of PWWP domain-containing proteins in chromatin biology. 12

13

14

Page 21: Structure and function of the nuc leosome-binding PWWP domain · - 1 - 1 Structure and function of the nuc leosome-binding PWWP domain 2 3 Su Qin1 and Jinrong Min1, 2 4 1 Structural

- 21 -

Box 1. Outstanding questions 1

z Will it be possible that the PWWP domain can bind DNA in a specific manner to some extent 2

(e.g. base and/or shape)? 3

z Does the interaction between the PWWP domain and H3K79me3 occur in vivo? If yes, what 4

is the functional significance? 5

z Will it be possible that the PWWP domain can recognize non-histone substrate? 6

z What is the detailed picture of a PWWP domain bound with a methylated nucleosome? 7

z Does the PWWP domain binding to nucleosome function solely as a signal transductor? Can 8

it affect the chromatin structure directly? 9

z Given that the PWWP domain-containing HAT complexes may function in the gene body 10

region to promote transcription elongation, how do they cooperate with the HDAC 11

complexes that can also recognize H3K36me3 and prevent cryptic initiation of transcription 12

within the coding region? 13

14

Acknowledgements 15

We would like to thank Johnathan Lau for critical reading of the manuscript. The SGC 16

is a registered charity (number 1097737) that receives funds from AbbVie, Boehringer 17

Ingelheim, the Canada Foundation for Innovation, the Canadian Institutes for Health 18

Research, Genome Canada through the Ontario Genomics Institute [OGI-055], 19

GlaxoSmithKline, Janssen, Lilly Canada, the Novartis Research Foundation, the 20

Ontario Ministry of Economic Development and Innovation, Pfizer, Takeda, and the 21

Wellcome Trust [092809/Z/10/Z]. 22

Page 22: Structure and function of the nuc leosome-binding PWWP domain · - 1 - 1 Structure and function of the nuc leosome-binding PWWP domain 2 3 Su Qin1 and Jinrong Min1, 2 4 1 Structural

- 22 -

Reference 1

2

1 Stec, I., et al. (1998) WHSC1, a 90 kb SET domain-containing gene, expressed in early 3 development and homologous to a Drosophila dysmorphy gene maps in the Wolf-Hirschhorn 4 syndrome critical region and is fused to IgH in t(4;14) multiple myeloma. Hum Mol Genet 7, 5 1071-1082 6 2 Stec, I., et al. (2000) The PWWP domain: a potential protein-protein interaction domain in 7 nuclear proteins influencing differentiation? FEBS Lett 473, 1-5 8 3 Izumoto, Y., et al. (1997) Hepatoma-derived growth factor belongs to a gene family in mice 9 showing significant homology in the amino terminus. Biochem Biophys Res Commun 238, 26-32 10 4 Maurer-Stroh, S., et al. (2003) The Tudor domain 'Royal Family': Tudor, plant Agenet, 11 Chromo, PWWP and MBT domains. Trends Biochem Sci 28, 69-74 12 5 Taverna, S.D., et al. (2007) How chromatin-binding modules interpret histone modifications: 13 lessons from professional pocket pickers. Nat Struct Mol Biol 14, 1025-1040 14 6 Wu, H., et al. (2011) Structural and histone binding ability characterizations of human 15 PWWP domains. PLoS One 6, e18919 16 7 Letunic, I., et al. (2011) SMART 7: recent updates to the protein domain annotation resource. 17 Nucleic Acids Res 40, D302-305 18 8 Keshava Prasad, T.S., et al. (2009) Human Protein Reference Database--2009 update. 19 Nucleic Acids Res 37, D767-772 20 9 Wen, H., et al. (2014) ZMYND11 links histone H3.3K36me3 to transcription elongation and 21 tumour suppression. Nature 22 10 Wang, J., et al. (2014) Crystal structure of human BS69 Bromo-ZnF-PWWP reveals its role 23 in H3K36me3 nucleosome binding. Cell Res 24 11 Sue, S.C., et al. (2007) PWWP module of human hepatoma-derived growth factor forms a 25 domain-swapped dimer with much higher affinity for heparin. J Mol Biol 367, 456-472 26 12 Qiu, C., et al. (2002) The PWWP domain of mammalian DNA methyltransferase Dnmt3b 27 defines a new family of DNA-binding folds. Nat Struct Biol 9, 217-224 28 13 Lukasik, S.M., et al. (2006) High resolution structure of the HDGF PWWP domain: a 29 potential DNA binding domain. Protein Sci 15, 314-323 30 14 Yang, J., and Everett, A.D. (2007) Hepatoma-derived growth factor binds DNA through the 31 N-terminal PWWP domain. BMC Mol Biol 8, 101 32 15 Laguri, C., et al. (2008) Human mismatch repair protein MSH6 contains a PWWP domain 33 that targets double stranded DNA. Biochemistry 47, 6199-6207 34 16 van Nuland, R., et al. (2013) Nucleosomal DNA binding drives the recognition of 35 H3K36-methylated nucleosomes by the PSIP1-PWWP domain. Epigenetics Chromatin 6, 12 36 17 Eidahl, J.O., et al. (2013) Structural basis for high-affinity binding of LEDGF PWWP to 37 mononucleosomes. Nucleic acids research 41, 3924-3936 38 18 Sue, S.C., et al. (2004) Solution structure and heparin interaction of human 39 hepatoma-derived growth factor. J Mol Biol 343, 1365-1377 40 19 Adams-Cioaba, M.A., and Min, J. (2009) Structure and function of histone methylation 41 binding proteins. Biochem Cell Biol 87, 93-105 42

Page 23: Structure and function of the nuc leosome-binding PWWP domain · - 1 - 1 Structure and function of the nuc leosome-binding PWWP domain 2 3 Su Qin1 and Jinrong Min1, 2 4 1 Structural

- 23 -

20 Nameki, N., et al. (2005) Solution structure of the PWWP domain of the hepatoma-derived 1 growth factor family. Protein Sci 14, 756-764 2 21 Laue, K., et al. (2008) The multidomain protein Brpf1 binds histones and is required for Hox 3 gene expression and segmental identity. Development 135, 1935-1946 4 22 Wang, Y., et al. (2009) Regulation of Set9-mediated H4K20 methylation by a PWWP domain 5 protein. Mol Cell 33, 428-437 6 23 Qiu, Y., et al. (2012) Solution structure of the Pdp1 PWWP domain reveals its unique 7 binding sites for methylated H4K20 and DNA. Biochem J 442, 527-538 8 24 Vezzoli, A., et al. (2010) Molecular basis of histone H3K36me3 recognition by the PWWP 9 domain of Brpf1. Nat Struct Mol Biol 17, 617-619 10 25 Dhayalan, A., et al. (2010) The Dnmt3a PWWP domain reads histone 3 lysine 36 11 trimethylation and guides DNA methylation. J Biol Chem 285, 26114-26120 12 26 Pradeepa, M.M., et al. (2012) Psip1/Ledgf p52 binds methylated histone H3K36 and splicing 13 factors and contributes to the regulation of alternative splicing. PLoS Genet 8, e1002717 14 27 Li, F., et al. (2013) The histone mark H3K36me3 regulates human DNA mismatch repair 15 through its interaction with MutSalpha. Cell 153, 590-600 16 28 Gong, W., et al. (2012) Structural insight into recognition of methylated histone tails by 17 retinoblastoma-binding protein 1. J Biol Chem 287, 8531-8540 18 29 Kim, D., et al. (2010) Corecognition of DNA and a methylated histone tail by the MSL3 19 chromodomain. Nat Struct Mol Biol 17, 1027-1029 20 30 Musselman, C.A., et al. (2013) Binding of PHF1 Tudor to H3K36me3 enhances nucleosome 21 accessibility. Nat Commun 4, 2969 22 31 Ruthenburg, A.J., et al. (2007) Multivalent engagement of chromatin modifications by linked 23 binding modules. Nat Rev Mol Cell Biol 8, 983-994 24 32 Otani, J., et al. (2009) Structural basis for recognition of H3K4 methylation status by the 25 DNA methyltransferase 3A ATRX-DNMT3-DNMT3L domain. EMBO Rep 10, 1235-1241 26 33 Qin, S., et al. (2011) Recognition of unmodified histone H3 by the first PHD finger of 27 bromodomain-PHD finger protein 2 provides insights into the regulation of histone 28 acetyltransferases monocytic leukemic zinc-finger protein (MOZ) and MOZ-related factor 29 (MORF). J Biol Chem 286, 36944-36955 30 34 He, C., et al. (2012) The methyltransferase NSD3 has chromatin-binding motifs, 31 PHD5-C5HCH, that are distinct from other NSD (nuclear receptor SET domain) family members 32 in their histone H3 recognition. J Biol Chem 288, 4692-4703 33 35 He, F., et al. (2010) Structural insight into the zinc finger CW domain as a histone 34 modification reader. Structure 18, 1127-1139 35 36 Poplawski, A., et al. (2014) Molecular insights into the recognition of N-terminal histone 36 modifications by the BRPF1 bromodomain. J Mol Biol 426, 1661-1676 37 37 Liu, L., et al. (2012) Solution structure of an atypical PHD finger in BRPF2 and its 38 interaction with DNA. J Struct Biol 180, 165-173 39 38 Barski, A., et al. (2007) High-resolution profiling of histone methylations in the human 40 genome. Cell 129, 823-837 41 39 Chedin, F. (2011) The DNMT3 family of mammalian de novo DNA methyltransferases. Prog 42 Mol Biol Transl Sci 101, 255-285 43 40 Chen, T., et al. (2004) The PWWP domain of Dnmt3a and Dnmt3b is required for directing 44

Page 24: Structure and function of the nuc leosome-binding PWWP domain · - 1 - 1 Structure and function of the nuc leosome-binding PWWP domain 2 3 Su Qin1 and Jinrong Min1, 2 4 1 Structural

- 24 -

DNA methylation to the major satellite repeats at pericentric heterochromatin. Mol Cell Biol 24, 1 9048-9058 2 41 Ge, Y.Z., et al. (2004) Chromatin targeting of de novo DNA methyltransferases by the 3 PWWP domain. J Biol Chem 279, 25447-25454 4 42 Hu, J.L., et al. (2009) The N-terminus of histone H3 is required for de novo DNA 5 methylation in chromatin. Proc Natl Acad Sci U S A 106, 22187-22192 6 43 Zhang, Y., et al. (2010) Chromatin methylation activity of Dnmt3a and Dnmt3a/3L is guided 7 by interaction of the ADD domain with the histone H3 tail. Nucleic Acids Res 38, 4246-4253 8 44 Shirohzu, H., et al. (2002) Three novel DNMT3B mutations in Japanese patients with ICF 9 syndrome. Am J Med Genet 112, 31-37 10 45 Ullah, M., et al. (2008) Molecular architecture of quartet MOZ/MORF histone 11 acetyltransferase complexes. Mol Cell Biol 28, 6828-6843 12 46 Voss, A.K., et al. (2009) Moz and retinoic acid coordinately regulate H3K9 acetylation, Hox 13 gene expression, and segment identity. Dev Cell 17, 674-686 14 47 Doyon, Y., et al. (2006) ING tumor suppressor proteins are critical regulators of chromatin 15 acetylation required for genome expression and perpetuation. Mol Cell 21, 51-64 16 48 Gilbert, T.M., et al. (2014) An H3K36me3 binding PWWP protein targets the NuA3 17 acetyltransferase complex to coordinate transcriptional elongation at coding regions. Mol Cell 18 Proteomics 19 49 Wagner, E.J., and Carpenter, P.B. (2012) Understanding the language of Lys36 methylation 20 at histone H3. Nat Rev Mol Cell Biol 13, 115-126 21 50 Sanders, S.L., et al. (2004) Methylation of histone H4 lysine 20 controls recruitment of Crb2 22 to sites of DNA damage. Cell 119, 603-614 23 51 Fang, R., et al. (2013) LSD2/KDM1B and its cofactor NPAC/GLYR1 endow a structural and 24 molecular model for regulation of H3K4 demethylation. Mol Cell 49, 558-570 25 52 Vermeulen, M., et al. (2010) Quantitative interaction proteomics and genome-wide profiling 26 of epigenetic histone marks and their readers. Cell 142, 967-980 27 53 Panier, S., and Boulton, S.J. (2014) Double-strand break repair: 53BP1 comes into focus. Nat 28 Rev Mol Cell Biol 15, 7-18 29 54 Pei, H., et al. (2011) MMSET regulates histone H4K20 methylation and 53BP1 accumulation 30 at DNA damage sites. Nature 470, 124-128 31 55 Hajdu, I., et al. (2011) Wolf-Hirschhorn syndrome candidate 1 is involved in the cellular 32 response to DNA damage. Proc Natl Acad Sci U S A 108, 13130-13134 33 56 Sy, S.M., et al. (2010) The 53BP1-EXPAND1 connection in chromatin structure regulation. 34 Nucleus 1, 472-474 35 57 Huen, M.S., et al. (2010) Regulation of chromatin architecture by the PWWP 36 domain-containing DNA damage-responsive factor EXPAND1/MUM1. Mol Cell 37, 854-864 37 58 Sartori, A.A., et al. (2007) Human CtIP promotes DNA end resection. Nature 450, 509-514 38 59 You, Z., et al. (2009) CtIP links DNA double-strand break sensing to resection. Mol Cell 36, 39 954-969 40 60 Daugaard, M., et al. (2012) LEDGF (p75) promotes DNA-end resection and homologous 41 recombination. Nat Struct Mol Biol 19, 803-810 42 61 Schmidt, C.K., and Jackson, S.P. (2013) On your mark, get SET(D2), go! H3K36me3 primes 43 DNA mismatch repair. Cell 153, 513-515 44

Page 25: Structure and function of the nuc leosome-binding PWWP domain · - 1 - 1 Structure and function of the nuc leosome-binding PWWP domain 2 3 Su Qin1 and Jinrong Min1, 2 4 1 Structural

- 25 -

62 Kariola, R., et al. (2002) Functional analysis of MSH6 mutations linked to kindreds with 1 putative hereditary non-polyposis colorectal cancer syndrome. Hum Mol Genet 11, 1303-1310 2 63 Smolle, M., et al. (2012) Chromatin remodelers Isw1 and Chd1 maintain chromatin 3 structure during transcription by preventing histone exchange. Nat Struct Mol Biol 19, 884-892 4 64 Maltby, V.E., et al. (2012) Histone H3 lysine 36 methylation targets the Isw1b remodeling 5 complex to chromatin. Mol Cell Biol 32, 3479-3485 6 65 Venkatesh, S., et al. (2012) Set2 methylation of histone H3 lysine 36 suppresses histone 7 exchange on transcribed genes. Nature 489, 452-455 8 66 Elsaesser, S.J., et al. (2010) New functions for an old variant: no substitute for histone H3.3. 9 Curr Opin Genet Dev 20, 110-117 10 67 Llano, M., et al. (2006) Identification and characterization of the chromatin-binding 11 domains of the HIV-1 integrase interactor LEDGF/p75. J Mol Biol 360, 760-773 12 68 Cherepanov, P., et al. (2004) Identification of an evolutionarily conserved domain in human 13 lens epithelium-derived growth factor/transcriptional co-activator p75 (LEDGF/p75) that binds 14 HIV-1 integrase. J Biol Chem 279, 48883-48892 15 69 Dietz, F., et al. (2002) The family of hepatoma-derived growth factor proteins: 16 characterization of a new member HRP-4 and classification of its subfamilies. Biochem J 366, 17 491-500 18 70 Ge, H., et al. (1998) Isolation of cDNAs encoding novel transcription coactivators p52 and 19 p75 reveals an alternate regulatory mechanism of transcriptional activation. EMBO J 17, 20 6723-6729 21 71 Ge, H., et al. (1998) A novel transcriptional coactivator, p52, functionally interacts with the 22 essential splicing factor ASF/SF2. Mol Cell 2, 751-759 23 72 Yokoyama, A., and Cleary, M.L. (2008) Menin critically links MLL proteins with LEDGF on 24 cancer-associated target genes. Cancer Cell 14, 36-46 25 73 Christ, F., and Debyser, Z. (2013) The LEDGF/p75 integrase interaction, a novel target for 26 anti-HIV therapy. Virology 435, 102-109 27 74 Botbol, Y., et al. (2008) Chromatinized templates reveal the requirement for the LEDGF/p75 28 PWWP domain during HIV-1 integration in vitro. Nucleic Acids Res 36, 1237-1246 29 75 Sims, R.J., 3rd, and Reinberg, D. (2008) Is there a code embedded in proteins that is based 30 on post-translational modifications? Nat Rev Mol Cell Biol 9, 815-820 31 76 Qin, S., et al. (2014) Structural basis for histone mimicry and hijacking of host proteins by 32 influenza virus protein NS1. Nat Commun 5, 3952 33 77 Marazzi, I., et al. (2012) Suppression of the antiviral response by an influenza histone mimic. 34 Nature 483, 428-433 35 78 Filippakopoulos, P., and Knapp, S. (2014) Targeting bromodomains: epigenetic readers of 36 lysine acetylation. Nat Rev Drug Discov 13, 337-356 37 79 Liu, Y., et al. (2014) Epigenetic targets and drug discovery: Part 1: Histone methylation. 38 Pharmacol Ther 39 80 Gong, W., et al. (2014) Retinoblastoma-binding protein 1 has an interdigitated double Tudor 40 domain with DNA binding activity. J Biol Chem 289, 4882-4895 41 42

Page 26: Structure and function of the nuc leosome-binding PWWP domain · - 1 - 1 Structure and function of the nuc leosome-binding PWWP domain 2 3 Su Qin1 and Jinrong Min1, 2 4 1 Structural

- 26 -

Figure legend 1

Figure 1: 3D structures of representative human PWWP domains. The β barrel is 2

colored in green and its strands are numbered as 1 to 5 in black. The helix bundle is 3

colored in blue and the two conserved helices are numbered as 1 and 2 in red, 4

respectively. The insertion between β4 and β5 is colored in cyan, and the insertion 5

between β2 and β3 and structural elements that do not belong to a classic PWWP 6

domain is colored in gray. The residues forming an aromatic cage (except RBBP1, 7

which harbors an incomplete cage, panel G) are shown in stick mode. The protein 8

name and their corresponding PDB code are listed atop each panel (A-J). PSIP1, PC4 9

And SFRS1 Interacting Protein 1; DNMT3A, DNA (cytosine-5-)-methyltransferase 3 10

alpha; MSH6, mutS homolog 6; NSD3, Nuclear receptor-binding SET 11

domain-containing protein 3; MUM1, Mutated melanoma-associated antigen 1; 12

BRPF1, Bromodomain And PHD Finger-Containing Protein 1; RBBP1, 13

Retinoblastoma-binding protein 1; ZMYND11, Zinc finger MYND 14

domain-containing protein 11; PWWP2B, PWWP domain-containing protein 2B; 15

HDGF, Hepatoma-derived growth factor. 16

17

Figure 2: Sequence alignment of the human PWWP domains (β-barrel part). The 18

human PWWP domains are grouped based on the sequence similarity. The secondary 19

elements of each representative are shown atop and colored as in Figure 1. The 20

conserved PWWP motif is boxed, and the aromatic residues forming the methyllysine 21

cage are highlighted in yellow. 22

23

Figure 3: Sequence alignment of the human PWWP domains (helix-bundle part). 24

Two conserved helices are numbered as α1 and α2. 25

26

Page 27: Structure and function of the nuc leosome-binding PWWP domain · - 1 - 1 Structure and function of the nuc leosome-binding PWWP domain 2 3 Su Qin1 and Jinrong Min1, 2 4 1 Structural

- 27 -

Figure 4: DNA-binding sites are located on one side of the PWWP domains. 1

The potential DNA-binding sites were identified by NMR titration and shown in 2

purple (A, HDGF[13], B, PSIP1[17], and C, MSH6[15]). Carton models are on top, 3

electrostatic potential surfaces on the bottom. The aromatic cage (stick mode) and a 4

histone peptide (grey) are shown for comparison. The peptide is superimposed from 5

the PDB entry 3QJ6. 6

7

Figure 5: Histone peptides bind to the PWWP domain in a similar direction. 8

All available complex structures of different PWWP domains with their 9

corresponding ligands are shown here. A, complex of BRPF1-PWWP with 10

H3K36me3; B and C, complexes of HDGF2-PWWP with H4K20me3 and 11

H3K79me3, respectively; D, complex of the Bromo-Zinc-PWWP cassette of 12

ZMYND11 with H3.3K36me3. The residues forming the aromatic cage in the PWWP 13

domains are colored in salmon and shown in a stick mode. The histone peptides are 14

colored in yellow and the methylysine residue is shown in a stick model. The detailed 15

interaction of ZMYND11 with H3.3S31 is also shown, in comparison with ZMYND8 16

(D). 17

18

Page 28: Structure and function of the nuc leosome-binding PWWP domain · - 1 - 1 Structure and function of the nuc leosome-binding PWWP domain 2 3 Su Qin1 and Jinrong Min1, 2 4 1 Structural

- 2

8 -

Tabl

e 1.

The

PW

WP

dom

ain-

cont

aini

ng p

rote

ins i

n hu

man

pro

teom

e

Prot

ein

Nam

es

Oth

er h

isto

ne/D

NA

bind

ing

dom

ains

*

Func

tions

D

isea

ses

Ref

s

D

NM

T3A

A

DD

(H3K

4me0

) de

nov

o D

NA

met

hyltr

ansf

eras

e

[32,

39]

D

NM

T3B

A

DD

de

nov

o D

NA

met

hyltr

ansf

eras

e IC

F Sy

ndro

me

[39,

44]

B

RPF

1 PH

D, B

rom

o

(H2A

K5a

c, H

4K12

ac,

H3K

14ac

)

Subu

nit o

f MO

Z/M

OR

F hi

ston

e

acet

yltra

nsfe

rase

com

plex

[3

6]

B

RPF

2/B

RD

1 PH

D1(

H3K

4me0

)

PHD

2(D

NA

), B

rom

o

Subu

nit o

f MO

Z/M

OR

F hi

ston

e

acet

yltra

nsfe

rase

com

plex

[3

3, 3

7]

B

RPF

3 PH

D, B

rom

o Su

buni

t of M

OZ/

MO

RF

hist

one

acet

yltra

nsfe

rase

com

plex

N

SD1

PHD

H

isto

ne ly

sine

met

hyltr

ansf

eras

e So

tos s

yndr

ome

1

Bec

kwith

-Wie

dem

ann

synd

rom

e

Can

cers

(AM

L, p

rost

ate,

neur

obla

stom

a, b

reas

t)

N

SD2/

WH

SC1/

MM

SET

PHD

, HM

G

His

tone

lysi

ne m

ethy

ltran

sfer

ase

Wol

f-H

irsch

horn

synd

rom

e

Mul

tiple

mye

lom

a

N

SD3/

WH

SC1L

1 PH

D5

(H3K

4me0

) H

isto

ne ly

sine

met

hyltr

ansf

eras

e B

reas

t can

cer,

AM

L,

Mye

lody

spla

stic

synd

rom

e

[34]

ZM

YN

D8/

RA

CK

7/ P

RK

CB

P1

PHD

, Bro

mo

ZM

YN

D11

/BS6

9 PH

D, B

rom

o Tr

ansc

riptio

nal r

epre

ssor

[9]

H

DG

F

Tran

scrip

tiona

l rep

ress

or

H

DG

FL1

H

DG

FRP2

/HD

GF2

Page 29: Structure and function of the nuc leosome-binding PWWP domain · - 1 - 1 Structure and function of the nuc leosome-binding PWWP domain 2 3 Su Qin1 and Jinrong Min1, 2 4 1 Structural

- 2

9 -

H

DG

FRP3

PS

IP1/

LED

GF/

p75

AT h

ook

DA

N re

pair,

coa

ctiv

ator

[26,

60]

G

LYR

1/N

PAC

AT

hoo

k C

ofac

tor o

f LSD

2

[51]

ZC

WPW

1 C

W (H

3K4m

e3)

[35]

ZC

WPW

2 C

W (H

3K4m

e3)

A

RID

4A/R

BB

P1

AR

ID (D

NA

), Tu

dor

(DN

A),

chro

mo

(H4K

20m

e3)

leuk

emia

and

tum

or su

ppre

ssor

[28,

80]

A

RID

4B/R

BB

P1L1

A

RID

, Tud

or, c

hrom

o le

ukem

ia a

nd tu

mor

supp

ress

or

M

SH6

M

ism

atch

repa

ir H

ered

itary

non

-pol

ypos

is

colo

rect

al c

ance

r 5, E

ndom

etria

l

canc

er, M

ism

atch

repa

ir ca

ncer

synd

rom

e

[27]

M

UM

1/EX

PAN

D1

ac

cess

ory

fact

or in

the

DN

A d

amag

e

resp

onse

pat

hway

[5

7]

PW

WP2

B

M

BD

5 M

BD

Men

tal r

etar

datio

n, a

utos

omal

dom

inan

t 1

* Th

e kn

own

bind

ing

ligan

ds a

re sh

own

in b

rack

et.

Page 30: Structure and function of the nuc leosome-binding PWWP domain · - 1 - 1 Structure and function of the nuc leosome-binding PWWP domain 2 3 Su Qin1 and Jinrong Min1, 2 4 1 Structural

- 3

0 -

Hig

hlig

hts:

1.

PWW

P do

mai

n ex

hibi

ts st

rong

bin

ding

affi

nity

to h

isto

ne ly

sine

mod

ified

nuc

leos

ome

2.

PWW

P do

mai

n pr

otei

ns a

re a

ssoc

iate

d w

ith c

hrom

atin

in a

PW

WP-

depe

nden

t man

ner.

3.

The

PWW

P do

mai

n is

invo

lved

in c

ross

talk

of d

iffer

ent e

pige

netic

mar

ks.

4.

Mut

atio

ns in

the

PWW

P do

mai

n ha

ve b

een

linke

d to

var

ious

hum

an d

isea

ses.

Page 31: Structure and function of the nuc leosome-binding PWWP domain · - 1 - 1 Structure and function of the nuc leosome-binding PWWP domain 2 3 Su Qin1 and Jinrong Min1, 2 4 1 Structural

(A) PSIP1, 4FU6 (B) DNMT3A, 3LLR (C) MSH6, 2GFU (D) NSD3_C, 2DAQ

(E) MUM1, 3PMI (F) BRPF1, 3L42 (G) RBBP1, 2YRV (H) ZMYND11, 4NS5

(I) PWWP2B, 4LD6 (J) HDGF homodimer, 2NLU

1 2

3

4

5 1

2

N

C N

C

1 2

3 4

5 1

2

N

C

1

2

3

4 5

1

2

N

C

1

2

3

4

5 2

N

C

1 2

3 4 5

1

2

N C

1 2

3

4

5

1

2

N

C

1 2

3 4

5

1

2

N

C

1 2

3 4 5

1

2

N

C

1

2

3

4 5

1

2

N

C

1 2

3

4 5 1

2

Bromodomain

Zinc finger

Fig. 1

Page 32: Structure and function of the nuc leosome-binding PWWP domain · - 1 - 1 Structure and function of the nuc leosome-binding PWWP domain 2 3 Su Qin1 and Jinrong Min1, 2 4 1 Structural

β1 β2 β3 β4 β5 η

β1 β2 β3 β4 β5 ηβ

β1 β2 β3 β4 β5 α

β1 β2 β3 β4 β5 ηβ β α

β1 β2 β3 β4 β5 η

β1 β2 β3 β4 β5 ηη

β1 β2 β3 β4 β5 ηη

β1 β2 β3 β4 β5

3

1

Fig. 2

Page 33: Structure and function of the nuc leosome-binding PWWP domain · - 1 - 1 Structure and function of the nuc leosome-binding PWWP domain 2 3 Su Qin1 and Jinrong Min1, 2 4 1 Structural

α1 α2

α2 α

α2

α1 α2

α2

η1 α2

α2 α α1 η ηα

α1 α α

α2 α1 α

3

1

α1

Fig. 3

Page 34: Structure and function of the nuc leosome-binding PWWP domain · - 1 - 1 Structure and function of the nuc leosome-binding PWWP domain 2 3 Su Qin1 and Jinrong Min1, 2 4 1 Structural

(A) HDGF (B) PSIP1 (C) MSH6

Fig. 4

N N N

N

N N

C C

C

C

C

C

DNA binding surface

DNA binding surface

DNA binding surface

Page 35: Structure and function of the nuc leosome-binding PWWP domain · - 1 - 1 Structure and function of the nuc leosome-binding PWWP domain 2 3 Su Qin1 and Jinrong Min1, 2 4 1 Structural

(A) BRPF1-H3K36me3, 3MO8 (B) HDGF2-H4K20me3, 3QBY (C) HDGF2-H3K79me3, 3QJ6

(D) ZMYND11-H3.3K36me3, 4N4I

K36me3

A31

N

C

N

C K20me3

K79me3

N

C

N

C

K36me3

S31

S31

E251 E248

N266 C263

ZMYND11 ZMYND8

Fig. 5

Bromodomain

Zinc finger