8
THE JOURNAL OF BIOLOGICAL CHEMISTRY 0 1986 by The American Society of Biological Chemists, Inc. Vol. 261, No. 24, Issue of August 25, pp. 11266-11273,1986 Printed in U.S.A. Purification and DomainStructure of Core hnRNP Proteins A1 and A2 and Their Relationship to Single- stranded DNA-binding Proteins* (Received for publication, February 3, 1986) Amalendra KumarS, Kenneth R. Williamsg, and Wlodzimierz SzerS From the $Department of Biochemistry, New York University School of Medicine, New York, New York 10016 and the §Department of Molecular Biophysics and Biochemistry, Yale UniversitySchool of Medicine, New Haven, Connecticut 06510 Protein A1 (M, -32,000), a major glycine-rich pro- tein of heterogeneous nuclear ribonucleoproteins (hnRNP), was purified to near homogeneity under non- denaturing conditions from HeLa cells. Limited pro- teolysis of the native protein yields a trypsin-resistant N-terminal nucleic acid-binding domain about 195 amino acids long which has a primary structure nearly identical to that of the 195-amino acid-long single- stranded DNA (ssDNA)-binding protein UP1 (Mr 22,162) from calf thymus (Williams, K. R., Stone, K. L., LoPresti, M. B., Merrill, B. M., and Planck, S. R. (1985) Proc. Natl. Acad. Sei. U. S. A. 82,5666-5670). 45 of the 6 1 glycine residues of A1 are present in the trypsin-sensitive C-terminal domain of the protein which containsno sequences homologous to UP1. Pro- tein A2, another major glycine-rich core hnRNP pro- tein from HeLa, has a domain structure analogous to A1 and appears to be related to ssDNA-binding pro- teins UP1-B from calf liver and HDP-1 from mouse myeloma in a way similar to the Al/UPl relationship. In contrast to ssDNA-binding proteins, A1 binds pref- erentially to RNA over ssDNA and exhibits no helix- destabilizing activity. In eukaryotic cells, the newly synthesized hnRNA associ- ates with a set of proteins and forms ribonucleoprotein par- ticles (hnRNP’). These particles can be seen in electron micrographs of transcriptionally active chromatin as fibrils containing 20-nm beads spaced along their length and are generally believed to be part of the hnRNA-processing appa- ratus (for review, see Refs. 1 and 2). The 20-nm substructures, often called monoparticles, can be recoveredfrompurified nuclei as a fairly homogenous peak sedimenting at about 30- 40 S in the sucrose gradient. The monoparticles which arise during isolation as a result of endonucleolytic cleavage of the larger hnRNP complexes contain, in addition to pre-mRNA sequences, a set of about nine prominent polypeptides called core proteins and a number of minor protein components (3). In HeLa cells, the major core proteins are termed Al, A2, B1 (two or three subspecies), B2, C1, C2, and C3, in order of GM 31539 (to K. R. W.), GM23705, and CA 16239 (to W. S.). The * This work was supported by National Institutes of Health Grants costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “aduertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. The abbreviations used are: hnRNP, heterogeneous nuclear ri- bonucleoprotein; ssDNA, single-stranded DNA; SDS, sodium dodecyl sulfate; HPLC, high pressure liquid chromatography; DTT, dithio- threitol; BSA, bovine serum albumin; PMSF, phenylmethylsulfonyl fluoride. increasing molecular weights (4). Most of the core proteins are modified post-translationally, primarily by phosphoryla- tion. The number of core proteins in a variety of organisms is similar, and they all have molecular weights between 32,000 and 42,000. Most are basic proteins that have similar amino acid compositions characterized by a high content of glycine (about 20%), a low content of cysteine, the presence of the unusual modified amino acid dimethylarginine, and a blocked N terminus. Immunological investigations, including the use of monoclonal antibodies,demonstratethat core proteins comprise a family of related and evolutionarily conserved polypeptides (5, 6). Furthermore, a cDNA clone of a major hnRNP protein from brine shrimp Artemia salina cross-hy- bridizes with DNA sequences of plant, avian, and mammalian origin (7). The core proteins appear to be devoid of any enzymatic activity and are believed to function as structural components that condense and organize hnRNA into the characteristic beads-on-a-string hnRNP particles. Nucleoprotein complexes that resemble native hnRNP can be reconstituted from ex- ogenous RNA and these proteins (8-11). It has been shown by optical methods that the RNA in hnRNP, both native and reconstituted, is substantially devoid of any residual second- ary structure, indicating that at least some of the core proteins may have RNA helix-destabilizing properties (12-14). A major hnRNP protein that was purified to homogeneity under non- denaturing conditions from A. salina was shown to be an RNA helix-destabilizing protein (15, 16). It was recently demonstrated that the ssDNA-binding pro- tein (also termed a DNA helix-destabilizing protein) from calf thymus,UP1(17),shares some antigenicdeterminantsin common with core hnRNP proteins from calf thymus and HeLa cells (18). Proteins analogous to UP1 were also isolated from mouse myeloma (19). In calf thymus, mouse myeloma, and probably in HeLa cells (18), there appears to be a group of proteins related structurally and antigenically toUP1 (UP1-typeproteins)with molecular weights rangingfrom 22,000 to 28,000, and immunological investigations suggest that these proteins may arise by the action of an endogenous protease on the larger and more abundant core hnRNP pro- teins(20), e.g. the mouse myeloma UP1-type protein ob- tained by standard purification has a M, of about 24,000, whereas an immunoblot of the crude homogenate detects a species of M, -36,000 (21). Furthermore, a full-length cDNA clone of a UP1-type protein from rat was found to contain an open reading frame coding for a protein of 320 rather than the expected -200 amino acids (22). The clone was isolated by screening a cDNA library constructed with total poly(A)+ RNA of newborn rat brain witholigonucleotide probes based onpeptide sequences of calf thymusUP1.The complete 11266

Purification and Domain Structure of Core hnRNP Proteins and A2 … · 1999-02-02 · trypsin-sensitive C-terminal domain of the protein which contains no sequences homologous to

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Purification and Domain Structure of Core hnRNP Proteins and A2 … · 1999-02-02 · trypsin-sensitive C-terminal domain of the protein which contains no sequences homologous to

THE JOURNAL OF BIOLOGICAL CHEMISTRY 0 1986 by The American Society of Biological Chemists, Inc.

Vol. 261, No. 24, Issue of August 25, pp. 11266-11273,1986 Printed in U.S.A.

Purification and Domain Structure of Core hnRNP Proteins A1 and A2 and Their Relationship to Single- stranded DNA-binding Proteins*

(Received for publication, February 3, 1986)

Amalendra KumarS, Kenneth R. Williamsg, and Wlodzimierz SzerS From the $Department of Biochemistry, New York University School of Medicine, New York, New York 10016 and the §Department of Molecular Biophysics and Biochemistry, Yale University School of Medicine, New Haven, Connecticut 06510

Protein A1 (M, -32,000), a major glycine-rich pro- tein of heterogeneous nuclear ribonucleoproteins (hnRNP), was purified to near homogeneity under non- denaturing conditions from HeLa cells. Limited pro- teolysis of the native protein yields a trypsin-resistant N-terminal nucleic acid-binding domain about 195 amino acids long which has a primary structure nearly identical to that of the 195-amino acid-long single- stranded DNA (ssDNA)-binding protein UP1 (Mr 22,162) from calf thymus (Williams, K. R., Stone, K. L., LoPresti, M. B., Merrill, B. M., and Planck, S. R. (1985) Proc. Natl. Acad. Sei. U. S. A. 82,5666-5670). 45 of the 61 glycine residues of A1 are present in the trypsin-sensitive C-terminal domain of the protein which contains no sequences homologous to UP1. Pro- tein A2, another major glycine-rich core hnRNP pro- tein from HeLa, has a domain structure analogous to A1 and appears to be related to ssDNA-binding pro- teins UP1-B from calf liver and HDP-1 from mouse myeloma in a way similar to the Al/UPl relationship. In contrast to ssDNA-binding proteins, A1 binds pref- erentially to RNA over ssDNA and exhibits no helix- destabilizing activity.

In eukaryotic cells, the newly synthesized hnRNA associ- ates with a set of proteins and forms ribonucleoprotein par- ticles (hnRNP’). These particles can be seen in electron micrographs of transcriptionally active chromatin as fibrils containing 20-nm beads spaced along their length and are generally believed to be part of the hnRNA-processing appa- ratus (for review, see Refs. 1 and 2). The 20-nm substructures, often called monoparticles, can be recovered from purified nuclei as a fairly homogenous peak sedimenting at about 30- 40 S in the sucrose gradient. The monoparticles which arise during isolation as a result of endonucleolytic cleavage of the larger hnRNP complexes contain, in addition to pre-mRNA sequences, a set of about nine prominent polypeptides called core proteins and a number of minor protein components (3). In HeLa cells, the major core proteins are termed Al, A2, B1 (two or three subspecies), B2, C1, C2, and C3, in order of

GM 31539 (to K. R. W.), GM23705, and CA 16239 (to W. S.). The * This work was supported by National Institutes of Health Grants

costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “aduertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

The abbreviations used are: hnRNP, heterogeneous nuclear ri- bonucleoprotein; ssDNA, single-stranded DNA; SDS, sodium dodecyl sulfate; HPLC, high pressure liquid chromatography; DTT, dithio- threitol; BSA, bovine serum albumin; PMSF, phenylmethylsulfonyl fluoride.

increasing molecular weights (4). Most of the core proteins are modified post-translationally, primarily by phosphoryla- tion. The number of core proteins in a variety of organisms is similar, and they all have molecular weights between 32,000 and 42,000. Most are basic proteins that have similar amino acid compositions characterized by a high content of glycine (about 20%), a low content of cysteine, the presence of the unusual modified amino acid dimethylarginine, and a blocked N terminus. Immunological investigations, including the use of monoclonal antibodies, demonstrate that core proteins comprise a family of related and evolutionarily conserved polypeptides (5, 6). Furthermore, a cDNA clone of a major hnRNP protein from brine shrimp Artemia salina cross-hy- bridizes with DNA sequences of plant, avian, and mammalian origin (7).

The core proteins appear to be devoid of any enzymatic activity and are believed to function as structural components that condense and organize hnRNA into the characteristic beads-on-a-string hnRNP particles. Nucleoprotein complexes that resemble native hnRNP can be reconstituted from ex- ogenous RNA and these proteins (8-11). It has been shown by optical methods that the RNA in hnRNP, both native and reconstituted, is substantially devoid of any residual second- ary structure, indicating that at least some of the core proteins may have RNA helix-destabilizing properties (12-14). A major hnRNP protein that was purified to homogeneity under non- denaturing conditions from A. salina was shown to be an RNA helix-destabilizing protein (15, 16).

I t was recently demonstrated that the ssDNA-binding pro- tein (also termed a DNA helix-destabilizing protein) from calf thymus, UP1 (17), shares some antigenic determinants in common with core hnRNP proteins from calf thymus and HeLa cells (18). Proteins analogous to UP1 were also isolated from mouse myeloma (19). In calf thymus, mouse myeloma, and probably in HeLa cells (18), there appears to be a group of proteins related structurally and antigenically to UP1 (UP1-type proteins) with molecular weights ranging from 22,000 to 28,000, and immunological investigations suggest that these proteins may arise by the action of an endogenous protease on the larger and more abundant core hnRNP pro- teins (20), e.g. the mouse myeloma UP1-type protein ob- tained by standard purification has a M , of about 24,000, whereas an immunoblot of the crude homogenate detects a species of M , -36,000 (21). Furthermore, a full-length cDNA clone of a UP1-type protein from rat was found to contain an open reading frame coding for a protein of 320 rather than the expected -200 amino acids (22). The clone was isolated by screening a cDNA library constructed with total poly(A)+ RNA of newborn rat brain with oligonucleotide probes based on peptide sequences of calf thymus UP1. The complete

11266

Page 2: Purification and Domain Structure of Core hnRNP Proteins and A2 … · 1999-02-02 · trypsin-sensitive C-terminal domain of the protein which contains no sequences homologous to

Core hnRNP Proteins and ssDNA-binding Proteins 11267

sequence of the 195 amino acids of calf thymus UP1 (Mr 22,162) has been determined (23) and was found to be iden- tical to the predicted sequence of the 195-amino acid-long N- terminal fragment of the rat clone (22). Partial amino acid sequencing of an analogous protein from mouse myeloma also shows a high degree of sequence homology with calf thymus UP1 (23). The UP1-type proteins bind tightly to single- stranded but not double-stranded DNA and stimulate the activity of the homologous DNA polymerase CY (24,25). They also have a substantial affinity for single-stranded RNA. These properties of a UP1-type mammalian proteins are similar to those of the well-characterized prokaryotic ssDNA- binding proteins, the phage T4 gene 32 protein, and the Escherichia coli SSB DNA protein that play essential roles in replication, recombination, and repair. However, a fundamen- tal property of prokaryotic ssDNA-binding proteins, and one thought to be critical for their function, uiz. cooperative binding to single-stranded DNA, is not shared by UP1-type proteins.

In this report, we describe the purification of core proteins Al, B2, C1, and A2 from HeLa cells, the first three under nondenaturing conditions. We analyze the domain structure of A1 and A2 and their relationship to some ssDNA-binding proteins.

EXPERIMENTAL PROCEDURES AND RESULTS~

Primary Structures of the N-terminal Fragment of HeLa Core Protein A1 and the ssDNA-binding Protein UP1 from Calf Thymus Are Nearly Identical-Limited proteolysis of a native protein often generates large polypeptides that may give some insights into the domain structure of the protein. As seen from Fig. 4, partial digestion of A1 ( M , -32,000) with trypsin at 0 “C yields, within a few minutes, a fragment of M, -22,000 that remains completely resistant to further prote- olysis for at least 2 h, i.e. for about 1 h after the disappearance of the A1 substrate. In contrast, the remaining part of the A1 molecule, about 10,000-11,000 daltons, is sensitive to trypsin and seems to be completely degraded to small peptides that run off the gel. The M , 22,000 tryptic fragment of A1 appears to have a blocked N-terminal amino acid since no discernible sequence was obtained when the material, electroeluted from SDS-polyacrylamide gels, was subjected to gas-phase sequenc- ing. Protein A1 and other major core proteins apparently have blocked N termini, thus locating the M, 22,000 tryptic frag- ment at the N-terminal end of Al. The trypsin-derived frag- ment of A1 migrates on the SDS-polyacrylamide gel at the same rate as authentic calf thymus UP1 (Fig. 4, lane 2) and has a similar amino acid composition (data not shown). Since calf thymus UP1 also has a blocked N-terminal amino acid, these results prompted us to further examine and compare these two proteins. The complete primary structure of UP1 has been recently determined (23); it contains 195 amino acids ( M , 22,162), has 2 cysteines, and contains a single NG,NG-dimethylarginine residue near its C terminus. A com- parison of HPLC maps of peptides from exhaustive tryptic digests of A1 and UP1 (Fig. 5) demonstrates that virtually every UP1 tryptic peptide makes its appearance at the same position in the A1 map. The A1 pattern shows some 10-11

* Portions of this paper (including “Experimental Procedures,” part of “Results,” and Figs. 1-3) are presented in miniprint at the end of this paper. Miniprint is easily read with the aid of a standard magnifying glass. Full size photocopies are available from the Journal of Biological Chemistry, 9650 Rockville Pike, Bethesda, MD 20814. Request Document No. 86M-360, cite the authors, and include a check or money order for $3.20 per set of photocopies. Full size photocopies are also included in the microfilm edition of the Journal that is available from Waverly Press.

1 2 3 4 5 6 7 8 9 FIG. 4. Time course of digestion of protein A1 with trypsin

followed by electrophoresis on a 15% polyacrylamide, 0.1% SDS gel. Digestion was at 0 “C at an enzyme-to-substrate ratio of 1:120. Lane 1, M, markers as in Fig. 2b; lane 2, protein Al; lane 3, protein UP1 from calf thymus; lanes 4-9, digestion for 5, 15, 30, 60, 90, and 120 min, respectively. Each lane received about 10 pg of protein. An undialyzed preparation of A1 (see Fig. 2 and text) was used in this experiment. An identical time course of digestion was obtained with a dialyzed preparation (not shown).

additional peptides absent from the UP1 map (indicated by arrows in Fig. 5) which are presumably derived from the trypsin-sensitive C-terminal fragments of the protein. There are about 43 peptides in the HPLC map of Al, in fairly good agreement with the number of lysine and arginine residues in the protein (Table I). The very high degree of homology between UP1 and the N-terminal domain of A1 indicated by the similarities of their HPLC maps has been confirmed by sequencing of six tryptic peptides of Al, four of which match the sequence of UP1 (Figs. 6 and 7). Of a total of 46 amino acids sequenced in the latter four peptides, 45 correspond exactly to sequences found in UP1. The one substitution involves the interchange of lysine in A1 for arginine in UP1 a t position 30 (Fig. 6). The four matching A1 peptides cover UP1 sequences near the N terminus (positions 14-30), in the middle of the protein (positions 92-104), and near the C terminus (positions 146-153 and 183-194) of the protein. This, together with the HPLC data of Fig. 5, provides con- vincing evidence that the N-terminal M , 22,000 fragment of HeLa A1 has a primary structure nearly identical to that of calf thymus UP1.

UP1 is not a glycine-rich protein; it contains 7.7 mol % glycine, which is nearly identical to the 7.6 mol % frequency of glycine in the 2054 eukaryotic protein sequences in the PIR Protein Sequence Database. In contrast to UP1, A1

Page 3: Purification and Domain Structure of Core hnRNP Proteins and A2 … · 1999-02-02 · trypsin-sensitive C-terminal domain of the protein which contains no sequences homologous to

11268 Core hnRNP Proteins and ssDNA-binding Proteins

TIME l m t n ~ f e s l

FIG. 5. Reverse-phase HPLC separation of tryptic peptides from the HeLa A1 (1.0 nmol) and the calf thymus UP1 (0.9 nmol) proteins (top) or from the HeLa A2 (1.0 nmol) and the calf liver UP1-B (0.7 nmol) proteins (bottom). The trichloroa- cetic acid-precipitated proteins were digested in 2 M urea, injected directly onto the Vydac C-18 column, and then eluted with increasing concentrations of acetonitrile as described under "Experimental Pro- cedures." Arrows indicate peptides present in A1 or A2 chromato- grams that are absent from the respective calf thymus UP1 or calf liver UP1-B profiles. The large absorbance peak at about 23 min is due to Tris-HC1.

TABLE I Amino acid composition of HeLa proteins A1 and A 2 and calf

proteins UP1 (th-ymus) and UP1-B (liver) No. of residues

Amino acid A1 UP1 ";'pi/ A2 UP1-B UP1-B AA2/

Asx 33.1 17 Thr 10.7 12 Ser 26.0 16 Glx 28.4 25 Pro 11.7 6 G ~ Y 60.5 15 Ala 13.6 10 Val 15.4 17 Met 4.7 4 Ile 7.7 8 Leu 10.0 8 TY r 13.2 4 Phe 17.5 10 His 6.7 8 LY s 18.4 16 Arg 22.3 15

M, 32,000' 22,162 Me,-Arg" 3.1 1

Mez-Arg, dimethylarginine.

16 33.7 0 9.8

10 21.7 3 31.2 6 14.6

45 70.0 4 13.6 0 13.9 1 5.2 0 8.4 2 11.2 9 18.0 8 16.6 0 7.1 2 16.3 7 26.4 2 0.5

34,000'

18.1 10.9 14.6 28.3

5.5 21.7 11.5 14.1 3.4 7.8 8.6 4.4

11.6 7.1

14.5 16.7

22,000'

15.6 0 7.1 2.9 9.1

48.3 2.1 0 1.8 0.6 2.6

13.6 5.0 0 1.8 9.7 0.5

* Assumed monomer molecular weight.

contains 20.0 mol % glycine, which makes it unusually gly- cine-rich (Table I). In view of the near identity of UP1 and the N-terminal domain of A l , one would expect that those A1 peptides that come from the C terminus of the protein and do not match UP1 sequences should contain a high number of glycine residues. In fact, since the calculated mass of the C-terminal fragment of A1 is about 10,000-11,000 daltons

Ser-Lys-Ser-Glu-Ser-Pro-Lys-Glu-Pro-Clu-G~~-L~~-A~g-Ly~-L~~-Ph~-~~~-G~y-~~y-L~~- LO 20

SerPhe-Clu-Thr-Thc-A.p-Glu-Ser"*rg-s~~-mi~-Ph~~~~-G~~-T~p-G~y-Th~-L~~-Th~- 30 40

-" - LY e

50 A ~ p - C Y r - V * l - V . I - U e t - A r g - A ~ p - P r c - A ~ o - T h r - L y ~ - A ~ ~ - S e r - A ~ g ~ l y - P h ~ - G l y - P h ~ - V ~ l - T h r -

60

Tyr-Al~-Thr-Val-Glu-Clu-V~l-A~p-Al.-Al~-uet-A~n-Al~-Arg-Pr~-ni~-Ly~-Vnl-A~p-Gly- 70 80

90 *rg-VMl-V.l-Glu-Pro-Lys-Acg-Al.-Vnl-Ser-Arg-Glu-Asp-Ser-Gla-Arg-Pro-Cly-Al.-Bis-

100

I

Leu-Thr-V.l-Ly*-Lys-Ile-Phe-Vol-Cly-Gly-Ile-LyP-Clu-A.p-Thr-Glu-Clu-Bi.-Bir-Leu-

____I

110 120

A r g - A r p - T y r - P h e - C l u - ~ ~ ~ - T y r - G l y - L y s " l e " p - A ~ g - G ~ y - 130 1 40

150 Ser-Gly-Lyr-Ly~-A~g-C1y-Phe-Al~-Phe-V~l-Thr-Phe-A~p-Asp-mis-Asp-s~~-V.l-A~p-Ly~-

160 - --- Ile-V~I-IIe-Cln-Ly~-Tyr-Bi~-Th~-V~l-A~n-G1y-Uin-A~n-Cy~-Clu-V~l-Arg-Lya-Al.-Leu-

I70 I 80

190 Ser-Ly.-G1~-Glu-Het-Al~-S~~-Al~-S~~-S~~-S~~-Gl~-A~g-Gly-A~g

ICH3)2

I "_ "I FIG. 6. Amino acid sequence of the UP1 ssDNA-binding

protein from calf thymus (23). Four of the six tryptic peptides from HeLa protein A1 that have been sequenced are indicated below the UP1 sequence. A solid line indicates identity; a dashed line indicates positions where an amino acid could not be determined. Note the Arg/Lys substitution at position 30.

I. . . . Ser-Gly-Ser-Gly-Asn-Phe-Gly-Gly-Gly-(x)-Gly- Gly-Gly-Phe-Gly-Gly-Asn-Asp . . .

11. . . . Asn-Gln-Gly-Gly-Tyr-Gly-Gly-Ser-Ser-Ser-Ser-

Ser-Ser-Tyr-Gly . . . FIG. 7. Amino acid sequences of the two tryptic peptides of

A1 that do not match UP1 sequences. The sequences match exactly positions 197-215 and 301-315 of the full-length rat clone (Ref. 22; see text).

(see Fig. 4 and Table I), its glycine content should be nearly 40 mol % (Table I). This prediction appears to be borne out by the sequences of two additional A1 tryptic peptides that show no homology to UP1 (Fig. 7). Of a total of 33 amino acids so far sequenced in these peptides, 15 (45%) are glycine residues. Also note from Table I that the calculated number of lysine and arginine residues in the C-terminal fragment of A1 is in fairly good agreement with the number of peptides in the HPLC map of A1 that are absent from the HPLC map of UP1 (Fig. 5 ) . Finally, the amino acid sequences of the two peptides in Fig. 7 have recently been shown to match exactly residues 197-215 and 301-315 predicted from the sequence of the full-length rat UP1 cDNA clone which contains an open reading frame coding for a protein of 320 amino acids, rather than the expected 195 amino acids, of UP1 (Ref. 22; see Introduction). The predicted sequence of the 195-amino acid- long N-terminal fragment of the rat clone is identical to the sequence of calf thymus UP1 (Fig. 6), and the amino acid composition predicted for the full-length rat protein agrees with that of HeLa A1 (Table I). Hence, based on the matching

Page 4: Purification and Domain Structure of Core hnRNP Proteins and A2 … · 1999-02-02 · trypsin-sensitive C-terminal domain of the protein which contains no sequences homologous to

Core hnRNP Proteins and ssDNA-binding Proteins 11269

sequences of Fig. 7, the clone appears to correspond to a rat Al-type protein.

Tryptic Digests of Core Proteins A2, B2, and Cl-Protein A2 ( M , -34,000), a major glycine-rich core protein which is present in 40 S monoparticles at a molar ratio of approxi- mately 1:l to protein Al, was obtained pure by CM-cellulose chromatography in 6.0 M urea or electroelution from SDS- polyacrylamide gels (Fig. 3 and text). The amino acid com- positions of A1 and A2 are similar (Table I; cf. Ref. 3). The HPLC map of peptides from an exhaustive tryptic digest of A2 (Fig. 5) contains about 41 peptides, in good agreement with the number of lysine and arginine residues in A2 (Table I). An HPLC map of tryptic peptides from an exhaustive digest of protein UP1-B (M, -22,000) is shown in Fig. 5. This ssDNA-binding protein was isolated along with UP1 from calf liver according to the procedure for the purification of UP1; it has an amino acid composition similar to UP1 (Table I). In addition, UP1 and UP1-B share a high degree of sequence homology. Of the 103 amino acids that have so far been sequenced in UP1-B, 84 have been shown to be identical to sequences in UPl.3 A comparison of the peptide maps of A2 and UP1-B shows that essentially each of the UP1-B tryptic peptides has its counterpart which elutes at the same position in the A2 map. In addition, the A2 map contains about 11 peptides that are absent from the UP1-B pattern (indicated by arrows in Fig. 5), in good agreement with the calculated number of lysine and arginine residues which can be assumed to be present in the 10,000-11,000-dalton frag- ment(s) of A2 remaining after subtracting the molecular mass of UP1-B from that of A2 (Table I). In view of the similarities of the peptide maps of the two proteins, these are presumably the glycine-rich peptides since the glycine content of UP1-B and A2 is 18 and 70 residues, respectively.

Three A2 tryptic peptides have been sequenced so far. One of these sequences proved to match exactly residues 113-121 in UP1, whereas a second matched UP1 residues 166-172 with the exception that the A2 sequence has an isoleucine corresponding to valine 169 in UP1. The third peptide has a sequence, Glu-Glu-Ser-Gly-X-Pro-Gly-Ala-His-Val-Thr-Val, which is highly homologous to residues 92-103 in UP1 (Fig. 6) and is identical to the corresponding region in an analogous ssDNA-binding protein, HDP-1, isolated from mouse mye- loma cells (23). Taken together with a previous study3 that indicates that the UP1 and UP1-B proteins may share as much as 82% sequence homology, these data indicate that the A1 and A2 hnRNP proteins are encoded by two different genes that nonetheless share a high degree of sequence ho- mology. The UP1 and UP1-B ssDNA-binding proteins could result from limited proteolysis of the A1 and A2 hnRNP proteins, respectively (see "Discussion"). The HDP-1 ssDNA- binding protein that was previously reported in mouse mye- loma cells (19) appears to result from cleavage of an A2 hnRNP protein. Thus, previous sequence differences between the calf thymus UP1 and mouse myeloma HDP-1 proteins result from the fact that these proteins are related to A1 and A2, respectively, and not, as was originally proposed (23), from interspecies sequence differences in UP1.

Proteins B2 and C1 were purified under nondenaturing conditions (see Miniprint) and subjected to partial trypsin digestion. As seen from Fig. 8, regardless of whether the proteins were digested at 22 or 37 "C, the results are analogous to those obtained with A1 (Fig. 4) in that each protein loses uniformly about 12,000-13,000 daltons and a large trypsin- resistant fragment remains intact. The remaining part of the

B. Merrill, K. Stone, S. Riva, and K. R. Williams, unpublished data.

1 2 3 4 5 6 7 FIG. 8. Digestion of core proteins C1 (lanes 2-4) and B2

(lanes 5-7) with trypsin followed by electrophoresis on a 12% polyacrylamide, 0.1% SDS gel. The enzyme-to-substrate ratio was 1:30 (w/w); incubation time was 30 min. Lune 1, M, markers, as in Fig. 2; lanes 2 and 5, proteins C1 (4 pg) and B2 (5 pg), respectively; lanes 3 and 6, digestion at 22 "C; lanes 4 and 7, digestion at 37 "C. Lunes 3,4, 6, and 7 received 8 pg of proteins each.

C1 molecule appears to be completely degraded to small peptides that run off the gel as is the case with Al. Protein B2 ( M , -38,000) seems to have a single trypsin-sensitive bond under our conditions of digestion (enzyme-to-substrate ratio of 1:30, w/w) since it yields two fragments of M, -26,000 and -13,000.

Binding of A1 to Polynucleotides-We have examined some of the nucleic acid binding properties of Al , and we find that they differ considerably from those of UP1. We did not find any perturbation of CD spectra of HeLa hnRNA, coliphage MS2 RNA, poly[d(A-T)], poly[r(A+U)], and poly(rA) at 25- 35 ~ L M nucleotide upon the addition of A1 up to a molar ratio of one protein per 8-12 nucleotides. On the other hand, experiments employing the nitrocellulose filter binding assay (Table 11) have shown that, at 10 PM nucleotide, A1 forms complexes with a variety of labeled natural and synthetic RNAs and single-stranded and double-stranded DNA. This suggests that, in contrast to UP1 (28), binding of A1 is not accompanied by conformational changes indicative of the unfolding of the secondary structure of the polynucleotide. Whereas the UV and CD spectra of the homopolymer pair (rA + rU) are not perturbed by Al, addition of the protein affects the rate and extent of duplex formation at low ionic strength. In a typical experiment shown in Fig. 9, the addition of A1 to poly(rU) in a buffer containing 20 mM NaCl at a ratio of one protein per 12 nucleotides prior to the addition of poly(rA) increases the t1/2 for duplex formation from 0.95 min in the absence of A1 to 4.1 min in the presence of Al. Essentially the same results are obtained regardless of

Page 5: Purification and Domain Structure of Core hnRNP Proteins and A2 … · 1999-02-02 · trypsin-sensitive C-terminal domain of the protein which contains no sequences homologous to

11270 Core hnRNP Proteins and ssDNA-binding Proteins TABLE I1

Binding of protein A1 to labeled polynucleotides The nitrocellulose binding assay was as described under "Experi-

mental Procedures" with protein A1 at 0.8 p~ and polynucleotides at 10.0 YM (nucleotide).

at the indicated concentration of Polynucleotide retained on filter

NaCl in the binding buffer

hnRNA Coliphage MS2 RNA Poly(rU) Poly(rC) Poly(rA) Poly[r(A+U)] ~BR322 ssDNA

80 mM

100 100 78 69 75 73 76

300mM 500rnM

% input 32 0 38 >1 28 0

ND" ND ND 0 36 0 26 >2

pBR322 dsDNA, linear 73 21 >2 ND, not determined; dsDNA, double-stranded DNA.

-

0 5 10 15 20 Time (min)

FIG. 9. Effect of protein A1 on the formation of the poly[r(U+A)] duplex. Poly(rU) (4.0 nmol) was mixed with poly(rA) (4.0 nmol) in the presence (curue 1 ) or absence (curue 2) of A1 (0.33 nmol) at 20 "C in a buffer containing 10.0 mM Tris-HC1, pH 7.5, and 20 mM NaCl in a final volume of 0.25 ml. Changes in absorbance at 260 nm were recorded. The arrow indicates addition of NaCl from a 4.0 M stock solution. The hyperchromicity of the random coil uersus duplex was 44%.

whether the protein is initially mixed with the polypurine or the polypyrimidine. The decrease in the rate of annealing suggests that some rearrangement of the initial Al-single- stranded polynucleotide complex is required to allow duplex formation and could be the rate-limiting step. Dissociation of the protein from the initial complex seems unlikely since under these conditions A1 binds to synthetic duplex RNA (Table 11). Also note from Fig. 9 that the presence of A1 allows about 60% of the duplex to be formed at 20 mM NaCl. A somewhat higher NaCl concentration, which favors duplex formation, is required to complete the reaction.

The binding of A1 to polynucleotides is sensitive to elevated concentrations of NaCl (Table 11). Whereas complexes of UP1 with single-stranded DNA are stable in buffers contain- ing up to 0.4-0.5 M NaCl (17), dissociation of Al-polynucle- otide complexes starts at about 0.09-0.1 M NaCl (not shown) and is complete at 0.5 M NaCl (Table 11). The nitrocellulose binding assay was also used to measure the relative affinity of A1 for various polynucleotides. Table I11 shows the results of competition experiments in which a %fold excess of unla- beled polynucleotide was mixed with HeLa [3H]hnRNA or with coliphage MS2 [3H]RNA (each at 10.0 PM nucleotide) prior to the addition of Al. The results indicate a preference for the binding of natural RNA since none of the synthetic polymers nor DNA competed effectively for A1 with MS2 RNA or with hnRNA, whereas MS2 RNA reduced the binding to hnRNA by more than half.

TABLE 111 Competition of polynucleotides for protein A1

Labeled [3H]hnRNA or coliphage MS2 [3H]RNA, each at 5.0-8.0 p ~ , was mixed with a 3-fold molar excess of unlabeled polynucleotide in the binding buffer before the addition of Al. The molar ratio of protein-to-nucleotide was kept a t approximately 1:16 with respect to labeled Dolvnucleotides.

polynucleotide Competing

None rU rC rA rI r(A+U) (1:l) MS2 RNA ssDNA (calf thymus) dsDNA (calf thymus) d(A-T)

Polynucleotide retained on filter

[3H]hnRNA MS2 [3H]RNA ~~

% input 100 100 72 67

100 97 95 89

ND" 93 92 85 43 78 83 90 89 95 ND

ND, not determined; dsDNA, double-stranded DNA.

DISCUSSION

The data presented in this paper show that the primary structure of the ssDNA-binding protein UP1 from calf thymus is nearly identical to that of the N-terminal domain of HeLa core hnRNP protein Al. The degree of homology, based on partial sequences of A1 shown in Fig. 6, is 98%, the only difference being the interchange of the chemically similar amino acids lysine and arginine. This latter difference may result from species-specific sequence differences between the HeLa cell A1 and the calf thymus UP1 protein. The sequence of the N-terminal195 amino acids of the 320-amino acid-long presumed UP1-type protein from rat predicted from the se- quence of the cDNA clone of Cobianchi et al. (22) is identical to that of calf thymus UP1. Since those HeLa A1 peptides from the C-terminal domain which have been sequenced (Fig. 7, about a third of the domain) match exactly to the rat sequence that extends beyond the N-terminal domain, the clone most likely corresponds to a rat A1 protein. Although interspecies relatedness of A1 was inferred from chemical and immunological analysis of core proteins, such a high degree of homology was unexpected and is indicative of profound evolutionary constraints. Partial amino acid sequencing of a major hnRNP protein from A. salina also demonstrates con- siderable homology to UP1.4 Our analysis of protein A2 (Fig. 5 and sequences in text), taken together with sequence data on the ssDNA-binding protein UP1-B from calf liver3 and on the HDP-1 ssDNA-binding protein from mouse myeloma (23), indicates that these two proteins are related to A2 in a way analogous to the Al/UPl relationship. These results suggest that at least some ssDNA-binding proteins, until recently considered a separate class, and some core hnRNP proteins could be products of the same genes, uiz. A1 and UP1 is one example; and A2, UP1-B, and HDP-1 is another. In view of the extensive sequence homologies between UP1, UP1-B, and HDP-1, it is obvious that A1 and A2 are closely related proteins. In fact, as can be seen from Fig. 5, the two proteins share about 23 tryptic peptides of a total of about 43.

Whereas A1 and A2 are not products of the same gene, they share a similar domain structure: an N-terminal domain, apparently identical to what is assumed to be the sequence of a ssDNA-binding protein, and a glycine-rich C-terminal do- main of about 10,000-12,000 daltons. The division between the N-terminal and C-terminal domain of A1 can be taken to

A. Kumar, K. R. Williams, and W. Szer, unpublished data.

Page 6: Purification and Domain Structure of Core hnRNP Proteins and A2 … · 1999-02-02 · trypsin-sensitive C-terminal domain of the protein which contains no sequences homologous to

Core hnRNP Proteins and ssDNA-binding Proteins 11271

occur after glutamic acid 194 which is the last amino acid that is present in a region of internal sequence homology in UP1 (29). Hence, when residues 3-93 and 94-194 in UP1 are aligned, 32% of the amino acids in these two regions are identical and an additional 39% of these changes that are seen could be accomplished by single-base changes (Fig. 6). Preliminary results of oligonucleotide-UP1 covalent cross- linking and limited proteolysis studies5 suggest that residues 3-93 and 94-194 represent two globular domains that can independently bind to single-stranded nucleic acids. This postulated domain structure (30) appears to be exactly anal- ogous to that of the high mobility group protein E where limited proteolysis has been used to release two fragments corresponding to the approximately 92-amino acid residue repeat found in this protein; each of the resulting high mobil- ity group protein E fragments is a water-soluble, ssDNA- binding protein with no apparent tendency to interact with the other (31). The apparent existence of multiple nucleic acid-binding sites in a number of eukaryotic ssDNA- and sRNA-binding proteins suggests that this structural feature may prove to be a widely shared property of this class of proteins (30). Whereas it seems certain that the N-terminal domain of A1 and A2 is involved in interactions with RNA, the role of the C-terminal domain is not clear, but it seems to exert an effect on the nucleic acid binding properties of the protein (see below). This domain has no obvious amino acid sequence homology with the two internal repeats in the N- terminal domain.

The N-terminal domain of A1 is remarkably resistant to degradation by trypsin (Fig. 4), and this is probably true for A2 as well since limited trypsin digestion of a mixture of native A1 and A2 (Fig. 1, fractions 76-79) yields two large fragments of M , -21,000 and -22,000 (not shown). It was observed by others that treatment of 40 S monoparticles with trypsin yields a set of polypeptides that appear to be uniformly reduced in size by 11,000-13,000 daltons with respect to core proteins; these polypeptides react in Western blots with monoclonal antibodies against core proteins ( 5 ) and with antibodies against UP1 (18). The results of limited trypsin digestion of purified B2 and C1 (Fig. 8) are in line with these observations and suggest that most, perhaps all, core proteins contain a peculiarly trypsin-sensitive bond, analogous to that at Arglg5 in A1 (Fig. 6), and it is the cleavage of this bond that could give rise to the set of polypeptides of M, between 22,000 and 28,000 regarded as ssDNA-binding proteins. It is possible, but certainly not proven, that ssDNA-binding pro- teins are artifacts that arise by proteolytic degradation of core proteins during cell fractionation (20, 21). Such proteolyti- cally degraded forms of proteins that arise during isolation and retain certain in uitro activities have been reported, e.g. the calf thymus terminal deoxynucleotidyltransferase (32) and the eukaryotic fatty-acid synthetase (33). It is also pos- sible that ssDNA-binding proteins do have an autonomous function in the cell and arise by controlled in vivo cleavage of core proteins or by differential splicing of the corresponding primary transcripts. The number of copies of a UP1-type protein has been estimated at 0.8-1.0 X lo6 per cell (17, 19); the number of copies of a major core protein is probably about four times greater (34). In view of the results reported here, as well as (i) the immunological investigations by Pandolfo et al. (20), and (ii) the lower M, of the mouse myeloma UP1- type protein isolated biochemically as compared to the M, of the species detected immunologically in crude extracts (21), the in uiuo role of UP1-type ssDNA-binding proteins and the

B. M. Merrill and K. R. Williams, unpublished observations.

mechanism by which they are formed will have to be re- examined.

The purification of the A1 protein under nondenaturing conditions made possible an examination of some of its nu- cleic acid binding properties. Comparing A1 and UP1 in this respect reveals that the presence of the glycine-rich C-termi- nal domain modifies considerably the binding properties of the protein. A1 binds preferentially to RNA, whereas UP1 binds somewhat better to ssDNA than to RNA (28). The binding of UP1 disrupts base stacking interactions, a typical feature of RNA and DNA helix-destabilizing proteins (35), whereas the binding of A1 has no effect on the conformation of the polynucleotide ligand as judged from UV and CD spectra. Most probably related to its unwinding activity is the increased resistance of UP1-polynucleotide complexes to salt- induced dissociation as compared to A1 complexes which presumably involve primarily ionic bonds. Interactions of some aromatic amino acids with polynucleotides, in addition to ionic bonds, appear to be part of a general mechanism of the binding of helix-destabilizing proteins (36) and render their complexes relatively salt-resistant. The sensitivity of AI-polynucleotide complexes to salt is in line with the results of an earlier study which showed that A1 is among the first core proteins to dissociate when native 40 S monoparticles are sedimented in a sucrose gradient containing 0.2-0.3 M NaCl (4).

The modulatory effect of the C-terminal region of the protein on binding observed here in the Al/UPl system suggests that this region is similar to analogous functional domains in the high mobility group protein 1 (31) and in phage T4 gene 32 (37) and E. coli SSB (38) single-stranded DNA-binding proteins. In all three cases, the proteolytic removal of a C-terminal fragment results in an increased helix-destabilizing ability of the truncated protein. In addition to limiting the unwinding activity, the C-terminal domain appears to be involved in functional protein/protein interac- tions, e.g. gene 32 protein and possibly also E. coli SSB protein, with components of the in vitro replication complex, and high mobility group protein 1 with histones. That the C- terminal domains of the T4 gene 32 protein and A1 hnRNP protein might be functionally similar is further suggested by the observation that both proteins contain an unusual serine- rich region within these postulated domains; gene 32 protein contains 8 serines between residues 280 and 288 (39), whereas A1 contains 6 serines extending from residue 308 to 313 (Fig. 7). Whereas it remains to be established whether the glycine- rich C-terminal domain of A1 is essential for interactions with other proteins forming the hnRNP particle, it is intrigu- ing to find glycine-rich clusters in several nucleolar proteins (40, 41). One of these, the 110,000-dalton C23 protein is thought to be a component of preribosomal particles.

Acknowledgments-We are grateful to S. Wilson and his colleagues and to G. Reeck and D. Teller for communicating their work prior to publication. We especially thank M. LoPresti, K. Stone, and N. Fernando for their excellent technical assistance. K. R. W. thanks W. Konigsberg for his continued interest in these studies.

REFERENCES 1. Holoubek, V. (1983) in Chromosomal Nonhistone Proteins (Hnilica, L. S.,

2. Knowler, J. T. (1983) Znt. Reu. Cytol. 84, 103-153 3. Wilk, H. E., Werr, H., Friedrich, D., Klitz, H. H., and Schafer, K. P. (1985)

4. Beyer, A. L., Christensen, M. E., Walker, B. W., and LeStourgeon, W. M.

5. Lesser, G. P., Escara-Wilke, L., and Martin, T. E. (1984) J. Biol. Chern.

6. Dreyfuss, G., Choi, Y. D., and Adam, S. A. (1984) Mol. Cell Biol. 4, 1104-

ed) Vol. 4, pp. 21-117, CRC Press, Inc., Boca Raton, FL

Eur. J . Biochern. 146, 71-81

(1977) Cell 11,127-138

259,1827-1833 ...,

7. Cruz-Alvarez, M., Szer, W., and Pellicer, A. (1985) Nucleic Acids Res. 13, 1114

3917-3930

Page 7: Purification and Domain Structure of Core hnRNP Proteins and A2 … · 1999-02-02 · trypsin-sensitive C-terminal domain of the protein which contains no sequences homologous to

11272 Core hnRNP Proteins and ssDNA-binding Proteins

8. Wilk, H. E., Angeli, G., and Schafer, K. P. (1983) Biochemistry 22,4592-

9. Pullman, J. M., and Martin, T. E. (1983) J. Cell Biol. 97,99-111 4600

10. Thomas, J. O., Glowacka, S. K., and Szer, W. (1983) J. Mol. Biol. 1 7 1 ,

11. Economidis, I. V., and Pederson, T. (1983) Proc. Nutl. Acad. Sci. U. S. A.

12. Northemann. W.. Seifert. H.. and Heinrich. P. C. (1979) HoDDe-Sevkr's 2.

439-455

80,4296-4300

Physiol. C&vn.'360,877-888

252,7307-7322

J. Riol. Chem. 255.6473-6478

. I ,. I

13. Karn, J., Vidali, G., Boffa, L. C., and Allfrey, V. G. (1977) J. Bid. Chem.

14. Nowak, L., Marvil, D. K., Thomas, J. O., Boublik, M., and Szer, W. (1980)

15. Marvil, D. K., Nowak, L., and Szer, W. (1980) J. Biol. Chem. 2 5 5 , 6466-

16. Thomas, J. O., Raziuddin, Sobota, A,, Bouhlik, M., and Szer, W. (1981)

17. Herrick, G., and Alberts, B. (1976) J. Biol. Chem. 251,2124-2132 18. Valentini, 0.. Biamonti, G., Pandolfo, M., Morandi, C., and Riva, S. (1985)

6472

Proc. Nutl. Acad. Sci. U. S. A. 78, 2888-2892

19. Planck, S. R., and Wilson, S. H. (1980) J. Biol. Chem. 2 5 5 , 11547-11556 20. Pandolfo. M.. Valentini. 0.. Biamonti. G.. and Riva. S. (1985) NucleicAcids

Nucleic Acids Res. 13 , 337-346

I , I . I . I

21. Planck, S. R., and Wilson, S. H. (1985) Biochem. Biophys. Res. Commun.

22. Cobianchi, F., Sen-Gupta, D., Zmudzka, B. Z., and Wilson, S. (1986) J.

23. Williams, K. R., Stone, K. L., LoPresti, M. B., Merrill, B. M., and Planck,

24. Herrick, G., Delius, H., and Alberts, B. (1976) J. Biol. Chem. 2 5 1 , 2142-

Res. 13,6577-6590

131,362-369

Biol. Chem. 261,3536-3543

S. R. (1985) Proc. Nutl. Acad. Sci. U. S. A. 82,5666-5670

2146

S u p p l m n t a l M a t e r i a l t o

PURIFICATION AND DOMAIN STRUCTURE OF CORE hnRNP PROTEINS A I AN0 A2, AND THEIR RELATIONSHIP TO SSONA BINDING PROTEINS

Ina lendra Kumar. Kenneth R. Ui l l i ams and Ulodzimierz Szer

EXPERIMENTAL PROCEDURES

Mater ia ls PolynUCleotideS were frm Mi les and coliphage 3H US2 RNA was prepared as described

Uorthington. pBR322 DNA was a g i f t frm Dr. P. O'E~lstachio. (261; i t s S p e c i f i c r a d i o a c t i v i t y was 14.500 cpmlnmol. Trypsin and DNase I were from

Cell growth and f rac t i ona t ion

mdi f ied. Gibcol supplemented w i t h 5-/. f e t a l c a l f serum and kep t a t a dens i t y between 3 x HeLa cells wePe grown In suspension c u l t u r e i n minimal essential medium (Jok l i k ' s

lo8 and 6 x 108 c e l l s / l by sequent ia l d i lut ion every 24 h. Nuclei were obtained as described (31 and ext racted by g e n t l e s t i r r i n g f o r f o u r p e r i o d s o f 30 min each a t 22' u i t h 5 vo lmer Of a buffer ( R S B I conta in ing 10 mM Tris-HCl. pH R.0 ( two ex t rac t ions) or pH 9.0 (two subsequent ext ract ions l . 0.1 M NaCl. 1.5 mH MgC12. 0.1 mH DTT and 0.1 mM phenylmethylsul- phony1 f luor ide. The nuc lear ext racts (see Fig. lb. lane E l were cmbined. pe l le ted a t 100.000 9 f o r 16 h and t h e p e l l e t s suspended i n 2-3 volumes o f t h e same b u f f e r a t pH 7.6 w i t h or without DTT (see Results) except that NaCl was adjusted to 2.0 M. The suspension was d ia lyzed overn igh t aga ins t the h igh-sa l t bu f fe r , c la r i f ied by b r i e f c e n t r i f u g a t i o n a t 10,000 9 and f ract ionated by gel permeation Chromatography (Figs. 1 and 21. I f 40s par t i c l es were desired. the pel lets were suspended i n RSB (pH 7.61, layered on 15-30'1' sucrose g r a d i e n t s i n t h e same b u f f e r and c e n t r i f u g e d a t 21.000 rpm for. 16 h a t 4' i n the Beckman SY28 ro to r . The f rac t ions o f the g rad ien t con ta in ing the hnRNP p a r t i c l e s were cmbined. pel leted by high-speed centr i fugation and prepared for gel permeation chrmatography as described above for the nuclear extracts. The y i e l d was about 2.0-2.5 rng of t o t a l hnRNP protein f rom 109 c e l l s .

F O ~ t h e i s o l a t i o n of 3~ hnwA frm 40s p a r t i c l e s . c e l l s were concentrated 5-fold.

and the 405 par t i c l es i so la ted I S described. m e hnRNA was e x t r a c t e d i n t h e presence of reruspended i n f r e s h medium w i t h 10.0 &/ml o f 3H ur idine, Incubated for 20 n i n a t 37'.

O . l . / ' SDS t w i c e w i t h phenol. once with Chlorofom, then ethanol precipitated. dissolved

pheno l lch lo ro fom ex t rac ted . me spec i f i c ac t i v i t y was 28.000-43,000 cpm/nmol. i n RSR. pH 7.0. and incubated fo? 90 .in a t 37' w i t h 60 ug/ml O f RNase-free DNase and

Pro te in chmis t r y s tud ies Tr ich lo roacet ic ac id p rec ip i ta ted p ro te ins were red isso lved i n 0.2 ml of 2M urea, 2EmM

a t 37'C. The resul t ing pept ides veve separated on a 5 micron Vydac C-18 column (0.46 Cm x NH4HC03 and then digested wi th t rypsin at a protein/enzyme ( w / w l r a t i o O f 25 f o r 24 h m

25 cml tha t vas esu i l ib ra ted w i th 0.05'1' t r i f l u o r o a c e t i c a c i d a t a f lw r a t e O f 0.7 ml/min. Peptides were then eluted by increas ing the concentrat ion o f so lvent B (80'1'

and 90-135 rnin (37.5-75'/'81. Su i tab le a l i quo ts o f t he pur i f ied pept ides were hydrolyzed ( v l v l a c e t o n i t r i l e i n 0.05'/' t r i f l u o r o a c e t i c a c i d 1 as follows: 0-90 min (0-37.5'/'81

i n 6 N HCl f o r 16 h r r a t 115.C and then analvzcd on a Beckman 121M amino acid analyzer. Those peptides selected for sequencing were d r ied i n vacuo p r i o r t o r e d i s s o l v i n g i n 011 ml

p ro te in Sequencer. The OPNRUN program Supplied w i th t h i s i ns t rumen t was used wi thout loo'/' t r i f l u o r o a c e t i c a c i d and loading onto an X i j p W B i o s y s t m s model 470A gas phase

modi f icat ion and the resul t ing phenyl th iohydanto in der ivat ives were analyzed as prev ious ly described (271. I n the case o f t he 22,000 dalton fragment from the HeLa A 1 prote in . 291 p i cmo les were app l ied to the gas phase sequence? i n t r i f l u o r o a c e t i c a c i d . Subsequent HPLC a n a l y s i s f a i l e d t o d e t e c t an increase above about 0.5 pmole i n any phenylthiohydantoin der ivat ive through IO Cvcles o f Edman degradation. These resul ts suggest that both the ~ e ~ a A 1 pro te in and i t s 22,000 dalton fragment have blocked NH2-termini.

F i l t e r b i n d i n g assays

NaCl. 0.1 n*l DTT and 10 uglml BSA.Samples (0 .1 ml l weve incubated w i th p ro te in A 1 f o r 1-2 The standard b ind ing buf fer f o r f i l t e r b ind ing assays was 5.0 mH Tris-HC1. pH 7.6. 60 nhl

rnin a t room temperature. f i l t e r e d t h r o u g h M i l l i p o r e f i l t e r s washed. d r i ed and counted. Retention of free polynucleotides vas t y p i c a l l y 0.1-0.5'1' 0; i npu t and these valuer were used to Cor rec t measurements of cmo lex f omat ion . I n View O f the polYdispersi ty of most o f the DOlynucleOtide PFepdmtiOnS used. their concentrat ions a m expressed i n mononucleotides i n legends t o Tables and Figure 9.

RESULTS

P u r i f i c a t i o n of proteins AI. A2. 82 and C 1 I so la t i on and p u r i f i c a t i o n o f i n d i v i d u a l core hnRNP proteins under nondenaturing

condi t ions has been hampered by the fact that the major species about 6-9 i n number have c lose i soe lec t r i c po in ts (PIP about 8-91 and t h e i r mo1ecul;r u e i q h t s f a l l w i t h i n a

chromatography f o r t h e p u r i f i c a t i o n i s d i f f i c u l t because f ree core prote ins tend to r e l d t i v e l v narrow range frm 32.000 t o 42.000. Furthermore. the appl icat ion of ion exchange

p rec ip i t a te from solut ions Of low ionic strength i n t h e absence Of RNA. The protocol developed fo r t he Dur i f i ca t i on O f p ro te in A1 (Figs. 1 and 2. and Experimental Procedures1

condi t ions wheve most core prote ins, but not A l . f o m o l i g o - and oolymeric aggregates. The i s , i n essence. a one-step Procedure t h a t v e l i e s on gel f i l t r a t i o n chromatography under

25.

26.

27.

28. 29.

30.

31.

32.

33.

34.

35.

36.

37.

38.

39.

40.

41.

Detera, S. O., Becerra, S. P., Swack, J. A,, and Wilson, S. H. (1981) J. Biol.

Weissmann, C., and Feix, G. (1966) Proc. Nutl. Acad. Sci. U. S. A:55, Chem. 256,6933-6943

1 %A-1 9632 Merrill, B. M., Williams, K. R., Chase, J. W., and Konigsberg, W. H. (1984)

Herrick, G., and Alberts, B. (1976) J. Biol. Chem. 251,2133-2141 Mernll, B. M., LoPresti, M. B., Stone, K. L., and Wdllams, K. R. (1986)

Chase, J. W., and Williams, K. R. (1986) Annu. Rev. Biochem. 5 5 , 103-

"". ""V

J. Biol. Chem. 259,10850-10856

J. Biol. Chem. 261,878-883

Reeck, G. R., and Teller, D. C. (1986) in Progress in Nonhistone Protein 136

Research (Bekhor, I., ed) Vol. 2, CRC Press, Inc., Boca Raton, FL, In press

Chang, L. M. S., Plevani, P., and Bollum, F. J. (1982) J. Biol. Chem. 257. 5700-5706

Wakil, J. S., Stoops, J. K., and Joshi, V. C. (1983) Annu. Reu. Biochem. 52,537-579

Comings, D. E., and Peters, K. E. (1981) in The Nucleus (Busch, H., ed) Vol. 9 p. 89-118 Academic Press New York

Thomas: f. O., and & e , W. (1982) Prog. Nucleic Acid Res. Mol. Bwl. 27 , 157-187

Pri odich, R. V., Casas-Finet, J., Williams, K. R., Konigsberg, W., and 8oleman J. E. (1984) Biochemistry 23,522-529

Burke, R. d., Alberts, B., and Hosoda, J. (1980) J. Biol. Chem. 255,11484-

Williams, K. R., Spicer, E. K., LoPresti, M. B., Guggenheimer, R. A., and

Williams, K. R., LoPresti, M. B., Setoguchi, M., and Konigsberg, W. (1980)

Lischwe, M. A., Ochs, R. L., Reddy, R., Cook, R. G., Tan, E. M., Reichlin,

Liscdwe, M. A., Cook, R. G., Ahn, Y. S., Yeoman, L. C., and Busch, H.

11493

Chase, J. W. (1983) J. Bid. Chem. 258,3346-3355

Proc. Natl. Acad. Sci. U. S. A. 77, 4614-4617

M. and Busch, H. (1985) J. Bid. Chem. 260,14304-14310

(1985) Biochemistry 24,6025-6028

standard nuclear extract containing hnRNP p a r t i c l e s or. i n rme prevavat ions. 405 par t i c l es

conta in ing 2.0 M NaCl to d issoc ia te RNA and cope prote ins. If the ext ract i s then passed i so la ted by suc*ose dens i ty g rad ien t Cent r i fugat ion . i s f i r s t d i a l y z e d i n t o a b u f f e r

through a gel f i l t r a t i o n column i n t h e h i g h - s a l t b u f f e r i n t h e presence of SH-reagents (Fig. 11. core p r o t e i n s e l u t e i n f r a c t i o n s no. 55-79 i n o r d e r o f decreasing molecular weight (Fig. 1. panel b l . Some p a r t i a l p u r i f i c a t i o n can be achieved by th is protocol . v iz. an approximately 1:1 mixture of near ly homogenous A I and A2 e l u t e s i n f r a c t i o n s no. 76-79, and f rac t i ons no. 55-58 are enriched i n p r o t e i n C l . m e use o f o t h e r g e l f i l t r a t i o n media (Biogelr. Sephmoses and FPLC media, not shown1 has no s i g n i f i c a n t e f f e c t on the f ract ionat ion.

20 40 60 80 . .. , . .. .. ...................... . . ........... . .. , , ..... ..... . . .. .. ... . . . ...

Fraction Number

M E M n rn 3 0 ~ 4 0 45 M €0 65 68 73 M 55 58 62 76 79

FIG. 1. P a r t i a l p u r i f i c a t i o n of core prote ins by gel penneation Chrmatographv i n the orerence of SH-reaaents. a. Seohadex GI00 ChrmatoaraDhv ( 1 . 6 ~ 200 cm column. Vn.104 m l l o f t h e nuclea; ex t rac t 15.2 rnghl prote in . 3.5 m1i prev ious lv d ia lyzed -116%. 3 changes1 in to a bu f fe r con ta in ing IO mM Tris-HC1. pH 7.6. 1.5 M MgAc. 2.OM NaCl. 0.5 mM DTT. 0.1 nhl PMCF. and drvrloorci a t a rate o f 6 ml lh w i th the same buf fer . 1.0 m l

320nrn..... . A 10- fo ld reduced scale i s shorn f o r the peak i n f r a c t i o n s no. 20-32 t h a t !ra;tTo-n-s w i r i ' c i l i e i i e i . " - E l i t i o i - 6 t t e r n ~ a t ' 260. -i~ - 2 6 , ~"", and

contains most ly RNA and sane hiqh molecular weight proteins. b. Electrophoresis of the nuclear extract ( lane E l and column e luates an I O I' polvaCrvlamide/O.1'1~ SDS gels. Lane numbers correspond t o f r a c t i o n numbers i n panel a. Each lane received 10-30 "9 prote in . Lanes M. molecular weight markers (M, x 94. 67. 43. 30 and 20.1.

SH-reagents (Fig. 21. p ro te in A2 and some o f t h e proteins of the B group form high molecula? Uhen g e l f i l t r a t i o n i s c a r r i e d o u t i n t h e same high-sa l t buf fer i n the absence of

we ish t oo lmer r tha t e lu te near the vo id volume loanel b. f rac t i on 121. and o ro te in 82 and

molecular wights (panel b. f rac t i ons no. 71-83. c f . t he pos i t i on O f these proteins i n Fig. C1 -apparentlv f o m oligome?s t h a t e l u t e much ear l ier than expected irom t h e i r monomeric

c losely spaced d u b l e t i n the SDS/polyacrylamide gel. 1, panel b l . Near ly homogenous A 1 appears i n f r a c t i o n s no. 115-123 and migrates as a

Page 8: Purification and Domain Structure of Core hnRNP Proteins and A2 … · 1999-02-02 · trypsin-sensitive C-terminal domain of the protein which contains no sequences homologous to

Core hnRNP Proteins and ssDNA-binding Proteins 11273 dif ferent polypept ides. m e y i e l d of nearly hmogenous A 1 var ies frm 0.25 t o 0.4 mg frm 16 mg p r o t e i n i n t h e n u c l e a r e x t r a c t . m e p r o t e i n can be kept frozen for a t l e a s t 4-5 m n t h s a t -20'; it g r a d u a l l y l o s e r b i n d i n g a c t i v i t y on thawing and freezing.

a 0.1

L 5 25 45 65 85 105 125

% -1 +

Fraction Number

C

M I Z 30 45 51 n 83 93 m n 5 3 25 37 51 63 77 87 99 lOQ 120

115 123 na

FIG. 2. P u r i f i c a t i o n of prote in A1 by gel permeation chromatography i n the absence of SH-reagents. a. Sephamse 68 chmatography (1.6~200 cm column) o f t he nuc lea r ex t rac t 18.0 mglml prote in , 2.0 ml ) p rev ious l y d ia l yzed i n to and developed w i th the same buf fer as i n Fig. 1 except fo r the absence of OTT. E l u t i o n p a t t e r n a t 260. -. 280. "". and 32Onm. ..... A 5-fold reduced scale i s shown f o r t h e peak i n fPactiOnS no. 5-35. b. Electrophoresis of column eluates as i n Fig. 1. Lane numbers correspond t o f r a c t i o n numbers i n Dane1 a. Lane M. moleculer weiaht markers 94. 67. 43. 30. 20.1 and 14. c. Electrophoresis o i column f rac t i ons no. 115-128 p r e v i o u s l y d i i l y m d f o r 18 h i n t o a bu f fe r con ta in ing 10 ni4 Tris-HC1. pH 7.6. 250 mM Hac1 and 10 nH OTT on a 10.P

m a h l on Amicon 10 f i l t e v s and keDt at ' -20' i n t he d ia l ys i s bu f fe r exceo t t ha t OTT was polyacrylamidel0.1 SOS ( 5 - 8 p e r l a n e ) m e p u r i f i e d p r o t e i n was concentrated t o 0.5-1.5

r;buced t o 1.0 M.

Y i l k e t a. ( 3 ) . who i so la ted p ro te in A 1 by e lec t roe lu t i on from SOS/polyacrylamide gels. observed i l i e Smewhat e r r a t i c appearance of t he A1 dublet and. af ter analyzing the amino acid composit ion and protease V8-generated peptide maps of the two bands concluded tha t they are iden t ica l . They a lso showed t h a t A 1 migrates as a single band if 8.0 M urea i s inc luded i n the gel i n a d d i t i o n t o SOS. we do not observe the A 1 dublet when core Prote ins are f ract ionated i n the presence of SH-reagents (Fig. l b l . and we f i n d t h a t t h e d u b l e t V i r t U l l l Y diCameapq. i.e.. a s ino le band i s seen i n t h e s t a n d a r d SOSIDolvaCrvlamide ael (Fig. 2c) if th;-zt&iibl iiiified in t h e absence of SH-reagents i s d i a l j z e d o v e r n i g h t ;gainst-a buffer con ta in ing a h igh concent ra t ion o f OTT (10 mM Trir-HC1. pH 7.6. 250 mM NaCl. 10 mM On). Tho m n c ~ m n c ~ o f the dublet could be the resu l t o f cys t ine fomat ion . Fur ther ana lys is and sequencing o f A 1 c o n f i m s t h a t t h e m a t e r i a l p u r i f i i d as described i s n o t a m i i t u r e o f

""_ - . . ." .. ..~ ~ .~ . ~ ~~~

Fo r the pur i f i ca t ion o f Pro te in A2 f ract ions o f the Sepharose 68 column enriched i n A2

25 M sodium acetate, pH 5.5. 1.0 mM OTT. 0.1 mN FUSF and 6.0 M urea fo r 16 h w i t h 3 [Fig. 2b. f ract ions 6 t o 15. t o t a l pro;ein 2.5 mg) were dialyzed against a buf fer containing

changer. and layered on a IM-cel lu lose column (0.5 x 10 c n ) e q u i l i b r a t e d i n t h e same b u f f e r . m e column was developed w i t h 40 a1 of a l i n e a r g r a d i e n t o f 0-200 M NaCl i n t h e same buf fer . 1.2 ml f ract ions were co l l ec ted and a l iauots labout 10 "a o r o t e i n l were analyzed as i n Fig. 1. Fract ion nos. 14 and 15 (Fig. 3al; e l u t i n g a t a b a u i - 7 0 M NaCl v i&

Tris-HC1. pH 7.6. 250 mN NaCl. 1.0 mM LITT. 0.1 mM FUSF and 25'1' o lvcero l . F ina l v ie ld pooled. concentrated on Amicon 10 f i l t e r s , and d ia lyzed against a bu f fe r con ta in ing 10 nu

was 0.9 ng protein. Atempts t o dissociate the high molecular weight-comilex. Contajning-Ai (Fig. Zb, lane 12) w i th SH-reagents and/or h i g h s a l t i n t h e absence o f u rea were unsuccessful. Some preparations of A2 used i n t h i s study were obtained by electroelut ion from SOS-polyacrylamide gels (Fig. 3b).

b

column f ract ions on 10.P polyacrylamide/ 0.1.P SOS (see t e x t f o r d e t a i l s ) . Lane H. FIG. 3. Pur i f i ca t i on o f p r o t e i n A2. a) Gel e lec t rophores i s o f carboxymethyl-cellulose

molecular weight markers as i n F l g 1' lanes Y , column washes equ i l i b ra t i on bu f fe r . l anes 1 t s 32 correspond t o f r a c t i o n numLe/s i n the reg ion o f the. 0-200 mM NaCl gradient. b) E lec t roe lu t i on of A2 from SOS polyacrylamide gels. A sample 14 mg pro te in ) enr iched i n A2 (Fig. 2b) processed as described for CM-cel lulose chromatography (see text). was e lec t ro -

corresponding t o A2 was excised, e lectroeluted and destained by repeated acetone Precipita- phoresed on a 3.0 m t h i c k 8.1' polyacry lamidel0. l 'P SOS gel. Af ter Staining. the band

t i o n s (1.0 mg pro te in ) . Lane M. molecular weight markers; lane 1. e lec t roe lu ted A2 (about 15 "0).

F o r t h e p u r i f i c a t i o n of p ro te in 82. f r ac t i ons o f t h e Sepharose 68 column enriched i n 82

buffer containing 50 M Tris-HC1. pH 7.6. 250 mM amnonium bicarbonate, concentrated t o 1.0 (Fig. 2b. lanes 71 t o 83. t o t a l p r o t e i n 3.2 mg) were pooled, dialyzed for 16 h against a

ml by lyoph i l i za t ion . c la r i f ied by lox -speed cent r i fugat ion and d ia l yzed f o r 16 h against 10 n l 4 Tris-HCl. pH 7.6. 500 M NaCl, 1.0 M DTT, 0.1 mM EOTA and 0.1 mM FUSF. The sample was chromatographed on a Sephadex G l O O column e q u i l i b r a t e d i n t h e same b u f f e r and analyzed as described i n Fig. 1. Fract ions containing about 85'1' pure 82 (0.6 mg) were pooled and concentrated on M i c o n 10 f i l t e r s [ F i g . 8. lane 5) .

F o r t h e p u r i f i c a t i o n o f p r o t e i n Cl. frac t i ons o f t he Sepharose 68 column enriched i n C1 (Fig. 2b. f ract ions 99 t o 105 cmb ined w i th analogous f rac t i ons frm another preparation. t o t a l p r o t e i n 7.0 ng) were processed exactly as descr ibed fo r 82. Frac t i ons o f t he Sephadex GlOO column Containing about 85'1' pure C1 (0.27 ng) were pooled and concentrated on k i c o n 10 f i l t e r s ( F i g . 8. lane 2).