14
Eur. J. Biochem. 225, 1181-1194 (1994) 0 FEBS 1994 Structural characterisation of human stefin A in solution and implications for binding to cysteine proteinases John R. MARTIN', Roman JERALA*, Louise KROON-~lTK02, Eva ZEROVNIK2, Vito TURK2 and Jonathan P. WALTHO' ' Krebs Institute, Department of Molecular Biology and Biotechnology, University of Sheffield, England Department of Biochemistry and Molecular Biology, Joief Stefan Institute, Slovenia (Received June 21/August 24, 1994) - EJB 94 0898/3 Stefin A is a member of the cystatin superfamily of proteins which are tight and reversibly binding inhibitors of the papain-like cysteine proteinases. The 'H-NMR and "N-NMR resonances of human stefin A have been sequentially assigned using two-dimensional homonuclear and hetero- nuclear NMR techniques in conjunction with three-dimensional heteronuclear methods. Characteris- tic sequential and medium range NOE contacts, J constants and hydrogen exchange data have been used to identify the secondary structural elements of the protein which consists of five anti-parallel P-strands and a single a-helix. There is much similarity between the secondary structural features of stefin A and the homologous protein stefin B in its complex with papain [Stubbs, M. T., Laber, B., Bode, W., Huber, R., Jerala, R., LenarZiE, B. & Turk, V. (1990) EMBO. J. 9, 1939-19471 but also some important differences in regions which are fundamental to the binding event. The principal difference is the presence of two conformationally unrestricted regions in stefin A that form two of the components of the tripartite wedge which docks into the active site of the target proteinase. Specifically, these regions are the five N-terminal residues and the second binding loop, which form a turn and a short helix respectively, in the bound conformation of stefin B. Human stefin A is a member of the cystatin superfamily of proteins, which are tight and reversibly binding inhibitors of the papain-like cysteine proteinases. These inhibitors are believed to help protect cells from inappropriate endogenous or external proteolysis, and are involved in the control mech- anism responsible for protein breakdown (Turk and Bode, 1991). The cystatin superfamily has been subdivided into three families on the basis of sequence identity, the number of disulphide bonds present, and the molecular mass of the protein (Barrett et al., 1986). Recently determined protein sequences indicate that the classification could be extended to include additional families (e.g. Kondo et al., 1991). Stefin A is a member of family I. This family (also known as the stefins) consists of single-chain proteins with a molecular mass of approximately 11 kDa, which lack disulphide bonds and carbohydrates. Members of family I1 (the cystatins) have a molecular mass of approximately 13 kDa and are character- ised by the presence of two disulphide bonds, which are lo- cated towards the C-terminus, and a lack of carbohydrates. Family 111 (the kininogens) comprises plasma proteins which are of a larger molecular size (60- 120 kDa) than members Correspondence to J. P. Waltho, Krebs Institute, Department of Molecular Biology and Biotechnology, University of Sheffield, P. 0. Box 594 Sheffield, England S10 2UH Phone: +44 742 824224. Fax: +44 742 728697. Abbreviations. COSY, correlated spectroscopy; DOUBLE-RE- LAY, double-relayed spectroscopy ; DQF, double quantum filtered; HMQC, heteronuclear multiple quantum correlation ; HSQC ; hetero- nuclear single quantum correlation ; NOESY, nuclear Overhauser ef- fect spectroscopy ; RELAY, relayed spectroscopy, ROESY, rotating- frame NOE spectroscopy; TOCSU, total correlation spectroscopy; tm, mixing time. of the other two families. The members of this family contain three repeats similar to the family 11 inhibitors (including the same conserved disulphide bonds in addition to a further three, to give a total of nine) and are glycosylated (Barrett et al., 1986). The most abundant source of stefin A is polymorphonu- clear leucocytes in the liver (Davies and Barrett, 1984). Stefin A has also been localised to the strata corneum and granulosum of the epidermis (Rasanen et al., 1978) and has been found in extracts of squamous epithelia from the oe- sophagus, vagina (Rinne et al., 1978) and mouth (Jiirvinen et al., 1983). This selective distribution of stefin A correlates with those tissues which form the first line of defence against infective agents. It has thus been suggested (Barrett et al., 1986) that stefin A provides an important protective function as an inhibitor of cysteine proteinases which are utilised as invasive tools by many pathogenic organisms. In addition, stefin A is implicated in a number of disease states. The inhibitor has been detected in several epimeroid carcinomas including sqsamous cell carcinomata of the lung, skin, vulva, cervix and oesophagus (Rinne, 1979, 1980) but was absent from a variety of other carcinomas (Rinne, 1980; Rinne et al., 1984). It has been proposed that stefin A plays a key role in tumour invasion which is due either to its expression as a less active isoform or a lower level of expression (Lah et al., 1990). Stefin A has also been found in the upper spinous layer of psoriatic cells (Hopsu-Havu et al., 1983a) and the serum level of this inhibitor has been shown to increase sig- nificantly in patients with cardiovascular disease (Hopsu- Havu et al., 1983b). The three-dimensional structures of two members of the cystatin superfamily (namely stefin B complexed with pa-

Structural Characterisation of Human Stefin A in Solution and Implications for Binding to Cysteine Proteinases

Embed Size (px)

Citation preview

Eur. J. Biochem. 225, 1181-1194 (1994) 0 FEBS 1994

Structural characterisation of human stefin A in solution and implications for binding to cysteine proteinases John R. MARTIN', Roman JERALA*, Louise KROON-~lTK02, Eva ZEROVNIK2, Vito TURK2 and Jonathan P. WALTHO' ' Krebs Institute, Department of Molecular Biology and Biotechnology, University of Sheffield, England

Department of Biochemistry and Molecular Biology, Joief Stefan Institute, Slovenia

(Received June 21/August 24, 1994) - EJB 94 0898/3

Stefin A is a member of the cystatin superfamily of proteins which are tight and reversibly binding inhibitors of the papain-like cysteine proteinases. The 'H-NMR and "N-NMR resonances of human stefin A have been sequentially assigned using two-dimensional homonuclear and hetero- nuclear NMR techniques in conjunction with three-dimensional heteronuclear methods. Characteris- tic sequential and medium range NOE contacts, J constants and hydrogen exchange data have been used to identify the secondary structural elements of the protein which consists of five anti-parallel P-strands and a single a-helix. There is much similarity between the secondary structural features of stefin A and the homologous protein stefin B in its complex with papain [Stubbs, M. T., Laber, B., Bode, W., Huber, R., Jerala, R., LenarZiE, B. & Turk, V. (1990) EMBO. J . 9, 1939-19471 but also some important differences in regions which are fundamental to the binding event. The principal difference is the presence of two conformationally unrestricted regions in stefin A that form two of the components of the tripartite wedge which docks into the active site of the target proteinase. Specifically, these regions are the five N-terminal residues and the second binding loop, which form a turn and a short helix respectively, in the bound conformation of stefin B.

Human stefin A is a member of the cystatin superfamily of proteins, which are tight and reversibly binding inhibitors of the papain-like cysteine proteinases. These inhibitors are believed to help protect cells from inappropriate endogenous or external proteolysis, and are involved in the control mech- anism responsible for protein breakdown (Turk and Bode, 1991). The cystatin superfamily has been subdivided into three families on the basis of sequence identity, the number of disulphide bonds present, and the molecular mass of the protein (Barrett et al., 1986). Recently determined protein sequences indicate that the classification could be extended to include additional families (e.g. Kondo et al., 1991). Stefin A is a member of family I. This family (also known as the stefins) consists of single-chain proteins with a molecular mass of approximately 11 kDa, which lack disulphide bonds and carbohydrates. Members of family I1 (the cystatins) have a molecular mass of approximately 13 kDa and are character- ised by the presence of two disulphide bonds, which are lo- cated towards the C-terminus, and a lack of carbohydrates. Family 111 (the kininogens) comprises plasma proteins which are of a larger molecular size (60- 120 kDa) than members

Correspondence to J. P. Waltho, Krebs Institute, Department of Molecular Biology and Biotechnology, University of Sheffield, P. 0. Box 594 Sheffield, England S10 2UH

Phone: +44 742 824224. Fax: +44 742 728697. Abbreviations. COSY, correlated spectroscopy; DOUBLE-RE-

LAY, double-relayed spectroscopy ; DQF, double quantum filtered; HMQC, heteronuclear multiple quantum correlation ; HSQC ; hetero- nuclear single quantum correlation ; NOESY, nuclear Overhauser ef- fect spectroscopy ; RELAY, relayed spectroscopy, ROESY, rotating- frame NOE spectroscopy; TOCSU, total correlation spectroscopy; tm, mixing time.

of the other two families. The members of this family contain three repeats similar to the family 11 inhibitors (including the same conserved disulphide bonds in addition to a further three, to give a total of nine) and are glycosylated (Barrett et al., 1986).

The most abundant source of stefin A is polymorphonu- clear leucocytes in the liver (Davies and Barrett, 1984). Stefin A has also been localised to the strata corneum and granulosum of the epidermis (Rasanen et al., 1978) and has been found in extracts of squamous epithelia from the oe- sophagus, vagina (Rinne et al., 1978) and mouth (Jiirvinen et al., 1983). This selective distribution of stefin A correlates with those tissues which form the first line of defence against infective agents. It has thus been suggested (Barrett et al., 1986) that stefin A provides an important protective function as an inhibitor of cysteine proteinases which are utilised as invasive tools by many pathogenic organisms. In addition, stefin A is implicated in a number of disease states. The inhibitor has been detected in several epimeroid carcinomas including sqsamous cell carcinomata of the lung, skin, vulva, cervix and oesophagus (Rinne, 1979, 1980) but was absent from a variety of other carcinomas (Rinne, 1980; Rinne et al., 1984). It has been proposed that stefin A plays a key role in tumour invasion which is due either to its expression as a less active isoform or a lower level of expression (Lah et al., 1990). Stefin A has also been found in the upper spinous layer of psoriatic cells (Hopsu-Havu et al., 1983a) and the serum level of this inhibitor has been shown to increase sig- nificantly in patients with cardiovascular disease (Hopsu- Havu et al., 1983b).

The three-dimensional structures of two members of the cystatin superfamily (namely stefin B complexed with pa-

1182

pain, and chicken cystatin) have been determined. The struc- ture of stefin B in its complex with papain has been solved by X-ray crystallography (Stubbs et al., 1990) and forms a five-stranded anti-parallel p sheet wrapped around a central five-turn a-helix, with a C-terminal loop running along the convex face of the sheet. The structure of chicken cystatin has been solved by both X-ray crystallography (Bode et al., 1988) and NMR spectroscopy (Dieckmann et al., 1993) and in each case exhibits essentially the same global fold as stefin B. The cystatin structures determined by the two different techniques also contained some significant differences, most notably in the region comprising residues 69-91 (as dis- cussed in detail by Engh et al., 1993). A topological compari- son of the two smaller families of the cystatin superfamily indicates that this region is not present in the stefins (Bode et al., 1988).

The amino acid sequence (Machleidt et al., 1983) and cDNA sequence (Kartasova et al., 1987) of stefin A have been determined, and the recombinant protein has been ex- pressed in Escherichia coli (Jerala, et al., 1994a). This pro- tein, which forms the focus of our current study, exhibits essentially identical biochemical properties to the native pro- tein (Jerala, R., unpublished results). Stefin A is a potent, reversible and competitive inhibitor of the papain-like cyste- ine proteinases; the K, values for inhibition of papain and cathepsins B, H and L are 0.019, 8.2, 0.31 and 1.3 nM, re- spectively (Barrett et al., 1986).

In order to understand the factors which lead to the for- mation of stable proteinase-inhibitor complexes, high-resolu- tion structural information is required for the free and bound states of the proteins involved in binding. In particular, de- tailed information concerning the dynamic flexibility of the proteins is required in order to understand the strain and con- formational entropy loss which they undergo on binding. In this regard, the X-ray crystal structures of free papain (Drenth et al., 1971; Priestle et al., 1984) and the stefin-B- papain complex (Stubbs et al., 1990) have been determined, although no structure of a free stefin has been reported. In this study we present the 'H-NMR and "N-NMR assignment and secondary structure of free stefin A in solution.

EXPERIMENTAL PROCEDURES Protein production

Unlabelled recombinant stefin A was expressed in E. coli strain DH5a and the labelled protein was expressed in strain TG1. The plasmid construction (Jerala et al., 1994b) and method of growth and expression (Jerala et al., 1994a) have previously been reported. For the preparation of I5N-labelled stefin A, cells were grown on M9 minimal medium contain- ing 0.62 g/l I5N-labelled ammonium sulphate as the sole source of nitrogen, supplemented with 0.3 % glucose, 0.01 mg/l thiamine and 50 mg/l ampicillin. Cells were grown at 37°C in a 10-1 fermentor and, once harvested, were lysed using a combination of freeze-thaw and sonication. The in- hibitor was isolated from the cell extract using affinity chro- matography on carboxymethyl-papain- Sepharose, followed by FPLC on a Mono Q column (Jerala et al., 1994b). Cells grown on rich media typically expressed stefin A at approxi- mately 30 mg/lO 1 from which a post-purification yield of 20 mg was obtained. The expression level in minimal media was approximately 20 mg/lO 1 from which a yield of 14 mg pure protein was extracted. Protein concentrations were de- termined by absorbance measurement using an A,,, value of

0.87 for a 1 mg/ml solution of stefin A. Protein purity was checked using SDS/polyacrylamide gel electrophoresis and isoelectric focusing.

Sample preparation Protein samples for NMR measurements (0.5 ml) were

concentrated to 1.5 -4 mM using Centricon-SR3 ultrafiltra- tion tubes (Amicon). The buffers contained 50 mM potas- sium phosphate and 50 mM potassium chloride adjusted to pH 5.5, and either 9 : 1 (by vol.) 'H,OPH,O or 100% 'H,O. All samples contained 5 mM sodium azide. The 'H,O sam- ples were prepared by dialysis against 100 vol. buffered *H,O.

NMR spectroscopy NMR experiments were recorded on a Bruker AMX-500

spectrometer operating at 500.13 MHz. 'H-NMR data were collected over a range of temperatures (290-318 K) and pH values (pH 5.25, 5.5 and 6.5). The water resonance was suppressed by low power presaturation (typically applying a 50-Hz field for 800 ms) during the relaxation delay, followed by a SCUBA sequence (Brown et al., 1988) employing two composite 17 pulses separated by 30-ms delays. For two-di- mensional spectra, 1 28 signal-averaged transients were typi- cally collected; acquisition times were normally 48 X328 ms. Normal pulse sequences were used throughout except for the inclusion of a spin-echo sequence prior to acquisition, for homonuclear experiments cosine modulated in J (Waltho and Cavanagh, 1993). A delay of 2.54 ms preceded the 1T pulse and acquisition commenced 32 dwell times prior to the top of the spin-echo; the data were oversampled to 12.5 kHz in the acquisition dimension. Quadrature detection in the non- acquisition dimensions was achieved using time-proportional phase incrementation. The two-dimensional homonuclear ex- periments performed included double-quantum-filtered correlated spectroscopy (DQF-COSY), clean total correlation spectroscopy (TOCSY), nuclear Overhauser effect spectros- copy (NOESY), rotating-frame NOE spectroscopy (ROESY), double quantum spectroscopy, relayed spectroscopy (RE- LAY) and double-relayed spectroscopy (DOUBLE RELAY). Mixing times of 60, 90 and 120 ms were used for TOCSY experiments, and NOESY data were acquired with mixing times of either 70 ms or 200 ms. Heteronuclear experiments were acquired at 308 K with the sample at pH 5.5. The heter- onuclear single quantum correlation spectrum (HSQC) was recorded with the ''N carrier in the centre of the amide region with a '5N spectral width of 2 kHz. The 15N& resonance of the arginine side chain was folded in the spectrum. Three- dimensional TOCSY-heteronuclear multiple quantum corre- lation -(HMQC) (l,,, = 90 ms) and NOESY-HMQC (tm = 150 ms) data sets were collected with acquisition times of 40.96, 23.04 and 40.96 ms, respectively.

The acquired data were processed using the program FE- LIX (Biosym Technologies Inc.) running on a Silicon Graph- ics Indigo workstation. Prior to transformation a low-fre- quency deconvolution filter was applied to the acquisition dimension to remove the residual water signal (Marion et al., 1989). The initial delay in the non-acquisition dimensions was set to compensate for precession during the pulses flank- ing the evolution periods. Data were acquired with cosine modulation in the non-acquisition dimensions, and were shifted right by one point during the processing in order that the first time point in the Fourier transform was equivalent

11 83

to zero and thus sine modulated. Two-dimensional 'H-NMR datasets were typically processed by applying a sinebell phase shifted by 40" in each dimension, and Fourier trans- formed to produce matrices consisting of 2048 X 2048 real points. The three-dimensional heteronuclear spectra were processed to yield frequency-domain matrices consisting of 512X512X64 real data points. The time-domain data in each dimension were multiplied by a 60" shifted-sinebell and zero-filled to 1024, 128 and 1024 points in t l , t2 and t3 respec- tively. A convolution filter was applied to t3 and the data were shifted left by 25 points prior to Fourier transformation (Waltho and Cavanagh, 1993). The upfield half of the acquisition dimension of the three-dimensional spectra was discarded.

In order to identify the slowly exchanging amide protons, a 2 m M sample was dialysed for 30min against 100 vol. buffered 'H20 and a TOCSY spectrum (spin locking period 60 ms) was acquired over 12 h. J constants were measured by fitting simulated lineshapes to both in-phase and anti- phase crosspeaks representing the J coupling. Proton fre- quencies were calibrated by placing the 3-trimethyl- ~ilyl(2,2,3,3-~H,)propionate signal at 0.00 ppm; nitrogen chemical shifts were referenced to I5NH, at 0.00 ppm. (Live et al., 1984).

RESULTS AND DISCUSSION Preliminary NMR characterisation

Preliminary one-dimensional 'H-NMR experiments were performed on a 1 mM sample of stefin A in 50 mM potas- sium phosphate containing 20 mM potassium chloride. Spectra were recorded at pH 6.5 over a range of temperatures (290-318 K) and the protein remained stable throughout. The sample was stable during the pH titration from 6.5-5.0 (at 308 K) but below this pH a slight precipitate developed, and the resulting sensitivity of the NMR spectra was de- creased by approximately 10%. At this stage, the pH was retitrated to pH 6.5 and a TOCSY experiment (spin locking period 90 ms) was performed. The one-dimensional spectra of stefin A exhibit relatively well-dispersed signals and sharp lines (approximately 6 Hz).

The HSQC spectrum and the two-dimensional fingerprint region

Initially, the NHPN crosspeaks in the HSQC spectrum and the backbone NWCaH crosspeaks in the DQF-COSY spectrum were examined. An HSQC spectrum is presented in Fig. 2. The backbone NH peaks are labelled according to residue type and sequence position, and the side chain am- ides of asparagine and glutamine residues are denoted by the residue number only. The numbering of the residues is based on sequence homology with the archetype of the cystatin superfamily, namely chicken cystatin (Stubbs et al., 1990). 92 out of an expected 93 crosspeaks (with five proline resi- dues, and the N-terminus blocked by an acyl group; Ritonja, A., unpublished results) were identified from this spectrum. The final peak was accounted for during the process of se- quential assignment, whereby T36 and G124 were identified as having identical NH and l5N frequencies. An additional peak of low intensity was also observed; this peak was later identified as a second peak for 17 and is likely to be the result of cis-trans isomerisation of the peptide bond of the succeeding proline residue. 21 out of an expected 22 peaks

- 10.0 8.0 6.0 4.0 2.0 0 . 0

F, I ppm)

Fig. 1. One-dimensional 'H-NMR spectrum of human stefin A at 500 MHz, recorded at 308 K on a 3 mM sample in 50 mM potas- sium phosphate, pH 5.5, containing 50 mM KCl. The residual H,O signal was removed by applying a time-domain convolution filter; following Fourier transformation, the amplitude of the reso- nances reduced by the filter was corrected (Waltho and Cavanagh, 1993). Peaks arising from glycerol contamination are present at ap- proximately 3.7 ppm.

11s

1 1 4

O r .?

v1 r r -

E P

Y

0 N r

U.

In N r

0 m r

I I I

9.0 8.0 7.0

F2 (pprn)

Fig.2. HSQC spectrum of "N-labelled human stefin A at 500 MHz. For sample conditions, see legend of Fig. 1. Cross-peaks arising from backbone amide groups are labelled according to the residue type and sequence position. Side-chain amide groups (for asparagine and glutamine residues) are labelled by sequence position only.

were observed for the primary amide protons. It was deduced from the two-dimensional NOESY spectrum that the fre- quencies of one of the primary amide protons of N92 and one of N105A were degenerate. The fingerprint region of the DQF-COSY spectrum contained 89 crosspeaks out of the

Table 1. 'H-NMR and ISN-NMR assignments for recombinant human stefin A. The 'H chemical shift values ( 50.01 ppm) were derived from two-dimensional DQF-COSY and TOCSY spectra at 308 K and pH 5.5. The 15N chemical shift values were obtained from the two- dimensional HSQC spectrum recorded under the same conditions. The side chain 15N resonances of the asparagine and glutamine residues are given in parentheses below the main chain 15N resonance. n.a., not assigned; n.o., not observed; t, tentatively assigned; [I, resonances assigned to spin system but not to specific proton.

Residue Chemical shift of

I SN NH CaH CPH others

M6 125.7 8.24 4.41 2.00, 2.00

1.89

yCH, 2.56, 2.56 ECH, 2.08 yCH, 1.51, 1.20 yCH, 0.95 6CH, 0.86 yCH, 1.95, 1.95 6CH, 3.93, 3.70

17 123.7 8.17 4.52

P8 4.43 2.28, 2.10

G9 GI 0 L11

110.0 108.0 122.6

8.55 8.35 8.25

4.06, 4.06 4.14, 3.86 4.67 1.72, 1.72' yCH 1.46

6CH, 0.57, 0.27 s12 El 3 A1 4 K15

118.4 11 7.3 125.0 121.9

8.87 8.72 8.88 8.52

4.42 4.36 4.30 4.80

4.08, 4.08 2.12, 2.01 1.22 0.91, n.a.

yCH, 2.37, 2.31

[I .79, 1.65, 1.32, 1.281 FCH, 2.93, 2.93 n.a. 6CH, 3.88, 3.73

P16 4.71 2.55, n.a.

A1 7 T18 P19

120.4 109.9

7.37 6.28

4.03 4.91 4.31

1.16 4.77 yCH, 1.43

[2.45] dCH, 3.93, n.a. yCH, 2.23, 2.13 yCH, 1.35, 0.26 yCH, 0.57

yCH, 2.46, 2.35 ENH, 8.13, 7.77

yCH, 2.40, 2.18 yCH, 1.68, 0.71 yCH, 0.53

yCH, 0.90, 0.76

6CH, -0.06

6CH, -0.05

n.a.

E20 I21

117.1 119.3

8.42 7.36

4.02 3.55

1.99, 1.81 1.91

Q22 120.7 (1 18.7) 117.9 118.7

8.02 3.67 2.23, 2.21

E23 I24

8.06 7.06

4.09 3.38

2.10, 2.03 1.83

V25

D26 K27

11 8.2

11 8.8

118.2

7.45

8.66 8.10

3.21

4.31 4.15

2.28

2.85, 2.68

2.02, 1.92' yCH, 1.52', 1.52' 6CH, 1.68, 1.68 ECH, 3.00, 3.00 yCH, 1.00, 0.92 [I .45, 1.45, 1.90, 1.851 cCH, 3.23, 3.07 yCH, 2.15, 2.15 6CH, 3.79, 3.70 yCH, 2.65, 2.57 ENH, 8.22, 7.06 yCH 1.43 dCH, 0.50, 0.29 yCH, 2.81, 2.71

V28 K29

109.8 126.2

8.00 7.97

4.56 3.97

2.45 2.14, 2.0.5

P30 4.62 2.45, 1.97

Q31 115.6 (1 12.5) 120.7

7.49 4.31 2.35, 2.35

L32 7.78 3.78 2.19, 1.12

E3 3 119.0 8.49 4.25 2.12, 2.08

1185

Table 1. (Continued).

Residue Chemical shift of

15N NH CaH others

E34 K35

T36 N37

E38

T39 Y40

G43 K44

L45

E46 A47 V48

Q49

Y50

K5 1 T52

Q5 3

v54 v55 A56 G57 T5 8 N59

Y60

Y61

I62

K63 V64 R65

A66 G67

PPm

118.2 116.3

106.4 117.2

(111.4)

117.5 111.1

123.3

111.8 120.4

125.9

124.1 128.7 118.3 117.8

(1 1 2.2) 121.3

118.0 109.1 121.0

(1 12.5) 130.0 127.2 129.6 105.9 116.8 122.7

(1 09.8) 121.6

122.4

121.3

129.6 123.3 124.1

130.6 107.8

7.77 7.63

8.23

8.09

7.61 7.63

8.89

8.66 7.96

8.53

8.54 9.71 9.13 7.76

9.73

9.43 8.81 8.74

9.1 5 8.20 8.18 7.58 8.47 9.01

9.92

9.21

9.58

9.23 9.59 9.23

8.79 8.08

4.14

4.27

4.32 4.35

4.42 4.20

4.69

3.97, 3.77 4.21

4.73

4.71 4.43 4.53 5.43

6.03

4.38 5.51 4.75

4.33 4.49 4.73 4.53, 3.61 4.83 5.68

4.89

5.52

4.93

5.22 5.51 4.46

4.91 4.03, 3.93

2.25, 2.25

1.89, 1.89

4.33

3.16, 2.80

1.51, 3.53

3.84

3.03, 2.43

1.72, 1.72'

1.46, 1.08

2.05, 2.05 1.10 2.03 2.10, 2.05

3.18, 2.96

1.34, 1.34 3.86 2.37, 2.37

2.10 2.28 1.43

3.70 2.88, 1.85

2.95, 2.53

2.65, 2.63

1.55

1.81, 1.81 1.78 1.70, 1.70

1.13

yCH, 2.46, 2.32

yCH, 1.75, 1.75 6CH, 1.92, 1.70 ECH, 3.22, 3.12

yCH, 0.97

6NH, 7.42, 6.73

yCH, 2.02, 2.02 yCH, 1.02

2,6H 7.04 3,5H 6.43

yCH, 1.32', 1.28' 6CH, 1.65, 1.65 eCH, 2.92, 2.92 yCH 1.33 6CH, 0.52, 0.52 yCH, 2.18, 1.88

yCH, 1.14, 1.14 yCH, 2.28', 2.16 ENH, 7.57, 6.85 2,6H 6.83 3,5H 6.91 [1.90, 1.121 yCH, 1.02 yCH, 2.00', 1.95' ENH, 7.73, 7.01 yCH, 1.06, 1.03 yCH, 0.85, 0.75

yCH, 0.56

6NH, 7.08, 6.69 2,6H 6.83 3,5H 6.49 2,6H 6.67 3,5H 6.27 yCH, 1.72, 0.84

6CH3 0.66 n.0.

yCH, 0.84

yCH, 0.80, 0.80 yCH, 1.36, 0.93 6CH, 3.17, 3.07 eNH 6.93

1186

Table 1. (Continued).

Residue Chemical shift of

5N NH CaH CPH others

6NH, 7.63, 6.93 yCH, 1.54', 1.36' XH, 1.76, 1.76 ECH, 3.08, 3.08 2,6H 6.97 3,5H 6.78 yCH, 2.56, 2.11 ECH, 1.54

C2H 7.47 C4H 6.99 yCH 1.81 6CH, 0.90, 0.77 [0.92]

yCH, 0.83, 0.79 2,6H 7.18 3,5H 7.52 4H 7.18

[1.70, 1.60, 1.55, 1.381

D68 N92

K93

Y94

M95

H96

L97

K98 v99

F1 00

KlOl s102 L102A

P103

(3104 Q105

N105A

El06 D107

L108

V109

LllO

T i l l G112

Y113

Q114

v115 D115A K116 N117

K118

119.6 115.4

(112.8) 118.8

120.6

116.6

120.6

122.6

124.1 126.7

128.4

129.9 121.2 121.2

-

109.9 119.3

(1 10.9) 117.3

(112.8)

120.2 121.7 122.0

122.3

128.8

121.6 108.3 115.2

117.7 (111.5) 115.0 116.6 114.3 118.9

(114.8) 123.8

8.55 8.58

7.20

8.52

9.01

9.14

9.88

8.56 9.22

9.60

8.72 8.12 8.37

-

8.55 8.00

8.34

8.17 8.27

8.22

8.83 8.92

8.98 8.12

8.37

9.22

8.36 8.91 7.39 8.90

8.70

4.43 4.63

4.84

5.57

4.98

6.09

5.03

4.97 4.91

5.50

4.73 4.14 4.48

4.36

4.21, 3.79 4.43

4.63

4.26 4.67 4.93

4.64 4.89

4.58 4.50, 4.17 5.70

4.60

4.67 4.56 4.48 4.99

4.98'

2.67, 2.67

3.09, 2.86

1.72, 1.72'

2.65, 2.52

2.24, 1.65

3.29, 3.02

1.81, 1.47

1.32, 1.32'

1.88

3.35, 2.90

1.92, n.a. 4.00, 4.00 1.59, 1.59

2.33, 1.88

2.25, 2.10

2.91, 2.79

2.09, 1.91 2.69, 2.60 1.57, 1.18

2.19 2.21, 2.21

4.31

3.39, 2.71

n.a.

2.46 2.95, 2.73 1.02, 0.91 2.17, 2.73

n.0.

yCH 1.63 6CH, 0.97, 0.93 yCH, n.a. 6CH, 3.83, 3.62

yCH, 2.39, 2.36 ENH, 7.55, 6.77

6NH, 7.62, 6.91 yCH, 2.22, 2.18

yCH 1.57 6CH, 0.87, 0.79 yCH, 1.07, 0.99

yCH 1.63 6CH, 1.18, 0.86 yCH, 1.22

2,6H 6.67 3,5H 6.67 [1.72, 1.60, 1.021 ENH, 6.85, 6.53

yCH, 1.12, 0.97

[1.37]

(INH, 7.82, 6.88 n.0.

1187

Table 1. (Continued).

Residue Chemical shift of

'IN NH CaH C/3H others

D119 113.2 D120 119.5 El 21 125.4 L122 126.0

T123 117.2 G124 106.4 F125 124.0

8.03 4.57 2.77, 2.48 7.12 4.51 2.75, 2.43 9.02 4.13 1.98, 1.84 9.14 4.21 1.41, 1.10

7.78 4.30 4.21 8.24 3.31, 3.31 7.28 4.44 3.18, 3.08

yCH, 2.23, 2.23 yCH 1.20 6CH, 0.71, 0.35 yCH, 1.11

2,6H 7.36 3 5 H 7.15 4H 6.88

expected 93. The four peaks which were not observed in the DQF-COSY were identified in the HSQC spectrum. In two cases (namely V64P100 and K27/S102), the absence of the peaks was demonstrated to be due to exact degeneracy of the amide and a proton frequencies. In the cases of K29 and K118, the low values of 3JwN0 (less than 4 Hz) had rendered them invisible from the amide-a region of the J-correlated spectra.

Spin system identification

A number of side chain spin systems were initially iden- tified from the two-dimensional J-correlated homonuclear spectra on the basis of well-established procedures (Wuthrich, 1986). These spin systems included 7 of the 8 glycine residues, all of the 5 alanines, 5 of the 9 valines, 5 of the 7 threonines, 4 of the 8 leucines, 3 of the 4 isoleucines, 4 of the 12 lysines, 17 of the 22 AMX-type spin systems and 7 of the 17 AM(PT)X-type spin systems. The identification of the remaining residues proved to be more difficult, and was achieved using the three-dimensional TOCSY-HMQC or on the basis of sequential connectivities, as outlined below, in ascending order of side chain mass.

G124 was identified via a medium-intensity daN (i,i+l) NOE to F125 in the two-dimensional NOESY spectrum. No remote peaks were observed for G124 in the double quantum spectrum; this may be due to a small increase in the line- width resulting from conformational exchange in connection with T123 (see below). All five proline residues were catego- rised from the two-dimensional homonuclear spectra once their a or 6 proton frequencies had been elucidated via se- quential connectivity in the NOESY spectra. Amide proton degeneracy involving the four remaining unidentified valine residues (V28, V64, V99 and Vll5) was resolved in the three-dimensional spectra. The spin system of T36 was tenta- tively assigned as an alanine/threonine residue (due to exact degeneracy of the a and p proton resonances) in the J-corre- lated spectra, and later confirmed as threonine during the sequential assignment. The amide resonance of T123 was uncharacteristically broad indicating that it could be involved in a conformational exchange process ; this resonance was also degenerate with the amide frequencies of L32, E34 and

449, and was eventually identified in the three-dimensional TOCSY-HMQC spectrum.

The amide frequency of L45 resonates in a particularly crowded region of the two-dimensional spectrum (nine am- ide frequencies occur within 0.1 ppm) but this spin system was readily identified in the three-dimensional TOCSY- HMQC spectrum. Less extreme problems of resonance de- generacy for the remaining leucines (L102A, LllO and L122) were resolved in the same manner. The degeneracy of the amide proton resonances of 162, V64 and FlOO was resolved in the three-dimensional TOCSY-HMQC spectrum. In addi- tion, a further five lysine residues (K15, K27, K63, K98 and KlOl ) were identified from this spectrum. The final three lysine residues were identified on the basis of connectivities to their sequential partners; K29 and K118 both exhibit small 3JHN0 values and thus their spin systems were difficult to identify using this amide-based strategy. In the case of K116, coherence transfer had only correlated the amide through to the ,fl protons, thus making it difficult to categorise this spin system to a particular residue type.

The five remaining AMX-type spin systems were iden- tified in the three-dimensional spectra once the resonance overlap, which had hindered their prior categorisation, had been resolved. This overlap included Y61 (amide degeneracy with K63, R65, V99 and Q114), D68 and N92 (which both exhibited amide resonance overlap with G9, K98 and G104), FlOO (which exhibits both a and amide resonance overlap with V64, and amide degeneracy with I62), and S102 (amide degeneracy with K27, N37 and G112). A further eight AM(PT)X-type residues (E13, E33, E46, Q53, Q105, E106, Q114 and E121) were categorised in the three-dimensional TOCSY-HMQC spectrum. The final two residues, namely E34 and Q49, exhibited amide resonance overlap with each other and with L32 and T123, which hindered their assign- ment in the two-dimensional J-correlated spectra. It was pos- sible to identify 449 on the basis of a strong d,, (i,i+l) NOE to Y50, and E34 was identified from its sequential NOE in the three-dimensional NOESY-HMQC spectrum. R65 was provisionally assigned as an arginine or a lysine in the three- dimensional TOCSY-HMQC ; this assignment was precluded from the two-dimensional spectra due to severe resonance overlap involving five amide protons within 0.02 ppm. This

1188

0

a s 0

O 0

A

B

n

Fig. 3. A two-dimensional 'H TOCSY spectrum (90-ms spin locking period) of 3 mM human stefin A, showing correlations between aliphatic protons in F , and amide and aromatic protons in F,. The two inset boxes in the amide-a region encapsulate regions of the spectrum which are relatively crowded. The cross-peaks within these boxes are labelled according to the sequence position of the residue giving rise to the crosspeak.

assignment was confirmed by connectivities to its sequential partners in the NOESY spectra.

Sequential assignment Spin systems were assigned to their sequential positions

on the basis of characteristic inter-residue distances, using well-established procedures (e.g. Chazin et al., 1988). The resonance assignments are given in Table 1. In the first in- stance, two-dimensional NOESY spectra recorded at 298, 308 and 318 K (all with t,, = 200 ms) were examined in con- junction with their equivalent two-dimensional TOCSY spectra (acquired with spin locking periods of 60-120 ms). This allowed a significant number of residues to be sequen- tially assigned. The amide aliphatic region of a two-dimen- sional TOCSY spectrum is given in Fig. 3. The amide-n re- gion of this figure contains two inset boxes which encapsu- late sections of the spectrum that are relatively crowded; the cross-peaks within these inset boxes are labelled with the sequence position of the residue giving rise to the cross-peak. Furthermore, corresponding F3 :F , planes from three-dimen-

sional TOCSY-HMQC and NOESY-HMQC spectra demon- strating the sequential connectivity from N92 to L97 are given in Fig. 4.

A chain of unbroken sequential NOES can be traced for the entire backbone, as shown in Fig. 5. A strong or medium intensity daN or dNN (i,i+l) NOE can be observed between 77 sequential neighbours in the 70-ms NOESY spectrum. A further seven connectivities consist of weak-intensity daN (i,i+l) NOEs (Q22/E23, V28/K29, V53A.56, D68N92, TllUG112, D119/D120 and T123/G124), with the sequential connectivity confirmed by the presence of dm (i,i+ 1) NOES. In the case of N92/K93, weak doN ( i j - t l ) and d" (i,i+l) NOES are observed, and for S102L102A and K118D119 weak d,, and dm (i,i+l) NOEs are present. For the final 12 sequential connectivities (M6-Ll1, T18P19, G43/K44, G67/D68, L102AIp103, G104/Q105, G112/Y113 and L122/ T123), only weak-intensity daN (i,i+l) [or dnh (i,i+l) for pro- line residues in the ( i f l ) position] NOES were observed. A description of the identification the most difficult sequential connectivities is given below. In each case, the cause of the difficulty was resonance overlap which was resolved following examination of the NOESY-HMQC spectrum.

11 89

N92

1 1 5 . 3 5

0 0

0 0

- , 8 .58

N92

1 1 5 . 3 5

0

*-

- 8.58

K93

1 1 8 . 4 0

4)

b 0

-

0 ,

7.20

F3

K9 3

1 1 8.40

P

- e

-f

- 7 . 2 0

F3

Y94

1 2 0 . 1 4

e Q)

- .a

7 8.53

M95

1 1 6 . 2 2

m

8 0

63

-.a

- 9.00

Y94

1 2 0 . 1 4

0

6

-7 j ! *.-.

- 8.53

M95

1 1 6 . 2 2

- 0

B a

9--

, 9 . 0 0

H96

1 2 0 . 5 7

8

B

Q . 3 - 9.15

H96

1 2 0 . 5 7

9

0

0

* ~

1

c

, 9.15

L97

1 2 2 . 3 1

D - 0

0

P a

- 9.88

L97

1 2 2 .31

m

m

0

1 9 . 8 8

Fig. 4. Corresponding F,:F, planes from three-dimensional TOCSY-HMQC (A) and three-dimensional NOESY-HMQC (B) spectra (spin locking period of 90 ms and mixing time of 150 ms, respectively) of 3 mM uniformly 15N-labelled human stefin A, demonstrating the sequential connectivities from N92-L97. The residue label is given above each section, and the ''N chemical shift of each plane is given near to the top of each section. The NH chemical shifts are indicated along the F, dimension and the ali- phatic shifts along F,. In A, the intraresidue amide-a cross-peaks are denoted in the F, dimension. In B, the pathway of sequential d,, ( i , i + l ) connectivities is indicated (-).

In terms of residue number, the first obstacle to the se- quential assignment arose with the connectivity between S12 and El3 owing to amide proton resonance overlap involving El3 and K301. This overlap was easily resolved in the three- dimensional NOESY-HMQC spectrum as the IsN frequencies of these residues differ by 12.6 ppm., thus allowing the S12/ El3 and F100/K101 sequential connectivities to be estab- lished. The sequential link between E20 and I21 was compli- cated by the presence of the intraresidue dNa NOE of A17 at the same position as the sequential d,, NOE. The sequential connectivity between these residues was first identified in the three-dimensional spectra whereby the 15N resonances of A17 and I21 were suitably resolved. The possibility of the spin system eventually identified as A17 being the preceding residue to I21 was eliminated on the basis of unambiguous connectivities between A17 and its sequential partners. The amide proton resonance frequencies of Q22 and E23 were separated by only 0.04 ppm, and the amides of V28 and K29 by 0.03 ppm. This precluded the observation of the expected sequential d,, NOEs; the NOE patterns of the residues in this region of stefin A strongly suggest that they adopt an a- helical conformation, as outlined in Fig. 5. The sequential connectivities between Q22E23 and V28IK29 were in each case affirmed by the presence of weak do, and dm (i,i+l) NOEs.

Another region of spectral overlap in the two-dimen- sional homonuclear data sets is centred at approximately 7.77 ppm, with the backbone amide proton resonance fre- quencies of L32, E34, Q49 and T123, in addition to one of the side chain amide proton resonances of Q22, being present within 0.01 ppm of the central frequency. Fortunately, the "N resonances of these residues were sufficiently well-resolved to allow them to be confidently assigned to their respective sequential positions in the three-dimensional spectra.

The amide proton frequencies of T36 and G124 were overlapped at all measured temperatures (290, 298, 308 and 318 K) and pH values (pH 5.25, 5.5 and 6.5). It can be ob- served in the HSQC spectrum (Fig. 2), that these residues also have identical 15N frequencies at 308 K and pH 5.5. T36 is connected to K35 and N37 via medium-intensity d,, (i,i+l) NOE, and a confirmatory dyN (i,i+l) NOE is also observed between T36 and N37. Weak-intensity dnN, dm, and d,,N (i,i+l) NOEs between T123 and G124, and a medmm- intensity daN NOE from G124 to F125 allowed the sequential position of this inextricably overlapped penultimate residue to be established. A less severe case of resonance overlap involved L45 and E46. Although the degenerate amide pro- ton resonance frequencies of these two residues proved to be an obstacle to their identification in the two-dimensional NOESY spectra, their "N resonances were clearly resolved, thus facilitating their sequential assignment.

The most difficult stretch of sequential connectivity to establish was Y60-R65, principally because of severe amide proton resonance overlap. The amide protons of Y61, K63, R65, V99 and 4114 all resonate within 0.02 ppm (centred at approximately 9.22 ppm) and the amide protons of 162, V64 and FlOO also resonate within 0.02 ppm (centred at approxi- mately 9.59 ppm). The problem of amide proton resonance degeneracy within this region was accentuated by the pres- ence of a proton resonance overlap. For example, the do, (i,i+l) NOE between Y60 and Y61 was degenerate with the dNo (i,i) NOE of V99. Similarly, the doN (i,i+l) NOE between Y61 and I62 was overlapped with the d,, (i,i) NOE of both V64 and F100. Furthermore, the I62dK63N NOE coincides with the d,, (ij) NOE of V99, and the V64dR65N NOE

1190

I l l I S 20 ZS 30 35

M I I ' G G L S E A K I ' A T I ' E l Q E I V D K V K P ~ L E E K T N E

a

striuid I - helix

1s SJ s.5 hO hS 95 T Y G K L E A V Q Y K T Q V V A G T N Y Y I K V R A G D N K Y M H

0 . 0 0 0 . 0 0 0 . 0 0 .

o n m o o o o o o o o o o o o o o o o o o o o o

strand 2 strand 3

I05 110 1 1 5 120 I25 L K V F K S L I ' G Q N E D L V L T G Y Q V D K N K D D E L T G F

I 0 0

daN(i.i+l) - - Fig. 5. Summary of sequential and medium-range NOE connectivities, slowly exchanging amide protons, 'JHNa coupling constants and secondary structure for human stefin A. The height of the bars corresponding to the sequential NOES gives a quantitative measure of the strength of the NOE which were classified as strong, medium or weak from a two-dimensional NOESY spectrum with a mixing time of 70 ms. Connectivities that could not be resolved due to resonance degeneracy are marked (*). Slowly exchanging amide protons are indicated (0). Values of 'J,,, are denoted as greater than 8 Hz (0) or less than 5 Hz (m).

1191

with the d,, (i,i) crosspeak of Y61. In addition to hindering the assignment of the sequence Y60-R65, the amide proton resonance overlap centred at approximately 9.22 ppm and 9.59 ppm was also an obstacle to the identification of V99, FlOO and Q114. The 15N frequencies of all of these residues, however, were well-resolved from each other in the three- dimensional spectra, thus allowing the sequential connectivi- ties to be identified confidently.

Secondary structure

A large number of sequential and medium-range NOEs could be identified in the two-dimensional NOESY and three-dimensional NOESY-HMQC spectra. Cross-peaks in a two-dimensional NOESY spectrum, which had been re- corded with a mixing time of 70 ms, were classified as hav- ing strong, medium or weak intensity. These sequential and medium-range NOEs, in addition to 3JHNn coupling constants and the presence of slowly exchanging amide protons are summarised in Fig. 5. A diagram representing the backbone inter-strand NOE connectivities and slowly exchanging pro- tons is given in Fig. 6. The information contained within these two figures demonstrates the secondary structural ele- ments that are present in the protein, as discussed below.

N-terminus: Met6-GlylO

The first five residues adopt no regular stable conforma- tion. In each case a weak sequential doN (i,i+l) NOE [for I7-P8 a d,, (i,i+l) NOE] was observed, but no other se- quential or medium range connectivities were present.

Strand 1: Leu11 -Ah1 7 The NOEs observed in the sequence L11-Al7 suggest

that the conformation of this strand is not regular; a stretch of strong daN (i,i+l) NOEs (beginning at L11) are present, but inter-strand NOEs only occur from K15-Al7. They in- clude a cross-sheet dNN (ij) NOE between K15 and Y50, a d,, (ij) NOE between P16 and 449, and d,, (ij) connectivi- ties between P16 and Y.50, and 449 and A17 (Fig. 6). The amide protons of K15 and Y50 are resistant to exchange with the solvent, and are deduced to form a hydrogen bond pair.

Helix: Prol9- Thr36

The pattern of medium-range NOEs, which defines the a helix, begins at P19. A stretch of medium-intensity sequen- tial dNN (i,i+l) NOEs begins at E20 and continues to N37 [for K29 to P30 two dNa (i,i+l) NOEs are observed, and for P30 to 431 a pair of dm (i,i+l) NOEs are present]. The expected sequential d,, (i,i+l) NOE between Q22E23 and V28/K29 are not observed due to amide proton degeneracy between the two partners. Seven of the amide protons in the helix (Q22-V28) exchange slowly with the solvent. Low values of 3JHNa (less than 5.5 Hz) were measured for residues E20-K29 (except for V28 which had a measured value of 6.4 Hz) and E33-K35.

Strand II: Lys44 - Val54

Strong sequential d,, (i,i+ 1) NOEs are present from K44 to A47. E46 is likely to form an inter-strand hydrogen bond pair with R65 ; both amide protons are resisitant to exchange with the solvent and a d,, (ij) NOE is observed between

them. The values of ' J H N , (5.8, 9.8 and 5.1 Hz for K44, L45 and A47 respectively; it was not possible to measure this value for E46 on the grounds of resonance overlap), suggest that the early part of the strand is not regular. This irregular- ity continues with V48 which has a weak daN (i,i+l) but strong d" (i,i+l) NOE to Q49. The pattern of strong se- quential daN (i,i+l) NOEs begins again at Q49 through to V54. This section of the strand appears to be more regular; residues Y50-V54 all have 'J,,, values of greater than 8 Hz, thus confirming the presence of an extended backbone con- formation. The amide protons of Q49, Y50 and K51 ex- change slowly with the solvent. The pattern of inter-strand NOEs (Fig. 6) suggests that these residues form hydrogen bond pairs with residues K63, K15 and Y61, respectively. The amide protons of T52, 453 and V54 are not resistant to exchange with the solvent. For residues T52 and V54, there are no NOEs with which to define potential hydrogen bond acceptors, thus these amide protons are probably not in- volved in stable hydrogen bonds. In the case of Q53, how- ever, the pattern of observed NOEs (Fig. 6) suggests that a pair of inter-strand hydrogen bonds should be formed with N59. It is possible that this pair of hydrogen bonds are popu- lated to a significant extent but that the dynamic properties of this part of the molecule allow the amide protons to ex- change with the solvent.

Strand 1II: Thr58-Ah66

Strand I11 adopts a regular extended conformation from T58-A66. In each case a strong sequential d,, (i,i+l) NOE is observed and the 3JHN, value is greater than 8 Hz. The amide protons of Y60-R65 are all resistant to exchange with the solvent for a period in excess of three months, suggesting that this strand forms part of the stable core of the molecule. The hydrogen bonding partners of these residues are pre- sented in Fig. 6.

Strand 1V: Lys93 - LyslO1

Strand IV also adopts a regular extended conformation. An unbroken sequence of strong sequential daN (i,i+l) NOEs are present from K93-K101; each of these residues has a 3JHN, value of greater than 8 Hz. Residues M95-Fl00 ex- hibit slowly exchanging amide protons ; the proposed hy- drogen bond network formed by these residues, with the sup- porting evidence of the NOE connectivities, is given in Fig. 6.

Loop : SerlO2 -Am1 05A

The pattern of weak NOEs connecting the sequence S102-N105A, and the lack of medium-range NOEs in this region suggests that this part of the molecule does not adopt a stable regular secondary structure.

Strand V: Glul06-Asnll7

The final strand begins at El06 and ends at N117. The observed pattern of NOEs suggest that that this strand is not regular throughout. An unbroken stretch of strong d,, (i,i+l) NOEs is present from E106-L110. No backbone inter-strand NOEs are observed from El06 or D107, and the amide pro- tons of these residues are not protected against exchange with

1192

.. .. .. .. k 0 H 0 H---H 0 H---H H --- '\ I - / - ' . - -'

I - / - 'r - - '. I

H 0 H H 0 H H H 0 H---

Fig.6. Schematic diagram of the anti-parallel P-sheet of human stefin A. Residues are labelled at the u carbon. NOE connectivities between backbone protons are represented (---). Hydrogen bonds (deduced from the presence of slowly exchanging amide protons and NOE patterns) are shown (=) connecting amide protons and carbonyl oxygens. The double slash through the peptide bond between 115 and 116 indicates that residue 115A has been omitted from this diagram.

the solvent. The first inter-strand NOE is the d,, (ij) between L108 and K101. The amide of V109 is resistant to exchange with the solvent: it can be deduced from the observed pattern of NOEs between strands IV and V that V109 and FlOO form a hydrogen bond pair. The sequence of strong d,, (i,i+l) NOEs is broken for two residues (between TI11 and Yl13) and a medium-intensity dNN (i,i+l) NOE is observed be- tween T l l l and G112. This demonstrates that the extended conformation has been distorted. The amide of G112 is, how- ever, protected against solvent exchange; the network of NOEs suggests that G112 and K98 form a pair of hydrogen bonds. The regularity of the strand is resumed at Y113 until V115, as demonstrated by the presence of strong sequential d,i, (i,i+l) NOEs. The amide proton of Q114 is protected, and the complement of NOEs indicates that this residue forms a pair of hydrogen bonds with H96. The pattern of NOEs from D115A to N117 suggests that the final three resi- dues of the strand form a more distorted backbone conforma- tion.

C-terminal residues: Lys118-Phe125

The pattern of sequential NOEs and the linewidth of the resonances for the final eight residues suggest that the C- terminus is conformationally restricted, but that the confor- mation is not regular strand or helix. Six of these residues exhibit values of 35HNu of less than 5.5 Hz. Furthermore, the amide proton resonance of T123 is uncharacteristically broad, indicating that this residue might be involved in a conformational exchange process.

Comparison with stefin B

Stefins A and B exhibit 54% sequence identity. The structure of stefin B in its complex with carboxymethylated papain has been solved by X-ray crystallography (Stubbs et al., 1990). This molecule forms a tripartite wedge which slots into the active site of papain. The three components of this wedge are the N-terminal trunk, the first hairpin loop (con-

1193

taining the highly conserved QVVAG region) and a second hairpin loop. The structure of this complex validated the in- hibitory model proposed on the basis of docking the crystal structure of chicken cystatin into the active site cleft of pa- pain (Bode et al., 1988). According to the model, the mecha- nism of binding of cystatin inhibitors to papain-like protein- ases is fundamentally different from the standard mechanism defined for serine proteinases and most of their protein inhib- itors (Laskowsky and Kato, 1980). In this novel mechanism, two conserved inhibitor loops (which have significantly dif- ferent conformations from a bound substrate) bind to a region of the target enzyme which is remote from the active site C25; G9 and A10 of the inhibitor are in close spatial proxim- ity to this catalytic residue, but are in a conformation which renders them invulnerable to proteolytic attack. A compari- son of the secondary structural features of free stefin A and stefin B in its complex with carboxymethyl-papain is out- lined below.

Stefin B begins with an open type I1 turn (M7-A10) for which the presence of a glycine in the third position is highly favourable; this glycine is conserved throughout the cystatin superfamily. There is no evidence to suggest that this region of stefin A in solution populates any single conformation to a significant extent. This region of the stefin molecule forms part of the tripartite wedge which docks into the active site of the target proteinase. The side chain of the catalytic resi- due (C25, which has been carboxymethylated) of papain is enclosed by stefin residues S8-G9 and Q53-V54. A matrix of proteinase-inhibitor (papain-stefin B) intermolecular con- tacts of less than 0.4nm (Stubbs et al., 1990) indicates that 12, 18 and 13 such contacts are made by M7, S8 and G9, respectively. In addition, three proteinase-inhibitor main chain-main chain hydrogen bonds are made by the N-termi- nal trunk, namely G67N-SS0, G660-SSN and D1580-G9N. It is clear that this turn is integral to the binding. However, since the turn is not populated significantly in solution, it is probably induced as a result of complexation.

The N-terminal trunk is followed by an extended first strand which is a common feature in both molecules. A five- turn a-helix ensues; the NMR data are consistent with the helix of stefin A beginning and terminating at the equivalent positions as in stefin B. The helix of stefin B is kinked at the fourth turn; the interrupted pattern of medium range NOE suggests that a similar kink is induced by the presence of P30 in the fourth turn of the helix of stefin A.

There is a great deal of similarity in the secondary struc- tural features of the two molecules in the 40 residues that follow the helix. In each case, a short loop connects the helix to the second strand. The conformation of the early part of this strand is irregular until residue 49 when a regular ex- tended conformation begins. The second strand is followed by the first hairpin loop which forms part of the tripartite wedge. There are 47 intermolecular contacts less than 0.4 nm (Stubbs et al., 1990) and six potential hydrogen bonds be- tween this loop of stefin B and carboxymethyl-papain. In addition, V55 within this loop is in a slightly strained confor- mation in the proteinase-inhibitor complex (4 = -117, w = - 146). The NOE intensity between the sequential residues in the locality of the loop of free stefin A in solution, and the observation of inter-strand NOE beginning at V54 : T58, suggest that it is conformationally restricted. The first hairpin loop is succeeded by a regular third strand followed by a loop and a regular fourth strand; the secondary structural features of this region are common to the free and bound inhibitors.

The conformation of the second hairpin loop (which is the final component of the tripartite wedge) is significantly different in the free inhibitor when compared with its com- plexed counterpart. Residues L102A-Nl05A form 1.5 turns of a 3,, helix in the bound conformation of stefin B; these residues collectively provide 13 intermolecular (proteinase- inhibitor) contacts less than 0.4 nm apart. In contrast, there is no evidence to suggest that this region of the free inhibitor in solution populates any single conformation to a significant degree. All of the sequential NOEs in this region are of weak intensity suggesting that a general lowering of the NOE build-up rates has occurred as a result of mobility. The simi- larity in the secondary structural features of stefins A and B is resumed with the fifth strand through to the C-terminus.

Concluding remarks

Complete backbone and side chain I5N-NMR resonance assignments have been obtained for recombinant human stefin A. In addition, complete backbone 'H-NMR assign- ments have been made, and 88% of the side chain proton resonances have been identified. 3JHNm values have been ob- tained for the majority of the residues and the slowly ex- changing amide protons have been identified. The secondary structure of the protein has been determined on the basis of short inter-residue and medium-range NOEs, J-constants and amide exchange data, and consists of five strands of anti- parallel P-sheet, an a-helix and a C-terminal loop. There is a great deal of similarity between these features of free stefin A and the secondary structure of stefin B in its complex with carboxymethyl-papain, but also some noteworthy differ- ences. These differences occur in two of the three compo- nents of the tripartite wedge of stefin which confers the in- hibitory activity of the molecule. Specifically, there is no evidence to suggest that either the N-terminus or the second binding loop of free stefin A populate any single conforma- tion to a significant degree. This contrasts with the observa- tion of an open type I1 turn and 1.5 turns of a 3,, helix for these two regions, respectively, in the bound conformation of stefin B. It seems likely, therefore, that the stefin molecule undergoes a significant conformational selection on binding, i.e. a turn is formed at the N-terminus and a helix induced in the second binding loop as a result of the formation of the inhibitory complex. Full structure determination and an analysis of the dynamic properties of free stefin A in solution are currently in progress.

The Krebs Institute is a Science and Engineering Research Council Biomolecular Sciences Centre for Molecular Recognition. J. R. M. acknowledges the receipt of a research studentship from the SERC. J. P. W. thanks The Royal Society and the University of Sheffield for equipment grants. Financial help from the Molecular Recognition Initiative of the SERC, the Ministry for Science of the Republic of Slovenia and the British Council is acknowledged.

REFERENCES Barrett, A. J., Rawlings, N. D., Davies, M. E., Machleidt, W.,

Salvesen, G. & Turk, V. (1986) in Proteinuse inhibitors (Barrett, A. J. & Salvesen, G., eds) pp. 515-569, Elsevier, Amsterdam.

Bode, W., Engh, R., Musil, D., Thiele, U., Huber, R., Karshikov, A., Brzin, J., Kos, J. & Turk, V. (1988) EMBO. J. 7, 2593-2599.

Brown, S. C., Weber, P. L. & Mueller, L. (1988) J. Mugn. Reson. 77, 166-169.

Chain, W. J., Rance, M. & Wright, P. (1988) J. Mol. Bid. 202, 603 -622.

1194

Davies, M. E. & Barrett, A. J. (1984) Histochemistry 80, 373-377. Dieckmann, T., Mitschang, L., Hofman, M., Kos, J., Turk, V., Auers-

wald, E. A,, Jaenicke, R. & Oschkinat, H. (1993) J. Mol. Biol. 234, 1048-1059.

Drenth, J., Jansonius, J. N., Koekoek, R. & Wolthers, B. G. (1971) Adv. Protein Chem. 25, 79-115.

Engh, R., Dieckmann, T., Bode, W., Auerswald, E. A., Turk, V., Huber, R. & Oschkinat, H. (1993) J. Mol. Biol. 234, 1060-1069.

Hopsu-Haw, V. K., Jarvinen, M. & Rinne, A. (1983a) Br. J. Derma- tol. 109, 77-85.

Hopsu-Havu, V. K., Joronen, I., Jarvinen, M. & Rinne, A. (1983b) Eur. Rev. Med. Pharmacol. Sci. 5 , 1-4.

Jarvinen, M., Pernu, H., Rinne, A,, Hopsu-Havu, V. K. & Altonen, M. (1983) Acta Histochem. 73, 279-282.

Jerala, R., Zerovnik, E., Lohner, K. & Turk, V. (1994a) Protein Eng. 7 , 977-984.

Jerala, R., Kroon-Zitko, L. & Turk, V. (1994b) Prot. Exp. Pur$ 5, 65-69.

Kartasova, T., Cornelissen, B. J. C., Belt, P. & van de Putte, P. (1987) Nucleic Acids Res. 15, 5945-5962.

Kondo, H., Abe, K., Emori, Y. & Arai, S. (1991) FEBS Lett. 278,

Lah, T. T., Kokalj-Kunovar, M. & Turk, V. (1990) Biol. Chem.

Laskowsky, M. Jr & Kato, I. (1980) Annu. Rev. Biochem. 49, 593-

87-90.

Hoppe-Seyler 371, 199-203.

626.

Live, D. H., Davis, D. G., Agosta, W. C. & Cowburn, D. (1984) J. Am. Chem. SOC. 106, 1939 - 1941.

Machleidt, W., Borchart, U., Fritz, H., Brzin, J., Ritonja, A. & Turk, V. (1983) Hoppe-Seyler 's Z. Physiul. Chem. 364, 1481 -1486.

Marion, D., Ikura, M. & Bax, A. (1989) J. Magn. Reson. 84, 425- 428.

Priestle, J. P., Ford, G. C., Glor, M., Mehler, E. L., Smit, J. D. G., Thaller, C. & Jansonius, J. N. (1984) Acta Ciystallogl: Sect. A 40, Hamburg Congress Abstracts C-17.

Rasanen, O., J h i n e n , M. & Rinne, A. (1978) Acta Histochem. 63, 193- 196.

Rinne, A., J h i n e n , M. & Rasanen, 0. (1978) Acta Histuchem. 63, 183 - 192.

Rinne, A. (1979) Acta Univ. Oulu. 041, Anat. Pathol. Microbiol. 4. Rinne, A., J h i n e n , M., Rasanen, 0. & Dorn, A. (1980) Acta His-

tochem. 22, 325-329. Rinne, A,, Rasanen, O., Jarvinen, M., Dammert, K., Kallioinen,

M. & Hopsu-Havu, V. K. (1984) Acta Histochem. 74, 75-79. Stubbs, M. T., Laber, B., Bode, W., Huber, R., Jerala, R., LenarG,

B. & Turk, V. (1990) EMBO. J. 9, 1939-1947. Turk, V. & Bode, W. (1991) FEBS Lett. 285, 213-219. Waltho, J. P. & Cavanagh, J. (1993) J. Magn. Reson. 103, 338-348. Wiithrich, K. (1986) NMR ofproteiny and nucleic acids, John Wiley

and Sons, New York.