Supporting Online Materials For - Journal of Biological ... · Supporting Online Materials For: Structural and Functional Characterization of an Archaeal CASCADE Complex for ... secondary

Supporting Online Materials For:

Structural and Functional Characterization of an Archaeal CASCADE Complex for

CRISPR-Mediated Viral Defense

Nathanael G. Lintner1,2†, Melina Kerou3†, Susan K. Brumfield1,4, Shirley Graham3, Huanting Liu3,

James H. Naismith3, Matthew Sdano1,2, Nan Peng5, Qunxin She5, Valérie Copié1,2, Mark J. Young1,4,

Malcolm F. White3* and C. Martin Lawrence1,2*

1 Thermal Biology Institute, 2Department of Chemistry and Biochemistry, 4Department of Plant

Sciences and Plant Pathology, Montana State University Bozeman, MT 59717, USA

3Biomedical Sciences Research Complex, University of St Andrews, North Haugh, St Andrews, Fife

KY16 9ST, UK.

5Archaea Center, Department of Biology, University of Copenhagen, DK-2200, Copenhagen N,

Denmark

† These authors contributed equally to the work

*For correspondence: C. Martin Lawrence ([email protected]), Malcolm F White ([email protected])

Supplemental Figures

Figure S1: Characterization of aCASCADE. (A) Chromatograph of aCASCADE purified from S. solfataricus on a superose-6 column. The major elution peak is centered at 13.6 mL and corresponds to a molecular weight of ~350-500 kDa, with an additional “tail” of higher molecular weight material. (B) aCASCADE co-purifies with a small amount of DNA. Sybr-Gold stained UREA-PAGE gel showing nucleic acids extracted from the proteins using basic phenol-chloroform after the StrepTactin step (lane 2), DNAseI treated (lane 3), RNAseA treated (lane 4) or treated with both DNaseI and RNAseA (Lane 5). Lanes 2 and 3 are overloaded to demonstrate the relative amounts of DNA and RNA in the complex. (C) The bound crRNA is highly protected. RNAse-protection assays showing no degradation of the bound RNA after 24 hours of incubation with RNAseA.

Figure S2: The Csa2 fold. The ferredoxin or RRM-like subdomain is colored violet, the 1-3 domain is red, the 2-4 domain is orange and the C-terminal subdomain is yellow. The disordered loops are depicted by dotted lines. Secondary structures are labelled as in Fig. 4. The four insertions into the core RRM-like fold are labelled and numbered.

Figure S3: Multiple sequence alignment of Csa2 orthologs. Csa2 orthologs encoded in genomes with CRISPRs, Cas1-6 and Csa proteins were aligned using T-COFFEE. The secondary structural elements were assigned with DSSP using chain A of the Sso1442 structure and are named as in figure 4. The disordered residues in chain A are depicted with dotted lines. TK0453 has an extra 80-residue insertion between helices four and five which is not shown in the alignment. Identified Csa2 orthologs are from Sulfolobus solfataricus (SSO), Sulfolobus tokodaii (ST), Metallosphaera sedula (Msed), Methanocaldococcus jannaschii (MJ), Pyrobaculum aerophilum (PAE), Candidatus Korarchaeum cryptofilum (Kcr), Pyrococcus furiosus (PF), Pyrococcus horikoshii (PH), Pyrococcus abyssi (PAB), Aeropyrum pernix (APE), Archaeoglobus fulgidus (AF), and Thermococcus kodakarensis (TK).

Figure S4:. SSM superposition of Csa2 Chain B (cyan) onto chain A (orange) displaying the conformational flexibility of the 1-3 domain. In chain B the end of the β-hairpin is shifted by 10.4 Å relative to chain A. This conformational change is accompanied by the disordering of an additional 18-23 residues.

Figure S5: Hypothetical Csa2 hexamer. The size and shape of a modeled Csa2 hexamer is similar to that of the hexameric CasC core of E. coli CASCADE (Jore et al., 2011).

Figure S6: Distribution of spacer sizes among CRISPRS. Histograms showing (A) the spacer lengths in S. solfataricus and E. coli, and (B) the distribution of lengths for all spacers in the CRISPR database (Grissa et al., 2007) on 12/22/10. The open symmetry of CASCADE may allow it to accommodate crRNA of variable length.

Supplemental Tables

Table S2: Proteins Co-purifying with Sso1442(Csa2) as identified by solution trypsin digestion followed by LC-MS/MS

Protein Family Expt A Expt B Expt C score coverage score coverage score coverage Sso1442 Csa2(Cas7) 24937 78% 33795 80% 32149 82% Sso1441 Cas5 2040 53% 1770 57% 1726 63% Sso1399 Csa2(Cas7) 999 66% 1189 40% 912 50% Sso2466 Biotin

carboxylase 593 33% 67 6% 377 59%

Sso1400 Cas5 396 41% 622 22% 327 22% Sso1443 Csa5 130 25% 372 65% 377 59% Sso0962 Alba 95 53% 173 72% 173 46% Sso1437 Cas6 92 17% 132 17% 61 18% Sso1401 Csa4(Cas8a2) - - 153 10% 84 8%

Table S1: Identification of high-abundance proteins identified using in-gel trypsin digestion followed by LC-MS/MS

Band (Fig 1A)

Protein Cas Family Mascot Score

Unique peptides

Sequence Coverage

a. AccC N/A 289 8 16% PpcB 274 5 12%

b. Sso1442 Csa2(Cas7) 329 8 31% c. Sso1442 Csa2(Cas7) 492 10 34% d. Sso1442 Csa2(Cas7) 85 2 6% e. Sso1441 Cas5 414 5 31%

Table S3: Oligonucleotides used in this study (CRISPR repeat derived sequences are in bold and

the PAM is underlined.)

Oligonucleotide name

sequence

crRNA-A1 5’-

AUUGAAAGGAACUAGCUUAUAGUUUAGAAGAAAACAAACAAAUAAU

GAUUAAUCCCAAAA

U15 CRISPR repeat 5’-UUUUUUUUUUUUUUUGAUUAAUCCCAAAAGGAAUUGAAAG

Target-A1f 5’-

TAATACGACTCACTATAGGGTATTATTTGTTTGTTTTCTTCTAAACTAT

AAGCTAGTTCTGGAGAGAAGGTG

Target-A1r 5’-

CACCTTCTCTCCAGAACTAGCTTATAGTTTAGAAGAAAACAAACAAAT

AATACCCTATAGTGAGTCGTATTA

crRNA transcript 5’-

GGGAUAGGAAGUAUAAAAACACAACAGAUUAAUCCCAAAAGGAAU

UGAAAGGAACUAGCUUAUAGUUUAGAAGAAAACAAACAAAUAAUG

AUUAAUCCCAAAAGGAAUUGAAAGAUUUUCAGCUGAAAAUUUGAA

AUCUGUAGAUUUGGAUG

Supplemental Materials and Methods

S. solfataricus protein expression and purification. For transformation, 1 µg expression

vector was combined with 50 µL electrocompetent S. solfataricus PH1-16 and electroporated in a

1 mm cuvette using a Bio-Rad GenePulser Xcell with a pulse controller at 1.5 kV, 400 Ώ and 25

µF. The cells were immediately diluted in 1 mL ice cold H2O and incubated on ice for 10 min,

and then added to 25 mL 80 ºC Brock’s minimal media (Brock et al., 1972) supplemented with

0.1% tryptone and 0.2% sucrose. The presence of vector was confirmed by PCR. For protein

expression, 50 mL of starter culture at an OD600 = 0.8-1.2 was used to seed 1 L of Brock’s

minimal media supplemented with 0.2% arabinose and grown for 3 days at 80 ºC.

Isolation of aCASCADE. 8-10 g Csa2-expressing S. solfataricus PH1-16 cell pellet was

resuspended in lysis buffer [20 mM NaH2PO4, pH 7.5, 150 mM NaCl, 0.1% NP-40, 1 mM

EDTA, 0.1 mM PMSF, protease inhibitor cocktail set III (Calbiochem)] and passed through a

French Press. The lysate was centrifuged 30 min. at 30,000 x g. The supernatant was combined

with 400 µL StrepTactin resin (Sigma) and inverted end-over-end for 4 hours at 4 °C. Beads were

washed four times with 800 µL 20 mM NaH2PO4, pH 7.5, 150 mM NaCl, 0.1% NP-40, 1 mM

EDTA and twice with 800 µL 20 mM NaH2PO4, pH 7.5, 150 mM NaCl, 0.1% NP-40. Bound

proteins were eluted from the StrepTactin resin by washing four times with StrepTactin elution

buffer [20 mM NaH2PO4 pH 7.5, 150 mM NaCl, 0.1% NP-40 2.5 mM desthiobiotin].

aCASCADE was further purified using nickel-chelate affinity chromatography. 100 µL Ni-NTA-

agarose (Qiagen) was added to pooled elution fractions and sample was rotated end-over-end for

30 min at 4 °C. The beads were washed four times with 200 µL 20 mM NaH2PO4, pH 7.5, 150

mM NaCl, 0.1% NP-40 and twice with 200 µL 20 mM NaH2PO4, pH 7.5, 150 mM NaCl. The

proteins were eluted by washing four times with 40 µL Ni-NTA elution buffer [10 mM Tris-Cl

pH 8.0, 50 mM NaCl, 200 mM Imidizole]. aCASCADE was further purified and analyzed by gel-

filtration chromatography using a Superose-6 column equilibrated in 20 mM NaH2PO4, pH 7.5,

150 mM NaCl and calibrated with thyroglobulin, ferritin, catalase and aldolase (Amersham

Pharmacia). For transmission electron microscopy, gel filtration was done in 20 mM NaH2PO4,

pH 7.5, 150 mM NaCl.

Expression and purification of recombinant proteins in E. coli. The genes encoding S.

solfataricus P2 Cas6 (sso2004), Cas5a (sso1441) and Csa2 (sso1442) were amplified by PCR

from genomic DNA. Sso2004 was cloned into pDEST14 using the Gateway recombination

cloning (Oke et al., 2010). For co-expression of sso1441 and sso1442, a dual expression vector

pRSFDuetHISTEV was constructed by adding a DNA sequence coding for six histidines, a

spacer, and TEV protease cleavage site upstream of the first multiple cloning site of the

pRSFDuet-1 vector (Novagen). The Sso1441 and Sso1442 dual expression construct

pRSFD1441-1442 was constructed by cloning of sso1441 and sso1442 into pRSFDuetHISTEV

using restriction sites BspH1/BamH1 and NdeI/XhoI respectively. The cloned sso1441 and

sso1442 genes were sequenced to confirm their integrity. This vector resulted in expression of

Sso1441 recombinant protein with an N-terminal TEV cleavable His tag while Sso1442 was

expressed in the native form.

Expression of both Cas6 and the Csa2/Cas5a complex was achieved in E.coli BL21(DE3)

cells by induction at OD600 = 0.6 with 0.4 mM IPTG and overnight incubation at 25 °C. For the

purification of Cas6, cells were harvested and disrupted by sonication in 50 mM Tris-HCl pH 7.5,

500 mM NaCl, 10 mM imidazole, 10 % glycerol, complete EDTA-free protease inhibitor tablets

(Roche) and 1mg/ml lysozyme. The recombinant protein was purified from the soluble fraction

by nickel-chelate affinity chromatography on a 5 ml HisTrap HP column (GE Healthcare) and gel

filtration on a Superdex 200 10/300 column (GE Healthcare) in 50 mM Tris-HCl pH 7.5, 150 mM

NaCl, 10 % glycerol. For the purification of the Csa2/Cas5a complex the cells were lysed by

sonication in 20 mM NaHPO4 pH 7.4, 500 mM NaCl, 10 % glycerol, complete EDTA-free

protease inhibitor tablets (Roche) and 1 mg/ml lysozyme. The soluble fraction was heated at 65

oC for 20 min, centrifuged at 40,000 rpm and the recombinant complex was purified from the

supernatant by successive chromatography steps on 5 ml HisTrap HP, Superdex 200 10/300 and

HiTrap Heparin columns (GE Healthcare).

For crystallographic studies, sso1442 was cloned with a minimal N-terminal non-

cleavable His6 tag into pDest14 using site-specific recombination (Gateway, Invitrogen) and a

nested PCR-protocol described previously (Kraft et al., 2004). The internal forward and reverse

primers were:

CCATGCATCACCATCACCATCACATGATAAGCGGTTCAGTTAGGTTTTTGGTA and

CACTTTGTACAAGAAAGCTGGGTCCTACTCTTCCTCTAATTTAACTACTAAGTC,

respectively, while the sequences of the external forward and reverse primers consisted of

GGGGACAAGTTTGTACAAAAAAGCAGGCTTCGAAGGAGATAGAACCATGCATCACC

ATCACCATCAC and GGGGACCACTTTGTACAAGAAAGCTGGGTCCTA, respectively.

Sso1442 was expressed in BL21(DE3)-pLysS E.coli using ZYP-5025 autoinduction media

(Studier, 2005). Cells were resuspended in lysis buffer [20 mM Tris, 400 mM NaCl, pH 8.0 0.1

mM PMSF] and lysed by passage through a French Press. The lysate was incubated at 65 °C for

20 minutes then clarified by centrifugation at 22,000 × g for 30 minutes. The supernatant was

applied to a gravity-flow column containing a 1-3 ml bed volume of Ni-NTA agarose (Qiagen).

The column was washed with 8 column volumes of wash buffer (20 mM Tris, 400 mM NaCl, pH

8.0) and Sso1442 was eluted in 10 mM Tris, pH 8.0, 200 mM NaCl and 200 mM imidizole. Gel-

filtration chromatography was done on a Superdex S-75 column (GE Healthcare Life Sciences)

equilibrated with 10 mM Tris (pH 8.0) and 200 mM NaCl. Protein concentrations were

determined using the Bradford assay (Bradford, 1976), Protein Assay Reagent (Bio-Rad), and

BSA as a standard. The purity and molecular weight of Sso1442 were confirmed by SDS-PAGE.

Supplemental References Bradford, M.M. (1976) A rapid and sensitive method for the quantitation of microgram

quantities of protein utilizing the principle of protein-dye binding. Anal Biochem, 72, 248-254.

Brock, T.D., Brock, K.M., Belly, R.T. and Weiss, R.L. (1972) Sulfolobus: a new genus of sulfur-oxidizing bacteria living at low pH and high temperature. Arch Mikrobiol, 84, 54-68.

Grissa, I., Vergnaud, G. and Pourcel, C. (2007) The CRISPRdb database and tools to display CRISPRs and to generate dictionaries of spacers and repeats. BMC Bioinformatics, 8, 172.

Jore, M.M., Lundgren, M., van Dujn, E., Bultema, J.B., Westra, E.R., Waghmare, S.P., Widenheft, B., Pul, U., Wurm, R., Wagner, R., Beijer, M.R., Berndregt, A., Zhou, K., Snijders, A.P.L., Dickman, M.J., Doudna, J.A., Boekema, E.J., Heck, A.J.R., van der Oost, J. and Brouns, S.J. (2011) Structural basis for CRISPR RNA-guided DNA recognition by CASCADE. Nature Structural and Molecular Biology, In Press.

Kraft, P., Kummel, D., Oeckinghaus, A., Gauss, G.H., Wiedenheft, B., Young, M. and Lawrence, C.M. (2004) Structure of D-63 from sulfolobus spindle-shaped virus 1: surface properties of the dimeric four-helix bundle suggest an adaptor protein function. J Virol, 78, 7438-7442.

Oke, M., Carter, L.G., Johnson, K.A., Liu, H., McMahon, S.A., Yan, X., Kerou, M., Weikart, N.D., Kadi, N., Sheikh, M.A., Schmelz, S., Dorward, M., Zawadzki, M., Cozens, C., Falconer, H., Powers, H., Overton, I.M., van Niekerk, C.A., Peng, X., Patel, P., Garrett, R.A., Prangishvili, D., Botting, C.H., Coote, P.J., Dryden, D.T., Barton, G.J., Schwarz-Linek, U., Challis, G.L., Taylor, G.L., White, M.F. and

Naismith, J.H. (2010) The Scottish Structural Proteomics Facility: targets, methods and outputs. J Struct Funct Genomics, 11, 167-180.

Studier, F.W. (2005) Protein production by auto-induction in high density shaking cultures. Protein expression and purification, 41, 207.

Documents

Supporting Online Materials For - Journal of Biological ... · Supporting Online Materials For: Structural and Functional Characterization of an Archaeal CASCADE Complex for ... secondary