20
Supplemental Data for: Profiling of engineering hotspots identifies an allosteric CRISPR-Cas9 switch Benjamin L. Oakes 1 , Dana C. Nadler 1 , Avi Flamholz 1 , Christof Fellmann 1 , Brett T. Staahl 1 , Jennifer A. Doudna 1-5 & David F. Savage 1,5,6. 1 Department of Molecular and Cell Biology, University of California, Berkeley, California, USA. 2 Howard Hughes Medical Institute, University of California, Berkeley, California, USA. 3 Innovative Genomics Initiative, University of California, Berkeley, California, USA. 4 Physical Biosciences Division, Lawrence Berkeley National Laboratory, Berkeley, California, USA. 5 Department of Chemistry, University of California, Berkeley, California, USA. 6 Energy Biosciences Institute, University of California, Berkeley, California, USA. * To whom correspondence should be addressed. Email: [email protected]. Nature Biotechnology: doi:10.1038/nbt.3528

Supplemental Data for: Profiling of engineering …...Supplemental Data for: Profiling of engineering hotspots identifies an allosteric CRISPR-Cas9 switch Benjamin L. Oakes 1, Dana

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Supplemental Data for: Profiling of engineering …...Supplemental Data for: Profiling of engineering hotspots identifies an allosteric CRISPR-Cas9 switch Benjamin L. Oakes 1, Dana

Supplemental Data for:

Profiling of engineering hotspots identifies an allosteric CRISPR-Cas9 switch Benjamin L. Oakes1, Dana C. Nadler1, Avi Flamholz1, Christof Fellmann1, Brett T. Staahl1,

Jennifer A. Doudna1-5 & David F. Savage 1,5,6.

1Department of Molecular and Cell Biology, University of California, Berkeley, California, USA. 2Howard

Hughes Medical Institute, University of California, Berkeley, California, USA. 3Innovative Genomics

Initiative, University of California, Berkeley, California, USA. 4Physical Biosciences Division, Lawrence

Berkeley National Laboratory, Berkeley, California, USA. 5Department of Chemistry, University of

California, Berkeley, California, USA. 6Energy Biosciences Institute, University of California, Berkeley,

California, USA.

* To whom correspondence should be addressed. Email: [email protected].

Nature Biotechnology: doi:10.1038/nbt.3528

Page 2: Supplemental Data for: Profiling of engineering …...Supplemental Data for: Profiling of engineering hotspots identifies an allosteric CRISPR-Cas9 switch Benjamin L. Oakes 1, Dana

a

S1.

In vitro reaction with engineered transposon, Mu transposase and a dCas9 bearing plasmid

Engineered Transposon

Remove transposon via BsaI digest Ligate in ORF with properstickey ends and desired linkers

Digest

Insertion

Subclone dCas9 to recover only Cas9’s with an instertion event

1

2

3

4

5

6

ORF of interest

Ligate ORF with linkers and sticky ends

Domian insertion libraryfor screening

b

c

d

Ala ProAlaSer

Ser ProAlaSer

AlaProAlaSer

SerProAlaSer

S1 | Construction of a transposition library (a) Schematic of the engineered Mu transposon used for transposition. (b) Schematic showing the inserted transposon. (c) Excision of an intra-Cas9 transposon using Type II-S restriction enzyme BsaI. Any ORF can be subsequently ligated into these sticky ends. (d) Description of the sticky ends and linkers used to ligate in the PDZ and ER-LBD domains. Ala and Ser are hardcoded on each side providing the correct sticky ends for a Golden Gate ligation and additional diversity is provided by BCT codons (encoding Ala, Ser, or Pro). In total, there are 13 possible amino acid variations per terminus

Nature Biotechnology: doi:10.1038/nbt.3528

Page 3: Supplemental Data for: Profiling of engineering …...Supplemental Data for: Profiling of engineering hotspots identifies an allosteric CRISPR-Cas9 switch Benjamin L. Oakes 1, Dana

S2Po

sitiv

e se

lect

ion

roun

ds

a

b

Normalized read counts for productive insertions in the transposition library

Normalized read counts for productive insertions in the PDZ insertion libraries

S2 | Deep sequencing of the transposition, naïve PDZ and selected PDZ libraries. (a) Sequenc-ing and alignment of the transposon insertion library indicates coverage across the Cas9 coding sequence with a bias for insertion towards the C-terminus. We observe 973 productive insertions covering 71% of all possible sites. (b) Deep sequencing of the PDZ libraries, naïve (pre-screened) and post one and two rounds of CRISPRi screening. Sequencing of the naïve PDZ library indicates that good coverage of the protein is maintained upon cloning in of the PDZ domain with 953 productive insertions observed. Upon screening many clones are cleared out of the library while a smaller number are enriched. All read counts represent the corrected replicated averages generated from DESeq.

Nature Biotechnology: doi:10.1038/nbt.3528

Page 4: Supplemental Data for: Profiling of engineering …...Supplemental Data for: Profiling of engineering hotspots identifies an allosteric CRISPR-Cas9 switch Benjamin L. Oakes 1, Dana

aProtein of interest

Linker

Signal sequence

Cas9/dCas9

PDZ domain - - -GVKESLV-COOH

PDZ based recruitment

Modulation of cleavage and repair, epigenetic state,

gene repression/activation

S3

Alpha-syntrophin PDZ proteininteraction domain with recognition

peptide

NC

bProcessive enzymes

HR enzymes

NHEJ Proteins

Stacked recruitment:chromatin remodeling

and any above

S3 | The PDZ domain as a potential scaffolding element (a) The Alpha-Syntrophin PDZ protein interaction domain. The adjacent N- and C-termini and its peptide ligand are depicted (PDB ID: 2PDZ)16 (b) PDZ based recruitment. The PDZ domain is a protein interaction domain that specifi-cally recognizes a seven amino acid C-terminal motif that can be modularly attached to any protein of interest. This provides a mechanism by which it is possible to recruit one or many different proteins to a cleavage site or binding site to increase the local concentration. This may allow for the recruitment of proteins that may be processive (such as helicases), DNA repair enzymes (such as HR and NHEJ machinery, Rad51 and Ku70/80), activators/repressor or epigenetic modifying machinery, and even libraries of multiple protein domains fused to the PDZ recruitment amino acid sequence.

Nature Biotechnology: doi:10.1038/nbt.3528

Page 5: Supplemental Data for: Profiling of engineering …...Supplemental Data for: Profiling of engineering hotspots identifies an allosteric CRISPR-Cas9 switch Benjamin L. Oakes 1, Dana

RFP (au)Co

unt

S4 b cNon-functional dCas9:RFP & GFP expression

Functional dCas9:GFP expression only

dCas9 Vector control

CRISPRi E.coli screening platform

Functional dCas9 insertion clones: GFP only, no RFP

Defective dCas9 insertion clones: RFP & GFP expression

RFP GFP

RFP GFP

a

S4 | CRISPRi screening protocol controls (a) Schematic of the E. coli screening platform for determining DNA-binding competent dCas9 insertion mutants. (b) Flow cytometry of the RFP repression by dCas9. A 20 fold change in RFP fluorescence when dCas9 is present for 6+ hours similar to the results reported in Qi et al. 201318, this provides a clear signal by which to select functional Cas9 insertion mutants (c) On-plate screening of RFP repression by dCas9. The lack of RFP signal is visible by eye when screening colonies after overnight growth on plates.

Nature Biotechnology: doi:10.1038/nbt.3528

Page 6: Supplemental Data for: Profiling of engineering …...Supplemental Data for: Profiling of engineering hotspots identifies an allosteric CRISPR-Cas9 switch Benjamin L. Oakes 1, Dana

S5

b

a

Productive insertionsNon-productive insertions

Histogram of fold changes for insertions upon cloning from transposition to the naïve PDZ library

10%

20%

30%

40%

0 1 2

Pro

duct

ive

inse

rtion

s (%

)

1.9

23.6

42.1

Selection round

-15 -10 -5 0 5 10

-15 -10 -5 0 5 10

Histogram of fold changes for the PDZ library insertions upon two rounds of screening

norm

aliz

ed c

ount

sno

rmal

ized

cou

nts

S5 | Enrichment for productive clones during domain insertion profiling of the PDZ-inser-tion library (a) Histogram of fold changes from the transposition to naïve PDZ libraries. The transposition library technique will theoretically create out of frame or reverse insertions that do not code for full length proteins ~5/6 of the time. During library outgrowth and cloning of the PDZ library we observe depletion of clones with in-frame, forward (‘productive’) insertions. This is presumably due to the cost of producing full length, >1400 amino acid, DNA binding proteins. (b) Passage of the PDZ insertion library through rounds of screening enriches productive insertions and substantially depletes non-productive insertions. Thus the CRISPRi based screen selects for full length coding sequences that can translate into full Cas9 proteins. Inset: percentage of produc-tive insertions in the library during each round. All counts represent the normalized and corrected replicated averages generated from DESeq, error bars represent standard deviation of four techni-cal replicates.

Nature Biotechnology: doi:10.1038/nbt.3528

Page 7: Supplemental Data for: Profiling of engineering …...Supplemental Data for: Profiling of engineering hotspots identifies an allosteric CRISPR-Cas9 switch Benjamin L. Oakes 1, Dana

Selection against the PAM pocket

Selection against core packing regions

S6

a

b

Log2 Fold Change6.00 to 4.504.49 to 3.002.99 to 1.000.99 to -0.99-1.00 to -4.99-5.00 to -10.0No longer observed

N.S.

180°

Log2 Fold Change6.00 to 4.504.49 to 3.002.99 to 1.000.99 to -0.99-1.00 to -4.99-5.00 to -10.0No longer observed

N.S.

Target DNA

sgRNA

Helical-II

RuvC

Helical-IHelical-III

CTD

DNA

sgRNA

Helical-II

RuvC

Helical-IHelical-III

CTD

Target DNA

Helical-II

RuvC

Helical-I Helical-III

S6 | PDZ-insertion sites avoid critical structural motifs. (a) Mapping the log2 fold change of the PDZ-insertions onto the RNA:DNA holo Cas9 crystal structure (PDB ID: 4UN3)11 demonstrates how domin insertion profiling can experimentally delineate regions of critical structure and func-tion. The PAM binding pocket (PAM residues in orange) and the RNA:DNA channel are selected against (blue). (b) PDZ insertions are not readily observed in core packing regions of the RuvC and Helical-III domains and are also selected against in the sgRNA binding grooves (Fig 1C) and Arg helix.

DNA

CTD

Nature Biotechnology: doi:10.1038/nbt.3528

Page 8: Supplemental Data for: Profiling of engineering …...Supplemental Data for: Profiling of engineering hotspots identifies an allosteric CRISPR-Cas9 switch Benjamin L. Oakes 1, Dana

180°

S7Insertion sites by domain

a

Enrichment ≥ 2-fold

N- terminusC- terminusC- terminus

Helical recognition (REC) lobe Nuclease (NUC) lobe NUC lobe

REC lobe

NUC lobe

REC lobe

NUC lobe

S7| Map of enriched PDZ-insertion sites by Cas9 domain (a) Mapping the enriched insertions sites onto a RNA:DNA bound Cas9 crystal structure (PDB ID: 4UN3). Red denotes statistically enriched sites (p < .01, Figure 1B) greater than two-fold. Domains are colored according to the primary sequence bar. Many sites with high fold-change mapped to amino acids that are unre-solved in the crystal structures and are presumably in unstructured loops11.

Nature Biotechnology: doi:10.1038/nbt.3528

Page 9: Supplemental Data for: Profiling of engineering …...Supplemental Data for: Profiling of engineering hotspots identifies an allosteric CRISPR-Cas9 switch Benjamin L. Oakes 1, Dana

PDZ domain insertions into secondary structures by model

S8

a

S8| Secondary structure of enriched PDZ insertions (a) Secondary structure annotation for each atomic model (PDB ID: 4CMP48, 4OO810, 4UN311 & 4TZ034) was determined using STRIDE49. In each structure insertions occur in all annotated types of structural elements. The fraction of inser-tions into each element was also compared to the overall prevalence of that element in the model and P-values were calculated using a null model where insertions were picked at random from the structure. For every structure except for 4TZ0 insertions into regions which could not be modeled (not resolved) were statistically overrepresented. This presents an interesting finding which could inform future rational efforts. In 4ZT0 where many more residues were resolved, insertions into turn elements were found to be over represented.

Nature Biotechnology: doi:10.1038/nbt.3528

Page 10: Supplemental Data for: Profiling of engineering …...Supplemental Data for: Profiling of engineering hotspots identifies an allosteric CRISPR-Cas9 switch Benjamin L. Oakes 1, Dana

0.000

10.0

01 0.01 0.1 1

103

104

1 10 100100

101

102

103

104

105

106

208204

102712601291

12401016

1267

1183

127

1049

801

953

732

1227

757

495

208

2041027

12601291

1240

1016

1267

1183

127

1049

801

953

7321227

495

Binding:GFP(au)/OD600

Cleavage:CFU/mL

Vector

dCas9

dCas9

Cas9

Depleted clones Enriched clones

S9

Protein activity vs Enrichment

Enrichment (fold)

GFP

(au)

/OD

600

CFU

/mL

a

S9| PDZ insertion clone activity vs Fold Change (a) The data from figures 1D & E has been graphed according to the enrichment (fold change) score for each construct (Supplementary table 1), binding data based on repression of a genomic GFP is graphed on the left y axis in blue. Posi-tive and negative controls dCas9 and vector are dark blue. Cleavage data for each construct is mapped onto the right Y axis and colored in green with positive and negative controls Cas9 and dCas9 respectively in dark green. PDZ-Cas9 activity levels plateau (at > ~4 fold enrichment) near wt Cas9 activity levels for each assay despite further increased enrichment. A monotonic relation-ship between fold change and binding/cleavage ability is readily apparent. The spearman correla-tion for the cleavage data are r = -.85, p value of <0.0001, and the binding data r = -.76, p value of 0.0006, indicating a strong correlation between enrichment and increased ability to both bind and cleave the genome of E. coli.

Nature Biotechnology: doi:10.1038/nbt.3528

Page 11: Supplemental Data for: Profiling of engineering …...Supplemental Data for: Profiling of engineering hotspots identifies an allosteric CRISPR-Cas9 switch Benjamin L. Oakes 1, Dana

AA SiteVec

tor

dCas

920

223

846

880

410

5810

7111

9112

44Vec

tor

dCas

920

223

846

880

410

5810

7111

9112

44Vec

tor

dCas

920

223

846

880

410

5810

7111

9112

44Vec

tor

dCas

920

223

846

880

410

5810

7111

9112

44102

103

104

GFP

(au)

/OD

600

0.2 nM aTc2 nM aTc20 nM aTc200 nM aTc

SH3 Insertions

0.2 nM aTc2 nM aTc20 nM aTc200 nM aTc

Stacked PDZ and SH3 Insertions

S10a

b

vect

ordC

as9

PDZ-

208,

PDZ-

1027

PDZ-

208,

PDZ-

1260

PDZ-

1027

, PD

Z-12

60SH

3-46

8,PD

Z-10

27PD

Z-20

8,SH

3-12

44PD

Z-20

8,PD

Z-10

27, P

DZ-

1260

SH3-

468,

PDZ-

1027

, PD

Z-12

60PD

Z-20

8,PD

Z-10

27, S

H3-

1244

PDZ-

1027

, SH

3-12

44, P

DZ-

Cte

rmin

i

PDZ-

208,

PDZ-

1027

, SH

3-12

44, P

DZ-

Cte

rmin

i

vect

ordC

as9

PDZ-

208,

PDZ-

1027

PDZ-

208,

PDZ-

1260

PDZ-

1027

, PD

Z-12

60SH

3-46

8,PD

Z-10

27PD

Z-20

8,SH

3-12

44PD

Z-20

8,PD

Z-10

27, P

DZ-

1260

SH3-

468,

PDZ-

1027

, PD

Z-12

60PD

Z-20

8,PD

Z-10

27, S

H3-

1244

PDZ-

1027

, SH

3-12

44, P

DZ-

Cte

rmin

i

PDZ-

208,

PDZ-

1027

, SH

3-12

44, P

DZ-

Cte

rmin

i

vect

ordC

as9

PDZ-

208,

PDZ-

1027

PDZ-

208,

PDZ-

1260

PDZ-

1027

, PD

Z-12

60SH

3-46

8,PD

Z-10

27PD

Z-20

8,SH

3-12

44PD

Z-20

8,PD

Z-10

27, P

DZ-

1260

SH3-

468,

PDZ-

1027

, PD

Z-12

60PD

Z-20

8,PD

Z-10

27, S

H3-

1244

PDZ-

1027

, SH

3-12

44, P

DZ-

Cte

rmin

i

PDZ-

208,

PDZ-

1027

, SH

3-12

44, P

DZ-

Cte

rmin

i

vect

ordC

as9

PDZ-

208,

PDZ-

1027

PDZ-

208,

PDZ-

1260

PDZ-

1027

, PD

Z-12

60SH

3-46

8,PD

Z-10

27PD

Z-20

8,SH

3-12

44PD

Z-20

8,PD

Z-10

27, P

DZ-

1260

SH3-

468,

PDZ-

1027

, PD

Z-12

60PD

Z-20

8,PD

Z-10

27, S

H3-

1244

PDZ-

1027

, SH

3-12

44, P

DZ-

Cte

rmin

i

PDZ-

208,

PDZ-

1027

, SH

3-12

44, P

DZ-

Cte

rmin

i

102

103

104

GFP

(au)

/OD

600

S10 | Testing of the SH3 and stacked domain insertions. (a) The crk SH3 domain17 was cloned into 8 enriched sites in dCas9 with linkers that mimic those used for the PDZ domain insertions (Supplementary Table 2). These clones were then tested for GFP repression ability at increasing levels of Tet promoter induction. While the SH3 insertions are less functional than expected at very low levels of induction, all but two, at sites 238 and 804, are able attain GFP repression levels at or near (±5%)WT dCas9 at higher levels of induction (based on mean repression, 238 maintains 51% and 804, 88% of dCas9 activity). (b) Previously validated PDZ and SH3 domain insertions were cloned into a single dCas9 and tested for GFP repression ability. These stacked domain inser-tions are able to maintain DNA binding and gene repression ability with 3 at or near (±5%) WT dCas9, 3 at >80% of dCas9 and 4 between 58 and 79% of dCas9 activity. Error bars represent one standard deviation of biological triplicates. Vector and dCas9 controls at each induction level are shown next to the group in grey and orange respectively.

Nature Biotechnology: doi:10.1038/nbt.3528

Page 12: Supplemental Data for: Profiling of engineering …...Supplemental Data for: Profiling of engineering hotspots identifies an allosteric CRISPR-Cas9 switch Benjamin L. Oakes 1, Dana

APO Agonist AntagonistConformation:

4-HT (4-Hydroxytamoxifen)

S11

a

‘Hormonearm’

‘Hormonearm’

‘Hormonearm’

64Å

37Å

21Å

β-E (Estrogen)

S11 | Conformations of the estrogen receptor alpha ligand binding domain (ER-LBD). (a) Conformations of the ER-LBD. Crystal structures of the apo, 17-beta-estradiol and 4-hydoxy-tamoxifen-bound ER-LBD (PDB ID’s:1A52, 1GWR, 3ERT respectively)22-24 clearly demonstrate the range of conformational change this domain can undergo. The ‘hormone arm,’ or helix 12, places the modeled N- and C-termini up to 63 Å apart in the apo form, 37 Å in the agonist bound form (β-E, blue), and 21 Å in the antagonist bound structure (4-HT, red).

N

C

N

C

C

N

Nature Biotechnology: doi:10.1038/nbt.3528

Page 13: Supplemental Data for: Profiling of engineering …...Supplemental Data for: Profiling of engineering hotspots identifies an allosteric CRISPR-Cas9 switch Benjamin L. Oakes 1, Dana

AA 231

arC9 insertion site

S12

a

S12 | arC9 Hit (a) After rounds of screening and counter-screening (Figure 2A) clones were picked from plates and tested in a 96 well assay for switch like behavior. One clone with the ER-LBD insertion site at amino acid 231 (indicated in red) demonstrated a change in activity upon addition of 4-HT and was further validated (PDB ID: 4UN311). (b) Position 231 was also found to be enriched in the PDZ Domain Insertion Profile (Supplementary Table 1).

b 231

Nature Biotechnology: doi:10.1038/nbt.3528

Page 14: Supplemental Data for: Profiling of engineering …...Supplemental Data for: Profiling of engineering hotspots identifies an allosteric CRISPR-Cas9 switch Benjamin L. Oakes 1, Dana

GFP (au)

S13

wtCas9 arC9:231

GFP targetingguide

+DMSO

GFP targetingguide+4H-T

Non-targetingguide

GFP disrupted

gate

GFP positive

gate

GFP disrupted

gate

GFP positive

gate

75 ng Transfection into HEK 293T d2EGFP cells: 48 hr timepoint

GFP (au) GFP (au)

a

Controls overlay arC9 dose overlay

OverlayCas9: targeting guide Cas9: non-targeting

guide

arC9:targeting guide +DMSO

arC9: targeting guide

+4-HT

b

Transfection gate

Lipofectimine only transfection

arC9:231

GFP disrupted

gate

GFP disrupted

gate

GFP positive

gate

GFP positive

gate

arC9:231 DMSO only

mCherry (au)

mCherry (au)

GFP (au)

mCherry (au)

Idealexpression

window

wtCas9 controlsGFP targeting arC9 Overlay

Optimizing transfections into HEK 293T

GFP

(au)

GFP

(au)

mCherry (au)

DMSO

4-HT

GFP

(au)

GFP-targeting arC9

GFP-targeing guide

Non-targeing guide

GFP disrupted

Trasnfected

c

S13 | Optimization of arC9+NLS expression levels (a) Upon transfection of 75 ng of plasmid into the HEK293T cell line and measurement EGFP fluorescence via Flow cytometry at 48 hours we observed that 2xNLS-Cas9 disrupts ~70-80% of EGFP signal regardless of treatment condi-tion. 2xNLS-arC9 also disrupted GFP signal with a 2 fold increase upon the addition of 4-HT (b). We detected that 4-HT-induced activation is seen at low levels of arC9-mCherry expression, but higher levels of arC9-mCherry expression cause GFP disruption regardless of the presence of ligand. Specifically, the mCherry signal for the arC9 transfected cells - in which GFP was disrupt-ed regardless of 4-HT treatment - was 8.5x greater than that of the non-disrupted cells. Therefore we optimized transfection conditions and were able to reduce the expression levels similar to the ideal gates posed in (c) by lowering the plasmid transfection to 5 ng of DNA. This resulted in significantly less background activity while maintaining 4-HT activation (main Fig. 3B).

Nature Biotechnology: doi:10.1038/nbt.3528

Page 15: Supplemental Data for: Profiling of engineering …...Supplemental Data for: Profiling of engineering hotspots identifies an allosteric CRISPR-Cas9 switch Benjamin L. Oakes 1, Dana

DMSO B-E4-H

T

DMSO B-E4-H

T

DMSO B-E4-H

T

DMSO B-E4-H

T

DMSO B-E4-H

T

DMSO B-E4-H

T

DMSO B-E4-H

T

DMSO B-E4-H

T05

20

40

60

80

100

EGFP

disr

uptio

n(%

)

*

***

N.S.

N.S.

N.S.

N.S. N.S. N.S. N.S.

*<.01, ***<.001

EGFP-targeting:

Non-targeting:

sgRNA Protein Construct

wtCas9 2xNLS

wtCas9 2xNLS

wtCas9: no NLS

wtCas9: no NLS

wtCas9-ER LBD C-fusion

wtCas9-ER LBD C-fusion

arC9: no NLS

arC9: no NLS

Ligand:

N.S.

S14

a

S14 | Comparison between termini and domain insertion fusions. (a) In order to examine the capabilities of termini fusions in comparison to the arC9 domain insertion the ER-LBD was cloned as a C-termini fusion to wtCas9 w/o a NLS. Examination of the ability to control Cas9 activity in the presence of distinct ligands reveals that the ER-LBD C-termini fusion can increase Cas9 based EGFP disruption from 50.7 ±2.5% in DMSO to 61.3±1.5% in 4-HT(p = 0.003). This induction is compared to that of arC9 which induces Cas9 EGFP disruption from background levels (equiva-lent to non-targeting guide) in DMSO and B-E to 40.3 ±1.5% in 4-HT (p < 0.001). All data is repre-sented as a mean from biological triplicates, error is one standard deviation.

Nature Biotechnology: doi:10.1038/nbt.3528

Page 16: Supplemental Data for: Profiling of engineering …...Supplemental Data for: Profiling of engineering hotspots identifies an allosteric CRISPR-Cas9 switch Benjamin L. Oakes 1, Dana

DYRK1 genomic locus

0

20

40

60

Inde

l(%

)

Cas9arC9

sgRNA4-HT

+---

+--+

+-+-

+-++

-+--

-+-+

-++-

-+++

L

N.D. N.D. N.D. N.D. N.D.

S15

a

S15 | T7EI assay of the Cas9 and arC9 cleaved DYRK1 genomic locus. (a) Cas9 and arC9 targeting of a second genomic locus was carried out for 72 hrs in triplicate. T7EI assays were then performed on the recovered genomic DNA. Cas9 with targeting guides causes indels regardless of treatment condition, while arC9 cleaves a genomic locus only in the presence of 4-HT. Quantifica-tion and error bars represent the standard deviation of biological replicates. N.D. signifies not detected, below the detection limit of the assay. L is a lane with a 100bp Ladder.

Nature Biotechnology: doi:10.1038/nbt.3528

Page 17: Supplemental Data for: Profiling of engineering …...Supplemental Data for: Profiling of engineering hotspots identifies an allosteric CRISPR-Cas9 switch Benjamin L. Oakes 1, Dana

Infect reporter

cells

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Day 0 4 8 12

Flow Flow

16 2620 32

(Selection)

Flow

Induction

+/- 4-HT

Induction

+/- 4-HT

InfectsecondsgRNA T7E1

Assess leakiness of arC9 Assess induction Reversibility(Rec

over

y)

24 28

Flow

apBC2101 lentipBC2102 lentipBC2103 lenti

U6 sgRNA EFS 3xFLAG NLS SpyCas9 NLS T2A mCherry P2A Hygro

U6 sgRNA EFS 3xFLAG SpyCas9 T2A mCherry P2A Hygro

U6 sgRNA EFS 3xFLAG arC9 T2A mCherry P2A Hygro

b

0

20

40

60

0

20

40

60

80

100

sgGF

P1

sgGF

P2

sgRe

n71

sgGF

P1

sgGF

P2

sgRe

n71

sgGF

P1

sgGF

P2

sgRe

n71

pBC2101 pBC2102 pBC2103

BNL-LMP-15

Med

ian flu

ores

cenc

e [A

U]

GFP

-neg

ative

cells

(%)

Day 8

Day 12

Day 16

Day 20

Day 24

Day 12 median

c

02468

1012141618

sgGF

P1

sgGF

P2

sgRe

n71

sgGF

P1

sgGF

P2

sgRe

n71

sgGF

P1

sgGF

P2

sgRe

n71

DMSO β-E 4-HT

BNL-LMP-15 pBC2103

GFP-

negativ

e ce

lls (%

)

Day 8 (%) Day 12 (%)

BNL-LMP-15 arC9 inductiond

S16

e

S16 | 4-hydroxy-tamoxifen inducible arC9 lentiviral vector analysis (a) Lentiviral vector maps. (b) Timeline of arC9 assessment with regards to leakiness, inducibility and reversibility of arCas9 activity. (c) Quantification of arC9 leakiness. The fraction of GFP negative BNL-LMP-15 reporter cells with either Cas9 or arC9 was quantified at the indicated time points. For arC9 without ligand, no leakiness is observed. sgRen71 is a non-targeting, negative control guide (n >10,000 events for each measurement). (d) Quantification of arC9 induction after testing for leakiness. The BNL-LMP-15 pBC2103 cells were treated with DMSO, beta-estradiol (β-E) or 4-hydroxy tamoxi-fen (4-HT) for the indicated amount of time and the amount of GFP disruption quantified. sgRen71 is a non-targeting, negative control guide (n >10,000 events for each measurement). (e) T7E1 quantification of editing at the Pcsk9 locus with secondary sgRNAs, 6 days after infection and start of secondary arC9 re-induction (see timeline for details, secondary induction in yellow). Regard-less of the three previous guide conditions, cleavage with sgPcsk9-7 is observed in the presence of 4-HT. Small amounts of cleavage are also detected in the DMSO control. sgPcsk9-1 does not show activity. Arrows indicate the size of the canonical cleavage products for sgPcsk9-7 (414bp, 180bp).

arCas9 reversibility and re-induction

BNL-LMP-15 pBC2103sg

GFP

1sg

GFP

2sg

Ren

71sg

GFP

1sg

GFP

2sg

Ren

71sg

GFP

1sg

GFP

2sg

Ren

71sg

GFP

1sg

GFP

2sg

Ren

71sg

GFP

1sg

GFP

2sg

Ren

71sg

GFP

1sg

GFP

2sg

Ren

71

Pcsk9-1 Pcsk9-7 Ren71 Pcsk9-1 Pcsk9-7 Ren71

DMSO 4-HT

Seco

ndar

ysg

RN

APr

imar

ysg

RN

A

Editing (%) 0 0 0 0.5 1.4 0.2 0 0 0 0 0 0.9 8.6 5.1 15.0 0 0 0

Nature Biotechnology: doi:10.1038/nbt.3528

Page 18: Supplemental Data for: Profiling of engineering …...Supplemental Data for: Profiling of engineering hotspots identifies an allosteric CRISPR-Cas9 switch Benjamin L. Oakes 1, Dana

arC9 insertion site in Helical-II domain

4ZT0 4UN3PDB ID:

Helical II domain overlay

DNA binding channel occlusion by Helical-II domian

Translation and rotation of the Helical-II domain is required to prevent steric occlusion

of the DNA binding channel

ER-LBD bound by 4-HT is permisive of normal Helical-II translocation while Apo or B-E bound

ER-LBD alters propper Helical-II function

N

C

N

C

S17

a

b c

S17 | Model for arC9 based regulation. (a) PDB models 4ZT034 (black) and 4UN311 (white) are Cas9 structures which are respectively sgRNA only and sgRNA and DNA bound. When aligned it is possible to observe that in the RNA only bound structure the helical-II domain, highlighted using surface representation, is sterically occluding the channel in which the DNA will be unwound and bind to the sgRNA. It is therefore assumed that this domain must rearrange and vacate this channel in order to unwind DNA. This provides a possible mechanism by which arC9 ER-LBD control may be accomplished. (b) Within the helical-II domain there is an alpha helix and adjacent residues (residues 255-283) that must translate~20 -30 Å and rotate from the position highlighted in red to that in blue in order to vacate the DNA binding channel. (c) The insertion of the apo or B-E bound ER-LBD at site 231(highlighted in purple) could affect the ability of Cas9 undergo this conformational change, thus rendering it unable to unwind, bind and cleave DNA targets. Upon the addition of 4-HT the insertion is no longer perturbative to the un-shielding of the DNA channel by the Helical-II domain and thus DNA binding and cleavage are restored.

RNA RNA

DNA

RNA

DNA

RNA

DNA

Nature Biotechnology: doi:10.1038/nbt.3528

Page 19: Supplemental Data for: Profiling of engineering …...Supplemental Data for: Profiling of engineering hotspots identifies an allosteric CRISPR-Cas9 switch Benjamin L. Oakes 1, Dana

S18

Nihongaki et al. 713

Wright et al.714

Davis et al. 219

Zetsche et al.573

Davis et al. 574

Truong et al. 638

Intein Insertion/fusion

Split protein

a

S18 | Comparison of previously identified sites with hotspots identified in this study. (a) Amino acid sites rationally identified and utilized in previous studies which have split Cas9, intro-duced intiens or split intiens into the protein have been mapped onto the hotspots identified with the PDZ insertion.13, 38-40, 50

Nature Biotechnology: doi:10.1038/nbt.3528

Page 20: Supplemental Data for: Profiling of engineering …...Supplemental Data for: Profiling of engineering hotspots identifies an allosteric CRISPR-Cas9 switch Benjamin L. Oakes 1, Dana

48. Jinek, M. et al. Structures of Cas9 Endonucleases Reveal RNA-Mediated Conformational Activation. Science. 343, 1247997–1247997 (2014).

49. Heinig, M. & Frishman, D. STRIDE: a web server for secondary structure assignment from known atomic coordinates of proteins. Nucleic Acids Res. 32, W500–2 (2004).

50. Wright, A. V et al. Rational design of a split-Cas9 enzyme complex. Proc. Natl. Acad. Sci. U. S. A. 112, 2984–9 (2015).

Nature Biotechnology: doi:10.1038/nbt.3528