63
1 Investigating Alkaline Phosphatase and Ketosteroid Isomerase by Rational Design A thesis presented by Nicholas A. DeLateur to The Department of Chemistry and Chemical Biology in partial fulfillment of the requirements for the degree of Master of Science in the field of Chemistry Northeastern University Boston, Massachusetts August 8, 2013

Investigating alkaline phosphatase and ketosteroid …...POOL predict that alkaline phosphatase and ketosteroid isomerase contain most of their catalytic power in the residues directly

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

  • 1

    Investigating Alkaline Phosphatase and Ketosteroid Isomerase

    by Rational Design

    A thesis presented

    by

    Nicholas A. DeLateur

    to

    The Department of Chemistry and Chemical Biology

    in partial fulfillment of the requirements for the degree of

    Master of Science in the field of Chemistry

    Northeastern University

    Boston, Massachusetts

    August 8, 2013

  • 2

    © Copyright 2013

    Nicholas A. DeLateur

    All rights reserved

  • 3

    Investigating Alkaline Phosphatase and Ketosteroid Isomerase

    by Rational Design

    by

    Nicholas A. DeLateur

    ABSTRACT OF THESIS

    Submitted in partial fulfillment of the requirements for the degree

    of Master of Science in Chemistry and Chemical Biology

    in the College of Science of Northeastern University,

    August 8, 2013

  • 4

    Abstract

    Enzymes catalyze chemical reactions many orders of magnitude faster than the

    uncatalyzed reaction and are capable of doing so at physiological pH and temperature. As

    enzymes consist of hundreds of amino acids, the ability to identify which residues contribute to

    catalysis with high recall and low false positive rates is of critical importance to characterizing

    and engineering enzymes. Theoretical Microscopic Anomalous Titration Curve Shapes

    (THEMATICS) and Partial Order Optimum Likelihood (POOL) are programs developed at

    Northeastern University that can identify the residues contributing to catalysis. THEMATICS

    finds anomalous titration behavior, which correlates with catalytic activity. POOL combines the

    THEMATICS input with geometric and evolutionary predictions to rank each residue by the

    likelihood of its importance for catalysis.

    Alkaline phosphatase (AP) is a protein found in all domains of life which cleaves

    phosphate groups from a broad range of substrates. Ketosteroid isomerase performs an important

    biological function in the metabolism of many bacteria by degrading steroids. THEMATICS and

    POOL predict that alkaline phosphatase and ketosteroid isomerase contain most of their catalytic

    power in the residues directly surrounding the reacting substrate molecule; there is very little

    contribution from the residues in the distal or remote residues of the protein. This example is in

    stark contrast to phosphoglucose isomerase (PGI) and nitrile hydratase (NH), where

    THEMATICS and POOL predict a multi-layer active site, with residues in the second and third

    shells contributing to activity. The predictions for KSI, PGI, and NH have been experimentally

    validated.

  • 5

    Pseudomonas putida KSI (PpKSI) is strikingly efficient and selective. Three putative

    KSIs identified from Structural Genomics were analyzed by THEMATICS and POOL and then

    characterized in vitro to determine the presence of, or lack of, KSI activity. A putative KSI from

    Mycobacterium tuberculosis (MtKSI) was predicted to have isomerase activity and biochemical

    experiments reveal that the putative M. tuberculosis KSI does indeed possess KSI activity,

    although with reduced efficiency compared to PpKSI.

    To investigate this lower efficiency in the correctly annotated KSI, we engineered the

    MtKSI active site to resemble more closely that of PpKSI under the hypothesis that these

    mutations would increase the activity of MtKSI. However, we found that most of these mutations

    alone or in tandem significantly lowered rather than increased activity. Variants S16Y, F111D,

    S16Y/F64Y, S16Y/F111D, F64Y/F111D, and S16Y/F64Y/F111D lost catalytic power and were

    essentially inactive. Variant F64Y retained catalytic power similar to the wild-type enzyme.

    Although the active sites of MtKSI and PpKSI are similar, our attempts to increase the catalytic

    efficiency by creating a more PpKSI-like active site of MtKSI were not successful.

    Protein engineering relies on the ability to accurately predict sites of function. The best

    predictor for active-site residues is POOL using THEMATICS, INTREPID, and ConCavity

    inputs. We’ve shown that not only can POOL correctly predict the residues required for

    catalysis, but these predictions can also be used to assign function to proteins whose function is

    unknown or putatively assigned. Even if the residues required for catalysis are known, the ability

    to engineer improved or novel function is still difficult and may require multiple approaches.

  • 6

    Acknowledgments

    I am blessed with not one, but two advisors of extraordinary talent and patience. I am

    forever grateful to Professor Penny Beuning for allowing me to begin work in her lab as a young

    freshman with no experience in chemistry or biology. She has been an unending source of

    mentoring and teaching. Professor Mary Jo Ondrechen has trusted me with project after project,

    encouraging me to investigate and grow as a scientist, for which I will be always grateful.

    Dr. Srinivas Somarowthu performed the herculean task of teaching me both the

    computational and experimental aspects of THEMATICS/POOL, alkaline phosphatase, and

    ketosteroid isomerase. I owe most of my practical knowledge in these areas to Sri, and am

    thankful for the pleasure of meeting and working with him over these past years.

    I want to thank the numerous past and present DNA and ORG lab members, with

    emphasis towards Judith Hollander and Ramya Parasarum for graciously sharing bench space

    and wisdom. Mark Naniong and Colleen Shea experimented on MtKSI as undergraduate

    researchers and their impressive work contributed to the data contained in this thesis.

    Neither this work—nor even my graduation—would be possible without Richard

    Pumphrey, Cara Shockley, Andrew Bean, Jordan Keefe, and Katie Cameron assisting me

    through the NU shuffle and my own shortcomings. Jeff Peterson, Professor Graham Jones,

    Professor Carla Mattos, and Professor O’Doherty have provided me with immensely valuable

    discussion and direction. I believe John Bottomy has forgiven me more than anyone on Earth; I

    cherish his friendship and kindness.

  • 7

    I owe my inspiration and aptitude to my ever-supportive family, especially my parents

    Sandra and Joe. They have been a never-ending source of love. Thank you so much Mom, Dad,

    and Matt, along with Cole and Tiffany.

    Funding that allowed these projects and my research to happen was provided by the

    Office of the Provost at Northeastern University, the Matz Co-op Scholarship, and grants NSF:

    MCB-0843603, CAREER MCB-0845033, and REU MCB-0843603.

  • 8

    Table of Contents

    Abstract ........................................................................................................................................... 3

    Acknowledgments........................................................................................................................... 6

    Table of Contents ............................................................................................................................ 8

    List of Figures ................................................................................................................................. 9

    List of Tables ................................................................................................................................ 11

    List of Abbreviations .................................................................................................................... 12

    Chapter 1. Protein Engineering ..................................................................................................... 16

    1.1. Proteins as catalysts ............................................................................................................ 16

    1.2. Design vs. Redesign; Directed Evolution vs. Rational Design .......................................... 18

    1.3. Functional Site Prediction with THEMATICS and POOL ................................................ 20

    1.4. Catalysis by remote residues .............................................................................................. 24

    Chapter 2. Alkaline Phosphatase .................................................................................................. 26

    2.1. Introduction ........................................................................................................................ 26

    2.2. Computational Predictions ................................................................................................. 28

    2.3. Materials and Methods ....................................................................................................... 33

    2.4. Results ................................................................................................................................ 35

    2.5. Conclusions ........................................................................................................................ 39

    Chapter 3. Ketosteroid Isomerase ................................................................................................. 43

    3.1. Introduction ........................................................................................................................ 43

    3.2. Computational Predictions ................................................................................................. 45

    3.3. Materials and Methods ....................................................................................................... 46

    3.4. Results ................................................................................................................................ 48

    3.5. Conclusions ........................................................................................................................ 53

    Chapter 4. Future Work ................................................................................................................ 56

    4.1. POOL-rank cut-offs............................................................................................................ 56

    Appendix A. Propagation of error in calculating catalytic efficiency .......................................... 59

    References ..................................................................................................................................... 60

  • 9

    List of Figures

    Figure 1.1. Alanine, aspartate, glutamate, and asparagine at pH 7. .............................................. 19

    Figure 1.2. Phenylalanine, tyrosine, and serine at pH 7. .............................................................. 20

    Figure 1.3. A titration curve of mean net charge as a function of pH for select lysine residues in

    E. coli β-lactamase. ................................................................................................................. 22

    Figure 1.4. Diagram of a multi-layered active site. ...................................................................... 25

    Figure 2.1. The active site of alkaline phosphatase based on PDB ID: 1ALK.. ........................... 27

    Figure 2.2. Diagram of Evolutionary Trace and THEMATICS predictions for AP. .................... 28

    Figure 2.3. A POOL plot of POOL score vs. POOL rank for alkaline phosphatase. ................... 30

    Figure 2.4. The 2nd

    and 3rd

    shell residues predicted by THEMATICS. ........................................ 32

    Figure 2.5. Primers for site-directed mutagenesis of E. coli alkaline phosphatase.. .................... 33

    Figure 2.6. Standard curve for 4-nitrophenol phosphate .............................................................. 36

    Figure 2.7. Michaelis-Menten plots for AP in 1 M Tris-HCl pH 8.0 buffer. ............................... 37

    Figure 2.8. Catalytic efficiencies of wild-type and variant alkaline phosphatases ....................... 38

    Figure 2.9. AP residues investigated in this work.. ...................................................................... 39

    Figure 2.10. A plot of Table 3 and Table 4 showing effects on catalytic efficiency based on

    POOL rank for AP. ................................................................................................................. 40

    Figure 3.1. Mechanism of KSI based on PpKSI numbering......................................................... 43

    Figure 3.2. Primers for site-directed mutagenesis of MtKSI in plasmid pGST-Rv0760c. ........... 46

    Figure 3.3. Standard curve for 4-androstene-3,17-dione (4AND). ............................................... 49

    Figure 3.4. Michaelis-Menten plots for MtKSI WT and variants................................................. 49

    Figure 3.5. WT and F64Y individual Michaelis-Menten plots. .................................................... 49

    Figure 3.6. Single run of Michaelis-Menten plot for MtKSI F111D. ........................................... 50

  • 10

    Figure 3.7. “Top-down” view of PpKSI. ...................................................................................... 53

    Figure 3.8. Three residues of interest in PpKSI without surrounding secondary structure. ......... 54

    Figure 4.1. POOL plots for AP, KSI, PGI, NH, DnaE and DinB. ................................................ 58

  • 11

    List of Tables

    Table 2.1. POOL predictions for alkaline phosphatase. ............................................................... 29

    Table 2.2. Kinetic assays for alkaline phosphatase....................................................................... 35

    Table 2.3. WT and variant AP kinetic parameters. ....................................................................... 37

    Table 2.4. Summary calculations for WT alkaline phosphatase and variants. ............................. 38

    Table 2.5. 1st shell variants of AP and their catalytic efficiency under comparable conditions to

    our experiments. ...................................................................................................................... 41

    Table 2.6. 2nd

    and 3rd

    shell variants of AP and their catalytic efficiency under comparable

    conditions to our experiments. ................................................................................................ 42

    Table 3.1. SALSA alignment of POOL predicted residues for known KSI proteins and proteins

    annotated as putative KSIs.. .................................................................................................... 45

    Table 3.2. Kinetic assays for MtKSI.. ........................................................................................... 48

    Table 3.3. Vmax and KMapp

    for MtKSI WT and variants. ................................................................ 50

    Table 3.4. Comparison between the WT MtKSI and F111D variant at 90 μM 5AND. ............... 50

    Table 3.5. Catalytic efficiency for MtKSI WT and variants......................................................... 51

    Table 3.6. MtKSI WT and variants based on initial velocities at 30 μM 5AND. ......................... 52

    Table A.1. Concentrations of enzymes used to gather kinetic data for alkaline phosphatase. ..... 59

  • 12

    List of Abbreviations

    % Percent

    °C Degrees Celsius

    4AND 4-androstene-3,17-dione

    5AND 5-androstene-3,17-dione

    Å Ångströms

    AP Alkaline phosphatase

    BSA Bovine Serum Albumin

    cm Centimeter

    Da Dalton

    DinB DNA Polymerase IV

    DNA Deoxyribonucleic acid

    DTT Dithiothreitol

    E. coli Escherichia coli

    ET Evolutionary Trace

    FPLC Fast protein liquid chromatography

    GST Glutathione S-transferase

    h Hours

    HEPES 4-(2-Hydroxyethyl)-1-Piperazineethanesulfonic Acid

    kcat First order rate constant

    kDa Kilodalton

    KM Michaelis constant

  • 13

    KSI Ketosteroid isomerase

    L Liter

    M Molar

    min Minutes

    mL Milliliters

    Ml Mesorhizobium loti

    mM Millimolar

    mmol Millimoles

    Mt Mycobacterium tuberculosis

    NH Nitrile hydratase

    nM Nanomolar

    nm Nanometers

    NTF2 Nuclear Transcription Factor 2

    OD Optical density

    Pa Pectobacterium atrosepticum

    PDB Protein Data Bank

    PGI Phosphoglucose isomerase

    PhoA Alkaline phosphatase

    PNP para-nitrophenol

    PNPP para-nitrophenol phosphate

    POOL Partial Order Optimum Likelihood

    PSI Protein Structure Initiative

  • 14

    R2 Regression co-efficient

    rcf Relative centrifugal force

    RNA Ribonucleic acid

    SALSA Structurally Aligned Local Sites of Activity

    SDS-PAGE Sodium dodecyl sulfate poly-acrylamide gel electrophoresis

    SG Structural Genomics

    SVM Support vector machine

    TEV Tobacco etch virus

    THEMATICS Theoretical Microscopic Anomalous Titration Curve Shapes

    TM Melting temperature

    Tris-HCl 2-amino-2-hydroxymethyl-propane-1,3-diol

    μ3 3rd

    central moment

    μ4 4th

    central moment

    v/v Volume by volume

    V0 Initial velocity

    Vmax Maximum velocity

    WT Wild-type

    YT Yeast extract and Bacto Tryptone

    μL Microliter

    μM Micromolar

    σ Error

  • 15

    A Ala Alanine

    C Cys Cysteine

    D Asp Aspartic Acid

    E Glu Glutamic Acid

    F Phe Phenylalanine

    G Gly Glycine

    H His Histidine

    I Ile Isoleucine

    K Lys Lysine

    L Leu Leucine

    M Met Methionine

    N Asn Asparagine

    P Pro Proline

    Q Gln Glutamine

    R Arg Arginine

    S Ser Serine

    T Thr Threonine

    V Val Valine

    W Trp Tryptophan

    Y Tyr Tyrosine

  • 16

    Chapter 1. Protein Engineering

    1.1. Proteins as catalysts

    All known life forms create polymers of various combinations of 20 different amino

    acids. These polymers are known as proteins and frequently act as catalysts, in which case they

    are then referred to as enzymes. The linear chain of amino acids (primary structure) folds to form

    local order such as α-helices and β-strands (secondary structure). These helices, strands, loops,

    and other local structures fold into a single overall arrangement (tertiary structure); multiple

    chains can associate with each other (quaternary structure). Enzymes catalyze reactions under

    physiological conditions, such as neutral pH and room temperature, with extreme specificity and

    high efficiency. With few exceptions, enzymes are responsible for catalyzing every important

    chemical reaction in biology, giving rise to statements such as Orgel’s First Law:

    Whenever a spontaneous process is too slow or too inefficient

    a protein will evolve to speed it up or make it more efficient.

    Most proteins are on the order of 100 to 1000 amino acids. With 20 canonical amino

    acids with which to build, the number of possible protein sequences quickly becomes

    unfathomable. For a protein on the smaller end, the number of possible sequences is 20100

    . This

    number however includes sequences that are nothing more than 200 prolines in a row, a

    sequence that would be generally considered non-functional. Estimates put the fraction of

    “functional” folds to be 1 in 1077

    [1].

    How enzymes are capable of achieving the remarkable feats of chemistry required of

    them represents a central area of research in biochemistry. While a protein may be composed of

    hundreds of amino acids, generally only a small handful of those amino acids are directly

  • 17

    involved in the performance of catalysis. These residues compose the “active site” of the

    enzyme. In 1946 Linus Pauling postulated that the catalytic power of enzymes lies in their ability

    to lower the energy of the transition state between substrate and product[2], a theory which

    essentially is still true today[3]. To explain the ability of enzymes to perform reactions only on

    their specific cognate substrates, the “lock and key”[4] theory was proposed, eventually giving

    way to a more nuanced “induced fit”[5] theory taking into account the transition state geometry

    and realistic expectation of a dynamic system. The lock-key and induced fit models generally

    assume a globular fold with a solvent accessible active site. In many cases the active site is

    buried within the protein or protein cavity[6]. Recently, Jiri Damborsky has pursued a “keyhole-

    lock-key” model to address this complication[6, 7].

    The Central Dogma[8] of biology, that DNA is transcribed into RNA which is then

    translated into protein, provides a natural scheme in which to probe hypotheses about protein

    sequence-structure-function relationships. By manipulation of an organism’s DNA, a variant

    protein product is produced, which can then either be examined at an in vitro functional level

    after isolation, or kept in the organism and the phenotype of the organism observed under

    varying conditions to elucidate in vivo function. The sequence-structure relationship is a folding

    problem, and while interesting in its own right, will not be addressed here in favor of the

    structure-function relationship. One reason is that many sequences result in the same overall

    structure. Another reason is that the active site is a structural feature and is the focus of protein

    engineering endeavors.

    Two things required for protein engineering as a field to emerge were:

    -A method to change the protein sequence, and thus structure, with exquisite control

  • 18

    -A falsifiable hypothesis of how a change in protein structure will change protein function

    This was finally accomplished in 1982, exemplified with a foundational study on tyrosyl-

    transfer RNA synthetase[9], after the advent of site-directed mutagenesis which allowed specific,

    controlled changes at the DNA level to be specified by the researcher. Protein engineering as a

    field today produces marvelous work that ranges from designing enzymes to catalyze Diels-

    Alder[10] and Kemp[11] reactions, building a fully functional enzyme from a 9-amino acid

    alphabet[12], and creating a completely new fold never seen in nature[13].

    1.2. Design vs. Redesign; Directed Evolution vs. Rational Design

    To be strict, protein engineering, or protein design, would refer to the process of creating

    a functional protein de novo (also referred to as “artificial” enzymes). Most protein engineering

    however utilizes already functional enzymes, such as those extracted from organisms of research

    interest, and manipulates them in a way to make them more functional, different in function, or

    to make them lose their functionality. These are examples of protein reengineering and can often

    be seen designated as such in the literature (for example, see recent review by Hilvert[14]).

    Presently, the protein engineering paradigm of creating mutations and examining

    resulting changes in function is well established. Implementation on the other hand is constrained

    by the unfathomably high permutation level proteins occupy; it is impossible for an experimental

    lab to investigate every residue in a protein especially if multiple mutations at the same residue

    are desired. There are two main approaches to deal with this dilemma: directed evolution and

    rational design.

    Directed evolution draws upon Darwinian evolution concepts to discover mutations of

    interest by iterating rounds of mutagenesis and selection. At its simplest, a gene encoding a

  • 19

    protein undergoes non-specific mutagenesis to introduce a large array of mutations, and the

    resulting library expressed and a certain phenotype is selected. The survivors of the first round of

    selection return to the mutagenesis step to repeat the process until a satisfying level of function is

    attained. Since its inception, directed evolution has proven to be a powerful technique for protein

    engineering[15, 16] for developing new or improved function.

    Rational design represents the oldest method of protein engineering. Using hypotheses

    about the roles of particular residues, specific mutations to specific amino acids are chosen,

    created, and then the resulting change (or lack thereof) examined. The residues of interest can be

    chosen based on crystal structures, previous experiments, sequence comparison, structural

    comparison, etc. To determine which residues in a protein contribute to catalysis for example,

    one would determine which residues are suspected of contributing to catalysis, and create one or

    more mutations that probe this hypothesis.

    Figure 1.1. Alanine, aspartate, glutamate, and asparagine at pH 7.

    Alanine contains a mere methyl group as its side-chain, whereas aspartic acid is a short

    acid. Glutamic acid is another acid with a side chain longer than aspartic acid by a single

    methylene, and asparagine contains an amide group instead of the carboxyl group. Often a

    residue is changed to alanine due to alanine’s simple nature, consisting of a single methyl group

    O

    NH3+

    CH3O-

    O-O

    -

    NH3+

    O

    O

    O O

    O-

    O-

    NH3+

    O

    O

    NH2

    NH3+

    O-Glutamate

    Alanine Aspartate

    Asparagine

  • 20

    for a side chain residue. This approximates a loss of both functional group and bulk. Charge and

    size are two of the most important characteristics to investigate for an amino acid’s contribution.

    Figure 1.1 shows the differences between an aspartic acid and a change to asparagine to change

    charge, or a change to glutamic acid to change size.

    Many times a residue will contain multiple functionalities. For example, tyrosine contains

    both a hydroxyl functional group and an aromatic functional group. To investigate the

    contributions of these moieties as separately as possible, a series of mutations such as visualized

    in Figure 1.2 could be made. A mutation from tyrosine to serine, while a drastic change in size,

    would remove the aromatic functionality. A mutation from tyrosine to phenylalanine would

    remove just the hydroxyl group, leaving the 6-membered aromatic ring intact.

    O

    NH3

    +O-

    O

    NH3

    +

    OH

    O-

    O

    NH3

    +OH O

    -

    Figure 1.2. Phenylalanine, tyrosine, and serine at pH 7.

    Tyrosine provides both an aromatic ring and a hydroxyl group to the active site of an

    enzyme. Phenylalanine provides only the aromatic moiety whereas serine adds only a hydroxyl

    group without aromaticity. These mutations allow us to test hypotheses pertaining to an

    enzyme’s stability, mechanism, selectivity, or efficiency by rational change of the protein. This

    method of investigation underpins protein engineering as a powerful tool to investigate the active

    sites of proteins.

    1.3. Functional Site Prediction with THEMATICS and POOL

    The active site of the protein is commonly termed “where the chemistry happens”. For

    our purposes we sometimes use a more strict definition of “residues within 5 Å of the site of

    Phenylalanine Tyrosine Serine

  • 21

    reaction”. These residues interact with the substrate directly, whether it by hydrophobic

    interactions, π- π interactions, (de)protonation, hydrogen bonds, Coulomb forces, dipole-dipole

    interactions, or covalent bonding. It is of great interest to predict accurately and quickly the

    active site of a given protein structure. To that end, the active site prediction method Theoretical

    Microscopic Anomalous Titration Curve Shapes (THEMATICS) was published in 2001[17].

    THEMATICS uses computational methods to calculate a theoretical titration curve for every

    ionizable residue (K, R, D, E, H, Y, C) in a protein structure. A small minority of these titration

    curves will show behavior that significantly differs from the ideal Henderson-Hasselbalch

    behavior (Figure 1.3). While a single outlier may be a fluke, a “cluster”, defined as two or more

    residues with deviant behavior within 6 Å of each other, is considered a positive hit for

    identifying the active site.

    THEMATICS utilizes the unique property of a catalyst to help find active sites; a catalyst

    must replenish itself to the former state at the end of a chemical reaction. For enzymes, of which

    there are many, that give or receive a proton there is a fundamental problem that to be acidic

    enough to offer a proton, or basic enough to abstract a proton from the substrate, would

    necessitate being too weak a base to take back the proton once owned by the enzyme, or too

    weak an acid to give back the proton borrowed by the enzyme[18].

    If a residue could be both an acceptor and donor of a proton simultaneously, or near

    simultaneously, the paradox would be resolved. The residue would have to be ionizable over a

    wide range of pH values and not follow Henderson-Hasselbalch behavior: the type of behavior

    THEMATICS calculates for known residues of catalytic importance.

  • 22

    Figure 1.3. A titration curve of mean net charge as a function of pH for select lysine residues in E. coli β-lactamase.

    In Figure 1.3[19] the two filled symbols show the titration curve of two lysines (K146

    and K215) that do not contribute to catalysis. The two unfilled symbols in Figure 1.3 show the

    titration curves of active site lysines K73 and K234. Note the classic, sharp transition of charge

    states as modeled by the Henderson-Hasselbalch equation for the non-catalytic lysines contrasted

    to the perturbed, anomalous behavior of the curves for catalytic lysines.

    THEMATICS contains additional advantages beyond predicting active sites. Because the

    criteria for prediction are based purely on computed chemical properties from the three-

    dimensional coordinates for the query protein and are not dependent on homology,

    THEMATICS remains immune to false positives due to homology or database misannotation. A

    structure of an enzyme could be the only structure in existence, such as a novel or artificial fold,

    and THEMATICS will still perform just as powerfully. It has been shown that THEMATICS

    works well using a homology model as an input rather than empirical structures[20], and finds

    both catalysis and recognition sites of enzymes[21]. Quantitation of the deviation from

    Henderson-Hasselbalch behavior was implemented by examining the 3rd

    and 4th

    central moments

  • 23

    of the curves, which correspond to asymmetry and kurtosis respectively[22]. Residues scoring

    more than one standard error higher than the average residue of its type were considered positive

    hits (Z >1 for μ3 or μ4)[22]. The Z-score cut-off was later refined to Z >0.99 for μ3 or μ4 after it

    was found to improve performance on the reference data set[23].

    Originally, THEMATICS titration curves were inspected manually for non-Henderson-

    Hasselbalch behavior, which raises both resource-commitment and scientist-bias issues.

    Automation[24] alleviated of both of these concerns and paved the way to add Support Vector

    Machines (SVM) as a potential way of raising THEMATICS recall and precision even

    higher[25].

    Partial-Order Optimum Likelihood (POOL) combines THEMATICS with other

    predictors to create the best functional site predictor to date[26]. Originally using CASTp for

    geometric features and ConSurf for sequence-based features[26], POOL has since[27]

    incorporated ConCavity[28] for geometric features and INTREPID[29] for sequence-based

    phylogenetics features. POOL provides many advantages over THEMATICS: the ability to

    predict non-ionizable residues, include sequence/geometric information, and improved

    performance. POOL allows non-ionizable residues to be predicted by assigning all residues an

    environmental μ3 and μ4 based on the behavior of nearby residues. In addition to the 3rd

    and 4th

    central moments, the buffer range[27] (BR) was added as a feature to quantitate the wide-range

    of buffering capability that is typically high for active site residues. POOL is publicly available

    via web at http://www.pool.neu.edu/wPOOL/[30]

    http://www.pool.neu.edu/wPOOL/

  • 24

    1.4. Catalysis by remote residues

    Earlier we defined the active site as “residues within 5 Å of the site of reaction”. Even

    during the seminal work on tyrosyl-transfer RNA synthetase the concept of residues remote from

    the site of chemical transformation contributing to catalysis seemed evident and was validated by

    showing that T40 and H45 contributed to catalysis by binding of the tail phosphate groups of the

    ATP moiety[9]. However, here a stricter definition of remote residues is adopted, and we

    redefine the active site as “residues within 5 Å of the substrate”, regardless of whether that

    particular residue is directly involved in chemical reactions. With this definition, residues such as

    T40 and H45 in tyrosyl-transfer RNA synthetase would not be considered remote, but rather it

    could be said that the active site of tyrosyl-transfer RNA synthetase is particularly large to

    accommodate a particularly large substrate.

    As soon as THEMATICS was created, it was noted that certain predictions by

    THEMATICS included residues that were not in direct contact with the substrate[17]. These

    residues were not only far away from the site of the reaction, but did not have any interaction

    with the substrate. Whether these predictions were false positives, or correct predictions yet to be

    tested remained an open question[17, 18]. One could imagine an active site to be composed of

    layers: the first layer are those residues that are within contact with the substrate, the second

    layer would be composed of the residues in contact with, but behind, the first shell, the third shell

    would be composed of the residues in contact with, but behind, the second shell.

    Figure 1.4 abstractly shows a multi-layered active site consisting of a 1st shell that

    interacts with the substrate, a 2nd

    shell of residues interacting with the 1st shell, and a 3

    rd shell of

    residues interacting with the 2nd

    shell. Each shell is approximately 5 Å in depth.

  • 25

    Figure 1.4. Diagram of a multi-layered active site.

    These predicted residues were in the second, or even third, shell of the active site. Nitrile

    Hydratase (NH), Phosphoglucose Isomerase (PGI) and DNA Polymerase IV (DinB) were

    predicted to contain 2nd

    shell residues contributing to catalysis. Alternatively, there are some

    enzymes such as Ketosteroid Isomerase (KSI) where no second-shell residues are predicted to be

    important for catalysis. It was found that indeed NH[31], PGI[32], and DinB[33] all contain

    remote residues contributing to catalysis, whereas KSI[32] possesses a mostly single-layered

    active site.

    These results show that many, but not all, enzymes contain active sites that are extended,

    utilizing remote residues to contribute towards catalysis. THEMATICS and POOL accurately

    predict the contributions of remote residues to catalysis by a wide range of enzymes. Thus, the

    extent of an enzyme’s active site can be predicted using POOL and THEMATICS.

  • 26

    Chapter 2. Alkaline Phosphatase

    2.1. Introduction

    Alkaline phosphatase (AP) appears across all domains of life releasing phosphate groups

    from a wide range of substrates. AP is of great interest for use in diagnostic assays but the

    bacterial enzyme is considered too slow compared to the mammalian enzyme, although the

    temperature stability of the mammalian enzyme is much lower than the bacterial enzyme (65 °C

    and 95 °C TM, respectively)[34]. Alkaline phosphatase has been a staple of enzymology studies

    for decades[35] although it is under constant revision and further investigation as to its

    mechanism[35-38]. Its thermostability, ubiquity in both nature and the chemical literature, and

    ease of kinetic assay present an excellent learning opportunity. As such, the senior-level

    Chemical Biology course at Northeastern University utilizes the site-directed mutagenesis and

    Michaelis-Menten parameterization of alkaline phosphatase as a long term lab experiment. Some

    mutations in this work were designed by undergraduates partaking in this course.

    E. coli AP is encoded by the phoA gene and encodes 471 amino acids composing the

    precursor protein; the first 21 amino acids contain a periplasmic signal sequence that is then

    removed from the protein, resulting in a 450 amino acid enzyme that naturally dimerizes in

    solution. Each monomer contains its own active site with three metal ions: two zinc and one

    magnesium[39]. These metal ions are held in place by various residues and with no substrate

    present are coordinated with three water molecules[40]. The magnesium ion is held in place by

    D51, D153, T155, and E322; the zinc1 ion is held in place by R166, D327, H331, and H412; the

    zinc2 ion is held in place by D51, R166, D369 and H370[35, 39-49]. K328 interacts with the

  • 27

    phosphate moiety through a water molecule[49] and S102 performs the nucleophilic attack on

    the substrate[40, 50, 51].

    Figure 2.1. The active site of alkaline phosphatase based on PDB ID: 1ALK. Zinc: purple; magnesium: yellow;

    phosphate: green/red.

  • 28

    2.2. Computational Predictions

    Alkaline phosphatase as analyzed by THEMATICS was predicted to have mostly 1st shell

    residues, with two predicted 2nd

    shell residues and one 3rd

    shell residue. In addition to analysis by

    THEMATICS, analysis by Evolutionary Trace (ET)[52, 53] predicted a much larger population

    of residues that fully included those predicted by THEMATICS.

    Figure 2.2. Diagram of Evolutionary Trace and THEMATICS predictions for AP.

  • 29

    Each group in Figure 2.2 represents a shell of the active site for AP. Residues predicted

    by ET but not predicted by THEMATICS could be non-ionizable (such as the case of S102)

    and/or simply not predicted by THEMATICS

    POOL

    Rank Residue

    Raw

    POOL score

    Normalized

    POOL Score

    1 ASP 51 2.06E-02 1.00E+00

    2 ASP 369 1.42E-02 6.87E-01

    3 HIS 370 1.06E-02 5.15E-01

    4 ASP 327 1.06E-02 5.15E-01

    5 GLU 322 7.89E-03 3.83E-01

    6 HIS 412 3.57E-03 1.73E-01

    7 HIS 331 5.18E-04 2.52E-02

    8 ASP 101 3.57E-04 1.73E-02

    9 HIS 372 3.36E-04 1.63E-02

    10 ASP 153 1.54E-04 7.50E-03

    11 ARG 166 1.10E-04 5.36E-03

    12 GLU 341 8.08E-05 3.92E-03

    13 LYS 328 7.37E-05 3.58E-03

    14 HIS 86 6.85E-05 3.33E-03

    15 PRO 156 4.25E-05 2.06E-03

    16 GLY 52 3.80E-05 1.85E-03

    17 GLU 57 2.20E-05 1.07E-03

    18 THR 155 2.11E-05 1.03E-03

    19 SER 102 2.11E-05 1.03E-03

    20 HIS 162 1.77E-05 8.59E-04

    21 MET 53 1.69E-05 8.18E-04

    22 ASP 330 9.87E-06 4.79E-04

    23 ASN 44 8.80E-06 4.27E-04

    24 PHE 317 8.80E-06 4.27E-04

    25 GLY 207 8.80E-06 4.27E-04

    Table 2.1. POOL predictions for alkaline phosphatase. THEMATICS predictions include those colored. Blue: 1st

    shell; yellow: 2nd

    shell

  • 30

    Shown here are only the 25 mostly highly ranked residues of 450. Residues in blue are

    known residues contributing to catalysis via ligand or metals; residues in yellow are predicted by

    THEMATICS to be 2nd

    or 3rd

    shell residues of interest. E341 helps form the dimer interface.

    Figure 2.3. A POOL plot of POOL score vs. POOL rank for alkaline phosphatase.

    The POOL plot in Figure 2.3 extends out towards a rank of 449, asymptotically

    approaching a POOL score of 0. There are quite a few interesting predictions by POOL for

    alkaline phosphatase (Table 2.1). It performs well in predicting the first shell of residues, as well

    as the dimer-interface forming residue. Residues predicted by THEMATICS all reside in the top

    22 (top 5%) of POOL ranking, including the 2nd

    and 3rd

    shell residues predicted by

    THEMATICS. Threonine 155 and serine 102 are both essential for catalysis but only rank as 18

    and 19 respectively; because neither serine nor threonine are considered ionizable, THEMATICS

    would not predict these residues directly. The computational predictions shown above suggest

    that alkaline phosphatase may have a few 2nd

    and 3rd

    shell residues important for catalysis,

    namely E57, D330, and H372 (Figure 2.4).

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    0 10 20 30 40 50

    PO

    OL

    Sco

    re

    POOL rank

  • 31

    POOL discards the THEMATICS Boolean approach of assigning discrete yes/no values

    to predictions of functional importance for residues in exchange for a ranking system (Figure

    2.3), complete with its own advantages and disadvantages (see discussion[26]). Traditionally a

    %-based cut-off, such as top 8%, 10%, or as low as top 5%, is utilized to determine what the user

    should investigate as a residue important for catalysis. However the exact cut-off is still an area

    of investigation (see Further Work) and can be dependent on the size and type of protein of

    interest.

  • 32

    Figure 2.4. The 2nd

    and 3rd

    shell residues predicted by THEMATICS: (top) H372, (middle) D330, and (bottom) E57.

    Zinc: purple; magnesium: yellow; phosphate: green/red.

    H372

    D330

    E57

  • 33

    2.3. Materials and Methods

    To investigate these predictions by THEMATICS and POOL pertaining to the possible

    outer-shell residues in alkaline phosphatase, we employed site-directed mutagenesis to construct

    mutants, expressed and purified them, and assayed their activities in reference to the wild-type

    protein.

    2.3.1. Materials

    Quik-Change® site-directed mutagenesis kits (Agilent, CA) were used to make mutations

    in pEK29[43] (provided by E. Kantorwitz (Boston College)) using primers below and confirmed

    by DNA sequencing (Massachusetts General Hospital DNA Core, Cambridge, MA).

    M75T

    5'-GGCGATGGGACGGGGGACTCGG-3'

    5'-CCGAGTCCCCCGTCCCATCGCC-3'

    H394D

    5'-CTGATCACGCCGACGCCAGCCAG-3'

    5'-CTGGCTGGCGTCGGCGTGATCAG-3'

    H108L

    5'-GGGCAATACACTCTCTATGCGCTG-3'

    5'-CAGCGCATAGAGAGTGTATTGCCC-3'

    E79Q

    5'-GGACTCGCAAATTACTGCCGCACG-3'

    5'-CGTGCGGCAGTAATTTGCGAGTCC-3'

    D352N

    5'-CGATAAACAGAATCATGCTGCCAATCC-3'

    5'-GGATTCGCAGCATGATTCTGTTTATCG-3'

    H394L

    5'-CGCTGATCACGCCCTCGCCAGCCAG-3'

    5'-CTGGCTGGCGAGGGCGTGATCAGCG-3'

    M75A

    5'-CTGATTGGCGATGGGGCAGGGGACTCG-3'

    5'-CGAGTCCCCTGCCCCATCGCCAATCAG-3'

    E172Q

    5'-GTTTCTACCGCACAGTTGCAGGATG-3'

    5'-CATCCTGCAACTGTGCGGTAGAAAC-3'

    S127A

    5'-GACTCGGCTGCAGCAGCAACCGCC-3'

    5'-GGCGGTTGCTGCTGCAGCCGAGTC-3'

    Q457E

    5'-GGACTGACCGACGAGACCGATCTC-3'

    5'-GAGATCGGTCTCGTCGGTCAGTCC-3'

    Figure 2.5. Primers for site-directed mutagenesis of E. coli alkaline phosphatase. Codons manipulated are

    underlined.

    SM547 cells, lacking a chromosomal phoA gene, were provided by E. Kantorwitz and

    made competent by chemical treatment with CaCl2 and stored at -80 °C in aliquots. Primers were

  • 34

    hydrated to 100 μM concentration with sterile water, and a 5 μM stock created by diluting 20-

    fold into sterile water.

    2.3.2. Methods

    For protein purification, plasmids to express either WT or variant AP were transformed

    into SM547 competent cells and selected on LB agar containing 100 μg mL-1

    ampicillin. An

    overnight culture of 50 mL YT medium containing100 μg mL-1

    ampicillin grown at 37 °C was

    sub-cultured to 1 L YT supplemented with 100 μg mL-1

    ampicillin and growth was continued for

    12 hours at 37 °C. The cells were harvested, washed, and osmotically shocked as previously

    described by Brockman & Heppel[54] and then precipitated, suspended, dialyzed, and purified

    on a HiTrap FastFlow Q column (GE Healthcare) by FPLC as described by Chaidaroglou et

    al.[43]. Purity of each fraction was determined by 10% SDS-PAGE and pure fractions were

    stored at -20 °C. Concentration of protein was determined by Bradford assay (Bio-Rad) against a

    bovine serum albumin standard.

    Formation of para-nitrophenyl was measured at 410 nm at room temperature in High Tris

    buffer (1.0 M Tris-HCl pH 8.0) from the cleavage of para-nitrophenyl phosphate to calculate

    initial velocities with an extinction coefficient of 1.42 x 104 M

    -1 cm

    -1 (Figure 2.6). Non-linear

    regression to calculate KM and kcat was performed using GraphPad Prism 5 version 5.02 . At least

    three independent trials were performed for each protein. Data were collected every 0.5 seconds

    starting at the 3rd

    sec of the reaction and continuing for 2 min to construct the initial velocities,

    initiated with addition of enzyme. PNPP was kept in the dark as much as possible, and stored in

    light-resistant microcentrifuge tubes when aliquoted.

  • 35

    PNPP

    μM

    Buffer

    (2X) Water

    PNPP

    2 mM

    Enzyme

    variable nM Total

    1 500 483 2 15 1000

    2 500 481 4 15 1000

    5 500 475 10 15 1000

    10 500 465 20 15 1000

    20 500 445 40 15 1000

    50 500 385 100 15 1000

    100 500 285 200 15 1000

    200 500 85 400 15 1000 Table 2.2. Kinetic assays for alkaline phosphatase. Bolded columns denote final concentrations, where all other

    numbers refer to μL added to the cuvette.

    2.4. Results

    In order to determine initial velocities by monitoring production of the product 4-

    nitrophenol phosphate (4-PNP), a standard curve with dilutions of 4-PNP gives a molar

    extinction coefficient of 1.42 x 104 M

    -1 cm

    -1 similar to the reported 1.62 x 10

    4 M

    -1 cm

    -1 [43].

  • 36

    Figure 2.6. Standard curve for 4-nitrophenol phosphate

    Each alkaline phosphatase variant was tested concurrently with wild-type alkaline

    phosphatase on the same day. Initial velocities, V0, for each substrate concentration (1-200 μM

    PNPP) was calculated by taking the slope of the product formation (in a.u. min-1

    ) and dividing by

    the 4-PNP molar extinction coefficient to give μM PNP min-1

    .

    y = 0.0142x + 0.0304 R² = 0.9993

    0

    0.5

    1

    1.5

    2

    2.5

    0 20 40 60 80 100 120 140 160

    Ab

    sorb

    an

    ce (

    410 n

    m)

    4-nitrophenol phosphate (μM)

  • 37

    Figure 2.7. Michaelis-Menten plots for AP in 1 M Tris-HCl pH 8.0 buffer. Error bars represent standard error of at

    least three independent trials.

    AP Variant Vmax (μM min-1

    ) KM (μM) R2

    WT 4.8 (0.1) 26.7 (2.6) 0.94

    M53A 8.9 (0.4) 26.2 (4.1) 0.96

    M53T 1.7 (0.1) 22.7 (2.1) 0.98

    E57Q 7.5 (0.3) 25.2 (3.5) 0.97

    H86L 3.2 (0.1) 13.5 (0.9) 0.99

    S105A 3.1 (0.2) 14.1 (3.1) 0.91

    E150Q 5.2 (0.5) 33.2 (8.7) 0.90

    D330N 2.0 (0.1) 22.0 (5.4) 0.87

    H372D 3.0 (0.3) 53.1 (14.4) 0.91

    H372L 4.1 (0.2) 10.6 (1.5) 0.95

    Q435E 8.4 (0.1) 20.9 (1.1) 0.99

    Table 2.3. WT and variant AP kinetic parameters. Standard errors are in parentheses and consist of at least three

    independent trials.

    Vmax is not proportional to kcat between enzymes due to the enzymes being at different

    concentrations (Appendix A). None of the variants showed a dramatic decrease in activity. While

    there are some small differences in individual kcat or KM values, the catalytic efficiencies are all

    similar (Table 2.4).

  • 38

    PhoA Variant

    POOL Rank

    Å to PO4 Shell kcat (s

    -1) KM (μM) Catalytic Efficiency

    (106 M-1 s-1) Fold

    Decrease

    WT -- -- -- 40 (7.3) 28 (9) 1.43 (0.53) --

    H372D 9 6.7 2nd 27 (10) 63 (44) 0.43 (0.34) 3.33 (2.91)

    H372L 9 6.7 2nd 6.3 (0.1) 11 (2.9) 0.57 (0.15) 2.49 (1.13)

    H86L 14 11.2 2nd 6.3 (0.1) 14 (0.7) 0.45 (0.02) 3.17 (1.19)

    S105A 16 7.2 2nd 9.7 (0.7) 14 (3.2) 0.69 (0.17) 2.06 (0.91)

    E57Q 17 12.3 3rd 21 (0.8) 26 (4) 0.81 (0.13) 1.77 (0.71)

    M53A 21 14.6 3rd 25 (4.4) 27 (16) 0.93 (0.17) 1.54 (0.64)

    M53T 21 14.6 3rd 14 (0.8) 23 (3.4) 0.61 (0.10) 2.36 (0.94)

    D330N 22 11.0 2nd 17 (4.2) 25 (2.5) 0.68 (0.18) 2.1 (0.96)

    Q453E 44 11.1 2nd 17.4 (0.4) 21 (3) 0.83 (0.12) 1.72 (0.68)

    E150Q 136 10.2 2nd 13 (8.8) 34 (6.4) 0.38 (0.27) 3.74 (2.97) Table 2.4. Summary calculations for WT alkaline phosphatase and variants. Standard errors are in parentheses and

    consist of at least three independent trials.

    There is no correlation between either POOL rank nor distance (Å) to the phosphate

    substrate for the residues tested, nor is there a significant difference in results between the 2nd

    shell and the 3rd

    shell residues as groups. Distances from the PO4 (Å) are based on

    PDB:1ALK[39] and measured from tip of the residue side chain to the phosphorous atom.

    Figure 2.8. Catalytic efficiencies of wild-type and variant alkaline phosphatases. Error bars represent standard error

    over at least three independent trials.

    0.00

    0.50

    1.00

    1.50

    2.00

    2.50

    WT H372D H372L H86L S105A E57Q M53A M53T D330N Q453E E150Q

    Cata

    lyti

    c E

    ffec

    ien

    cy (

    10

    6 M

    -1 s

    -1)

    AP variant

  • 39

    2.5. Conclusions

    WT alkaline phosphatase is catalytically efficient with a kcat/KM of 1.5 x 106 M

    -1 s

    -1,

    which is similar to the literature values reported across various experiments[34, 43, 55, 56].

    Mutations disrupting interactions of the active site at the first shell commonly decrease the

    catalytic efficiency of alkaline phosphatase by many orders of magnitude. In contrast, throughout

    this work and from the compiled literature of single-mutation variants, mutations in the second

    or third shell have little to no effect on catalysis. Alkaline phosphatase seems to have a compact

    active site comprising solely first-shell residues that contribute significantly to catalysis as

    measured by single-point mutants.

    Figure 2.9. AP residues investigated in this work. Zinc: purple; magnesium: yellow; phosphate: green/red.

    It has been shown that the turnover rate of AP can be increased substantially by multiple

    mutations, including 2nd

    shell mutations. A D153G/D330N double mutant was reported to have

    over 50-fold higher kcat than the WT AP; the KM was also raised by about 30 fold leaving the

  • 40

    enzyme with less than a 2-fold higher overall catalytic efficiency, however[34]; similarly,

    D101A gives a 2-fold increase to kcat and KM negating each other[57]. D153A by itself, while

    resulting in almost no change in catalytic efficiency, resulted in a 7-fold increase in each kcat and

    KM[42]. D101, D153, and D330 are all predicted by POOL and rank 8th

    , 10th

    , and 22nd

    respectively. Multiple mutations in close space achieved modest 2- to 6-fold increases in kcat/KM

    including V99A, T100V, T100I, and D101S[58].

    Figure 2.10. A plot of Table 3 and Table 4 showing effects on catalytic efficiency based on POOL rank for AP.

    The largest loss of activity are seen in D327 and S102, with no mutations on residues

    outside the top 20 predicted residues by POOL having a large (>1 magnitude) decrease in

    catalytic efficiency. It is important to note that this compilation only examines single mutations

    where both subunits are affected. Alkaline phosphatase has been known to shown intragenic

    complementation where a heterodimer of variants A and B, AB, will have higher activity than

    0.1

    1

    10

    100

    1000

    10000

    100000

    1000000

    0 20 40 60 80 100 120 140 160

    Fold

    Dec

    rease

    over

    res

    pec

    tive

    WT

    AP

    (log s

    cale

    )

    POOL Rank for AP

  • 41

    AA or BB[56]. While the two active sites per dimer are more than 30 Å apart, there seems to be

    molecular communication between them.

    Variant Shell

    Pool

    Rank

    POOL

    percentile

    (kcat/KM) wild-type /

    (kcat/KM) mutant Reference

    D51E 1 1 99 231 [44]

    D369N 1 2 99 95 [56]

    D327N 1 4 99 4350 [45]

    D327N 1 4 99 100 [46]

    D327A 1 4 99 >600,000 [45]

    D327A 1 4 99 >1,000,000 [46]

    E322K 1 5 99 1520 [56]

    H412Y 1 6 99 >12,000 [56]

    H412E 1 6 99 2237 [44]

    H331E 1 7 98 972 [44]

    D101S 1 8 98 0.2 [59]

    D101A 1 8 98 1 [57]

    D153G 1 10 98 0.2 [59]

    D153H 1 10 98 1.1 [34]

    D153H 1 10 98 3.5 [47]

    D153E 1 10 98 1.3 [44]

    D153A 1 10 98 1.1 [42]

    D153N 1 10 98 1.1 [42]

    R166A 1 11 98 313 [43]

    R166S 1 11 98 125 [43]

    R166Q 1 11 98 166 [48]

    R166K 1 11 98 4 [48]

    K328R 1 13 97 0.9 [58]

    K328C 1 13 97 10 [51]

    K328H 1 13 97 0.5 [34]

    K328H 1 13 97 3.2 [49]

    K328A 1 13 97 3.8 [49]

    T155M 1 18 96 678 [56]

    S102G 1 19 96 >300,000 [60]

    S102A 1 19 96 >60,000 [60]

    S102C 1 19 96 >19,000 [60]

    Table 2.5. 1st shell variants of AP and their catalytic efficiency under comparable conditions to our experiments.

  • 42

    Variant Shell

    Pool

    Rank

    POOL

    percentile

    (kcat/KM) wild-type /

    (kcat/KM) mutant Reference

    H372A 2 9 98 2.9 [61]

    H372D 2 9 98 3.3 This Work

    H372L 2 9 98 2.5 This Work

    H86L 2 14 97 3.1 This Work

    E57Q 2 17 96 1.8 This Work

    M53A 2 21 95 1.5 This Work

    M53T 2 21 95 2.4 This Work

    D330N 2 22 95 0.2 [34]

    D330N 2 22 95 2.1 This Work

    Q435E 2 44 90 1.7 This Work

    A103C 2 50 89 0.9 [58]

    A103D 2 50 89 2.2 [58]

    T100V 2 51 89 0.3 [58]

    T100I 2 51 89 0.3 [58]

    V99A 2 100 78 0.2 [58]

    S105L 2 136 70 6.3 [56]

    S105A 2 136 70 2.1 This Work

    E150Q 3 106 76 3.7 This Work

    E341K * 12 97 1407 [56]

    T59A * 169 37 1.5 [62]

    T59R * 169 37 >600,000 [62]

    Table 2.6. 2nd

    and 3rd

    shell variants of AP and their catalytic efficiency under comparable conditions to our

    experiments.

  • 43

    Chapter 3. Ketosteroid Isomerase

    3.1. Introduction

    Ketosteroid isomerase (KSI) moves a double bond to convert ∆5-3-ketosteroids to ∆

    4-3-

    ketosteroids by cleavage of the C-H bond at C4 and reattaching the proton at C6. This reaction is

    characteristic of many biological processes of intramolecular abstraction and reprotonation

    (Figure 3.1). Considering that some known KSI enzymes reach diffusion-limited rates of

    reaction[63, 64], KSI is an attractive model for studying enzyme kinetics and active site

    engineering[32, 63, 65]. There are two well-studied sources of ∆5-3-ketosteroid isomerase:

    Pseudomonas putida (PpKSI) and Commamonas testosteroni (CtKSI). These two enzymes have

    practically identical active sites and catalytic residues placement, while sharing only 34% amino

    acid sequence identity[63]. This fold is not entirely uncommon in nature[66], being

    superimposable on Nuclear Transcription Factor 2[67] despite lack of sequence homology or

    function similarity[66, 68].

    O-

    O

    R

    OH

    R

    OHO

    R

    O

    O

    CH3

    CH3

    H

    H

    H

    OHO

    R

    OH

    R

    OHO

    R

    O

    O-

    CH3

    CH3

    H

    H

    O-

    O

    R

    OH

    R

    OHO

    R

    O

    O

    CH3

    CH3

    H

    H

    H

    Figure 3.1. Mechanism of KSI based on PpKSI numbering.

    The active site of KSI is particularly hydrophobic which is reasonable for an enzyme that

    binds steroid ligands[66]. The mechanism for KSI involves abstraction of a proton at the C4

    D40 D40 D40

    D103 D103 D103

    Y16 Y16 Y16

  • 44

    position by D40 (PpKSI numbering) followed by stabilization of the intermediate by D103 and

    Y16[32, 63, 65, 66, 69]. Regeneration of the catalyst is achieved by the C6 carbon abstracting

    the hydrogen from D40. The ability for an aspartic acid to act as a base is of particular interest,

    especially with a nearby aspartic acid requiring protonation to stabilize the resulting enolate ion.

    With the advent of the Protein Structure Initiative many crystal structures are uploaded to

    the Protein Data Bank with putative, predicted, or unknown function. These structures often have

    function assignments based purely on sequence or structural similarity. With misannotation in

    databases becoming an increasing problem[70], recently we have developed a method to help

    assign function to structures without biochemical data called SALSA: Structurally Aligned Local

    Sites of Activity[71]. Because THEMATICS and POOL allow the active site of any protein to be

    predicted regardless of existing homology and based solely upon the tertiary structure of the

    enzyme they are optimal for prediction of protein function that may be incorrectly annotated.

    There are three putative KSI proteins from structural genomics centers from three

    organisms: Mycobacterium tuberculosis (MtKSI), Pectobacterium atrosepticum (PaKSI), and

    Mesorhizobium loti (MlKSI). Previous work in our group by Dr. Srinivas Somarowthu has

    shown that of these three, only MtKSI possesses KSI activity. However, the catalytic efficiency

    of MtKSI was found to be on the order of 105 M

    -1 s

    -1, a thousand times lower than PpKSI’s

    efficiency of 108 M

    -1 s

    -1. This begs the question: what are the key differences that lead to this

    loss of activity between MtKSI and PpKSI? Can the activity of MtKSI be brought to PpKSI

    levels by making the MtKSI active site more PpKSI-like?

    http://www.pdb.org/pdb/search/smartSubquery.do?smartSearchSubtype=TreeEntityQuery&t=1&n=29471http://www.pdb.org/pdb/search/smartSubquery.do?smartSearchSubtype=TreeEntityQuery&t=1&n=381

  • 45

    3.2. Computational Predictions

    For each known KSI and putative KSI, POOL ranked each residue’s importance for

    catalysis and the top 10% for each was used as a cut-off. The structures were aligned based on

    their active sites and a structural alignment Table was created (see Table 3.1). Nuclear

    Transcription Factor 2 (NTF2) contains an incredibly similar overall fold without sharing any

    function with KSI and thus was used as a negative control.

    PDB Structurally aligned POOL predicted residues

    PpKSI 1oh0 Y32 Y57 Y16 D40 W120 F56 G49 P41 D103 D35 G43 E39 M116

    CtKSI 8cho F30 Y55 Y14 D38 F116 F54 G47 P39 D99 D33 G41 E37 M112

    MtKSI 2z76 M32 F64 S16 D40 W128 F63 G56 P41 F111 D35 G43 E39 M124

    MlKSI 3hx8 Y52 W76 F36 P60 S146 L75 G68 P61 Y125 D55 - F59 D142

    PaKSI 3d9r Y35 Y59 Y19 G43 K131 V58 G51 P44 E110 D38 - M42 Y127

    NTF2 1oun Y33 L56 Y18 W41 A122 K55 G48 E42 Q101 A36 - T40 D117

    Table 3.1. SALSA alignment of POOL predicted residues for known KSI proteins and proteins annotated as putative

    KSIs. Bold: POOL-predicted; underlined: literature annotated.

    POOL prediction based on top 10% of rankings. The proteins in Table 3.1, in order from

    top to bottom: two known KSIs, three SG putative KSIs, and a nuclear transcription factor of

    similar structure, shown for comparison. For the three putative KSIs, only MtKSI’s active site is

    both predicted and similar to the known KSI active sites; both MlKSI and PaKSI do not have

    similar active sites, nor are the residues in the same spatial positions as the KSI active site

    predicted to be important for activity. The match between MtKSI and PpKSI / CtKSI is not

    100%. While a tyrosine to phenylalanine mismatch is somewhat conservative, it is of note that

    for MtKSI that F64 of interest is not predicted by POOL to be important for catalysis. The same

    can be said for the S16 where PpKSI and CtKSI have a tyrosine as well. The essential aspartic

    acid at D40 is conserved, but curiously the other aspartic acid at PpKSI-D103 / CtKSI-D99

    which is thought to be essential is replaced by a non-POOL-predicted F111 in MtKSI.

  • 46

    3.3. Materials and Methods

    Wild-type MtKSI DNA was obtained in the form of a plasmid pGST-Rv0760c (Craig

    Garen and Prof. Michael James, Department of Biochemistry, University of Alberta) encoding MtKSI with a

    GST-tag, as well as an ampicillin resistance marker gene. Steroids were purchased from

    Steraloids Inc, RI, USA. Primers were hydrated to 100 μM concentration with sterile water, and

    a 5 μM stock was created by diluted 20-fold into sterile water. Codons manipulated are

    underlined.

    3.3.1. Methods

    QuikChange (Agilent Technologies) site-directed mutagenesis was used to mutate the

    wild-type KSI gene with the following mutations: S16Y, F64Y, F111D, S16Y/F64Y,

    S16Y/F111D, F64Y/F111D, and S16Y/F64Y/F111D. Since the amino acids of interest are coded

    by codons far enough apart, multiple mutations can be introduced using single-mutation primers

    in succession.

    MtKSI.F111D-F GGCGTGGACACCTACCGGGTG

    MtKSI.F111D-R CACCCGGTAGGTGTCCACGCC

    MtKSI.F64Y-F GGCGCCTTCTACGACACACAC

    MtKSI.F64Y-R GTGTGTGTCGTAGAAGGCGCC

    MtKSI.S16Y-F CGCAGTCGTACTGGCGGTGCG

    MtKSI.S16Y-R CGCACCGCCAGTACGACTGCG

    Figure 3.2. Primers for site-directed mutagenesis of MtKSI in plasmid pGST-Rv0760c.

    BL21 DE3 pLysS competent cells were transformed with pGST-Rv0760c containing

    either WT or mutations, and after streaking a transformed colony, a single colony was used to

    inoculate 50 mL of LB liquid culture which was grown overnight with 100 μg μL-1

    ampicillin.

    The next day, the 50 mL culture transferred to a 500 mL of LB liquid culture with 100 μg μL-1

    and grown with shaking for 2 h at 37 °C. Once an OD of 0.5-0.8 at 600 nm was obtained, the

  • 47

    culture was brought to 0.5 mmol L-1

    IPTG to induce expression and agitated at room temperature

    overnight. After overnight growth, the culture was harvested by centrifugation at 6000 RPM for

    10 minutes, suspended in 1X Phosphate Buffered Saline (PBS) pH 7.3 with 1 mM DTT and ½ a

    tablet of Roche Protease Inhibitor cocktail (Buffer A) and stored at -80°C.

    Frozen pellets from the -80 °C freezer were thawed overnight in ice. The suspended,

    thawed cells were subjected to sonication for 2 min (multiple rounds of 10 sec on followed by 10

    sec off) and then clarified by centrifugation at 14,000 rcf for 60 min. The supernatant was

    collected and loaded onto a disposable 4B Sepharose GST column resin (GE Healthcare). The

    column was washed with Buffer A extensively, and then the GST-tagged MtKSI gradually eluted

    with 1 to 10 mM reduced glutathione. Fractions containing MtKSI determined by SDS-PAGE

    were collected and combined with histidine-tagged TEV protease overnight and then dialyzed

    against Buffer A to remove any reduced glutathione. The solution was then run through a 4B

    Sepharose GST column, except this time the KSI was collected in the initial flow through, and

    then filtered onto a Nickel FPLC column to remove the histidine-tagged TEV protease, and the

    MtKSI was collected in the flow through. Fractions containing MtKSI were determined by SDS-

    PAGE, and then concentrated using Viva-spin tubes with a 5000 Da Molecular Weight Cut Off

    (SartoriusStedim biotech) while being exchanged into KSI storage buffer (50 mM NaCl, 10 mM

    Tris-HCl, 1 mM DTT, pH 8.0). Purity was determined by SDS-PAGE and protein concentration

    determined by Bradford Assay against a BSA standard.

    Activity of MtKSI was determined by formation of 4-androstene-3,17-dione (4AND) by

    isomerization of 5-androstene-3,17-dione (5AND), measured at 248 nm by a UV/Vis instrument

    for a fixed enzyme concentration and varying substrate concentration between 30 and 300 μM

  • 48

    5AND while keeping final methanol concentration 3.3% v/v (Table 3.2; Figure 3.3). Enzyme

    concentration was fixed at a final concentration of 10 nM from a 1.2 μM stock that was made by

    diluting purified KSI with a dilution buffer (34 mM KCl, 2.5 mM EDTA, 1% BSA, pH 7.0).

    Reactions were blanked with all reagents except the substrate, 5AND. 5AND was added,

    mixed completely quickly, and then the absorbance at 248 nm tracked for 60 seconds, starting

    after 3 seconds, every 0.5 seconds.

    [5AND]

    μM final

    [KSI]

    nM

    2 X

    Buffer Water Methanol

    KSI

    1200 nM

    5AND

    3 mM

    5AND

    10 mM Total

    10 10 1500 1375 90 25 10 - 3000

    20 10 1500 1375 80 25 20 - 3000

    30 10 1500 1375 70 25 30 - 3000

    60 10 1500 1375 40 25 60 - 3000

    90 10 1500 1375 10 25 90 - 3000

    120 10 1500 1375 64 25 - 36 3000

    180 10 1500 1375 46 25 - 54 3000

    300 10 1500 1375 10 25 - 90 3000

    Table 3.2. Kinetic assays for MtKSI. Bolded columns denote final concentrations, where all other numbers refer to

    μL added to the cuvette.

    3.4. Results

    y = 0.0142x - 0.0088 R² = 0.9994

    y = 0.0147x + 0.0076 R² = 0.9971

    -0.1

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    0 10 20 30 40 50 60 70

    Ab

    sorb

    an

    ce (

    248 n

    m)

    4AND (μM)

  • 49

    Figure 3.3. Standard curve for 4-androstene-3,17-dione (4AND).

    A molar extinction coefficient of 1.4 x 104 M

    -1 cm

    -1 was used for kinetic analysis. Two of

    two trials are shown in Figure 3.3.

    Figure 3.4. Michaelis-Menten plots for WT MtKSI and variants. Error bars represent standard error of at least three

    independent trials.

    WT and F64Y KSI show increasing V0 with increasing substrate concentration, although

    the V0 do not approach a Vmax due to poor solubility of the substrate. Therefore, all KM values for

    KSI are reported as KMapp

    . Vmax can still be extrapolated by non-linear regression, but with less

    accuracy represented by lower regression coefficients and higher standard errors.

    Figure 3.5. WT KSI and F64Y individual Michaelis-Menten plots.

    MtKSI WT MtKSI F64Y

  • 50

    Any variant containing S16Y and/or F111D however does not show classic Michaelis-

    Menten behavior along with significantly diminished activity. Purification of MtKSI-F111D was

    problematic, including low yields and loss of protein after concentration. Only a small amount of

    data could be obtained for F111D, but there seems to be no deviation from the behavior shown

    by the other non-F64Y mutants.

    MtKSI Variant Vmax KMapp

    R2

    WT 63.67 (29.08) 453.6 (297.3) 0.7611

    S16Y 10.04 (25.41) 1607 (4635) 0.5909

    F64Y 57.83 (3.993) 231.1 (27.89) 0.98

    S16Y/F64Y 18.65 (29.19) 2485 (4247) 0.8972

    S16Y/F111D 4.094 (1.148) 273.4 (127) 0.8289

    F64Y/F111D 0.3873 (0.104) 3.235 920.3) 0.2783

    S16Y/F64Y/F111D 3.821 (11.98) 1674 (5973) 0.5009 Table 3.3. Vmax and KM

    app for WT MtKSI and variants. Standard errors are in parenthesis and consist of at least three

    independent trials.

    WT V0

    μM min-1

    F111D V0

    μM min-1

    Trial 1 11.8 0.68

    Trial 2 4.9 0.70

    Trial 3 11.8 0.88

    Trial 4 13.8 0.85 Table 3.4. Comparison between the WT MtKSI and F111D variant at 90 μM 5AND.

    Figure 3.6. Single experiment of Michaelis-Menten plot for MtKSI F111D.

  • 51

    kcat

    (s-1

    )

    KMapp

    (μM)

    Catalytic

    Efficiency

    (103 M

    -1 s

    -1)

    Fold decrease

    to WT

    WT 106 (48) 454 (297) 234 (187) --

    S16Y 17 (42) 1607 (4635) 10 (40) 23 (88)

    F64Y 96 (6.7) 231 (28) 417 (58) 0.6 (0.45)

    F111D 1.5 (--) 70 (--) 36 (--) 6.4 (5.15)

    S16Y/F64Y 31 (49) 2485 (4247) 13 (29) 18.7 (46)

    S16Y/F111D 6.8 (1.9) 273 (127) 25 (14) 9.4 (9.1)

    F64Y/F111D 0.6 (0.2) 3.2 (20) 200 (1253) 1.2 (7.4)

    S16Y/F64Y/F111D 6.4 (20) 1674 (5973) 4 (18) 62 (296) Table 3.5. Catalytic efficiency for WT MtKSI and variants. Where available, standard errors are in

    parentheses and consist of at least three independent trials.

    For comparison, in Table 3.5, PpKSI’s catalytic efficiency is 100,000 x 103 M

    -1 s

    -1. F64Y

    retained the same kcat while having a lower KMapp

    , giving it a higher catalytic efficiency than the

    WT. All other mutants lack the signal to noise required to make an accurate analysis of their

    Michaelis-Menten paramaters or catalytic efficiency.

    For any variant tested besides F64Y, the Michaelis-Menten parameters of KM and kcat

    could not be calculated, evidenced by higher standard errors than measurements themselves for

    most of these variants. Enzyme efficiencies may be compared without separating the KM and kcat

    variables. If the concentration of substrate is negligible compared to the KM, the additive term of

    substrate concentration in the Michaelis-Menten equation can be dropped.

    Assuming [s]

  • 52

    MtKSI Variant V0

    Fold Decrease to

    WT

    WT 3.87 (1.4) --

    S16Y 0.27 (0.24) 14.3 (13.7)

    F64Y 6.80 (3.3) 0.6 (0.35)

    F111D 0.41 (--) 9.4 (--)

    S16Y/F64Y 0.26 (0.02) 14.8 (5.7)

    S16Y/F111D 0.47 (0.28) 8.2 (5.8)

    F64Y/F111D 0.41 (0.23) 9.4 (6.4)

    S16Y/F64Y/F111D 0.42 (--) 9.2 (--) Table 3.6. WT MtKSI and variants compared solely based on initial velocities at 30 μM 5AND. Where

    available, standard deviations are in parentheses and represent at least three independent trials.

    These results only report ratios of catalytic efficiency without examining kcat or KMapp

    individually. F64Y results are similar between this method and full Michaelis-Menten kinetic

    analysis.

  • 53

    3.5. Conclusions

    For any mutation tested in MtKSI, or combination thereof, the resulting variant had little

    to no KSI activity on 5AND except for the F64Y variant, and proper Michaelis-Menten curves

    could not be constructed. Why did we not increase the catalytic efficiency to more closely

    approximate the PpKSI and CtKSI forms with a more “PpKSI-like” active site?

    Figure 3.7. “Top-down” view of PpKSI (PDB ID: 1OHO; Red), CtKSI (PDB ID: 8CHO; Orange), and

    MtKSI (PDB ID: 2Z76; Yellow).

    The steroid-binding pocket and active site is at the front of Figure 3.7. The left group of

    residues is Y57, Y55, and F64 respectively. The top group of residues is Y16, Y14, and S16

    respectively. The right group of residues is D103, D99, and F111 respectively. MtKSI-F64,

    while being spatially aligned with PpKSI-Y57 in many structural alignments, is actually

    swiveled almost 180° away from where they phenol group is pointing in either PpKSI or CtKSI

    (Figure 3.7). There are few replacements for PpKSI-Y57 and MtKSI-Y113 is too far away to

  • 54

    take over its job[68].This seems to be a limitation of structural alignments more than SALSA,

    but calls to attention the importance of human verification. In this respect, it makes sense for the

    F64Y variant to have unmodified catalytic activity.

    Figure 3.8. Three residues of interest in PpKSI (PDB ID: 1OHO; Red), CtKSI (PDB ID: 8CHO; Orange),

    and MtKSI (PDB ID: 2Z76; Yellow) without surrounding secondary structure.

    F111 and S16 from MtKSI overlap well with their SALSA partners in PpKSI and CtKSI

    (Figure 3.8). However, mutations to make the side-chains similar resulted in loss of activity. The

    natural substrate for MtKSI is unknown. Because Y16 PpKSI / CtKSI position is used in

    recognition of the steroid ligand[68], MtKSI could very well use a different steroid. If the natural

    substrate for MtKSI is a different steroid, this would explain the reduced catalytic efficiency on

    5AND and sensitivity to changing the binding recognition pocket.

    The identification of F111 in MtKSI as spatially equivalent to PpKSI-D103 does not

    seem to be an alignment error; there are no residues in the MtKSI structure that seem capable of

  • 55

    replicating the essential catalytic role of D99/D103. Indeed, the authors in the report of the

    crystal structure use this to argue against Rv0760c having KSI activity and reported no activity

    on 5AND[68].

    The peculiarity of the POOL and SALSA predictions stands out after these results. What

    does it mean for an enzyme to not only have a strikingly different residue at a catalytic position,

    but also for that residue to not be predicted for activity by POOL? Clearly it doesn’t discount a

    certain functional activity, such as ketosteroid isomerization of 5AND, but it may correspond to

    different substrate recognition, or even a different mechanism.

    How many differences are required to declare two enzymes to have different functions,

    and how many similarities must there be before they are declared similar? This is a current area

    of investigation[71].

  • 56

    Chapter 4. Future Work

    4.1. POOL-rank cut-offs

    THEMATICS is a Boolean predictor giving either a yes or no for each residue in a

    protein structure. In contrast, POOL assigns a ranking to every residue in a given protein

    structure, and it is up to the user to determine what cut-off to implement for best results.

    Traditionally, the top 5, 8, or even 10% of POOL predictions are considered to be positive

    predictions[26, 27, 30, 32, 33, 71]. There remain two open questions:

    1) Should POOL prediction be based on percentage or POOL score?

    2) Can we use POOL to predict single-layer vs. multi-layer enzyme active sites?

    Recent work has shown convincingly that a POOL normalized score cut-off is superior to

    flat % cut-offs. By itself, a percentage cut-off presents an odd assumption that the number of

    partaking residues of an active site is linearly and directly proportional to the total number of

    amino acids. Rather, assigning an absolute cut-off of normalized POOL score (such as 0.01)

    seems more rational, and is currently being investigated as the next generation cut-off for POOL.

    The second question remains much more difficult and lies central to our work on active

    site catalysis, engineering, and understanding. It has been shown that there are some multi-layer

    active site enzymes[31-33] and some single-layer active site enzymes[32][this work]. How to

    differentiate easily though, much less without examining each predicted residue, is still an

    ongoing discussion. It has been proposed that the shape of the POOL plot itself may provide

    predictive power regarding the extent of an enzyme’s active site. This hypothesis comes from an

    empirical observation across a few proteins studied so far that the POOL plots seem to drop

  • 57

    much more sharply for enzymes with single-layer active sites than multi-layer active sites

    (Figure 4.1).

    Single-layered active site proteins alkaline phosphatase and ketosteroid isomerase have

    sharp decreases immediately, flat-lining by their 10th

    residue for AP and even by the 5th

    residue

    for KSI. Multi-layered active site proteins phosphoglucose isomerase, cobalt-type nitrile

    hydratase, α subunit of pol III (DnaE), and pol IV (DinB) have extended tails on their POOL

    plots and start flat-lining farther out compared to AP and KSI.

  • 58

    Figure 4.1. POOL plots for AP, KSI, PGI, NH, DnaE and DinB.

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    0 5 10 15 20

    Norm

    ali

    zed

    PO

    OL

    sco

    re

    POOL rank

    AP POOL plot

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    0 5 10 15 20

    Norm

    ali

    zed

    PO

    OL

    sco

    re

    POOL rank

    KSI POOL plot

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    0 5 10 15 20

    Norm

    ali

    zed

    PO

    OL

    sco

    re

    POOL rank

    PGI POOL plot

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    0 5 10 15 20

    Norm

    ali

    zed

    PO

    OL

    sco

    re

    POOL rank

    NH POOL plot

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    0 5 10 15 20

    Norm

    ali

    zed

    PO

    OL

    sco

    re

    POOL rank

    DnaE POOL Plot

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    0 5 10 15 20

    Norm

    ali

    zed

    PO

    OL

    sco

    re

    POOL rank

    DinB POOL Plot

  • 59

    Appendix A. Propagation of error in calculating catalytic efficiency

    Both the Vmax value and the KM value are calculated with their respective standard errors

    based on the inputs to the GraphPad Prism program. Vmax values are converted to kcat values by

    the following transformation, where kcat is in s-1

    , Vmax in μM min-1

    and [enzyme] in μM:

    AP Variant [e] (μM)

    WT 0.002

    H372D 0.011

    H372L 0.011

    H86L 0.0083

    S105A 0.0018

    E57Q 0.006

    M53A 0.006

    M53T 0.002

    D330N 0.002

    Q453E 0.008

    E150Q 0.0023

    Table A.1. Concentrations of enzymes used to gather kinetic data for alkaline phosphatase.

    All MtKSI kinetic experiments were done with 0.010 μM enzyme. Enzymes were diluted

    from stock concentrations measured by Bradford assays using a BSA standard. Catalytic

    efficiency is defined as the kcat divided by KM. To propagate the error in each measurement, I

    used (where σx is the standard error of variable x):

    √(

    )

    (

    )

    Where in our case Z is catalytic efficiency, X is kcat, and Y is KM.

  • 60

    References

    1. Axe, D.D., Estimating the prevalence of protein sequences adopting functional enzyme

    folds. J Mol Biol, 2004. 341(5): p. 1295-315.

    2. Pauling, L., Molecular architecture and biological reactions. Chem. Eng. News, 1946.

    24(10): p. 1375-1377.

    3. Garcia-Viloca, M., et al., How enzymes work: analysis by modern rate theory and

    computer simulations. Science, 2004. 303(5655): p. 186-95.

    4. Fischer, E., Einfluss der Configuration auf die Wirkung der Enzyme. Berichte der

    deutschen chemischen Gesellschaft, 1894. 27(3): p. 2985-2993.

    5. Koshland, D.E., Application of a Theory of Enzyme Specificity to Protein Synthesis.

    Proceedings of the National Academy of Sciences, 1958. 44(2): p. 98-104.

    6. Damborsky, J. and J. Brezovsky, Computational tools for designing and engineering

    biocatalysts. Curr Opin Chem Biol, 2009. 13(1): p. 26-34.

    7. Gora, A., J. Brezovsky, and J. Damborsky, Gates of Enzymes. Chemical Reviews, 2013.

    8. Crick, F., Central Dogma of Molecular Biology. Nature, 1970. 227(5258): p. 561-563.

    9. Brannigan, J.A. and A.J. Wilkinson, Protein engineering 20 years on. Nat Rev Mol Cell

    Biol, 2002. 3(12): p. 964-70.

    10. Siegel, J.B., et al., Computational design of an enzyme catalyst for a stereoselective

    bimolecular Diels-Alder reaction. Science, 2010. 329(5989): p. 309-13.

    11. Rothlisberger, D., et al., Kemp elimination catalysts by computational enzyme design.

    Nature, 2008. 453(7192): p. 190-5.

    12. Walter, K.U., K. Vamvaca, and D. Hilvert, An active enzyme constructed from a 9-amino

    acid alphabet. J Biol Chem, 2005. 280(45): p. 37742-6.

    13. Kuhlman, B., et al., Design of a novel globular protein fold with atomic-level accuracy.

    Science, 2003. 302(5649): p. 1364-8.

    14. Hilvert, D., Design of protein catalysts. Annu Rev Biochem, 2013. 82: p. 447-70.

    15. Turner, N.J., Directed evolution drives the next generation of biocatalysts. Nat Chem

    Biol, 2009. 5(8): p. 567-73.

    16. Jackel, C. and D. Hilvert, Biocatalysts by evolution. Curr Opin Biotechnol, 2010. 21(6):

    p. 753-9.

    17. Ondrechen, M.J., J.G. Clifton, and D. Ringe, THEMATICS: a simple computational

    predictor of enzyme function from structure. Proc Natl Acad Sci U S A, 2001. 98(22): p.

    473-8.

    18. Shehadi, I.A., H. Yang, and M.J. Ondrechen, Future directions in protein function

    prediction. Mol Biol Rep, 2002. 29(4): p. 329-35.

    19. Shehadi, I.A., et al., Active site prediction for comparative model structures with

    thematics. J Bioinform Comput Biol, 2005. 3(1): p. 127-43.

    20. Shehadi, I.A., et al., THEMATICS is effective for active site prediction in comparative

    model structures, in Proceedings of the second conference on Asia-Pacific bioinformatics

    - Volume 292004, Australian Computer Society, Inc.: Dunedin, New Zealand. p. 209-215.

    21. Ringe, D., et al., Protein structure to function: insights from computation. Cell Mol Life

    Sci, 2004. 61(4): p. 387-92.

  • 61

    22. Ko, J., et al., Prediction of active sites for protein structures from computed chemical

    properties. Bioinformatics, 2005. 21 Suppl 1: p. i258-65.

    23. Wei, Y., et al., Selective prediction of interaction sites in protein structures with

    THEMATICS. BMC Bioinformatics, 2007. 8(1): p. 119.

    24. Ko, J., et al., Statistical criteria for the identification of protein active sites using

    theoretical microscopic titration curves. Proteins: Structure, Function, and

    Bioinformatics, 2005. 59(2): p. 183-195.

    25. Tong, W., et al., Enhanced performance in prediction of protein active sites with

    THEMATICS and support vector machines. Protein Sci, 2008. 17(2): p. 333-41.

    26. Tong, W., et al., Partial Order Optimum Likelihood (POOL): Maximum Likelihood

    Prediction of Protein Active Site Residues Using 3D Structure and Sequence Properties.

    PLoS Comput Biol, 2009. 5(1): p. e1000266.

    27. Somarowthu, S., et al., High-performance prediction of functional residues in proteins

    with machine learning and computed input features. Biopolymers, 2011. 95(6): p. 390-

    400.

    28. Capra, J.A., et al., Predicting Protein Ligand Binding Sites by Combining Evolutionary

    Sequence Conservation and 3D Structure. PLoS Comput Biol, 2009. 5(12): p. e1000585.

    29. Sankararaman, S. and K. Sjölander, INTREPID—INformation-theoretic TREe traversal

    for Protein functional site IDentification. Bioinformatics, 2008. 24(21): p. 2445-2452.

    30. Somarowthu, S. and M.J. Ondrechen, POOL server: machine learning application for

    functional site prediction in proteins. Bioinformatics, 2012. 28(15): p. 2078-2079.

    31. Brodkin, H.R., et al., Evidence of the participation of remote residues in the catalytic

    activity of Co-type nitrile hydratase from Pseudomonas putida. Biochemistry, 2011.

    50(22): p. 4923-35.

    32. Somarowthu, S., et al., A tale of two isomerases: compact versus extended active sites in

    ketosteroid isomerase and phosphoglucose isomerase. Biochemistry, 2011. 50(43): p.

    9283-95.

    33. Walsh, J.M., et al., Effects of non-catalytic, distal amino acid residues on activity of E.

    coli DinB (DNA polymerase IV). Environ Mol Mutagen, 2012. 53(9): p. 766-76.

    34. Muller, B.H., et al., Improving Escherichia coli alkaline phosphatase efficacy by

    additional mutations inside and outside the catalytic pocket. Chembiochem, 2001. 2(7-8):

    p. 517-23.

    35. Coleman, J.E., Structure and mechanism of alkaline phosphatase. Annu Rev Biophys

    Biomol Struct, 1992. 21: p. 441-83.

    36. Lassila, J.K., J.G. Zalatan, and D. Herschlag, Biological Phosphoryl-Transfer Reactions:

    Understanding Mechanism and Catalysis. Annu Rev Biochem, 2011. 80(1): p. 669-702.

    37. Andrews, L.D., T.D. Fenn, and D. Herschlag, Ground State Destabilization by Anionic

    Nucleophiles Contributes to the Activity of Phosphoryl Transfer Enzymes. PLoS Biol,

    2013. 11(7): p. e1001599.

    38. Andrews, L.D., H. Deng, and D. Herschlag, Isotope-edited FTIR of alkaline phosphatase

    resolves paradoxical ligand binding properties and suggests a role for ground-state

    destabilization. J Am Chem Soc, 2011. 133(30): p. 11621-31.

    39. Kim, E.E. and H.W. Wyckoff, Reaction mechanism of alkaline phosphatase based on

    crystal structures. Two-metal ion catalysis. J Mol Biol, 1991. 218(2): p. 449-64.

  • 62

    40. Stec, B., K.M. Holtz, and E.R. Kantrowitz, A revised mechanism for the alkaline

    phosphatase reaction involving three metal ions. J Mol Biol, 2000. 299(5): p. 1303-11.

    41. Murphy, J.E., X. Xu, and E.R. Kantrowitz, Conversion of a magnesium binding site into

    a zinc binding site by a single amino acid substitution in Escherichia coli alkaline