Upload
others
View
9
Download
0
Embed Size (px)
Citation preview
27
CHAPTER 2
2. MOLECULAR MODELING
Drug discovery is a multidisciplinary approach where drugs are designed
and/or discovered. The drug design & discovery (DDD) process involves
the candidate identification, synthesis, screening studies, and pre-
clinical and clinical studies. The total expenditure for R&D to bring a
molecule (new chemical entity) to market (drug) is estimated to be ~ $1.8
billion dollars and time involved in drug discovery is ~12-15 years with
low rate of new discoveries. The lack in efficiency in converting lead
molecule to preclinical candidate and the ratio of converting of candidate
molecule from phase II to phase III trials are major issues.81 This
situation demands an alternate strategies to cut down the cost and time
involved in the drug discovery and as well as increase the success rate.
2.1 Rational Drug Design (RDD)
A quick look into the recent drug discoveries in the pharmaceutical
industry were there by luck and most of them by chance; also traditional
approaches fails to explain why any compound is resulting into active or
inactive and similarly how to optimize the compound, so that we can
enhance its activity and as well as its properties such as solubility. The
knowledge based approaches (RDD) concept originally developed from the
finding of chemoreceptor concept (Paul Ehrlich, 1872) and lock and key
concept (Emil Fischer, 1894) respectively.82 The further development in
28
the area of molecular biology, bioinformatics tools, crystallography leads
to the development of RDD approaches.
The development of pathophysiology and pathogen biochemistry,
leads to the development of rational approaches towards the drug design.
In early 1900s,Combinatorial chemistry approach made a lager impact in
the drug research and developmental strategiesarea and unlocked the
chemical-bottleneckduring optimization of lead moleculefrom “what can
we make” to “which should we make”.83-85In generalized manner, RDD
(Rational drug design) approach starts with the detection of key
biomolecular target(protein) and the associated affected biochemical
pathways due to the protein over-expression/malfunction and followed
by the designing of small molecules based on the lock & key approach
and which leads to modifying/inhibiting the function of these protein
target.Figure 2.1shows the different strategies involved in the RDD
inorder to identify the novel and potent NCE’s.
Figure 2.1: Different
Role of Computer Aided Molecular Design in Drug
The computer
knowledge based models and used to screen and eliminate many
irrelevant compounds and will help in the identification of new leads.
Ability of the identification of structurally diverse comp
binding affinities that can be correlate its biological activities are the
crucial for in any lead discovery programs/approaches. Many
computational approaches and algorithms are developed for
29
Different Strategies of Rational Drug Design
Role of Computer Aided Molecular Design in Drug Discovery
he computer-aided approaches are used to generate the
knowledge based models and used to screen and eliminate many
irrelevant compounds and will help in the identification of new leads.
Ability of the identification of structurally diverse compounds with good
binding affinities that can be correlate its biological activities are the
crucial for in any lead discovery programs/approaches. Many
computational approaches and algorithms are developed for
Discovery
aided approaches are used to generate the
knowledge based models and used to screen and eliminate many
irrelevant compounds and will help in the identification of new leads.
ounds with good
binding affinities that can be correlate its biological activities are the
crucial for in any lead discovery programs/approaches. Many
computational approaches and algorithms are developed for
30
combinational chemistry to generate a library with molecular diversity,
available vendor databases and fragment based approaches can be used
to generate an in-house library and these databases can be used for
screening purpose. Target based/structure based approaches are one of
the important approach in RDD, it’s about protein structure,
identification of binding site and analysis and docking approaches to
identify the binding mode of the newly designed compound. One of the
major issues in the SBDD is accurate prediction of the binding mode and
its binding affinities are difficult.86 The thermodynamics properties of
binding cannot be modeled analytically without simplifying the entropic
and associated energetic factors.87
RDDApproaches
Rational design approaches (RDD) are divided based on the
availability/existence of 3D-structure of the biomolecular target. These
are Analog based strategies (ABDD) and Structure based strategies
(SBDD).
2.1.1 Analog Based Studies
In analogue based strategies, it will explore the already
known/reported molecules against the biomolecular target and activity.
Based on the analysis of this reported compounds, set of rules
(hypothesis) are developed to identify/design a NCE's and also suggest
31
the modifications to existing compounds. Analog based approach mainly
depends on the available data (known/reported compounds with specific
activity against that particular target).
2.1.1.1 Quantitative Structure Activity Relationship (QSAR)
QSAR is the most common approach in ABDD. It generates the
correlation between the molecular descriptors (physico-chemical and
structural properties) of known/reported compounds and their
corresponding experimental activities. These generated equations
(correlations) are validated with known compounds and later this
information used in the design of potent NCEs and to optimize the
activity of the already known compounds.
At the earlier stages of QSAR development, used to study the effect
of the functional groups (presence/absence) in a series of compounds
(Free-Wilson model) with its experimental activity and later they used to
compare with the physico-chemical properties (Molecular Descriptors
such as., electronic properties, lipophilicity) of the molecules with its
experimental activity of the dataset (Hansch analysis). 3D-QSAR methods
are developed.88 To generate QSAR equation; (1) Enough no of active
ligands against particular target with structure and activity diverse, to
32
develop the structure activity relationships (SAR); (2) the equations
generated from QSAR, parameterized only for the one specific targetand
(3) also address molecular properties (Descriptors) influencing the
activity and how modify those properties.
2.1.1.1.1 Pharmacophore Model Generation
Pharmacophore modeling is another approach of theanalog based
method. The word “pharmacophore” was coined by Paul Ehrlich as the
skeleton (framework)carrying the crucial features (phoros) necessary for
biological activity of a compound (pharmacon). Soon after in 1977,Peter
Gund defined it as “a set of features (structural) in a molecule that is
recognized at a receptor site and is responsible for that molecule’s
biological activity”. Using reported compounds with its biological activity,
pharmacophore models are constructed and are refined using further
data. In addition to the ligand-based models, pharmacophore model can
constructed using from the known receptor-ligand interactions
(Structure based).In a similar manner, dynamic pharmacophore model
can be generated using molecular dynamics trajectories,which consider
the binding site dynamics (flexibility).89 The generated pharmacophore
models (ligand/structure-based)are used in the optimization of the
known compounds and also to for screening databases in order to get
new leads suitable for further development.90
33
Pharmacophore features
1. Hydrophobic
2. Hydrophobic aliphatic
3. Hydrophobic aromatic
4. Hydrogen bond acceptor
5. Hydrogen bond donor
6. Positive ionizable
7. Negative ionizable
8. Ring aromatic
Manual Pharmacophore Generation: Visual Pattern Recognition
Steps for the development of Pharmacophore model
1 Manualassessment and identification of the common features
shared by a bunch of activecompounds and those features
which are absent in the inactive compounds
2 Identification of common features its spatial arrangements with
other models
3 Improvement of a manual Pharmacophore models and followed
by the validation study of the generated models,to study the
pharmacophore mapping to the active ones and the followed by
the mapping to the inactive compounds.
34
4 Generation of Quantitative Pharmacophore modelsusing the
compounds by known biological activity (IC50), until the
preferred model is obtained.
Compound is considered as inactive, when
a) There is a short of the important groups necessary for
biomolecular target recognition, which are not completely mapping
on to the essential Pharmacophore.
b) Though it contains the pharmacophore features, nevertheless it
also contains the unnecessary groups which obstruct with target
recognition/binding and that can be optimized using these
pharmacophore models.
c) It can be less soluble when compared to its bioactive conformation
of the compound.
d) It has bulky groups which will sterically stop them to interact with
the target biomolecule, it’s an additional asset that might be
detected by these models.
If the protein crystallographic structure is reported and the complete
information about the active site and ligand binding pattern is well
explored, thenthe medicinal chemist should depends on the structure
activity for the ligandsgiven. If all the givendataset compounds are active
35
against the receptor and by studying the interaction pattern, we can
describe the commonality between the data set.
The Pharmacophore model can be used for
� To generate new ideas (search for a new candidate)
� Rank/prioritizing hits
� Develop the virtual compound library
� Estimate the activity
� Using the resulting alignments for further studies
� Finalized models are considered asqueries for mining
Using Catalystpackage (Accelry’s Suite), two types of hypotheses or
pharmacophore models can be generated, based on the
availability/accessibility to the reported compounds and its activities.91
While activity is not considered and needs searchfor common
chemical features present in the set of compounds, in that caseHipHop
module in Catalyst can be used for this function.
If activity data is availablefor set compounds, in that case HypoGen
modulein Catalyst is used. Quantitative (Feature based) models can be
36
generated by means of HypoGen can be used as query for virtual
screening process to recognize as potential hits.
HipHop: (Qualitative/Common feature based)
Hypothesis, or Pharmacophore model, contains of a 3D-arrangement of
chemical features encircled by spheres. Each sphere has a tolerance,
defined by the area in cartesian space which can be occupied by a
particular chemical significance. Each and every feature is assigned with
a weight that indicates its significance within the hypothesis. Whereas,
feature with high weight, indicates that the feature is responsible for the
activity compare other features of the hypothesis.
While the data set (ligand) is small (<15 compounds), with
biological activity range also not sufficient, one can generate a
hypotheses based on the common feature alignments with HipHop
module inCatalyst suite.91,92 HipHop module generates a common
chemical features pharmacophore models share by a set of actives.
HipHop findsthe 3D-dimensional spatial arrangements of chemical
features for common set of actives. The configurations are recognized by
pruned exhaustive search,it begins with the small feature combination
37
and widen them till no further configuration is found. Onehas to define
the number of actives that should map the features entirely or number of
actives can moderately ignore the feature mapping. On basis of the user-
defined options and chosen features, it generates the simplest to varied
hypotheses.
� Principle
– It represents the top active/reference compound(s)
– reference configuration models
0 don’t consider these compounds
1 consider this compounds configurations
2 it consider this as active/reference compound
– only for generation of HipHop model
� Max Omit Features
– It specifies number of features can be omitted for each one
0 map all features
1 all features but at least one/more feature should map to
hypotheses
2 no need of mapping all the features
38
– only for generation of HipHop model
HypoGen: (Quantitative Pharmacophore Models)
It generatesa variety of Structure Activity Relationship (SAR)
hypothesis models by making use of data set moleculeswithits
experimental activity data. HypoGen creates many hypotheses models
and recognizes the features which are commonly sharedin the actives,
but they are not shared by the inactives. The top pharmacophore model
is taken for the prediction of the activity of unknown compounds or to
look for new possible leads contained in 3D chemical databases.93,94
HypoGen generates hypotheses with a set of chemical features in
3D space, each feature contains a certain weight and tolerance that fit
entirely to the hypotheses features of the training set, and the
experimental activity is correlated to it.
The HypoGen has three phases:
1. Constructive phase
2. Subtractive phase
3. Optimization phase
In the constructive phase, it locates pharmacophore models that
are common and shared by actives, while the subtractive phase
withdraws pharmacophore models which are common to the inactives,
and lastly the optimization phase will develop the initial models. The
39
resulting models should contain a set of best features with regression
data. The validated models are utilized as queries in screening to explore
new hits (Figure 2.2) from a virtual database.
Figure 2.2: Pharmacophore guided lead optimization
Running HypoGen
Preparation of Training set:
� Training set should contain minimum of 16 molecules to assure
the statistical power of the model
� Experimental Activities rangeshould be minimum of 4 orders
magnitude
� It should contain 3-4 compounds (every order of magnitude)
40
� No redundant information in activity & No problems in
excluded volume
HypoGen goes through3 phases, a constructive, subtractive &
optimization phase (shown in Figure 2.3).
HypoGen calculates the cost of two observed hypotheses,Fixedcost
(minimum cost), and the Null cost (high cost). Ideal hypothesis cost
should contain the value between these two cost values and it must be
close to the Fixed (Ideal) cost compare to the Null cost.
As per the pharmacophoreguidelines and studies shows that if the
cost difference between a returned hypothesis and the Null hypothesis is
around 40-60 bits, there is a 75-90% chance of true correlation.
Another useful number in the pharmacophore is the Entropy of
hypothesis. It should bebelow 17, a careful inspection of pharmacophore
models will be necessary to select the best model out of it.
Constructive phase
The concept of constructive phase is same as HipHopalgorithm.
Steps:
1) It consider the top 2 actives
41
2) Hypotheses (maximum 5 features) from the top two
activeswill begenerated and stored
3) the fit (values) the other actives are same
Subtractive phase
The hypotheses models generated during the constructive phase
are examined and if any models found common to the most of the
inactives, they will be removed.In this phase, the program deletes all
hypotheses which areunlikely to be of use.
Figure 2.3: Hypotheses generation
42
Optimization phase
During the optimization phase, itutilizes small adjustments to the
hypotheses features formed in the constructive phase, in order to
enhance score and regression values.During the optimization phase, it
implements simulated annealing algorithm.
HypoRefine
This algorithm is an additional algorithm for Catalyst HypoGen algorithm
for making Structure activity relationship (SAR)-based pharmacophore
models required to predict the activities of new molecules. HypoRefine
assist to develop the predictive hypothesis models with improved
regression data that correlates hypothesis which has steric properties.
Besides, it can assist to overcome prediction of inactives (false positives)
with pharmacophore features which are common to the other actives
present in the dataset.
Analysis of cost parameters from the HypoGen output files
During the automated hypothesis generation runs, Catalyst
generates and discards no of models. It differentiates the models by
applying costparameters. The general assumption is based on Occam’s
razor; the simplest model is best with good regression parameters. In
general, if the difference (Null and Fixed cost) is above 60 bits, there is
43
avery good chance of the model to represent a true correlation. The list of
details of remaining cost parameters are described in Table 2.1.
Table 2.1 Pharmacophore cost parameters
S.No Parameter name Description
1 Fixed cost Cost of the simplest possible hypothesis
(initial)
2 Null cost
Costs when each molecule estimated as
mean activity acts like a hypothesis with no
features
3 Weight cost
A value that increases in a Gaussian form as
the feature weight in a model deviates from
an idealized value of 2.0. This cost factor
favors hypotheses in which the feature
weights are close to 2. The standard
deviation of this parameter is given by the
weight variation parameter.
4 Error cost
A value that increases as the rms difference
between estimated and measured activities
for the training set molecules increases. This
cost factor is designed to favor models for
which the correlation between estimated and
measured activities is better. The standard
44
deviation of this parameter is given by the
uncertainty parameter.
5 Configuration cost
A fixed cost depends on the complexity of
the hypothesis space being optimized. It is
equal to the entropy of the hypothesis space.
This parameter is constant among all the
hypotheses.
The main goal made by hypotheses is it should map all the actives
present in the dataset and compared to the inactives. In a simple
statement, the compound is inactive because of a) it missing (lack of)
important feature and/or b) the partial/incomplete pharmacophore
mapping because of the orientation.
Metrics for analyzing Pharmacophores and Hit Lists
The generated pharmacophore model further validated based on its
predictive power and its ability to identify/retrieve actives from the
inactives (decoys) (Figure 2.4).
45
Actives
Database molecules Database molecules
Database molecules
Actives
Actives
Hits
Ha
Ht
Figure 2.4: Database searching using pharmacophore models
D = Total no of compounds in database,
A = No of active compounds in database,
Ht = No of compounds in search hit list and
Ha = No of active compounds in hit list
Pharmacophore Validation
Percentage of yield (actives):
% Y = Ha / Ht x 100
Percent ratio of the activities in the hit list:
% A = Ha / Ax 100
Enrichment (enhancement)
46
E = Ha / Ht = Ha x D
A/ D Ht x A
False negatives: A - Ha
False positives: Ht - Ha
When these two conditions (Ha = A and Ha = Ht), and hence (Ha =
Ht= A), are fulfilled, that is impossible to get the true situation.
In actual situation, no of database compounds which are actives
but not short listed as actives, or not tested for activity against this
target. In both cases, these are end up as “False positives”.
When “False negatives” is nil but not able to retrieve the actives
from the database.
The prefect/ideal hit list which retrieves all actives & nothing else
(False negatives = 0, false positives = 0) (i.e., Ht = Ha= A).
Whereas, the worst hit list is which retrieves all data set else but
the known actives (False positives = D-A, False negatives = A) (i.e., Ha =
0, Ht = D-A).
47
The Goodness of Hit (GH) score indicates a how fine the short
listed hit-list is in all aspects total yield and percentage of actives
retrieval. Table 2.2 provides an example of acceptable range of the hit
lists, from ideal& worst (based on the GH score/enrichment value).
Case % Y % A Enrichment False
negatives
False
positives
GH
score
Best 100 100 500 0 0 1
Typical
Good 40 80 200 20 120 0.60
Extreme
Y 100 1 500 99 0 0.50
Extreme
A 0.2 100 1 0 49,900 0.50
Typical
Bad 5 50 25 50 950 0.26
Worst 0 0 0 100 49,900 0
Table 2.2: Goodness of Hit score values
48
2.1.2 Structure Based Studies
Structure based studies (SBDD) uses 3D-structure of the
biomolecular target has many advantages compared to analog based
studies.Itassist to acquire a theoretical depiction of the protein-ligand
interactions and active site orientations, which enable to design new
leads.95A antihypertensive drug, a Angiotensin Converting Enzyme
(ACE)inhibitor was the first lead designed from SBDD.96Table 2.3shows
the complete list of drugs which are identified from structure based
approaches (SBDD).
Table 2.3: List of drugs identified from structure based studies
Enzyme Disease Drugs Trade name
Neuraminidase
Influenza Oseltamivir
Zanamivir
Tamiflu
Relenza®
Carbonic Anhydrase II Glaucoma Dorzolamide Cosopt®
5-Hydroxy Tryptamine
1B Migraine Zolmitriptan Zomig®
Angiotensin II Hypertension Losartan Cozaar®
EGFR Kinase
Bcr-Abl Kinase
Cancer
CML
Erlotinib
Imatinab
Tarceva®
Gleevac®
49
HIV-protease
HIV-Reverse
transcriptase
AIDS
Indinavir
Nelfinavir
Saquinavir
Ritonavir
Lopinavir
Amprenavir
Tipranavir
Rilpivirine
Etravirine
Crixivan
Viracept
Invirase®
Norvir®
Kaletra®
Agenerase®
Aptivus®
Phase II*
Phase III*
* in clinical trials
2.1.2.1 Docking
The protein data bank (PDB) is a freely accessible repository
maintained by RCSB for the 3D structures of biomolecular
targets,determined by x-ray crystallography& NMR
spectroscopy.96,97,98Large-scale protein structureprojects led by various
genomicconsortiums throughout the world has enabled selection of many
target proteins for their therapeutic potential.99
Docking is defined asinsilicoprediction of non-bonded (non-
covalent) protein-ligand complex. The main objectiveof the docking
algorithms to predict the binding mode and orientation of ligand, from
given the protein structure and ligand. During the docking process, it
identifies the binding mode (with lowest free energy) of a ligand within
50
the protein cavity. Docking mainly deals with two components: binding
pose identification and scoring. Protein simulation process is
computationally expensive due to its high flexibility; because of this
reason, most of the docking algorithmare rigid docking [consider protein
as rigid]or flexible docking [allow side chainsflexibility (active site amino
acids)]. Docking algorithm can be classified as: fragment based approach
(as variant 3) and whole molecule approach [variant 1 and 2] shownin
Figure 2.5. Table 2.4list the existing docking methodologies and the
strategies they follow.In general docking algorithmcalculates the forces
associated and concerned in the protein−ligand bindingsuch as.,Van Der
Waals, electrostatic and hydrogen bonding. 100
Figure 2.5: flexible ligand docking process
51
Table 2.4: Different conformational methods used for docking
Searching
Algorithm Description Examples
Monte Carlo
(MC)
Stochastic method of generating
conformations. Selection based on
Metropolis criterion
Ligand Fit
Simulated
Annealing
(SA)
Random thermal motions are induced,
through high temperatures, to explore
the local search space. System is
driven to a minimum energy
conformation by decreasing
temperature. SA usually combined
with MC
MC-DOCK,
ICM-DOCK,
AutoDock
Genetic
Algorithm
(GA)
Based on Darwin principles of
evolution. ‘Chromosome’ encoding
model parameters (like torsion angles)
is varied stochastically. Populations
generated through genetic operations
(crossover, mutation, migration). The
fittest survives in the population.
GOLD,
DARWIN,
AutoDock
Tabu search Stochastic method of generating PRO_LEADS
52
conformation by keeping a record of
previous conformations (tabu).
Generated conformation is retained if
it is not tabu or if it scores better than
that in tabu
Incremental
construction
Systematic method where ligand is
broken into rigid fragments at
rotatable bonds. Fragments docked in
all possible ways and assembled
piecewise to regenerate ligand
Flex-X, DOCK,
HOOK, LUDI,
Hammerhead
Matching
methods
Based on clique detection technique
from graph theory. Ligand atoms
matched to the complimentary atoms
in the receptor
FLOG, DOCK
Simulation
methods
Molecular dynamics simulations used
to generate conformations DOCK
Docking algorithmwill be implemented in the different phases of
the discovery process:
(1) To predict the binding mode of a known actives
(2) To identify the new ligands in virtual screening process
53
(3) To predict/estimate the binding affinities of series of
known actives
Scoring
Scoring functions in a docking program often have twoaims: to
assist the conformational search algorithm to identify the correct
bindingmodeand to predict the binding energy. This dock score and
binding free energy values allowsthe user to compare binding modes of
same ligand and also to prioritize the ligands during the screening
process.101
Scoring functions are divided into3 types.100: (a) Empirical Scoring;
(b)knowledge based and (c) force field basedscoring methods. Empirical
scoring functions are depends on the functions/rules are derived from a
known crystal structures with its affinities for the co-crystals. Force field
based scoring functions usesparameterized terms (with force field)that
score the electrostatic interactions and Van Der Waals interactions
between protein and ligand. It includesinteraction energy (receptor-
ligand) and internal ligand energy (strain energy induced due to the
binding). Knowledge based scoring methodsis based on the assessment of
the frequency of each non-bonded interaction and the observed distance
between specific type of atoms in of protein–ligand databases. The
probability & occurrence of an each type of interaction and well
compared with the idealvalue.
54
2.1.2.1.1 Docking algorithms
A. GOLD
B. GLIDE
C. Ligand Fit
GOLD
Genetic Optimization for Ligand Docking102 is an rigid docking,
that implementsGA (genetic algorithm) to investigate the widerange of
ligand conformations and gold allows partial flexibility (side chain) of the
protein.
GOLD scoring functions
A. The Goldscore function similar to molecular mechanics
parameterized function with 4 terms:
Shb_ext calculates thehydrogen-bond interactions
Svdw_ext it calculates the Van Der Waals score
Shb_int it calculates the intramolecular hydrogen bonds with in the
ligand; by default this term is not considered (GOLD default),
and which gives the better results;
Svdw_int it calculates the intramolecular strain in the ligand.
GOLD Fitness = Shb_ext + Svdw_ext + Shb_int + Svdw_int ---(1)
55
B. GOLD uses the genetic algorithm (GA) with an adjustable
parameters such as., (a) ligand torsion angle adjustments (b) flip ring
corners of the ligand (c) dihedral angle adjustments OH groups and NH3
groupsof protein; and (d) the ligand mapping on to the fitting points (i.e.,
the orientation/position of the ligand within the site). In addition to these
parameters, user can also adjust docking parameters.
C. GOLD uses fitting points to place/orient the ligand molecule in
the site; it generates fitting points which represents the groups involved
in the hydrogen bonding interactions in protein &ligand, and it
superimpose the acceptor points on to the ligand and donor points on to
the protein and vice-versa. In addition to that, GOLD also generates
Steric (hydrophobic fitting points) in the protein cavity, which maps onto
the ligand CH groups.
ChemScore was developed and derived empirically (a set of 82
protein-ligand complexes) which measures binding affinities 103, 104.
The ChemScore function also added with regression data against
measured affinity, but still it has to be improved in all aspects and
GoldScore is superior compare to Chemscore in predicting binding
affinities.
ChemScore :
56
Here, the no ofterms(regression coefficients) and P terms represents the
physical contributions required to binding.
In addition to above terms, in a clash penalty and internal torsion
terms, also added to ChemScore value, which gives penalty for close
contacts and poor conformations. Covalent &constraint scores are also
included as shown in the below equation:
GLIDE
Grid-based ligand docking with energetic (Glide)docking
calculations is performed with version v3.6. The glide docking
involvesthree steps, Protein preparation, Grid generation and Ligand
docking.
Using the protein preparation wizard, contains charged side
chainsrefinements of improper side chains, adjust bond order, and
monomer from dimer. During the grid generation, grid center can be
defined using the bound ligand or specifying the X, Y, Z coordinates. The
box dimensions (12X12X12), which are automatically adjustable from the
57
bound ligand size, which covers the entire active site pocket. A scaling
factor of Van Der Waals radii is 0.9 was applied to ligand atoms.
Scoring Functions 105, 106
GLIDE 3.5 employs two differentscoring systems:
(i) GLIDE Score 3.5 SP(Standard-Precision)
(ii) GLIDE Score 3.5 XP (Extra-Precision)
These two scoring (SP & XP) functions uses the similar terms but
are used forspecial objectives. Specifically, GLIDE SP 3.5 is a “softer”,
has more forgiving function which allows ligands that have a reasonable
binding. Glide SP version will reduce false negatives and it is an
appropriate choice for database screening.GLIDE score 3.5 XP gives the
severe penalties for poses which violatesknown physical chemistry rules
GLIDE XP scorewill reduce the false positives and especially useful in
optimization of lead.
GLIDE score(also known as modified ChemScore) function as follows:
LigandFit
58
LigandFit module (Accelrys Suite)is another rigid docking algorithm
where protein is considered as rigid and ligand is flexible allowing for
conformation search and docked within the pocket.
The steps involved in the Ligand Fit
1. Site search:
The site can be defined by two ways.
a. Protein Shape (Cavity based): it identifies the sitesusing
proteinshape alone. An “eraser” algorithm used to adjust and
editthe grid points.
b. LigandDocking: Site points are defined usingbound ligand.
2. Conformational Search:
The Monte Carlo method used to generateligand conformational
search.
Ligand Fittings
The ligand fitting is carried out in twosteps:
1. The principle moment of inertia (PMI) of active site pocket is
compared with PMI of ligand. Ligand conformation is selected
based on the PMI fit value (threshold).
2. Further optimization will carried out to adjust ligand positions
and orientation and to improve the docking scores using Rigid
body minimization.
59
2.1.3 De Novo Ligand Design
New drug(De novo) design uses active site information to “develop” a
fragment to whole compoundusing fragment databasebased on the active
site shape and key interactions 107.
Figure 2.6: De novo ligand design flowchart
De novo design carried out in different approaches like site point
approach, fragment based approach. Figure 2.6 gives a schematic
description of few of these strategies.
2.2 Virtual Screening(VS)
Virtual Screening (VS) is theone of key technique in CADD, to search
analyze chemical databases (Vendor and in-house) in order to
60
rank/identify possible new candidate molecules (Figure 2.7). VS
(Structure-based and analog based) is used to screen databases to limit
the no of compounds to be screened experimentally.109 The choice of
virtual screening (SBVS/LBVS) methods is depends on the three
parameters: no of available actives, no of crystal structures
available/not..110-112
Figure 2.7The virtual screening methods
2.3 Molecular Dynamics
Molecular Dynamic simulations (MD) are widely used to study the
conformations changes of biological molecules with the associated kinetic
61
and thermodynamic properties within the time steps (Table 2.5). The one
of basic feature of MD is the generation of a molecular trajectory, i.e. a
series of conformations captured at time steps (regular-intervals) with
complete information.113-115
Table 2.5List of various MD techniques
MD Methods Description
Brownian MD accounts for Brownian motion of molecules
particularly evident with solvents of high
viscosity
Langevin MD Uses Langevin equations of motion where
additional frictional and random forces are
added to Newtonian equation
Activated MD Simulates activated processes allowing to
cross barriers existing between the initial and
final stages
Accelerated MD Time treated as statistical quantity.
Conformational sampling through i) modified
potential energy surface ii) non-Boltzmann
type iii) degrees of freedom at the expense of
faster degrees of freedom
Steered MD Based on Atomic Force Microscopy (AFM)
62
principle. Introduces a time and position
dependent force to steer systems along
particular degrees of freedom
Targeted MD Force used to drive simulation towards target
conformation
SWARM-MD Multiple simulations using molecules with
each one subjected to force introducing
cooperative behavior and drives the trajectory
to the mean trajectory of the entire swarm
Replica
Exchange MD
(REMD)
Series of simultaneous non-interacting
simulations (replica) carried over a range of
temperatures and swapped at particular
intervals of temperature
Self Guided MD
(SGMD)
Introduces an additional guiding force that is
a continuously updated time average of the
force of the current simulation
Leap Dynamics A combination of MD and essential dynamics.
Leaps applied to the system to force it over
energy barriers
Multiple Body
O(N) Dynamics
[MBO(N)D]
Combines rigid body dynamics with multiple
time steps where highest frequency harmonic
motions are removed retaining the low-
63
frequency anharmonic motions
λ Dynamics A multiple copy method with reduced
interaction potential of the ligand, lowering
the barriers of conformational transitions.
Fraction of interacting potential, λi2
(additional degree of freedom) allows ligand to
explore conformational space due to reduced
barriers
2.4 ADMET Prediction
ADMET (Absorption, Distribution, Metabolism, Elimination, and
Toxicity) profiles major challenge in discovery. Table 2.6 lists drugs
which are withdrawn due to their unsatisfactory ADMET profile. Various
InsilicoADME models developed to filter compounds with low ADME
profile.116, 117
Table 2.6: Drugs withdrawn from market due to ADMET failure
Drug Therapeutic
Utility
Year of
withdrawal
Withdrawn due
to
Thalidomide Morning
sickness in
pregnancy
1960s Teratogenicity
leading to
deformities in
64
fetal development
Ticrynafen Diuretic 1982 Hepatitis
Sumatriptan Migraine drug-drug
interactions with
MAO inhibitors
Terfinadine Antihistamine 1997 Cardiotoxicity
Mebifradil Anti
hypertension
Angina
1998 Interferes with
metabolism of
other drugs used
for hypertension
Troglitazone Anti diabetic 2000 Hepatotoxicity
Cerivastatin Anti
hyperlipidemic
2001 Rhabdomyolysis
Rofecoxib Anti
inflammatory
2004 Myocardial
Infarction
Phenyl
proponalamine
In cough as
decongestant
2005 Hemorrhagic
stroke
Simulations One Step Ahead−Future Directions
Advances in computations in structural biology have made it
possible to carry out virtual cell simulations that mimic the cell
environment and the cellular events therein.120 A number of programs
65
have been developed in this area. For example NEURON and GENESIS
simulate the electrophysiological behavior of single neurons and
neuronal networks. One step ahead of these, E-CELL constructs a model
of a hypothetical self-sustaining whole-cell with 127 genes sufficient for
transcription, translation and energy production.121-122