66

GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second
Page 2: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

GENERAL INFORMATION

ORGANIZERS

Kazan (Volga region) Federal University Russian Scientific Foundation

Dynasty Foundation D.I. Mendeleev Chemical Society of Republic of Tatarstan

SPONSORS

Chemical Abstract Service (https://www.cas.org/) BIOCAD (http://biocad.ru/)

Elsevier (http://elsevierscience.ru/products/reaxys/)

PARTNERS

Inte:Ligand Software-Entwicklungs und Consulting GmbH (http://www.inteligand.com/)

BioSolveIT (http://www.biosolveit.de/) Computational Chemistry List (http://www.ccl.net/)

ORGANIZING COMMITTEE

Chairmen of the Organizing Committee: Prof. Igor S. Antipin (Kazan, Russia)

Prof. Alexandre Varnek (Strasbourg, France)

Scientific Secretary: T.I. Madzhidov (Kazan, Russia)

Local Committee: M.A. Kazymova V.A. Afonina R.I. Nugmanov T.R. Gimadiev N.R. Khafizov O.P. Varlamov N.I. Ivanova M.D. Misarov

SCIENTIFIC PROGRAM

The program of the Second Kazan Summer School on Chemoinformatics includes 12 plenary lectures, 7 key-note lectures, 4 oral reports, 4 tutorials and 23 poster presentations.

OFFICIAL LANGUAGE

The official School language is English. No translation will be provided.

OFFICE OF THE ORGANIZING COMMITTEE

The office of the organizing committee is located at A.M. Butlerov Chemical Institute building, Auditorium No. 218. Participants will be able to use telephone, Internet and printing facilities there.

1

Page 3: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

VENUE

The event will be held mainly in the A.M. Butlerov Chemical Institute building (Lobachevskogo St. 1). Plenary and key-note lectures, oral presentations and tutorials will take place in the Hall No. 401. Only a few stand-alone computers will be installed for the usage by participants. However Wi-Fi connection and power supply for personal laptops of the participants is provided.

REAXYS PRIZE

Best presentations of young participants (no matter oral or poster) will be awarded by prizes from Elsevier for the valuable contribution to the science. The award commission consists from plenary and key-note lecturers of the School. The awarding will take place during the closing ceremony.

SOCIAL EVENTS

Coffee-breaks

All the participants are invited to take part in coffee-breaks sponsored by Chemical Abstract Service. Coffee breaks are free for all and will take place in the Hall No. 409.

Welcome Party

All the participants are invited to a welcome party that will be held July 6 in the second floor of the Cafe of the Institute of Physics (Kremlyovskaya St. 16A) at 18.00. The participation is free.

The meeting will be outside A.M. Butlerov Chemical Institute building at 18.00.

Excursion

The attendees are welcome to take part in the excursion program. The excursion will begin July 9 at 14.00. Visit to Kazan Kremlin and Zilant Monastery are planned. The tour around the Kazan Kremlin includes a walking trip in the Kremlin, where you will see the Savior’s Tower, leaning Syuyumbike Tower, visit the Kull Sharif Mosque and the Annunciation Cathedral. At the final part you will enjoy visiting Assumption Zilant Nunnery nearby to Kazan city. The duration of the excursion will be 3.5 hours. During the excursion around the religious building dress code should be observed. Please note that no food (only water) will be provided during the excursion program. The excursion language will be English.

The meeting will be outside A.M. Butlerov Chemical Institute building at 15.00.

2

Page 4: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

PROGRAM of the Second Kazan Summer School on Chemoinformatics

July 6, 2015

9:00-14:00 Registration

Chair of the session - A. Varnek 14:00-14:15 Opening Ceremony

14:15-14:35 Plenary Lecture 1 /

I. Antipin (Russia) "Kazan School of Chemistry"

14:35-15:20 Plenary Lecture 2 / J. Gasteiger (Germany)

"Solved and Unsolved Problems in Chemoinformatics"

15:20-15:50 Coffee - break Chair of the session - I. Tetko

15:50-16:35 Plenary Lecture 3 / A. Varnek (France)

“Chemical Space paradigm in Chemoinformatics”

16:35-17:20 Plenary Lecture 4 / V. Tkachenko (UK)

"Chemical databasing. State of the art and current challenges"

17:20-17:45 Key-note Lecture 1 / V. Palyulin (Russia)

"Molecular Field Topology Analysis as an Advanced Tool for QSAR/QSPR Studies"

18:00-21:00 Welcome-Party

July 7, 2015

Chair of the session - I. Gasteiger 9:00-9:45 Plenary Lecture 5 /

H. Senderowich (Israel) "Statistical Modeling in Material Sciences"

9:45-10:30 Plenary Lecture 6 / J. Aires de Sousa (Portugal)

"Machine learning with large datasets of quantum chemistry calculations"

10:30-11:00 Coffee - break Chair of the session - T. Langer

11:00-11:45 Plenary Lecture 7 / I. Tetko (Germany)

"Prediction-driven Matched Molecular Pairs to interpret and compare QSPR/QSRR models"

11:45-11:55 Oral Presentation 1 / D. Shulga (Russia)

“Perspectives of direct modeling of halogen bonding in early drug discovery”

11:55-12:05 Oral Presentation 2 / V.B. Siramshetty(Germany)

“Potential drug repositioning opportunities for ebola virus disease”

12:05-12:15 Oral Presentation 3 / O. Tarasova (Russia)

“Application of the large-scale database to the QSAR modeling of the HIV-1 reverse transcriptase inhibitors”

12:15-12:25 Oral Presentation 4 / P. Sidorov (France)

“Mappability of Drug-like Space: towards a polypharmacologically competent map of drug-relevant compounds”

12:30-14:30 Lunch Chair of the session - T. Madzhidov

14:30-15:30 Tutorial 1/ D. Erokhin, T. Khristova (CAS, USA)

"CAS SciFinder - Nr. 1 Search Tool for Chemistry and Life Science Information"

15:30-16:00 Coffee - break

3

Page 5: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

Chair of the session - T. Madzhidov 16:00-17:00 Tutorial 2/ A. Khudoshin

(Elsevier, Niederlands) "How to manage information challenges in chemistry and medicinal chemistry?“

17:00-19:00 Poster Session/Coffee Break

July 8, 2015

Chair of the session - A. Tropsha 9:00-9:45 Plenary Lecture 8 /

T. Langer (Austria) "Chemical Feature-Based 3D Pharmacophore Models For Drug Design: Current And Future Aspects"

9:45-10:30 Plenary Lecture 9 / I. Baskin (Russia)

"3D QSAR: Achievements and Perspectives"

10:30-11:00 Coffee - break Chair of the session - H. Senderowich

11:00-11:45 Plenary Lecture 10 / A. Tropsha (USA)

"Current trends in QSAR Modeling"

11:45-12:10 Key-note Lecture 2 / D. Horvath (France)

"Conformational Sampling & Docking: State-of-the-art and Challenges"

12:10-12:35 Key-note Lecture 3 / P. Polischuk (Ukraine)

"Applications of the Mixtures Representation Approach in QSAR Modeling"

12:35-14:30 Lunch Chair of the session - T. Madzhidov

14:30-16:00 Tutorial 3 / T. Langer (Inte:Ligand, Germany)

"LigandScout - efficient tool for pharmacophore modeling"

16:00-16:30 Coffee - break 16:30-18:00 Tutorial 4 / M. Gastreich

(BioSolveIT, Germany) "Staying Critical All the Way: Where to Pay Extra Attention in Lead Generation and Optimization"

July 9, 2015

Chair of the session - A. Varnek 9:00-9:45 Plenary Lecture 11 /

V. Tsirelson (Russia) "In Search of Up-To-Date Bonding Descriptors Based on Electron Density"

9:45-10:10 Key-note Lecture 4 / B. Creton (France)

"Structure-Property modelling in the oil industry"

10:10-10:35 Key-note Lecture 5 / T. Madzhidov (Russia)

"Condensed graph of reaction approach for protective group reactivity"

10:35-11:05 Coffee - break Chair of the session - J. Aires de Sousa

11:05-11:30 Key-note Lecture 6 / V. Solov'ev (Russia)

" QSPR models for halogen bond strength "

11:30-11:55 Key-note Lecture 7 / G. Marcou (France)

"A Measure for QSAR Modelability and for Data Organization"

11:55-12:40 Plenary Lecture 12 / V. Poroikov (Russia)

"Drug Discovery: Science, Art, Business"

12:40-13:00 Closing Ceremony 15:00-19:30 Excursion

4

Page 6: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

Plenary lectures

5

Page 7: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

J. Gasteiger

SOLVED AND UNSOLVED PROBLEMS OF CHEMOINFORMATICS

Computer-Chemie-Centrum, University of Erlangen-Nuremberg, D-91052 Erlangen, Germany;

From early beginnings in the 1960s Chemoinformatics has developed into a scientific field

of its own. Without the many tools provided by chemoinformatics modern scientific research and development in chemistry and related fields would simply not be possible.[1,2]

Mature as the field has now become there are nevertheless many problems that still wait to be satisfactorily solved: The efficient search for the bioactive conformation, the representation of polymers, the prediction of the course of chemical reactions, the analysis of biochemical reaction networks, the design of organic syntheses, automatic structure elucidation to name just a few. In addition, dramatic changes are occurring in the way we get access to data and information, the role of publishing houses is being challenged by information becoming freely available on the internet.

Furthermore, it must be realized that chemoinformatics has not yet found its proper place in the chemical community, the power of chemoinformatics has not yet been utilized in every corner of chemistry. As of now, most efforts have been directed to drug design but chemoinformatics methods could increase our understanding of chemistry across all fields, from analytical chemistry through organic chemistry to physical chemistry.

Thus, there is still a lot of work to be done to prove the importance of chemoinformatics, from producing excellent chemoinformatics applications in all areas of chemistry, through training chemoinformatics specialists to incorporating chemoinformatics topics into regular chemistry curricula.

1. Handbook of Chemoinformatics - From Data to Knowledge, 4 volumes, J. Gasteiger, Editor, Wiley-VCH, Weinheim, 2003. ISBN: 3-527-30681-1 2. Chemoinformatics - A Textbook, J. Gasteiger, T. Engel (Editors), Wiley-VCH, Weinheim, 2003. ISBN: 3-527-30680-3

6

Page 8: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

A. Varnek

CHEMICAL SPACE PARADIGM IN

CHEMOINFORMATICS University of Strasbourg, France [email protected]

Although the expression “Chemical Space” (CS) is widely used in the literature, it is not still

well defined. Generally speaking, the notion of “space” stands for a set of objects with some particular properties and some relationships between them (metric). Generally, chemoinformatics considers two main types of chemical objects: graphs and descriptor vectors. Graphs-based CS is traditionally described in terms of scaffolds/R-groups concept which is exploited in popular Scaffold Tree and Scaffold Net approaches. Different sets of molecular descriptors can be generated from one same molecular graph and, therefore, one same graphs-based CS may correspond to several different descriptors-based CS. In this presentation, we discuss different methods of analysis and visualization of graphs-based and descriptors-based CS. Particular attention is paid to Generative Topographic Mapping (GTM) approach which could efficiently be used to visualize chemical data, to predict activity profiles, to conduct virtual screening and to compare chemical databases.

1. Varnek A., Baskin I. I. Mol. Informatics, 2011, 30: 20 – 32

7

Page 9: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

V. Tkachenko CHEMICAL DATABASING. STATE OF THE ART AND CURRENT CHALLENGES

1 Royal Society of Chemistry, Burlington House, Picadilly, London, UK [email protected]

Chemical databasing technologies and databases, have been available for many years. At this point it would be expected that trends stabilize. While there are dominant commercial chemical databases such as CAS and Reaxys, in terms of technologies and publicly accessible databases there has been a continuous increase in the variety of database engines and chemistry content offerings via databases. This presentation will review the history of chemical databases and will cover present trends. We will analyse the challenges in general databasing technologies which are attributed to the current trends in Big Data and Semantic Web and will show how those challenges and trends affect technical solutions applied to chemistry.

The Royal Society of Chemistry has participated in a number of national and international grants allocated to the construction of large scale chemical information management systems including examples such as the Innovative Medicines Initiative project know as Open PHACTS, the PharmaSea project to identify novel antibiotics from the ocean and the National Chemical Database Service, a hub of cheminformatics services and related content made available to chemists in UK academia. We will talk about the specifics of each project and present the technical and scientific solutions that have helped make these successful.

8

Page 10: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

H. Senderowitz

STATISTICAL MODELING IN MATERIAL SCIENCES Department of Chemistry, Bar Ilan University, Ramat-Gan 52900, Israel. [email protected]

Statistical modeling (also termed Quantitative Structure Activity Relationship, QSAR or

Quantitative Structure Property Relationship, QSPR) is a general name for a host of methods that attempt to correlate a specific activity for a set of compounds with their structure-derived descriptors by means of a mathematical model. Statistical modeling has been widely applied in many fields including chemistry, biology, and environmental sciences. In particular, the role of QSAR models in the identification of new compounds and in their subsequent optimization has been constantly growing and is now recognized by many practitioners of computer aided drug design and computer aided material design methodologies.

This lecture will focus on the application of statistical modeling techniques in material

sciences discussing methods typically used in this field (e.g., Principle Component Analysis, Cluster analysis and linear and non-linear regression), material descriptors (e.g., material composition, material spectra) and the challenges in obtaining them and highlighting selected applications.

Special emphasis will be put on the application of statistical modeling in the newly emerging

field of photovoltaic cells entirely made of metal oxides (MO). Such cells have the potential to provide clean and affordable energy if their power conversion efficiencies are improved. Such improvements require the development of new MOs which in turn could benefit from combining combinatorial material sciences for producing solar cells libraries with statistical tools to direct synthesis efforts. With this in mind we developed a QSAR workflow and applied it to the analysis of several solar cell libraries. Our results demonstrate that QSAR models with good prediction statistics for multiple solar cells properties could be developed and that these models highlight important factors affecting these properties in accord with experimental findings. The resulting models are therefore suitable for designing better solar cells. We further demonstrate that the similar property principle commonly invoked in pharmaceuticals design could be extended to PV cells.

1. Yosipof A., Nahum O.E., Anderson A.Y., Barad H., Zaban A., Senderowitz H. Molecular Informatics, 2015, accepted for publication.

2. Yosipof A., Senderowitz H. J. Comput. Chem. 2014, accepted for publication 3. Yosipof A., Senderowitz H. J Chem. Inf. Model. 2014, 54: 1567-77.

9

Page 11: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

J. Aires-de-Sousa MACHINE LEARNING WITH LARGE DATASETS OF QUANTUM CHEMISTRY CALCULATIONS

LAQV and REQUIMTE, Departamento de Química, Faculdade de Ciências e Tecnologia, Universidade Nova de Lisboa, 2829-516 Caparica, Portugal; [email protected]

The rapid access to intrinsic physicochemical properties of molecules is highly desired for

large scale data mining explorations, e.g., for the discovery of new materials and drugs, toxicity risk assessment, or mass spectrum prediction in metabolomics. Data can be obtained by quantum chemistry calculations, which provide increasingly accurate estimations of several properties, but are too computationally expensive for large scale uses. Even though, high-throughput quantum chemistry calculations are being performed in projects employing enormous computational resources [1,2]. A big data scenario can be envisaged in which computational analytic techniques extract innovative knowledge from the large volumes of data produced by quantum calculations, so that they can be predicted 5–6 orders of magnitude faster in new situations. Studies have been reported in which machine learning algorithms were trained with thousands of data points to predict ab initio- or DFT-calculated properties (molecular, bond, or atomic properties) [3,4]. Our lab trained machine learning algorithms with >8,000 bond energies, and >35,000 atomic charges, calculated by DFT, to enable extremely fast predictions [5,6].

In this lecture, machine learning of condensed Fukui indices is presented firstly. From a

chemoinformatics perspective, modeling Fukui indices presents a specific challenge – the competition of all the atoms in the molecule for a charge, so that atoms in the same substructure can exhibit very different Fukui indices in different molecules. The problem was approached either as a Random Forests regression/classification, and as a ranking of atom types with the Bradley-Terry model.

A second project is also presented involving machine learning of bond properties calculated

by DFT for ca. 150,000 covalent bonds, covering a large range of molecular sizes and chemical elements. Most of the currently available QSAR/QSPR algorithms, molecular descriptors, and software have been typically designed for data sets at least 10 times smaller than this. The results obtained with various strategies to handle large data sets are presented, namely for the selection of training sets, bond descriptors, and ML algorithms.

1. Hachmann J. et al. J. Phys. Chem. Lett., 2011, 2: 2241-2251. 2. Jain A. et al. APL Materials, 2013, 1: 011002; doi: 10.1063/1.4812323. 3. Hansen K. et al. J. Chem. Theory Comput., 2013, 9: 3404-3419. 4. Raj B.K., Bakken G.A. J. Comput. Chem., 2013, 34: 1661-1671. 5. Qu X., Latino D.A.R.S., Aires-de-Sousa J. J. Cheminform., 2013, 5: 34. 6. Zhang Q.-Y. et al. Chemom. Intell. Lab. Syst., 2014, 134: 158-163.

Financial support from Fundação para a Ciência e a Tecnologia (FCT), Portugal, under Project PEst-C/EQB/LA0006/2013 is acknowledged.

10

Page 12: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

I.V. Tetko1,2

PREDICTION-DRIVEN MATCHED MOLECULAR

PAIRS TO INTERPRET AND COMPARE QSPR/QSRR MODELS

1Helmholtz-Zentrum München — German Research Centre for Environmental Health (GmbH), Institute of Structural Biology, Munich, Germany 2A.M. Butlerov Institute of Chemistry, Kazan Federal University, Kremlyovskaya St. 18, 420008 Kazan, Russia [email protected]

The Matched Molecular Pairs (MMP) uses pairs of molecules, which differ with only one structural group to analyze chemical data and derive rules hidden in them. This approach is very useful to identify “activity cliffs”, i.e., small changes in the chemical structure leading to large changes in the activity/properties of molecules. The MMP are extensively used in scientific literature to analyze experimental measurements.1 In this work the MPPs are extended to analysis of Quantitative Structure Property/Reactivity Relationship (QSPR/QSRR) models. The statistically significant MMPs derived from predictions of a model (prediction-driven MMP) allows us to understand rules that were learn by the model as well as to identify dependencies, which were not correctly captured by it. The analysis using MMPs does not depend on the used machine learning algorithms and descriptors and can be applied for any model. The prediction-driven MMPs are also useful approach to compare different models and to provide a mechanistic interpretation of individual predictions. In this presentation the MMPs will be used as an important tool to interpret models of physico-chemical properties as well as chemical reactivity for prediction of outcomes of chemical reactions. Moreover, it will be also discussed how MMPs can be used to drive optimization of chemical reactions. The developed method2 is publicly available at http://ochem.eu.

1. Griffen E, Leach AG, Robb GR, Warner DJ: Matched molecular pairs as a medicinal chemistry tool. J Med Chem 2011, 54(22):7739-7750 2. Sushko, Y.; Novotarskyi, S.; Korner, R.; Vogt, J.; Abdelaziz, A.; Tetko, I.V. Prediction-driven matched molecular pairs to interpret QSARs and aid the molecular optimization process. J. Cheminform. 2014, 6, 48. This study was supported with Russian Scientific Foundation (Agreement No 14-43-00024, dated by October 1, 2014). We thank ChemAxon (http://www.chemaxon.com) for providing the Standardizer and calculator plugins. We also thank Molecular Networks GmbH (http://www.molecular- networks.com) and Talete Srl (http://www.talete.mi.it) for contributing their software tools used in this study.

11

Page 13: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

T. Langer1

S. D. Bryant2

G. Ibis2

T. Seidel1

M. Wieder1

CHEMICAL FEATURE-BASED 3D PHARMACOPHORE MODELS FOR DRUG DESIGN:

CURRENT AND FUTURE ASPECTS 1 Department of Pharmaceutical Chemistry, Faculty of Life Sciences, University of Vienna, Althanstr.14, 1090 Vienna, Austria 2 Inte:Ligand GmbH, Mariahilferstr. 72a/11, 1060 Vienna, Austria [email protected]

Pharmacophore-based compound modeling, virtual screening, and bio-activity profiling has become a popular in silico technique for supporting medicinal chemists in their hit finding, hit expansion, hit to lead, and lead optimization programs. [1]

At Inte:Ligand GmbH, we have developed the LigandScout [2] platform as an integrated

software solution containing rapid and efficient tools for automatic interpretation of ligand-protein interactions and subsequent transformation of this information into 3D chemical feature-based pharmacophore models. Additionally, pattern recognition-based algorithms were developed for ligand-based pharmacophore modeling in the absence of a target 3D structure, as well as for establishing novel accurate virtual screening procedures. Our recent interest is to incorporate the results of molecular dynamics simulation trajectories into the pharmacophore description, in order to develop pharmacophore ensembles representing the dynamic event of binding and to make this functionality available as LigandScout KNIME [3] extensions.

In the presentation, we will highlight successful applications of chemical feature-based 3D

pharmacophore modeling within the early drug discovery process. Such examples range from drug repurposing, to in silico fragment-based hit discovery and discovery of novel protein-protein interface inhibitors.

In the hands-on workshop, we will show typical workflows using the novel LigandScout 4.0

Expert platform, including structure- and ligand-based pharmacophore modeling, virtual screening, hit list filtering, ligand profiling, and molecular docking.

1. Langer, T., Pharmacophores in Drug Research, Mol. Inf. 2010, 29, 470-475. 2. Wolber, G., Langer, T. ; LigandScout: 3D Pharmacophores Derived from Protein-Bound Ligands and their Use as Virtual Screening Filters, J. Chem. Inf. Model. 2005, 45, 160-169. 3. KoNstanz Information MinEr, available from KNIME.COM AG, Zurich, Switzerland (http://knime.org)

12

Page 14: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

I. I. Baskin1,2,3

N. I. Zhokhova1

G. V. Sitnikov3,4

A. Varnek2,3

3D QSAR: ACHIEVEMENTS AND PERSPECTIVES 1 Chair of Polymer and Crystal Physics, Faculty of Physics, M.V.Lomonosov Moscow State University, Leninskie Gory, Moscow, Russia; 2 A.M. Butlerov Institute of Chemistry, Kazan Federal University, Kremlyovskaya St. 18, Kazan, Russia;

3 Laboratoire de Chémoinformatique, UMR 7140 CNRS, Université de Strasbourg, 1 rue Blaise Pascal, Strasbourg, France; 4 A.N.Nesmeyanov Institute of Organoelement Compounds of Russian Academy of Sciences, Vavilova St. 28, Moscow, Russia; [email protected]

The lecture reviews the state of the art in 3D QSAR methods combining both alignment-based and alignment-free approaches. Three types of molecular fields are considered: molecular interection fields, atomic property fields, and the fields based on the electron density function. The main problems and challengies for 3D QSAR analysis are outlined. The perspectives of 4D QSAR modeling will be discussed.

The second part of the lecture is devoted to the theory and applications of the continuous molecular field approach [1-2] to building 3D QSAR regression and one-class classification models. Particular attention is paid to continuous indicator fields [3], which combine the 3D QSAR methodology and the substructural approach. Continuous indicator fields (CIF) can be considered as 3D analogues of topological substructural fragments. CIF models can be interpreted in terms of preferable and undesirable positions of certain types of atoms in space. This helps to understand which changes in chemical structure should be made in order to design a compound possessing desirable properties. The applications concidered in the lecture concern 3D QSAR models for protein-ligand interaction, metal complexation and absorption of coloring dyes.

The last part of the lecture describes the perspectives of 3D QSAR. The possibilities of using machine learning for solving 3D QSAR problems will be discussed.

1. Baskin I.I., Zhokhova N.I. J. Comput.-Aided Mol. Des., 2013, 27(5): 427-442. 2. Baskin I.I., Zhokhova N.I. In: Challenges and Advances in Computational Chemistry and Physics, 2014, 17: 432-459. 3. Sitnikov G.V., Zhokhova N.I., Ustynyuk Yu.A., Varnek A., Baskin I.I. J. Comput.-Aided Mol. Des., 2015, 29(3): 233-247.

Most of this work was supported by Russian Foundation for Basic Research (Grant 13-07-00511). A part of this study was supported by Russian Scientific Foundation (Agreement No 14-43-00024, dated by October 1, 2014).

13

Page 15: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

A. Tropsha

CURRENT TRENDS IN QSAR MODELING

CB # 7568 Beard Hall, UNC Eshelman School of Pharmacy, UNC Chapel Hill, NC 27599 [email protected]

More than fifty years of continuous improvements, interdisciplinary breakthroughs, and

community-driven developments were needed to make QSAR modeling one of the commonly employed approaches to modeling the physical and biological properties of chemicals in use today. In fact, the analysis of published literature indicates that the continuing growth of chemical data and databases especially in the public domain has stimulated the concurrent growth in QSAR publications. However, throughout its entire history the QSAR approach has drawn both praise and criticism concerning its reliability, limitations, successes, and failures. In this presentation, we will discuss: (i) the development and evolution of QSAR; (ii) the current trends, unsolved problems, and pressing challenges; and (iii) several novel and emerging applications of QSAR modeling [1]. Throughout the discussion, we will provide guidelines for QSAR development, validation, and application, which are summarized in best practices for building rigorously validated and externally predictive QSAR models. We emphasize the importance of communications between computational and experimental chemists towards collaborative development and use of QSAR models. We also address the issue of data accuracy and reproducibility that are particularly important for computational scientists such as bio- and cheminformaticians whose success inherently depends on the quality of experimental data used as inputs for their models. We stress that the exploitation of modern chemogenomics repositories containing huge sets of heterogeneous data requires the use of powerful, transparent, and robust data curation workflows. Supplementing and enriching our previous chemical curation protocols [2], we will describe the enhanced chemical and biological data curation workflow. This global data curation workflow can be utilized to improve the quality of data analysis and interpretation as well as boost the prediction performances of computational models built with available chemical genomics data. We will also discuss guidelines concerning the use of stringent scientific standards to manuscripts reporting new QSAR studies that should enable easier manuscript acceptance by journal reviewers and editors.

1. Cherkasov A, Muratov EN, Fourches D, Varnek A, Baskin II, Cronin M, Dearden J, Gramatica P, Martin YC, Todeschini R, Consonni V, Kuz'min VE, Cramer R, Benigni R, Yang C, Rathman J, Terfloth L, Gasteiger J, Richard A, Tropsha A. J Med Chem. 2014, 57: 4977-5010. 2. Fourches D, Muratov E, Tropsha A. Trust, But Verify: On the Importance of Chemical Structure Curation in Cheminformatics and QSAR Modeling Research. J Chem Inf Model. 2010, 50: 1189–1204.

The support from NIH grant GM066940 is acknowledged.

14

Page 16: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

V. G. Tsirelson

IN SEARCH OF UP-TO-DATE BONDING

DESCRIPTORS BASED ON ELECTRON DENSITY Quantum Chemistry Department, Mendeleev University of Chemical Technology, Miusskaya Sq., 9, Moscow 125047, Russia [email protected]

Concept of bonding, which allows recognizing and classifying the atomic and molecular interactions, is one of the basic concepts in chemistry. Competitive development of X-ray diffraction and quantum-chemical methods led to reliable derivation of electron density and electrostatic potential in molecules, molecular complexes and crystals. As a result, many bonding descriptors based on electron density were suggested to reach a successive bonding picture. Especially, the Quantum Theory of Atoms in Molecules in Crystals (QTAIMC) has allowed to establish which atoms, in terms of electron density, are chemically bonded and which are not and to quantify the atomic and molecular interactions. In 1985, we applied the QTAIMC to experimental electron density; the approach is nowadays a common one in structural chemistry. Later, we also suggested combining the formulae from density functional theory and experimental electron density: it allowed extracting the electronic (total, exchange and correlation) energy characteristics from the X-ray and electron diffraction experiments. Also, the electron localization function, localized orbital locator and the other bonding descriptors became derivable directly from experiment. In this talk, we report on the recent developments in the search of up-to-date bonding descriptors based on electron density and electrostatic potential in molecules, molecular complexes and crystals; these descriptors are equally applicable to theoretical and experimental densities. Also, we will demonstrate how these developments provide new insights into the nature of atomic and molecular interactions. Note that the electron-density-based descriptors reflecting the properties both the atoms and bonds are derived using certain approximations; therefore the limits of their applicability will be discussed as well.

Crystalline picolinic acid N-oxide consists of molecular layers linked by weak π-stacking interactions.

The density-based bonding ellipsoids explicitly show how these interlayer С2…O1 and O2…N1

interactions differ from intralayer C3-H3…O1 hydrogen bond.

This work is supported by Russian Foundation for Basic Research, grant 13-03-00767a.

15

Page 17: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

V. Poroikov

DRUG DISCOVERY: SCIENCE, ART, BUSINESS

1 Laboratory of Structure-Function Based Drug Design, Department for Bioinformatics, Institute of Biomedical Chemistry, Pogodinskaya Str., 10/8, Moscow, 119121, Russia [email protected]

Drug research and development is extremely expensive and time-consuming process with the high risk of negative results. The key stage of this process is the discovery of novel pharmaceutical agents. Nowadays drug discovery is based on investigations of pathological mechanisms of diseases, identification of relevant pharmacological targets, findings or design of the active compounds (hits), and their further optimization to lead compounds. We will consider how the four listed above stages of drug discovery process are executed by pharmaceutical science and industry. Examples of successful medicines found either through the empirical search or created using computer-aided drug design methods will be presented (Figure 1).

Figure 1. The number of pharmaceuticals launched in 1899-2014

(source: Thomson Reuters Integrity).

Due to the tough competition, pharmacetical industry is an exceptionally innovative business. It should execute R & D of new pharmaceuticals based on the most recent achievements of biomedicine, biotechnology and chemistry. On the other side, a lot of biomedical and chemical studies are stimulated by the unmet medical needs.

Despite of many efforts to prove the safety of investigational drugs during the preclinical and clinical trials, some drugs have been removed from the market due to the unexpected life-threatened actions. Using such examples one may study if the application of modern cheminformatics methods [1] could help to filter out the potentially dangerous compounds at the early stages of research.

Many current drugs have been discovered due to the serendipity. Thus, some potential biological activities of such pharmaceutical agents remain unstudied. Rational application of cheminformatics methods helps to find novel biological activities of these compounds, which lead to drug repurposing. Similar approach allows to disclose the hidden pharmacological potential of natural compounds from medicinal plants directing to the most promising way of their studies [2].

1. Filimonov D.A. et al. Chemistry of Heterocyclic Compounds, 2014, 50: 444-457. 2. Lagunin A.A. et al. Natural Product Reports, 2014, 31: 1585-1611.

Acknowledgement. The work is supported by the Program of Basic Research of the Russian State Academies of Sciences (2013-2020).

16

Page 18: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

Key-note lectures

17

Page 19: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

V.A. Palyulin E.V. Radchenko N.S. Zefirov

MOLECULAR FIELD TOPOLOGY ANALYSIS AS AN ADVANCED TOOL FOR QSAR/QSPR STUDIES

Department of Chemistry, Lomonosov Moscow State University, Leninskie Gory, 1/3, Moscow 119991, Russia [email protected]

There exist numerous techniques for the evaluation of quantitative structure-activity/property relationships (QSAR/QSPR) based on various descriptors of a chemical structure. Most of the approaches employ so called global descriptors characterizing the molecule as a whole. However the application of local (atomic) descriptors allows one to consider the detailed ways of structure optimization to increase both activity and selectivity of the potential drug candidate or improve any other useful properties depending on the local features of the chemical structure.

Molecular Field Topology Analysis (MFTA) [1-3] is a QSAR approach based on the local descriptors. First, a so-called molecular supergraph is constructed which is a simple graph such that the molecular graphs of all training set structures can be represented as its subgraphs. A uniform descriptor set for the statistical analysis is obtained by superimposing each training set structure onto the molecular supergraph. Each supergraph vertex is assigned the values of effective atomic charge, van der Waals radius, H-bond donor and H-bond acceptor ability, local lipophilicity and other parameters for the corresponding atom of the training set structure. For unoccupied vertices the neutral descriptor values are used. The predictive QSAR/QSPR models are derived using the Partial Least Squares Regression or Artificial Neural Networks machine learning methods. The analysis of the impact of local descriptors on the activity/property for different supergraph positions is highly helpful in search for new promising structures as well as in understanding their action. In addition, the MFTA models can be used for the virtual screening of promising structures in chemical databases or compound libraries built by means of specially designed structural generators.

Recent advances of MFTA allow one to apply this tool to the design of new catalysts, antioxidants, dyes, enzyme inhibitors, receptor agonists, antagonists and modulators, virus entry inhibitors, etc. Special approaches were proposed for the molecular design of drugs having optimal activity and selectivity profile, including multi-target drugs.

The joint application of the MFTA in conjunction with other QSAR/QSPR approaches (e.g. ADMET models based on fragmental descriptors) and molecular modeling techniques is especially fruitful.

1. Palyulin V.A., Radchenko E.V., Zefirov N.S. J. Chem. Inf. Comp. Sci., 2000, 40: 659-667. 2. Radchenko E.V., Palyulin V.A., Zefirov N.S. Molecular Field Topology Analysis in drug design and virtual screening, in Chemoinformatics Approaches to Virtual Screening, A. Varnek, A. Tropsha, eds, RSC, 2008, 150-181. 3. Makhaeva G.F., Radchenko E.V., Palyulin V.A., Rudakova E.V., Aksinenko A.Yu., Sokolov V.B., Zefirov N.S., Richardson R.J. Chem.-Biol. Int., 2013, 203: 231-237.

The work was supported by the Russian Foundation for Basic Research (projects no. 14-03-00851 and 15-03-09084).

18

Page 20: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

D.Horvath

L.Hoffer

G.Marcou A.Varnek

S4MPLE – SAMPLER FOR MULTIPLE PROTEIN-LIGAND ENTITIES: SIMULTANEOUS DOCKING

OF SEVERAL ENTITIES Laboratoire de Chemoinformatique, UMR 7140; 1 rue Blaise Pascal, 67000 Strasbourg, France; [email protected]

S4MPLE [1,2] a conformational sampling tool based on a genetic algorithm coupled to an

AMBER/GAFF core force field competed by a continuum desolvation term, is able to address various problems, from folding/sampling of individual molecules to simultaneous docking of multiple entities. This latter, original ability is assessed in two different important contexts:

First, the key problem of predicting water-mediated interaction is addressed by considering explicit water molecules as additional entities to be docked in presence of the “main” ligand. Blind prediction of solvent molecule positions, reproducing relevant ligand-water-site mediated interactions, is achieved in 76% cases over saved poses. S4MPLE was also successful to predict crystallographic water displacement by a purposely added functional group. However, water localization is a delicate issue in terms of weighing of electrostatic and desolvation terms, and also introduces a significant increase of required sampling efforts.

Second, simultaneous docking of two fragment-like ligands was attempted, as such ternary complexes are the basis of fragment-based drug design by linkage of the independent binders. S4MPLE was successfully challenged to predict locations of fragments involved in ternary complexes by means of multi-entity docking. The herein reported results – not making use of massively parallel deployment of the software – are very encouraging.

1. Hoffer L. et al. J Chem Inf Model, 2012, 53: 88-102. 2. Hoffer L. et al. J Chem Inf Model, 2013, 53: 836-51

19

Page 21: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

P. Polishchuk1 E. Mokshyna1 E. Varlamova2 E. Muratov3 T. Madzhidov4 V. Kuz’min1

APPLICATIONS OF THE MIXTURES REPRESENTATION APPROACH IN QSAR

MODELING 1 A.V. Bogatsky Pysico-Chemical Institute of National Academy of Sciences of Ukraine, Lustdorfskaya doroga 86, 65080 Odessa, Ukraine; 2 Laboratory for Molecular Modeling and Drug Design, Faculty of Pharmacy, Federal University of Goiás, Rua 240, Qd. 87, Setor Leste Universitário, Goiânia, Goiás 74605-170, Brazil; 3 University of North Carolina, Beard Hall 301, CB#7568, Chapel Hill, NC, 27599, USA; 4 A.M. Butlerov Institute of Chemistry, Kazan Federal University, Kremlevskaya st. 29/1, 420008 Kazan, Russia.

[email protected] Various approaches towards mixtures representation for QSAR modeling have been

developed and published. Those works mainly dealt with binary mixtures of traditional chemicals (drugs, solvents, etc), most of them limited by additive modeling. Due to the limitations of existing approaches, we developed our own approach based on simplex representation of molecular structure. It is a non-additive scheme that can be applied for QSAR modeling of binary and more complex mixtures.

Earlier we demonstrated application of this approach for modeling of vapor-liquid equilibrium curves of binary mixtures of organic solvents, and antiviral properties of drugs combinations. The mixture representation paradigm and particularly the developed approach of mixture representation was also effectively applied for other QSAR tasks:

1) modeling of chemical reactions; 2) modeling of ligands binding to artificial receptors; 3) modeling of bulk properties of pure chemicals.

Part of the work was funded by the grant № 14-43-00024 of Russian Scientific Foundation

20

Page 22: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

B. Creton

STRUCTURE-PROPERTY MODELLING IN THE OIL INDUSTRY

IFP Energies nouvelles, 1 et 4 avenue de Bois-Préau, 92852 Rueil-Malmaison, France. [email protected]

The oil industry involves numerous scientific domains each applied to various operations

such as extraction, refining, synthesis of chemicals... Numerical simulation tools are now widely used during these operations, and IFP Energies nouvelles (IFPEN) recently investigated the use of Quantitative Structure Property Relationship (QSPR) to predict properties of petrochemicals, materials... The purpose hereafter is to present some recent works perfomed at IFPEN with a focus on the QSPR modelling of mixtures: surfactants, alternative fuels, and adsorption of gases in nanoporous materials.

In its duration, the process of crude oil extraction can be composed of three stages, with the third being the enhanced oil recovery (EOR). The chemical EOR consists in the injection of alkaline/surfactant/polymer (ASP), and the formulation of ASP combinations is a challenging and time consuming task considering that each potentially eligible reservoir exhibits its own conditions. The optimal salinity is one of the key properties to consider during the selection of surfactants’ formulation, and we proposed QSPR based models to assist the formulation [1-2].

Property predictions for alternative fuels can be used to assist the formulation of biofuels. We developed QSPR models to predict some fuels’ specifications (physical properties such as cetane number, flash point, enthalpy of combustion, melting point, density and viscosity) for families of compounds similar to those found in biofuels: hydrocarbons and oxygenated compounds [3-4]. First, QSPR based models of pure compound properties were developed. Then, the case of mixtures has been examinated and two types of approaches were investigated: (i) the direct application of machine learning methods to mixture property data; (ii) the use of the previously developed pure compound property models in combination with theoretically based mixing rules. The compatibility of materials with fuels’ components is of major concern especially as the fuel composition varies within a year and with the consideration of oxygenated compounds in the pool of renewable molecules. Machine learning approaches have been used to model the sorption of neat compounds and up to quinary mixtures of some hydrocarbons, alcohols and ethers, in a semicrystalline poly(ethylene) [5]. QSPR based models were further tested for surrogate gasolines, and predictions were in excellent agreement with experimental sorption values.

In the adsorption field, promising porous materials have already been identified for carbone dioxide (CO2) capture applications, among them Zeolitic Imidazolate Frameworks (ZIF). The design of new nanoporous materials could be highly accelerated using QSPR models [6]. However, the development of relevant descriptors for such materials remains a challenge.

1. Moreau P. et al. SPE International Symposium on Oilfield Chemistry, 2013, 164091-MS. 2. Muller C. et al., in preparation. 3. Saldana D.A. et al. Oil Gas Science and Technology, 2013, 68: 651-662. 4. Saldana D.A. et al. Energy Fuels, 2013, 27: 3811-3820. 5. Villanueva N. et al Journal of Membrane Science, 2015, submitted. 6. Amrouche H. et al. RSC Advances 2012, 2: 6028-6035.

Your acknowledgement or funding description, min 12pt, italics, row interval 1.0, Times New Roman

21

Page 23: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

T. I.Madzhidov 1

A. I. Lin1,2

I. Antipin1 O. Klimchuk2

A. Varnek2

CONDENSED GRAPH OF REACTION APPROACH FOR PROTECTIVE GROUP REACTIVITY

1 Kazan Federal University, Kazan, Russia 2 Université de Strasbourg, Strasbourg, France [email protected]

In chemoinformatics, a chemical reaction represents a difficult object because it involves several species of two types (reactants and products) and depends on experimental conditions (catalyst, solvent, temperature, etc). This complexity often prevents one to apply to chemical reactions some approaches developed for treatment of individual molecules, e.g. similarity search in databases or structure-property modeling. These problems can be easily solved using the Condensed Graph of Reaction (CGR) approach representing any chemical reaction as a pseudo-molecule characterizing by both canonical (single, double, etc) bonds and “dynamical” bonds characterizing chemical transformations [1]. A set of fragment descriptors can be easily generated for a CGR just opening a way for any type of chemoinformatics modeling of chemical reactions [1]. However atom-to-atom mapping (AAM) is required to generate a CGR for a given reaction.

In the presentation the application of CGR approach to different tasks of reaction chemoiinformatics will be shown: reaction search, quantitative structure-reactivity modeling, and data mining.

The most of attention will be paid to the analysis of large database of protection/deprotection reactions. A key problem dealing with them is to choose optimal experimental conditions (catalyst, solvent, additives, etc) leading to selective protection or deprotection of a given group in particular chemical environment. Reactivity charts from a famous the Green’s book [2] were widely used for this purpose which has become a recognized guide in protection/deprotection chemistry. On the other hand, these reactivity charts resulted from manual analysis of relatively small amount of data and therefore may miss important information hidden in large reaction databases.

In this presentation we report statistical analysis of large dataset of hydrogenation reactions (142.111 reactions) extracted from Reaxys database. For this purpose, we built a workflow involving numerous in-house tools for reaction data processing based on the Condensed Graph of Reaction (CGR) approach [1].

Three separate application of the developed tools was shown: 1. Statistical analysis of data, showing how many deprotection reactions of a given type were

proceeded and not proceeded under certain conditions. 2. Statistical analysis showing selectivity of deprotection on a given catalyst: which group is

cleaved first and which is second. 3. Expert system predicting best conditions for deprotection.

The full procedure of analysis and results were also presented in poster presentation of A.

Lin et al. “Toward an expert system for assessment of optimal conditions for selective deprotection reactions”.

1. Varnek A., Fourches D., Hoonakker F., Solov_ev V.P. J. Comput. Aided. Mol. Des., 2005, 19: 693 – 703.

2. Peter G. M. Wuts, Theodora W. Greene Greene's Protective Groups in Organic Synthesis / Edition 4, Wiley, 2006.

3.

The research was supported by Ministry of Education and Science of the Tatarstan Republic and French embassy in Russian Federation. This work was performed in the framework Russian Scientific Foundation grant, agreement No14-43-00024, signed 1.10.2014

22

Page 24: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

V. Solov'ev1 M. Glavatskikh2,3 T. Madzhidov2,3 M. Gilles2 D. Horvath2 A. Varnek2

QSPR MODELS FOR HALOGEN BOND STRENGTH 1 Institute of Physical Chemistry and Electrochemistry, Russian Academy of Sciences, Leninskiy prosp., 31, 119071 Moscow, Russia; 2 Laboratoire de Chémoinformatique, UMR 7140 CNRS, Université de Strasbourg, 1, rue Blaise Pascal, 67000 Strasbourg, France; 3 Laboratory of Chemoinformatics and Molecular Modeling, Butlerov Institut of Chemistry, Kazan Federal University, Kremlevskaya 18, Kazan, Russia; [email protected]

Predicting affinity for halogens of electron-enriched groups in organic molecules – a property related, but not identical to basicity, H-bond acceptor strength and nucleophilicity – represents a complementary approach for better characterization of intermolecular interactions in various contexts, including protein-ligand binding. Here, we report SVM and ensemble MLR models of Halogen bond basicity scale based on the stability constant of halogen bonding (log KBI2) and the labeled ISIDA fragment descriptors. These models display good predictive performance (RMSE = 0.39 - 0.47 log KBI2 units) in external cross-validation. Moreover, they are able to reliably estimate stability constants of the complexes of I2 with polyfunctional molecules (RMSE = 0.49 – 0.59). Consensus SVM and ensemble MLR models are available for the users on our websites infochim.u-strasbg.fr/webserv/VSEngine.html (SVM) and http://vpsolovev.ru/programs/ (eMLR). Details of this work are given in the poster presentation by M. Glavatskikh et al. on this school.

23

Page 25: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

G. Marcou

D. Horvath

A. Varnek

A MEASURE FOR QSAR MODELABILITY AND FOR DATA ORGANIZATION

Laboratory of Chemoinformatics, Faculty of Chemistry, University of Strasbourg, 1, rue Blaise Pascal, 67000 Strasbourg, France; [email protected]

The lack of adequacy between a target property and molecular descriptors encoding chemicals structures may significantly deteriorate structure-activity models. Thus, an a priori assessment of data modelibility within a particular descriptor space may significantly reduce computational costs of QSAR modeling 1. In this work, we propose to assess chemical data modelibility using the Hilbert Schmidt Independence Criterion (HSIC) 2 still rarely used in chemoinformatics. HSIC is an empirical estimate of the Hilbert- Schmidt norm of the cross-covariance operator. Here we demonstrate that HSIC can efficiently be used to estimate a priori the adequacy of ensemble of molecular descriptors to a given chemical library both in simple QSAR and in multi-task learning studies.

The HSIC could also be used for data visualization purpuses. Thus, this parameter helps to reorder a dataset in such a way that chemical structures located near a given position are chemicaly related (see Figure 1).

Figure 1 : Ordering of ligands of the human dopamine D5 receptor around a central scaffold. The dot lines indicate the minimum level of similarity of the enclosed compounds with the central scaffold.

1. Golbraikh, A.; Muratov, E.; Fourches, D.; Tropsha, A., Journal of chemical information and modeling 2014, 54 (1), 1-4. 2. Gretton, A.; Bousquet, O.; Smola, A.; Schölkopf, B. In Measuring statistical dependence with Hilbert-Schmidt norms, Algorithmic learning theory, Springer: 2005; pp 63-77.

We thank Dr I. Baskin from Lomonsov Moscow State University (Russia) for stimulating discussion

24

Page 26: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

Oral Presentations

25

Page 27: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

D. Shulga O. Titov V. Palyulin N. Zefirov

PERSPECTIVES OF DIRECT MODELING OF HALOGEN BONDING IN EARLY DRUG DISCOVERY Chair of Medicinal Chemistry, Department of Chemistry, Lomonosov Moscow State University, Leninskie Gory, Moscow, Russia [email protected]

Halogen bonding (XB), defined as attractive interaction between a heavy halogen atom in a molecule and an electron rich donor, has received considerable attention recently [1] as a new promising interaction pattern (such as e.g. hydrogen bonding, hydrophobic interactions, pi-pi stacking, etc.) that could be used rationally in early stages of drug discovery [2]. XB uniquely combines hydrophobic properties and directional electrostatic interactions. Moreover, halogen atoms are abundant in drugs and drug candidates [3], therefore the route to additional optimization of a molecule is open.

Despite its potential usefulness, XB description in computer-aided tools used in the early drug discovery has not reached the level ensuring their comprehensive utilization. As of today only a few molecular mechanics descriptions of XB were reported, as well as a few attempts to incorporate XB explicitly in scoring functions used for molecular docking and virtual screening. Thus, XB interactions are not properly represented in the current tools for early stages of computer aided drug discovery.

We observe the perspectives and means for incorporation of XB description at different levels of approximation consistent with the approaches proved to be fruitful in drug discovery and, hence, incorporated in a series of tools used in drug discovery practice. The spectrum of levels of XB description should span from the force field to empirical scoring functions.

Starting from the nature of the XB interactions – high molecular electrostatic potential anisotropy – we have compared and optimized earlier two approaches to incorporate XB in a force field modeling [4,5]. Our current focus is on the explicit incorporation of the XB description in the current scoring functions used for molecular docking and virtual screening studies. The main challenge is to quickly (avoiding quantum chemical calculation) assign proper values of anisotropic electrostatic parameters to a molecule belonging to a broad and diverse space of pharmaceutically relevant compounds. On the route to build an empirical scheme to assign the electrostatic XB parameters for diverse organic molecules we investigated the extent and the nature of dependence of those parameters on the substitution pattern on an archetypal set of mono- and di-substituted phenyl halides. Since the relative importance of resonance and inductive effects were estimated, our current aim is to check the findings within the molecular docking environment.

1. R. Wilcken, M.O. Zimmermann et al. J. Med. Chem. 2013, 56, 1363-1388. 2. Y. Lu, Y. Liu et al. Expert Opin. Drug Discov. 2012, 7: 375-383. 3. Z. Xu, Z. Yang et al. J. Chem. Inf. Model. 2014, 54: 69-78. 4. O.I. Titov, D.A. Shulga et al. Dokl. Chem. 2013, 450: 139–143. 5. O.I. Titov, D.A. Shulga et al. Mol. Inf. 2015, in press.

This work was supported by Russian Foundation for Basic Research (Project No. 14-03-00851-a).

26

Page 28: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

V. B. Siramshetty1 P. Banerjee1,2 A. Olubunmi3 R. Preissner1,4

POTENTIAL DRUG REPOSITIONING OPPORTUNITIES FOR EBOLA VIRUS DISEASE

1 Structural Bioinformatics Group, Institute for Physiology, Charité – University Medicine Berlin, Berlin, Germany; 2 Graduate School of Computational Systems Biology, Humboldt-Universität zu Berlin, Berlin, Germany; 3 Department of Chemistry, University of Ilorin, Ilorin, Nigeria; 4 BB3R – Berlin Brandenburg 3R Graduate School, Freie Universität Berlin, Berlin, Germany; [email protected]

Introduction: The recent outbreak of the Ebola virus (EBOV) that causes the Ebola virus disease (EVD) has been as fatal as killing more than 5,000 individuals within a span of 10 months. This presents an important challenge to exploit the existing computational methodologies in a quest for immediate solutions against EBOV. Drug-repositioning has been a key perspective amongst a repertoire of computational approaches in proposing new indications for approved drugs. Methods: We present an integrated approach that combines chemical similarity methods with a protein-ensemble similarity method to repurpose approved drugs for EVD. A dataset containing over 32,500 drug-target interactions (5,244 drugs and 6,549 targets) was constructed and used as a template space. A total of 26 reference drugs, that demonstrated promising activity against the EBOV proteins in various cell-based assays and against those host proteins that mediate the entry of virus-like particles into cells and their further replication, were extracted from literature. Pairwise binding-site similarity between all template and reference proteins were calculated using the SMAP software [1]. Sub-structure and physico-chemical property based fingerprints were generated to compute pairwise chemical similarities between template and reference drugs. Furthermore, we developed a scoring scheme to integrate the similarity scores and ranked drugs in the order of their potential to be repositioned for EVD. Results: We identified a total of 137 drugs that demonstrated highest similarities against the experimental drugs for Ebola both in ligand and target spaces. One of the top ranked approved drugs, Amiodarone is currently in a Phase 3 clinical trial (NCT02307591) aimed at evaluating its safety and efficacy in the treatment of patients afflicted with EVD [2]. Conclusions: We propose a novel drug repositioning approach to identify and prioritize drugs that require immediate pre-clinical investigations against EVD. However, this approach is not limited to EVD but has potential implication in a wide range of indications.

1. Bourne PE. et al. Nucleic Acids Research, 2010, 38(Web Server issue):W441-W444.

2. Clinical Study to Assess Efficacy and Safety of Amiodarone in Treating Patients With Ebola. Virus Disease (EVD) in Sierra Leone. EASE (EMERGENCY Amiodarone Study Against Ebola). [http://clinicaltrials.gov/ct2/show/NCT02307591]

We acknowledge the funding from TWAS-DFG 2014 funding program. The funders had no role in study design, data collection and analysis.

27

Page 29: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

O. Tarasova1 A. Urusova1

A. Zakharov2

M. Nicklaus2 V. Poroikov1

APPLICATION OF THE LARGE-SCALE DATABASE TO THE QSAR MODELING OF THE HIV-1 REVERSE

TRANSCRIPTASE INHIBITORS 1 Laboratory for Structure-Function Based Drug Design, Institute of Biomedical Chemistry, Pogodinskaya Str., 10 Building 8, Moscow, Russia, 119121; 2 CADD Group, Chemical Biology Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, DHHS, NCI-Frederick, Building 376, Room 205, 376 Boyles St., Frederick, MD 21702 [email protected]

A lot of publicly and commercially accessible databases contain information about chemical structure and biological activity of drug-like organic compounds [1]. Several methods have been suggested to reduce inconsistency in publicly available bioactivity databases [1, 2]. Typically, these approaches are based on selecting the compounds investigated by a single team of authors to reduce the impact of different assays on the activity measurements. However, there is still an issue how to create consistent data sets for the purposes of QSAR modeling using the large-scale databases of chemical compounds. In our study we investigated the ways to automatically prepare the modeling sets using the Integrity and ChEMBL databases as the examples of the commercially and publicly available databases respectively. We selected HIV-1 reverse transcriptase (RT) inhibitors for this research because this target provides a good case due to the presence of the multiple assays results in the databases. The structures of all HIV-1 RT inhibitors available from ChEMBL and Integrity were collected, including compounds assayed against both wild type and mutants of RT. Integrity provided a data set of 1,327 records (564 unique compounds) tested in more than 1300 bioassays approximately. ChEMBL yielded 3,787 records (2,297 unique compounds) tested in about 100 bioassays. For each of two general subclasses of HIV-1 (wild type of RT and the mutant forms of RT) we suggested several different ways to compile data sets for creating QSAR models: (1) selection of all compounds tested against a specific end-point; (2) selection of the compounds tested using one method and material (biological assay); (3) selection of the compounds derived from one scientific publication. We used a program GUSAR to build QSAR models. We tested the performance of the obtained QSAR models with leave 30% out cross-validation (LMO) and five-fold cross validation procedures; we then discussed the compatibility of the data from ChEMBL and Integrity. For the most of modelling sets from Integrity database we observed an increase of the performance of the QSAR models created by the second compiling method in comparison to the first one (characteristics of the best model from the second compiling method, the data set “Antigen assay, Mononuclear cells (blood) as a material”: N=52; R2=0.85; Q2=0.76; F=7.7; SD = 0.91; R2

LMO=0.75; R2

5fold =0.64). However for the data sets from ChEMBL we did not observe similar trend. We have suggested this is a result either of the insufficient annotation or of the incomplete description of the assays in the scientific publication, which lead to the very fuzzy classification of the assay types in ChEMBL that does not make sense in the terms of the consistency. That observation corresponds to the conclusions of Kaliokoski et al. [1] We could not create the modelling sets using the third compiling method (selection of the compounds derived from one scientific publication) for the data sets from Integrity database, while the third compiling method leads to the increase of the performance of the models built on the data sets from ChEMBL database. We also proposed an algorithm to automatically match data from ChEMBL and Integrity on the compounds that were tested in the similar experimental conditions. 1. Kalliokoski T. et al. PloS One, 2013, 8: e61007. 2. Muresan S. et al. Drug Discov. Today, 2011, 16, 1019–1030.

This work was supported by the Russian Foundation of Basic Research (grant No. 13-04-91455_NIH-a).

28

Page 30: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

P.Sidorov1,2

H.Gaspar1

G.Marcou1 D.Horvath1 A.Varnek1

MAPPABILITY OF DRUG-LIKE SPACE: TOWARDS A POLYPHARMACOLOGICALLY COMPETENT MAP

OF DRUG-RELEVANT COMPOUNDS 1 Laboratoire de Chémoinformatique, UMR 7140 CNRS - Université de Strasbourg, 1 rue Blaise Pascal, Strasbourg 67000, France; 2 Laboratory of Chemoinfomatics, Butlerov Institute of Chemistry, Kazan Federal University, Kazan, Russia [email protected]

This work attempts to address the question whether a “Universal model” of the Chemical Space

exists and propose a representation of it. A universal model is intended as a probability distribution of compounds that could be set-independent. The probabilistic model is build as a Generative Topographic Map (GTM). The claim of “universality” is quantitatively justified, with respect to all the structure-activity information available so far. To this purpose, an evolutionary map growth and selection procedure considered both the choice of meta-parameters (poling molecule sets, descriptor types) and map-specific parameters (size, RBF function controls, etc) as degrees of freedom. It was associated to a fitness function measuring the polypharmacological performance of the map, with respect to a multi(144)-target quantitative affinity prediction challenge. Under the pressure of Darwinian selection, the emerging maps were pushed to find (a) the best descriptor type, out of the proposed substructural molecular fragments descriptors schemes, and (b) the specific non-linear “recipe” of generating a model GTM probability distribution which enhances the information contained in certain descriptor elements, but suppresses descriptor “noise”. The fittest manifolds were seen to “grow” in rather low-resolution molecular descriptor spaces: pharmacophore- or force-field-type colored atom pairs and triplets rather than more specific sequence or circular fragment counts included in the pool of competing ISIDA descriptor types.

Obtained maps were perfectly suited to solve classification problems: on the overall, more than 80% of the more than 600 distinct and varied classification problems, chosen such as to cover a maximum of exploitable SAR data, were successfully solved. This justifies, in our view, the claim of "Universality" of the constructed GTMs.

In addition, intuitive 2D representations were shown to provide an insightful analysis of drug-like space, and provide huge perspectives for target- and therapeutic range-related compound collections. Due to quantitative validation, the user may gain confidence in the rendered visual patterns, and draw very meaningful conclusions on their behalf.

P.Sidorov thanks the Program of Competitive Growth of Kazan Federal University for the support

29

Page 31: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

Poster Presentations

30

Page 32: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

P. Banerjee1,2 V. B. Siramshetty1 M. N. Drwal1 A. Goede1 M. Dunkel1 R. Preissner1,3

SYSTEMATIC ASSESSMENT OF DIFFERENT COMPUTATIONAL APPROACHES FOR PREDICTION

OF TOXIC EFFECTS OF CHEMICAL STRUCTURES 1 Structural Bioinformatics Group, Institute for Physiology, Charité – University Medicine Berlin, Berlin, Germany; 2 Graduate School of Computational Systems Biology, Humboldt-Universität zu Berlin, Berlin, Germany; 3 BB3R – Berlin Brandenburg 3R Graduate School, Freie Universität Berlin, Berlin, Germany; [email protected]

Approximately 20% of failures in the late drug development are caused by occurrences of

toxicities and adverse drug reactions (ADR). Animal trials are currently the major method for determining the possible toxic effects of drug candidates. However, as an alternative, several traditional chemoinformatics approaches such as Quantitative Structure Activity Relationship modeling, ligand- and structure-based approaches, have been proposed to perform well in silico thus enabling the reduction of cost, time and animal experiments. Molecular similarity analysis in alliance with identification of toxic fragments was reported to show promising performance in prediction of rodent oral toxicity [1,2]. Furthermore, pharmacophore models (toxicophores) were reported to indicate possible toxicity targets associated with adverse drug effects. Predicting the in vitro effects solely based on structural descriptors has also received great attention in recent years.

Here, we describe different computational approaches and their intrinsic limitations while comparing their performance across data sets provided in the Tox21 Data Challenge 2014. In particular, we examine different methodologies including analysis of toxic fragments, pan assay interface compound substructures and toxicophore mapping. Additionally, a case study consisting two different drugs having similar toxic class effects can cause similar ADR that result from sharing similar toxicological pathways or networks has been reported [3]. Proper understanding of tissue specificity is necessary to detect relevant genes and pathways in a specific organ and to identify the key nodes underlying the organ-specific safety profile of a particular drug.

1. Drwal M.N., Banerjee P. et al. Nucleic Acids Res., 2014, 42: W53-8. 2. Drwal M.N., Banerjee P. et al. Altex proceedings,9th World Congress on Alternatives and Animal Use in the Life Sciences, 2014. 3. Metushi I.G., P.Banerjee, B.O Gohlke; A. M. English; A.Lucas; C.Moore; J.Sidney; S. Buus; D. A. Ostrov; S. Mallal; ™E.Phillips; J.Shabanowitz; D.Hunt; R.Preissner; B.Peters, PloSONE, 2015. We acknowledge the funding from DFG, GRK 1172,GRK 1360, Innovative Toxicology for the Reductionof Animal Experimentation (e:ToP), Immunotox Project (BMBF) [031A268B].

31

Page 33: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

P. N. D’yachkov1

E. P. D’yachkov1 INTERACTION OF CARBON NANOTUBES WITH

ANTITUMOR DRUGS DOXORUBICIN AND PACLITAXEL CALCULATED USING MOLECULAR

DOCKING AND DYNAMICS TECHNIQUES 1 Kurnakov Institute of General and Inorganic Chemistry of the Russian Academy of Science, Leninskii pr. 31, 119991Moscow, Russia [email protected]

Carbon nanotubes are attractive for many biomedical uses. Antitumor drug delivery is an important way to use nanotubes. Paclitaxel and doxorubicin are the drugs for the chemotherapeutic treatment of cancer, but they have very low solubility in aqueous solution. The drugs can be more widely and efficiently used in practice if they are delivered directly to diseased tissue. This can be provided by using nanotubes since they are accumulated in diseased tissue and form complexes with the drugs. Nanotubes are almost insoluble in water, but their complexation with poly(ethylene glycol) leads to water soluble nanotube derivatives. The present study is aimed at estimating the dependence of the energy of complex formation of nanotubes with the drugs and poly(ethylene glycol) and giving recommendations on what type of nanotubes would form the most stable binary and ternary complexes with these molecules. To assess the structure and stability of the complexes, we used the molecular docking and dynamics techniques. It is shown that for nanotubes of small diameters d < 14.9 Å of any chirality, the paclitaxel molecule is located on the outside of nanotube. For nanotubes of larger diameters, paclitaxel is located inside the tubule. The dependence of the paclitaxel bond energy on the tube diameter is nonmonotonic: the paclitaxel complexes with nanotubes ~15 Å in diameter are the most stable, the paclitaxel molecule being located inside the nanotube. For doxorubicin, the complexes with nanotubes having d ~13.5 Å are the most stable, the drug is also located inside the nanotube. The external or internal location of poly(ethylene glycol) is dictated by the tube diameter too; however, the boundary diameter is different (d ~ 11 Å). This fact makes it possible to obtain the excellent drug carriers with the poly(ethylene glycol) molecule being inside and the drugs molecules outside the nanotube. This pertains to the complexes of the paclitaxel and doxorubicin with nanotubes with 11 Å ≤ d ≤ 14.9 Å and 11 Å ≤ d ≤ 12.5 Å, respectively.

1. D’yachkov E.P., D’yachkov P.N.. Nanosci. and Nanotechnol. Lett., 2013, 5: 1–3. 2. Kiruta N.V., D’yachkov E.P., D’yachkov P.N.. Computer Modelling and New Technologies, 2011, 15, No. 4: 16–22.

This work was supported by Russian basic reseach Foundation (grant 03-14-00493).

32

Page 34: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

T. Gimadiev 1, 2 H. Gaspar 1 G. Marcou 1 D. Horvath 1 A. Varnek 1

GENERATIVE TOPOGRAPHIC MAPPING APPROACH TO MODELING AND CHEMICAL SPACE

VISUALIZATION OF HUMAN INTESTINAL TRANSPORTERS

1 Laboratoire de Chémoinformatique, Université de Strasbourg, France 2 A.M. Butlerov Insitute of Chemistry, Kazan Federal University, Russia [email protected]

One of the most common cellular membrane transport proteins are ATP-binding cassette superfamily and solute carriers family members. These types of transporters play key roles in drug design, because they regulate influx or efflux of xenobiotic and endogenous chemicals. The selection of viable drug candidates among biologically active compounds requires the assessment of their transporter interaction profiles.

In this work, the Generative Topographic Mapping (GTM) and Kernel Generative Topographic Mapping (KGTM) methods are used both to visualize chemical space of Human Intestinal Transporters and to build classification models linking chemical structure with biological activities. The modeling was perfomed using ISIDA and MOE 2D descriptors. KGTM calculations were performed using linear and Tanimoto kernels. Benchmarking studies for individual transporters subsets show that performance of GTM models involving ISIDA descriptors is similar to those obtained with SVM method, whereas SVM outperforms GTM if MOE descriptors are used. GTM and KGTM perform similarly for MOE descriptors. The model performance can significantly be improved using GTM-based applicability domain 1 which discards the areas of uncertain predictions (see Figure 1).

The entire chemical space of the ligands against 11 different targets has been analized by considering the areas populated by active and incative compounds. Comparing activity zones for different targets allowed us to identify the areas populated by compounds selectivity binding particular proteins.

Figure 1. Graphical interpretation of the applicability domain for GTM classification models. On the map prepared for the entire set of 1568 molecules of inhibitors (dark grey) and non inhibitors (light grey) of P-glycoprotein 1, the color stands for the class having the highest probability compared to the other in a given node. Black points correspond to incorrectly classified molecules. The increase the class prevalence factor (CPF = Probability of major class / Probability of another class) from CPF=1 (left) to 4 (right) results in shrinking the AD area. This leads to the decrease of the number of molecules inside AD (coverage), on one hand, and to the increase of the model’s performance (Balance Accuracy, BA), on the other hand.

1. Gaspar H. A., Marcou G., Arault A., Lozano S., Vayer P., Varnek A. J. Chem. Inf. Model., 2013, 53 (4): 763-772

CPF = 1, BA= 0.84 , coverage = 100 %

CPF = 4, BA= 0.90, coverage = 78 %

33

Page 35: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

T.R.Gimadiev1,2

R. I. Nugmanov1

T. I. Madzhidov1 P.G. Polishuk1,3

A.S. Petrovsky1

I. I. Baskin4 I. S. Antipin1 A. A. Varnek1,2

PREDICTION OF TAUTOMER EQUILIBRIUM CONSTANTS USING CONDENSED GRAPHS OF

REACTION 1 Laboratory of chemoinformatics and molecular modeling, A.M. Butlerov Institute of Chemistry, Kazan Federal University, Kremlyovskaya Str., 18, Kazan, Russia; 2 Chemoinformatics Laboratory, University of Strasbourg, B. Pascal Str., 4, Strasbourg, France; 3 A.V.Bogatsky Physico-Chemical Institute of NAU, Lustdorfskaya doroga, 86, Odessa, Ukraine; 4 Lomonosov Moscow State University, Leninskie Gory, Moscow 119991, Russia; [email protected]

Chemical reactivity, spectral and physico-chemical properties of compounds are highly dependent on equilibrium between different tautomeric forms. Typically, in chemical databases or QSAR/QSPR modeling the most stable in water form is considered. On the other hand, many experiments are carried out in non-aqueous media or water-organic solvent mixtures. So, prediction of the tautomers’ population as a function of solvent represent a real challenge for chemoinformatics.

In this work, we report QSPR modeling of tautomer equilibrium constants (logKT) in different solvents and water-organic solvent mixtures. A dataset containing logKT values for 1074 (801 of them were used in modeling and testing) tautomer equilibria in 12 pure solvents and 7 types of water-organic solvents mixtures were compiled from the literature. These data cover 12 types of tautomer transformations. Two types of external test were selected for examination of produced model and comparison with other methods. Each encoded as Condensed Graph of Reaction (CGR) representing a pseudo-molecule containing both conventional bonds (e.g., single, double, aromatic, etc) and dynamical bonds (e.g., single to double) [1, 2]. For all CGRs, ISIDA fragment descriptors were generated [3]. Solvents were encoded by 15 descriptors representing solvent polarity, polarizability, H-acidity and basicity [2]. Temperature was encoded by its value in Celsius degrees. QSPR models were built using SVM method with genetic algorithm of selecting best models. Both universal (for all data) and specific (for each transformation type) models were obtained. They have reasonable predictive performance: consensus RMSE was about 0.65 logKT units in 30 x 5-fold cross validation for universal models and 0.34-0.97 for specific models. RMSE of percentage prediction on the universal model is about 17% (on consensus logKT predicted for test sets during cross-validation). As a score of corectly predicted dominant tautomer we used correct predicted sign of logKT(CPS) 84%. Test1 was predicted with RMSE 1.63 and CPS 54% and Test2 with RMSE 0.73 and CPS 87%. Howerver, when we excluded zone of high noise of model from -0.7 to 0.7 logKT (which corresponds to RMSE of test in CV of models) CPS increase to 84% and 100% for test1 and test2 correspondingly.

For comparison, logKT in several pure solvents were also assessed in quantum chemical (QC) calculations in DFT B3LYP/6-311++G(d,p) using IEFPCM model for solvent description with SMD non-electrostatic corrections. RMSE of prediction for test set 1 and test 2 was 5.8 and 1.62 with CPS 48% and 74% correspondingly. So, our QSPR models perform much better than QC calculations.

Tautomeric equilibria in water at temperature about 25oC were selected to test the quality of predictions made by ChemAxon/Tautomerizer plugin. ChemAxon cannot see tautomerism in more than half of set.and for the rest of transformations predictions has RMSE 4.6. At the same time, our QSPR model based on CGRs yields reasonable predictions for all componds and RMSE 1.69.

1. A. Varnek, D. Fourches, F. Hoonakker, V. P. Solov’evJ. Computer-Aided Molecular Design, 2005, 19: 693-703.

2. T. I. Madzhidov, P. G. Polishchuk, R. I. Nugmanov, A. V. Bodrov, A. I. Lin, I. I. Baskin,A.A. Varnek, I. S. Antipin. Russian Journal of Organic Chemistry, 2014, 50 (4): 459-463.

3. A.Varnek, D. Fourches, D. Horvath, O. Klimchuk, C. Gaudin, P. Vayer, V. Solov’ev, F. Hoonakker, I. V. Tetko, G. Marcou Current Computer-Aided Drug Design, 2008, 4 (3): 191-198

This work was supported by Russian Scientific Foundation grant, agreement No14-43-00024, signed 1.10.2014

34

Page 36: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

M.Glavatskikh1,3

T.Madzhidov 1,3 V. Solov’ev2

D. Horvath1

G. Marcou1

A. Varnek1

PREDICTIVE MODELS FOR HALOGEN BOND BASICITY SCALE PKI2

1 Laboratory of Cheminformatics, University of Strasbourg, Street Blaise Pascal 1, Strasbourg, France; 2 Institute of Physical Chemistry and Electrochemistry, Russian Academy of Sciences, Leninskiy prosp. 31a, Moscow, Russia; 3 Laboratory of Cheminformatics and Molecular Modeling, Butlerov Institut of Chemistry, Kremlevskaya 18, Kazan, Russia; [email protected]

This work is devoted to building QSPR (Quantitative Structure-Property Relationship) models

of Halogen bond basicity scale pKI2. Understanding and predicting affinity for halogens of electron-enriched groups in organic molecules – a property related, but not identical to basicity, H bond acceptor strength and nucleophilicity – represents a complementary approach for better characterization of intermolecular interactions in various contexts, including protein-ligand binding. The scale is based on the experimental 1:1 (B:I2) complexation constant logKI2 of organic compounds (B) with diiodine (I2) as a reference halogen-bond donor in hexane at 298 K [1]. Models based on ISIDA fragment descriptors were obtained and cross-validated using support vector machine (SVN) and ensemble multiple linear regression (eMLR) methods on a set of 598 organic compounds. Some of these were found to successfully pass the cross-validation test, at RMSE = 0.45-0.56 logKI2 units. A consensus model returning the mean of values predicted by most successful individual SVM models, based on various ISIDA fragmentation schemes [2,3,4], and including applicability domain assessment strategies (bounding box, standard deviation of consensus prediction) was rendered publicly accessible on our web server infochim.u-strasbg.fr/webserv/VSEngine.html. It may optionally perform an automated detection of putative Halogen bonding centers, or alternatively consider the centers labeled as such (as ChemAxon "marked atoms") by the user.

Developed models have then been challenged, on the external test set of 11 polyfunctional compounds, for which unambiguous assignment of measured I2 complexation equilibrium constants could not be assigned to either of envisageable halogen bond acceptor groups. To this purpose, the models were used to predict pKI2 of individual centers followed by an estimation of the effective complexation constant with the help of the ChemEqui [5] program. At a prediction error of RMSE = 0.55, only slightly larger than cross-validation results, this external prediction challenge is therefore a success.

1. Laurence C., Graton J., Berthelot M., El Ghomari M. J. Chemistry-a European Journal 2011, 17: 10431-10444.

2. Varnek A., Fourches D., Hoonakker F., Solov’ev V. P. J. Computer-Aided Mol. Design, 2005, 19: 693-703.

3. Ruggiu F., Marcou G., Varnek A., Horvath D. Mol. Informatics, 2010, 29: 855 – 868. 4. Ruggiu F., Solov’ev V., Marcou G., Horvath D., Graton J., Le Questel J.-Y., Varnek A. Mol.

Informatics, 2014, 33: 477 – 487. 5. Solov’ev V. P., Tsivadze A. Y. Protection of Metals and Physical Chemistry of Surfaces 2015,

51: 1-35. This work was supported by Russian Scientific Foundation grant, agreement No14-43-00024, signed 1.10.2014

35

Page 37: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

V. Gordeeva 1, D. I. Osolodkin2

SYSTEM OF STRUCTURE-BASED VIRTUAL SCREENING OF MYCOBACTERIUM TUBERCULOSIS

PROTEIN KINASE PKNB INHIBITORS 1 Chair of Bioinformatics, Department of Biological and Medical Physics, MIPT, Institutskiy per. 9, Dolgoprudny141700, Moscow Region, Russia; 2 Department of Chemistry, Lomonosov Moscow State University, Leninskie Gory 1/3, Moscow 119991, Russia; [email protected]

Tuberculosis (TB) is a widespread infection disease caused by Mycobacterium tuberculosis (Mtb). The emergence of multi-drug resistant strains of Mtb has stressed the need for new drugs to treat tuberculosis[1].

The serine/threonine protein kinase PknB is an important enzyme that plays a role in cell division and elongation of Mtb[2]. Therefore inhibition of PknB represents a potential therapeutic approach for the treatment of TB. Despite about 60 small molecules were shown to inhibit PknB, none of them are used in clinical practice.

In this study we develop virtual screening techniques based on the molecular docking for ATP competitive kinase PknB inhibitors, which could find a use in tuberculosis treatment.

Nine available X-ray structures of PknB were obtained from Protein Data Bank (PDB) and aligned by the hinge residues' back-bone (residues 92–97). The database of true active compounds contained the 57 known PknB inhibitors which were drawn in Chemaxon’s Marvin sketch application[3]. Decoys set (600 compounds) was developed in program DecoyFinder based on high similarity of physico-chemical properties and topological dissimilarity to known inhibitors of PknB kinase[4]. Docking was performed with the flexible anchor-and-grow algorithm available in DOCK6.7 [5].

Analysis of docking results of the known PknB inhibitors PknB revealed characteristics of the interaction between ligands and target — the hydrogen bonds with the GLU93 and the VAL95 and in some cases close contact with the LEU17. To evaluate the performance of ranking molecules the ROC (ReceiverOperatingCharacteristic) method was employed based on the assessment of the binding energy between target and ligand.

As a result of study the analysis of suitability of certain structures was carried out and the algorithm of system of statistically reliable virtual screening based on nondegenerate ensemble docking was developed.

1. Av-Gay Y., Everett M. The eukaryotic-like Ser/Thr protein kinases of Mycobacterium tuberculosis. Trends Microbiol.,2000,Vol. 8, № 5: 238– 244.

2. Lougheed K.E. a et al. Effective inhibitors of the essential kinase PknB and their potential as anti-mycobacterial agents. Tuberculosis (Edinb). Elsevier Ltd, 2011, Vol. 91, № 4: 277–286.

3. Cereto-Massagué A. et al. DecoyFinder: an easy-to-use python GUI application for building target-specific decoy sets. Bioinformatics, 2012, Vol. 28, № 12:1661–1662.

4. Brozzel S. a et al.. Evaluation of DOCK 6 as a pose generation and database enrichment tool J. Comput. Aided Mol Des., 2012, Vol. 6:749-773.

5. Lougheed K.E. a et al. Effective inhibitors of the essential kinase PknB and their potential as anti-mycobacterial agents. Tuberculosis (Edinb). Elsevier Ltd, 2011, Vol. 91, № 4: 277–286

36

Page 38: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

G. Jonusauskas1

S. A. Denisov1,2

N. D. McClenaghan2

EQUILIBRATION BETWEEN ELECTRONIC STATES AND REVERSIBLE ELECTRONIC ENERGY

TRANSFER IN BICHROMOPHORIC COMPOUNDS 1 Laboratoire Ondes et Matière d’Aquitaine, UMR CNRS 5798, Bordeaux University, 351 cours de la Libération, 33405 Talence France; 2 Institut des Sciences Moléculaires, UMR CNRS 5255, Bordeaux University, 351 cours de la Libération, 33405 Talence France; [email protected]

The efficient use of energy following light absorption is of extreme importance in natural photosynthetic assemblies as well as in artificial systems. Small supramolecular systems have been used successfully to absorb light energy and transfer it to a specific site, while reversible energy transfer processes in polypyridine complexes with transition metals have been reported to temporarily stock energy and prolong excited-state lifetimes.

Strategies to increase the luminescence lifetimes and quantum yields have reposed on disfavoring thermally-activated loss by increasing the energy gap between states on lowering emissive 3MLCT levels, as a result of incorporating highly electron poor terpyridine-like ligands and/or increasing steric rigidity of complexes in excited states. Additionally, coupling with an organic auxiliary with matched lowest triplet state energy led to a spectacular increasing of the metal complex centered luminescence lifetime and quantum yield.

Here we report the unique excited-state equilibration between three different excited states in a structurally simple bichromophoric Copper(I)-phenanthroline complex coupled through a short spacer with an auxiliary anthracene chromophore acting as an energy reservoir [1] (Figure 1) as well as unprecedented increasing of luminescence lifetimes in Ruthenium(II) complexes based on tridentate polypyridine ligands linked to anthracene chromophore [2] and emissive cyclometallated Iridium(III) centre connected to pyrene chromophore [3].

Figure 1. Equilibration between excited states in Cu(I)-phenanthroline – anthracene system. The choice of auxiliary chromophores, the energetic schemes and the rate constants and

efficiencies will be discussed in details in this report.

1. Leydet Y. et al. Journal of the American Chemical Society, 2007, 129: 8688-8689. 2. Ragazzon G. et al. Chemical Communications, 2013, 49: 9110-9112. 3. Denisov S. A. et al. Inorganic Chemistry, 2014, 53: 2677–2682.

The authors wish to acknowledge financial support from the CNRS, Université Bordeaux I, Région Aquitaine, the Europe-an Union (HetIridium, CIG322280), programme IdEx Bordeaux – LAPHIA (ANR-10-IDEX-03-02).

37

Page 39: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

A. Kosinskaya1,2 P. Polishchuk1 L. Ognichenko1 O. Lebed2

I. Burdina2 V. Kuz’min1

2D - QSAR MODELS OF BLOOD-BRAIN BARRIER PERMEABILITY OF ORGANIC COMPOUNDS

1 A.V. Bogatsky Physical-Chemical Institute of NAS of Ukraine, Lustdorfskaya doroga 86, 65080 Odessa, Ukraine 2 Odessa National Medical University, Valikhovskiy lane 2, 65026 Odessa, Ukraine [email protected]

Brain penetration is one of the major parameters which are taken into consideration in

chemical toxicological studies, development of new neurotherapeutic drugs and for more effective treatment of brain diseases. The blood- brain barrier (BBB) separates the brain from the blood stream and limits the transport of substances from the systemic circulation into the brain tissue.

BBB permeability is often expressed as BBB permeability-surface area product (PS, quantified as logPS). PS represents the uptake clearance across BBB. LogPS can be determined by in vivo intravenous administration, indicator diffusion, brain uptake index techniques, and in situ brain perfusion.

At the present time the QSAR/QSPR methods are widely used for estimation of the permeability of substances through the BBB. The known models are inherently additive and do not take into account mutual influence of atoms in a molecule. Structurally or functionally homogeneous datasets of compounds were used for development of these models that limits their predictive ability and applicability. Thus, the development of models with satisfactory predictive ability for structurally diverse datasets is a quite actual task. Even a more important task is an interpretation of obtained QSAR models in terms of structural features and atomic properties. This information can be used for drug design and fast filtering of undesirable compounds.

The goal of this work was to build QSAR models using simplex representation of molecular structure [1] and analysis of various structural factors on the penetration of substances through the BBB. The object of the study was a dataset of 182 compounds belonging to different classes of organic compounds with measured logPS values. QSAR models were developed using the Random Forest [2] statistical method.

1. Kuz’min V.E. et al. Virtual screening and molecular design based on hierarchical QSAR technology. In Recent Advances in QSAR Studies, Eds. T. Puzyn, J. Leszczynski,M.Cronin, Springer, London, 2010: 127–176, 422 p. 2. Breiman L. Machine Learning, 2001, 45: 5-32.

38

Page 40: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

A. I. Lin1,2

T. I.Madzhidov 1

I. Antipin1 O. Klimchuk2

A. Varnek2

TOWARD AN EXPERT SYSTEM FOR ASSESSMENT OF OPTIMAL CONDITIONS FOR SELECTIVE

DEPROTECTION REACTIONS 1 Kazan Federal University, Kazan, Russia 2 Université de Strasbourg, Strasbourg, France [email protected]

Protection/deprotection reactions play an important role in synthetic organic chemistry. A

key problem is to choose optimal experimental conditions (catalyst, solvent, additives, etc) leading to selective deprotection of a given group in particular environment. Up to now, for this purpose chemists use reactivity charts from a famous the Green’s book [1] which has become a recognized guide in protection/deprotection chemistry. On the other hand, these reactivity charts resulted from manual analysis of relatively small amount of data and therefore may miss important information hidden in large reaction databases. In this presentation we report statistical analysis of large dataset of hydrogenation reactions (142.111 reactions) extracted from Reaxys database. For this purpose, we built a workflow involving numerous in-house tools for reaction data processing based on the Condensed Graph of Reaction (CGR) approach [2]. Raw reaction data were curated, normalized and annotated thus forming well-structured database. Its analysis clearly shows some disagreements with the Green’s reactivity charts in respect to (i) reactivity of particular protective groups and (ii) selectivity of deprotection of a given group in presence of other groups or chemical functions.

We have developed a prototype of an expert system able to provide chemist with detailed recommendations of experimental conditions leading to desirable chemical transformations. This tool implements CGR-based similarity searching to the reaction database issued from raw data processing and could be easily implemented in any database system.

4. Peter G. M. Wuts, Theodora W. Greene Greene's Protective Groups in Organic Synthesis / Edition 4, Wiley, 2006.

5. Varnek A., Fourches D., Hoonakker F., Solov_ev V.P. J. Comput. Aided. Mol. Des., 2005, 19: 693 – 703.

The research was supported by Ministry of Education and Science of the Tatarstan Republic and French embassy in Russian Federation. This work was performed in the framework Russian Scientific Foundation grant, agreement No14-43-00024, signed 1.10.2014

39

Page 41: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

E. Mokshyna

P. Polishchuk V. Nedostup V. Kuz’min

ATOM-PAIR DESCRIPTORS, TEMPERATURE-DEPENDENT PROPERTY AND BINARY MIXTURES: QSPR FOR SECOND VIRIAL CROSS-COEFFICIENTS

Department of Molecular Structure and Chemoinformatics, A.V. Bogatsky Physico-Chemical Institute NAS of Ukraine, 86 Lustdorfskaya doroga , 65080 Odessa, Ukraine [email protected]

Amongst all the equations of state there is only one having rigorous theoretical explanation – virial equation of state, in which compressibility factor (Z) is expressed by an infinite series expansion in density (reciprocal molar volume) [1]:

𝑍 = 𝑝𝑉𝑚𝑅𝑇

= 1 + 𝐵𝑚𝑉𝑚

+ 𝐶𝑚𝑉𝑚2

+ ⋯ (1) where Vm – molecular volume, T – absolute temperature, R – the gas constant. For mixtures virial equation is a function of temperature T and mixture composition. Second

virial coefficeint describes departure from ideality due to pair interactions. Uniqueness of second virial cross-corfficient B12 (2) is that it relates only to interactions of compounds 1 with compounds 2.

𝐵𝑚 = 𝑥12𝐵11 + 2𝑥1𝑥2𝐵12 + 𝑥22𝐵22 (2) where xi – mole fraction of compound i, Bm – virial coeffient of mixture as a whole, Bii –

virial coefficient of pure compound i. Simplex representation [2] was used to describe molecular structure. To describe interactions

between two molecules pair potential-based descriptors were used. For all the possible atom pairs between two molecules (see the scheme) values of Lennard-Jones constants are calculated, using values for UFF (universal force field) [3]. Depending on the absolute value of calculated constant

two-atom fragment is labeled as strong-interacting, medium-interacting, weak-interacting, etc.

Temperature-dependent data is always a challenge for QSAR/QSPR and requires careful and

thoughtful approach. In the case of virial coefficients all the temperature ranges are unequal and frequently does not even intersect. The naïve statistical approach – to include temperature as a single descriptor gave quite poor results. So we suggested the two-layer approach, which relies on physical based methodology. It uses equation (3) derived from the Van der Waals equation of state for the real gases [4]. The procedure can be described by equations (4)-(6).

𝐵12 = 𝑏 − 𝑒𝑥𝑝 (𝑎 𝑅𝑇)⁄ (3) 𝐵12 = f(a, b) (4)

a = f(D1, D2, …, Dn) (5) b = f(D1, D2, …, Dm) (6)

where a, b – coefficients of Van der Waals equation; D – descriptors; T – temperature. Obtained QSPR models were robust with Q2=0.75-0.85 and showed prediction error

comparable with that of data. Logically, error of data as well as prediction error grows from mixtures of non-polar compounds (about 80 cm3/mol) towards mixtures of polar compounds (about 220 cm3/mol). Despite the simplicity of equations and used descriptors the models are not only accurate, but also interpretable in simple physico-chemical terms. Interpretation for consensus model shows the prevailing importance of electrostatic factors, lesser importance of van der Waals’ interactions, and electronic polarizability was in the third place.

1. Dymond, J. H.,Marsh, K., et al., Virial coefficients of pure gases and mixtures. Springer Hidelberg, Germany, 2002, Vol. 4. 2. Kuz'min, V. E.,Artemenko, A. G., et al., Journal of computer-aided molecular design, 2008, 22: 403-21. 3. Rappe, A. K.,Casewit, C. J., et al., J Am Chem Soc, 1992, 114: 10024-10035. 4. Kaye & Laby Online Tables of Physical & Chemical Constants Version 1.1. www.kayelaby.npl.co.uk, 2008

40

Page 42: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

R. I. Nugmanov1

G. R. Sabirova1

T. I. Madzhidov1

A. A. Varnek1,2

DESCRIPTORS OF COUNTER-ION EFFECT IN BIMOLECULAR NUCLEOPHILIC SUBSTITUTION

REACTIONS 1 Laboratory of chemoinformatics and molecular modeling, A.M. Butlerov Institute of Chemistry, Kazan Federal University, Kremlyovskaya Str., 18, Kazan, Russia; 2 Chemoinformatics Laboratory, University of Strasbourg, B. Pascal Str., 4, Strasbourg, France; [email protected]

Bimolecular nucleophilic substitution reactions one of the most prevalent in modern organic chemistry. In these reactions may be involved neutral and/or ionic substrates. Modeling neutral reaction enough to easily feasible [1]. But in reactions involving counterions reaction rates are strongly dependent on their nature.

In this work have been developed quantum chemical descriptors of counterions, which describe the electrostatic and non-electrostatic energy of solvation in polar and non-polar solvents and also the distribution of the electrostatic potential on the surface and volume of ion [2].

We was compared calculated descriptors with indicator variables and null model in which information about the counterion was dropped on the set of the 1600 reactions including neutral and ionic substrate.

On a mixed set including both neutral and ionic substrates quality of model with developed descriptors improved from model with indicator variables, which in turn improves compared with the zero model. But on set including only ionic substrates quality of model on new descriptors is same as on model with indicator variables.

We found that the most significant contribution to the model make descriptors of solvation

energy.

1. Madzhidov T.I., Polishchuk P.G., Nugmanov R.I., Bodrov A. V., Lin A.I., Baskin I.I. et al. Russ. J. Org. Chem., 2014, 50(4): 459–463.

2. Nugmanov R.I., Madzhidov T.I., Khaliullina G.R., Baskin I.I., Antipin I.S., Varnek A.A. J. Struct. Chem., 2014, 55(6): 1026–1032.

This work was supported by Russian Scientific Foundation grant, agreement No14-43-00024, signed 1.10.2014

41

Page 43: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

I. Piyanzina1,2

B. Minisini2

D.Tayurskii1

DENSITY FUNCTIONAL THEORY CHARACTERISATION OF AZOBENZENE

DERIVATIVES 1Laboratory for Computer Design of New Materials, Institute of Physics, Kazan Federal University, Kremlyovskaya Str., 18, Kazan, Russia; 2Institut Supérieur des Matériaux et Mécaniques Avancés du Mans, Avenue Bartholdi, 44, Le Mans, France. [email protected]

Since the year 1937, when Harley published his work [1] about cis isomerization of azobenzene, this photochemical phenomena became widely studied. The idea behind it is photoisomerization of azobenzene under irradiation when non-polar trans azobenzene can be photoisomerized into the polar cis-azobenzene. Due to relatively simple molecular structure and unique characteristics, azobenzene and its derivatives were investigated in different studies as photoswitchable substances [2]. Moreover, azobenzene derivatives have vivid colors, which caused them to be used as dyes. Due to the formation of polar cis-isomer, the contact angle can be significantly decreased. This feature can be used to control wettability of surface [2]. Furthermore, changes in dipole moment cause changes in the surface potential, which can be used to create a surface with the controlled motion and net mass transport [2].

The physical and chemical properties of azobenzene derivative depend on both different molecular groups used as ring substituents and stability of the trans/cis configurations. In the present study, we have analyzed the effect of structural diversity on electronic properties. We have selected 10 molecules with different types of activating groups with azobenzene as a parent compound: 1. Azobenzene’s derivatives of the type of aminoazobenzene, i.e., azobenzene substituted with an

electron donating groups CH3, C6H5, NH2, SO2-NH2, N-(CH3)2; 2. Azobenzene’s derivatives substituted with an electron acceptor group OH, NO2, CH2-CH2-OH; 3. Azobenzene’s derivatives of the pseudostilbene type, i.e., substituted with an electron acceptor at

the para-position of a phenyl ring and an electron-donating group at the other para-position of another phenyl ring. This type is also known as push-pull azobenzene.

For this purpose we performed density functional calculations in the GAUSSIAN03 [3] program. All properties have been obtained using the B3LYP functional and 6-31++G(d,p) basis set, which were selected as the most appropriate in terms of time, accuracy, and cost of computer resources for the structural, electronic, molecular and vibrational properties determination [4]. For each molecule we have examined trans and cis forms and all possible configurations concerning spatial position of functional group. More details could be found in our previous works [4-6].

The planar trans structure has been obtained for all considered molecules. On both configurations, an asymmetry in structural parameters was obtained for all molecules. The trans configurations were found to be more stable than cis. The relative difference in the dipole moment between the trans and cis configurations was found to be lower than for azobenzene for all considered molecules except for molecule with N-(CH3)2 and NH2 groups, for which the difference was obtained equal to 4.7 Debye. For this molecule, the largest polarizabilities have been obtained as well. Concerning molecular properties, the highest reactivities were also found for this molecule and for the molecule with NH2 and NO2 groups as well. 1. Hartley G. Nature, 1937, 140: 281. 2. Halabieh El. et al. Pure Appl.Chem., 2004, 76: 14453. 3. Frisch, M. J., et al. "Gaussian 03, revision c. 02; Gaussian." Inc., Wallingford, CT, 2004, 4. 4. Minisini B. et al. J.Mol.Model., 2007, 13: 1227-1235. 5. Piyanzina I. et al. J.Mol.Model., 2015, 21: 34. 6. Minisini B. et al. J.Str.Chem., 2014, 5: 843-851. This work was performed according to the Russian Government Program of Competitive Growth of Kazan Federal University.

42

Page 44: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

D. Podshivalov1,2 V. Timofeev2,3 I. Kuranova2,3

COMPUTER-AIDED SEARCH OF NEW INHIBITORS OF PHOSPHOPANTETHEINE

ADENYLYLTRANSFERASE FROM MYCOBACTERIUM TUBERCULOSIS

1 Faculty of Physics M.V. Lomonosov Moscow State University Leninskie Gory, Moscow 119991 Russia; 2 Laboratory of X-ray Analysis Methods and Synchrotron Radiation, Shubnikov Institute of Crystallography, Russian Academy of Sciences, Leninsky Prospect 59, Moscow 119333, Russian Federation;

3 NBICS Center, National Research Centre "Kurchatov Institute", Akad. Kurchatova sqr, 1, Moscow, Moscow region, 123182, Russian Federation [email protected]

Phosphopantetheine adenylyltransferase from Mycobacterium tuberculosis (PPAT Mt) catalyzes the penultimate step of the biosynthesis of coenzyme A (CoA), the reversible transfer of an adenylyl group from ATP to 4'-phosphopantetheine, resulting in the formation of 3'-dephosphocoenzyme A (dPCoA) and pyrophosphate. Reaction catalyzed by PPAT is crucial for the biosynthesis of CoA - a metabolite necessary for the survive of mycobacteria, so PPAT Mt is a suitable target for the synthesis of anti-tuberculosis drugs.

The 3D-structure of PPAT Mt was studied by X-ray analysis [1]. Here the potential inhibitors of PPAT has been found based on X-ray structure of enzyme by virtual screening using Mcule service [2]. 10 best hits were chosen for next steps. The basic binding amino acid residues has been determined for each ligand.

On next step all ligands have been prepared for molecular dynamic simulation. The cycle of energy minimization and equilibration was performed for each ligand. The trajectory length of each simulation was 10 ns. As a result the positions of ligands in active site of PPAT were refined and compared they with position of functional ligands of PPAT – ATP, 4' -phosphopantetheine and dPCoA. This information will be used for the estimation of binding constants of studied inhibitors.

The reported study was supported by the Supercomputing Center of Lomonosov Moscow State University[3].

1. Timofeev V.I., Smirnova E.A., Chupova L.A., Esipov R.S, Kuranova I.P. Crystallography Reports, 2012, 57: 96–104. 2. https://mcule.com/ 3. Sadovnichy V., Tikhonravov A., Voevodin Vl., Opanasenko V. "Lomonosov": Supercomputing at Moscow State University. In Contemporary High Performance Computing: From Petascale toward Exascale (Chapman & Hall/CRC Computational Science), pp.283-307, Boca Raton, USA, CRC Press, 2013.

This work was supported by RFBR grant 14-02-31110-mol_a

43

Page 45: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

A. Prokhorov

V. Tumanov THE USE OF MAMDANI'S METHOD FOR

PREDICTING THE VALUES OF THE CLASSICAL POTENTIAL BARRIER FOR RADICAL REACTIONS

Department of Computing and Information Resources, Institute of Problems of Chemical Physics, Russian Academy of Sciences, Academician Semenov avenue 1, Chernogolovka, Moscow region, Russian Federation [email protected]

Experimental determination of rate constants and activation energies of radical reactions is a

difficult experimental task, and in many practical cases its theoretical estimation is an urgent task. Creating databases on rate constants of radical reactions allows to solve problems of prediction of the reactivity of organic compounds in different classes of radical reactions using the methods of applied artificial intelligence.

The purpose of this paper is the use of the Mamdani method [1] to identify the empirical dependence of classical potential barrier of reactions between substituted phenyl radicals (4-CH3- C6Hº

5, 4-Br-C6Hº5, 4-Cl-C6Hº

5, etc.) and hydrocarbons in the liquid phase based on experimental kinetic data.

To determine the parameter space of identification (input data) of activation energy of radical reactions Ee we use the intersecting Morse curve approach, which is defined by the correlation ratio [2].

It is assumed that the classical potential barrier of activation is specified by the nonlinear function:

)2,1( SSFeE = (1)

where S1 is the vector of numerical variables (bond dissociation energies, force constants), S2 is the vector of linguistic variables (type of solvent, electrophilic and electron-donating properties of substituents). The classification of substituents by electrophilic and electron-donating properties is taken from [3]. The sample of 97 reactions is obtained from the database on rate constants of liquid phase radical reactions [4].

In constructing the fuzzy knowledge base input and output variables according to (1) are regarded as linguistic variables defined on the respective universal set. As the membership functions of fuzzy terms to the element G is selected:

( ) 2

1

1

+

=

cbx

xGµ (2)

For the experimental sample of reactions of substituted phenyl radicals with hydrocarbons a fuzzy knowledge base was set by artificial neural network training [4]. Currently it includes 63 fuzzy rules.

On the example of reactions between substituted phenyl radicals and hydrocarbons an attempt was made to identify the dependence of classical potential barrier of activation of radical reactions by the fuzzy knowledge base built on basis of quantitative and qualitative parameters.

1. Yager R. et al. Essentials of Fuzzy Modeling and Control. USA: John Wiley & Sons, 1994, 387 p. 2. Denisov E.T. et al. Zhurnal fizicheskoi khimii, 1994, 68(4): 719-725. 3. Henry D.J. et al. J. Phys. Chem. A, 2001, 105: 6750-6756. 4. Tumanov V. et al. Information Resources of Russia, 2010, 5: 16-21.

The reported study was partially supported by RFBR, research project No. 15-07-08645.

44

Page 46: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

E. A. Sosnina D. I. Osolodkin E. V. Radchenko S. B. Sosnin V. A. Palyulin N. S. Zefirov

QSAR STUDY OF GLYCOGEN SYNTHASE KINASE 3 INHIBITORS

Division of Medicinal Chemistry and Advanced Organic Synthesis, Department of Chemistry, Lomonosov Moscow State University, Leninskie gory 1/3, Moscow 119991 Russia [email protected]

Glycogen synthase kinase 3 (GSK-3) inhibitors have been the focus of attention of scientists for about 30 years because of their therapeutic potential in treatment of neurodegenerative disorders, type 2 diabetes, inflammatory processes, and certain types of cancer [1]. More than two thousand small-molecule GSK-3β inhibitors from different chemical classes were discovered during previous studies, but only lithium salts, which have low therapeutic index, are used in the clinical practice. However, this area is the topic of intense research, and several GSK-3β inhibitors are now in preclinical studies.

Thanks to numerous publications, there is plentiful experimental data on inhibitory effect of small molecule compounds towards GSK-3β. Therefore, it is possible to conduct a search and design of new potential inhibitors by means of chemoinformatics methods and quantitative structure-activity relationships analysis.

In our QSAR study artificial neural network models were built using the NASAWIN software [2]. Biological and chemical data from different published articles and patents were used for constructing databases. From all this data two main databases were created: first, containing 1920 compounds, was used to build predictive regression models; second, containing 3378 compounds, was applied to generate classification models that allow one to predict if a new chemical structure would possess an inhibitory effect at 10 μM concentration. For all models fragmental descriptors were used, selected by a stepwise multiple linear regression procedure. To prevent overfitting of the models, the optimal number of descriptors for each case was defined on the basis of the model validation, using both the data that served to derive the model (internal methods) and separate test sets (external methods).

The first dataset of 1920 compounds was divided into two parts: 10% of the compounds were randomly selected to form the external test set while 1728 remaining structures were used for creating the regression model on 260 descriptors. Cross-validation revealed good results (Q2 = 0.817, RMSE = 0.517) while the accuracy of prediction on external test set was lower (R2 = 0.516, RMSE = 0.849). For more accurate prediction all compounds were clustered according to their Tanimoto similarity. The clusters represented all classes of the GSK-3 inhibitors, in particular, maleimides (288 compounds), pyrazolo[1,5-b]pyridazines (91), indirubins (76), paullones (107), pyrazolo[3,4-d]pyrimidines (65), aminoindazoles (71), 1,3,4-oxadiazoles (77), etc. All subsets were divided in the same manner, and models were created and validated. The models have Q2 from 0.750 to 0.895 in cross-validation and R2 for the external test set from 0.622 to 0.754.

The second database was used to build the classification models. The optimal model contains 280 descriptors. The quality of classification was estimated by the ROC-analysis, leading to AUC value = 0.98.

These results can be helpful in the development and optimization of new promising GSK-3β inhibitors. The developed QSAR models will be used for prediction of inhibitory activity of compounds belonging to certain structural classes and for identification of new inhibitory scaffolds.

1. Glycogen Synthase Kinase 3 (GSK-3) and Its Inhibitors: Drug Discovery and Development. Martinez A., Castro A., Medina M., Eds.; John Wiley & Sons, Hoboken, NJ, USA, 2006. 2. Baskin I. I., Halberstam N. M., Artemenko N. V., Palyulin V. A., Zefirov N. S. EuroQSAR 2002 – Designing drugs and crop protectants: Processes, problems and solutions, 2003: 260–263.

45

Page 47: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

K. Taborskaya1

M. Petrosyan,2 E. Fedorova,3

K. Pats4

USE OF MATHEMATICAL MODELING TO PREDICT THE PROGESTATIONAL ACTIVITY

1Department of Biology, Saint-Petersburg State University, Universitetskaya nab. 7-9, St.-Petersburg, Russia 2 Laboratory of Pharmacology, Research Institute of Obstetrics, Gynecology and Reproductology named after D. O. Ott., Mendeleyevskaya line 3, St.-Petersburg, Russia 3,4 Saint-Petersburg State Chemical Pharmaceutical Academy, Prof. Popova St. 14, St.-Petersburg, Russia [email protected]

Computer prediction of biological activity of compounds is the actual and quickly developing

direction in the field of drug development. It is connected with the fact that empirical methods for determining biological activity of substances are ineffective in view of the large variety of chemical compounds and their targets. Today tens of gestagen pregnane class drugs used in the treatment of various diseases. Despite the variety of progesterone analogues developing of drugs that are more effective and safer is an important task. Using the accumulated amount of data about structure and activity of progestogens it is possible to create a training sample, and then, with the help of special software to predict probability of activity of new compounds against progesterone receptor. The aim of this study was to create a training set of progesterone analogues and their activity and apply it to predict probability new drugs activity with respect to the progesterone receptor. Methods Data about progestin activity were collected from information resources (NCBI), from the Research Institute of Obstetrics, Gynecology and Reproductology named after D. O. Ott and from other laboratories. We used the program ISIS Base to manage our databases and the program GUSAR and PASS for the prediction.[1]. Results Full spectrum of the biological activity of the 7 test compounds was obtained with the help of software PASS. A training set of 28 compounds with known activity at oral and subcutaneous intake has been created. The results obtained on prediction allows to use the training set and a GUSAR program for predicting progestin activity of new compounds. Predictive data may be successfully used while planning experimental research in the future.

1. Филимонов Д. А., Поройков В.В. Рос. хим. ж., 2006, 1 (2).

46

Page 48: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

R. Toichuev

ENERGY ASPECTS IN MODELING

Institute of Medical Problems, South Branch of the National Academy of Sciences of the Kyrgyz Republic, Uzgenskaya Str, 130-a, Osh, Kyrgyzstan [email protected]

Objective: To develop energy approaches in modeling, including those for the production of

biologically active substances. Materials and methods: Advanced integrated methods based on mass, energy, including signals and sounds, space and field, time and speed which are required for the process to take place, considering the quantity and quality, charges, spin states of atoms and compounds. Results: In modern science, much attention is paid to modeling the structure, relationships, charging aspects, spin states and fields formed by atoms, considering Van der Waals’ forces, Connolly surface and the range of molecular compounds. At the same time, the dark energy, which occupies about 70% of the total energy of our universe and which is responsible not only for interactions, but also for the storage of information still takes a back seat in the modeling. In modeling, the energy will play a major role and first and foremost, information energy (Janay - a piece of soul) which exists in water and silicon. Being found in DNA structure (0.26 to 0.31%) [1], silicon is not only a repository of information in computers but also in DNA.

In modeling, little attention is paid to energy consumption, energy absorption, energy preservation, energy conductivity, energy efficiency, energy exchange of organic molecules, changes of the emitted energy depending on the type of compounds affecting the functional state of the connections, maintaining the laws of renovation and equilibrium [2,3]. Introduction of these concepts would help to solve not only some molecular biology issues, but also other processes related to the development of diseases. Consideration of atoms electron affinity and spins, "saturation" of compounds that are the components of therapeutic agents could increase therapeutic efficacy of new obtained compounds. For example, our research work – obtaining therapeutic agents from the plants growing in Kyrgyzstan for neutralization and removal of organochlorine pesticides (OCPs) which possess a total neutral charge from gastrointestinal tract. During the research, atoms electron affinity and all stages of OCPs movement starting from the gastrointestinal tract, blood and lymph circulation and cell to release from the body have been taken into account. After 10 days of administration of the agents received, concentration of OCPs in blood, urine and breast milk of nursing women has decreased by two times. After a month of administration of the agents, only traces of OCPs were detected in sperm of men with male infertility.

1. Berezov T.T., Korovkin B.F. Biological chemistry: 3rd ed., Stereotypical. M.: Publisher: “Medicine”, 2007: 704. 2. Toichuev R.M. Energy aspects of tumor development and mutagenesis (manuscript). Certificate No.1653, State Patent Service of the Kyrgyz Republic, April 15, 2011. 3. Toichuev R.M. Scientific and Technical Journal "Achievements in modern science", 2013, 4: 37-41.

47

Page 49: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

B. Viira

A. T. Garcia-Sosa U. Maran

HUNTING PROMISING STRUCTURE MOTIFS FOR HIV1 RT FROM PUBLICLY AVAILABLE DATA

Institute of Chemistry, University of Tartu, Ravila 14A, 50411, Tartu, Estonia [email protected]

HIV is one of the deadliest infectious diseases in the world. Globally, an estimated 35.3 million people were living with HIV in 2012 [1]. The growing interest in HIV1 reverse transcriptase (RT) inhibitors and drug-resistant mutations over the past years has led to an increasing amount of data regarding chemical and the corresponding biological activity space. This data allows chemoinformatics to understand the structural patterns of known active and inactive chemicals and their chemotypes.

All data on HIV1 RT inhibitors were extracted from the ChEMBL database. After extensive

curating, the final dataset consisted compounds with reported Kd and Ki values, which were measured against wild type and 13 different HIV1 RT mutants. The curated data was analyzed using a hierarchical classification of common core structures that is based on rules considering synthetic and medicinal chemistry rationales resulting a scaffold tree.

To date, 13 approved drugs for HIV1 RT are used: abacavir, delavirdine, didanosine,

efavirenz, emtricitabine, etravirine, lamivudine, nevirapine, rilpivirine, stavudine, tenofovir disoproxil, zalcitabine and zidovudine. A scaffold tree was also constructed for approved drugs to analyze their structures and compare them to the previous dataset. 12 molecules out of 13 consist of N-heterocycles and 1 molecule contains a N,O-heterocycle as a “parent” structure. The approved drugs scaffold tree coincides with the above mentioned data scaffold tree. Analyzing these virtual scaffolds may discover the “holes” not covered by the compounds in the database and are promising starting points for further investigation.

1. Global Report, UNAIDS report on the global aids epidemic 2013. UNAIDS, November 2013. Page 4.

Acknowledgement: Estonian Research Council (grants IUT34-14)

48

Page 50: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

Balakin K.V.1, 2, 3 Lapushkin G.I.2 Savilova A.G.2 Voronkov A.2

PREDICTION OF THE CLINICAL ADVERSE EFFECTS OF SEROTONIN AND NOREPINEPHRINE REUPTAKE INHIBITORS BASED ON ANALYSIS OF

THEIR MULTITARGET ACTIVITY PROFILE. 1 Institute of physiologically active compounds, Russian Academy of Sciences, Chernogolovka, Moscow region; 2 Moscow Institute of physics and technology (State University), Dolgoprudnyi, Moscow region.

[email protected] Early identification and control of drug adverse effects (AEs) is an important and very difficult issue in modern drug discovery and development. Therefore it is very tempting to develop a methodology to quantitatively estimate relationships between the drug side effects and their molecular properties, such as structural features, pharmacokinetics, absorption, distribution, metabolism, excretion (ADME) as we well as physico-chemical profile and the influence on the specific target/s. In the current work we have set the goal to evaluate possibilities to prognose the side effects of drugs from the SNRIs group, on the basis of their target specificity profile. This task is also actual on the stage of development of new anti-depressants from the SNRIs group, especially on the stages of clinical trials, which potentially can minimize risks for the patients. In this study, we have suggested a new approach to to evaluate potential adverse effects profile for a series of antidepressants, namely serotonin and norepinephrine reuptake inhibitors (SNRIs), based on analysis of their in vitro multitarget activity data. Our approach is based on the use of computational data mining methods. At the first step, we analyzed adverse effects of 10 SNRIs using data extracted from the FDA AERS database. We also collected data on their in vitro multitarget activity from the ChEMBL database. Calculation of pairwise similarity coefficients allowed us to obtain quantitative measures of (dis)similarities of adverse effects’ and multitarget activity profiles for the 10 SNRIs. Our work has revealed a clear correspondence between the two hierachies, thus demonstrating the dependence of the adverse effects of the SNRIs on their multitarget activity profile. The results of this work pave the way for finding more general dependencies connecting the spaces of adverse effects and multitarget activity of drugs.

49

Page 51: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

L.S.Gredyagina1

N.I.Baranova1

E.V.Fedorova2

THE CORRELATION BETWEEN LOGP OF LIGANDS AND TOXICITY OF THEIR COMPLEXES WITH

VANADIUM 1 Laboratory of Pharmacological Research, Saint Petersburg State Chemical Pharmaceutical Academy, Professora Popova str. 14, Saint Petersburg, Russia; 2 Research and Development Department, Saint Petersburg State Chemical Pharmaceutical Academy, Professora Popova str. 14, Saint Petersburg, Russia; [email protected]

There are several approaches to study the relationships between structure and pharmacological and chemical properties (LD50, IC50, solubility, melting point, etc.) of substances. However, most of contemporary QSAR-methods were developed for molecular graphs of organic molecules with only covalent bonds. At the same time using QSAR-methods for complex compounds might be difficult because of its unique structure: the bonds inside ligand are covalent, and bonds between the ligand and the metal atom are coordinational; in addition the polarity and strength of these bonds may vary greatly. This makes it difficult to carry out directed synthesis and in silico screening of biologically active complex compounds [1].

Thus we have attempted to establish the relationships between descriptors of ligands and properties of complexes on a small sample and create QSAR-model for predicting the properties of coordination compounds. We selected 15 complexes of vanadium in the oxidation state +4 (classical Werner complexes) from different toxical groups. The physico-chemical parameter – partition coefficient – had been selected as a descriptor. The dependence between logP of ligands and logLD50 of complexes was built. This dependence is described by a straight line (y = 0,2001x + 2,3593), having a Pearson coefficient of linear correlation equal to 0,728 (p <0,01).

11 novel complexes of vanadium (IV) were used for the external validation of the model; logP value varied from 0.8 to 3.8, and the LD50 varied from 950 to 1500 mg/kg. Acute toxicity was studied by the method of Prozorovskiy on albino male mice [2]. When comparing the predicted and experimentally determined LD50 was carried out, prognosis accuracy was more than 75%, which indicates that the model is stable and suitable for predicting the toxicity of complex compounds of vanadium (IV).

So the descriptors of the ligand might be used to predicting the properties of whole complex. This may help to solve the fundamental problem of cheminformatics - saving synthetic resources and time, due to the virtual "screening" of potentially low-active and highly toxic molecules, including coordination compounds, despite the features of their structure.

1. Fedorova E.V., Buryakina A.V., Zakharov A.V., Filimonov D.A., Lagunin A.A. Design, Synthesis and Pharmacological Evaluation of Novel Vanadium-Containing Complexes as Antidiabetic Agents, 2014. - PLoS ONE 9(7): e100386. doi: 10.1371/journal.pone.0100386 2. Prozorovsky V.B. Psychopharmacology and Biological Narcology, 2007, 7(3-4): 2090-2120.

Funded by a grant from Committee on Science and Higher Education of the Government of St. Petersburg

50

Page 52: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

V.R. Khayrullina1 A.Ja. Gerchikov1 M. N. Vasiliew1 T.R. Nasrtdinova1 R.F. Nasjirova1 A.S. Zigangirov1 F.S. Zarudiy2 I.A. Taipov3

QSAR- MODELING OF ANTITUMOR ACTIVITY OF QUINAZOLINE DERIVATIVES USING

GUSAR 1 Department of Chemistry, 1Bashkir State University, Z. Validi Str., 32, Ufa, Russia; 2 Department of Pharmacology № 1, Bashkir State Medical University, Lenin Str. 3, Ufa, Russia; 3 "Branch in Ufa Scientific Production Association Microgen Immunopreparat", Novorossijiskaja Str. 105, Ufa, Russia [email protected]

A quantitative analysis of relationships between structures of different classes of biologically active compounds and efficiency of inhibition of the catalytic activity of thymidylate synthase of inbred albino mice was made by the program GUSAR (General Unrestricted Structure Activity Relationships) [1-2]. Biological data from ChEMBL [3] were used for creation of QSAR models. In general 6 statistically significant QSAR-models (Rtrain set

2> 0.6, Rtest set2> 0.5, Q2> 0.6) for prediction

of IC50 values for various quinazoline derivatives against thymidylate synthase outbred white mice rats were created based on MNA- and QNA-descriptors, as well as consensus of their combinations. The characteristics of created models are shown in Table 1. Training set TrS2 and test set TS included 22 and 11 structures of thymidylate synthase inhibitors, respectively. They were obtained by dividing the pre- sorted in ascending order of IC50 values in ratio 2:1, i.e. excluded from TrS1 every third compound to TS. These models can be used for quantitative prediction of potential anti-tumor drugs against thymidylate synthase. Atoms and structural fragments of the studied structures influencing on increase and decrease of thymidylate synthase inhibition were identified by GUSAR visualization of quantitative “structure-activity” relationships in the created models. The results of structural analysis of the contribution of the different functional groups in the activity of thymidylate synthase inhibition can be considered in the molecular design of active substances of known anticancer drugs in order to enhance the efficiency of their inhibitory action thymidylate synthase.

Table 1. Characteristics and prediction accuracy of IC50 values for consensus models M1 - M6. pIC50 activity in TrS1 and TrS2 lies in the range 5-9.

Training set Models N R2ОВ R2

ТВ F S.D. Q2 V QSAR model based on MNA-descriptors

TSet1 М1 33 0.900 - 22.215 0.207 0.832 6 TSet2 М2 22 0.747 0.528 7.211 0.320 0.587 4

QSAR model based on QNA-descriptors TSet1 М3 33 0.876 - 21.836 0.220 0.791 6 TSet2 М4 22 0.829 0.635 9.461 0.273 0.702 5

QSAR model based on MNA- and QNA-descriptors TSet1 М5 33 0.897 - 23.943 0.207 0.828 6 TSet2 М6 22 0.872 0.725 12.091 0.242 0.761 5

N – number of structures in the training set; R2TrS - a multiple coefficient of determination calculated for compounds

from the training set; R2TS - a multiple coefficient of determination calculated for compounds from the test set; Q2 – a

cross-validated R2 calculated during leave-one-out cross-validation procedure on data of the training set; F – Fisher's coefficient; SD – standard deviation; V- the number of variables in the final regression equation.

1. Filimonov D.A. et al. SAR and QSAR in Environmental Research, 2009. 20 (7–8): 679–709. 2. Masanda V.H. et al. Der Pharma Chemica, 2011, 3 (4): 517–525. 3. ChEMBL: https://www.ebi.ac.uk/chembl/.

This work was supported by the RFBR grant 14-04-97035 and project №4.299.2014 / K, running within the design of the public tasks in the field of Education and Science of the Russian Federation scientific activity.

51

Page 53: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

Khayrullina V.R.1 Gerchikov A.Ja.1 Nasjirova R.F.1 Zigangirov A.S.1 Nasrtdinova T.R.1 Zarudiy F.S.2

CORRELATION ANALYSIS OF CYCLOOXYGENASE-2 INHIBITORS IN A SERIES OF

1,2-DIPHENYLIMIDAZOLE DERIVATIVES 1 Department of Chemistry, 1Bashkir State University, Z. Validi Str., 32, Ufa, Russia; 2 Department of Pharmacology № 1, Bashkir State Medical University, Lenin Str. 3, Ufa, Russia [email protected]

The aim of this work was to study the dependence of the IC50 for COX-2 inhibitors with the general formulas I and II and the nature of the physical and chemical characteristics of para-substituents on their benzene fragments.

N

N

R1

S OOMe

3

5X

B

A

N

N

R1

S OONH2

3

5X

B

A

I II

Experimental biological data for these compounds are presented by the authors [1].The linearization results of the experimental data [1] in the coordinates of one-parameter equations [2] for COX-2 inhibitors in table 1 are shown. This data allow to conclude that the increase the activity of COX-2 inhibitors with the general formula I will contribute to the introduction of the para-position of the ring electron-withdrawing substituents. It is found that parameter IC50 of these compounds with the hydrophobicity parameters π are correlated. In this series compounds lipophilic functional groups with π>0 will be enhance the COX-2 inhibition activity. Our results can be used in the molecular design of new high-COX-2 inhibitors.

Table 1. Linearization of the experimental data in coordinate of equations (1) - (8). № Equation S.D. N R F P

(α=0.05) Compounds of the general formula I

1 p)i(50 •)170.0(020.1)075.0(82.6pIC σ±+±= 0.1811 8 0.926 36.09 < 0.05 2 +σ±+±= p)i(50 •)0811.0(559.0)077.0(950.6pIC 0.1606 8 0.942 47.45 < 0.05 3 R)i(50 •)280.0(425.1)117.0(053.7pIC σ±+±= 0.208 8 0.901 25.90 < 0.05 4 π±+±= •)160.0(740.0)107.0(432.6pIC )i(50 0.322 9 0.870 21.76 < 0.05

Compounds of the general formula II 5 I)i(50 •)080.0(400.1)026.0(390.7pIC σ±+±= 0.380 4 0.996 248.80 < 0.05

Pooled sample comprising a compound with formulas I-II 6 p)i(50 •)380.0(480.1)140.0(170.7pIC σ±+±= 0.446 12 0.778 15.306 < 0.05 7 +σ±+±= p)i(50 •)25.0(810.0)160.0(330.7pIC 0.442 12 0.72 9.70 < 0.05

8 π±+±= •)270.0(906.0)170.0(722.6pIC )i(50 0.590 13 0.712 11.32 < 0.05 N – number of structures in the training set; R – a coefficient of determinationt; F – Fisher's coefficient; SD – standard deviation, p – a confidence level at a significance level a=0.05 1. Khanna I.K. et al. J. Med. Chem., 1997. 40: 1634-1647. 2. Hansh C. et al. Chem. Rev., 1991, 97: 165-195.

This work was supported by the RFBR grant 14-04-97035 and project №4.299.2014 / K, running within the design of the public tasks in the field of Education and Science of the Russian Federation scientific activity.

52

Page 54: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

M. Kulinsky

P. Polishchuk

V. Kuz’min

CLASSIFICATION QSAR MODELS OF PHARMACOKINETIC PROPERTIES OF DRUG

SUBSTANCES AND THEIR STRUCTURAL INTERPRETATION

A.V. Bogatsky Physico-Chemical Institute of the NAS of Ukraine, Lustdorfskaya doroga, 86, 65080 Odessa, Ukraine. [email protected]

In present study we performed QSAR analysis of elimination half-life, total body clearance

and volume of distribution at steady-state of drug substances. Elimination half-life of the drug is the time during which the concentration of drug in the body is reduced by 50%. Total body clearance is the volume of plasma cleared of the drug per unit time and describes how quickly drugs are eliminated, metabolized or distributed throughout the body. The volume of distribution at steady-state represents the actual blood and tissue volume into which a drug is distributed and the relative binding of drug to protein in these spaces.

Data for studied pharmacokinetic properties was collected from literature [1]. The modeling dataset included 551 structurally diverse drug substances of various classes and mechanisms of action. All compounds from the dataset were divided into two classes.

In this study, classification QSAR models based on 2D level of simplex descriptors were developed using gradient boosting machine (GBM), support vector machine (SVM) and Random Forest (RF) statistical approaches. For developing models we used software tool SIRMS-QSAR [2]. For all models five-cross validation was carried out.

For each pharmacokinetic property we have obtained classification QSAR models with accuracy around 0.70, balanced values of sensitivity and specificity and fair Cohen's kappa coefficient. At last step structural interpretation of obtained QSAR models was performed, allowing us to identify fragments which affects investigated ADME properties the most.

1. Obach R.S. et al. Drug Metab. Dispos., 2008, 36(7): 1385-1405. 2. http://www.qsar4u.com

53

Page 55: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

V. Kuz’min1

A. Mouats2

A. Artemenko1 P. Polishchuk1

(2.0+0.X)D - QSAR MODELS. STEREOCHEMICAL DESCRIPTION OF MOLECULES WITHIN THE

“SiRMS” APPROACH 1 Department of Molecular Structure and Chemoinformatics, O.V. Bogatsky Physico-Chemical Institute NAS of Ukraine, 86 Lustdorfskaja doroga , 65080 Odessa, Ukraine; 2 Chair of Organic Chemistry, I.I. Mechnikov, Odessa National University, Dvorianskaja Str. 2, 65000 Odessa, Ukraine; [email protected]

Stereochemistry of molucules is considered explicitly in 3D-QSAR models. Usually these models are applied for one fixed conformer of each molecule that can be undesirable if we don’t know the spatial structure of the molecule interacting with a biological target. Commonly used 2D-QSAR models frequently have comparable or better predictive performance than 3D ones but they lack stereochemical representation. Thus the question arises - how to describe stereochemistry of molecules (different types of chirality, geometric isomers, etc.) without specifying their conformation?

In this study we show how to solve this problem in terms of simplex representation of the molecular structure (SiRMS) [1]. All molecular fragments that does not determine stereochemistry of a molecule are described in terms of 2D molecular representation (structural formula). Structural elements which determine molecular stereoisomerism are described by 3D chiral simplexes [2]. This procedure is illustrated on the example of the enantiomers of alanine:

Chiral 3D simplexes for Common 2D achiral simplexes Chiral 3D simplexes for

S isomer R isomer

It should be noted that chiral simplexes allow us to describe the molecular system of any stereochemical complexity. In the proposal (2.0+0.X)D - QSAR approach parameter (0.X) is determined by the ratio of 2D achiral and 3D chiral simplexes involved in the appropriate QSAR model.

The effectiveness of the developed approach was demonstrated on various examples of QSAR tasks for chiral molecules of different types. 1. Kuz'min, V. E., Artemenko, A. G., et al., Journal of computer-aided molecular design, 2008, 22 : 403-21. 2. Kuz'min, V. E., Chelombitko, V. A., et al., Journal of Structural Chemistry, 1998, 39: 452-56.

54

Page 56: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

A.G. Maldonado

THE ROLE OF PREDICTIVE MODELING ON

INDUSTRIAL CHEMICAL INNOVATION TNO Technical Sciences / Industrial Innovation, PoBox 6235, 5600 HE, Eindhoven, the Netherlands; [email protected]

The general idea of predictive modeling (PM) is simple: using available data, predict with as

much accuracy as possible a future response. But even though predictive modeling is all about making a practical use of real data to solve real problems, in most cases the data is not clean and the model is not quite perfect. Sometimes the problem which the predictive model tries to solve is usually too difficult to be addressed experimentally and will benefit greatly even from a "trends" model.

PM combined with other data mining and chemometric tools have proven to be a powerful

asset in modern industrial molecular discovery. Using PM in industrial chemistry can also help us to understand why a certain response behave like it does, and which are the molecular features that govern (most) of its behavior. It can assist us to know why among a large set of experiments we did not succeed to find the optimal molecule. This helps us to pinpoint good regions in the chemical space, to search for the novel optimal molecules. PM is thus key in substitution projects and in chemical innovation.

In this lecture, we will show how to get the best out of your experimental data by combining

theoretical calculations, QSPR models and experimental validations.

1. A.G. Maldonado, G. Rothenberg. Chem. Soc. Rev, 2010. DOI: 10.1039/b921393g. http://www.rsc.org/publishing/journals/CS/article.asp?doi=b921393g 2. A.G. Maldonado, G. Rothenberg. Chem. Eng. Proc., June 2009.

Experimental Data

Predictive Model

Experimental Validation

Optimal Performance

55

Page 57: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

N. Rusakova1

A. Kotomkin1

V. Turovtsev1,2 Yu. Orlov1

COMPARISON OF ELECTRON STRUCTURE OF THE RADICAL ISOMERS OF

THIOCARBOXYLIC ACIDS n-Alk-CH●-CSOH 1 General Physics Department, Tver State University, Sadoviy Str. 35, Tver,Russia; 2 Department of Physics, Mathematics and Medical Informatics, Tver state medical academy, Sovetskaya Str. 4, Tver,Russia; [email protected]; [email protected]

The studying of radicals by classical chemistry methods is technically complicated and

expensive process becaus of their high reactivity and short lifetime. The aim of this work is to study in details the electronic structure of compounds in homologous series of tautomers thiocarboxylic acids CH3(CH2)nC●HC(S)OH and CH3(CH2)nC●HC(O)SH within QTAIM [1] and analysis the inductive impact of having free valence group C●H on the alkyl chain.

The geometry optimizations of CH3(CH2)nC●HC(S)OH and CH3(CH2)nC●HC(O)SH, where 0 ≤ n ≤ 7, were carried out with program GAUSSIAN 03 [2] by B3LYP/6-311++g(3df,3pd) 6d 10f. Charges q(Ω) and spin densities σ(Ω) of the «topological atoms» Ω have been calculated using AIMALL [3]. The parameters of functional groups q(R) and σ(R) were summarized from the characteristics Ω. Qualitative determination of group electronegativities χ(R) in radicals was carried out by comparing of their q(R).

Calculation q(R) for the CH3(CH2)nC●HC(S)OH and CH3(CH2)nC●HC(O)SH revealed the value q(C(O)SH) is less than q(C(S)OH) by 0.024 a.u. Thus, the relationship among electronegativities is as follows χ(C(S)OH) < χ(C(O)SH). From the comparison of the group charges C●H, CH2, CH3 and CSOH, the common scale χ(R) was obtained:

χ(CH2) < χ(CH3) < χ(C●H) < χ(C(S)OH) < χ(C(O)SH)

It has been shown that the inductive effect C●H in isomers CH3(CH2)nC●HC(O)SH and

CH3(CH2)nC●HC(S)OH extends along on nearest three groups CH2 in the hydrocarbon chain. Manifestations steric effects of COSH groups in radicals CH3(CH2)nC●HC(O)SH and CH3(CH2)nC●HC(S)OH did not observed.

Analysis of σ(R) values in CH3(CH2)nC●HC(O)SH permits to determine the fragment C●H as radical center. Although a small part of the spin density was localized on the oxygen atom in C(O)SH, the free valency (marked with symbol ●) can be ascribed to the carbon atom of the group C●H. In CH3(CH2)nC●HC(S)OH a distribution of σ(R) does not allow to determine the fragment C●H as a radical center because σ(C●H) = σ(S) = 0,5. Thus, for this case free valence can be formally ascribed to two atoms: carbon in the group C●H and sulfur in the group C(S●)OH.

1. R.F.W. Bader, Atoms in molecules. Quantum theory, Oxford University Press, Oxford (1990) 2. Frisch M.J., Trucks G.W. at all. Gaussian 03 (Revision E 0.1 SMP). Gaussian Inc., Pittsburgh PA, 2007. 3. AIMAll (Version 11.09.18, Professional), Todd A. Keith, 2010 (http://aim.tkgristmill.com).

This work was partially supported by RFBR grant (project 14-03-97502)

56

Page 58: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

A.V. Semenov1 I.V. Tarasova1 A.V. Tutushkina1 E.V. Semenova2

THEORETICAL EVALUATION OF BINDING EFFICIENCY OF TRPV1 WITH SOME CAPSAICIN

HETEROANALOGUES 1Chair of Organic Chemistry, Phisics and Chemistry Insitute, Ogarev Mordovian State University, Bolshevistskaya Str., 68, Saransk, Russia; 2Chair of Pharmacology, Insitute of Medicine, Ogarev Mordovian State University, Bolshevistskaya Str., 68, Saransk, Russia [email protected]

According to recent investigations,the main component of hot peppers – alkaloid capsaicin – can be considered as a multimodal analgesic drug. It is capable to influence both on the nociceptive component of pain (due to direct exposure to vanilloid receptor TRPV1), and neuropathic component (by normalizing the metabolic processes in nerve fibers and stimulation of the regeneration of damaged structures due to proven antioxidant effect).

Using the molecular docking method we carried out the performance evaluation of binding capacity TRPV1 receptor with potential ligands – capsaicin heteroanalogues (1), offering the best ADME characteristics and antioxidant activity compared with the prototype.

The structure of the target receptor TRPV1 was taken from the base Protein Data Bank as pdb-file (file index 3J5R). Preparation of target and ligands structures for docking procedure was carried out using the software package version 1.10.1 UCSFChimera [1].

Molecular docking procedure was carried out in the capsaicin binding region of TRPV1 receptor established in [2] and has been implemented by a software package EADock2 [3]. Docking process was based on a multiobjective evolutionary algorithm, which includes the reconstruction of the ligand increments based on DSS procedure [3] – dihedral space sampling. Search binding domains was built on the grid-based LIGSITE algorithm [4]. Evaluation function was based on the CHARMM22 force field [5]. The screening results filtering was carried out by visual comparison with the electron microscopy data to capsaicin [2].

Selecting results are shown in Table. In this Table presents the binding parameters for the ligands top conformations within the clusters limits selected according to the criteria of visual filtering.

Table – Results of molecular docing of ligands (1) with receptor TRPV1

HOHN

O

(CH2)3H3CO

capsaicin

N

HOHN

O

R

1а-k

Compound R ΔG, kcal/mol capsaicin -6,59

1а H -6,27 1b CH3 -6,39 1c CH3CH2 -6,69 1d CH3(CH2)2 -6,44 1e CH3(CH2)3 -6,67 1f CH3(CH2)4 -6,76 1g CH3(CH2)5 -6,98 1h CH3(CH2)6 -6,97 1i CH3(CH2)7 -6,99 1j CH3(CH2)8 -7,63 1k CH3(CH2)9 –*

*Compound doesn’t form the stable complexes within the investigated binding site

Analysis of the data suggests that the majority of the investigated ligands have binding energies comparable to the prototype. The best characteristics has ligand (1j).

1. http://www.cgl.ucsf.edu/chimera/ 2. Caol E. et al. Nature. 2013. 504:113–118. 3. Grosdidier A. L. et al. J. Comput. Chem. 2011. 32: 2149–2159. 4. Hendlich M. et al. J. Mol. Graph. Model. 1997. 15: 359–363. 5. Brooks B. R. et al. J. Comput. Chem. 2009. 30: 1545–1614.

57

Page 59: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

G. V. Sitnikov1,2

N. I. Zhokhova3

A. Varnek1,4

I. I. Baskin1,3,4

CONTINUOUS INDICATOR FIELDS IN MODELLING OF Am3+/Eu3+ SEPARATION FACTORS

1 Laboratoire de Chémoinformatique, UMR 7140 CNRS, Université de Strasbourg, 1 rue Blaise Pascal, Strasbourg, France; 2 A.N.Nesmeyanov Institute of Organoelement Compounds of Russian Academy of Sciences, Vavilova St. 28, Moscow, Russia; 3 Chair of Polymer and Crystal Physics, Faculty of Physics, M.V.Lomonosov Moscow State University, Leninskie Gory, Moscow, Russia; 4 A.M. Butlerov Institute of Chemistry, Kazan Federal University, Kremlyovskaya St. 18, Kazan, Russia;

[email protected] In the framework of the Continuous Molecular Fields (CMF) approach [1] for building 3D-

QSAR models continuous functions are used for the description of molecular fields instead of finite sets of molecular descriptors (such as interaction energies computed at grid nodes) commonly applied for this purpose (in CoMFA, CoMSIA, GRID, etc.). In this work, a novel type of molecular fields, Continuous Indicator Fields (CIFs) [2], is suggested to provide 3D structural description of molecules. The values of CIFs are calculated as the degree to which a point with 3D coordinates belongs to an atom of a certain type. CIFs can be considered as a 3D analog of atom-centered fragment descriptors.

This approach is applied to building 3D-QSAR models of separation factors of Am3+/Eu3+ for the datasets of 47 polyazaheterocyclic ligands. It is shown that even in the simplest form this method provides either comparable or enhanced predictive performance of models in comparison with state-of-the-art 3D-QSAR methods based on interaction molecular fields of physico-chemical nature.

Graphical analysis of 3D-QSAR models based on the use of CIFs shows the preferable positions (in physical 3D space) of structural features important for strong binding of ligands to proteins. This allows for a clear interpretation of 3D-QSAR models by visualizing the overlap between the fields of regression coefficients specifying the preferable positions of atoms with CIFs describing their actual position for a given molecule. This directs the process of lead optimization towards the better overlap between them.

[1]. Baskin I.I., Zhokhova N.I. J. Comput.-Aided Mol. Des., 2013, 27(5): 427-442. [2]. Sitnikov G.V., Zhokhova N.I., Ustynyuk Y.A., Varnek A., Baskin I.I. J. Comput.-Aided Mol. Des., 2015, 29(3): 233-247.

58

Page 60: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

A.V. Petrova1

B. Minisini2

O.V. Nedopekin1 D.A. Tayurskii1,3

AB-INITIO INVESTIGATION OF RARE-EARTH FLUORITES GDLIF4 AND LULIF4 UNDER

HYDROSTATIC PRESSURE 1Laboratory for Computer Design of New Materials, Institute of Physics, Kazan Federal University, Kremlevskaya Str.,16a, Kazan, Russia 2 Institut Supérieur des Matériaux et Mécaniques Avancés du Mans, Avenue Bartholdi, 44, Le Mans, France 3 Centre for Quantum Technologies, Kazan Federal University, Kremlevskaya Str.,16a, Kazan, Russian Federation [email protected]

The interest in fluorite rare-earth compounds (with the scheelite CaWO4 structure) increases significantly due to their possible application in laser technologies and microelectronics [1,2]. When a stress is applied the present materials undergo to the phase transitions. That to investigate the phase transitions in LuLiF4 and GdLiF4 compounds we performed ab-initio calculations by means of DFT [3, 4] with using VASP 5.2 [5] (Vienna Ab-Initio Simulation Package) program, the part of the MedeA1 interface. The ferroelastic phase transition of the LuLiF4 scheelite (I41/a, Z=4) from the tetragonal structure to the fergusonite one (C12/c1, Z=4) has been found at 10.5 GPa [6]. It has been identified as the second-order transition from the pressure dependence of the structural parameters, order parameter and cell volume. The absence of the phase transitions to the P21/c and P12/c1 structure symmetries has been shown. In order to find the similar phase transitions in GdLiF4, the behavior of the order parameter for the C12/c1 symmetry, as well as the enthalpy difference between two symmetries I41/a and P12/c1 versus the pressure have been investigated. One can conclude that the order parameter of GdLiF4 structure is changed at a pressure close to 16 GPa. The coincidence of enthalpy of the I41/a and P12/c1 symmetries is observed near 18 GPa. Thus, the transitions to the C12/c1 and P12/c1 symmetries compete with each other. The fact that the GdLiF4 compound undergoes a structural decomposition can be explained based on this assumption.

1 Materials Design, S.A.R.L. 1. Errandonea D., Manjón F.J., Somayazulu M., Häusermann D., J. Solid State Chem., 2004, 177 (4-5): 1087-1097. 2. Minisini B., Bonnaud P., Wang Q., Tsobnang F., Comput. Mater Sci., 2008, 42(1): 156-160. 3. Hohenberg P., Kohn W., Phys. Rev. B, 1964, 136: 864. 4. Kohn W., Sham L.J., Phys. Rev. A, 1965, 140: 1133. 5. Kresse G., Furthmuller J., Phys. Rev. B, 1996, 54: 11169. 6. Petrova A., Minisini B., Nedopekin O., Tayurskii D., J. Phys.: Conf. Ser., 2012, 394: 012021.

The work is performed according to the Russian Government Program of Competitive Growth of Kazan Federal University.

59

Page 61: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

O. P. Varlamov1

R. I. Nugmanov1

T. I. Madzhidov1

G. Marcou2 D. Horvath2 A. Varnek1,2

DEVELOPMENT OF WEB SERVICE FOR MODELLING OF CHEMICAL REACTIONS

1 Laboratory of Chemoinformatics and Molecular Modeling, A.M. Butlerov Institute of Chemistry, Kazan Federal University, Kremlyovskaya Str., 18, Kazan, Russia; 2 Chemoinformatics Laboratory, University of Strasbourg, B. Pascal Str., 4, Strasbourg, France; [email protected]

Modern information technologies, such as technologies of machine training, are widely used in

chemistry for search of dependences like "structure - property". A large number of resources for work with molecules is created, there are available tools for

creation and operating by chemical information in databases (JChem, Bingo), there are web resources for modeling "structure - property" at molecules (OChem).

At the same time for chemical reactions is not yet created accessible tools that enable experts in the field of chemoinformatics to grant access to their models.

In this work web service for work with chemical reactions was created. The mechanism for quality check atom-atom mapping with use of software, and also in the interactive mode is offered. The algorithm is developed for identification of types of reactions.

Project developed on Python3/Flask-restful and Pony.ORM/PostgreSQL on NGINX/uwsgi host (server side) and JavaScript with Bootstrap (client side).

During this research the model for a prediction of rate constant of bimolecular nucleophilic substitution reactions of azids was implemented. Chemical reaction was represented in the form of a condensed graph (CGR, Condensed Graph of Reaction). The temperature and solvent characteristics were also included into set of descriptors. Relationships between the reaction parameters and rate constants were found using method of support vector machine (SVM) with a Gaussian kernel. The model performance was assessed by the determination coefficient Q2(squared correlation coefficient)

between the predicted and experimental values as well as by the root mean square errors (RMSE). High correlation between the predicted and experimental rate constants was observed (Q2 = 0.69, RMSE = 1.07).

The webservice is available at http://arsole .u-strasbg.fr

This work was supported by Russian Scientific Foundation grant, agreement No14-43-00024, signed 1.10.2014

60

Page 62: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

CONTENT

pages General Information 1 Program of the Second Kazan Summer School on Chemoinformatics 3 Plenary lectures 5 J. Gasteiger Solved and Unsolved Problems of Chemoinformatics 6

A. Varnek

Chemical Space Paradigm in Chemoinformatics 7

V. Tkachenko Chemical Databasing. State of the Art and Current Challenges 8

H. Senderowitz Statistical Modeling in Material Sciences 9

J. Aires-de-Sousa Machine Learning with Large Datasets of Quantum Chemistry Calculations

10

I.V. Tetko Prediction-Driven Matched Molecular Pairs to Interpret and Compare QSPR/QSRR Models

11

T. Langer, S. D. Bryant, G. Ibis, T. Seidel, M. Wieder

Chemical Feature-Based 3D Pharmacophore Models for Drug Design: Current and Future Aspects

12

I. Baskin, N. I. Zhokhova, G. V. Sitnikov, A. Varnek

3D QSAR: Achievements and Perspectives 13

A. Tropsha

Current Trends in QSAR Modeling 14

V. G. Tsirelson

In Search of Up-To-Date Bonding Descriptors Based on Electron Density 15

V. Poroikov

Drug Discovery: Science, Art, Business 16

Key-note lectures 17 V.A. Palyulin, E.V. Radchenko, N.S. Zefirov Molecular Field Topology Analysis as an Advanced Tool for QSAR/QSPR Studies

18

D.Horvath, L.Hoffer, G.Marcou, A.Varnek S4MPLE – Sampler for Multiple Protein-Ligand Entities: Simultaneous Docking of Several Entities

19

P. Polishchuk, E. Mokshyna, E. Varlamova, E. Muratov, T. Madzhidov, V.Kuz’min Applications of The Mixtures Representation Approach in QSAR Modeling

20

B. Creton Structure-Property Modelling in The Oil Industry 21

T. Madzhidov Reaction Mining Basics: Database Search and Structure-Reactivity Modeling

22

61

Page 63: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

V. Solov'ev QSPR Models for Halogen Bond Strength 23

G. Marcou, D. Horvath, A. Varnek A Measure for QSAR Modelability and for Data Organization 24

Oral Presentations 25 D. Shulga, O. Titov, V. Palyulin, N. Zefirov Perspectives of Direct Modeling of Halogen Bonding in Early Drug Discovery

26

V. B. Siramshetty, P. Banerjee, A. Olubunmi, R. Preissner Potential Drug Repositioning Opportunities for Ebola Virus Disease 27

O. Tarasova, A. Urusova, A. Zakharov, M. Nicklaus, V. Poroikov Application of The Large-Scale Database to The QSAR Modeling of The HIV-1 Reverse Transcriptase Inhibitors

28

P.Sidorov, H.Gaspar, G.Marcou, D.Horvath, A.Varnek Mappability of Drug-Like Space: Towards A Polypharmacologically Competent Map of Drug-Relevant Compounds

29

Poster Presentations 30 P. Banerjee, V. B. Siramshetty, M. N. Drwal, A. Goede, M. Dunkel, R.Preissner Systematic Assessment of Different Computational Approaches for Prediction of Toxic Effects of Chemical Structures

31

P. N. D’yachkov, E. P. D’yachkov Interaction of Carbon Nanotubes with Antitumor Drugs Doxorubicin and Paclitaxel Calculated Using Molecular Docking and Dynamics Techniques

32

T. Gimadiev , H. Gaspar , G. Marcou, D. Horvath , A. Varnek Generative Topographic Mapping Approach to Modeling and Chemical Space Visualization of Human Intestinal Transporters

33

T.R.Gimadiev, R. I. Nugmanov, T. I. Madzhidov, P.G. Polishuk, A.S.Petrovsky, I. I. Baskin, I. S. Antipin, A. A. Varnek Prediction of Tautomer Equilibrium Constants Using Condensed Graphs of Reaction

34

M.Glavatskikh, T.Madzhidov, V. Solov’ev, D. Horvath, G. Marcou, A.Varnek, etc Predictive Models for Halogen Bond Basicity Scale pKi2

35

V. Gordeeva, D. I. Osolodkin System of Structure-Based Virtual Screening of Mycobacterium Tuberculosis Protein Kinase PknB Inhibitors

36

G. Jonusauskas, S. A. Denisov, N. D. McClenaghan Equilibration Between Electronic States and Reversible Electronic Energy Transfer in Bichromophoric Compounds

37

A. Kosinskaya, P. Polishchuk, L. Ognichenko, O. Lebed, I. Burdina, V.Kuz’min 2D - QSAR Models of Blood-Brain Barrier Permeability of Organic Compounds

38

A. I. Lin, T. I.Madzhidov, I. Antipin, O. Klimchuk, A. Varnek Toward An Expert System for Assessment of Optimal Conditions for Selective Deprotection Reactions

39

E. Mokshyna, P. Polishchuk, V. Nedostup, V. Kuz’min Atom-Pair Descriptors, Temperature-Dependent Property and Binary 40

62

Page 64: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

Mixtures: QSPR for Second Virial Cross-Coefficients R. I. Nugmanov, G. R. Sabirova, T. I. Madzhidov, A. A. Varnek

Descriptors of Counter-Ion Effect in Bimolecular Nucleophilic Substitution Reactions

41

I. Piyanzina, B. Minisini, D.Tayurskii Density Functional Theory Characterisation of Azobenzene Derivatives 42

D. Podshivalov, V. Timofeev, I. Kuranova Computer-Aided Search of New Inhibitors of Phosphopantetheine Adenylyltransferase from Mycobacterium Tuberculosis

43

A. Prokhorov, V. Tumanov The Use of Mamdani's Method for Predicting The Values of The Classical Potential Barrier for Radical Reactions

44

E. A. Sosnina, D. I. Osolodkin, E. V. Radchenko, S. B. Sosnin, V. A. Palyulin, N. S. Zefirov QSAR Study of Glycogen Synthase Kinase 3 Inhibitors

45

K. Taborskaya, M. Petrosyan, E. Fedorova, K. Pats Use of Mathematical Modeling to Predict The Progestational Activity 46

R. Toichuev

Energy Aspects in Modeling 47

B. Viira, A. T. Garcia-Sosa, U. Maran Hunting Promising Structure Motifs for HIV1 RT from Publicly Available Data

48

Balakin K.V., Lapushkin G.I., Savilova A.G., Voronkov A. Prediction of The Clinical Adverse Effects of Serotonin and Norepinephrine Reuptake Inhibitors Based on Analysis of Their Multitarget Activity Profile

49

L.S.Gredyagina, N.I.Baranova, E.V.Fedorova The Correlation Between logP of Ligands and Toxicity of Their Complexes with Vanadium

50

V.R. Khayrullina, A.Ja. Gerchikov, M. N. Vasiliew, T.R. Nasrtdinova, R.F.Nasjirova, A.S. Zigangirov, F.S. Zarudiy, I.A. Taipov QSAR-Modeling of Antitumor Activity of Quinazoline Derivatives Using GUSAR

51

Khayrullina V.R., Gerchikov A.Ja., Nasjirova R.F., Zigangirov A.S., Nasrtdinova T.R., Zarudiy F.S. Correlation Analysis of Cyclooxygenase-2 Inhibitors in A Series of 1,2-Diphenylimidazole Derivatives

52

M. Kulinsky, P. Polishchuk, V. Kuz’min Classification QSAR Models of Pharmacokinetic Properties of Drug Substances and Their Structural Interpretation

53

V. Kuz’min, A. Mouats, A. Artemenko, P. Polishchuk (2.0+0.X)D - QSAR Models. Stereochemical Description of Molecules within The “SiRMS” Approach

54

A.G. Maldonado

The Role of Predictive Modeling on Industrial Chemical Innovation 55

N. Rusakova, A. Kotomkin, V. Turovtsev, Yu. Orlov Comparison of Electron Structure of The Radical Isomers of Thiocarboxylic Acids N-Alk-Ch●-Csoh

56

63

Page 65: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second

A.V. Semenov, I.V. Tarasova, A.V. Tutushkina, E.V. Semenova Theoretical Evaluation of Binding Efficiency of TRPV1 with Some Capsaicin Heteroanalogues

57

G. V. Sitnikov, N. I. Zhokhova, A. Varnek, I. I. Baskin Continuous Indicator Fields in Modelling of Am3+/Eu3+ Separation Factors 58

A.V. Petrova, B. Minisini, O.V. Nedopekin, D.A. Tayurskii Ab-Initio Investigation of Rare-Earth Fluorites GdLiF4 And LuLiF4 Under Hydrostatic Pressure

59

O. P. Varlamov, R. I. Nugmanov, T. I. Madzhidov, G. Marcou, D. Horvath, A.Varnek Development of Web Service for Modelling of Chemical Reactions

60

64

Page 66: GENERAL INFORMATION - kpfu.ru · 2015-09-14 · GENERAL INFORMATION ORGANIZERS Kazan (Volga region) Federal University Russian Scientific Foundation ... The program of the Second