38
Online chemical database with modeling environment” a summer school course Sergii Novotarskyi Iurii Sushko

Online Chemical Database with Modelling Environment

  • Upload
    ssa-kpi

  • View
    974

  • Download
    0

Embed Size (px)

DESCRIPTION

AACIMP 2009 Summer School lecture by Yuriy Sushko and Sergii Novotarskyi. "Environmental Chemoinfornatics" course.

Citation preview

Page 1: Online Chemical Database with Modelling Environment

Online chemical databasewith modeling environment”a summer school course

Sergii NovotarskyiIurii Sushko

Page 2: Online Chemical Database with Modelling Environment

Chemoinformatics – overview of online resourcesChemical databases

1. PubChem — a database that provides information on the biologicalactivities of small molecules

2. ChemSpider — a free access service providing a structure centriccommunity for chemists

3. ChemIDplus — a tool, that provides chemical structure, property, andtoxicity searching

4. ChemBank — a database of chemical structures and assays

5. ChemDB — a set of chemoinformatics tools

Page 3: Online Chemical Database with Modelling Environment

Chemoinformatics – overview of online resourcesLiterature databases

6. PubMed — a service, that includes over 19 million citations fromMEDLINE and other life science journals for biomedical articles back to1948

7. Toxicology Literature Online (TOXLINE) — references from toxicologyliterature

8. ScienceDirect — a full-text scientific database offering articles/chaptersfrom more than 2,500 peer-reviewed journals and more than 10,000books

9. ACS Publications — a worldwide scientific community with a collectionof the most cited peer-reviewed journals in the chemical and relatedsciences.

Page 4: Online Chemical Database with Modelling Environment

Chemoinformatics – overview of online resourcesPubChem – start page

URL: http://pubchem.ncbi.nlm.nih.gov/ or for «PubChem»

Page 5: Online Chemical Database with Modelling Environment

Chemoinformatics – overview of online resourcesPubChem – search results

Page 6: Online Chemical Database with Modelling Environment

Chemoinformatics – overview of online resourcesPubChem – compound details

Page 7: Online Chemical Database with Modelling Environment

Chemoinformatics – overview of online resourcesPubChem – bioassay search results

Page 8: Online Chemical Database with Modelling Environment

Chemoinformatics – overview of online resourcesChemSpider – start page

URL: http://www.chemspider.com/ or for «ChemSpider»

Page 9: Online Chemical Database with Modelling Environment

Chemoinformatics – overview of online resourcesChemSpider – search results

Page 10: Online Chemical Database with Modelling Environment

Chemoinformatics – overview of online resourcesChemIdPlus – main page

URL: http://chem.sis.nlm.nih.gov/chemidplus/

for «ChemIdPlus»

Page 11: Online Chemical Database with Modelling Environment

Chemoinformatics – overview of online resourcesChemIdPlus – search results

Page 12: Online Chemical Database with Modelling Environment

Chemoinformatics – overview of online resourcesChemBank – main page

URL: http://chembank.broadinstitute.org/ or for «ChemBank»

Page 13: Online Chemical Database with Modelling Environment

Chemoinformatics – overview of online resourcesChemBank – search results

Page 14: Online Chemical Database with Modelling Environment

Chemoinformatics – overview of online resourcesChemDB – main page

URL: http://cdb.ics.uci.edu/ or for «ChemDB»

Page 15: Online Chemical Database with Modelling Environment

Chemoinformatics – overview of online resourcesChemDB – search results

Page 16: Online Chemical Database with Modelling Environment

Chemoinformatics – overview of online resourcesPubMed – main page

URL: http://www.ncbi.nlm.nih.gov/pubmed/ or for «PubMed»

Page 17: Online Chemical Database with Modelling Environment

Online chemical database with modeling environmentThe subject of development

The web-based service

The database of physical, chemical and biological properties

Accumulating experimentally verified dataProviding user-friendly web-based access to this data

The QSPR modeling environment

Providing web-based tools for QSPR modelingStoring and “publishing” created models

Page 18: Online Chemical Database with Modelling Environment

Online chemical database with modeling environmentMotivation

Our motivation

The importance of QSPR modeling

The importance of web-based tools for QSPR modeling

The importance to build one more service in this field

Page 19: Online Chemical Database with Modelling Environment

Online chemical database with modeling environmentMotivation - QSPR

Structure-property relationship hypothesis:

QSPR modeling:

log (IC50) =0.64 log(µM)

log (IC50) =1.87 log(µM)

log (IC50) =1.87 log(µM)

log (IC50) = ?

“Similar structures - similar properties”

Predicting properties based on availabledata for structurally similar molecules.

Structures are represented by a set ofdescriptors (atom count, molecularweight).

Page 20: Online Chemical Database with Modelling Environment

Online chemical database with modeling environmentQSPR – Similarity in descriptor space

Number of specific fragments in a molecule

Page 21: Online Chemical Database with Modelling Environment

Online chemical database with modeling environmentMotivation - web-based tools for modeling

Main benefits of web-based tools:

Availability and accessibilityonly a computer with Internet access and a modern web-browser requiredto start working; possibility to share work materials among severallocations; works with any platform (Linux, Win, Mac)

Communication and collaborationpossibility to work on common topics, publish own results and use newresults of other people

Page 22: Online Chemical Database with Modelling Environment

Online chemical database with modeling environmentMotivation - one more web-based tool

Reasons to build one more service:

Different approach to data modificationa completely open database, any user can add, delete and edit data (only

constrained by a set of simple rules)

Different approach to data organizationdata in the database is organized in a way, suitable for QSPR modeling

Integration of a database with modeling toolsdata from the database can be used for model creation and property

prediction

Page 23: Online Chemical Database with Modelling Environment

Online chemical database with modeling environmentDistinctive features

The features, that make our service different:“Wiki” approach to data handlingusers can add, modify and delete data

Mandatory reference to an articleevery record in a database should contain a reference to an article, wherethe data was published

Storing additional informationwe store measurement conditions to increase data quality

Several tools to support decision makingintegration with other web-services (validation of molecule names againstPubChem database, automatic fetching of article information fromPubMed), duplicate records management

Aimed at model buildingconvenient to build training sets from data - filter by property, article andexport data either to internal modeling tools or download as Excel file

Page 24: Online Chemical Database with Modelling Environment

Online chemical database with modeling environmentData structure

Page 25: Online Chemical Database with Modelling Environment

Online chemical database with modeling environmentSimplified data structure

Records Properties

Molecules

ArticlesUnits

Journals

Conditions

Users

Page 26: Online Chemical Database with Modelling Environment

Online chemical database with modeling environmentUser interface agreements

Browser-based interface

Page 27: Online Chemical Database with Modelling Environment

Online chemical database with modeling environmentUser interface agreements

Browser-based interface

Page 28: Online Chemical Database with Modelling Environment

Online chemical database with modeling environmentUser interface agreements

Icons

Edit current record (item, article, unit, etc.)

Delete current record

Most places — open record-specific submenu, sometimes — view profile

Open a wiki page with additional explanations

Send a message to the user

Download data in XLS format

Select item

Page 29: Online Chemical Database with Modelling Environment

Online chemical database with modeling environmentSummary

The database currently contains:

More than 50000 records

Around 285 properties

More than 2700 articles

Page 30: Online Chemical Database with Modelling Environment

Thank you

Page 31: Online Chemical Database with Modelling Environment

Online chemical database with modeling environmentPractical course - outline

• Collection of data from original literature

• Use of publicly available tools for literature and cmemical structurelookup

• Introduction of data to OCHEM — single record

• Collection of data from benchmark literature

• Introduction of data to OCHEM — batch upload

Page 32: Online Chemical Database with Modelling Environment

Online chemical database with modeling environmentPractical course – collection of data – before we start

Article name PubMedID Compound name Value

1

2

3

4

5

Page 33: Online Chemical Database with Modelling Environment

Online chemical database with modeling environmentPractical course – collection of data

The goal: achive data on CYP450 1A2 inhibitors and noninhibitors

Cytochrome P450 (abbreviated CYP, P450, CYP450) is a very large and diversesuperfamily of hemoproteins found in all domains of life. © Wikipedia

PubMed search terms: CYP1A2 inhibition

Page 34: Online Chemical Database with Modelling Environment

Online chemical database with modeling environmentPractical course – data collection

Article name PubMedID Compound name CYPModulation

1 Chemical genomics ofcancer chemopreventivedithiolethiones

19126641 •3H-1,2-dithiole-3-thione•4-methyl-5-pyrazinyl-3H-1,2-dithiole-3-thione•5-tert-butyl-3H-1,2-dithiole-3-thione

InhibitorNoninhibitorNoninhibitor

2 Comprehensive in vitroanalysis of voriconazoleinhibition of eight cytochromeP450 (CYP) enzymes: majoreffect on CYPs 2B6, 2C9,2C19, and 3A

19029318 Voriconazole Noninhibitor

3 Involvement of CYP1A2 inmexiletine metabolism 9690950 Mexiletine Inhibitor

4 Differential inhibition ofcytochrome P450 isoforms bythe protease inhibitors,ritonavir, saquinavir andindinavir

9278209 Indinavir Noninhibitor

5 An evaluation of potentialmechanism-based inactivationof human drug metabolizingcytochromes P450 bymonoamine oxidaseinhibitors,including isoniazid.

16669850 Clorgyline Inhibitor

Page 35: Online Chemical Database with Modelling Environment

Online chemical database with modeling environmentPractical course – data introduction – cheat sheet

Good chemistry lookup engine: PubChem (find URL in Google.com)

We search by name, and want to get structure

Convenient structure representation - SMILES

Property: CYP450 Modulation

Condition: CYP450 Type = CYP1A2

Page 36: Online Chemical Database with Modelling Environment

Online chemical database with modeling environmentPractical course – batch data introduction – template

• CASRN — CAS registration number• SMILES — smiles string• NAME — molecule name• ARTICLEID — article identifier (PubMed or OCHEM)• PAGE — article page• TABLE — article table• LINE — article line• COMMENT — text comment• REFERENCE — record reference• CYP450 Modulation — value of the property• Unit — measurment unit of the property• Accuracy — measurment accuracy• Interval — measurmen interval• CYP450 Type — record condition

Page 37: Online Chemical Database with Modelling Environment

Online chemical database with modeling environmentPractical course – batch data introduction – cheat sheet

• Article URL: http://tinyurl.com/rendic• Article title: «Summary of information on human CYP enzymes:

human P450 metabolism data»• Good chemistry lookup engine: PubChem (find URL in Google.com)• We search by name, and want to get structure• Convenient structure representation - SMILES• Property: CYP450 Modulation• Condition: CYP450 Type = CYP1A2• Reference = 1• ArticleID = Q1592• Batch upload template URL: http://tinyurl.com/bu-template

Page 38: Online Chemical Database with Modelling Environment

Thank you (once more)