Upload
ssa-kpi
View
974
Download
0
Embed Size (px)
DESCRIPTION
AACIMP 2009 Summer School lecture by Yuriy Sushko and Sergii Novotarskyi. "Environmental Chemoinfornatics" course.
Citation preview
Online chemical databasewith modeling environment”a summer school course
“
Sergii NovotarskyiIurii Sushko
Chemoinformatics – overview of online resourcesChemical databases
1. PubChem — a database that provides information on the biologicalactivities of small molecules
2. ChemSpider — a free access service providing a structure centriccommunity for chemists
3. ChemIDplus — a tool, that provides chemical structure, property, andtoxicity searching
4. ChemBank — a database of chemical structures and assays
5. ChemDB — a set of chemoinformatics tools
Chemoinformatics – overview of online resourcesLiterature databases
6. PubMed — a service, that includes over 19 million citations fromMEDLINE and other life science journals for biomedical articles back to1948
7. Toxicology Literature Online (TOXLINE) — references from toxicologyliterature
8. ScienceDirect — a full-text scientific database offering articles/chaptersfrom more than 2,500 peer-reviewed journals and more than 10,000books
9. ACS Publications — a worldwide scientific community with a collectionof the most cited peer-reviewed journals in the chemical and relatedsciences.
Chemoinformatics – overview of online resourcesPubChem – start page
URL: http://pubchem.ncbi.nlm.nih.gov/ or for «PubChem»
Chemoinformatics – overview of online resourcesPubChem – search results
Chemoinformatics – overview of online resourcesPubChem – compound details
Chemoinformatics – overview of online resourcesPubChem – bioassay search results
Chemoinformatics – overview of online resourcesChemSpider – start page
URL: http://www.chemspider.com/ or for «ChemSpider»
Chemoinformatics – overview of online resourcesChemSpider – search results
Chemoinformatics – overview of online resourcesChemIdPlus – main page
URL: http://chem.sis.nlm.nih.gov/chemidplus/
for «ChemIdPlus»
Chemoinformatics – overview of online resourcesChemIdPlus – search results
Chemoinformatics – overview of online resourcesChemBank – main page
URL: http://chembank.broadinstitute.org/ or for «ChemBank»
Chemoinformatics – overview of online resourcesChemBank – search results
Chemoinformatics – overview of online resourcesChemDB – main page
URL: http://cdb.ics.uci.edu/ or for «ChemDB»
Chemoinformatics – overview of online resourcesChemDB – search results
Chemoinformatics – overview of online resourcesPubMed – main page
URL: http://www.ncbi.nlm.nih.gov/pubmed/ or for «PubMed»
Online chemical database with modeling environmentThe subject of development
The web-based service
The database of physical, chemical and biological properties
Accumulating experimentally verified dataProviding user-friendly web-based access to this data
The QSPR modeling environment
Providing web-based tools for QSPR modelingStoring and “publishing” created models
Online chemical database with modeling environmentMotivation
Our motivation
The importance of QSPR modeling
The importance of web-based tools for QSPR modeling
The importance to build one more service in this field
Online chemical database with modeling environmentMotivation - QSPR
Structure-property relationship hypothesis:
QSPR modeling:
log (IC50) =0.64 log(µM)
log (IC50) =1.87 log(µM)
log (IC50) =1.87 log(µM)
log (IC50) = ?
“Similar structures - similar properties”
Predicting properties based on availabledata for structurally similar molecules.
Structures are represented by a set ofdescriptors (atom count, molecularweight).
Online chemical database with modeling environmentQSPR – Similarity in descriptor space
Number of specific fragments in a molecule
Online chemical database with modeling environmentMotivation - web-based tools for modeling
Main benefits of web-based tools:
Availability and accessibilityonly a computer with Internet access and a modern web-browser requiredto start working; possibility to share work materials among severallocations; works with any platform (Linux, Win, Mac)
Communication and collaborationpossibility to work on common topics, publish own results and use newresults of other people
Online chemical database with modeling environmentMotivation - one more web-based tool
Reasons to build one more service:
Different approach to data modificationa completely open database, any user can add, delete and edit data (only
constrained by a set of simple rules)
Different approach to data organizationdata in the database is organized in a way, suitable for QSPR modeling
Integration of a database with modeling toolsdata from the database can be used for model creation and property
prediction
Online chemical database with modeling environmentDistinctive features
The features, that make our service different:“Wiki” approach to data handlingusers can add, modify and delete data
Mandatory reference to an articleevery record in a database should contain a reference to an article, wherethe data was published
Storing additional informationwe store measurement conditions to increase data quality
Several tools to support decision makingintegration with other web-services (validation of molecule names againstPubChem database, automatic fetching of article information fromPubMed), duplicate records management
Aimed at model buildingconvenient to build training sets from data - filter by property, article andexport data either to internal modeling tools or download as Excel file
Online chemical database with modeling environmentData structure
Online chemical database with modeling environmentSimplified data structure
Records Properties
Molecules
ArticlesUnits
Journals
Conditions
Users
Online chemical database with modeling environmentUser interface agreements
Browser-based interface
Online chemical database with modeling environmentUser interface agreements
Browser-based interface
Online chemical database with modeling environmentUser interface agreements
Icons
Edit current record (item, article, unit, etc.)
Delete current record
Most places — open record-specific submenu, sometimes — view profile
Open a wiki page with additional explanations
Send a message to the user
Download data in XLS format
Select item
Online chemical database with modeling environmentSummary
The database currently contains:
More than 50000 records
Around 285 properties
More than 2700 articles
Thank you
Online chemical database with modeling environmentPractical course - outline
• Collection of data from original literature
• Use of publicly available tools for literature and cmemical structurelookup
• Introduction of data to OCHEM — single record
• Collection of data from benchmark literature
• Introduction of data to OCHEM — batch upload
Online chemical database with modeling environmentPractical course – collection of data – before we start
Article name PubMedID Compound name Value
1
2
3
4
5
Online chemical database with modeling environmentPractical course – collection of data
The goal: achive data on CYP450 1A2 inhibitors and noninhibitors
Cytochrome P450 (abbreviated CYP, P450, CYP450) is a very large and diversesuperfamily of hemoproteins found in all domains of life. © Wikipedia
PubMed search terms: CYP1A2 inhibition
Online chemical database with modeling environmentPractical course – data collection
Article name PubMedID Compound name CYPModulation
1 Chemical genomics ofcancer chemopreventivedithiolethiones
19126641 •3H-1,2-dithiole-3-thione•4-methyl-5-pyrazinyl-3H-1,2-dithiole-3-thione•5-tert-butyl-3H-1,2-dithiole-3-thione
InhibitorNoninhibitorNoninhibitor
2 Comprehensive in vitroanalysis of voriconazoleinhibition of eight cytochromeP450 (CYP) enzymes: majoreffect on CYPs 2B6, 2C9,2C19, and 3A
19029318 Voriconazole Noninhibitor
3 Involvement of CYP1A2 inmexiletine metabolism 9690950 Mexiletine Inhibitor
4 Differential inhibition ofcytochrome P450 isoforms bythe protease inhibitors,ritonavir, saquinavir andindinavir
9278209 Indinavir Noninhibitor
5 An evaluation of potentialmechanism-based inactivationof human drug metabolizingcytochromes P450 bymonoamine oxidaseinhibitors,including isoniazid.
16669850 Clorgyline Inhibitor
Online chemical database with modeling environmentPractical course – data introduction – cheat sheet
Good chemistry lookup engine: PubChem (find URL in Google.com)
We search by name, and want to get structure
Convenient structure representation - SMILES
Property: CYP450 Modulation
Condition: CYP450 Type = CYP1A2
Online chemical database with modeling environmentPractical course – batch data introduction – template
• CASRN — CAS registration number• SMILES — smiles string• NAME — molecule name• ARTICLEID — article identifier (PubMed or OCHEM)• PAGE — article page• TABLE — article table• LINE — article line• COMMENT — text comment• REFERENCE — record reference• CYP450 Modulation — value of the property• Unit — measurment unit of the property• Accuracy — measurment accuracy• Interval — measurmen interval• CYP450 Type — record condition
Online chemical database with modeling environmentPractical course – batch data introduction – cheat sheet
• Article URL: http://tinyurl.com/rendic• Article title: «Summary of information on human CYP enzymes:
human P450 metabolism data»• Good chemistry lookup engine: PubChem (find URL in Google.com)• We search by name, and want to get structure• Convenient structure representation - SMILES• Property: CYP450 Modulation• Condition: CYP450 Type = CYP1A2• Reference = 1• ArticleID = Q1592• Batch upload template URL: http://tinyurl.com/bu-template
Thank you (once more)