Upload
ed-cannon
View
105
Download
0
Embed Size (px)
Citation preview
OpenEye Template 2010
Lexichem, a New EraDr Ed CannonScientific Software Developer
1
OverviewLexichem TK v2.1.1, released February 2012 Real world applications of Lexichem Performance metric New features Lexichem Workbench
5/23/12 2012 OpenEye Scientific Software2
Start by talking about some real world applications of Lexichem, what you can do with it and who is using it.then move on to talk about Lexichem, TK v2.11.released Metric to assess how well Lexichem is performingNew features from v2.0.2Finish off with GUI2
Lexichem
Now lets talk about the main purpose of my were here, Lexichem. 3
Lexichem5/23/12 2012 OpenEye Scientific Software4
Supported Nomenclature
IUPAC 79 / 93 / 2005Chemical Abstract / CASTraditionalMDL / BeilsteinAutoNomOpenEyeSupported Languages (17)
English (American / British) German JapaneseSpanishSwedish
Lexichem is OpenEyes chemical nomenclature software, you can:->convert names to molecules->molecules to names->translate names from one language to anotherLexichem comes in two flavours4
Command Line Applications5/23/12 2012 OpenEye Scientific Software5
Glycinate
Standalone applications run from command line.5
Lexichem TK5/23/12 2012 OpenEye Scientific Software
6
For those who want to program and use Lexichem:Lexichem TK is written in C++ and Swig (Simplified Wrapper and Interface Generator) wrapped to python, Java and C#.6
Applications
So what can Lexichem do, other than help you buy heroin?7
Pipelines5/23/12 2012 OpenEye Scientific Software8
Large scale conversion of structures to names and names to structures
Easy integration in workflows
Keith Taylor showed yesterday a Pipeline pilot workflow with a node for converting structures to namesWorkflow integration with Pipeline Pilot.Node use 1 of Lexichems functions-> mol2nam, nam2mol, translateMatt Stahl working on nodes using OpenEye Software.Lexichem node highlighted in square ->convert structures to names.8
Webservices5/23/12 2012 OpenEye Scientific Software9
Mol2Nam
PUG SOAP
Molecules
Hits in PubChem
Lexichem Webservice available
Integration with 3rd party webservicesPubChem usesLexichem
Craig Bruce hired recently, been developing Webservices for OE, one being LexichemUse Lexichem prior / post processing to Webservice.PUG (Power User Gateway)Search for structures with the names found in across PubChem.9
Lexiparser5/23/12 2012 OpenEye Scientific Software10Automated extraction of structures and names from documentsSupported formats:.txt .html.doc.docx.rtf.pdf FUTURE
Chemical name extraction from patents/documents and structure generation (Lexiparser uses Lexichem)->Uses Lexichem after its extracted chemical names to convert them to molecules.10
Extracting Structures from a Patent5/23/1211
Patent URL
Names extracted
Generate Structures
Desktop Applications5/23/12 2012 OpenEye Scientific Software12
Electronic Laboratory Notebook
NEW! Lexichem Workbench
Lexichem is the engine beneath the hood, when a user draws a structure a call to Lexichem is made which generates a name which can be rendered.Alternatively you can import chemical names convert them to molecules and visualize the image.12
Performance Metric
Purpose of metric-> ensure we are not regressing but improving Lexichem-> identify areas / features in need of improvement-> gold standard which other companies can then compare chemical nomenclature software13
Why a New Performance Metric ?5/23/12 2012 OpenEye Scientific Software14Ensures consistent improvement of LexichemIdentify areas in need of developmentGold standard for all chemical nomenclature software
->Concept: start pt, and an end pt after some processing, then compare start pt to end pt+ve: quick to calculate, gives one figure value of how accurate Lexichem is on dataset->Paper accepted
Adv SMILEs: human readable, less verbose than inchi, tautomer support
14
Round Tripping5/23/12 2012 OpenEye Scientific Software15*E.O.Cannon. JCIM 2012, DOI: 10.1021/ci3000419
Compare the initial and final structure after name generation
Easy to calculate
->Concept: start pt, and an end pt after some processing, then compare start pt to end pt+ve: quick to calculate, gives one figure value of how accurate Lexichem is on dataset->Paper accepted
Adv SMILEs: human readable, less verbose than inchi, tautomer support
15
Results
The concept of a benchmark is good, but do we have good results using it?16
Performance5/23/1217
Speed%RTCS
Whilst these results are good, is Lexichem feasible on a large scale?
Seen Lexichem performs well and is feasible on large datasets, now lets look at what features been added.----- Meeting Notes (3/28/12 11:38) -----Mol2Nam -> canonicalize atoms & bonds,identify atom types, identify ring systems and size, bridges, locants and positions,identify stereo / walk the graph17
New Features
So what have we added since v2.0.2?
Our main drive has been looking nam2mol features (as theyre not quite as well supported as mol2nam), in particular the ability to generate molecules for large ring systems. (one of Lexichems weaker points in the previous releases)18
5/23/12 2012 OpenEye Scientific Software
19
Nam2mol New Featuresvon Baeyer
Von Baeyer -> polyalicyclic ring systems-> previously only bicyclic supported.
-> working hard on augmenting natural productsBeta-carotene in carrots -> Provitamin A carotenoid19
von BaeyerPoly-linear von Baeyer Spiro Compounds
5/23/12 2012 OpenEye Scientific Software
18
Nam2mol New Features
Von Baeyer -> polyalicyclic ring systems-> previously only bicyclic supported.
-> working hard on augmenting natural productsBeta-carotene in carrots -> Provitamin A carotenoid20
Nam2mol New Featuresvon BaeyerPoly-linear von Baeyer Spiro CompoundsPoly-branched von Baeyer Spiro Compounds
5/23/12 2012 OpenEye Scientific Software
18
Von Baeyer -> polyalicyclic ring systems-> previously only bicyclic supported.
-> working hard on augmenting natural productsBeta-carotene in carrots -> Provitamin A carotenoid21
Nam2mol New Featuresvon BaeyerPoly-linear von Baeyer Spiro CompoundsPoly-branched von Baeyer Spiro CompoundsSteroids
5/23/12 2012 OpenEye Scientific Software
18
Von Baeyer -> polyalicyclic ring systems-> previously only bicyclic supported.
-> working hard on augmenting natural productsBeta-carotene in carrots -> Provitamin A carotenoid22
Nam2mol New Featuresvon BaeyerPoly-linear von Baeyer Spiro CompoundsPoly-branched von Baeyer Spiro CompoundsSteroidsAlkaloids
5/23/12 2012 OpenEye Scientific Software
18
Von Baeyer -> polyalicyclic ring systems-> previously only bicyclic supported.
-> working hard on augmenting natural productsBeta-carotene in carrots -> Provitamin A carotenoid23
Nam2mol New Features von BaeyerPoly-linear von Baeyer Spiro CompoundsPoly-branched von Baeyer Spiro CompoundsSteroidsAlkaloidsTerpenes
5/23/12 2011 OpenEye Scientific Software
18
Von Baeyer -> polyalicyclic ring systems-> previously only bicyclic supported.
-> working hard on augmenting natural productsBeta-carotene in carrots -> Provitamin A carotenoid24
Nam2mol New Featuresvon BaeyerPoly-linear von Baeyer Spiro CompoundsPoly-branched von Baeyer Spiro CompoundsSteroidsAlkaloidsTerpenesL/D-amino acids
5/23/12 2012 OpenEye Scientific Software
18L-ArginineD-Arginine
Von Baeyer -> polyalicyclic ring systems-> previously only bicyclic supported.
-> working hard on augmenting natural productsBeta-carotene in carrots -> Provitamin A carotenoid25
Nam2mol New Featuresvon BaeyerPoly-linear von Baeyer Spiro CompoundsPoly-branched von Baeyer Spiro CompoundsSteroidsAlkaloidsTerpenesL/D-amino acidsR-groups
5/23/12 2012 OpenEye Scientific Software
18
Von Baeyer -> polyalicyclic ring systems-> previously only bicyclic supported.
-> working hard on augmenting natural productsBeta-carotene in carrots -> Provitamin A carotenoid26
5/23/12 2012 OpenEye Scientific Software5,6,6a,7-tetrahydro-4H-dibenzo[de,g]quinolineYohimbanBenzo[cd]indoleOctahydro-1H-4,7-epoxyisoindoleRingtemplates
Mol2Nam New Templates19
Ring templates for conversion of molecules to names.Mainly bridge and fused ring templates have been added.27
Lexichem Workbench
Primary goal of the GUI was to:-> Lower the bar to use Lexichems functionality (for people not keen on using an API to program against, or using the command line)28
Lexichem WorkbenchGraphical user interface for LexichemFeatures:Home page LexiWebLexiParserNam2MolMol2NamTranslateResults 5/23/12 2012 OpenEye Scientific Software21
Main Window5/23/12 2012 OpenEye Scientific Software22Converts input SMILES string or chemical nameVisual display of the structureChemical information:Molecular weightSMILESIUPAC name
->Primarily modeled on the command line tools, but provided numerous additional features
2 Menus: Open a molecular fileImport name, SMILES, set default options, clear the view.30
Results5/23/12 2012 OpenEye Scientific Software23
Results historyOriginal input on display
31
Results5/23/12 2012 OpenEye Scientific Software24Text options:Copy selected cellsSave table Display selection
Results5/23/12 2012 OpenEye Scientific Software25
Display options:SaveCopyPrint
Substructure Search5/23/12 2012 OpenEye Scientific Software26
Options:Functional group from listCustom SMILES/SMARTS patternCustom name
Filter the results34
ConclusionsLarge number of new features available in Lexichem TK v2.1.1
New! Performance metric
New! Lexichem Workbench desktop application available
5/23/12 2012 OpenEye Scientific Software27
Story about Lexichem, you know you want it, then please feel free to contact us
Future work: continue to work on fused polycyclic ring systems, natural products35
OpenEye Scientific SoftwareFor more information, please contact us:
[email protected]@[email protected]+1-505-473-7385 (USA)+81-3-6206-1425 (Japan)
5/23/12 2012 OpenEye Scientific Software28