Upload
andrew-lang
View
784
Download
4
Tags:
Embed Size (px)
DESCRIPTION
Presentation to ORU College of Science and Engineering - November 27, 2012
Citation preview
COLLECTING, CURATING, AND MODELING MELTING POINTS
Andrew Lang
Professor of Mathematics
Oral Roberts University
Open Drug Discovery for Neglected Diseases
MalariaSchistosomiasis Gram positive bacteriaBreast Cancer
Drugs for neglected diseases
need to be…
cheap and…
easy to make.
docking
combinatorial library
synthesis
solvent selection
recrystallization
biologicalassay
solubility models
solubility data
melting point models
melting point data
The big picture
docking
combinatorial library
synthesis
solvent selection
recrystallization
biologicalassay
solubility models
solubility data
melting point models
melting point data
Oral Roberts University undergraduate research
Cameron NeylonBiophysicist RAL
David BulgerMD/PhD Student
Tennessee
Solubility Measurements and Ugi Product Synthesis at ORU,
Drexel, and RAL
Submeta ONS Award Winner, BOE Award Winner
Supervisors: Robert Stewart, Lois Ablin, Bill Collier, Joel
Gaikwad, Jean-Claude Bradley, and Cameron Neylon
Lizzie ClarkNursing Major
Lacey CondronChemistry Major
Samantha Gaines, Lizzie Clark, and Lacey Condron
Solubility Measurements and Solubility Modeling at ORU
Supervisors: Ken Weed, Lois Ablin
Daryl Charron, Alejandro Hernandez, Maria Hernandez, Jesse Patsolic, Matthew Wilson
Cluster Computer Construction and In-Silico
Docking at ORU
Supervisors: Ken Preston
docking
combinatorial library
synthesis
solvent selection
recrystallization
biologicalassay
solubility models
solubility data
melting point models
melting point data
Let’s focus
Early models, before 2005 were…
…specialized1979 Martin – disubstituted benzenes1987 Hanson – normal alkanes1988 Needham – normal and branched alkanes1990 Abramowitz – non-hydrogen bonded benzenes1991 Dearden – anilines1993 Katritzky – aldehydes, amines, and ketones1994 Simamora – rigid aromatic1996 Charlton – alkanes1996 Katritzky – pyridines1999 Zhao – aliphatic2001 Chickos – homologous series2003 Bergstrom – druglike (N = 277, r2 = 0.54)
In 2005…
…everything changed
MDPI - cheminformatics.org
Karthikeyan 2005 N = 4173, r2 = 0.65
PHYSPROP
Clark 2005 N = 6257, r2 = 0.61
Recent melting point models use these datasets…
…never reproducing r2 = 0.65 (0.47 – 0.56)
Even though [a] melting point can be measured accurately, its prediction has been a notoriously difficult problem.
We began measuring, collecting, and curating melting points in the Fall of 2010
Jean-Claude Bradley’sChemical Information Retrieval
Course at Drexel
567 curated and referenced measurements from Fall 2010 Chemical Information Retrieval course
Most popular data sources…
…chemical vendors
Alfa Aesar donates ~13,000melting points to the public domain
collection
curation
modelingvalidation
measurement
ONS melting point
workflow
Collection: Open Datasource data points curated values source year data type
Bell 2483 1631 1995 donated-CC0
Bergstrom 277 277 2003 open
MDPI-Karthikeyan 4450 4084 2005 open
Hughes 287 262 2008 open
Oxford-MSDS 3217 1481 2010 open
Drugbank 875 875 2011 open
Griffiths 3757 278 2011 donated-CC0
Alfa Aesar 12986 8739 2011 donated-CC0
PHYSPROP 11645 9694 2011 donated-CC0
ONS 471 471 2012 open
27792 curated measurements for 19410 compounds
Curation is…
…lots of hard, tedious work(Jean-Claude Bradley and Antony Williams)
Antony Williams – RSC ChemSpider
Inconsistencies and SMILES problems within the “high trust level” MDPI dataset
PHYSPROP Structure Errors (Incorrect Valence)2315 out of 43543 contained pentavalent nitrogens
PHYSPROP Errors: Structure displayed is for the neutral compound dopamine but the associated CAS Number and
chemical name in the file are for the hydrobromide salt.
Common errorsunit errors: Kelvin/Celsius, Fahrenheit/Celsius
bad SMILES (non-rendering, hypervalency)
salts associated with SMILES for free base
using boiling point for melting point
Some melting points can’t be resolved only with literature: 4-benzyltoluene
Open lab notebook page measuring the melting point of 4-benzyltoluene
Modeling – All Data
Melting Point Model
CDKdescriptor calculator
Rstatistical computing
melting point data
MP Model N = 19515, r2 = 0.80
use this model
Modeling – Highly Curated Subset
compoundsdoubleplusgoodsingle
CDKdescriptor calculator
Rstatistical computing
data
Melting Point Model
MP Model N = 2704, r2 = 0.83
Straight chain carboxylic acids from 1 to 10 carbons
Straight chain alcohols from 1 to 10 carbons
Comparison of model with double+ validated measurements
Cyclic primary amines from 3 to 6 carbons cyclobutylamine flagged for measurement
only single source available
Publication of double+ validated melting point dataset
…as a preprint
Publication of double+ validated melting point dataset
…as a book
Data and model deployed…
…on the web
web service
…in Google spreadsheets
…as an app
Use case: recrystallizing dibenzalacetone
Can the solvents used to recrystallize compounds in organic teaching labs be improved?
Trans-dibenzalacetone
Aldol condensation between two molecules of benzaldehyde and one molecule of acetone
[Matthew McBride: Undergraduate Research Assistant - Drexel]
Dibenzalacetone First recrystallized in ethyl acetate in 1906: Straus
and Ecker, Ber. 39, 2988 (1906) Recrystallized in ethyl acetate in Organic Syntheses
Organic Teaching Labs
Recommended recrystallization solvent: ethyl acetate.
(http://classes.kvcc.edu/chm230/mixed%20aldol%20condensation.pdf
(http://www.xula.edu/chemistry/documents/orgleclab/Aldol_notes.pdf)
Recrystallization AppEnter compound identification and desired parameters
How does it work?
1. Look up the solvent boiling point
2. Look up the room temperature solubility or predict it via measured or predicted Abraham descriptors
3. Look up the solute melting point or predict it via a model
4. Use the melting point and the solubility at room temperature to predict the solubility at boiling
5. Calculate the predicted recrystallization yield
ResultsLists solvents and their predicted recrystallization yield.
Prediction is generated by the temperature dependent solubility curves.
Comparison ethyl acetate (predicted yield of 72%) vs ethanol
(predicted yield of 93%) ethyl acetate
ethanol
0.09M
1.1M
0.62M
2.06M
Dibenzalacetone derivatives docking against tubulin (paclitaxel site)
Example Derivatives of dibenzalacetone may be synthesized
by altering the aldehyde used From a library of derivatives, the following
compound was the top hit for the docking site of Taxol
Uses phenanthrene-9-carboxaldehyde
Search Literature Perform a Reaxys search to determine availability
of synthesis procedures
No results
[Matthew McBride: Undergraduate Research Assistant - Drexel]
Synthesis and recrystallization solvents chosen using ONS models
Used methanol and benzene
Melting Point: 264-265°C
(http://usefulchem.wikispaces.com/EXP286)
[Matthew McBride: Undergraduate Research Assistant - Drexel]
AcknowledgementsORU Biology and Chemistry FacultyJean-Claude Bradley (Drexel)Cameron Neylon (RAL)Antony Williams (RSC ChemSpider)Evan Curtin (Drexel)Matthew McBride (Drexel)
ORU research assistants: David Bulger, Daryl Charron, Lizzie Clark, Lacey Condron, Samantha Gaines, Alejandro Hernandez, Maria Hernandez, Jesse Patsolic, and Matthew Wilson