55
COLLECTING, CURATING, AND MODELING MELTING POINTS Andrew Lang Professor of Mathematics Oral Roberts University

modeling melting points

Embed Size (px)

DESCRIPTION

Presentation to ORU College of Science and Engineering - November 27, 2012

Citation preview

Page 1: modeling melting points

COLLECTING, CURATING, AND MODELING MELTING POINTS

Andrew Lang

Professor of Mathematics

Oral Roberts University

Page 2: modeling melting points
Page 3: modeling melting points

Open Drug Discovery for Neglected Diseases

MalariaSchistosomiasis Gram positive bacteriaBreast Cancer

Page 4: modeling melting points

Drugs for neglected diseases

need to be…

Page 5: modeling melting points

cheap and…

Page 6: modeling melting points

easy to make.

Page 7: modeling melting points

docking

combinatorial library

synthesis

solvent selection

recrystallization

biologicalassay

solubility models

solubility data

melting point models

melting point data

The big picture

Page 8: modeling melting points

docking

combinatorial library

synthesis

solvent selection

recrystallization

biologicalassay

solubility models

solubility data

melting point models

melting point data

Oral Roberts University undergraduate research

Page 9: modeling melting points

Cameron NeylonBiophysicist RAL

David BulgerMD/PhD Student

Tennessee

Solubility Measurements and Ugi Product Synthesis at ORU,

Drexel, and RAL

Submeta ONS Award Winner, BOE Award Winner

Supervisors: Robert Stewart, Lois Ablin, Bill Collier, Joel

Gaikwad, Jean-Claude Bradley, and Cameron Neylon

Page 10: modeling melting points

Lizzie ClarkNursing Major

Lacey CondronChemistry Major

Samantha Gaines, Lizzie Clark, and Lacey Condron

Solubility Measurements and Solubility Modeling at ORU

Supervisors: Ken Weed, Lois Ablin

Page 11: modeling melting points

Daryl Charron, Alejandro Hernandez, Maria Hernandez, Jesse Patsolic, Matthew Wilson

Cluster Computer Construction and In-Silico

Docking at ORU

Supervisors: Ken Preston

Page 12: modeling melting points

docking

combinatorial library

synthesis

solvent selection

recrystallization

biologicalassay

solubility models

solubility data

melting point models

melting point data

Let’s focus

Page 13: modeling melting points

Early models, before 2005 were…

Page 14: modeling melting points

…specialized1979 Martin – disubstituted benzenes1987 Hanson – normal alkanes1988 Needham – normal and branched alkanes1990 Abramowitz – non-hydrogen bonded benzenes1991 Dearden – anilines1993 Katritzky – aldehydes, amines, and ketones1994 Simamora – rigid aromatic1996 Charlton – alkanes1996 Katritzky – pyridines1999 Zhao – aliphatic2001 Chickos – homologous series2003 Bergstrom – druglike (N = 277, r2 = 0.54)

Page 15: modeling melting points

In 2005…

…everything changed

Page 16: modeling melting points

MDPI - cheminformatics.org

Karthikeyan 2005 N = 4173, r2 = 0.65

Page 17: modeling melting points

PHYSPROP

Clark 2005 N = 6257, r2 = 0.61

Page 18: modeling melting points

Recent melting point models use these datasets…

…never reproducing r2 = 0.65 (0.47 – 0.56)

Page 19: modeling melting points

Even though [a] melting point can be measured accurately, its prediction has been a notoriously difficult problem.

Page 20: modeling melting points

We began measuring, collecting, and curating melting points in the Fall of 2010

Page 21: modeling melting points

Jean-Claude Bradley’sChemical Information Retrieval

Course at Drexel

567 curated and referenced measurements from Fall 2010 Chemical Information Retrieval course

Page 22: modeling melting points

Most popular data sources…

…chemical vendors

Page 23: modeling melting points

Alfa Aesar donates ~13,000melting points to the public domain

Page 24: modeling melting points

collection

curation

modelingvalidation

measurement

ONS melting point

workflow

Page 25: modeling melting points

Collection: Open Datasource data points curated values source year data type

Bell 2483 1631 1995 donated-CC0

Bergstrom 277 277 2003 open

MDPI-Karthikeyan 4450 4084 2005 open

Hughes 287 262 2008 open

Oxford-MSDS 3217 1481 2010 open

Drugbank 875 875 2011 open

Griffiths 3757 278 2011 donated-CC0

Alfa Aesar 12986 8739 2011 donated-CC0

PHYSPROP 11645 9694 2011 donated-CC0

ONS 471 471 2012 open

27792 curated measurements for 19410 compounds

Page 26: modeling melting points

Curation is…

…lots of hard, tedious work(Jean-Claude Bradley and Antony Williams)

Antony Williams – RSC ChemSpider

Page 27: modeling melting points

Inconsistencies and SMILES problems within the “high trust level” MDPI dataset

Page 28: modeling melting points

PHYSPROP Structure Errors (Incorrect Valence)2315 out of 43543 contained pentavalent nitrogens

Page 29: modeling melting points

PHYSPROP Errors: Structure displayed is for the neutral compound dopamine but the associated CAS Number and

chemical name in the file are for the hydrobromide salt.

Page 30: modeling melting points

Common errorsunit errors: Kelvin/Celsius, Fahrenheit/Celsius

bad SMILES (non-rendering, hypervalency)

salts associated with SMILES for free base

using boiling point for melting point

Page 31: modeling melting points

Some melting points can’t be resolved only with literature: 4-benzyltoluene

Page 32: modeling melting points

Open lab notebook page measuring the melting point of 4-benzyltoluene

Page 33: modeling melting points

Modeling – All Data

Melting Point Model

CDKdescriptor calculator

Rstatistical computing

melting point data

Page 34: modeling melting points

MP Model N = 19515, r2 = 0.80

use this model

Page 35: modeling melting points

Modeling – Highly Curated Subset

compoundsdoubleplusgoodsingle

CDKdescriptor calculator

Rstatistical computing

data

Melting Point Model

Page 36: modeling melting points

MP Model N = 2704, r2 = 0.83

Page 37: modeling melting points

Straight chain carboxylic acids from 1 to 10 carbons

Straight chain alcohols from 1 to 10 carbons

Comparison of model with double+ validated measurements

Page 38: modeling melting points

Cyclic primary amines from 3 to 6 carbons cyclobutylamine flagged for measurement

only single source available

Page 39: modeling melting points

Publication of double+ validated melting point dataset

…as a preprint

Page 40: modeling melting points

Publication of double+ validated melting point dataset

…as a book

Page 41: modeling melting points

Data and model deployed…

…on the web

web service

Page 42: modeling melting points

…in Google spreadsheets

Page 43: modeling melting points

…as an app

Page 44: modeling melting points

Use case: recrystallizing dibenzalacetone

Can the solvents used to recrystallize compounds in organic teaching labs be improved?

Trans-dibenzalacetone

Aldol condensation between two molecules of benzaldehyde and one molecule of acetone

[Matthew McBride: Undergraduate Research Assistant - Drexel]

Page 45: modeling melting points

Dibenzalacetone First recrystallized in ethyl acetate in 1906: Straus

and Ecker, Ber. 39, 2988 (1906) Recrystallized in ethyl acetate in Organic Syntheses

Page 46: modeling melting points

Organic Teaching Labs

Recommended recrystallization solvent: ethyl acetate.

(http://classes.kvcc.edu/chm230/mixed%20aldol%20condensation.pdf

(http://www.xula.edu/chemistry/documents/orgleclab/Aldol_notes.pdf)

Page 47: modeling melting points

Recrystallization AppEnter compound identification and desired parameters

Page 48: modeling melting points

How does it work?

1. Look up the solvent boiling point

2. Look up the room temperature solubility or predict it via measured or predicted Abraham descriptors

3. Look up the solute melting point or predict it via a model

4. Use the melting point and the solubility at room temperature to predict the solubility at boiling

5. Calculate the predicted recrystallization yield

Page 49: modeling melting points

ResultsLists solvents and their predicted recrystallization yield.

Prediction is generated by the temperature dependent solubility curves.

Page 50: modeling melting points

Comparison ethyl acetate (predicted yield of 72%) vs ethanol

(predicted yield of 93%) ethyl acetate

ethanol

0.09M

1.1M

0.62M

2.06M

Page 51: modeling melting points

Dibenzalacetone derivatives docking against tubulin (paclitaxel site)

Page 52: modeling melting points

Example Derivatives of dibenzalacetone may be synthesized

by altering the aldehyde used From a library of derivatives, the following

compound was the top hit for the docking site of Taxol

Uses phenanthrene-9-carboxaldehyde

Page 53: modeling melting points

Search Literature Perform a Reaxys search to determine availability

of synthesis procedures

No results

[Matthew McBride: Undergraduate Research Assistant - Drexel]

Page 54: modeling melting points

Synthesis and recrystallization solvents chosen using ONS models

Used methanol and benzene

Melting Point: 264-265°C

(http://usefulchem.wikispaces.com/EXP286)

[Matthew McBride: Undergraduate Research Assistant - Drexel]

Page 55: modeling melting points

AcknowledgementsORU Biology and Chemistry FacultyJean-Claude Bradley (Drexel)Cameron Neylon (RAL)Antony Williams (RSC ChemSpider)Evan Curtin (Drexel)Matthew McBride (Drexel)

ORU research assistants: David Bulger, Daryl Charron, Lizzie Clark, Lacey Condron, Samantha Gaines, Alejandro Hernandez, Maria Hernandez, Jesse Patsolic, and Matthew Wilson