58
Good annotation Good annotation practice practice for chemical for chemical data: data: ChEBI experience ChEBI experience Kirill Degtyarenko Kirill Degtyarenko European Patent Office European Patent Office

“ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

Embed Size (px)

Citation preview

Page 1: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

““Good annotation Good annotation practicepractice”” for chemical for chemical

data: data: ChEBI experienceChEBI experience

Kirill DegtyarenkoKirill DegtyarenkoEuropean Patent OfficeEuropean Patent Office

Page 2: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

good Naming practice how to give most appropriate names

good Ontology practice how to link the entity of interest by

defined logical relationships to other entities

good Drawing practice

• how to draw unambiguous 2-D diagrams

Good anNODation practice

Page 3: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

or

How to Give Most Appropriate Names

Good Naming Practice

Page 4: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

2-{[3-(trifluoromethyl)phenyl]amino}benzoic acid

NH

O

OH

F

F

F

Systematic Name (IUPAC)

1

23

4

5

6

1

2

34

5

6

Page 5: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

• flufenamic acid (INN English)• acide flufénamique (INN French)• ácido flufenámico (INN Spanish)• acidum flufenamicum (INN Latin)• Flufenaminsäure (German)

NH

O

OH

F

F

F

Common Name

Page 6: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

The Unpronounceables

CHEBI:48935

(E)-roxithromycin

IUPAC name:

(3R,4S,5S,6R,7R,9R,10E,11S,12R,13S,14R)-4-(2,6-dideoxy-3-C-methyl-3-O-methyl-α-L-ribo-hexopyranosyloxy)-14-ethyl-7,12,13-trihydroxy-10-{[(2-methoxyethoxy)methoxy]imino}-6-[3,4,6-trideoxy-3-(dimethylamino)-β-D-xylo-hexopyranosyloxy]-3,5,7,9,11,13-hexamethyloxacyclotetradecan-2-one

O O

O

O

OH

N

O

O

N

OH

OH

O OO

O

OH OH

CH3

CH3

CH3

CH3

CH3CH3

CH3 CH3

CH3

CH3

CH3

CH3

CH3CH3

Page 7: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

O O

O

O

OH

N

O

O

N

OH

OHO

OH OH

CH3

CH3

CH3CH3CH3

CH3 CH3

CH3

CH3

CH3

CH3

CH3CH3

OOO

CH3

CHEBI:32109(Z)-roxithromycin

What is the common name of roxithromycin?

CHEBI:48935(E)-roxithromycinINN: roxithromycin

O O

O

O

OH

N

O

O

N

OH

OH

O OO

O

OH OH

CH3

CH3

CH3

CH3

CH3CH3

CH3 CH3

CH3

CH3

CH3

CH3

CH3CH3

Page 8: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

O O

O

O

OH

N

O

O

N

OH

OH

O OO

O

OH OH

CH3

CH3

CH3

CH3

CH3CH3

CH3 CH3

CH3

CH3

CH3

CH3

CH3CH3

O O

O

O

OH

N

O

O

N

OH

OHO

OH OH

CH3

CH3

CH3CH3CH3

CH3 CH3

CH3

CH3

CH3

CH3

CH3CH3

OOO

CH3

CHEBI:48844 roxithromycin

(E)-roxithromycin

O O

O

O

OH

N

O

O

N

OH

OH

O OO

O

OH OH

CH3

CH3

CH3

CH3

CH3CH3

CH3 CH3

CH3

CH3

CH3

CH3

CH3CH3

(Z)-roxithromycin

Page 9: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

What is thiamine?CHEBI:18385thiamine(1+)aka thiamine

N

N

NH2

CH3 S

CH3

OH

N+

CHEBI:33283thiamine(1+) chlorideINN: thiamine

N

N

NH2

CH3 S

CH3

OH

N+

Cl-

CHEBI:49105 thiamine(2+) dichlorideaka thiamine chloride hydrochloride aka thiamine hydrochloride

N

NH3+

NCH3 S

CH3

OH

N+

Cl-

Cl-

Page 10: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

Problem is not unique to ChEBI

Cf. phenol vs phenols phenol metabolism vs phenols

metabolism

Bad solution: article use a phenol metabolism?

Solution: prepositional phrases metabolism of phenols

Plurals and singulars

Page 11: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

or

How to Draw Unambiguous 2-D Diagrams

Good Drawing Practice

Page 12: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

Linear forms of monosaccharides

CHO

CH2OH

H OH

OH H

OH H

H OH

OH

O

H OH

OH H

OH H

H OH

H H

OH

OH

OH

OH

OH

O

Page 13: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

Pyranose forms of monosaccharides

O

OHH

HOH

HOH

H OH

H

CH2OH

O

CH2OH

OH

OH

OH

OH

OH

OH

OH

OH

OOH

Page 14: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

Fused systems

(R)-camphor

ambiguous unambiguous

CH3

OCH3

CH3

O

CH3CH3

CH3

Page 15: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

Square planar geometry

InChI=1/2ClH.2H3N.Pt/h2*1H;2*1H3;/q;;;;+2/p-2

Pt

N Cl

ClN

HH

H

H

HH

Pt

NCl

N Cl

H H

H

H

HH

cisplatin transplatin

SMILES: [H][N]([H])([H])[Pt](Cl)(Cl)[N]([H])([H])[H]

Page 16: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

Compositional uncertainty

Positional uncertainty

Configurational uncertainty

Conformational uncertainty

Uncertainty and ambiguity in chemistry

Page 17: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

Examples

an alkali metal cation

vanadate(V) anion

[2H]ethanol

Compositional uncertainty

Page 18: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

Examples

L-bromohistidine residue

pteroic acid (several tautomers)

Positional uncertainty

Page 19: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

Examples

androstane

rel-(2R,3R)-2-amino-3-methylpentanoic acid

tetradec-11-enoic acid

Configurational uncertainty

Page 20: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

Examples

cyclohexane: chair, boat, twist

protein secondary structure: , , …

Conformational uncertainty

Page 21: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

or

How to Link the Entity of Interest by Defined

Logical Relationships to Other Entities

Good Ontology Practice

Page 22: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

• Molecular structure ontology• Subatomic particle ontology• Biological role ontology • Application ontology

ChEBI ontology

Page 23: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

Relationships in ChEBI ∆ Is A generic

⋄ Is Part Of generic

♯ Is Conjugate Acid Of specific

♭ Is Conjugate Base Of specific

Is Enantiomer Of specific

Is Tautomer Of specific

ℛ Is Substituent Group From

specific

ℋ Has Parent Hydride specific

ℱ Has Functional Parent specific

Page 24: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

Is A relationship

NH2

O

OHSH

NH2

O

OHSH∆

L-cysteine

cysteineis a

Page 25: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

NH3+

O

OHSH

NH3+

O

OHSH

L-cysteinium

Is Part Of

L-cysteine hydrochloride

is part of

Cl-

has part

Page 26: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

NH2

O

OHSH

Is Enantiomer Of

NH2

O

OHSH

L-cysteine

NH2

O

OHSH

∆ ∆

D-cysteine

is enantiomer of

Page 27: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

Is Tautomer Of

3H-pyrrole

NH

N N

2H-pyrrole

1H-pyrrole

Page 28: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

NH2

O

O-

S-

NH3+

O

OHSH

NH2

O

O-

SH

Is Conjugate Acid Of

NH2

O

OHSH♯

L-cysteine

L-cysteinate(1–)is conjugate acid of

L-cysteinium

L-cysteinate(2–)

♯♯

Page 29: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

NH2

O

O-

SH

Is Conjugate Base Of

NH2

O

OHSH

L-cysteine

L-cysteinate(1–)

NH2

O

O-

S-

NH3+

O

OHSH

L-cysteinium

L-cysteinate(2–)

♭ ♭

Page 30: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

NH2

O

O-

SH

Acid/base relationships

NH2

O

OHSH

♭L-

cysteineL-cysteinate(1–)

NH2

O

O-

S-

NH3+

O

OHSH

L-cysteinium

L-cysteinate(2–)

♭♯♯

Page 31: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

NH2

O

SH

L-cysteinyl

NH

O

SH

NH

O

OHSH

Is Substituent Group From

NH2

O

OHSHL-cysteine

L-cysteine residue

L-cysteino

*

*

*

*

Page 32: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

salutaridinol

Has Parent Hydride

has parent hydride

is parent hydride of

ℋ NHH

morphinan

OH

N

O

O

CH3

OH

CH3

CH3

Page 33: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

7-O-acetylsalutaridinol

Has Functional Parent

has functional parent

is functional parent of

salutaridinol

OH

N

O

O

CH3

CH3

CH3

OCH3

O

OH

N

O

O

CH3

OH

CH3

CH3

Page 34: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

Live annotation demo

Page 35: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

Going to SourceForge…

Page 36: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

Reading a request…

Page 37: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

Going to curator tool…

Page 38: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

Search result…

Page 39: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

Adding new entry…

Page 40: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

Editing new entry…

Page 41: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

Success!

Page 42: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

Let’s draw

Page 43: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office
Page 44: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

Success again!

Page 45: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

Using ACD/Name (1)

Page 46: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

Using ACD/Name (2)

Page 47: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

Adding IUPAC name (1)

Page 48: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

Adding IUPAC name (2)

Page 49: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

Classifying (1)

Page 50: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

Classifying (2)

Page 51: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

Classifying (3)

Page 52: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

Classifying (4)

Page 53: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

The last touch (1)

Page 54: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

The last touch (2)

Page 55: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

Responding request…

Page 56: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

A job well done…

Page 57: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

• Rafael Alcántara• Michael Ashburner• Volker Ast *• Michael Darsow *• Paula de Matos• Marcus Ennis• Janna Hastings• Alan McNaught *• Chris Steinbeck• Martin Zbinden *

The team

Page 58: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

• Kristian Axelsen• Hélène Courrier• Anne Morgat• Ian Unwin• Our faithful Users

• EU: funding

Thanks