22
Drug legislation with data fusion rules using multiple fingerprints measurements The 5th International Conference on Innovations in Bio-Inspired Computing and Applications. June 23- 25, 2014 Moustafa Zein http://www.egyptscience.net

Drug legislation with data fusion rules using multiple fingerprints measurements

Embed Size (px)

DESCRIPTION

The orphan drug certification process from the European committee is depending on experts opinions that it is not similar to any other drug, this stage is very complicated and those opinions differ based on the expertise. So, this paper introduces computational model that gives one accurate probability of similarity, using multiple fingerprints measurements to similarity, and fuse these measurements by data fusion rules, that give one probability of similarity helping experts to determine that drug is similar to existing anyone or not.

Citation preview

Page 1: Drug legislation with data fusion rules using multiple fingerprints measurements

Drug legislation with data fusion rules using multiple fingerprints measurements

The 5th International Conference on Innovations in Bio-Inspired Computing and Applications. June 23-25, 2014

Moustafa Zein

http://www.egyptscience.net

Page 2: Drug legislation with data fusion rules using multiple fingerprints measurements

Scientific Research Group in Egyptwww.egyptscience.net

Page 3: Drug legislation with data fusion rules using multiple fingerprints measurements

Overview

IntroductionIdentify orphan drugs legislation.Structure similarity measurements.Data fusion and Fusion rules.

Problem Discussion Methodologies Numerical experiments Conclusion and future works

The 5th International Conference on Innovations in Bio-Inspired Computing and Applications. June 23-25, 2014

Page 4: Drug legislation with data fusion rules using multiple fingerprints measurements

Definition Of Orphan Drugs

The 5th International Conference on Innovations in Bio-Inspired Computing and Applications. June 23-25, 2014

• A medicine that has been developed specifically to treat a rare medical condition.

• These medical conditions are normally referred to as rare diseases and there is much current interest in the development of orphan drugs for the treatment of such diseases.

• Orphan drugs generally follow the same regulatory development path as any other pharmaceutical product.

• Example of orphan Drug : Trade name: Cofactor .Generic Name: (6R,S)5,10‐methylene‐ tetrahydrofolic acid

Page 5: Drug legislation with data fusion rules using multiple fingerprints measurements

Identify Orphan Drugs Legislation

The 5th International Conference on Innovations in Bio-Inspired Computing and Applications. June 23-25, 2014

• In the European Union (EU), the evaluation of orphan drugs is coordinated by the European Medicines Agency (hereafter the EMA).

• A medicine must meet a number of criteria if it is to qualify as an orphan drug.

• No similar medicinal product can be brought to the European market for a period of ten years.

• The new medicine is considered to be similar to another orphan drug if defined a similar active substance

• We used structure similarity as comparison criteria.

Page 6: Drug legislation with data fusion rules using multiple fingerprints measurements

Structure similarity measurements

The 5th International Conference on Innovations in Bio-Inspired Computing and Applications. June 23-25, 2014

• Similarity search:Is one of the most common techniques for ligand-based virtual screening and involves scanning a chemical database to identify those molecules that are most similar to a user-defined reference structure .

• Similarity measurements:Any similarity measure has three principal components.

• Similarity coefficients:The similarity coefficient that is used to quantify the degree of resemblance between two suitably weighted representations.

Page 7: Drug legislation with data fusion rules using multiple fingerprints measurements

Data Fusion

The 5th International Conference on Innovations in Bio-Inspired Computing and Applications. June 23-25, 2014

• Is the name given to a body of techniques that combine multiple sources of data into a single source.

• Data fusion has been widely used as way of enhancing the effectiveness of similarity-based virtual screening.

• Data fusion can take place at three different levels:• Data level fusion, the raw data generated by

multiple sensors are combined directly• Feature-level fusion, feature extraction methods are

used to generate derived representations• Decision-level fusion involves combining decisions

that have been arrived at independently by the available sensors

Page 8: Drug legislation with data fusion rules using multiple fingerprints measurements

Fusion Rules

The 5th International Conference on Innovations in Bio-Inspired Computing and Applications. June 23-25, 2014

• Fusion rules:• A fusion rule takes as input n (n ≥ 2) sets of N

similarities or ranks and produces as output a single such set, normally as a ranked list from which the top-ranked database structures can be selected for further analysis.

• Fusion rules can either be unsupervised, meaning that the fusion rule operates directly on the similarity or rank information, or supervised, meaning that an additional training procedure is required

Page 9: Drug legislation with data fusion rules using multiple fingerprints measurements

Fusion Rules cont

The 5th International Conference on Innovations in Bio-Inspired Computing and Applications. June 23-25, 2014

• The unsupervised fusion rules :The unsupervised fusion rules require as input just the n sets of N similarity scores An early study of the use of data fusion methods was undertaken by Belkin et al., who described a range of fusion rules that are based on simple arithmetic operations

• Supervised Fusion Rules:Are based on sets of structures for which bioactivity data are available. Specifically, these approaches seek to identify a quantitative relationship between the structural similarity of a pair of molecules, as determined using some particular similarity measure, and their corresponding biological similarities

Page 10: Drug legislation with data fusion rules using multiple fingerprints measurements

Problem Discussion

The 5th International Conference on Innovations in Bio-Inspired Computing and Applications. June 23-25, 2014

• Problem overview:

• In the European Union, medicines are authorized for some rare disease only if they are judged to be dissimilar to authorized orphan drugs for that disease.

• When a company applies to register a new medicine for an indication that has already been granted for an orphan medicine, it is the responsibility of the EMA’s Committee for Orphan Medicinal Products (COMP) to decide if the new drug is indeed similar to an existing orphan drug.

Page 11: Drug legislation with data fusion rules using multiple fingerprints measurements

Problem Core

The 5th International Conference on Innovations in Bio-Inspired Computing and Applications. June 23-25, 2014

• Using manual system to measure of similarity of drugs give a low level of efficiency and need more time.

• There are considerable differences between individual chemists in decision making that lead to do more efforts to reach a final decision about drug authority.

• From previous related work that be used some of fingerprint similarity coefficients to measure the similarity gives limited results that can’t be help the chemist to take a final right decision.

• In addition if change some similarity measurements, the decision of chemist changes.

Page 12: Drug legislation with data fusion rules using multiple fingerprints measurements

Related Work

The 5th International Conference on Innovations in Bio-Inspired Computing and Applications. June 23-25, 2014

• Using fingerprint measurements:• Using 2D fingerprint methods to support the assessment of

structural similarity in orphan drug by Pedro Franco 2014.

• Measures of structural similarity based on 2D fingerprints can provide a useful source of information for the assessment of orphan drug status by regulatory authorities.

Page 13: Drug legislation with data fusion rules using multiple fingerprints measurements

Related Work Cont.

The 5th International Conference on Innovations in Bio-Inspired Computing and Applications. June 23-25, 2014

• Using fusion rules:• By (Peter Willett 2013) described the use of supervised

fusion rules and techniques in cheminformatics to solve the similarity problem.

• Data fusion is the name given to a body of techniques that combine multiple sources of data into a single source, with the expectation that the resulting fused source will be more informative than will the individual input sources.

Page 14: Drug legislation with data fusion rules using multiple fingerprints measurements

Proposed Methodologies

The 5th International Conference on Innovations in Bio-Inspired Computing and Applications. June 23-25, 2014

• Overview:• There is wide variety of fingerprinting algorithms, there are

multiple methods for evaluating virtual screening (VS) performance, and little consensus as to which is best.

• Fingerprints measurements:• we used similarity measures for comparing chemical

structures represented by means of fingerprints are (Dice, Tanimoto, Cosine, Kaczynski, McConnaughey) coefficient s for the following fingerprints ('ECFP6', 'MACCS', 'FCFP6', 'AVALON', 'RDK7', 'RDK6', RDK5', 'HASHAP', 'HASHTT' ).

• All fingerprints were calculated using the RDKit (http://www.rdkit.org/).

Page 15: Drug legislation with data fusion rules using multiple fingerprints measurements

Data Fusion

The 5th International Conference on Innovations in Bio-Inspired Computing and Applications. June 23-25, 2014

• Using Data fusion rules to give one probability of similarity to molecules is methodology to solve the problem of more one probability to the same structure similarity.

• Examples of such arithmetic fusion rules where, as before, dj denotes the jth database structure and where there are n sets of similarity scores or ranks to be fused (when considering these rules, it should be remembered that a large similarity score, e.g., 0.95 for a Tanimoto coefficient, will correspond to a small rank, i.e., at or near to the top of a ranked list).

MAX

MIN

SUM

MED

Page 16: Drug legislation with data fusion rules using multiple fingerprints measurements

Data Set

The 5th International Conference on Innovations in Bio-Inspired Computing and Applications. June 23-25, 2014

• The data that used in as a training set form DrugBank with 100 molecule-pairs that compare between molecules structure of authorized drug A and molecules structure of new drug B with chemist opinion about molecules similarity.

Molecule A Molecule B Chemist yes opinion

Chemist No opinion

      .78

      .22

Page 17: Drug legislation with data fusion rules using multiple fingerprints measurements

Results And Discussion

The 5th International Conference on Innovations in Bio-Inspired Computing and Applications. June 23-25, 2014

• Before using Data fusion rules :• There are some results of fingerprints (ECFP6, MACCS,

and RDK7). It includes the chemist opinion with acceptance and refuse the similarity between molecules in authorized drug and the molecules of new drug, in addition to the probability of similarity that calculated by fingerprint algorithms.

• We can simulate the results and compare its similarity with the chemist similarity that show differences in these similarity like number (4, 5) the probability similarity fingerprint is larger than chemist yes opinion and in number (1, 3, 7) the chemist yes opinion larger than the probability of similarity

Page 18: Drug legislation with data fusion rules using multiple fingerprints measurements

Results And Discussion Cont.

The 5th International Conference on Innovations in Bio-Inspired Computing and Applications. June 23-25, 2014

Fingerprint Yes opinion No opinion Probability of similarity

ECFP6

0.707514969 0.292485031 0.5521749690.417283406 0.582716594 0.4273765940.980550353 0.019449647 0.8252103530.141990953 0.858009047 0.7026690470.122763846 0.877236154 0.521896154

MACCS

0.707514969 0.292485031 0.7525291080.417283406 0.582716594 0.4476478760.980550353 0.019449647 0.9535795290.141990953 0.858009047 0.5822560930.122763846 0.877236154 0.407872497

RDK6

0.707514969 0.292485031 0.5721749220.417283406 0.582716594 0.7059905090.980550353 0.019449647 0.8106520850.141990953 0.858009047 0.3905768830.122763846 0.877236154 0.421270038

Page 19: Drug legislation with data fusion rules using multiple fingerprints measurements

Results And Discussion Cont.

The 5th International Conference on Innovations in Bio-Inspired Computing and Applications. June 23-25, 2014

• With applying Data fusion rules:• Now we have unique probability of similarity for every

compound that describe chemist opinion and data fusion rules output of the probability of similarity between molecules structure similarity.

• There are three fusion rules that show one of probability of similarity between molecules structure of authorized and new drugs. In this study. we found that SUM rule give probability of similarity is closer to chemist yes opinion.

Page 20: Drug legislation with data fusion rules using multiple fingerprints measurements

Results And Discussion Cont.

The 5th International Conference on Innovations in Bio-Inspired Computing and Applications. June 23-25, 2014

Chemist yes opinion

Chemist no opinion

SUM Fusion Rule

MED Fusion Rule

Max Fusion Rule

0.70 0.29 0.69 0.63 0.75

0.41 0.58 0.48 0.43 0.60

0.98 0.01 0.91 0.81 0.95

0.14 0.85 0.15 0.39 0.58

0.12 0.87 0.17 0.42 0.52

0.56 0.43 0.53 0.42 0.48

0.93 0.06 0.86 0.81 0.83

0.10 0.89 0.19 0.37 0.43

0.33 0.66 0.36 0.62 0.67

0.96 0.03 0.93 0.80 0.80

Page 21: Drug legislation with data fusion rules using multiple fingerprints measurements

Conclusion And Future Work

The 5th International Conference on Innovations in Bio-Inspired Computing and Applications. June 23-25, 2014

• In this study we introduce a solution to the differences in similarity measurements of fingerprint algorithms and how these differences cause to the chemist some difficult in decision not help him.

• We use fusion rules to solve and give him an accurate

probability of similarity to molecule-pairs structure in authorized drug and new drug that need to take a legislation to share in market.

• In this study we did not use the activity similarity and how can their activity has an effect in the decision.

Page 22: Drug legislation with data fusion rules using multiple fingerprints measurements

Thanks and Acknowledgement

Authors:  Moustafa Zein, Ahmed Abdo, Ammar Adl, Aboul Ella Hassanien Ali, Mohamed F. Tolba and Vaclav Snasel