30
Performance Metrics Proposed Method2 GPU based virtual screening techniques for faster drug discovery 12/15/2016 46 46

Performance Metricspeople.cse.nitc.ac.in/jayaraj/files/thesis_ppt2.pdf · Similar Property Principle The substructure based virtual screening is based on the widely accepted Similar

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Performance Metricspeople.cse.nitc.ac.in/jayaraj/files/thesis_ppt2.pdf · Similar Property Principle The substructure based virtual screening is based on the widely accepted Similar

Performance MetricsProposed Method2

GPU based virtual screening techniques for faster drug discovery 12/15/201646

46

Page 2: Performance Metricspeople.cse.nitc.ac.in/jayaraj/files/thesis_ppt2.pdf · Similar Property Principle The substructure based virtual screening is based on the widely accepted Similar

Running Time of serial & parallel SOM implementationsProposed Method2

GPU based virtual screening techniques for faster drug discovery47 12/15/2016

47

The parallelized algorithm speedup the process considerably when implemented on a GPU.

Page 3: Performance Metricspeople.cse.nitc.ac.in/jayaraj/files/thesis_ppt2.pdf · Similar Property Principle The substructure based virtual screening is based on the widely accepted Similar

Proposed Method-3

Substructure similarity based virtual screening Using GPU

12/15/2016

48

GPU based virtual screening techniques for faster drug discovery 48

Page 4: Performance Metricspeople.cse.nitc.ac.in/jayaraj/files/thesis_ppt2.pdf · Similar Property Principle The substructure based virtual screening is based on the widely accepted Similar

Substructure similarity based Virtual Screening

Active

unknowns• One known active molecule

becomes the querymolecule.

• Look for compounds that are most similar to the querymolecule.

• Actives from unknown set can be easily identified by matching the structuralsimilarity of these molecules against the known active molecule.

© NVIDIA 2013

Proposed Method3

GPU based virtual screening techniques for faster drug discovery 49

49

12/15/2016

Page 5: Performance Metricspeople.cse.nitc.ac.in/jayaraj/files/thesis_ppt2.pdf · Similar Property Principle The substructure based virtual screening is based on the widely accepted Similar

Similar Property Principle

The substructure based virtual screening is based on the widely accepted Similar Property Principle(SPP).

SPP States that “chemicals of similar structure

frequently share similar biological activities and physio-chemical properties”[55].

The size of the maximum common subgraph(MCS) between 2 graphs is a good metric for checking the compound similarity [53][54].

Size of MCS can be taken as a metric for ligand based virtual screening.

12/15/2016

50Proposed Method3

GPU based virtual screening techniques for faster drug discovery 50

Page 6: Performance Metricspeople.cse.nitc.ac.in/jayaraj/files/thesis_ppt2.pdf · Similar Property Principle The substructure based virtual screening is based on the widely accepted Similar

Maximum Common Subgraph Problem denotes the largest common subgraph between the graphs under observation.

51

Proposed Method3

GPU based virtual screening techniques for faster drug discovery 12/15/2016

51

Page 7: Performance Metricspeople.cse.nitc.ac.in/jayaraj/files/thesis_ppt2.pdf · Similar Property Principle The substructure based virtual screening is based on the widely accepted Similar

MCS algorithm : related works

McGregor[51] introduced a maximal common subgraphalgorithm that uses a backtrack search.

MCS of two graphs can be found by an algorithm that transforms the MCS problem into Maximal Clique Enumeration problem (MCE) [56].

The compatible information between two graphs should be stored as a new graph called “product graph”.

MCE algorithm uses the edge product graph for the solution. The maximum clique in the edge product graph corresponds to the

maximum common sub graph.

12/15/2016

52Proposed Method3

GPU based virtual screening techniques for faster drug discovery 52

Page 8: Performance Metricspeople.cse.nitc.ac.in/jayaraj/files/thesis_ppt2.pdf · Similar Property Principle The substructure based virtual screening is based on the widely accepted Similar

Consider two graphs G1=(V1,E1) and G2=(V2,E2)

The edge product graph uses the vertex set V = E1 x E2.

The edges are formed between two vertices (e1, e2) and (f1, f2), if

e1 ≠ f1 and e2 ≠ f2;

if either e1, f1 in G1 are connected via the vertex labeled same as that of e2, f2 in G2; OR

e1, f1 and e2, f2 are not adjacent in G1 and G2 respectively.

• The BK algorithm finds all cliques in a graph exactly once[50].

Finding Edge product Graph for MCE[56]

GPU based virtual screening techniques for faster drug discovery 53

53

12/15/2016

Proposed Method3

Page 9: Performance Metricspeople.cse.nitc.ac.in/jayaraj/files/thesis_ppt2.pdf · Similar Property Principle The substructure based virtual screening is based on the widely accepted Similar

12/15/201654

Proposed Method3

Serial BK algorithm[50]

GPU based virtual screening techniques for faster drug discovery

54

Page 10: Performance Metricspeople.cse.nitc.ac.in/jayaraj/files/thesis_ppt2.pdf · Similar Property Principle The substructure based virtual screening is based on the widely accepted Similar

12/15/201655

Solution tree generated by BK algorithmInput Graph G

Proposed Method3

GPU based virtual screening techniques for faster drug discovery

Example

Page 11: Performance Metricspeople.cse.nitc.ac.in/jayaraj/files/thesis_ppt2.pdf · Similar Property Principle The substructure based virtual screening is based on the widely accepted Similar

EP-Graph

(1,5)

)

(4,6)

)(4,5)

(2,5)

)

(3,5)

)(3,7)

)

(1,6)

)

(2,6)

)

(3,6)

)

(1,7)

)

(2,7)

)

(4,7)

)

12/15/201656

Proposed Method3

GPU based virtual screening techniques for faster drug discovery

Page 12: Performance Metricspeople.cse.nitc.ac.in/jayaraj/files/thesis_ppt2.pdf · Similar Property Principle The substructure based virtual screening is based on the widely accepted Similar

12/15/2016GPU based virtual screening techniques for faster drug discovery 57

a) Backtrack to form the common subgraph from the 1st graph with edges 2,4,3

b) Maximum common subgraph of the above two graphs

One of the maximal clique(clique of size 3)Proposed Method3

Page 13: Performance Metricspeople.cse.nitc.ac.in/jayaraj/files/thesis_ppt2.pdf · Similar Property Principle The substructure based virtual screening is based on the widely accepted Similar

Parallel BK algorithm

Clique enumeration is done by constructing a solution tree.

For large graphs, the serial solution consumes huge amount of time for the tree construction.

Parallel BK: Related work

Matthew C. Schmidt[57] proposed a parallel MCE algorithm for shared memory high performance computing architectures using OpenMP.

It involves decomposition of the search tree into search sub trees

Implemented using OpenMP/POSIX thread for shared memory and MPI for distributed memory.

Tested on Cray XT4 machine..

12/15/2016

58Proposed Method3

GPU based virtual screening techniques for faster drug discovery 58

Page 14: Performance Metricspeople.cse.nitc.ac.in/jayaraj/files/thesis_ppt2.pdf · Similar Property Principle The substructure based virtual screening is based on the widely accepted Similar

Proposed GPU based MCE

Branch and bound techniques are utilized in the design

The proposed parallel method to solve MCE also uses the idea of decomposing the search tree into search sub trees.

The proposed solution uses a Breadth First technique for branching in the algorithm.

Current Active Set

Contains the list of nodes to be branched

Each node in currentActiveSet becomes a BFS root

Each thread will take a node from currentActiveSet and evaluates a sub region of the solution space

12/15/2016

59Proposed Method3

GPU based virtual screening techniques for faster drug discovery 59

Page 15: Performance Metricspeople.cse.nitc.ac.in/jayaraj/files/thesis_ppt2.pdf · Similar Property Principle The substructure based virtual screening is based on the widely accepted Similar

12/15/201660

Proposed Method3

GPU based virtual screening techniques for faster drug discovery

Page 16: Performance Metricspeople.cse.nitc.ac.in/jayaraj/files/thesis_ppt2.pdf · Similar Property Principle The substructure based virtual screening is based on the widely accepted Similar

12/15/201661

Proposed Method3

Solution tree generated by BK algorithmInput Graph G

GPU based virtual screening techniques for faster drug discovery

Page 17: Performance Metricspeople.cse.nitc.ac.in/jayaraj/files/thesis_ppt2.pdf · Similar Property Principle The substructure based virtual screening is based on the widely accepted Similar

12/15/201662

Proposed Method3

Solution tree generated by BK algorithmInput Graph G

GPU based virtual screening techniques for faster drug discovery

Page 18: Performance Metricspeople.cse.nitc.ac.in/jayaraj/files/thesis_ppt2.pdf · Similar Property Principle The substructure based virtual screening is based on the widely accepted Similar

12/15/201663

Proposed Method3

Solution tree generated by BK algorithmInput Graph G

GPU based virtual screening techniques for faster drug discovery

Page 19: Performance Metricspeople.cse.nitc.ac.in/jayaraj/files/thesis_ppt2.pdf · Similar Property Principle The substructure based virtual screening is based on the widely accepted Similar

12/15/201664

Solution tree generated by BK algorithmInput Graph G

Proposed Method3

GPU based virtual screening techniques for faster drug discovery

Page 20: Performance Metricspeople.cse.nitc.ac.in/jayaraj/files/thesis_ppt2.pdf · Similar Property Principle The substructure based virtual screening is based on the widely accepted Similar

12/15/2016GPU based virtual screening techniques for faster drug discovery 65

Test runs for different EP Graph

Experiments conducted using 6 randomly generated graphs[1-6 items in Table] and 7 DIMACS graphs(established benchmark graph).

The run time and size of the maximum cliques found are summarized in Table

Page 21: Performance Metricspeople.cse.nitc.ac.in/jayaraj/files/thesis_ppt2.pdf · Similar Property Principle The substructure based virtual screening is based on the widely accepted Similar

12/15/201666

Proposed parallel virtual screening MethodProposed Method3

GPU based virtual screening techniques for faster drug discovery

Page 22: Performance Metricspeople.cse.nitc.ac.in/jayaraj/files/thesis_ppt2.pdf · Similar Property Principle The substructure based virtual screening is based on the widely accepted Similar

12/15/201667

Proposed Method3

Proposed parallel virtual screening Method continued …

GPU based virtual screening techniques for faster drug discovery

Page 23: Performance Metricspeople.cse.nitc.ac.in/jayaraj/files/thesis_ppt2.pdf · Similar Property Principle The substructure based virtual screening is based on the widely accepted Similar

12/15/2016GPU based virtual screening techniques for faster drug discovery 68

Parallelism in MCS based VSProposed Method3

68

Page 24: Performance Metricspeople.cse.nitc.ac.in/jayaraj/files/thesis_ppt2.pdf · Similar Property Principle The substructure based virtual screening is based on the widely accepted Similar

12/15/2016GPU based virtual screening techniques for faster drug discovery69

Results of the proposed MCS based algorithm when Benzene was thequery compound

Proposed Method369

Page 25: Performance Metricspeople.cse.nitc.ac.in/jayaraj/files/thesis_ppt2.pdf · Similar Property Principle The substructure based virtual screening is based on the widely accepted Similar

Comparison of proposed Machine learning based VS techniques

Performance comparison on various aspects Efficiency metrics Run time Ability to label molecules

Observations Since SOM takes multiple iteration to converge, Accuracy of SOM is

found better than RF method SOM takes large execution time since the multiple iterations are

required for completing the screening SOM based method can reduce the false positive, since it classifies

molecules as undefined also

12/15/2016

70

GPU based virtual screening techniques for faster drug discovery 70

Page 26: Performance Metricspeople.cse.nitc.ac.in/jayaraj/files/thesis_ppt2.pdf · Similar Property Principle The substructure based virtual screening is based on the widely accepted Similar

12/15/2016GPU based virtual screening techniques for faster drug discovery 71

Performance comparison on efficiency metrics of RF and SOM based methods

Page 27: Performance Metricspeople.cse.nitc.ac.in/jayaraj/files/thesis_ppt2.pdf · Similar Property Principle The substructure based virtual screening is based on the widely accepted Similar

12/15/2016GPU based virtual screening techniques for faster drug discovery 72

Comparison of running time of Parallel RF and Parallel SOM classification for virtual screening

List of Compounds which are predicted as Active by RFC and Undefined by SOM for GDB17 test set, which is a data set of unknown chemical compounds

Page 28: Performance Metricspeople.cse.nitc.ac.in/jayaraj/files/thesis_ppt2.pdf · Similar Property Principle The substructure based virtual screening is based on the widely accepted Similar

12/15/2016GPU based virtual screening techniques for faster drug discovery 73

Tools Developed

Based on the methods proposed, following tools are developed to make the drug discovery process for efficient GPURFSCREEN SOMSCREEN GRAPHSCREEN

These tools are built using Python and C language on CUDA frame work.

Source code, Readme files etc are available at http://ccc.nitc.ac.in/project/

Page 29: Performance Metricspeople.cse.nitc.ac.in/jayaraj/files/thesis_ppt2.pdf · Similar Property Principle The substructure based virtual screening is based on the widely accepted Similar

Conclusion

Considering large volume of data involved in ligand based drug design, the proposed parallel methods can reduce the running time.

The cost of installation, power consumption and maintenance of a GPU based system are lower compared to other multi-core system.

GPU based virtual screening is a viable alternative for quickly screening large quantity of ligand data at a lower cost

As part of the thesis, three new tools for faster virtual screening were developed

12/15/2016

74

GPU based virtual screening techniques for faster drug discovery 74

Page 30: Performance Metricspeople.cse.nitc.ac.in/jayaraj/files/thesis_ppt2.pdf · Similar Property Principle The substructure based virtual screening is based on the widely accepted Similar

Future Work

12/15/2016GPU based virtual screening techniques for faster drug discovery

75

GPU based RF can further be parallelized by using multiple cores available in CPUs

Variant of random forest classifiers that implement balanced decision trees

Other distance measures such as Manhattan distance can be used as a discriminant function for the winner neuron prediction in SOM based VS.

The development of GRAPHSCREEN is limited because it considers only MCS related properties for screening the compounds.

75