11
Performance Analysis of Molecular Dynamics Simulation of PfENR Enzyme using AMBER on Cluster and GPU computing environment 1 Heru Suhartanto, 2 Arry Yanuar, 3 Alhadi Bustamam, 1 Aruni Yasmin Azizah, 1 Ari Wibisono, 1 M. H. Hilman 1, Faculty of Computer Science Universitas Indonesia , [email protected] *2, Faculy of Pharmacy Universitas Indonesia, [email protected] 3 Faculty of Mathematics and Natural Sciences, [email protected] Abstract One of the processes requiring HPC environments is Molecular Dynamics ( MD ) . In tropical countries, the MD process is very important in the preparation of virtual screening experiments for anti-malaria search. Previous works on the virtual screening project for anti-malaria search conducted by WISDOM project uses grid infrastructure with 1,700 CPUs of various infrastructure provided in 15 countries [13]. In silico anti malaria compounds searching from Indonesian medical plants using virtual screening methods are urgently required. This can reduce the cost and time required compared to the direct searching or examining each compound by in vitro and in vivo which will spend a lot of time and expense . However, the use of thousands of processors is difficult for the researchers with limited resources in developing countries such as Indonesia. Our of previous studies using MD with GROMACS shows the improvement of the simulation time using Cluster. But that is not the case for some of our previous works with AMBER on Cluster where we did not obtain significant speed up. However, our previous works running GROMACS on GPUs provided significant speed up about 12 times faster than that run on Cluster. In this study , we build a GPU -based computing environment and have some MD simulation with AMBER. We used several computing environments such as cluster with 16 cores , GPU Geforce GTX 465 , GTX 470 , GTX 560 , GTX 680 , and GTX 780 . In addition to PfENR ( Plasmodium falciparum Enoyl acyl Carrier Protein Reductase ) enzyme , as benchmark we also conducted MD experiments on Myoglobin protein , Dihydrofolate reductase (DHFR) protein, and Ras - Raf protein . All experimental results showed that the slowest MD processes occurred on Cluster, followed in increasing order by GTX 560, GTX 465, GTX 470, GTX 680 and GTX 780. While the GPU speed up relative to cluster is about 24 , 26 , 32 , 24 , 77 and 101, respectively. . Keywords: Molecular Dynamic Simulation, GPU, Cluster 1. Introduction Drug design is a field of study that consists of many disciplines, including chemistry, biology, pharmacy, and computer science. Many reports on the successful drug design that have been produced are based on computing-based drug design. McCammon has produced antiretroviral raltegravir for HIV-1 using AutoDock [19]. Merck has released their product after receiving approval by US FDA at Oct 12, 2007. Another success story is the production of drugs schistomiasis released by Chen Yuzong using InvDock for drug design process. [20]. Molecular dynamics (MD) is the part that takes an important role in drug design activities. The researchers conducted MD to study the structure and properties of molecules, protein folding, and protein structural analysis [1,3]. Previous studies have been able to do MD in supercomputers with multi-core system [1], another special clusters used is the MD-Engine II [1,3], and a similar project is a quite popular volunteer distributed computing called Folding @ Home [26]. In the era of new drug design, natural compounds from plants will be the base for further investigation. Indonesia is the world's centers of biodiversity, and is ranked the second richest in the world after Brazil. If marine life are taken into account, Indonesia ranks first in the richest biodiversity. On this earth, there are about 40,000 species of plants, 30,000 species which live in the Indonesian archipelago. Among the 30,000 species of plants that live in the Indonesian archipelago, it is known that there are at least 9600 species of medicinal plants. Several compounds from natural sources has the potential to become a guide for model compounds that inhibit the action of a new enzyme, such as that International Journal of Advancements in Computing Technology(IJACT) Volume 6, Number 1, January 2014 Performance Analysis of Molecular Dynamics Simulation of PfENR Enzyme using AMBER on Cluster and GPU computing environment Heru Suhartanto, Arry Yanuar, Alhadi Bustamam, Aruni Yasmin Azizah,Ari Wibisono, M. H. Hilman 68

Performance Analysis of Molecular Dynamics Simulation of ...staff.ui.ac.id/system/files/users/h.hilman/publication/ijact-amber.pdf · The use of a cluster or grid computing is the

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Performance Analysis of Molecular Dynamics Simulation of ...staff.ui.ac.id/system/files/users/h.hilman/publication/ijact-amber.pdf · The use of a cluster or grid computing is the

Performance Analysis of Molecular Dynamics Simulation of PfENR Enzyme using AMBER on Cluster and GPU computing environment

1Heru Suhartanto, 2Arry Yanuar, 3Alhadi Bustamam, 1Aruni Yasmin Azizah,

1Ari Wibisono, 1M. H. Hilman 1, Faculty of Computer Science Universitas Indonesia , [email protected] *2, Faculy of Pharmacy Universitas Indonesia, [email protected]

3Faculty of Mathematics and Natural Sciences, [email protected]

Abstract One of the processes requiring HPC environments is Molecular Dynamics ( MD ) . In tropical countries, the MD process is very important in the preparation of virtual screening experiments for anti-malaria search. Previous works on the virtual screening project for anti-malaria search conducted by WISDOM project uses grid infrastructure with 1,700 CPUs of various infrastructure provided in 15 countries [13]. In silico anti malaria compounds searching from Indonesian medical plants using virtual screening methods are urgently required. This can reduce the cost and time required compared to the direct searching or examining each compound by in vitro and in vivo which will spend a lot of time and expense . However, the use of thousands of processors is difficult for the researchers with limited resources in developing countries such as Indonesia. Our of previous studies using MD with GROMACS shows the improvement of the simulation time using Cluster. But that is not the case for some of our previous works with AMBER on Cluster where we did not obtain significant speed up. However, our previous works running GROMACS on GPUs provided significant speed up about 12 times faster than that run on Cluster. In this study , we build a GPU -based computing environment and have some MD simulation with AMBER. We used several computing environments such as cluster with 16 cores , GPU Geforce GTX 465 , GTX 470 , GTX 560 , GTX 680 , and GTX 780 . In addition to PfENR ( Plasmodium falciparum Enoyl acyl Carrier Protein Reductase ) enzyme , as benchmark we also conducted MD experiments on Myoglobin protein , Dihydrofolate reductase (DHFR) protein, and Ras - Raf protein . All experimental results showed that the slowest MD processes occurred on Cluster, followed in increasing order by GTX 560, GTX 465, GTX 470, GTX 680 and GTX 780. While the GPU speed up relative to cluster is about 24 , 26 , 32 , 24 , 77 and 101, respectively. .

Keywords: Molecular Dynamic Simulation, GPU, Cluster

1. Introduction

Drug design is a field of study that consists of many disciplines, including chemistry, biology, pharmacy, and computer science. Many reports on the successful drug design that have been produced are based on computing-based drug design. McCammon has produced antiretroviral raltegravir for HIV-1 using AutoDock [19]. Merck has released their product after receiving approval by US FDA at Oct 12, 2007. Another success story is the production of drugs schistomiasis released by Chen Yuzong using InvDock for drug design process. [20].

Molecular dynamics (MD) is the part that takes an important role in drug design activities. The researchers conducted MD to study the structure and properties of molecules, protein folding, and protein structural analysis [1,3]. Previous studies have been able to do MD in supercomputers with multi-core system [1], another special clusters used is the MD-Engine II [1,3], and a similar project is a quite popular volunteer distributed computing called Folding @ Home [26].

In the era of new drug design, natural compounds from plants will be the base for further investigation. Indonesia is the world's centers of biodiversity, and is ranked the second richest in the world after Brazil. If marine life are taken into account, Indonesia ranks first in the richest biodiversity. On this earth, there are about 40,000 species of plants, 30,000 species which live in the Indonesian archipelago. Among the 30,000 species of plants that live in the Indonesian archipelago, it is known that there are at least 9600 species of medicinal plants. Several compounds from natural sources has the potential to become a guide for model compounds that inhibit the action of a new enzyme, such as that

International Journal of Advancements in Computing Technology(IJACT) Volume 6, Number 1, January 2014

Performance Analysis of Molecular Dynamics Simulation of PfENR Enzyme using AMBER on Cluster and GPU computing environment Heru Suhartanto, Arry Yanuar, Alhadi Bustamam, Aruni Yasmin Azizah,Ari Wibisono, M. H. Hilman

68

Page 2: Performance Analysis of Molecular Dynamics Simulation of ...staff.ui.ac.id/system/files/users/h.hilman/publication/ijact-amber.pdf · The use of a cluster or grid computing is the

of the HIV virus [7,10]. We have proposed a portal for Medicinal Plants Database and Three Dimensional Structure of Chemical Compounds from Medicinal Plants in Indonesia [28] for insilico screening the Natural Compounds. The use of this database is needed to find inhibitors against enzymes of HIV-1 by using virtual screening.

Furthermore, research on computing-based drug design has characteristics that require large computing resources. Supercomputers as one of the high performance computing infrastructure commonly used to meet the needs. For example, virtual screening for anti-malarial search conducted by WISDOM project uses grid infrastructure using 1700 CPUs of various infrastructure in 15 countries.[13]. In silico searching compounds from Indonesian for natural materials that can be potentially as anti malarial using virtual screening methods are urgently required because this method can reduce the cost and time of in vitro and in vivo experiment. [14].

Supercomputer development as the main resource of high performance computing poses a special problem in the research community , especially for those from research institutes with limited budget or some third world countries that do not provide an adequate budget for their research on national income and expenditure budgets . One alternative is to use the cluster , grid and GPU computing environment . This technology is one of the best alternative for every researcher in providing high- performance computing resources need for drug design experiments .

In our previous works , we did some simulations using a simple compounds or molecules taken from literature and our Medical Plants Database portal . Initially , we have conducted research using the GROMACS on our cluster computing environment that gain significant speed up results in experiments with five nodes [27]. We also did another MD with GROMACS using cluster computing environments , named Cluster05 and computing facilities are equipped with a GPU ( Graphics Processing Unit ) that provides speed up to 11-12 times [23]. However, our preliminary experiments performed using the MD with AMBER on cluster showed a speed - up but not too significant [22] . So in this activity , we intent to improve the performance of MD simulations using the AMBER in GPU computing environment for processing the conformational ensemble of PfENR which is an important enzyme in Plasmodium falciparum . Furthermore , we plan to use the entire molecule from our portal database virtual screening with a target molecule inhibitors PfENR seeking to build a drug candidate for malaria .

Malaria is a life-threatening disease caused by parasites that are transmitted to humans through the bite of an infected female Anopheles mosquito or through blood transfusions . In 2008 , there were more than 500 million cases of malaria in the world and about 2 million people died [25] . The disease is caused by Plasmodium consisting of four types of Plasmodium falciparum , Plasmodium ovale , Plasmodium malariae , and Plasmodium vivax . Of the four types of Plasmodium that cause malaria in humans , Plasmodium falciparum is the most severe and fatal , the disease commonly called cerebral malaria .

Antimalarial drugs are widely used in Indonesia , among others, quinine , primaquine , chloroquine , pyrimethamine - sulfadoxine . Quinine is an alkaloid class of antimalarial drugs that are skizontosid kinkona blood in humans and Plasmodium vivax and gametosid on Plasmodium malariae . This drug is an antimalarial drug alternative to radical treatment without resistance Plasmodium falciparum to chloroquine and pyrimethamine - sulfadoxine ( multi- drug ) [29] . Currently, most of Plasmodium falciparum is resistant to the existing antimalarial drugs . This situation is caused by the occurrence of spontaneous mutations on the structure and activity of drug targets in malaria parasites .

Thus Searching for antimalaria drugs that work specific to the target parasite is very important, some specific targets at this time is PfENR ( Plasmodium falciparum Enoyl acyl Carrier Protein Reductase ) , PM ( plasmepsin ) and FTase ( farnesyl transferase ) . In the past decade , there was a finding of a potential target for antimalarial . This target is a path of type II fatty acid biosynthesis that is known to take place also in Plasmodium falciparum with specific target ie Plasmodium falciparum Enoyl acyl Carrier Protein Reductase ( PfENR ) [25] . PfENR is an enzyme that plays an important role in type II fatty acid biosynthesis that occurs in Plasmodium falciparum . PfENR catalyzes the final step in the elongation cycle of fatty acid biosynthesis . PfENR works by reducing carbon double bond in enoil covalently bound to the acyl carrier protein [24].

Search inhibitors in the type II fatty acid synthesis in Plasmodium falciparum chosen to shorten the life cycle. Chain termination during synthesis makes the Plasmodium can not be metabolized and will have to die . Increasingly there are at least Plasmodium protozoa the fewer who will infect humans

Performance Analysis of Molecular Dynamics Simulation of PfENR Enzyme using AMBER on Cluster and GPU computing environment Heru Suhartanto, Arry Yanuar, Alhadi Bustamam, Aruni Yasmin Azizah,Ari Wibisono, M. H. Hilman

69

Page 3: Performance Analysis of Molecular Dynamics Simulation of ...staff.ui.ac.id/system/files/users/h.hilman/publication/ijact-amber.pdf · The use of a cluster or grid computing is the

through the bite of the female Anopheles mosquito . Virtual screening methods is more efficient than other methods and requires a relatively shorter time and lower cost .

Searching by in silico for compounds from Indonesian natural materials that can potentially be antiplasmodium or anti malarial using virtual screening methods are urgently required because this method can summarize the cost and time required in the direct search or examine each compound in vitro and in vivo methods.

In the era of designing new drugs , the availability of Indonesian Medicinal Plants database will become the basis for further investigation , particularly in the search for anti- malarial with the target enzyme PfENR . The use of a cluster or grid computing is the best solution in terms of increasing the speed of screening process, however, since the limited infrastructure and high operational costs it is necessary to find an attempt to overcome them . Today Interesting offer is the use of computers as a tool in drug development, especially with the technology of Graphical Processing Unit ( GPU ).

2. Literature Study

GPU computing system is very fast system . The bandwidth and computation power are approximately 10 times of the CPU capability. The microbenchmark performance is very convincing, the elementary mathematics instruction reaches 472 GFLOPS on 8800 Ultra GPU and 1581 types GFLOPS on a new generation GPU GTX 580. While the basic memory bandwidth performance is 86 GB per second for Tesla C870 GPU , and 144 GB per second for a new generation Tesla C2050 . Some applications may proceed more rapidly , for example, the N - body computation can achieve 240 GFLOPS which means 12 billion interactions per second . Case studies have been done on the issue of Molecular Dynamics ( MD) and Seismic data processing . The power of GPU computing environment has also been demonstrated in the process of protein to protein interactions [2].

General Purpose Programming on GPUs ( GPGPU ) is a common non graphical application development process on the GPU . Initially, this technique is quite complicated. The issue to be faced should be considered as a problem related to graphics. Data must be mapped into the image ( texture maps) and algorithms must be adapted to image synthesis [9]. Further development of GPGPU grow more rapidly again into GPU computing , after the launch of the programming tools Compute Unified Device Architechture ( CUDA ) by NVIDIA in 2007 . CUDA allows the leap very significant computational performance by moving the computational processes that run in series from the CPU into massively parallel computing using thousands of threads in the hundreds or even thousands of cores on the GPU . With the continuously strong support from NVIDIA and the cost of installation of the system is much cheaper than machines supercomputers and clusters, GPU computing via CUDA development continues to increase rapidly to almost all areas requiring high-performance computing.

Libraries are developed and provided in a programming language to facilitate the use subprograms within an application program where the programmer does not need to make these subprograms repeatedly. The programmer simply use the pre-defined subprograms in a library. Many libraries have been built as a collection based on CUDA such as CUBLAS ( Basic Linear Algebra Subprograms in CUDA ) and CUFFT ( Fast Fourier Transform in CUDA ) [12].

Some applications running on the GPU have been developed by researchers such as in medicine and other research applications. Another example is the implementation and use of GPU computing has succeeded in improving the performance of the Markov clustering algorithm for inter- protein interaction networks [2] . Meanwhile , the ability of a machine to produce images with very detailed (highly detailed ) in a very fast time unit is needed in the process of scanning breast cancer . TechniScan , a developer of automated ultrasound imaging system , has transferred the CPU -based implementation into CUDA ™ and NVIDIA ® Tesla ™ GPUs [6] .

We have done some simulations using a simple compounds or molecules that are taken from literature and our portal. Initially, we have conducted research using the GROMACS on cluster computing environments that provide significant results in five trials nodes [27] . We have also done using the GROMAC in another cluster computing environments , named Cluster05 and computing facilities are equipped with the GPU ( graphics processing unit ) that provides speed up to 11-12 times [23] . However , preliminary experiments using the AMBER on cluster computing showed not too significant a speed – up [22] . Thus, in this activity, we want to improve the performance of MD simulations using the AMBER on GPU computing environment for manufacturing PfENR conformational ensemble . Given the conformational ensemble , we will further conduct virtual drug

Performance Analysis of Molecular Dynamics Simulation of PfENR Enzyme using AMBER on Cluster and GPU computing environment Heru Suhartanto, Arry Yanuar, Alhadi Bustamam, Aruni Yasmin Azizah,Ari Wibisono, M. H. Hilman

70

Page 4: Performance Analysis of Molecular Dynamics Simulation of ...staff.ui.ac.id/system/files/users/h.hilman/publication/ijact-amber.pdf · The use of a cluster or grid computing is the

screening to find candidate anti-malarial using molecules from Indonesian Medicinal Plants Database [28].

Plasmodium falciparum Enoyl acyl Carrier Protein Reductase ( PfENR ) is located in the apicoplast, the organelle which is the site of several metabolism in Plasmodium falciparum one of which fatty acid biosynthesis . The biosynthesis of fatty acids is essential for living organisms . As a major component of cell membranes , fatty acids are essential for energy needs . In Plasmodium falciparum , fatty acid biosynthesis is also required for cell growth, cell division and homeostasis . Biosynthesis of fatty acids increased during the phase of erythrocytes, where parasites grow and divide very quickly [25].

The biosynthesis of fatty acids that occur in Plasmodium falciparum is a type II fatty acid biosynthesis. One of the enzymes involved in this process is PfENR . It is becoming a key enzyme in the biosynthesis pathway of type II fatty acid [15]. This enzyme is involved in the extension of the final reduction step of fatty acid biosynthesis . It has a unique advantage because it does not exist in humans, only in a few specific bacteria and protozoa . Therefore, this target is a good target to do the virtual screening process [18] .

Virtual screening is an analog computing system or in silico biology screening . The purpose of virtual screening is to look for value, rank or filter a set of one or more structures using computational procedures. Virtual screening is used to help determine the compound to be screened or to help the process of synthesis [16]. Virtual screening project for anti- malarial conducted by WISDOM project uses grid infrastructure using 1700 CPUs of various infrastructure in 15 countries [13].

Virtual screening based on a collection ( ensemble ) of conformational structure of the protein conformational ensemble refers to the use of the crystal structure , NMR studies or molecular dynamics simulations give better results [14]. Conformational ensemble is necessary because the protein is flexible, can undergo folding - unfolding thermodynamically [11] , so that the virtual screening process with the shape of the structure is not enough . 3. Objectives of the activities

This work aims to develop an inexpensive HPC technology competencies to support research computing that processes large amounts of data and long processing time. In addition, this aims to obtain the conformational ensemble with molecular dynamics simulations Plasmodium falciparum Enoyl acyl Carrier Protein Reductase ( PfENR ) by maximizing the use of the GPU as a computing environment . The availability of the GPU -based HPC environment can be relied upon research requiring high computing power resource at a low cost . Information on plant species or potential drug-containing compounds guiding inhibitors on Plasmodium falciparum PfENR is very urgent to be known. Virtual screening protocols are established based on the target molecule conformational ensemble processed on the GPU. This would be one of the cheap and quick solutions in drug design research.

4. The methodology

Our work is done with a combination of literary studies , GPU computing environment preparation, simulations and molecular modeling.

Preparation computing environments Graphic Processing Unit

In order to perform molecular dynamics simulations of proteins with GPU-based computing environment, it would require some hardwares that were able to accommodate the use of the GPU card. The supporting Softwares are CUDA toolkit , CUDA SDK , OpenMM, and Amber 11. For Hardware needs, then the match will be investigated by the variation of the motherboard with suitable GPU and available in local market. Our previous experience showed that excellent GPU type is not yet available in local market and must be imported . For software needs, especially the AMBER11 parameters and CUDA parameters should be adjusted with suitable grid size and the number of threads.

Performance Analysis of Molecular Dynamics Simulation of PfENR Enzyme using AMBER on Cluster and GPU computing environment Heru Suhartanto, Arry Yanuar, Alhadi Bustamam, Aruni Yasmin Azizah,Ari Wibisono, M. H. Hilman

71

Page 5: Performance Analysis of Molecular Dynamics Simulation of ...staff.ui.ac.id/system/files/users/h.hilman/publication/ijact-amber.pdf · The use of a cluster or grid computing is the

Searchs and Downloads of Protein Structure

Macromolecular structures of PfENR targets were sought in the website of PDB (Protein Data Bank). Macromolecules were selected based on inclusion criteria such as wild-type macromolecule or nonmutan and related to the ligand. The Exclusion criteria is resolution which is greater than 2.5 Å and the incomplete chain . The Macromolecules was downloaded in text format then in Pdb format for further processing .

5. Experiment results

Before performing molecular dynamics simulations on the PfENR enzyme, we conducted experiments on other proteins first. The experiments conducted on two proteins, namely myoglobin, dihydrofolate reductase (DHFR), an enzyme Ras-Raf and PfNER. Their number of atoms are 2,492, 23.558, 42.193, and 37,873, respectively. Each protein molecular dynamics simulations run with timestep 100 picoseconds (ps, 1 ps = 10-12 seconds), 200 ps, 300 ps, 400 ps, and 500 ps at GPU computing environment. In Table 1 are the specifications of the GPU obtained from NVIDIA site which are used in the simulations.

Descriptio

n GeForce GTX465

GeForce GTX470

GeForce GTX560

GeForce GTX680

GeForce GTX780

Processor Clock

1215 MHz 1215 MHz 1620-1900 MHz

1006MHz (Base clock),

1058MHz (Boost clock)

863 MHz (Base clock),

900MHz (Boost clock)

Cuda Cores

352 448 336 1536 2304

Memory 1024 MB GDDR5

1280 MB GDDR5

1024 MB GDDR5

2048MB GDDR5

3072 MB GDDR5

Memory Clock

1603 MHz 1674 MHz 2002-2200 MHz

n/a n/a

Memory Interface

Width

256 Bit 320 Bit 256 Bit 256-bit 384-bit

Memory Bandwidth

102.6 GB/sec

133.9 GB/sec

128 GB/sec

192.2 GB/sec 288.4 GB/sec

Directx 11 11 11 11 11 CUDA

Support Yes Yes Yes Yes Yes

Bus Type PCI-E 2.0 x 16

PCI - E 2.0

PCI-E 2.0 x 16

PCI Express 3.0

PCI Express 3.0

Height 4.3 inch 4.3 inch 4.3 inch 4.376 inches 4.376 inches Length 9.5 inch 9.5 inch 8.25 inch 10.0 inches 10.5 inches Width Dual slot Dual slot Dual slot Dual slot Dual slot Power

Requirement 200 W 215 W 150 W 195 W 250W

Recommended Power

Supply

550 W 550 W 450 W 550 W 600W

Supplementary Power Connector

6-pin x 2 6-pin x 2 6-pin x 2 6-pin x 2 One 8-pin, one 6-pin

Maximum Temperature

105 celcius

105 celcius

99 celcius 98 celcius 95 celcius

Tabel 1. GPU NVIDIA Consumer Graphic Card specifications

Performance Analysis of Molecular Dynamics Simulation of PfENR Enzyme using AMBER on Cluster and GPU computing environment Heru Suhartanto, Arry Yanuar, Alhadi Bustamam, Aruni Yasmin Azizah,Ari Wibisono, M. H. Hilman

72

Page 6: Performance Analysis of Molecular Dynamics Simulation of ...staff.ui.ac.id/system/files/users/h.hilman/publication/ijact-amber.pdf · The use of a cluster or grid computing is the

Below in Table 2 are the specifications of two the computers that have the GPU computing environment GTX465 and GTX470, while Table 3 is a specification for computers with GTX560, and Table 4 is the specification for computer with GTX 680 and GTX 780. The operating system used by all the computers are Ubuntu versions 10:04.

Table 2. GPU GTX465 and GTX470 Computer Specifications

Tabel 3. GPU GTX560 Computer Specifications

Performance Analysis of Molecular Dynamics Simulation of PfENR Enzyme using AMBER on Cluster and GPU computing environment Heru Suhartanto, Arry Yanuar, Alhadi Bustamam, Aruni Yasmin Azizah,Ari Wibisono, M. H. Hilman

73

Page 7: Performance Analysis of Molecular Dynamics Simulation of ...staff.ui.ac.id/system/files/users/h.hilman/publication/ijact-amber.pdf · The use of a cluster or grid computing is the

Computer Specification CORSAIR TX V2 Series [TX750] 750W, Active PFC, Single Rail +12V INTEL Processor Core [i7-3770] Quad Core, 3.4 GHz, 77W, 8MB Cache, Integrated Intel HD Graphics 4000, Socket LGA1155 CORSAIR Memory PC 2 x 8GB DDR3 PC-12800 2 x 8GB kit, DDR3, 1600MHz, Dual Channel XMP Profile, PC-12800 Asus Motherboard [P8Z77-I DELUXE] Socket LGA1155, Intel® Z77, DDR3 Dual Channel, PCI-e 3.0 x16, SATA III, USB 3.0, Audio

Tabel 4. GPU GTX680 and GTX780 Computer Specifications

5.1. Performance analysis of Myoglobin protein The MD simulation results on myoglobin protein is provided in the following table. Entry in the

table is the CPU time in seconds and the entries in parentheses are the acceleration reached by each computer that uses the GPU at the time of the simulations compared to simulations on a cluster with 16 processor.

Tabel 5. Execution time (second) from Myoglobin simulation Timesteps (PS)

Computing Environment Cluster05 (16 CPUs)

GTX560 GTX465 GTX470 GTX680 GTX780

100 9.076,67 361,20 (25,12)

340,55 (26,65)

274,41 (33,07)

115,98 (78,26)

87,94 (103,21)

200 17.959,07 723,06 (24,83)

682,57 (26,31)

548,98 (32,71)

231,09 (77,71)

175,69 (102,22)

300 26.821,47 1.083,08 (24,76)

1.025,15(26,16)

822,77 (32,59)

347,44 (77,20)

264,34 (101,47)

400 35.773,05 1.444,22 (24,76)

1.367,99(26,15)

1.095,00 (32,66)

462,93 (77,28)

353,34 (101,24)

500 44.817,55 1.803,76 (24,84)

1.711,61(26,18)

1.370,91 (32,69)

575,24 (77,91)

442,78 (101,22)

In Table 5, we can see that the MD simulation by utilizing GPU computing environment

outperformed cluster computing environment. GPU usage may shorten the time simulation of myoglobin protein , which is from 24 times to 101 times shorter . CPU time between different types of GPUs too. It is also obvious that, the order of the fast CPU time among the GPU types is the GTX780, GTX680, GTX470 , GTX465 , and GTX560 .

It is known that the number of GPU CUDA cores for GTX 780 is 2304 cores. The order of GPU -type that has the most CUDA cores after GTX780 is GTX680 ( 1536 CUDA cores ) , GTX470 ( 448 CUDA cores ) , GTX465 ( 352 CUDA cores ) , and the last is the GTX560 ( 332 CUDA cores ) . So , if the number of CUDA cores available associated with the MD simulation execution or CPU time, the more cores available on the GPU, then the simulation execution time or CPU time will be even shorter.

Performance Analysis of Molecular Dynamics Simulation of PfENR Enzyme using AMBER on Cluster and GPU computing environment Heru Suhartanto, Arry Yanuar, Alhadi Bustamam, Aruni Yasmin Azizah,Ari Wibisono, M. H. Hilman

74

Page 8: Performance Analysis of Molecular Dynamics Simulation of ...staff.ui.ac.id/system/files/users/h.hilman/publication/ijact-amber.pdf · The use of a cluster or grid computing is the

5.2. Performance Analysis of DHFR Protein

DHFR protein simulation results are in Table 6. Entry in the table is the CPU time in seconds and the entries are in parentheses is the acceleration achieved during the computer simulation when compared with simulation on 16 processors cluster environment by rounding two decimal places.

Table 6. Execution time (second) for DM DHFR simulation

DM Timesteps (ps)

Computing Environment Cluster05 (16 CPUs)

GTX560 GTX465 GTX470 GTX680 GTX780

100 20.932,76 670,87 (31,20)

667,87 (31,34)

533,07 (39,26)

274,03 (76,39)

204,42 (102,40)

200 41.900,07 1.342,16(31,21)

1.335,23 (31,38)

1.066,26 (39,29)

547,79 (76,49)

408,89 (102,47)

300 61.360,69 2.012,60(30,48)

2.002,04 (30,64)

1.601,95 (38,30)

819,62 (74,86)

609,77 (100,63)

400 83.912,49 2.683,87(31,26)

2.666,69 (31,46)

2.135,44 (39,29)

1091,16 (76,90)

811,65 (103,39)

500 102.855,07 3.354,70(30,65)

3.333,76 (30,85)

2.666,14 (38,57)

1362,11 (75,51)

1014,14 (101,42)

Based on Table 6, the GPU computing environment also outperform cluster computing

environment on this DHFR protein simulations. When compared with the results of the simulation of myoglobin protein, the speed up achieved by GPU GTX560, GTX465, and GTX470 are increased to 30 to 38 times faster. Furthermore, from Table it can be observed that the same order of the performance of the GPU simulations of protein myoglobin, where GTX780 with the lowest CPU time followed by the GTX680, GTX470, GTX465, and GTX560. DHFR protein simulation experience longer execution time compared with the protein myoglobin because it has more atom than that of myoglobin.

5.3. Performance Analysis for Ras-Raf and PfENR protein

MD Simulation results of Ras-Raf and PfENR protein are shown in Table 7. For comparison purposes, the results of other protein on 500 ps simulation is included . In this table, we omit the performance of cluster which is again the worst one. Entry in the table is the CPU time in seconds and the entries are in parentheses is the acceleration achieved computer with GPU during the simulation when compared to using Cluster with rounding two decimal places. While the entries in parenthesis with two star ** indicate the speed gained by other GPU relative to the slowest GPU GTX 560.

Protein Computing Environment

Cluster05 (16 CPUs)

GTX560 GTX465 GTX470 GTX680 GTX780

Myoglobin 44.817,55 1.803,76 (24,84)

1.711,61 (26,18) (1.05)**

1.370,91 (32,69) (1.32)**

575,24 (77,91) (3.14)**

442,78 (101,22) (4.07)**

DHFR 102.855,07 3.354,70 (30,65)

3.333,76 (30,85) (1.01)**

2.666,14 (38,57) (1.26)**

1362,11 (75,51) (2.46)**

1014,14 (101,42) (3.31)**

Ras-Raf 3,676,07 3237,19 (1,14)**

2586,24 (1,42)**

1588,96 (2,31)**

1142,09 (3,22)**

PfENR 3,834,53 3078,30 (1,25)**

2465,82 (1,56)**

1488,38 (2,58)**

1068,50 (3,59)**

Table 7. Execution time (second) in 500 ps for various protein and various computing environment

Performance Analysis of Molecular Dynamics Simulation of PfENR Enzyme using AMBER on Cluster and GPU computing environment Heru Suhartanto, Arry Yanuar, Alhadi Bustamam, Aruni Yasmin Azizah,Ari Wibisono, M. H. Hilman

75

Page 9: Performance Analysis of Molecular Dynamics Simulation of ...staff.ui.ac.id/system/files/users/h.hilman/publication/ijact-amber.pdf · The use of a cluster or grid computing is the

Simulation results of Ras-Raf protein showed that the performance order among GPU when compared with previous simulations did not change. A minimum CPU time is obtained at the computer GTX780 GPU, followed by the GTX680, GTX470, GTX465, and GTX560 which has the slowest one. As with the GTX680 which is able to reduce the CPU time up to 2 times shorter and GTX780 which reduce the CPU time up to 3 times shorter.

Simulation of the Ras-Raf proteins with a timestep of 500 ps experiencing slower execution time compared to the simulation and DHFR protein myoglobin because Ras-Raf has more protein (42,193) than myoglobin.

Based on Table 7 it can be seen that each GPU-acceleration has no significant change compared with the Ras-Raf protein simulations. Computer with a GTX780 still ranks first, it has the fastest time and has the highest acceleration. While the performance order among the GPU is the same as that of the Ras-Raf protein simulations.

MD Simulation of PfENR protein with 500 ps timestep takes much longer than the simulation of DHF and Myoglobin. This is due to the larger number of atoms PfENR has, which is about 37 873 atoms. When compared with simulation on the Ras-Raf protein, then the simulation should take a shorter time because of the number of atoms of PfENR is less than the Ras-Raf protein. However, the simulation with GTX 560 takes longer simulation time. This is maybe caused by the number of CUDA cores for GTX 560 is less than those of other GPU.

6. Conclusion

The experimental results showed that in MD simulation processes with Myoglobin, the longest

process occured on the cluster environment , however a speed up achieved by GTX 560 , GTX 465 , GTX 470 , GTX 680 and GTX 780, respectively at about 24 , 26 , 32 , 24 , 77 and 101 times faster than that of Cluster . Among GPU, the slowest one is GTX 560, and the relative speed of GTX 465, GTX 470, GTX 680 an GTX 780 are respectively 1.04 , 1.13 , 3.34 , and 4.39 . The MD results with the DHFR protein shows that the longest is with the cluster environment , and the speed up of the GPU with the same order is 31 , 39 , 31 , 77 , 103, respectively. And among the GPUs, the speed up is 1.01 , 1.26 , 2.46 , and 3.31 . In MD results of the Ras - Ras proteins, the speed up amongst the GPU with the same order is 1.14 , 1.42 , 2.31 , and 3.22, respectively . Finally, in MD Results on PfENR protein, the speed up with the same GPU order is consecutively 1.25 , 1.56 , 2.58 and 3.59.

7. References

[1] S. R. Alam et.al., “Experimental evaluation of molecular dynamics simulations on multi-core

systems,” in Proceeding of the 15th International Conference on High Performance Computing. Bangalore, India, 17-20 December, 2008.

[2] Alhadi Bustamam, Kevin Burrage, Nicholas A. Hamilton, "Fast Parallel Markov Clustering in Bioinformatics Using Massively Parallel Computing on GPU with CUDA and ELLPACK-R Sparse Format," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 9, no. 3, pp. 679-692, May-June 2012, doi:10.1109/TCBB.2011.68

[3] T. Amisaki and S. Fujiwara., “Grid-enabled applications in molecular dynamics simulations using a cluster of dedicated computers, ” in Proceeding of Applications and the Internet Workshops, pp. 616-622, 26-30 January, 2004.

[4] “AMBER 11 NVIDIA GPU Acceleration Support”, tanpa tahun, [Online].: http://ambermd.org/gpus. [Last Access 19 Juli 2010]. (2010a)

[5] “AMBER Home Page”, 5 Juni 2010, [Online]. Tersedia di: http://ambermd.org. [Last Access 19 Juli 2010] (2010b)

[6] Cuda Medicine, http://www.nvidia.co.uk/object/cuda_medical_uk.html [akses 13 Feb 2010] (2010c)

[7] Braz, VA (2010). Binding of the nonnucleoside Reverse Transcriptase Inhibitors Efavirenz to HIV-1 Reverse Transcriptase monomers and Dimers. Case Western Reserve University.

Performance Analysis of Molecular Dynamics Simulation of PfENR Enzyme using AMBER on Cluster and GPU computing environment Heru Suhartanto, Arry Yanuar, Alhadi Bustamam, Aruni Yasmin Azizah,Ari Wibisono, M. H. Hilman

76

Page 10: Performance Analysis of Molecular Dynamics Simulation of ...staff.ui.ac.id/system/files/users/h.hilman/publication/ijact-amber.pdf · The use of a cluster or grid computing is the

[8] Buck I (2007), Cuda Programming, the International Conference for High Performance Computing, Networking, Storage and Analysis, http://sc07.supercomputing.org/, 2007

[9] Case, D.A., T. A. Darden, T. E. Cheatham, III, C. L. Simmerling, J. Wang, R. E. Duke, R. Luo, M. Crowley, R. C.Walker, W. Zhang, K. M. Merz, B.Wang, S. Hayik, A. Roitberg, G. Seabra, I. Kolossváry, K. F.Wong, F. Paesani, J. Vanicek, X.Wu, S. R. Brozell, T. Steinbrecher, H. Gohlke, L. Yang, C. Tan, J. Mongan, V. Hornak, G. Cui, D. H .Mathews, M. G. Seetin, C. Sagui, V. Babin, dan P. A. Kollman, “AMBER 10”, University of California, San Francisco, 2008, [Online]. Tersedia di: http://www.lulu.com/content/paperback-book/amber-10-users-manual/2369585. [Diakses pada 11 Juni 2010]

[10] Cotelle, P. Patented HIV-1 Integrase Inhibitors (1998-2005). Recent Patents on Anti-infective Drug Discovery, 1-15. 2, 2006.

[11] Daura, X., van Gunteresen, W., Mark, A.E., Folding-unfolding thermodynamics of a beta-heptapeptide from equilibrium simulations.,Protein (1999) 34(3) 269-80

[12] Fatica, M, CUDA Libraries, the International Conference for High Performance Computing, Networking, Storage and Analysis, 2007, http://sc07.supercomputing.org/

[13] Jacq,N., Salzemann, J., Jacq, F., Legré, Y., Medernach, E., Montagnat, J.,Maaß, A., Reichstadt, M., Schwichtenberg, H., Sridhar, M., Kasam, V., Zimmermann, Hofmann, M & Breton, V., Grid Enable Virtual Screening Against Malaria, J Grid Computing (2008) 6: 29-43. DOI: 10.1007/s10723-007-9085-5

[14] Amaro, R.E., Li, W.W., Emerging Methods for Ensemble-Based Virtual Screening, Curr Topic Med Chem, (2010), 10(1): 3-13. DOI: 10.2174/156802610790232279

[15] Karioti, Anastasia., et al. Inhibiting enoyl-ACP reductase (FabI) across pathogenic microorganisms by linear sesquiterpene lactones from Anthemis auriculata. Phytomedicine 15: 1125–1129, 2008.

[16] Leach, Andrew R., Shoichet , Brian K., dan Peishoff Catherine E. , Prediction of Protein-Ligand Interactions. Docking and Scoring: Successes and Gaps. Journal of Medicinal Chemistry, 49 (20). 5851-5855., DOI: 10.1002/chin.200650271, 2006.

[17] Lin, P., “Introduction to the AMBER Molecular Dynamics Package”, Tutorial dari Materials Simulation Center, The Pennsylvania State University, 30 April 2008. [Online].: http://msc.psu.edu/tutorials/IntroAmber_course.pdf. [Diakses pada 17 Juni 2010].

[18] Morde, Varun., et al. (2009). Molecular modeling studies, synthesis, and biological evaluation of Plasmodium falciparum enoyl-acyl carrier protein reductase (PfENR) inhibitors. Mol Divers, 13:501–517

[19] Morris. “AutoDock's role in Developing the First Clinically-Approved HIV Integrase Inhibitor”, 17 December 2007, [Online]. http://autodock.scripps.edu/news/autodocks-role-in-developing-the-first-clinically-approved-hiv-integrase-inhibitor. [accessed 20 July 2010].

[20] NUS, “Customising drugs to suit individuals”, 20 January 2009, [Online]. http://www.nus.edu.sg/research/rg100.php. [accessed 20 July 2010].

[21] Onufriev,A., D. Bashford, dan D. A. Case, “Modification of the generalized born model suitable for macromolecules,” The Journal of Physical Chemistry B, vol. 104, no. 15, pp. 3712-3720. [Online]. di: http://dx.doi.org/10.1021/jp994072s, April 2000.

[22] Heru Suhartanto, Arry Yanuar, MH Hilman, Ari Wibisono and Toni Dermawan, Performance Analysis, Cluster Computing Environments on Molecular Dynamics Simulation of LOX-RAD GTPase and Curcumin Molecules with AMBER, International Journal of Computer Science Issues, Vol 9, issue 2, March, 2012.

[23] H. Suhartanto, A. Yanuar, A. Wibisono, “Performance Analysis Cluster and GPU Computing Environment on Molecular Dynamic Simulation of BRV-1 and REM2 with GROMACS”, International Journal of Computer Science Issues, vol. 8, no. 3, May 2011.

Performance Analysis of Molecular Dynamics Simulation of PfENR Enzyme using AMBER on Cluster and GPU computing environment Heru Suhartanto, Arry Yanuar, Alhadi Bustamam, Aruni Yasmin Azizah,Ari Wibisono, M. H. Hilman

77

Page 11: Performance Analysis of Molecular Dynamics Simulation of ...staff.ui.ac.id/system/files/users/h.hilman/publication/ijact-amber.pdf · The use of a cluster or grid computing is the

[24] Surolia, Namita., et al. (2006). Novel diphenyl ethers: design, docking studies,synthesis and inhibition of enoyl ACP reductase of Plasmodium falciparum and Escherichia coli. Bioorganic & Medicinal Chemistry 14 : 8086–8098

[25] Tasdemir, Deniz. (2006). Type II Fatty Acid Biosynthesis, A New Approach In Antimalarial Natural Product Discovery. Phytochemistry Reviews, 5 : 99-108 WHO. (2010) last access 11 Januari 2011 pukul 13:55 dari http://www.who.int/topics/malaria/en/

[26] V. Pande, “Folding@HOME”, 2010, Internet: http://folding.stanford.edu. [27] Wibisono, Arry Yanuar, Heru Suhartanto, “Performance Analysis of Curcumin Molecular

Dynamics Simulation using GROMACS on Cluster Computing Environment” , in Proceeding of International Conference on Advanced Computer Science and Information System, Bali, 2010.

[28] Arry Yanuar, Abdul Mun'im, Bertha Akma Aprima Lagho, Rezi Riadhi Syahdi, Marjuqi Rahmat and Heru Suhartanto, Medicinal Plants Database and Three Dimensional Structure of the Chemical Compounds from Medicinal Plants in Indonesia, IJCSI, Volume 8, Issue 5, September 2011,

[29] Zein, Umar. (2005). Penanganan Terkini Malaria Falciparum. Diunduh pada tanggal 11 Januari 2011 pukul 14.01 dari http://repository.usu.ac.id/bitstream/123456789/ 3372/1/penydalam-umar6.pdf

Performance Analysis of Molecular Dynamics Simulation of PfENR Enzyme using AMBER on Cluster and GPU computing environment Heru Suhartanto, Arry Yanuar, Alhadi Bustamam, Aruni Yasmin Azizah,Ari Wibisono, M. H. Hilman

78