Guest editorial Cluster computing - Gridbus and Cloudbus · Cluster computing Rajkumar Buyyaa,1, ... Cluster computing has become a hot topic of research among academic and industry

Future Generation Computer Systems 18 (2002) v–viii

Guest editorial

Cluster computing

Rajkumar Buyyaa,1, Hai Jinb,∗,2, Toni Cortesc,3a School of Computer Science and Software Engineering, Monash University, Melbourne, Australiab College of Computer, Huazhong University of Science and Technology, Wuhan 430074, PR China

c Department d’Arquitectura de Computadors, Universitat Politècnica de Catalunya, Barcelona, Spain

Cluster computing can be described as a fusion ofthe fields of parallel, high-performance, distributed,and high-availability computing. Cluster computinghas become a hot topic of research among academicand industry community including system designers,network developers, language designers, standardiz-ing forums, algorithm developers, graduate studentsand faculties. The use of clusters as computing plat-form is not just limited to scientific and engineeringapplications; there are many business applications thatcan benefit from the use of clusters. There are manyexciting areas of development in cluster computingwith new ideas as well as hybrids of old ones beingdeployed for production as well as research systems.The aim of this special issue is to bring together orig-inal and latest work from both academia and industryon various issues related to cluster computing.

This research, and development, is being done inmany areas but there are a few that are of specialinterest. The first one is, with no doubt, the network.Clusters are based on the communication betweennodes and designing fast and low latency networksis a must for clusters to become the configuration ofthe future. Plenty of work being done in this are in

∗ Corresponding author.E-mail addresses:[email protected] (R. Buyya),[email protected] (H. Jin), [email protected] (T. Cortes).

1 http://www.buyya.com.2 http://ceng.usc.edu/∼hjin.3 http://www.ac.upc.es/homes/toni.

both hardware (Infiniband, SCI, Myrinet, QSnet) andsoftware (Low latency protocols such as VIA).

Allowing heterogeneity in both hardware and soft-ware (OS) is extremely important and is becomingone of the key issues in cluster research. This het-erogeneity allows a better expandability (i.e. use oldand new components) and scalability and has to beaddressed at many levels. We need tools that allowus to develop applications for these configurations,as well as scheduling systems that offer mechanismsfor exploitation of heterogeneity and that deliverbetter application performance. The papers selectedfor inclusion in this special issue address theseissues.

File systems are also of great interest as clustersare being used for I/O intensive applications such ase-commerce, WEB servers or databases. This kind offile systems tight the I/O devices as well as offer highperformance and a single-system image. Achievingthese objectives is not easy and plenty of work is beingdevoted.

Clusters also provide an excellent platform forsolving a range of parallel and distributed applicationsin both scientific and commercial areas. For scientificapplication, clusters can be used in grand challengeor supercomputing applications, such as earthquakesor hurricanes prediction, complex crystallographicand microtomographic structural problems, pro-tein dynamics and biocatalysis, relativistic quantumchemistry of actinides, virtual materials design and

0167-739X/02/$ – see front matter © 2002 Published by Elsevier Science B.V.PII: S0167-739X(01)00053-X

vi R. Buyya et al. / Future Generation Computer Systems 18 (2002) v–viii

processing including crash simulations, and globalclimate modeling. For the commercial applications,cluster can be best used in e-commerce as superserver,which consolidate web server, ftp server, e-mail server,database server, etc. Clusters can also be used in datamining applications to provide the storage and datamanagement services for the data set being mined andcomputational services required by the data filtering,preparation and mining tasks. Other commercial ap-plications includes image rendering, network simula-tion, etc.

Due to the growing interest in cluster computing,the IEEE Task Force on Cluster Computing (TFCC)[8] was formed in early 1999. TFCC is acting as afocal point and guide to the current cluster comput-ing community and has been actively promoting thefield of cluster computing with the aid of a numberof novel projects. With the support of TFCC, a seriesof special sessions, workshop, symposiums and con-ferences were held to guide R&D work both in aca-demic and industrial settings. Further information onTFCC activities can be accessed from the web site:http://www.ieeetfcc.org/.

This special issue emerges from the seven bestpapers that we selected from the three conferencesin the area of cluster computing. They are the firstIEEE International Workshop/Conference on ClusterComputing4 (Melbourne, Australia), CC-TEA 2001(Las Vegas, USA), and the Asia–Pacific InternationalSymposium on Cluster Computing (Beijing, China).The last two meetings have been merged to createa major IEEE international conference series (calledCCGRID5 ) that focus on both cluster computingand grid6 technologies. Grid technology will helpin coupling multiple clusters in the same or differentorganizations to create computational grids (federatedor hyper clusters) for solving large-scale multidisci-plinary problems.

The papers that we have selected for inclusion inthis special issue have received the higher reviewratings for their research contributions. All the pa-pers are extended and revised to reflect the latestresearch achievements in various areas of clustercomputing. The topics of the papers cover different

4 http://www.clustercomp.org.5 http://www.ccgrid.org.6 http://www.gridcomputing.com.

aspects of cluster computing ranging from lightweight(low-latency and high-bandwidth) communicationprotocols to highly optimized sorting algorithms im-plemented on clusters. Topic areas covered includerapid application development environments, oper-ating system kernels, file systems, load-balancingmechanisms, communication subsystems, industrystandard oriented user-level communication inter-faces, and a number of parallel sorting schemes. Theworks presented have addressed mechanisms for han-dling heterogeneity in clusters, as this is an importantissue for handling design issues such as expandabilityof clusters.

The first paper entitled “MPI–Delphi: An MPI Im-plementation for Visual Programming Environmentsand Heterogeneous Computing” presents authors’ ef-forts to integrate MPI for parallel programming withDelphi visual programming environment. MPI–Delphiinterface makes it possible to manage a cluster of het-erogeneous PCs. This interface is also suitable forsome specific kind of programs, such as monitoringlong execution parallel programs, or computational in-tensive graphical simulations.

“PODOS — The Design and Implementation of aPerformance Oriented LINUX Cluster” reports thedesign issues of a performance-oriented operatingsystem, PODOS, which harness the performancecapabilities of a cluster-computing environment. PO-DOS added four new components to existing LINUXoperating system, they are Communication Man-ager, PODOS File System, Resource Manager, andGlobal Inter-Process Communication. The customdesigned communication protocol uses round-robinTransmission-Groups mechanism to multiplex pack-ets across multiple network interfaces. PODOS filesystem builds an efficient file sharing environmentson top of high-speed communication subsystem.

“On a Scheme for Parallel Sorting on Heteroge-neous Clusters” discusses the parallel sorting algo-rithms and their implementations suitable for clusterarchitectures in order to optimize cluster resources.By carefully studying the various sorting algorithmson heterogeneous cluster, the authors concluded thatthe software challenges to better utilize the heteroge-neous cluster platform are data decomposition tech-niques, scheduling and load balancing methods. Thealgorithm described in this paper combines very goodproperty for load balancing.

R. Buyya et al. / Future Generation Computer Systems 18 (2002) v–viii vii

“Cluster File Systems: A Case Study” presents ascalable single-image file system, called COSMOS,designed for Dawning 2000 superserver. COSMOSprovides location transparency and strong UNIXfile-sharing semantics. COSMOS uses serverless de-sign and introduces a dual-granularity cooperativecaching. Other characteristics of COSMOS includesheuristic replacement algorithm, network disk strip-ing, and distributed meta-data management. COS-MOS provides high I/O bandwidth for large files andgood I/O performance for small files and directoryreads.

“Load Balancing for Heterogeneous Clusters ofPCs” discusses the authors’ experience of using theasymmetric load balancing approach in heterogeneouscluster environments. By using the LU benchmarkfrom the NPB family, authors find that asymmetricload balancing approach is a general-purpose toolthat can be used with any data decomposed regularproblem, and with some extensions, it can be alsoused in irregular problem.

“Directed Point: A Communication Subsystem forCommodity Supercomputing with Gigabit Ethernet”studies the practical issues on the design of a newhigh performance communication subsystem, calledDirected Point (DP). The DP abstraction model de-picts the communication channels built among agroup of communicating processes. It supports bothpoint-to-point communication and various types ofgroup operations. The API of DP combines fea-tures from BSD sockets and MPI to facilitate thepeer-to-peer communication in a cluster. DP improvesthe communication performance by reducing protocolcomplexity through the use of directed message, byreducing the intermediate memory copies betweenprotocol layers through the use of token buffer pool,and by reducing the context switching and schedulingoverhead through the use of light-weight messagingcalls.

“VI Architecture Communication Features andPerformance on the Giganet Cluster LAN” presentsthe performance result study of Giganet cLAN VIarchitecture hardware implementation. VI architec-ture aims to close the performance gap between thebandwidths and latencies provided by the commu-nication hardware and visible to the application byminimizing the software overhead on the criticalpath of communication. The focus of this study is

to assess and compare the performance for differentVIA data-transfer modes and specific features thatare available to higher-level communication softwarelike MPI. The features investigated in this paperinclude the use of send/receive vs. RDMA data trans-fers, polling vs. blocking to check the completion ofcommunication operations, multiple VIs, completionqueues, and scatter capabilities of VIA.

From the above discussion, it is clear that the papersincluded in this special issue cover a broad range oftopics in cluster computing. We hope that the wholecluster computing community including researchers,developers, practitioners, and users find this specialissue of use and interest.

We would like to take this opportunity to thank ed-itors of the Future Generation Computer Systems, Pe-ter Sloot and Doutzen Abma for inviting us to editthis special issue. In fact, we are honored by their in-vitation. We congratulate all the authors whose papershave been included in this special issue and we thankthem for agreeing to extend their papers to reflect thelatest state of the art in their topical area. In the ref-erences section below, we have included pointers tofurther information [7], particularly a list of books[1–6] that have been published recently. We hope thatyou will find this special issue interesting and useful.Happy reading!

References

[1] R. Buyya (Ed.), High Performance Cluster Computing:Systems and Architectures, Vol. 1, 1st Edition, Prentice-Hall,Englewood Cliffs, NJ, 1999.

[2] R. Buyya (Ed.), High Performance Cluster Computing:Programming and Applications, Vol. 2, 1st Edition,Prentice-Hall, Englewood Cliffs, NJ, 1999.

[3] G. Pfister, In Search of Clusters, 2nd Edition, Prentice-Hall,Englewood Cliffs, NJ, 1998.

[4] K. Hwang, Z. Xu, Scalable Parallel Computing — Technology,Architecture, Programming, McGraw-Hill, New York, 1998.

[5] T. Sterling, J. Salmon, D. Becker, D. Savarrese, How to Builda Beowulf, MIT Press, Cambridge, MA, 1999.

[6] B. Wilkinson, M. Allen, Parallel Programming Techniquesand Applications Using Networked Workstations and ParallelComputers, Prentice-Hall, Englewood Cliffs, NJ, 1999.

[7] Cluster Computing Info Centre. http://www.buyya.com/cluster.

[8] IEEE Task Force on Cluster Computing. http://www.ieeetfcc.org.

viii R. Buyya et al. / Future Generation Computer Systems 18 (2002) v–viii

Rajkumar Buyya is an AustralianGovt. Research Scholar at the School ofComputer Science and Software Engi-neering, Monash University, Melbourne,Australia. He was awarded the DharmaRatnakara Memorial Trust Gold Medalfor his academic excellence during 1992by Mysore/Kuvempu University. He hasauthored three books,Microprocessorx86 Programming, Mastering C++, and

Design of PARAS Microkernel. He has edited a popular two-volume book onHigh Performance Cluster Computing: Architec-tures and Systems (Vol. 1); Programming and Application (Vol.2)published by the Prentice Hall, USA. He has edited proceedingsof six international conferences and served as guest editor formany research journals in the area of Parallel and DistributedComputing, Cluster Computing, and Grid Computing. He hascontributed to the development of system software for PARAMsupercomputers produced by the Centre for Development of Ad-vanced Computing, India. At Monash University, he is engagedin R&D on next generation Internet/Grid computing technologiesand its applications.

Rajkumar is a speaker in the IEEE Computer Society ChapterTutorials Program and Foundation Chair of the IEEE ComputerSociety Task Force on Cluster Computing (TFCC). He hasco-organised and chaired three major IEEE/ACM internationalconferences: IEEE TFCC International Cluster Computing Confer-ence (IEEE ClusterComp’99, Melbourne, Australia), ACM/IEEEInternational Workshop on Grid Computing (Grid 2000, Banga-lore, India), and IEEE/ACM International Symposium on ClusterComputing and the Grid (CCGrid 2001, Brisbane, Australia). Hehas chaired few other international events held in USA (CC-TEA,Las Vegas, USA, 1997-2000), China, and Germany that havemerged with IEEE conferences.

Rajkumar has lectured on advanced technologies such as Paral-lel, Distributed and Multithreaded Computing, Internet and Java,Cluster Computing, and Java and High Performance Computingin several international conferences and institutions located inKorea, Singapore, USA, Mexico, Australia, Norway, Spain, China,Canada, Germany, France, and India, of course. His researchpapers have appeared in international conferences and journals.For further information, please browse: http://www.buyya.com.

Hai Jin is a Professor of Computer Sci-ence and Engineering at Huazhong Univer-sity of Science and Technology (HUST)in China. He received his Ph.D. in com-puter engineering from HUST in 1994.In 1996, he was awarded a scholarshipby German Academic Exchange Service(DAAD) at Technical University of Chem-nitz in Germany. He worked at the Uni-versity of Hong Kong between 1998 and

2000, where he participated in the HKU Cluster project. He workedas a visiting scholar at the Internet and Cluster Computing Labo-ratory at the University of Southern California between 1999 and2000.

Dr. Jin is a member of IEEE and ACM. He chaired the2000 Asia-Pacific International Symposium on Cluster Computing(APSCC’01)andFirst International Workshop on Internet Comput-ing and E-Commerce (ICEC’01).He served as program vice-chairof 2001 International Symposium on Cluster Computing and Grid(CCGrid’01). He has co-authored four books and published morethan 50 papers. His research interests cover parallel I/O, RAIDarchitecture, fault tolerance, and cluster computing. Contact himat [email protected].

Toni Cortes is an associate professorat Universitat Politècnica de Catalunya,Barcelona, Spain. He obtained his Ph.D.degree in computer science in 1997 fromUniversitat Politècnica de Catalunya.He is currently the coordinator of thesingle-system image technical area in theIEEE CS Task Force on Cluster Comput-ing. Besides working in many academicprojects, he has been cooperating in

European industrial projects such as Paros, Nanos, and Dimemas.He organized (along with Rajkumar Buyya) theCluster Com-puting Technologies, Environments, and Applicationssessionorganized as part of the PDPTA’99 and PDPTA’00 Conferences.He has also been program vice chair for the IEE Cluster 2000conferences and has been a reviewer for important internationalconferences such as Cluster, CC-Grid, ISCA, ICS, SCCC, andinternational journals such asIEEE Transactions on Computers,Parallel and Distributed Computing Practices, ISCA InternationalJournal of Computers, andSoftware-Practice& Experience. Hehas also published more than 20 papers in international jour-nals and conferences (most of them about parallel I/O). He hasalso published a chapter on parallel I/O inHigh PerformanceCluster Computing(1999). His research interests cover com-puter architecture, parallel I/O, RAID architecture design, highperformance storage system, cluster computing, and operatingsystems.

Documents

Guest editorial Cluster computing - Gridbus and Cloudbus · Cluster computing Rajkumar Buyyaa,1, ... Cluster computing has become a hot topic of research among academic and industry