Research Article - Hindawi Publishing Corporationdownloads.hindawi.com/journals/ijdsn/2010/961591.pdf · Consumption (TEC) for sensor data gathering is critical to ensuring sustained

Hindawi Publishing CorporationInternational Journal of Distributed Sensor NetworksVolume 2010, Article ID 961591, 9 pagesdoi:10.1155/2010/961591

Research Article

Optimizing Cluster Heads for Energy Efficiency in Large-ScaleHeterogeneous Wireless Sensor Networks

Yi Gu,1 Qishi Wu,1 and Nageswara S. V. Rao2

1 Department of Computer Science, University of Memphis, Memphis, TN 38152, USA2 Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA

Correspondence should be addressed to Qishi Wu, [email protected]

Received 31 July 2010; Accepted 5 December 2010

Copyright © 2010 Yi Gu et al. This is an open access article distributed under the Creative Commons Attribution License, whichpermits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Many complex sensor network applications require deploying a large number of inexpensive and small sensors in a vastgeographical region to achieve quality through quantity. Hierarchical clustering is generally considered as an efficient and scalableway to facilitate the management and operation of such large-scale networks and minimize the total energy consumption forprolonged lifetime. Judicious selection of cluster heads for data integration and communication is critical to the success ofapplications based on hierarchical sensor networks organized as layered clusters. We investigate the problem of selecting sensornodes in a predeployed sensor network to be the cluster heads to minimize the total energy needed for data gathering. We rigorouslyderive an analytical formula to optimize the number of cluster heads in sensor networks under uniform node distribution, andpropose a Distance-based Crowdedness Clustering algorithm to determine the cluster heads in sensor networks under generalnode distribution. The results from an extensive set of experiments on a large number of simulated sensor networks illustrate theperformance superiority of the proposed solution over the clustering schemes based on k-means algorithm.

1. Introduction

Multiple sensor systems have been the target of activeresearch since the early 1990s due to their widespreaduse in many agricultural, civil, industrial, and militaryapplications that involve environmental monitoring, targettracking, and situational assessment [1, 2]. Recent devel-opments in Microelectromechanical Systems (MEMS) makeit now possible to deploy a large number of inexpensiveand small sensors to perform complex tasks by obtainingquality through quantity. In some important applications,sensor networks are deployed for remote operations invast unstructured geographical areas. In such deployments,wireless networks with low bandwidth are usually theonly means of communication among the sensors. Thesesensors are typically powered by irreplaceable batteries withlimited energy supply, a large portion of which has tobe spent on data communication among sensors and theBase Station (BS). Therefore, minimizing the Total EnergyConsumption (TEC) for sensor data gathering is critical toensuring sustained operations of these large-scale WirelessSensor Networks (WSNs), even though minimizing TECdoes not necessarily maximize network lifetime, which

also depends on the balance of residual energy across thenetwork.

It has been well recognized that clustering provides anefficient and scalable way to design and organize large-scale WSNs for energy efficiency of data communication.In a typical hierarchical WSN that deploys a large numberof homogeneous or heterogeneous sensors, clusters areformed around a set of strategically selected or randomlydesignated Cluster Heads (CHs). The sensors within eachcluster, often referred to as Leaf Nodes (LNs), collect andsend environmental measurements to their correspondingCH, which performs an appropriate form of data processing(i.e., aggregation and compression) and sends the resultto a BS for integration with data from other clusters. TheBS, which is located either inside or outside the sensornetwork region, could be wire connected or rechargeable,and hence is often considered to have unlimited energysupply. Apparently, the number and location of CHs inhierarchical WSNs have a significant effect on the energyconsumption for data communication from LNs to the BS.Many existing clustering algorithms assume that the numberof clusters is predetermined and each CH is also designated apriori or on a random basis [3, 4]. In practice, however, such

2 International Journal of Distributed Sensor Networks

information on CHs is not always readily available duringnetwork implementation and operation, especially when thenetwork is deployed in an unstructured environment withunpredictable disturbances and threats. As a matter of fact,determining the optimal number and location of CHs hasbeen raised as one of the most fundamental problems inclustering-based hierarchical WSNs.

The global knowledge of sensor locations is crucial fordetermining the optimal number and location of CHs.Once deployed, each sensor can acquire and report itsgeographical location through a certain location discoveryprocess. The Global Positioning System (GPS) is certainlya very effective but prohibitively expensive solution due toits high cost and the constrained energy supply of sensors.Many other location discovery approaches use received signalstrength, neighbor node position, or arrival time differenceto estimate the distance between two sensors [5]. Based onthe distance estimations, sensors can compute their locationsby distance or angle combining (hyperbolic trilateration,triangulation, or maximum likelihood estimation). To makesuch combining feasible, the computation requires at leasttwo known reference points, which can be obtained througheither a GPS or a deterministic deployment.

We investigate the problem of selecting an optimalsubset of sensors, which are designated as CHs to formclusters, to minimize the TEC for data communicationper round from LNs to the BS in a predeployed WSNunder a uniform or general node distribution. We rigorouslyderive an analytical solution to calculate the number of CHsfor minimum TEC in WSNs where nodes are uniformlydistributed. In WSNs where nodes are deployed according toan unknown probability distribution, we propose a heuristicalgorithm, Distance-based Crowdedness Clustering (DCC),to determine the optimal number and location of CHs. Theextensive experimental results on simulated WSNs rangingfrom small to large scales show the performance superiorityof the proposed methods compared to the clustering andoptimization schemes based on classical k-means algorithmin terms of TEC for data communication per round. How-ever, the main purpose of these performance comparisonsis to recognize the necessity of such CH optimization andillustrate the efficacy of the proposed research approaches.In fact, the proposed optimization schemes, which couldbe performed offline using global information prior to theactual deployment, are largely orthogonal to the main bodyof current research efforts on sensor network clustering, andthe optimization results obtained by our approaches canbe used to effectively guide the operation of, and thereforegreatly improve the performance of, the classical clusteringalgorithms that require a priori knowledge on CHs.

The rest of the paper is organized as follows. Wedescribe related work in Section 2. The energy consumptionmodels are presented and the optimization problems areformulated in Section 3. In Section 4, we derive an analyticalsolution for WSNs under uniform distribution and developa heuristic approach for WSNs under general distribution.Implementation details and performance evaluations areprovided in Section 5. We conclude our work and discussfuture efforts in Section 6.

2. Related Work

In WSNs, a CH, which may collect environmental sensingdata as well in some cases, is responsible for receiving,processing, and transmitting data from the LNs in its servicearea to the BS, and hence it consumes much more energythan an LN. The sensor nodes in the close proximity of aCH may also run out of battery quickly due to frequentdata forwarding. Therefore, designating an optimal subsetof sensor nodes as CHs at appropriate locations is critical tominimizing the TEC for prolonging the lifetime of the entirenetwork. There exist a large number of research efforts in theliterature that have been devoted to solving various clusteringproblems with different objectives [3, 6–15].

Several clustering algorithms have been proposed tominimize the energy consumption or satisfy the networkconnectivity requirement with the assumption that thenumber of CHs is known a priori [3, 4, 12, 16]. Some otherresearchers tackle the problem of optimizing the numberand location of CHs, which is closely related to our work.Kim et al. derived analytical formulas to estimate the optimalnumber of CHs to achieve the minimum TEC of the entirenetwork, based on the assumption of even partition (i.e.,each cluster covers an equal number of sensors) and one-hopcommunication in both intracluster and intercluster routing[6]. In [7], Wang et al. calculated the optimal number ofCHs for a WSN by applying a cross-layer approach fromboth perspectives of the power efficiency in the mediumaccess control (MAC) layer and the coverage performancein the physical layer. In [9], Xu and Cassandras determinedthe optimal location of CHs for minimum communicationenergy consumption in a two-level heterogeneous WSN.They formulated the optimization problem with the con-straint that each LN is connected to at least p CHs andeach CH has at most q LNs as a Mixed Integer NonlinearProgramming (MINLP) problem and achieved a globaloptimum based on an iterative decomposition algorithm anda randomized multistart technique. Srinivas et al. studiedthe problem of minimizing the number of backbone nodesand referred to it as the Connected Disk Cover problem in[17], where they controlled the mobility of the backbonenodes to maintain the connectivity of the network. In[18], Tao and Zheng proposed an improved algorithm thatcombines the optimal number of CHs with energy adaptiveCH selection algorithm based on the classical Low-EnergyAdaptive Clustering Hierarchy (LEACH) [3] protocol.

One commonly adopted way to ensure load balanceand meet energy constraint of the entire network is torotate the role of a CH and form a corresponding clus-ter on a random and periodical basis among all sensornodes. The classical energy-efficient algorithm, LEACH [3],employed randomized rotation of CHs to evenly distributethe energy consumption among the sensors in the network.Yang and Sikdar [8] applied a sleep-wakeup scheme basedon a decentralized MAC protocol to LEACH and furtherproposed an analytical framework for achieving the optimalprobability with which a sensor becomes a CH in orderto minimize network energy consumption and prolongnetwork lifetime, assuming uniform distribution of all the

International Journal of Distributed Sensor Networks 3

sensors. A similar node election strategy that considersthe amount of residual energy was also used in [19] withmultihop data communication. Du and Xiao presented anenergy-efficient Chessboard Clustering routing protocol forheterogeneous sensor networks [20] to balance node energyconsumption and increase network lifetime, in which somesensors are initially set to be in sleeping mode and canbe activated later on according to a certain procedure.Some research efforts along a different line allow sensorsto adjust their locations after initial deployment. Mao andWu proposed two new sensor location updating algorithms,VFSec and Weighted Centroid, to jointly optimize sensingcoverage and secure connectivity [21].

Nonlinear programming has been widely used to modelwireless sensor data communication as network flows tominimize TEC in WSNs. Besides [9], Ergen and Varaiyaalso formulated their problem as a nonlinear programmingproblem that determines the optimal location of relay nodesand the optimal energy provided to them so that the networkis alive during the desired lifetime with minimum totalenergy [22]. In [23], Dasgupta et al. presented the sensordeployment problem for maximum lifetime with coverageconstraints and proposed an algorithm motivated by force-directed/potential field-based approaches in robotics andgraph drawing to place and assign the role of each sensor inthe system to maximize network lifetime.

The probability of a sensor to be a CH is also consideredin some work. Bandyopadhyay and Coyle proposed adistributed and randomized clustering algorithm to organizethe sensors in a WSN into clusters, and they further extendedthis algorithm to generate a hierarchy of CHs and observedmore energy savings as the number of levels in the hierarchyincreases [24].

The main differences between our work and the afore-mentioned ones arise from the following aspects: (i) weinvestigate both uniform distribution and general randomdeployment for large-scale sensor networks; (ii) we proposeboth analytical derivation and heuristic algorithm to solvethe CH optimization problems; (iii) we consider completedata aggregation and use a compression ratio to reflect thereduction in data size at CHs; (iv) we do not assume evenpartition of sensor networks as in [6].

3. Network Model and Problem Formulation

3.1. Energy Consumption Model. We consider two differenttypes of energy consumption for data transmission andreceiving, respectively: a transmitter consumes energy torun both the radio electronics and the power amplifier,while a receiver only consumes energy to drive the radioelectronics. The mobile radio channels on typical sensornodes are predominantly in the VHF (frequency from30 MHz to 300 MHz, wavelength from 1 m to 10 m) andUHF (frequency from 300 MHz to 3 GHz, wavelength from10 cm to 1 m), respectively [25, 26]. Since the interclustercommunication distance is typically much longer thanthe intracluster communication distance, we employ (i)the free space (fs) fading channel model for intracluster

wireless communication that incurs a d2 power loss, and (ii)the multipath (mp) fading channel model for interclusterwireless communication that incurs a d4 power loss [3, 6,19, 25]. Multipath fading is mainly caused by the reflectionof radio waves off structures such as buildings, mountains,trees, and other landscape features in the environment wherethe nodes are deployed. The impact of multipath fading isparticularly strong in rich scattering environments such asoffices and other indoor locales, and could be also significantin some outdoor deployments [25].

In a real communication system, the transmission powercould be adjusted by suitably configuring the power ampli-fier. Therefore, the energy dissipation in transmitting oneunit of data message over a directed wireless communicationlink can be modeled as Et(i):

Et(i) = Eelec + Eamp

(di, j)

=⎧⎨⎩Eelec + εfs · di, j2 , intra cluster

Eelec + εmp · di, j4, inter cluster,

(1)

where Eelec denotes the energy for driving the electronics,which depends on various factors including digital coding,modulation, filtering, and spreading of the signals, forboth transmitter electronics and receiver electronics; εfs

and εmp are the coefficients for calculating the amplifierenergy Eamp, which depends on the Euclidean distance

di, j =√

(xi − xj)2 + (yi − yj)2 between transmitter vi located

at (xi, yi) and receiver vj located at (xj , yj) as well asthe acceptance bit-error rate. The energy consumed by asensor vi in receiving one unit of data packet is denoted asEr(i) = Eelec. Note that the above transmission and receivingenergy models assume a contention free MAC protocol,where interferences from simultaneous transmission can beavoided.

A CH, which also collects environment sensing data,receives data messages from LNs within the cluster and sendsall the data to the BS after performing a certain type of dataprocessing (such as data aggregation and data compression).We use a constant Ep to represent the energy spent inprocessing each unit of received or sensed data. We assumethat the CH performs complete data aggregation, that is, aninput of two k-bit messages produces an output of one k-bitmessage after aggregation. Furthermore, we use a parameterα, 0 < α ≤ 1, to denote the data compression ratio: an inputof k bits results in an output of α · k bits after compression.

3.2. Problem Formulation. The problem of determining theoptimal number and location of CHs for minimum TECin sensor networks is formulated as follows. We considera WSN where n sensor nodes have been deployed in abounded L×L (m2) square region and a single BS is located at(xBS, yBS), somewhere inside or outside the network region.The location of each sensor vi, i ∈ 0, 1, . . . ,n − 1, is denotedas (xi, yi). We assume a one-hop communication modelfor both intracluster (from LNs to their associated CHs)and intercluster (from CHs to the BS) communication:the transmission energy within each cluster is calculated


using the fs model; while from CHs to the BS, the mpmodel is used. We consider two different sensor deploymentscenarios: (i) uniform node distribution and (ii) generalnode distribution. The CH optimization problem is tostrategically designate an appropriate subset of sensor nodesin the network as CHs, each of which forms a cluster withits neighbor nodes, such that the TEC for the transmissionof each unit of data message from all LNs to CHs and to theBS per round is minimized. Here, the “round” is defined as atime period during which every sensor in the network sendsone unit of data message to the BS through its associatedCH. The optimization objective is to determine the optimalnumber and location of CHs in the given sensor network forminimum TEC.

We consider the following general conditions or assump-tions in our problem formulation.

(i) All sensors are predeployed and have constrainedenergy supply.

(ii) The BS is also predeployed and has unlimited energysupply.

(iii) The network is static, that is, neither the sensors northe BS has mobility once deployed.

(iv) The total number of sensors is known.

(v) Each CH forms exactly one cluster, and besides dataprocessing, also performs the same task of environ-mental sensing and data collection as a regular sensornode.

(vi) There exists a contention free MAC protocol forwireless communication.

We consider the energy consumption for data transmis-sion of each LN, and for data receiving, processing, andtransmission of each CH. Since the energy cost for environ-ment sensing is generally much less than communication andprocessing tasks, we do not consider sensing energy cost here.Obviously, the TEC depends on the network distribution, thenumber and location of CHs, and the compression ratio α atCHs.

4. Optimizing Number and Location of CHs

4.1. Analytical Derivation for Uniform Distribution. WhenCHs perform data processing or are responsible for certainintracluster management duties, uniform distribution ofsensors among the network is usually a goal for setting upthe network, which can also leverage data delay [27]. InLEACH and other similar clustering algorithms, the expectednumber k of CHs per round is considered as a prefixedsystem parameter [3, 4, 12, 16]. Here, we shall rigorouslyderive an analytical formula for calculating the optimal valueof k to achieve the minimum TEC of data transfer from LNsto the BS through their corresponding CHs. The optimalk value determined by our approach can be used to guidethe execution of the clustering algorithms that require suchinformation.

We first calculate the expected distances from an LN to itsCH and from a CH to the BS using an approach similar to the

one used in [8]. Since the CHs are uniformly distributed inan L×L (m2) sensor network region, the expected square areacovered by each cluster with the CH deployed at (xCH, yCH)can be calculated as:

√L2/k × √

L2/k based on Voronoitessellation. Furthermore, the LNs are also uniformly andindependently deployed in each cluster, where we haveE[xLN] = E[xCH] = E[yLN] = E[yCH] = (1/2)

√L2/k,

and E[(xLN)2] = E[(xCH)2] = E[(yLN)2] = E[(yCH)2] =(1/3k)L2. Therefore, the average squared distance from anLN to its CH within a cluster can be calculated in

r2 = E[

(xLN − xCH)2 +(yLN − yCH

)2]

= E[

(xLN)2]− 2E[xLN]E[xCH] + E

[(xCH)2

]

+ E[(yLN)2]− 2E

[yLN]E[yCH

]+ E[(yCH

)2]

= 13kL2.

(2)

Without loss of generality, we suppose that the BSlocation (xBS, yBS) is prefixed at (0, 0) to simplify calculation.For a coarse grained analysis, we assume identical expecteddistance from CHs to the BS. Therefore, the average squareddistance from a CH to the BS is similarly given by

R2 = E[

(xCH − xBS)2 +(yCH − yBS

)2]= 2

3L2. (3)

The TEC per round, denoted by ETot, is the sum of theenergy consumption ELN of all LNs for data transmission andthe energy consumption ECH of all CHs for data receiving,processing, and transmission in one round, which can bedefined as

ETot = ELN + ECH. (4)

Since ELN only contains transmission energy cost Et and thetotal number of LNs in the network is n − k, ELN can beestimated as

ELN = (n− k)Et = (n− k)(Eelec + εfsr

2). (5)

Similarly, ECH is the total energy cost for the transfer of oneunit of data from each CH to the BS in one round, whichincludes the energy cost Er for receiving, Ep for processing,and Et for transmission. Each of n− k LNs transfers one unitof data to its corresponding CH, which performs processing(aggregation and compression) on the received data andits own sensing data, and sends the compressed aggregatedresult to the BS. Since there are total n units of input data(including n − k units of data received from LNs and kunits of data collected by k CHs themselves), the total energyconsumed by k CHs is defined by

ECH = (n− k)Er + nEp + αkEt

= (n− k)Eelec + nEp + αk(Eelec + εmpR

4).

(6)


Using (2), (3), (4), (5), and (6), we obtain the TEC per roundETot as follows:

ETot=(2n− 2k + αk)Eelec + nEp + (n− k)εfsL2

3k+ αkεmp

4L4

9.

(7)

The TEC per round ETot can be minimized by selecting anoptimal value of k, which is a solution to the first derivationof (7). Following that, the optimal number k of CHs can becalculated as (the negative solution is ignored)

k =√√√√ 3nL2εfs

9(α− 2)Eelec + 4αL4εmp. (8)

We further verify that the solution to the second derivationof (7) is positive. Therefore, we conclude that the value of kdefined in (8) results in the minimum TEC in WSNs withuniform node distribution. Once the optimal number ofCHs is obtained, their locations can be determined based onVoronoi tessellation among uniformly distributed sensors.

4.2. Algorithm Design for General Distribution. Thoughuniform distribution is suited for achieving energy balance,it may not always be feasible for practical sensor networkapplications, especially for those deployed in large andharsh environments not accessible to humans. In suchenvironments, sensors may be airborne or dropped by otherappropriate means that could lead to a more general nodedistribution.

We develop a heuristic algorithm, Distance-basedCrowdedness Clustering (DCC), based on a cutoff distanceand the concept of crowdedness to solve the CH optimizationproblem in WSNs under general node distribution. Here, the“cutoff distance” is the threshold that decides which clusteran LN belongs to: if the distance from an LN to the CHis within the cutoff distance, the LN is considered to belocated inside this cluster; otherwise, it is not. The conceptof “crowdedness”, which is closely related to the number ofneighbor nodes of each sensor node, is used to describe thedensity of sensor nodes in a given region.

In the DCC algorithm, we first calculate the distancesdi, j(i, j ∈ V) between all pairs of sensor nodes, each ofwhich is selected based on the cutoff distance. Using thisdistance, we designate the sensor with the largest number ofneighbor nodes as a CH and form a cluster of this CH withall its unclustered neighbors. We repeatedly designate CHswith most neighbor nodes in the rest of the LNs using thesame cutoff distance until there is no LN left unclustered.We calculate the TEC using (7) in Section 4.1 for all cutoffdistances, from which, the minimum one is selected as thefinal result. The pseudocode of DCC is given in Algorithm 1.The complexity of this algorithm is O(n3).

5. Performance Evaluation

5.1. Implementation and Experimental Settings. The pro-posed DCC algorithm is implemented in C++ and runs ona Windows XP desktop equipped with a 3.0 GHz CPU and 2

Input: a sensor network G = (V ,E) with n LNsrandomly deployed in a L× L (m2) square regionand one BS deployed inside or outside the region.Output: the optimal number k and location of CHs withminimum TEC.

(1) Calculate all-pair distances di, j , for vi, vj ∈ V , in anarray Ad ;

(2) Initialize minimum TEC TECmin = +∞;(3) for all distances di, j ∈ Ad do(4) Set cutoff distance dcut = di, j ;(5) Set vm as a neighbor of vn if dm,n ≤ dcut for all m,

n ∈ V ;(6) Sort all v ∈ V according to the number of neighbors

in a decreasing order and place them in an array Av ;(7) Insert all v ∈ V in an unclustered sensor queue Qu;(8) Initialize a clustered sensor queue Qc = ∅;(9) Initialize the number of clusters nclusters = 0;(10) while Qu /=∅ do(11) Retrieve vk ∈ Qu from Av and designate it as a CH;(12) Form a cluster Ck of vk and its neighbors vl ∈ Qu;(13) Insert all v ∈ Ck in Qc;(14) Remove all v ∈ Ck from Qu;(15) nclusters ++;(16) Calculate the TEC;(17) if TECmin > TEC then(18) TECmin = TEC;(19) k = nclusters;(20) return k and location.

Algorithm 1: Distance-based Crowdedness Clustering.

Gbytes memory. We conduct an extensive set of experimentsusing a wide variety of simulated sensor networks, in whichtwo deployment scenarios are considered: uniform andgeneral distributions. We generate these simulation datasetsby varying the size of network regions and the number ofinitially deployed sensor nodes. The parameters used in thetesting sensor networks and the sensor radio characteristicsof the energy cost models for wireless communication [6] aretabulated in Table 1. For testing purposes, we consider a fixedvalue 0.8 for the compression ratio α, which has an impact onthe clustering process.

5.2. Case Study for Uniform Distribution. We first investigatethe optimization problem on the number and location ofCHs in a sensor network within a square region of 200 ×200 (m2) under uniform distribution. Figure 1 plots the TECoptimization curve that is calculated by (7) in the network ofn = 200 sensor nodes in response to the number k of CHsvarying from 1 to 50.

From (8) in Section 4.1, we obtain the optimal numberof CHs to be k = 6, which is consistent with the optimal oneobserved in Figure 1. The TEC increases as the number ofCHs moves away from the optimal point. We further studya case with a larger WSN of n = 900 sensor nodes. Again,the optimal number 13 observed in Figure 2 is consistentwith the one produced by (8). The unimodal property ofthe TEC optimization curves justifies the correctness of our


Table 1: WSN communication parameters.

Parameter Value

L× L 200× 200 (m2)

Eelec 50 (nJ/bit)

εfs 10 (pJ/bit/m2)

εmp 0.0013 (pJ/bit/m4)

Ep 5 (nJ/bit/signal)

α 0.8

derivation for the optimal number of CHs in WSNs underuniform node distribution.

To evaluate the robustness of our solution, we performmore experiments on networks under uniform distributionwith 10 different problem cases from small to large ones: inthe first case, the total number of initial sensor nodes is set tobe 10; in the rest 9 cases, it increases from 100 to 900 nodesat an interval of 100 nodes. The optimization results, that is,the optimal number of CHs, the expected distance from anLN to its associated CH, and the corresponding minimumTEC in each case, are plotted in Figure 3. We observe thatthe expected distance from an LN to its CH varies from 32 mto 116 m and it decreases when the optimal number of CHsincreases.

5.3. Case Study for General Distribution. To better illustratethe optimization process of DCC in WSNs under generalnode distribution, we calculate and plot the TEC per roundunder different cutoff distances at each optimization stepfor a network of n = 200 sensor nodes deployed in thesame region, as shown in Figure 4. This three-dimensionaloptimization curve also features a unimodal property: theTEC is minimized with an optimal number of CHs at acertain cutoff distance.

For performance comparison, we adapt the classical k-means clustering algorithm to our problem and compareits performance with that of the proposed DCC algorithmusing the same set of 10 problem cases previously consideredfor uniform distribution. We first determine the optimalnumber k of CHs and the corresponding minimum TECusing the proposed DCC algorithm, and then use k-meansalgorithm to perform clustering based on the k valueobtained by the DCC algorithm, referred to as deterministick-means algorithm. For visual comparison, we plot theperformance measurements of TEC produced by these twoalgorithms in Figure 5. The results produced by the DCCalgorithm outperform those produced by the deterministick-means algorithm in all 10 cases we studied, which showsthe performance superiority of the proposed DCC algorithm.

For illustration purposes, we lay out the node distribu-tion and clustering of the sensor network with 100 sensornodes in Figure 6, in which the unclustered solid nodedenotes the BS. The clustering results obtained by the DCCalgorithm are marked by the solid lines, while the resultsobtained by the deterministic k-means algorithm are markedby the dashed lines. The DCC algorithm produces 20 clustersin this instance.

25

30

35

40

45

50

55

60

TE

C(μJ/

rou

nd)

1 5 10 15 20 25 30 35 40 45 50

Number of clusters

L = 200 m, n = 200

Figure 1: Analytical calculation of TEC per round with n = 200.

100

120

140

160

180

200

220

TE

C(μJ/

rou

nd)

1 5 10 15 20 25 30 35 40 45 50

Number of clusters

L = 200 m, n = 900

Figure 2: Analytical calculation of TEC per round with n = 900.

We further compare DCC with a clustering methodcompletely based on the classical k-means algorithm, inwhich we iteratively use k-means algorithm to cluster nsensor nodes for n times by varying the parameter k from1 to n, and select the clustering result with the minimumenergy cost. We refer to this clustering method as adaptive k-means algorithm. Two WSNs of problem sizes of 200 and 900sensor nodes, respectively, are tested using adaptive k-meansalgorithm, and their partial optimization curves are plottedin Figures 7 and 8. The rest of the TEC values continuouslyincreases as k increases.

We observe some small random variations on the perfor-mance curves produced by the adaptive k-means algorithm.These variations are mainly caused by the following twofactors: (i) the k-means algorithm may be trapped in localoptimal solutions by only considering the distance fromLNs to CHs, while the TEC has to consider the distancesfrom LNs to CHs and from CHs to the BS; (ii) as widelyevidenced, the quality of the final solution produced by the


0

20

40

60

80

100

120

Z:M

inim

um

TE

C(μJ/

rou

nd)

150

100

50

0

Y : Expected distance

froma RS to a CH

( m)

03

69

1215

X : Optimal number of CHs for each case

X : 12.02Y : 33.31Z : 101.1

X : 13Y : 32.03Z : 112.4

X : 9.99Y : 36.53Z : 77.54

X : 11Y : 34.81Z : 89.38

X : 7.95Y : 40.96Z : 53.36

X : 9.054Y : 38.38Z : 66.52

X : 6Y : 47.14Z : 29.39

X : 7Y : 43.64Z : 41.84

X : 1Y : 115.5Z : 2.93

X : 4Y : 57.74Z : 16.42

Figure 3: Analytical calculation of minimum TEC per round in 10cases ranging from small to large scales.

0

1000

2000

3000

4000

5000

Z:T

EC

(μJ/

rou

nd)

300250

200150

10050

0

Y : cutoff distance ( m)

050

100150

200

X: Number of CHs for each distance

L = 200 m, n = 200

Figure 4: TEC per round at each optimization step using DCCalgorithm.

k-means algorithm also depends on the initially (randomly)selected set of CHs. We apply the adaptive k-means algorithmto all 10 cases of different problem sizes, and plot theperformance measurements in terms of minimum TEC andoptimal number of CHs produced by DCC, adaptive k-means, and deterministic k-means algorithms in Figure 9.We observe that DCC achieves the best performance amongthe three algorithms in comparison and the adaptive k-means algorithm outperforms the deterministic k-meansalgorithm.

0

20

40

60

80

100

120

140

160

Min

imu

mT

EC

(μJ/

rou

nd)

1 2 3 4 5 6 7 8 9 10

Index of problem cases

DCCDeterministic k-means

Figure 5: Performance comparison between DCC and determin-istic k-means algorithms in 10 cases ranging from small to largescales.

0

20

40

60

80

100

120

140

160

180

200

Y(m

)

0 50 100 150 200

X (m)

L = 200 m, n = 100 nodes

Figure 6: Visualization of the clustering in a network of 100 nodesusing DCC and deterministic k-means algorithms: the results ofDCC are marked by solid lines and the results of deterministic k-means are marked by dashed lines.

Using the same network instance of 100 sensor nodes, wevisually compare the clustering results obtained by the DCCalgorithm and the adaptive k-means algorithm, as shownin Figure 10. The 20 clusters marked by the solid lines areobtained by the DCC algorithm (the same DCC clusteringin Figure 6), and the 4 clusters marked by dashed lines areobtained by the adaptive k-means algorithm. In Figure 10, weobserve that the DCC algorithm achieves a more reasonableclustering result than the k-means-based algorithms in termsof local sensor density.


20

30

40

50

60

70

80

TE

C(μJ/

rou

nd)

1 5 10 15 20 25 30 35 40 45 50

Number of clusters

L = 200 m, n = 200

Figure 7: TEC per round for each k value using adaptive k-meansalgorithm.

100

120

140

160

180

200

220

240

TE

C(μJ/

rou

nd)

1 10 20 30 40 50 60 70 80 90 100

Number of clusters

L = 200 m, n = 900

Figure 8: TEC per round for each k value using adaptive k-meansalgorithm.

6. Conclusion and Future Work

We investigated the problem of selecting a subset of sensornodes as CHs in WSNs to achieve minimum TEC. We con-sidered two energy dissipation models, free space model forintracluster communication and multipath model for inter-cluster communication. We derived an analytical formulato determine the optimal number and location of CHs inWSNs under uniform distribution and proposed a heuristicclustering algorithm based on distance and crowdedness tooptimize the number and location of CHs in WSNs undergeneral node distribution. The simulation results illustratedthe performance superiority of the proposed solution incomparison with the clustering schemes based on classicalk-means algorithm.

In the present work, we only considered one-hopcommunications for both intracluster and intercluster datacommunication under uniform and general node distribu-

0

50

100

150

200

Z:M

inim

um

TE

C(μJ/

rou

nd)

60

40

20

0

Y : Number of CHs 24

68

10

X : Index of 10 cases

X : 10Y : 41Z : 152.3

X : 10Y : 41Z : 92.51X : 7

Y : 39Z : 93.23

X : 10Y : 5Z : 113.5

X : 4Y : 43Z : 95.32

X : 7Y : 39Z : 60.63

X : 4Y : 5Z : 41.89X : 1

Y : 5Z : 5.19

X : 7Y : 9Z : 72.53

X : 4Y : 43Z : 28.68

X : 1Y : 5Z : 1.61

X : 1Y : 2Z : 2.59

DCCAdaptive k-meansDeterministic k-means

Figure 9: Performance comparison among DCC, adaptive k-meansand deterministic k-means, algorithms in 10 cases ranging fromsmall to large scales.

0

20

40

60

80

100

120

140

160

180

200

Y(m

)

0 50 100 150 200

X (m)

L = 200 m, n = 100 nodes

Figure 10: Visualization of the clustering in a network of 100 nodesusing DCC and adaptive k-means algorithms: the results of DCCare marked by solid lines and the results of adaptive k-means aremarked by dashed lines.

tions. We will consider multihop routing methods and thebalance of sensor nodes in each cluster in our future work.It would also be of our future interest to derive an analyticalperformance bound of energy cost for the clustering of WSNsunder complex and general distributions.


Acknowledgments

This research is sponsored by National Science Foundationunder Grant no. CNS-0721980 and Oak Ridge NationalLaboratory, US Department of Energy, under contract no.PO 4000056349 with University of Memphis.

References

[1] A. Hyder, E. Shahbazian, and E. Waltz, Multisensor Fusion,Kluwer Academic, Boston, Mass, USA, 2002.

[2] D. N. Jayasimha, S. S. Iyengar, and R. L. Kashyap, “Infor-mation integration and synchronization in distributed sensornetworks,” IEEE Transactions on Systems, Man and Cybernetics,vol. 21, no. 5, pp. 1032–1043, 1991.

[3] W. Heinzelman, A. Chandrakasan, and H. Balakrish-nan, “Energy-efficient communication protocol for wirelessmicrosensor networks,” in Proceedings of the 33rd AnnualHawaii International Conference on System Sciences, vol. 2, pp.3005–3014, 2000.

[4] E. I. Oyman and C. Ersoy, “Multiple sink network design prob-lem in large scale wireless sensor networks,” in Proceedings ofthe IEEE International Conference on Communications, vol. 6,pp. 3663–3667, June 2004.

[5] A. Savvides, C. C. Han, and M. B. Strivastava, “Dynamicfine-grained localization in ad-hoc networks of sensors,” inProceedings of the 7th ACM Annual International Conferenceon Mobile Computing and Networking (MOBICOM ’01), pp.166–179, July 2001.

[6] H. Kim, S. Kim, S. Lee, and B. Son, “Estimation of the optimalnumber of cluster-heads in sensor network,” Knowledge-BasedIntelligent Information and Engineering Systems, pp. 87–94,2005.

[7] L. C. Wang, C. M. Liu, and C. W. Wang, “Optimizingthe number of clusters in a wireless sensor network usingcross-layer analysis,” in Proceedings of the IEEE InternationalConference on Mobile Ad-Hoc and Sensor Systems, pp. 585–587,October 2004.

[8] H. Yang and B. Sikdar, “Optimal cluster head selection in theLEACH architecture,” in Proceedings of the IEEE InternationalPerformance, Computing, and Communications Conference, pp.93–100, New Orleans, La, USA, April 2007.

[9] N. Xu and C. Cassandras, “Optimal cluster-head deploymentin wireless sensor networks with redundant link require-ments,” in Proceedings of the 2nd International Conferenceon Performance Evaluation Methodologies and Tools, pp. 1–9,2007.

[10] E. Lloyd and G. Xue, “Relay node placement in wireless sensornetworks,” IEEE Transactions on Computers, vol. 56, no. 1, pp.134–138, 2007.

[11] J. Pan, L. Cai, Y. Hou, Y. Shi, and S. Shen, “Optimalbasestationlocations in two-tiered wireless sensor networks,”IEEE Transactions on Mobile Computing, vol. 4, no. 5, pp. 458–473, 2005.

[12] G. K. Das, S. Das, S. C. Nandy, and B. P. Sinha, “Efficientalgorithm for placing a given number of base stations tocover a convex region,” Journal of Parallel and DistributedComputing, vol. 66, no. 11, pp. 1353–1358, 2006.

[13] H. Kim, Y. Seok, N. Choi, Y. Choi, and T. Kwon, “Optimalmulti-sink positioning and energy-efficient routing in wirelesssensor networks,” in Proceedings of the International Confer-ence on Information Networking (ICOIN ’05), pp. 264–274,January 2005.

[14] W. Heinzelman, Application-specific protocol architectures forwireless networks, Ph.D. thesis, Massachusetts Institute ofTechnology, June 2000.

[15] S. Guha and S. Khuller, “Approximation algorithms forconnected dominating sets,” Algorithmica, vol. 20, no. 4, pp.374–387, 1998.

[16] Z. Vincze, R. Vida, and A. Vidacs, “Deploying multiple sinksin multi-hop wireless sensor networks,” in Proceedings of theIEEE International Conference on Pervasive Services (ICPS ’07),pp. 55–63, July 2007.

[17] A. Srinivas, G. Zussman, and E. Modiano, “Mobile backbonenetworks—construction and maintenance,” in Proceedings ofthe International Symposium on Mobile Ad Hoc Networking andComputing (MobiHoc ’06), pp. 166–177, Florence, Italy, 2006.

[18] Y. Tao and Y. Zheng, “The combination of the optimal numberof cluster-heads and energy adaptive cluster-head selectionalgorithm in wireless sensor networks,” in Proceedings ofthe International Conference on Wireless Communications,Networking and Mobile Computing (WiCOM ’06), pp. 1–4,September 2006.

[19] S. Selvakennedy and S. Sinnappan, “An energy-efficientclustering algorithm for multihop data gathering in wirelesssensor networks,” Journal of Computers, vol. 1, no. 1, pp. 40–47, 2006.

[20] X. Du and Y. Xiao, “Energy efficient chessboard clusteringand routing in heterogeneous sensor network,” InternationalJournal of Wireless and Mobile Computing, vol. 1, no. 2, pp.121–130, 2006.

[21] Y. Mao and M. Wu, “Coordinated sensor deployment forimproving secure communications and sensing coverage,” inProceedings of the 3rd ACM Workshop on Security of Ad HocandSensor Networks, pp. 117–128, Alexandria, Va, USA, 2005.

[22] S. C. Ergen and P. Varaiya, “Optimal placement of relay nodesfor energy efficiency in sensor networks,” in Proceedings of theIEEE International Conference on Communications (ICC ’06),vol. 8, pp. 3473–3479, July 2006.

[23] K. Dasgupta, M. Kukreja, and K. Kalpakis, “Topology-aware placement and role assignment for energy-efficientinformation gathering in sensor networks,” in Proceedingsof the 8th IEEE International Symposium on Computers andCommunications, pp. 341–348, 2003.

[24] S. Bandyopadhyay and E. J. Coyle, “An energy efficient hier-archical clustering algorithm for wireless sensor networks,” inProceedings of the 22nd Annual Joint Conference on the IEEEComputer and Communications Societies (INFOCOM ’03), vol.3, pp. 1713–1723, April 2003.

[25] D. Puccinelli and M. Haenggi, “Multipath fading in wire-less sensor networks: measurements and interpretation,” inProceedings of the International Wireless Communications andMobile Computing Conference (IWCMC ’06), vol. 2006, pp.1039–1044, Vancouver, Canada, 2006.

[26] T. Rappaport, Wireless Communications: Principles and Prac-tice, Prentice Hall, Upper Saddle River, NJ, USA, 2002.

[27] A. A. Abbasi and M. Younis, “A survey on clustering algo-rithms for wireless sensor networks,” Computer Communica-tions, vol. 30, no. 14-15, pp. 2826–2841, 2007.

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttp://www.hindawi.com Volume 2010

RoboticsJournal of

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014


Active and Passive Electronic Components

Control Scienceand Engineering

Journal of



RotatingMachinery


Hindawi Publishing Corporation http://www.hindawi.com

Journal ofEngineeringVolume 2014

Submit your manuscripts athttp://www.hindawi.com

VLSI Design



Shock and Vibration


Civil EngineeringAdvances in

Acoustics and VibrationAdvances in



Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

The Scientific World JournalHindawi Publishing Corporation http://www.hindawi.com Volume 2014

SensorsJournal of


Modelling & Simulation in EngineeringHindawi Publishing Corporation http://www.hindawi.com Volume 2014


Chemical EngineeringInternational Journal of Antennas and

Propagation




Navigation and Observation



DistributedSensor Networks


Documents

Research Article - Hindawi Publishing Corporationdownloads.hindawi.com/journals/ijdsn/2010/961591.pdf · Consumption (TEC) for sensor data gathering is critical to ensuring sustained