7
Energy Aware Routing for Spatio-temporal Queries in Sensor Networks Neha Jain, Ratnabali Biswas, Nagesh Nandiraju, and Dharma P. Agrawal OBR Research Center for Distributed and Mobile Computing, ECECS Department, University of Cincinnati, Cincinnati, OH 45221-0030 Email:{njain,biswasr,nandirns,dpa}@ececs.uc.edu Abstract— Wireless sensor networks is an emerging technology that can significantly improve the quality of spatio-temporal data monitoring because of their untethered operation and potential for large scale deployment. In this paper, we define a communi- cation architecture that supports distributed query processing to evaluate spatio-temporal queries within the network. We represent these queries by query trees and distribute query operators to appropriate sensor nodes. As operator execution demands high computation capability, we propose use of a heterogenous sensor network where query operators are assigned to sparsely deployed resource-rich nodes within a dense network of low power sensor nodes. We design an adaptive, decentralized, low communication overhead algorithm to determine an operator placement on the resource-rich nodes in the network to minimize cost of transmitting data along a routing tree constructed to continuously retrieve data at the sink, from a set of spatially distributed geographical regions. To the best of our knowledge, this is the first attempt to build an energy aware routing infrastructure to enable in-network processing of spatio-temporal queries. I. I NTRODUCTION A wireless sensor network is a dense network of a large number of low cost miniature devices driven by a limited bat- tery resource. These devices, called sensor nodes, are equipped with multiple onboard sensors, a short range wireless interface and limited storage and processing capability. The ease of deployment, ad hoc connectivity and cost-effectiveness of a wireless sensor network are revolutionizing remote monitoring applicatons [1]. At the node level, data communication is the dominant component of energy consumption, and protocol design for sensor networks is geared towards reducing data traffic in the network. As sensors close to the event being monitored sense similar data, the focus of existing research [7] has been to aggregate (combine, partially compute and compress) sensor data at a local level before transmitting it to a remote user called the sink. The number of nodes that sense attributes related to an event in a geographical region depends on the footprint of the event, referred to as the ‘target region’ in this paper. These queries are usually long-running and persist through- out network lifetime. Hence, we treat data coming from target regions as a continuous data stream. In this paper, we con- sider complex queries computed over multiple target regions acting as data sources. We specifically address the problem of translating a ‘query tree’ at the sink to a corresponding ‘routing tree’ such that cost of transferring data from the target regions to the sink through intermediate resource rich nodes that execute the operators is minimized. We propose an algorithm that adapts the routing tree to fluctuations in data properties and scarcity of network resources in a decentralized manner. In this work, we use a heterogeneous node environment for query evaluation. A large number of cheap sensor nodes sense and transmit data and some powerful nodes, henceforth referred as Query Processor (QP) nodes execute query opera- tors to combine and process data arriving from multiple target regions. The resource rich nodes have additional storage, pro- cessing capability and power to execute computation intensive tasks, and buffer data from continuous data streams. Organi- zation of heterogeneous sensor networks offers flexibility to employ more sophisticated and interesting algorithms [6], [8] for data processing. At the same time, the cost of network deployment remains low as a large fraction of nodes are cheap, low power sensor nodes. Latest technology trends reveal that increasing energy efficiency of processors through advances in embedded hardware design will alleviate the limitations in memory and computation capabilities of sensor nodes [10]. Spatio-temporal applications that can benefit from in- network query processing using heterogeneous sensor net- works are monitoring civil structures, machines, road traffic and environment. We propose that the cost and delay involved in transmitting raw data to a remote server for offline result computation can be eliminated by processing data from spa- tially distributed regions within the network. For example, the sensor nodes may be randomly dispersed over a large geographical area to observe physical phenomena such as comparison of pollution or temperature levels among regions physically separated by large distances. The organization of the paper is as follows. We begin with a description of the communication architecture and the query model in Sections II and III respectively. We derive local optimal placement of query operators in the network in Section IV. The routing tree construction is discussed in Section V and its adaptive features in Section VI. Relevant simulation IEEE Communications Society / WCNC 2005 1860 0-7803-8966-2/05/$20.00 © 2005 IEEE

Energy aware routing for spatio-temporal queries in sensor networks

Embed Size (px)

Citation preview

Energy Aware Routing for Spatio-temporal Queriesin Sensor Networks

Neha Jain, Ratnabali Biswas, Nagesh Nandiraju, and Dharma P. AgrawalOBR Research Center for Distributed and Mobile Computing, ECECS Department,

University of Cincinnati, Cincinnati, OH 45221-0030Email:{njain,biswasr,nandirns,dpa}@ececs.uc.edu

Abstract— Wireless sensor networks is an emerging technologythat can significantly improve the quality of spatio-temporal datamonitoring because of their untethered operation and potentialfor large scale deployment. In this paper, we define a communi-cation architecture that supports distributed query processingto evaluate spatio-temporal queries within the network. Werepresent these queries by query trees and distribute queryoperators to appropriate sensor nodes. As operator executiondemands high computation capability, we propose use of aheterogenous sensor network where query operators are assignedto sparsely deployed resource-rich nodes within a dense networkof low power sensor nodes. We design an adaptive, decentralized,low communication overhead algorithm to determine an operatorplacement on the resource-rich nodes in the network to minimizecost of transmitting data along a routing tree constructed tocontinuously retrieve data at the sink, from a set of spatiallydistributed geographical regions.

To the best of our knowledge, this is the first attempt to buildan energy aware routing infrastructure to enable in-networkprocessing of spatio-temporal queries.

I. INTRODUCTION

A wireless sensor network is a dense network of a largenumber of low cost miniature devices driven by a limited bat-tery resource. These devices, called sensor nodes, are equippedwith multiple onboard sensors, a short range wireless interfaceand limited storage and processing capability. The ease ofdeployment, ad hoc connectivity and cost-effectiveness of awireless sensor network are revolutionizing remote monitoringapplicatons [1]. At the node level, data communication is thedominant component of energy consumption, and protocoldesign for sensor networks is geared towards reducing datatraffic in the network.

As sensors close to the event being monitored sense similardata, the focus of existing research [7] has been to aggregate(combine, partially compute and compress) sensor data at alocal level before transmitting it to a remote user called thesink. The number of nodes that sense attributes related to anevent in a geographical region depends on the footprint of theevent, referred to as the ‘target region’ in this paper.

These queries are usually long-running and persist through-out network lifetime. Hence, we treat data coming from targetregions as a continuous data stream. In this paper, we con-sider complex queries computed over multiple target regions

acting as data sources. We specifically address the problemof translating a ‘query tree’ at the sink to a corresponding‘routing tree’ such that cost of transferring data from thetarget regions to the sink through intermediate resource richnodes that execute the operators is minimized. We propose analgorithm that adapts the routing tree to fluctuations in dataproperties and scarcity of network resources in a decentralizedmanner.

In this work, we use a heterogeneous node environmentfor query evaluation. A large number of cheap sensor nodessense and transmit data and some powerful nodes, henceforthreferred as Query Processor (QP) nodes execute query opera-tors to combine and process data arriving from multiple targetregions. The resource rich nodes have additional storage, pro-cessing capability and power to execute computation intensivetasks, and buffer data from continuous data streams. Organi-zation of heterogeneous sensor networks offers flexibility toemploy more sophisticated and interesting algorithms [6], [8]for data processing. At the same time, the cost of networkdeployment remains low as a large fraction of nodes are cheap,low power sensor nodes. Latest technology trends reveal thatincreasing energy efficiency of processors through advancesin embedded hardware design will alleviate the limitations inmemory and computation capabilities of sensor nodes [10].

Spatio-temporal applications that can benefit from in-network query processing using heterogeneous sensor net-works are monitoring civil structures, machines, road trafficand environment. We propose that the cost and delay involvedin transmitting raw data to a remote server for offline resultcomputation can be eliminated by processing data from spa-tially distributed regions within the network. For example,the sensor nodes may be randomly dispersed over a largegeographical area to observe physical phenomena such ascomparison of pollution or temperature levels among regionsphysically separated by large distances.

The organization of the paper is as follows. We begin witha description of the communication architecture and the querymodel in Sections II and III respectively. We derive localoptimal placement of query operators in the network in SectionIV. The routing tree construction is discussed in Section Vand its adaptive features in Section VI. Relevant simulation

IEEE Communications Society / WCNC 2005 1860 0-7803-8966-2/05/$20.00 © 2005 IEEE

results are provided in Section VII. In the end, we providerelated work in Section VIII, followed by the conclusion inSection IX.

II. THE NETWORK MODEL

Here, we present the proposed communication architectureto model spatio-temporal monitoring applications. In Fig-ure 1(a), we show the bird’s eye view of a sensor field withdata streams emerging from multiple target regions and beingmerged at intermediate nodes in the network into a singleoutput stream directed towards the sink. Observe that thearea of a target region is much smaller compared to the areaof the entire sensor field and its distance from other targetregions in the field. In Figure 1(b), we zoom into target

T3T2T

O1

O2

Sink

1

1500

0m

18000m Target Region T1

Root of Aggregation treeSource of Data Stream

400m

C

(a) (b)

Fig. 1. Levels in computation hierarchy

region T1 to observe how data from the low power sensornodes (refer white circles) is aggregated [7] at the QP nodes(refer black circles) to form a single input stream on behalfof the entire target region. In this paper, each target regionis represented by its final aggregation node that transmits asingle data stream and a routing infrastructure is designedfor processing continuous data streams from multiple targetregions.

III. THE QUERY MODEL

Instead of executing an entire query at the sink, we pro-pose to evaluate a query tree by placing its operators atappropriate QP nodes in the network. We assume that a userquery specified in a declarative language like SQL has beenconverted to a query tree (specifying the order of evaluationof operators) by using query optimization techniques basedon power conservation [9]. Data sources for these queries aredata streams emerging from target regions as a sequence oftuples or data packets. In a spatio-temporal data monitoringapplication, knowledge of data co-ordinate and the time atwhich it was generated are as important as the data itself [5].These tuples consist of the sensor-data attributes, a query iden-tifier to associate the sensor-data with its corresponding query,a region identifier and the timestamp to provide its spatio-temporal qualities. Each operator processes the tuples fromthe input streams, which usually results in either reduction inthe size of the tuples or elimination of tuples from the datastreams. Therefore, the data rate of the output stream is a

fraction of the data rate of the input streams and we refer tothis fraction as the data reduction factor, φ:

φ =data rate of output stream∑data rate of all input streams

, 0 ≤ φ ≤ 1 (1)

A low value of φ implies high data reduction factor. The queryoperators are popular database operators such as select, join,project or aggregation operators such as maximum, minimum,average etc. We now provide the motivation behind optimaloperator placement to support such in-network processing ofqueries.

A. In-network Query Processing

In-network query processing can lead to significant datareduction. Consider a balanced query tree as shown in Figure2. Data tuples of size ‘x’ are generated by each of the ‘n’ leafnodes, and at each higher level data is combined and reducedby a uniform reduction factor, φ. We define this data reductiondr as the ratio of the amount of data obtained at the sink byin-network query processing, dqp to that obtained by the naiveapproach of directly delivering data to the sink, ddirect.

T2 T3 4 5 6 7 8

Root

Sink

TTTTT1T

x x x x x x x x

3

22

2xφ 2xφ2xφ 2xφ

4xφ4xφ

8xφ

Fig. 2. Data reduction in a balanced query tree.

dr =dqp

ddirect=

∑ki=0(nxφi)

nxlog(n)=

1 − φlog(n)+1

(1 − φ)log(n)(2)

We observe from equation 2, that for low values of φ,reduction in data due to in-network query processing is veryhigh. Unless supplemented by an energy efficient routing treeinfrastructure, the benefit of reducing the routing load throughin-network processing may be offset by the physical separationbetween the QP nodes of the routing tree. In this paper,we present an algorithm for optimal operator placement thatminimizes data transfer from target regions to the sink viaoperatorQP nodes.

IV. OPTIMAL OPERATOR PLACEMENT

The objective of optimal operator placement is to map a treeof operators on to the QP nodes in the sensor network suchthat cost of data transfer from the target regions to a fixed sinkis minimized. We begin with the simple case of determiningoptimal operator placement O(x,y) of a single operator thatcombines data from two target regions represented by childnodes, Cl(xl,yl) and Cr(xr,yr) and routes the output streamto the sink, represented by the parent node P(xp,yp). Asmentioned in [3], the data transfer cost between two nodes is

IEEE Communications Society / WCNC 2005 1861 0-7803-8966-2/05/$20.00 © 2005 IEEE

P

Cr

Cl

O(x,y)

d

d

dr l

p

Fig. 3. Placement of operator O in �PCrCl

proportional to number of hops separating the two nodes andthe rate of data transfer between them. Let us approximatenumber of hops between two nodes by the euclidean distancebetween the two nodes (This is indeed true if sensor nodesprovide a good coverage of the field).If data rate from Cr to O is dr, data rate from Cl to O is dl,and data rate from O to P is dp, the cost of data transfer fromCr and Cl to P through O(x,y) is proportional to the followingcost function f(X̄=(x,y)),

f(X̄) = dl||O − Cl|| + dr||O − Cr|| + dp||O − P || (3)

where||O − P || =√

(x − xp)2 + (y − yp)2,

||O − Cr|| =√

(x − xr)2 + (y − yr)2,

||O − Cl|| =√

(x − xl)2 + (y − yl)2

The optimal placement of operator O is the location at whichthe value of cost function f(X̄) is minimum. We now presentthe following lemmas to prove that it is sufficient to find alocal minima of f(X̄) to find a minimum value of f(X̄).

Lemma 11: Minima of function f(X̄) cannot lie outside �PCrCl

Lemma 21: S={X̄ ε R2 / X̄ε �PCrCl}, is a convex set ifP, Cr, and Cl are not in a straight line.

Lemma 32: If X̄ ε S, then f(X̄) is strictly convex.Lemma 42: If f′(X̄k)=0 for the convex function f(X̄k), then

X̄k is a local minimaLemma 52: For the strictly convex function, f(X̄k), if there

exists a local minima X̄k, then it is also its global minima.Any of the constrained or unconstrained non-linear opti-

mization methods may be used to minimize f(X̄). We adapt asimple method, called ‘steepest descent’ to minimize f(X̄).Given a location X̄k(xk, yk) of the operator, the steepestdescent method iteratively approaches the local minima bycomputing X̄k+1(xk+1, yk+1) as,

X̄k+1 = X̄k − αkd̄k, where αk >= 0, and

d̄k = �f(xk, yk)→ xk+1 = xk − αkdxk, yk+1 = yk − αkdyk (4)

where d̄k = �f(xk, yk) = (df

dx,df

dy)X̄k

= (dxk, dyk)

1Proof provided in [11]2Proof available in [2]

O(x k,yk)

P

Cr

Cl

A

=0)

L

k

k=a)

Fig. 4. Line of steepest slope, L in � PCrCl.

We minimize the value of f(X̄k+1) for values of αk such thatX̄k+1 lies on the portion of line of steepest descent L, within�PCrCl as shown in Figure 4.

A. Selecting Initial Operator Placement

Since the optimal placement of the operator would be closerto node receiving or sending data with maximum data rate, weselect the initial position X̄0 for the steepest descent methodas the weighted centroid of �PCrCl given by X̄0. In fact,we observed through our experiments that the local minimawas indeed arrived at in fewer iterations by placing operatorsinitially at X̄0.

X̄0 = (dpxp + dlxl + drxr

dp + dl + dr,dpyp + dlyl + dryr

dp + dl + dr) (5)

B. Algorithm findLocalMinima

We list the steps involved in determining local minima bythe steepest descent method algorithm findLocalMinima,Input Parameters: Current placement of operator(X̄0=(xo,yo)), Cl, Cr, P, dl, dr, dp

Output Parameter: X(x,y)= Optimal local placement ofoperator to obtain input data from Cl, and Cr and deliveroutput data to P. Let f(x,y)=data transfer cost of placingoperator at X(x,y).

findLocalMinima( Xo, Cl, Cr, P, dl, dr, dp)

{if (||� f(xo, yo) || < δ)

return Xo /* no need to adapt*/

k=0

(xo,yo)=(dpxp+dlxl+drxr

dp+dl+dr,

dpyp+dlyl+dryrdp+dl+dr

)

compute(||(dxk,dyk)||=�f(xk,yk)

while ( �|| (dxk,dyk)|| > δ )

{find αmax={max(α)/α ≥ 0,(xk-α ∗ dxk, yk-α*dyk)ε�PClCr}let g(α)=f(xk-α*dxk,yk-α*dyk)

If(g′(0)≤0) and g′(αmax ≥ 0)

find αmin cost in [0,αmax] s.t. g′(αmin cost)=0

else

αmin cost = αmax /* g(α) decreases as α increases

from 0 to αmax */

(xk+1,yk+1)= (xk − αmin cost*dxk, yk − αmin cost*dyk)

IEEE Communications Society / WCNC 2005 1862 0-7803-8966-2/05/$20.00 © 2005 IEEE

compute(dxk+1,dyk+1)=�f(xk+1,yk+1)

k=k+1

}return Xk(xk,yk)

}

C. Global Optimal Placement (GOP) of Query Operators

A centralized approach to determine GOP would incur alot of communication overhead and delay in responding tofluctuations in the network resources and data properties.Therefore, a decentralized solution for operator placement inthe network that approximates to GOP is desirable.Lemma 63: If the placement of operators in the routing treeis globally optimal, then placement of each operator is alsolocally optimal with respect to its child and parent operators.

Motivated by Lemma 6, we propose a heuristic that decom-poses the problem of determining an optimal placement ofall the operators in a query tree to the problem of determiningplacement of each operator such that it is optimal with respectto its child and parent operators. As indicated by data transfercost Equation 3, the local optimal placements of operatorsin a query tree are co-related, i.e., they are affected by theplacements of their child and parent operators. We propose adecentralized adaptation algorithm to attain an optimal place-ment of operators in a query tree by iteratively refining theoptimal local placements of individual operators with respectto updated locations of the child and parent operators.

D. Proposed decentralized adaptation

The proposed decentralized adaptation strategy is based onthe following assumptions,

• The sink receives the complex query in the form of anoptimized query tree that consists of either binary orunary operators4

• The sink is aware of the location of target regions.

To reduce data flow in the routing tree, it is always betterto place a unary operator at the location of its child operatorin the query tree. Hence, we consider optimal placement ofonly binary operators in the proposed decentralized operatorplacement. The sink computes the initial location of theoperators based on the location of the leaf nodes as follows,

Initial Operator placement: Initially the data rates andφ are not known, therefore operators are placed bottom-upat the center of the shortest path joining the child operators.Although this placement is far from optimal, we adapt it inthe subsequent top-down phase taking into account the parentoperator’s locations at each operator and assuming equal datarates from, and to each operator.

As data starts flowing through this routing tree, data ratesand φ are computed at the operatorQP nodes, the operatorplacement is adapted to these factors and with respect to each

3Proof provided in [11]4Proposed local operator placement for a binary query tree can be extended

to an ‘n’-ary query tree.

others placement. We propose a Two-Phase DecentralizedAdaptation (TPDA) strategy to adjust operator placement ina query tree iteratively through alternate bottom-up and top-down phases using current data rates and φ.

Bottom-up Phase: In the bottom-up phase, we start fromthe deepest level of operators in the query tree, and succes-sively optimize local placement of operators at upper levels inthe tree. This phase propagates the effect of any change in theplacement of the child operators to operators which are higherup in the tree. This is followed by a top-down phase.

Top-down Phase: This phase propagates changes in posi-tion of parent operators to their child operators till they reachthe leaf nodes of the routing tree.

Thus few iterations of a sequence of bottom-up and top-down phases place the operators in the query tree close to aGOP (as demonstrated by simulations in Section IV-F).

E. Iterative Improvement by TPDA

We call the optimal operator placement at which the op-erators cease to cause a change in each other’s placementas the steady state placement. Although this strategy is notguaranteed to reach a steady state placement, we observed inour simulations that operator placements indeed improve andstabilize after few iterations. Although the stable placement ofoperators does not guarantee the minimum data transfer costor GOP, our simulation results as illustrated in the next sectionalso indicate that the placement of operators always improveswith respect to the original or previous placement and nevergets worse.

F. Deviation from Global Optimal Placement

In this section, we compare the percentage deviation of thedata transfer cost of the following schemes from GOP:

• Naive: Data is sent directly from each target region tothe sink for query evaluation.

• Weighted centroid: The operator is simply placed at theweighted centroid of the child and parent operators infollowing the initial placement.

• TPDA: This is the 2 phase algorithm proposed in SectionIV-D. In our experiments, we observed that it takes veryfew adaptive iterations for TPDA to reach a steady state.

• Fast TPDA (FTPDA): TPDA is modified to reduce itscomputation complexity. In fast TPDA, we do at most 2iterations of the while loop in algorithm findLocalMin-ima. And, our simulations revealed that we still get veryclose to TPDA in terms of overall data transfer cost.

The performance comparison is based on simulation. In ourexperimental set up we placed 300 QP nodes in a field ofsize 800mX800m with uniform random distribution. The GOPis computed by doing an exhaustive search of all possibleoperator placements on the QP nodes in the network. Thedata transfer cost is computed by using the euclidean distancebetween the QP nodes as an indicator of number of hops.Figure 5(a) and (b) shows deviations of different schemesfrom GOP for both a balanced and an unbalanced left deep

IEEE Communications Society / WCNC 2005 1863 0-7803-8966-2/05/$20.00 © 2005 IEEE

query tree, each consisting of 4 target regions respectively.The plots are averaged over 100 random selections of targetregions, and the sink node in the same network configurationfor values of φ between 0 and 1 in increments of 0.1 assumingthe same value of φ for each operator. We now summarize thegeneral observations from Figure 5.

• For small values of φ, deviation of the naive scheme fromGOP is largest compared to rest of the schemes because itroutes the data generated at each target region directly tothe sink. Naive scheme performs best when φ approaches1 since it is better to place the operators close to the sink.

• The weighted centroid scheme performs much better thanthe naive scheme because it considers the incoming andoutgoing data rates of each operator in the query tree.

• For a wide range of values of φ, TPDA and FTPDAprovide least deviation from GOP as they keep refiningthe operator placements with respect to their parent andchild operators.

(a) (b)

Fig. 5. Operator placement strategies.

We conclude that performance of FTPDA is close to GOPfor a wide range of values of φ. For certain values of φ,FTPDA/ TDPA has relatively high (∼10%) deviation fromGOP because of some approximations used for convergence.These can be tuned to reduce the deviation and are explainedas follows:

• Operator relocation occurs only when its computed opti-mal position is more than one hop away from its currentplacement. This reduces unnecessary adaptations whenimprovement by operator relocation is insignificant.

• Operator relocation does not occur for low values ofthe cost function gradient (� f(X̄)). This is indeed thecase when the data rate of one of the operatorQP nodesfar exceeds the data rates of the remaining two andtherefore, optimal placement of operator is at or closeto the operatorQP node with maximum data rate.

V. INITIAL ROUTING TREE

In this section, we describe translation of a given query treeat the sink into a corresponding routing tree, and routing of

tuples along the routing tree. Construction of the routing treeconsists of the following two phases:

Determination of operator placement: In this phase, thesink computes initial placement of all the operators in thequery tree based on the location of the leaf nodes as describedin Section IV-D.

Query Dissemination: In this phase, the operators areassigned to relevant QP nodes in the network and forward andreverse paths in tree are set up to connect operatorQP nodesto their child and parent operaterQP nodes respectively. Thesink creates a query packet consisting of the sink’s locationand the entire query tree with the desirable location of eachoperator in it. The sink greedily forwards5 the query packetin the direction of the root operator till it reaches a QP node,which is near (within radio range of) the computed location ofroot operator. This node, on receiving the query tree, becomesthe root node of the query tree and extracts all operators thatshould reside on it. Then, it splits the rest of the query treeinto right and left subtrees and propagates them towards thelocation of root operators in their respective subtrees. This waynodes keep getting added to the routing tree and the processcontinues until the query packet reaches the leaf nodes.

The reverse paths (node to its parent) are set up as thequery packet propagates down towards the leaf nodes. Onreceiving a query packet, each new operatorQP node generatesa location update message consisting of its location for itsparent operatorQP node to initialize the forward paths (nodeto its children).

The complexity of tree construction can be defined as thenumber of messages exchanged for tree construction in termsof the number of target regions, ‘n’ for a given query. It canbe proved that the best case complexity is for constructing abalanced binary tree and is equal to O(nlog(n)). It can also beproved that the worst complexity of routing tree constructionis for a left deep or right deep tree and is equal to O(n2). Since,typically the number of target regions anticipated in a query ismuch smaller (say 2 to 50) compared to the size of the sensornetwork (10000 nodes), this scheme clearly has much loweroverhead than naive flooding schemes used to construct tree-based communication topology in existing in-network queryprocessing architectures such as TinyDB [9].

Once the tree is constructed, data streams generated at theleaf nodes are routed up the tree using reverse paths. We use asingle path routing protocol, GEAR (Geographic and EnergyAware Routing Algorithm)6 [12] to route query responsesbetween operatorQP nodes via intermediate nodes.

The initial routing tree construction is followed by decen-tralized adaptations in operator placement to fluctuations innetwork resources, data properties etc. as described in thefollowing section.

5to a neighbor closest to the destination location6next hop for a data packet is the neighbor that has maximum residual

energy among all neighbors closer than the forwarding node to the destination

IEEE Communications Society / WCNC 2005 1864 0-7803-8966-2/05/$20.00 © 2005 IEEE

VI. THE DYNAMIC ROUTING TREE

After the initial tree is constructed, we implement each cycleof bottom-up and top-down phases in the TPDA strategy toadapt the routing tree after an adaptivity interval of tadapt. Al-though new operator placements are computed at operatorQPnodes in the bottom-up phase, but the operators are movedto their adapted locations based on the placements computedby the top-down phase following the bottom-up phase. Thisway operators adapt only once for a set of bottom-up and top-down phases. The value of tadapt can be adjusted dependingon the expected frequency of fluctuation in data properties inthe network. In order to minimize control messages sent upand down the tree while implementing TPDA, we use timesynchronization.

A. Adaptation to Network Resources

For a long running query, the operatorQP nodes on therouting tree may get depleted of their energy, or their nodedegree may reduce due to failure of their neighboring nodes.We maintain the residual energy level, ER of each operatorQPnode and its connectivity status with the neighboring nodes, n(node degree)7 while the query is in execution. The resourcedeficient operatorQP node transfers its current operators andassociated routing table to the neighboring QP node that hasadequate available energy and a high node degree. If anoperatorQP node in the tree stops functioning, it can be locallyrepaired by its parent operatorQP node.

B. Adaptation to Query Specifications

For a long running query, the user might want to stopquerying some existing target regions, initiate monitoring ofnew target regions, or reorder operators in the query tree. Ifmore than, say, 50% of the total number of operators need tobe moved or updated, the existing routing tree is purged, anda fresh routing tree is constructed based on the modified querytree. Otherwise the sink initiates local repair of a subtree toadd or remove target regions through exchange of appropriatecontrol messages.

We now present simulation results to illustrate constructionof a routing tree based on a given query tree and adaptationof the routing tree to changing data properties and networkresources.

VII. EXPERIMENTS

We conducted our simulations in a general discrete eventsimulator, Simjava [4]. For our experiments, we place 2500low power sensor nodes and 300 QP nodes with a uniformrandom distribution in a square field of size 800m X 800m.This ensures that there is at least one QP node in a clusterof 8 to 10 low power sensor nodes, for a radio range of 50mfor any node in the network. Note that at a given QP node,

7in a sensor network a node periodically transmits ‘alive’ messages to allits neighors

we simulate data processing (operator evaluation) by assumingvalues of data reduction factor (φ) of each operator residing onit. In a real application, φ for an operator may be computed atthe operatorQP nodes by collecting meta-data such as averagenumber of input tuples eliminated per unit time.

A

B

D

F

G

H

T3

T4

T2

Root

T

E

C

T1

5

Fig. 6. Query tree, QT used in experiments.

We present a series of snapshot views of the routing tree atdifferent times corresponding to the query tree, QT in Figure6. We select the location of sink and target regions of QT ran-domly in the network. The initial routing tree and its variousadaptations and the dynamic routing tree are shown in Figure7 where we plot the co-ordinates of each operatorQP nodein the routing tree, and connect parent and child operatorQPnodes with line segments to show the direction of data flow inthe tree. We have labeled the operators and their correspondingvalues of φ next to them at the operatorQP nodes.Initial tree: The placement of operatorQP nodes in the initialrouting tree as shown in Figure 7(a), is symmetric withrespect to the location of their child operators as they areplaced approximately at the center of the location of theirchild operators (refer expression 5).Steady state tree: We generate random values of φ for eachoperator in the query tree, QT and assign them to the operatorsduring initial tree construction. The initial routing tree inFigure 7(a) is adapted to data rates and φ of operators usingTPDA strategy to obtain an optimal placement as shown inFigure 7(b). The total number of iterations required for the op-erators to settle at the steady state placement for this examplewas observed to be 2. We label the amount of data transmittedup the routing tree; 100 packets are generated at each targetregion, and after data reductions at each operatorQP node, 45packets finally reach the sink as the query results.Fluctuation in φ: Once the query is in execution, the datageneration rate at the target regions or value of φ of eachoperator may vary with time. Figure 7(c) shows the routingtree adapted to the new value of φ for operator D in the routingtree shown in Figure 7(b).Reduction in network resources: To simulate continuousqueries, data in the tree is routed long enough to cause energylevel of few nodes to drop below a critical threshold. In Figure7(d), operatorQP node ‘X’ evaluating operator D is almostdepleted of its energy, therefore it initiates transfer of operator

IEEE Communications Society / WCNC 2005 1865 0-7803-8966-2/05/$20.00 © 2005 IEEE

0 100 200 300 400 500 600 700 8000

100

200

300

400

500

600

700

800

T1

T2

T3 T4 T5

Sink

H F,G E D

A,B

C

(a) Initial Operator Placement

0 100 200 300 400 500 600 700 8000

100

200

300

400

500

600

700

800

T3

H(0.9) 90 100

80

106

138

80 100

100 45

T5 T4

T1

T2

A(0.5) B(0.5)

D(0.5) F(0.8) G(0.9)

C(0.4)

E(0.8)

Sink

(b) Steady State Operator Placement

0 100 200 300 400 500 600 700 8000

100

200

300

400

500

600

700

800

T3

H(0.9) 90

100

80 138

80 100

100 36

T5 T4

T1

T2

A(0.5) B(0.5)

F(0.8) G(0.9)

C(0.4)

E(0.8)

D(0.35)

Sink

75

(c) Adaptation to modified value of φ for operator, D

0 100 200 300 400 500 600 700 8000

100

200

300

400

500

600

700

800

T3

H(0.9) 90 100

80

75

138

80 100

100 36

T5 T4

T1

T2

A(0.5) B(0.5)

D at QP node X

F(0.8) G(0.9)

C(0.4)

E(0.8)

D(0.35) atQP Node Y

Sink

(d) Relocation of operator D from QP Node X to Y when ER of Xis critically low.

Fig. 7. Routing tree for Query Tree, QT

D to its neighboring QP node ‘Y’.

VIII. RELATED WORK

Bonfils et al. were the first to introduce the problem ofoperator placement [3] for supporting in-network query pro-cessing in homogeneous sensor networks. For optimal operatorplacement in a query tree, they proposed neighbor explorationstrategy, which uses a cost function that is the cumulativecost of the right and left subtrees of an operator, but does notinclude the outgoing data transfer rate to the parent operatorwhich is crucial for minimizing local data transfer cost asillustrated in Section IV. Besides, they have not evaluated thenumber of iterations required to reach a GOP, and overheadof messages periodically transmitted up the tree to computethe costs of the subtrees. We have exploited the additionalcomputation power at QP nodes to implement a non-linearoptimization method that has low communication overheadand converges to GOP fairly fast.

IX. CONCLUSION

In this paper, we have proposed a network and query modelfor evaluating complex queries to serve spatio-temporal datamonitoring applications. We have provided the motivation fordesigning an energy aware routing infrastructure that supportsin-network query processing and proposed a decentralizedalgorithm TPDA, having low communication overhead for thesame. We have shown by simulations that TPDA maps a querytree to a routing tree that has operator placements close toglobal optimal. We have demonstrated the construction of ini-tial routing tree and its adaptive features through experiments.

REFERENCES

[1] D. P. Agrawal, Q. Zeng, Introduction to Wireless and Mobile Systems,Brooks/Cole Publishing, 436 pages, ISBN 0534-40851-6, 2003.

[2] D. P. Bertsekas, Non Linear Programming:2nd Edition, Athena Scien-tific Publishing, 708 pages, ISBN 1-886529-00-0, 1999.

[3] B. J. Bonfils, P. Bonnet, “Adaptive and Decentralized Operator Place-ment for In-Network Query Processing,” In Proc. of 2nd InternationalWorkshop, IPSN, Apr. 2003

[4] F. Howell, and R. McNab, “Simjava: A Discrete Event Simulation Pack-age for Java ” Web Page, http://www.dcs.ed.ac.uk/home/hase/simjava

[5] T. Imielinski and B. Nath, “Wireless Graffiti - Data, Data Everywhere,”In Intl. Conf. on Very Large Data Bases (VLDB), 2002.

[6] Intel Research Oregon, “Heterogeneous sensor net-works,” Technical report, Intel Corporation, Web Page,http://www.intel.com/research/exploratory/heterogeneous.htm.

[7] C. Intanagonwiwat, D. Estrin, R. Govindan, and J. Heidemann, “ Impactof Network Density on Data Aggregation in Wireless Sensor Networks,”Technical Report 01-750, University of Southern California, Nov. 2001.

[8] R. Kumar, V. Tsisatsis and M. Srivatsava, “Computation Hierarchy forIn-network Processing,” In WSNA 2003.

[9] S. Madden, M. J. Franklin, J. M. Hellerstein, W. Hong, “The Designof an Acquisitional Query Processor for Sensor Networks,” In ACMSIGMOD Conference, June 2003.

[10] G. E. Moore, “Cramming More Components onto Integrated Circuits,”In Electronics, pp. 114-117, April 1965.

[11] N. Jain, “Energy Aware and Adaptive Routing Protocols for WirelessSensor Networks,” Ph.D. Thesis, Computer Science, University ofCincinnati, Summer 2004.

[12] Y. Yu, R. Govindan, and D. Estrin, “Geographical and Energy AwareRouting: A Recursive Data Dissemination Protocol for Wireless SensorNetworks,” UCLA Computer Science Department Technical ReportUCLA/CSD-TR-01-0023, May 2001.

IEEE Communications Society / WCNC 2005 1866 0-7803-8966-2/05/$20.00 © 2005 IEEE