14
Privacy preservation in wireless sensor networks: A state-of-the-art survey Na Li a, * , Nan Zhang b , Sajal K. Das a , Bhavani Thuraisingham c a Department of Computer Science and Engineering, The University of Texas at Arlington, Box 19015, 416 Yates St., Room 300, Nedderman Hall, Arlington, TX 76019-0015, United States b Department of Computer Science, The George Washington University, 801 22nd Street NW, Suite 704, Washington DC 20052, United States c Department of Computer Science, Erik Jonsson School of Engineering & Computer Science, The University of Texas at Dallas, 800 W. Campbell Road, MS EC31, Richardson, TX 75080, United States article info Available online 22 April 2009 Keywords: Wireless sensor network Privacy abstract Much of the existing work on wireless sensor networks (WSNs) has focused on addressing the power and computational resource constraints of WSNs by the design of specific rout- ing, MAC, and cross-layer protocols. Recently, there have been heightened privacy concerns over the data collected by and transmitted through WSNs. The wireless transmission required by a WSN, and the self-organizing nature of its architecture, makes privacy pro- tection for WSNs an especially challenging problem. This paper provides a state-of-the- art survey of privacy-preserving techniques for WSNs. In particular, we review two main categories of privacy-preserving techniques for protecting two types of private informa- tion, data-oriented and context-oriented privacy, respectively. We also discuss a number of important open challenges for future research. Our hope is that this paper sheds some light on a fruitful direction of future research for privacy preservation in WSNs. Ó 2009 Elsevier B.V. All rights reserved. 1. Introduction In recent years, wireless sensor networks (WSNs) have drawn considerable attention from the research commu- nity on issues ranging from theoretical research to practi- cal applications. Special characteristics of WSNs, such as resource constraints on energy and computational power, have been well defined and widely studied [3]. What has received less attention, however, is the critical privacy con- cern on information being collected, transmitted, and ana- lyzed in a WSN. Such private information of concern may include payload data collected by sensors and transmitted through the network to a centralized data processing ser- ver. For example, a patient’s blood pressure, sugar level and other vital signs are usually of critical privacy concern when monitored by a medical WSN which transmits the data to a remote hospital or doctor’s office. Privacy con- cerns may also arise beyond data content and may focus on context information such as the location of a sensor ini- tiating data communication. Note that an alert communi- cation originating from a patient’s heart monitor in the medical WSN is enough for an adversary to infer that the patient suffers from heart problem. Effective countermea- sure against the disclosure of both data and context-ori- ented private information is an indispensable prerequisite for the broad application of WSNs to real-world applications. Privacy protection has been extensively studied in var- ious fields related to WSN such as wired and wireless net- working, databases and data mining. Nonetheless, the following inherent features of WSNs introduce unique challenges for privacy preservation in WSNs, and prevent the existing techniques from being directly transplanted: Uncontrollable environment: Sensors may have to be deployed to an environment uncontrollable by the defen- der, such as a battlefield, enabling an adversary to launch physical attacks to capture sensor nodes or deploy coun- terfeit ones. As a result, an adversary may retrieve private 1570-8705/$ - see front matter Ó 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.adhoc.2009.04.009 * Corresponding author. Tel.: +1 8172727409. E-mail addresses: [email protected] (N. Li), [email protected] (N. Zhang), [email protected] (S.K. Das), bhavani.thuraisingham@utdal- las.edu (B. Thuraisingham). Ad Hoc Networks 7 (2009) 1501–1514 Contents lists available at ScienceDirect Ad Hoc Networks journal homepage: www.elsevier.com/locate/adhoc

Ad Hoc Networksprojects/295d/papers/sdarticle.pdfthe privacy of the data being collected and guarantee the privacy of data query initiated by users of the network. For context-based

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Ad Hoc Networksprojects/295d/papers/sdarticle.pdfthe privacy of the data being collected and guarantee the privacy of data query initiated by users of the network. For context-based

Ad Hoc Networks 7 (2009) 1501–1514

Contents lists available at ScienceDirect

Ad Hoc Networks

journal homepage: www.elsevier .com/locate /adhoc

Privacy preservation in wireless sensor networks: A state-of-the-art survey

Na Li a,*, Nan Zhang b, Sajal K. Das a, Bhavani Thuraisingham c

a Department of Computer Science and Engineering, The University of Texas at Arlington, Box 19015, 416 Yates St., Room 300, Nedderman Hall,Arlington, TX 76019-0015, United Statesb Department of Computer Science, The George Washington University, 801 22nd Street NW, Suite 704, Washington DC 20052, United Statesc Department of Computer Science, Erik Jonsson School of Engineering & Computer Science, The University of Texas at Dallas, 800 W. Campbell Road,MS EC31, Richardson, TX 75080, United States

a r t i c l e i n f o

Available online 22 April 2009

Keywords:Wireless sensor networkPrivacy

1570-8705/$ - see front matter � 2009 Elsevier B.Vdoi:10.1016/j.adhoc.2009.04.009

* Corresponding author. Tel.: +1 8172727409.E-mail addresses: [email protected] (N. Li), nz

Zhang), [email protected] (S.K. Das), bhavani.thlas.edu (B. Thuraisingham).

a b s t r a c t

Much of the existing work on wireless sensor networks (WSNs) has focused on addressingthe power and computational resource constraints of WSNs by the design of specific rout-ing, MAC, and cross-layer protocols. Recently, there have been heightened privacy concernsover the data collected by and transmitted through WSNs. The wireless transmissionrequired by a WSN, and the self-organizing nature of its architecture, makes privacy pro-tection for WSNs an especially challenging problem. This paper provides a state-of-the-art survey of privacy-preserving techniques for WSNs. In particular, we review two maincategories of privacy-preserving techniques for protecting two types of private informa-tion, data-oriented and context-oriented privacy, respectively. We also discuss a numberof important open challenges for future research. Our hope is that this paper sheds somelight on a fruitful direction of future research for privacy preservation in WSNs.

� 2009 Elsevier B.V. All rights reserved.

1. Introduction

In recent years, wireless sensor networks (WSNs) havedrawn considerable attention from the research commu-nity on issues ranging from theoretical research to practi-cal applications. Special characteristics of WSNs, such asresource constraints on energy and computational power,have been well defined and widely studied [3]. What hasreceived less attention, however, is the critical privacy con-cern on information being collected, transmitted, and ana-lyzed in a WSN. Such private information of concern mayinclude payload data collected by sensors and transmittedthrough the network to a centralized data processing ser-ver. For example, a patient’s blood pressure, sugar leveland other vital signs are usually of critical privacy concernwhen monitored by a medical WSN which transmits thedata to a remote hospital or doctor’s office. Privacy con-

. All rights reserved.

[email protected] (N.uraisingham@utdal-

cerns may also arise beyond data content and may focuson context information such as the location of a sensor ini-tiating data communication. Note that an alert communi-cation originating from a patient’s heart monitor in themedical WSN is enough for an adversary to infer that thepatient suffers from heart problem. Effective countermea-sure against the disclosure of both data and context-ori-ented private information is an indispensable prerequisitefor the broad application of WSNs to real-worldapplications.

Privacy protection has been extensively studied in var-ious fields related to WSN such as wired and wireless net-working, databases and data mining. Nonetheless, thefollowing inherent features of WSNs introduce uniquechallenges for privacy preservation in WSNs, and preventthe existing techniques from being directly transplanted:

� Uncontrollable environment: Sensors may have to bedeployed to an environment uncontrollable by the defen-der, such as a battlefield, enabling an adversary to launchphysical attacks to capture sensor nodes or deploy coun-terfeit ones. As a result, an adversary may retrieve private

Page 2: Ad Hoc Networksprojects/295d/papers/sdarticle.pdfthe privacy of the data being collected and guarantee the privacy of data query initiated by users of the network. For context-based

1502 N. Li et al. / Ad Hoc Networks 7 (2009) 1501–1514

keys used for secure communication and decrypt anycommunication eavesdropped by the adversary.

� Sensor-node resource constraints: A battery-powered sen-sor node generally has severe constraints on its ability tostore, process, and transmit the sensed data. As a result,the computational complexity and resource consump-tion of public-key ciphers is usually considered unsuit-able for WSNs. This introduces additional challengesfor privacy preservation.

� Topological constraints: The limited communicationrange of sensor nodes in a WSN requires multiple hopsin order to transmit data from the source to the base sta-tion. Such a multi-hop scheme demands different nodesto take diverse traffic loads. In particular, a node closerto the base station (i.e., data collecting and processingserver) has to relay data from nodes further away frombase station in addition to transmitting its own gener-ated data, leading to higher transmission rate. Such anunbalanced network traffic pattern brings significantchallenges to the protection of context-oriented privacyinformation. Particularly, if an adversary holds the abil-ity of global traffic analysis, observing the traffic pat-terns of different nodes over the whole network, it caneasily identify the sink and compromise context privacy,or even manipulate the sink node to impede the properfunctioning of the WSN.

The unique challenges for privacy preservation in WSNscall for the development of effective privacy-preservingtechniques. In this paper, we provide a state-of-the-art sur-vey of existing privacy-preserving techniques in WSNs. Wereview two main categories of privacy-preserving tech-niques for protecting two types of private information,data-oriented and context-oriented privacy, respectively.In the category of data privacy, we mainly discuss how toenable the aggregation of sensed data without violatingthe privacy of the data being collected and guarantee theprivacy of data query initiated by users of the network.For context-based privacy, we analyze the protection oflocation privacy and temporal private information. In addi-tion, we build a table to compare different techniques interms of their effectiveness in practical applications. Lastbut not the least, we discuss some interesting and chal-lenging open issues on this topic, which are expected toshed light on a fruitful direction of future research on pri-vacy preservation in WSNs.

The rest of the paper is organized as follows. We reviewprivacy-preserving techniques in related fields in Section 2.In Section 3, we introduce our taxonomy of privacy-pre-serving techniques in WSNs. Sections 4 and 5 address tech-niques for data and context-oriented privacy protection,respectively. In Section 6, we evaluate and compare theperformance of different privacy-preserving techniques.Section 7 outlines the open challenges for future research,followed by final remarks in Section 8.

2. Related work

Research on issues related to WSNs requires multi-disciplinary studies spanning networking, databases,

distributed computing, etc. To properly understand thechallenges of privacy preservation in WSNs and the tech-niques necessary to address such challenges, it is importantto first examine the privacy issues and privacy-preservingtechniques in such related fields as databases, data min-ing and wireless networks, which we briefly review asfollows.

In the field of database and data mining, privacy con-cerns may arise from three types of systems [36]: The firstis an information sharing system which involves two ormore mutually untrusted parties. The objective is to guar-antee that no private information beyond the minimumnecessary is disclosed during information sharing. Crypto-graphic secure multi-party computation techniques areusually used for this type of systems [1,11,35,38]. The sec-ond is a data collection system where one centralized datacollector/analyzer collects and mines data from multipledistributed data providers. Random perturbation tech-niques [2,15,33,34] are usually applied to protecting pri-vacy in these systems. The third type is a data publishingsystem, the objective of which is to publish data to supportdata analytical application without compromising the ano-nymity of individual data owners. k-anonymity [27] and l-diversity based algorithms [19,20] are proposed for privacyprotection in these systems.

Privacy issues have also been extensively studied in thedomain of generic networking. Location privacy is of par-ticular concern with the pervasive development of ad-vanced wireless device, like PDA, and with the advent oflocation-based service (LBS). In an LBS system, a user hold-ing a wireless device queries the LBS server to obtain thenearest restaurant or hospital to the user. Nonetheless,the user would not willingly disclose his/her real location.To address such location privacy concerns, ANONYMIZER,a trusted-third-party based framework, was proposed[13]. With this framework, a user first sends his/her loca-tion to the centralized anonymizer which then queriesthe LBS server with not the user’s real location but a cloak-ing region which covers not only the user but also a num-ber of other users. This technique prevents the LBS serverfrom distinguishing one user from many others. However,it is unlikely to be practical for two reasons: First, it is notreasonable to assume the existence of a trustable thirdparty. Second, even if such a trusted third party exists, itcreates a single point-of-failure for the system because ifthe third party is compromised, the privacy preservationover the whole system will completely collapse. To removethe requirement of a trusted third party, a private-infor-mation-retrieval based technique was proposed [12].Nonetheless, this technique also suffers from significantcomputation and communication overhead. Besides thedirect disclosure of user’s location from query payload,traffic flow information may also breach location privacy.In particular, the server may pinpoint the location of a userbased on the user’s IP address. In order to provide trafficflow confidentiality, Tor [10] is designed, as the secondgeneration onion routing [22], and becomes a popularanonymous communication network which consists ofthousands of Tor routers to relay user traffic to or fromLBS server.

Page 3: Ad Hoc Networksprojects/295d/papers/sdarticle.pdfthe privacy of the data being collected and guarantee the privacy of data query initiated by users of the network. For context-based

N. Li et al. / Ad Hoc Networks 7 (2009) 1501–1514 1503

3. Taxonomy of privacy-preserving techniques forWSNs

In this section, we introduce our taxonomy of privacy-preserving techniques for WSNs. Fig. 1 depicts the com-plete taxonomy which includes the problems defined (aselipses) as well as the techniques proposed (as rectangles)in the literature. One can see from the figure that there aretwo main types of privacy concerns, data-oriented andcontext-oriented concerns. Data-oriented concerns focuson the privacy of data collected from, or query posted to,a WSN. On the other hand, context-oriented concerns con-centrate on contextual information, such as the locationand timing of traffic flows in a WSN. Data- and context-ori-ented privacy concerns may be violated by data- and traf-fic-analysis attacks, respectively. Fig. 2 depicts a simpleillustration of the two types of attacks. One can see fromFig. 2 that, in the case of data analysis attack, a maliciousnode of the WSN abuses its ability of decrypting data tocompromise the payload being transmitted. In traffic-anal-ysis attacks, a third-party adversary does not have the abil-ity to decrypt data payloads. Instead, it eavesdrops thewirelessly transmitted data and tracks the traffic flowinformation hop-by-hop. In the following, we provide abrief overview of the problem definitions and main chal-lenges for these two categories, respectively.

Reshuffled data:-SMART [14]

Flooding:-Baseline flooding [18]-Probabilistic flooding [18]

R-Rgr-Pse

Hop bReenc

Privacy in

Data Privacy

Data Aggregation Data Query Locati

Data SourRandom noise:-CPDA [14]

Fake s

Approximate histogram:-GP2S [32]

Anonymity:-K-anonymity based [4]-DP2AC [37]

Random walk:-Phantom routing [18]-Greedy random walking [28]

Dummy injection:-Local dummy [18]-Basic global dummy [24]-Global dummy with filter [29]

Fig. 1. Taxonomy of privacy-prese

3.1. Overview of data-oriented privacy protection

Data-oriented privacy protection focuses on protectingthe privacy of data content. Here ‘‘data” refer to not onlysensed data collected within a WSN but also queries posedto a WSN by users. In the above-mentioned medical WSNexample, private information may include temperatureand blood pressure collected from the WSN, or querieson these vital signs posed to the WSN.

There are two types of adversaries which maycompromise data-oriented privacy. One is an externaladversary which eavesdrops the data communicationbetween sensor nodes in a WSN. This type of adversar-ies can be effectively defended against using the tradi-tional techniques of cryptographic encryption andauthentication.

The second type is an internal adversary which is also aparticipating node of the WSN, but has been captured andmanipulated by malicious entities to compromise privateinformation. Since a participating node is allowed to de-crypt data legally, the traditional encryption and authenti-cation techniques may no longer be effective. Thus, themain challenge for protecting data-oriented privacy is toprevent an internal adversary from compromising the pri-vate information, while maintaining the normal operationof the WSN.

Multiple parents routing [8][9]

andom walk routing:andom selection in two oup neighbors [16]robabilistic parent lection [8,9]

y hop ryption [9]

Transmission rate control [9]

Dummy propagation [8] [9]

WSNs

Context Privacy

Base Station

on Privacy Temporal Privacy

Global AnalysisLocal Analysis

ce

Random sending time [9]

ources [21]

Random delay [17]

rving techniques for WSNs.

Page 4: Ad Hoc Networksprojects/295d/papers/sdarticle.pdfthe privacy of the data being collected and guarantee the privacy of data query initiated by users of the network. For context-based

Fig. 2. Two scenarios for privacy attack in a WSN.

1504 N. Li et al. / Ad Hoc Networks 7 (2009) 1501–1514

3.2. Overview of context-oriented privacy protection

Context-oriented privacy protection focuses on pro-tecting contextual information, such as the location[7–9,16,18,21,24,28,29] and timing [17] information oftraffic transmitted in a WSN. Location privacy concernsmay arise for such special sensor nodes as the data source[18,21,24,28,29] and the base station [7–9,16]. As wementioned in Section 1, an adversary with knowledge ofthe location of the data source or base station locationmay be able to infer the content of the data being transmit-ted or destroy the sensor network.

Timing privacy, on the other hand, concerns the timewhen sensitive data is created at data source, collected bya sensor node and transmitted to the base station [17]. Thistype of privacy is also of primary importance, especially inthe mobile target tracking application of WSNs, because anadversary with knowledge of such timing information maybe able to pinpoint the nature and location of the trackedtarget without learning the data being transmitted in theWSN [17]. Furthermore, the adversary may be able to pre-dict the moving path of the mobile target in the future, vio-lating the privacy of the target.

Similar to data-oriented privacy, context-oriented pri-vacy may also be threatened by both external and internaladversaries. Nonetheless, existing research has mostly fo-cused on defending against external adversaries, becausesuch adversaries may be able to compromise context pri-vacy easily by monitoring wireless communication. Withinthe category of external adversaries, one can further clas-sify adversaries into two categories, local attackers andglobal attackers, based on the strength of attacks an adver-sary is capable of launching. Local attackers can only mon-itor a local area within the coverage area of a WSN, andtherefore have to analyze traffic hop-by-hop to compromisetraffic context information. On the other hand, a globalattacker has the capability (e.g., a high-gain antenna) ofmonitoring the global traffic in a WSN. One can see thata global attacker is much stronger than a local one.

We would like to remark that the classification of con-text-oriented privacy into the two categories of locationand timing privacy only reflects the current state-of-the-art, and should not be treated as a comprehensive classifi-cation. Context-oriented privacy also extends to other con-cerns, such as the hiding of time frequency ofcommunication between sensor nodes in a WSN, becausesuch frequency may expose information about the trafficflow in a WSN. In the above-mentioned medical WSN,knowing that a patient’s blood pressure sensor is havingfrequent communication with its neighboring nodes is en-ough for an adversary to infer that the patient may be suf-fering from high or low arterial pressure.

4. Data-oriented privacy protection techniques

Recall from Section 3 that two types of adversariesthreatening data-oriented privacy are external and internaladversaries. To defend against external adversaries, crypto-graphic techniques such as encryption and authenticationcan be used effectively. In particular, a multi-level privacyprotection scheme was proposed in [25] to assign differentencryption keys to sensed data which require different lev-els of privacy protection, in order to properly defendagainst external adversaries.

Compared with external adversaries, the internal adver-saries can be more powerful because they consist of nodes/base stations captured and controlled by malicious entitiesand therefore may have knowledge of encryption keysused in a WSN. As mentioned in Section 3, there are twotypes of data privacy we would like to protect againstinternal adversaries: the privacy of data being collectedand the privacy of queries being posed to a WSN. For thefirst class privacy, a straight forward approach to defendagainst internal adversaries is to apply end-to-end encryp-tion between the data source and the base station. Withthis approach, no intermediate node, including the internaladversaries, can compromise the privacy of data being

Page 5: Ad Hoc Networksprojects/295d/papers/sdarticle.pdfthe privacy of the data being collected and guarantee the privacy of data query initiated by users of the network. For context-based

N. Li et al. / Ad Hoc Networks 7 (2009) 1501–1514 1505

transmitted without knowing the key shared by only thetwo end nodes. Nonetheless, this approach also impedesthe normal operation of a WSN. In particular, it rendersdata aggregation (i.e., node-by-node selection of transmit-ted data to reduce traffic volume) infeasible in intermedi-ate nodes, due to their incapability of decrypting thetransmitted data. This leads to a significant amount ofadditional traffic being transmitted, jeopardizing the en-ergy consumption of a power-constrained WSN. Thus, amain challenge for privacy-preserving data aggregation isto defend against internal adversaries under hop-by-hop(i.e., link) encryption, where nodes that are directly com-municating with each other share a private key for encryp-tion and decryption.

For protecting the privacy of queries posed to a WSN,the base station must collect not only the data useful foranswering a query, but also other data available in aWSN, in order to prevent an adversary from inferring theprivate query based on the data being accessed. Thus, amain challenge for protecting query privacy is to minimizethe amount of dummy tuple while maintaining the privacyof queries being issued [4]. In the following, we will pres-ent the existing techniques for protecting the privacy ofdata aggregated and queries posed, respectively.

4.1. Privacy protection during data aggregation

Data aggregation is designed to substantially reduce thevolume of traffic being transmitted in a WSN by fusing orcompressing data in the intermediate sensor nodes (calledaggregators). It is an important technique for preserving re-sources (e.g., energy consumption) in a WSN. Interestingly,it is also a common and effective method to preserve pri-vate data against an external adversary, because the pro-cess compresses large inputs to small outputs at theintermediate sensor nodes. However, as mentioned above,the usage of data aggregation also leads to vulnerabilityagainst internal adversaries. In particular, if an aggregatoris compromised, it may perform either passive or activeattacks:

� For passive attacks, the malicious aggregator properlyfollows the protocols defined by a WSN, with the onlyexception that it may record all intermediate computa-tion and communication. Since an aggregator is sup-posed to perform the aggregation operations (e.g.,max/min), it has the capability of decrypting the trans-mitted data and compromising the content privacy.

� Furthermore, an aggregator may launch active attacks toinject bogus data or tamper with raw data, leading to theinvalidity of collected data at base station. Unlike thepassive attacks which aim to compromise the confiden-tiality of private data, active attacks aim to destroy theintegrity of collected data. Existing techniques defendagainst such active attacks by utilizing trust-basedmechanisms to confirm the aggregated result [6,30,31].

In the following, we review the existing techniques forprivacy-preserving data aggregation [14,32]. A commonidea of existing techniques is to perturb the data beingtransmitted in a WSN. Two privacy-preserving techniques

were proposed in [14]: (i) cluster-based private data aggre-gation (CPDA), which adds random seeds into the originaldata, and (ii) slice-mixed aggregation (SMART), whichchops one data item into pieces and then rebuilds datapackage after exchanging those pieces randomly. Both ap-proaches rely on the cooperation of neighbors to hide indi-vidual raw data, and they concentrate on the SUMaggregation operation. On the other hand, in [32], a collu-sion-resilient privacy-preserving data-aggregation tech-nique is proposed, generalizing the transmitted data tosupport multiple aggregation operations, such as MIN/MAX, Median, etc. In the following, we review these threetechniques in detail.

4.1.1. Cluster-based privacy data aggregation (CPDA)The basic idea of CPDA is to introduce noise to the raw

data sensed from a WSN, such that although an aggregatorcan obtain accurate aggregated information but not indi-vidual data points. This is similar to the data perturbationapproach extensively used in privacy-preserving data min-ing. However, unlike in privacy-preserving data miningwhere noises are independently generated (at random)and therefore leads to imprecise aggregated results, thenoises in CPDA are carefully designed to leverage the coop-eration between different sensor nodes, such that the pre-cise aggregated values can be obtained by the aggregator.

In particular, CPDA classifies sensor nodes into two cat-egories: cluster heads and cluster members. There is a one-to-many mapping between the cluster heads and clustermembers. The cluster heads are responsible for directlyaggregating data from cluster members, with the commu-nication secured by a different shared key between anypair of communicating nodes. Consider a cluster with onehead and n members. Let the cluster head be denoted ass0, and let all other sensors be denoted as s1,. . ., sn, respec-tively. Let Pi be the private data values belonging to a sen-sor si. Each node is pre-assigned a non-zero number, Ai,which is known to all other members in the same cluster.Furthermore, each sensor si generates n private randomnumbers, Ri

1; . . . ;Rin and calculates the following numbers:

Vij ¼ Pi þ Ri

1Aj þ Ri2ðAjÞ2 þ . . .þ Ri

nðAjÞn ð1Þ

where 0 6 j 6 n. Then si will send Vij to sj, which is en-

crypted by the shared key between siand sj. After exchang-ing data, every sensor sj calculates a value Fj as follows

Fj ¼Xn

i¼0

Vij ¼ P þ R1Aj þ R2ðAjÞ2 þ . . .þ RnðAjÞn ð2Þ

where P ¼Pn

i¼0Pi and Rk ¼Pn

i¼0Rik for 1 6 k 6 n. Then sj

broadcasts Fj to the cluster head s0 which calculates thesum P according to U ¼ G�1F where G�1 is the inverse ofthe matrix

G ¼

1 A0 . . . ðA0Þn

1 A1 . . . ðA1Þn

. . . . . . . . . . . .

1 An . . . ðAnÞn

0BBB@

1CCCA: ð3Þ

Here F ¼ ðF0; F1; F2; . . . FnÞT and U ¼ ðP; R1; R2; . . . RnÞT

where XT is the transpose of a matrix X. So P is the first

Page 6: Ad Hoc Networksprojects/295d/papers/sdarticle.pdfthe privacy of the data being collected and guarantee the privacy of data query initiated by users of the network. For context-based

Fig. 4. Mixing: ðJ ¼ 2;h ¼ 1Þ.

Fig. 5. Aggregating.

1506 N. Li et al. / Ad Hoc Networks 7 (2009) 1501–1514

element of U: Note that the matrix G is of full rank becauseAi are distinct numbers. One can see that the cluster headsnow learn the sum of private data from their cluster mem-bers. Nonetheless, the cluster heads cannot compromisethe privacy of their respective members separately.

The random data mentioned in CPDA could be regardedas noise which assists to hide individual raw data itemsfrom being known by the cluster head. Due to the cooper-ation of nodes in the cluster, the negative effect of noise iseliminated, and the sum calculated is exactly the same asits original value.

4.1.2. Slice-mixed aggregation (SMART)SMART is another solution to protect individual data in

the SUM aggregation. The main idea is to slice original datainto pieces and recombine them randomly. There are threesteps in this scheme. In the first step (Slicing), each sensorrandomly selects J neighbor nodes within h hops to form aset S. Then it slices its data into J pieces, keeps one of thosepieces for itself and then sends the other ðJ � 1Þ encryptedpieces to ðJ � 1Þ sensors randomly selected from the set S.In the second step (Mixing), after a sensor receives piecesof data from some other sensors, it decrypts the data usingthe key shared with the data sender. Each sensor waits fora while to make sure that all of the round aggregation datahave already been sliced and received separately. In thethird step (Aggregation), the intermediate sensor aggre-gates all pieces of data and transmits it towards the basestation. Figs. 3–5 give a clear illustration of the basis ideaof SMART. In this example, the five sensors are denotedas s1 to s5, respectively. Let dii be the piece of data keptby si, where let dij mean the piece of data transmitted fromsi to sj and ri means the data aggregated by si.

4.1.3. Generic privacy-preservation solutions for approximateaggregation (GP^2S)

The basic idea of GP^2S is to generalize the values ofdata transmitted in a WSN, such that although individualdata content cannot be decrypted, the aggregator can stillobtain an accurate estimate of the histogram of data distri-bution, and thereby approximate the aggregates. In partic-ular, before transmission, each sensor node first uses aninteger range to replace the raw data. Then, with certain

Fig. 3. Slicing ðJ ¼ 2;h ¼ 1Þ.

granularity, the aggregator plots the histogram for datacollected, and then estimates aggregates such as MIX/MAX and Median according to the algorithms proposedin [32].

Similar to the generalization mechanism of GP^2S, abucketing scheme was proposed in [26] to first bucketizethe mixture of original data (into a few bins) before storingthem at storage nodes compromisable by an adversary. Be-fore transmitting the data to the storage nodes, the datasource first encrypts the data using the key shared withthe base station, and then attaches to the encrypted dataa tag indicating the bucket which the data falls into. Whenthe base station needs to process a range query over thedata stored in the storage nodes, instead of storage nodesdecrypting every data point and returning them, the basestation first obtains an approximate result based on thetags corresponding to the query range. Then, throughdecrypting at the base station, real data set in need couldbe derived accurately.

4.2. Private date query

Orthogonal to the protection of private data being col-lected in a WSN, the query issued to a WSN (to retrievethe collected data) is often also of critical privacy concerns.In the previous example of medical WSN, if an adversarylearns that queries have been frequently issued to the part

Page 7: Ad Hoc Networksprojects/295d/papers/sdarticle.pdfthe privacy of the data being collected and guarantee the privacy of data query initiated by users of the network. For context-based

Query Sequence: Q1, Q2, Q3, Q4

(0,0) (0,1) (0,2) (0,3) (0,4)

(1,0) (1,1) (1,2) (1,3) (1,4)

(2,0) (2,1) (2,2) (2,3) (2,4)

(3,0) (3,1) (3,2) (3,3) (3,4)

(4,0) (4,1) (4,2) (4,3) (4,4)

Fig. 7. Each query randomly maps m regions in the sequence (m = 4).

N. Li et al. / Ad Hoc Networks 7 (2009) 1501–1514 1507

of WSN that covers a patient’s house, the adversary cancertainly infer the health of the patient is getting moreattention probably due to his/her health problems. Privatedata query represents significant challenges to the designof a resource-constrained WSN. From the perspective ofenergy preservation, query processing should be restrictedto as small a targeted range as possible. Nonetheless, nar-rowing down the range also increases the chance for thequery to be inferred by adversaries.

To address this challenge, a target-region transforma-tion technique was proposed in [4] to fuzzy the targetregion of the query according to pre-defined transforma-tion functions. The main idea of the transformation func-tion is to map one region into m regions, such that thetarget region cannot be distinguished from the other unin-teresting regions. Multiple transformation functions suchas uniform, randomized and hybrid were introduced in [4].

In the Union Transform (UT), the interesting region ofeach query is transformed to the set of all regions that ap-pear in a query sequence. For instance, let ðQ1;Q2;Q3;Q4Þbe a query sequence with target regions (0,0), (2,3), (4,2)and (1,1), respectively. No matter which of those four que-ries is carried on, the four target regions are queried simul-taneously. Fig. 6 illustrates the mapping between querysequence and query regions under the UT scheme.

In the Randomized Transform (RT), each query ismapped into a randomized set of regions involving the tar-get region. Considering the same example as above, Fig. 7shows the association between queries and their query re-gions under this scheme. Regions with the same specificlines represent the query regions of the correspondingquery. Here each query maps four regions involving onlyone real target. Finally, the hybrid Transform (HT) is a com-bination of UT and RT.

In comparison with the data query problem in the do-main of databases, these schemes are similar to the k-ano-nymity algorithms (e.g., [27]) which hide the target’s realidentity using other ðk� 1Þ similar objects so that it isimpossible for the adversary to distinguish the target fromthe ðk� 1Þ other objects. But the cost of such an anonym-ity-based technique is high in a WSN, because query dis-semination and data collection in the uninteresting

Query Sequence: Q1, Q2, Q3, Q4

(0,0) (0,1) (0,2) (0,3) (0,4)

(1,0) (1,1) (1,2) (1,3) (1,4)

(2,0) (2,1) (2,2) (2,3) (2,4)

(3,0) (3,1) (3,2) (3,3) (3,4)

(4,0) (4,1) (4,2) (4,3) (4,4)

Fig. 6. Each query maps with all target.

regions consume a large amount of energy. The larger theregion saturated with query, the more effectively doessafeguarding privacy work, but at the cost of the more en-ergy. Consequently, a proper tradeoff between energy con-sumption and privacy protection is important.

Another query protection technique was proposed in[37]. Instead of enlarging the query regions as proposedin [4], this technique intends to disconnect the mappingbetween a user’s identity and the query issued by the user.With this technique, a user is able to access data in a WSNafter purchasing a certain amount of tokens from the WSNowner with blind signatures [5]. Such a token not only con-trols access to the sensed data, but also hides the user’sidentity and thereby guarantees query privacy.

5. Context-oriented privacy protection techniques

In this section, we review the existing techniques forcontext-oriented privacy protection that address the pro-tection of private information related to the characteristicsof traffic being saturated in a WSN. In particular, the exist-ing techniques in the literature have focused on protectingtwo types of contextual information: location of specialsensor nodes such as the data source and the base station,and timing of the generation of sensitive data.

5.1. Location privacy

A major challenge for context-oriented privacy protec-tion is that an adversary may be able to compromise pri-vate information even without the ability of decryptingthe transmitted data. In particular, since hop-by-hop trans-mission is required to address the limited transmissionrange of sensor nodes, an adversary may derive the loca-tions of base station and data source by observing and ana-lyzing the traffic patterns between different hops (to trackdown the base station and/or the data source). To addressthis challenge, the objective of context-oriented privacyprotection is to hide the real traffic pattern. Various traf-fic-pattern-obfuscation techniques have been proposed inthe literature.

Page 8: Ad Hoc Networksprojects/295d/papers/sdarticle.pdfthe privacy of the data being collected and guarantee the privacy of data query initiated by users of the network. For context-based

Base Station

Sensor

Random walk

Fig. 8. Random walk, h = 4.

Probabilistic flooding

Base Station

Sensor

Random walk

Fig. 9. Probabilistic flooding.

1508 N. Li et al. / Ad Hoc Networks 7 (2009) 1501–1514

In the following, we review the existing techniques forprotecting the locations of data source and base station.A common idea of such techniques is to introduce randomfactors to packet routing, in order to increase the uncer-tainty of traffic patterns observed by the adversary andto counteract the adversarial traffic analysis attacks. Notethat, after all, it is the routing path that exposes both typesof private information. Thus, protecting such private infor-mation should start from hiding routing informationagainst traffic analysis. With this common idea, the costassociated with privacy protection is the delay of deliver-ing data packets, the (possible) reduction of successfuldelivery probability, or even a huge amount of energyconsumption.

5.1.1. Location privacy of data sourceBefore reviewing the existing techniques for protecting

the location privacy of the data source, let us first brieflydiscuss the ‘‘Panda Hunter Game”, a classic formed prob-lem, based on which (data-source) location privacy inWSN has been extensively studied in the literature [18].In the Panda Hunter Game, a large number of panda-detect-ing sensors are deployed in a panda habitat. After sensorsdetect a panda, they will generate event messages andtransmit them toward the base station. Meantime, a pan-da-hunter also attempts to identify the location of the datasource to find the panda.

In this game, the objective of the defender is two-fold:On one hand, the defender needs to properly obtain thepanda’s moving information in order to enable biologicalresearch. On the other hand, it also intends to hide thelocation information from being known by the panda hunt-ers which have the ability to eavesdrop the wireless com-munication between different sensor nodes, but do nothave the key to decrypt the payload. The objective of thepanda hunter is to compromise the location of the datasource (and thereby the panda) by analyzing the trafficflow in the WSN.

Since a local adversary may only be able to monitor thetraffic within a small local area, generally, there are twoapproaches for adversary to start the attack, arbitrarilychoosing a place to stay in the network to monitor trafficor staying around base station with the prior knowledgeabout the location of the base station. In the following,we discuss four existing techniques, flooding [18], Randomwalk [18,28], dummy injection [18,24,29], and fake datasources [21], against the disclosure of the location of datasource in WSNs.

5.1.1.1. Baseline and probabilistic flooding mechanisms. Thebasic idea of baseline flooding is for each sensor to broad-cast the data it receives from one neighbor to all of its otherneighbors. The premise of this approach is that all sensorsparticipate in the data transmission so that it is unlikely foran attacker to track a path of transmission back to the datasource [18]. However, the effectiveness of baseline floodingon privacy protection critically depends on the number ofnodes on the transmission path between the data sourceand the base station. If the path is too short, after an adver-sary detects the arrival of the first packet at the base sta-tion, the adversary can infer that the routing path of this

packet is the shortest path between the data source andthe base station. Then, at a later time, the adversary cantrack back from the last forwarding sensor along the rout-ing path to the data source. Furthermore, the flooding con-sumes significant amount of energy in the whole networkand hence the lifetime of the WSN may be substantiallyreduced.

To address the side effect of baseline flooding, probabi-listic flooding is proposed in [18], in which not all sensorsare involved in forwarding data. Instead, each node for-wards/broadcasts a packet it receives with a pre-deter-mined probability. One can see that this scheme may notonly significantly save energy but also effectively limitthe adversary’s ability to deterministically track back tothe data source. Nonetheless, the reception of data by thebase station is not guaranteed due to the randomness in-volved in this approach.

5.1.1.2. Random walk mechanisms. As one of random walkapproaches, Phantom Routing is proposed in [18]. Figs. 8and 9 illustrate the basic idea of Phantom Routing. Data

Page 9: Ad Hoc Networksprojects/295d/papers/sdarticle.pdfthe privacy of the data being collected and guarantee the privacy of data query initiated by users of the network. For context-based

N. Li et al. / Ad Hoc Networks 7 (2009) 1501–1514 1509

first performs a few steps of random walk from datasource, and then, by employing probabilistic floodingscheme, it is transmitted towards base station. The pre-mise of this approach is that even if an adversary is ableto track back along the routing path, it would only be ableto figure out the terminal node of the random walk insteadof the original data source.

Unfortunately, as indicated in [28], the pure randomwalk approach is not statistically secure for protectingthe location of the data source. In particular, it can beshown that a pure random walk tends to stay around thereal source [23]. To improve Phantom flooding, a two-way greedy random walk (GROW) scheme was proposedin [28]. Figs. 10 and 11 give a vivid illustration of runningGROW. In this scheme, a random path with a given numberof hops is initiated from base station first. Sensors locatedon the path will serve as the receptors. And then, each pack-et from a source is then randomly forwarded until itreaches one receptor. At that point, the packet is forwardedto the base station through the path pre-established bybase station.

Sensor

Data source

Base Station

Sink’ s path

Fig. 10. Sink builds the receptor path.

Data source’ s path

Data source

Sensor

Base Station

Sink’ s path

Fig. 11. Data randomly walk, intersect with a receptor and followreceptor path.

5.1.1.3. Dummy data mechanism. To further protect thelocation of the data source, fake data packets can be intro-duced to perturb the traffic patterns observed by theadversary. In particular, a simple scheme called Short-livedFake Source Routing was proposed in [18] for each sensor tosend out a fake packet with a pre-determined probability.Upon receiving a fake packet, a sensor node just discardsit. Although this approach perturbs the local traffic patternobserved by an adversary, it also has limitations on privacyprotection. Specifically, to maintain the energy-efficiencyof the WSN, the length of each path along which fake datais forwarded is only one hop, therefore, an adversary is ableto quickly identify fake paths and eliminate them fromconsideration.

Note that this technique is ineffective against globaladversaries that can monitor the transmission rate of eachsensor node and thereby identify those that are only send-ing out dummy data. To address this problem, one possibleapproach is to globally inject dummy data as well as keep-ing the transmission of real data the same as that of dum-my data. However, this approach may introduce significantdelay to the data transmission process. To cope with thisconcern, various techniques have been proposed in the lit-erature. For example, a special distribution of data trans-mission interval was suggested in [24], such that as longas every sensor node follows such a pre-determined distri-bution, the delay of real data can be minimized withoutallowing an adversary to identify the real traffic. Two otherschemes, Proxy-based Filtering Scheme (PFS) and Tree-based Filtering Scheme (TFS), were proposed in [29] to fil-ter partial dummy data without threatening the source pri-vacy. The whole network is divided into cells. The proxiesare responsible for relaying real data from cells aroundthem and filtering dummy data. After filtering all dummypackages from cells and buffering real data, the proxy willsend data package, including buffered real data and newgenerated dummy data, at the same rate of transmission.Thanks to the filtering procedure, a large number of dum-my data are removed so that energy consumption on over-head communication is greatly reduced.

5.1.1.4. Fake data sources mechanism. The basic idea of fakedata source is to choose one or more sensor node to simu-late the behavior of a real data source in order to confusethe adversaries [21]. The more the fake sources, the bettercan the identity of real source be protected. Such atechnique will also incur more power consumption inthe WSN. Furthermore, a significant challenge for thedesign of this technique is how to simulate the behaviorof data sources without being detected. This is an openproblem.

5.1.2. Location privacy of base stationBesides protecting the data-source location, another

important challenge for context-oriented privacy is theproper hiding of the base station’s location. In a WSN, abase station is not only in charge of collecting and analyz-ing data, but also used as the gateway connecting the WSNwith outside wireless or wired network. Consequently,destroying or isolating the base station may lead to themalfunction of the entire network. In the following, we

Page 10: Ad Hoc Networksprojects/295d/papers/sdarticle.pdfthe privacy of the data being collected and guarantee the privacy of data query initiated by users of the network. For context-based

1510 N. Li et al. / Ad Hoc Networks 7 (2009) 1501–1514

review the existing privacy-preserving techniques [7–9,16]for the location of the base station. In particular, we discusstechniques for defense against local and global adversaries.

5.1.2.1. Defense against local adversaries. The main chal-lenge for defending the location of base station against lo-cal adversaries is two-fold: First, the location informationof the base station is usually included in the data payloadbeing transmitted. Thus, one must provide the payloadconfidentiality to hide the base station’s location. Second,a local adversary may be able to infer the parent–childrelationship (i.e., which node within two communicatingones is nearer to the base station) from the time intervalbetween receiving and sending data at each sensor. Suchinformation can enable an adversary to efficiently tracedown to the base station along the data transmission route.For the first challenge, traditional security techniquescould work effectively. To handle the second challenge,the following four techniques have been proposed in theliterature [8,9,16].

� Changing data appearance by re-encryption: Similar toanonymous routing systems (e.g., [10]) in traditionalwired networks, a data re-encryption scheme was pro-posed in [9] such that a packet is re-encrypted hop-by-hop when transmitted through the routing path. Thisscheme eliminates the disclosure of base station loca-tion through changing the appearance of data.

� Routing with multiple parents: In [8,9], a multiple-parentscheme was introduced to balance the traffic loadbetween parents and children, such that an adversarycannot easily identify which one is nearer to the basestation. Also, this scheme allows each sensor to ran-domly select one of its multiple parents to transmit datatowards the base station. This also makes it more diffi-cult for the adversary to identify the location of the basestation by tracing the data transmission.

� Routing with random walk: In [16], a simple random walkdivides the neighbors of a sensor into two lists – closerand further lists – according to the hop count from thebase station. When sensor forwards data, it randomlyselects a next hop neighbor from one of those two lists.In the random walk schemes [8,9], each node selects itsparent as next hop with probability Pr or randomly for-wards data with probability (1 – PrÞ, which drasticallyreduces the probability of successful analysis by theadversary. In this sense, the scheme in [16] could beregarded as a special case of the strategy in [8,9] forPr ¼ 1=2. However, the cost of random walk is the delayof delivering the data package to the base station as wellas the extra energy consumption, since it is possible toselect a node as next hop which is further away fromsink. Or even, there is no guarantee to the arrival of dataat the base station.

� Decorrelating parent–child relationship by randomlyselecting sending time [9]: An adversary can figure outthe relation of parent–child through the short timeinterval between sending data at a sensor and receivingdata at its neighbor node. In order to de-correlate thisspecific relation, the period of time T is divided into mslots when there are one parent and (m – 1) children

for a sensor. The sensor assigns time slots to its childrenand each sensor will transmit its data at a random timewithin its slot.

The above privacy-preserving techniques aim to pre-vent an adversary from identifying the base station. Therehas also been work on how to improve the robustness of aWSN after the base station has been detected. In particular,a multiple-base station scheme was proposed in [7]. Evenwhen one of the base stations is destroyed, the otherscan continue to collect data and maintain the normal func-tions of the WSN. This scheme apparently enhances therobustness of the system, but offers no additional protec-tion for the base stations. Thus, it only delays, withouteliminating, the threat from a local adversary on identify-ing the location of the base stations.

5.1.2.2. Defense against global adversaries. The above-men-tioned techniques are ineffective against a global adversarywhich is capable of monitoring all traffic transmitted in theWSN. In particular, since an adversary is able to computethe transmission rate of each sensor node, it can easilyidentify the base station according to the specific topologyof WSNs as discussed in Section 1. To defend against globaladversaries, the injection of dummy traffic and/or dummysensor nodes is required. We review the existing tech-niques in the following:

� Hiding traffic pattern by controlling transmission rate. Theasymmetric data traffic flow in a WSN facilitates adver-sary to find the base station because a sensor close to thebase station needs to not only send its own data but alsorelay data from sensors further away from the base sta-tion, and therefore features a high transmission rate. Aprivacy-preserving technique was proposed in [9] tomaintain the same transmission rate among all sensorsby controlling delay of real data.

� Propagating dummy data. In [8,9], under the assumptionthat an adversary cannot distinguish real data from fakedata, a fake-packet injection scheme was proposed toprevent an adversary from identifying the real datatransmission pattern. In particular, if a sensor detectsthat its neighbor is transmitting a real packet to the basestation, the sensor generates a fake packet with proba-bility Pc , and forwards it to one of its neighbors. In[8,9], the authors introduced two fractal propagationmethods. One scheme is to control the probability Pc

according to the data forwarding rate. The higher therate is, the lower is Pc . The other scheme aims to simu-late a high transmission-rate area to mislead the adver-sary to believe it as the base station. The side effect offractal propagation is additional energy consumption,due to the fake data being transmitted over the network.Consequently, one has to make a proper tradeoffbetween energy consumption and privacy protection.

5.2. Temporal privacy problem

Let us now address the second type of context-basedprivacy, namely temporal privacy. Consider again the Pan-da Tracking application. When sensors detect the target,

Page 11: Ad Hoc Networksprojects/295d/papers/sdarticle.pdfthe privacy of the data being collected and guarantee the privacy of data query initiated by users of the network. For context-based

N. Li et al. / Ad Hoc Networks 7 (2009) 1501–1514 1511

they generate event messages and forward them to thebase station. If an adversary can identify a specific timewhen an event message is generated, the adversary coulduse the information to track the target, or even predictthe target’s next move. Here the attacker’s confidence de-pends on the assumption that the delay time (tÞ of datapassing through each intermediate sensor along the rout-ing path is the same. When the adversary eavesdrops themessage near the base station, it can first obtain the arrivaltime z and then deduct from it the multiplication of aver-age delay t and the hop count h, according to the equationx ¼ z� h�t. The main idea of the counteraction in [17] is tolocally buffer the data for a random period of time at theintermediate sensors located along the routing path, suchthat an adversary cannot accurately estimate the originalgeneration time of the message.

Apparently, the random delay leads to increasing costsof buffer space at the intermediate nodes. How to make atradeoff between the protection of timing privacy and theefficiency of buffer space is of primary concern. For thispurpose, a rate-controlled adaptive delaying scheme isproposed in [17] to adjust the delay distribution as a func-tion of the incoming traffic rate and the available bufferspace.

6. Comparisons

In the previous sections, we followed a depth-first ap-proach to review the existing techniques for each categoryof problems respectively. In this section, we horizontallycompare all privacy-preserving techniques that have beenreviewed in this paper. Table 1 depicts a comprehensivecomparison of the performances of privacy-preservingtechniques in WSNs. We evaluate their performances interms of four metrics: privacy, accuracy, delay time, andpower consumption. Privacy refers to the degree of privacyprotection provided by the reviewed techniques. The accu-racy measure covers two perspectives: (i) the accuracy ofthe data obtained by the base station; and (ii) the availabil-ity of the (intended) data to the base station (i.e., whetherthe data can be delivered to the base station). The delaytime includes both the computation and communicationtime of data transmission at the intermediate sensors. Fi-nally, the power consumption measure focuses on the addi-tional messages required for transmission (i.e., additionalenergy consumed) in the WSN.

One can see from Table 1 that the benefit of privacy pro-tection usually comes at the cost of other metrics. Forexample, all data aggregation techniques shown in Table 1(i.e., CPDA, SMART and GP^2S) can provide perfect protec-tion for the privacy of data collected from individualsensors. Nonetheless, all of these techniques also have defi-ciencies on other measures. For CPDA and SMART, the costis mainly on the power consumption due to the exchangeof sliced data. On the other hand, for GP^2S, the cost comesfrom the accuracy of data being aggregated, as onlyapproximate values of aggregate functions, e.g., SUM,MIN and MEDIAN, can be obtained. One can see that a mainchallenge for the future design of privacy-preserving dataaggregation techniques is how to make a proper tradeoff

between three metrics: privacy, power consumption, andaccuracy.

Another observation from Table 1 concentrates on thecontext-oriented privacy-preserving techniques. In partic-ular, with the three techniques for location privacy protec-tion (i.e., random walk routing, flooding and dummyinjecting), the main cost of privacy protection is on powerconsumption. These techniques have to transmit a largeamount of additional traffic due to the flooding of real dataor the injection of dummy data. Such overhead leads to sig-nificant power consumption. Thus, a main challenge forthe future design of context-oriented privacy-preservingtechniques is to minimize the communication overheadwhich causes the costly power consumption.

7. Open problems

While some work has been proposed in privacy protec-tion in WSNs, there are still many open research issues inneed of future research. Here we list some important ones.

� In the scenario of Panda Hunting, the existing work [18]only addresses systems with a single mobile target(panda). An interesting open problem would be to tacklethe cases with multiple mobile targets.

� A significant challenge is to effectively protect the loca-tion of a mobile base station. Admittedly, by intuition,the mobility of base station is supposed to provide someprotection to its location privacy against external attack,however, it has to update its location to all or part ofnodes in the network so that they could forward datatowards it, which implicitly creates more chance to pinpoint it through internal attack. Accordingly, how toprotect the location privacy of a mobile base station aswell as guaranteeing the system functionality is chal-lenge of primary importance.

� In [9], re-encryption technique is proposed to changethe appearance of data in order to eliminate the possibil-ity of figuring out base station by tracking one datapackage in a dense network. But in a sparse WSN, it isstill highly likely for adversary to infer the moving ofone data package without the disturbing of dense datatraffic, in spite of disguising the data package by re-encrypting it. Therefore, how to make re-encryptiontechnique work effectively in a sparse network is stillchallenging.

� Considering the promising applications of WSNs to ourreal physical world, new types of contextual privacyinformation may come out, which are possibly com-bined with such potential research fields as social net-work. We are looking forward to the coming of themand their counteractions which gradually enhance theprivacy preservation in WSNs.

8. Final remarks

In this paper, we have presented a state-of-the-artsurvey on privacy-preserving techniques in WSNs. Wediscussed the existing privacy-preserving techniquesin two categories, which address data-oriented and

Page 12: Ad Hoc Networksprojects/295d/papers/sdarticle.pdfthe privacy of the data being collected and guarantee the privacy of data query initiated by users of the network. For context-based

Table 1An overview of solutions.

Privacy Accuracy Delay time Power consumption

CPDA Depend on the security ofshared key

Get the exact sum if no datais lost

Cummunicate data assistingwith calculation

Exchange a large amount ofcommunication data forcalculating the aggregationresult

SMART 100% protection for privacy Get the exact sum if no datais lost

Slice and recombine data Transmit and receive slicesof data

GP^2S 100% protection for privacy Approximately plot datahistogram

No extra delay time exceptaggregating

No extra energyconsumption on overheadand extra communication

K-Anonymity based Depend on the parameter k Get query results from oneinteresting cell and (k-l)other uninteresting cells

Query and collect data inuninteresting cells

Query and collect data inuninteresting cells

DP^2AC Depend on the k-bit randominteger generated at basestation

Get accurate query resultswith verified access right

Verify the token based onpublish-key mechanism,also token-reuse detection

Consume energy ondistributed token-reuse-detcction

Flooding to protect datasource

For baseline flooding, easilyfigure out the shortest pathbetween sink and datasourceFor probabilistic flooding,depend on the presetpribabilily

For baseline flooding, guar-antee data arrivalFor probabilistic flooding,not guarantee data arrival

For probabilistic flooding,not guatantee the datatransmission along theshortest path

Flood data over the wholenetwork

Random walk to protectdata source

Distract attention to the realdata source

For Phantom, 100% dataarrivalFor GROW, data arrivaldepends on intersection oftwo random walk

For Phantom, depend on thehops of the walkFor GROW, depend on therandomness of walk

Consume energy on randomwalking

Dummy injection to protectdata source

Disturb the traffic pattern ofthe whole network

No influence on data arrivaland accuracy

Delay real data to keep itstransmission rate follow thesame distribution as that offake data

Inject fake data

Fake data sou ice Distract attention from thereal source

No influence on data arrivaland accuracy

No extra delay Generate fake data from fakesources

Hop by hop re-encryption Re-encrypt data link to link No influence on data arrivaland accuracy

Spend time on encryptingand decrypting data at eachintermediate node

No extra energyconsumption on overheadand extra communication

Multiple-parent routing Route data from multiplerouting paths

No influence on data arrivaland accuracy

No extra delay No extra energyconsumption on overheadand extra communication

Random walk routing toprotect base station

Depend on the probability ofchoosing the parent as nexthop

Not guarantee data arrivalDepend on the probabilisticselection

Depend on the selection ofnext hop

Random walking

Random sending time Make parent–childrelationship ambiguous

No influence on data arrivaland accuracy

Randomly selecttransmission time in eachslot

Less energy consumption onsynchronization for slotcontrol

Transmission rate control Hide global traffic pattern bymaking transmission ratethe same

No influence on data arrivaland accuracy

Buffer data to keeptransmission rate the sameove the whole network

No extra energyconsumption on overheadand extra communication

Dummy propagation toprotect base station

Injecting fake data No influence on data arrivaland accuracy

No extra delay Inject fake data

Random delay for temporalprivacy

Destroy the deduction ofapproximate generationtime of data

No influence on data arrivaland accuracy

Randomly buffer data atintermediate sensors

No extra energyconsumption on overheadand extra communication

1512 N. Li et al. / Ad Hoc Networks 7 (2009) 1501–1514

context-oriented privacy concerns. The data-oriented tech-niques address the protection of private data transmittedin the WSN and that of sensitive queries executed. Con-text-oriented techniques, on the other hand, protect thelocation of the data source and the base station as wellas the timing of the generation and transmission of sensi-tive data. Also, we attempted to compare the existing tech-niques in terms of such metrics as privacy, accuracy, delayand power consumption. Through comprehensive analysisof privacy issues in WSNs, ranging from problem defini-tions to the existing techniques, we depicted a complete

picture of the state-of-the-art on privacy preservation inWSNs. Furthermore, based on the existing work, we listeda number of open issues which may intrigue the interest ofresearchers for future work. It is our hope that this papersheds lights on a fruitful direction of future research on pri-vacy preservation for WSNs.

Acknowledgement

The authors gratefully acknowledge the insightful com-ments of the anonymous reviewers which helped improve

Page 13: Ad Hoc Networksprojects/295d/papers/sdarticle.pdfthe privacy of the data being collected and guarantee the privacy of data query initiated by users of the network. For context-based

N. Li et al. / Ad Hoc Networks 7 (2009) 1501–1514 1513

the quality of the paper significantly. This work is partiallysupported by the AFOSR Grant A9550-08-1-0260, NSFGrants IIS-0326505, CNS-0721951, CNS-0852673, CCF-0852674, and IIS-0845644. The work of S.K. Das is alsosupported by (while serving at) the National Science Foun-dation. Any opinion, findings, and conclusions or recom-mendations expressed in this material are those of theauthors and do not necessarily reflect the views of theNational Science Foundation.

References

[1] R. Agrawal, A. Evfimievski, R. Srikant, Information sharingacross private databases, in: Proceedings of the 2003 ACMSIGMOD International Conference on Management of Data, 2003,pp. 86–97.

[2] R. Agrawal, R. Srikan, Privacy-preserving data mining, in:Proceedings of the 2000 ACM SIGMOD on Management of Data,Dallas, TX USA, May 15–18, 2000, pp. 439–450.

[3] I.F. Akyildiz, W. Su, Y. Sankarasubramaniam, E. Cayirci, Wirelesssensor networks: a Survey, Computer Networks 38 (4) (2002) 393–422.

[4] B. Carbunar, Y. Yu, L. Shi, M. Pearce, V. Vasudevan, Query privacy inwireless sensor networks, in: 4th Annual IEEE CommunicationsSociety Conference on Sensor, Mesh and Ad Hoc Communicationsand Networks, 2007 (SECON’07), 18–21 June 2007, pp. 203–212.

[5] D. Chaum, Blind signatures for untraceable payments, in: Advancesin Cryptology – Crypto’82, Springer-Verlag (1983), 1982, pp. 199–203.

[6] E.D. Cristofaro, J. Bohli, D.Westho, FAIR: fuzzy-based aggregationproviding in-network Resilience for real-time wireless sensornetworks, to appear, in: Proceedings of the Second ACMConference Wireless Network Security (Wisec), 2009.

[7] J. Deng, R. Han, S. Mishra, Intrusion tolerance and anti-traffic analysisstrategies for wireless sensor networks, in: IEEE InternationalConference on Dependable Systems and Networks (DSN 2004),Florence, Italy, 28 June–1 July 2004, pp. 637-646.

[8] J. Deng, R. Han, S. Mishra, Countermeasures against traffic analysisattacks in wireless sensor networks, in: First InternationalConference on Security and Privacy for Emerging Areas inCommunications Networks (SecureComm) 2005, September 2005,pp. 113–126.

[9] J. Deng, R. Han, S. Mishra, Decorrelating wireless sensor networktraffic to inhibit traffic analysis attacks, Pervasive and MobileComputing Elsevier 2 (2) (2006) 159–186.

[10] R. Dingledine, N. Mathewson, P. Syverson, Tor: the second-generation Onion router, in: Proceedings of the 13th USENIXSecurity Symposium, 2004.

[11] M.J. Freedman, K. Nissim, B. Pinkas, Efficient private matching andset intersection, in: Advances in Cryptography, Proceedings ofEurocrypt 2004, 2004, pp. 1–19.

[12] G. Ghinita, P. Kalnis, A. Khoshgozaran, C. Shahabi, Tan, Kian-Lee,Private queries in location based services: anonymizers are notnecessary, in: Proceedings of the 2008 ACM SIGMOD internationalconference on Management of data, Vancouver, Canada, 2008, pp.121–132.

[13] M. Gruteser, D. Grunwald, Anonymous usage of location-basedservices through spacial and temporal cloaking, in: Proceedings ofthe International Conference on Mobile Systems, Applications, andServices (MobiSys), 2003, pp. 31–42.

[14] W.B. He, X. Liu, H. Nguyen, K. Nahrstedt, T. Abdelzaher, PDA:privacy-preserving data aggregation in wireless sensor networks, in:Proceedings of the 26th IEEE International Conference on ComputerCommunications (INFOCOM 2007), May 2007, pp. 2045–2053.

[15] Z. Huang, W. Du, B. Chen, Deriving private information fromrandomized data, in: Proceedings of 2005 ACM SIGMODInternational Conference on Management of Data (ACM SIGMOD2005), 2005, pp. 37–48.

[16] Y. Jian, S.G. Chen, Z. Zhang, L. Zhang, Protecting receiver-locationprivacy in wireless sensor networks, in: Proceedings of the 26th IEEEInternational Conference on Computer Communications (INFOCOM2007), May 2007, pp. 1955–1963.

[17] P. Kamat, W.Y. Xu, W. Trappe, Y.Y. Zhang, Temporal privacy inwireless sensor networks, in: Proceedings of the 27th International

Conference on Distributed Computing Systems (ICDCS 2007), June2007, pp. 23–23.

[18] P. Kamat, Y.Y. Zhang, W. Trappe, C. Ozturk, Enhancing source-location privacy in sensor network routing, in: Proceedings of the25th IEEE International Conference on Distributed ComputingSystems (ICDCS 2005), June 2005, pp. 599–608.

[19] K. LeFevre, D.J. DeWitt, R. Ramakrishnan, Workload-awareanonymization, in: Proceedings of the 12th ACM SIGKDDInternational Conference on Knowledge Discovery and DataMining, 2006, pp. 277–286.

[20] A. Machanavajjhala, D. Kifer, J. Gehrke, M. Venkitasubramaniam, l-Diversity: privacy beyond k-anonymity, in: ACM Transactions onKnowledge Discovery from Data, 1 (2007).

[21] K. Mehta, D.G. Liu, M. Wright, Location privacy in sensor networksagainst a global eavesdropper, in: Proceedings of the IEEEInternational Conference on Network Protocols (ICNP 2007),October 2007, pp. 314–323.

[22] M.G. Reed, P.F. Syverson, D.M. Goldschlag, Anonymous connectionsand onion routing, IEEE Journal on Selected Areas inCommunications, 16 (4) (1998) 482–494.

[23] S.M. Ross, Stochastic Processes, second ed., John Wiley & sons, Inc.,1996.

[24] M. Shao, Y. Yang, S. Zhu, G. Cao, Towards statistically strong sourceanonymity for sensor networks, in: Proceedings of the 26th IEEEInternational Conference on Computer Communications (INFOCOM2007), May 2007, pp. 1298–1306.

[25] M. Shao, S. Zhu, W. Zhang, G. Cao, pDCS: security and privacysupport for data-centric sensor networks, in: Proceedings of the 26thIEEE International Conference on Computer Communications(INFOCOM 2007), May 2007, pp. 1298–1306.

[26] B. Sheng, Qun Li, Verifiable privacy-preserving range query in two-tiered sensor networks in: Proceedings of the 27th IEEEInternational Conference on Computer Communications (INFOCOM2008), 13–18 August 2008, pp. 46–50.

[27] L. Sweeney, K-anonymity: a model for protecting privacy,International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 2 (2) (2002) 557–570.

[28] Y. Xi, L. Schwiebert, W.S. Shi, Preserving source location privacy inmonitoring-based wireless sensor networks, in: Proceedings of the20th International Parallel and Distributed Processing Symposium(IPDPS 2006), April 2006.

[29] Y. Yang, M. Shao, S. Zhu, B. Urgaonkar, G. Cao, Towards event sourceunobservability with minimum network traffic in sensor networks,in: Proceedings of the first ACM Conference on Wireless NetworkSecurity (WiSec), 2008, pp.77–88.

[30] Y. Yang, X. Wang, S. Zhu, G. Cao, SDAP: a secure hop-by-hop dataaggregation protocol for sensor networks, in: ACM Transactions onInformation and System Security (TISSEC 2008), 11 (4) (2008).

[31] W. Zhang, Y. Liu, S.K. Das, P. De, Secure data aggregation in wirelesssensor networks: a watermark based authentication supportiveapproach, Elsevier Pervasive and Mobile Computing 4 (5) (2008)658–680.

[32] W.S. Zhang, C. Wang, T.M. Feng, GP^2S: generic privacy-preservationsolutions for approximate aggregation of sensor data, concisecontribution, in: Proceedings of the Sixth Annual IEEEInternational Conference on Pervasive Computing andCommunications (PerCom), Hong Kong, P.R.C., March 17–21, 2008,pp.179–184.

[33] N. Zhang, S. Wang, W. Zhao, A new scheme on privacy preservingassociation rule mining, in: Proceedings of the 8th Europeanconference on Principles and practice of knowledge discovery indatabases, 2004, pp. 484–495.

[34] N. Zhang, S. Wang, W. Zhao, A new scheme on privacy-preservingdata classification, in: Proceedings of the 11th ACM SIGKDDInternational Conference on Knowledge Discovery and DataMining, 2005, pp. 374–383.

[35] N. Zhang, W. Zhao, Distributed Privacy Preserving InformationSharing, in: Proceedings of the 31st International Conference onVery Large Data Bases, 2005, pp. 889–900.

[36] N. Zhang, W. Zhao, Privacy-Preserving Data Mining Systems, IEEEComputer 40 (2007) 52–58.

[37] R. Zhang, Y. Zhang, K. Ren, DP^2AC: distributed privacy-preservingaccess control in sensor networks, to appear in: Proceedings of the28th IEEE International Conference on Computer Communications(INFOCOM 2009), pp.1298–1306.

[38] N. Zhang, W. Zhao, Privacy protection against malicious adversariesin distributed information sharing systems, IEEE Transactions onKnowledge and Data Engineering, 20 (8) (2008).

Page 14: Ad Hoc Networksprojects/295d/papers/sdarticle.pdfthe privacy of the data being collected and guarantee the privacy of data query initiated by users of the network. For context-based

tworks 7 (2009) 1501–1514

Na Li received her B.S. degree in ComputerScience & Technology from Nankai University,Tianjin, China, in 2005. From 2005–2007, Shewas a research assistant in Computer NetworkInformation Center, Chinese Academy of Sci-ences, Beijing, China. Now she is a PhD stu-dent at Computer Science and EngineeringDepartment in the University of Texas atArlington. Her interests involve routing andscheduling techniques, QoS and managingstrategies, security and privacy in WirelessSensor Network and Ad Hoc Network.

1514 N. Li et al. / Ad Hoc Ne

Nan Zhang is an Assistant Professor of Com-puter Science at the George Washington Uni-versity. He received the B.S. degree fromPeking University in 2001 and the Ph.D.degree from Texas A&M University in 2006,both in computer science. His currentresearch interests include security and pri-vacy issues in databases, data mining, andcomputer networks, in particular privacy andanonymity in data collection, publishing, andsharing, privacy-preserving data mining, andwireless network security and privacy.

Sajal K. Das is a University DistinguishedScholar Professor of Computer Science andEngineering and the Founding Director of theCenter for Research in Wireless Mobility andNetworking (CReWMaN) at the University ofTexas at Arlington (UTA). He is currently aProgram Director in the Computer and Net-work Systems division at the US NationalScience Foundation. He is also E.T.S. WaltonProfessor of Science Foundation of Ireland; aVisiting Professor at the Indian Institute ofTechnology (IIT) at Kanpur and IIT Guwahati;

an Honorary Professor of Fudan University in Shanghai and International

Advisory Professor of Beijing Jiaotong University, China; and a VisitingScientist at the Institute of Infocomm Research (I2R), Singapore. Hiscurrent research interests include wireless sensor networks, mobile and

pervasive computing, design and modeling of smart environments, per-vasive security, smart health care, resource and mobility management inwireless networks, mobile grid computing, biological networking, appliedgraph theory and game theory. He has published over 400 papers andover 35 invited book chapters in these areas. He holds five US patents inwireless networks and mobile Internet, and coauthored the books”SmartEnvironments: Technology, Protocols, and Applications” (Wiley, 2005)and”Mobile Agents in Distributed Computing and Networking (Wiley,2008). Dr. Das is a recipient of several Best Paper Awards in such con-ferences as EWSN’08, IEEE PerCom’06, and ACM MobiCom’99. He received2009 IEEE Technical Achievement Award for pioneering contributions towireless mobile and sensor networks. He is also a recipient of the IEEEEngineer of the Year Award (2007), UTA Academy of DistinguishedScholars Award (2006), University Award for Distinguished Record ofResearch (2005), College of Engineering Research Excellence Award(2003), and Outstanding Faculty Research Award in Computer Science(2001 and 2003). He is frequently invited as keynote speaker at inter-national conferences and symposia. Dr. Das serves as the Founding Editor-in-Chief of Pervasive and Mobile Computing (PMC) journal, and AssociateEditor of IEEE Transactions on Mobile Computing, ACM/Springer WirelessNetworks, IEEE Transactions on Parallel and Distributed Systems, andJournal of Peer-to-Peer Networking. He is the founder of IEEE WoWMoMand co-founder of IEEE PerCom conference. He has served as General orTechnical Program Chair as well as TPC member of numerous IEEE andACM conferences.

Bhavani Thuraisingham joined The Univer-sity of Texas at Dallas (UTD) in October 2004as a Professor of Computer Science andDirector of the Cyber Security Research Centerin the Erik Jonsson School of Engineering andComputer Science. She is an elected Fellow ofthree professional organizations: the IEEE(Institute for Electrical and Electronics Engi-neers), the AAAS (American Association forthe Advancement of Science) and the BCS(British Computer Society) for her work indata security. She received the IEEE Computer

Society’s prestigious 1997 Technical Achievement Award for ‘‘outstanding

and innovative contributions to secure data management.” She wasquoted by Silicon India Magazine as one of the top seven technologyinnovators of South Asian Origin in the USA in 2002. Dr. Thuraisinghamhas published over 80 journal papers and is the author of 8 books. Herresearch focussed on applying information management technologies fordata security and national security. She was educated in the UnitedKingdom both at the University of Bristol and the University of Wales.