Sensor Data

Sensor Data

한국기술교육대학교 민준기

Wireless Sensor Network◦ Limited Energy Power◦ Limited Computing Power

Sensor Data Management◦ Navie Approach

Each Sensor sends data to the base station Do data processing at the base station

◦ Problem Each sensor waste its energy quickly in order to send its read-

ing continuously◦ Minimize Energy Consumption◦ In-Network Processing

Sensor Data Management

Data Aggregation

Data Gathering Query Processing

Major Research Topics

TAG (Tiny Aggregation)◦ In-Network Aggregation◦ Tree Routing Based

◦ Simple Approach◦ Cost for Median is very high

Aggregation(1/5)

2

4 3

5 3 2 2

Sum(2,12, 7)

Sum(4,5,3) Sum(3,2,2)

Q-Digest[2]◦ Capture the distribution of sensor data approximately◦ Digest property

count(v) <= floor(n/k) (except leaf node) count(v)+count(vp)+count(vs) >= floor(n/k) (except root node), where v is a node, vp is the parent of v, vs is the sibling of v.n is the number of data, k is compression parameterσ is the range of data

◦ Size of q-Digest <= 3k Each Sensor build q-Digest Parent node

◦ Merges q-Digests of Children◦ Compression

Aggregation(2/5 )

compression

Quantile Query◦ Find value whose rank in n values is qn, where q (0,1)

If q = 0.5, find median<[1,8],1> <[5,6], 2> <[7,8], 2> <[3,3],4> <[4,4], 6>Sorting in increasing right end point <[3,3],4> <[4,4], 6> <[5,6], 2> <[7,8], 2> <[1,8],1> <[4,4],6> exceed 0.5*15= 7.5Thus, 4 is an estimated median

Aggregation(3/5 )

Multiple Aggregation◦ Equivalence Class Reduction[3]

Q = {q1 = {1+2+3}, q2 ={1+2}, q3 = {3}} Equivalent class = set of sensors supports same

query set EC1 = {1,2} , EC2 = {3} Bit Vector EC1 = [1,1,0]T, EC2 = [1,0,1]T

EC1 EC2Q1 1 1 basisQ2 1 0 x v1 = {1+2} 1 0 x v1Q3 0 1 v2 = {3} 0 1 v2

Aggregation(4/5 )

Multiple Aggregation◦ Segmentation Based Method[4]

Dynamic routing, Not tree routing Segment == equivalent class A sensor sends data to a node including same segment as possible STG vs STS

Node 6 can send data to node 5 and 7, in case, node 6 sends data to node 7 STG : node 4 sends data for q2 (=4, 7, 8) and q1+q2 (=4,5)

node 1 receives 3 messages ( from node 2 - 1 message, node 4- 2 messages) STS: multiple routing

node 4 sends data for q2 (=4,5,6,7) to node 1 and q1(=4,5) to node 2 node 1 receives 2 messages

Aggregation(5/5)

In-network aggregation provides a great opportunity for reducing the communication overhead

Since a single aggregated value represents the overall sensing field, it may be insuffi-cient to analysis the correlation among sub-regions of the sensor field

Sensor Data Gathering◦ Exact Data Gathering waste Energy◦ Solution reduce the number of transmission

Gathering(1/8 )

Basic Approach◦ Temporal Suppression

A node does not transmit a value if it has not change since last reported

◦ Spatial Suppression A node suppresses it value if it is identical to those of

its neighboring Approximate Gathering

◦ Sensor readings have errors intrinsically◦ Sensor readings have strong correlations

Gathering(2/8 )

Approximate Data Gathering◦ Each Sensor has a tool to estimate future value◦ The base Station also keep tools

If a sensor does not send data estimation correct If a sensor sends data estimation incorrect

Update tools of the sensor and the basestation

◦ Model Based BBQ[5] KEN[6] PAQ[7]

◦ Filter Based Dual Kalman[9]

◦ Compression Based Wavelet, DFT, SBR[8]etc. A collection of readings of a sensor is transmitted periodically

Gathering(3/8)

Model Based Approach◦ Linear Regression

Xt+1 = aXt+b◦ BBQ, KEN

Multivariate Gaussian model Probability density function: P(X1, X2, X3, …, Xn)

Xi: random variable for sensor readings

Gathering (4/8 )

Approximate Gathering◦ PAQ

Linear Regression and Gaussian model require much time to construct correct model, and much data

AutoRegression(3) model A data Vt = mt+X(t) Vt - mt= X(t) X(t) = aX(t-1)+bX(t-2)+cX(t-3)+b(w)N(0,1) mt is a mean of V to time t, a,b,c is real constants,

b(w) is white noise Predictor P(t) = mt+ a(vt-1 – mt-1)+ b(vt-2 – mt-2) + c(vt-3

– mt-3)

Gathering(5/8)

PAQ◦ Lemma)Let e = v b(w), where v > 1. Then the actual

value at time t is contained in [P(t)-e , P(t)+e)] with probability at most 1/v2.

Proof) Chebychev inequality P(|vt- P(t)| > e) <= b(w)2/e2 = b(w)2/v2b(w)2 = 1/v2

◦ Generally v is 6 or 7◦ Using above Lemma, PAQ decide when it updates its

model.

Gathering(6/8)

-e -d d -e

Well fit Parital fit Outlier

Filter Based◦ Mode Based Approach requires much data to con-

struct models◦ Each node has the filter according to the last re-

ported sensor reading |Vnew – Vold| > e, the reading is sent to the base sta-

tion

Gathering(7/8)

Dual Kalman Filter◦ Base station has as many filters as the number of

sensors◦ Discrete Kalman Filter◦ Ex) moving object

State model : xt = vt-1*dt+xt-1

vt = vt-1 Measure model: z (real position)

z = [1 0]T x +vt

, where vt is measurement white Guassion noise

Gathering(8/8 )

project current state

Estimatenext state

Prediction stepComputeKalman gain

Updatesystem state

Correction step

Updateerror covariance

Initial state

Join Operation◦ An important operator◦ It allows to relate measurements taken at differ-

ent nodes.

Query Processing(1/6)

L R

General Join Plans[12,13]


L R

Naive

L RSequential

L RCentroid

Optimal Join Location[14]◦ Weighted Fermat Problem

One wants to find the point with the property that the weighted sum of the distances from the point to the vertexes of a triangle is minimized.


Synopsis Join[13]◦ Prunes non-candidate tuples and only joins candi-

date tuples◦ Preliminary Join

Eliminate non-candidate tuples

◦ Final Join


TPSJ [10]◦ Preprocessing: Query Decomposition

Query Q

Decomposed Queries Q1 Q2

Page 21


TPSJ◦ Fist phase

Query Q1 execute◦ Second phase

Query Q2 is executed with the injecting of R1 into the network

Page 22


Sensor◦ Light weight◦ Wireless

Sensor Data Management◦ Reduce Energy consumption

In-network Processing Aggregation Gathering Query Processing

Conclusion

[1] S. Madden et.al., “TAG: Aggregation Service for Ad-Hoc Sensor Networks”, OSDI, 2002 [2] N. Shrivastava et.al., “Medians and Beyond: New Aggregation Techniques for Sensor Networks,”

ACM Sensys 2004 [3] N. Trigoni et.al., “Multi-Query Optimization for Sensor Networks” DCOSS 2005 [4]N. Trigoni, et.al., "Routing and Processing Multiple Aggregate Queries in Sensor Networks,“ ACM

SenSys, 2006. [5] A. Deshpande et.al., "Model-Driven Data Acquisition in Sensor Networks,“ VLDB, 2004. [6] D. Chu et.al., "Approximate Data Collection in Sensor Networks using Probabilistic Models,“

ICDE, 2006 [7] D. Tulone et. al., “PAQ: Time Series Forecasting For Approximate Query Answering In Sensor

Networks,” European Conf. Wireless Sensor Networks, 2006 [8] A. Deligiannakis et.al., “Compressing Historical Information in Sensor Networks,” ACM SIGMOD

2004 [9] A. Jain et.al., “Adaptive Stream Resource Management Using Kalman Filters,” ACM SIGMOD 2004 [10] X. Yang et.al., “In-Network Execution of Monitoring Queries in Sensor Networks,” ACM SIGMOD

2007. [11]M. Stern et.al., “Towards Efficient Processing of Gneral-Purpose Joins in Sensor Networks,” ICDE

2009. [12]A. Pandit et.al, “ Communication-Efficient Implementation of Range-Joins in Sensor Networks,”

International Conference on Database Systems for Advanced Applications (DASFAA), 2006 [13] H. Yu et.al, “In-Network Join Processing for Sensor Networks,” APWeb 2006. [14] A. Coman et.al, “On Join Location in Sensor Networks,” MDM 2007.

Reference

Documents

Sensor Data