Partha Mukherjee & Sandip Sen Department of Math & CS University of Tulsa

Partha Mukherjee & Sandip Sen

Department of Math & CS

University of Tulsa

Comparing Reputation Schemes for Detecting Malicious Nodes in Sensor Networks

Motivation

ASSUMPTION : A network of sensors deployed for sensing data over a regionCorrelation between data sensed at different nodes

Correlation pattern may change over time Colluding malicious nodes may attempt to subvert the data

reported by the sensor network GOAL : Comparing the performances of the reputation

mechanisms used to detect malicious / erroneous nodes in the network

Sensor Networks

Monitor physical / environmental conditions Resource constraints Sensed/aggregated data reported back to Base station

Susceptible to security breaches/compromise

Sensor Network Organization

Sensor field consists of nodes laid out on a grid

Nodes organized in a hierarchy Assumption: time-varying data sensed by different nodes

are correlatedExample: Temperatures at different grid points over the

day

Schemes used to detect malicious nodes

Reinforcement learningQ-learning approach

Statistically grounded scheme:-reputation approach

Discount factors: weights on past / present experiences• Un-weighted

• Linear

• Exponential

Varying parameters:Patterns in the sensed dataDelay of onset of malicious data

Detecting Malicious Nodes

Collect sufficient data when sensor network is operating normally for mining correlation patterns Use neural networks to model correlation between data sensed by

siblings in the sensor node hierarchy The value sensed at any node is predicted from the values sensed by

its siblings Offline training of the nets using back-propagation

Use learning techniques to discover patterns Each malicious node adds a random offset in the range [0,]

to the reported value

Detecting Malicious Nodes

At each reporting time step error between actual and predicted data sensed by a node is calculated

This sequence of “errors” is used to incrementally update the reputation of the node

Node labeled malicious if reputation falls below threshold

Detecting Malicious nodes

Choose Reputation Threshold, For each node:

Compute relative error at time t : t

Compute error statistic : (t)Update Reputations :

Q-Learning : tQL = (1 - ). (t-1)

QL + . (t)

• Balance Factor : - Reputation : t

= (t + 1) / (t + t + 1)

• Cooperative Response : , Non-cooperative Response : – Un-weighted :– Linear : – Exponential : Exponential discount factor :

Node is malicious : if QL < or if <

€

t = (1− f (ε j ))j=1

t

∑ ,

€

t = f (ε j )j=1

t

∑

€

t = (1− f (ε j ))j=1

t

∑ .1

t − j +1,

€

t = f (ε j )j=1

t

∑ .1

t − j +1

€

t = (1− f (ε j ))j=1

t

∑ .λ t− j +1,

€

t = f (ε t ).λ t− j +1

j=1

t

∑

Experiment

Computation of sensed dataBased on generation function : g

Model fluctuationAdd Gaussian Noise : N

Variation of the sensed parameter is represented by the stochastic function ƒƒ(x,y,t) = g(x,y) + h(t) + N(0,)h : T [l, u]

Experiment

Considered two generation functions g to generate data patterns over the 85 node sensor networkg1: exp(-(x2 + y2))

g2 : (x + y) / 2

Considered error-free time interval setD = {0,10,20,30,40,50}

Considered exponential discount factor set = {0.2,0.4,0.6,0.8}

Q-learning and -reputation Schemes with Linear and Two Extreme Discount Factors

Q-learning scheme detects the erroneous nodes earlier than -reputation for distribution exp(-(x2 + y2))

Q-learning and -reputation Schemes with Linear and Two Extreme Discount Factors

Q-learning scheme detects the erroneous nodes earlier than -reputation for distribution (x + y)/2

Comparison Between -Reputation Schemes with Different discount factors

-reputation schemes of lower discount factors detects the erroneous nodes earlier for distribution exp(-(x2 + y2))

Comparison Between -Reputation Schemes with Different discount factors

-reputation schemes of lower discount factors detects the erroneous nodes earlier for distribution (x + y)/2

Conclusions Q-Learning is more efficient than β-Reputation for higher

values of initial error free time steps β-Reputation is more efficient than Q-learning to detect first

malicious node when the initial delay of attack is in between 0 to 4 iterations

Among β-Reputation schemes with discount factors, schemes with lower discount values exhibit higher efficiency. The un-weighted one ( = 1) is least efficient

The combination of learning and reputation management makes this scheme work with the following observationsAll faulty nodes are detected (No false positives)No normal node labeled faulty (No false negatives)

Future Work

Testing with different complex data patterns. Testing with different topologies. Exploring the possibility of developing more robust scheme. Handling sophisticated collusion.

Hierarchical structure : If nodes in higher level collude.

THANK YOU

Documents

Partha Mukherjee & Sandip Sen Department of Math & CS University of Tulsa