View
217
Download
0
Tags:
Embed Size (px)
Citation preview
Collecting Correlated Information from a Sensor Network
Micah Adler
University of Massachusetts, Amherst
Fundamental Problem• Collecting information from distributed sources.
– Objective: correlations reduce bits that must be sent.
• Correlation examples in sensor networks:– Weather in geographic region.
– Similar views of same image.
• Our focus: information theory– Number of bits sent.
– Ignore network topology.
Modeling Correlation• k sensor nodes each have n-bit string.
• Input drawn from distribution D.– Sample specifies all kn bits.
– Captures correlations
and a priori knowledge.
• Objective: – Inform server of all k strings.
– Ideally: nodes send H(D) bits.
– H(D): Binary entropy of D.
x1
x2
x3
x4
x5
x6
xk
Binary entropy of D
• Optimal code to describe sample from D:– Expected number of bits required ≈ H(D).
• Ranges from 0 to kn.
– Easy if entire sample known by single node.• Idea: shorter codewords for more likely samples.
• Challenge of our problem: input distributed.
€
H D( ) = −Pr x[ ]
x ∈ D∑ log2 Pr x[ ]
Distributed Source Coding
• [Slepian-Wolf, 1973]:– Simultaneously encode r independent samples.– As r ,
• Bits sent by nodes rH(D).
• Probability of error 0.
• Drawback: relies on r – Recent research: try to remove this.
€
x11x1
2... x1r
€
x21x2
2... x2r
€
x31x3
2... x3r
€
x41 x4
2... x4r
€
x51x5
2... x5r
€
x61x6
2... x6r
€
xk1xk
2... xkr
Recent Research on DSC
• Survey [XLC 2004]: over 50 recent DSC papers– All previous techniques:
• Significant restrictions on D and/or
• Large values of r required.– Can also be viewed as restriction on D.
– Generalizing D most important open problem.
• Impossibility result:– There are D such that no encoding for small r achieves O(H(D)+k) bits sent from nodes.
• Our result: general D, r=1, O(H(D)+k) bits.
New approach• Allow interactive communication!
– Nodes receive “feedback” from server.• Also utilized for DSC in [CPR 2003].
– Server at least as powerful as nodes.
• Power utilization: – Central issue for sensor networks.– Node sending: power intensive.– Node receiving requires less power.
• Analogy: portable radio vs. cellphone.
x1
x2
x4
x5
xk
x3
New approach• Communication model:
– Synchronous rounds:• Nodes send bits to server.• Server sends bits back to nodes.• Nothing directly between nodes.
• Objectives:– Minimize bits sent by nodes.
• Ideally O(H(D)+k).
– Minimize bits sent by server.– Minimize rounds of communication.
x1
x2
x4
x5
xk
x3
Asymmetric Commmunication Channels
• Adler-Maggs 1998: k=1 case.
• Subsequent work: [HL2002] [GGS2001] [W2000] [WAF2001]
• Other applications:– Circumventing web censorship [FBHBK2002]
– Design of websites [BKLM2002]
• Sensor networks problem: natural parallelization.
xD
Who knows what?
• Nodes: only know own string.– Can also assume they know distribution D.
• Server: knows distribution D.– Typical in work on DSC.
– Some applications: D must be learned by server
• Most such cases: D varies with time.
• Crucial to have r as small as possible.
D
X1 X2 X3 X4 X5 X6 X7 X8, D , D , D , D , D , D , D , D
New Contributions• New technique to communicate interactively:
– O(H(D)+k) node bits. – O(kn + H(D) log n) server bits.– O(log min(k, H(D))) rounds.
• Lower bound:– kn bits must be exchanged if no error.
• If server is allowed error with probability ∆:– O(H(D)+k) node bits. – O(k log (kn/∆) + H(D) log n) server bits.– O(log min(k, H(D))) rounds.
General strategy
• Support uniform case:– D is uniform over set of possible inputs.
• General distributions:– Technique from [Adler-Maggs 1998]
• “Reduce” to support uniform case.
– Requires modification to support uniform protocol.
• Allowing for Error:– Same protocol with some enhancements.
Support Uniform Input
• D: k-dimensional binary matrix– side length 2n
• Choose X: uniform 1 entry of matrix.
• Server is given matrix, wants to know X.
• Node i given ith coordinate of X.
• H(D) = log M– M: number of possible inputs.
Basic Building Block
• Standard fingerprinting techniques:– Class of hash functions f:
• n-bit string 1 bit.• For randomly chosen f,
– If x y, then Pr[ f(x) = f(y)] ≈ 1/2
• Description of f requires O(log n) bits.
Not possible inputs
Possible inputs
Node 1 bits:
Protocol for k=1
• Server sends node log M fingerprint fcts.
• Node sends back resulting fingerprint bits.
• Intuition: each bit removes half inputs left.
First step: allow many rounds.
• Each round: – Server chooses one node.– That node sends single fingerprint bit.
• Objectives:– Ideal: each bit removes half remaining inputs.– Our goal: expected inputs removed is constant fraction.
– Possibility: no node can send such a bit.
• Need to distinguish “useful” bits from “not useful” bits.
Balanced Bits• Fingerprint bit sent by node i is balanced:
– No value for i has > 1/2 possible inputs,– given all information considered so far.
• Balanced bits, in expectation,eliminate constant fraction of inputs.
- Protocol goal:–Collect O(log M) balanced bits.–Don’t collect (log M) unbalanced bits.
Balanced bit:Unbalanced bit:
Objective: minimize rounds.
• Must send multiple bits from multiple nodes.– But shouldn’t send too many unbalanced bits!
• Difficulty:– Must decide how many at start of round.– As bits processed, matrix evolves.
• Node may only send unbalanced bits.
Number of bits sent per round
• Defined node: only one possible value left.– Should no longer send bits.
• First try:– Round i: ki undefined nodes, each send bits
• Possible problem: – Most nodes become defined at start of round i
– Nodes might send total bits.
€
log M
ki
€
Ω logk log M
loglog k
⎛
⎝ ⎜
⎞
⎠ ⎟
Protocol Description• Phases: first round and second round.
– First of phase i: undefined nodes send bi bits.
• Server processes bits in any order.• Counts number of balanced bits.
– Second: if node had any unbalanced bits• Query if it has first heavy string.
• Continue until (log M) balanced bits– Or until know entire input.– Send nodes remaining possibilities.
€
bi = minlog M
ki,3bi−1
2
⎛
⎝ ⎜
⎞
⎠ ⎟
Performance of protocol
• Theorem:– Expected bits sent by nodes:
O(k + log M)– Expected bits sent by server:
O(kn + log M log n)– Expected number of rounds:
O(min(log log M, log k))
Proof: O(log M + k) node bits.
• Key: if node i sends unbalanced bit in phase– Pr[i defined by end of phase] 1/2
• Expected answers to heavy queries: O(1).• Accounting for unbalanced bits:
– Charge to bit sent by same node:• In most recent balanced round
– If none, then first round.
• Spread bits charged to round evenly.
Proof: O(log M + k) node bits.
• Expected unbalanced bits charged to any bit:
• Total balanced bits: O(log M)
• Total first round bits:€
i−112( )
i=1
∞
∑i
32( ) =O 1( )
€
k log M
k
⎡ ⎢ ⎢
⎤ ⎥ ⎥
Proof: server bits and rounds
• Server bits: after O(log M) balanced bits– Pr[incorrect input not eliminated] = 1/M– Expected possible inputs remaining: O(1)
• Rounds:– Sparse phase: < 2/3 log M fingerprint bits– O(1) non-sparse phases.– O(min(log k, log log M)) sparse phases.