Upload
dev-anand
View
224
Download
0
Embed Size (px)
Citation preview
7/30/2019 Chernoff Bound
1/49
Chernoff Bounds
Let X1,..., Xn be independent 0-1 random variables with
Pr(Xi = 1) = pi Pr(Xi = 0) = 1 pi.
Let X = ni=1 Xi,
= E[X] =n
i=1
E[Xi] =n
i=1
pi
We want a bound on
Pr(|X | > ).
7/30/2019 Chernoff Bound
2/49
7/30/2019 Chernoff Bound
3/49
The Basic IdeaUsing Markov inequality we have:For any t > 0,
Pr(X a) = Pr(etX eta) E[etX]eta
.
Similarly, for any t < 0
Pr(X a) = Pr(etX
eta
) E[etX]
eta .
Pr(X
a)
mint>0
E[etX]
eta.
Pr(X
a)
mint
7/30/2019 Chernoff Bound
4/49
Moment Generating Function
Definition
The moment generating function of a random variable X is defined
for any real value t as
MX(t) = E[etX].
7/30/2019 Chernoff Bound
5/49
Theorem
Let X be a random variable with moment generating functionMX(t). Assuming that exchanging the expectation and
differentiation operands is legitimate, then for all n 1E[xn] = M
(n)X (0),
where M(n)X (0) is the n-th derivative of MX(t) evaluated at t = 0.
Proof.
M(n)X (t) = E[X
netX].
Computed at t = 0 we get
M(n)X (0) = E[X
n].
7/30/2019 Chernoff Bound
6/49
Theorem
Let X and Y be two random variables. If
MX(t) = MY(t)
for all t (, ) for some > 0, then X and Y have the samedistribution.
Theorem
If X and Y are independent random variables then
MX+Y(t) = MX(t)MY(t).
Proof.
MX+Y(t) = E[et(X+Y)] = E[etx]E[etY] = MX(t)MY(t).
7/30/2019 Chernoff Bound
7/49
Chernoff Bound for Sum of Bernoulli Trials
Let X1, . . . , Xn be a sequence of independent Bernoulli trials with
Pr(Xi = 1) = pi. Let X =n
i=1 Xi, and let
= E[X] = E
n
i=1
Xi
=
ni=1
E[Xi] =n
i=1
pi.
MXi(t) = E[etXi]
= piet + (1
pi)
= 1 + pi(et 1) epi(et1).
7/30/2019 Chernoff Bound
8/49
Taking the product of the n generating functions we get
MX(t) =n
i=1
MXi(t)
n
i=1
epi(et1)
= eni=1 pi(e
t1)
= e(et1)
7/30/2019 Chernoff Bound
9/49
Theorem
Let X1, . . . , Xn be independent Bernoulli random variables suchthat Pr(Xi = 1) = pi.
1 For any > 0,
Pr(X (1 + )) 0
Pr(X (1 + )) = Pr(etX et(1+)) (4) E[e
tX]
et(1+)
0, we can set t = ln(1 + ) > 0 to get:
Pr(X (1 + )) e
(1 + )(1+)
.
W h h f 0 1
7/30/2019 Chernoff Bound
11/49
We show that for 0 < < 1,
e
(1 + )(1+) e2/3
or thatf() = (1 + ) ln(1 + ) + 2/3 0
in that interval. Computing the derivatives of f() we get
f() = 1 1 + 1 + ln(1 + ) + 23 (5)
= ln(1 + ) + 23
, (6)
f() =
1
1 + +
2
3. (7)
f() < 0 for 0 < 1/2, and f() > 0 for > 1/2.f() first decreases and then increases over the interval [0, 1].Since f(0) = 0 and f(1) < 0, f()
0 in the interval [0, 1].
Since f(0) = 0, we have that f() 0 in that interval.
7/30/2019 Chernoff Bound
12/49
For R 6, 5.
Pr(X (1 + ))
e
(1 + )(1+)
e
6R
2R.
7/30/2019 Chernoff Bound
13/49
Theorem
Let X1, . . . , Xn be independent Bernoulli random variables suchthat Pr(Xi = 1) = pi. LetX =
ni=1 Xi and = E[X].
For 0 < < 1,
Pr(X (1 )) e2/2 (8)
7/30/2019 Chernoff Bound
14/49
7/30/2019 Chernoff Bound
15/49
We need to show:
f() = (1 )ln(1 ) + 12
2 0. (10)
Differentiating f() we get
f() = ln(1 ) + ,f() = 1
1 + 1.
f(0) = 0, and since f() 0 in the range [0, 1), f() ismonotonically decreasing in that interval.
E l C i fli
7/30/2019 Chernoff Bound
16/49
Example: Coin flips
Let X be the number of heads in a sequence of n independent fair
coin flips.
Pr
|X n
2| 1
2
4n ln n
= Pr
X n2
1 +
4 ln n
n
+PrX n
2 1 4 ln n
n e 13 n2 4 ln nn + e 12 n2 4 ln nn 2
n.
7/30/2019 Chernoff Bound
17/49
Using the Chebyshevs bound we had:
Pr|X n2 | n4 4n .
Using the Chernoff bound in this case, we obtain
Pr|X n2 | n4 = PrX n2 1 + 12
+ Pr
X n
2
1 1
2
e
13n2
14 + e
12n2
14
2e n24 .
E l E ti ti P t
7/30/2019 Chernoff Bound
18/49
Example: Estimating a Parameter
Evaluating the probability that a particular gene mutationoccurs in the population.
Given a DNA sample, a lab test can determine if it carries themutation.
The test is expensive and we would like to obtain a relativelyreliable estimate from a minimum number of samples. p = the unknown value; n = number of samples, pn had the mutation.
Given sufficient number of samples we expect the value p tobe in the neighborhood of sampled value p, but we cannotpredict any single value with high confidence.
Confidence Interval
7/30/2019 Chernoff Bound
19/49
Confidence Interval
Instead of predicting a single value for the parameter we give aninterval that is likely to contain the parameter.
Definition
A 1 q confidence interval for a parameter T is an interval[p , p+ ] such that
Pr(T [p , p+ ]) 1 q.
We want to minimize 2 and q, with minimum n.Using pn as our estimate for pn, we need to compute and q suchthat
Pr(p [p , p+ ]) = Pr(np [n(p ), n(p+ )]) 1 q.
7/30/2019 Chernoff Bound
20/49
The random variable here is the interval [p , p+ ] (or thevalue p), while p is a fixed (unknown) value.
np has a binomial distribution with parameters n and p, andE[p] = p. If p / [p , p+ ] then we have one of thefollowing two events:
1 If p < p , then np n(p+ ) = np(1 + p), or np is largerthan its expectation by a
pfactor.
2 If p > p+ , then np n(p ) = np(1 p
), and np is
smaller than its expectation by a p
factor.
7/30/2019 Chernoff Bound
21/49
Pr(p [p , p+ ])
= Pr(np np(1
p)) + Pr(np np(1 +
p))
e 12np( p)2 + e 13np( p)2
= en2
2p + en2
3p .
But the value of p is unknown, A simple solution is to use the factthat p 1 to prove
Pr(p [p , p+ ]) = e n22 + e n23 .
Setting q = en2
2 + en2
3 , we obtain a tradeoff between , n and
the error probability q.
Better Bound
7/30/2019 Chernoff Bound
22/49
Better Bound
The binomial probabilities are monotone increasing up to theexpectation, and then monotone decreasing.
Pr(p [p , p+ ]) Pr(np np(1
p)) + Pr(np np(1 +
p))
maxpp
enp(pp
p)2/2 + max
pp+enp(
pp
p)2/3
e n2
2(p) + e n
2
3(p+) ,
Setting
q = e n
2
2(p) + e n
2
3(p+)
gives a tighter tradeoff between , n and q.
Application: Set Balancing
7/30/2019 Chernoff Bound
23/49
Application: Set Balancing
Given an n
n matrix
Awith entries in
{0, 1
}, let
a11 a12 ... a1na21 a22 ... a2n... ... ... ...... ... ... ...
an1 an2 ... ann
b1b2......bn
=
c1c2......cn
.
Find a vector b with entries in {1, 1} that minimizes
||Ab|| = maxi=1,...,n
|ci|.
7/30/2019 Chernoff Bound
24/49
Theorem
For a random vector b, with entries chosen independently and withequal probability from the set {1, 1},
Pr(||Ab|| 12n ln n) 4n
.
7/30/2019 Chernoff Bound
25/49
Consider the i-th row ai = ai,1, . . . , ai,n. Let k be the numberof 1s in that row.
If k 12n ln n clearly |ai b| 12n ln n. If k >
12n ln n, let
Xi = |{j | ai,j = 1 and bj = 1}|
andYi = |{j | ai,j = 1 and bj = 1}|.
Thus, Xi counts the number of +1s in the sum n
j=1
ai,jbj,
Yi counts the number of1s Xi + Yi = k.
7/30/2019 Chernoff Bound
26/49
if |Xi Yi|
12n log n then |Xi (k Xi)|
12n log nwhich implies
k
2 (1
12n log n
k ) Xi k
2 (1 +
12n log n
k ).
7/30/2019 Chernoff Bound
27/49
Using Chernoff bounds,
Pr
Xi k2
1 +
12n ln n
k2
e( k2 )( 13 )( 12n ln nk2 ) e2 ln n
PrXi k
21
12n ln n
k2 e
( k2
)( 12
)( 12n ln nk2
)
e3 ln n
Hence, for a given row,
Pr(|Xi
Yi
|
12n ln n)
2
n2
Since there are n rows, the probability that any row exceeds thatbound is bounded by 2
n.
Chernoff Bound for Sum of{1, +1} Random
7/30/2019 Chernoff Bound
28/49
C e o ou d o Su o { , + } a doVariables
Theorem
Let X1,..., Xn be independent random variables with
Pr(Xi = 1) = Pr(Xi = 1) =1
2 .
Let X =n
1 Xi. For any a > 0,
Pr(X
a)
ea
2/2n
F t > 0
7/30/2019 Chernoff Bound
29/49
For any t > 0,
E[etXi] =1
2et +
1
2et.
et = 1 + t+t2
2!+ + t
i
i!+ . . .
and
et
= 1 t+t2
2! + + (1)it
i
i! + . . .
Thus,
E[etXi] =1
2
et +1
2
et = i0
t2i
(2i)!
i0
( t2
2 )i
i!= et
2/2
7/30/2019 Chernoff Bound
30/49
E[etX] =
ni=1
E[etXi] ent2
/2,
Pr(X
a) = Pr(etX > eta)
E[etX]
eta
et
2n/2ta.
Setting t = a/n yields
Pr(X a) ea2/2n.
7/30/2019 Chernoff Bound
31/49
By symmetry we also have
Corollary
Let X1,..., Xn be independent random variables with
Pr(Xi = 1) = Pr(Xi =
1) =
1
2.
Let X =n
i=1 Xi. Then for any a > 0,
Pr(|X| > a) 2ea2/2n.
Application: Set Balancing Revisited
7/30/2019 Chernoff Bound
32/49
g
Theorem
For a random vector b, with entries chosen independently and withequal probability from the set {1, 1},
Pr(
||Ab
||
4n ln n)
2
n
(11)
Consider the i-th row ai = ai,1, ...., ai,n.
Let k be the number of 1s in that row.
Zi =k
j=1 ai,ijbij =k
j=1 bij.
If k 4n ln n then clearly Zi satisfies the bound.
7/30/2019 Chernoff Bound
33/49
If k > 4n log n, the k non-zero terms in the sum Zi areindependent random variables, each with probability 1/2 of beingeither +1 or 1.Using the Chernoff bound:
Pr|Zi| >
4n log n
2e4n log n/2k 2
n2,
where we use the fact that n k.
Packet Routing on Parallel Computer
7/30/2019 Chernoff Bound
34/49
Communication network:
Nodes - processors, switching nodes. edges - communication links.
7/30/2019 Chernoff Bound
35/49
The n-cube:N = 2n nodes.Let x = (x1,..., xn) be the number of node x in binary.Nodes x and y are connected by an edge iff their binary
representations differ in exactly one bit.Bit-wise routing: correct bit i in the i-th transition - route haslength n.
7/30/2019 Chernoff Bound
36/49
7/30/2019 Chernoff Bound
37/49
A permutation communication request: each node is the sourceand destination of exactly one packet.Up to one packet can cross an edge per step, each packet can
cross up to one edge per step.What is the time to route an arbitrary permutation on the n-cube?
7/30/2019 Chernoff Bound
38/49
Two phase routing algorithm:
1 Send packet to a randomly chosen destination.
2 Send packet from random place to real destination.
Path: Correct the bits, starting at x0 to xn1.Any greedy queuing method - if some packet can traverse an edgeone does.
7/30/2019 Chernoff Bound
39/49
Theorem
The two phase routing algorithm routes an arbitrary permutationon the n-cube in O(log N) = O(n) parallel steps with highprobability.
We focus first on phase 1. We bound the routing time of agiven packet M.
Let e1,..., em be the m n edges traversed by a given packetM is phase 1.
Let X(e) be the total number of packets that traverse edge e
at that phase. Let T(M) be the number of steps till M finished phase 1.
7/30/2019 Chernoff Bound
40/49
Lemma
T(M) mi=1
X(ei).
We call any path P = (e1, e2, . . . , em) of m n edges thatfollows the bit fixing algorithm a possible packet path.
We denote the corresponding nodes v0, v1, . . . , vm, withei = (vi1, vi).
For any possible packet path P, let T(P) = mi=1 X(ei).
If phase I takes more than T steps then for some possible
7/30/2019 Chernoff Bound
41/49
p p ppacket path P,
T(P) T
There are at most 2n 2n = 22n possible packet paths. Assume that ek connects (a1,..., ai,..., an) to (a1,.., ai,..., an). Only packets that started in address
(,..., , ai, ...., an)
can traverse edge ek, and only if their destination addressesare
(a1, ...., ai1, ai, , ...., ).
There are 2i1 possible packets, each has probability 2i totraverse ei.
7/30/2019 Chernoff Bound
42/49
There are 2i1 possible packets, each has probability 2i totraverse ei.
E[X(ei)] 2i1 2i = 1
2.
E[T(P)]
mi=1
E[X(ei)] 12
m n.
Problem: The X(ei)s are not independent.
A packet is active with respect to possible packet path P if it
7/30/2019 Chernoff Bound
43/49
ever use an edge of P.
For k = 1, . . . , N, let Hk = 1 if the packet starting at node kis active, and H
k= 0 otherwise.
The Hk are independent, since each Hk depends only on thechoice of the intermediate destination of the packet startingat node k, and these choices are independent for all packets.
Let H = Nk=1 Hk be the total number of active packets.
E[H] E[T(P)] n
Since H is the sum of independent 0
1 random variables we
can apply the Chernoff bound
Pr(H 6n 6E[H]) 26n.
7/30/2019 Chernoff Bound
44/49
For a given possible packet path P,
Pr(T(P) 36n) Pr(H 6n) + Pr(T(P) 36n | H < 6n) 26n + Pr(T(P) 36n | H < 6n).
Lemma
7/30/2019 Chernoff Bound
45/49
Lemma
If a packet leaves a path (of another packet) it cannot return tothat path in the same phase.
Proof.
Leaving a path at the i-th transition implies different i-th bit, thisbit cannot be changed again in that phase.
Lemma
The number of transitions that a packet takes on a given path isdistributed G( 12 ).
Proof.
The packet has probability 1/2 of leaving the path in eachtransition.
The probability that the active packets cross edges of P more than36 i i l h h b bili h f i i fli d 36
7/30/2019 Chernoff Bound
46/49
36n times is less than the probability that a fair coin flipped 36ntimes comes up heads less than 6n times.Letting Z be the number of heads in 36n fair coin flips, we now
apply the Chernoff bound:
Pr(T(P) 36n | H 6n) Pr(Z 6n)
e18n(2/3)2/2
= e
4n
23n1
.
Pr(T(P) 36n) Pr(H 6n)+ Pr(T(P) 36n | H 6n) 26n + 23n1 23n
7/30/2019 Chernoff Bound
47/49
As there are at most 22n possible packet paths in the hypercube,the probability that there is any possible packet path for whichT(P) 36n is bounded by
22n
23n
= 2n
= O(N1
).
7/30/2019 Chernoff Bound
48/49
The proof of phase 2 is by symmetry: The proof of phase 1 argued about the number of packets
crossing a given path, no timing considerations.
The path from one packet per node to random locations is
similar to random locations to one packet per node inreverse order.
Thus, the distribution of the number of packets that crosses apath of a given packet is the same.
Oblivious Routing
7/30/2019 Chernoff Bound
49/49
Definition
A routing algorithm is oblivious if the path taken by one packet isindependent of the source and destinations of any other packets inthe system.
TheoremGiven an N-node network with maximum degree d the routingtime of any deterministic oblivious routing scheme is
(N
d3 ).