40
Randomization Carmella Kroitoru Seminar on Communication Complexity

Randomization Carmella Kroitoru Seminar on Communication Complexity

Embed Size (px)

Citation preview

Page 1: Randomization Carmella Kroitoru Seminar on Communication Complexity

Randomization

Carmella Kroitoru

Seminar on Communication Complexity

Page 2: Randomization Carmella Kroitoru Seminar on Communication Complexity

What’s new?

Until now, Alice and Bob were all powerful, but deterministic.

But what if Alice and Bob could act in a randomized fashion?

What’s randomized fashion??

Well, just flipping a coin, for instance..

Page 3: Randomization Carmella Kroitoru Seminar on Communication Complexity

What now??

Now players “toss coins”.

The coins determine the protocol.

The communication over an input (x, y) is not fixed anymore, it’s a random variable.

And so is f(x,y).

Page 4: Randomization Carmella Kroitoru Seminar on Communication Complexity

So, How can we define the success of the protocol?

The conservative way - Las Vegas protocols

Success = the protocol always gives the right f(x,y).

The liberal option – Monte Carlo protocols

Success = a high probability that the protocol will give the right f(x,y).

Page 5: Randomization Carmella Kroitoru Seminar on Communication Complexity

Where do we get all the coins?

DeterministicRandomized

Input

Additional Information

Output Dependence

Denote – r(I) is a pseudo random string of arbitrary length

Alice gets xBob gets y

Alice gets xBob gets y

Alice has r(A) Bob has r(B)

F(x,y) depends only on x and y

F(x,y) depends on

x, y, r(A), r(B)

Page 6: Randomization Carmella Kroitoru Seminar on Communication Complexity

So how does the decision tree looks??

Z(1,1)

X1 = 1 X1 = 0

Y1 = 1 Y1 = 0 Y1 = 1 Y1 = 0

Z(1,0) Z(0,1) Z(0,0)

Before Randomization

After Randomization

Page 7: Randomization Carmella Kroitoru Seminar on Communication Complexity

But What if we get the ‘wrong’ r(I)?

Does it mean the f(x,y) will be wrong?

Yes!

For The same input (x,y), f(x,y) will differ, depending on r(A) and r(B).

So how will we get the right answer?

Through the magic of probability.

Page 8: Randomization Carmella Kroitoru Seminar on Communication Complexity

Definitions:P – a randomized protocol Zero error – for every (x,y)

Pr [P(x,y) = f(x,y)] = 1 ε error – for every (x,y)

Pr [P(x,y) = f(x,y)] ≥ 1- ε One sided ε error - for every (x,y)

if f(x,y) = 0 then Pr [P(x,y) = 0] = 1 if f(x,y) = 1 then Pr [P(x,y) = 1] ≥ 1- ε

Page 9: Randomization Carmella Kroitoru Seminar on Communication Complexity

Now f(x,y) can vary depending on r(A) and r(B)

What about D(f)?? Is it constant for some (x,y) input?

No!

So how will we measure the running time?

Page 10: Randomization Carmella Kroitoru Seminar on Communication Complexity

Running time – first method

The worst case running time of a randomized protocol p is

The worst case cost of p is

Page 11: Randomization Carmella Kroitoru Seminar on Communication Complexity

Running time – second method The average case running time of a

randomized protocol p is

The average case cost of p is

Page 12: Randomization Carmella Kroitoru Seminar on Communication Complexity

Average ?

r(A) and r(B) are chosen independently, according to some probability distribution.

Should we consider the distribution of (x,y)?

No ‘average input’!

Page 13: Randomization Carmella Kroitoru Seminar on Communication Complexity

Now what’s all that ‘probability’ talk?Didn’t I pass that course already?Let’s do a quick review..

Page 14: Randomization Carmella Kroitoru Seminar on Communication Complexity

Binomial Distribution A binomial experiment, also known as Bernoulli trial,

is a statistical experiment with the properties:

n Independent trials – yi’s. Success or failure (denoted 0/1, T/F) Probability of success, p, is the same on every trial. Denote: S – number of successes.

Example: coin toss. Success = ‘Heads’.

trialy1 y2 y3 ..…..…yn

result011..…..…0S = ∑yi

Page 15: Randomization Carmella Kroitoru Seminar on Communication Complexity

Exercise:Given a box with red and blue ball, suppose red

balls are at least δ fraction of total.

Probability that draws don’t see any red balls?

But what if 1/3 isn’t good enough? What if we want <α?

Probability that draws don’t see any red balls?

Page 16: Randomization Carmella Kroitoru Seminar on Communication Complexity

Expectation

Linearity E( a ) = a E( a*X + b) = a*E(X) + b E( X+Y ) = E(X) + E(Y)

Non-multiplicativity

E( X*Y ) = E(X) * E(Y) only if X and Y independent

Page 17: Randomization Carmella Kroitoru Seminar on Communication Complexity

What is the probability that a random variable deviates from its expectation? Focus on sums/averages of n bounded variables:

Note: can get similar bounds for other bounded variables

Page 18: Randomization Carmella Kroitoru Seminar on Communication Complexity

Don’t know anything about the yi’s: If yi is positive, but we don’t know anything else, then

we can use Markov’s inequality to bound S:

Some intuition - no more than 1/5th of the population can have more than 5

times the average income Example:

–toss 100 coins, Pr (# of heads ≥ 60) ≤ 5/6 (E(Y) = 50, t = 6/5)

So without knowing if yi is bound, or if the yi ‘s are independent we got an upper bound.

Although it’s not a great bound, and we don’t know the lower bound.

Page 19: Randomization Carmella Kroitoru Seminar on Communication Complexity

Bounding S=Σyi when yi ‘s are independent

Page 20: Randomization Carmella Kroitoru Seminar on Communication Complexity

An exact result: Using the Cumulative binomial probability.

Refers to the probability that the binomial random variable falls within a specified range

Example: toss 100 coins, Pr (# of heads ≥ 60) = ?

Solution: To solve this problem, we compute 40 individual probabilities, using the binomial formula. The sum of all

these probabilities is the answer we seek .Denote: k = # of heads

Pr(k ≥ 60; n = 100, p = 0.5) = Pr(k = 60; 100, 0.5) + Pr(k = 61; 100, 0.5) + . . . + Pr(k = 100; 100, 0.5)

Pr(k ≥ 60; n = 100, p = 0.5) = 0.028

Page 21: Randomization Carmella Kroitoru Seminar on Communication Complexity

More generally:We can use Chernoff’s inequality to bound S:

Then for any δ > 0

Example: toss 100 coins independently

Pr(# of heads ≥ 60) ≤ (t = 1/5)

Dramatically better bounds then Markov Worse then Cumulative Probability but much easier and

works for more cases. Bounds for all S (bigger or smaller then E)

Back to running time

Page 22: Randomization Carmella Kroitoru Seminar on Communication Complexity

3 types of errors - 3 complexity measures (1)Let f : x × y → {0,1}

Lets define complexity measures for a randomized protocol p:

is the minimum average case cost of p, that computes f with zero error.

Page 23: Randomization Carmella Kroitoru Seminar on Communication Complexity

3 types of errors - 3 complexity measures (2)For 0< ε <1, is the minimum worst case

cost of p that computes f with error ε.

For 0< ε <1, is the minimum worst case cost of p that computes f with one sided error ε.

Page 24: Randomization Carmella Kroitoru Seminar on Communication Complexity

Wait!

What’s the meaning of ‘average’ for f with zero error?

And the meaning of ‘worst’ for f with error?

Page 25: Randomization Carmella Kroitoru Seminar on Communication Complexity

Worst case = Θ (Average case)Reminder: DAVG(f) – the average running time

over all random vectors r(A), r(B), maxed over all possible inputs x,y.

A protocol ‘A’ makes an error ε/2 and DAVG(A) = t Define A’:~ Execute A as long as at most 2t/ ε bits are exchanged~ If A finished, return it’s answer~ Else return 0

And #{bits exchenged} of A’ in the worst case is DWORST(A’) = 2t/ε

So we found k= 2/ε s.t. DAVG(A) = k * DWORST(A’)

Page 26: Randomization Carmella Kroitoru Seminar on Communication Complexity

But is A and A’ have the same errors? A has error of ε/2 .

What’s the error of A’? A’ can return the answer A output. That has a chance of

ε/2 to be wrong. Or A’ can output 0 because A wasn’t done after 2t/ ε

bits. What’s the chance of that? We’ll use Markov:

Pr[A exchanged more then 2t/ ε bits] =

Pr[ #{bits exchenged in A}> (2/ε)*t ] ≤ 1/(2/ε) = ε/2

So A’ has error of at most ε/2 + ε/2 = ε, meaning 2*Aerror = A’error , both one sided.

Page 27: Randomization Carmella Kroitoru Seminar on Communication Complexity

What if ε = 0?

For zero error protocols, using the worst case cost gives exactly the deterministic communication complexity.

How come?

A deterministic protocol can just fix some r(A) and r(B) and proceed.

So for zero error protocols, we only care about the average case cost.

Page 28: Randomization Carmella Kroitoru Seminar on Communication Complexity

Exercise (1):A: Calculate f(x, y) for some fixed x,y, with some protocol

with one sided error ε.

A’: Execute A t times.

Denote: ~ fi – result of execution #i

~ if for some i fi=1 then res(A’)= 1 , else res(A’)= 0.

Pr [res(A’) ≠ f(x, y)] = ?

Solution:If res(A’) = 1, that’s the right answer, 100% (by definition).

If res(A’) = 0, it might be the right result, or we got the wrong result for t rounds. What’s the chance of that?

Pr[ mistake in A] < ε. Pr[ mistake in A’] < ε^t.

Page 29: Randomization Carmella Kroitoru Seminar on Communication Complexity

Exercise (2):A: Calculate f(x, y) for some fixed x,y, with some protocol

with error ε.

A’: Execute A t times. Denote: fi – result of execution #i

res(A’) = maj {fi}

What’s the possibility that A’ didn’t get a result we trust?

Solution(1):

Define a Bernoulli trial: yi = 1 if fi ≠ f(x, y).

To get res(A’) wrong we need to get more than half yi ‘s wrong.

E[∑ yi ] = E[S] = εt. So what’s the chance of S > t/2 ?

Hint: Use Chernoff for that.

Page 30: Randomization Carmella Kroitoru Seminar on Communication Complexity

Solution (2):Let’s fix ε ≤ ¼ (can be generalized for ε < ½) and take δ = 1.

What if we want to bound the possibility of error with α smaller than that?

Then we need to take t = 12*ln[1/ α]

So the error probability can be reduced if we’ll take bigger t, meaning, enlarge the communication complexity by a small penalty.

Page 31: Randomization Carmella Kroitoru Seminar on Communication Complexity

FieldA non-empty set F with two binary operation + (addition) and * (multiplication) is called a field if it has:

Closure of F under + and * Associativity of + and * Commutativity of + and * Distributivity of * over + + and * identity (0, 1) + and * inverses

examples: rational numbers, GF[7] = {0, 1, 2, 3, 4, 5, 6}

Page 32: Randomization Carmella Kroitoru Seminar on Communication Complexity

Example: EQDenote: Alice’s input A = a0a1…an-1

Bob’s input B = b0b1…bn-1

Let’s think of these inputs as polynomials over GF[p] where n² < p < 2n², p is prime.

That is A(x) = a0 + a1x + a2x² + … + an-1x^(n-1) mod p

and B(x) = b0 + b1x + b2x² + … + bn-1x^(n-1) mod p

Alice picks at random t in GF[p] and sends Bob both t and A(t). Bob outputs 1 if A(t) = B(t) and 0 otherwise.

#{bits exchanged} = O(log p) = O(log n)

Page 33: Randomization Carmella Kroitoru Seminar on Communication Complexity

CorrectnessNote that if A=B then A(t) = B(t) for all t, so f(A,B) = 1.

If A≠B then we have 2 distinct polynomials of degree n-1.

Such polynomials can be equal on at most n-1 (out of p) elements of the field (since their difference is a non zero polynomial of degree ≤ n-1, and has at most n-1 roots).

Hence the probability of error is at most (n-1)/p ≤ n/n² = 1/n

So we have shown that R(EQ) = O(log n), and in fact and

In contrast to D(NE) = D(EQ) = n+1

Page 34: Randomization Carmella Kroitoru Seminar on Communication Complexity

ExerciseProve that the following protocol for EQ achieves similar performance:

Alice and Bob view their inputs A and B as n-bit integers (between 1 and 2^n) .

Alice chooses a random prime number p between the n first primes. (If n = 3 then p is 2, 3 or 5)

She sends p and (A mod p) to Bob .

Bob outputs 1 if A mod p = B mod p, and 0 otherwise.

Page 35: Randomization Carmella Kroitoru Seminar on Communication Complexity

SolutionFirst note that if A = B, then of course

A mod p = B mod p.If A ≠ B and Bob accepts, it means that A = B (mod p).Define BADA,B = {p | p is in the first n primes and A = B (mod p)}

BAD 5,8 = {3}

Claim (without proof): | BADA,B | ≤ ½n.Then the probability that B accepts is

Can repeat this for 100 times to get error of 2^(-100).

So we got since max {first n primes} < n^3 and O(log n) = O(log n^3)

Page 36: Randomization Carmella Kroitoru Seminar on Communication Complexity

So how much better are bounds of randomization protocols then deterministic ones?

Lemma:

We will prove a more delicate statement:

Page 37: Randomization Carmella Kroitoru Seminar on Communication Complexity

Proof 1:

Let’s present a deterministic simulation of a given randomized protocol p.

For each leaf l of the protocol p, Alice will send Bob p(A, l) – the probability that given x, she will respond in a way leading to the l.

Bob will compute p(B, l) - the probability that given the y, he will respond in a way leading to the l.

Bob will then compute p(l) = p(A, l) * p(B, l) – the probability to reach l.

Page 38: Randomization Carmella Kroitoru Seminar on Communication Complexity

Proof 2:Bob and Alice will do this for every leaf, thus

calculating all leafs.

Bob will check which of the values has probability 1-ε, and that is the right f(x,y).

What’s the problem??

We need to send probability values, that means real numbers.

But we can use precision of

bits.

Page 39: Randomization Carmella Kroitoru Seminar on Communication Complexity

Proof 3:This guarantees deviation in values of at most

This implies that p(l), is at most far

from the true p(l) (since p(B, l) ≤ 1).

So the total error over all the leaves is at most ½-ε.

Therefore Bob only needs to check which of the values (0 or 1) has probability of more then ½ and that is the correct f(x,y) value.

Q.E.D.

Page 40: Randomization Carmella Kroitoru Seminar on Communication Complexity