31
Probabilistic Analysis and Randomized Algorithm

Probabilistic Analysis and Randomized Algorithm. Worst case analysis Probabilistic analysis Need the knowledge of the distribution of the inputs Indicator

Embed Size (px)

Citation preview

Probabilistic Analysis and Randomized Algorithm

Worst case analysis Probabilistic analysis

Need the knowledge of the distribution of the inputs

Indicator random variables Given a sample space S and an event A, the indicator

random variable I{A} associated with event A is defined as: 10 if occurs

o/wAI A

E.g.: Consider flipping a fair coin: Sample space S = { H,T } Define random variable Y with Pr{ Y=H } = Pr{ Y=T }=1/2 We can define an indicator r.v. XH associated with the

coin coming up heads, i.e. Y=H

10 if if H

Y HX I Y HY T

1 Pr 0 Pr

1Pr

2

HE X E I Y HY H Y T

Y H

{ }

:

:

Pr

1 Pr 0 Pr

Pr

A

A

A

S AS X I A

E X A

E X E I A A A

A

Lemma

Proof

Given a sample space and an event in thesample space , let Then

Hire-Assistant(n)

1. best = 0

2. for i = 1 to n

3. interview candidate i

4. if candidate i is better than candidate best

5. best = i

6. hire candidate i

1

:

:I { candidate i is hired

1/ .... 1 1/ 2

( ln )

}i

i

h

n

O c n

E X iE X X

Lemma

ProofX

Assuming that the candidate are presented in a random order, algorithmHire-Assistant has an average-case totalhiring cos

t of .

1/ 3 ... 1/ln (1).

nn O

Randomized-Hire-Assistant(n)

1. randomly permute the list of candidate

2. best = 0

3. for i = 1 to n

4. interview candidate i

5. if candidate i is better than candidate best

6. best = i

7. hire candidate i

( ln )

:

h

Lemma

O c nThe expected hiring cost of the algorithmRandomed-Hire-Assista is nt .

Permute-By-Sorting(A)

1. n = A.length

2. Let P[1..n] be a new array

3. for i = 1 to n

4. P[i] = Random(1, n^3)

5. sort A, using P as sort keys

After sorting, if P[i] is the j-th smallest one, then A[i] lies in position j of the output.

Procedure Permute - By -Sorting produces a uniform random permutation of the input, assuming that all entries are distinct.

:Lemma

Define event Ei : A[i] receives the i-th smallest element.

Pr{E1∩E2 ∩…∩En-1 ∩En} =

Pr{E1} Pr{E2|E1} Pr{E3|E1 ∩E2 } … Pr{En|E1 ∩E2 ∩…∩En-1 }

Pr{E1}=1/n, Pr{E2|E1}=1/(n-1)

Pr{Ei|E1 ∩E2 ∩…∩Ei-1 } = 1/(n-i+1)

Pr{E1∩E2 ∩…∩En-1 ∩En} = 1/n!, which is the probability of obtaining the identity permutation.

It holds for any permutation.

Randomize-In-Place(A): a better method

1. n = A.length

2. for i = 1 to n

3. swap A[i] with A[Random(i, n)]

Lemma: The above procedure computes a uniform random permutation.

The birthday paradox: How many people must there be in a room before there

is a 50% chance that two of them born on the same day of the year?

(1) Suppose there are k people and there are n days in a y

ear,bi : i-th person’s birthday, i =1,…,k

Pr{bi=r}=1/n, for i =1,…,k and r=1,2,…,n

Pr{bi=r, bj=r}=Pr{bi=r}. Pr{bj=r} = 1/n2

Define event Ai : Person i’s birthday is different from per

son j’s for j < i

Pr{Bk} = Pr{Bk-1∩Ak} = Pr{Bk-1}Pr{Ak|Bk-1}where Pr{B1} = Pr{A1}=1

11

Pr Pr ,n

i j i j nrb b b r b r

1

1

: the event that people have distinct birthdayk

k ii

k k

B A k

B A

( 1)1 2

1 (1

1 1

2 1 2 1

1 2 1 3 2 11 2 1

11 2

/

Pr{ } Pr{ }Pr{ | }Pr{ }Pr{ | }Pr{ | }... Pr{ }Pr{ | }Pr{ | }...Pr{ | }1 ( )( )...( )

1 (1 )(1 )...(1 ) 1k

n n n

k k ki

k k k k

k k k k k

k kn n n kn n n

xkn n n

i n

B B A BB A B A B

B A B A B A B

e e e x e

e e

1)

2 ( 1)1 12 2 2ln( )where n k k

n

12( 1) 2 ln 2 , (1 1 (8ln 2) ) / 2

365, 23the prob.

For we have k k n k n

n k

(2) Analysis using indicator random variables For each pair (i, j) of the k people in the room, define th

e indicator r.v. Xij, for 1≤ i < j ≤ k, by

10 /

ijX I i ji jo w

person and person have the same birthday and have the same birthday

1

1 1

1 1

1 1

Pr

( 1)/

2 2

person and have the same birthday

Let

ij

nk k

iji j ik k

iji j i

k k

iji j i

E X i j

X X

E X E X

k kkE X nn

When k(k-1) ≥ 2n, the expected number of pairs of people with the same birthday is at least 1

2 1 1 82 0

2( ), 365 28, we expect to find at least

one matching pair

nk k n k

k n n k

Balls and bins problem: Randomly toss identical balls into b bins, numbered 1,2,

…,b The probability that a tossed ball lands in any given bin

is 1/b (a) How many balls fall in a given bin?

If n balls are tossed, the expected number of balls that fall in

the given bin is n/b (b) How many balls must one toss, on the average, until

a given bin contains a ball? By geometric distribution with probability 1/b

1

21 1 1 1 1

21 1 1 1 1 1

1 11 (1 )

1

1 2 (1 ) 3 (1 ) ...(1 ) (1 ) (1 ) ...

( ) 1

1b

b b b b b

b b b b b b

b

b

ee e

e e b

(c) (Coupon collector’s problem) How many balls must one toss until every bin contains at least one ball?

Want know the expected number n of tosses required to get b hits

The ith stage consists of the tosses after the (i-1)st hit until the ith hit

For each toss during the ith stage, there are i-1 bins that contain balls and b-i+1 empty bins

Thus, for each toss in the ith stage, the probability of obtaining a hit is (b-i+1)/b

Let ni be the number of tosses in the ith stage. Thus the number of tosses required to get b hits is n=∑b

i=1 ni

Each ni has a geometric distribution with probability of success (b-i+1)/b → E[ni]=b/b-i+1

111 1 1 1

(ln (1)) ( ln )

b b b bbi i b i ii i i i

E n E n E n b

b b O O b b

Streaks

Flip a fair coin n times, what is the longest streak of consecutive heads? Ans:θ(lg n)

Let Aik be the event that a streak of heads of length at least k begins with the ith coin flip

For j=0,1,2,…,n, Let Lj be the event that the longest streak of heads has Length exactly j, and let L be the length of the longest streak.

2

2 lg 1,2 lg

Pr 1/ 22 lg

Pr 2

kik

n

i n n

Ak n

A

For

0Pr

n

jjE L j L

2 lg

0,12 lg

Pr

j

n

jj n

L j nn

L

Note that the events for ,..., are disjoint.So the probability that a streak of heads of length

begins anywhere is

12 lg

2 lg 1

0 0

Pr

Pr 1. Pr 1

Thus,

while We have

n

j nj nn n

j jj j

L

L L

02 lg 1

0 2 lg2 lg 1

0 2 lg2 lg 1

0 2 lg

Pr

Pr Pr

(2 lg ) Pr Pr

2 lg Pr Pr

2 lg 1 (1/ ) (lg )

n

jjn n

j jj j nn n

j jj j nn n

j jj j n

E L j L

j L j L

n L n L

n L n L

n n n O n

We look for streaks of length s by partitioning the n flips into approximately n/s groups of s flips each.

lg

, lg

1

Pr 1 2 1

1lg

The probability is that the largest streakis

r n ri r n

r r

A n

n n nr n

:

lgThe expected length of the longest streak of heads in coin flips is

nC im

n

la

The probability that a streak of heads of length

does not begin in position i is

(lg ) / 2Take s n s s s

n

(lg ) / 2

, (lg ) / 2Pr 1 2 1n

i nA n

(lg ) / 2n 1 1 n

(lg ) / 2 / (lg ) / 21

(lg ) / 2

(lg ) / 2

(1 1 ) (1 )n

n n n

n

nn

n

n

The groups are mutually exclusive, ind. coin flips,

the prob. that every one of the groups fails to be a streak oflength is at most

1 2 / lg 11

2 / lg 1 / lg 1

(1 ) n n

nn n n n

ne O e O

(lg ) / 2 1

(lg ) / 2

Pr 1 1/n

jj n

n

L O n

Thus, the prob. that the longest streak exceeds is

WHY?

0(lg ) / 2

0 (lg ) / 2 1

(lg ) / 2 1

(lg ) / 2 1

Pr

Pr Pr

(lg ) / 2 Pr

(lg ) / 2 Pr

(lg ) / 2 1 1/ (lg )

n

jjn n

j jj j nn

jj nn

jj n

E L j L

j L j L

n L

n L

n O n n

Using indicator r.v. :

Let ik ikX I A1

1Let

n k

ikiX X

1

11 1 1 1

1 1 1 2Pr 1/ 2 k

n k

ikin k n k n k k n k

ik iki i i

E X E X

E X A

lg 1 1

1

lglg 1 lg 1 1 ( lg 1) /

21

( )

If , for some constant ,

c n c c c

c

k c n cn c n n c n c n n

E Xn n n

n

If c is large, the expected number of streaks of length clgn is very small.

Therefore, one streak of such a length is very likely to occur.

12

1 12

1 12

12( ) lg

If , then we obtain

and we expect that there will be a large number of streaksof length

nc E X n

n

:(lg )The length of the longest streak is

Conclusionn ■

The on-line hiring problem:

To hire an assistant, an employment agency sends one candidate each day. After interviewing that person you decide to either hire that person or not. The process stops when a person is hired.

What is the trade-off between minimizing the

amount of interviewing and maximizing the quality of the candidate hired?

What is the best k?

Let M(j) = max 1ij{score(i)}.

Let S be the event that the best-qualified applicant is chosen.

Let Si be the event the best-qualified applicant chosen is the i-th one interviewed.

Si are disjoint and we have Pr{S}= ji=1Pr{Si}.

If the best-qualified applicant is one of the first k, we have that Pr{Si}=0 and thus

Pr{S}= ji=k+1Pr{Si}.

Let Bi be the event that the best-qualified applicant must be in position i.

Let Oi denote the event that none of the applicants in position k+1 through i-1 are chosen

If Si happens, then Bi and Oi must both happen.

Bi and Oi are independent! Why?

Pr{Si} = Pr{Bi Oi} = Pr{Bi} Pr{Oi}.

Clearly, Pr{Bi} = 1/n.

Pr{Oi} = k/(i-1). Why???

Thus Pr{Si} = k/(n(i-1)).

i1

1

1

1

Pr{S} = Pr{S }

( 1)

1( / )

( 1)

1( / )

n

i k

n

i k

n

i k

n

i k

kn i

k ni

k ni

1

1

1

Differentiate

1 1

(ln ln ) Pr{ } (ln( 1) l

(ln ln )with respect to k.

1We have (ln ln 1) 0.

Thus / and Pr{ } 1

n( 1

/

).

.

)

1n n

k k

n

i k

k n kn

n k

dx dxx x

k kn k S n kn n

nk n e S e

i