Upload
mirit
View
48
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Probabilistic analysis. Wooram Heo. The birthday paradox. How many people must there be in a room before there is a 50% chance that two of them were born on the same day of the year? Index the people with integers 1, 2, …, k : the day of the year on which person i ’s birthday falls - PowerPoint PPT Presentation
Citation preview
Probabilistic analysis
Wooram Heo
The birthday paradox• How many people must there be in a room before there is a
50% chance that two of them were born on the same day of the year?
• Index the people with integers 1, 2, …, k
• : the day of the year on which person i’s birthday falls
• Birthdays are uniformly distributed across the n days
ib
The birthday paradox• Then, the prob. that i’s birthday and j’s birthday both fall on
day r is
• Thus, the prob. That they both fall on the same day is
The birthday paradox• Pr{at least 2 out of k people having matching birthdays} =
1 – Pr{k people have distinct birthday}
• Ai : i’s birthday is different from j’s birthday for all j < i
• Bk : Event that k people have distinct birthdays
The birthday paradox• If Bk-1 holds,
The birthday paradox• Prob. That all k birthdays are distinct is at most ½ when
• For n = 365, k is bigger than or equal to 23
• Thus, if at least 23 people are in a room, the prob. is at least ½ that two people have the same birthday
Balls and bins• Consider the process of randomly tossing identical balls into b
bins, numbered 1, 2, …, b
• Tosses are independent.
• Prob. that a tossed ball lands in any given bin is 1/b
• Ball-tossing process is a sequence of Bernouli(1/b)
• Useful for analyzing hashing
Balls and bins• How many balls must one toss until every bin contains at least
one ball?
• Call a toss in which a ball falls into an empty bin a “hit”
• Expected number n of tosses required to get b hits?
• Hit can be used partition the n tosses into stages. The i th stage consists of the tosses after the (i - 1)st hit until i th hit.
Balls and bins
• For each toss during the i th stage, prob. obtaining a hit is (b – i + 1) / b
• ni : denote the number of tosses in the i th stage.
stage1 stage2 stage3 stage b
Balls and bins• By linearity of expectation,
Streaks• Suppose you flip a fair coin n times. The longest streak of con-
secutive heads that you expect to see is
• Proof consists of two steps; showing and
• Aik : the event that a streak of heads of length at least k begins with the i th coin flip. I.e. coin flips i, i + 1, …, i + k – 1 yield only heads.
Streaks• f
• Prob. that a streak of heads of length at least begins anywhere is
Streaks• Lj : event that the longest streak of heads has length exactly j
• L : the length of the longest streak
• E
• Events Lj for j = 0, 1, …, n are disjoint, so the prob. that a streak of heads of length at least begins anywhere is
Streaks
The hiring problem• H
• In worst-case, total hiring cost of
• What is the expected number of times that manager hires a new office assistant?
The hiring problem• D
• D
• D
• d
The On-line hiring problem• Manager is willing to settle for a candidate who is close to the
best, in exchange for hiring exactly once.
• After interviewing, either immediately offer the position to the applicant or immediately reject the applicant.
• After manager has seen j applicants, he knows which of the j has the highest score, but he does not know whether any of the remaining n – j applicants will receive a higher score.
The On-line hiring problem• H
• We wish to determine, for each possible value of k, the proba-bility that we hire the most qualified applicant.
The On-line hiring problem• K
• S : event that we succeed in choosing the best-qualified applicant
• Si : event that we succeed when the best-qualified applicant is the i th one interviewd.
• Since Si are disjoint,
•
The On-line hiring problem• Bi : event that the best-qualified applicant is in position i.
• Oi : event that none of the applicants in position k + 1 through i – 1 chosen. I.e. all of the values score(k + 1) through score(i – 1) must
be less than M(k).
• Bi and Oi are independent.
The On-line hiring problem• D = 1/n ,
• D
• d
The On-line hiring problem• d
• Evaluating these integrals gives us the bounds
• To maximize the probability of success, focus on choosing the value of k that maximizes the lower bound on Pr{S}.
• By differentiating the expression (k / n) (ln n – ln k) with re-spect to k, and setting the derivative equal to 0, we will suc-ceed in hiring our best-qualified applicant with prob. at least 1/e.
END