DZone Snippets is a public source code repository. Easily build up your personal collection of code snippets, categorize them with tags / keywords, and share them with the world

Snippets has posted 5883 posts at DZone. View Full User Profile

09.17.2008
| 3436 views |
        The classical birthday paradox calculates the probability that any two items within a set share the same value.  For example, in a room of 23 people, there is a 50% chance that there exists a pair of people with the same birthday.  This is because there are n(n-1)/2 pairs of people in the room to form a collision.

The two-way birthday paradox occurs when there are two groups of items.  Specifically, it describes the probability that there exists an item in set A with the same value as an item in set B.  At first this sounds easy to calculate, but better mathematicians than I have proven me wrong. So I present to you the algorithm for calculating this probability.

p_0=\frac{1}{d^{m+n}}\sum_{i=1}^{m}\sum_{j=1}^{n}S_2(m,i)S_2(n,j)\prod_{k=0}^{i+j-1}d-k

http://en.wikipedia.org/wiki/Birthday_paradox (Under: Generalization to multiple types)


# This calculates the Stirling numbers of the second kind (http://en.wikipedia.org/wiki/Stirling_number)
def stirling2(n,m)
if n <= 0 or m <= 0
return []
end

s2 = [[1]]
1.upto(m-1) do |i|
s2[0] << 0
end

1.upto(n-1) do |i|
s2 << []
s2[i][0] = 1
1.upto(m-1) do |j|
s2[i][j] = (j+1)*s2[i-1][j] + s2[i-1][j-1]
end
end

s2[n-1][m-1]
end

# this calculates the probability of NO collisions between set A (m members) and set B (n members) where there
# are t distinct values (t is d in the Wikipedia article, e.g., 365)
def birthday2(m,n,t)
b2 = 0
1.upto(m) do |i|
1.upto(n) do |j|
acc = 1
0.upto(i+j-1) do |k|
acc *= t-k
end
acc *= stirling2(m,i)*stirling2(n,j)
b2 += acc
end
end
b2 /= (t**(m+n)).to_f;
end


"Starting from scratch" is seductive but disease ridden