DZone Snippets is a public source code repository. Easily build up your personal collection of code snippets, categorize them with tags / keywords, and share them with the world

Snippets has posted 5883 posts at DZone. View Full User Profile

Two-way Birthday Paradox

09.17.2008
| 3357 views |
  • submit to reddit
        The classical birthday paradox calculates the probability that any two items within a set share the same value.  For example, in a room of 23 people, there is a 50% chance that there exists a pair of people with the same birthday.  This is because there are n(n-1)/2 pairs of people in the room to form a collision.  

The two-way birthday paradox occurs when there are two groups of items.  Specifically, it describes the probability that there exists an item in set A with the same value as an item in set B.  At first this sounds easy to calculate, but better mathematicians than I have proven me wrong. So I present to you the algorithm for calculating this probability.

p_0=\frac{1}{d^{m+n}}\sum_{i=1}^{m}\sum_{j=1}^{n}S_2(m,i)S_2(n,j)\prod_{k=0}^{i+j-1}d-k

http://en.wikipedia.org/wiki/Birthday_paradox (Under: Generalization to multiple types)

# This calculates the Stirling numbers of the second kind (http://en.wikipedia.org/wiki/Stirling_number)
def stirling2(n,m)
  if n <= 0 or m <= 0
    return []
  end
  
  s2 = [[1]]
  1.upto(m-1) do |i|
    s2[0] << 0
  end

  1.upto(n-1) do |i|
    s2 << []
    s2[i][0] = 1
    1.upto(m-1) do |j|
      s2[i][j] = (j+1)*s2[i-1][j] + s2[i-1][j-1]
    end
  end
  
  s2[n-1][m-1]
end

# this calculates the probability of NO collisions between set A (m members) and set B (n members) where there
# are t distinct values (t is d in the Wikipedia article, e.g., 365)
def birthday2(m,n,t)
  b2 = 0
  1.upto(m) do |i|
    1.upto(n) do |j|
      acc = 1
      0.upto(i+j-1) do |k|
        acc *= t-k
      end
      acc *= stirling2(m,i)*stirling2(n,j)
      b2 += acc
    end
  end
  b2 /= (t**(m+n)).to_f;
end