Login     Register

        Contact Us     Search

The Birthday Paradox

Nov 21

Written by: Charles Flock
11/21/2008 6:44 PM  RssIcon

What are the chances that any two people in the room have the same birthday?

As the holidays approach, you may find yourself in a room filled with celebratory people. If you are anything like me, you will survey the room and ask yourself, what are the chances that any two people in this room have the same birthday? Not the same birthday as me, and not the same birth date, just the same birthday.
 Let’s say that there are 30 people in the room and, to make it more interesting, I decide to place a bet that there are 2 people in the room (and I of course will disclaim any prior knowledge about birthdays) that have the same birthday. It would be even better if I could get odds, let’s say 2 to 1, but I would take even money. What are the odds?
The answer to this question is a classic probability problem called the birthday paradox, though it’s not really a paradox, it’s more of a puzzle since the answer seems to be so far removed from what intuition tells us. And the answer is that if there are 30 people in the room, the probability that 2 of them have the same birthday is 70.6%, a better than even-money bet.
 
How can this be? Let’s frame the problem the following way. If there are two people in the room, then the first person’s birthday can fall on any of 365 days (we are going to ignore leap years). For the second person’s birthday to not be the same day it can only fall on any of 364 days. If we add a third person, then his birthday can only fall on one of 363 days. To calculate the probability of them not having the same birthday we would use the following equation:
 
1 – (365/365 * 364/365 * 363/365) = 0.82%
 
So, in a room with 3 people in, the probability that 2 of them have the same birthday is less than 1% (0.82%).
 
It turns out that this is quite easy to calculate in SQL Server using common table expressions (CTE) and the XleratorDB PRODUCT function. First, we create a CTE that consists of the number of people and the number of unique days available and we add a select at the end to display the contents of the CTE:
;with birthdays as
(
    select cast(1 as float) as people
      ,365 as num_bd
    union all
    select people + 1
      ,num_bd - 1
      from birthdays
      where people < 366
   
)
select *
from birthdays
where people < 4
option (maxrecursion 366)
 
which returns the following:
 
people                 num_bd
---------------------- -----------
1                      365
2                      364
3                      363
 
(3 row(s) affected)
 
If we then just change the select statement, we can calculate the probability:
 
select 1 - wct.product(cast(num_bd as float)/365)
from birthdays
where people < 4
option (maxrecursion 366)
 
which returns the following result:
 
----------------------
0.00820416588478134
 
(1 row(s) affected)
 
It turns out that with just a minor adjustment, we can return the probability for any number of people, though we never have to do more than 366, because at that point the probability is 1. We will construct a CTE from the birthday CTE and then select the probabilities at 10, 20, 30, 40 and 50 people.
 
;with birthdays as
(
    select cast(1 as float) as people
      ,365 as num_bd
    union all
    select people + 1
      ,num_bd - 1
      from birthdays
      where people < 366
   
), bpd as  
(
select b.people
,(1 - wct.product(cast(a.num_bd as float)/365)) as prob
from birthdays a, birthdays b
where a.people between 1 and b.people
group by b.people
)
select people, prob
from bpd
where people in (10,20,30,40,50)
option (maxrecursion 366)
 
which returns the following result:
 
people                 prob
---------------------- ----------------------
10                     0.116948177711078
20                     0.41143838358058
30                     0.706316242719269
40                     0.891231809817949
50                     0.970373579577988
 
Now, we can ask ourselves, at what point does this become an even-money bet? In other words, at what point does the probability become greater than .5? We can find this simply by changing our select statement to find the smallest number of people where the probability is > .5. Using the same CTE:
 
select min(people) from bpd
where prob > .50
option (maxrecursion 366)
 
which returns the following result:
 
----------------------
23
 
(1 row(s) affected)
 
At 23 people it becomes an even-money bet. Obviously we could simply test for .99 instead of .50 to see how many people would need to be in the room for the probability to be greater than 99%:
 
select min(people) from bpd
where prob > .99
option (maxrecursion 366)
 
which returns the following result:
 
----------------------
57
 
(1 row(s) affected)
 
Finally, I have included this select to show all the points of interest we have talked about and their probabilities:
 
select people, prob
from bpd
where people in (3, 10,20,27,30,40,50,57)
option (maxrecursion 366)
 
which returns the following result:
 
people                 prob
---------------------- ----------------------
3                      0.00820416588478134
10                     0.116948177711078
20                     0.41143838358058
27                     0.626859282263242
30                     0.706316242719269
40                     0.891231809817949
50                     0.970373579577988
57                     0.99012245934117
 
(8 row(s) affected)
 
The next time that you are at a gathering and people discover that they have the same birthday, and they say ‘what a coincidence!’ you will know better. It wasn’t fate; it was math.
 
Happy Thanksgiving!

Tags:
Categories:
Copyright 2008-2017 Westclintech LLC         Privacy Policy        Terms of Service